
Claude 4: Anthropic's first model built for agents, not just conversations
On May 22, 2025, Anthropic launched Claude Opus 4 and Sonnet 4 — the first Claude generation explicitly designed for sustained multi-hour autonomous work. The launch included four new agent API tools, general availability of Claude Code, and — in a company first — activation of ASL-3 safety protections for Opus 4.
Research Brief
What the models actually do

Memory, shortcuts, and the agent behavior gap
Four new API tools shipped the same day
| Tool | What it does |
|---|---|
| Code execution | Runs Python in a sandboxed container; Claude can generate, execute, and iterate on code within a single API call |
| MCP connector | Connects to any remote Model Context Protocol server without custom client code; Anthropic handles authentication and tool discovery |
| Files API | Stores documents server-side so they can be referenced across sessions without re-uploading |
| Extended prompt caching | Raises the cache TTL from 5 minutes to 1 hour; can cut costs by up to 90% and latency by up to 85% on long repeated prompts |
ASL-3: the first time Anthropic activated its highest safety tier
Claude Code goes generally available
What shifts with this generation
Related content
Picked from other channels by content similarity—find new creators to follow.
Article·Claude 4 发布:Opus 4 拿下 SWE-bench 72.5%,Anthropic 的 agent 押注
Anthropic 发布 Claude Opus 4 和 Sonnet 4,在 SWE-bench 软件工程 benchmark 上拿下当前最高分,并随附 extended thinking + tool use 和大幅改善的 agent 稳定性。本文解读两款模型的技术变化、定位分工和 Anthropic 押注长时间 agent 工作负载的逻辑。
三大公司大模型论文
Article·🚨 BREAKING: Anthropic Drops Claude Opus 4.8 — 4× Less Likely to Lie, Same Price, Hundreds of Parallel Subagents
🚨 BREAKING: Anthropic ships Claude Opus 4.8 — 42 days after Opus 4.7, same $5/$25 price, 4× better at catching its own mistakes. Dynamic Workflows unlocks hundreds of parallel subagents. The safety squad is playing offense now. #AILeague
AIL·Breaking
Audio·Opus 4.8:Anthropic 把旗舰模型做成更稳的代理工人
Anthropic 发布 Claude Opus 4.8,同价升级 Opus,并把努力程度控制、Claude Code 动态工作流和更强调诚实性的评估放到同一条线上。本期解读它为什么指向更长时间、更高自治度的代理工作,而不只是一次跑分提升。
Claude 博客解读播客
Article·AI Coding Tools Weekly: Opus 4.8 lands on three platforms, Copilot's $746 bill shock, and the June 18 Gemini CLI deadline
This week's digest covers 22 confirmed events across 8 tools: Anthropic closed a $65B Series H and shipped Claude Opus 4.8 simultaneously to Copilot, Cursor, and Windsurf. Claude Code's Dynamic Workflows (16 parallel sub-agents, 1,000 total) enabled a 750K-line Zig→Rust migration in 11 days. Cursor v3.6 launched Auto-review mode to keep agents running without constant approval interrupts. Copilot's June 1 usage-based billing is generating sticker shock — community posts document bills of 15–26× current rates under agentic workloads. Devin raised $1B at a $26B valuation with $492M run-rate revenue; async sessions now outnumber interactive ones. Grok Build shipped 7 releases in 7 days. Gemini CLI shuts down June 18. The BARE benchmark finds frontier models succeed on real maintainability tasks less than 23% of the time.
Global AI Coding Tools Update
Article·Claude 三个月迭代全景:从旗舰降价到 AI 安全分水岭
2026 年 2 月至 5 月,Anthropic 在模型、定价、产品、对齐研究四条线同步推进:Opus 4.6/4.7、Sonnet 4.6、Haiku 4.5 密集迭代,旗舰降价 67%,Mythos Preview 引发 AI 安全新关注,agent 编排架构全面成熟。
Claude 全动态追踪
Article·Claude Opus 4.8:当「诚实」成为旗舰模型的核心卖点
Anthropic 在 2026 年 5 月发布的 Claude Opus 4.8,以「诚实性」作为首要叙事方向:代码缺陷未标出率下降 4 倍、首个在关键 Agent 测试上漏报率为零的 Claude 模型。本文深度拆解其核心能力提升、Dynamic Workflows 新功能、benchmark 进退与竞品格局,以及 Mythos 下一代模型的时间线信号。
LLM Release Notes

Add more perspectives or context around this Post.