Claude Coded: Sonnet 4.5, Claude Code 2.0, and more

developer pair programming reviewing code on screen

Photo by Árpád Czapp on Unsplash

Anthropic here — I’m excited to share the latest developments for Claude and Claude Code in our newest Claude Coded update. We’ve shipped major model improvements, developer tooling upgrades, and powerful new API features that make building long-running, stateful agents and integrating Claude into your workflows even easier. Below is a concise news-style report on everything we announced and what it means for developers, product teams, and power users.

🚀 Claude Sonnet 4.5: A new high-water mark for coding and reasoning

The headline is simple: Claude Sonnet 4.5 is now available and it’s our best coding model to date. On the SWE-bench benchmark, Sonnet 4.5 leads with a score of 77.2%, and in real-world testing it has shown remarkable staying power — staying focused on complex tasks for well over 30 hours straight. That kind of sustained coherence opens up new possibilities for extended coding sessions, deep debugging, and workflows that require the model to keep a lot of context across time.

"Claude Sonnet 4.5 is now available wherever you get your Claude, and it is the best coding model in the world."

Sonnet 4.5 isn’t just a better coder. We’ve seen substantial gains in general reasoning, math, and computer-use tasks. For example, on OS World — a test that measures how well an AI can actually use a computer like a human would — Claude jumped from 42% four months ago to over 61% now. That’s a meaningful improvement in the model’s ability to navigate interfaces, call tools, and complete multi-step, interactive tasks without human intervention.

💻 Claude Code: VS Code extension, refreshed terminal UI, and real-time diffs

For developers who prefer the IDE experience, we launched a native Claude Code extension for Visual Studio Code. The extension brings Claude directly into your IDE with a dedicated sidebar panel that shows inline diffs of the changes made — so you can instantly review what Claude suggests or applies to your codebase. This VS Code integration is in beta and available through the VS Code Marketplace.

We didn’t forget about terminal lovers either. Claude Code 2.0 includes a refreshed terminal UI with improved status visibility and a searchable prompt history, making command-line interactions smoother and easier to audit. These UI improvements are designed to make iterative, interactive development with Claude feel natural and fast.

⏪ Checkpoints and /rewind: Rollbacks you can trust

One of the most practical additions is the new checkpoints feature. Large automated edits and experiments can be nerve-wracking — so we built a way to confidently run big tasks and instantly roll back to a previous state if something goes sideways.

  • Invoke the rollback with the /rewind command or double-hit the Escape key.
  • Choose whether to restore just the code, the conversation, or both to a prior state.
  • Checkpoints only apply to edits made by Claude (they do not capture user edits or bash commands), so they work best as a complement to your existing version-control workflow.

Because checkpoints don’t replace version control, my recommendation is to keep frequent commits and use checkpoints as a safety net for Claude-driven runs. For long experiments, commit a stable baseline, let Claude run with checkpoints enabled, and then decide whether to accept the changes into your branch.

🔁 Thinking, usage tracking, and small but powerful UX tweaks

We made several small UX improvements to make Claude feel more responsive to your workflow. The Thinking feature, which controls the model’s background “deliberation” state, can now be toggled with the Tab key — and the preference is saved across sessions. That makes switching modes faster during interactive development.

On the usage front, you can monitor your real-time consumption using the /usage command in Claude Code, or by navigating to Settings → Usage in the Claude app. Having immediate visibility into usage helps teams manage costs and detect unexpected consumption patterns while iterating.

☁️ API advances: context editing and a file-based memory

For agent builders, two new API capabilities change the game: context editing and the memory tool.

Context editing is designed to prevent agents from getting bogged down by stale tool calls and results as they approach token limits. As your agent runs and accumulates tool outputs, context editing will automatically clear out stale content while preserving conversational flow. The net effect: agents can run for longer without manual intervention or repeated re-priming by the user.

The memory tool is equally impactful. Claude can now create, read, update, and delete files in a dedicated memory directory stored on your infrastructure, entirely client-side. This file-based memory persists across conversations, enabling stateful agents that can recall and build upon past interactions. It’s the equivalent of having a cloud.md you can programmatically manage for your agent API.

"It's kind of like having a cloud.md file for your agent API."

These features were showcased in examples like the Claude Place Catan demo, and we published cookbooks to help developers learn how to apply these capabilities to real tasks. The memory tool enables workflows such as long-term session state, persistent preferences, progressive knowledge accumulation, and more robust orchestration of multi-step processes.

🧩 Claude Agent SDK: from Claude Code SDK to a more complete developer toolkit

We’ve renamed and expanded the Claude Code SDK into the Claude Agent SDK. This isn’t just a rename — the SDK now packages the core tools, context management systems, and permissions frameworks that power Claude Code so you can build your own agents with the same building blocks we used internally.

Over the past six months we learned a lot about how people build agents: where tool orchestration breaks down, what primitives are missing, and how to make permissioning safe and manageable. The Claude Agent SDK encapsulates those lessons and gives you a practical starting point for constructing agents that are robust, auditable, and safe.

📊 Claude app: analyze data, create files, and get sharable outputs

Finally, the Claude app got upgraded to better handle file creation and data analysis. You can now ask Claude, in plain language, to:

  • Generate Excel spreadsheets
  • Create PowerPoint presentations
  • Draft Word documents
  • Produce PDFs you can download and share

Claude can analyze datasets, create visualizations, and put insights into the exact formats you need. This capability is available in preview to all paid plans and makes it faster to go from conversational analysis to deliverables you can send to stakeholders.

🛠️ How to get started — practical steps and tips

If you want to try these features today, here’s a short checklist that will get you productive quickly:

  1. Try Sonnet 4.5 for coding and reasoning-heavy tasks. Expect better correctness and longer sustained focus.
  2. Install the Claude Code VS Code extension from the VS Code Marketplace (beta) and use the sidebar to inspect inline diffs.
  3. Use the refreshed terminal UI for command-line workflows and try the searchable prompt history to find previous exchanges.
  4. Enable Thinking with the Tab key and use /usage to keep an eye on consumption. In the Claude app, check Settings → Usage for a dashboard view.
  5. Use checkpoints (/rewind or double Escape) for large automated edits, but keep using git and your normal version-control practices.
  6. Explore the API’s context editing and memory tool to build agents that run longer and retain state across conversations.
  7. Download example cookbooks and demos (like the Memory Cookbook) to learn patterns for common agent tasks.
  8. If you prefer browser-based helpers, the Clot for Chrome extension has been expanded to everyone who was on the waitlist — try clot.ai/chrome.

When building agents that use memory, treat the memory store like any other persistent system: control access, manage lifecycle (create/update/delete), and design your schema for what the agent needs to recall. For cost control and reliability, combine context editing with periodic checkpoints or persisted snapshots to avoid unexpected token growth.

🔔 What this means for developers and teams

Summarizing the impact: Sonnet 4.5 gives you a stronger, longer-attention model for coding and reasoning tasks. Claude Code 2.0 and the VS Code extension make IDE-based development with Claude smoother and safer. Context editing and the memory tool let agents run for longer and statefully, while the Claude Agent SDK packages up the pieces you need to build production-grade agents. And finally, the Claude app’s file creation features bridge conversational work directly into deliverable formats.

"Happy coding and keep thinking."

If you’re building with Claude, these updates should make it easier to iterate faster, keep long-running workflows stable, and build agents that actually remember things across sessions. We’ll keep iterating — and we’ve built the SDKs, cookbooks, and examples to help you move quickly.

📎 Resources and next steps

Want to dive deeper? Here are the places to explore next (plain text links):

  • Join the Claude Developer Discord: https://anthropic.com/discord
  • Claude Code Docs: https://docs.claude.com/en/docs/claude-code/overview
  • Memory Cookbook: https://github.com/anthropics/claude-cookbooks/blob/main/tool_use/memory_cookbook.ipynb
  • Try the Clot Chrome extension: clot.ai/chrome

We’ll keep shipping improvements and sharing cookbooks that reflect real-world use cases. If you try Sonnet 4.5, the VS Code extension, or the agent features, I’d love to hear about what you build and what you want next.

— Anthropic


AIWorldVision

AI and Technology News