Claude Sonnet 4.6: More capable AI with a 1M-token context

Feb 17, 20264 minutes

Claude Sonnet 4.6 is here and promises to be the most capable Sonnet yet. What does that mean for you, your team, or your business? In practical terms: better code, smarter use of on-screen apps, longer-form reasoning, and a beta 1 million token context that can hold contracts, codebases, or entire research libraries in a single query.

Qué trae Sonnet 4.6

Sonnet 4.6 is a broad update: improvements in coding, computer usage, long-context reasoning, agent-like planning, knowledge work, and design. If you use claude.ai or Claude Cowork on Free and Pro plans, Sonnet 4.6 is already the default model. The price hasn’t changed from Sonnet 4.5: it starts at $3/$15 per million tokens.

Sonnet 4.6 improves consistency, instruction following, and reduces useless repetition in code. In early tests, developers preferred Sonnet 4.6 over Sonnet 4.5 in about 70% of cases, and over Opus 4.5 in 59% of comparisons because it’s less prone to overengineering and more reliable on multi-step follow-through.

Uso de computador: por qué importa

Thinking of an AI that uses a computer like a person changes the game. Before, automating old or specialized software required custom connectors. Now the model can interact with interfaces, click and type inside a virtual machine, which expands the tasks you can delegate: navigating complex spreadsheets, filling multi-step forms, or orchestrating work across multiple tabs.

OSWorld, the standard benchmark for these tasks, shows steady improvements over 16 months. Still, Sonnet 4.6 isn’t perfect: it remains behind the most skilled humans on certain fine maneuvers. But the progress is notable and already useful in many real workflows.

Important: using computers opens risk vectors, like prompt injection attacks. Anthropic says Sonnet 4.6 shows greater resistance compared to Sonnet 4.5 and performance similar to Opus 4.6. Review mitigation best practices in the API documentation.

Rendimiento y evaluaciones clave

Sonnet 4.6 raises performance across multiple benchmarks and real-world tasks: OfficeQA (business documents), Vending-Bench Arena (simulated business strategy), deep-reasoning evaluations, and large-scale bug-fixing.

The beta 1M-token context window lets you keep entire codebases, long contracts, or dozens of papers in a single request, and crucially: reason across all of it. In Vending-Bench Arena, Sonnet 4.6 showed long-term planning strategies that gave it an edge over competitors.

Clients reported concrete improvements: smoother frontends, clearer financial analysis, fewer iterations to reach production, and better bug detection. Practical example: Rakuten AI got iOS code that better met specifications and modern architecture in a single pass.

Productos y herramientas: dónde está disponible

Sonnet 4.6 is already on claude.ai, Claude Cowork, Claude Code, the API, and major clouds.
The free tier was updated to use Sonnet 4.6 by default, including file creation, connectors, and compaction.
On the developer platform: it supports adaptive thinking, extended thinking, and context compaction in beta (automatic summarization of old context).
In the API: web search and fetch can now write and run code to filter results, improving answer quality and token efficiency. Code execution, memory, programmatic tool calls, and usage examples are generally available.
For Claude users in Excel: the add-in now supports MCP connectors with key financial providers (S&P Global, LSEG, PitchBook, FactSet, among others) on Pro, Max, Team, and Enterprise plans.

Seguridad y límites

Anthropic reports that Sonnet 4.6 passed extensive safety evaluations and describes its character as 'warm, honest, prosocial, and sometimes funny', with strong safety behaviors and no signs of major alignment failures. Still, remember: no benchmark fully captures real-world risk. Practical advice: test in controlled environments, monitor outputs, and apply mitigations against injection and misuse.

Cómo empezar hoy

If you’re a developer, use the identifier claude-sonnet-4-6 in the API to migrate. Try different effort/latency settings to find the balance between speed and quality for your use case. If you rely on Opus for ultra-critical deep-reasoning tasks, Anthropic suggests Opus 4.6 remains the best choice for those highest-demand scenarios.

Sonnet 4.6 is especially attractive if you want near-state-of-the-art performance at a more efficient cost: better performance-to-cost ratio, fewer iterations to production, and expanded capabilities for code and document tasks.

Reflexión final

It’s not just a version number: Sonnet 4.6 shows how AI becomes more practical for real work — from fixing bugs in large codebases to processing lengthy contracts — and it does so while keeping price and scalability. Ready to try it in your workflow? Start with a controlled experiment and you’ll see which tasks save you the most time.

Fuente original

https://www.anthropic.com/news/claude-sonnet-4-6

Stay up to date!

Get AI news, tool launches, and innovative products straight to your inbox. Everything clear and useful.