Anthropic Removes Long-Context Pricing for Claude 1M Token Prompts Now Cost the Same
Anthropic has announced a major pricing update for its latest AI models, removing the long-context surcharge for prompts approaching 1 million tokens. The change applies to both Claude Opus 4.6 and Claude Sonnet 4.6, allowing developers to run extremely large prompts at the same standard per-token rate as smaller requests.
Previously, prompts that exceeded roughly 200,000 tokens were moved into a premium pricing tier. Under the new model, that threshold has been removed entirely. Developers can now submit prompts containing hundreds of thousands — or even close to one million — tokens without triggering higher pricing bands.
For organizations building large-scale AI systems, this pricing shift could significantly simplify how applications are designed.
Claude’s 1-Million-Token Context Window Explained
A context window determines how much information an AI model can process in a single request. Larger context windows allow AI systems to analyze longer documents or larger datasets without splitting tasks into multiple prompts.
With a 1-million-token context window, Claude models can process massive inputs such as:
- Entire code repositories
- Long research papers
- Legal documents and contracts
- Large enterprise knowledge bases
- Multi-file development projects
This capability enables the model to reason across broader information sets, which can improve tasks like debugging complex software systems or analyzing large collections of documents.
What the New Claude Pricing Looks Like
Under Anthropic’s updated pricing structure, long prompts are billed at the same per-token rate regardless of size.
Current pricing
| Model | Input Cost | Output Cost |
| Claude Opus 4.6 | ~$5 per million tokens | ~$25 per million tokens |
| Claude Sonnet 4.6 | ~$3 per million tokens | ~$15 per million tokens |
Previously, these costs increased when prompts crossed the long-context threshold. For example, Sonnet’s input token price could rise from about $3 to roughly $6 per million tokens, while Opus pricing could double from $5 to around $10.
By removing that surcharge, Anthropic has made large-context workloads easier to experiment with and deploy in production environments.
Why This Change Matters for Developers
For years, developers have used architectural strategies such as retrieval-augmented generation (RAG) to keep prompts small and control costs. Retrieval systems send only the most relevant snippets of information to a model rather than entire datasets.
That approach helped manage token costs but also introduced complexity.
With the pricing barrier removed, developers can now choose between two approaches:
- Continue using retrieval systems to reduce token usage
- Send larger bodies of data directly to the model when broader context is beneficial
This flexibility could simplify many AI workflows. Instead of orchestrating multiple model calls or splitting documents into smaller segments, developers can sometimes place a larger portion of data into a single prompt and ask the model to analyze it all at once.
Large Context Windows Are Becoming an AI Benchmark
The race toward larger context windows has become a key area of competition among AI model providers.
Companies such as OpenAI and Google have also introduced models capable of processing extremely long prompts approaching the million-token mark.
Anthropic’s roadmap toward this milestone has evolved over several years:
- Early Claude models supported around 200,000 tokens
- The Claude 3 family (2024) demonstrated the ability to handle inputs beyond 1 million tokens for select use cases
- Claude Sonnet 4 (2025) introduced the first public 1-million-token context window
- Claude Opus 4.6 and Sonnet 4.6 (2026) now provide the full capability with simplified pricing
The latest update removes one of the final limitations that discouraged developers from fully using long prompts.
Where Developers Can Access the 1M Token Context
The million-token context window is available across the Claude ecosystem and major cloud AI platforms, including:
- Claude Platform
- Amazon Bedrock
- Google Vertex AI
- Microsoft Foundry
Enterprise users running Opus 4.6 through Claude Code Max, Team, or Enterprise subscriptions also receive the full 1-million-token context window by default.
The Bigger Picture for AI Development
Removing the long-context surcharge does not eliminate token costs altogether. Larger prompts still consume more tokens, which means developers must weigh the cost of sending large datasets against alternative architectural approaches.
However, by removing the pricing threshold, Anthropic has lowered the barrier to experimenting with long-context AI systems.
For AI-native coding tools, enterprise automation platforms, and research workflows, the ability to analyze entire projects or document collections in a single prompt could unlock new capabilities.
As the competition among AI model providers intensifies, pricing transparency and larger context windows are becoming important differentiators. Anthropic’s latest update signals that million-token reasoning may soon become a standard feature across advanced AI models.