Anthropic's New AI Model Targets Coding, Enterprise Work as Competition With OpenAI Intensifies -- Pure AI

Anthropic's New AI Model Targets Coding, Enterprise Work as Competition With OpenAI Intensifies

By John K. Waters
02/10/2026

Key Takeaways

Anthropic released Claude Opus 4.6 with a million-token context window in beta and introduced agent teams that allow multiple AI agents to coordinate autonomously on coding projects, marking the company's push beyond pure software development into broader enterprise workflows.
The model outperformed OpenAI's GPT-5.2 by approximately 144 Elo points on GDPval-AA, a benchmark for professional tasks in finance, law, and other domains, while addressing context-degradation issues that have plagued earlier AI models during longer conversations.
The release arrived three days after OpenAI launched its Codex desktop application, intensifying competition between the two companies as Anthropic's enterprise adoption has surged to roughly 40 percent of surveyed companies while OpenAI maintains dominance at approximately 77 percent, according to an Andreessen Horowitz survey.

Anthropic released Claude Opus 4.6 on Thursday, introducing a million-token context window and automated agent coordination features as the AI company seeks to expand beyond software development into broader enterprise applications.

The San Francisco-based firm said the model improves performance on coding tasks, financial analysis, and document processing compared to its predecessor. Anthropic positioned the release as strengthening its position in enterprise AI workflows, an increasingly crowded market where it competes directly with OpenAI and Google.

"We're focused on building the most capable, reliable, and safe AI systems," an Anthropic spokesperson said. "Opus 4.6 is even better at planning, helping solve the most complex coding tasks."

The release comes three days after OpenAI launched a desktop application for its Codex AI coding system, underscoring the rapid pace of competition in AI development tools. Anthropic said in November that Claude Code, its coding product, reached $1 billion in annualized revenue six months after general availability.

Extended Context and Agent Coordination
Opus 4.6 supports up to one million tokens of context in beta on Anthropic's developer platform, a substantial increase from the 200,000-token limit of earlier Opus versions. The expansion allows the model to process larger codebases and longer documents without splitting tasks across multiple requests.

The company also introduced agent teams in Claude Code as a research preview, allowing multiple AI agents to work simultaneously on segmented portions of a project. Scott White, Anthropic's head of product, compared the feature to coordinating a human team working in parallel.

Anthropic said Opus 4.6 addresses context degradation, a common problem where AI performance declines as conversations lengthen. On a retrieval benchmark that hides information in large text volumes, Opus 4.6 scored 76 percent compared to 18.5 percent for its Sonnet 4.5 model.

The model supports outputs of up to 128,000 tokens. Anthropic introduced adaptive thinking, which allows the model to determine when to apply deeper reasoning, and four effort settings that developers can adjust to balance performance, speed, and cost.

Benchmark Performance
Anthropic reported that Opus 4.6 leads on Terminal-Bench 2.0, an evaluation of AI agents completing command-line tasks, with a 65.4 percent score under maximum-effort settings. The Terminal-Bench project's public leaderboard shows separate entries for Opus 4.6, with a score of 62.9 percent under one configuration.

On GDPval-AA, a benchmark measuring performance on professional tasks across finance, legal, and other domains, Anthropic said Opus 4.6 outperforms OpenAI's GPT-5.2 by approximately 144 Elo points, a gap that corresponds to a roughly 70 percent win rate in direct comparisons. Artificial Analysis, which maintains the GDPval-AA leaderboard, describes the evaluation framework in its methodology documentation.

Anthropic also cited results from BrowseComp, an OpenAI benchmark for browsing agents that measures the ability to locate hard-to-find information across 1,266 questions that require persistent web navigation.

Safety Testing and Cybersecurity Measures
Anthropic said Opus 4.6 underwent extensive safety evaluations, including tests for deception, sycophancy, and cooperation with potential misuse. The company's system card reports the model showed low rates of problematic behaviors while achieving the lowest rate of over-refusals among recent Claude models.

The company developed six cybersecurity probes to detect harmful uses of the model's enhanced capabilities. Anthropic said it is using Opus 4.6 to identify and patch vulnerabilities in open-source software as part of defensive cybersecurity efforts.

"Agents have tremendous potential for positive impacts in work, but it's important that agents continue to be safe, reliable, and trustworthy," the spokesperson said, referring to a framework Anthropic published outlining core principles for agent development.

Product Integrations and Pricing
Anthropic released Claude in PowerPoint as a research preview for paid subscribers, building on existing integrations with Excel. The PowerPoint tool reads layouts, fonts, and slide templates to generate presentations, the company said.

White said Anthropic has observed the use of Claude Code expanding beyond software engineers to product managers, financial analysts, and workers in other fields. The company cited deployments at Uber, Salesforce, Accenture, Spotify, and other enterprises.

Opus 4.6 is available on claude.ai and through the Claude API under the identifier claude-opus-4-6. Pricing remains $5 per million input tokens and $25 per million output tokens. Premium pricing of $10 per million input tokens and $37.50 per million output tokens applies when prompts exceed 200,000 tokens using the million-token context window. The model is also available through Amazon Bedrock and Google Cloud Vertex AI.

The release arrives as OpenAI's GPT-5.3-Codex began rolling out through GitHub Copilot, according to GitHub's changelog. GitHub described GPT-5.3-Codex as OpenAI's latest agentic coding model and outlined availability for Copilot Pro, Business, and Enterprise users.

An Andreessen Horowitz survey from January showed that Anthropic's enterprise adoption had risen to approximately 40 percent of surveyed companies using it in production, up from near zero in early 2024. OpenAI remained the most widely used provider at approximately 77 percent.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].

Featured

The New AI Security Rules, Perplexity's $34.5B Chrome Bid, More

Pure AI

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
March 10-April 14, 2026

Live! 360 2-Day Hands-On Seminar: Copilot Studio, Microsoft Agent Framework and Foundry: Building Multi-Agent AI Systems
June 8-9, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026