The Claude 4.x family

Three model sizes, one philosophy: same family, different speed/cost/capability sliders. Pick by task. Mix in production.

1. TL;DR

Model	ID	Best at	Cost	Context
Opus 4.7	`claude-opus-4-7`	Hard reasoning, multi-step plans, ambiguous specs.	$$$	200k (1M variant available)
Sonnet 4.6	`claude-sonnet-4-6`	The daily driver. 90% of pro work.	$$	200k
Haiku 4.5	`claude-haiku-4-5-20251001`	High-volume, latency-critical, bounded tasks.	$	200k

If you only remember one thing: default to Sonnet. Reach for Opus when the task is open-ended or ambiguous, and Haiku when you'll make a million calls and each one is small.

2. Claude Opus 4.7

The reasoning flagship. Best at multi-step plans, large refactors, design decisions, and anything where the model needs to think before answering. Extended thinking is supported and recommended for hard problems.

Use when

The task is open-ended ("redesign this auth flow"), ambiguous ("figure out why X is slow"), or has lots of state.
You're refactoring across many files at once.
You need long-horizon agentic behavior (Claude Code on a hard ticket).
The cost of a wrong answer is high (production code, contracts, security review).

Avoid when

The task is bounded and well-defined — Sonnet is just as good for 90% of well-scoped prompts and 5× cheaper.
You're doing high-volume bulk work — use Haiku or batch Sonnet.
Latency matters more than depth — Sonnet is faster.

1M context variant

For huge codebases or massive document workloads, claude-opus-4-7[1m] exposes a 1M-token context window. Use sparingly — token cost is the same per-token but you'll easily ship 500k tokens per request. Pairs perfectly with prompt caching.

3. Claude Sonnet 4.6

The default model for almost everything. Strong reasoning, fast enough for interactive UX, ~5× cheaper than Opus. Most production apps should start here.

Use when

You're building a chat UX — users feel the latency.
The task is well-scoped: "summarize this doc", "answer this question with citations", "rewrite this paragraph".
You want a strong agent for routine coding work.
You're optimizing for cost at moderate volume.

Promote to Opus when

Sonnet visibly struggles on your evals — wrong tool selection, missed edge cases, weak plans.
The task starts to need multi-step thinking and Sonnet's plans are shallow.

4. Claude Haiku 4.5

The cheap, fast model for high-volume tasks. Strong enough for classification, routing, intent detection, autocomplete, simple Q&A. Not the model for complex reasoning.

Use when

You'll make a lot of calls — millions/day.
The task is narrow and bounded.
Latency is critical (autocomplete, live UX).
You're routing or filtering before handing off to Sonnet / Opus.

Common patterns

Triage router — Haiku decides which downstream model to call.
Bulk classification — feed a million records through Haiku + Batch API for ~$0.50/M tokens.
Autocomplete — sub-second response times.
Guardrail check — pre-filter prompts for safety before invoking Opus.

No extended thinking on Haiku. It's a speed-first model. If you find yourself needing thinking, upgrade to Sonnet.

5. How to pick (decision flow)

Is the task open-ended / multi-step / ambiguous?
  └─ Yes → Opus 4.7
  └─ No  → Is latency critical OR are you making >100k calls/day?
            └─ Yes → Haiku 4.5
            └─ No  → Sonnet 4.6

Validate the choice with an eval set of ~50 representative inputs. If Sonnet hits your bar, stay there. If it doesn't, escalate.

6. Mixing models in one product

Most pro deployments use multiple models. Common topology:

Stage	Model	Why
Route the user's intent	Haiku 4.5	Fast, cheap, accurate enough for "is this support / sales / engineering?"
Filter / guardrail	Haiku 4.5	Same — cheap precheck before paying for Sonnet.
Execute the user's request	Sonnet 4.6	Quality + cost sweet spot.
Handle the gnarly edge case	Opus 4.7	Escalation path when Sonnet's confidence is low.
Generate the final polished output	Sonnet 4.6 or Opus 4.7	Quality matters at the end.

Cost-wise, this beats "always Opus" by 5–20×, often with no quality drop.

7. Migration notes (4.5 / 4.6 → 4.7)

Sonnet 4.6 → Opus 4.7 / Sonnet 4.7 is largely a model-string change. The API surface is stable. Things to know:

Extended thinking is more useful in 4.x than it was in 3.x — bump budget_tokens for hard tasks.
Tool selection is more aggressive about parallel calls. If your handler doesn't support concurrency, set disable_parallel_tool_use: true.
Prompt caching works the same; existing cache breakpoints continue to apply.
System prompt sensitivity — 4.x is better at following nuanced instructions. You may be able to trim verbose "always do X" lists.
Re-run your evals. Higher capability sometimes means new failure modes (over-confidence, more aggressive refactoring). Don't ship without a validation sweep.

8. Deprecated / retired models

Anthropic retires older models on a published schedule. If your code still references one of these, plan a migration:

Old	Replace with
`claude-3-opus-*`	`claude-opus-4-7`
`claude-3-5-sonnet-*`	`claude-sonnet-4-6`
`claude-3-haiku-*`	`claude-haiku-4-5-20251001`
`claude-3-5-haiku-*`	`claude-haiku-4-5-20251001`

Check docs.claude.com/about-claude/models for the live retirement schedule.

The Claude 4.x family

On this page

1. TL;DR

2. Claude Opus 4.7

Use when

Avoid when

1M context variant

3. Claude Sonnet 4.6

Use when

Promote to Opus when

4. Claude Haiku 4.5

Use when

Common patterns

5. How to pick (decision flow)

6. Mixing models in one product

7. Migration notes (4.5 / 4.6 → 4.7)

8. Deprecated / retired models