CrossModel
Back to model catalog

Anthropic · Model guide

Claude Haiku 4.5

anthropic/claude-haiku-4-5
Modalities
TextImageText
Context
200K
Max output
64K
Claude Haiku 4.5

The fastest Claude option at near-frontier quality

Context window
200K
tokens
Max output
65,536
tokens
Interfaces
2
OpenAI / Anthropic
Realtime
Low-latency chat and support agents
Coding
IDE assistants, pair programming, test generation
Sub-agents
Parallel subtasks under a Sonnet/Opus orchestrator

Coding quality close to Sonnet 4 at roughly one-third the cost and about 2x the speed; computer use can exceed Sonnet 4.

Overview

Claude Haiku 4.5 is Anthropic's fast, low-latency Claude model released on October 15, 2025. Anthropic positions it as the fastest near-frontier option: coding quality close to Claude Sonnet 4 at roughly one-third the cost and about 2x the speed, with computer-use results that can exceed Sonnet 4.

That makes Haiku 4.5 a strong default for real-time chat, support agents, pair-programming loops, and multi-agent systems where many subtasks run in parallel. It also supports Extended Thinking, so teams can spend a little more reasoning budget on harder steps without moving every request to Sonnet or Opus.

Key capabilities

DimensionDetail
Context window200,000 tokens
Max output65,536 tokens
Input modalitiesText, image
Output modalitiesText
Toolsfunction calling, structured outputs, streaming, computer use
ReasoningExtended Thinking

Prompt caching uses product-level multipliers: cache reads are 0.1x the base input rate, 5-minute writes are 1.25x, and 1-hour writes are 2x. See live pricing in the model catalog.

Benchmarks

Haiku 4.5's story is price-performance: it delivers near-Sonnet coding at the lowest latency in the family.

Software engineering: SWE-bench Verified

Haiku 4.5 SWE-bench Verified comparison

On SWE-bench Verified (n=500), Haiku 4.5 scores 73.3%, slightly ahead of Sonnet 4 at 72.7% and close to GPT-5 Codex at 74.5%. That is unusually strong for a latency-oriented model and makes it attractive for IDE assistants, small fixes, test generation, and code-review triage.

Speed and cost positioning

Speed & Cost

The best balance of speed and unit cost

SWE-bench Verified
73.3%
Sonnet 4: 72.7% GPT-5 Codex: 74.5%
Speed
4-5x
vs Sonnet 4.5
Sonnet 4.5 capability
~90%
agentic coding evals

SWE-bench Verified coding edges out Sonnet 4 while keeping the lowest latency and highest throughput in the Claude family.

Haiku 4.5 runs roughly 4-5x faster than Sonnet 4.5 while retaining about 90% of its agentic coding performance — the basis for Anthropic's recommendation to use it as a parallel worker under a Sonnet or Opus orchestrator. Its safety profile is also favorable: misaligned behavior is significantly lower than Sonnet 4.5 and Opus 4.1, and the model is classified as ASL-2.

When to use it

  • Realtime support and chat agents: low latency with strong language quality.
  • Pair programming and IDE plugins: fast output plus a 73.3% SWE-bench Verified score.
  • High-volume text processing: summarization, classification, extraction, and rewriting.
  • Sub-agents: run many Haiku workers under a Sonnet or Opus orchestrator to reduce total cost.

CrossModel exposes Claude Haiku 4.5 through Anthropic-compatible /v1/messages and OpenAI-compatible /v1/chat/completions. Current pricing is available in the model catalog.