Claude Haiku 4.5

The fastest Claude option at near-frontier quality

Context window

200K

tokens

Max output

65,536

tokens

Interfaces

OpenAI / Anthropic

Realtime

Low-latency chat and support agents

Coding

IDE assistants, pair programming, test generation

Sub-agents

Parallel subtasks under a Sonnet/Opus orchestrator

Coding quality close to Sonnet 4 at roughly one-third the cost and about 2x the speed; computer use can exceed Sonnet 4.

Overview

Claude Haiku 4.5 is Anthropic's fast, low-latency Claude model released on October 15, 2025. Anthropic positions it as the fastest near-frontier option: coding quality close to Claude Sonnet 4 at roughly one-third the cost and about 2x the speed, with computer-use results that can exceed Sonnet 4.

That makes Haiku 4.5 a strong default for real-time chat, support agents, pair-programming loops, and multi-agent systems where many subtasks run in parallel. It also supports Extended Thinking, so teams can spend a little more reasoning budget on harder steps without moving every request to Sonnet or Opus.

Key capabilities

Dimension	Detail
Context window	200,000 tokens
Max output	65,536 tokens
Input modalities	Text, image
Output modalities	Text
Tools	function calling, structured outputs, streaming, computer use
Reasoning	Extended Thinking

Prompt caching uses product-level multipliers: cache reads are 0.1x the base input rate, 5-minute writes are 1.25x, and 1-hour writes are 2x. See live pricing in the model catalog.

Benchmarks

Haiku 4.5's story is price-performance: it delivers near-Sonnet coding at the lowest latency in the family.

Software engineering: SWE-bench Verified

Haiku 4.5 SWE-bench Verified comparison

On SWE-bench Verified (n=500), Haiku 4.5 scores 73.3%, slightly ahead of Sonnet 4 at 72.7% and close to GPT-5 Codex at 74.5%. That is unusually strong for a latency-oriented model and makes it attractive for IDE assistants, small fixes, test generation, and code-review triage.

Speed and cost positioning

Speed & Cost

The best balance of speed and unit cost

SWE-bench Verified

73.3%

Sonnet 4: 72.7% GPT-5 Codex: 74.5%

Speed

4-5x

vs Sonnet 4.5

Sonnet 4.5 capability

~90%

agentic coding evals

SWE-bench Verified coding edges out Sonnet 4 while keeping the lowest latency and highest throughput in the Claude family.

Haiku 4.5 runs roughly 4-5x faster than Sonnet 4.5 while retaining about 90% of its agentic coding performance — the basis for Anthropic's recommendation to use it as a parallel worker under a Sonnet or Opus orchestrator. Its safety profile is also favorable: misaligned behavior is significantly lower than Sonnet 4.5 and Opus 4.1, and the model is classified as ASL-2.

When to use it

Realtime support and chat agents: low latency with strong language quality.
Pair programming and IDE plugins: fast output plus a 73.3% SWE-bench Verified score.
High-volume text processing: summarization, classification, extraction, and rewriting.
Sub-agents: run many Haiku workers under a Sonnet or Opus orchestrator to reduce total cost.

CrossModel exposes Claude Haiku 4.5 through Anthropic-compatible /v1/messages and OpenAI-compatible /v1/chat/completions. Current pricing is available in the model catalog.