GPT-5.4 strengths in a high-throughput small model
A strong fit for parallel subagents and frequent production traffic that needs more than simple classification.
Overview
GPT-5.4 Mini is the high-throughput small model in OpenAI's GPT-5.4 family. OpenAI describes it as its strongest mini model yet for coding, computer use, and subagents: a model that brings much of GPT-5.4's agentic shape to workloads where latency, concurrency, and cost discipline matter.
It keeps a 400,000-token context window, 128,000 output tokens, text and image input, text output, reasoning tokens, structured outputs, function calling, and the newer tool stack. In a multi-model system, Mini works well as the "worker thread" beside a flagship model: inspect files, summarize evidence, make small code changes, classify screenshots, or run focused subtasks in parallel.
Key capabilities
| Dimension | Detail |
|---|---|
| Context window | 400,000 tokens |
| Max output | 128,000 tokens |
| Input modalities | Text, image |
| Output modalities | Text |
| Tools | web search, file search, image generation, code interpreter, hosted shell, apply patch, skills, computer use, MCP, tool search |
CrossModel aligns GPT-5.4 Mini with OpenAI's 400K context window. Current pricing is shown in the live model catalog.
Coding and subagents
Better than Nano for coding, vision, and computer-use subtasks
Mini approaches the main model in several places, but high-risk final judgments still belong with GPT-5.4 or GPT-5.5.
OpenAI reports GPT-5.4 Mini at 54.4% on SWE-Bench Pro Public and 60.0% on Terminal-Bench 2.0. Those scores sit close enough to GPT-5.4 for many supporting engineering tasks, while the model is explicitly designed for faster, more efficient high-volume use.
The best pattern is delegation. Let Mini handle code search, draft patches, test generation, issue triage, and PR pre-review. Keep the final architecture call or risky production change on GPT-5.4, GPT-5.5, or GPT-5.5 Pro.
Vision and medium-long context
A 400K window for medium-long documents and multi-file subtasks
400K is very useful in production; near-1M retrieval and critical conclusions are still better served by flagship models.
Mini is also strong enough for multimodal and computer-use subtasks: 72.1% on OSWorld-Verified, 76.6% on MMMUPro, and 78.0% on MMMUPro with Python. For long context, it reaches 47.7% on MRCR v2 8-needle in the 64K-128K range and 76.3% on Graphwalks BFS 0K-128K.
When to use it
- Parallel subagents: retrieval, summarization, code search, evidence extraction, and draft generation.
- High-throughput coding help: small refactors, tests, lint fixes, issue triage, and PR pre-review.
- Vision workflows: screenshots, document images, UI state classification, and light computer-use tasks.
- Medium-long extraction: multi-document classification and structured output inside a 400K context.
CrossModel exposes GPT-5.4 Mini through an OpenAI-compatible API. Current pricing is available in the model catalog.