GPT-5.4 Nano

The GPT-5.4-class entry model for simple work at scale

Context window

400K

tokens

Max output

128K

tokens

Main advantage

Throughput

low-latency frequent calls

Classification

Intent, risk, topic, and ticket type

Extraction

Structured fields from text, screenshots, and documents

Routing

Judge difficulty before escalating to Mini or a flagship model

Nano is best as a routing, extraction, classification, and light-subagent layer, not as the final judge for hard tasks.

Overview

GPT-5.4 Nano is the GPT-5.4-family model optimized for speed, scale, and simple high-volume decisions. OpenAI positions it for classification, extraction, ranking, routing, and lightweight subagents rather than complex final reasoning.

Nano keeps the same 400,000-token context window and 128,000-token max output as GPT-5.4 Mini, with text and image input, text output, reasoning tokens, structured outputs, and tool support. The point is not to replace a flagship model; it is to make the first layer of a workflow cheap and fast enough that hard cases can be identified before they reach larger models.

Key capabilities

Dimension	Detail
Context window	400,000 tokens
Max output	128,000 tokens
Input modalities	Text, image
Output modalities	Text
Typical role	classification, extraction, ranking, routing, lightweight subagent

CrossModel aligns GPT-5.4 Nano with OpenAI's 400K context window. Current pricing is shown in the live model catalog.

Routing and scale

Routing Layer

Use medium-long context for first-pass filtering and structured output

MRCR 64K-128K

44.2%

8-needle

Graphwalks BFS

73.4%

0K-128K

MMMUPro

66.1%

with Python: 69.5%

Let Nano remove noise, extract fields, and judge difficulty before escalating the smaller hard set.

Nano is strongest when the job can be split into many small, local decisions. It can read a medium-long packet, extract normalized fields, assign labels, rank candidates, or decide whether a request should stay on Nano, move to Mini, or escalate to GPT-5.4 / GPT-5.5.

This makes it a good first pass for support queues, ingestion pipelines, content review, knowledge-base routing, and analytics preprocessing. The workflow should include clear validation: if the output violates a schema, lacks confidence, or hits a high-risk category, escalate instead of asking Nano to reason harder.

Benchmarks and boundaries

Benchmarks

Keeps reasoning and vision depth, but computer use should not be overestimated

GPQA Diamond

82.8%

GPT-5 mini: 81.6%

HLE with tools

37.7%

GPT-5 mini: 31.6%

OSWorld-Verified

39.0%

Mini: 72.1%

Nano is valuable for simple steps at scale; complex screen operation should go to Mini or the main model.

OpenAI's numbers show that Nano still carries real GPT-5.4-family capability: 82.8% on GPQA Diamond, 37.7% on Humanity's Last Exam with tools, 66.1% on MMMUPro, and 69.5% on MMMUPro with Python. But it reaches only 39.0% on OSWorld-Verified, far below Mini's 72.1%, so complex computer-use workflows should be routed upward.

When to use it

Large-scale classification: support intent, risk category, document type, topic, and moderation labels.
Structured extraction: turn documents, screenshots, emails, and messages into JSON fields.
Ranking and routing: choose the next model or workflow before spending flagship-model tokens.
Lightweight subagents: run many simple steps in parallel, then merge or escalate the small hard set.

CrossModel exposes GPT-5.4 Nano through an OpenAI-compatible API. Current pricing is available in the model catalog.