GPT-5.4 · Model guide

GPT-5.4

A mainline model for professional work, coding, and computer use

Context window

1.05M

tokens

Max output

128K

tokens

reasoning effort

none-xhigh

configurable depth

Professional work

Spreadsheets, documents, decks, finance, and legal analysis

Computer use

Browsers, desktops, screenshots, and cross-app workflows

Tool ecosystems

Tool search, MCP, files, search, and code execution

Before GPT-5.5, GPT-5.4 was the main frontier model for completing real work across tools.

Overview

GPT-5.4 is OpenAI's March 5, 2026 frontier model for professional work. It brought reasoning, coding, computer use, tool calling, and document-style deliverables into one mainline model, and it remains the more affordable GPT-5.4-class choice when you want strong agentic capability without moving all the way to GPT-5.5.

OpenAI described GPT-5.4 as its first general-purpose model with native, state-of-the-art computer-use capability. In CrossModel it is configured with a 1,050,000-token context window, 128,000 output tokens, text and image input, text output, and configurable reasoning effort from none through xhigh.

Key capabilities

Dimension	Detail
Context window	1,050,000 tokens
Max output	128,000 tokens
Input modalities	Text, image
Output modalities	Text
Tools	web search, file search, image generation, code interpreter, hosted shell, apply patch, skills, computer use, MCP, tool search

Inputs above 272K tokens enter OpenAI's long-context multiplier tier for GPT-5.4-class flagship requests. Current pricing is shown in the live model catalog.

Professional work

Professional Work

More complete spreadsheets, documents, decks, and business analysis

GDPval

83.0%

wins or ties · GPT-5.2: 70.9%

Banking modeling

87.3%

GPT-5.2: 68.4%

OfficeQA

68.1%

GPT-5.2: 63.1%

GDPval covers real knowledge-work outputs across 44 occupations; the banking eval focuses on complex spreadsheet quality.

GPT-5.4's biggest shift was not a single benchmark jump; it was better completion of real work products. OpenAI reports 83.0% wins or ties on GDPval, up from GPT-5.2 at 70.9%, and 87.3% on an internal investment-banking modeling benchmark, versus GPT-5.2 at 68.4%. Those tasks resemble the work users actually hand to agents: spreadsheets, presentations, schedules, diagrams, and reports.

This makes GPT-5.4 a good fit when the expected output is more than prose. It can plan, use tools, inspect artifacts, and keep revising until the deliverable is closer to something a person can review.

Computer use, tools, and coding

Computer Use & Tools

From screen use to tool lookup, agents behave more like real software operators

OSWorld-Verified

75.0%

human baseline: 72.4%

SWE-Bench Pro

57.7%

Public

MCP Atlas tokens

-47%

with tool search

BrowseComp

82.7%

agentic web search

OpenAI also introduced tool search so large tool sets can be retrieved on demand instead of pasted into every prompt.

On OSWorld-Verified, GPT-5.4 reaches 75.0%, above OpenAI's published human baseline of 72.4%. It also reaches 57.7% on SWE-Bench Pro Public, 82.7% on BrowseComp, and 54.6% on Toolathlon. Tool search is especially important for enterprise agents: in MCP Atlas, OpenAI reports a 47% reduction in total token usage while preserving accuracy when large tool sets are placed behind search.

When to use it

Professional deliverables: spreadsheets, reports, slides, legal analysis, and structured business work.
Computer-use agents: browser and desktop tasks that require screenshots, forms, and cross-app coordination.
Large tool ecosystems: MCP servers, internal tools, search, file retrieval, and code execution in the same workflow.
Long-context development: plan, patch, run, and verify across a 1M-scale context window.

CrossModel exposes GPT-5.4 through an OpenAI-compatible API. Current pricing is available in the model catalog.