A mainline model for professional work, coding, and computer use
Before GPT-5.5, GPT-5.4 was the main frontier model for completing real work across tools.
Overview
GPT-5.4 is OpenAI's March 5, 2026 frontier model for professional work. It brought reasoning, coding, computer use, tool calling, and document-style deliverables into one mainline model, and it remains the more affordable GPT-5.4-class choice when you want strong agentic capability without moving all the way to GPT-5.5.
OpenAI described GPT-5.4 as its first general-purpose model with native, state-of-the-art computer-use capability. In CrossModel it is configured with a 1,050,000-token context window, 128,000 output tokens, text and image input, text output, and configurable reasoning effort from none through xhigh.
Key capabilities
| Dimension | Detail |
|---|---|
| Context window | 1,050,000 tokens |
| Max output | 128,000 tokens |
| Input modalities | Text, image |
| Output modalities | Text |
| Tools | web search, file search, image generation, code interpreter, hosted shell, apply patch, skills, computer use, MCP, tool search |
Inputs above 272K tokens enter OpenAI's long-context multiplier tier for GPT-5.4-class flagship requests. Current pricing is shown in the live model catalog.
Professional work
More complete spreadsheets, documents, decks, and business analysis
GDPval covers real knowledge-work outputs across 44 occupations; the banking eval focuses on complex spreadsheet quality.
GPT-5.4's biggest shift was not a single benchmark jump; it was better completion of real work products. OpenAI reports 83.0% wins or ties on GDPval, up from GPT-5.2 at 70.9%, and 87.3% on an internal investment-banking modeling benchmark, versus GPT-5.2 at 68.4%. Those tasks resemble the work users actually hand to agents: spreadsheets, presentations, schedules, diagrams, and reports.
This makes GPT-5.4 a good fit when the expected output is more than prose. It can plan, use tools, inspect artifacts, and keep revising until the deliverable is closer to something a person can review.
Computer use, tools, and coding
From screen use to tool lookup, agents behave more like real software operators
OpenAI also introduced tool search so large tool sets can be retrieved on demand instead of pasted into every prompt.
On OSWorld-Verified, GPT-5.4 reaches 75.0%, above OpenAI's published human baseline of 72.4%. It also reaches 57.7% on SWE-Bench Pro Public, 82.7% on BrowseComp, and 54.6% on Toolathlon. Tool search is especially important for enterprise agents: in MCP Atlas, OpenAI reports a 47% reduction in total token usage while preserving accuracy when large tool sets are placed behind search.
When to use it
- Professional deliverables: spreadsheets, reports, slides, legal analysis, and structured business work.
- Computer-use agents: browser and desktop tasks that require screenshots, forms, and cross-app coordination.
- Large tool ecosystems: MCP servers, internal tools, search, file retrieval, and code execution in the same workflow.
- Long-context development: plan, patch, run, and verify across a 1M-scale context window.
CrossModel exposes GPT-5.4 through an OpenAI-compatible API. Current pricing is available in the model catalog.