A full-stack coding and agent model built for real productivity
Highspeed shares the same capability and context as the standard tier; the difference is output speed (about 60 → 100 tps).
Overview
MiniMax M2.5 is a MiniMax M2-series model released on February 12, 2026 under the theme "Built for Real-World Productivity." It targets full-stack development, agentic tool use, deep search, office automation, and financial modeling, while keeping OpenAI / Anthropic compatibility and raising the context window to 204,800 tokens.
MiniMax documents MiniMax-M2.5-highspeed and MiniMax-M2.5 as same-capability variants: identical context window, with highspeed described as "same performance, faster and more agile." Output speed is listed around 60 tps for the regular model and 100 tps for highspeed, so CrossModel uses one guide for both while model IDs and pricing stay separate.
Key capabilities
| Dimension | Detail |
|---|---|
| Context window | 204,800 tokens |
| Max output | 2,048 tokens |
| Input modalities | Text |
| Output modalities | Text |
| Tools | streaming, tool use, interleaved thinking, OpenAI / Anthropic compatible API |
M2.5 tends to plan architecture and specification before implementation — a learned operating style, not an extra prompt requirement. See live pricing in the model catalog.
Engineering and search benchmarks
MiniMax M2.5's evaluation axis is real engineering and tool-driven search rather than short snippets. It reports SWE-Bench Verified 80.2%, Multi-SWE-Bench 51.3%, and BrowseComp 76.3% with context management, and covers 13+ programming languages including Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, and Ruby.
Core scores for coding, multilingual fixes, and deep search
Figures from the official announcement; BrowseComp uses the context-management setting.
The point of these numbers is breadth: M2.5 trains multi-language, cross-platform, and tool calling together, so Web, Android, iOS, and Windows are all in scope. It favors emitting an architecture plan and spec before writing code, which tends to be more stable on complex projects than a direct patch.
Speed and the highspeed tier
Faster end-to-end execution than M2.1
Highspeed suits interactive coding assistants, IDE plugins, and high-concurrency sub-agents.
Efficiency is a focus. MiniMax says M2.5 is 37% faster than M2.1 on SWE-Bench Verified, cutting average end-to-end time from 31.3 minutes to 22.8 minutes, close to Claude Opus 4.6 at 22.9 minutes. It uses a more efficient reasoning path and parallel tool calls to reduce wasted turns and tokens. The highspeed tier (MiniMax-M2.5-highspeed, ~100 tps vs ~60 tps) is useful for interactive coding assistants, IDE plugins, and high-concurrency sub-agent workloads.
When to use it
- Complex full-stack development: Web, Android, iOS, Windows, and cross-platform projects.
- Large-codebase edits: refactors, multi-file repairs, tests, and review.
- Deep search and tool use: repeated retrieval, verification, and external tool calls.
- Office and finance workflows: spreadsheets, financial models, and business processes.
- Interactive high-speed workloads: choose
MiniMax-M2.5-highspeedwhen output speed matters.
CrossModel exposes MiniMax M2.5 and MiniMax M2.5 Highspeed through OpenAI-compatible /v1/chat/completions and Anthropic-compatible /v1/messages. Current pricing is available in the model catalog.