CrossModel
Back to model catalog

Z.ai · Model guide

GLM-5.2

z-ai/glm-5.2
Modalities
TextText
Context
1M
Max output
128K
GLM-5.2 · Zhipu AI

A 1M-context flagship built for long-horizon tasks

Parameters
753B
~40B active MoE
Context window
1M
1,000,000 tokens (GLM-5.1 was 200K)
SWE-bench Pro
62.1
beats GPT-5.5 (58.6)
IndexShare
2.9×
FLOPs cut at 1M context
Long-horizon engineering
SWE-bench Pro / Terminal Bench / FrontierSWE
Agents & tool use
MCP-Atlas, Tool-Decathlon
Reasoning & math
AIME 2026, GPQA-Diamond

IndexShare shares one indexer across every four sparse-attention layers, cutting per-token FLOPs by 2.9× at 1M context.

Overview

GLM-5.2 is Zhipu AI's flagship model for long-horizon tasks, released on June 17, 2026 under the MIT license. It keeps the GLM-5 MoE lineage — 753B total parameters with ~40B active — but makes a step change in context: GLM-5.2 stably sustains a 1M-token working context, up from GLM-5.1's 200K.

The headline architectural change is IndexShare: every four sparse-attention layers share a single lightweight indexer placed on the first layer, and the top-k indices are reused across the other three. Combined with KVShare and a refined MTP layer, this cuts per-token FLOPs by 2.9× at 1M context and improves speculative-decoding acceptance length by up to 20%, so the long window stays affordable to serve.

Key capabilities

DimensionDetail
Context window1,000,000 tokens (1M)
Max output128,000 tokens
Input modalitiesText
Output modalitiesText
Toolsstreaming, JSON output, tool calls, High / Max effort levels

GLM-5.2 exposes High and Max thinking effort levels, letting you trade model capability against latency and compute cost per request. Max spends more internal reasoning for the hardest engineering and math work; High is the faster default for interactive coding. See live pricing in the model catalog.

Benchmarks

GLM-5.2's evaluation axis is long-horizon engineering: resolving real repository issues and driving terminals over many steps, not single-turn prompts.

Coding & Terminal

Closing in on closed-source flagships on long-horizon coding

SWE-bench Pro
62.1
GLM-5.1 58.4 · GPT-5.5 58.6
FrontierSWE
74.4
Opus 4.8 75.1 · GPT-5.5 72.6
Terminal Bench 2.1
81.0
Terminus-2, GLM-5.1 63.5
ProgramBench
63.7
GLM-5.1 50.9

Numbers from the official launch blog; detail lines show comparison models.

On SWE-bench Pro, GLM-5.2 scores 62.1, ahead of GPT-5.5 (58.6) and its own predecessor GLM-5.1 (58.4). On FrontierSWE it reaches 74.4, edging past GPT-5.5 (72.6) and finishing in a near-tie with Claude Opus 4.8 (75.1). Terminal Bench 2.1 (Terminus-2) climbs to 81.0, a large jump from GLM-5.1's 63.5, and ProgramBench rises to 63.7 from 50.9 — the clearest signal that the gains are about sustained, tool-driven execution rather than one-shot code.

Agents, tools, and reasoning

Agentic & Reasoning

Tool use and math reasoning rise together

MCP-Atlas
76.8
Public Set, GPT-5.5 75.3
Tool-Decathlon
48.2
GLM-5.1 40.7
AIME 2026
99.2
GPT-5.5 98.3
GPQA-Diamond
91.2
GLM-5.1 86.2

Numbers from the official launch blog; detail lines show comparison models.

On the MCP-Atlas tool-usage public set, GLM-5.2 scores 76.8, ahead of GPT-5.5 (75.3) and just behind Claude Opus 4.8 (77.8); Tool-Decathlon improves to 48.2 from GLM-5.1's 40.7. Reasoning rises in lockstep: 99.2 on AIME 2026 and 91.2 on GPQA-Diamond, both well above GLM-5.1 (95.3 / 86.2). The pattern across coding, agents, and math is consistent — GLM-5.2 narrows the gap to the leading closed models while remaining open-weight.

When to use it

  • Million-token codebases: whole-repo reading, cross-file refactors, and migrations that overflow a 200K window.
  • Long-horizon agents: multi-step tool chains where MCP-Atlas / Tool-Decathlon stability matters more than single-turn quality.
  • Hard reasoning and math: competition-level problems and research-style analysis where Max effort pays off.
  • Open-weight deployment: teams that need MIT-licensed weights they can self-host and fine-tune.

CrossModel exposes GLM-5.2 through an OpenAI-compatible /v1/chat/completions API. Current pricing is available in the model catalog.