Z.ai's GLM-5.2 Matches US Frontier AI at One-Sixth the API Cost

Beijing-based Z.ai released GLM-5.2 on June 16, 2026 — an open-weight large language model that independent evaluators say performs on par with leading closed US models at roughly one-sixth their API cost.

Architecture

GLM-5.2 uses a sparse mixture-of-experts (MoE) design with 753 billion total parameters and approximately 40 billion active per token. A new “IndexShare” sparse-attention mechanism keeps inference costs manageable at the model’s full one-million-token context window — a fourfold increase over its predecessor, GLM-5.1. Weights are available under the MIT license on Hugging Face, through llama.cpp and Unsloth, and via Z.ai’s API on its GLM Coding Plan.

Performance

Multiple independent evaluations place GLM-5.2 directly alongside OpenAI’s GPT-5.5 and Anthropic’s Claude Opus 4.8 on coding and long-horizon agentic tasks. The model ranks second globally on Code Arena. Artificial Analysis, which tracks frontier model performance, placed it between GPT-5.5 and Opus 4.8 on its agentic knowledge-work evaluation.

Fast.ai co-founder Jeremy Howard described it as “at least as good as Opus 4.8 and GPT 5.5” for his workloads. David Sacks, former White House AI and cryptocurrency policy adviser, called it “just a tick below Opus 4.8 and right up there with GPT 5.5.”

Context

Z.ai — formerly Zhipu AI, founded in 2019 by Tsinghua University researchers — became the first major Chinese large language model company to complete an IPO when it listed on the Hong Kong Stock Exchange in January 2026. The combination of MIT licensing, frontier-adjacent performance, and developer-friendly pricing has pushed GLM-5.2 above Anthropic in API traffic share on the OpenRouter developer platform, where it climbed rapidly through the rankings in the weeks after its release.