Key Terms
- GLM Coding Plan - Zhipu AI's subscription service for using GLM models in coding agents. Supports Claude Code, Cline, OpenCode, Roo Code, Kilo Code, Cursor, Crush, and Goose. Uses a dedicated API endpoint separate from the general API. Source: Bigmodel – Overview
- GLM-5.1 - Zhipu AI's latest flagship model (April 2026). Claims to align with Claude Opus 4.6 in coding ability. 200K context, 128K max output. Supports thinking mode, function calling, context caching, structured output, and MCP. Source: Bigmodel – Glm 5.1
- Long-horizon tasks - GLM-5.1's headline capability: the model can work autonomously for up to 8 hours in a single task, performing planning, execution, testing, and iteration. Source: Bigmodel – Glm 5.1
- Peak multiplier - GLM-5.1 and GLM-5-Turbo consume 3x quota during peak hours (14:00-18:00 UTC+8) and 2x during off-peak. As a promotional offer valid through end of June 2026, off-peak usage counts as 1x. Source: Bigmodel – Overview
- MCP servers - GLM Coding Plan includes exclusive MCP servers for vision understanding, web search, web page reading, and open-source repository access. Source: Bigmodel – Overview
Latest Changes
First report for this supplier. All models, plans, and pricing are listed as current state.
- New model: GLM-5.1 launched April 8-9. 744B parameters (40B active), MoE. Claims alignment with Claude Opus 4.6.
- Plan change: GLM Coding Plan prices doubled April 14. Lite: $18/mo, Pro: $72/mo, Max: $160/mo.
- Plan change: Legacy subscription plans being phased out in favor of new Lite/Pro/Max tier structure (April 23).
- Feature added: GLM-5.1 supports up to 8 hours of continuous autonomous work in a single task.
- Feature added: Promotional 1x off-peak multiplier for GLM-5.1 through end of June 2026.
- Plan change: Plans in "short-term sales restriction" with daily inventory limits released at 10:00 UTC+8.
Plans
GLM Coding Plan (Subscription)
| Plan | 5-Hour Limit | Weekly Limit | MCP Calls/mo | Recommended Projects | Price (monthly) | Price (quarterly, 10% off) |
|---|---|---|---|---|---|---|
| Lite | ~80 prompts | ~400 prompts | 100 | 1 project | $18/mo | $16.20/mo ($48.60/quarter) |
| Pro | ~400 prompts | ~2,000 prompts | 1,000 | 1-2 projects | $72/mo | $64.80/mo ($194.40/quarter) |
| Max | ~1,600 prompts | ~8,000 prompts | 4,000 | 2+ projects | $160/mo | $144/mo ($432/quarter) |
Each prompt triggers approximately 15-20 model calls. Monthly value is claimed to be 15-30x the subscription cost at API rates. Plans are currently in short-term sales restriction mode: limited inventory released daily at 10:00 UTC+8. Renewals and upgrades are not affected.
Available models: GLM-5.1, GLM-5-Turbo, GLM-4.7, GLM-4.5-Air. GLM-5.1 is recommended for complex tasks; GLM-4.7 for routine work to conserve quota.
Source: Bigmodel – Overview, Z – Subscribe
API (Pay-as-you-go)
The general API at open.bigmodel.cn is separate from the Coding Plan. Free models available: GLM-4.7-Flash, GLM-4.5-Flash (no token cost). All prices below are per 1M tokens.
Source: Z – Overview
z.ai Chat (Consumer)
Free web-based chatbot at z.ai powered by GLM-5.1 and GLM-5. No subscription required.
Source: Z
API Pricing
Text Models ($/MTok)
| Model | Input | Cached Input | Cached Input Storage | Output |
|---|---|---|---|---|
| GLM-5.1 | $1.40 | $0.26 | Limited-time Free | $4.40 |
| GLM-5 | $1.00 | $0.20 | Limited-time Free | $3.20 |
| GLM-5-Turbo | $1.20 | $0.24 | Limited-time Free | $4.00 |
| GLM-4.7 | $0.60 | $0.11 | Limited-time Free | $2.20 |
| GLM-4.7-FlashX | $0.07 | $0.01 | Limited-time Free | $0.40 |
| GLM-4.6 | $0.60 | $0.11 | Limited-time Free | $2.20 |
| GLM-4.5 | $0.60 | $0.11 | Limited-time Free | $2.20 |
| GLM-4.5-X | $2.20 | $0.45 | Limited-time Free | $8.90 |
| GLM-4.5-Air | $0.20 | $0.03 | Limited-time Free | $1.10 |
| GLM-4.5-AirX | $1.10 | $0.22 | Limited-time Free | $4.50 |
| GLM-4-32B-0414-128K | $0.10 | - | - | $0.10 |
| GLM-4.7-Flash | Free | Free | Free | Free |
| GLM-4.5-Flash | Free | Free | Free | Free |
Vision Models ($/MTok)
| Model | Input | Cached Input | Output |
|---|---|---|---|
| GLM-5V-Turbo | $1.20 | $0.24 | $4.00 |
| GLM-4.6V | $0.30 | $0.05 | $0.90 |
| GLM-OCR | $0.03 | - | $0.03 |
| GLM-4.6V-FlashX | $0.04 | $0.004 | $0.40 |
| GLM-4.5V | $0.60 | $0.11 | $1.80 |
| GLM-4.6V-Flash | Free | Free | Free |
Built-in Tools
| Tool | Cost |
|---|---|
| Web Search | $0.01/use |
Image Generation (per image)
| Model | Price |
|---|---|
| GLM-Image | $0.015 |
| CogView-4 | $0.01 |
Video Generation (per video)
| Model | Price |
|---|---|
| CogVideoX-3 | $0.20 |
| ViduQ1-Text | $0.40 |
| ViduQ1-Image | $0.40 |
Source: Z – Overview
Model Performance / Benchmarks
| Benchmark | GLM-5.1 | GPT-5.4 | Claude Opus 4.6 | Gemini 3.1 Pro |
|---|---|---|---|---|
| SWE-Bench Pro | 58.4 | 57.7 | 53.4 | 54.2 |
Additional GLM-5.1 capabilities:
- 8-hour continuous autonomous work (building a complete Linux desktop, 655-round vector database optimization to 6.9x throughput)
- KernelBench Level 3: 3.6x geometric mean speedup over torch.compile max-autotune
- 200K context window, 128K max output
Source: Bigmodel – Glm 5.1
Latest News
GLM-5.1 Launch (April 8-9, 2026)
Zhipu AI released GLM-5.1, the latest flagship model with significant coding and long-horizon task improvements:
- SWE-Bench Pro: 58.4 (claimed above GPT-5.4 at 57.7, Opus 4.6 at 53.4, Gemini 3.1 Pro at 54.2)
- Claims alignment with Claude Opus 4.6 in comprehensive and coding capabilities
- Long-horizon task capability: up to 8 hours of continuous autonomous work in a single task
- 200K context window, 128K max output
- Demonstrated building a complete Linux desktop system in 8 hours
- Demonstrated 655-round iteration optimizing a vector database to 6.9x throughput
- KernelBench Level 3: 3.6x geometric mean speedup over torch.compile max-autotune
- 744B parameters (40B active), MoE architecture (carried from GLM-5)
- Supports thinking mode, function calling, context caching, structured output, MCP
- HN: 618 points, 263 comments
Source: Bigmodel – Glm 5.1, News – From
GLM Coding Plan Price Increase (April 14, 2026)
Zhipu AI doubled the GLM Coding Plan prices. HN: 18 points, 6 comments. This followed the GLM-5.1 launch and the promotional 1x multiplier for GLM-5.1 usage (off-peak) valid through end of June 2026.
Source: News – From
Legacy Plan Migration (April 23, 2026)
Zhipu AI began phasing out original subscription plans in favor of the new Lite/Pro/Max tier structure. Users on legacy plans are being migrated. HN: 4 points.
Source: News – From
GLM-5 Turbo for OpenClaw (March 2026)
GLM-5-Turbo released, optimized for the OpenClaw (persistent agent) scenario. Improved complex long-task execution continuity.
Source: Bigmodel – Model Overview
Community Signals
GLM-5 Launch (January 2026, ongoing relevance)
HN: 484 points, 520 comments (largest Zhipu thread). GLM-5 was positioned as an open-source SOTA model with coding capabilities aligned to Claude Opus 4.5. Community noted the 744B MoE architecture and open-source availability.
Source: News – From
GLM Coding Plan Adoption
- GLM Coding Plan supports Claude Code, Cursor, Cline, OpenCode, Roo Code, Kilo Code, and other tools
- Community documentation shows step-by-step guides for using GLM models with Claude Code via Anthropic-compatible proxy API
- Zhipu provides a
npx @z_ai/coding-helpertool for automatic configuration - HN: "GLM 4.5 with Claude Code" thread (213 points, 84 comments) showed early adoption interest
Source: News – From
Price Sensitivity
- "Z.ai doubles it's coding plan prices" (HN, 18 points): community reacted negatively to the price increase following the GLM-5.1 launch
- The promotional 1x multiplier for GLM-5.1 (off-peak, through June 2026) suggests Zhipu is trying to drive adoption of the new model while managing compute costs
Sales Restriction
The platform implemented daily inventory limits for new subscriptions due to "user volume surge exceeding expectations." This suggests either genuine demand or capacity constraints.
Source: Bigmodel – Overview
Enterprise Readiness
| Feature | Available? | Details |
|---|---|---|
| SSO (SAML) | No | Not mentioned. GLM Coding Plan uses API keys. |
| SSO (OIDC) | No | Not mentioned. |
| SCIM | No | Not mentioned. |
| Audit logs | No | Not mentioned. |
| IP indemnity | No | Not mentioned. |
| Data residency | No | Not mentioned. API endpoints are China-focused. |
| HIPAA | No | Not mentioned. |
| Air-gapped / on-prem | No | Not available. |
| SLA | No | No published SLA. |
| Admin controls (RBAC) | No | No admin controls documented. Plans are single-user. |
Transparency Gaps
| Gap | Details | Severity |
|---|---|---|
| Prompt count is approximate | Plan limits use "approximately X prompts" where each prompt triggers 15-20 model calls. Actual consumption depends on task complexity, making cost estimation unreliable. | Medium |
| Peak multiplier creates uncertainty | GLM-5.1 consumes 3x quota during peak hours (14:00-18:00 UTC+8) and 2x off-peak. The promotional 1x rate (through June) will increase effective costs when it expires. | Medium |
| Dynamic concurrency limits | Concurrency is "dynamically adjusted" based on resource availability. Max users get priority, but specific concurrency numbers are not published. | Medium |
| Usage restrictions enforced by risk control | The platform monitors for "improper use" including account sharing and use in non-approved tools. Violations trigger throttling, freezing, or banning. The detection methodology is not disclosed. | Low |
| No batch pricing published | Batch API is mentioned as available but pricing is not documented. | Low |
| Model parameter count not disclosed for GLM-5.1 | GLM-5 is documented as 744B (40B active), but GLM-5.1's architecture is not specified. | Low |