Key Terms
- Grok 4.20 - xAI's current flagship model, used for both chat and coding. Features reasoning, function calling, structured outputs, and 2M token context window. Source: xAI – Grok 4.20 Reasoning
- Higher context pricing - requests exceeding 200K tokens are billed at 2x standard rates: $4.00 input / $12.00 output per MTok. Source: xAI – Grok 4.20 Reasoning
- Server-side tools - xAI-provided tools (web search, X search, code execution, file search) that can be invoked by the model autonomously. Priced per invocation in addition to token costs. Source: xAI – Models
- Batch API - asynchronous processing at 50% of standard token pricing, with results typically within 24 hours. Source: xAI – Models
- Cached prompt tokens - automatically enabled for all requests. Cached tokens cost $0.20/MTok (10x cheaper than input). Source: xAI – Grok 4.20 Reasoning
Latest Changes
First report for this supplier. All models, plans, and pricing are listed as current state.
- New model: Grok 4.20 launched as the sole text model for both chat and coding. Previous models (Grok 3, Grok Code Fast 1) consolidated into Grok 4.20.
- Feature added: Per-invocation pricing for server-side tools: $5/1k calls for web search, X search, code execution.
- Feature added: Storage pricing for files and collections ($0.025-$0.10/GiB/day), effective April 20.
- Price change: Requests exceeding 200K context charged at 2x standard rate ($4/$12 per MTok).
Plans
xAI does not appear to offer subscription-based plans for coding agents. Access is API-only via console.x.ai with pay-as-you-go billing.
| Plan | Price | Usage | Key Inclusions |
|---|---|---|---|
| API access | Pay-as-you-go | Per-token billing | Grok 4.20 via api.x.ai |
| Enterprise | undisclosed | undisclosed | Contact required |
Source: xAI – Models
API Pricing
Grok 4.20 (Standard Context, up to 200K)
| Metric | Rate |
|---|---|
| Input tokens | $2.00/MTok |
| Cached input tokens | $0.20/MTok |
| Output tokens | $6.00/MTok |
| Batch API (all token types) | 50% off standard rates |
| Context window | 2,000,000 tokens |
| Rate limits | 1,800 RPM, 10,000,000 TPM |
Grok 4.20 (Higher Context, above 200K)
| Metric | Rate |
|---|---|
| Input tokens | $4.00/MTok |
| Cached input tokens | $0.40/MTok |
| Output tokens | $12.00/MTok |
Grok 4.1 Fast (via Google Vertex AI)
| Metric | Rate |
|---|---|
| Input tokens | $0.20/MTok |
| Cached input tokens | $0.05/MTok |
| Output tokens | $0.50/MTok |
Source: Google – Pricing
Tool Pricing
| Tool | Cost per 1,000 calls |
|---|---|
| Web Search | $5.00 |
| X Search | $5.00 |
| Code Execution | $5.00 |
| File Attachments Search | $10.00 |
| Collections Search (RAG) | $2.50 |
| Image Understanding (from search) | Token-based only |
| Remote MCP Tools | Token-based only |
Storage Pricing (effective April 20, 2026)
| Resource | Rate |
|---|---|
| File storage | $0.025/GiB/day |
| Collection storage | $0.10/GiB/day |
| File/Collection downloads | $0.20/GiB |
Source: xAI – Grok 4.20 Reasoning, xAI – Models
Model Performance / Benchmarks
xAI does not publish benchmark scores for Grok 4.20 on its models page. Community assessment:
- Coding performance is generally considered competitive but not state-of-the-art compared to Claude Opus 4.7 or GPT-5.3-Codex.
- 2M token context window is the largest among all suppliers in this report.
- No dedicated coding-optimized model variant (Grok Code Fast 1 was consolidated into Grok 4.20).
Source: xAI – Models
Latest News
Grok 4.20 Launch (April 2026)
Grok 4.20 is xAI's newest flagship model, described as "the most intelligent and fastest model we've built." It combines low hallucination rates with strict prompt adherence. Both the Chat and Coding use cases on xAI's models page now point to Grok 4.20, suggesting previous models (Grok 3, Grok Code Fast 1) have been consolidated.
Source: xAI – Models
Tool Pricing and Storage Introduced (April 2026)
xAI introduced per-invocation pricing for server-side tools ($5/1k calls for search and code execution) and storage pricing for files and collections ($0.025-$0.10/GiB/day), effective April 20, 2026.
Source: xAI – Models
Community Signals
- Grok 4.20 is frequently compared to GPT-5 and Claude Opus in community benchmarks, with mixed results. Its coding performance is generally considered competitive but not state-of-the-art compared to Claude Opus 4.7 or GPT-5.3-Codex.
- The consolidation of Grok Code Fast 1 into Grok 4.20 means there is no longer a dedicated coding-optimized model variant.
- The 2M token context window is the largest among all suppliers in this report, useful for large codebase analysis.
Enterprise Readiness
| Feature | Available? | Details |
|---|---|---|
| SSO (SAML) | Undisclosed | Enterprise plan exists but details are not published. Source: xAI – Models |
| SSO (OIDC) | Undisclosed | Not mentioned. |
| SCIM | No | Not mentioned. |
| Audit logs | No | Not mentioned. |
| IP indemnity | No | Not mentioned. |
| Data residency | No | Not mentioned. |
| HIPAA | No | Not mentioned. |
| Air-gapped / on-prem | No | Not available. |
| SLA | No | No published SLA. |
| Admin controls (RBAC) | No | No admin controls documented. |
Transparency Gaps
| Gap | Details | Severity |
|---|---|---|
| Enterprise pricing | Enterprise plan details and pricing are not published. | Medium |
| Previous model deprecation | Grok 3 and Grok Code Fast 1 appear deprecated but no formal deprecation notice found. | Medium |
| No dedicated coding agent | No IDE integration, CLI agent, or cloud coding agent. Only raw API access. | Medium |
| Tool call cost unpredictability | Agent autonomously decides how many tools to call, making costs scale with query complexity in ways that are hard to predict. | Low |
| Usage guidelines violation fee | $0.05 per request flagged as violating usage guidelines, charged even if generation is blocked. | Low |