Key Terms
- Unified credit pool - Kimi's consumer plans use a single credit balance metered by token consumption across all features (agent, Code, Swarm, Claw). Credits reset monthly. Source: Kimi – Membership Overview
- Kimi Code - Moonshot AI's coding agent product, available as a CLI and IDE extension. Uses Kimi K2.6 as its underlying model. Source: Kimi – Code
- Agent Swarm - Kimi's architecture for decomposing tasks into heterogeneous subtasks executed concurrently by self-created domain-specialized agents. K2.6 supports up to 300 sub-agents across 4,000 coordinated steps. Source: Kimi – Agent Swarm
- Kimi Claw - A persistent, proactive AI agent that operates across multiple applications with 24/7 execution (similar to Anthropic's computer use). Available on Allegretto ($39/mo) and above. Source: Kimi – Kimi Claw Introduction
- Context caching - Kimi automatically caches context. Cached tokens are billed at the "cache hit" rate, which is 83% cheaper than regular input for K2.6 ($0.16 vs $0.95 per MTok). Source: Kimi – Chat K26
Latest Changes
First report for this supplier. All models, plans, and pricing are listed as current state.
- New model: Kimi K2.6 launched April 19-20. SWE-Bench Pro: 58.6%, Terminal-Bench 2.0: 66.7%. API at $0.95/$4.00 per MTok.
- Feature added: Agent Swarm expanded to 300 sub-agents across 4,000 coordinated steps (up from 100/1,500 in K2.5).
- Deprecation: All Kimi K2 series models will be discontinued May 25, 2026. Users should migrate to K2.6.
- Feature added: Cursor Composer 2 partnership confirmed. Kimi K2.5 is the base model for Cursor's Composer 2 with RL fine-tuning.
- Feature added: K2.6 API available with OpenAI SDK compatibility. Top-up promotion: 20-30% bonus voucher on recharges of $100+ through May 3.
Plans
Consumer Plans (Kimi.com)
| Plan | Price (monthly) | Price (annual) | Agent Quota/mo | Kimi Code | Agent Swarm | Key Inclusions |
|---|---|---|---|---|---|---|
| Adagio (Free) | $0 | $0 | 6 agent tasks | Not included | Not included | 1 concurrent agent task, 200 database calls |
| Moderato | $19/mo | $15/mo ($180/yr) | 60 agent tasks | 1x quota (undisclosed token count) | Not included | 2 concurrent tasks, 4x speed priority, 2,000 database calls |
| Allegretto | $39/mo | $31/mo ($372/yr) | 150 agent tasks | 5x quota | 50 Swarm uses/mo, 4 concurrent subtasks | Kimi Claw, 2 concurrent tasks, 5,000 database calls |
| Allegro | $99/mo | $79/mo ($948/yr) | 360 agent tasks | 15x quota | 120 Swarm uses/mo, 4 concurrent subtasks | Kimi Claw, 4 concurrent tasks, 12,000 database calls |
| Vivace | $199/mo | $159/mo ($1,908/yr) | 720 agent tasks | 30x quota | 240 Swarm uses/mo, 8 concurrent subtasks | Kimi Claw, 4 concurrent tasks, 24,000 database calls |
Agent quotas are approximate values based on typical task token consumption. Actual usage varies by task complexity. All plans share a unified credit pool metered by tokens.
Source: Kimi – Membership Pricing
API Plans (Pay-as-you-go)
| Tier | Cumulative Recharge | Concurrency | RPM | TPM | TPD |
|---|---|---|---|---|---|
| Tier 0 | $1 | 1 | 3 | 500,000 | 1,500,000 |
| Tier 1 | $10 | 50 | 200 | 2,000,000 | Unlimited |
| Tier 2 | $20 | 100 | 500 | 3,000,000 | Unlimited |
| Tier 3 | $100 | 200 | 5,000 | 3,000,000 | Unlimited |
| Tier 4 | $1,000 | 400 | 5,000 | 4,000,000 | Unlimited |
| Tier 5 | $3,000 | 1,000 | 10,000 | 5,000,000 | Unlimited |
Minimum recharge: $1. At $5 cumulative recharge, users receive a $5 voucher. Vouchers do not count toward cumulative recharge. Enterprise custom limits available via api-service@moonshot.ai.
Source: Kimi – Limits
API Pricing
| Model | Input ($/MTok) | Output ($/MTok) | Cache Hit ($/MTok) | Context Window |
|---|---|---|---|---|
| Kimi K2.6 | $0.95 | $4.00 | $0.16 | 256K tokens |
| Kimi K2.5 | $0.60 | $3.00 | $0.10 | 256K tokens |
| Kimi K2 (0905) | $0.60 | $2.50 | $0.15 | 256K tokens |
| Kimi K2 Turbo | $1.15 | $8.00 | $0.15 | 256K tokens |
| Kimi K2 Thinking | $0.60 | $2.50 | $0.15 | 256K tokens |
| Kimi K2 Thinking Turbo | $1.15 | $8.00 | $0.15 | 256K tokens |
| Kimi K2 (0711) | $0.60 | $2.50 | $0.15 | 128K tokens |
Note: Kimi K2 series models will be discontinued on May 25, 2026 and will no longer be maintained. Users should migrate to Kimi K2.6.
Other API pricing:
- Web search ($web_search tool): $0.005 per successful tool call
- File upload/extract: temporarily free
- K2.6 supports text, image, and video input
- Thinking mode: can be enabled/disabled per request
- Temperature fixed at 1.0 (thinking mode) or 0.6 (non-thinking), cannot be changed
Source: Kimi – Chat K26, Kimi – Chat K2, Kimi – Chat K25, Kimi – Tools
Model Performance / Benchmarks
| Benchmark | Kimi K2.6 | GPT-5.4 | Claude Opus 4.6 |
|---|---|---|---|
| SWE-Bench Pro | 58.6% | 57.7% | 53.4% |
| Terminal-Bench 2.0 | 66.7% | 65.4% | 65.4% |
| HLE-Full (with tools) | 54.0% | 52.1% | 53.0% |
| SWE-Bench Verified | 80.2% | - | 80.8% |
Additional K2.6 capabilities:
- Agent Swarm: up to 300 sub-agents across 4,000 coordinated steps
- 256K context window, multimodal (text, image, video)
- Demonstrated 13-hour continuous coding session optimizing exchange-core for 185% throughput improvement
Source: Kimi – Kimi K2 6
Latest News
Kimi K2.6 Launch (April 19-20, 2026)
Moonshot AI released Kimi K2.6, its latest open-source model with state-of-the-art coding and agent capabilities. Key claims:
- SOTA on SWE-Bench Pro (58.6%), competitive with GPT-5.4 (57.7%) and above Claude Opus 4.6 (53.4%)
- Terminal-Bench 2.0: 66.7%, above GPT-5.4 (65.4%) and Opus 4.6 (65.4%)
- HLE-Full with tools: 54.0%, above GPT-5.4 (52.1%), Opus 4.6 (53.0%), Gemini 3.1 Pro (51.4%)
- SWE-Bench Verified: 80.2%, competitive with Opus 4.6 (80.8%) and Gemini 3.1 Pro (80.6%)
- Agent Swarm expanded to 300 sub-agents across 4,000 coordinated steps (up from 100/1,500 in K2.5)
- 256K context window, multimodal (text, image, video), thinking and non-thinking modes
- Demonstrated 13-hour continuous coding session optimizing exchange-core for 185% throughput improvement
- Demonstrated 12-hour session implementing Qwen3.5-0.8B inference in Zig (a niche language)
Source: Kimi – Kimi K2 6
K2.6 API Available (April 19, 2026)
K2.6 model available on Kimi API platform at $0.95/$4.00 per MTok (input/output). OpenAI SDK compatible. Top-up promotion: 20-30% bonus voucher on recharges of $100+ during April 19 - May 3, 2026.
Source: Kimi – Kimi K2 6 Quickstart
K2 Series Discontinuation (May 25, 2026)
All Kimi K2 series models (kimi-k2-0905-preview, kimi-k2-turbo-preview, kimi-k2-thinking, etc.) will be officially discontinued on May 25, 2026 and no longer maintained. Users should migrate to kimi-k2.6.
Source: Kimi – Chat K2
Cursor Composer 2 Partnership Confirmed (April 2026)
Community discovered Cursor's "in-house model" Composer 2 is Kimi K2.5 with RL fine-tuning, served via Fireworks AI. Moonshot confirmed it is an authorized commercial partnership. Cursor co-founder Lee Rob stated only ~1/4 of compute came from the base model; the rest is Cursor's own training. HN: 276 points, 168 comments.
Source: News – Item
Community Signals
Rate Limiting and Reliability Issues
Multiple reports of 429 errors and "system busy" messages on the Kimi Forum:
- "Kimi CLI stuck in engine overloaded loop for 48h" (2 replies, 36 views, April 28). Source: Forum – Kimi Cli Stuck In Engine Overloaded Loop For 48H
- "Error code 429: We're receiving too many requests at the moment" (12 replies, 841 views, ongoing since before April). Source: Forum – Error Code 429 Were Receiving Too Many Requests At The Moment
- "System is Currently Busy Error" (97 views, April 20). Source: Forum – System Is Currently Busy Error
Billing and Subscription Issues
- Multiple reports of double-charging after cancellation. "Charged twice after cancellation, no invoices, no authorization" (4 replies, 42 views, April 16). Source: Forum – Charged Twice After Cancellation No Invoices No Authorization
- "I was charged twice for my Vivace plan" (16 replies, 343 views). Source: Forum – I Was Charged Twice For My Vivace Plan
- Subscription state not syncing across devices: user upgraded from $19 to $39 plan, logged in on another device, and was downgraded to free. Source: Forum – I Upgraded My Subscription From 19 To 39 Plan And Logged In On Another Device And Some How Got Downgraded To Free
- "Refund request 1 minute after subscription" (5 replies, 125 views). Source: Forum – Refund Request 1 Minute After Subscription
- No link to cancel subscription (226 views). Source: Forum – No Link To Cancel Subscription
K2.6 Quality Feedback
- "Overengineered Answer, Zero Results. Kimi 2.6 Couldn't Solve a Simple Bug" (40 views, April 24). Source: Forum – Overengineered Answer Zero Results Kimi 2 6 Couldn T Solve A Simple Bug
- "129 Hour limit on Kimi" (58 views, April 4). Source: Forum – 129 Hour Limit On Kimi
Cursor/Kimi Licensing Discussion
- HN discussion on Composer 2 being Kimi K2.5 drew 168 comments. Key debate: whether Cursor's use of open-weight models without prominent attribution violated Kimi's modified MIT license (requires displaying "Kimi K2.5" for products with >100M MAU or >$20M monthly revenue). Moonshot confirmed authorized partnership.
- Quote: "There is so much money to be made repackaging open source these days." (HN user mohsen1)
- Quote: "K2.6 offers SOTA-level performance at a fraction of the cost." (Kilo.ai CEO Scott Breitenother, via Kimi blog)
Source: News – Item
Enterprise Readiness
| Feature | Available? | Details |
|---|---|---|
| SSO (SAML) | No | Not mentioned. Kimi is primarily a consumer and API product. |
| SSO (OIDC) | No | Not mentioned. |
| SCIM | No | Not mentioned. |
| Audit logs | No | Not mentioned. |
| IP indemnity | No | Not mentioned. |
| Data residency | No | Not mentioned. API endpoints are China-focused with limited global availability. |
| HIPAA | No | Not mentioned. |
| Air-gapped / on-prem | No | Not available. |
| SLA | No | No published SLA. |
| Admin controls (RBAC) | No | No admin controls documented. API tiers are single-user. |
Transparency Gaps
| Gap | Details | Severity |
|---|---|---|
| Agent quota is approximate | Plan inclusions use "approximate values based on typical task token consumption" rather than concrete token counts. A buyer cannot know exactly how much usage they get. | High |
| Kimi Code quota multiplier unclear | Kimi Code is listed as "1x", "5x", "15x", "30x" without a concrete base unit. The actual token allocation for Kimi Code per plan is undisclosed. | High |
| No invoice system | Multiple forum posts about inability to get invoices or billing transparency. No self-service invoice download. | Medium |
| Subscription management gaps | No clear cancel subscription link. Subscription state does not sync across devices. Double-charging reported by multiple users. | Medium |
| Refund policy | No refund policy published. Forum posts show users requesting refunds for immediate cancellations with no response. | Medium |
| Rate limit changes during peak | Documentation states "when the cluster load reaches its capacity limit, we may take temporary measures to adjust the rate limits" without specifying what adjustments are made or when. | Low |
| API batch pricing | No batch API or discounted async processing tier is documented. | Low |