Moonshot AI

AI Coding Agents Report: May 2026 · Updated 31 May 2026 · Version history

Executive Summary

What it is: Moonshot AI's Kimi is a consumer AI platform with coding capabilities (Kimi Code), available via CLI, IDE extensions (VS Code, Zed, JetBrains), and web. Consumer plans range from $0 (Adagio) to $199/mo (Vivace annual). The underlying K2.6 model is available via API at $0.95/$4.00 per MTok (input/output) with a 256K context window, roughly 5x cheaper than Claude Opus 4.8 ($5/$25 per MTok) for competitive coding quality. Kimi Code now defaults to K2.6 for all plans.

What to watch out for: Agent quotas remain approximate with no concrete token counts, and Kimi Code multipliers ("1x", "5x", "15x", "30x") still lack a defined base unit. K2 series models (kimi-k2-0905, kimi-k2-thinking, etc.) were discontinued on May 25, 2026. Billing issues from April persist: double-charging reports, no invoice system, no visible cancel subscription link. Rate limiting (429 errors) continues during peak usage.

Bottom line: Kimi K2.6 delivers frontier-competitive coding benchmarks (SWE-Bench Pro: 58.6%, Terminal-Bench 2.0: 66.7%) at API pricing well below Western models. The K2.6 model won the AI Coding Contest Word Gem Puzzle challenge outright, beating GPT-5.5, Claude Opus 4.7, and Gemini Pro 3.1. However, billing reliability, customer service gaps, and zero enterprise features (no SSO, no audit logs, no SLA) make it risky for teams. Best suited for individual developers comfortable with API-level integration who want frontier-competitive performance at a fraction of the cost.

Key Terms

Unified credit pool - Kimi's consumer plans use a single credit balance metered by token consumption across all features (agent, Code, Swarm, Claw). Credits reset monthly. Source: Kimi – Membership Credits
Kimi Code - Moonshot AI's coding agent product, available as a CLI and IDE extension. Uses Kimi K2.6 as its underlying model (branded as kimi-for-coding). Source: Kimi – Code
Agent Swarm - Kimi's architecture for decomposing tasks into heterogeneous subtasks executed concurrently by self-created domain-specialized agents. K2.6 supports up to 300 sub-agents across 4,000 coordinated steps. Source: Kimi – Agent Swarm
Kimi Claw - A persistent, proactive AI agent that operates across multiple applications with 24/7 execution. Available on Allegretto ($39/mo) and above. Source: Kimi – Kimi Claw Introduction
Context caching - Kimi automatically caches context. Cached tokens are billed at the "cache hit" rate, which is 83% cheaper than regular input for K2.6 ($0.16 vs $0.95 per MTok). Source: Kimi – Chat K26
Kimi WebBridge - A browser extension for AI agents, released in May 2026. Source: Kimi – Webbridge

Latest Changes

Changes since the 2026-04 report.

Deprecation: All Kimi K2 series models (kimi-k2-0905-preview, kimi-k2-turbo-preview, kimi-k2-thinking, kimi-k2-thinking-turbo) were officially discontinued on May 25, 2026. Only K2.5, K2.6, and Moonshot V1 remain available via API. See API Pricing.
Feature added: Kimi Code now defaults to K2.6 (branded as kimi-for-coding). The CLI shows "Model: kimi-for-coding (powered by kimi-k2.6)". Source: Kimi – Code
Feature added: Kimi WebBridge released, a browser extension enabling AI agents to interact with web content. Source: Kimi – Webbridge
Partnership: Berget AI announced Berget Code for European teams, powered by Kimi K2.6, targeting GDPR-compliant deployment. Source: Berget – Berget Code Launch En
Community win: Kimi K2.6 won the AI Coding Contest Word Gem Puzzle outright with 22 match points (7-1-0 record), beating GPT-5.5 (16 pts), Claude Opus 4.7 (12 pts), and Gemini Pro 3.1 (9 pts). Source: Thinkpol – An Open Weights Chinese Model Just Beat Claude Gpt 5 5 And Gemini In A Programming Challenge
No pricing changes: API and consumer plan pricing remain unchanged from April.

Plans

Consumer Plans (Kimi.com)

Plan	Price (monthly)	Price (annual)	Agent Quota/mo	Kimi Code	Agent Swarm	Key Inclusions
Adagio (Free)	$0	$0	6 agent tasks	Not included	Not included	1 concurrent agent task, 200 database calls
Moderato	$19/mo	$15/mo ($180/yr)	60 agent tasks	1x quota (undisclosed token count)	Not included	2 concurrent tasks, 4x speed priority, 2,000 database calls
Allegretto	$39/mo	$31/mo ($372/yr)	150 agent tasks	5x quota	50 Swarm uses/mo, 4 concurrent subtasks	Kimi Claw, 2 concurrent tasks, 5,000 database calls
Allegro	$99/mo	$79/mo ($948/yr)	360 agent tasks	15x quota	120 Swarm uses/mo, 4 concurrent subtasks	Kimi Claw, 4 concurrent tasks, 12,000 database calls
Vivace	$199/mo	$159/mo ($1,908/yr)	720 agent tasks	30x quota	240 Swarm uses/mo, 8 concurrent subtasks	Kimi Claw, 4 concurrent tasks, 24,000 database calls

Agent quotas are approximate values based on typical task token consumption. Actual usage varies by task complexity. All plans share a unified credit pool metered by tokens. Annual billing saves up to $480/year vs monthly.

Source: Kimi – Pricing

API Plans (Pay-as-you-go)

Tier	Cumulative Recharge	Concurrency	RPM	TPM	TPD
Tier 0	$1	1	3	500,000	1,500,000
Tier 1	$10	50	200	2,000,000	Unlimited
Tier 2	$20	100	500	3,000,000	Unlimited
Tier 3	$100	200	5,000	3,000,000	Unlimited
Tier 4	$1,000	400	5,000	4,000,000	Unlimited
Tier 5	$3,000	1,000	10,000	5,000,000	Unlimited

Minimum recharge: $1. At $5 cumulative recharge, users receive a $5 voucher. Vouchers do not count toward cumulative recharge. Enterprise custom limits available via api-service@moonshot.ai.

Source: Kimi – Limits

API Pricing

Model	Input ($/MTok)	Output ($/MTok)	Cache Hit ($/MTok)	Context Window
Kimi K2.6	$0.95	$4.00	$0.16	256K tokens
Kimi K2.5	$0.60	$3.00	$0.10	256K tokens
Moonshot V1 8K	$0.20	$2.00	N/A	8K tokens
Moonshot V1 32K	$1.00	$3.00	N/A	32K tokens
Moonshot V1 128K	$2.00	$5.00	N/A	128K tokens

Discontinued models (removed May 25, 2026): Kimi K2 (0905), K2 Turbo, K2 Thinking, K2 Thinking Turbo, K2 (0711).

Other API pricing:

Web search ($web_search tool): $0.005 per successful tool call
File upload/extract: temporarily free
K2.6 and K2.5 support text, image, and video input
Thinking mode: can be enabled/disabled per request ("thinking": {"type": "enabled"})
Temperature fixed at 1.0 (thinking mode) or 0.6 (non-thinking), cannot be changed
Top-p fixed at 0.95, cannot be changed

Source: Kimi – Chat K26, Kimi – Chat K25, Kimi – Chat V1, Kimi – Tools

Model Performance / Benchmarks

Kimi K2.6 Benchmarks

Benchmark	Kimi K2.6	GPT-5.4 (xhigh)	Claude Opus 4.6 (max)	Gemini 3.1 Pro (thinking high)	Kimi K2.5
SWE-Bench Pro	58.6%	57.7%	53.4%	54.2%	50.7%
Terminal-Bench 2.0	66.7%	65.4%	65.4%	68.5%	50.8%
SWE-Bench Verified	80.2%	-	80.8%	80.6%	76.8%
SWE-Bench Multilingual	76.7%	-	77.8%	76.9%	73.0%
HLE-Full (with tools)	54.0%	52.1%	53.0%	51.4%	50.2%
BrowseComp	83.2%	82.7%	83.7%	85.9%	74.9%
OSWorld-Verified	73.1%	75.0%	72.7%	-	63.3%
AIME 2026	96.4%	99.2%	96.7%	98.3%	95.8%
LiveCodeBench (v6)	89.6%	-	88.8%	91.7%	85.0%

Additional K2.6 capabilities:

Agent Swarm: up to 300 sub-agents across 4,000 coordinated steps
256K context window, multimodal (text, image, video)
Demonstrated 13-hour continuous coding session optimizing exchange-core for 185% throughput improvement
Demonstrated 12-hour session implementing Qwen3.5-0.8B inference in Zig

Source: Kimi – Kimi K2 6

Latest News

K2 Series Discontinued (May 25, 2026)

All Kimi K2 series models (kimi-k2-0905-preview, kimi-k2-turbo-preview, kimi-k2-thinking, kimi-k2-thinking-turbo) were officially discontinued on May 25, 2026. Users must migrate to kimi-k2.6 or kimi-k2.5.

Source: Kimi – Chat

Kimi K2.6 Wins AI Coding Contest (April 30, reported widely May)

In the ongoing AI Coding Contest Word Gem Puzzle challenge, Kimi K2.6 won outright with 22 match points (7-1-0 record), beating GPT-5.5 (16 pts), Claude Opus 4.7 (12 pts), and Gemini Pro 3.1 (9 pts). The challenge tested real-time decision-making and clean functional code connecting to a TCP server. The HN post drew 219 comments and 380 points.

Source: Thinkpol – An Open Weights Chinese Model Just Beat Claude Gpt 5 5 And Gemini In A Programming Challenge News – Item

Kimi WebBridge Released (May 26, 2026)

Moonshot AI released Kimi WebBridge, a browser extension enabling AI agents to interact with web content directly. This extends Kimi's agent capabilities into the browser.

Source: Kimi – Webbridge, News – Item

Berget Code for European Teams (May 13, 2026)

Berget AI announced Berget Code, a coding agent for European teams powered by Kimi K2.6, targeting GDPR-compliant European hosting.

Source: Berget – Berget Code Launch En, News – Item

DeepSeek V4 Pro vs Kimi K2.6 Comparison (May 15, 2026)

Kilo.ai published a comparison of DeepSeek V4 Pro and Flash vs Claude Opus 4.7 and Kimi K2.6, evaluating coding agent performance.

Source: Kilo – We Tested Deepseek V4 Pro And Flash, News – Item

Kimi K2.6 HN Launch Discussion (April 20, ongoing into May)

The original K2.6 launch HN thread reached 710 points and 372 comments, making it one of the most-discussed model launches in May. Key community themes: competitive benchmark performance at low cost, questions about Chinese model licensing, and excitement about the open-weight availability.

Source: News – Item

Community Signals

K2.6 Competitive Performance Recognition

The K2.6 launch received significant positive community attention across HN, with 710 points on the launch thread and 380 points on the AI Coding Contest win thread. Key themes:

Open-source models are closing the gap with Western frontier models
K2.6's SWE-Bench Pro score (58.6%) beating GPT-5.4 (57.7%) drew particular attention
Cost-performance ratio frequently highlighted: $0.95/$4.00 per MTok vs $5/$25 for Claude Opus

Quote: "Kimi K2.6 sets a new level for open-sourced models, especially in long-horizon, agent-style coding workflows." (Blackbox.ai CEO Robert Rizk, via K2.6 blog)

Quote from AI Coding Contest: "When models within a few index points of the frontier are also freely available to run locally, that's a different competitive situation than the one that existed a year ago." (Rohana Rezel, contest organizer)

Source: News – Item News – Item Thinkpol – An Open Weights Chinese Model Just Beat Claude Gpt 5 5 And Gemini In A Programming Challenge

Ecosystem Adoption

Multiple third-party products building on Kimi K2.6:

Kilo Code: uses K2.6 as a core model. "K2.6 offers SOTA-level performance at a fraction of the cost." (Kilo CEO)
Tencent CodeBuddy: integrates K2.6 Thinking for complex programming tasks
Berget Code: European GDPR-compliant coding agent powered by K2.6
Kimiflare: open-source Claude Code clone powered by K2.6 on Cloudflare Workers AI (12k+ npm downloads)
Ollama: K2.6 available for local deployment
Cursor: Composer 2 built on K2.5 (confirmed partnership in April)

Source: Kimi – Kimi K2 6, News – Item

Rate Limiting and Reliability Issues (Ongoing)

Reports of 429 errors and "system busy" messages continue on the Kimi Forum:

"Kimi CLI stuck in engine overloaded loop for 48h" (April 28). Source: Forum – Kimi Cli Stuck In Engine Overloaded Loop For 48H
"Error code 429: We're receiving too many requests at the moment" (12 replies, 841 views, ongoing). Source: Forum – Error Code 429 Were Receiving Too Many Requests At The Moment

Billing and Subscription Issues (Ongoing from April)

Billing issues reported in April appear unresolved:

Double-charging after cancellation reported by multiple users
No self-service invoice download
No visible cancel subscription link
Subscription state not syncing across devices

Source: Forum – 353, Forum – No Link To Cancel Subscription

Kimi K2.6 Self-Hosting and Optimization

Florian Leibert published "5.6x throughput on Kimi K2.6 by speculating less" on HuggingFace, demonstrating how to optimize K2.6 inference on MI300X hardware. 11 points on HN.

Source: Hugging Face – Kimi K26 Dflash Mi300X, News – Item

Enterprise Readiness

Feature	Available?	Details
SSO (SAML)	No	Not mentioned. Kimi is primarily a consumer and API product.
SSO (OIDC)	No	Not mentioned.
SCIM	No	Not mentioned.
Audit logs	No	Not mentioned.
IP indemnity	No	Not mentioned.
Data residency	Partial	Berget AI offers European-hosted K2.6 for GDPR compliance. No official data residency from Moonshot directly.
HIPAA	No	Not mentioned.
Air-gapped / on-prem	Partial	K2.6 is open-weight and available via Ollama for local deployment, but no official on-prem enterprise product.
SLA	No	No published SLA.
Admin controls (RBAC)	No	No admin controls documented. API tiers are single-user.

Transparency Gaps

Gap	Details	Severity
Agent quota is approximate	Plan inclusions use "approximate values based on typical task token consumption" rather than concrete token counts. A buyer cannot know exactly how much usage they get.	High
Kimi Code quota multiplier unclear	Kimi Code is listed as "1x", "5x", "15x", "30x" without a concrete base unit. The actual token allocation for Kimi Code per plan is undisclosed.	High
No invoice system	Multiple forum posts about inability to get invoices or billing transparency. No self-service invoice download.	Medium
Subscription management gaps	No clear cancel subscription link. Subscription state does not sync across devices. Double-charging reported by multiple users.	Medium
Refund policy	No refund policy published. Forum posts show users requesting refunds for immediate cancellations with no response.	Medium
Rate limit changes during peak	Documentation states "when the cluster load reaches its capacity limit, we may take temporary measures to adjust the rate limits" without specifying what adjustments are made or when.	Low
No batch API	No batch API or discounted async processing tier is documented.	Low
No thinking token pricing	K2.6 thinking mode generates reasoning tokens, but there is no separate pricing for thinking tokens vs regular output tokens. It is unclear whether thinking tokens are billed at output rates.	Low

Type: CLI, IDE, API
Starts at: $19.0/mo
API Input: $0.2/MTok
API Output: $2.0/MTok
Context: 256K
Free Tier: Yes

Compare all suppliers →