Google

Executive Summary

What it is: Google's Gemini Code Assist is an AI-powered coding assistant available via IDE extensions (VS Code, JetBrains), CLI (Gemini CLI, transitioning to Antigravity CLI), and Google Cloud services. Plans range from free (individuals, via Google Developer Program) to $22.80/user/month (Standard) and $54/user/month (Enterprise) for teams. The underlying Gemini models (3.5 Flash, 3.1 Pro Preview, 3 Flash Preview) are also available as raw API access at $0.25 to $2.00 input and $1.50 to $12.00 output per MTok on Vertex AI Agent Platform. New in May: Gemini 3.5 Flash launched at I/O 2026 as a frontier-class model at $1.50/$9.00 per MTok, outperforming the more expensive Gemini 3.1 Pro Preview on most benchmarks.

What to watch out for: Gemini CLI and the free/individual Code Assist IDE extensions will stop serving requests on June 18, 2026. Users must migrate to Antigravity CLI, a new product with a different architecture. Enterprise Code Assist customers are unaffected. The free tier of Gemini Code Assist has undisclosed usage limits and does not disclose daily request caps. Google's Flash models have a history of 503 errors during peak demand, with one HN commenter reporting 70% failure rates on busy days. The context window for Code Assist is listed as 1M tokens but per-request limits are not documented.

Bottom line: Google is aggressively consolidating its developer tools around the Antigravity platform and Gemini 3.5 model family. Gemini 3.5 Flash delivers frontier-level coding performance at half the price of comparable models from Anthropic or OpenAI, making it the current price/performance leader for API consumers. Enterprise Code Assist at $19 to $45/seat/month (annual) is competitive with GitHub Copilot, but the forced CLI migration and historical reliability issues warrant careful evaluation before committing.

Key Terms

  • Token-based billing - charges based on the number of input and output tokens processed. One token is roughly 4 characters. Gemini 3.x models use a 200K-token threshold: requests below 200K input tokens are billed at the standard rate, while requests above 200K are billed at a higher rate. Source: Google – Pricing
  • Prompt caching - stores frequently used prompts to reduce cost. Cached input tokens cost 90% less than standard input tokens. Available for all Gemini 3.x and 2.5 models. Source: Google – Pricing
  • Context window - the maximum number of tokens a model can process in a single conversation. Gemini 3.5 Flash, 3.1 Pro, and 3 Flash support a 1M token context window. Source: Google – Code Assist
  • Antigravity CLI - Google's new terminal-based AI agent, replacing Gemini CLI for free and individual users starting June 18, 2026. Built in Go, supports asynchronous multi-agent workflows, and shares the same agent harness as Antigravity 2.0 desktop app. Source: Googleblog – An Important Update Transitioning Gemini Cli To Antigravity Cli
  • Antigravity 2.0 - Google's agent-first development platform, announced at I/O 2026. Includes a standalone desktop app for orchestrating cohorts of autonomous AI agents. Source: Google – Sundar Pichai Io 2026
  • Managed Agents - a Gemini API feature (preview) that lets developers define agents as markdown files (AGENTS.md, SKILL.md) and run them in Google-managed cloud sandboxes. Powered by the Antigravity agent harness. Source: Google – Managed Agents Gemini Api
  • Agent mode - a multi-step collaborative reasoning agent in Gemini Code Assist chat that can perform multiple file edits, full project context, built-in tools, and MCP integration. Currently in preview. Source: Google – Code Assist
  • Global vs Non-global pricing - Gemini 3.5 Flash and 3.1 Flash-Lite offer lower pricing when requests are served from Google's global compute pool, and 10% higher pricing when routed to a specific non-global region. Source: Google – Pricing
  • Priority tier - a premium serving option that provides higher availability and lower latency at 1.8x the standard price. Available for all Gemini 3.x models. Source: Google – Pricing
  • Flex/Batch tier - a discounted serving option for non-urgent workloads at approximately 50% of standard pricing. Gemini 3.5 Flash Flex/Batch input is $0.75/MTok vs $1.50 standard. Source: Google – Pricing

Latest Changes

Changes since the 2026-04 report.

  • New model: Gemini 3.5 Flash launched May 19 at I/O 2026. Input $1.50/MTok (Global), output $9.00/MTok (Global). Outperforms Gemini 3.1 Pro Preview on most coding and agentic benchmarks. 4x faster output than other frontier models. Source: Google – Gemini 3 5
  • New model: Gemini Omni Flash launched at I/O 2026, a new multi-modal output model (text, image, video from any input). Source: Google – Sundar Pichai Io 2026
  • Product launch: Antigravity 2.0, a standalone desktop app for orchestrating autonomous AI agents. Includes an optimized Gemini 3.5 Flash variant running at 12x the speed of other frontier models. Source: Google – Sundar Pichai Io 2026
  • Product launch: Antigravity CLI, replacing Gemini CLI for free and individual users. Available immediately. Source: Googleblog – An Important Update Transitioning Gemini Cli To Antigravity Cli
  • Feature added: Managed Agents in the Gemini API (preview). Define agents as markdown files, run in cloud sandboxes. Source: Google – Managed Agents Gemini Api
  • Feature added: Agent mode in Gemini Code Assist (preview). Multi-step reasoning, multiple file edits, MCP integration, human-in-the-loop. Source: Google – Code Assist
  • Deprecation (upcoming): Gemini CLI and Code Assist IDE extensions will stop serving requests for free/individual users on June 18, 2026. Must migrate to Antigravity CLI. Enterprise Code Assist customers unaffected. Gemini Code Assist for GitHub also affected: no new installations after June 18. Source: Googleblog – An Important Update Transitioning Gemini Cli To Antigravity Cli
  • Feature added: Gemini Spark, a personal AI agent in the Gemini app running 24/7 on dedicated VMs, powered by Gemini 3.5 Flash and Antigravity harness. Rolling out to Google AI Ultra subscribers. Source: Google – Next Evolution Gemini App
  • Announcement: Gemini 3.5 Pro coming in June 2026. Currently used internally, showing "great improvements." Source: Google – Gemini 3 5
  • Scale milestone: Google now processes over 3.2 quadrillion tokens per month across surfaces (7x year-over-year growth). Over 8.5 million developers building with Gemini models monthly. Source: Google – Sundar Pichai Io 2026

Plans

PlanPrice (monthly)Price (annual)Key Inclusions
Gemini Code Assist for individuals$0N/ACode completion, generation, chat in IDE; Gemini CLI (until June 18, then Antigravity CLI); local codebase awareness. Via Google Developer Program. Usage limits: undisclosed
Gemini Code Assist Standard$22.80/user/month$19/user/monthEverything in individual + enterprise-grade security, Gen AI indemnification, agent mode (preview), Gemini CLI, code transformation, Gemini in Firebase, BigQuery data insights, code completion in Cloud Run, Colab Enterprise, database dev assistance. 30-day free trial for up to 50 users
Gemini Code Assist Enterprise$54/user/month$45/user/monthEverything in Standard + code customization (private repos), Gemini in Apigee, Gemini in Application Integration, Gemini Cloud Assist, increased agent mode usage, increased daily usage limits

Terms explained:

  • Code customization - tailors code suggestions to an organization's private codebases. Enterprise-only feature. Source: Google – Code Assist
  • Gemini Cloud Assist - AI-powered infrastructure design, diagnostics, cost optimization, and troubleshooting. Currently in preview at no additional charge for Enterprise customers. Source: Google – Pricing
  • Gen AI indemnification - Google covers legal costs if Gemini Code Assist output infringes third-party copyright. Available on both Standard and Enterprise. Source: Google – Protecting Customers With Generative Ai Indemnification

Source: Google – Code Assist and Google – Pricing

API Pricing

Standard Pricing

ModelInput ($/MTok) <= 200KInput ($/MTok) > 200KCached Input ($/MTok) <= 200KCached Input ($/MTok) > 200KOutput ($/MTok) <= 200KOutput ($/MTok) > 200K
Gemini 3.5 Flash (Global)$1.50$1.50$0.15$0.15$9.00$9.00
Gemini 3.5 Flash (Non-global)$1.65$1.65$0.165$0.165$9.90$9.90
Gemini 3.1 Pro Preview$2.00$4.00$0.20$0.40$12.00$18.00
Gemini 3 Flash Preview (text/image/video)$0.50$0.50$0.05$0.05$3.00$3.00
Gemini 3 Flash Preview (audio)$1.00$1.00$0.10$0.10$3.00$3.00
Gemini 3.1 Flash-Lite (Global)$0.25$0.25$0.025$0.025$1.50$1.50
Gemini 3.1 Flash-Lite (Non-global)$0.275$0.275$0.0275$0.0275$1.65$1.65
Gemini 2.5 Pro$1.25$2.50$0.13$0.25$10.00$15.00
Gemini 2.5 Flash (text/image/video)$0.30$0.30$0.03$0.03$2.50$2.50

Flex/Batch Pricing (50% discount)

ModelInput ($/MTok)Output ($/MTok)
Gemini 3.5 Flash (Global)$0.75$4.50
Gemini 3.1 Pro Preview (<=200K)$1.00$6.00
Gemini 3.1 Pro Preview (>200K)$2.00$9.00
Gemini 3 Flash Preview (text)$0.25$1.50
Gemini 3.1 Flash-Lite (Global)$0.125$0.75

Priority Pricing (1.8x standard)

ModelInput ($/MTok) <= 200KInput ($/MTok) > 200KOutput ($/MTok) <= 200KOutput ($/MTok) > 200K
Gemini 3.5 Flash (Global)$2.70$2.70$16.20$16.20
Gemini 3.1 Pro Preview$3.60$7.20$21.60$32.40
Gemini 3 Flash Preview$0.90$0.90$5.40$5.40
Gemini 3.1 Flash-Lite (Global)$0.45$0.45$2.70$2.70

Source: Google – Pricing

Model Performance / Benchmarks

Published benchmark results for Gemini 3.5 Flash from Google's official announcement:

ModelBenchmarkScore
Gemini 3.5 FlashTerminal-Bench 2.176.2%
Gemini 3.5 FlashGDPval-AA1656 Elo
Gemini 3.5 FlashMCP Atlas83.6%
Gemini 3.5 FlashCharXiv Reasoning84.2%

Google claims Gemini 3.5 Flash outperforms Gemini 3.1 Pro Preview on "almost all benchmarks" and delivers 4x faster output tokens per second compared to other frontier models. Specific comparison numbers against competitors were shown in a GIF chart but not provided as tabular data.

Google claims an Antigravity-optimized version of 3.5 Flash runs at 12x the speed of other frontier models within the Antigravity platform.

Source: Google – Gemini 3 5

Latest News

  1. Google I/O 2026 keynote (May 19): Sundar Pichai announced the "agentic Gemini era" with major launches including Gemini 3.5 Flash, Gemini Omni Flash, Antigravity 2.0, and Gemini Spark. Google now processes 3.2 quadrillion tokens/month, 8.5 million developers building with Gemini, and over 375 Cloud customers each processing more than 1 trillion tokens annually. Source: Google – Sundar Pichai Io 2026
  1. Gemini 3.5 Flash launch (May 19): Google's newest frontier model combining intelligence with action. Outperforms Gemini 3.1 Pro on coding and agentic benchmarks. Available immediately in Gemini app, Gemini API, Vertex AI Agent Platform, and Google Antigravity. Gemini 3.5 Pro coming in June. Source: Google – Gemini 3 5
  1. Gemini CLI to Antigravity CLI transition (May 19): Free and individual users must migrate to Antigravity CLI by June 18, 2026. Gemini CLI and Code Assist IDE extensions will stop serving requests. Enterprise Code Assist customers are not affected. Gemini Code Assist for GitHub also impacted: no new installations after June 18. Source: Googleblog – An Important Update Transitioning Gemini Cli To Antigravity Cli
  1. Managed Agents in Gemini API (May 19): New preview feature allowing developers to build custom agents defined as markdown files (AGENTS.md, SKILL.md), running in Google-managed cloud sandboxes via the Antigravity agent harness. Available via Interactions API and Google AI Studio. Source: Google – Managed Agents Gemini Api
  1. Gemini Omni Flash (May 19): New model generating samples in any output modality from any input. Starting with video outputs. Available in Gemini app, Google Flow, and YouTube Shorts. API access coming in "the coming weeks." Source: Google – Sundar Pichai Io 2026
  1. SynthID adoption by OpenAI (May 19): OpenAI, Kakao, and Eleven Labs announced adoption of Google's SynthID watermarking system. Over 100 billion images and videos watermarked to date. Source: Google – Sundar Pichai Io 2026

Community Signals

Hacker News (962 points, 658 comments on Gemini 3.5 Flash announcement)

The HN community had a broadly positive reaction to Gemini 3.5 Flash's technical capabilities and pricing, but raised several specific concerns:

  • Serving reliability: One commenter reported persistent availability issues with Google's Flash models: "We use flash models extensively (both 2.5 and 3.1) and I cannot overstate this, google cannot serve them without 503s 70% of the time on most days." Another suggested the $1.50/$9.00 pricing was set high intentionally to manage demand, not reflective of actual inference cost. Source: News – Item
  • Agentic task quality: A developer running model benchmarks noted: "Gemini 3.5 Flash is really smart in one-shot coding reasoning, btw. Near the frontier. But it doesn't do so well in long horizon agentic tasks with arbitrary tool availability." This aligns with a pattern noted across Google models. Source: News – Item
  • Price comparison: Commenters compared Gemini 3.5 Flash at $1.50/$9.00 unfavorably to DeepSeek V4 Flash at $0.14/$0.28, but favorably to Anthropic and OpenAI frontier models. One commenter estimated Google may be taking roughly a 90% profit margin on inference given the estimated model size (250-400B parameters). Source: News – Item
  • Model architecture speculation: A detailed technical comment estimated Gemini 3.5 Flash at 250-300B total parameters with 10-16B active, running on a single TPU 8i chip without tensor parallelism. With TurboQuant optimizations, the model could be as large as 400B. Source: News – Item
  • CLI migration concern: The forced migration from Gemini CLI to Antigravity CLI for free users generated concern, particularly the short 4-week window before the June 18 shutdown. Enterprise customers welcomed the clarification that their access is unaffected.

Hacker News (156 points, 46 comments on Gemini API File Search multimodal)

Earlier in May, Google announced multimodal RAG capabilities for Gemini API File Search, which received moderate attention from the developer community.

Source: News – Item

Enterprise Readiness

FeatureAvailable?Details
SSO (SAML/OIDC)YesAvailable via Google Cloud Identity and IAM integration. Source: Google – Pricing
SCIMYesGoogle Cloud supports SCIM provisioning for user management. Source: Google – Identity
Audit logsYesAvailable through Google Cloud Audit Logs and usage metrics dashboard (daily active users, chat exposures, acceptance rates, lines of code accepted). Source: Google – Code Assist
IP indemnityYesGoogle's generative AI indemnification policy covers Gemini Code Assist licensed users for copyright infringement claims. Available on both Standard and Enterprise plans. Source: Google – Protecting Customers With Generative Ai Indemnification
Data residencyYesAvailable via Google Cloud regional controls and VPC Service Controls. Source: Google – Code Assist
HIPAAYesGoogle Cloud HIPAA compliance available for covered workloads. Gemini Code Assist covered under Google Cloud BAA.
Air-gapped/on-premYesAvailable via Google Distributed Cloud Air-gapped for customers requiring fully disconnected deployment. Source: Google – Distributed Cloud Air Gapped
SLAYesGoverned by Google Cloud SLA terms. Specific uptime guarantees vary by service tier.
Admin controls (RBAC)YesGranular IAM permissions, VPC Service Controls, enterprise access controls, and admin console for managing licenses. Source: Google – Code Assist

Terms explained:

Transparency Gaps

  1. Usage limits undisclosed: Gemini Code Assist for individuals (free tier) does not disclose daily or monthly request limits. Standard and Enterprise plans reference "increased daily usage limits" for Enterprise but the specific limits are not published anywhere. Source: Google – Code Assist
  1. Agent mode limits undisclosed: Agent mode (preview) is available on both Standard and Enterprise, but Enterprise gets "increased agent usage" and "increased daily usage limits" without any specific numbers published. Source: Google – Pricing
  1. Model parameter counts not disclosed: Google does not disclose parameter counts for any Gemini 3.x model. The community estimates 250-400B total parameters for 3.5 Flash based on hardware analysis, but this is unconfirmed. Source: News – Item
  1. Context window usage not broken out: While Gemini 3.x models support a 1M token context window, Code Assist does not disclose how much of this context is available per request or how context usage affects billing for subscription plans.
  1. Free tier sunset timeline unclear: The migration from Gemini CLI to Antigravity CLI for free users has a hard deadline of June 18, but there is no published roadmap for feature parity between the two CLIs. Key Gemini CLI features (Skills, Hooks, Subagents, Extensions) are mapped to Antigravity plugins, but no feature parity matrix is published. Source: Googleblog – An Important Update Transitioning Gemini Cli To Antigravity Cli
  1. Pricing model opacity for Code Assist: Gemini Code Assist pricing is per-user per-month with no token metering or usage-based component. This is straightforward but makes it impossible to compare cost-per-token with API access or with competitors that offer metered plans.
  1. Benchmark details not fully published: Google showed benchmark comparisons in an animated GIF at I/O but did not publish a full benchmark table with exact numbers for all comparison models. The specific scores for Claude Opus 4.7, GPT-5.5, and other competitors were shown visually but not provided as structured data. Source: Google – Gemini 3 5