OpenAI

AI Coding Agents Report: April 2026 · Updated 01 May 2026 · Version history

Executive Summary

What it is: OpenAI Codex is an agentic coding and general-purpose AI platform available via ChatGPT (web), CLI, VS Code extension, and API. Plans range from $0 (Free) to $200/mo (Pro 20x) for consumers, with a Business tier using pay-as-you-go per-token billing and no fixed seat fee. The latest model, GPT-5.5, leads on Terminal-Bench 2.0 at 82.7% and is priced at $5.00/$30.00 per MTok via API.

What to watch out for: Codex usage limits are wide ranges (e.g., "15 to 80" GPT-5.5 messages per 5 hours for Plus), making actual capacity unpredictable. The current Pro plan multipliers (10x for $100/mo, 25x for $200/mo) are promotional and expire May 31, 2026, reverting to 5x and 20x. OpenAI's critique of SWE-bench Verified argues the benchmark is saturated, but the replacement (SWE-Bench Pro) lacks independent verification.

Bottom line: GPT-5.5 leads on Terminal-Bench 2.0 and the Business Codex tier with no seat fee is a flexible model for teams that want per-token billing without subscriptions. However, promotional pricing and opaque usage ranges make it hard to forecast costs. The April 28 AWS partnership (OpenAI models on Bedrock, Codex on AWS) expands deployment options for enterprises already on Amazon infrastructure.

Key Terms

Token-based billing - charges based on the number of input and output tokens processed. One token is roughly 4 characters or 0.75 words. Codex credits are now consumed based on token rates rather than per-message estimates for Business and new Enterprise customers. Source: Openai – Pricing
Prompt caching - stores frequently used prompts on OpenAI's servers to reduce latency and cost. Cached input tokens cost 90% less than regular input tokens for all flagship models (e.g., $0.50/MTok cached vs $5.00/MTok standard for GPT-5.5). Source: OpenAI – Pricing
Batch API - asynchronous processing that costs 50% less than standard API calls, with results available within 24 hours. GPT-5.5 Batch: $2.50/$15.00 per MTok vs standard $5.00/$30.00. Source: OpenAI – Pricing
Flex processing - best-effort processing at 50% of standard API rates. Results may have higher latency and lower availability. GPT-5.5 Flex: $2.50/$15.00 per MTok. Source: OpenAI – Flex Processing
Priority processing - guaranteed faster processing at 2.5x standard API rates. GPT-5.5 Priority: $12.50/$75.00 per MTok. Source: OpenAI – Priority Processing
Context window - the maximum number of tokens a model can process in a single conversation. GPT-5.5 supports a 1M token context window via API, and 400K context in Codex for Pro/Enterprise users. Source: OpenAI – Introducing Gpt 5 5
Fast mode - generates tokens 1.5x faster for 2.5x the credit cost in Codex. Available for supported models. Source: Openai – Pricing
Credits - the core billing unit for Codex usage beyond included limits. As of April 2, 2026, Business and new Enterprise customers are billed based on API token rates mapped to credits. Plus and Pro users still use per-message averages. Source: Openai – Pricing
GPT-5.3-Codex - a specialized model optimized for coding tasks within Codex. Priced at $1.75/$14.00 per MTok via API. Used for cloud tasks and code review within Codex. Source: OpenAI – Pricing
GPT-5.3-Codex-Spark - a fast Codex model in research preview for ChatGPT Pro users only. Not available via API. Usage governed by a separate limit that may adjust based on demand. Source: Openai – Pricing
Managed Agents - OpenAI's agent deployment framework, now available on Amazon Bedrock as "Amazon Bedrock Managed Agents, powered by OpenAI" (limited preview). Agents maintain context, execute multi-step workflows, and use tools. Source: OpenAI – Openai On Aws
Symphony - an open-source specification for Codex orchestration, released April 27, 2026. Defines how multiple Codex agents coordinate tasks. Source: OpenAI – Open Source Codex Orchestration Symphony

Latest Changes

First report for this supplier. All models, plans, and pricing are listed as current state.

New model: GPT-5.5 launched April 23 at $5/$30 per MTok. Terminal-Bench 2.0: 82.7%. See Model Performance / Benchmarks.
New model: GPT-5.5 Pro launched alongside at $30/$180 per MTok.
Price change (promotional): Pro 5x at 10x usage and Pro 20x at 25x are promotional until May 31, 2026, reverting to 5x and 20x.
Plan change: Business Codex tier launched with no fixed seat fee, per-token billing.
Feature added: Microsoft partnership amended April 27. OpenAI can now serve products across any cloud provider.
Feature added: OpenAI models, Codex, and Managed Agents on AWS (limited preview, April 28).
Feature added: Symphony open-source orchestration spec (April 27).
Feature added: Workspace agents in ChatGPT (April 22).
Deprecation: SWE-bench Verified replaced by SWE-Bench Pro for OpenAI's own evaluations (April 25).

Plans

Plan	Price	Usage Model	Key Inclusions
Free	$0/mo	Limited	Limited GPT-5.3 Instant access, limited Codex, limited messages/uploads, 27K context window
Go	$8/mo	Expanded	More GPT-5.3 access, 54K context window, more messages/uploads, may include ads
Plus	$20/mo	Shared 5h window	GPT-5.5 Thinking, 256K reasoning context, 54K instant context, expanded Codex usage, deep research, agent mode. Double Codex usage promo until May 31, 2026
Pro 5x	$100/mo	5x Plus usage	5x Codex usage (10x during promo until May 31, 2026), GPT-5.5 Pro access, 400K context, unlimited GPT-5.3, GPT-5.3-Codex-Spark (research preview), ChatGPT Pulse
Pro 20x	$200/mo	20x Plus usage	20x Codex usage (25x 5h limits during promo until May 31, 2026), everything in Pro 5x
Business Codex	Pay-as-you-go	Per-token billing	No fixed seat fee, AI software engineering, automated code/security reviews, SAML SSO, MFA, no training on data, GDPR/CCPA support, SOC 2 Type 2 aligned
Business ChatGPT & Codex	Per user/mo	Per-seat + usage	Everything in Plus and Business Codex, 60+ app integrations (Slack, Google Drive, SharePoint, GitHub), shared projects, custom workspace GPTs
Enterprise	Custom pricing	Scales with credits	SCIM, EKM, audit logs, data residency (10 regions), 128K instant context, 400K reasoning context, 24/7 priority support, SLAs, RBAC, compliance API, IP allowlisting

Source: Chatgpt – Pricing and Openai – Pricing

Codex Usage Limits by Plan and Model

Model	Plus (local/5h)	Pro 5x (local/5h)	Pro 20x (local/5h)	Plus (cloud tasks/5h)	Pro 5x (cloud tasks/5h)	Pro 20x (cloud tasks/5h)
GPT-5.5	15-80	80-400	300-1600	Not available	Not available	Not available
GPT-5.4	20-100	100-500	400-2000	Not available	Not available	Not available
GPT-5.4-mini	60-350	300-1750	1200-7000	Not available	Not available	Not available
GPT-5.3-Codex	30-150	150-750	600-3000	10-60	50-300	200-1200

Local messages and cloud tasks share a five-hour rolling window. Additional weekly limits may apply. Enterprise/Edu without flexible pricing have the same limits as Plus. Image generations use included limits 3-5x faster than regular turns.

Source: Openai – Pricing

Credit Rates for Business and New Enterprise Customers

Model	Input (credits/MTok)	Cached Input (credits/MTok)	Output (credits/MTok)
GPT-5.5	125	12.50	750
GPT-5.4	62.50	6.250	375
GPT-5.4-mini	18.75	1.875	113
GPT-5.3-Codex	43.75	4.375	350
GPT-5.2	43.75	4.375	350
GPT-Image-2 (image)	200	50	750
GPT-Image-2 (text)	125	31.25	250

Plus, Pro, existing Enterprise/Edu, and new Edu customers use per-message averages (e.g., GPT-5.5 local task: ~14 credits/message, GPT-5.3-Codex cloud task: ~25 credits/message).

Source: Openai – Pricing

API Pricing

Flagship Models (Standard, per 1M tokens)

Model	Input ($/MTok)	Cached Input ($/MTok)	Output ($/MTok)	Long Context Input ($/MTok)	Long Context Cached ($/MTok)	Long Context Output ($/MTok)
gpt-5.5	$5.00	$0.50	$30.00	$10.00	$1.00	$45.00
gpt-5.5-pro	$30.00	undisclosed	$180.00	$60.00	undisclosed	$270.00
gpt-5.4	$2.50	$0.25	$15.00	$5.00	$0.50	$22.50
gpt-5.4-mini	$0.75	$0.075	$4.50	undisclosed	undisclosed	undisclosed
gpt-5.4-nano	$0.20	$0.02	$1.25	undisclosed	undisclosed	undisclosed
gpt-5.4-pro	$30.00	undisclosed	$180.00	$60.00	undisclosed	$270.00

Regional processing (data residency) endpoints: 10% uplift on all models listed above.

Source: OpenAI – Pricing

Specialized Models (Standard, per 1M tokens)

Category	Model	Input ($/MTok)	Cached Input ($/MTok)	Output ($/MTok)
Codex	gpt-5.3-codex	$1.75	$0.175	$14.00
Codex Priority	gpt-5.3-codex	$3.50	$0.35	$28.00
ChatGPT	gpt-5.3-chat-latest	$1.75	$0.175	$14.00
Deep research	o3-deep-research	$5.00	undisclosed	$20.00
Deep research	o4-mini-deep-research	$1.00	undisclosed	$4.00
Computer use	computer-use-preview	$1.50	undisclosed	$6.00

Batch and Flex Pricing (50% of standard)

Model	Batch Input ($/MTok)	Batch Output ($/MTok)	Flex Input ($/MTok)	Flex Output ($/MTok)
gpt-5.5	$2.50	$15.00	$2.50	$15.00
gpt-5.4	$1.25	$7.50	$1.25	$7.50
gpt-5.4-mini	$0.375	$2.25	$0.375	$2.25
gpt-5.3-codex	undisclosed	undisclosed	undisclosed	undisclosed

Priority Pricing (2.5x standard)

Model	Priority Input ($/MTok)	Priority Output ($/MTok)
gpt-5.5	$12.50	$75.00
gpt-5.4	$5.00	$30.00
gpt-5.4-mini	$1.50	$9.00

Other API Pricing

Web search: $10.00 per 1,000 calls + content tokens billed at model rates
Web search preview (reasoning models): $10.00 per 1,000 calls
Web search preview (non-reasoning): $25.00 per 1,000 calls
Containers (hosted shell/code interpreter): $0.03 (1GB) to $1.92 (64GB) per 20-minute session
File search storage: $0.10/GB per day (1 GB free)
File search tool call: $2.50 per 1,000 calls
Fine-tuning (o4-mini): $100.00/hour training, $4.00/$16.00 per MTok inference (50% with data sharing)
Embeddings: undisclosed (not listed on pricing page for current models)

Source: OpenAI – Pricing

Model Performance / Benchmarks

Benchmark	GPT-5.5	GPT-5.4	Claude Opus 4.7
Terminal-Bench 2.0	82.7%	75.1%	69.4%
SWE-Bench Pro	58.6%	57.7%	64.3%
Expert-SWE (internal, median 20h tasks)	73.1%	68.5%	-
GDPval (wins or ties)	84.9%	83.0%	80.3%
OSWorld-Verified	78.7%	75.0%	78.0%
ARC-AGI-2 (Verified)	85.0%	73.3%	75.8%
BrowseComp	82.7%	-	-
GPT-5.5 Pro BrowseComp	90.1%	89.3% (GPT-5.4 Pro)	-

Source: OpenAI – Introducing Gpt 5 5

Note: Cross-supplier scores in this table are from the GPT-5.5 announcement. The Anthropic report should be treated as the primary source for Opus 4.7 benchmarks.

Latest News

GPT-5.5 Launch (April 23, 2026)

OpenAI released GPT-5.5, its new flagship model, to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex. API access followed on April 24.

Key benchmarks:

Terminal-Bench 2.0: 82.7% (vs GPT-5.4 at 75.1%, Claude Opus 4.7 at 69.4%)
SWE-Bench Pro: 58.6% (vs GPT-5.4 at 57.7%, Claude Opus 4.7 at 64.3%)
Expert-SWE (internal, median 20-hour tasks): 73.1% (vs GPT-5.4 at 68.5%)
GDPval (wins or ties): 84.9% (vs GPT-5.4 at 83.0%, Claude Opus 4.7 at 80.3%)
OSWorld-Verified: 78.7% (vs GPT-5.4 at 75.0%, Claude Opus 4.7 at 78.0%)
ARC-AGI-2 (Verified): 85.0% (vs GPT-5.4 at 73.3%, Claude Opus 4.7 at 75.8%)

Notable claims: GPT-5.5 uses "significantly fewer tokens" than GPT-5.4 for the same Codex tasks while achieving higher-quality results. Served on NVIDIA GB200 and GB300 NVL72 systems. Matches GPT-5.4 per-token latency despite being more capable. OpenAI claims it delivers "state-of-the-art intelligence at half the cost of competitive frontier coding models" on the Artificial Analysis Coding Index.

Cybersecurity capabilities rated "High" under OpenAI's Preparedness Framework. Stricter classifiers deployed for cyber risk, which "some users may find annoying initially." Trusted Access for Cyber program launched with expanded GPT-5.5 access for verified security researchers.

GPT-5.5 Pro (the higher-tier variant) is available to Pro, Business, and Enterprise users in ChatGPT. On BrowseComp: 90.1% (vs GPT-5.4 Pro at 89.3%). Priced at $30/$180 per MTok via API.

Over 85% of OpenAI's own employees reportedly use Codex every week across engineering, finance, comms, marketing, data science, and product management.

Source: OpenAI – Introducing Gpt 5 5

Microsoft Partnership Amendment (April 27, 2026)

OpenAI and Microsoft announced an amended partnership agreement. Key terms:

Microsoft remains OpenAI's primary cloud partner; OpenAI products ship first on Azure unless Microsoft chooses not to support necessary capabilities
OpenAI can now serve products across any cloud provider
Microsoft's license to OpenAI IP is now non-exclusive, extending through 2032
Microsoft no longer pays a revenue share to OpenAI
Revenue share payments from OpenAI to Microsoft continue through 2030 at the same percentage but subject to a total cap
Microsoft remains a major shareholder

This restructuring enabled the same-day AWS partnership announcement.

Source: OpenAI – Next Phase Of Microsoft Partnership

OpenAI Models, Codex, and Managed Agents on AWS (April 28, 2026)

OpenAI and AWS launched a strategic partnership with three components in limited preview:

OpenAI models (including GPT-5.5) on Amazon Bedrock
Codex on AWS (configure Codex to use Bedrock as provider; supports CLI, desktop app, VS Code extension)
Amazon Bedrock Managed Agents, powered by OpenAI

Customers can apply Codex usage toward their AWS cloud commitments. All customer data processed by Amazon Bedrock.

Source: OpenAI – Openai On Aws

SWE-bench Verified Critique (April 25, 2026)

OpenAI published a blog post arguing that SWE-bench Verified no longer measures frontier coding capabilities. Key findings from OpenAI's audit:

At least 59.4% of a 27.6% subset of problems (19.1% of total) have flawed test cases that reject functionally correct submissions
Some tests are "too narrow" (require exact function signatures not mentioned in the problem) or "too wide" (accept empty solutions)
Frontier models have memorized the underlying PRs from training data, making it impossible to distinguish reasoning from recall

OpenAI is instead evaluating on SWE-Bench Pro, Terminal-Bench 2.0, and Expert-SWE (internal).

SWE-bench co-creator Ofir Press responded on HN: "SWE-bench Verified is now saturated at 93.9% (congrats Anthropic), but anyone who hasn't reached that number yet still has more room for growth." New benchmarks in development include SWE-bench Multilingual, SWE-bench Multimodal, CodeClash.ai, and AlgoTune.io.

Source: OpenAI – Why We No Longer Evaluate Swe Bench Verified

Symphony: Open-Source Orchestration Spec (April 27, 2026)

OpenAI released Symphony, an open-source specification for Codex orchestration. Defines how multiple Codex agents coordinate and manage tasks. HN reception: 22 points, 2 comments.

Source: OpenAI – Open Source Codex Orchestration Symphony

OpenAI Principles (April 26, 2026)

OpenAI published "Our Principles," a company-wide statement of values and operational philosophy. HN reception: 88 points, 104 comments, with mixed reactions about whether this signals a shift in organizational culture.

Source: OpenAI – Our Principles

Workspace Agents in ChatGPT (April 22, 2026)

New "workspace agents" feature allowing teams to deploy agents within ChatGPT that can use tools, access company data, and operate across business applications. HN reception: 161 points, 64 comments.

Source: OpenAI – Introducing Workspace Agents In Chatgpt

Codex for Almost Everything (April 17, 2026)

Blog post expanding Codex beyond coding into document generation, spreadsheets, slide presentations, financial modeling, and knowledge work. Claims 4+ million weekly Codex users. HN reception: 1001 points, 559 comments.

Source: OpenAI – Codex For Almost Everything

OpenAI Privacy Filter (April 17, 2026)

Introduction of a privacy filter for ChatGPT. HN reception: 294 points, 66 comments, with discussion about data handling practices.

Source: OpenAI – Introducing Openai Privacy Filter

Axios Developer Tool Compromise Response (April 17, 2026)

OpenAI's security response to a compromise of the Axios developer tool. HN reception: 102 points, 60 comments.

Source: OpenAI – Axios Developer Tool Compromise

ChatGPT Images 2.0 (April 21, 2026)

Major update to ChatGPT's image generation capabilities. HN reception: 1046 points, 974 comments (highest-engagement OpenAI post this month after GPT-5.5).

Source: OpenAI – Introducing Chatgpt Images 2 0

Community Signals

GPT-5.5 Reception on Hacker News

The GPT-5.5 announcement thread is the largest OpenAI thread this month: 1,575 points, 1,052 comments.

Positive signals:

Early testers praise coding improvements. Dan Shipper (CEO of Every) described GPT-5.5 as "the first coding model I've used that has serious conceptual clarity."
Pietro Schirano (CEO of MagicPath) reported GPT-5.5 successfully merging a branch with hundreds of frontend changes in one shot in about 20 minutes.
NVIDIA engineer quoted: "Losing access to GPT-5.5 feels like I've had a limb amputated."
Michael Truell (Cursor CEO): "GPT-5.5 is noticeably smarter and more persistent than GPT-5.4, with stronger coding performance and more reliable tool use."

Negative signals:

User endymi0n reported severe motivation/laziness issues with GPT-5.4 API at xhigh reasoning effort: "I literally wasn't able to convince the model to WORK, on a quick, safe and benign subtask that later GLM, Kimi and Minimax succeeded on without issues."
Multiple commenters report models stopping mid-task and apologizing instead of continuing work. User jurgenburgen noted: "I've noticed that cursing and being rude makes the models stop being lazy."
Cyber safeguards causing over-refusals for legitimate security research tasks.

Source: News – Item

SWE-bench Debate

OpenAI's SWE-bench critique generated 340 points and 180 comments on HN. Key community reactions:

SWE-bench co-creator Ofir Press confirmed saturation at 93.9% and announced new benchmarks in development.
User stingraycharles: "You can trust that a model that scores 40% vs a model that scores 90% is indeed worse. You can't trust it that a model that scores 93% is better at software engineering than a model that scores 90%."
User dannyw: "It's honestly far better to just ignore SWEBench Verified in 2026. Multiple labs have noted issues with contamination."
Community consensus: public benchmarks are contaminated at the frontier. Private or semi-private benchmarks (ARC-AGI-3, CodeClash) are more reliable but less accessible.

Source: News – Item

Codex Expansion Reception

"Codex for almost everything" thread: 1,001 points, 559 comments.

Strong interest in non-coding use cases (document generation, spreadsheets, research)
Security concerns about granting agents full computer access
User cjbarber predicted "professional agents for non-technical users will be one of the most important and fastest growing product categories of all time"
Skepticism about corporate IT departments allowing arbitrary code execution
Discussion about agent security: "Any data now becomes effectively an executable" (teiferer)

Source: News – Item

Microsoft Partnership Restructuring

70 points, minimal discussion on HN. The revenue share elimination and non-exclusive licensing terms received more attention in financial media than developer communities. The practical impact: OpenAI can now serve customers through AWS and other cloud providers, breaking Azure exclusivity.

Source: News – Item

Pricing and Value Concerns

GPT-5.5 API pricing at $5/$30 per MTok (standard) makes it 2x the input cost and 2x the output cost of GPT-5.4 ($2.50/$15.00). OpenAI argues token efficiency gains offset the higher per-token price.
GPT-5.5 Pro at $30/$180 per MTok is 6x the cost of GPT-5.5 standard for higher accuracy.
Business Codex with no fixed seat fee and per-token billing is a new model that competes directly with per-seat coding agent subscriptions.
The promotional doubling of Codex usage on Plus (until May 31, 2026) and the 5x-to-10x boost on Pro $100 suggest OpenAI is subsidizing usage to drive adoption before potentially tightening limits.

Enterprise Readiness

Feature	Available?	Details
SSO (SAML)	Yes	Business and Enterprise plans. Source: Chatgpt – Pricing
SSO (OIDC)	Undisclosed	Not explicitly mentioned. SAML SSO is listed.
SCIM	Yes	Enterprise plan. Source: Chatgpt – Pricing
Audit logs	Yes	Enterprise plan. Source: Chatgpt – Pricing
IP indemnity	No	Not mentioned on pricing or product pages.
Data residency	Yes	10 regions on Enterprise plan. Source: Chatgpt – Pricing
HIPAA	No	Not mentioned on pricing pages.
Air-gapped / on-prem	No	Not available. Can use via AWS Bedrock or Azure with partner infrastructure.
SLA	Yes	24/7 priority support and SLAs on Enterprise plan. Specific uptime percentage not published. Source: Chatgpt – Pricing
Admin controls (RBAC)	Yes	RBAC, compliance API, IP allowlisting, spend controls on Enterprise plan. Source: Chatgpt – Pricing

Transparency Gaps

Gap	Details	Severity
Usage limits are ranges, not fixed numbers	Codex limits are stated as wide ranges (e.g., "15-80" GPT-5.5 local messages per 5h for Plus). Actual limits depend on task complexity, which users cannot predict. The five-hour rolling window is not visible to users during a session without checking `/status`.	High
GPT-5.5 Pro cached input pricing	Not listed on the pricing page for gpt-5.5-pro. Standard input is $30/MTok but cached input is undisclosed. Given caching provides 90% savings on other models, this is a significant unknown for high-volume users.	Medium
GPT-5.4-mini and GPT-5.4-nano long context pricing	Long context pricing for mini and nano models is not listed. Whether these models support long context at all is not documented.	Medium
GPT-5.3-Codex Batch and Flex pricing	Not listed in the specialized models pricing table. Whether Codex-specific models support batch and flex processing is unclear.	Medium
Pro plan 10x/20x multipliers are promotional	The Pro $100 plan's 10x usage (vs standard 5x) is promotional until May 31, 2026. After that, users will drop to 5x with no announced pricing for maintaining 10x. The Pro $200 plan's 25x 5-hour limit is also promotional.	High
"Additional weekly limits may apply"	Codex documentation mentions weekly limits that may apply on top of 5-hour windows but provides no numbers. Users discover them by hitting walls.	High
SWE-bench Pro methodology	OpenAI replaced SWE-bench Verified with SWE-bench Pro for its own evaluations but has not published the full SWE-bench Pro dataset or methodology for independent verification.	Medium
GPT-5.5 token efficiency claims	OpenAI claims GPT-5.5 uses "significantly fewer tokens" than GPT-5.4 but does not quantify the exact savings percentage across task types. This matters because higher per-token cost could negate efficiency gains.	Medium
Enterprise custom pricing	Enterprise pricing is entirely custom. No published per-seat or per-token ranges. Volume discount tiers are not documented.	Medium
gpt-5.3-chat-latest model identity	Listed in specialized models at $1.75/$14.00 per MTok but not explained. It appears identical in pricing to gpt-5.3-codex but the distinction between the two is not documented.	Low
Business ChatGPT & Codex per-seat price	The per-user monthly price for the combined Business plan is not clearly listed on the pricing page (the dollar amount is rendered but obscured in the page layout).	Medium
Cyber safeguard false positive rate	OpenAI deployed "stricter classifiers for potential cyber risk" with GPT-5.5 but has not published false positive rates or an appeals process for legitimate researchers blocked by the safeguards.	Medium

Type: CLI, IDE, API
Starts at: $8.0/mo
API Input: $0.2/MTok
API Output: $1.25/MTok
Context: 1M
Free Tier: Yes

Compare all suppliers →