Market Overview
June 2026 marks a turning point in AI API pricing. Budget models now start under $0.10 per million tokens, two legacy Anthropic models were retired mid-month, and the quality gap between mid-tier and premium models has narrowed dramatically. Teams can save 60–80% by choosing the right model for each task.
⚠️ Deprecation Alert: Claude 4 Opus ($15/$75 per 1M tokens) and Claude Sonnet 4 ($3/$15 per 1M tokens) were retired on June 15, 2026. Migrate to Claude Opus 4.7/4.8 ($5/$25) — 67% cheaper with better quality and 1M context.
Budget Tier — Under $0.60/1M Input
These models handle most everyday tasks at rock-bottom prices:
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Gemini 2.0 Flash Lite | $0.075 | $0.30 | 1M | |
| GPT-oss 20B | OpenAI | $0.08 | $0.35 | 128K |
| Llama 3.1 8B | Meta (Together.ai) | $0.10 | $0.10 | 128K |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 1M |
| GPT-4o mini | OpenAI | $0.15 | $0.60 | 128K |
| Llama 4 Maverick | Meta (Together.ai) | $0.20 | $0.60 | 10M |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 128K |
OpenAI GPT-5.5 Family
The biggest story from OpenAI is the GPT-5.5 family launch:
- GPT-5.5 ($5.00 input / $30.00 output, $0.50 cached) — new flagship with 1,050,000-token context
- GPT-5.5 Pro ($30.00 input / $180.00 output) — highest-precision variant, no cached discount
- GPT-5.4 ($2.50 input / $15.00 output) — cheaper frontier alternative
- GPT-5.4 mini ($0.75 input / $4.50 output) — best price/performance for production
- GPT-5.4 nano ($0.20 input / $1.25 output) — budget option for high-volume tasks
OpenAI still offers $5 free credits for new accounts.
Anthropic Claude Pricing
With the retirement of Claude 4 Opus and Sonnet 4, Anthropic's current lineup is:
- Claude Opus 4.8 ($5/$25 per 1M) — strongest generally available model
- Claude Sonnet 4.6 ($3/$15 per 1M) — same price as retired Sonnet 4, better quality
- Claude Fable 5 ($10/$50 per 1M) — Mythos-class, US-only, premium reasoning
- Claude Haiku 4.5 ($0.25/$1.25 per 1M) — fastest/cheapest
New API users get $5 free credits.
Google Gemini
Google's free tier is unmatched: unlimited rate-limited access to Gemini 2.0 Flash and 2.5 Pro models. No credit card required. For paid API: Gemini 2.0 Flash at $0.10/$0.40 is the most price-efficient option in the budget tier.
Mixed Provider Landscape
Other notable pricing: xAI's Grok Build 0.1 at $0.30/$0.50 with 256K context, Together.ai offering Llama 4 Scout at $0.11/$0.34 with 10M context, and DeepSeek V4 Pro at $0.44/$0.87.
Key Takeaway
The June 2026 pricing environment is the most competitive ever. With models from $0.075/M tokens, teams should route tasks to the appropriate tier: budget models for classification/routing, mid-tier for generation, and premium for high-stakes reasoning. The days of defaulting to the most expensive model are over.