Google Gemini 3.5 Flash: Pricing, Availability, and First Look at Google's New Agentic Model

· google-gemini-3-5-flash-pricing-agentic-model

Google released Gemini 3.5 Flash at I/O 2026 — a model built for AI agents with pricing starting at $1.50/1M input tokens. Here's what it costs, where it runs, and what it can actually do.

Google Gemini 3.5 Flash: Pricing, Availability, and First Look at Google's New Agentic Model

At Google I/O 2026 (May 19), Google DeepMind introduced Gemini 3.5 Flash — the first model in what they're calling their "frontier intelligence with action" family. It's not just another language model; it's explicitly built for agentic workflows: long-horizon tasks, computer use, tool calling, and complex multi-step operations.

If you've been comparing AI models for building agents or coding assistants, here's everything that matters about Gemini 3.5 Flash — pricing, availability, and how it stacks up.

Gemini 3.5 Flash cover art
Gemini 3.5 Flash cover art

What Makes Gemini 3.5 Flash Different

The key distinction Google is drawing is between models that *answer questions* and models that *execute tasks*. Gemini 3.5 Flash is squarely in the latter camp.

Key specs from the official launch:

Context window: 1M tokens (standard), with >200K long-context pricing tier

Max output: 64K tokens

Multimodal input: Text, image, video, and audio

Knowledge cutoff: January 2026 (training data cutoff)

Computer Use Tool support: Yes — can interact with GUIs programmatically

Function calling: Built-in with function declarations counted in input tokens

The model also supports Google's new Priority service tier for higher throughput and reliability, plus Flex/Batch modes for cost-sensitive workloads.

Gemini 3.5 Flash Pricing (Standard Tier)

All prices are per million tokens (USD):

| Direction | Global Endpoint | Non-Global Endpoint | |-----------|----------------|---------------------| | Input (text, image, video, audio) | $1.50 | $1.65 | | Input (>200K tokens) | $1.50 | $1.65 | | Cached input | $0.15 | $0.165 | | Cached input (>200K) | $0.15 | $0.165 | | Text output (response + reasoning) | $9.00 | $9.90 |

Priority Tier pricing is roughly 1.8x the standard rates ($2.70/$16.20 per 1M input/output tokens globally).

Flex/Batch pricing cuts costs significantly: $0.75 input and $4.50 output per 1M tokens globally. Batch API also offers cached input at just $0.075/1M tokens — among the cheapest rates for any frontier-class model.

Comparison with Competitors

How does Gemini 3.5 Flash compare to other current AI models in the same tier?

Gemini 3.5 Flash → $1.50/$9.00 per 1M tokens (input/output)

Claude Sonnet 4.6 → $3.00/$15.00 per 1M tokens

Claude Haiku 4.5 → $1.00/$5.00 per 1M tokens

GPT-5.4 mini → $0.75/$4.50 per 1M tokens

GPT-5.4 → $2.50/$15.00 per 1M tokens

At $1.50 input / $9.00 output, Gemini 3.5 Flash sits right between the GPT-5.4 mini and Claude Sonnet 4.6 in pricing. But it offers features neither has natively — native Computer Use tool support and 1M token context at standard pricing without surcharge below 200K.

Agentic Capabilities

The "action" in Google's "frontier intelligence with action" tagline comes primarily from three areas:

1. Computer Use Tool. Gemini 3.5 Flash can control GUI applications through the Computer Use API on Vertex AI. This allows agents to navigate web browsers, desktop apps, and cloud consoles. Pricing is based on token consumption of the underlying model — no separate per-action fee.

2. Long-horizon agentic tasks. Google claims the model excels at multi-step workflows that require maintaining context over hundreds of tool calls. The 1M token window supports this directly.

3. Function calling as first-class. Starting with Gemini 3.5, function declarations are included in the input token count, which changes how you calculate costs for tool-heavy agent applications.

Where It's Available

Gemini 3.5 Flash is available through:

Gemini API (ai.google.dev) — standard and free tier

Google Vertex AI — with Priority and Flex/Batch tiers

Global endpoints for most use cases

Non-global endpoints for data residency requirements (pricing slightly higher until July 1, 2026)

Should You Use It?

For developers building agentic applications — especially those already on Google Cloud or using Vertex AI — Gemini 3.5 Flash is a compelling option. The pricing is competitive, the context window is generous, and the Computer Use support is something neither OpenAI nor Anthropic offers at comparable price points.

If you're doing high-volume, cost-sensitive work, the Batch/Flex tier pricing ($0.75/$4.50) is hard to beat at this capability level.

Sources

Google I/O 2026 Keynote — Sundar Pichai

Gemini 3.5: frontier intelligence with action — Google Blog

Google Cloud Vertex AI Pricing

Google I/O 2026 Collection — All announcements

I/O 2026: Welcome to the agentic Gemini era