Gemini 2.0 Flash
1M-token context for pennies — the best $/token deal on the market.
Intelligence index
64/ 100
vs all models27th pctile
Composite of MMLU, GPQA, MATH & HumanEval
Speed
220tok/s
vs all models91th pctile
Median across providers, steady state
Blended price
$0.18/ 1M tokens
vs all models95th pctile
3:1 input:output blend
At a glance
- Context window
- 1M tokens
- Max output
- 8k tokens
- Input price
- $0.10 / 1M tokens
- Output price
- $0.40 / 1M tokens
- Time to first token
- 0.3s
- Input modalities
- text, image, audio, video
- Output modalities
- text
- License
- Proprietary
- Provider
Benchmark scores
Public scores from each provider; bars compare this model against the leader in each benchmark.
MMLU
General knowledge across 57 subjects
85.0
leader: 91.8
MMLU Pro
Harder MMLU successor with more reasoning
70.0
leader: 80.0
GPQA
Graduate-level science Q&A
49.5
leader: 78.0
MATH
Competition mathematics
84.0
leader: 94.8
HumanEval
Python code generation pass@1
86.0
leader: 95.8
Strengths
- Cheapest 1M-context model
- Very fast
- Multimodal
Weaknesses
- Weaker reasoning than 2.5 Pro
Best for
- High-throughput pipelines
- RAG
- Bulk processing
Models you should also evaluate
Google
Gemini 2.5 Pro
2M-token context + native multimodality — unbeatable for huge docs.
78 intel110 tok/s$2.19 /1M
OpenAI
GPT-5.5 mini
Production workhorse — GPT-5.5 quality reasoning at fast-tier prices.
68 intel180 tok/s$0.44 /1M
Google
Gemini 1.5 Pro
The original 2M-context model — still useful for legacy pipelines.
67 intel60 tok/s$2.19 /1M
Gemini 2.0 Flash vs… popular head-to-heads
One-click matchups against the models people compare Gemini 2.0 Flash with most.
Gemini 2.0 Flash — frequently asked questions
Gemini 2.0 Flash is a large language model from Google, released on 5 February 2025. 1M-token context for pennies — the best $/token deal on the market.
Need help choosing between models?
Compare every option in one sortable table — intelligence, speed and price on a single page.