Mistral Small 3
Excellent latency / cost ratio — open weights, production-ready.
Intelligence index
52/ 100
vs all models5th pctile
Composite of MMLU, GPQA, MATH & HumanEval
Speed
150tok/s
vs all models77th pctile
Median across providers, steady state
Blended price
$0.30/ 1M tokens
vs all models82th pctile
3:1 input:output blend
At a glance
- Context window
- 32k tokens
- Max output
- 8k tokens
- Input price
- $0.20 / 1M tokens
- Output price
- $0.60 / 1M tokens
- Time to first token
- 0.3s
- Input modalities
- text
- Output modalities
- text
- License
- Open source
- Provider
- Mistral
Benchmark scores
Public scores from each provider; bars compare this model against the leader in each benchmark.
MMLU
General knowledge across 57 subjects
81.0
leader: 91.8
MMLU Pro
Harder MMLU successor with more reasoning
66.0
leader: 80.0
GPQA
Graduate-level science Q&A
45.0
leader: 78.0
MATH
Competition mathematics
70.6
leader: 94.8
HumanEval
Python code generation pass@1
84.5
leader: 95.8
Strengths
- Apache 2.0 license
- Fast
- Cheap
Weaknesses
- Smaller context
Best for
- Self-hosting
- Edge deployment
- Real-time apps
Models you should also evaluate
Meta
Llama 3.1 8B
Tiny but capable; serves at >500 tok/s on Groq.
44 intel750 tok/s$0.06 /1M
DeepSeek
DeepSeek R1
Open-weights reasoning model that matches o1 at 1/25 the price.
73 intel60 tok/s$0.96 /1M
OpenAI
GPT-5.5 mini
Production workhorse — GPT-5.5 quality reasoning at fast-tier prices.
68 intel180 tok/s$0.44 /1M
Mistral Small 3 — frequently asked questions
Mistral Small 3 is a large language model from Mistral, released on 30 January 2025. Excellent latency / cost ratio — open weights, production-ready.
Need help choosing between models?
Compare every option in one sortable table — intelligence, speed and price on a single page.