Llama 3.1 405B
The original "open-source GPT-4" — largest publicly-released weights.
Intelligence index
64/ 100
vs all models27th pctile
Composite of MMLU, GPQA, MATH & HumanEval
Speed
32tok/s
vs all models0th pctile
Median across providers, steady state
Blended price
$2.70/ 1M tokens
vs all models41th pctile
3:1 input:output blend
At a glance
- Context window
- 128k tokens
- Max output
- 4k tokens
- Input price
- $2.70 / 1M tokens
- Output price
- $2.70 / 1M tokens
- Time to first token
- 0.7s
- Input modalities
- text
- Output modalities
- text
- License
- Open source
- Provider
- Meta
Benchmark scores
Public scores from each provider; bars compare this model against the leader in each benchmark.
MMLU
General knowledge across 57 subjects
88.6
leader: 91.8
MMLU Pro
Harder MMLU successor with more reasoning
73.3
leader: 80.0
GPQA
Graduate-level science Q&A
51.1
leader: 78.0
MATH
Competition mathematics
73.8
leader: 94.8
HumanEval
Python code generation pass@1
89.0
leader: 95.8
Strengths
- Open weights at frontier scale
- No usage limits
Weaknesses
- Slow
- Expensive to host (needs 8×H100)
Best for
- Research
- Distillation
- Custom fine-tunes
Models you should also evaluate
Meta
Llama 3.3 70B
Open-weights 70B that matches GPT-4o on most benchmarks.
66 intel200 tok/s$0.27 /1M
DeepSeek
DeepSeek R1
Open-weights reasoning model that matches o1 at 1/25 the price.
73 intel60 tok/s$0.96 /1M
DeepSeek
DeepSeek V3
Frontier-class quality at fast-tier prices — and open weights.
67 intel90 tok/s$0.48 /1M
Llama 3.1 405B — frequently asked questions
Llama 3.1 405B is a large language model from Meta, released on 23 July 2024. The original "open-source GPT-4" — largest publicly-released weights.
Need help choosing between models?
Compare every option in one sortable table — intelligence, speed and price on a single page.