Responsible AI Use Disclaimer: The tools listed are for informational purposes. Users are responsible for adhering to ethical guidelines. Learn more.

Meta Open sourceJul 2024

Llama 3.1 8B

Tiny but capable; serves at >500 tok/s on Groq.

Intelligence index
44/ 100
vs all models0th pctile
Composite of MMLU, GPQA, MATH & HumanEval
Speed
750tok/s
vs all models95th pctile
Median across providers, steady state
Blended price
$0.06/ 1M tokens
vs all models100th pctile
3:1 input:output blend

At a glance

Context window
128k tokens
Max output
4k tokens
Input price
$0.06 / 1M tokens
Output price
$0.06 / 1M tokens
Time to first token
0.2s
Input modalities
text
Output modalities
text
License
Open source
Provider
Meta

Benchmark scores

Public scores from each provider; bars compare this model against the leader in each benchmark.

MMLU
General knowledge across 57 subjects
73.0
leader: 91.8
MMLU Pro
Harder MMLU successor with more reasoning
48.3
leader: 80.0
GPQA
Graduate-level science Q&A
30.4
leader: 78.0
MATH
Competition mathematics
51.9
leader: 94.8
HumanEval
Python code generation pass@1
72.6
leader: 95.8
Strengths
  • Extremely fast
  • Cheapest production-grade model
  • Open weights
Weaknesses
  • Weak on hard reasoning
Best for
  • Real-time UX
  • Classification
  • Edge deployment

Llama 3.1 8B — frequently asked questions

Llama 3.1 8B is a large language model from Meta, released on 23 July 2024. Tiny but capable; serves at >500 tok/s on Groq.

Need help choosing between models?

Compare every option in one sortable table — intelligence, speed and price on a single page.