Intelligence index
74/ 100
vs all models73th pctile
Composite of MMLU, GPQA, MATH & HumanEval
Speed
75tok/s
vs all models36th pctile
Median across providers, steady state
Blended price
$6.00/ 1M tokens
vs all models27th pctile
3:1 input:output blend
At a glance
- Context window
- 1M tokens
- Max output
- 16k tokens
- Input price
- $3.00 / 1M tokens
- Output price
- $15.00 / 1M tokens
- Time to first token
- 0.6s
- Input modalities
- text, image
- Output modalities
- text
- License
- Proprietary
- Provider
- xAI
Benchmark scores
Public scores from each provider; bars compare this model against the leader in each benchmark.
MMLU
General knowledge across 57 subjects
88.0
leader: 91.8
MMLU Pro
Harder MMLU successor with more reasoning
76.0
leader: 80.0
GPQA
Graduate-level science Q&A
62.0
leader: 78.0
MATH
Competition mathematics
88.5
leader: 94.8
HumanEval
Python code generation pass@1
90.0
leader: 95.8
Strengths
- Live X data
- 1M context
- Strong reasoning mode
Weaknesses
- Smaller ecosystem
- Less tool-use tooling
Best for
- Real-time research
- Social-aware apps
Models you should also evaluate
OpenAI
GPT-5.5
OpenAI’s 2026 flagship — strongest at reasoning, coding and tool use.
82 intel95 tok/s$7.50 /1M
Anthropic
Claude 4 Opus
Anthropic’s 2026 flagship — best-in-class on code and long-horizon agents.
81 intel50 tok/s$16.00 /1M
Google
Gemini 2.5 Pro
2M-token context + native multimodality — unbeatable for huge docs.
78 intel110 tok/s$2.19 /1M
Grok 3 vs… popular head-to-heads
One-click matchups against the models people compare Grok 3 with most.
Grok 3 — frequently asked questions
Grok 3 is a large language model from xAI, released on 17 February 2025. Real-time data via X — competitive on reasoning, 1M context.
Need help choosing between models?
Compare every option in one sortable table — intelligence, speed and price on a single page.