Responsible AI Use Disclaimer: The tools listed are for informational purposes. Users are responsible for adhering to ethical guidelines. Learn more.

AI Models · Multimodal

Best Multimodal AI Models Multimodal

Models that natively understand text, images, and beyond.

Models
11
Providers
4
Categories
17
Updated
2026-06

Multimodal models

11 models matched. Click any column to sort.

GPT-5.5OpenAI8295 tok/s0.42s400k$5.00$15.00$7.50
Claude 4 OpusAnthropic8150 tok/s1.4s500k$8.00$40.00$16.00
Gemini 2.5 ProGoogle78110 tok/s0.7s2M$1.25$5.00$2.19
Claude 4 SonnetAnthropic7595 tok/s0.85s500k$3.00$15.00$6.00
Grok 3xAI7475 tok/s0.6s1M$3.00$15.00$6.00
GPT-4oOpenAI72110 tok/s0.4s128k$2.50$10.00$4.38
Claude 3.5 SonnetAnthropic7185 tok/s0.9s200k$3.00$15.00$6.00
GPT-5.5 miniOpenAI68180 tok/s0.28s400k$0.25$1.00$0.44
Gemini 1.5 ProGoogle6760 tok/s0.9s2M$1.25$5.00$2.19
Gemini 2.0 FlashGoogle64220 tok/s0.3s1M$0.10$0.40$0.18
GPT-4o miniOpenAI56145 tok/s0.32s128k$0.15$0.60$0.26

Showing 11 of 11 models. Click any column header to sort. Prices are USD per 1M tokens unless noted otherwise. Estimates marked with *.

Frequently asked questions

A 0–100 composite of public benchmarks (MMLU, MMLU Pro, GPQA, MATH, HumanEval). Higher is better.

Explore the full catalog

See every AI model in one place — intelligence, speed and price on a single sortable table.