OpenClaw Model Rankings
Top 10 of 256 qualifying models
OpenAI
77.3
OpenClaw
Intelligence · Coding
Int 48.9·Code 51.5
Terminal-Bench Hard
52.3%
Latency (TTFT)
12.02 s
Run Cost
$0.027
in $0.750 / out $4.500 per 1M
KwaiKAT
76.9
OpenClaw
Intelligence · Coding
Int 43.8·Code 45.6
Terminal-Bench Hard
49.2%
Latency (TTFT)
1.45 s
Run Cost
$0.0084
in $0.300 / out $1.200 per 1M
76.4
OpenClaw
Intelligence · Coding
Int 57.2·Code 55.5
Terminal-Bench Hard
53.8%
Latency (TTFT)
24.34 s
Run Cost
$0.072
in $2.000 / out $12.000 per 1M
DeepSeek
75.1
OpenClaw
Intelligence · Coding
Int 51.5·Code 47.5
Terminal-Bench Hard
46.2%
Latency (TTFT)
1.13 s
Run Cost
$0.035
in $1.740 / out $3.480 per 1M
OpenAI
74.8
OpenClaw
Intelligence · Coding
Int 56.8·Code 57.2
Terminal-Bench Hard
57.6%
Latency (TTFT)
209.59 s
Run Cost
$0.090
in $2.500 / out $15.000 per 1M
OpenAI
74.6
OpenClaw
Intelligence · Coding
Int 53.6·Code 53.1
Terminal-Bench Hard
53.0%
Latency (TTFT)
76.09 s
Run Cost
$0.077
in $1.750 / out $14.000 per 1M
Kimi
74.5
OpenClaw
Intelligence · Coding
Int 53.9·Code 47.1
Terminal-Bench Hard
43.9%
Latency (TTFT)
1.22 s
Run Cost
$0.027
in $0.950 / out $4.000 per 1M
Xiaomi
74.0
OpenClaw
Intelligence · Coding
Int 53.8·Code 45.5
Terminal-Bench Hard
43.2%
Latency (TTFT)
2.05 s
Run Cost
$0.024
in $1.000 / out $3.000 per 1M
Alibaba
73.8
OpenClaw
Intelligence · Coding
Int 50.0·Code 42.9
Terminal-Bench Hard
43.9%
Latency (TTFT)
1.72 s
Run Cost
$0.018
in $0.500 / out $3.000 per 1M
Z AI
72.5
OpenClaw
Intelligence · Coding
Int 49.8·Code 44.2
Terminal-Bench Hard
43.2%
Latency (TTFT)
761 ms
Run Cost
$0.025
in $1.000 / out $3.200 per 1M
| # | Model | OpenClaw Score | Intelligence · Coding | Terminal-Bench Hard | Latency (TTFT) | Run Cost | Value |
|---|---|---|---|---|---|---|---|
| #1 | GPT-5.4 mini (xhigh) OpenAI | 77.3 | Int 48.9·Code 51.5 Capability 49.2 | 52.3% Score 100.0 | 12.02 s Speed score 28.5 | $0.027 in $0.750 / out $4.500 per 1M | 25.9 |
| #2 | KAT Coder Pro V2 KwaiKAT | 76.9 | Int 43.8·Code 45.6 Capability 44.0 | 49.2% Score 100.0 | 1.45 s Speed score 81.5 | $0.0084 in $0.300 / out $1.200 per 1M | 54.1 |
| #3 | Gemini 3.1 Pro Preview | 76.4 | Int 57.2·Code 55.5 Capability 57.0 | 53.8% Score 100.0 | 24.34 s Speed score 7.3 | $0.072 in $2.000 / out $12.000 per 1M | 15.9 |
| #4 | DeepSeek V4 Pro (Reasoning, Max Effort) DeepSeek | 75.1 | Int 51.5·Code 47.5 Capability 51.1 | 46.2% Score 94.9 | 1.13 s Speed score 86.0 | $0.035 in $1.740 / out $3.480 per 1M | 22.6 |
| #5 | GPT-5.4 (xhigh) OpenAI | 74.8 | Int 56.8·Code 57.2 Capability 56.8 | 57.6% Score 100.0 | 209.59 s Speed score 1.0 | $0.090 in $2.500 / out $15.000 per 1M | 13.5 |
| #6 | GPT-5.3 Codex (xhigh) OpenAI | 74.6 | Int 53.6·Code 53.1 Capability 53.6 | 53.0% Score 100.0 | 76.09 s Speed score 1.0 | $0.077 in $1.750 / out $14.000 per 1M | 13.6 |
| #7 | Kimi K2.6 Kimi | 74.5 | Int 53.9·Code 47.1 Capability 53.2 | 43.9% Score 90.3 | 1.22 s Speed score 84.7 | $0.027 in $0.950 / out $4.000 per 1M | 28.4 |
| #8 | MiMo-V2.5-Pro Xiaomi | 74.0 | Int 53.8·Code 45.5 Capability 53.0 | 43.2% Score 88.7 | 2.05 s Speed score 74.5 | $0.024 in $1.000 / out $3.000 per 1M | 31.0 |
| #9 | Qwen3.6 Plus Alibaba | 73.8 | Int 50.0·Code 42.9 Capability 49.3 | 43.9% Score 90.3 | 1.72 s Speed score 78.2 | $0.018 in $0.500 / out $3.000 per 1M | 35.0 |
| #10 | GLM-5 (Reasoning) Z AI | 72.5 | Int 49.8·Code 44.2 Capability 49.2 | 43.2% Score 88.7 | 761 ms Speed score 92.0 | $0.025 in $1.000 / out $3.200 per 1M | 27.6 |
Scores (0–100) are percentile-normalized across all qualifying models — not raw benchmark percentages. Standard run = 12,000 input + 4,000 output tokens. Hover column headers for metric definitions. Data via Artificial Analysis.