
AIL Player Card #002 — GPT-4o: The Veteran Anchor
90 OVR. OF. Still starting for OpenAI United. Arena ELO #1 among peers, but context ceiling, value squeeze from DeepSeek, and GPT-5 on the bench tell the full story. #AILeague
The stat sheet
| Dimension | Score | Source |
|---|---|---|
| OVR (Overall) | 90 | Composite, weighted |
| RZN (Reasoning) | 86 | MMLU 88.7%, GPQA mid-tier at launch |
| CRE (Creativity) | 88 | Arena ELO ~1287 (human preference, May 2026) |
| SPD (Speed) | 82 | 128K context, 48 t/s, 1.25s latency |
| MLT (Multimodal) | 94 | Native omni model — text/image/audio/vision in one net |
| SAF (Safety) | 83 | Preparedness Framework: Medium post-mitigation (Persuasion), Low for Cybersecurity/CBRN/Autonomy |
| VAL (Value) | 79 | $2.50/$10.00 per 1M tokens — competitive but no longer cheap |
Scouting report
The multimodal thesis that actually landed
Instruction following is the real position
Context wall and value squeeze
Head-to-head: OF position class
| Stat | GPT-4o (OpenAI United) | Claude 3.5 Sonnet (Anthropic FC) | DeepSeek V3 (DeepSeek Athletic) |
|---|---|---|---|
| OVR | 90 | 89 | 87 |
| Arena ELO | ~1287 | ~1264 | ~1243 |
| HumanEval | 90.2% | 92.0% | 91.3% |
| MMLU | 88.7% | 88.3% | 88.5% |
| Context window | 128K | 200K | 128K |
| API input ($/1M) | $2.50 | $3.00 | $0.27 |
| API output ($/1M) | $10.00 | $15.00 | $1.10 |
| Best at | Tool use, structured output | Long docs, honest uncertainty | Cost efficiency, code at scale |
Season highlights
Coach's verdict
Related content
Picked from other channels by content similarity—find new creators to follow.
Article·AI League — Game Day 10: Grok Crosses 197 t/s, Intelligence Board Locked for Ten Straight
Grok 4.3 hits 197.7 t/s — 3rd straight record day, +36% since season open. Claude holds #1 at 61 pts, intelligence board frozen for 10 days. GPT-5.5 dips to 61.7 t/s. Microsoft ships two new models. Full June 7 stats. #AILeague
AIL·Stats Board
Article·AI League — Game Day 8: Grok Goes Turbo, Claude and GPT Cool Down
Grok 4.3 rockets to 174.8 t/s — new season-high for xAI, nearly catching Flash. Claude drops to 59.8 t/s, GPT-5.5 to 64.7. Intelligence board stays locked at 61. Full June 5 stats. #AILeague
AIL·Stats Board
Article·AI League — Season Opening Night: The Official Stats Panel, Week 1
Claude Opus 4.8 tops the board (AI Index: 61). DeepSeek V4 Pro cuts output price 75% to $0.87/M. Gemini 3.5 Flash hits 207 t/s. Full post-game stats panel. #AILeague
AIL·Stats Board
Article·AI League — Game Day 7: GPT-5.5 Breaks Out with a Season-High 68.2 t/s
GPT-5.5 hits 68.2 t/s — new season-high at the 60+ index tier. Claude bumps to 63.7. Google fields the fastest pro model in the 57+ club at 138 t/s AND a 187 t/s flash unit. DeepSeek quietly +6.2 t/s. Intelligence board locked at 61. Full June 4 stats. #AILeague
AIL·Stats Board
Article·AI League — Game Day 11: Grok Clears 207 t/s, Gemini Flash Gives Chase in Speed Dead Heat
Grok 4.3 hits 207.6 t/s — 5th straight record day, +43% since season open. Gemini 3.5 Flash answers at 206.8 t/s, turning the speed crown into a 0.8 t/s photo finish. GPT-5.5 bounces back 3.3 t/s. Intelligence board locked at 61 for 11 days. #AILeague
AIL·Stats Board
Article·AI League — Game Day 9: Grok Tops the Speed Board, Claude and GPT Trim Again
Grok 4.3 hits 186.9 t/s — new season-high, edging past Gemini Flash as the fastest proprietary model. Claude drops to 57.4 t/s, GPT-5.5 to 62.9. Intelligence board locked at 61 for nine straight days. Full June 6 stats. #AILeague
AIL·Stats Board

Add more perspectives or context around this Post.