How well do our ELO ratings predict match outcomes between established athletes?
We measure accuracy on established players only — both athletes must have 10+ career matches. This excludes first-time competitors whose ratings haven't stabilized yet, giving an honest picture of prediction quality.
For each ELO gap range, we show how often the higher-rated player actually wins. "Expected" is what ELO predicts; "Actual" is what happened.
| ELO Gap | Matches | Expected | Actual | Gap |
|---|---|---|---|---|
| 0 | 941 | 53.7% | 50.6% | -3.1% |
| 50 | 892 | 60.5% | 61.3% | +0.8% |
| 100 | 706 | 67.1% | 66.3% | -0.8% |
| 150 | 469 | 73.0% | 74.8% | +1.8% |
| 200 | 379 | 78.4% | 76.0% | -2.4% |
| 250 | 233 | 82.8% | 79.8% | -3.0% |
| 300 | 151 | 86.5% | 84.8% | -1.8% |
| 350 | 78 | 89.5% | 91.0% | +1.5% |
| 400 | 41 | 91.9% | 90.2% | -1.6% |
| 450 | 22 | 93.9% | 90.9% | -3.0% |
Every match falls into one of 7 categories. From the favored player's perspective. The "Score" column factors in partial credit — submission=1.0, points=0.9, decision=0.8, draw=0.5 — and tracks closely with ELO's expected win rate, showing the ratings are well calibrated across all match outcomes, not just wins and losses.
| ELO Gap | N | Fav Sub | Fav Pts | Fav Dec | Draw | Und Dec | Und Pts | Und Sub | Score | Expected | Gap |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 941 | 214 | 155 | 107 | 62 | 85 | 147 | 171 | 53.3% | 53.7% | -0.4% |
| 50 | 892 | 239 | 192 | 116 | 53 | 80 | 103 | 109 | 62.5% | 60.5% | +2.0% |
| 100 | 706 | 241 | 146 | 81 | 49 | 52 | 66 | 71 | 67.8% | 67.1% | +0.7% |
| 150 | 469 | 180 | 108 | 63 | 37 | 25 | 25 | 31 | 75.4% | 73.0% | +2.4% |
| 200 | 379 | 158 | 96 | 34 | 27 | 22 | 21 | 21 | 76.9% | 78.4% | -1.5% |
| 250 | 233 | 108 | 54 | 24 | 14 | 17 | 6 | 10 | 80.2% | 82.8% | -2.6% |
| 300 | 151 | 80 | 35 | 13 | 13 | 4 | 2 | 4 | 85.7% | 86.5% | -0.8% |
| 350 | 78 | 45 | 16 | 10 | 3 | 0 | 2 | 2 | 88.6% | 89.5% | -0.9% |
| 400 | 41 | 27 | 7 | 3 | 4 | 0 | 0 | 0 | 92.0% | 91.9% | +0.1% |
| 450 | 22 | 16 | 3 | 1 | 2 | 0 | 0 | 0 | 93.2% | 93.9% | -0.7% |
How to read: "Score" is the implied rating using our margin scoring (submission=1.0, points=0.9, decision=0.8, draw=0.5). This is what actually updates ratings. When Score ≈ Expected, the rating system is well-calibrated.
Data from 30,054 matches across 885 events (1998-2026). Ratings use standard ELO with margin-of-victory scoring. Full methodology