ELO Rankings Predictions Methodology Accuracy Data Submissions

Rating Accuracy & Calibration

How well do our ELO ratings predict match outcomes between established athletes?

65.7%

Prediction Accuracy

3,912

Established Matches

10+

Min Career Matches

We measure accuracy on established players only — both athletes must have 10+ career matches. This excludes first-time competitors whose ratings haven't stabilized yet, giving an honest picture of prediction quality.

Predicted vs Actual Win Rate

For each ELO gap range, we show how often the higher-rated player actually wins. "Expected" is what ELO predicts; "Actual" is what happened.

ELO Gap	Matches	Expected	Actual	Gap
0	941	53.7%	50.6%	-3.1%
50	892	60.5%	61.3%	+0.8%
100	706	67.1%	66.3%	-0.8%
150	469	73.0%	74.8%	+1.8%
200	379	78.4%	76.0%	-2.4%
250	233	82.8%	79.8%	-3.0%
300	151	86.5%	84.8%	-1.8%
350	78	89.5%	91.0%	+1.5%
400	41	91.9%	90.2%	-1.6%
450	22	93.9%	90.9%	-3.0%

Full Outcome Distribution

Every match falls into one of 7 categories. From the favored player's perspective. The "Score" column factors in partial credit — submission=1.0, points=0.9, decision=0.8, draw=0.5 — and tracks closely with ELO's expected win rate, showing the ratings are well calibrated across all match outcomes, not just wins and losses.

ELO Gap	N	Fav Sub	Fav Pts	Fav Dec	Draw	Und Dec	Und Pts	Und Sub	Score	Expected	Gap
0	941	214	155	107	62	85	147	171	53.3%	53.7%	-0.4%
50	892	239	192	116	53	80	103	109	62.5%	60.5%	+2.0%
100	706	241	146	81	49	52	66	71	67.8%	67.1%	+0.7%
150	469	180	108	63	37	25	25	31	75.4%	73.0%	+2.4%
200	379	158	96	34	27	22	21	21	76.9%	78.4%	-1.5%
250	233	108	54	24	14	17	6	10	80.2%	82.8%	-2.6%
300	151	80	35	13	13	4	2	4	85.7%	86.5%	-0.8%
350	78	45	16	10	3	0	2	2	88.6%	89.5%	-0.9%
400	41	27	7	3	4	0	0	0	92.0%	91.9%	+0.1%
450	22	16	3	1	2	0	0	0	93.2%	93.9%	-0.7%

How to read: "Score" is the implied rating using our margin scoring (submission=1.0, points=0.9, decision=0.8, draw=0.5). This is what actually updates ratings. When Score ≈ Expected, the rating system is well-calibrated.

Data from 30,054 matches across 885 events (1998-2026). Ratings use standard ELO with margin-of-victory scoring. Full methodology