AIIDE 2016 Bayesian Elo ratings
Again I have Elo as calculated by Remi Coulom’s bayeselo program. The # column gives the official ranking, so you can see how it differs from the rank by Elo (the bayeselo ranking is slightly more accurate because it takes into account all the information in the tournament results, not only the raw winning rate). I left out the 95% confidence interval column as relatively uninteresting, since the “better?” column tells us how likely each bot is to be superior to the one below it.
| # | bot | score | Elo | better? |
|---|---|---|---|---|
| 1 | Iron | 87% | 2016 | 99.4% |
| 2 | ZZZKBot | 85% | 1974 | 99.6% |
| 3 | tscmoo zerg | 83% | 1932 | 99.9% |
| 4 | LetaBot | 74% | 1815 | 99.8% |
| 5 | UAlbertaBot | 70% | 1774 | 99.9% |
| 6 | Ximp | 65% | 1699 | 99.6% |
| 8 | Aiur | 61% | 1663 | 51.6% |
| 7 | Overkill | 62% | 1663 | 99.6% |
| 9 | MegaBot | 58% | 1627 | 88.5% |
| 10 | IceBot | 57% | 1611 | 57.5% |
| 12 | Xelnaga | 57% | 1608 | 50.0% |
| 11 | JiaBot | 57% | 1608 | 98.1% |
| 13 | Skynet | 55% | 1581 | 100% |
| 14 | GarmBot | 43% | 1441 | 100% |
| 16 | TerranUAB | 27% | 1250 | 74.7% |
| 15 | NUSBot | 27% | 1240 | 99.9% |
| 17 | SRbotOne | 22% | 1167 | 99.0% |
| 18 | Cimex | 21% | 1130 | 92.6% |
| 19 | Oritaka | 20% | 1106 | 99.3% |
| 20 | CruzBot | 17% | 1064 | 100% |
| 21 | Tyr | 1% | 533 | - |
There are some switches from the official ranking, due to bots being statistically indistinguishable. Overkill and Aiur are in a dead heat. IceBot (terran), Xelnaga (protoss) and JiaBot (zerg) are also virtually even. bayeselo gives IceBot a 57.6% chance of being better than JiaBot two ranks down, essentially the same as its 57.5% chance of being better than Xelnaga one rank down.
Tomorrow: The per-map crosstables.
Comments
Jay Scott on :