archive by month
Skip to content

SSCAIT initial and current Elo ratings

I’m still working on Elo curves over time, but today I have Elo ratings for each bot in the SSCAIT data at the beginning and end of its career. Here is yesterday’s table plus the new info, now sorted by decreasing current rating—the bot’s real strength yesterday as best we can measure. The topmost ratings are, to my surprise, exactly in the order I expected!

To make the ratings easier to interpret, I added two columns labeled “expect”. These are the expected winning rate of the bot against the average opponent. The rating system is designed so that the average Elo rating is constant at 1500, and it’s easy to compute the expected winning rate against an opponent rated 1500. The constant average rating, by the way, means that a bot which remains the same can see its rating decline over time if its opponents improve.

Ratings are not accurate for bots with a very small number of games. I plan to exclude those bots from the curves over time.

initialcurrent
botwin %EloexpectEloexpectgamesearliestlatest
krasi068.77%159363.07%216397.85%21422015 Nov 302016 Sep 27
Iron bot77.74%158061.31%208196.59%19992015 Nov 272016 Sep 26
Marian Devecka58.66%179084.15%206596.28%62892013 Dec 252016 Sep 27
Martin Rooijackers68.50%184087.62%201194.99%72902014 Jul 282016 Sep 27
tscmooz79.80%182386.52%199194.41%50062015 Feb 272016 Sep 27
tscmoo72.06%183887.50%197894.00%57192015 Jan 222016 Sep 27
LetaBot CIG 201675.68%174880.65%193292.32%4442016 Aug 012016 Sep 27
WuliBot72.76%177382.80%187189.43%9842016 Apr 192016 Sep 26
Simon Prins55.48%151351.87%186789.21%54312015 Jan 252016 Sep 27
ICELab81.12%218998.14%186589.10%83442013 Dec 252016 Sep 27
FlashTest69.44%174480.29%186388.99%2162016 Mar 222016 Jul 27
Sijia Xu71.65%185088.23%184988.17%23282015 Oct 102016 Sep 27
LetaBot SSCAI 2015 Final65.87%171077.01%181385.84%4162016 Aug 042016 Sep 27
Dave Churchill75.48%198594.22%180485.19%82752013 Dec 252016 Sep 27
Chris Coxe73.10%175481.19%180084.90%22012015 Sep 032016 Sep 27
Tomas Vajda79.37%216997.92%179084.15%83722013 Dec 252016 Sep 27
Flash65.69%145843.98%177783.13%9912016 Apr 182016 Sep 27
LetaBot IM noMCTS60.93%164569.73%176682.22%12262016 May 182016 Aug 01
Zia bot52.24%156859.66%175781.45%5362016 Jul 072016 Sep 27
A Jarocki62.77%171177.11%174180.02%9322015 Oct 042016 Jan 26
PeregrineBot57.29%169275.12%172878.79%12762016 Feb 092016 Sep 10
tscmoop78.16%189590.67%172178.11%19922015 Nov 112016 Sep 26
Andrew Smith65.00%170576.50%171877.81%83912013 Dec 252016 Sep 27
Florian Richoux62.11%177082.55%171677.62%82032013 Dec 252016 Sep 27
Carsten Nielsen66.08%170876.81%169575.45%47112015 Mar 172016 Sep 27
Soeren Klett63.62%206896.34%168774.58%82772013 Dec 252016 Sep 27
Vaclav Horazny37.35%10667.60%168674.47%64552013 Dec 252015 Nov 18
La Nuee51.61%149949.86%166271.76%5582015 Dec 132016 Mar 18
Jakub Trancik45.08%175581.27%165771.17%84162013 Dec 252016 Sep 27
Marek Suppa51.85%174680.47%165570.94%44132015 Jan 052016 Mar 18
Krasimir Krystev70.52%203395.56%165370.70%65102013 Dec 252016 Mar 10
ASPbot201149.78%167172.80%165270.58%2272015 Jan 292016 Feb 25
Marcin Bartnicki60.42%185588.53%163368.26%14352014 Nov 282016 Mar 18
Tomas Cere61.11%188890.32%163168.01%83732013 Dec 252016 Sep 27
MegaBot49.40%157660.77%163067.88%4192016 Aug 012016 Sep 27
Aurelien Lermant58.26%168874.69%162266.87%36872015 Jun 222016 Sep 27
Matej Kravjar49.57%172378.31%161966.49%32342013 Dec 252015 Feb 18
Daniel Blackburn43.79%165170.46%160564.67%68832013 Dec 252016 Jan 26
Gabriel Synnaeve45.96%173779.65%158461.86%16582013 Dec 252015 Nov 24
David Milec49.09%155257.43%156659.39%552015 Jan 132015 Jan 20
Odin201455.65%165971.41%156559.25%56482014 Dec 212016 Sep 11
Gaoyuan Chen48.05%158261.59%155958.41%51182015 Feb 102016 Sep 27
Henri Kumpulainen38.81%144742.43%155357.57%8942016 Jan 132016 May 31
Martin Dekar33.14%142939.92%153354.73%49102013 Dec 252016 Jan 25
Serega48.20%177182.64%150550.72%38032015 Jan 312016 Jan 26
Chris Ayers35.53%161065.32%148147.27%15202015 Aug 102016 Jan 26
Nathan a David39.34%144642.29%148147.27%10042016 Feb 232016 Aug 08
DAIDOES34.02%137032.12%147145.84%4852016 Jun 132016 Sep 08
FlashZerg0.00%147446.27%145944.13%72016 Apr 242016 May 12
Igor Lacik39.32%160865.06%145443.42%80732013 Dec 252016 Sep 08
Matej Istenik44.74%170976.91%144942.71%82972013 Dec 252016 Sep 27
EradicatumXVR40.88%153755.30%144341.87%46872013 Dec 252016 Jan 23
Ibrahim Awwal30.57%151051.44%143741.03%5302013 Dec 252014 Mar 24
Tomasz Michalski27.02%131425.53%143240.34%4332015 Dec 222016 Mar 18
Oleg Ostroumov48.75%171477.41%143140.20%36412013 Dec 252016 Jan 26
NUS Bot35.72%148247.41%142639.51%33372015 May 192016 Sep 06
Martin Pinter28.98%140937.20%142539.37%37402013 Dec 252015 Dec 11
Roman Danielis45.63%168874.69%141738.28%51552013 Dec 252016 Sep 26
ZerGreenBot22.22%140436.53%141638.14%362016 Sep 222016 Sep 27
Rafael Bocquet0.00%145042.85%141538.01%102015 Jun 232015 Jun 26
Flashrelease0.00%144942.71%141337.73%82016 Apr 242016 Apr 24
Marek Kadek37.29%155758.13%141337.73%76412013 Dec 252016 May 22
Ian Nicholas DaCosta37.12%139435.20%140436.53%29282015 Apr 272016 Sep 08
AwesomeBot29.81%132626.86%140336.39%4732016 Jun 162016 Sep 08
Radim Bobek23.37%131525.64%139034.68%11512015 Oct 012016 Mar 06
Adrian Sternmuller26.89%143640.89%137532.75%45292013 Dec 252016 Jul 22
Martin Strapko19.76%138834.42%136631.62%33862013 Dec 252016 Jan 26
Maja Nemsilajova23.81%136531.49%136331.25%42462013 Dec 252015 Nov 29
Johan Kayser24.46%129423.40%136131.00%4132016 Jul 292016 Sep 27
UPStarcraftAI24.75%134629.18%136030.88%6102015 Dec 242016 Apr 13
Martin Vlcak28.92%137032.12%135330.02%12242016 Feb 162016 Sep 07
Johannes Holzfuss35.04%153154.45%135129.78%6852016 Mar 052016 Jun 15
Vojtech Jirsa14.14%118614.09%135029.66%27862015 Jan 122015 Sep 05
JompaBot21.99%131625.75%134929.54%10552016 Feb 042016 Aug 13
Rob Bogie31.34%133527.89%134629.18%6512016 May 142016 Sep 06
Christoffer Artmann20.51%128922.89%134428.95%3952016 Aug 072016 Sep 27
Marek Gajdos22.69%125119.26%133127.43%13842016 Jan 302016 Sep 11
Travis Shelton23.59%139034.68%131425.53%12212016 Feb 282016 Sep 06
Peter Dobsa13.25%122717.20%130724.77%30272015 Jan 112015 Oct 02
VeRLab17.06%124118.38%130424.45%8972016 Feb 282016 Aug 01
Andrej Sekac11.76%135930.75%129623.61%682013 Dec 252014 Jan 04
Bjorn P Mattsson22.22%135129.78%129523.50%44422015 Apr 052016 Sep 27
Lukas Sedlacek22.86%134428.95%129323.30%702015 Jan 122015 Jan 20
Sergei Lebedinskij13.30%117813.55%129323.30%10832015 May 282015 Sep 03
Vladimir Jurenka38.45%163568.51%127821.79%61672013 Dec 252016 Sep 27
neverdieTRX20.66%126520.54%127221.21%3342016 Jul 192016 Sep 10
OpprimoBot21.85%132126.30%125619.71%20092015 Nov 182016 Sep 27
Marek Kruzliak14.45%115111.83%125519.62%9342013 Dec 252015 Jan 20
Sungguk Cha18.65%120715.62%125019.17%6972016 Jun 052016 Sep 27
Jacob Knudsen20.53%10838.31%124718.90%12572016 Feb 232016 Sep 10
Ludmila Nemsilajova16.04%113310.79%122817.28%5052013 Dec 252015 Jan 21
Karin Valisova17.68%123818.12%122617.12%11712013 Dec 252016 Jan 26
HoangPhuc15.67%113210.73%120915.77%3002016 Jul 182016 Sep 07
Sebastian Mahr15.06%120515.47%118213.82%12022016 Jan 132016 Aug 08
Jan Pajan14.48%121015.85%117913.61%11192013 Dec 252016 Jan 05
Pablo Garcia Sanchez12.20%112310.25%117413.28%5902015 Dec 242016 Apr 13
Ivana Kellyerova11.47%112910.57%113110.68%16302013 Dec 252015 Apr 01
Lucia Pivackova13.29%11119.63%10908.63%8352013 Dec 252015 Jan 20
Tae Jun Oh4.55%10697.72%10366.47%1542016 Mar 222016 Apr 11
Denis Ivancik10.76%11029.19%10226.00%5022013 Dec 252015 Jan 20
ButcherBoy4.74%9213.45%9704.52%4222016 Jun 212016 Sep 06
Jon W5.06%9203.43%9644.37%7902015 Apr 302015 Jul 09
Matyas Novy6.32%113010.62%8852.82%16932015 Feb 042015 Jul 09

How did I get the initial ratings? I had a cute idea. One of the issues with computing Elo ratings over time is: How do you initialize the ratings? Most systems either start everybody with the same rating, which makes an ugly graph, or use a different and less accurate method to estimate the rating in early games. But in this case I have the whole data set in hand. I set the final rating of every bot to the same rating and computed ratings backwards in time to find an initial rating. Then I threw away everything except the initial rating, and calculated the real ratings forward in time to find the ratings over time and the final ratings. That way every data point is equally good, from beginning to end. I doubt I’m the first to think of it, but it’s a cute idea and I’m pleased.

Next: I’ll find some sensible way to plot the curves. Stand by!

Trackbacks

No Trackbacks

Comments

imp on :

"The topmost ratings are, to my surprise, exactly in the order I expected!"
- that statement just made it into my quote library ;)

krasi0 on :

Calculating the initial ratings "backwards" is a neat idea! :) Still do you think that the rankings would change if you start with a constant of, say, 1500 for each bot like in the typical scenario?

Jay Scott on :

The final ratings should only change a little if the initial ratings are 1500. The point of finding a good initial rating is to have a useful graph from beginning to end. That’s the theory, anyway. But it was only a one-line change to test it in my code, and... I hit a mystery bug, the kind that Should Never Happen. Not sure whether it affects the original results. Stand by while I solve it.

Jay Scott on :

Whew, it wasn’t a real bug, I was only confused. Setting the initial rating to 1500 causes the final ratings to be slightly compressed, so the top rating and bottom rating are around 15 points closer to the average. It’s a bigger effect than I expected, but still small. It could cause a prediction error of 2%.

Jay Scott on :

Oh, and the rankings do change some, but not in the top 10 places.

krasi0 on :

ELO ratings should also be able to provide insights into the expected chance of winning against any opponent given both players' ratings. When you have time, you could compile a (huge) cross-table with those :)

Additionally, we could investigate alternative rating schemes like Truskill and others. Here is a link with comparison of some of them: https://rankade.com/ree#ranking-system-comparison

Jay Scott on :

All the mathematically sound rating systems that I know of are variations of Elo’s original. The differences amount to trying to squeeze a little more information out of each game by modeling more closely how playing strength varies in the real world, both between players and for a given player over time. We can definitely do better than straight Elo, because we have some prior knowledge. We know that IceBot doesn’t learn and it hasn’t been updated since its last upload in 2013, so its playing strength is constant over that time. We even have theoretical limits on how fast some learning bots can learn, because we know how they work. My intuition is that if we use our prior knowledge, we will surely get more accurate ratings... but only modestly more accurate. I think the major limit on the accuracy of Elo family systems is that they assume that playing strength is a single value, so they inherently can’t cope with intransitivity where A>B, B>C and C>A, which happens A LOT. Intransitivity distorts ratings because then ratings depend on the number of games. For example in the ABC example if A vs B happens to be more frequent (C was uploaded later), then the rating system finds more evidence that A>B than the other two, increasing A’s rating and decreasing B’s and C’s. For the same reason, adding a new player to the pool may change the relative ratings of existing players. Rating systems are great and Elo is accurate on average, but it’s not a cure-all, for fundamental reasons. And my feeling is that we can do a little better than straight Elo, but not much better.

krasi0 on :

Yes, I completely agree with your analysis. Still if you can come up with any other useful insights that we could capture from the vast games history, please share :)

Jay Scott on :

The game scores are virgin territory. Later I’ll explore there.

Add Comment

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

Form options

Submitted comments will be subject to moderation before being displayed.