a few preliminary Elo charts

The SSCAIT data includes 103 bots, and 3 of them have 10 or fewer games, leaving exactly 100 with useful rating curves. I’ve crunched and formatted the data, and now all I have to do is draw it. I hope to create a humongalicious zoomable graph of daily rating data for all 100 bots—if I can find a way to draw that many lines on a graph in a way that’s usable. Well, I’ll think of something. I chose powerful graphing software that’s fully capable of doing the job, but it’s complicated and my skill and patience may be less than fully capable....

Anyway, another appetizer. Here are static rating graphs for 2016 for the top 3 CIG finishers, all of which had many updates this year. The graphs run from 1 January 2016 to 27 September 2016. The authors may be interested in comparing their updates with movements in their graph. Krasi0 shows steady improvement since April, while the other two look more irregular.

graph of Krasi0’s rating in 2016

graph of Iron’s rating in 2016

graph of Tacmoo terran’s rating in 2016

Trackbacks

No Trackbacks

Comments

Jay Scott on Saturday, October 1. 2016:

What are the dates when Tscmoo terran was playing its nuke strategy? I remember it was doing that for a while.

tscmoo on Saturday, October 1. 2016:

Wasn't that in 2015? I can't remember very well when I've made updates >.

Jay Scott on Saturday, October 1. 2016:

My recollection is that it played the nuke strategy for two stretches, one in 2015 when the strat was developed and one later when it was restored by request because it was fun. But I don’t remember dates....

Jay Scott on Saturday, October 1. 2016:

Do the dips in Iron’s graph correspond to the times when it introduced new units?

Igor Dimitrijevic on Saturday, October 1. 2016:

That may well. It seems that every single new feature involves a dip, first. That's quite frustrating but it meens every feature needs a lot of tuning to work well in most situations. Moreover, every new feature (including new units) is likely to modify the previous overall behavior of the bot and its overall tuning. Also, we can see that Iron barely reached the elo it started with, when it just knew about vultures. But since the level of the top bots have increased a lot.

krasi0 on Saturday, October 1. 2016:

Well, isn't ELO supposed to account for the changes in strength of other bots? But I know the feeling when you introduce something new and there appears to be a hidden regression somewhere else in the behavior.

Igor Dimitrijevic on Sunday, October 2. 2016:

It is! But some readers might not be that used to elos. These posts by Jay Scott come in handy for them...

krasi0 on Saturday, October 1. 2016:

Igor, it appears that Iron crashes quite often these days. Is this a newly introduced regression? You might want to look into it...

Igor Dimitrijevic on Sunday, October 2. 2016:

Indeed! That's the new combat simulator. I am now testing it with the Protoss race, and some of the last games revealed a bug with one of the special Protoss units. It should be fixed now! But bugs like that are easy to spot and to fix. What I meant about regression is more about tuning, which is more difficult to achieve, and easy to loose. When you have reached a satisfying state, with all parameters very well tuned (some kind of near optimal state), it is likely that a perturbation will first decrease the global quality...

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA