AIIDE 2017 results discussion
The AIIDE 2017 results include the usual thorough tables and graphs and data files. I don’t see replays, though, so I will only be guessing about play styles. Except, of course, for what I can see in the 7 videos of sample games.
The finishing order was #1 ZZZKBot, #2 PurpleWave, and #3 Iron, all with extremely close winning percentages of 82% - 83%. If you look at the winning percentages over time, Iron started well ahead early in the tournament and the other 2 caught up over time due to learning, finally edging just in front. This year, learning is winning. Iron, we know, has superior micro and an adaptive play style that can cope with most challenges—but it does not learn about its opponents. Opponent modeling has become a feature which helps you win tournaments (and I’m still disappointed I didn’t finish it in time). Steamhammer’s random opening choice, an anti-learning feature, held Steamhammer’s results steady over time as many other non-learning bots lost ground to the learners. #4 cpac in particular lost a lot of ground and came within a hair’s breadth of falling behind #5 Microwave.
#1 ZZZKBot supposedly added new cheese builds and learned which ones to play when. I will dig into it later and see what’s going on.
ZZZKBot, cpac, and CherryPi are notable for the short durations of their games. Cpac and CherryPi apparently like cheese or at least low-econ pressure builds (we saw some of that in the videos). The average duration of games was about the same as last year, even though with many zergs this year there should have been more fast ZvZ games. That may reflect a higher level of play—or that bots don’t know how to play ZvZ.
I get a crash rate this year of 3.8% of games, compared to 6.7% of games last year. Apparently bots are more reliable (though some crashes may be due to the infrastructure, which could have improved too). The crash column does show 2 orders of magnitude difference between the most and least reliable bots, so no doubt a few bad apples can spoil the number. According to Dave Churchill’s comment yesterday, the crash column this year includes games in which a bot went idle for 60 seconds while last year presumably didn’t, so the improvement is bigger than it may appear.
the re-entrants
Even though some new entrants were weak bots which lost most of their games, the overall level of play was higher. 6 bots played in identical versions in 2016 and 2017. In 2016, 5 of the 6 scored above 50%; in 2017, only XIMP and Aiur were able to hold a little above 50%, and all the re-entrants found themselves in a tougher field.
| bot | 2016 | 2017 |
|---|---|---|
| AIUR | 61.22% | 50.46% |
| Garmbot | 42.52% | 27.09% |
| ICEbot | 57.43% | 45.62% |
| Skynet | 55.03% | 43.78% |
| Xelnaga | 56.98% | 37.10% |
| XIMP | 64.54% | 54.19% |
race distribution
Nearly half of the entrants this year were zerg.
| race | # |
|---|---|
| terran | 4 |
| protoss | 10 |
| zerg | 13 |
| random | 1 |
| total | 28 |
If we look at new entrants, excluding the 2016 re-entries, it’s even more extreme. Zerg is the popular race. How much is that my fault for making Steamhammer a zerg?
| race | # |
|---|---|
| terran | 3 |
| protoss | 6 |
| zerg | 12 |
| random | 1 |
| total | 22 |
Looking at the results, zergs clustered toward the top and bottom of the rankings. Terran and protoss are more evenly spread.
heritage
This is based on a quick troll through the source code, and I probably missed things. #4 cpac, #5 Microwave, #8 Arrakhammer, #10 Steamhammer, and #18 KillAll are Steamhammer and its forks. Steamhammer itself finished behind most of them, which may represent the cost of open source. All the successful forks, of course, represent the benefit of open source. The direct UAlbertaBot forks—leaving aside Steamhammer and its family—seem to be #11 Ailien, #12 LetaBot, #14 UAlbertaBot itself, #21 Overkill, and #24 Myscbot, and possibly #26 Sling (it seems to borrow some code, at least).
Lesson 1: This is a lot of bots. UAlbertaBot is extremely influential, and its influence is still growing.
Lesson 2: The Steamhammer Branch was mostly more successful than the Old UAlbertaBot Family that it stems from. Apparently I did some things right.
Next: The massive per-map crosstables. Race balance.