AIIDE 2017 results discussion
The AIIDE 2017 results include the usual thorough tables and graphs and data files. I don’t see replays, though, so I will only be guessing about play styles. Except, of course, for what I can see in the 7 videos of sample games.
The finishing order was #1 ZZZKBot, #2 PurpleWave, and #3 Iron, all with extremely close winning percentages of 82% - 83%. If you look at the winning percentages over time, Iron started well ahead early in the tournament and the other 2 caught up over time due to learning, finally edging just in front. This year, learning is winning. Iron, we know, has superior micro and an adaptive play style that can cope with most challenges—but it does not learn about its opponents. Opponent modeling has become a feature which helps you win tournaments (and I’m still disappointed I didn’t finish it in time). Steamhammer’s random opening choice, an anti-learning feature, held Steamhammer’s results steady over time as many other non-learning bots lost ground to the learners. #4 cpac in particular lost a lot of ground and came within a hair’s breadth of falling behind #5 Microwave.
#1 ZZZKBot supposedly added new cheese builds and learned which ones to play when. I will dig into it later and see what’s going on.
ZZZKBot, cpac, and CherryPi are notable for the short durations of their games. Cpac and CherryPi apparently like cheese or at least low-econ pressure builds (we saw some of that in the videos). The average duration of games was about the same as last year, even though with many zergs this year there should have been more fast ZvZ games. That may reflect a higher level of play—or that bots don’t know how to play ZvZ.
I get a crash rate this year of 3.8% of games, compared to 6.7% of games last year. Apparently bots are more reliable (though some crashes may be due to the infrastructure, which could have improved too). The crash column does show 2 orders of magnitude difference between the most and least reliable bots, so no doubt a few bad apples can spoil the number. According to Dave Churchill’s comment yesterday, the crash column this year includes games in which a bot went idle for 60 seconds while last year presumably didn’t, so the improvement is bigger than it may appear.
the re-entrants
Even though some new entrants were weak bots which lost most of their games, the overall level of play was higher. 6 bots played in identical versions in 2016 and 2017. In 2016, 5 of the 6 scored above 50%; in 2017, only XIMP and Aiur were able to hold a little above 50%, and all the re-entrants found themselves in a tougher field.
| bot | 2016 | 2017 |
|---|---|---|
| AIUR | 61.22% | 50.46% |
| Garmbot | 42.52% | 27.09% |
| ICEbot | 57.43% | 45.62% |
| Skynet | 55.03% | 43.78% |
| Xelnaga | 56.98% | 37.10% |
| XIMP | 64.54% | 54.19% |
race distribution
Nearly half of the entrants this year were zerg.
| race | # |
|---|---|
| terran | 4 |
| protoss | 10 |
| zerg | 13 |
| random | 1 |
| total | 28 |
If we look at new entrants, excluding the 2016 re-entries, it’s even more extreme. Zerg is the popular race. How much is that my fault for making Steamhammer a zerg?
| race | # |
|---|---|
| terran | 3 |
| protoss | 6 |
| zerg | 12 |
| random | 1 |
| total | 22 |
Looking at the results, zergs clustered toward the top and bottom of the rankings. Terran and protoss are more evenly spread.
heritage
This is based on a quick troll through the source code, and I probably missed things. #4 cpac, #5 Microwave, #8 Arrakhammer, #10 Steamhammer, and #18 KillAll are Steamhammer and its forks. Steamhammer itself finished behind most of them, which may represent the cost of open source. All the successful forks, of course, represent the benefit of open source. The direct UAlbertaBot forks—leaving aside Steamhammer and its family—seem to be #11 Ailien, #12 LetaBot, #14 UAlbertaBot itself, #21 Overkill, and #24 Myscbot, and possibly #26 Sling (it seems to borrow some code, at least).
Lesson 1: This is a lot of bots. UAlbertaBot is extremely influential, and its influence is still growing.
Lesson 2: The Steamhammer Branch was mostly more successful than the Old UAlbertaBot Family that it stems from. Apparently I did some things right.
Next: The massive per-map crosstables. Race balance.
Comments
krasi0 on :
PurpleWaveJadien on :
Jay Scott on :
Antiga / Iruian on :
Nathan Gamble on :
ZZZKbot had only 2 zerg opponents that they lost more than 20 games to.
Purplewave had only 4.
Iron and cpac had 5.
Microwave had 6.
Cherrypi had 7.
McRave had 9.
All of these bots had different zerg bots that caused them upsets, but being able to beat weaker zergs consistently seems to have played a major part in determining the rankings.
The next few bots (Arrakhammer, Tyr, Steamhammer, AIlien, Letabot) are bots that have managed 2-3 upsets against some of the top zergs.
The rest of the bots apart from Hannes Bredberg all failed to beat more than 1 zerg that finished with a higher ranking than they did.
Dave Churchill on :
- There have been no strategic enhancements to the bot since 2014, only a few more hard-coded opening builds since it went Random
- I removed the BWTA requirement from the bot to make it easier for people to compile and use, which means I no longer have access to base polygons, so my scouting worker just moves in and out of the base vs. roaming around inside the base
- I updated SparCraft and BOSS without a ton of testing, and I think it may have led to a few bugs
- The build orders were not as fully tested as the 2016 and 2015 versions, with some ad-hoc changes that probably led to a few crashes
It's really great that so many bots have been based on the architectures, I feel that the strategic fixes you made to the bot really make it much stronger. Looking forward to see what you did in Steamhammer for this competition