AIIDE 2017 results discussion

The AIIDE 2017 results include the usual thorough tables and graphs and data files. I don’t see replays, though, so I will only be guessing about play styles. Except, of course, for what I can see in the 7 videos of sample games.

The finishing order was #1 ZZZKBot, #2 PurpleWave, and #3 Iron, all with extremely close winning percentages of 82% - 83%. If you look at the winning percentages over time, Iron started well ahead early in the tournament and the other 2 caught up over time due to learning, finally edging just in front. This year, learning is winning. Iron, we know, has superior micro and an adaptive play style that can cope with most challenges—but it does not learn about its opponents. Opponent modeling has become a feature which helps you win tournaments (and I’m still disappointed I didn’t finish it in time). Steamhammer’s random opening choice, an anti-learning feature, held Steamhammer’s results steady over time as many other non-learning bots lost ground to the learners. #4 cpac in particular lost a lot of ground and came within a hair’s breadth of falling behind #5 Microwave.

#1 ZZZKBot supposedly added new cheese builds and learned which ones to play when. I will dig into it later and see what’s going on.

ZZZKBot, cpac, and CherryPi are notable for the short durations of their games. Cpac and CherryPi apparently like cheese or at least low-econ pressure builds (we saw some of that in the videos). The average duration of games was about the same as last year, even though with many zergs this year there should have been more fast ZvZ games. That may reflect a higher level of play—or that bots don’t know how to play ZvZ.

I get a crash rate this year of 3.8% of games, compared to 6.7% of games last year. Apparently bots are more reliable (though some crashes may be due to the infrastructure, which could have improved too). The crash column does show 2 orders of magnitude difference between the most and least reliable bots, so no doubt a few bad apples can spoil the number. According to Dave Churchill’s comment yesterday, the crash column this year includes games in which a bot went idle for 60 seconds while last year presumably didn’t, so the improvement is bigger than it may appear.

the re-entrants

Even though some new entrants were weak bots which lost most of their games, the overall level of play was higher. 6 bots played in identical versions in 2016 and 2017. In 2016, 5 of the 6 scored above 50%; in 2017, only XIMP and Aiur were able to hold a little above 50%, and all the re-entrants found themselves in a tougher field.

bot	2016	2017
AIUR	61.22%	50.46%
Garmbot	42.52%	27.09%
ICEbot	57.43%	45.62%
Skynet	55.03%	43.78%
Xelnaga	56.98%	37.10%
XIMP	64.54%	54.19%

race distribution

Nearly half of the entrants this year were zerg.

race	#
terran	4
protoss	10
zerg	13
random	1
total	28

If we look at new entrants, excluding the 2016 re-entries, it’s even more extreme. Zerg is the popular race. How much is that my fault for making Steamhammer a zerg?

race	#
terran	3
protoss	6
zerg	12
random	1
total	22

Looking at the results, zergs clustered toward the top and bottom of the rankings. Terran and protoss are more evenly spread.

heritage

This is based on a quick troll through the source code, and I probably missed things. #4 cpac, #5 Microwave, #8 Arrakhammer, #10 Steamhammer, and #18 KillAll are Steamhammer and its forks. Steamhammer itself finished behind most of them, which may represent the cost of open source. All the successful forks, of course, represent the benefit of open source. The direct UAlbertaBot forks—leaving aside Steamhammer and its family—seem to be #11 Ailien, #12 LetaBot, #14 UAlbertaBot itself, #21 Overkill, and #24 Myscbot, and possibly #26 Sling (it seems to borrow some code, at least).

Lesson 1: This is a lot of bots. UAlbertaBot is extremely influential, and its influence is still growing.

Lesson 2: The Steamhammer Branch was mostly more successful than the Old UAlbertaBot Family that it stems from. Apparently I did some things right.

Next: ~~The massive per-map crosstables.~~ Race balance.

Trackbacks

No Trackbacks

Comments

krasi0 on Monday, October 9. 2017:

It's refreshing to see more (and stronger) Zerg entries having been submitted this year. Next, we really need to see those in the meatgrinder called SSCAIT ladder :)

PurpleWaveJadien on Monday, October 9. 2017:

I didn't envy anyone trying to win with Zerg in this tournament. ZvZ is such a volatile matchup and the round robin format rewards consistency. Also crazy how fast development has been in the past couple of years. Seems like it used to be possible to be a top entry in consecutive years without updates. No longer.

Jay Scott on Monday, October 9. 2017:

No? And yet the winner is a zerg. I happened to just run race balance numbers, and it turns out that #1 ZZZKBot’s best matchup was ZvZ. Steamhammer’s best matchup is also ZvZ, by a large margin.

Antiga / Iruian on Monday, October 9. 2017:

Amazingly close tournament! I was doing some napkin math for my bot Antiga with the rewards that having learning could give. I'm betting steamhammer / randomhammer will see a 50-75 ELO on SSCAIT ladder jump within a few weeks of implementation. I think the rewards of having it will be well worth it.

Nathan Gamble on Monday, October 9. 2017:

I think that the consistency against zerg shows a pretty accurate picture of the top of the table.

ZZZKbot had only 2 zerg opponents that they lost more than 20 games to.
Purplewave had only 4.
Iron and cpac had 5.
Microwave had 6.
Cherrypi had 7.
McRave had 9.

All of these bots had different zerg bots that caused them upsets, but being able to beat weaker zergs consistently seems to have played a major part in determining the rankings.

The next few bots (Arrakhammer, Tyr, Steamhammer, AIlien, Letabot) are bots that have managed 2-3 upsets against some of the top zergs.

The rest of the bots apart from Hannes Bredberg all failed to beat more than 1 zerg that finished with a higher ranking than they did.

Dave Churchill on Tuesday, October 10. 2017:

For another data point: UAlbertaBot 2017 is strictly worse than 2016, for several reasons:

- There have been no strategic enhancements to the bot since 2014, only a few more hard-coded opening builds since it went Random

- I removed the BWTA requirement from the bot to make it easier for people to compile and use, which means I no longer have access to base polygons, so my scouting worker just moves in and out of the base vs. roaming around inside the base

- I updated SparCraft and BOSS without a ton of testing, and I think it may have led to a few bugs

- The build orders were not as fully tested as the 2016 and 2015 versions, with some ad-hoc changes that probably led to a few crashes

It's really great that so many bots have been based on the architectures, I feel that the strategic fixes you made to the bot really make it much stronger. Looking forward to see what you did in Steamhammer for this competition

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA