AIIDE 2019 results first look
Important update on Friday 11 October: The results are invalid due to an error and the tournament will be repeated from scratch. See Dave Churchill’s tweet “The 2019 AIIDE StarCraft AI Competition will have to be re-run due to an error on our part causing a corrupted file which caused McRave to crash a lot of games.” The same error might have caused other problems. Even if McRave was the only bot directly affected, the competition was round robin so every bot’s score was potentially affected.
The AIIDE 2019 results were announced today at the conference. The AIIDE conference stream includes Dave Churchill’s presentation starting at about 52:30.They come with a video of Locutus versus PurpleWave, with commentary by Dan Gant focusing not on the game, but on the AI techniques.
The standings: #1 Locutus edged out #2 PurpleWave. #3 DaQin and #4 BananaBrain were far behind, but finished out the dominant protoss bloc. (The win rate over time graph strangely omits #4 BananaBrain.) #5 Iron, #6 Microwave, #7 XiaoYi, and #8 Steamhammer were closely grouped around 50% win rate. As in CoG, Iron is the top terran and the top returning bot, and Microwave was the top zerg.
#10 McRave did surprisingly poorly. It must be suffering from new bugs. I notice that McRave’s army has become strangely passive; it sometimes seems unwilling to fight even with a large advantage. That seems like a symptom of an important bug.
#8 Steamhammer did about as I expected, or at least as I expected after I noticed the combat sim bug that I had just added. Without that bug I think it would have finished slightly ahead of Microwave. I’m bothered by the 59% win rate against Iron, though; I expected over 90%. I tested on every map with the correct version of Iron, but must have made a mistake somewhere.
Last year, Bruce Nielsen provided diffs from Locutus for bots derived from it. This year, Dan Gant has provided diffs of a few other bots.
• Stormbreaker derived from SAIDA - Stormbreaker was disqualified because its behavior was nearly identical to SAIDA’s, though there are big code differences. According to the presentation, Stormbreaker adds a neural network but does not use it.
• XiaoYi derived from SAIDA - According to the presentation, SAIDA would likely have finished 3rd if it had played. XiaoYi placed 7th behind Microwave.
• DaQin this year versus last year. I see a great many detailed changes.
We were promised a second competition on “unknown” maps, for those bots which did not opt out. I count 8 participants for the second competition. I don’t see a sign of its results. Perhaps it has not been run yet.
As always, I will analyze both CoG and AIIDE. But CoG is showing evidence of sloppiness, so AIIDE deserves more attention. With fewer entrants in AIIDE this year, it won’t take as long to dig into them. But I think I have almost managed to interpret the CoG result file, so I’ll start there.
Comments
Dave Churchill on :
- Inspection showed that most of the code changes in Stormbreaker from SAIDA is code re-ordering, weird large commented sections, and mostly cosmetic things. They call their neural network only in one location but then immediately the result goes out of scope and is never used. I spoke to the author and they agreed with the DQ after presenting these facts
- We are investigating why McRave didn't start nearly every game vs. Locutus. If we find it's on our side of things, then we will re-run the tournament. We made updates to the software for more accurate reporting of crashes and it is possible we have a mistake. But very strange that it would only affect one bot vs. another bot. They both use BWEB according to McRave so maybe this is part of the culprit? Note that BASIL doesn't check for start-up times or frame timeouts
- Due to an earlier AIIDE, we haven't run the 2nd unknown maps competition yet, that will come later this month. Apologies for delays
Jay Scott on :
Looking forward to the second competition.
Jay Scott on :
Bruce on :
I’ve uploaded a branch for 2019 DaQin on my github as well, so a diff with my AIIDE 2018 tag is probably possible there to see the full divergence from Locutus.
Jay Scott on :
McRave on :
I was opposed to re-running the tournament but I understand that it's the right thing to do.
My bot did poorly from mostly timeouts which I did not find in testing, but are obvious from some videos posted on Discord of McRave running on AIIDE hardware / TM.
It's disappointing, but it is what it is. The bot is strong but did not constrain itself to the rules of the tournament.
Dave Churchill on :
Now that we identified this we will be re-running the entire tournament this week, so you may want to hold off on analysis until then
Jay Scott on :
It will be interesting to compare the “unofficial preliminary” results with the final results.
Dave Churchill on :
Jay Scott on :