a first look at the AIIDE 2021 results
AIIDE 2021 results are out.
Ever since the unfortunate withdrawal of PurpleWave due to frame time issues, it was sure that protoss Stardust and BananaBrain would finish first and second—it seemed likely from the start, but without the other strong protoss it was inescapable. As it turned out, #1 Stardust was in a class by itself, scoring 96%, and #2 BananaBrain was in the following class by itself at 80%. #3 Dragon was only the best of the rest, the leader of the trailers, barely above breakeven with 51%. All others scored below even. I didn’t expect Dragon to place so high, because it was a holdover from last year and bots should have been prepared for it. I knew that #4 Steamhammer would outscore it head-to-head.
#4 Steamhammer did great at 49%. I met my goals of finishing above the middle and of murderfying #7 Microwave (73% score head-to-head). I had hoped to make third, but missed by about 1.4%. I expected to and did beat #3 Dragon, #5 McRave, and #7 Microwave, so I had some reason. I knew that Steamhammer risked a zero score against #1 Stardust—and it did happen—but the win count was going to be tiny no matter what so it wasn’t a big concern. I was worried about #6 WillyT because its big tank-infantry attacks are effective, but Steamhammer scored OK there too with 56%. Like last year, the trouble was a huge upset by carryover #8 DaQin. 2020 score 22%, 2021 score 27%—an improvement, but not by much. I had expected better.
#5 McRave scored better than the other zergs versus #1 Stardust and #2 BananaBrain, but it was not enough to move the needle. It was upset by #6 WillyT and, strangely, by #10 UAlbertaBot (last year it scored 89% against UAlbertaBot). #6 WillyT could not cope at all with #3 Dragon, and was upset by #8 DaQin too. #7 Microwave was little updated, according to the author. #9 FreshMeat, the new zerg by Hao Pan, scored 34% and was the tail ender of the submitted bots (those other than the holdovers). #10 UAlbertaBot’s upset of #5 McRave and stubborn ability to score some wins against every opponent kept it up at 27%, higher than I had anticipated. I guess UAlbertaBot will remain a usable benchmark for at least one more year.
The tournament ranks are similar to the BASIL ranks. BASIL has Stardust as the top among the AIIDE participants and BananaBrain as next. Microwave’s higher placement on BASIL is the biggest discrepancy. FreshMeat may be class B on BASIL and ranked 18 out of 86, but its BASIL rank still predicts its second-to-last finish.
This highlights that AIIDE 2021 was an elite tournament. There were few participants, and every submitted bot was already known to be highly ranked. 3 newcomer bots registered, and none was submitted. To me, it smells as though authors only want to submit if they believe they can do well. I see that as a mistake. From the author’s point of view, a tournament is a chance to gain experience, to learn about your own bot and others, and to show off your good ideas. From the community’s point of view, a tournament is an opportunity to invite new members in and to trade insights. In my experience, virtually every bot has good ideas that we can learn from. Many bots that perform poorly in games still have impressive skills in specific circumstances, not to mention other clever ideas. See for example my analysis of AITP, which scored 12% in AIIDE 2019.
Next: New bot Broken Horn. After that, stand by for more analysis of AIIDE.
Comments
McRave on :
MicroDK on :
McRave, WillyT, Microwave, DaQin all were very close, though with 30 games lost on frame timeout it is hard to get a better position. Microwave only had 3 in COG... I wonder why?
Tully Elliston on :
nklausner on :
Looking at the basil scores and test games I was pretty confident that WillyT would beat Steamhammer. I was more worried about McRaveZ and FreshMeat. So congrats on beating WillyT as well! It's nice, that the pimped UAB is still good for something. And yes, my bot is still fundamentally unable to play good TvT, cause it doesn't know siege lines. We will see whether that changes. With Terran being slightly unrepresented it wasn't pressing yet.