AIIDE 2020 - a first look at the results
I enjoyed lurking in the AIIDE 2020 stream last night. The top winners were easily predicted: #1 Stardust, #2 PurpleWave, #3 BananaBrain. #4 terran Dragon did well, and #5 McRave playing zerg did great to finish as well as they did in the era of protoss domination. The race pattern continues: Protoss at the top and otherwise scattered randomly down the table, terran split between strong bots near the top and weaker bots near the bottom, and zerg clumped in the middle. Of course that is only a general pattern, every bot is on its own.
I will be doing my usual results analysis, I hope more than my usual. I’m curious about how some of the specific results came about.
Here is my version of the crosstable, computed from the detailed results. It exactly matches the official crosstable, only the presentation is different. A total of 5 games went uncounted due to GAME_STATE_NOT_UPDATED_60S_BOTH_BOTS, all of them involving UAlbertaBot.
# | bot | overall | star | purp | bana | drag | mcra | micr | stea | daqi | zzzk | ualb | will | ecgb | eggb |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | stardust | 93.22% | 83% | 62% | 93% | 98% | 99% | 98% | 93% | 99% | 98% | 99% | 99% | 97% | |
2 | purplewave | 79.44% | 17% | 63% | 45% | 95% | 71% | 95% | 87% | 91% | 93% | 98% | 99% | 100% | |
3 | bananabrain | 69.61% | 38% | 37% | 45% | 55% | 61% | 71% | 66% | 94% | 88% | 84% | 99% | 98% | |
4 | dragon | 62.38% | 7% | 55% | 55% | 79% | 57% | 30% | 53% | 47% | 80% | 93% | 94% | 97% | |
5 | mcrave | 57.22% | 2% | 5% | 45% | 21% | 77% | 57% | 65% | 83% | 89% | 63% | 80% | 99% | |
6 | microwave | 54.47% | 1% | 29% | 39% | 43% | 23% | 29% | 83% | 93% | 61% | 65% | 88% | 100% | |
7 | steamhammer | 54.00% | 2% | 5% | 29% | 70% | 43% | 71% | 22% | 75% | 95% | 55% | 83% | 97% | |
8 | daqin | 50.14% | 7% | 13% | 34% | 47% | 35% | 17% | 78% | 9% | 69% | 96% | 99% | 97% | |
9 | zzzkbot | 39.89% | 1% | 9% | 6% | 53% | 17% | 7% | 25% | 91% | 49% | 92% | 29% | 100% | |
10 | ualbertabot | 31.14% | 2% | 7% | 12% | 20% | 11% | 39% | 5% | 31% | 51% | 45% | 61% | 90% | |
11 | willyt | 29.44% | 1% | 2% | 16% | 7% | 37% | 35% | 45% | 4% | 8% | 55% | 74% | 69% | |
12 | ecgberht | 24.28% | 1% | 1% | 1% | 6% | 20% | 12% | 17% | 1% | 71% | 39% | 26% | 97% | |
13 | eggbot | 4.72% | 3% | 0% | 2% | 3% | 1% | 0% | 3% | 3% | 0% | 10% | 31% | 3% |
It’s curious that Stardust towered over every opponent—only BananaBrain was able to put up a serious fight—but did not score 100% against any. #4 Dragon upset protoss #2 PurpleWave and #3 BananaBrain, but was upset in turn by zergs #7 Steamhammer and #9 ZZZKBot. That is typical of tscmoo authored bots: They are tuned to do well against the best, and show some weakness against the rest. I’m surprised by UAlbertaBot’s relatively high finish; I expected it to be second to last.
In my original post on the bots registered for AIIDE I separated out 3 new bots, DanDanBot, Randofoo, and Taij. I didn’t mention the other new entrant, EggBot, which hadn’t appeared on the list yet. Of the 4 new bots, only EggBot ended up playing. None of the familiar old names dropped out. Way to go EggBot! In my book it did not finish last, it finished ahead of 3 no-shows, and ahead of everyone who was afraid to sign up at all. You don’t have to have a serious chance in the competition to take the competition seriously; opportunities are to be taken. The only downside is that I can no longer say “Eggie” to mean Ecgberht.
I am of course especially interested in Steamhammer’s results. Its rival Microwave squeaked ahead with 8 extra wins out of the 1800 games. #7 Steamhammer upset #4 Dragon and #6 Microwave by about 70% each, but it was crushed by #8 DaQin, scoring only 22% (where Microwave scored 83%). In my post Steamhammer’s prepared learning data for AIIDE 2020 I said “I also didn’t prepare against DaQin because I didn’t have recent data handy; I could have tried harder, but time was short.” That one omission was my downfall!
The race balance tables are not very interesting, since protoss dominates. And of course there was only one random player, UAlbertaBot. Nevertheless, here they are. The overall race balance:
overall | vT | vP | vZ | vR | |
---|---|---|---|---|---|
terran | 39% | 31% | 38% | 58% | |
protoss | 59% | 69% | 58% | 72% | |
zerg | 51% | 62% | 42% | 73% | |
random | 31% | 42% | 28% | 27% |
Each bot’s results by opponent race. I think the table tells more about the opponents grouped by race than about the bots listed on the left.
# | bot | race | overall | vT | vP | vZ | vR |
---|---|---|---|---|---|---|---|
1 | stardust | protoss | 93.22% | 97% | 84% | 99% | 98% |
2 | purplewave | protoss | 79.44% | 81% | 67% | 88% | 93% |
3 | bananabrain | protoss | 69.61% | 76% | 60% | 70% | 88% |
4 | dragon | terran | 62.38% | 94% | 54% | 53% | 80% |
5 | mcrave | zerg | 57.22% | 55% | 43% | 72% | 89% |
6 | microwave | zerg | 54.47% | 65% | 50% | 48% | 61% |
7 | steamhammer | zerg | 54.00% | 70% | 31% | 63% | 95% |
8 | daqin | protoss | 50.14% | 81% | 38% | 35% | 69% |
9 | zzzkbot | zerg | 39.89% | 58% | 41% | 16% | 49% |
10 | ualbertabot | random | 31.14% | 42% | 28% | 27% | - |
11 | willyt | terran | 29.44% | 40% | 18% | 31% | 55% |
12 | ecgberht | terran | 24.28% | 16% | 20% | 30% | 39% |
13 | eggbot | protoss | 4.72% | 12% | 2% | 1% | 10% |
Next: Results by map.
Comments
Dan on :
Very well put. And to that I'd add everyone who thought making a bot could be fun but didn't (which included me for seven years). Proud of EggBot's author and hope to see more.
Tully Elliston on :
Going in with learning data (or lack of) often seems to be a Steamhammer downfall in these tourneys, the tradition is upheld!
Congrats on getting the upper hand over Microwave over the set.
I feel that a focus on defensive skills increases consistency against weaker opponents, while a focus on offensive skills increases the ability of a bot to upset stronger opponents when they make errors. Steamhammer's focus on aggression is costing it more games against weaker opponents compared to Microwave, which fares better against these across the board. But Steamhammer fares better against the stronger opponents.
The protoss dominance at the moment I think is a product of the attributes of the faction vs the level of gameplay bots are capable of at the moment. Protoss thrives in situations where armies engage over low surface areas. Most bots at the moment are incapable of concaves, line moves and flanking (Steamhammer likes to attack in a single file line), which turbo charges Protoss. When Steamhammer and other zerg bots learn to attack across a broad front (and to flank!) I suspect this trend will drastically reverse.
Dilyan on :
Tully Elliston on :
Jay Scott on :
Tully Elliston on :
Jay Scott on :
Tully Elliston on :
Jay Scott on :
If the opponent varies, then Steamhammer varies too to avoid being exploited in just that way.
Jay Scott on :