AIIDE 2020 - a first look at the results

I enjoyed lurking in the AIIDE 2020 stream last night. The top winners were easily predicted: #1 Stardust, #2 PurpleWave, #3 BananaBrain. #4 terran Dragon did well, and #5 McRave playing zerg did great to finish as well as they did in the era of protoss domination. The race pattern continues: Protoss at the top and otherwise scattered randomly down the table, terran split between strong bots near the top and weaker bots near the bottom, and zerg clumped in the middle. Of course that is only a general pattern, every bot is on its own.

I will be doing my usual results analysis, I hope more than my usual. I’m curious about how some of the specific results came about.

Here is my version of the crosstable, computed from the detailed results. It exactly matches the official crosstable, only the presentation is different. A total of 5 games went uncounted due to GAME_STATE_NOT_UPDATED_60S_BOTH_BOTS, all of them involving UAlbertaBot.

#	bot	overall	star	purp	bana	drag	mcra	micr	stea	daqi	zzzk	ualb	will	ecgb	eggb
1	stardust	93.22%		83%	62%	93%	98%	99%	98%	93%	99%	98%	99%	99%	97%
2	purplewave	79.44%	17%		63%	45%	95%	71%	95%	87%	91%	93%	98%	99%	100%
3	bananabrain	69.61%	38%	37%		45%	55%	61%	71%	66%	94%	88%	84%	99%	98%
4	dragon	62.38%	7%	55%	55%		79%	57%	30%	53%	47%	80%	93%	94%	97%
5	mcrave	57.22%	2%	5%	45%	21%		77%	57%	65%	83%	89%	63%	80%	99%
6	microwave	54.47%	1%	29%	39%	43%	23%		29%	83%	93%	61%	65%	88%	100%
7	steamhammer	54.00%	2%	5%	29%	70%	43%	71%		22%	75%	95%	55%	83%	97%
8	daqin	50.14%	7%	13%	34%	47%	35%	17%	78%		9%	69%	96%	99%	97%
9	zzzkbot	39.89%	1%	9%	6%	53%	17%	7%	25%	91%		49%	92%	29%	100%
10	ualbertabot	31.14%	2%	7%	12%	20%	11%	39%	5%	31%	51%		45%	61%	90%
11	willyt	29.44%	1%	2%	16%	7%	37%	35%	45%	4%	8%	55%		74%	69%
12	ecgberht	24.28%	1%	1%	1%	6%	20%	12%	17%	1%	71%	39%	26%		97%
13	eggbot	4.72%	3%	0%	2%	3%	1%	0%	3%	3%	0%	10%	31%	3%

It’s curious that Stardust towered over every opponent—only BananaBrain was able to put up a serious fight—but did not score 100% against any. #4 Dragon upset protoss #2 PurpleWave and #3 BananaBrain, but was upset in turn by zergs #7 Steamhammer and #9 ZZZKBot. That is typical of tscmoo authored bots: They are tuned to do well against the best, and show some weakness against the rest. I’m surprised by UAlbertaBot’s relatively high finish; I expected it to be second to last.

In my original post on the bots registered for AIIDE I separated out 3 new bots, DanDanBot, Randofoo, and Taij. I didn’t mention the other new entrant, EggBot, which hadn’t appeared on the list yet. Of the 4 new bots, only EggBot ended up playing. None of the familiar old names dropped out. Way to go EggBot! In my book it did not finish last, it finished ahead of 3 no-shows, and ahead of everyone who was afraid to sign up at all. You don’t have to have a serious chance in the competition to take the competition seriously; opportunities are to be taken. The only downside is that I can no longer say “Eggie” to mean Ecgberht.

I am of course especially interested in Steamhammer’s results. Its rival Microwave squeaked ahead with 8 extra wins out of the 1800 games. #7 Steamhammer upset #4 Dragon and #6 Microwave by about 70% each, but it was crushed by #8 DaQin, scoring only 22% (where Microwave scored 83%). In my post Steamhammer’s prepared learning data for AIIDE 2020 I said “I also didn’t prepare against DaQin because I didn’t have recent data handy; I could have tried harder, but time was short.” That one omission was my downfall!

The race balance tables are not very interesting, since protoss dominates. And of course there was only one random player, UAlbertaBot. Nevertheless, here they are. The overall race balance:

	overall	vT	vP	vZ	vR
terran	39%		31%	38%	58%
protoss	59%	69%		58%	72%
zerg	51%	62%	42%		73%
random	31%	42%	28%	27%

Each bot’s results by opponent race. I think the table tells more about the opponents grouped by race than about the bots listed on the left.

#	bot	race	overall	vT	vP	vZ	vR
1	stardust	protoss	93.22%	97%	84%	99%	98%
2	purplewave	protoss	79.44%	81%	67%	88%	93%
3	bananabrain	protoss	69.61%	76%	60%	70%	88%
4	dragon	terran	62.38%	94%	54%	53%	80%
5	mcrave	zerg	57.22%	55%	43%	72%	89%
6	microwave	zerg	54.47%	65%	50%	48%	61%
7	steamhammer	zerg	54.00%	70%	31%	63%	95%
8	daqin	protoss	50.14%	81%	38%	35%	69%
9	zzzkbot	zerg	39.89%	58%	41%	16%	49%
10	ualbertabot	random	31.14%	42%	28%	27%	-
11	willyt	terran	29.44%	40%	18%	31%	55%
12	ecgberht	terran	24.28%	16%	20%	30%	39%
13	eggbot	protoss	4.72%	12%	2%	1%	10%

Next: Results by map.

Trackbacks

No Trackbacks

Comments

Dan on Monday, October 19. 2020:

"In my book it did not finish last, it finished ahead of 3 no-shows, and ahead of everyone who was afraid to sign up at all."

Very well put. And to that I'd add everyone who thought making a bot could be fun but didn't (which included me for seven years). Proud of EggBot's author and hope to see more.

Tully Elliston on Tuesday, October 20. 2020:

Nice write up, thanks.

Going in with learning data (or lack of) often seems to be a Steamhammer downfall in these tourneys, the tradition is upheld!

Congrats on getting the upper hand over Microwave over the set.

I feel that a focus on defensive skills increases consistency against weaker opponents, while a focus on offensive skills increases the ability of a bot to upset stronger opponents when they make errors. Steamhammer's focus on aggression is costing it more games against weaker opponents compared to Microwave, which fares better against these across the board. But Steamhammer fares better against the stronger opponents.

The protoss dominance at the moment I think is a product of the attributes of the faction vs the level of gameplay bots are capable of at the moment. Protoss thrives in situations where armies engage over low surface areas. Most bots at the moment are incapable of concaves, line moves and flanking (Steamhammer likes to attack in a single file line), which turbo charges Protoss. When Steamhammer and other zerg bots learn to attack across a broad front (and to flank!) I suspect this trend will drastically reverse.

Dilyan on Tuesday, October 20. 2020:

Well said, zerg power is in surrounding enemy units or spreading army... I don't see it from SH :/ or other zergs except cherrypi

Tully Elliston on Wednesday, October 21. 2020:

Yes, Cherripi has cool zergling control

Jay Scott on Wednesday, October 21. 2020:

Agreed, and necessary to have any kind of chance versus Stardust. But not in fact the most urgent improvement needed. :-/

Tully Elliston on Wednesday, October 21. 2020:

Your loss rate versus zzzkbot (25%) compared to microwave (7%) suggests there are some low hanging gains to be had in tightening up SH play against dawn rushes.

Jay Scott on Wednesday, October 21. 2020:

The tournament version does tighten up recognition of fast rushes—which helps substantially. But play against them is not much changed.

Tully Elliston on Thursday, October 22. 2020:

Perhaps you need something in your learning to the effect of "if this opponent has used the same plan every single game for more than x games, don't vary build, just use the counter build every time"

Jay Scott on Thursday, October 22. 2020:

Steamhammer does have that! If the opponent plays a constant plan, Steamhammer seeks the single most effective counter, which is all that many bots ever do. But it still needs to explore different counter builds to see which one works best.

If the opponent varies, then Steamhammer varies too to avoid being exploited in just that way.

Jay Scott on Thursday, October 22. 2020:

By the way, ZZZKBot does not play the same every game. It tries its 4 pool first, and varies if that does not work. Its turtle into 2-hatch muta is effective versus zerg bots that are trying to survive 4 pool.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA