tournaments - 7 | Starcraft AI blog

Steamhammer games and status

Steamhammer played an excellent game versus Monster today. The game is kind of long and boring to watch, with repetitive action, but I’m pleased by the good play against stubborn defense. Steamhammer wasted some resources and missed some opportunities, but made no severe mistake at any point. It even expanded at a good time, which is depressingly rare in its ZvZs. Near the end, Steamhammer tried to put the cherry on top by ensnaring Monster’s mutalisks, but the mutas zoomed by too fast, the ensnare missed, and the queen was shot down. Oh well, dropping the cherry didn’t change the rest!

For a game that is not in the least excellent but is interesting for its mistakes, I like yesterday’s Steamhammer-Slater game. I watched the game live, and when Steamhammer bumbled the defense of its natural I steeled myself for a quick upset. But it was not so quick after all. The game is a showcase of ways to go wrong on both sides. Some of Steamhammer’s mistakes remain unresolved because my planned fixes are complicated and need to be implemented as projects.

The latency compensation bug is still making me scratch my head. The easiest way to work around it is to use the Micro module’s order tracking; Steamhammer already keeps track of what orders it has given to units, including larvas, so it doesn’t need to rely on BWAPI to keep it straight. I traced the backbone of the production code and added the minimal workaround, a two-line addition to the code that decides whether a unit should be added to the set of candidate producers. And... it didn’t work. In order to control where zerg units are made, to do things like make drones at bases that don’t have enough drones, there is a special-case low-level routine, and it ignores the set of candidate producers and does its own calculations from scratch—slightly complicated calculations that the candidates don’t make easier. I’m still thinking about the right fix. Maybe I can find a way to make it simple and powerful at the same time.

It is, by the way, a serious bug. In Steamhammer, the effect is to sometimes—at predictable times—drop a unit that was queued for production. Among other things, it turns 12 hatch openings into 11 hatch. I had noticed that Steamhammer was playing 11 hatch surprisingly often, but it does have a full suite of intentional 11 hatch openings, so I didn’t realize that it was due to a bug.

SSCAIT 2020 halfway point

The annual SSCAIT is past the halfway point of the round robin phase, and it’s time to take stock. The numbers keep changing, but here’s a snapshot.

Stardust has slowly climbed to #1 after its weak start, with only 4 losses after 50 games, as compared to #2 Monster with 7 losses after 67 games. Stardust has strong chances to hold its #1 position, though it has played fewer games. Stardust’s worst upset was against #29 ICEbot, while Monster’s was against #47 Junkbot. #5 PurpleWave is unexpectedly low, below #4 BananaBrain; from games I’ve seen, I suspect it did not get its usual thorough preparation, or perhaps the prep was concentrated on top opponents so that it can succeed in the elimination phase. #6 Iron is doing better than I expected, and is ahead of #7 Hao Pan, though they are ranked close and the edge may not stick. #9 Xiao Yi is also higher than I expected.

#14 Skynet by Andrew Smith is the only classic unupdated bot to hang on in the top 16. Other classics #17 UAlbertaBot by Dave Churchill and #18 XIMP by Tomas Vajda are just outside, and there is a gap with #16 Proxy so they may remain outside at the end of the round robin. #19 McRaveZ I had hoped to do better; its muta micro is good but its muta decision making (which target to seek, when to attack and when to run away) is not nearly as good as Monster’s. #20 Microwave has been slowly upping its win rate and has an outside chance of making it into the top 16 by the end; I imagine its learning is figuring out how to compensate for the bugs in this version.

Steamhammer is at #13 at the moment after a few losses, but I’m still forecasting that its most likely finish is #9 or #10. It has played more of its tough games than its easy games.

Some bots get special icons on the unofficial crosstable by Lines Prower. It’s a cute touch, though for me it makes the table harder to read. The funniest is Krasi0P’s linux penguin for 2 wins and Windows logo for 2 losses. I don’t understand McRaveZ’s icons. A salt shaker for losses, OK, but a secret agent for wins? I may be missing some background. PurpleWave gets a purple heart for wins. Maybe Lines Prower doesn’t know what a purple heart means to Americans?

only one horrible game

Last year, Steamhammer finished SSCAIT for the first time with no losses due to crippling bugs and only 2 close calls. So far, it is on track to repeat in this year’s SSCAIT. I have seen all its games up to now, and there are no losses due to egregious bugs (only the standard issue flaws) and only one near miss. That’s great compared to Steamhammer’s early years, but I still want to fix the bugs.

The bad game is Steamhammer vs legacy (random zerg). Steamhammer made a number of mistakes in the game and suffered at least 2 bugs. The bug I could not accept is that it built spore colonies to defend against air attack—very early, immediately after scouting legacy’s base and seeing that it had not yet taken gas. It’s not possible to get mutalisks that fast, and without gas there was not even a hint of future risk. In fact, legacy never took its gas and played the whole game with a mass slow zergling plan. If Steamhammer had held on to the drones instead of wasting them on static defense due to a bug, I doubt the attack would have troubled it at all.

I traced the bug to, of all things, an integer overflow. The routine that figures out the time the enemy’s spire will complete returns INT_MAX for “never” if there is no evidence of an enemy spire... and I brilliantly added a margin for the mutas to hatch and fly across the map. In C++, integer overflow is officially undefined, so the compiler retreats to its room and laughs its head off before generating the code that will cause the most possible confusion, because “undefined” means it can do that. I don’t know what it did this time, but it was not as simple as wrapping around from an extreme positive value to an extreme negative value, because that would have caused the bug to show up in half of ZvZ games. No, it’s better if it shows up only when it will cause a disgusting blunder out of nowhere.

Anyway, it was easy to fix. I also fixed a bug that caused multiple commanding of overlords. And I’m writing code to collect data for my main current project. Progress is underway.

SSCAIT and performance over time

Yesterday I claimed that the cannon bot Jakub Trancik “has been falling slowly in the rankings year by year, even as bots that began above it fall further.” Is it true? I see room for argument, but there is something to it.

graph of SSCAIT finishes for 5 bots over 6 years

Here is a graph of the SSCAIT finishing ranks of 5 bots over 6 years, from the 2014 through 2019 editions of the round robin phase of the annual tournament. The bots were selected to have no updates over the time period; it is the same code every year, according to the info on SSCAIT’s website. (UAlbertaBot by Dave Churchill was updated in 2015, so I didn’t include its 2014 finish.) The finishing ranks are normalized so that finishing first is 100 and finishing last is 0, so that the ranks can be compared over time even though each year had a different number of participants. The graph shows old bots falling in relative performance as new and updated bots grew stronger over the years.

Jakub Trancik’s finishes were nearly flat from 2014 through 2017, and it fell in 2018. It did not participate in 2019, though it has been allowed back this year. The other non-updated bots showed declines over the period, but not always steep declines. Each bot has a visible knee in the curve, where it bent more sharply down. The year of the knee, the last year of relatively stable performance, ranges from 2016 for Tomas Cere to 2018 for Skynet by Andrew Smith. That might be because performance gains have accelerated in the last few years, or it might be because it takes that long for enough new and updated bots to be tuned against the unchanging old ones. Maybe the knees occur when flashy newcomers start to exploit specific weaknesses of the old guard.

Of these 5 bots, Jakub Trancik has the flattest curve, though it doesn’t look exceptional. It is not a statistical outlier, and Skynet’s curve is almost as level. Jakub Trancik is also the least sophisticated bot, and it has the most extreme and unconventional strategy. The facts might be related.

For comparison, here’s another chart with 4 more non-updated bots. Roman Danielis missed 2016. These curves also seem to have knees, though less sharp, and the shape of Roman Danielis’s curve is not clear to the eye.

graph of SSCAIT finishes for 4 more bots

SSCAIT 2020 so far

The annual SSCAIT has progressed far enough that the competitors have roughly sorted themselves into groups. It’s about 1/4 complete, and we can get an idea of how things are going. Currently we have #1 Monster, which may in fact be the favorite to finish first, but it’s too early to talk about detailed finishing order.

Iron is doing better than I expected, though I guess it’s within the statistical margin of error. I have always been bemused by the consistency of cannonbot #38 Jakub Trancik, 11-16 for 41%, last updated in 2013. It has been falling slowly in the rankings year by year, even as bots that began above it fall further; apparently improvements that help against usual play do not help as much against the cannons. What stands out more to me are the bots that collapsed. Styx is failing to start and losing every game. Microwave, which should be in the top 16, is currently #31 of 56 with 16-15; maybe the latest update introduced a bug.

The biggest upset is #52 Marine Hell > #8 Steamhammer; Steamhammer went from lifetime 67-2 to 67-3 (since opponent modeling was added) against this opponent after failing to scout Marine Hell’s unit mix and making the wrong choice of counter units, among other mistakes. A more interesting upset is #48 Garmbot by Aurelien Lermant > #9 Dragon, where Dragon tried its usual harassing game plan but ended up defending all game instead, and could not hold it together. I was also pleased with #16 Skynet by Andrew Smith > #2 BetaStar after BetaStar chose a risky build, and this time didn’t get away with it. Don’t underestimate your foes: “It is not enough to be a good player, you must also play well” — Siegbert Tarrasch.

Steamhammer is currently at #8 with 17-5, having played only 22 games, fewer games than any other bot in the top 16 except Stardust, which has played only 20. I think Steamhammer’s most likely finish is #9 or #10, but we’ll see. Last year I was slightly pessimistic, and if the same is true this year then it may hold its position.

SSCAIT is popular today

I see 15 viewers on the SSCAIT stream as I write. More usually I see 1 to 4 of late, which presumably includes me when I’m watching. It’s a good sign; the annual tournament is driving interest.

SSCAIT tournament soon

I’ve just uploaded Steamhammer 3.3.5, which will be the SSCAIT tournament version unless it hits a last-minute bug. If you dare to rush through your opponent prep, now’s the time! Expect the change list after the deadline. Optically, this version fixes all the most visible bugs introduced in and since the AIIDE version; the games look cleaner, overlords live longer, bizarre expansion behavior does not happen. Results are only slightly improved, though, in part because of the learning hides bugs issue. I expected better.

Starting on 19 December, there’s been a rush of updates. In fact, every bot updated after 27 November was updated (or re-updated) on 19 December or later, so there’s a gap in the dates.

There is not much to predict about the tournament. I think everyone can foresee that the top finishers of the round robin phase will include Stardust, Krasi0 (if it competes as terran this year), Monster, and PurpleWave, and likely BananaBrain which has been doing well. Halo by Hao Pan is significantly weaker, and there is a gap below Hao Pan and adias (aka SAIDA) of nearly 100 elo before the remaining strong bots. Steamhammer is likely to finish near the middle of the top 16, and then survive not very long in the elimination phase, as in past years.

Steamhammer tournament plans

For the upcoming SSCAIT annual tournament, I’ll follow my usual plan. I’ve just uploaded a new test version Steamhammer 3.3.1, which fixes one of the critical bugs (and has another surprise change). I’ll drop frequent test versions until tournament time, and after the deadline I’ll release the tournament version. Time is short, so the changes will be mostly bug fixes and low-risk improvements that are unlikely to break stuff.

I expect the standard long no-upload period while the tournament runs. I will either turn to SCHNAIL, or else I’ll work on one of my machine learning ideas. Just after tournament season is the ideal time to add bugs and their associated major new features, so that the rest of the year can work desperately to fix—I mean, to tune them.

AIIDE 2020 - various versus DaQin

I added parsing for DaQin’s files, which was little effort. I decided to dump all of DaQin’s analysis into a single post, because the tables aren’t that rich in information. Now I’m able to move on to other topics. I put the opponents on the left, so that in all cases, blue is good for the opponent and red is good for DaQin.

bananabrain strategies versus daqin strategies

	overall	2GateDT	3GateDT	4GateGoon
overall	99/150 66%	9/14 64%	53/89 60%	37/47 79%
PvP_10/12gate	7/10 70%	-	2/5 40%	5/5 100%
PvP_12nexus	4/7 57%	1/1 100%	2/4 50%	1/2 50%
PvP_2gatedt	14/16 88%	1/1 100%	7/9 78%	6/6 100%
PvP_2gatedtexpo	10/14 71%	1/2 50%	5/7 71%	4/5 80%
PvP_2gatereaver	13/16 81%	1/1 100%	5/7 71%	7/8 88%
PvP_3gaterobo	8/13 62%	2/2 100%	5/7 71%	1/4 25%
PvP_3gatespeedzeal	2/7 29%	0/1 0%	1/5 20%	1/1 100%
PvP_4gategoon	4/8 50%	1/1 100%	2/6 33%	1/1 100%
PvP_9/9gate	15/16 94%	-	11/11 100%	4/5 80%
PvP_9/9proxygate	6/10 60%	1/1 100%	2/6 33%	3/3 100%
PvP_nzcore	7/11 64%	1/1 100%	4/8 50%	2/2 100%
PvP_zcore	3/7 43%	0/2 0%	3/5 60%	-
PvP_zcorez	2/7 29%	-	2/4 50%	0/3 0%
PvP_zzcore	4/8 50%	0/1 0%	2/5 40%	2/2 100%

Reading DaQin’s openings out of its configuration file, I see that 2GateDT makes 2 dark templar out of the promised 2 gateways, adds 3 cannons in front of its natural, then expands. As a PvP build, that strikes me as illogical (you might want one cannon if the enemy also has dark templar). 3GateDT makes one gate, gets dragoons and dragoon range, adds a second gate and a citadel, and then the predefined build order ends—the rest is left to the strategy manager. That seems sensible as far as it goes, but does the strategy manager regularly add a third gate and make DTs as promised, or is the name of the opening a lie? See below for BananaBrain’s opinion on the question. In any case, 3GateDT is the opening that gave BananaBrain the most trouble.

bananabrain as seen by daqin

bananabrain played	#	daqin recognized
PvP_10/12gate	10	10 Fast rush
PvP_12nexus	7	5 Fast rush \| 1 Safe expand \| 1 Naked expand
PvP_2gatedt	16	16 Fast rush
PvP_2gatedtexpo	14	13 DarkTemplar rush \| 1 Unknown
PvP_2gatereaver	16	16 DarkTemplar rush
PvP_3gaterobo	13	13 DarkTemplar rush
PvP_3gatespeedzeal	7	6 Fast rush \| 1 Unknown
PvP_4gategoon	8	5 DarkTemplar rush \| 1 Naked expand \| 1 Unknown \| 1 Fast rush
PvP_9/9gate	16	16 Fast rush
PvP_9/9proxygate	10	9 Fast rush \| 1 Proxy
PvP_nzcore	11	8 DarkTemplar rush \| 1 Not fast rush \| 1 Naked expand \| 1 Unknown
PvP_zcore	7	7 DarkTemplar rush
PvP_zcorez	7	5 DarkTemplar rush \| 2 Not fast rush
PvP_zzcore	8	5 DarkTemplar rush \| 2 Proxy \| 1 Not fast rush

DaQin recognizes 9-9 gate as Fast rush, but also the economy-first 10-12 gate and even the fast expand 12 nexus. What BananaBrain calls a reaver build, DaQin sees as a dark templar rush. Strategy recognition has some odd results.

daqin as seen by bananabrain

daqin played	#	bananabrain recognized
2GateDT	14	12 P_1gatecore \| 2 P_unknown
3GateDT	89	45 P_1gatecore \| 32 P_4gategoon \| 11 P_unknown \| 1 P_ffe
4GateGoon	47	36 P_4gategoon \| 9 P_1gatecore \| 2 P_unknown

This suggests that DaQin’s 3GateDT was often not a dark templar build at all.

mcrave strategies versus daqin strategies

	overall	ForgeExpand5GateGoon	ForgeExpandSpeedlots
overall	97/150 65%	3/3 100%	94/147 64%
PoolHatch,Overpool,2HatchMuta	97/150 65%	3/3 100%	94/147 64%

Not a lot of strategic variety here.

mcrave as seen by daqin

mcrave played	#	daqin recognized
PoolHatch,Overpool,2HatchMuta	150	117 Not fast rush \| 28 Heavy rush \| 5 Unknown

daqin as seen by mcrave

daqin played	#	mcrave recognized
ForgeExpand5GateGoon	3	3 FFE,Forge,5GateGoon
ForgeExpandSpeedlots	147	121 FFE,Forge,Speedlot \| 21 FFE,Nexus,Speedlot \| 2 FFE,Forge,5GateGoon \| 2 FFE,Forge,ZealotArchon \| 1 FFE,Gateway,Speedlot

microwave strategies versus daqin strategies

	overall	4GateGoon	ForgeExpand5GateGoon	ForgeExpandSpeedlots
overall	125/150 83%	3/11 27%	3/3 100%	119/136 88%
1HatchMuta_Sparkle	56/62 90%	0/1 0%	-	56/61 92%
3HatchLingBust	11/17 65%	2/4 50%	1/1 100%	8/12 67%
3HatchMuta	53/59 90%	0/2 0%	2/2 100%	51/55 93%
3HatchMutaExpo	5/9 56%	1/4 25%	-	4/5 80%
3HatchPoolHydraExpo	0/1 0%	-	-	0/1 0%
9Pool	0/1 0%	-	-	0/1 0%
OverpoolLurker	0/1 0%	-	-	0/1 0%

Why did DaQin play its most successful opening by far, 4GateGoon, less often than any other? It is not that it discovered the opening late; it played it first in game 10 of 150, and won that game. It immediately played it again and lost, but soon played it a third time and won again. It surely wasn’t confused by too many choices. Either there was a bug, or some built-in bias in DaQin’s decisions led it astray.

microwave as seen by daqin

microwave played	#	daqin recognized
1HatchMuta_Sparkle	62	34 Not fast rush \| 19 Heavy rush \| 7 Unknown \| 2 Proxy
3HatchLingBust	17	12 Not fast rush \| 4 Heavy rush \| 1 Proxy
3HatchMuta	59	48 Not fast rush \| 7 Heavy rush \| 4 Proxy
3HatchMutaExpo	9	8 Not fast rush \| 1 Proxy
3HatchPoolHydraExpo	1	1 Not fast rush
9Pool	1	1 Fast rush
OverpoolLurker	1	1 Unknown

daqin as seen by microwave

daqin played	#	microwave recognized
4GateGoon	11	8 Unknown \| 2 HeavyRush \| 1 Proxy
ForgeExpand5GateGoon	3	1 SafeExpand \| 1 Turtle \| 1 NakedExpand
ForgeExpandSpeedlots	136	68 Turtle \| 22 HeavyRush \| 20 SafeExpand \| 16 NakedExpand \| 10 Unknown

steamhammer strategies versus daqin strategies

	overall	ForgeExpand5GateGoon	ForgeExpandSpeedlots
overall	33/150 22%	29/136 21%	4/14 29%
10HatchLing	0/1 0%	0/1 0%	-
11Gas10PoolLurker	0/1 0%	0/1 0%	-
12-12Hatch	0/1 0%	0/1 0%	-
12Hatch_4HatchLing	0/2 0%	0/2 0%	-
2.5HatchMuta	0/1 0%	0/1 0%	-
2HatchHydraBust	0/2 0%	0/1 0%	0/1 0%
3HatchHydra	0/2 0%	0/2 0%	-
3HatchHydraBust	0/3 0%	0/2 0%	0/1 0%
3HatchHydraExpo	0/1 0%	0/1 0%	-
3HatchLateHydras+1	0/1 0%	0/1 0%	-
3HatchLing	26/59 44%	24/52 46%	2/7 29%
3HatchLingBust2	0/2 0%	0/2 0%	-
4HatchBeforeGas	5/25 20%	3/23 13%	2/2 100%
4HatchBeforeLair	0/1 0%	0/1 0%	-
5HatchBeforeGas	0/2 0%	-	0/2 0%
5HatchPool	0/1 0%	0/1 0%	-
5PoolHard2Player	0/1 0%	0/1 0%	-
5Scout	0/1 0%	0/1 0%	-
973HydraBust	0/4 0%	0/3 0%	0/1 0%
9Pool8GasLurker	0/1 0%	0/1 0%	-
9PoolHatchSpeed	0/1 0%	0/1 0%	-
9PoolHatchSpeedSpire2	0/1 0%	0/1 0%	-
9PoolHatchSpire	0/1 0%	0/1 0%	-
9PoolSpireSlowlings	0/1 0%	0/1 0%	-
9PoolSunkHatch	0/1 0%	0/1 0%	-
AntiFact_2Hatch	0/1 0%	0/1 0%	-
AntiFact_Overpool9Gas	0/1 0%	0/1 0%	-
AntiFactory2	0/1 0%	0/1 0%	-
Over10Hatch1Sunk	0/1 0%	0/1 0%	-
OverhatchExpoMuta	0/3 0%	0/3 0%	-
OverpoolSpeed	0/1 0%	0/1 0%	-
OverpoolTurtle 0	0/2 0%	0/2 0%	-
Proxy8HatchNatural	0/1 0%	0/1 0%	-
Sparkle 3HatchMuta	1/6 17%	1/6 17%	-
ZvP_2HatchMuta	0/1 0%	0/1 0%	-
ZvP_3BaseSpire+Den	0/1 0%	0/1 0%	-
ZvP_3HatchPoolHydra	1/7 14%	1/7 14%	-
ZvT_2HatchMuta	0/1 0%	0/1 0%	-
ZvT_3HatchMuta	0/1 0%	0/1 0%	-
ZvT_7Pool	0/1 0%	0/1 0%	-
ZvZ_12PoolLing	0/1 0%	0/1 0%	-
ZvZ_12PoolLingB	0/2 0%	0/2 0%	-
ZvZ_Overpool11Gas	0/1 0%	0/1 0%	-

steamhammer as seen by daqin

steamhammer played	#	daqin recognized
10HatchLing	1	1 Unknown
11Gas10PoolLurker	1	1 Heavy rush
12-12Hatch	1	1 Not fast rush
12Hatch_4HatchLing	2	2 Heavy rush
2.5HatchMuta	1	1 Not fast rush
2HatchHydraBust	2	1 Hydra bust \| 1 Not fast rush
3HatchHydra	2	2 Not fast rush
3HatchHydraBust	3	2 Not fast rush \| 1 Heavy rush
3HatchHydraExpo	1	1 Not fast rush
3HatchLateHydras+1	1	1 Not fast rush
3HatchLing	59	40 Not fast rush \| 16 Heavy rush \| 3 Unknown
3HatchLingBust2	2	1 Not fast rush \| 1 Unknown
4HatchBeforeGas	25	24 Not fast rush \| 1 Unknown
4HatchBeforeLair	1	1 Not fast rush
5HatchBeforeGas	2	2 Not fast rush
5HatchPool	1	1 Not fast rush
5PoolHard2Player	1	1 Fast rush
5Scout	1	1 Not fast rush
973HydraBust	4	4 Not fast rush
9Pool8GasLurker	1	1 Heavy rush
9PoolHatchSpeed	1	1 Heavy rush
9PoolHatchSpeedSpire2	1	1 Fast rush
9PoolHatchSpire	1	1 Heavy rush
9PoolSpireSlowlings	1	1 Heavy rush
9PoolSunkHatch	1	1 Fast rush
AntiFact_2Hatch	1	1 Not fast rush
AntiFact_Overpool9Gas	1	1 Not fast rush
AntiFactory2	1	1 Heavy rush
Over10Hatch1Sunk	1	1 Heavy rush
OverhatchExpoMuta	3	3 Not fast rush
OverpoolSpeed	1	1 Heavy rush
OverpoolTurtle 0	2	2 Heavy rush
Proxy8HatchNatural	1	1 Heavy rush
Sparkle 3HatchMuta	6	6 Not fast rush
ZvP_2HatchMuta	1	1 Not fast rush
ZvP_3BaseSpire+Den	1	1 Heavy rush
ZvP_3HatchPoolHydra	7	4 Not fast rush \| 1 Heavy rush \| 1 Hydra bust \| 1 Unknown
ZvT_2HatchMuta	1	1 Not fast rush
ZvT_3HatchMuta	1	1 Not fast rush
ZvT_7Pool	1	1 Fast rush
ZvZ_12PoolLing	1	1 Not fast rush
ZvZ_12PoolLingB	2	2 Not fast rush
ZvZ_Overpool11Gas	1	1 Heavy rush

daqin as seen by steamhammer

daqin played	#	steamhammer recognized
ForgeExpand5GateGoon	136	79 Turtle \| 41 Safe expand \| 10 Heavy rush \| 5 Naked expand \| 1 Unknown
ForgeExpandSpeedlots	14	7 Safe expand \| 6 Turtle \| 1 Unknown

ecgberht strategies versus daqin strategies

	overall	12NexusCarriers	3GateDT
overall	1/150 1%	1/2 50%	0/148 0%
14CC	0/32 0%	-	0/32 0%
FullMech	0/29 0%	0/1 0%	0/28 0%
JoyORush	0/28 0%	-	0/28 0%
MechGreedyFE	0/28 0%	-	0/28 0%
ProxyEightRax	1/33 3%	1/1 100%	0/32 0%

ecgberht as seen by daqin

ecgberht played	#	daqin recognized
14CC	32	28 Safe expand \| 2 Naked expand \| 2 Unknown
FullMech	29	27 Factory \| 2 Not fast rush
JoyORush	28	27 Factory \| 1 Unknown
MechGreedyFE	28	12 Unknown \| 9 Safe expand \| 7 Not fast rush
ProxyEightRax	33	27 Fast rush \| 5 Not fast rush \| 1 Proxy

daqin as seen by ecgberht

daqin played	#	ecgberht recognized
12NexusCarriers	2	2 Unknown
3GateDT	148	148 Unknown

AIIDE 2020 - Microwave versus Steamhammer

Microwave played more different openings than Steamhammer (no doubt seeking a winning choice), so I put it on the left. Blue is good for Microwave, red is good for Steamhammer.

microwave strategies versus steamhammer strategies

	overall	6PoolBurrow	8-8HydraRush	9Hatch8Pool	9PoolHatchSpeedSpire	OverhatchLing	OverpoolBurrow	ZvZ_12HatchExpo	ZvZ_12PoolLing	ZvZ_12PoolMain	ZvZ_Overpool11Gas	ZvZ_Overpool9Gas	ZvZ_OverpoolTurtle
overall	43/150 29%	1/1 100%	1/1 100%	4/5 80%	1/1 100%	1/1 100%	1/1 100%	3/5 60%	4/11 36%	2/3 67%	12/44 27%	7/64 11%	6/13 46%
10Hatch9Pool9gas	4/9 44%	-	-	0/1 0%	-	-	-	-	-	-	2/2 100%	1/5 20%	1/1 100%
10HatchMain9Pool9Gas	1/4 25%	-	-	-	-	-	-	-	-	-	0/1 0%	1/2 50%	0/1 0%
10HatchTurtleHydra	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
11HatchTurtleMuta	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	0/1 0%
12HatchMain	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
12Pool	5/25 20%	-	-	-	-	1/1 100%	-	-	0/3 0%	2/2 100%	2/7 29%	0/10 0%	0/2 0%
12PoolMain	1/5 20%	-	-	1/1 100%	-	-	-	-	-	-	0/2 0%	0/2 0%	-
2HatchLurker	0/2 0%	-	-	-	-	-	-	-	-	-	0/1 0%	0/1 0%	-
3HatchHydraBust	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
3HatchHydraExpo	0/2 0%	-	-	-	-	-	-	-	-	-	0/1 0%	0/1 0%	-
3HatchPoolHydra	0/2 0%	-	-	-	-	-	-	-	0/1 0%	-	0/1 0%	-	-
4HatchPoolHydra	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
5Pool	0/4 0%	-	-	-	-	-	-	-	0/1 0%	-	0/1 0%	0/2 0%	-
5PoolSpeed	1/3 33%	-	1/1 100%	-	-	-	-	-	0/1 0%	-	0/1 0%	-	-
7Pool	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
7PoolHydraLingRush7D	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
9Hatch9Pool9Gas	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
9HatchTurtleHydra	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
9PoolGasHatchSpeed8D	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
9PoolHatch	0/2 0%	-	-	-	-	-	-	-	-	-	-	0/2 0%	-
9PoolSpeed	17/31 55%	1/1 100%	-	3/3 100%	1/1 100%	-	-	1/1 100%	1/1 100%	-	4/6 67%	4/13 31%	2/5 40%
9PoolSpeedLing	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
9PoolSunken	0/7 0%	-	-	-	-	-	-	-	-	0/1 0%	0/3 0%	0/3 0%	-
OverpoolSpeed	1/3 33%	-	-	-	-	-	1/1 100%	-	-	-	0/1 0%	0/1 0%	-
ZvP_11Hatch10Pool	2/4 50%	-	-	-	-	-	-	1/1 100%	-	-	0/1 0%	1/2 50%	-
ZvP_2HatchHydra	0/9 0%	-	-	-	-	-	-	-	-	-	0/4 0%	0/5 0%	-
ZvP_9Hatch9Pool	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
ZvZ_Overgas11Pool	10/20 50%	-	-	-	-	-	-	1/1 100%	3/4 75%	-	4/9 44%	0/4 0%	2/2 100%
ZvZ_Overpool11Gas	0/2 0%	-	-	-	-	-	-	-	-	-	-	0/2 0%	-
ZvZ_Overpool9Gas	1/4 25%	-	-	-	-	-	-	-	-	-	-	0/3 0%	1/1 100%

Steamhammer’s ZvZ_Overpool9Gas opening was successful against all Microwave tries, but notice that it was the only one: Flecks of blue, or entire streaks, crept into every other Steamhammer attempt. The end result does not look close, but in fact Microwave would have needed only a small increment of skill to turn it around; there was only one strategy it was unprepared to face.

microwave as seen by steamhammer

microwave played	#	steamhammer recognized
10Hatch9Pool9gas	9	5 Naked expand \| 3 Heavy rush \| 1 Unknown
10HatchMain9Pool9Gas	4	3 Unknown \| 1 Turtle
10HatchTurtleHydra	1	1 Naked expand
11HatchTurtleMuta	1	1 Heavy rush
12HatchMain	1	1 Unknown
12Pool	25	17 Naked expand \| 5 Heavy rush \| 3 Unknown
12PoolMain	5	3 Heavy rush \| 2 Unknown
2HatchLurker	2	2 Naked expand
3HatchHydraBust	1	1 Naked expand
3HatchHydraExpo	2	1 Naked expand \| 1 Heavy rush
3HatchPoolHydra	2	1 Naked expand \| 1 Heavy rush
4HatchPoolHydra	1	1 Heavy rush
5Pool	4	4 Fast rush
5PoolSpeed	3	3 Fast rush
7Pool	1	1 Fast rush
7PoolHydraLingRush7D	1	1 Unknown
9Hatch9Pool9Gas	1	1 Naked expand
9HatchTurtleHydra	1	1 Heavy rush
9PoolGasHatchSpeed8D	1	1 Heavy rush
9PoolHatch	2	1 Unknown \| 1 Heavy rush
9PoolSpeed	31	21 Unknown \| 7 Naked expand \| 3 Heavy rush
9PoolSpeedLing	1	1 Naked expand
9PoolSunken	7	4 Unknown \| 3 Heavy rush
OverpoolSpeed	3	2 Unknown \| 1 Heavy rush
ZvP_11Hatch10Pool	4	3 Naked expand \| 1 Heavy rush
ZvP_2HatchHydra	9	6 Heavy rush \| 2 Turtle \| 1 Naked expand
ZvP_9Hatch9Pool	1	1 Naked expand
ZvZ_Overgas11Pool	20	19 Unknown \| 1 Turtle
ZvZ_Overpool11Gas	2	2 Unknown
ZvZ_Overpool9Gas	4	4 Unknown

To play ZvZ truly well, Steamhammer needs a more detailed understanding of enemy builds. But even with this crude breakdown, I notice that most of the blue spots are associated with misunderstanding the main idea of Microwave’s play. On the other hand, many misunderstandings also show as red.

steamhammer as seen by microwave

steamhammer played	#	microwave recognized
6PoolBurrow	1	1 FastRush
8-8HydraRush	1	1 Unknown
9Hatch8Pool	5	4 HeavyRush \| 1 Unknown
9PoolHatchSpeedSpire	1	1 NakedExpand
OverhatchLing	1	1 HeavyRush
OverpoolBurrow	1	1 NakedExpand
ZvZ_12HatchExpo	5	5 NakedExpand
ZvZ_12PoolLing	11	8 HeavyRush \| 2 Unknown \| 1 NakedExpand
ZvZ_12PoolMain	3	3 HeavyRush
ZvZ_Overpool11Gas	44	36 Turtle \| 5 Unknown \| 3 NakedExpand
ZvZ_Overpool9Gas	64	51 Turtle \| 8 Unknown \| 5 NakedExpand
ZvZ_OverpoolTurtle	13	13 Turtle

The builds recognized as Turtle genuinely are turtle builds. They get mutalisks fast at the expense of weakness to zergling attack, which they compensate for with sunkens instead of a second hatchery. From the meta-strategy point of view, Steamhammer usually defeats Microwave in games where Steamhammer gains air superiority early, so Steamhammer’s choices make sense.

AIIDE 2020 - Steamhammer versus McRave

I added parsing for Steamhammer. DaQin is nearly the same. The only remaining bot which records data that can be analyzed this way is ZZZKBot, which has a difficult file format, does not keep a recognized enemy strategy, and doesn’t bother to write a newline at the end of its file. I may skip ZZZKBot.

The Steamhammer-McRave strategy crosstable is the most interesting one yet.

steamhammer strategies versus mcrave strategies

	overall	PoolHatch,12Pool,2HatchMuta	PoolHatch,12Pool,2HatchSpeedling	PoolLair,9Pool,1HatchMuta
overall	64/150 43%	17/33 52%	10/22 45%	37/95 39%
12PoolLurker	0/1 0%	-	-	0/1 0%
3HatchLingBurrow	1/5 20%	1/2 50%	-	0/3 0%
8DroneGas	7/11 64%	-	1/1 100%	6/10 60%
9HatchMain9Pool9Gas	0/2 0%	-	-	0/2 0%
9PoolHatchSpeedAllInB	0/1 0%	-	-	0/1 0%
9PoolSpire	0/2 0%	0/2 0%	-	-
Over10HatchBust	8/19 42%	7/7 100%	-	1/12 8%
Over10PoolLing	0/1 0%	-	-	0/1 0%
OverpoolSpeed	3/15 20%	1/5 20%	0/3 0%	2/7 29%
OverpoolSunk	8/21 38%	0/1 0%	2/8 25%	6/12 50%
OverpoolTurtle	11/23 48%	2/6 33%	1/1 100%	8/16 50%
ZvP_3HatchMuta	0/1 0%	-	-	0/1 0%
ZvZ_12HatchExpo	0/1 0%	-	0/1 0%	-
ZvZ_Overgas9Pool	0/2 0%	-	0/1 0%	0/1 0%
ZvZ_OverpoolTurtle	26/45 58%	6/10 60%	6/7 86%	14/28 50%

For Steamhammer, either 8DroneGas (a zergling build despite the name) or else ZvZ_OverpoolTurtle (a mutalisk build) may dominate among the openings tried, while McRave’s best was the 1 hatch muta play because no Steamhammer try was better than even against it. It’s possible that switching between different kinds of builds was important, though, because the table suggests that the other counters are likely imbalanced (without a game-theoretic saddle point).

Both sides had trouble identifying the best strategies. If both had played their best strategies then the match would have come out close to 50%, while in fact Steamhammer came out behind, so Steamhammer had more trouble selecting from its excessive range of possibilities. I get the impression of a back-and-forth learning struggle.

steamhammer as seen by mcrave

steamhammer played	#	mcrave recognized
12PoolLurker	1	1 HatchPool,12Pool,1HatchMuta
3HatchLingBurrow	5	3 HatchPool,Unknown,2HatchLing \| 1 HatchPool,Unknown,Unknown \| 1 Unknown,Unknown,3HatchMuta
8DroneGas	11	6 HatchPool,9Pool,2HatchLing \| 2 PoolHatch,9Pool,2HatchLing \| 1 PoolHatch,Unknown,2HatchLing \| 1 HatchPool,Unknown,2HatchLing \| 1 PoolHatch,Unknown,Unknown
9HatchMain9Pool9Gas	2	1 PoolHatch,12Pool,2HatchLing \| 1 HatchPool,Unknown,2HatchLing
9PoolHatchSpeedAllInB	1	1 PoolHatch,9Pool,LingRush
9PoolSpire	2	2 Unknown,Unknown,Unknown
Over10HatchBust	19	7 HatchPool,12Pool,Unknown \| 4 HatchPool,12Pool,2HatchLing \| 3 Unknown,12Pool,Unknown \| 2 HatchPool,Unknown,2HatchLing \| 2 HatchPool,Unknown,Unknown \| 1 PoolHatch,12Pool,Unknown
Over10PoolLing	1	1 HatchPool,12Pool,Unknown
OverpoolSpeed	15	5 HatchPool,9Pool,LingRush \| 4 PoolHatch,12Pool,Unknown \| 3 Unknown,12Pool,Unknown \| 1 Unknown,9Pool,LingRush \| 1 PoolHatch,9Pool,LingRush \| 1 HatchPool,12Pool,3HatchMuta
OverpoolSunk	21	8 HatchPool,9Pool,Unknown \| 5 PoolHatch,9Pool,LingRush \| 2 HatchPool,9Pool,3HatchMuta \| 1 PoolHatch,9Pool,Unknown \| 1 Unknown,Unknown,Unknown \| 1 Unknown,12Pool,3HatchMuta \| 1 HatchPool,Unknown,Unknown \| 1 HatchPool,12Pool,3HatchMuta \| 1 PoolHatch,Unknown,Unknown
OverpoolTurtle	23	7 HatchPool,9Pool,LingRush \| 5 Unknown,12Pool,1HatchHydra \| 3 Unknown,Unknown,1HatchHydra \| 2 HatchPool,Unknown,1HatchHydra \| 2 Unknown,9Pool,1HatchHydra \| 2 HatchPool,12Pool,1HatchLurker \| 1 PoolHatch,12Pool,1HatchLurker \| 1 HatchPool,12Pool,1HatchHydra
ZvP_3HatchMuta	1	1 HatchPool,Unknown,2HatchLing
ZvZ_12HatchExpo	1	1 HatchPool,Unknown,2HatchMuta
ZvZ_Overgas9Pool	2	1 PoolLair,Unknown,1HatchMuta \| 1 PoolLair,Unknown,Unknown
ZvZ_OverpoolTurtle	45	15 Unknown,Unknown,Unknown \| 10 PoolLair,9Pool,1HatchMuta \| 6 PoolLair,12Pool,1HatchMuta \| 6 PoolLair,Unknown,1HatchMuta \| 2 PoolLair,Unknown,Unknown \| 2 Unknown,12Pool,Unknown \| 2 Unknown,Unknown,3HatchMuta \| 1 PoolLair,9Pool,Unknown \| 1 Unknown,12Pool,3HatchMuta

Some curious stuff here. None of Steamhammer’s openings here is 3 hatch mutalisk, so those that are recognized that way may have added a third hatchery later in the game. Steamhammer does have an unfortunate love of laying down an unnecessary hatchery before its spire in ZvZ (3 hatcheries with zerglings is good, 2 hatcheries with mutalisks is good, 3 hatcheries with mutalisks is hard to justify in ZvZ). Looking at the Steamhammer openings tried more often, OverpoolSunk should be recognized as PoolHatch usually (maybe sometimes PoolLair). McRave got it wrong over half the time, without any big effect on its win rate. OverpoolTurtle should be PoolHatch with a hydra followup (this opening is not intended for ZvZ). For ZvZ_OverpoolTurtle, the closest match is PoolLair,9Pool,1HatchMuta. McRave got it right 10 times out of 45 and was close some other times. Failing to recognize anything (likely the scout was denied) was bad.

mcrave as seen by steamhammer

mcrave played	#	steamhammer recognized
PoolHatch,12Pool,2HatchMuta	33	22 Naked expand \| 6 Unknown \| 5 Heavy rush
PoolHatch,12Pool,2HatchSpeedling	22	9 Naked expand \| 9 Unknown \| 3 Heavy rush \| 1 Worker rush
PoolLair,9Pool,1HatchMuta	95	89 Unknown \| 4 Turtle \| 2 Naked expand

Worker rush? That is likely a bug. The other choices capture information about the game that is probably true and not particularly useful.

AIIDE 2020 - Dragon versus Ecgberht

Two posts today, to cover the newly available Ecgberht pairings. Neither post has much meat to it.

dragon strategies versus ecgberht strategies

	overall	14CC	BioMechGreedyFE	FullMech	ProxyBBS	ProxyEightRax
overall	141/150 94%	28/28 100%	27/28 96%	25/25 100%	36/44 82%	25/25 100%
1rax fe	5/6 83%	1/1 100%	1/1 100%	1/1 100%	2/3 67%	-
bio	136/144 94%	27/27 100%	26/27 96%	24/24 100%	34/41 83%	25/25 100%

I was curious about Dragon’s pattern of seemingly giving up on “1rax fe” (barracks expand) after a single loss, so I looked at the file. In fact Dragon played “bio” as the regular build the whole time, throwing in “1rax fe” occasionally for spice. The “1rax fe” loss was not the last “1rax fe” game, but the second to last.

For Ecgberht, when one build is producing nearly all the wins, probably you should play it more often than 30% of the time. You may not want to play it every game, because that makes it easy for the opponent to adapt—mixing it up is good. Maybe 50% of the time would be better, given this number of alternatives? To know for sure, I guess we’d have to test against a range of bots to see the overall effectiveness of learning.

dragon as seen by ecgberht

dragon played	#	ecgberht recognized
1rax fe	6	6 Unknown
bio	144	144 Unknown

Nothing to see here. Move along.

ecgberht as seen by dragon

Dragon does not record its idea of the opponent’s build. If it has one.

AIIDE 2020 - BananaBrain versus Ecgberht

bananabrain strategies versus ecgberht strategies

	overall	14CC	FullMech	JoyORush	MechGreedyFE	ProxyEightRax
overall	148/150 99%	31/31 100%	28/28 100%	28/28 100%	28/28 100%	33/35 94%
PvT_10/12gate	10/10 100%	3/3 100%	-	1/1 100%	3/3 100%	3/3 100%
PvT_10/15gate	10/10 100%	2/2 100%	3/3 100%	1/1 100%	1/1 100%	3/3 100%
PvT_12nexus	10/10 100%	3/3 100%	3/3 100%	2/2 100%	1/1 100%	1/1 100%
PvT_1gatedtexpo	10/10 100%	1/1 100%	3/3 100%	2/2 100%	4/4 100%	-
PvT_1gatereaver	10/10 100%	2/2 100%	3/3 100%	2/2 100%	1/1 100%	2/2 100%
PvT_28nexus	10/10 100%	3/3 100%	2/2 100%	1/1 100%	2/2 100%	2/2 100%
PvT_2gatedt	11/11 100%	2/2 100%	4/4 100%	1/1 100%	2/2 100%	2/2 100%
PvT_2gaterngexpo	10/10 100%	3/3 100%	3/3 100%	-	1/1 100%	3/3 100%
PvT_32nexus	10/10 100%	1/1 100%	1/1 100%	2/2 100%	3/3 100%	3/3 100%
PvT_9/9gate	10/10 100%	2/2 100%	-	3/3 100%	-	5/5 100%
PvT_9/9proxygate	10/10 100%	4/4 100%	3/3 100%	1/1 100%	1/1 100%	1/1 100%
PvT_bulldog	10/10 100%	1/1 100%	1/1 100%	5/5 100%	1/1 100%	2/2 100%
PvT_dtdrop	10/10 100%	1/1 100%	-	3/3 100%	3/3 100%	3/3 100%
PvT_proxydt	7/9 78%	1/1 100%	1/1 100%	3/3 100%	2/2 100%	0/2 0%
PvT_stove	10/10 100%	2/2 100%	1/1 100%	1/1 100%	3/3 100%	3/3 100%

We can see exactly how Ecgberht scored its total of 2 wins: It happened to play a fast proxy when BananaBrain played a slow proxy. For BananaBrain, maybe the lesson is to avoid risky openings versus much weaker opponents. As a general principle, I suggest saving risky builds for games where you have a high risk of losing with safe play—in that case, why not?

bananabrain as seen by ecgberht

bananabrain played	#	ecgberht recognized
PvT_10/12gate	10	7 ZealotRush \| 3 Unknown
PvT_10/15gate	10	10 Unknown
PvT_12nexus	10	9 ProtossFE \| 1 Unknown
PvT_1gatedtexpo	10	10 Unknown
PvT_1gatereaver	10	10 Unknown
PvT_28nexus	10	10 Unknown
PvT_2gatedt	11	11 Unknown
PvT_2gaterngexpo	10	10 Unknown
PvT_32nexus	10	10 Unknown
PvT_9/9gate	10	7 ZealotRush \| 3 Unknown
PvT_9/9proxygate	10	8 Unknown \| 2 CannonRush
PvT_bulldog	10	10 Unknown
PvT_dtdrop	10	10 Unknown
PvT_proxydt	9	9 Unknown
PvT_stove	10	10 Unknown

Except for a couple cases of CannonRush, the builds that Ecgberht recognized were named correctly. I imagine that it interpreted CannonRush as “something proxied.”

ecgberht as seen by bananabrain

ecgberht played	#	bananabrain recognized
14CC	31	21 T_fastexpand \| 6 T_unknown \| 4 T_2rax
FullMech	28	21 T_unknown \| 6 T_1fac \| 1 T_2fac
JoyORush	28	23 T_2fac \| 3 T_unknown \| 2 T_1fac
MechGreedyFE	28	25 T_unknown \| 3 T_2rax
ProxyEightRax	35	35 T_unknown

As we’ve seen before, BananaBrain has little skill in recognizing terran builds.

AIIDE 2020 - what Ecgberht learned

I added parsing code for Ecgberht’s JSON format learning files. I had to refactor for generality, and it added complexity, but I can use the parser for more than one purpose. Today I summarize the contents of its history files.

Ecgberht I think is a complex and interesting bot. It played up to 5 different strategies in each matchup, though the selection of the 5 varied by matchup. Sometimes it played fewer. Against most opponents Ecgberht played its strategies at roughly equal rates—except for the strategies it didn’t play at all. Ecgberht uses UCB with a high exploration rate. The strategy manager in the source lists 15 strategies (plus one more played only on the map Plasma and named PlasmaWraithHell), so it did not play everything it knows. I made a quick scan through the source for opponent-specific preparation, and did find some, but for bots in the tournament only ZZZKBot is affected (it is flagged by a zergling rush check; some bots that always zealot rush are flagged for that). I didn’t dig deep enough to find out why Ecgberht ignores so many of its available strategies.

Ecgberht tries to recognize the opponent’s strategy, but often finds itself unsure. It recorded a high rate of Unknown enemy plans. The ones it does recognize are drawn from a small set that seems to me well-chosen.

Ecgberht recorded fewer than 150 games for 5 of its 11 opponents, although it completed all games with no crashes. In total, 7 games do not appear in the game records of the history files. Maybe it has a cleanup bug that bites occasionally?

#1 stardust

opening	games	wins	first	last
14CC	31	0%	3	147
FullMech	28	0%	0	148
JoyORush	27	0%	2	143
MechGreedyFE	27	0%	4	146
ProxyEightRax	36	6%	1	141
5 openings	149	1%

enemy	games	wins
Unknown	149	1%
1 opening	149	1%

A couple wins against the top player is not bad.

#2 purplewave

opening	games	wins	first	last
14CC	35	3%	3	148
FullMech	29	0%	0	149
JoyORush	28	0%	2	146
MechGreedyFE	28	0%	4	147
ProxyEightRax	30	0%	1	142
5 openings	150	1%

enemy	games	wins
ProtossFE	7	0%
Unknown	143	1%
2 openings	150	1%

#3 bananabrain

opening	games	wins	first	last
14CC	31	0%	3	146
FullMech	28	0%	0	144
JoyORush	28	0%	2	147
MechGreedyFE	28	0%	4	148
ProxyEightRax	35	6%	1	149
5 openings	150	1%

enemy	games	wins
CannonRush	2	0%
ProtossFE	9	0%
Unknown	125	2%
ZealotRush	14	0%
4 openings	150	1%

#4 dragon

opening	games	wins	first	last
14CC	28	0%	3	148
BioMechGreedyFE	28	4%	4	144
FullMech	25	0%	0	146
ProxyBBS	44	18%	2	149
ProxyEightRax	25	0%	1	147
5 openings	150	6%

enemy	games	wins
Unknown	150	6%
1 opening	150	6%

#5 mcrave

opening	games	wins	first	last
14CC	28	7%	7	147
BioGreedyFE	51	29%	0	145
ProxyEightRax	47	26%	21	140
TwoPortWraith	22	5%	3	146
4 openings	148	20%

enemy	games	wins
FastHatch	61	16%
NinePool	13	31%
Unknown	74	22%
3 openings	148	20%

Ecgberht put up its strongest fight against zerg.

#6 microwave

opening	games	wins	first	last
14CC	32	9%	4	145
BioGreedyFE	21	0%	0	148
FullBioFE	24	4%	3	146
ProxyEightRax	52	27%	1	147
TwoPortWraith	20	0%	2	138
5 openings	149	12%

enemy	games	wins
FastHatch	99	4%
NinePool	5	40%
Unknown	45	27%
3 openings	149	12%

#7 steamhammer

opening	games	wins	first	last
14CC	34	12%	8	147
BioGreedyFE	36	17%	0	142
ProxyEightRax	36	14%	1	141
TwoPortWraith	43	23%	4	148
4 openings	149	17%

enemy	games	wins
EarlyPool	4	0%
FastHatch	22	32%
NinePool	81	14%
Unknown	42	17%
4 openings	149	17%

#8 daqin

opening	games	wins	first	last
14CC	32	0%	8	148
FullMech	29	0%	0	149
JoyORush	28	0%	4	144
MechGreedyFE	28	0%	43	147
ProxyEightRax	33	3%	1	141
5 openings	150	1%

enemy	games	wins
Unknown	150	1%
1 opening	150	1%

#9 zzzkbot

opening	games	wins	first	last
FullBio	150	71%	0	149
1 opening	150	71%

enemy	games	wins
EarlyPool	150	71%
1 opening	150	71%

Ecgberht upset ZZZKBot, possibly aided by its hardcoded knowledge of how ZZZKBot plays.

#10 ualbertabot

opening	games	wins	first	last
FullBio	58	43%	0	144
FullMech	52	38%	2	145
ProxyBBS	40	32%	1	149
3 openings	150	39%

enemy	games	wins
BioPush	11	91%
EarlyPool	12	50%
MechRush	9	33%
Unknown	104	24%
ZealotRush	14	100%
5 openings	150	39%

#11 willyt

opening	games	wins	first	last
14CC	31	3%	68	148
FullMech	34	9%	0	147
ProxyEightRax	85	41%	2	149
3 openings	150	26%

enemy	games	wins
BioPush	34	15%
Unknown	116	29%
2 openings	150	26%

#13 eggbot

opening	games	wins	first	last
FullMech	148	94%	0	147
1 opening	148	94%

enemy	games	wins
CannonRush	94	95%
Unknown	54	93%
2 openings	148	94%

AIIDE 2020 - Microwave versus BananaBrain

This is the last matchup I can analyze this way without writing more parsing code. McRave did ask for more in a comment, though, so I may do that. All the matchups have featured BananaBrain.

Microwave plays a large number of strategies, so I put it on the left side. Blue is good for Microwave, red is good for BananaBrain.

microwave strategies versus bananabrain strategies

	overall	PvZ_10/12gate	PvZ_1basespeedzeal	PvZ_2basespeedzeal	PvZ_4gate2archon	PvZ_5gategoon	PvZ_9/9gate	PvZ_9/9proxygate	PvZ_bisu	PvZ_neobisu	PvZ_sairdt	PvZ_sairgoon	PvZ_sairreaver	PvZ_stove
overall	58/150 39%	5/17 29%	3/19 16%	4/11 36%	4/9 44%	4/7 57%	5/11 45%	5/12 42%	4/14 29%	4/10 40%	5/10 50%	6/11 55%	4/9 44%	5/10 50%
10Hatch9Pool9gas	0/2 0%	-	-	-	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-
10HatchMain9Pool9Gas	0/1 0%	-	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
11HatchTurtleHydra	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
12Hatch	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
12PoolMain	22/43 51%	0/5 0%	0/9 0%	2/2 100%	3/3 100%	3/3 100%	0/1 0%	1/3 33%	2/2 100%	3/3 100%	0/3 0%	2/3 67%	4/4 100%	2/2 100%
12PoolMuta	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
1HatchMuta_Sparkle	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-	-	-	-	-
2HatchMuta	1/5 20%	-	-	1/1 100%	-	-	0/1 0%	-	0/1 0%	-	-	-	0/1 0%	0/1 0%
3HatchHydraBust	0/1 0%	-	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
3HatchHydra_BHG	0/1 0%	-	-	0/1 0%	-	-	-	-	-	-	-	-	-	-
3HatchLingBust	2/6 33%	-	0/1 0%	0/1 0%	-	-	1/1 100%	0/1 0%	-	-	-	1/1 100%	-	0/1 0%
3HatchMuta	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
3HatchPoolHydraExpo	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
4HatchBeforeGas	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
4HatchPoolHydra	0/2 0%	-	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-
4PoolHard	2/6 33%	-	1/1 100%	0/1 0%	-	-	1/1 100%	-	0/1 0%	-	-	-	-	0/2 0%
4PoolSoft	0/1 0%	-	0/1 0%	-	-	-	-	-	-	-	-	-	-	-
6Pool	0/1 0%	-	0/1 0%	-	-	-	-	-	-	-	-	-	-	-
7Pool	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-	-
8Pool	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
8PoolHydraRush8D	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
9PoolGasHatchSpeed8D	12/18 67%	2/2 100%	2/2 100%	-	1/2 50%	0/1 0%	1/1 100%	0/2 0%	1/1 100%	1/1 100%	1/1 100%	1/2 50%	0/1 0%	2/2 100%
9PoolHatchGasSpeed7D	0/1 0%	-	-	-	0/1 0%	-	-	-	-	-	-	-	-	-
9PoolHatchGasSpeed8D	17/32 53%	3/4 75%	0/1 0%	1/1 100%	0/1 0%	0/1 0%	2/4 50%	4/5 80%	1/5 20%	0/1 0%	4/4 100%	2/2 100%	0/2 0%	0/1 0%
9PoolSpeed	0/3 0%	0/1 0%	-	-	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-
9PoolSpeedLing	1/5 20%	-	-	-	-	-	0/1 0%	-	0/1 0%	-	-	0/1 0%	0/1 0%	1/1 100%
9PoolSunkHatch	0/1 0%	-	-	0/1 0%	-	-	-	-	-	-	-	-	-	-
Overpool	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
OverpoolSpeed	0/3 0%	-	0/1 0%	0/1 0%	-	-	-	-	0/1 0%	-	-	-	-	-
ZvP_10Hatch9Pool	1/3 33%	-	0/1 0%	0/1 0%	-	1/1 100%	-	-	-	-	-	-	-	-
ZvP_11Hatch10Pool	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
ZvZ_Overgas9Pool	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
ZvZ_Overpool11Gas	0/2 0%	-	-	-	-	-	0/1 0%	-	-	-	0/1 0%	-	-	-

This table looks even more scattered than yesterday’s BananaBrain-Dragon table, but to me it tells a story of duelling learning algorithms. Microwave found a few builds that countered BananaBrain’s preferred play, and BananaBrain did not shift its responses far enough to entirely squelch them.

microwave as seen by bananabrain

microwave played	#	bananabrain recognized
10Hatch9Pool9gas	2	2 Z_10hatch
10HatchMain9Pool9Gas	1	1 Z_10hatch
11HatchTurtleHydra	1	1 Z_12hatch
12Hatch	1	1 Z_12hatch
12PoolMain	43	36 Z_12pool \| 5 Z_10hatch \| 2 Z_unknown
12PoolMuta	1	1 Z_10hatch
1HatchMuta_Sparkle	1	1 Z_unknown
2HatchMuta	5	5 Z_12hatch
3HatchHydraBust	1	1 Z_12hatch
3HatchHydra_BHG	1	1 Z_10hatch
3HatchLingBust	6	6 Z_12hatch
3HatchMuta	1	1 Z_12hatch
3HatchPoolHydraExpo	1	1 Z_12hatch
4HatchBeforeGas	1	1 Z_12hatch
4HatchPoolHydra	2	2 Z_12hatch
4PoolHard	6	6 Z_4/5pool
4PoolSoft	1	1 Z_4/5pool
6Pool	1	1 Z_4/5pool
7Pool	1	1 Z_9pool
8Pool	1	1 Z_9pool
8PoolHydraRush8D	1	1 Z_9pool
9PoolGasHatchSpeed8D	18	15 Z_9pool \| 3 Z_overpool
9PoolHatchGasSpeed7D	1	1 Z_9pool
9PoolHatchGasSpeed8D	32	29 Z_9pool \| 3 Z_overpool
9PoolSpeed	3	2 Z_9poolspeed \| 1 Z_9pool
9PoolSpeedLing	5	5 Z_9poolspeed
9PoolSunkHatch	1	1 Z_9pool
Overpool	1	1 Z_overpool
OverpoolSpeed	3	3 Z_overpool
ZvP_10Hatch9Pool	3	3 Z_10hatch
ZvP_11Hatch10Pool	1	1 Z_12hatch
ZvZ_Overgas9Pool	1	1 Z_12pool
ZvZ_Overpool11Gas	2	2 Z_overpool

BananaBrain was accurate at reading Microwave’s initial build. Lumping 11 hatch with 12 hatch is fine, they’re very similar. 12 pool can be difficult to distinguish from 10 hatch, if you scout it late after the second hatchery finishes. It would be useful to better separate 9 pool from overpool, which are significantly different in effect, but it requires close attention to detail. Overall, highly accurate readings with only one wide miss, seeing the overgas 9 pool as 12 pool—and that is a ZvZ build that is extremely rare in ZvP.

It makes quite a contrast with yesterday’s BananaBrain-Dragon analysis, where BananaBrain barely recognized terran builds.

bananabrain as seen by microwave

bananabrain played	#	microwave recognized
PvZ_10/12gate	17	13 HeavyRush \| 3 Unknown \| 1 NakedExpand
PvZ_1basespeedzeal	19	14 Unknown \| 5 HeavyRush
PvZ_2basespeedzeal	11	4 NakedExpand \| 3 Turtle \| 3 SafeExpand \| 1 HeavyRush
PvZ_4gate2archon	9	4 NakedExpand \| 4 SafeExpand \| 1 HeavyRush
PvZ_5gategoon	7	6 NakedExpand \| 1 HeavyRush
PvZ_9/9gate	11	9 HeavyRush \| 2 Unknown
PvZ_9/9proxygate	12	6 HeavyRush \| 6 Unknown
PvZ_bisu	14	6 SafeExpand \| 4 NakedExpand \| 2 Turtle \| 1 HeavyRush \| 1 Unknown
PvZ_neobisu	10	4 NakedExpand \| 3 SafeExpand \| 2 Turtle \| 1 HeavyRush
PvZ_sairdt	10	8 Unknown \| 2 HeavyRush
PvZ_sairgoon	11	7 NakedExpand \| 1 SafeExpand \| 1 Turtle \| 1 Unknown \| 1 HeavyRush
PvZ_sairreaver	9	4 SafeExpand \| 3 NakedExpand \| 2 Turtle
PvZ_stove	10	7 Unknown \| 3 HeavyRush

Microwave borrowed Steamhammer’s rather crude classification of enemy plans (which was still far in the future when Microwaved forked from Steamhammer). It was intended to be minimal, just enough to allow for basic reactions, to hold the fort until I could raise enough troops to make a sally. Microwave’s recognitions look similar to Steamhammer’s, with the right general tendency but many sloppy variations (which I think are due mostly to weak scouting, with a contribution from overlapping recognition rules).

It’s striking that some recognitions—of dubious accuracy—are dark blue in stark contrast to their neighbors. It gives me the impression that Microwave makes use of the recognized enemy plan, in some cases to good effect. It suggests that more accurate recognition, if the reactions are also good, could be a major improvement.

AIIDE 2020 - BananaBrain versus Dragon

Of the 4 bots I’m prepared to run this analysis on, this is the only pairing involving Dragon. Dragon did not record all 150 games against either McRave or Microwave. Like yesterday, all win rates and coloring are from the point of view of BananaBrain: Blue is good for BananaBrain, red is good for Dragon.

bananabrain strategies versus dragon strategies

	overall	1rax fe	2rax bio	2rax mech	bio	dirty worker rush	mass vulture	siege expand
overall	67/150 45%	6/14 43%	6/11 55%	8/15 53%	15/37 41%	3/3 100%	22/56 39%	7/14 50%
PvT_10/12gate	12/17 71%	2/3 67%	-	2/3 67%	4/4 100%	-	3/6 50%	1/1 100%
PvT_10/15gate	5/12 42%	-	2/2 100%	1/5 20%	1/3 33%	-	1/2 50%	-
PvT_12nexus	1/8 12%	1/2 50%	-	-	0/1 0%	-	0/3 0%	0/2 0%
PvT_1gatedtexpo	3/7 43%	1/2 50%	-	-	0/1 0%	-	2/4 50%	-
PvT_1gatereaver	0/5 0%	-	0/1 0%	-	0/2 0%	-	0/2 0%	-
PvT_28nexus	5/11 45%	0/2 0%	0/1 0%	0/2 0%	1/1 100%	-	4/5 80%	-
PvT_2gatedt	3/9 33%	0/1 0%	-	1/1 100%	0/2 0%	-	0/3 0%	2/2 100%
PvT_2gaterngexpo	2/7 29%	-	0/1 0%	-	1/1 100%	1/1 100%	0/4 0%	-
PvT_32nexus	2/8 25%	-	-	-	1/4 25%	1/1 100%	0/2 0%	0/1 0%
PvT_9/9gate	14/18 78%	-	2/3 67%	-	4/4 100%	1/1 100%	7/9 78%	0/1 0%
PvT_9/9proxygate	8/14 57%	1/1 100%	1/1 100%	3/3 100%	0/2 0%	-	2/6 33%	1/1 100%
PvT_bulldog	0/6 0%	0/1 0%	-	-	0/3 0%	-	0/1 0%	0/1 0%
PvT_dtdrop	2/8 25%	-	1/1 100%	-	0/4 0%	-	1/2 50%	0/1 0%
PvT_proxydt	10/14 71%	1/1 100%	-	1/1 100%	3/3 100%	-	2/5 40%	3/4 75%
PvT_stove	0/6 0%	0/1 0%	0/1 0%	-	0/2 0%	-	0/2 0%	-

Not one table cell has more than 9 games in it. Neither bot successfully predicted what the other would play, if it even tried: BananaBrain is unpredictable and Dragon changes its choice frequently when losing, and besides BananaBrain is poor at recognizing terran plans. So the strategy x strategy cross is a hash. To me the table means that, at least for this pairing, reactions during the game were more important than the initial choice of strategy. Neither side had a way to choose a counter beforehand.

bananabrain as seen by dragon

Dragon does not record a recognized opponent strategy. Its history files have only its own strategy and whether it won.

dragon as seen by bananabrain

dragon played	#	bananabrain recognized
1rax fe	14	13 T_unknown \| 1 T_fastexpand
2rax bio	11	8 T_unknown \| 2 T_fastexpand \| 1 T_1fac
2rax mech	15	14 T_unknown \| 1 T_1fac
bio	37	35 T_unknown \| 1 T_1fac \| 1 T_fastexpand
dirty worker rush	3	3 T_unknown
mass vulture	56	30 T_1fac \| 26 T_unknown
siege expand	14	9 T_unknown \| 5 T_1fac

We knew that BananaBrain struggles to recognize terran strategies. Maybe the author has not spent effort on it because it doesn’t affect results much? In any case, given how Dragon plays, with its love of fast expansions and mixed tech, the terran builds that are recognized probably represent truths about the games. It’s not clear that they are helpful truths, though, because they say so little about what happened.

From the coloring, it looks as though there was little relationship between whether BananaBrain recognized Dragon’s build and whether BananaBrain won. That is consistent with the theory that the author decided it didn’t matter.