archive by month
Skip to content

Steamhammer games and status

Steamhammer played an excellent game versus Monster today. The game is kind of long and boring to watch, with repetitive action, but I’m pleased by the good play against stubborn defense. Steamhammer wasted some resources and missed some opportunities, but made no severe mistake at any point. It even expanded at a good time, which is depressingly rare in its ZvZs. Near the end, Steamhammer tried to put the cherry on top by ensnaring Monster’s mutalisks, but the mutas zoomed by too fast, the ensnare missed, and the queen was shot down. Oh well, dropping the cherry didn’t change the rest!

For a game that is not in the least excellent but is interesting for its mistakes, I like yesterday’s Steamhammer-Slater game. I watched the game live, and when Steamhammer bumbled the defense of its natural I steeled myself for a quick upset. But it was not so quick after all. The game is a showcase of ways to go wrong on both sides. Some of Steamhammer’s mistakes remain unresolved because my planned fixes are complicated and need to be implemented as projects.

The latency compensation bug is still making me scratch my head. The easiest way to work around it is to use the Micro module’s order tracking; Steamhammer already keeps track of what orders it has given to units, including larvas, so it doesn’t need to rely on BWAPI to keep it straight. I traced the backbone of the production code and added the minimal workaround, a two-line addition to the code that decides whether a unit should be added to the set of candidate producers. And... it didn’t work. In order to control where zerg units are made, to do things like make drones at bases that don’t have enough drones, there is a special-case low-level routine, and it ignores the set of candidate producers and does its own calculations from scratch—slightly complicated calculations that the candidates don’t make easier. I’m still thinking about the right fix. Maybe I can find a way to make it simple and powerful at the same time.

It is, by the way, a serious bug. In Steamhammer, the effect is to sometimes—at predictable times—drop a unit that was queued for production. Among other things, it turns 12 hatch openings into 11 hatch. I had noticed that Steamhammer was playing 11 hatch surprisingly often, but it does have a full suite of intentional 11 hatch openings, so I didn’t realize that it was due to a bug.

SSCAIT 2020 halfway point

The annual SSCAIT is past the halfway point of the round robin phase, and it’s time to take stock. The numbers keep changing, but here’s a snapshot.

Stardust has slowly climbed to #1 after its weak start, with only 4 losses after 50 games, as compared to #2 Monster with 7 losses after 67 games. Stardust has strong chances to hold its #1 position, though it has played fewer games. Stardust’s worst upset was against #29 ICEbot, while Monster’s was against #47 Junkbot. #5 PurpleWave is unexpectedly low, below #4 BananaBrain; from games I’ve seen, I suspect it did not get its usual thorough preparation, or perhaps the prep was concentrated on top opponents so that it can succeed in the elimination phase. #6 Iron is doing better than I expected, and is ahead of #7 Hao Pan, though they are ranked close and the edge may not stick. #9 Xiao Yi is also higher than I expected.

#14 Skynet by Andrew Smith is the only classic unupdated bot to hang on in the top 16. Other classics #17 UAlbertaBot by Dave Churchill and #18 XIMP by Tomas Vajda are just outside, and there is a gap with #16 Proxy so they may remain outside at the end of the round robin. #19 McRaveZ I had hoped to do better; its muta micro is good but its muta decision making (which target to seek, when to attack and when to run away) is not nearly as good as Monster’s. #20 Microwave has been slowly upping its win rate and has an outside chance of making it into the top 16 by the end; I imagine its learning is figuring out how to compensate for the bugs in this version.

Steamhammer is at #13 at the moment after a few losses, but I’m still forecasting that its most likely finish is #9 or #10. It has played more of its tough games than its easy games.

Some bots get special icons on the unofficial crosstable by Lines Prower. It’s a cute touch, though for me it makes the table harder to read. The funniest is Krasi0P’s linux penguin for 2 wins and Windows logo for 2 losses. I don’t understand McRaveZ’s icons. A salt shaker for losses, OK, but a secret agent for wins? I may be missing some background. PurpleWave gets a purple heart for wins. Maybe Lines Prower doesn’t know what a purple heart means to Americans?

only one horrible game

Last year, Steamhammer finished SSCAIT for the first time with no losses due to crippling bugs and only 2 close calls. So far, it is on track to repeat in this year’s SSCAIT. I have seen all its games up to now, and there are no losses due to egregious bugs (only the standard issue flaws) and only one near miss. That’s great compared to Steamhammer’s early years, but I still want to fix the bugs.

The bad game is Steamhammer vs legacy (random zerg). Steamhammer made a number of mistakes in the game and suffered at least 2 bugs. The bug I could not accept is that it built spore colonies to defend against air attack—very early, immediately after scouting legacy’s base and seeing that it had not yet taken gas. It’s not possible to get mutalisks that fast, and without gas there was not even a hint of future risk. In fact, legacy never took its gas and played the whole game with a mass slow zergling plan. If Steamhammer had held on to the drones instead of wasting them on static defense due to a bug, I doubt the attack would have troubled it at all.

I traced the bug to, of all things, an integer overflow. The routine that figures out the time the enemy’s spire will complete returns INT_MAX for “never” if there is no evidence of an enemy spire... and I brilliantly added a margin for the mutas to hatch and fly across the map. In C++, integer overflow is officially undefined, so the compiler retreats to its room and laughs its head off before generating the code that will cause the most possible confusion, because “undefined” means it can do that. I don’t know what it did this time, but it was not as simple as wrapping around from an extreme positive value to an extreme negative value, because that would have caused the bug to show up in half of ZvZ games. No, it’s better if it shows up only when it will cause a disgusting blunder out of nowhere.

Anyway, it was easy to fix. I also fixed a bug that caused multiple commanding of overlords. And I’m writing code to collect data for my main current project. Progress is underway.

SSCAIT and performance over time

Yesterday I claimed that the cannon bot Jakub Trancik “has been falling slowly in the rankings year by year, even as bots that began above it fall further.” Is it true? I see room for argument, but there is something to it.

graph of SSCAIT finishes for 5 bots over 6 years

Here is a graph of the SSCAIT finishing ranks of 5 bots over 6 years, from the 2014 through 2019 editions of the round robin phase of the annual tournament. The bots were selected to have no updates over the time period; it is the same code every year, according to the info on SSCAIT’s website. (UAlbertaBot by Dave Churchill was updated in 2015, so I didn’t include its 2014 finish.) The finishing ranks are normalized so that finishing first is 100 and finishing last is 0, so that the ranks can be compared over time even though each year had a different number of participants. The graph shows old bots falling in relative performance as new and updated bots grew stronger over the years.

Jakub Trancik’s finishes were nearly flat from 2014 through 2017, and it fell in 2018. It did not participate in 2019, though it has been allowed back this year. The other non-updated bots showed declines over the period, but not always steep declines. Each bot has a visible knee in the curve, where it bent more sharply down. The year of the knee, the last year of relatively stable performance, ranges from 2016 for Tomas Cere to 2018 for Skynet by Andrew Smith. That might be because performance gains have accelerated in the last few years, or it might be because it takes that long for enough new and updated bots to be tuned against the unchanging old ones. Maybe the knees occur when flashy newcomers start to exploit specific weaknesses of the old guard.

Of these 5 bots, Jakub Trancik has the flattest curve, though it doesn’t look exceptional. It is not a statistical outlier, and Skynet’s curve is almost as level. Jakub Trancik is also the least sophisticated bot, and it has the most extreme and unconventional strategy. The facts might be related.

For comparison, here’s another chart with 4 more non-updated bots. Roman Danielis missed 2016. These curves also seem to have knees, though less sharp, and the shape of Roman Danielis’s curve is not clear to the eye.

graph of SSCAIT finishes for 4 more bots

SSCAIT 2020 so far

The annual SSCAIT has progressed far enough that the competitors have roughly sorted themselves into groups. It’s about 1/4 complete, and we can get an idea of how things are going. Currently we have #1 Monster, which may in fact be the favorite to finish first, but it’s too early to talk about detailed finishing order.

Iron is doing better than I expected, though I guess it’s within the statistical margin of error. I have always been bemused by the consistency of cannonbot #38 Jakub Trancik, 11-16 for 41%, last updated in 2013. It has been falling slowly in the rankings year by year, even as bots that began above it fall further; apparently improvements that help against usual play do not help as much against the cannons. What stands out more to me are the bots that collapsed. Styx is failing to start and losing every game. Microwave, which should be in the top 16, is currently #31 of 56 with 16-15; maybe the latest update introduced a bug.

The biggest upset is #52 Marine Hell > #8 Steamhammer; Steamhammer went from lifetime 67-2 to 67-3 (since opponent modeling was added) against this opponent after failing to scout Marine Hell’s unit mix and making the wrong choice of counter units, among other mistakes. A more interesting upset is #48 Garmbot by Aurelien Lermant > #9 Dragon, where Dragon tried its usual harassing game plan but ended up defending all game instead, and could not hold it together. I was also pleased with #16 Skynet by Andrew Smith > #2 BetaStar after BetaStar chose a risky build, and this time didn’t get away with it. Don’t underestimate your foes: “It is not enough to be a good player, you must also play well” — Siegbert Tarrasch.

Steamhammer is currently at #8 with 17-5, having played only 22 games, fewer games than any other bot in the top 16 except Stardust, which has played only 20. I think Steamhammer’s most likely finish is #9 or #10, but we’ll see. Last year I was slightly pessimistic, and if the same is true this year then it may hold its position.

SSCAIT is popular today

I see 15 viewers on the SSCAIT stream as I write. More usually I see 1 to 4 of late, which presumably includes me when I’m watching. It’s a good sign; the annual tournament is driving interest.

SSCAIT tournament soon

I’ve just uploaded Steamhammer 3.3.5, which will be the SSCAIT tournament version unless it hits a last-minute bug. If you dare to rush through your opponent prep, now’s the time! Expect the change list after the deadline. Optically, this version fixes all the most visible bugs introduced in and since the AIIDE version; the games look cleaner, overlords live longer, bizarre expansion behavior does not happen. Results are only slightly improved, though, in part because of the learning hides bugs issue. I expected better.

Starting on 19 December, there’s been a rush of updates. In fact, every bot updated after 27 November was updated (or re-updated) on 19 December or later, so there’s a gap in the dates.

There is not much to predict about the tournament. I think everyone can foresee that the top finishers of the round robin phase will include Stardust, Krasi0 (if it competes as terran this year), Monster, and PurpleWave, and likely BananaBrain which has been doing well. Halo by Hao Pan is significantly weaker, and there is a gap below Hao Pan and adias (aka SAIDA) of nearly 100 elo before the remaining strong bots. Steamhammer is likely to finish near the middle of the top 16, and then survive not very long in the elimination phase, as in past years.

Steamhammer tournament plans

For the upcoming SSCAIT annual tournament, I’ll follow my usual plan. I’ve just uploaded a new test version Steamhammer 3.3.1, which fixes one of the critical bugs (and has another surprise change). I’ll drop frequent test versions until tournament time, and after the deadline I’ll release the tournament version. Time is short, so the changes will be mostly bug fixes and low-risk improvements that are unlikely to break stuff.

I expect the standard long no-upload period while the tournament runs. I will either turn to SCHNAIL, or else I’ll work on one of my machine learning ideas. Just after tournament season is the ideal time to add bugs and their associated major new features, so that the rest of the year can work desperately to fix—I mean, to tune them.

AIIDE 2020 - various versus DaQin

I added parsing for DaQin’s files, which was little effort. I decided to dump all of DaQin’s analysis into a single post, because the tables aren’t that rich in information. Now I’m able to move on to other topics. I put the opponents on the left, so that in all cases, blue is good for the opponent and red is good for DaQin.

bananabrain strategies versus daqin strategies

overall2GateDT3GateDT4GateGoon
overall99/150 66%9/14 64%53/89 60%37/47 79%
PvP_10/12gate7/10 70%-2/5 40%5/5 100%
PvP_12nexus4/7 57%1/1 100%2/4 50%1/2 50%
PvP_2gatedt14/16 88%1/1 100%7/9 78%6/6 100%
PvP_2gatedtexpo10/14 71%1/2 50%5/7 71%4/5 80%
PvP_2gatereaver13/16 81%1/1 100%5/7 71%7/8 88%
PvP_3gaterobo8/13 62%2/2 100%5/7 71%1/4 25%
PvP_3gatespeedzeal2/7 29%0/1 0%1/5 20%1/1 100%
PvP_4gategoon4/8 50%1/1 100%2/6 33%1/1 100%
PvP_9/9gate15/16 94%-11/11 100%4/5 80%
PvP_9/9proxygate6/10 60%1/1 100%2/6 33%3/3 100%
PvP_nzcore7/11 64%1/1 100%4/8 50%2/2 100%
PvP_zcore3/7 43%0/2 0%3/5 60%-
PvP_zcorez2/7 29%-2/4 50%0/3 0%
PvP_zzcore4/8 50%0/1 0%2/5 40%2/2 100%

Reading DaQin’s openings out of its configuration file, I see that 2GateDT makes 2 dark templar out of the promised 2 gateways, adds 3 cannons in front of its natural, then expands. As a PvP build, that strikes me as illogical (you might want one cannon if the enemy also has dark templar). 3GateDT makes one gate, gets dragoons and dragoon range, adds a second gate and a citadel, and then the predefined build order ends—the rest is left to the strategy manager. That seems sensible as far as it goes, but does the strategy manager regularly add a third gate and make DTs as promised, or is the name of the opening a lie? See below for BananaBrain’s opinion on the question. In any case, 3GateDT is the opening that gave BananaBrain the most trouble.

bananabrain as seen by daqin

bananabrain played#daqin recognized
PvP_10/12gate1010 Fast rush
PvP_12nexus75 Fast rush | 1 Safe expand | 1 Naked expand
PvP_2gatedt1616 Fast rush
PvP_2gatedtexpo1413 DarkTemplar rush | 1 Unknown
PvP_2gatereaver1616 DarkTemplar rush
PvP_3gaterobo1313 DarkTemplar rush
PvP_3gatespeedzeal76 Fast rush | 1 Unknown
PvP_4gategoon85 DarkTemplar rush | 1 Naked expand | 1 Unknown | 1 Fast rush
PvP_9/9gate1616 Fast rush
PvP_9/9proxygate109 Fast rush | 1 Proxy
PvP_nzcore118 DarkTemplar rush | 1 Not fast rush | 1 Naked expand | 1 Unknown
PvP_zcore77 DarkTemplar rush
PvP_zcorez75 DarkTemplar rush | 2 Not fast rush
PvP_zzcore85 DarkTemplar rush | 2 Proxy | 1 Not fast rush

DaQin recognizes 9-9 gate as Fast rush, but also the economy-first 10-12 gate and even the fast expand 12 nexus. What BananaBrain calls a reaver build, DaQin sees as a dark templar rush. Strategy recognition has some odd results.

daqin as seen by bananabrain

daqin played#bananabrain recognized
2GateDT1412 P_1gatecore | 2 P_unknown
3GateDT8945 P_1gatecore | 32 P_4gategoon | 11 P_unknown | 1 P_ffe
4GateGoon4736 P_4gategoon | 9 P_1gatecore | 2 P_unknown

This suggests that DaQin’s 3GateDT was often not a dark templar build at all.

mcrave strategies versus daqin strategies

overallForgeExpand5GateGoonForgeExpandSpeedlots
overall97/150 65%3/3 100%94/147 64%
PoolHatch,Overpool,2HatchMuta97/150 65%3/3 100%94/147 64%

Not a lot of strategic variety here.

mcrave as seen by daqin

mcrave played#daqin recognized
PoolHatch,Overpool,2HatchMuta150117 Not fast rush | 28 Heavy rush | 5 Unknown

daqin as seen by mcrave

daqin played#mcrave recognized
ForgeExpand5GateGoon33 FFE,Forge,5GateGoon
ForgeExpandSpeedlots147121 FFE,Forge,Speedlot | 21 FFE,Nexus,Speedlot | 2 FFE,Forge,5GateGoon | 2 FFE,Forge,ZealotArchon | 1 FFE,Gateway,Speedlot

microwave strategies versus daqin strategies

overall4GateGoonForgeExpand5GateGoonForgeExpandSpeedlots
overall125/150 83%3/11 27%3/3 100%119/136 88%
1HatchMuta_Sparkle56/62 90%0/1 0%-56/61 92%
3HatchLingBust11/17 65%2/4 50%1/1 100%8/12 67%
3HatchMuta53/59 90%0/2 0%2/2 100%51/55 93%
3HatchMutaExpo5/9 56%1/4 25%-4/5 80%
3HatchPoolHydraExpo0/1 0%--0/1 0%
9Pool0/1 0%--0/1 0%
OverpoolLurker0/1 0%--0/1 0%

Why did DaQin play its most successful opening by far, 4GateGoon, less often than any other? It is not that it discovered the opening late; it played it first in game 10 of 150, and won that game. It immediately played it again and lost, but soon played it a third time and won again. It surely wasn’t confused by too many choices. Either there was a bug, or some built-in bias in DaQin’s decisions led it astray.

microwave as seen by daqin

microwave played#daqin recognized
1HatchMuta_Sparkle6234 Not fast rush | 19 Heavy rush | 7 Unknown | 2 Proxy
3HatchLingBust1712 Not fast rush | 4 Heavy rush | 1 Proxy
3HatchMuta5948 Not fast rush | 7 Heavy rush | 4 Proxy
3HatchMutaExpo98 Not fast rush | 1 Proxy
3HatchPoolHydraExpo11 Not fast rush
9Pool11 Fast rush
OverpoolLurker11 Unknown

daqin as seen by microwave

daqin played#microwave recognized
4GateGoon118 Unknown | 2 HeavyRush | 1 Proxy
ForgeExpand5GateGoon31 SafeExpand | 1 Turtle | 1 NakedExpand
ForgeExpandSpeedlots13668 Turtle | 22 HeavyRush | 20 SafeExpand | 16 NakedExpand | 10 Unknown

steamhammer strategies versus daqin strategies

overallForgeExpand5GateGoonForgeExpandSpeedlots
overall33/150 22%29/136 21%4/14 29%
10HatchLing0/1 0%0/1 0%-
11Gas10PoolLurker0/1 0%0/1 0%-
12-12Hatch0/1 0%0/1 0%-
12Hatch_4HatchLing0/2 0%0/2 0%-
2.5HatchMuta0/1 0%0/1 0%-
2HatchHydraBust0/2 0%0/1 0%0/1 0%
3HatchHydra0/2 0%0/2 0%-
3HatchHydraBust0/3 0%0/2 0%0/1 0%
3HatchHydraExpo0/1 0%0/1 0%-
3HatchLateHydras+10/1 0%0/1 0%-
3HatchLing26/59 44%24/52 46%2/7 29%
3HatchLingBust20/2 0%0/2 0%-
4HatchBeforeGas5/25 20%3/23 13%2/2 100%
4HatchBeforeLair0/1 0%0/1 0%-
5HatchBeforeGas0/2 0%-0/2 0%
5HatchPool0/1 0%0/1 0%-
5PoolHard2Player0/1 0%0/1 0%-
5Scout0/1 0%0/1 0%-
973HydraBust0/4 0%0/3 0%0/1 0%
9Pool8GasLurker0/1 0%0/1 0%-
9PoolHatchSpeed0/1 0%0/1 0%-
9PoolHatchSpeedSpire20/1 0%0/1 0%-
9PoolHatchSpire0/1 0%0/1 0%-
9PoolSpireSlowlings0/1 0%0/1 0%-
9PoolSunkHatch0/1 0%0/1 0%-
AntiFact_2Hatch0/1 0%0/1 0%-
AntiFact_Overpool9Gas0/1 0%0/1 0%-
AntiFactory20/1 0%0/1 0%-
Over10Hatch1Sunk0/1 0%0/1 0%-
OverhatchExpoMuta0/3 0%0/3 0%-
OverpoolSpeed0/1 0%0/1 0%-
OverpoolTurtle 00/2 0%0/2 0%-
Proxy8HatchNatural0/1 0%0/1 0%-
Sparkle 3HatchMuta1/6 17%1/6 17%-
ZvP_2HatchMuta0/1 0%0/1 0%-
ZvP_3BaseSpire+Den0/1 0%0/1 0%-
ZvP_3HatchPoolHydra1/7 14%1/7 14%-
ZvT_2HatchMuta0/1 0%0/1 0%-
ZvT_3HatchMuta0/1 0%0/1 0%-
ZvT_7Pool0/1 0%0/1 0%-
ZvZ_12PoolLing0/1 0%0/1 0%-
ZvZ_12PoolLingB0/2 0%0/2 0%-
ZvZ_Overpool11Gas0/1 0%0/1 0%-

steamhammer as seen by daqin

steamhammer played#daqin recognized
10HatchLing11 Unknown
11Gas10PoolLurker11 Heavy rush
12-12Hatch11 Not fast rush
12Hatch_4HatchLing22 Heavy rush
2.5HatchMuta11 Not fast rush
2HatchHydraBust21 Hydra bust | 1 Not fast rush
3HatchHydra22 Not fast rush
3HatchHydraBust32 Not fast rush | 1 Heavy rush
3HatchHydraExpo11 Not fast rush
3HatchLateHydras+111 Not fast rush
3HatchLing5940 Not fast rush | 16 Heavy rush | 3 Unknown
3HatchLingBust221 Not fast rush | 1 Unknown
4HatchBeforeGas2524 Not fast rush | 1 Unknown
4HatchBeforeLair11 Not fast rush
5HatchBeforeGas22 Not fast rush
5HatchPool11 Not fast rush
5PoolHard2Player11 Fast rush
5Scout11 Not fast rush
973HydraBust44 Not fast rush
9Pool8GasLurker11 Heavy rush
9PoolHatchSpeed11 Heavy rush
9PoolHatchSpeedSpire211 Fast rush
9PoolHatchSpire11 Heavy rush
9PoolSpireSlowlings11 Heavy rush
9PoolSunkHatch11 Fast rush
AntiFact_2Hatch11 Not fast rush
AntiFact_Overpool9Gas11 Not fast rush
AntiFactory211 Heavy rush
Over10Hatch1Sunk11 Heavy rush
OverhatchExpoMuta33 Not fast rush
OverpoolSpeed11 Heavy rush
OverpoolTurtle 022 Heavy rush
Proxy8HatchNatural11 Heavy rush
Sparkle 3HatchMuta66 Not fast rush
ZvP_2HatchMuta11 Not fast rush
ZvP_3BaseSpire+Den11 Heavy rush
ZvP_3HatchPoolHydra74 Not fast rush | 1 Heavy rush | 1 Hydra bust | 1 Unknown
ZvT_2HatchMuta11 Not fast rush
ZvT_3HatchMuta11 Not fast rush
ZvT_7Pool11 Fast rush
ZvZ_12PoolLing11 Not fast rush
ZvZ_12PoolLingB22 Not fast rush
ZvZ_Overpool11Gas11 Heavy rush

daqin as seen by steamhammer

daqin played#steamhammer recognized
ForgeExpand5GateGoon13679 Turtle | 41 Safe expand | 10 Heavy rush | 5 Naked expand | 1 Unknown
ForgeExpandSpeedlots147 Safe expand | 6 Turtle | 1 Unknown

ecgberht strategies versus daqin strategies

overall12NexusCarriers3GateDT
overall1/150 1%1/2 50%0/148 0%
14CC0/32 0%-0/32 0%
FullMech0/29 0%0/1 0%0/28 0%
JoyORush0/28 0%-0/28 0%
MechGreedyFE0/28 0%-0/28 0%
ProxyEightRax1/33 3%1/1 100%0/32 0%

ecgberht as seen by daqin

ecgberht played#daqin recognized
14CC3228 Safe expand | 2 Naked expand | 2 Unknown
FullMech2927 Factory | 2 Not fast rush
JoyORush2827 Factory | 1 Unknown
MechGreedyFE2812 Unknown | 9 Safe expand | 7 Not fast rush
ProxyEightRax3327 Fast rush | 5 Not fast rush | 1 Proxy

daqin as seen by ecgberht

daqin played#ecgberht recognized
12NexusCarriers22 Unknown
3GateDT148148 Unknown

AIIDE 2020 - Microwave versus Steamhammer

Microwave played more different openings than Steamhammer (no doubt seeking a winning choice), so I put it on the left. Blue is good for Microwave, red is good for Steamhammer.

microwave strategies versus steamhammer strategies

overall6PoolBurrow8-8HydraRush9Hatch8Pool9PoolHatchSpeedSpireOverhatchLingOverpoolBurrowZvZ_12HatchExpoZvZ_12PoolLingZvZ_12PoolMainZvZ_Overpool11GasZvZ_Overpool9GasZvZ_OverpoolTurtle
overall43/150 29%1/1 100%1/1 100%4/5 80%1/1 100%1/1 100%1/1 100%3/5 60%4/11 36%2/3 67%12/44 27%7/64 11%6/13 46%
10Hatch9Pool9gas4/9 44%--0/1 0%------2/2 100%1/5 20%1/1 100%
10HatchMain9Pool9Gas1/4 25%---------0/1 0%1/2 50%0/1 0%
10HatchTurtleHydra0/1 0%----------0/1 0%-
11HatchTurtleMuta0/1 0%-----------0/1 0%
12HatchMain0/1 0%---------0/1 0%--
12Pool5/25 20%----1/1 100%--0/3 0%2/2 100%2/7 29%0/10 0%0/2 0%
12PoolMain1/5 20%--1/1 100%------0/2 0%0/2 0%-
2HatchLurker0/2 0%---------0/1 0%0/1 0%-
3HatchHydraBust0/1 0%----------0/1 0%-
3HatchHydraExpo0/2 0%---------0/1 0%0/1 0%-
3HatchPoolHydra0/2 0%-------0/1 0%-0/1 0%--
4HatchPoolHydra0/1 0%----------0/1 0%-
5Pool0/4 0%-------0/1 0%-0/1 0%0/2 0%-
5PoolSpeed1/3 33%-1/1 100%-----0/1 0%-0/1 0%--
7Pool0/1 0%----------0/1 0%-
7PoolHydraLingRush7D0/1 0%----------0/1 0%-
9Hatch9Pool9Gas0/1 0%----------0/1 0%-
9HatchTurtleHydra0/1 0%------0/1 0%-----
9PoolGasHatchSpeed8D0/1 0%---------0/1 0%--
9PoolHatch0/2 0%----------0/2 0%-
9PoolSpeed17/31 55%1/1 100%-3/3 100%1/1 100%--1/1 100%1/1 100%-4/6 67%4/13 31%2/5 40%
9PoolSpeedLing0/1 0%------0/1 0%-----
9PoolSunken0/7 0%--------0/1 0%0/3 0%0/3 0%-
OverpoolSpeed1/3 33%-----1/1 100%---0/1 0%0/1 0%-
ZvP_11Hatch10Pool2/4 50%------1/1 100%--0/1 0%1/2 50%-
ZvP_2HatchHydra0/9 0%---------0/4 0%0/5 0%-
ZvP_9Hatch9Pool0/1 0%---------0/1 0%--
ZvZ_Overgas11Pool10/20 50%------1/1 100%3/4 75%-4/9 44%0/4 0%2/2 100%
ZvZ_Overpool11Gas0/2 0%----------0/2 0%-
ZvZ_Overpool9Gas1/4 25%----------0/3 0%1/1 100%

Steamhammer’s ZvZ_Overpool9Gas opening was successful against all Microwave tries, but notice that it was the only one: Flecks of blue, or entire streaks, crept into every other Steamhammer attempt. The end result does not look close, but in fact Microwave would have needed only a small increment of skill to turn it around; there was only one strategy it was unprepared to face.

microwave as seen by steamhammer

microwave played#steamhammer recognized
10Hatch9Pool9gas95 Naked expand | 3 Heavy rush | 1 Unknown
10HatchMain9Pool9Gas43 Unknown | 1 Turtle
10HatchTurtleHydra11 Naked expand
11HatchTurtleMuta11 Heavy rush
12HatchMain11 Unknown
12Pool2517 Naked expand | 5 Heavy rush | 3 Unknown
12PoolMain53 Heavy rush | 2 Unknown
2HatchLurker22 Naked expand
3HatchHydraBust11 Naked expand
3HatchHydraExpo21 Naked expand | 1 Heavy rush
3HatchPoolHydra21 Naked expand | 1 Heavy rush
4HatchPoolHydra11 Heavy rush
5Pool44 Fast rush
5PoolSpeed33 Fast rush
7Pool11 Fast rush
7PoolHydraLingRush7D11 Unknown
9Hatch9Pool9Gas11 Naked expand
9HatchTurtleHydra11 Heavy rush
9PoolGasHatchSpeed8D11 Heavy rush
9PoolHatch21 Unknown | 1 Heavy rush
9PoolSpeed3121 Unknown | 7 Naked expand | 3 Heavy rush
9PoolSpeedLing11 Naked expand
9PoolSunken74 Unknown | 3 Heavy rush
OverpoolSpeed32 Unknown | 1 Heavy rush
ZvP_11Hatch10Pool43 Naked expand | 1 Heavy rush
ZvP_2HatchHydra96 Heavy rush | 2 Turtle | 1 Naked expand
ZvP_9Hatch9Pool11 Naked expand
ZvZ_Overgas11Pool2019 Unknown | 1 Turtle
ZvZ_Overpool11Gas22 Unknown
ZvZ_Overpool9Gas44 Unknown

To play ZvZ truly well, Steamhammer needs a more detailed understanding of enemy builds. But even with this crude breakdown, I notice that most of the blue spots are associated with misunderstanding the main idea of Microwave’s play. On the other hand, many misunderstandings also show as red.

steamhammer as seen by microwave

steamhammer played#microwave recognized
6PoolBurrow11 FastRush
8-8HydraRush11 Unknown
9Hatch8Pool54 HeavyRush | 1 Unknown
9PoolHatchSpeedSpire11 NakedExpand
OverhatchLing11 HeavyRush
OverpoolBurrow11 NakedExpand
ZvZ_12HatchExpo55 NakedExpand
ZvZ_12PoolLing118 HeavyRush | 2 Unknown | 1 NakedExpand
ZvZ_12PoolMain33 HeavyRush
ZvZ_Overpool11Gas4436 Turtle | 5 Unknown | 3 NakedExpand
ZvZ_Overpool9Gas6451 Turtle | 8 Unknown | 5 NakedExpand
ZvZ_OverpoolTurtle1313 Turtle

The builds recognized as Turtle genuinely are turtle builds. They get mutalisks fast at the expense of weakness to zergling attack, which they compensate for with sunkens instead of a second hatchery. From the meta-strategy point of view, Steamhammer usually defeats Microwave in games where Steamhammer gains air superiority early, so Steamhammer’s choices make sense.

AIIDE 2020 - Steamhammer versus McRave

I added parsing for Steamhammer. DaQin is nearly the same. The only remaining bot which records data that can be analyzed this way is ZZZKBot, which has a difficult file format, does not keep a recognized enemy strategy, and doesn’t bother to write a newline at the end of its file. I may skip ZZZKBot.

The Steamhammer-McRave strategy crosstable is the most interesting one yet.

steamhammer strategies versus mcrave strategies

overallPoolHatch,12Pool,2HatchMutaPoolHatch,12Pool,2HatchSpeedlingPoolLair,9Pool,1HatchMuta
overall64/150 43%17/33 52%10/22 45%37/95 39%
12PoolLurker0/1 0%--0/1 0%
3HatchLingBurrow1/5 20%1/2 50%-0/3 0%
8DroneGas7/11 64%-1/1 100%6/10 60%
9HatchMain9Pool9Gas0/2 0%--0/2 0%
9PoolHatchSpeedAllInB0/1 0%--0/1 0%
9PoolSpire0/2 0%0/2 0%--
Over10HatchBust8/19 42%7/7 100%-1/12 8%
Over10PoolLing0/1 0%--0/1 0%
OverpoolSpeed3/15 20%1/5 20%0/3 0%2/7 29%
OverpoolSunk8/21 38%0/1 0%2/8 25%6/12 50%
OverpoolTurtle11/23 48%2/6 33%1/1 100%8/16 50%
ZvP_3HatchMuta0/1 0%--0/1 0%
ZvZ_12HatchExpo0/1 0%-0/1 0%-
ZvZ_Overgas9Pool0/2 0%-0/1 0%0/1 0%
ZvZ_OverpoolTurtle26/45 58%6/10 60%6/7 86%14/28 50%

For Steamhammer, either 8DroneGas (a zergling build despite the name) or else ZvZ_OverpoolTurtle (a mutalisk build) may dominate among the openings tried, while McRave’s best was the 1 hatch muta play because no Steamhammer try was better than even against it. It’s possible that switching between different kinds of builds was important, though, because the table suggests that the other counters are likely imbalanced (without a game-theoretic saddle point).

Both sides had trouble identifying the best strategies. If both had played their best strategies then the match would have come out close to 50%, while in fact Steamhammer came out behind, so Steamhammer had more trouble selecting from its excessive range of possibilities. I get the impression of a back-and-forth learning struggle.

steamhammer as seen by mcrave

steamhammer played#mcrave recognized
12PoolLurker11 HatchPool,12Pool,1HatchMuta
3HatchLingBurrow53 HatchPool,Unknown,2HatchLing | 1 HatchPool,Unknown,Unknown | 1 Unknown,Unknown,3HatchMuta
8DroneGas116 HatchPool,9Pool,2HatchLing | 2 PoolHatch,9Pool,2HatchLing | 1 PoolHatch,Unknown,2HatchLing | 1 HatchPool,Unknown,2HatchLing | 1 PoolHatch,Unknown,Unknown
9HatchMain9Pool9Gas21 PoolHatch,12Pool,2HatchLing | 1 HatchPool,Unknown,2HatchLing
9PoolHatchSpeedAllInB11 PoolHatch,9Pool,LingRush
9PoolSpire22 Unknown,Unknown,Unknown
Over10HatchBust197 HatchPool,12Pool,Unknown | 4 HatchPool,12Pool,2HatchLing | 3 Unknown,12Pool,Unknown | 2 HatchPool,Unknown,2HatchLing | 2 HatchPool,Unknown,Unknown | 1 PoolHatch,12Pool,Unknown
Over10PoolLing11 HatchPool,12Pool,Unknown
OverpoolSpeed155 HatchPool,9Pool,LingRush | 4 PoolHatch,12Pool,Unknown | 3 Unknown,12Pool,Unknown | 1 Unknown,9Pool,LingRush | 1 PoolHatch,9Pool,LingRush | 1 HatchPool,12Pool,3HatchMuta
OverpoolSunk218 HatchPool,9Pool,Unknown | 5 PoolHatch,9Pool,LingRush | 2 HatchPool,9Pool,3HatchMuta | 1 PoolHatch,9Pool,Unknown | 1 Unknown,Unknown,Unknown | 1 Unknown,12Pool,3HatchMuta | 1 HatchPool,Unknown,Unknown | 1 HatchPool,12Pool,3HatchMuta | 1 PoolHatch,Unknown,Unknown
OverpoolTurtle237 HatchPool,9Pool,LingRush | 5 Unknown,12Pool,1HatchHydra | 3 Unknown,Unknown,1HatchHydra | 2 HatchPool,Unknown,1HatchHydra | 2 Unknown,9Pool,1HatchHydra | 2 HatchPool,12Pool,1HatchLurker | 1 PoolHatch,12Pool,1HatchLurker | 1 HatchPool,12Pool,1HatchHydra
ZvP_3HatchMuta11 HatchPool,Unknown,2HatchLing
ZvZ_12HatchExpo11 HatchPool,Unknown,2HatchMuta
ZvZ_Overgas9Pool21 PoolLair,Unknown,1HatchMuta | 1 PoolLair,Unknown,Unknown
ZvZ_OverpoolTurtle4515 Unknown,Unknown,Unknown | 10 PoolLair,9Pool,1HatchMuta | 6 PoolLair,12Pool,1HatchMuta | 6 PoolLair,Unknown,1HatchMuta | 2 PoolLair,Unknown,Unknown | 2 Unknown,12Pool,Unknown | 2 Unknown,Unknown,3HatchMuta | 1 PoolLair,9Pool,Unknown | 1 Unknown,12Pool,3HatchMuta

Some curious stuff here. None of Steamhammer’s openings here is 3 hatch mutalisk, so those that are recognized that way may have added a third hatchery later in the game. Steamhammer does have an unfortunate love of laying down an unnecessary hatchery before its spire in ZvZ (3 hatcheries with zerglings is good, 2 hatcheries with mutalisks is good, 3 hatcheries with mutalisks is hard to justify in ZvZ). Looking at the Steamhammer openings tried more often, OverpoolSunk should be recognized as PoolHatch usually (maybe sometimes PoolLair). McRave got it wrong over half the time, without any big effect on its win rate. OverpoolTurtle should be PoolHatch with a hydra followup (this opening is not intended for ZvZ). For ZvZ_OverpoolTurtle, the closest match is PoolLair,9Pool,1HatchMuta. McRave got it right 10 times out of 45 and was close some other times. Failing to recognize anything (likely the scout was denied) was bad.

mcrave as seen by steamhammer

mcrave played#steamhammer recognized
PoolHatch,12Pool,2HatchMuta3322 Naked expand | 6 Unknown | 5 Heavy rush
PoolHatch,12Pool,2HatchSpeedling229 Naked expand | 9 Unknown | 3 Heavy rush | 1 Worker rush
PoolLair,9Pool,1HatchMuta9589 Unknown | 4 Turtle | 2 Naked expand

Worker rush? That is likely a bug. The other choices capture information about the game that is probably true and not particularly useful.

AIIDE 2020 - Dragon versus Ecgberht

Two posts today, to cover the newly available Ecgberht pairings. Neither post has much meat to it.

dragon strategies versus ecgberht strategies

overall14CCBioMechGreedyFEFullMechProxyBBSProxyEightRax
overall141/150 94%28/28 100%27/28 96%25/25 100%36/44 82%25/25 100%
1rax fe5/6 83%1/1 100%1/1 100%1/1 100%2/3 67%-
bio136/144 94%27/27 100%26/27 96%24/24 100%34/41 83%25/25 100%

I was curious about Dragon’s pattern of seemingly giving up on “1rax fe” (barracks expand) after a single loss, so I looked at the file. In fact Dragon played “bio” as the regular build the whole time, throwing in “1rax fe” occasionally for spice. The “1rax fe” loss was not the last “1rax fe” game, but the second to last.

For Ecgberht, when one build is producing nearly all the wins, probably you should play it more often than 30% of the time. You may not want to play it every game, because that makes it easy for the opponent to adapt—mixing it up is good. Maybe 50% of the time would be better, given this number of alternatives? To know for sure, I guess we’d have to test against a range of bots to see the overall effectiveness of learning.

dragon as seen by ecgberht

dragon played#ecgberht recognized
1rax fe66 Unknown
bio144144 Unknown

Nothing to see here. Move along.

ecgberht as seen by dragon

Dragon does not record its idea of the opponent’s build. If it has one.

AIIDE 2020 - BananaBrain versus Ecgberht

bananabrain strategies versus ecgberht strategies

overall14CCFullMechJoyORushMechGreedyFEProxyEightRax
overall148/150 99%31/31 100%28/28 100%28/28 100%28/28 100%33/35 94%
PvT_10/12gate10/10 100%3/3 100%-1/1 100%3/3 100%3/3 100%
PvT_10/15gate10/10 100%2/2 100%3/3 100%1/1 100%1/1 100%3/3 100%
PvT_12nexus10/10 100%3/3 100%3/3 100%2/2 100%1/1 100%1/1 100%
PvT_1gatedtexpo10/10 100%1/1 100%3/3 100%2/2 100%4/4 100%-
PvT_1gatereaver10/10 100%2/2 100%3/3 100%2/2 100%1/1 100%2/2 100%
PvT_28nexus10/10 100%3/3 100%2/2 100%1/1 100%2/2 100%2/2 100%
PvT_2gatedt11/11 100%2/2 100%4/4 100%1/1 100%2/2 100%2/2 100%
PvT_2gaterngexpo10/10 100%3/3 100%3/3 100%-1/1 100%3/3 100%
PvT_32nexus10/10 100%1/1 100%1/1 100%2/2 100%3/3 100%3/3 100%
PvT_9/9gate10/10 100%2/2 100%-3/3 100%-5/5 100%
PvT_9/9proxygate10/10 100%4/4 100%3/3 100%1/1 100%1/1 100%1/1 100%
PvT_bulldog10/10 100%1/1 100%1/1 100%5/5 100%1/1 100%2/2 100%
PvT_dtdrop10/10 100%1/1 100%-3/3 100%3/3 100%3/3 100%
PvT_proxydt7/9 78%1/1 100%1/1 100%3/3 100%2/2 100%0/2 0%
PvT_stove10/10 100%2/2 100%1/1 100%1/1 100%3/3 100%3/3 100%

We can see exactly how Ecgberht scored its total of 2 wins: It happened to play a fast proxy when BananaBrain played a slow proxy. For BananaBrain, maybe the lesson is to avoid risky openings versus much weaker opponents. As a general principle, I suggest saving risky builds for games where you have a high risk of losing with safe play—in that case, why not?

bananabrain as seen by ecgberht

bananabrain played#ecgberht recognized
PvT_10/12gate107 ZealotRush | 3 Unknown
PvT_10/15gate1010 Unknown
PvT_12nexus109 ProtossFE | 1 Unknown
PvT_1gatedtexpo1010 Unknown
PvT_1gatereaver1010 Unknown
PvT_28nexus1010 Unknown
PvT_2gatedt1111 Unknown
PvT_2gaterngexpo1010 Unknown
PvT_32nexus1010 Unknown
PvT_9/9gate107 ZealotRush | 3 Unknown
PvT_9/9proxygate108 Unknown | 2 CannonRush
PvT_bulldog1010 Unknown
PvT_dtdrop1010 Unknown
PvT_proxydt99 Unknown
PvT_stove1010 Unknown

Except for a couple cases of CannonRush, the builds that Ecgberht recognized were named correctly. I imagine that it interpreted CannonRush as “something proxied.”

ecgberht as seen by bananabrain

ecgberht played#bananabrain recognized
14CC3121 T_fastexpand | 6 T_unknown | 4 T_2rax
FullMech2821 T_unknown | 6 T_1fac | 1 T_2fac
JoyORush2823 T_2fac | 3 T_unknown | 2 T_1fac
MechGreedyFE2825 T_unknown | 3 T_2rax
ProxyEightRax3535 T_unknown

As we’ve seen before, BananaBrain has little skill in recognizing terran builds.

AIIDE 2020 - what Ecgberht learned

I added parsing code for Ecgberht’s JSON format learning files. I had to refactor for generality, and it added complexity, but I can use the parser for more than one purpose. Today I summarize the contents of its history files.

Ecgberht I think is a complex and interesting bot. It played up to 5 different strategies in each matchup, though the selection of the 5 varied by matchup. Sometimes it played fewer. Against most opponents Ecgberht played its strategies at roughly equal rates—except for the strategies it didn’t play at all. Ecgberht uses UCB with a high exploration rate. The strategy manager in the source lists 15 strategies (plus one more played only on the map Plasma and named PlasmaWraithHell), so it did not play everything it knows. I made a quick scan through the source for opponent-specific preparation, and did find some, but for bots in the tournament only ZZZKBot is affected (it is flagged by a zergling rush check; some bots that always zealot rush are flagged for that). I didn’t dig deep enough to find out why Ecgberht ignores so many of its available strategies.

Ecgberht tries to recognize the opponent’s strategy, but often finds itself unsure. It recorded a high rate of Unknown enemy plans. The ones it does recognize are drawn from a small set that seems to me well-chosen.

Ecgberht recorded fewer than 150 games for 5 of its 11 opponents, although it completed all games with no crashes. In total, 7 games do not appear in the game records of the history files. Maybe it has a cleanup bug that bites occasionally?


#1 stardust

openinggameswinsfirstlast
14CC310%3147
FullMech280%0148
JoyORush270%2143
MechGreedyFE270%4146
ProxyEightRax366%1141
5 openings1491%
enemygameswins
Unknown1491%
1 opening1491%


A couple wins against the top player is not bad.


#2 purplewave

openinggameswinsfirstlast
14CC353%3148
FullMech290%0149
JoyORush280%2146
MechGreedyFE280%4147
ProxyEightRax300%1142
5 openings1501%
enemygameswins
ProtossFE70%
Unknown1431%
2 openings1501%

#3 bananabrain

openinggameswinsfirstlast
14CC310%3146
FullMech280%0144
JoyORush280%2147
MechGreedyFE280%4148
ProxyEightRax356%1149
5 openings1501%
enemygameswins
CannonRush20%
ProtossFE90%
Unknown1252%
ZealotRush140%
4 openings1501%

#4 dragon

openinggameswinsfirstlast
14CC280%3148
BioMechGreedyFE284%4144
FullMech250%0146
ProxyBBS4418%2149
ProxyEightRax250%1147
5 openings1506%
enemygameswins
Unknown1506%
1 opening1506%

#5 mcrave

openinggameswinsfirstlast
14CC287%7147
BioGreedyFE5129%0145
ProxyEightRax4726%21140
TwoPortWraith225%3146
4 openings14820%
enemygameswins
FastHatch6116%
NinePool1331%
Unknown7422%
3 openings14820%


Ecgberht put up its strongest fight against zerg.


#6 microwave

openinggameswinsfirstlast
14CC329%4145
BioGreedyFE210%0148
FullBioFE244%3146
ProxyEightRax5227%1147
TwoPortWraith200%2138
5 openings14912%
enemygameswins
FastHatch994%
NinePool540%
Unknown4527%
3 openings14912%

#7 steamhammer

openinggameswinsfirstlast
14CC3412%8147
BioGreedyFE3617%0142
ProxyEightRax3614%1141
TwoPortWraith4323%4148
4 openings14917%
enemygameswins
EarlyPool40%
FastHatch2232%
NinePool8114%
Unknown4217%
4 openings14917%

#8 daqin

openinggameswinsfirstlast
14CC320%8148
FullMech290%0149
JoyORush280%4144
MechGreedyFE280%43147
ProxyEightRax333%1141
5 openings1501%
enemygameswins
Unknown1501%
1 opening1501%

#9 zzzkbot

openinggameswinsfirstlast
FullBio15071%0149
1 opening15071%
enemygameswins
EarlyPool15071%
1 opening15071%


Ecgberht upset ZZZKBot, possibly aided by its hardcoded knowledge of how ZZZKBot plays.


#10 ualbertabot

openinggameswinsfirstlast
FullBio5843%0144
FullMech5238%2145
ProxyBBS4032%1149
3 openings15039%
enemygameswins
BioPush1191%
EarlyPool1250%
MechRush933%
Unknown10424%
ZealotRush14100%
5 openings15039%

#11 willyt

openinggameswinsfirstlast
14CC313%68148
FullMech349%0147
ProxyEightRax8541%2149
3 openings15026%
enemygameswins
BioPush3415%
Unknown11629%
2 openings15026%

#13 eggbot

openinggameswinsfirstlast
FullMech14894%0147
1 opening14894%
enemygameswins
CannonRush9495%
Unknown5493%
2 openings14894%

AIIDE 2020 - Microwave versus BananaBrain

This is the last matchup I can analyze this way without writing more parsing code. McRave did ask for more in a comment, though, so I may do that. All the matchups have featured BananaBrain.

Microwave plays a large number of strategies, so I put it on the left side. Blue is good for Microwave, red is good for BananaBrain.

microwave strategies versus bananabrain strategies

overallPvZ_10/12gatePvZ_1basespeedzealPvZ_2basespeedzealPvZ_4gate2archonPvZ_5gategoonPvZ_9/9gatePvZ_9/9proxygatePvZ_bisuPvZ_neobisuPvZ_sairdtPvZ_sairgoonPvZ_sairreaverPvZ_stove
overall58/150 39%5/17 29%3/19 16%4/11 36%4/9 44%4/7 57%5/11 45%5/12 42%4/14 29%4/10 40%5/10 50%6/11 55%4/9 44%5/10 50%
10Hatch9Pool9gas0/2 0%---0/1 0%0/1 0%--------
10HatchMain9Pool9Gas0/1 0%-------0/1 0%-----
11HatchTurtleHydra0/1 0%--------0/1 0%----
12Hatch0/1 0%0/1 0%------------
12PoolMain22/43 51%0/5 0%0/9 0%2/2 100%3/3 100%3/3 100%0/1 0%1/3 33%2/2 100%3/3 100%0/3 0%2/3 67%4/4 100%2/2 100%
12PoolMuta0/1 0%0/1 0%------------
1HatchMuta_Sparkle0/1 0%------0/1 0%------
2HatchMuta1/5 20%--1/1 100%--0/1 0%-0/1 0%---0/1 0%0/1 0%
3HatchHydraBust0/1 0%-------0/1 0%-----
3HatchHydra_BHG0/1 0%--0/1 0%----------
3HatchLingBust2/6 33%-0/1 0%0/1 0%--1/1 100%0/1 0%---1/1 100%-0/1 0%
3HatchMuta0/1 0%--------0/1 0%----
3HatchPoolHydraExpo0/1 0%0/1 0%------------
4HatchBeforeGas0/1 0%----------0/1 0%--
4HatchPoolHydra0/2 0%-0/1 0%0/1 0%----------
4PoolHard2/6 33%-1/1 100%0/1 0%--1/1 100%-0/1 0%----0/2 0%
4PoolSoft0/1 0%-0/1 0%-----------
6Pool0/1 0%-0/1 0%-----------
7Pool0/1 0%---------0/1 0%---
8Pool0/1 0%--------0/1 0%----
8PoolHydraRush8D0/1 0%0/1 0%------------
9PoolGasHatchSpeed8D12/18 67%2/2 100%2/2 100%-1/2 50%0/1 0%1/1 100%0/2 0%1/1 100%1/1 100%1/1 100%1/2 50%0/1 0%2/2 100%
9PoolHatchGasSpeed7D0/1 0%---0/1 0%---------
9PoolHatchGasSpeed8D17/32 53%3/4 75%0/1 0%1/1 100%0/1 0%0/1 0%2/4 50%4/5 80%1/5 20%0/1 0%4/4 100%2/2 100%0/2 0%0/1 0%
9PoolSpeed0/3 0%0/1 0%--0/1 0%------0/1 0%--
9PoolSpeedLing1/5 20%-----0/1 0%-0/1 0%--0/1 0%0/1 0%1/1 100%
9PoolSunkHatch0/1 0%--0/1 0%----------
Overpool0/1 0%0/1 0%------------
OverpoolSpeed0/3 0%-0/1 0%0/1 0%----0/1 0%-----
ZvP_10Hatch9Pool1/3 33%-0/1 0%0/1 0%-1/1 100%--------
ZvP_11Hatch10Pool0/1 0%--------0/1 0%----
ZvZ_Overgas9Pool0/1 0%--------0/1 0%----
ZvZ_Overpool11Gas0/2 0%-----0/1 0%---0/1 0%---

This table looks even more scattered than yesterday’s BananaBrain-Dragon table, but to me it tells a story of duelling learning algorithms. Microwave found a few builds that countered BananaBrain’s preferred play, and BananaBrain did not shift its responses far enough to entirely squelch them.

microwave as seen by bananabrain

microwave played#bananabrain recognized
10Hatch9Pool9gas22 Z_10hatch
10HatchMain9Pool9Gas11 Z_10hatch
11HatchTurtleHydra11 Z_12hatch
12Hatch11 Z_12hatch
12PoolMain4336 Z_12pool | 5 Z_10hatch | 2 Z_unknown
12PoolMuta11 Z_10hatch
1HatchMuta_Sparkle11 Z_unknown
2HatchMuta55 Z_12hatch
3HatchHydraBust11 Z_12hatch
3HatchHydra_BHG11 Z_10hatch
3HatchLingBust66 Z_12hatch
3HatchMuta11 Z_12hatch
3HatchPoolHydraExpo11 Z_12hatch
4HatchBeforeGas11 Z_12hatch
4HatchPoolHydra22 Z_12hatch
4PoolHard66 Z_4/5pool
4PoolSoft11 Z_4/5pool
6Pool11 Z_4/5pool
7Pool11 Z_9pool
8Pool11 Z_9pool
8PoolHydraRush8D11 Z_9pool
9PoolGasHatchSpeed8D1815 Z_9pool | 3 Z_overpool
9PoolHatchGasSpeed7D11 Z_9pool
9PoolHatchGasSpeed8D3229 Z_9pool | 3 Z_overpool
9PoolSpeed32 Z_9poolspeed | 1 Z_9pool
9PoolSpeedLing55 Z_9poolspeed
9PoolSunkHatch11 Z_9pool
Overpool11 Z_overpool
OverpoolSpeed33 Z_overpool
ZvP_10Hatch9Pool33 Z_10hatch
ZvP_11Hatch10Pool11 Z_12hatch
ZvZ_Overgas9Pool11 Z_12pool
ZvZ_Overpool11Gas22 Z_overpool

BananaBrain was accurate at reading Microwave’s initial build. Lumping 11 hatch with 12 hatch is fine, they’re very similar. 12 pool can be difficult to distinguish from 10 hatch, if you scout it late after the second hatchery finishes. It would be useful to better separate 9 pool from overpool, which are significantly different in effect, but it requires close attention to detail. Overall, highly accurate readings with only one wide miss, seeing the overgas 9 pool as 12 pool—and that is a ZvZ build that is extremely rare in ZvP.

It makes quite a contrast with yesterday’s BananaBrain-Dragon analysis, where BananaBrain barely recognized terran builds.

bananabrain as seen by microwave

bananabrain played#microwave recognized
PvZ_10/12gate1713 HeavyRush | 3 Unknown | 1 NakedExpand
PvZ_1basespeedzeal1914 Unknown | 5 HeavyRush
PvZ_2basespeedzeal114 NakedExpand | 3 Turtle | 3 SafeExpand | 1 HeavyRush
PvZ_4gate2archon94 NakedExpand | 4 SafeExpand | 1 HeavyRush
PvZ_5gategoon76 NakedExpand | 1 HeavyRush
PvZ_9/9gate119 HeavyRush | 2 Unknown
PvZ_9/9proxygate126 HeavyRush | 6 Unknown
PvZ_bisu146 SafeExpand | 4 NakedExpand | 2 Turtle | 1 HeavyRush | 1 Unknown
PvZ_neobisu104 NakedExpand | 3 SafeExpand | 2 Turtle | 1 HeavyRush
PvZ_sairdt108 Unknown | 2 HeavyRush
PvZ_sairgoon117 NakedExpand | 1 SafeExpand | 1 Turtle | 1 Unknown | 1 HeavyRush
PvZ_sairreaver94 SafeExpand | 3 NakedExpand | 2 Turtle
PvZ_stove107 Unknown | 3 HeavyRush

Microwave borrowed Steamhammer’s rather crude classification of enemy plans (which was still far in the future when Microwaved forked from Steamhammer). It was intended to be minimal, just enough to allow for basic reactions, to hold the fort until I could raise enough troops to make a sally. Microwave’s recognitions look similar to Steamhammer’s, with the right general tendency but many sloppy variations (which I think are due mostly to weak scouting, with a contribution from overlapping recognition rules).

It’s striking that some recognitions—of dubious accuracy—are dark blue in stark contrast to their neighbors. It gives me the impression that Microwave makes use of the recognized enemy plan, in some cases to good effect. It suggests that more accurate recognition, if the reactions are also good, could be a major improvement.

AIIDE 2020 - BananaBrain versus Dragon

Of the 4 bots I’m prepared to run this analysis on, this is the only pairing involving Dragon. Dragon did not record all 150 games against either McRave or Microwave. Like yesterday, all win rates and coloring are from the point of view of BananaBrain: Blue is good for BananaBrain, red is good for Dragon.

bananabrain strategies versus dragon strategies

overall1rax fe2rax bio2rax mechbiodirty worker rushmass vulturesiege expand
overall67/150 45%6/14 43%6/11 55%8/15 53%15/37 41%3/3 100%22/56 39%7/14 50%
PvT_10/12gate12/17 71%2/3 67%-2/3 67%4/4 100%-3/6 50%1/1 100%
PvT_10/15gate5/12 42%-2/2 100%1/5 20%1/3 33%-1/2 50%-
PvT_12nexus1/8 12%1/2 50%--0/1 0%-0/3 0%0/2 0%
PvT_1gatedtexpo3/7 43%1/2 50%--0/1 0%-2/4 50%-
PvT_1gatereaver0/5 0%-0/1 0%-0/2 0%-0/2 0%-
PvT_28nexus5/11 45%0/2 0%0/1 0%0/2 0%1/1 100%-4/5 80%-
PvT_2gatedt3/9 33%0/1 0%-1/1 100%0/2 0%-0/3 0%2/2 100%
PvT_2gaterngexpo2/7 29%-0/1 0%-1/1 100%1/1 100%0/4 0%-
PvT_32nexus2/8 25%---1/4 25%1/1 100%0/2 0%0/1 0%
PvT_9/9gate14/18 78%-2/3 67%-4/4 100%1/1 100%7/9 78%0/1 0%
PvT_9/9proxygate8/14 57%1/1 100%1/1 100%3/3 100%0/2 0%-2/6 33%1/1 100%
PvT_bulldog0/6 0%0/1 0%--0/3 0%-0/1 0%0/1 0%
PvT_dtdrop2/8 25%-1/1 100%-0/4 0%-1/2 50%0/1 0%
PvT_proxydt10/14 71%1/1 100%-1/1 100%3/3 100%-2/5 40%3/4 75%
PvT_stove0/6 0%0/1 0%0/1 0%-0/2 0%-0/2 0%-

Not one table cell has more than 9 games in it. Neither bot successfully predicted what the other would play, if it even tried: BananaBrain is unpredictable and Dragon changes its choice frequently when losing, and besides BananaBrain is poor at recognizing terran plans. So the strategy x strategy cross is a hash. To me the table means that, at least for this pairing, reactions during the game were more important than the initial choice of strategy. Neither side had a way to choose a counter beforehand.

bananabrain as seen by dragon

Dragon does not record a recognized opponent strategy. Its history files have only its own strategy and whether it won.

dragon as seen by bananabrain

dragon played#bananabrain recognized
1rax fe1413 T_unknown | 1 T_fastexpand
2rax bio118 T_unknown | 2 T_fastexpand | 1 T_1fac
2rax mech1514 T_unknown | 1 T_1fac
bio3735 T_unknown | 1 T_1fac | 1 T_fastexpand
dirty worker rush33 T_unknown
mass vulture5630 T_1fac | 26 T_unknown
siege expand149 T_unknown | 5 T_1fac

We knew that BananaBrain struggles to recognize terran strategies. Maybe the author has not spent effort on it because it doesn’t affect results much? In any case, given how Dragon plays, with its love of fast expansions and mixed tech, the terran builds that are recognized probably represent truths about the games. It’s not clear that they are helpful truths, though, because they say so little about what happened.

From the coloring, it looks as though there was little relationship between whether BananaBrain recognized Dragon’s build and whether BananaBrain won. That is consistent with the theory that the author decided it didn’t matter.