Starcraft AI blog | Entries from November 2020

Steamhammer on BWAPI 4.4.0

I finally have Steamhammer working locally with BWAPI 4.4.0. I didn’t have much trouble with it this time, on my second try, though I worked slowly.

I want to make a few minor code changes and run tests. I expect it won’t take long. If everything continues to look good, I’ll release Steamhammer 3.3 before long. Its play will be little changed.

Then it will be time to prepare for the annual SSCAIT tournament. (I write it that way even though the T already stands for Tournament. It seems clearer than “SSCAI tournament”.) The current version is essentially the AIIDE version, and the updates seem to have been worth 50 to 100 elo points, a substantial jump. But I did introduce a serious new bug: Steamhammer now often builds hatcheries beside the base it wants to take next, rather than at the base—sometimes several hatcheries. Then, since it hasn’t taken the base, it doesn’t mine there. It’s a terrible weakness, and fixing it has to be the first step.

how to beat Monster

Monster’s elo is settling close to 2900, making it the top zerg with a margin. It has been updated at least once. One of the changes was to place outward-facing sunkens in a neat line, rather than a somewhat ragged formation, a clear improvement. I expect some authors will be thinking about how to beat Monster in the upcoming annual SSCAIT tournament, so here are a few thoughts. Though Monster’s author could read this too and make changes....

Monster does not want its sunkens to be broken, and often builds too many too early near the start of the game. That delays its economic growth. You can take advantage by delaying your own expansion a little to look threatening, and perhaps by killing Monster’s scout drone quickly. If Monster makes 2 or 3 sunkens in response to only a few units of yours, you are ahead. Don’t try for a bust at any early timing (even Stardust has to build up forces first). If you do make a lot of units early, you can try to contain Monster, to prevent it from expanding until you can get ahead, as in PurpleWave-Monster on La Mancha (an exciting game; the plan barely worked).

If you can run by the sunkens in the natural and get into Monster’s main, do it. Sometimes Monster walls tightly enough that it’s impossible, though. If you get in, Monster will run drones from its main to its natural, so you won’t kill many, but that is OK. There’s no need to destroy anything to get ahead. Stay alive in the main as long as possible to prevent drones from returning. Monster will be mining only one geyser and fewer than half as many mineral patches, and its economy will fall behind with every second. If mutalisks come out, they will come on half as strong at first, and they will need to clear Monster’s main before they head across the map to attack, so you have more time to prepare.

Against protoss, Monster most often plays with hydralisks, but may switch to mutalisks if you skimp too much on air defense. Corsairs are good for air defense, but I have yet to see a protoss bot which understands how to get the most value out of them. Monster is skilled with scourge, so you have to be careful with corsairs. You can see the general idea in pro games: As long as zerg has no spire, corsairs can scout and can clear overlords away. When scourge appears, protoss makes a cannon at the main nexus (initially only one, maybe more later) for short-term air defense and parks corsairs there until there are enough corsairs that they can shoot down scourge before it strikes. That’s about 5 corsairs. +1 air attack helps. Keep the corsairs in a tight group and they can go out on the map again, vulnerable only to determined scourge attacks with spread scourge, and even then the corsairs can shoot down some of it, raising zerg’s gas cost. If zerg has a lot of scourge, put dragoons or archons in your army and keep corsairs nearby; when in danger, fly over your anti-air units.

Against protoss and zerg, Monster opens with safe overpool builds that are not easy to exploit. Against terran, Monster plays a greedy three-hatchery build that leaves openings. Terran can try a cheese rush, or can take advantage of the slow zerg start to go for heavy macro itself. Hao Pan and Krasi0 have both tried proxy barracks, with partial success. Monster does have cheese reactions and defends itself fairly well if it sees the rush coming, though even then it may fall behind and lose. See Hao Pan-Monster, bunker rush not scouted and Hao Pan-Monster, bunker rush scouted (the scout overlord saw the first marine). In the second game Monster skipped its natural hatchery (it will also cancel a natural already started), made the hatch in its main instead, and got a sunken for defense. If Monster breaks the bunker early (which it often does) it typically wins; if it stays contained too long it will lose.

If you as terran go the macro route, you need good timing. Monster plays spire with +1 (usually attack but I have seen carapace too; is that due to an update?), then a second spire and double upgrades from then on. Terran has longer ranged units and low mobility, so terran usually wants to build up before moving out, especially if playing mech. But ground units do not stack and air units do, so if you wait too long then the mutalisks will defeat goliaths—all mutas can fire at once to pick off an edge goliath while not all of its comrades can get into range. Here is Krasi0 fast expanding to outmacro Monster. I think that this one is more instructive, though: XiaoYi-Monster on Empire of the Sun shows good terran unit mix and timing to beat the mutas. Ecgberht has also defeated Monster with the same kind of plan using bio units.

As terran, you also need good defense to survive until it is time to move out. Bots usually space turrets apart to cover ground. Pros don’t position turrets that way against mutalisks: They build turrets in tight groups so that a muta flock killing one turret takes as much damage as possible from others. The turret groups are located to cover vital areas: The mineral lines and production buildings. All buildings need to be placed compactly so that they are easy to defend, and they have to leave movement corridors so that mobile anti-air units can move quickly to counter the mutas. It’s not easy. Defense against strong muta harass needs skills.

Valkyries are potentially a sound defense against the mutas, but bots as far as I have seen do not use valkyries well. The valks act independently rather than together and move carelessly away from air defense, and Monster’s aggressive scourge shoots down too many; I haven’t seen an exception yet. Also, apparently terran bots select a single mutalisk as their valk target (this is my interpretation from watching replays), and Monster instantly separates the targeted muta from the flock so that only it takes damage. I suggest keeping valks in range of anti-air (whether static or mobile) for safety from scourge and trying Valkyrie patrol micro to in effect attack a location rather than a unit while staying safe. If that works, it should force the whole flock to scatter, rather than only forcing a single muta to split off. But I haven’t seen it tried, so I’m not sure it will work.

representing the strategic situation

It’s possible to reason about Starcraft games straight from the details, without explicit abstractions. One of the promises of deep learning is that it allows you to get away with that, to feed concrete facts into a black box and get back concrete actions (so that all the abstraction is hidden inside the black box), and deep learning is not the only way. I believe that for a complex game, it’s probably better to introduce abstractions and do much of your reasoning at an abstract level. For Steamhammer, I selected four levels of abstraction, which I call strategy, operations, tactics, and micro (the breakdown is not perfectly cleanly implemented right now, but I keep moving in that direction). This post is about strategy and how to represent it as a data structure.

The three classic elements of a Starcraft strategic situation are economy, technology, and army. I would like to add a fourth element (less emphasized in abstract theory but never overlooked in practical discussions), production facilities. These four elements are the things that you can spend resources to buy, and that you have to trade off as you make each spending decision. (Do I spend this on another dragoon, or a second gateway so I can later make dragoons faster, or do I get dragoon range?) I’ll call them the elements of strategy—that simplifies the meaning of the word “strategy,” but let’s go with it. If you can fully represent the four elements, then you can fully represent the strategic situation of a game.

You have only partial information about what the opponent is doing, but you know all about your own state. You may not want to store both using the same datatype, though you could. It’s elegant if both have the same underlying model, at least. You might store exact values for your own strategic situation and probability distributions over the same values for the enemy’s situation, for example. It’s ideal to use all available information to estimate the enemy values: Scouting info, feasibility calculations (“at this frame it’s not possible to have more than n workers”), past behavior saved in learning files, hand-coded constraints from reading the enemy bot’s code....

If you know how the strategy elements varied throughout a game, then you know a lot about what happened in the game. You don’t know the maneuvers, but you know who won and you know the ebb and flow of the battles.

economy

I think there are 3 things you’d like to know about a player’s Starcraft economy. 1. Total minerals and gas mined so far. 2. Current rate of mining minerals and gas. 3. Predictions of the future rate of mining minerals and gas, perhaps under varying assumptions so you can choose a course of action. For example, if the map is running low on minerals after a long game, it’s likely wasteful to make more workers.

For 1 and 2, it’s easy to keep track of your own current state. For the enemy, there are occasions when you can get exact values for 1 and 2, but usually you’ll have to estimate. And 3 is always an estimation problem. Data useful for making the estimates includes number of workers, number of mineral patches, and so on. For the enemy, you have to estimate those too. (If you don’t have an estimate of the number of enemy workers then you won’t be able to decide on the priority of killing a worker versus another unit, a question that falls under 3.)

technology

A command center is a tech building, because it allows you to build SCVs. (It’s also a production building.) A tech state is simply a set of tech buildings and research and upgrades: These things are the tech that I have, I can make (say) SCVs and marines and turrets because I have a command center and a barracks and an ebay. I’ll call it the tech set which represents a given tech state.

The set of tech states is partially ordered by the relation of teching: If you can tech from A (I can make marines) to B (I can make marines and medics and firebats because I added an academy), then you can say B > A. The partial order gives you a bit of mathematical leverage to reason about tech states (I see a medic, the enemy must have an academy—Steamhammer does this reasoning). The enemy can destroy buildings, though. Destroy the barracks and you’re in a state C with academy but no barracks, which cannot be reached by teching from the start state. C is neither greater than nor less than A, but B > C because you can tech from C back to B by rebuilding the barracks.

production

The production buildings are those that make units. The terran production buildings are command center, barracks, factory, starport. The more copies you have of a production building, the faster you can make units—provided you have the tech and the economy for them. There are complications for every race. Terran can spend resources to repair. Protoss reavers and carriers rely on scarabs and interceptors, which cost minerals. The only zerg production building is the hatchery (and lair and hive), but hydras may be able to morph into lurkers and mutas may be able to morph into guardians or devourers. These are all spending decisions, so they count under my definition of strategy.

You can represent your terran production state as a count of each of CCs, barracks, factories, starports. For protoss, you might want to include reavers and carriers are production facilities, or you might ignore them as irrelevant for the strategic reasoning you care about. For zerg, you may want to count hydras as production facilities if you have lurker tech and mutas if you have a greater spire, and if you’re reasoning about details you should count larvas too.

army

Your army state is simply your count of units of each type. If you care, you might also keep health information. The hard thing about armies is not knowing what they are, it is comparing their strengths and knowing what they can do.

The four strategy elements are all interrelated. Production facilities you see can help you predict the enemy’s unit mix, and enemy units you see can help you predict their production state. Economy and production tell you how fast an army can grow. You get the idea.

new bot Monster

New zerg bot Monster has been going Godzilla on the opposition. As I write, it has 58 wins and 4 losses on BASIL (since it is unranked, it is facing opponents of all levels). Its wins include tough enemies like Iron (on Circuit Breaker). Its win rate on SSCAIT is “only” 31-8 as it is being voted tougher opponents on average. The losses on BASIL are to PurpleWave, Krasi0 (twice), and the expert zealot rusher Wuli.

I have yet to see Monster vary its early game build orders, though perhaps it simply hasn’t lost enough games to feel the need to. Versus terran, Monster likes three hatch mutalisk, making only one pair of zerglings at first. Krasi0 earned its 2 wins with proxy rax, beating the greedy build with fast aggression. Against protoss, Monster likes overpool followed by 11 hatchery, a standard build. It gets a hydra den early but does not always make use of it. Here is a win over Locutus where it does make hydras. Versus zerg, Monster likes overpool 9 gas, also one of Steamhammer’s favorite starts and difficult for bots to counter.

Monster is a complex bot with many skills. It appears to adapt its army size, unit mix, and static defense to the game situation. It has nice micro with zerglings, hydralisks, mutalisks, and scourge. (Though I judge McRaveZ’s muta micro is better. See this loss vs McRaveZ from SSCAIT; Monster won the rematch.) It can make queens with broodling, though I haven’t seen it make defilers. It knows how to position a sunken and block its ramp with zerglings to stop a vulture runby cold; see the Iron game above. Like ZZZKBot, when scouting for the location of a zerg opponent, it knows to discount a base when it does not see the creep; it does not have to scout farther to see that there is no hatchery. (See this win over Microwave for an example; watch how early the overlord turns away from its first scouting destination. Also notice Monster’s zergling formations.)

Monster still has a lot of headroom. I immediately saw inefficiencies in its build orders and weaknesses in its play. Its results say that its strengths are bigger than its weaknesses, though. I imagine it must have been thoroughly tested against a range of opponents to gain so many skills with such small loopholes.

Monster gg’s early when losing. I haven’t seen another bot surrender as quickly. There is an advantage to giving up early in testing: You can get more games in, iterate faster, and end up with a stronger bot. Of course the advantage doesn’t show in serious games, but if the gg is accurate then it doesn’t hurt.

Peering into the binary, I am impressed with Monster’s scope. The file I downloaded from SSCAIT is a 2.8MB .exe, pointing to a complex project that must have taken a long time to develop. It uses BWEB. I see a JSON parsing library and signs of a config file that is not included in the SSCAIT download. I see strings suggesting many skills that I have not yet noticed in games, such as scarab dodging.

AIIDE 2020 - various versus DaQin

I added parsing for DaQin’s files, which was little effort. I decided to dump all of DaQin’s analysis into a single post, because the tables aren’t that rich in information. Now I’m able to move on to other topics. I put the opponents on the left, so that in all cases, blue is good for the opponent and red is good for DaQin.

bananabrain strategies versus daqin strategies

	overall	2GateDT	3GateDT	4GateGoon
overall	99/150 66%	9/14 64%	53/89 60%	37/47 79%
PvP_10/12gate	7/10 70%	-	2/5 40%	5/5 100%
PvP_12nexus	4/7 57%	1/1 100%	2/4 50%	1/2 50%
PvP_2gatedt	14/16 88%	1/1 100%	7/9 78%	6/6 100%
PvP_2gatedtexpo	10/14 71%	1/2 50%	5/7 71%	4/5 80%
PvP_2gatereaver	13/16 81%	1/1 100%	5/7 71%	7/8 88%
PvP_3gaterobo	8/13 62%	2/2 100%	5/7 71%	1/4 25%
PvP_3gatespeedzeal	2/7 29%	0/1 0%	1/5 20%	1/1 100%
PvP_4gategoon	4/8 50%	1/1 100%	2/6 33%	1/1 100%
PvP_9/9gate	15/16 94%	-	11/11 100%	4/5 80%
PvP_9/9proxygate	6/10 60%	1/1 100%	2/6 33%	3/3 100%
PvP_nzcore	7/11 64%	1/1 100%	4/8 50%	2/2 100%
PvP_zcore	3/7 43%	0/2 0%	3/5 60%	-
PvP_zcorez	2/7 29%	-	2/4 50%	0/3 0%
PvP_zzcore	4/8 50%	0/1 0%	2/5 40%	2/2 100%

Reading DaQin’s openings out of its configuration file, I see that 2GateDT makes 2 dark templar out of the promised 2 gateways, adds 3 cannons in front of its natural, then expands. As a PvP build, that strikes me as illogical (you might want one cannon if the enemy also has dark templar). 3GateDT makes one gate, gets dragoons and dragoon range, adds a second gate and a citadel, and then the predefined build order ends—the rest is left to the strategy manager. That seems sensible as far as it goes, but does the strategy manager regularly add a third gate and make DTs as promised, or is the name of the opening a lie? See below for BananaBrain’s opinion on the question. In any case, 3GateDT is the opening that gave BananaBrain the most trouble.

bananabrain as seen by daqin

bananabrain played	#	daqin recognized
PvP_10/12gate	10	10 Fast rush
PvP_12nexus	7	5 Fast rush \| 1 Safe expand \| 1 Naked expand
PvP_2gatedt	16	16 Fast rush
PvP_2gatedtexpo	14	13 DarkTemplar rush \| 1 Unknown
PvP_2gatereaver	16	16 DarkTemplar rush
PvP_3gaterobo	13	13 DarkTemplar rush
PvP_3gatespeedzeal	7	6 Fast rush \| 1 Unknown
PvP_4gategoon	8	5 DarkTemplar rush \| 1 Naked expand \| 1 Unknown \| 1 Fast rush
PvP_9/9gate	16	16 Fast rush
PvP_9/9proxygate	10	9 Fast rush \| 1 Proxy
PvP_nzcore	11	8 DarkTemplar rush \| 1 Not fast rush \| 1 Naked expand \| 1 Unknown
PvP_zcore	7	7 DarkTemplar rush
PvP_zcorez	7	5 DarkTemplar rush \| 2 Not fast rush
PvP_zzcore	8	5 DarkTemplar rush \| 2 Proxy \| 1 Not fast rush

DaQin recognizes 9-9 gate as Fast rush, but also the economy-first 10-12 gate and even the fast expand 12 nexus. What BananaBrain calls a reaver build, DaQin sees as a dark templar rush. Strategy recognition has some odd results.

daqin as seen by bananabrain

daqin played	#	bananabrain recognized
2GateDT	14	12 P_1gatecore \| 2 P_unknown
3GateDT	89	45 P_1gatecore \| 32 P_4gategoon \| 11 P_unknown \| 1 P_ffe
4GateGoon	47	36 P_4gategoon \| 9 P_1gatecore \| 2 P_unknown

This suggests that DaQin’s 3GateDT was often not a dark templar build at all.

mcrave strategies versus daqin strategies

	overall	ForgeExpand5GateGoon	ForgeExpandSpeedlots
overall	97/150 65%	3/3 100%	94/147 64%
PoolHatch,Overpool,2HatchMuta	97/150 65%	3/3 100%	94/147 64%

Not a lot of strategic variety here.

mcrave as seen by daqin

mcrave played	#	daqin recognized
PoolHatch,Overpool,2HatchMuta	150	117 Not fast rush \| 28 Heavy rush \| 5 Unknown

daqin as seen by mcrave

daqin played	#	mcrave recognized
ForgeExpand5GateGoon	3	3 FFE,Forge,5GateGoon
ForgeExpandSpeedlots	147	121 FFE,Forge,Speedlot \| 21 FFE,Nexus,Speedlot \| 2 FFE,Forge,5GateGoon \| 2 FFE,Forge,ZealotArchon \| 1 FFE,Gateway,Speedlot

microwave strategies versus daqin strategies

	overall	4GateGoon	ForgeExpand5GateGoon	ForgeExpandSpeedlots
overall	125/150 83%	3/11 27%	3/3 100%	119/136 88%
1HatchMuta_Sparkle	56/62 90%	0/1 0%	-	56/61 92%
3HatchLingBust	11/17 65%	2/4 50%	1/1 100%	8/12 67%
3HatchMuta	53/59 90%	0/2 0%	2/2 100%	51/55 93%
3HatchMutaExpo	5/9 56%	1/4 25%	-	4/5 80%
3HatchPoolHydraExpo	0/1 0%	-	-	0/1 0%
9Pool	0/1 0%	-	-	0/1 0%
OverpoolLurker	0/1 0%	-	-	0/1 0%

Why did DaQin play its most successful opening by far, 4GateGoon, less often than any other? It is not that it discovered the opening late; it played it first in game 10 of 150, and won that game. It immediately played it again and lost, but soon played it a third time and won again. It surely wasn’t confused by too many choices. Either there was a bug, or some built-in bias in DaQin’s decisions led it astray.

microwave as seen by daqin

microwave played	#	daqin recognized
1HatchMuta_Sparkle	62	34 Not fast rush \| 19 Heavy rush \| 7 Unknown \| 2 Proxy
3HatchLingBust	17	12 Not fast rush \| 4 Heavy rush \| 1 Proxy
3HatchMuta	59	48 Not fast rush \| 7 Heavy rush \| 4 Proxy
3HatchMutaExpo	9	8 Not fast rush \| 1 Proxy
3HatchPoolHydraExpo	1	1 Not fast rush
9Pool	1	1 Fast rush
OverpoolLurker	1	1 Unknown

daqin as seen by microwave

daqin played	#	microwave recognized
4GateGoon	11	8 Unknown \| 2 HeavyRush \| 1 Proxy
ForgeExpand5GateGoon	3	1 SafeExpand \| 1 Turtle \| 1 NakedExpand
ForgeExpandSpeedlots	136	68 Turtle \| 22 HeavyRush \| 20 SafeExpand \| 16 NakedExpand \| 10 Unknown

steamhammer strategies versus daqin strategies

	overall	ForgeExpand5GateGoon	ForgeExpandSpeedlots
overall	33/150 22%	29/136 21%	4/14 29%
10HatchLing	0/1 0%	0/1 0%	-
11Gas10PoolLurker	0/1 0%	0/1 0%	-
12-12Hatch	0/1 0%	0/1 0%	-
12Hatch_4HatchLing	0/2 0%	0/2 0%	-
2.5HatchMuta	0/1 0%	0/1 0%	-
2HatchHydraBust	0/2 0%	0/1 0%	0/1 0%
3HatchHydra	0/2 0%	0/2 0%	-
3HatchHydraBust	0/3 0%	0/2 0%	0/1 0%
3HatchHydraExpo	0/1 0%	0/1 0%	-
3HatchLateHydras+1	0/1 0%	0/1 0%	-
3HatchLing	26/59 44%	24/52 46%	2/7 29%
3HatchLingBust2	0/2 0%	0/2 0%	-
4HatchBeforeGas	5/25 20%	3/23 13%	2/2 100%
4HatchBeforeLair	0/1 0%	0/1 0%	-
5HatchBeforeGas	0/2 0%	-	0/2 0%
5HatchPool	0/1 0%	0/1 0%	-
5PoolHard2Player	0/1 0%	0/1 0%	-
5Scout	0/1 0%	0/1 0%	-
973HydraBust	0/4 0%	0/3 0%	0/1 0%
9Pool8GasLurker	0/1 0%	0/1 0%	-
9PoolHatchSpeed	0/1 0%	0/1 0%	-
9PoolHatchSpeedSpire2	0/1 0%	0/1 0%	-
9PoolHatchSpire	0/1 0%	0/1 0%	-
9PoolSpireSlowlings	0/1 0%	0/1 0%	-
9PoolSunkHatch	0/1 0%	0/1 0%	-
AntiFact_2Hatch	0/1 0%	0/1 0%	-
AntiFact_Overpool9Gas	0/1 0%	0/1 0%	-
AntiFactory2	0/1 0%	0/1 0%	-
Over10Hatch1Sunk	0/1 0%	0/1 0%	-
OverhatchExpoMuta	0/3 0%	0/3 0%	-
OverpoolSpeed	0/1 0%	0/1 0%	-
OverpoolTurtle 0	0/2 0%	0/2 0%	-
Proxy8HatchNatural	0/1 0%	0/1 0%	-
Sparkle 3HatchMuta	1/6 17%	1/6 17%	-
ZvP_2HatchMuta	0/1 0%	0/1 0%	-
ZvP_3BaseSpire+Den	0/1 0%	0/1 0%	-
ZvP_3HatchPoolHydra	1/7 14%	1/7 14%	-
ZvT_2HatchMuta	0/1 0%	0/1 0%	-
ZvT_3HatchMuta	0/1 0%	0/1 0%	-
ZvT_7Pool	0/1 0%	0/1 0%	-
ZvZ_12PoolLing	0/1 0%	0/1 0%	-
ZvZ_12PoolLingB	0/2 0%	0/2 0%	-
ZvZ_Overpool11Gas	0/1 0%	0/1 0%	-

steamhammer as seen by daqin

steamhammer played	#	daqin recognized
10HatchLing	1	1 Unknown
11Gas10PoolLurker	1	1 Heavy rush
12-12Hatch	1	1 Not fast rush
12Hatch_4HatchLing	2	2 Heavy rush
2.5HatchMuta	1	1 Not fast rush
2HatchHydraBust	2	1 Hydra bust \| 1 Not fast rush
3HatchHydra	2	2 Not fast rush
3HatchHydraBust	3	2 Not fast rush \| 1 Heavy rush
3HatchHydraExpo	1	1 Not fast rush
3HatchLateHydras+1	1	1 Not fast rush
3HatchLing	59	40 Not fast rush \| 16 Heavy rush \| 3 Unknown
3HatchLingBust2	2	1 Not fast rush \| 1 Unknown
4HatchBeforeGas	25	24 Not fast rush \| 1 Unknown
4HatchBeforeLair	1	1 Not fast rush
5HatchBeforeGas	2	2 Not fast rush
5HatchPool	1	1 Not fast rush
5PoolHard2Player	1	1 Fast rush
5Scout	1	1 Not fast rush
973HydraBust	4	4 Not fast rush
9Pool8GasLurker	1	1 Heavy rush
9PoolHatchSpeed	1	1 Heavy rush
9PoolHatchSpeedSpire2	1	1 Fast rush
9PoolHatchSpire	1	1 Heavy rush
9PoolSpireSlowlings	1	1 Heavy rush
9PoolSunkHatch	1	1 Fast rush
AntiFact_2Hatch	1	1 Not fast rush
AntiFact_Overpool9Gas	1	1 Not fast rush
AntiFactory2	1	1 Heavy rush
Over10Hatch1Sunk	1	1 Heavy rush
OverhatchExpoMuta	3	3 Not fast rush
OverpoolSpeed	1	1 Heavy rush
OverpoolTurtle 0	2	2 Heavy rush
Proxy8HatchNatural	1	1 Heavy rush
Sparkle 3HatchMuta	6	6 Not fast rush
ZvP_2HatchMuta	1	1 Not fast rush
ZvP_3BaseSpire+Den	1	1 Heavy rush
ZvP_3HatchPoolHydra	7	4 Not fast rush \| 1 Heavy rush \| 1 Hydra bust \| 1 Unknown
ZvT_2HatchMuta	1	1 Not fast rush
ZvT_3HatchMuta	1	1 Not fast rush
ZvT_7Pool	1	1 Fast rush
ZvZ_12PoolLing	1	1 Not fast rush
ZvZ_12PoolLingB	2	2 Not fast rush
ZvZ_Overpool11Gas	1	1 Heavy rush

daqin as seen by steamhammer

daqin played	#	steamhammer recognized
ForgeExpand5GateGoon	136	79 Turtle \| 41 Safe expand \| 10 Heavy rush \| 5 Naked expand \| 1 Unknown
ForgeExpandSpeedlots	14	7 Safe expand \| 6 Turtle \| 1 Unknown

ecgberht strategies versus daqin strategies

	overall	12NexusCarriers	3GateDT
overall	1/150 1%	1/2 50%	0/148 0%
14CC	0/32 0%	-	0/32 0%
FullMech	0/29 0%	0/1 0%	0/28 0%
JoyORush	0/28 0%	-	0/28 0%
MechGreedyFE	0/28 0%	-	0/28 0%
ProxyEightRax	1/33 3%	1/1 100%	0/32 0%

ecgberht as seen by daqin

ecgberht played	#	daqin recognized
14CC	32	28 Safe expand \| 2 Naked expand \| 2 Unknown
FullMech	29	27 Factory \| 2 Not fast rush
JoyORush	28	27 Factory \| 1 Unknown
MechGreedyFE	28	12 Unknown \| 9 Safe expand \| 7 Not fast rush
ProxyEightRax	33	27 Fast rush \| 5 Not fast rush \| 1 Proxy

daqin as seen by ecgberht

daqin played	#	ecgberht recognized
12NexusCarriers	2	2 Unknown
3GateDT	148	148 Unknown

the final game of the ASL 10 finals

Yesterday’s ASL 10 finals was ZvZ with Zero (aka Queen) versus Soma in a best of 7. It’s worth seeing for the highly entertaining last game.

The match was in a tense situation. The game started... then hit technical difficulties, and after a long delay had to be restarted, ratcheting up the tension. Then the game itself was a spectacular knife fight with an exciting finish. Recommended.

AIIDE 2020 - Microwave versus Steamhammer

Microwave played more different openings than Steamhammer (no doubt seeking a winning choice), so I put it on the left. Blue is good for Microwave, red is good for Steamhammer.

microwave strategies versus steamhammer strategies

	overall	6PoolBurrow	8-8HydraRush	9Hatch8Pool	9PoolHatchSpeedSpire	OverhatchLing	OverpoolBurrow	ZvZ_12HatchExpo	ZvZ_12PoolLing	ZvZ_12PoolMain	ZvZ_Overpool11Gas	ZvZ_Overpool9Gas	ZvZ_OverpoolTurtle
overall	43/150 29%	1/1 100%	1/1 100%	4/5 80%	1/1 100%	1/1 100%	1/1 100%	3/5 60%	4/11 36%	2/3 67%	12/44 27%	7/64 11%	6/13 46%
10Hatch9Pool9gas	4/9 44%	-	-	0/1 0%	-	-	-	-	-	-	2/2 100%	1/5 20%	1/1 100%
10HatchMain9Pool9Gas	1/4 25%	-	-	-	-	-	-	-	-	-	0/1 0%	1/2 50%	0/1 0%
10HatchTurtleHydra	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
11HatchTurtleMuta	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	0/1 0%
12HatchMain	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
12Pool	5/25 20%	-	-	-	-	1/1 100%	-	-	0/3 0%	2/2 100%	2/7 29%	0/10 0%	0/2 0%
12PoolMain	1/5 20%	-	-	1/1 100%	-	-	-	-	-	-	0/2 0%	0/2 0%	-
2HatchLurker	0/2 0%	-	-	-	-	-	-	-	-	-	0/1 0%	0/1 0%	-
3HatchHydraBust	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
3HatchHydraExpo	0/2 0%	-	-	-	-	-	-	-	-	-	0/1 0%	0/1 0%	-
3HatchPoolHydra	0/2 0%	-	-	-	-	-	-	-	0/1 0%	-	0/1 0%	-	-
4HatchPoolHydra	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
5Pool	0/4 0%	-	-	-	-	-	-	-	0/1 0%	-	0/1 0%	0/2 0%	-
5PoolSpeed	1/3 33%	-	1/1 100%	-	-	-	-	-	0/1 0%	-	0/1 0%	-	-
7Pool	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
7PoolHydraLingRush7D	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
9Hatch9Pool9Gas	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-
9HatchTurtleHydra	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
9PoolGasHatchSpeed8D	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
9PoolHatch	0/2 0%	-	-	-	-	-	-	-	-	-	-	0/2 0%	-
9PoolSpeed	17/31 55%	1/1 100%	-	3/3 100%	1/1 100%	-	-	1/1 100%	1/1 100%	-	4/6 67%	4/13 31%	2/5 40%
9PoolSpeedLing	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
9PoolSunken	0/7 0%	-	-	-	-	-	-	-	-	0/1 0%	0/3 0%	0/3 0%	-
OverpoolSpeed	1/3 33%	-	-	-	-	-	1/1 100%	-	-	-	0/1 0%	0/1 0%	-
ZvP_11Hatch10Pool	2/4 50%	-	-	-	-	-	-	1/1 100%	-	-	0/1 0%	1/2 50%	-
ZvP_2HatchHydra	0/9 0%	-	-	-	-	-	-	-	-	-	0/4 0%	0/5 0%	-
ZvP_9Hatch9Pool	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
ZvZ_Overgas11Pool	10/20 50%	-	-	-	-	-	-	1/1 100%	3/4 75%	-	4/9 44%	0/4 0%	2/2 100%
ZvZ_Overpool11Gas	0/2 0%	-	-	-	-	-	-	-	-	-	-	0/2 0%	-
ZvZ_Overpool9Gas	1/4 25%	-	-	-	-	-	-	-	-	-	-	0/3 0%	1/1 100%

Steamhammer’s ZvZ_Overpool9Gas opening was successful against all Microwave tries, but notice that it was the only one: Flecks of blue, or entire streaks, crept into every other Steamhammer attempt. The end result does not look close, but in fact Microwave would have needed only a small increment of skill to turn it around; there was only one strategy it was unprepared to face.

microwave as seen by steamhammer

microwave played	#	steamhammer recognized
10Hatch9Pool9gas	9	5 Naked expand \| 3 Heavy rush \| 1 Unknown
10HatchMain9Pool9Gas	4	3 Unknown \| 1 Turtle
10HatchTurtleHydra	1	1 Naked expand
11HatchTurtleMuta	1	1 Heavy rush
12HatchMain	1	1 Unknown
12Pool	25	17 Naked expand \| 5 Heavy rush \| 3 Unknown
12PoolMain	5	3 Heavy rush \| 2 Unknown
2HatchLurker	2	2 Naked expand
3HatchHydraBust	1	1 Naked expand
3HatchHydraExpo	2	1 Naked expand \| 1 Heavy rush
3HatchPoolHydra	2	1 Naked expand \| 1 Heavy rush
4HatchPoolHydra	1	1 Heavy rush
5Pool	4	4 Fast rush
5PoolSpeed	3	3 Fast rush
7Pool	1	1 Fast rush
7PoolHydraLingRush7D	1	1 Unknown
9Hatch9Pool9Gas	1	1 Naked expand
9HatchTurtleHydra	1	1 Heavy rush
9PoolGasHatchSpeed8D	1	1 Heavy rush
9PoolHatch	2	1 Unknown \| 1 Heavy rush
9PoolSpeed	31	21 Unknown \| 7 Naked expand \| 3 Heavy rush
9PoolSpeedLing	1	1 Naked expand
9PoolSunken	7	4 Unknown \| 3 Heavy rush
OverpoolSpeed	3	2 Unknown \| 1 Heavy rush
ZvP_11Hatch10Pool	4	3 Naked expand \| 1 Heavy rush
ZvP_2HatchHydra	9	6 Heavy rush \| 2 Turtle \| 1 Naked expand
ZvP_9Hatch9Pool	1	1 Naked expand
ZvZ_Overgas11Pool	20	19 Unknown \| 1 Turtle
ZvZ_Overpool11Gas	2	2 Unknown
ZvZ_Overpool9Gas	4	4 Unknown

To play ZvZ truly well, Steamhammer needs a more detailed understanding of enemy builds. But even with this crude breakdown, I notice that most of the blue spots are associated with misunderstanding the main idea of Microwave’s play. On the other hand, many misunderstandings also show as red.

steamhammer as seen by microwave

steamhammer played	#	microwave recognized
6PoolBurrow	1	1 FastRush
8-8HydraRush	1	1 Unknown
9Hatch8Pool	5	4 HeavyRush \| 1 Unknown
9PoolHatchSpeedSpire	1	1 NakedExpand
OverhatchLing	1	1 HeavyRush
OverpoolBurrow	1	1 NakedExpand
ZvZ_12HatchExpo	5	5 NakedExpand
ZvZ_12PoolLing	11	8 HeavyRush \| 2 Unknown \| 1 NakedExpand
ZvZ_12PoolMain	3	3 HeavyRush
ZvZ_Overpool11Gas	44	36 Turtle \| 5 Unknown \| 3 NakedExpand
ZvZ_Overpool9Gas	64	51 Turtle \| 8 Unknown \| 5 NakedExpand
ZvZ_OverpoolTurtle	13	13 Turtle

The builds recognized as Turtle genuinely are turtle builds. They get mutalisks fast at the expense of weakness to zergling attack, which they compensate for with sunkens instead of a second hatchery. From the meta-strategy point of view, Steamhammer usually defeats Microwave in games where Steamhammer gains air superiority early, so Steamhammer’s choices make sense.

risky openings from data

I want to take a day off from analyzing AIIDE data to join in a conversation. From comments to Dragon vs Ecgberht:

Tully Elliston: Using learning to track win:loss %, and having a risk rating for each build (if win rate is above this level, don’t select this build unless it has won at least 1 game against this opponent) could actually be a very useful tool.

You can throw in lots of ridiculous polarised builds, and still ensure they won’t get accidentally selected when beating down a 4pool bot 100 times in a row.

MarcaDBAA: Yes, or give these builds, you don´t want to use at first, some default pseudo-losses, so that they will only be selected after other builds fail to win.

In BananaBrain versus Ecgberht I had mentioned that BananaBrain’s few losses to Ecgberht were due to unnecessarily playing a risky build. The commenters suggest two ways of marking builds as risky.

It can be done automatically from data, in the same style as opening timing data. In fact, it’s on my to-do list. The first step is to keep track of how good each opening is on average: For each matchup, store each opening’s average win rate across all opponents. It can be done offline by adding up the numbers from the individual learning files of each opponent, or you could keep separate records. That already gives you an automatic way to select openings against opponents you have not yet learned about; there’s no more need for hand configuration.

The next step is to compare how well each opening does against strong opponents versus weak opponents. If it reaches its average by beating expectations against strong opponents and falling below against weak opponents, taking into account the opponent’s strength, then it is a risky opening. If the reverse, it is a solid opening and is to be preferred against weak opponents (and if you’re ranked high, also against unknown opponents). One natural way to determine riskiness is to fit a line to the dataset this-opening’s-wins versus opponent-strength as measured by your win rates. The slope of the line tells how risky or solid the opening is. (If you have a lot of data you could fit a more complicated curve. Just make sure it strongly smooths the data.)

The same goes for other data about openings. For example, you can track how well each opening does on each map, and at given starting positions, and against different opponent strategies that you recognize. All the data can fold into your opening selection, without any hand configuration.

BASIL was formerly an excellent forum to collect this kind of data. But now the BASIL pairings are strongly biased toward opponents close in elo, so it is no longer a good option. Look at the crosstable for the last 30 days and notice how the white cells are laid out; unless you rank right in the middle, you can’t get a full cross-section of opponents without a long run.

AIIDE 2020 - Steamhammer versus McRave

I added parsing for Steamhammer. DaQin is nearly the same. The only remaining bot which records data that can be analyzed this way is ZZZKBot, which has a difficult file format, does not keep a recognized enemy strategy, and doesn’t bother to write a newline at the end of its file. I may skip ZZZKBot.

The Steamhammer-McRave strategy crosstable is the most interesting one yet.

steamhammer strategies versus mcrave strategies

	overall	PoolHatch,12Pool,2HatchMuta	PoolHatch,12Pool,2HatchSpeedling	PoolLair,9Pool,1HatchMuta
overall	64/150 43%	17/33 52%	10/22 45%	37/95 39%
12PoolLurker	0/1 0%	-	-	0/1 0%
3HatchLingBurrow	1/5 20%	1/2 50%	-	0/3 0%
8DroneGas	7/11 64%	-	1/1 100%	6/10 60%
9HatchMain9Pool9Gas	0/2 0%	-	-	0/2 0%
9PoolHatchSpeedAllInB	0/1 0%	-	-	0/1 0%
9PoolSpire	0/2 0%	0/2 0%	-	-
Over10HatchBust	8/19 42%	7/7 100%	-	1/12 8%
Over10PoolLing	0/1 0%	-	-	0/1 0%
OverpoolSpeed	3/15 20%	1/5 20%	0/3 0%	2/7 29%
OverpoolSunk	8/21 38%	0/1 0%	2/8 25%	6/12 50%
OverpoolTurtle	11/23 48%	2/6 33%	1/1 100%	8/16 50%
ZvP_3HatchMuta	0/1 0%	-	-	0/1 0%
ZvZ_12HatchExpo	0/1 0%	-	0/1 0%	-
ZvZ_Overgas9Pool	0/2 0%	-	0/1 0%	0/1 0%
ZvZ_OverpoolTurtle	26/45 58%	6/10 60%	6/7 86%	14/28 50%

For Steamhammer, either 8DroneGas (a zergling build despite the name) or else ZvZ_OverpoolTurtle (a mutalisk build) may dominate among the openings tried, while McRave’s best was the 1 hatch muta play because no Steamhammer try was better than even against it. It’s possible that switching between different kinds of builds was important, though, because the table suggests that the other counters are likely imbalanced (without a game-theoretic saddle point).

Both sides had trouble identifying the best strategies. If both had played their best strategies then the match would have come out close to 50%, while in fact Steamhammer came out behind, so Steamhammer had more trouble selecting from its excessive range of possibilities. I get the impression of a back-and-forth learning struggle.

steamhammer as seen by mcrave

steamhammer played	#	mcrave recognized
12PoolLurker	1	1 HatchPool,12Pool,1HatchMuta
3HatchLingBurrow	5	3 HatchPool,Unknown,2HatchLing \| 1 HatchPool,Unknown,Unknown \| 1 Unknown,Unknown,3HatchMuta
8DroneGas	11	6 HatchPool,9Pool,2HatchLing \| 2 PoolHatch,9Pool,2HatchLing \| 1 PoolHatch,Unknown,2HatchLing \| 1 HatchPool,Unknown,2HatchLing \| 1 PoolHatch,Unknown,Unknown
9HatchMain9Pool9Gas	2	1 PoolHatch,12Pool,2HatchLing \| 1 HatchPool,Unknown,2HatchLing
9PoolHatchSpeedAllInB	1	1 PoolHatch,9Pool,LingRush
9PoolSpire	2	2 Unknown,Unknown,Unknown
Over10HatchBust	19	7 HatchPool,12Pool,Unknown \| 4 HatchPool,12Pool,2HatchLing \| 3 Unknown,12Pool,Unknown \| 2 HatchPool,Unknown,2HatchLing \| 2 HatchPool,Unknown,Unknown \| 1 PoolHatch,12Pool,Unknown
Over10PoolLing	1	1 HatchPool,12Pool,Unknown
OverpoolSpeed	15	5 HatchPool,9Pool,LingRush \| 4 PoolHatch,12Pool,Unknown \| 3 Unknown,12Pool,Unknown \| 1 Unknown,9Pool,LingRush \| 1 PoolHatch,9Pool,LingRush \| 1 HatchPool,12Pool,3HatchMuta
OverpoolSunk	21	8 HatchPool,9Pool,Unknown \| 5 PoolHatch,9Pool,LingRush \| 2 HatchPool,9Pool,3HatchMuta \| 1 PoolHatch,9Pool,Unknown \| 1 Unknown,Unknown,Unknown \| 1 Unknown,12Pool,3HatchMuta \| 1 HatchPool,Unknown,Unknown \| 1 HatchPool,12Pool,3HatchMuta \| 1 PoolHatch,Unknown,Unknown
OverpoolTurtle	23	7 HatchPool,9Pool,LingRush \| 5 Unknown,12Pool,1HatchHydra \| 3 Unknown,Unknown,1HatchHydra \| 2 HatchPool,Unknown,1HatchHydra \| 2 Unknown,9Pool,1HatchHydra \| 2 HatchPool,12Pool,1HatchLurker \| 1 PoolHatch,12Pool,1HatchLurker \| 1 HatchPool,12Pool,1HatchHydra
ZvP_3HatchMuta	1	1 HatchPool,Unknown,2HatchLing
ZvZ_12HatchExpo	1	1 HatchPool,Unknown,2HatchMuta
ZvZ_Overgas9Pool	2	1 PoolLair,Unknown,1HatchMuta \| 1 PoolLair,Unknown,Unknown
ZvZ_OverpoolTurtle	45	15 Unknown,Unknown,Unknown \| 10 PoolLair,9Pool,1HatchMuta \| 6 PoolLair,12Pool,1HatchMuta \| 6 PoolLair,Unknown,1HatchMuta \| 2 PoolLair,Unknown,Unknown \| 2 Unknown,12Pool,Unknown \| 2 Unknown,Unknown,3HatchMuta \| 1 PoolLair,9Pool,Unknown \| 1 Unknown,12Pool,3HatchMuta

Some curious stuff here. None of Steamhammer’s openings here is 3 hatch mutalisk, so those that are recognized that way may have added a third hatchery later in the game. Steamhammer does have an unfortunate love of laying down an unnecessary hatchery before its spire in ZvZ (3 hatcheries with zerglings is good, 2 hatcheries with mutalisks is good, 3 hatcheries with mutalisks is hard to justify in ZvZ). Looking at the Steamhammer openings tried more often, OverpoolSunk should be recognized as PoolHatch usually (maybe sometimes PoolLair). McRave got it wrong over half the time, without any big effect on its win rate. OverpoolTurtle should be PoolHatch with a hydra followup (this opening is not intended for ZvZ). For ZvZ_OverpoolTurtle, the closest match is PoolLair,9Pool,1HatchMuta. McRave got it right 10 times out of 45 and was close some other times. Failing to recognize anything (likely the scout was denied) was bad.

mcrave as seen by steamhammer

mcrave played	#	steamhammer recognized
PoolHatch,12Pool,2HatchMuta	33	22 Naked expand \| 6 Unknown \| 5 Heavy rush
PoolHatch,12Pool,2HatchSpeedling	22	9 Naked expand \| 9 Unknown \| 3 Heavy rush \| 1 Worker rush
PoolLair,9Pool,1HatchMuta	95	89 Unknown \| 4 Turtle \| 2 Naked expand

Worker rush? That is likely a bug. The other choices capture information about the game that is probably true and not particularly useful.

AIIDE 2020 - Dragon versus Ecgberht

Two posts today, to cover the newly available Ecgberht pairings. Neither post has much meat to it.

dragon strategies versus ecgberht strategies

	overall	14CC	BioMechGreedyFE	FullMech	ProxyBBS	ProxyEightRax
overall	141/150 94%	28/28 100%	27/28 96%	25/25 100%	36/44 82%	25/25 100%
1rax fe	5/6 83%	1/1 100%	1/1 100%	1/1 100%	2/3 67%	-
bio	136/144 94%	27/27 100%	26/27 96%	24/24 100%	34/41 83%	25/25 100%

I was curious about Dragon’s pattern of seemingly giving up on “1rax fe” (barracks expand) after a single loss, so I looked at the file. In fact Dragon played “bio” as the regular build the whole time, throwing in “1rax fe” occasionally for spice. The “1rax fe” loss was not the last “1rax fe” game, but the second to last.

For Ecgberht, when one build is producing nearly all the wins, probably you should play it more often than 30% of the time. You may not want to play it every game, because that makes it easy for the opponent to adapt—mixing it up is good. Maybe 50% of the time would be better, given this number of alternatives? To know for sure, I guess we’d have to test against a range of bots to see the overall effectiveness of learning.

dragon as seen by ecgberht

dragon played	#	ecgberht recognized
1rax fe	6	6 Unknown
bio	144	144 Unknown

Nothing to see here. Move along.

ecgberht as seen by dragon

Dragon does not record its idea of the opponent’s build. If it has one.

AIIDE 2020 - BananaBrain versus Ecgberht

bananabrain strategies versus ecgberht strategies

	overall	14CC	FullMech	JoyORush	MechGreedyFE	ProxyEightRax
overall	148/150 99%	31/31 100%	28/28 100%	28/28 100%	28/28 100%	33/35 94%
PvT_10/12gate	10/10 100%	3/3 100%	-	1/1 100%	3/3 100%	3/3 100%
PvT_10/15gate	10/10 100%	2/2 100%	3/3 100%	1/1 100%	1/1 100%	3/3 100%
PvT_12nexus	10/10 100%	3/3 100%	3/3 100%	2/2 100%	1/1 100%	1/1 100%
PvT_1gatedtexpo	10/10 100%	1/1 100%	3/3 100%	2/2 100%	4/4 100%	-
PvT_1gatereaver	10/10 100%	2/2 100%	3/3 100%	2/2 100%	1/1 100%	2/2 100%
PvT_28nexus	10/10 100%	3/3 100%	2/2 100%	1/1 100%	2/2 100%	2/2 100%
PvT_2gatedt	11/11 100%	2/2 100%	4/4 100%	1/1 100%	2/2 100%	2/2 100%
PvT_2gaterngexpo	10/10 100%	3/3 100%	3/3 100%	-	1/1 100%	3/3 100%
PvT_32nexus	10/10 100%	1/1 100%	1/1 100%	2/2 100%	3/3 100%	3/3 100%
PvT_9/9gate	10/10 100%	2/2 100%	-	3/3 100%	-	5/5 100%
PvT_9/9proxygate	10/10 100%	4/4 100%	3/3 100%	1/1 100%	1/1 100%	1/1 100%
PvT_bulldog	10/10 100%	1/1 100%	1/1 100%	5/5 100%	1/1 100%	2/2 100%
PvT_dtdrop	10/10 100%	1/1 100%	-	3/3 100%	3/3 100%	3/3 100%
PvT_proxydt	7/9 78%	1/1 100%	1/1 100%	3/3 100%	2/2 100%	0/2 0%
PvT_stove	10/10 100%	2/2 100%	1/1 100%	1/1 100%	3/3 100%	3/3 100%

We can see exactly how Ecgberht scored its total of 2 wins: It happened to play a fast proxy when BananaBrain played a slow proxy. For BananaBrain, maybe the lesson is to avoid risky openings versus much weaker opponents. As a general principle, I suggest saving risky builds for games where you have a high risk of losing with safe play—in that case, why not?

bananabrain as seen by ecgberht

bananabrain played	#	ecgberht recognized
PvT_10/12gate	10	7 ZealotRush \| 3 Unknown
PvT_10/15gate	10	10 Unknown
PvT_12nexus	10	9 ProtossFE \| 1 Unknown
PvT_1gatedtexpo	10	10 Unknown
PvT_1gatereaver	10	10 Unknown
PvT_28nexus	10	10 Unknown
PvT_2gatedt	11	11 Unknown
PvT_2gaterngexpo	10	10 Unknown
PvT_32nexus	10	10 Unknown
PvT_9/9gate	10	7 ZealotRush \| 3 Unknown
PvT_9/9proxygate	10	8 Unknown \| 2 CannonRush
PvT_bulldog	10	10 Unknown
PvT_dtdrop	10	10 Unknown
PvT_proxydt	9	9 Unknown
PvT_stove	10	10 Unknown

Except for a couple cases of CannonRush, the builds that Ecgberht recognized were named correctly. I imagine that it interpreted CannonRush as “something proxied.”

ecgberht as seen by bananabrain

ecgberht played	#	bananabrain recognized
14CC	31	21 T_fastexpand \| 6 T_unknown \| 4 T_2rax
FullMech	28	21 T_unknown \| 6 T_1fac \| 1 T_2fac
JoyORush	28	23 T_2fac \| 3 T_unknown \| 2 T_1fac
MechGreedyFE	28	25 T_unknown \| 3 T_2rax
ProxyEightRax	35	35 T_unknown

As we’ve seen before, BananaBrain has little skill in recognizing terran builds.

AIIDE 2020 - what Ecgberht learned

I added parsing code for Ecgberht’s JSON format learning files. I had to refactor for generality, and it added complexity, but I can use the parser for more than one purpose. Today I summarize the contents of its history files.

Ecgberht I think is a complex and interesting bot. It played up to 5 different strategies in each matchup, though the selection of the 5 varied by matchup. Sometimes it played fewer. Against most opponents Ecgberht played its strategies at roughly equal rates—except for the strategies it didn’t play at all. Ecgberht uses UCB with a high exploration rate. The strategy manager in the source lists 15 strategies (plus one more played only on the map Plasma and named PlasmaWraithHell), so it did not play everything it knows. I made a quick scan through the source for opponent-specific preparation, and did find some, but for bots in the tournament only ZZZKBot is affected (it is flagged by a zergling rush check; some bots that always zealot rush are flagged for that). I didn’t dig deep enough to find out why Ecgberht ignores so many of its available strategies.

Ecgberht tries to recognize the opponent’s strategy, but often finds itself unsure. It recorded a high rate of Unknown enemy plans. The ones it does recognize are drawn from a small set that seems to me well-chosen.

Ecgberht recorded fewer than 150 games for 5 of its 11 opponents, although it completed all games with no crashes. In total, 7 games do not appear in the game records of the history files. Maybe it has a cleanup bug that bites occasionally?

#1 stardust

opening	games	wins	first	last
14CC	31	0%	3	147
FullMech	28	0%	0	148
JoyORush	27	0%	2	143
MechGreedyFE	27	0%	4	146
ProxyEightRax	36	6%	1	141
5 openings	149	1%

enemy	games	wins
Unknown	149	1%
1 opening	149	1%

A couple wins against the top player is not bad.

#2 purplewave

opening	games	wins	first	last
14CC	35	3%	3	148
FullMech	29	0%	0	149
JoyORush	28	0%	2	146
MechGreedyFE	28	0%	4	147
ProxyEightRax	30	0%	1	142
5 openings	150	1%

enemy	games	wins
ProtossFE	7	0%
Unknown	143	1%
2 openings	150	1%

#3 bananabrain

opening	games	wins	first	last
14CC	31	0%	3	146
FullMech	28	0%	0	144
JoyORush	28	0%	2	147
MechGreedyFE	28	0%	4	148
ProxyEightRax	35	6%	1	149
5 openings	150	1%

enemy	games	wins
CannonRush	2	0%
ProtossFE	9	0%
Unknown	125	2%
ZealotRush	14	0%
4 openings	150	1%

#4 dragon

opening	games	wins	first	last
14CC	28	0%	3	148
BioMechGreedyFE	28	4%	4	144
FullMech	25	0%	0	146
ProxyBBS	44	18%	2	149
ProxyEightRax	25	0%	1	147
5 openings	150	6%

enemy	games	wins
Unknown	150	6%
1 opening	150	6%

#5 mcrave

opening	games	wins	first	last
14CC	28	7%	7	147
BioGreedyFE	51	29%	0	145
ProxyEightRax	47	26%	21	140
TwoPortWraith	22	5%	3	146
4 openings	148	20%

enemy	games	wins
FastHatch	61	16%
NinePool	13	31%
Unknown	74	22%
3 openings	148	20%

Ecgberht put up its strongest fight against zerg.

#6 microwave

opening	games	wins	first	last
14CC	32	9%	4	145
BioGreedyFE	21	0%	0	148
FullBioFE	24	4%	3	146
ProxyEightRax	52	27%	1	147
TwoPortWraith	20	0%	2	138
5 openings	149	12%

enemy	games	wins
FastHatch	99	4%
NinePool	5	40%
Unknown	45	27%
3 openings	149	12%

#7 steamhammer

opening	games	wins	first	last
14CC	34	12%	8	147
BioGreedyFE	36	17%	0	142
ProxyEightRax	36	14%	1	141
TwoPortWraith	43	23%	4	148
4 openings	149	17%

enemy	games	wins
EarlyPool	4	0%
FastHatch	22	32%
NinePool	81	14%
Unknown	42	17%
4 openings	149	17%

#8 daqin

opening	games	wins	first	last
14CC	32	0%	8	148
FullMech	29	0%	0	149
JoyORush	28	0%	4	144
MechGreedyFE	28	0%	43	147
ProxyEightRax	33	3%	1	141
5 openings	150	1%

enemy	games	wins
Unknown	150	1%
1 opening	150	1%

#9 zzzkbot

opening	games	wins	first	last
FullBio	150	71%	0	149
1 opening	150	71%

enemy	games	wins
EarlyPool	150	71%
1 opening	150	71%

Ecgberht upset ZZZKBot, possibly aided by its hardcoded knowledge of how ZZZKBot plays.

#10 ualbertabot

opening	games	wins	first	last
FullBio	58	43%	0	144
FullMech	52	38%	2	145
ProxyBBS	40	32%	1	149
3 openings	150	39%

enemy	games	wins
BioPush	11	91%
EarlyPool	12	50%
MechRush	9	33%
Unknown	104	24%
ZealotRush	14	100%
5 openings	150	39%

#11 willyt

opening	games	wins	first	last
14CC	31	3%	68	148
FullMech	34	9%	0	147
ProxyEightRax	85	41%	2	149
3 openings	150	26%

enemy	games	wins
BioPush	34	15%
Unknown	116	29%
2 openings	150	26%

#13 eggbot

opening	games	wins	first	last
FullMech	148	94%	0	147
1 opening	148	94%

enemy	games	wins
CannonRush	94	95%
Unknown	54	93%
2 openings	148	94%

AIIDE 2020 - Microwave versus BananaBrain

This is the last matchup I can analyze this way without writing more parsing code. McRave did ask for more in a comment, though, so I may do that. All the matchups have featured BananaBrain.

Microwave plays a large number of strategies, so I put it on the left side. Blue is good for Microwave, red is good for BananaBrain.

microwave strategies versus bananabrain strategies

	overall	PvZ_10/12gate	PvZ_1basespeedzeal	PvZ_2basespeedzeal	PvZ_4gate2archon	PvZ_5gategoon	PvZ_9/9gate	PvZ_9/9proxygate	PvZ_bisu	PvZ_neobisu	PvZ_sairdt	PvZ_sairgoon	PvZ_sairreaver	PvZ_stove
overall	58/150 39%	5/17 29%	3/19 16%	4/11 36%	4/9 44%	4/7 57%	5/11 45%	5/12 42%	4/14 29%	4/10 40%	5/10 50%	6/11 55%	4/9 44%	5/10 50%
10Hatch9Pool9gas	0/2 0%	-	-	-	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-
10HatchMain9Pool9Gas	0/1 0%	-	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
11HatchTurtleHydra	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
12Hatch	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
12PoolMain	22/43 51%	0/5 0%	0/9 0%	2/2 100%	3/3 100%	3/3 100%	0/1 0%	1/3 33%	2/2 100%	3/3 100%	0/3 0%	2/3 67%	4/4 100%	2/2 100%
12PoolMuta	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
1HatchMuta_Sparkle	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-	-	-	-	-
2HatchMuta	1/5 20%	-	-	1/1 100%	-	-	0/1 0%	-	0/1 0%	-	-	-	0/1 0%	0/1 0%
3HatchHydraBust	0/1 0%	-	-	-	-	-	-	-	0/1 0%	-	-	-	-	-
3HatchHydra_BHG	0/1 0%	-	-	0/1 0%	-	-	-	-	-	-	-	-	-	-
3HatchLingBust	2/6 33%	-	0/1 0%	0/1 0%	-	-	1/1 100%	0/1 0%	-	-	-	1/1 100%	-	0/1 0%
3HatchMuta	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
3HatchPoolHydraExpo	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
4HatchBeforeGas	0/1 0%	-	-	-	-	-	-	-	-	-	-	0/1 0%	-	-
4HatchPoolHydra	0/2 0%	-	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-
4PoolHard	2/6 33%	-	1/1 100%	0/1 0%	-	-	1/1 100%	-	0/1 0%	-	-	-	-	0/2 0%
4PoolSoft	0/1 0%	-	0/1 0%	-	-	-	-	-	-	-	-	-	-	-
6Pool	0/1 0%	-	0/1 0%	-	-	-	-	-	-	-	-	-	-	-
7Pool	0/1 0%	-	-	-	-	-	-	-	-	-	0/1 0%	-	-	-
8Pool	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
8PoolHydraRush8D	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
9PoolGasHatchSpeed8D	12/18 67%	2/2 100%	2/2 100%	-	1/2 50%	0/1 0%	1/1 100%	0/2 0%	1/1 100%	1/1 100%	1/1 100%	1/2 50%	0/1 0%	2/2 100%
9PoolHatchGasSpeed7D	0/1 0%	-	-	-	0/1 0%	-	-	-	-	-	-	-	-	-
9PoolHatchGasSpeed8D	17/32 53%	3/4 75%	0/1 0%	1/1 100%	0/1 0%	0/1 0%	2/4 50%	4/5 80%	1/5 20%	0/1 0%	4/4 100%	2/2 100%	0/2 0%	0/1 0%
9PoolSpeed	0/3 0%	0/1 0%	-	-	0/1 0%	-	-	-	-	-	-	0/1 0%	-	-
9PoolSpeedLing	1/5 20%	-	-	-	-	-	0/1 0%	-	0/1 0%	-	-	0/1 0%	0/1 0%	1/1 100%
9PoolSunkHatch	0/1 0%	-	-	0/1 0%	-	-	-	-	-	-	-	-	-	-
Overpool	0/1 0%	0/1 0%	-	-	-	-	-	-	-	-	-	-	-	-
OverpoolSpeed	0/3 0%	-	0/1 0%	0/1 0%	-	-	-	-	0/1 0%	-	-	-	-	-
ZvP_10Hatch9Pool	1/3 33%	-	0/1 0%	0/1 0%	-	1/1 100%	-	-	-	-	-	-	-	-
ZvP_11Hatch10Pool	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
ZvZ_Overgas9Pool	0/1 0%	-	-	-	-	-	-	-	-	0/1 0%	-	-	-	-
ZvZ_Overpool11Gas	0/2 0%	-	-	-	-	-	0/1 0%	-	-	-	0/1 0%	-	-	-

This table looks even more scattered than yesterday’s BananaBrain-Dragon table, but to me it tells a story of duelling learning algorithms. Microwave found a few builds that countered BananaBrain’s preferred play, and BananaBrain did not shift its responses far enough to entirely squelch them.

microwave as seen by bananabrain

microwave played	#	bananabrain recognized
10Hatch9Pool9gas	2	2 Z_10hatch
10HatchMain9Pool9Gas	1	1 Z_10hatch
11HatchTurtleHydra	1	1 Z_12hatch
12Hatch	1	1 Z_12hatch
12PoolMain	43	36 Z_12pool \| 5 Z_10hatch \| 2 Z_unknown
12PoolMuta	1	1 Z_10hatch
1HatchMuta_Sparkle	1	1 Z_unknown
2HatchMuta	5	5 Z_12hatch
3HatchHydraBust	1	1 Z_12hatch
3HatchHydra_BHG	1	1 Z_10hatch
3HatchLingBust	6	6 Z_12hatch
3HatchMuta	1	1 Z_12hatch
3HatchPoolHydraExpo	1	1 Z_12hatch
4HatchBeforeGas	1	1 Z_12hatch
4HatchPoolHydra	2	2 Z_12hatch
4PoolHard	6	6 Z_4/5pool
4PoolSoft	1	1 Z_4/5pool
6Pool	1	1 Z_4/5pool
7Pool	1	1 Z_9pool
8Pool	1	1 Z_9pool
8PoolHydraRush8D	1	1 Z_9pool
9PoolGasHatchSpeed8D	18	15 Z_9pool \| 3 Z_overpool
9PoolHatchGasSpeed7D	1	1 Z_9pool
9PoolHatchGasSpeed8D	32	29 Z_9pool \| 3 Z_overpool
9PoolSpeed	3	2 Z_9poolspeed \| 1 Z_9pool
9PoolSpeedLing	5	5 Z_9poolspeed
9PoolSunkHatch	1	1 Z_9pool
Overpool	1	1 Z_overpool
OverpoolSpeed	3	3 Z_overpool
ZvP_10Hatch9Pool	3	3 Z_10hatch
ZvP_11Hatch10Pool	1	1 Z_12hatch
ZvZ_Overgas9Pool	1	1 Z_12pool
ZvZ_Overpool11Gas	2	2 Z_overpool

BananaBrain was accurate at reading Microwave’s initial build. Lumping 11 hatch with 12 hatch is fine, they’re very similar. 12 pool can be difficult to distinguish from 10 hatch, if you scout it late after the second hatchery finishes. It would be useful to better separate 9 pool from overpool, which are significantly different in effect, but it requires close attention to detail. Overall, highly accurate readings with only one wide miss, seeing the overgas 9 pool as 12 pool—and that is a ZvZ build that is extremely rare in ZvP.

It makes quite a contrast with yesterday’s BananaBrain-Dragon analysis, where BananaBrain barely recognized terran builds.

bananabrain as seen by microwave

bananabrain played	#	microwave recognized
PvZ_10/12gate	17	13 HeavyRush \| 3 Unknown \| 1 NakedExpand
PvZ_1basespeedzeal	19	14 Unknown \| 5 HeavyRush
PvZ_2basespeedzeal	11	4 NakedExpand \| 3 Turtle \| 3 SafeExpand \| 1 HeavyRush
PvZ_4gate2archon	9	4 NakedExpand \| 4 SafeExpand \| 1 HeavyRush
PvZ_5gategoon	7	6 NakedExpand \| 1 HeavyRush
PvZ_9/9gate	11	9 HeavyRush \| 2 Unknown
PvZ_9/9proxygate	12	6 HeavyRush \| 6 Unknown
PvZ_bisu	14	6 SafeExpand \| 4 NakedExpand \| 2 Turtle \| 1 HeavyRush \| 1 Unknown
PvZ_neobisu	10	4 NakedExpand \| 3 SafeExpand \| 2 Turtle \| 1 HeavyRush
PvZ_sairdt	10	8 Unknown \| 2 HeavyRush
PvZ_sairgoon	11	7 NakedExpand \| 1 SafeExpand \| 1 Turtle \| 1 Unknown \| 1 HeavyRush
PvZ_sairreaver	9	4 SafeExpand \| 3 NakedExpand \| 2 Turtle
PvZ_stove	10	7 Unknown \| 3 HeavyRush

Microwave borrowed Steamhammer’s rather crude classification of enemy plans (which was still far in the future when Microwaved forked from Steamhammer). It was intended to be minimal, just enough to allow for basic reactions, to hold the fort until I could raise enough troops to make a sally. Microwave’s recognitions look similar to Steamhammer’s, with the right general tendency but many sloppy variations (which I think are due mostly to weak scouting, with a contribution from overlapping recognition rules).

It’s striking that some recognitions—of dubious accuracy—are dark blue in stark contrast to their neighbors. It gives me the impression that Microwave makes use of the recognized enemy plan, in some cases to good effect. It suggests that more accurate recognition, if the reactions are also good, could be a major improvement.

AIIDE 2020 - BananaBrain versus Dragon

Of the 4 bots I’m prepared to run this analysis on, this is the only pairing involving Dragon. Dragon did not record all 150 games against either McRave or Microwave. Like yesterday, all win rates and coloring are from the point of view of BananaBrain: Blue is good for BananaBrain, red is good for Dragon.

bananabrain strategies versus dragon strategies

	overall	1rax fe	2rax bio	2rax mech	bio	dirty worker rush	mass vulture	siege expand
overall	67/150 45%	6/14 43%	6/11 55%	8/15 53%	15/37 41%	3/3 100%	22/56 39%	7/14 50%
PvT_10/12gate	12/17 71%	2/3 67%	-	2/3 67%	4/4 100%	-	3/6 50%	1/1 100%
PvT_10/15gate	5/12 42%	-	2/2 100%	1/5 20%	1/3 33%	-	1/2 50%	-
PvT_12nexus	1/8 12%	1/2 50%	-	-	0/1 0%	-	0/3 0%	0/2 0%
PvT_1gatedtexpo	3/7 43%	1/2 50%	-	-	0/1 0%	-	2/4 50%	-
PvT_1gatereaver	0/5 0%	-	0/1 0%	-	0/2 0%	-	0/2 0%	-
PvT_28nexus	5/11 45%	0/2 0%	0/1 0%	0/2 0%	1/1 100%	-	4/5 80%	-
PvT_2gatedt	3/9 33%	0/1 0%	-	1/1 100%	0/2 0%	-	0/3 0%	2/2 100%
PvT_2gaterngexpo	2/7 29%	-	0/1 0%	-	1/1 100%	1/1 100%	0/4 0%	-
PvT_32nexus	2/8 25%	-	-	-	1/4 25%	1/1 100%	0/2 0%	0/1 0%
PvT_9/9gate	14/18 78%	-	2/3 67%	-	4/4 100%	1/1 100%	7/9 78%	0/1 0%
PvT_9/9proxygate	8/14 57%	1/1 100%	1/1 100%	3/3 100%	0/2 0%	-	2/6 33%	1/1 100%
PvT_bulldog	0/6 0%	0/1 0%	-	-	0/3 0%	-	0/1 0%	0/1 0%
PvT_dtdrop	2/8 25%	-	1/1 100%	-	0/4 0%	-	1/2 50%	0/1 0%
PvT_proxydt	10/14 71%	1/1 100%	-	1/1 100%	3/3 100%	-	2/5 40%	3/4 75%
PvT_stove	0/6 0%	0/1 0%	0/1 0%	-	0/2 0%	-	0/2 0%	-

Not one table cell has more than 9 games in it. Neither bot successfully predicted what the other would play, if it even tried: BananaBrain is unpredictable and Dragon changes its choice frequently when losing, and besides BananaBrain is poor at recognizing terran plans. So the strategy x strategy cross is a hash. To me the table means that, at least for this pairing, reactions during the game were more important than the initial choice of strategy. Neither side had a way to choose a counter beforehand.

bananabrain as seen by dragon

Dragon does not record a recognized opponent strategy. Its history files have only its own strategy and whether it won.

dragon as seen by bananabrain

dragon played	#	bananabrain recognized
1rax fe	14	13 T_unknown \| 1 T_fastexpand
2rax bio	11	8 T_unknown \| 2 T_fastexpand \| 1 T_1fac
2rax mech	15	14 T_unknown \| 1 T_1fac
bio	37	35 T_unknown \| 1 T_1fac \| 1 T_fastexpand
dirty worker rush	3	3 T_unknown
mass vulture	56	30 T_1fac \| 26 T_unknown
siege expand	14	9 T_unknown \| 5 T_1fac

We knew that BananaBrain struggles to recognize terran strategies. Maybe the author has not spent effort on it because it doesn’t affect results much? In any case, given how Dragon plays, with its love of fast expansions and mixed tech, the terran builds that are recognized probably represent truths about the games. It’s not clear that they are helpful truths, though, because they say so little about what happened.

From the coloring, it looks as though there was little relationship between whether BananaBrain recognized Dragon’s build and whether BananaBrain won. That is consistent with the theory that the author decided it didn’t matter.

AIIDE 2020 - BananaBrain versus McRave

If both bots in a pairing write history files, and both record all 150 games of the tournament, then the history files can be aligned and we can compare what the bots were thinking in each game. So far, between the limitations of the data and the limitations of my script, I’m only ready to do that for a few pairings. Dragon in particular often did not record all 150 games, and I’d rather not try to align game records when there are gaps in the histories (there is enough data to do it programmatically, but it’s a pain and risks errors). Also my script depends on parsing out data into a specific format, and it is only implemented for 4 bots so far (#3 BananaBrain, #4 Dragon, #5 McRave, #6 Microwave—alphabetical order and their finishing order were the same).

Today is BananaBrain versus McRave. The first BananaBrain line in its file about McRave:

2020-10-09 20:56:04,2,(2)Destination.scx,PvZ_9/9proxygate,Z_overpool,7.6,1

The first McRave line in its history file about BananaBrain (we’re told it doesn’t use this data in games, but it’s there and we can analyze it):

Lost,Destination,7:30,2Gate,Proxy,ZealotRush,PoolHatch,Overpool,2HatchSpeedling,1:21,1:21,1:21,5,Zerg_Larva,30,Zerg_Zergling,15,Zerg_Drone,3,Zerg_Overlord,24,Protoss_Probe,16,Protoss_Zealot,1,Protoss_Corsair Lost,HeartbreakRidge,17:40,2Gate,Main,Corsair,PoolHatch,Overpool,2HatchMuta,2:01,2:01,5:10,2,Zerg_Larva,16,Zerg_Zergling,42,Zerg_Drone,10,Zerg_Overlord,28,Zerg_Mutalisk,18,Zerg_Scourge,89,Protoss_Probe,54,Protoss_Zealot,23,Protoss_Dragoon,1,Protoss_High_Templar,1,Protoss_Shuttle,33,Protoss_Corsair,5,Protoss_Dark_Templar,1,Protoss_Reaver,8,Protoss_Scarab

My script extracts key info from each line so we can compare. BananaBrain played PvZ_9/9proxygate and concluded that McRave answered with Z_overpool, while McRave played PoolHatch,Overpool,2HatchSpeedling and classified what it saw from BananaBrain as 2Gate,Proxy,ZealotRush. In this game, both sides agreed pretty well about what was going on.

bananabrain strategies versus mcrave strategies

This table shows which BananaBrain strategies were successful against which three-part McRave strategies. All the winning rates are from BananaBrain’s point of view. The intersection of the overall row and the overall column says that BananaBrain won 82 out of 150 games throughout the tournament, which can be checked against the official crosstable. The overall row tells how BananaBrain fared against each of McRave’s strategies, which can be checked against my tables of what McRave learned. The overall column tells how each BananaBrain strategy performed, which can be checked against what BananaBrain learned. (Spoiler: All the numbers match.) The center cells are the meat, and show what countered what.

	overall	PoolHatch,Overpool,2HatchMuta	PoolHatch,Overpool,2HatchSpeedling	PoolHatch,Overpool,3HatchSpeedling
overall	82/150 55%	72/131 55%	5/10 50%	5/9 56%
PvZ_10/12gate	9/13 69%	4/4 100%	-	5/9 56%
PvZ_1basespeedzeal	6/12 50%	6/12 50%	-	-
PvZ_2basespeedzeal	3/9 33%	3/9 33%	-	-
PvZ_4gate2archon	1/6 17%	1/6 17%	-	-
PvZ_5gategoon	9/16 56%	9/16 56%	-	-
PvZ_9/9gate	26/27 96%	26/27 96%	-	-
PvZ_9/9proxygate	5/10 50%	-	5/10 50%	-
PvZ_bisu	1/5 20%	1/5 20%	-	-
PvZ_neobisu	5/11 45%	5/11 45%	-	-
PvZ_sairdt	3/10 30%	3/10 30%	-	-
PvZ_sairgoon	11/17 65%	11/17 65%	-	-
PvZ_sairreaver	1/5 20%	1/5 20%	-	-
PvZ_stove	2/9 22%	2/9 22%	-	-

The table makes it plain that 2HatchSpeedling and 3HatchSpeedling were reactions to specific protoss builds, as the author pointed out in a comment. The counter to 10/12 gate at least seems to have been valuable, because McRave lost all 4 games where the 10/12 gate was played but not countered. The 9/9 gate crushed because no counter was played against it; the zealots are a McRave weakness.

bananabrain as seen by mcrave

But wait, there’s more. Both bots recorded not only their own strategy, but the recognized opponent strategy, so we can compare the known strategy of one bot with how the other bot recognized it. Note well: If the recognized strategy looks different than the actual strategy, it is not necessarily a mistake or a scouting miss. The bots may simply be noting different aspects of the game. Only some differences indicate mistakes.

The coloring is from the point of view of BananaBrain. For McRave, red is good and blue is bad.

bananabrain played	#	mcrave recognized
PvZ_10/12gate	13	12 2Gate,Main,Corsair \| 1 2Gate,Main,DT
PvZ_1basespeedzeal	12	8 1GateCore,2Zealot,DT \| 2 2Gate,Main,DT \| 1 2Gate,Main,4Gate \| 1 1GateCore,2Zealot,Corsair
PvZ_2basespeedzeal	9	8 FFE,Forge,Speedlot \| 1 FFE,Gateway,Speedlot
PvZ_4gate2archon	6	2 FFE,Forge,NeoBisu \| 2 FFE,Forge,5GateGoon \| 1 FFE,Forge,ZealotArchon \| 1 FFE,Nexus,NeoBisu
PvZ_5gategoon	16	14 FFE,Forge,5GateGoon \| 2 FFE,Nexus,5GateGoon
PvZ_9/9gate	27	26 2Gate,Main,Corsair \| 1 2Gate,Main,DT
PvZ_9/9proxygate	10	10 2Gate,Proxy,ZealotRush
PvZ_bisu	5	3 FFE,Forge,NeoBisu \| 2 FFE,Nexus,NeoBisu
PvZ_neobisu	11	5 FFE,Forge,NeoBisu \| 4 FFE,Forge,Speedlot \| 2 FFE,Nexus,NeoBisu
PvZ_sairdt	10	10 1GateCore,2Zealot,Corsair
PvZ_sairgoon	17	9 FFE,Forge,NeoBisu \| 2 FFE,Nexus,5GateGoon \| 2 FFE,Nexus,NeoBisu \| 2 FFE,Forge,Unknown \| 1 FFE,Forge,Speedlot \| 1 FFE,Forge,5GateGoon
PvZ_sairreaver	5	5 FFE,Forge,NeoBisu
PvZ_stove	9	9 1GateCore,2Zealot,Corsair

Only 2 games have an Unknown element. Without watching replays, I can’t say that any of McRave’s recognitions are wrong. Seeing PvZ_sairgoon as FFE,Forge,Speedlot could be correct if BananaBrain followed up with zealots in that one game.

I’m not sure what the difference is between FFE,Forge and FFE,Gateway and FFE,Nexus. FFE stands for forge fast expand, which means a forge and a nexus, and then you need a gateway if you’re ever going to make a mobile army, so all three buildings are required. Maybe it’s whatever building McRave saw first.

mcrave as seen by bananabrain

Again, the coloring is from the point of view of BananaBrain.

mcrave played	#	bananabrain recognized
PoolHatch,Overpool,2HatchMuta	131	101 Z_overpool \| 27 Z_9pool \| 3 Z_unknown
PoolHatch,Overpool,2HatchSpeedling	10	9 Z_overpool \| 1 Z_unknown
PoolHatch,Overpool,3HatchSpeedling	9	8 Z_overpool \| 1 Z_9pool

BananaBrain remembered far less detail about the game than McRave. Overpool is only an initial build order which reaches its end at 9 supply and can be followed up with any tech or unit mix whatsoever. If all you know is that the opponent will start with overpool, the only conclusions you can draw are limits on the opponent’s tech timings and economy. On the other hand, if you do know more about the opponent’s play, can you use the information productively?

more?

I could generate more tables. Various tables showing recognized strategies might make sense. If at least one bot of the pair records the map for each game, it would be easy to break down strategies by map. Is there any particular breakdown you’d like to see?

Update: I added coloring to the “as seen by” tables, to show how win rates vary depending on what the bots recognized.

AIIDE 2020 - summary so far

I’ve analyzed the learning files of most of the bots that wrote them. Stardust and EggBot wrote nothing. WillyT recorded the results of only a small fraction of games. ZZZKBot and Ecgberht wrote files that are more troublesome for me to parse (starting from what I’ve already coded), though I will analyze them too if people want. PurpleWave doesn’t have a single strategy-I-played and enemy-strategy-I-recognized, but tags history games with an irregular set including a combination of its multiple strategy choices and multiple “fingerprints” of enemy choices; this data should be summarized in an entirely different way.

Tomorrow I’ll start a new data analysis of a kind I haven’t tried before. It will bring out new information, and I think people will like it.