archive by month
Skip to content

Steamhammer on BWAPI 4.4.0

I finally have Steamhammer working locally with BWAPI 4.4.0. I didn’t have much trouble with it this time, on my second try, though I worked slowly.

I want to make a few minor code changes and run tests. I expect it won’t take long. If everything continues to look good, I’ll release Steamhammer 3.3 before long. Its play will be little changed.

Then it will be time to prepare for the annual SSCAIT tournament. (I write it that way even though the T already stands for Tournament. It seems clearer than “SSCAI tournament”.) The current version is essentially the AIIDE version, and the updates seem to have been worth 50 to 100 elo points, a substantial jump. But I did introduce a serious new bug: Steamhammer now often builds hatcheries beside the base it wants to take next, rather than at the base—sometimes several hatcheries. Then, since it hasn’t taken the base, it doesn’t mine there. It’s a terrible weakness, and fixing it has to be the first step.

how to beat Monster

Monster’s elo is settling close to 2900, making it the top zerg with a margin. It has been updated at least once. One of the changes was to place outward-facing sunkens in a neat line, rather than a somewhat ragged formation, a clear improvement. I expect some authors will be thinking about how to beat Monster in the upcoming annual SSCAIT tournament, so here are a few thoughts. Though Monster’s author could read this too and make changes....

Monster does not want its sunkens to be broken, and often builds too many too early near the start of the game. That delays its economic growth. You can take advantage by delaying your own expansion a little to look threatening, and perhaps by killing Monster’s scout drone quickly. If Monster makes 2 or 3 sunkens in response to only a few units of yours, you are ahead. Don’t try for a bust at any early timing (even Stardust has to build up forces first). If you do make a lot of units early, you can try to contain Monster, to prevent it from expanding until you can get ahead, as in PurpleWave-Monster on La Mancha (an exciting game; the plan barely worked).

If you can run by the sunkens in the natural and get into Monster’s main, do it. Sometimes Monster walls tightly enough that it’s impossible, though. If you get in, Monster will run drones from its main to its natural, so you won’t kill many, but that is OK. There’s no need to destroy anything to get ahead. Stay alive in the main as long as possible to prevent drones from returning. Monster will be mining only one geyser and fewer than half as many mineral patches, and its economy will fall behind with every second. If mutalisks come out, they will come on half as strong at first, and they will need to clear Monster’s main before they head across the map to attack, so you have more time to prepare.

Against protoss, Monster most often plays with hydralisks, but may switch to mutalisks if you skimp too much on air defense. Corsairs are good for air defense, but I have yet to see a protoss bot which understands how to get the most value out of them. Monster is skilled with scourge, so you have to be careful with corsairs. You can see the general idea in pro games: As long as zerg has no spire, corsairs can scout and can clear overlords away. When scourge appears, protoss makes a cannon at the main nexus (initially only one, maybe more later) for short-term air defense and parks corsairs there until there are enough corsairs that they can shoot down scourge before it strikes. That’s about 5 corsairs. +1 air attack helps. Keep the corsairs in a tight group and they can go out on the map again, vulnerable only to determined scourge attacks with spread scourge, and even then the corsairs can shoot down some of it, raising zerg’s gas cost. If zerg has a lot of scourge, put dragoons or archons in your army and keep corsairs nearby; when in danger, fly over your anti-air units.

Against protoss and zerg, Monster opens with safe overpool builds that are not easy to exploit. Against terran, Monster plays a greedy three-hatchery build that leaves openings. Terran can try a cheese rush, or can take advantage of the slow zerg start to go for heavy macro itself. Hao Pan and Krasi0 have both tried proxy barracks, with partial success. Monster does have cheese reactions and defends itself fairly well if it sees the rush coming, though even then it may fall behind and lose. See Hao Pan-Monster, bunker rush not scouted and Hao Pan-Monster, bunker rush scouted (the scout overlord saw the first marine). In the second game Monster skipped its natural hatchery (it will also cancel a natural already started), made the hatch in its main instead, and got a sunken for defense. If Monster breaks the bunker early (which it often does) it typically wins; if it stays contained too long it will lose.

If you as terran go the macro route, you need good timing. Monster plays spire with +1 (usually attack but I have seen carapace too; is that due to an update?), then a second spire and double upgrades from then on. Terran has longer ranged units and low mobility, so terran usually wants to build up before moving out, especially if playing mech. But ground units do not stack and air units do, so if you wait too long then the mutalisks will defeat goliaths—all mutas can fire at once to pick off an edge goliath while not all of its comrades can get into range. Here is Krasi0 fast expanding to outmacro Monster. I think that this one is more instructive, though: XiaoYi-Monster on Empire of the Sun shows good terran unit mix and timing to beat the mutas. Ecgberht has also defeated Monster with the same kind of plan using bio units.

As terran, you also need good defense to survive until it is time to move out. Bots usually space turrets apart to cover ground. Pros don’t position turrets that way against mutalisks: They build turrets in tight groups so that a muta flock killing one turret takes as much damage as possible from others. The turret groups are located to cover vital areas: The mineral lines and production buildings. All buildings need to be placed compactly so that they are easy to defend, and they have to leave movement corridors so that mobile anti-air units can move quickly to counter the mutas. It’s not easy. Defense against strong muta harass needs skills.

Valkyries are potentially a sound defense against the mutas, but bots as far as I have seen do not use valkyries well. The valks act independently rather than together and move carelessly away from air defense, and Monster’s aggressive scourge shoots down too many; I haven’t seen an exception yet. Also, apparently terran bots select a single mutalisk as their valk target (this is my interpretation from watching replays), and Monster instantly separates the targeted muta from the flock so that only it takes damage. I suggest keeping valks in range of anti-air (whether static or mobile) for safety from scourge and trying Valkyrie patrol micro to in effect attack a location rather than a unit while staying safe. If that works, it should force the whole flock to scatter, rather than only forcing a single muta to split off. But I haven’t seen it tried, so I’m not sure it will work.

representing the strategic situation

It’s possible to reason about Starcraft games straight from the details, without explicit abstractions. One of the promises of deep learning is that it allows you to get away with that, to feed concrete facts into a black box and get back concrete actions (so that all the abstraction is hidden inside the black box), and deep learning is not the only way. I believe that for a complex game, it’s probably better to introduce abstractions and do much of your reasoning at an abstract level. For Steamhammer, I selected four levels of abstraction, which I call strategy, operations, tactics, and micro (the breakdown is not perfectly cleanly implemented right now, but I keep moving in that direction). This post is about strategy and how to represent it as a data structure.

The three classic elements of a Starcraft strategic situation are economy, technology, and army. I would like to add a fourth element (less emphasized in abstract theory but never overlooked in practical discussions), production facilities. These four elements are the things that you can spend resources to buy, and that you have to trade off as you make each spending decision. (Do I spend this on another dragoon, or a second gateway so I can later make dragoons faster, or do I get dragoon range?) I’ll call them the elements of strategy—that simplifies the meaning of the word “strategy,” but let’s go with it. If you can fully represent the four elements, then you can fully represent the strategic situation of a game.

You have only partial information about what the opponent is doing, but you know all about your own state. You may not want to store both using the same datatype, though you could. It’s elegant if both have the same underlying model, at least. You might store exact values for your own strategic situation and probability distributions over the same values for the enemy’s situation, for example. It’s ideal to use all available information to estimate the enemy values: Scouting info, feasibility calculations (“at this frame it’s not possible to have more than n workers”), past behavior saved in learning files, hand-coded constraints from reading the enemy bot’s code....

If you know how the strategy elements varied throughout a game, then you know a lot about what happened in the game. You don’t know the maneuvers, but you know who won and you know the ebb and flow of the battles.

economy

I think there are 3 things you’d like to know about a player’s Starcraft economy. 1. Total minerals and gas mined so far. 2. Current rate of mining minerals and gas. 3. Predictions of the future rate of mining minerals and gas, perhaps under varying assumptions so you can choose a course of action. For example, if the map is running low on minerals after a long game, it’s likely wasteful to make more workers.

For 1 and 2, it’s easy to keep track of your own current state. For the enemy, there are occasions when you can get exact values for 1 and 2, but usually you’ll have to estimate. And 3 is always an estimation problem. Data useful for making the estimates includes number of workers, number of mineral patches, and so on. For the enemy, you have to estimate those too. (If you don’t have an estimate of the number of enemy workers then you won’t be able to decide on the priority of killing a worker versus another unit, a question that falls under 3.)

technology

A command center is a tech building, because it allows you to build SCVs. (It’s also a production building.) A tech state is simply a set of tech buildings and research and upgrades: These things are the tech that I have, I can make (say) SCVs and marines and turrets because I have a command center and a barracks and an ebay. I’ll call it the tech set which represents a given tech state.

The set of tech states is partially ordered by the relation of teching: If you can tech from A (I can make marines) to B (I can make marines and medics and firebats because I added an academy), then you can say B > A. The partial order gives you a bit of mathematical leverage to reason about tech states (I see a medic, the enemy must have an academy—Steamhammer does this reasoning). The enemy can destroy buildings, though. Destroy the barracks and you’re in a state C with academy but no barracks, which cannot be reached by teching from the start state. C is neither greater than nor less than A, but B > C because you can tech from C back to B by rebuilding the barracks.

production

The production buildings are those that make units. The terran production buildings are command center, barracks, factory, starport. The more copies you have of a production building, the faster you can make units—provided you have the tech and the economy for them. There are complications for every race. Terran can spend resources to repair. Protoss reavers and carriers rely on scarabs and interceptors, which cost minerals. The only zerg production building is the hatchery (and lair and hive), but hydras may be able to morph into lurkers and mutas may be able to morph into guardians or devourers. These are all spending decisions, so they count under my definition of strategy.

You can represent your terran production state as a count of each of CCs, barracks, factories, starports. For protoss, you might want to include reavers and carriers are production facilities, or you might ignore them as irrelevant for the strategic reasoning you care about. For zerg, you may want to count hydras as production facilities if you have lurker tech and mutas if you have a greater spire, and if you’re reasoning about details you should count larvas too.

army

Your army state is simply your count of units of each type. If you care, you might also keep health information. The hard thing about armies is not knowing what they are, it is comparing their strengths and knowing what they can do.

The four strategy elements are all interrelated. Production facilities you see can help you predict the enemy’s unit mix, and enemy units you see can help you predict their production state. Economy and production tell you how fast an army can grow. You get the idea.

new bot Monster

New zerg bot Monster has been going Godzilla on the opposition. As I write, it has 58 wins and 4 losses on BASIL (since it is unranked, it is facing opponents of all levels). Its wins include tough enemies like Iron (on Circuit Breaker). Its win rate on SSCAIT is “only” 31-8 as it is being voted tougher opponents on average. The losses on BASIL are to PurpleWave, Krasi0 (twice), and the expert zealot rusher Wuli.

I have yet to see Monster vary its early game build orders, though perhaps it simply hasn’t lost enough games to feel the need to. Versus terran, Monster likes three hatch mutalisk, making only one pair of zerglings at first. Krasi0 earned its 2 wins with proxy rax, beating the greedy build with fast aggression. Against protoss, Monster likes overpool followed by 11 hatchery, a standard build. It gets a hydra den early but does not always make use of it. Here is a win over Locutus where it does make hydras. Versus zerg, Monster likes overpool 9 gas, also one of Steamhammer’s favorite starts and difficult for bots to counter.

Monster is a complex bot with many skills. It appears to adapt its army size, unit mix, and static defense to the game situation. It has nice micro with zerglings, hydralisks, mutalisks, and scourge. (Though I judge McRaveZ’s muta micro is better. See this loss vs McRaveZ from SSCAIT; Monster won the rematch.) It can make queens with broodling, though I haven’t seen it make defilers. It knows how to position a sunken and block its ramp with zerglings to stop a vulture runby cold; see the Iron game above. Like ZZZKBot, when scouting for the location of a zerg opponent, it knows to discount a base when it does not see the creep; it does not have to scout farther to see that there is no hatchery. (See this win over Microwave for an example; watch how early the overlord turns away from its first scouting destination. Also notice Monster’s zergling formations.)

Monster still has a lot of headroom. I immediately saw inefficiencies in its build orders and weaknesses in its play. Its results say that its strengths are bigger than its weaknesses, though. I imagine it must have been thoroughly tested against a range of opponents to gain so many skills with such small loopholes.

Monster gg’s early when losing. I haven’t seen another bot surrender as quickly. There is an advantage to giving up early in testing: You can get more games in, iterate faster, and end up with a stronger bot. Of course the advantage doesn’t show in serious games, but if the gg is accurate then it doesn’t hurt.

Peering into the binary, I am impressed with Monster’s scope. The file I downloaded from SSCAIT is a 2.8MB .exe, pointing to a complex project that must have taken a long time to develop. It uses BWEB. I see a JSON parsing library and signs of a config file that is not included in the SSCAIT download. I see strings suggesting many skills that I have not yet noticed in games, such as scarab dodging.

AIIDE 2020 - various versus DaQin

I added parsing for DaQin’s files, which was little effort. I decided to dump all of DaQin’s analysis into a single post, because the tables aren’t that rich in information. Now I’m able to move on to other topics. I put the opponents on the left, so that in all cases, blue is good for the opponent and red is good for DaQin.

bananabrain strategies versus daqin strategies

overall2GateDT3GateDT4GateGoon
overall99/150 66%9/14 64%53/89 60%37/47 79%
PvP_10/12gate7/10 70%-2/5 40%5/5 100%
PvP_12nexus4/7 57%1/1 100%2/4 50%1/2 50%
PvP_2gatedt14/16 88%1/1 100%7/9 78%6/6 100%
PvP_2gatedtexpo10/14 71%1/2 50%5/7 71%4/5 80%
PvP_2gatereaver13/16 81%1/1 100%5/7 71%7/8 88%
PvP_3gaterobo8/13 62%2/2 100%5/7 71%1/4 25%
PvP_3gatespeedzeal2/7 29%0/1 0%1/5 20%1/1 100%
PvP_4gategoon4/8 50%1/1 100%2/6 33%1/1 100%
PvP_9/9gate15/16 94%-11/11 100%4/5 80%
PvP_9/9proxygate6/10 60%1/1 100%2/6 33%3/3 100%
PvP_nzcore7/11 64%1/1 100%4/8 50%2/2 100%
PvP_zcore3/7 43%0/2 0%3/5 60%-
PvP_zcorez2/7 29%-2/4 50%0/3 0%
PvP_zzcore4/8 50%0/1 0%2/5 40%2/2 100%

Reading DaQin’s openings out of its configuration file, I see that 2GateDT makes 2 dark templar out of the promised 2 gateways, adds 3 cannons in front of its natural, then expands. As a PvP build, that strikes me as illogical (you might want one cannon if the enemy also has dark templar). 3GateDT makes one gate, gets dragoons and dragoon range, adds a second gate and a citadel, and then the predefined build order ends—the rest is left to the strategy manager. That seems sensible as far as it goes, but does the strategy manager regularly add a third gate and make DTs as promised, or is the name of the opening a lie? See below for BananaBrain’s opinion on the question. In any case, 3GateDT is the opening that gave BananaBrain the most trouble.

bananabrain as seen by daqin

bananabrain played#daqin recognized
PvP_10/12gate1010 Fast rush
PvP_12nexus75 Fast rush | 1 Safe expand | 1 Naked expand
PvP_2gatedt1616 Fast rush
PvP_2gatedtexpo1413 DarkTemplar rush | 1 Unknown
PvP_2gatereaver1616 DarkTemplar rush
PvP_3gaterobo1313 DarkTemplar rush
PvP_3gatespeedzeal76 Fast rush | 1 Unknown
PvP_4gategoon85 DarkTemplar rush | 1 Naked expand | 1 Unknown | 1 Fast rush
PvP_9/9gate1616 Fast rush
PvP_9/9proxygate109 Fast rush | 1 Proxy
PvP_nzcore118 DarkTemplar rush | 1 Not fast rush | 1 Naked expand | 1 Unknown
PvP_zcore77 DarkTemplar rush
PvP_zcorez75 DarkTemplar rush | 2 Not fast rush
PvP_zzcore85 DarkTemplar rush | 2 Proxy | 1 Not fast rush

DaQin recognizes 9-9 gate as Fast rush, but also the economy-first 10-12 gate and even the fast expand 12 nexus. What BananaBrain calls a reaver build, DaQin sees as a dark templar rush. Strategy recognition has some odd results.

daqin as seen by bananabrain

daqin played#bananabrain recognized
2GateDT1412 P_1gatecore | 2 P_unknown
3GateDT8945 P_1gatecore | 32 P_4gategoon | 11 P_unknown | 1 P_ffe
4GateGoon4736 P_4gategoon | 9 P_1gatecore | 2 P_unknown

This suggests that DaQin’s 3GateDT was often not a dark templar build at all.

mcrave strategies versus daqin strategies

overallForgeExpand5GateGoonForgeExpandSpeedlots
overall97/150 65%3/3 100%94/147 64%
PoolHatch,Overpool,2HatchMuta97/150 65%3/3 100%94/147 64%

Not a lot of strategic variety here.

mcrave as seen by daqin

mcrave played#daqin recognized
PoolHatch,Overpool,2HatchMuta150117 Not fast rush | 28 Heavy rush | 5 Unknown

daqin as seen by mcrave

daqin played#mcrave recognized
ForgeExpand5GateGoon33 FFE,Forge,5GateGoon
ForgeExpandSpeedlots147121 FFE,Forge,Speedlot | 21 FFE,Nexus,Speedlot | 2 FFE,Forge,5GateGoon | 2 FFE,Forge,ZealotArchon | 1 FFE,Gateway,Speedlot

microwave strategies versus daqin strategies

overall4GateGoonForgeExpand5GateGoonForgeExpandSpeedlots
overall125/150 83%3/11 27%3/3 100%119/136 88%
1HatchMuta_Sparkle56/62 90%0/1 0%-56/61 92%
3HatchLingBust11/17 65%2/4 50%1/1 100%8/12 67%
3HatchMuta53/59 90%0/2 0%2/2 100%51/55 93%
3HatchMutaExpo5/9 56%1/4 25%-4/5 80%
3HatchPoolHydraExpo0/1 0%--0/1 0%
9Pool0/1 0%--0/1 0%
OverpoolLurker0/1 0%--0/1 0%

Why did DaQin play its most successful opening by far, 4GateGoon, less often than any other? It is not that it discovered the opening late; it played it first in game 10 of 150, and won that game. It immediately played it again and lost, but soon played it a third time and won again. It surely wasn’t confused by too many choices. Either there was a bug, or some built-in bias in DaQin’s decisions led it astray.

microwave as seen by daqin

microwave played#daqin recognized
1HatchMuta_Sparkle6234 Not fast rush | 19 Heavy rush | 7 Unknown | 2 Proxy
3HatchLingBust1712 Not fast rush | 4 Heavy rush | 1 Proxy
3HatchMuta5948 Not fast rush | 7 Heavy rush | 4 Proxy
3HatchMutaExpo98 Not fast rush | 1 Proxy
3HatchPoolHydraExpo11 Not fast rush
9Pool11 Fast rush
OverpoolLurker11 Unknown

daqin as seen by microwave

daqin played#microwave recognized
4GateGoon118 Unknown | 2 HeavyRush | 1 Proxy
ForgeExpand5GateGoon31 SafeExpand | 1 Turtle | 1 NakedExpand
ForgeExpandSpeedlots13668 Turtle | 22 HeavyRush | 20 SafeExpand | 16 NakedExpand | 10 Unknown

steamhammer strategies versus daqin strategies

overallForgeExpand5GateGoonForgeExpandSpeedlots
overall33/150 22%29/136 21%4/14 29%
10HatchLing0/1 0%0/1 0%-
11Gas10PoolLurker0/1 0%0/1 0%-
12-12Hatch0/1 0%0/1 0%-
12Hatch_4HatchLing0/2 0%0/2 0%-
2.5HatchMuta0/1 0%0/1 0%-
2HatchHydraBust0/2 0%0/1 0%0/1 0%
3HatchHydra0/2 0%0/2 0%-
3HatchHydraBust0/3 0%0/2 0%0/1 0%
3HatchHydraExpo0/1 0%0/1 0%-
3HatchLateHydras+10/1 0%0/1 0%-
3HatchLing26/59 44%24/52 46%2/7 29%
3HatchLingBust20/2 0%0/2 0%-
4HatchBeforeGas5/25 20%3/23 13%2/2 100%
4HatchBeforeLair0/1 0%0/1 0%-
5HatchBeforeGas0/2 0%-0/2 0%
5HatchPool0/1 0%0/1 0%-
5PoolHard2Player0/1 0%0/1 0%-
5Scout0/1 0%0/1 0%-
973HydraBust0/4 0%0/3 0%0/1 0%
9Pool8GasLurker0/1 0%0/1 0%-
9PoolHatchSpeed0/1 0%0/1 0%-
9PoolHatchSpeedSpire20/1 0%0/1 0%-
9PoolHatchSpire0/1 0%0/1 0%-
9PoolSpireSlowlings0/1 0%0/1 0%-
9PoolSunkHatch0/1 0%0/1 0%-
AntiFact_2Hatch0/1 0%0/1 0%-
AntiFact_Overpool9Gas0/1 0%0/1 0%-
AntiFactory20/1 0%0/1 0%-
Over10Hatch1Sunk0/1 0%0/1 0%-
OverhatchExpoMuta0/3 0%0/3 0%-
OverpoolSpeed0/1 0%0/1 0%-
OverpoolTurtle 00/2 0%0/2 0%-
Proxy8HatchNatural0/1 0%0/1 0%-
Sparkle 3HatchMuta1/6 17%1/6 17%-
ZvP_2HatchMuta0/1 0%0/1 0%-
ZvP_3BaseSpire+Den0/1 0%0/1 0%-
ZvP_3HatchPoolHydra1/7 14%1/7 14%-
ZvT_2HatchMuta0/1 0%0/1 0%-
ZvT_3HatchMuta0/1 0%0/1 0%-
ZvT_7Pool0/1 0%0/1 0%-
ZvZ_12PoolLing0/1 0%0/1 0%-
ZvZ_12PoolLingB0/2 0%0/2 0%-
ZvZ_Overpool11Gas0/1 0%0/1 0%-

steamhammer as seen by daqin

steamhammer played#daqin recognized
10HatchLing11 Unknown
11Gas10PoolLurker11 Heavy rush
12-12Hatch11 Not fast rush
12Hatch_4HatchLing22 Heavy rush
2.5HatchMuta11 Not fast rush
2HatchHydraBust21 Hydra bust | 1 Not fast rush
3HatchHydra22 Not fast rush
3HatchHydraBust32 Not fast rush | 1 Heavy rush
3HatchHydraExpo11 Not fast rush
3HatchLateHydras+111 Not fast rush
3HatchLing5940 Not fast rush | 16 Heavy rush | 3 Unknown
3HatchLingBust221 Not fast rush | 1 Unknown
4HatchBeforeGas2524 Not fast rush | 1 Unknown
4HatchBeforeLair11 Not fast rush
5HatchBeforeGas22 Not fast rush
5HatchPool11 Not fast rush
5PoolHard2Player11 Fast rush
5Scout11 Not fast rush
973HydraBust44 Not fast rush
9Pool8GasLurker11 Heavy rush
9PoolHatchSpeed11 Heavy rush
9PoolHatchSpeedSpire211 Fast rush
9PoolHatchSpire11 Heavy rush
9PoolSpireSlowlings11 Heavy rush
9PoolSunkHatch11 Fast rush
AntiFact_2Hatch11 Not fast rush
AntiFact_Overpool9Gas11 Not fast rush
AntiFactory211 Heavy rush
Over10Hatch1Sunk11 Heavy rush
OverhatchExpoMuta33 Not fast rush
OverpoolSpeed11 Heavy rush
OverpoolTurtle 022 Heavy rush
Proxy8HatchNatural11 Heavy rush
Sparkle 3HatchMuta66 Not fast rush
ZvP_2HatchMuta11 Not fast rush
ZvP_3BaseSpire+Den11 Heavy rush
ZvP_3HatchPoolHydra74 Not fast rush | 1 Heavy rush | 1 Hydra bust | 1 Unknown
ZvT_2HatchMuta11 Not fast rush
ZvT_3HatchMuta11 Not fast rush
ZvT_7Pool11 Fast rush
ZvZ_12PoolLing11 Not fast rush
ZvZ_12PoolLingB22 Not fast rush
ZvZ_Overpool11Gas11 Heavy rush

daqin as seen by steamhammer

daqin played#steamhammer recognized
ForgeExpand5GateGoon13679 Turtle | 41 Safe expand | 10 Heavy rush | 5 Naked expand | 1 Unknown
ForgeExpandSpeedlots147 Safe expand | 6 Turtle | 1 Unknown

ecgberht strategies versus daqin strategies

overall12NexusCarriers3GateDT
overall1/150 1%1/2 50%0/148 0%
14CC0/32 0%-0/32 0%
FullMech0/29 0%0/1 0%0/28 0%
JoyORush0/28 0%-0/28 0%
MechGreedyFE0/28 0%-0/28 0%
ProxyEightRax1/33 3%1/1 100%0/32 0%

ecgberht as seen by daqin

ecgberht played#daqin recognized
14CC3228 Safe expand | 2 Naked expand | 2 Unknown
FullMech2927 Factory | 2 Not fast rush
JoyORush2827 Factory | 1 Unknown
MechGreedyFE2812 Unknown | 9 Safe expand | 7 Not fast rush
ProxyEightRax3327 Fast rush | 5 Not fast rush | 1 Proxy

daqin as seen by ecgberht

daqin played#ecgberht recognized
12NexusCarriers22 Unknown
3GateDT148148 Unknown

the final game of the ASL 10 finals

Yesterday’s ASL 10 finals was ZvZ with Zero (aka Queen) versus Soma in a best of 7. It’s worth seeing for the highly entertaining last game.

The match was in a tense situation. The game started... then hit technical difficulties, and after a long delay had to be restarted, ratcheting up the tension. Then the game itself was a spectacular knife fight with an exciting finish. Recommended.

AIIDE 2020 - Microwave versus Steamhammer

Microwave played more different openings than Steamhammer (no doubt seeking a winning choice), so I put it on the left. Blue is good for Microwave, red is good for Steamhammer.

microwave strategies versus steamhammer strategies

overall6PoolBurrow8-8HydraRush9Hatch8Pool9PoolHatchSpeedSpireOverhatchLingOverpoolBurrowZvZ_12HatchExpoZvZ_12PoolLingZvZ_12PoolMainZvZ_Overpool11GasZvZ_Overpool9GasZvZ_OverpoolTurtle
overall43/150 29%1/1 100%1/1 100%4/5 80%1/1 100%1/1 100%1/1 100%3/5 60%4/11 36%2/3 67%12/44 27%7/64 11%6/13 46%
10Hatch9Pool9gas4/9 44%--0/1 0%------2/2 100%1/5 20%1/1 100%
10HatchMain9Pool9Gas1/4 25%---------0/1 0%1/2 50%0/1 0%
10HatchTurtleHydra0/1 0%----------0/1 0%-
11HatchTurtleMuta0/1 0%-----------0/1 0%
12HatchMain0/1 0%---------0/1 0%--
12Pool5/25 20%----1/1 100%--0/3 0%2/2 100%2/7 29%0/10 0%0/2 0%
12PoolMain1/5 20%--1/1 100%------0/2 0%0/2 0%-
2HatchLurker0/2 0%---------0/1 0%0/1 0%-
3HatchHydraBust0/1 0%----------0/1 0%-
3HatchHydraExpo0/2 0%---------0/1 0%0/1 0%-
3HatchPoolHydra0/2 0%-------0/1 0%-0/1 0%--
4HatchPoolHydra0/1 0%----------0/1 0%-
5Pool0/4 0%-------0/1 0%-0/1 0%0/2 0%-
5PoolSpeed1/3 33%-1/1 100%-----0/1 0%-0/1 0%--
7Pool0/1 0%----------0/1 0%-
7PoolHydraLingRush7D0/1 0%----------0/1 0%-
9Hatch9Pool9Gas0/1 0%----------0/1 0%-
9HatchTurtleHydra0/1 0%------0/1 0%-----
9PoolGasHatchSpeed8D0/1 0%---------0/1 0%--
9PoolHatch0/2 0%----------0/2 0%-
9PoolSpeed17/31 55%1/1 100%-3/3 100%1/1 100%--1/1 100%1/1 100%-4/6 67%4/13 31%2/5 40%
9PoolSpeedLing0/1 0%------0/1 0%-----
9PoolSunken0/7 0%--------0/1 0%0/3 0%0/3 0%-
OverpoolSpeed1/3 33%-----1/1 100%---0/1 0%0/1 0%-
ZvP_11Hatch10Pool2/4 50%------1/1 100%--0/1 0%1/2 50%-
ZvP_2HatchHydra0/9 0%---------0/4 0%0/5 0%-
ZvP_9Hatch9Pool0/1 0%---------0/1 0%--
ZvZ_Overgas11Pool10/20 50%------1/1 100%3/4 75%-4/9 44%0/4 0%2/2 100%
ZvZ_Overpool11Gas0/2 0%----------0/2 0%-
ZvZ_Overpool9Gas1/4 25%----------0/3 0%1/1 100%

Steamhammer’s ZvZ_Overpool9Gas opening was successful against all Microwave tries, but notice that it was the only one: Flecks of blue, or entire streaks, crept into every other Steamhammer attempt. The end result does not look close, but in fact Microwave would have needed only a small increment of skill to turn it around; there was only one strategy it was unprepared to face.

microwave as seen by steamhammer

microwave played#steamhammer recognized
10Hatch9Pool9gas95 Naked expand | 3 Heavy rush | 1 Unknown
10HatchMain9Pool9Gas43 Unknown | 1 Turtle
10HatchTurtleHydra11 Naked expand
11HatchTurtleMuta11 Heavy rush
12HatchMain11 Unknown
12Pool2517 Naked expand | 5 Heavy rush | 3 Unknown
12PoolMain53 Heavy rush | 2 Unknown
2HatchLurker22 Naked expand
3HatchHydraBust11 Naked expand
3HatchHydraExpo21 Naked expand | 1 Heavy rush
3HatchPoolHydra21 Naked expand | 1 Heavy rush
4HatchPoolHydra11 Heavy rush
5Pool44 Fast rush
5PoolSpeed33 Fast rush
7Pool11 Fast rush
7PoolHydraLingRush7D11 Unknown
9Hatch9Pool9Gas11 Naked expand
9HatchTurtleHydra11 Heavy rush
9PoolGasHatchSpeed8D11 Heavy rush
9PoolHatch21 Unknown | 1 Heavy rush
9PoolSpeed3121 Unknown | 7 Naked expand | 3 Heavy rush
9PoolSpeedLing11 Naked expand
9PoolSunken74 Unknown | 3 Heavy rush
OverpoolSpeed32 Unknown | 1 Heavy rush
ZvP_11Hatch10Pool43 Naked expand | 1 Heavy rush
ZvP_2HatchHydra96 Heavy rush | 2 Turtle | 1 Naked expand
ZvP_9Hatch9Pool11 Naked expand
ZvZ_Overgas11Pool2019 Unknown | 1 Turtle
ZvZ_Overpool11Gas22 Unknown
ZvZ_Overpool9Gas44 Unknown

To play ZvZ truly well, Steamhammer needs a more detailed understanding of enemy builds. But even with this crude breakdown, I notice that most of the blue spots are associated with misunderstanding the main idea of Microwave’s play. On the other hand, many misunderstandings also show as red.

steamhammer as seen by microwave

steamhammer played#microwave recognized
6PoolBurrow11 FastRush
8-8HydraRush11 Unknown
9Hatch8Pool54 HeavyRush | 1 Unknown
9PoolHatchSpeedSpire11 NakedExpand
OverhatchLing11 HeavyRush
OverpoolBurrow11 NakedExpand
ZvZ_12HatchExpo55 NakedExpand
ZvZ_12PoolLing118 HeavyRush | 2 Unknown | 1 NakedExpand
ZvZ_12PoolMain33 HeavyRush
ZvZ_Overpool11Gas4436 Turtle | 5 Unknown | 3 NakedExpand
ZvZ_Overpool9Gas6451 Turtle | 8 Unknown | 5 NakedExpand
ZvZ_OverpoolTurtle1313 Turtle

The builds recognized as Turtle genuinely are turtle builds. They get mutalisks fast at the expense of weakness to zergling attack, which they compensate for with sunkens instead of a second hatchery. From the meta-strategy point of view, Steamhammer usually defeats Microwave in games where Steamhammer gains air superiority early, so Steamhammer’s choices make sense.

risky openings from data

I want to take a day off from analyzing AIIDE data to join in a conversation. From comments to Dragon vs Ecgberht:

Tully Elliston: Using learning to track win:loss %, and having a risk rating for each build (if win rate is above this level, don’t select this build unless it has won at least 1 game against this opponent) could actually be a very useful tool.

You can throw in lots of ridiculous polarised builds, and still ensure they won’t get accidentally selected when beating down a 4pool bot 100 times in a row.

MarcaDBAA: Yes, or give these builds, you don´t want to use at first, some default pseudo-losses, so that they will only be selected after other builds fail to win.

In BananaBrain versus Ecgberht I had mentioned that BananaBrain’s few losses to Ecgberht were due to unnecessarily playing a risky build. The commenters suggest two ways of marking builds as risky.

It can be done automatically from data, in the same style as opening timing data. In fact, it’s on my to-do list. The first step is to keep track of how good each opening is on average: For each matchup, store each opening’s average win rate across all opponents. It can be done offline by adding up the numbers from the individual learning files of each opponent, or you could keep separate records. That already gives you an automatic way to select openings against opponents you have not yet learned about; there’s no more need for hand configuration.

The next step is to compare how well each opening does against strong opponents versus weak opponents. If it reaches its average by beating expectations against strong opponents and falling below against weak opponents, taking into account the opponent’s strength, then it is a risky opening. If the reverse, it is a solid opening and is to be preferred against weak opponents (and if you’re ranked high, also against unknown opponents). One natural way to determine riskiness is to fit a line to the dataset this-opening’s-wins versus opponent-strength as measured by your win rates. The slope of the line tells how risky or solid the opening is. (If you have a lot of data you could fit a more complicated curve. Just make sure it strongly smooths the data.)

The same goes for other data about openings. For example, you can track how well each opening does on each map, and at given starting positions, and against different opponent strategies that you recognize. All the data can fold into your opening selection, without any hand configuration.

BASIL was formerly an excellent forum to collect this kind of data. But now the BASIL pairings are strongly biased toward opponents close in elo, so it is no longer a good option. Look at the crosstable for the last 30 days and notice how the white cells are laid out; unless you rank right in the middle, you can’t get a full cross-section of opponents without a long run.

AIIDE 2020 - Steamhammer versus McRave

I added parsing for Steamhammer. DaQin is nearly the same. The only remaining bot which records data that can be analyzed this way is ZZZKBot, which has a difficult file format, does not keep a recognized enemy strategy, and doesn’t bother to write a newline at the end of its file. I may skip ZZZKBot.

The Steamhammer-McRave strategy crosstable is the most interesting one yet.

steamhammer strategies versus mcrave strategies

overallPoolHatch,12Pool,2HatchMutaPoolHatch,12Pool,2HatchSpeedlingPoolLair,9Pool,1HatchMuta
overall64/150 43%17/33 52%10/22 45%37/95 39%
12PoolLurker0/1 0%--0/1 0%
3HatchLingBurrow1/5 20%1/2 50%-0/3 0%
8DroneGas7/11 64%-1/1 100%6/10 60%
9HatchMain9Pool9Gas0/2 0%--0/2 0%
9PoolHatchSpeedAllInB0/1 0%--0/1 0%
9PoolSpire0/2 0%0/2 0%--
Over10HatchBust8/19 42%7/7 100%-1/12 8%
Over10PoolLing0/1 0%--0/1 0%
OverpoolSpeed3/15 20%1/5 20%0/3 0%2/7 29%
OverpoolSunk8/21 38%0/1 0%2/8 25%6/12 50%
OverpoolTurtle11/23 48%2/6 33%1/1 100%8/16 50%
ZvP_3HatchMuta0/1 0%--0/1 0%
ZvZ_12HatchExpo0/1 0%-0/1 0%-
ZvZ_Overgas9Pool0/2 0%-0/1 0%0/1 0%
ZvZ_OverpoolTurtle26/45 58%6/10 60%6/7 86%14/28 50%

For Steamhammer, either 8DroneGas (a zergling build despite the name) or else ZvZ_OverpoolTurtle (a mutalisk build) may dominate among the openings tried, while McRave’s best was the 1 hatch muta play because no Steamhammer try was better than even against it. It’s possible that switching between different kinds of builds was important, though, because the table suggests that the other counters are likely imbalanced (without a game-theoretic saddle point).

Both sides had trouble identifying the best strategies. If both had played their best strategies then the match would have come out close to 50%, while in fact Steamhammer came out behind, so Steamhammer had more trouble selecting from its excessive range of possibilities. I get the impression of a back-and-forth learning struggle.

steamhammer as seen by mcrave

steamhammer played#mcrave recognized
12PoolLurker11 HatchPool,12Pool,1HatchMuta
3HatchLingBurrow53 HatchPool,Unknown,2HatchLing | 1 HatchPool,Unknown,Unknown | 1 Unknown,Unknown,3HatchMuta
8DroneGas116 HatchPool,9Pool,2HatchLing | 2 PoolHatch,9Pool,2HatchLing | 1 PoolHatch,Unknown,2HatchLing | 1 HatchPool,Unknown,2HatchLing | 1 PoolHatch,Unknown,Unknown
9HatchMain9Pool9Gas21 PoolHatch,12Pool,2HatchLing | 1 HatchPool,Unknown,2HatchLing
9PoolHatchSpeedAllInB11 PoolHatch,9Pool,LingRush
9PoolSpire22 Unknown,Unknown,Unknown
Over10HatchBust197 HatchPool,12Pool,Unknown | 4 HatchPool,12Pool,2HatchLing | 3 Unknown,12Pool,Unknown | 2 HatchPool,Unknown,2HatchLing | 2 HatchPool,Unknown,Unknown | 1 PoolHatch,12Pool,Unknown
Over10PoolLing11 HatchPool,12Pool,Unknown
OverpoolSpeed155 HatchPool,9Pool,LingRush | 4 PoolHatch,12Pool,Unknown | 3 Unknown,12Pool,Unknown | 1 Unknown,9Pool,LingRush | 1 PoolHatch,9Pool,LingRush | 1 HatchPool,12Pool,3HatchMuta
OverpoolSunk218 HatchPool,9Pool,Unknown | 5 PoolHatch,9Pool,LingRush | 2 HatchPool,9Pool,3HatchMuta | 1 PoolHatch,9Pool,Unknown | 1 Unknown,Unknown,Unknown | 1 Unknown,12Pool,3HatchMuta | 1 HatchPool,Unknown,Unknown | 1 HatchPool,12Pool,3HatchMuta | 1 PoolHatch,Unknown,Unknown
OverpoolTurtle237 HatchPool,9Pool,LingRush | 5 Unknown,12Pool,1HatchHydra | 3 Unknown,Unknown,1HatchHydra | 2 HatchPool,Unknown,1HatchHydra | 2 Unknown,9Pool,1HatchHydra | 2 HatchPool,12Pool,1HatchLurker | 1 PoolHatch,12Pool,1HatchLurker | 1 HatchPool,12Pool,1HatchHydra
ZvP_3HatchMuta11 HatchPool,Unknown,2HatchLing
ZvZ_12HatchExpo11 HatchPool,Unknown,2HatchMuta
ZvZ_Overgas9Pool21 PoolLair,Unknown,1HatchMuta | 1 PoolLair,Unknown,Unknown
ZvZ_OverpoolTurtle4515 Unknown,Unknown,Unknown | 10 PoolLair,9Pool,1HatchMuta | 6 PoolLair,12Pool,1HatchMuta | 6 PoolLair,Unknown,1HatchMuta | 2 PoolLair,Unknown,Unknown | 2 Unknown,12Pool,Unknown | 2 Unknown,Unknown,3HatchMuta | 1 PoolLair,9Pool,Unknown | 1 Unknown,12Pool,3HatchMuta

Some curious stuff here. None of Steamhammer’s openings here is 3 hatch mutalisk, so those that are recognized that way may have added a third hatchery later in the game. Steamhammer does have an unfortunate love of laying down an unnecessary hatchery before its spire in ZvZ (3 hatcheries with zerglings is good, 2 hatcheries with mutalisks is good, 3 hatcheries with mutalisks is hard to justify in ZvZ). Looking at the Steamhammer openings tried more often, OverpoolSunk should be recognized as PoolHatch usually (maybe sometimes PoolLair). McRave got it wrong over half the time, without any big effect on its win rate. OverpoolTurtle should be PoolHatch with a hydra followup (this opening is not intended for ZvZ). For ZvZ_OverpoolTurtle, the closest match is PoolLair,9Pool,1HatchMuta. McRave got it right 10 times out of 45 and was close some other times. Failing to recognize anything (likely the scout was denied) was bad.

mcrave as seen by steamhammer

mcrave played#steamhammer recognized
PoolHatch,12Pool,2HatchMuta3322 Naked expand | 6 Unknown | 5 Heavy rush
PoolHatch,12Pool,2HatchSpeedling229 Naked expand | 9 Unknown | 3 Heavy rush | 1 Worker rush
PoolLair,9Pool,1HatchMuta9589 Unknown | 4 Turtle | 2 Naked expand

Worker rush? That is likely a bug. The other choices capture information about the game that is probably true and not particularly useful.

AIIDE 2020 - Dragon versus Ecgberht

Two posts today, to cover the newly available Ecgberht pairings. Neither post has much meat to it.

dragon strategies versus ecgberht strategies

overall14CCBioMechGreedyFEFullMechProxyBBSProxyEightRax
overall141/150 94%28/28 100%27/28 96%25/25 100%36/44 82%25/25 100%
1rax fe5/6 83%1/1 100%1/1 100%1/1 100%2/3 67%-
bio136/144 94%27/27 100%26/27 96%24/24 100%34/41 83%25/25 100%

I was curious about Dragon’s pattern of seemingly giving up on “1rax fe” (barracks expand) after a single loss, so I looked at the file. In fact Dragon played “bio” as the regular build the whole time, throwing in “1rax fe” occasionally for spice. The “1rax fe” loss was not the last “1rax fe” game, but the second to last.

For Ecgberht, when one build is producing nearly all the wins, probably you should play it more often than 30% of the time. You may not want to play it every game, because that makes it easy for the opponent to adapt—mixing it up is good. Maybe 50% of the time would be better, given this number of alternatives? To know for sure, I guess we’d have to test against a range of bots to see the overall effectiveness of learning.

dragon as seen by ecgberht

dragon played#ecgberht recognized
1rax fe66 Unknown
bio144144 Unknown

Nothing to see here. Move along.

ecgberht as seen by dragon

Dragon does not record its idea of the opponent’s build. If it has one.

AIIDE 2020 - BananaBrain versus Ecgberht

bananabrain strategies versus ecgberht strategies

overall14CCFullMechJoyORushMechGreedyFEProxyEightRax
overall148/150 99%31/31 100%28/28 100%28/28 100%28/28 100%33/35 94%
PvT_10/12gate10/10 100%3/3 100%-1/1 100%3/3 100%3/3 100%
PvT_10/15gate10/10 100%2/2 100%3/3 100%1/1 100%1/1 100%3/3 100%
PvT_12nexus10/10 100%3/3 100%3/3 100%2/2 100%1/1 100%1/1 100%
PvT_1gatedtexpo10/10 100%1/1 100%3/3 100%2/2 100%4/4 100%-
PvT_1gatereaver10/10 100%2/2 100%3/3 100%2/2 100%1/1 100%2/2 100%
PvT_28nexus10/10 100%3/3 100%2/2 100%1/1 100%2/2 100%2/2 100%
PvT_2gatedt11/11 100%2/2 100%4/4 100%1/1 100%2/2 100%2/2 100%
PvT_2gaterngexpo10/10 100%3/3 100%3/3 100%-1/1 100%3/3 100%
PvT_32nexus10/10 100%1/1 100%1/1 100%2/2 100%3/3 100%3/3 100%
PvT_9/9gate10/10 100%2/2 100%-3/3 100%-5/5 100%
PvT_9/9proxygate10/10 100%4/4 100%3/3 100%1/1 100%1/1 100%1/1 100%
PvT_bulldog10/10 100%1/1 100%1/1 100%5/5 100%1/1 100%2/2 100%
PvT_dtdrop10/10 100%1/1 100%-3/3 100%3/3 100%3/3 100%
PvT_proxydt7/9 78%1/1 100%1/1 100%3/3 100%2/2 100%0/2 0%
PvT_stove10/10 100%2/2 100%1/1 100%1/1 100%3/3 100%3/3 100%

We can see exactly how Ecgberht scored its total of 2 wins: It happened to play a fast proxy when BananaBrain played a slow proxy. For BananaBrain, maybe the lesson is to avoid risky openings versus much weaker opponents. As a general principle, I suggest saving risky builds for games where you have a high risk of losing with safe play—in that case, why not?

bananabrain as seen by ecgberht

bananabrain played#ecgberht recognized
PvT_10/12gate107 ZealotRush | 3 Unknown
PvT_10/15gate1010 Unknown
PvT_12nexus109 ProtossFE | 1 Unknown
PvT_1gatedtexpo1010 Unknown
PvT_1gatereaver1010 Unknown
PvT_28nexus1010 Unknown
PvT_2gatedt1111 Unknown
PvT_2gaterngexpo1010 Unknown
PvT_32nexus1010 Unknown
PvT_9/9gate107 ZealotRush | 3 Unknown
PvT_9/9proxygate108 Unknown | 2 CannonRush
PvT_bulldog1010 Unknown
PvT_dtdrop1010 Unknown
PvT_proxydt99 Unknown
PvT_stove1010 Unknown

Except for a couple cases of CannonRush, the builds that Ecgberht recognized were named correctly. I imagine that it interpreted CannonRush as “something proxied.”

ecgberht as seen by bananabrain

ecgberht played#bananabrain recognized
14CC3121 T_fastexpand | 6 T_unknown | 4 T_2rax
FullMech2821 T_unknown | 6 T_1fac | 1 T_2fac
JoyORush2823 T_2fac | 3 T_unknown | 2 T_1fac
MechGreedyFE2825 T_unknown | 3 T_2rax
ProxyEightRax3535 T_unknown

As we’ve seen before, BananaBrain has little skill in recognizing terran builds.

AIIDE 2020 - what Ecgberht learned

I added parsing code for Ecgberht’s JSON format learning files. I had to refactor for generality, and it added complexity, but I can use the parser for more than one purpose. Today I summarize the contents of its history files.

Ecgberht I think is a complex and interesting bot. It played up to 5 different strategies in each matchup, though the selection of the 5 varied by matchup. Sometimes it played fewer. Against most opponents Ecgberht played its strategies at roughly equal rates—except for the strategies it didn’t play at all. Ecgberht uses UCB with a high exploration rate. The strategy manager in the source lists 15 strategies (plus one more played only on the map Plasma and named PlasmaWraithHell), so it did not play everything it knows. I made a quick scan through the source for opponent-specific preparation, and did find some, but for bots in the tournament only ZZZKBot is affected (it is flagged by a zergling rush check; some bots that always zealot rush are flagged for that). I didn’t dig deep enough to find out why Ecgberht ignores so many of its available strategies.

Ecgberht tries to recognize the opponent’s strategy, but often finds itself unsure. It recorded a high rate of Unknown enemy plans. The ones it does recognize are drawn from a small set that seems to me well-chosen.

Ecgberht recorded fewer than 150 games for 5 of its 11 opponents, although it completed all games with no crashes. In total, 7 games do not appear in the game records of the history files. Maybe it has a cleanup bug that bites occasionally?


#1 stardust

openinggameswinsfirstlast
14CC310%3147
FullMech280%0148
JoyORush270%2143
MechGreedyFE270%4146
ProxyEightRax366%1141
5 openings1491%
enemygameswins
Unknown1491%
1 opening1491%


A couple wins against the top player is not bad.


#2 purplewave

openinggameswinsfirstlast
14CC353%3148
FullMech290%0149
JoyORush280%2146
MechGreedyFE280%4147
ProxyEightRax300%1142
5 openings1501%
enemygameswins
ProtossFE70%
Unknown1431%
2 openings1501%

#3 bananabrain

openinggameswinsfirstlast
14CC310%3146
FullMech280%0144
JoyORush280%2147
MechGreedyFE280%4148
ProxyEightRax356%1149
5 openings1501%
enemygameswins
CannonRush20%
ProtossFE90%
Unknown1252%
ZealotRush140%
4 openings1501%

#4 dragon

openinggameswinsfirstlast
14CC280%3148
BioMechGreedyFE284%4144
FullMech250%0146
ProxyBBS4418%2149
ProxyEightRax250%1147
5 openings1506%
enemygameswins
Unknown1506%
1 opening1506%

#5 mcrave

openinggameswinsfirstlast
14CC287%7147
BioGreedyFE5129%0145
ProxyEightRax4726%21140
TwoPortWraith225%3146
4 openings14820%
enemygameswins
FastHatch6116%
NinePool1331%
Unknown7422%
3 openings14820%


Ecgberht put up its strongest fight against zerg.


#6 microwave

openinggameswinsfirstlast
14CC329%4145
BioGreedyFE210%0148
FullBioFE244%3146
ProxyEightRax5227%1147
TwoPortWraith200%2138
5 openings14912%
enemygameswins
FastHatch994%
NinePool540%
Unknown4527%
3 openings14912%

#7 steamhammer

openinggameswinsfirstlast
14CC3412%8147
BioGreedyFE3617%0142
ProxyEightRax3614%1141
TwoPortWraith4323%4148
4 openings14917%
enemygameswins
EarlyPool40%
FastHatch2232%
NinePool8114%
Unknown4217%
4 openings14917%

#8 daqin

openinggameswinsfirstlast
14CC320%8148
FullMech290%0149
JoyORush280%4144
MechGreedyFE280%43147
ProxyEightRax333%1141
5 openings1501%
enemygameswins
Unknown1501%
1 opening1501%

#9 zzzkbot

openinggameswinsfirstlast
FullBio15071%0149
1 opening15071%
enemygameswins
EarlyPool15071%
1 opening15071%


Ecgberht upset ZZZKBot, possibly aided by its hardcoded knowledge of how ZZZKBot plays.


#10 ualbertabot

openinggameswinsfirstlast
FullBio5843%0144
FullMech5238%2145
ProxyBBS4032%1149
3 openings15039%
enemygameswins
BioPush1191%
EarlyPool1250%
MechRush933%
Unknown10424%
ZealotRush14100%
5 openings15039%

#11 willyt

openinggameswinsfirstlast
14CC313%68148
FullMech349%0147
ProxyEightRax8541%2149
3 openings15026%
enemygameswins
BioPush3415%
Unknown11629%
2 openings15026%

#13 eggbot

openinggameswinsfirstlast
FullMech14894%0147
1 opening14894%
enemygameswins
CannonRush9495%
Unknown5493%
2 openings14894%

AIIDE 2020 - Microwave versus BananaBrain

This is the last matchup I can analyze this way without writing more parsing code. McRave did ask for more in a comment, though, so I may do that. All the matchups have featured BananaBrain.

Microwave plays a large number of strategies, so I put it on the left side. Blue is good for Microwave, red is good for BananaBrain.

microwave strategies versus bananabrain strategies

overallPvZ_10/12gatePvZ_1basespeedzealPvZ_2basespeedzealPvZ_4gate2archonPvZ_5gategoonPvZ_9/9gatePvZ_9/9proxygatePvZ_bisuPvZ_neobisuPvZ_sairdtPvZ_sairgoonPvZ_sairreaverPvZ_stove
overall58/150 39%5/17 29%3/19 16%4/11 36%4/9 44%4/7 57%5/11 45%5/12 42%4/14 29%4/10 40%5/10 50%6/11 55%4/9 44%5/10 50%
10Hatch9Pool9gas0/2 0%---0/1 0%0/1 0%--------
10HatchMain9Pool9Gas0/1 0%-------0/1 0%-----
11HatchTurtleHydra0/1 0%--------0/1 0%----
12Hatch0/1 0%0/1 0%------------
12PoolMain22/43 51%0/5 0%0/9 0%2/2 100%3/3 100%3/3 100%0/1 0%1/3 33%2/2 100%3/3 100%0/3 0%2/3 67%4/4 100%2/2 100%
12PoolMuta0/1 0%0/1 0%------------
1HatchMuta_Sparkle0/1 0%------0/1 0%------
2HatchMuta1/5 20%--1/1 100%--0/1 0%-0/1 0%---0/1 0%0/1 0%
3HatchHydraBust0/1 0%-------0/1 0%-----
3HatchHydra_BHG0/1 0%--0/1 0%----------
3HatchLingBust2/6 33%-0/1 0%0/1 0%--1/1 100%0/1 0%---1/1 100%-0/1 0%
3HatchMuta0/1 0%--------0/1 0%----
3HatchPoolHydraExpo0/1 0%0/1 0%------------
4HatchBeforeGas0/1 0%----------0/1 0%--
4HatchPoolHydra0/2 0%-0/1 0%0/1 0%----------
4PoolHard2/6 33%-1/1 100%0/1 0%--1/1 100%-0/1 0%----0/2 0%
4PoolSoft0/1 0%-0/1 0%-----------
6Pool0/1 0%-0/1 0%-----------
7Pool0/1 0%---------0/1 0%---
8Pool0/1 0%--------0/1 0%----
8PoolHydraRush8D0/1 0%0/1 0%------------
9PoolGasHatchSpeed8D12/18 67%2/2 100%2/2 100%-1/2 50%0/1 0%1/1 100%0/2 0%1/1 100%1/1 100%1/1 100%1/2 50%0/1 0%2/2 100%
9PoolHatchGasSpeed7D0/1 0%---0/1 0%---------
9PoolHatchGasSpeed8D17/32 53%3/4 75%0/1 0%1/1 100%0/1 0%0/1 0%2/4 50%4/5 80%1/5 20%0/1 0%4/4 100%2/2 100%0/2 0%0/1 0%
9PoolSpeed0/3 0%0/1 0%--0/1 0%------0/1 0%--
9PoolSpeedLing1/5 20%-----0/1 0%-0/1 0%--0/1 0%0/1 0%1/1 100%
9PoolSunkHatch0/1 0%--0/1 0%----------
Overpool0/1 0%0/1 0%------------
OverpoolSpeed0/3 0%-0/1 0%0/1 0%----0/1 0%-----
ZvP_10Hatch9Pool1/3 33%-0/1 0%0/1 0%-1/1 100%--------
ZvP_11Hatch10Pool0/1 0%--------0/1 0%----
ZvZ_Overgas9Pool0/1 0%--------0/1 0%----
ZvZ_Overpool11Gas0/2 0%-----0/1 0%---0/1 0%---

This table looks even more scattered than yesterday’s BananaBrain-Dragon table, but to me it tells a story of duelling learning algorithms. Microwave found a few builds that countered BananaBrain’s preferred play, and BananaBrain did not shift its responses far enough to entirely squelch them.

microwave as seen by bananabrain

microwave played#bananabrain recognized
10Hatch9Pool9gas22 Z_10hatch
10HatchMain9Pool9Gas11 Z_10hatch
11HatchTurtleHydra11 Z_12hatch
12Hatch11 Z_12hatch
12PoolMain4336 Z_12pool | 5 Z_10hatch | 2 Z_unknown
12PoolMuta11 Z_10hatch
1HatchMuta_Sparkle11 Z_unknown
2HatchMuta55 Z_12hatch
3HatchHydraBust11 Z_12hatch
3HatchHydra_BHG11 Z_10hatch
3HatchLingBust66 Z_12hatch
3HatchMuta11 Z_12hatch
3HatchPoolHydraExpo11 Z_12hatch
4HatchBeforeGas11 Z_12hatch
4HatchPoolHydra22 Z_12hatch
4PoolHard66 Z_4/5pool
4PoolSoft11 Z_4/5pool
6Pool11 Z_4/5pool
7Pool11 Z_9pool
8Pool11 Z_9pool
8PoolHydraRush8D11 Z_9pool
9PoolGasHatchSpeed8D1815 Z_9pool | 3 Z_overpool
9PoolHatchGasSpeed7D11 Z_9pool
9PoolHatchGasSpeed8D3229 Z_9pool | 3 Z_overpool
9PoolSpeed32 Z_9poolspeed | 1 Z_9pool
9PoolSpeedLing55 Z_9poolspeed
9PoolSunkHatch11 Z_9pool
Overpool11 Z_overpool
OverpoolSpeed33 Z_overpool
ZvP_10Hatch9Pool33 Z_10hatch
ZvP_11Hatch10Pool11 Z_12hatch
ZvZ_Overgas9Pool11 Z_12pool
ZvZ_Overpool11Gas22 Z_overpool

BananaBrain was accurate at reading Microwave’s initial build. Lumping 11 hatch with 12 hatch is fine, they’re very similar. 12 pool can be difficult to distinguish from 10 hatch, if you scout it late after the second hatchery finishes. It would be useful to better separate 9 pool from overpool, which are significantly different in effect, but it requires close attention to detail. Overall, highly accurate readings with only one wide miss, seeing the overgas 9 pool as 12 pool—and that is a ZvZ build that is extremely rare in ZvP.

It makes quite a contrast with yesterday’s BananaBrain-Dragon analysis, where BananaBrain barely recognized terran builds.

bananabrain as seen by microwave

bananabrain played#microwave recognized
PvZ_10/12gate1713 HeavyRush | 3 Unknown | 1 NakedExpand
PvZ_1basespeedzeal1914 Unknown | 5 HeavyRush
PvZ_2basespeedzeal114 NakedExpand | 3 Turtle | 3 SafeExpand | 1 HeavyRush
PvZ_4gate2archon94 NakedExpand | 4 SafeExpand | 1 HeavyRush
PvZ_5gategoon76 NakedExpand | 1 HeavyRush
PvZ_9/9gate119 HeavyRush | 2 Unknown
PvZ_9/9proxygate126 HeavyRush | 6 Unknown
PvZ_bisu146 SafeExpand | 4 NakedExpand | 2 Turtle | 1 HeavyRush | 1 Unknown
PvZ_neobisu104 NakedExpand | 3 SafeExpand | 2 Turtle | 1 HeavyRush
PvZ_sairdt108 Unknown | 2 HeavyRush
PvZ_sairgoon117 NakedExpand | 1 SafeExpand | 1 Turtle | 1 Unknown | 1 HeavyRush
PvZ_sairreaver94 SafeExpand | 3 NakedExpand | 2 Turtle
PvZ_stove107 Unknown | 3 HeavyRush

Microwave borrowed Steamhammer’s rather crude classification of enemy plans (which was still far in the future when Microwaved forked from Steamhammer). It was intended to be minimal, just enough to allow for basic reactions, to hold the fort until I could raise enough troops to make a sally. Microwave’s recognitions look similar to Steamhammer’s, with the right general tendency but many sloppy variations (which I think are due mostly to weak scouting, with a contribution from overlapping recognition rules).

It’s striking that some recognitions—of dubious accuracy—are dark blue in stark contrast to their neighbors. It gives me the impression that Microwave makes use of the recognized enemy plan, in some cases to good effect. It suggests that more accurate recognition, if the reactions are also good, could be a major improvement.

AIIDE 2020 - BananaBrain versus Dragon

Of the 4 bots I’m prepared to run this analysis on, this is the only pairing involving Dragon. Dragon did not record all 150 games against either McRave or Microwave. Like yesterday, all win rates and coloring are from the point of view of BananaBrain: Blue is good for BananaBrain, red is good for Dragon.

bananabrain strategies versus dragon strategies

overall1rax fe2rax bio2rax mechbiodirty worker rushmass vulturesiege expand
overall67/150 45%6/14 43%6/11 55%8/15 53%15/37 41%3/3 100%22/56 39%7/14 50%
PvT_10/12gate12/17 71%2/3 67%-2/3 67%4/4 100%-3/6 50%1/1 100%
PvT_10/15gate5/12 42%-2/2 100%1/5 20%1/3 33%-1/2 50%-
PvT_12nexus1/8 12%1/2 50%--0/1 0%-0/3 0%0/2 0%
PvT_1gatedtexpo3/7 43%1/2 50%--0/1 0%-2/4 50%-
PvT_1gatereaver0/5 0%-0/1 0%-0/2 0%-0/2 0%-
PvT_28nexus5/11 45%0/2 0%0/1 0%0/2 0%1/1 100%-4/5 80%-
PvT_2gatedt3/9 33%0/1 0%-1/1 100%0/2 0%-0/3 0%2/2 100%
PvT_2gaterngexpo2/7 29%-0/1 0%-1/1 100%1/1 100%0/4 0%-
PvT_32nexus2/8 25%---1/4 25%1/1 100%0/2 0%0/1 0%
PvT_9/9gate14/18 78%-2/3 67%-4/4 100%1/1 100%7/9 78%0/1 0%
PvT_9/9proxygate8/14 57%1/1 100%1/1 100%3/3 100%0/2 0%-2/6 33%1/1 100%
PvT_bulldog0/6 0%0/1 0%--0/3 0%-0/1 0%0/1 0%
PvT_dtdrop2/8 25%-1/1 100%-0/4 0%-1/2 50%0/1 0%
PvT_proxydt10/14 71%1/1 100%-1/1 100%3/3 100%-2/5 40%3/4 75%
PvT_stove0/6 0%0/1 0%0/1 0%-0/2 0%-0/2 0%-

Not one table cell has more than 9 games in it. Neither bot successfully predicted what the other would play, if it even tried: BananaBrain is unpredictable and Dragon changes its choice frequently when losing, and besides BananaBrain is poor at recognizing terran plans. So the strategy x strategy cross is a hash. To me the table means that, at least for this pairing, reactions during the game were more important than the initial choice of strategy. Neither side had a way to choose a counter beforehand.

bananabrain as seen by dragon

Dragon does not record a recognized opponent strategy. Its history files have only its own strategy and whether it won.

dragon as seen by bananabrain

dragon played#bananabrain recognized
1rax fe1413 T_unknown | 1 T_fastexpand
2rax bio118 T_unknown | 2 T_fastexpand | 1 T_1fac
2rax mech1514 T_unknown | 1 T_1fac
bio3735 T_unknown | 1 T_1fac | 1 T_fastexpand
dirty worker rush33 T_unknown
mass vulture5630 T_1fac | 26 T_unknown
siege expand149 T_unknown | 5 T_1fac

We knew that BananaBrain struggles to recognize terran strategies. Maybe the author has not spent effort on it because it doesn’t affect results much? In any case, given how Dragon plays, with its love of fast expansions and mixed tech, the terran builds that are recognized probably represent truths about the games. It’s not clear that they are helpful truths, though, because they say so little about what happened.

From the coloring, it looks as though there was little relationship between whether BananaBrain recognized Dragon’s build and whether BananaBrain won. That is consistent with the theory that the author decided it didn’t matter.

AIIDE 2020 - BananaBrain versus McRave

If both bots in a pairing write history files, and both record all 150 games of the tournament, then the history files can be aligned and we can compare what the bots were thinking in each game. So far, between the limitations of the data and the limitations of my script, I’m only ready to do that for a few pairings. Dragon in particular often did not record all 150 games, and I’d rather not try to align game records when there are gaps in the histories (there is enough data to do it programmatically, but it’s a pain and risks errors). Also my script depends on parsing out data into a specific format, and it is only implemented for 4 bots so far (#3 BananaBrain, #4 Dragon, #5 McRave, #6 Microwave—alphabetical order and their finishing order were the same).

Today is BananaBrain versus McRave. The first BananaBrain line in its file about McRave:

2020-10-09 20:56:04,2,(2)Destination.scx,PvZ_9/9proxygate,Z_overpool,7.6,1

The first McRave line in its history file about BananaBrain (we’re told it doesn’t use this data in games, but it’s there and we can analyze it):

Lost,Destination,7:30,2Gate,Proxy,ZealotRush,PoolHatch,Overpool,2HatchSpeedling,1:21,1:21,1:21,5,Zerg_Larva,30,Zerg_Zergling,15,Zerg_Drone,3,Zerg_Overlord,24,Protoss_Probe,16,Protoss_Zealot,1,Protoss_Corsair Lost,HeartbreakRidge,17:40,2Gate,Main,Corsair,PoolHatch,Overpool,2HatchMuta,2:01,2:01,5:10,2,Zerg_Larva,16,Zerg_Zergling,42,Zerg_Drone,10,Zerg_Overlord,28,Zerg_Mutalisk,18,Zerg_Scourge,89,Protoss_Probe,54,Protoss_Zealot,23,Protoss_Dragoon,1,Protoss_High_Templar,1,Protoss_Shuttle,33,Protoss_Corsair,5,Protoss_Dark_Templar,1,Protoss_Reaver,8,Protoss_Scarab

My script extracts key info from each line so we can compare. BananaBrain played PvZ_9/9proxygate and concluded that McRave answered with Z_overpool, while McRave played PoolHatch,Overpool,2HatchSpeedling and classified what it saw from BananaBrain as 2Gate,Proxy,ZealotRush. In this game, both sides agreed pretty well about what was going on.

bananabrain strategies versus mcrave strategies

This table shows which BananaBrain strategies were successful against which three-part McRave strategies. All the winning rates are from BananaBrain’s point of view. The intersection of the overall row and the overall column says that BananaBrain won 82 out of 150 games throughout the tournament, which can be checked against the official crosstable. The overall row tells how BananaBrain fared against each of McRave’s strategies, which can be checked against my tables of what McRave learned. The overall column tells how each BananaBrain strategy performed, which can be checked against what BananaBrain learned. (Spoiler: All the numbers match.) The center cells are the meat, and show what countered what.

overallPoolHatch,Overpool,2HatchMutaPoolHatch,Overpool,2HatchSpeedlingPoolHatch,Overpool,3HatchSpeedling
overall82/150 55%72/131 55%5/10 50%5/9 56%
PvZ_10/12gate9/13 69%4/4 100%-5/9 56%
PvZ_1basespeedzeal6/12 50%6/12 50%--
PvZ_2basespeedzeal3/9 33%3/9 33%--
PvZ_4gate2archon1/6 17%1/6 17%--
PvZ_5gategoon9/16 56%9/16 56%--
PvZ_9/9gate26/27 96%26/27 96%--
PvZ_9/9proxygate5/10 50%-5/10 50%-
PvZ_bisu1/5 20%1/5 20%--
PvZ_neobisu5/11 45%5/11 45%--
PvZ_sairdt3/10 30%3/10 30%--
PvZ_sairgoon11/17 65%11/17 65%--
PvZ_sairreaver1/5 20%1/5 20%--
PvZ_stove2/9 22%2/9 22%--

The table makes it plain that 2HatchSpeedling and 3HatchSpeedling were reactions to specific protoss builds, as the author pointed out in a comment. The counter to 10/12 gate at least seems to have been valuable, because McRave lost all 4 games where the 10/12 gate was played but not countered. The 9/9 gate crushed because no counter was played against it; the zealots are a McRave weakness.

bananabrain as seen by mcrave

But wait, there’s more. Both bots recorded not only their own strategy, but the recognized opponent strategy, so we can compare the known strategy of one bot with how the other bot recognized it. Note well: If the recognized strategy looks different than the actual strategy, it is not necessarily a mistake or a scouting miss. The bots may simply be noting different aspects of the game. Only some differences indicate mistakes.

The coloring is from the point of view of BananaBrain. For McRave, red is good and blue is bad.

bananabrain played#mcrave recognized
PvZ_10/12gate1312 2Gate,Main,Corsair | 1 2Gate,Main,DT
PvZ_1basespeedzeal128 1GateCore,2Zealot,DT | 2 2Gate,Main,DT | 1 2Gate,Main,4Gate | 1 1GateCore,2Zealot,Corsair
PvZ_2basespeedzeal98 FFE,Forge,Speedlot | 1 FFE,Gateway,Speedlot
PvZ_4gate2archon62 FFE,Forge,NeoBisu | 2 FFE,Forge,5GateGoon | 1 FFE,Forge,ZealotArchon | 1 FFE,Nexus,NeoBisu
PvZ_5gategoon1614 FFE,Forge,5GateGoon | 2 FFE,Nexus,5GateGoon
PvZ_9/9gate2726 2Gate,Main,Corsair | 1 2Gate,Main,DT
PvZ_9/9proxygate1010 2Gate,Proxy,ZealotRush
PvZ_bisu53 FFE,Forge,NeoBisu | 2 FFE,Nexus,NeoBisu
PvZ_neobisu115 FFE,Forge,NeoBisu | 4 FFE,Forge,Speedlot | 2 FFE,Nexus,NeoBisu
PvZ_sairdt1010 1GateCore,2Zealot,Corsair
PvZ_sairgoon179 FFE,Forge,NeoBisu | 2 FFE,Nexus,5GateGoon | 2 FFE,Nexus,NeoBisu | 2 FFE,Forge,Unknown | 1 FFE,Forge,Speedlot | 1 FFE,Forge,5GateGoon
PvZ_sairreaver55 FFE,Forge,NeoBisu
PvZ_stove99 1GateCore,2Zealot,Corsair

Only 2 games have an Unknown element. Without watching replays, I can’t say that any of McRave’s recognitions are wrong. Seeing PvZ_sairgoon as FFE,Forge,Speedlot could be correct if BananaBrain followed up with zealots in that one game.

I’m not sure what the difference is between FFE,Forge and FFE,Gateway and FFE,Nexus. FFE stands for forge fast expand, which means a forge and a nexus, and then you need a gateway if you’re ever going to make a mobile army, so all three buildings are required. Maybe it’s whatever building McRave saw first.

mcrave as seen by bananabrain

Again, the coloring is from the point of view of BananaBrain.

mcrave played#bananabrain recognized
PoolHatch,Overpool,2HatchMuta131101 Z_overpool | 27 Z_9pool | 3 Z_unknown
PoolHatch,Overpool,2HatchSpeedling109 Z_overpool | 1 Z_unknown
PoolHatch,Overpool,3HatchSpeedling98 Z_overpool | 1 Z_9pool

BananaBrain remembered far less detail about the game than McRave. Overpool is only an initial build order which reaches its end at 9 supply and can be followed up with any tech or unit mix whatsoever. If all you know is that the opponent will start with overpool, the only conclusions you can draw are limits on the opponent’s tech timings and economy. On the other hand, if you do know more about the opponent’s play, can you use the information productively?

more?

I could generate more tables. Various tables showing recognized strategies might make sense. If at least one bot of the pair records the map for each game, it would be easy to break down strategies by map. Is there any particular breakdown you’d like to see?

Update: I added coloring to the “as seen by” tables, to show how win rates vary depending on what the bots recognized.

AIIDE 2020 - summary so far

I’ve analyzed the learning files of most of the bots that wrote them. Stardust and EggBot wrote nothing. WillyT recorded the results of only a small fraction of games. ZZZKBot and Ecgberht wrote files that are more troublesome for me to parse (starting from what I’ve already coded), though I will analyze them too if people want. PurpleWave doesn’t have a single strategy-I-played and enemy-strategy-I-recognized, but tags history games with an irregular set including a combination of its multiple strategy choices and multiple “fingerprints” of enemy choices; this data should be summarized in an entirely different way.

Tomorrow I’ll start a new data analysis of a kind I haven’t tried before. It will bring out new information, and I think people will like it.