archive by month
Skip to content

SSCAIT round of 8; AlphaStar

In between ASL 7, the SSCAIT round of 8, about 4 hours of video on DeepMind’s AlphaStar, and keeping up with Steamhammer’s games, I was watching Starcraft today nearly from dawn to dusk. Coding progress: Zero. Time to do something else for a while and write about it!

In SSCAIT, the hard-to-predict matches were PurpleWave-BananaBrain and Iron-Steamhammer. PurpleWave 3-2 BananaBrain was the expected close match. In the next round we can expect PurpleWave > SAIDA and Locutus > Iron, so PurpleWave and Locutus will fight it out in the semifinal. I think Locutus has an edge, but PurpleWave retains chances.

I was not surprised by Iron > Steamhammer, but I was surprised by Iron 3-0 Steamhammer. It was unlucky that it was so one-sided. When Steamhammer has ample learning data, it has a small advantage over Iron. On a 2 player map, the 2 hatch muta variant usually wins, but here Steamhammer played it on a 4 player map where it usually loses due to poor execution—Steamhammer didn’t have enough data to connect its past wins with 2 hatch muta to the map size. On other maps, the AntiFactory build wins 50% or a little more, and in this match Steamhammer was still casting about for ways to win and didn’t try AntiFactory. It’s because I knew that Steamhammer didn’t have enough data that I gave the edge to Iron. Steamhammer is likely to win in the next round and, as others have also predicted, drop out in loser’s round 4 to SAIDA.

Interestingly, commenters who predicted Proxy > Hao Pan were wrong. All the information I can find seems to indicate that Proxy has the upper hand over Hao Pan. Did somebody hit a learning transient?

AlphaStar in Starcraft 2 gives us a foretaste of what to expect from advanced neural network learning. On the one hand, they spent huge computing resources—weeks at a time of “many thousands” of simultaneous games with 16 of Google’s TPUs per player—to learn to play protoss versus protoss on a single map. On the other hand, AlphaStar came out of that work with exceptional micro and strong judgment, areas in which all Brood War bots are currently weak. Machine learning is the way to get strong judgment. But it’s not easy.

They say that AlphaStar plays with average APM around 280 and latency around 350 ms, both somewhat slower than human. That makes its strength more impressive. They didn’t say so clearly, but I got the idea that the 350 ms latency is for free: It takes that long to evaluate their deep and complex network, so they can’t react faster! They did not talk as much about how AlphaStar’s real advantage is not in speed, but in precision: It does not misclick (at least not harmfully). Humans have a tradeoff of speed versus precision; if you do something faster, you do it with more slop. AlphaStar is a little slower, but far more precise than a human, so in fact it stands higher on the speed-precision tradeoff. It should play better, given equal knowledge. Still, it certainly takes fewer liberties than a BWAPI bot.

AIIDE 2018 - what CherryPi learned

Here is a table of how each CherryPi opening fared against each opponent, like the tables I made for other bots. Reading the code confirmed my inference that the learning files recorded opening build orders, not build orders switched to later in the game; see how CherryPi played.

#bottotal10hatchling2hatchmuta3basepoollings9poolspeedlingmutahydracheesezve9poolspeedzvp10hatchzvp3hatchhydrazvp6hatchhydrazvpohydraszvpomutaszvt2baseguardianzvt2baseultrazvt3hatchlurkerzvtmacrozvz12poolhydraszvz9gas10poolzvz9poolspeedzvzoverpool
#1saida13-90  13%-----1-19 5%------1-15 6%9-37 20%2-19 10%----
#3cse73-30  71%-----0-2 0%24-5 83%--16-8 67%----33-15 69%----
#4bluebluesky89-14  86%-----0-1 0%29-8 78%-------60-5 92%----
#5locutus84-19  82%--63-11 85%-----14-3 82%-2-2 50%---5-3 62%----
#6isamind99-4  96%--1-0 100%-----98-4 96%----------
#7daqin103-0  100%--------------103-0 100%----
#8mcrave87-16  84%--9-2 82%-----31-4 89%-14-4 78%---33-6 85%----
#9iron97-6  94%----97-6 94%--------------
#10zzzkbot93-10  90%58-4 94%--0-1 0%-------------35-4 90%0-1 0%
#11steamhammer81-21  79%22-7 76%----16-5 76%---------0-1 0%-43-8 84%-
#12microwave94-9  91%----------------0-1 0%4-2 67%90-6 94%
#13lastorder85-18  83%45-7 87%----0-1 0%------------40-10 80%
#14tyr98-5  95%------98-5 95%------------
#15metabot94-2  98%---------94-2 98%---------
#16letabot101-2  98%0-1 0%-97-0 100%--1-1 50%-----3-0 100%-------
#17arrakhammer92-11  89%-----------------92-11 89%-
#18ecgberht102-1  99%--------------102-1 99%----
#19ualbertabot99-4  96%---96-2 98%-3-2 60%-------------
#20ximp98-5  95%-------1-0 100%-97-5 95%---------
#21cdbot103-0  100%-----96-0 100%-----------7-0 100%-
#22aiur100-3  97%---------100-3 97%---------
#23killall103-0  100%102-0 100%-----------------1-0 100%
#24willyt103-0  100%-103-0 100%-----------------
#25ailien103-0  100%-----------------103-0 100%-
#26cunybot100-3  97%-----------------100-3 97%-
#27hellbot103-0  100%------31-0 100%--72-0 100%---------
overall-  90%227-19 92%103-0 100%170-13 93%96-3 97%97-6 94%117-31 79%182-18 91%1-0 100%143-11 93%379-18 95%16-6 73%3-0 100%1-15 6%9-37 20%338-49 87%0-1 0%0-1 0%384-28 93%131-17 89%

Look how sparse the chart is—CherryPi was highly selective about its choices. It did not try more than 4 different builds against any opponent. It makes sense to minimize the number of choices so that you don’t lose games exploring bad ones, but you have to be pretty sure that one of the choices you do try is good. Where did the selectivity come from?

The opening “hydracheese” was played only against Iron, and was the only opening played against Iron. It smelled like a hand-coded choice. Sure enough, the file source/src/models/banditconfigurations.cpp configures builds by name for 18 of the 27 entrants. A comment says that the build order switcher is turned off for the hydracheese opening only: “BOS disabled for this specific build because the model hasn’t seen it.” Here is the full set of builds configured, including defaults for those that were not hand-configured. CherryPi played only builds that were configured, but did not play all the builds that were configured; presumably it stopped when it hit a good one.

botsbuildsnote
AILienzve9poolspeed zvz9poolspeedreturning opponents from last year
AIURzvtmacro zvpohydras zvp10hatch
Arrakhammer10hatchling zvz9poolspeed
Ironhydracheese
UAlbertaBotzve9poolspeed 9poolspeedlingmuta
Ximpzvpohydras zvtmacro zvp3hatchhydra
Microwavezvzoverpool zvz9poolspeed zvz9gas10pool“we have some expectations”
Steamhammerzve9poolspeed zvz9poolspeed zvz12poolhydras 10hatchling
ZZZKBot9poolspeedlingmuta 10hatchling zvz9poolspeed zvzoverpool
ISAMind
Locutus
McRave
DaQin
zvtmacro zvp6hatchhydra 3basepoollings zvpomutas
CUNYBotzvzoverpoolplus1 zvz9gas10pool zvz9poolspeed
HannesBredbergzvtp1hatchlurker zvt2baseultra zvt3hatchlurker zvp10hatch
LetaBotzvtmacro 3basepoollings zvt2baseguardian zve9poolspeed 10hatchling
MetaBotzvtmacro zvpohydras zvpomutas zve9poolspeed
WillyTzvt2baseultra 12poolmuta 2hatchmuta
ZvTzvt2baseultra zvtmacro zvt3hatchlurker zve9poolspeeddefaults
ZvPzve9poolspeed zvtmacro zvp10hatch zvpohydras
ZvZ10hatchling zve9poolspeed zvz9poolspeed zvzoverpool
ZvR10hatchling zve9poolspeed 9poolspeedlingmuta

I read this as pulling out all the stops to reach #1. They would have succeeded if not for SAIDA.

banditconfigurations.cpp continues and declares some properties for builds including non-opening builds. It looks like .validOpening() tells whether it can be played as an opening build, .validSwitch() tells whether the build order switcher is allowed to switch to it during the game, and .switchEnabled() tells whether the build order switcher is enabled at all.

The build orders themselves are defined in source/src/buildorders/. I found them a little hard to read, partly because they are written in reverse order: Actions to happen first are posted last to the blackboard.

The opening zve9poolspeed (I read “zve” as zerg versus everything) has the most red boxes in the chart—it did poorly against more opponents than any other. It may have been a poor choice to configure for use in so many cases. In contrast, zvz9poolspeed specialized for ZvZ was successful. It gets fast mutalisks and in general has a lot more strategic detail coded into the build.

They seem to have had expectations of the zvt2baseultra build against terran. It is configured for HannesBredberg, WillyT, and the default ZvT. It was in fact only tried against SAIDA. I didn’t notice anything that tells CherryPi what order to try opening builds in. Maybe the build order switcher itself contributes, helping to choose the more likely openings first?

LastOrder and its macro model - technical info

Time to dig into the details! I read the paper and some of the code to find stuff out.

LastOrder’s “macro” decisions are made by a neural network whose data size is close to 8MB—much larger than LastOrder’s code (but much smaller than CherryPi’s model data). There is room for a lot of smarts in that many bytes. The network takes in a summary description of the game state as a vector of feature values, and returns a macro action, what to build or upgrade or whatever next. The code to marshal data to and from the network is in StrategyManager.cpp.

network input

The list of network input features is initialized in the StrategyManager constructor and filled in in StrategyManager::triggerModel(). There are a lot of features. I didn’t dig into the details, but it looks as though some of the features are binary, some are counts, some are X or Y values that together give a position on the map, and a few are other numbers. They fall into these groups:

• State features. Basic information about the map and the opponent, our upgrades and economy, our own and enemy tech buildings.

• Waiting to build features. I’m not sure what these mean, but it’s something to do with production.

• “Our battle basic features” and “enemy battle basic features.” Combat units.

• “Our defend building features” and “enemy defend building features.” Static defense.

• “Killed” and “kill” features, what units of ours or the enemy’s are destroyed.

• A mass of features related to our current attack action and what the enemy has available to defend against it.

• “Our defend info” looks like places we are defending and what the enemy is attacking with.

• “Enemy defend info” looks like it locates the enemy’s static defense relative to the places we are interested in attacking.

• “Visible” gives the locations of the currently visible enemy unit types. I’m not quite sure what this means. A unit type doesn’t have an (x,y) position, and it seems as though LastOrder is making one up. It could be the location of the largest group of each unit type, or the closest unit of each type, or something. Have to read more code.

With this much information available, sophisticated strategies are possible in principle. It’s not clear how much of this the network successfully understands and makes use of. The games I watched did not give the impression of deep understanding, but then again, we have to remember that LastOrder learned to play against 20 specific opponents. Its results against those opponents suggest that it does understand them deeply.

network output

It looks like the network output is a single macro action. Code checks whether the action is valid in the current situation and, if so, calls on the appropriate manager to carry it out. The code is full of I/O details and validation and error handling, so I might have missed something in the clutter. Also the code shows signs of having been modified over time without tying up loose ends. I imagine they experimented actively.

By the way, the 9 pool/10 hatch muta/12 hatch muta opening choices and learning code from Overkill are still there, though Overkill’s opening learning is not used.

learning setup

The learning setup uses Ape-X DQN. The term is as dense as a neutron star! Ape-X is a way to organize deep reinforcement learning; see the paper Distributed Prioritized Experience Replay by Horgan et al of Google’s DeepMind. In “DQN”, D stands for deep and as far as I’m concerned is a term of hype and means “we’re doing the cool stuff.” Q is for Q-learning, the form of reinforcement learning you use when you know what’s good (winning the game) and you have to figure out from experience a policy (that’s a technical term) to achieve it in a series of steps over time. The policy is in effect a box where you feed in the situation and it tells you what to do in that situation. What’s good is represented by a reward (given as a number) that you may receive long after the actions that earned it; that can make it hard to figure out a good policy, which is why you end up training on a cluster of 1000 machines. Finally, “N” is for the neural network that acts as the box that knows the policy.

In Ape-X, the learning system consists of a set of Actors that (in the case of LastOrder) play Brood War and record the input features and reward for each time step, plus a Learner (the paper suggests that 1 learner is enough, though you could have more) that feeds the data to a reinforcement learning algorithm. The Actors are responsible for exploring, that is, trying out variations from the current best known policy to see if any of them are improvements. The Ape-X paper suggests having different Actors explore differently so you don’t get stuck in a rut. In the case of LastOrder, the Actors play against a range of opponents. The Learner keeps track of which which data points are more important to learn and feeds those in more often to speed up learning. If you hit a surprise, meaning the reward is much different than you expected (“I thought I was winning, then a nuke hit”), that’s something important to learn.

LastOrder seems to have closely followed the Ape-X DQN formula from the Ape-X paper. They name the exact same set of techniques, although many other choices are possible. Presumably DeepMind knows what they’re doing.

LastOrder does not train with a reward “I won/I lost.” That’s very little information and appears long after the actions that cause it, and it would leave learning glacially slow. They use reward shaping, which means giving a more informative reward number that offers the learning system more clues about whether it is going in the right direction. They use a reward based on the current game score.

the network itself

Following an idea from the 2015 paper Deep Recurrent Q-Learning for Partially Observable MDPs by Hausknecht and Stone, the LastOrder team layered a Long Short-Term Memory network in front of the DQN. We’ve seen LSTM before in Tscmoo (at least at one point; is it still there?). The point of the LSTM network is to remember what’s going on and more fully represent the game state, because in Brood War there is fog of war. So inputs go through the LSTM to expand the currently observed game state into some encoded approximation of all the game state that has been seen so far, then through the DQN to turn that into an action.

The LastOrder paper does not go into detail. There is not enough information in it to reproduce their network design. The Actor and Learner code is in the repo. I haven’t read it to see if it tells us everything.

Taken together it’s a little complicated, isn’t it? Not something for one hobbyist to try on their own. I think you need a team and a budget to put together something like this.

LastOrder and its macro model - general info

LastOrder (github) now has a 15-page academic paper out, Macro action selection with deep reinforcement learning in StarCraft by 6 authors including Sijia Xu as lead author. The paper does not go into great detail, but it reveals new information. It also uses a lot of technical terms without explanation, so it may be hard to follow if you don’t have the background. Also see my recent post how LastOrder played for a concrete look at its games.

I want to break my discussion into 2 parts. Today I’ll go over general information, tomorrow I’ll work through technical stuff, the network input and output and training and so on.

The name LastOrder turns out to be an ingenious reference to the character Last Order from the A Certain Magical Index fictional universe, the final clone sister. The machine learning process produces a long string of similar models which go into combat for experimental purposes, and you keep the last one. Good name!

LastOrder divides its work into 2 halves, “macro” handled by the machine learning model and “micro” handled by the rule-based code derived from Overkill. It’s a broad distinction; in Steamhammer’s 4-level abstraction terms, I would say that “macro” more or less covers strategy and operations, while “micro” covers tactics and micro. The macro model has a set of actions to build stuff, research tech, and expand to a new base, and a set of 18 attack actions which call for 3 different types of attack in each of 5 different places plus 3 “add army” actions which apparently assign units to the 3 types of attack. (The paper says 17 though it lists 18. It looks as though the mutalisk add army action is unused, maybe because mutas are added automatically.) There is also an action to do nothing.

The paper includes a table on the last page, results of a test tournament where each of the 28 AIIDE 2017 participants played 303 games against LastOrder. We get to see how LastOrder scored its advertised 83% win rate: #2 PurpleWave and #3 Iron (rankings from AIIDE 2017) won nearly all games, no doubt overwhelming the rule-based part of LastOrder so that the macro model could not help. Next Microwave scored just under 50%, XIMP scored about 32%, and all others performed worse, including #1 ZZZKBot at 1.64% win rate—9 bots scored under 2%. When LastOrder’s micro part is good enough, the macro part is highly effective.

In AIIDE 2018, #13 LastOrder scored 49%, ranking in the top half. The paper has a brief discussion on page 10. LastOrder was rolled by top finishers because the micro part could not keep up with #9 Iron and above (according to me) or #8 McRave and above (according to the authors, who know things I don’t). Learning can’t help if you’re too burned to learn. LastOrder was also put down by terrans Ecgberht and WillyT, whose play styles are not represented in the 2017 training group, which has only 4 terrans (one of which is Iron that LastOrder cannot touch).

In the discussion of future work (a mandatory part of an academic paper; the work is required to be unending), they talk briefly about how to fix the weaknesses that showed in AIIDE 2018. They mention improving the rule-based part and learning unit-level micro to address the too-burned-to-learn problem, and self-play training to address the limitations of the training opponents. Self-play is the right idea in principle, but it’s not easy. You have to play all 3 races and support all the behaviors you might face, and that’s only the starting point before making it work.

I’d like to suggest another simple idea for future work: Train each matchup separately. You lose generalization, but how much do production and attack decisions generalize between matchups? I could be wrong, but I think not much. Instead, a zerg player could train 3 models, ZvT ZvP and ZvZ, each of which takes fewer inputs and is solving an easier problem. A disadvantage is that protoss becomes relatively more difficult if you allow for mind control.

LastOrder has skills that I did not see in the games I watched. There is code for them, at least; whether it can carry out the skills successfully is a separate question. It can use hydralisks and lurkers. Most interestingly, it knows how to drop. The macro model includes an action to research drop (UpgradeOverlordLoad), an action to assign units and presumably load up for a drop (AirDropAddArmy), and actions to carry out drops in different places (AttackAirDropStart for the enemy starting base, AttackAirDropNatural, AttackAirDropOther1, AttackAirDropOther2, AttackAirDropOther3). The code to carry out drops is AirdropTactic.cpp; it seems to expect to drop either all zerglings, all hydralisks, or all lurkers, no mixed unit types. Does LastOrder use these skills at all? If anybody can point out a game, I’m interested.

Learning to when to make hydras and lurkers should not be too hard. If LastOrder rarely or never uses hydras, it must be because it found another plan more effective—in games you make hydras first and then get the upgrades, so it’s easy to figure out. If it doesn’t use lurkers, maybe they didn’t help, or maybe it didn’t have any hydras around to morph after it tried researching the upgrade, because hydras were seen as useless. But still, it’s only 2 steps, it should be doable. Learning to drop is not as easy, though. To earn a reward, the agent has to select the upgrade action, the load action, and the drop action in order, each at a time when it makes sense. Doing only part of the sequence sets you back, and so does doing the whole sequence if you leave too much time between the steps, or drop straight into the enemy army, or make any severe mistake. You have to carry through accurately to get the payoff. It should be learnable, but it may take a long time and trainloads of data. I would be tempted to explicitly represent dependencies like this in some way or another, to tell the model up front the required order of events.

AIIDE 2018 - what McRave learned

McRave, like Microwave and no doubt most bots that follow more than one plan, plays different openings against different races. In each opponent’s learning file, it writes win/loss numbers for 15 strategies. Their names all start with “P” for protoss, but I have stripped out the P to make the table more readable. 4 of the strategies are unused: ZealotDrop, NZCore (sounds like no zealot core), Proxy99, and Proxy6. That leaves 11 active openings. The races they were used against seen in the table. ZZCore (2 zealots before core) was played only against random.

#bottotal12Nexus1GateCorsair1GateRobo21Nexus2GateDragoon2GateExpand4GateDTExpandFFEZCoreZZCore
#1saida16-55  23%1-12 8%--7-17 29%1-12 8%--7-14 33%---
#2cherrypi15-88  15%-6-25 19%---6-25 19%2-20 9%-1-18 5%--
#3cse27-75  26%--7-19 27%--5-17 23%2-15 12%--13-24 35%-
#4bluebluesky29-74  28%--1-14 7%--2-15 12%7-18 28%--19-27 41%-
#5locutus46-56  45%--5-12 29%--15-15 50%14-15 48%--12-14 46%-
#6isamind54-49  52%--7-11 39%--4-10 29%15-14 52%--28-14 67%-
#7daqin60-43  58%--13-11 54%--4-9 31%8-10 44%--35-13 73%-
#9iron56-32  64%27-8 77%--2-7 22%18-9 67%--9-8 53%---
#10zzzkbot75-28  73%-8-7 53%---17-7 71%21-7 75%-29-7 81%--
#11steamhammer64-38  63%-9-9 50%---27-10 73%15-10 60%-13-9 59%--
#12microwave82-21  80%-0-5 0%---39-4 91%30-5 86%-13-7 65%--
#13lastorder97-6  94%-10-2 83%---17-1 94%10-2 83%-60-1 98%--
#14tyr91-10  90%--23-3 88%--7-5 58%31-1 97%--30-1 97%-
#15metabot49-46  52%--8-11 42%--16-12 57%23-14 62%--2-9 18%-
#16letabot77-15  84%12-5 71%--5-5 50%20-4 83%--40-1 98%---
#17arrakhammer102-1  99%------94-1 99%-8-0 100%--
#18ecgberht99-2  98%95-0 100%---3-1 75%--1-1 50%---
#19ualbertabot73-29  72%-----12-8 60%38-6 86%-7-7 50%-16-8 67%
#20ximp41-59  41%--8-14 36%--15-17 47%18-18 50%--0-10 0%-
#21cdbot103-0  100%------103-0 100%----
#22aiur80-21  79%--11-6 65%--13-6 68%41-3 93%--15-6 71%-
#23killall60-43  58%-3-9 25%---6-9 40%19-12 61%-32-13 71%--
#24willyt77-17  82%37-2 95%--3-6 33%23-4 85%--14-5 74%---
#25ailien86-17  83%-31-3 91%---20-5 80%5-6 45%-30-3 91%--
#26cunybot91-8  92%-26-1 96%---36-1 97%14-3 82%-15-3 83%--
#27hellbot103-0  100%---------103-0 100%-
overall-  68%172-27 86%93-61 60%83-101 45%17-35 33%65-30 68%261-176 60%510-180 74%71-29 71%208-68 75%257-118 69%16-8 67%

Unlike other bots that scored comparatively well against SAIDA—meaning they weren’t always wiped summarily from the map—McRave did not rely solely on cloaked units. The DTExpand opening scored best, but 21Nexus was nearly as successful. (McRave scored inconsistently against lower-ranked bots, though, as its author has commented.)

Every strategy came out with some good scores. But here is another analysis: Suppose the goal of the learning algorithm is to find the single most successful strategy (which is not always true—you might want to find the best mix to confuse the opponent’s learning). Leaving aside CDBot and HellBot, which McRave scored 100% against, against how many opponents was each opening the best choice? I made this table by hand, so there might be mistakes. I counted equal best as also best. The “versus” column tells which races the opening was used against.

openingbestversus
12Nexus3T
1GateCorsair2Z
1GateRobo0P
21Nexus0T
2GateDragoon0T
2GateExpand6P, Z, R
4Gate5P, Z, R
DTExpand2T
FFE5Z, R
ZCore4P
ZZCore0R

The counts do not match up well with the overall winning rates. There were 4 never-best openings. This analysis does not say that they are bad openings that dragged down the score. Consider what would have happened if they had not been enabled: Their games would have been distributed among the other openings; there would have been some extra wins and some extra losses, and the ratio would depend on the distribution. 21Nexus was never best, but scored second best against SAIDA and contributed as many wins. On the other hand, the openings which were often best were definitely worth having; they were well-chosen for McRave versus this set of opponents. It could make sense to try those openings first, or more often. On the third hand, notice that the openings with the highest counts were played against the largest number of opponents. There were more bests to count! Openings versus terran scored 5 bests because there were 5 terran opponents.

Plenty of similar analyses could be done. For example, you could count how often or how widely an opening scored above/below the average for each opponent: Did it make a net contribution, or the opposite? It would be another way of seeing whether the openings were well chosen for the opponents they faced.

Next I want to start watching some replays. I think I will start with LastOrder, which did all its learning offline yet held its win rate steady against the onslaught of learning bots. I’m expecting it to be interesting and sophisticated in some way.

AIIDE 2018 - what UAlbertaBot learned

UAlbertaBot played random, and its openings are chosen, not according to the opponent’s race, but according to its own once the game starts. It has 3 protoss, 4 terran, and 4 zerg openings. Playing random gives the disadvantage of having about 1/3 as many games to figure out how to counter the opponent with each race. The countervailing advantage, of course, is that the opponent can’t predict what is coming its way.

103 rounds were played and UAlbertaBot does not deliberately drop data, so some of the totals add up to more than the 100 official rounds. UAlbertaBot also had 46 crashes, so some totals add up to less. For example, it recorded 96 games against LastOrder.

The official site doesn’t offer binaries for the bots which were carried over from last year, but this should be the 2017 version of UAlbertaBot. It has enemy-specific strategies configured for 13 opponents, of which 5 are also in this tournament: #9 Iron, #10 ZZZKBot, #16 LetaBot, #2o Ximp, and #22 Aiur. For ZZZKBot, only the protoss opening is set; for the others, all 3 races have openings set. Looking at the table, we see that UAlbertaBot did not always try all of its openings, and the blanks in the table do not always correspond to enemy-specific openings. Apparently in this UAlbertaBot version, the enemy-specific strategies act as hints rather than requirements: When available they are tried first, and when not, the default opening is tried first (ZealotRush, MarineRush, or ZerglingRush). If the first opening tried performs well enough, UAlbertaBot sticks with it.

#bottotalProtossTerranZerg
DTRushDragoonRushZealotRush4RaxMarinesMarineRushTankPushVultureRush2HatchHydra3HatchMuta3HatchScourgeZerglingRush
#1saida13-88  13%12-7 63%0-2 0%0-5 0%0-9 0%0-9 0%1-13 7%0-9 0%0-9 0%0-9 0%0-8 0%0-8 0%
#2cherrypi1-99  1%0-8 0%0-7 0%0-7 0%0-8 0%1-11 8%0-8 0%0-8 0%0-11 0%0-11 0%0-10 0%0-10 0%
#3cse2-99  2%0-7 0%2-14 12%0-7 0%0-11 0%0-10 0%0-10 0%0-10 0%0-8 0%0-8 0%0-7 0%0-7 0%
#4bluebluesky11-92  11%0-4 0%3-10 23%4-11 27%0-5 0%0-5 0%2-11 15%0-5 0%0-9 0%0-8 0%0-8 0%2-16 11%
#5locutus6-97  6%0-7 0%4-17 19%0-7 0%0-8 0%0-8 0%1-11 8%0-8 0%1-10 9%0-7 0%0-7 0%0-7 0%
#6isamind5-96  5%0-7 0%4-17 19%0-7 0%0-9 0%0-8 0%0-8 0%0-8 0%0-7 0%0-7 0%0-7 0%1-11 8%
#7daqin12-90  12%4-12 25%0-4 0%2-9 18%0-6 0%0-6 0%1-6 14%0-5 0%2-13 13%0-7 0%0-7 0%3-15 17%
#8mcrave29-71  29%5-12 29%1-6 14%0-5 0%0-3 0%10-13 43%1-5 17%0-3 0%2-6 25%0-3 0%0-3 0%10-12 45%
#9iron9-94  9%0-10 0%1-14 7%0-9 0%0-8 0%0-8 0%0-8 0%1-12 8%1-6 14%1-6 14%0-4 0%5-9 36%
#10zzzkbot13-87  13%0-3 0%0-3 0%13-20 39%0-9 0%0-9 0%0-9 0%0-9 0%0-7 0%0-6 0%0-6 0%0-6 0%
#11steamhammer11-92  11%0-5 0%0-5 0%8-19 30%1-10 9%0-6 0%0-6 0%0-6 0%0-7 0%0-7 0%0-7 0%2-14 12%
#12microwave20-81  20%--18-7 72%0-7 0%2-14 12%0-7 0%0-7 0%0-10 0%0-10 0%0-10 0%0-9 0%
#13lastorder4-92  4%0-6 0%0-6 0%2-12 14%2-10 17%0-5 0%0-5 0%0-5 0%0-11 0%0-11 0%0-11 0%0-10 0%
#14tyr36-61  37%5-12 29%0-4 0%0-5 0%0-2 0%3-4 43%13-7 65%1-2 33%13-15 46%0-3 0%0-3 0%1-4 20%
#15metabot35-56  38%4-5 44%6-5 55%2-4 33%1-6 14%3-9 25%1-6 14%0-3 0%0-2 0%6-3 67%3-3 50%9-10 47%
#16letabot48-44  52%11-14 44%0-3 0%2-6 25%0-2 0%1-4 20%0-2 0%4-7 36%30-6 83%---
#17arrakhammer56-41  58%--23-6 79%0-6 0%0-6 0%0-6 0%0-6 0%---33-11 75%
#18ecgberht40-56  42%9-7 56%9-8 53%1-4 20%0-2 0%0-5 0%0-2 0%6-7 46%0-3 0%0-3 0%0-3 0%15-12 56%
#20ximp38-56  40%0-2 0%7-7 50%4-5 44%0-4 0%0-4 0%9-19 32%1-6 14%--17-9 65%-
#21cdbot44-54  45%--23-4 85%0-2 0%19-15 56%0-2 0%0-2 0%0-6 0%1-9 10%0-5 0%1-9 10%
#22aiur57-45  56%35-1 97%--0-2 0%0-2 0%0-2 0%11-10 52%1-5 17%9-15 38%0-3 0%1-5 17%
#23killall73-27  73%--30-8 79%0-2 0%12-6 67%0-2 0%0-2 0%---31-7 82%
#24willyt36-55  40%3-12 20%1-8 11%0-5 0%0-4 0%0-5 0%0-4 0%10-11 48%---22-6 79%
#25ailien71-30  70%--18-11 62%16-10 62%2-4 33%0-2 0%0-2 0%---35-1 97%
#26cunybot75-15  83%--23-1 96%-30-7 81%-----22-7 76%
#27hellbot100-2  98%--33-0 100%-41-2 95%-----26-0 100%
overall-  33%88-141 38%38-140 21%206-184 53%20-145 12%124-185 40%29-161 15%34-153 18%50-151 25%17-133 11%20-121 14%219-206 52%

The DT rush caused surprising problems for SAIDA, but terran and zerg had nothing. Did playing random contribute? Does the updated current SAIDA, flame-hardened on SSCAIT, react better? The hand-chosen 2 hatch hydra also did strikingly well against LetaBot, not an obvious choice. Every opening had a plus score against some opponent, though VultureRush barely made it over. Looking across the bottom row, the default openings had the best overall results for each race—they were chosen correctly. Also, we can see that protoss was UAlbertaBot’s best race, and terran the worst; we already knew that, but here we see it in the numbers.

AIIDE 2018 - what Microwave learned

Microwave uses UCB and keeps its learning data in the same file format as UAlbertaBot, one file per opponent listing on each line an opening, a count of wins, and a count of losses. It’s a simple format that is also used outside the UAlbertaBot family. Microwave adds a twist: It does not allow the count of wins or the count of losses to exceed 10. I’m not sure what the exact update rule is without reading the code, but the effect is that only the more recent game results are remembered. It’s appropriate if the enemy is expected to be learning too, and to change its strategy rapidly so that Microwave has to keep adapting.

Microwave plays different strategies against each race. Against Terran it has 7, against Protoss and Zerg 8, and against random 6. UAlbertaBot was the only random opponent. The strategies partly overlap. For example, 10Hatch9Pool9gas is played against both terran and protoss, while 9HatchMain8Pool8Gas is played only against zerg. The table has big blank spaces full of unplayed strategies. Maybe I should have sorted it by race, instead of by rank?

#bottotal10Hatch9Pool9gas12Pool3HatchPoolHydra5HatchGasHydra5Pool9HatchMain8Pool8Gas9Pool9PoolExpo9PoolHatch9PoolLurker9PoolSpeed9PoolSpeedLing9PoolSunkenOverpoolOverpoolSpeedZvT_12HatchHydraZvT_12HatchLurkerZvT_12HatchMutaZvZ_Overpool11Gas
#1saida0-70  0%0-10 0%---0-10 0%-0-10 0%--0-10 0%-----0-10 0%0-10 0%0-10 0%-
#2cherrypi0-80  0%-0-10 0%--0-10 0%0-10 0%--0-10 0%-0-10 0%-0-10 0%-0-10 0%---0-10 0%
#3cse0-80  0%0-10 0%-0-10 0%0-10 0%0-10 0%------0-10 0%---0-10 0%0-10 0%0-10 0%-
#4bluebluesky0-80  0%0-10 0%-0-10 0%0-10 0%0-10 0%------0-10 0%---0-10 0%0-10 0%0-10 0%-
#5locutus1-80  1%1-10 9%-0-10 0%0-10 0%0-10 0%------0-10 0%---0-10 0%0-10 0%0-10 0%-
#6isamind0-80  0%0-10 0%-0-10 0%0-10 0%0-10 0%------0-10 0%---0-10 0%0-10 0%0-10 0%-
#7daqin0-80  0%0-10 0%-0-10 0%0-10 0%0-10 0%------0-10 0%---0-10 0%0-10 0%0-10 0%-
#8mcrave7-68  9%1-10 9%-1-10 9%0-5 0%1-8 11%------1-10 9%---1-10 9%0-5 0%2-10 17%-
#9iron0-70  0%0-10 0%---0-10 0%-0-10 0%--0-10 0%-----0-10 0%0-10 0%0-10 0%-
#10zzzkbot24-37  39%-5-8 38%--0-2 0%9-10 47%--9-10 47%-0-1 0%-0-1 0%-0-1 0%---1-4 20%
#11steamhammer57-15  79%-10-2 83%--6-7 46%1-2 33%--10-2 83%-10-0 100%-0-1 0%-10-1 91%---10-0 100%
#13lastorder24-21  53%-0-1 0%--10-2 83%0-1 0%--2-4 33%-0-1 0%-10-6 62%-1-3 25%---1-3 25%
#14tyr15-13  54%2-3 40%-0-1 0%3-4 43%10-1 91%------0-1 0%---0-1 0%0-1 0%0-1 0%-
#15metabot41-13  76%10-2 83%-8-3 73%0-1 0%0-1 0%------10-1 91%---1-2 33%2-3 40%10-0 100%-
#16letabot26-18  59%4-5 44%---1-2 33%-10-0 100%--8-5 62%-----0-1 0%0-1 0%3-4 43%-
#17arrakhammer27-22  55%-7-8 47%--10-0 100%0-1 0%--0-1 0%-3-4 43%-5-4 56%-2-3 40%---0-1 0%
#18ecgberht38-18  68%0-1 0%---10-0 100%-0-1 0%--1-2 33%-----10-7 59%10-0 100%7-7 50%-
#19ualbertabot50-10  83%----10-1 91%-0-1 0%10-0 100%10-4 71%---10-4 71%10-0 100%-----
#20ximp27-15  64%2-3 40%-0-1 0%0-1 0%0-1 0%------10-0 100%---5-6 45%0-1 0%10-2 83%-
#21cdbot46-13  78%-10-0 100%--0-1 0%1-2 33%--4-5 44%-10-3 77%-1-2 33%-10-0 100%---10-0 100%
#22aiur48-15  76%1-2 33%-10-1 91%7-5 58%0-1 0%------9-3 75%---1-2 33%10-1 91%10-0 100%-
#23killall40-5  89%-10-0 100%--0-1 0%10-0 100%--0-1 0%-10-0 100%-10-1 91%-0-1 0%---0-1 0%
#24willyt34-10  77%4-5 44%---0-1 0%-0-1 0%--0-1 0%-----10-2 83%10-0 100%10-0 100%-
#25ailien28-32  47%-9-10 47%--1-4 20%0-1 0%--3-6 33%-0-1 0%-10-2 83%-5-7 42%---0-1 0%
#26cunybot67-1  99%-10-0 100%--10-0 100%0-1 0%--10-0 100%-10-0 100%-7-0 100%-10-0 100%---10-0 100%
#27hellbot74-0  100%10-0 100%-10-0 100%6-0 100%10-0 100%------8-0 100%---10-0 100%10-0 100%10-0 100%-
overall-  42%35-101 26%61-39 61%29-66 31%16-66 20%79-113 41%21-28 43%10-23 30%10-0 100%48-43 53%9-28 24%43-20 68%38-65 37%53-31 63%10-0 100%38-26 59%38-101 27%42-82 34%62-94 40%32-20 62%

The total column tells how successful Microwave was in recent games against each opponent. You might want to compare the percentages against the overall win rates from the official crosstable; they sometimes vary curiously. When the recorded results were less successful than the total results, it suggests that Microwave may have forgotten too much (though it could be random fluctuation). For example, Microwave scored 80% against LetaBot overall, but 59% in the recent games in this table.

The overall row tells how successful each opening was in recent games. Every opening was successful against some opponents, so there were no useless strategies. The body of the table, from #10 ZZZKBot and down, is full of strong contrasts, meaning that there was a big difference between the successful and unsuccessful openings against each opponent. That suggests that learning must have been useful.

AIIDE 2018 - what Locutus learned

The Locutusoids have learning data only slightly different from Steamhammer’s. I have run my summarizer code for CSE, BlueBlueSky, Locutus, and ISAMind, skipping DaQin because it recorded only 1 game per opponent (which tickles a bug in my code). I am thinking of posting only the Locutus results, because the others don’t hold much extra interest. Locutus plays a wider range of openings than the others (perhaps because newer bots have to restrict their scope). CSE in particular is more in the do-one-thing-well camp. Besides, all of them had high win rates against lower-ranked opponents; they did not have much to learn. I don’t see a point in piling up data about similar players.

But if people want, I can post them all. Any requests?

Locutus is the only Locutusoid to use pre-learned data. Some of the others had their own ways of preparing for known opponents. For example, CSE is configured with several enemy-specific strategies, such as DT drop against #9 Iron.

Here is a summary of the pre-learned data used by Locutus. Locutus is configured to retain at most 200 game records per opponent, so that’s as much pre-learned data as it makes sense to give it. When you give it that much, each tournament game record added at the end causes one pre-learned record to scroll off the beginning. At the end of a 100 round tournament, half the game records are retained from the pre-learned data and half are tournament games—the pre-learned data more or less dominated tournament data for decisions during the tournament.

#opponentgameswins
7DaQin3591%
9Iron20093%
10ZZZKBot20076%
14Tyr20096%
17Arrakhammer20088%
19UAlbertaBot71100%
22AIUR5196%
25AILien20096%


Here is the final data. For every opponent that has pre-learned data, much or all of the per-learned data is retained until the end.

#1 saida

openinggameswins
10-15GateGoon220%
10Gate25NexusFE297%
DTDrop326%
Proxy4GateGoon70%
Proxy4GateGoon2p30%
Proxy9-9Gate100%
6 openings1034%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Not fast rush10299%4%10299%3%99%0%
Proxy0%0%11%100%0%0%
Unknown11%0%0%0%0%0%


Locutus and the Locutusoids use “Not fast rush” as a catch-all: The enemy’s opening is not a fast rush, and it is not more precisely recognized than that.

#2 cherrypi

openinggameswins
ForgeExpand4Gate2Archon1916%
ForgeExpand5GateGoon555%
ForgeExpandSpeedlots166%
ProxyHeavyZealotRush617%
ProxyHeavyZealotRush2p757%
5 openings10312%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush1313%23%3534%20%23%0%
Not fast rush8986%10%6866%7%64%0%
Unknown11%0%0%0%0%0%


Why are the successful proxy openings so little played? The “2p” version is played only on 2-player maps; the other version only on 3- and 4-player maps. Looking into the file by hand, I see that they were both successful from early in the tournament, so it’s not a matter of discovering them late. Perhaps the map size specialization interferes with the learning process? Perhaps they are deliberately little played to prevent the opponent from adapting? Have to read the code for this one. The proxy openings show similar numbers across other opponents, so it's not a one-off. Locutus’s learning in general does not look like it concentrates hard on playing the best-performing openings.

#3 cse

openinggameswins
2GateDTExpo30%
2GateDTRush2438%
4GateGoon4630%
Proxy4GateGoon450%
Proxy4GateGoon2p862%
Proxy9-9Gate60%
ProxyHeavyZealotRush40%
ProxyHeavyZealotRush2p250%
Turtle650%
9 openings10333%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Dark templar1010%40%2827%43%10%0%
Fast rush0%0%66%0%0%0%
Heavy rush0%0%33%100%0%0%
Not fast rush9289%33%6664%29%63%0%
Unknown11%0%0%0%0%0%

#4 bluebluesky

openinggameswins
2GateDTExpo1331%
2GateDTRush743%
4GateGoon5843%
9-9GateDefensive30%
Proxy4GateGoon1100%
Proxy4GateGoon2p2100%
Proxy9-9Gate20%
ProxyHeavyZealotRush20%
ProxyHeavyZealotRush2p10%
Turtle1429%
10 openings10338%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Dark templar6058%32%5553%31%77%0%
Not fast rush3938%51%4544%49%82%0%
Proxy33%0%33%0%67%0%
Unknown11%0%0%0%0%0%

#6 isamind

openinggameswins
2GateDTRush1771%
4GateGoon6058%
9-9GateDefensive633%
Proxy4GateGoon2100%
Proxy4GateGoon2p367%
Proxy9-9Gate10%
ProxyHeavyZealotRush20%
ProxyHeavyZealotRush2p10%
Turtle1155%
9 openings10357%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Dark templar0%0%11%100%0%0%
Fast rush55%60%77%100%20%0%
Heavy rush1313%54%77%71%15%0%
Not fast rush7876%59%8583%51%85%0%
Proxy66%33%33%100%0%0%
Unknown11%100%0%0%0%0%

#7 daqin

openinggameswins
2GateDTExpo4100%
2GateDTRush25100%
4GateGoon4498%
9-9GateDefensive1968%
Proxy4GateGoon683%
Proxy4GateGoon2p1100%
Proxy9-9Gate475%
ProxyHeavyZealotRush2100%
ProxyHeavyZealotRush2p1100%
Turtle3238%
10 openings13879%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush5137%49%4130%78%31%0%
Not fast rush8662%97%9770%79%71%0%
Unknown11%100%0%0%0%0%


Locutus scored lower versus DaQin in the tournament than in the pre-learning data. It may mean that DaQin was updated in private before the tournament. You have to expect that; I assume it is why there were only 35 games in the pre-learning data.

#8 mcrave

openinggameswins
2GateDTExpo10%
2GateDTRush2767%
4GateGoon4955%
9-9GateDefensive633%
Proxy4GateGoon367%
Proxy4GateGoon2p367%
Proxy9-9Gate10%
ProxyHeavyZealotRush450%
ProxyHeavyZealotRush2p10%
Turtle825%
10 openings10353%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Dark templar22%50%22%0%0%0%
Fast rush1313%31%1212%25%8%0%
Heavy rush1515%40%66%83%7%0%
Not fast rush7270%61%8381%57%81%0%
Unknown11%0%0%0%0%0%

#9 iron

openinggameswins
10-15GateGoon580%
10Gate25NexusFE10591%
DTDrop8991%
Proxy4GateGoon1100%
4 openings20091%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Not fast rush15276%91%7437%97%39%14%
Unknown10%100%2211%91%0%0%
Wall-in4724%91%10452%87%70%0%

#10 zzzkbot

openinggameswins
ForgeExpand4Gate2Archon786%
ForgeExpand5GateGoon9794%
ForgeExpandSpeedlots8695%
ProxyHeavyZealotRush580%
ProxyHeavyZealotRush2p540%
5 openings20092%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush6332%95%10754%91%54%0%
Heavy rush8140%90%7437%93%40%0%
Not fast rush5628%93%1910%100%9%0%

#11 steamhammer

openinggameswins
ForgeExpand4Gate2Archon1100%
ForgeExpand5GateGoon10296%
2 openings10396%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush22%100%77%100%0%0%
Heavy rush3736%100%2221%100%19%0%
Hydra bust66%67%1414%93%17%0%
Not fast rush5755%96%6058%95%61%0%
Unknown11%100%0%0%0%0%

#12 microwave

openinggameswins
ForgeExpand4Gate2Archon5100%
ForgeExpand5GateGoon8394%
ForgeExpandSpeedlots1593%
3 openings10394%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush22%100%1212%100%0%0%
Heavy rush3837%95%2322%100%21%0%
Hydra bust1817%94%1616%81%11%0%
Not fast rush4443%93%5250%94%43%0%
Unknown11%100%0%0%0%0%

#13 lastorder

openinggameswins
ForgeExpand5GateGoon10398%
1 openings10398%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush4948%100%5856%97%55%0%
Not fast rush5351%96%4544%100%43%0%
Unknown11%100%0%0%0%0%

#14 tyr

openinggameswins
12Nexus5ZealotFECannons57100%
2GateDTExpo250%
4GateGoon103100%
9-9GateDefensive667%
Proxy9-9Gate333%
ProxyHeavyZealotRush10%
Turtle2889%
7 openings20096%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush2110%86%10%100%0%0%
Heavy rush8944%100%189%89%10%0%
Not fast rush8040%95%15075%97%54%38%
Proxy63%67%10%100%0%0%
Unknown42%100%3015%90%0%0%

#15 metabot

openinggameswins
2GateDTRush35100%
4GateGoon4789%
ProxyHeavyZealotRush2100%
Turtle14100%
4 openings9895%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Dark templar1717%88%5051%90%71%0%
Fast rush1010%100%11%100%0%0%
Heavy rush22%100%77%100%50%0%
Not fast rush6869%96%4041%100%49%0%
Unknown11%100%0%0%0%0%

#16 letabot

openinggameswins
10-15GateGoon10%
10Gate25NexusFE250%
4GateGoon475%
DTDrop9696%
4 openings10393%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush44%75%11%100%0%0%
Not fast rush4039%98%1010%90%10%0%
Unknown22%50%0%0%0%0%
Wall-in5755%93%9289%93%89%0%

#17 arrakhammer

openinggameswins
ForgeExpand4Gate2Archon1369%
ForgeExpand5GateGoon14698%
ForgeExpandSpeedlots2580%
ProxyHeavyZealotRush1155%
ProxyHeavyZealotRush2p560%
5 openings20090%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush3718%92%2814%100%3%0%
Heavy rush8241%88%9648%89%46%0%
Naked expand126%92%63%83%25%8%
Not fast rush6934%93%6934%90%38%0%
Unknown0%0%10%100%0%0%

#18 ecgberht

openinggameswins
4GateGoon53100%
DTDrop50100%
2 openings103100%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush5351%100%8885%100%81%0%
Not fast rush4342%100%1515%100%9%0%
Unknown77%100%0%0%0%0%

#19 ualbertabot

openinggameswins
4GateGoon63100%
9-9GateDefensive5100%
ForgeExpand5GateGoon9493%
Proxy9-9Gate12100%
4 openings17496%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Dark templar63%100%63%100%17%0%
Fast rush3420%88%2011%100%18%0%
Heavy rush5532%96%3721%100%31%9%
Hydra bust106%100%95%89%30%0%
Not fast rush6839%99%9253%93%46%6%
Proxy0%0%11%100%0%0%
Unknown11%100%95%100%0%0%

#20 ximp

openinggameswins
2GateDTRush250%
4GateGoon10195%
2 openings10394%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Not fast rush5351%96%103100%94%100%0%
Unknown5049%92%0%0%0%0%

#21 cdbot

openinggameswins
9-9GateDefensive1100%
ForgeExpand5GateGoon102100%
2 openings103100%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush55%100%1010%100%0%0%
Heavy rush4342%100%3635%100%40%5%
Hydra bust0%0%22%100%0%0%
Not fast rush5351%100%4645%100%43%8%
Proxy11%100%33%100%0%0%
Unknown11%100%66%100%0%0%

#22 aiur

openinggameswins
10-15GateGoon367%
12Nexus5ZealotFE5100%
2GateDTExpo1100%
2GateDTRush4100%
4GateGoon11496%
Proxy4GateGoon3100%
Proxy9-9Gate683%
ProxyHeavyZealotRush3100%
ProxyHeavyZealotRush2p1100%
Turtle1493%
10 openings15495%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Dark templar3019%97%3120%94%33%0%
Heavy rush3925%92%5334%98%28%0%
Naked expand138%85%32%67%23%38%
Not fast rush7247%97%5536%93%44%1%
Proxy0%0%64%100%0%0%
Unknown0%0%64%100%0%0%

#23 killall

openinggameswins
ForgeExpand5GateGoon10398%
1 openings10398%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush33%100%88%100%0%0%
Heavy rush4544%98%3837%97%22%0%
Hydra bust0%0%11%100%0%0%
Not fast rush5452%98%5654%98%41%0%
Unknown11%100%0%0%0%0%

#24 willyt

openinggameswins
10-15GateGoon8100%
10Gate25NexusFE7100%
4GateGoon64100%
DTDrop21100%
Turtle3100%
5 openings103100%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush6765%100%6462%100%69%0%
Not fast rush3534%100%3635%100%46%0%
Proxy0%0%33%100%0%0%
Unknown11%100%0%0%0%0%

#25 ailien

openinggameswins
ForgeExpand4Gate2Archon2496%
ForgeExpand5GateGoon3397%
ForgeExpandSpeedlots12898%
ProxyHeavyZealotRush1283%
ProxyHeavyZealotRush2p3100%
5 openings20097%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush13266%98%10150%96%57%2%
Naked expand0%0%21%100%0%0%
Not fast rush6834%96%9548%98%62%0%
Unknown0%0%21%100%0%0%

#26 cunybot

openinggameswins
ForgeExpand5GateGoon93100%
1 openings93100%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush11%100%22%100%0%0%
Heavy rush4447%100%2325%100%25%2%
Not fast rush4751%100%6570%100%72%4%
Unknown11%100%33%100%0%0%

#27 hellbot

openinggameswins
2GateDTRush20100%
4GateGoon83100%
2 openings103100%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Not fast rush4948%100%103100%100%100%0%
Unknown5452%100%0%0%0%0%

overall

totalPvTPvPPvZPvR
openinggameswinsgameswinsgameswinsgameswinsgameswins
10-15GateGoon3936% 3633% 367%
10Gate25NexusFE14374% 14374%
12Nexus5ZealotFE5100% 5100%
12Nexus5ZealotFECannons57100% 57100%
2GateDTExpo2442% 2442%
2GateDTRush16179% 16179%
4GateGoon88985% 12199% 70582% 63100%
9-9GateDefensive4659% 4052% 1100% 5100%
DTDrop28885% 28885%
ForgeExpand4Gate2Archon6968% 6968%
ForgeExpand5GateGoon101192% 91792% 9493%
ForgeExpandSpeedlots27090% 27090%
Proxy4GateGoon2759% 812% 1979%
Proxy4GateGoon2p2060% 30% 1771%
Proxy9-9Gate4547% 100% 2339% 12100%
ProxyHeavyZealotRush5456% 2045% 3462%
ProxyHeavyZealotRush2p2756% 743% 2060%
Turtle13063% 3100% 12762%
total330583%61280%120877%131189%17496%
openings played1881364

AIIDE 2018 - what Steamhammer learned

In CIG, Steamhammer was broken. My findings on what Steamhammer learned in CIG 2018 are not valid, because Steamhammer rarely played the opening it thought it was playing; it played a broken version of the opening that left out drones and buildings. That is likely why the zergling rushes were successful in CIG: There was little in the build to leave out, so the build played more nearly as written. In this tournament, Steamhammer seems to have been working fine (though we’ll see when the replays come out)—well, working fine except for the usual bugs, some of which are fixed in version 2.1. Also, Steamhammer’s learning was revamped to better bamboozle opponents that tried to learn its patterns; the result is that its learning behavior is richer. I think these tables are full of interesting data.

103 rounds were played, of which 100 were official. Steamhammer is set to record at most 100 game records per opponent, so games from the first 3 rounds may have been dropped. That’s why the numbers don’t exactly match the official crosstable, even though the game totals look correct.

Steamhammer’s game records contain much more information than I can summarize in tidy little tables. This time I captured a little more of it, adding a table about the plan recognizer. For each plan that was recognized during a game, the table shows how often the plan was predicted before the game, how often it was recognized during the game, and the win rate in each of those cases. It also tries to measure the accuracy of the prediction. The plan recognizer itself is not very accurate; it often fails to recognize what is in front of it, calling the plan Unknown. The “?” column shows how often the plan was predicted and then no plan was recognized. The plan recognizer can also blow it completely and recognize the wrong plan. When the opponent plays predictably, the plan predictor is generally more accurate than the plan recognizer. When the opponent plays unpredictably, I don’t know which is more accurate! Either way, the plan prediction is more important early in the tournament; once Steamhammer has accumulated enough experience, it pays more attention to its learning data, and it doesn’t matter whether the predicted plan is good.

#1 saida

openinggameswins
11Gas10PoolLurker30%
11Gas10PoolMuta10%
11HatchTurtleHydra10%
2HatchHydraBust10%
3HatchHydraExpo10%
3HatchLurker10%
4HatchBeforeGas20%
4PoolHard30%
5PoolHard10%
5PoolSoft10%
6Pool10%
7PoolSoft20%
9Hatch8Pool20%
9HatchExpo9Pool9Gas10%
9Pool10%
9PoolExpo10%
9PoolLurker812%
9PoolSpeedAllIn10%
9PoolSunkSpeed10%
AntiFact_13Pool80%
AntiFact_2Hatch120%
AntiFactory160%
AntiZeal_12Hatch20%
Over10Hatch2SunkHard10%
OverhatchLateGas10%
Overpool+110%
OverpoolHatch10%
PurpleSwarmBuild10%
Sparkle 2HatchMuta20%
ZvP_3HatchPoolHydra10%
ZvT_12PoolMuta20%
ZvT_2HatchMuta10%
ZvT_3HatchMuta10%
ZvZ_12PoolLing10%
ZvZ_Overgas9Pool10%
ZvZ_Overpool11Gas1315%
ZvZ_Overpool9Gas10%
ZvZ_OverpoolTurtle10%
38 openings1003%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Factory100100%3%9191%3%91%2%
Naked expand0%0%77%0%0%0%
Unknown0%0%22%0%0%0%


SAIDA is a good example of how Steamhammer reacts to a predictable opponent. First, it repeatedly tried its counters to the opponent’s Factory plan, the 3 “AntiFact” openings (you may call them fake news openings if you like). In this case the counters did not work; SAIDA is too strong. Then it explored more widely. Steamhammer scored 1 win with a fast lurker opening, and repeated the opening to no avail (maybe Steamhammer got lucky once, or maybe SAIDA learned the timing). It also scored a win with a ZvZ fast mutalisk opening, and repeating that did bring a second win for a total of 3 in 100 rounds. The smaller second table shows that the plan predictor was 100% accurate over the last 100 rounds in predicting SAIDA’s factory-first play, while the plan recognizer was 91% accurate and actually saw a command center first in 7 games.

#2 cherrypi

openinggameswins
2.5HatchMuta10%
3HatchPoolMuta10%
4HatchBeforeGas10%
4PoolSoft10%
6PoolSpeed20%
7PoolHard10%
8Hatch7Pool10%
9Hatch8Pool10%
9PoolSunkSpeed10%
OverhatchLing10%
OverhatchMuta10%
OverpoolSpeed10%
OverpoolSunk10%
ZvP_2HatchMuta10%
ZvP_3BaseSpire+Den10%
ZvT_12PoolMuta10%
ZvT_3HatchMuta10%
ZvT_3HatchMutaExpo10%
ZvZ_12HatchMain2114%
ZvZ_12PoolLing10%
ZvZ_12PoolMain30%
ZvZ_Overgas9Pool10%
ZvZ_Overpool9Gas3030%
ZvZ_OverpoolTurtle2532%
24 openings10020%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush2222%14%11%0%0%100%
Heavy rush7777%22%2828%25%35%61%
Naked expand11%0%22%0%0%0%
Unknown0%0%6969%19%0%0%


Steamhammer sees CherryPi as a strategy switcher. I suspect that CherryPi did not actually play any fast zergling rushes, because they said they avoided risky openings, but I can’t be sure without a closer look. In any case, Steamhammer found answers and scored a respectable 20% against a much higher ranked opponent.

#3 cse

openinggameswins
11Gas10PoolLurker10%
11Gas10PoolMuta1020%
11HatchTurtleHydra20%
11HatchTurtleLurker10%
12HatchTurtle10%
2.5HatchMuta10%
2HatchHydra10%
2HatchHydraBust50%
2HatchLurkerAllIn10%
3HatchHydraBust90%
3HatchHydraExpo10%
3HatchLingBust30%
3HatchLingExpo10%
3HatchLurker20%
3HatchPoolMuta10%
4HatchBeforeGas60%
4PoolHard20%
5PoolHard2Player20%
5PoolSoft10%
7PoolHard20%
7PoolSoft10%
8Pool30%
9HatchExpo9Pool9Gas10%
9PoolExpo10%
9PoolHatch10%
9PoolSpeedAllIn20%
9PoolSpire20%
AntiFact_2Hatch10%
AntiZeal_12Hatch10%
Over10Hatch2SunkHard10%
Over10HatchBust20%
Over10HatchSlowLings20%
OverhatchExpoLing30%
OverhatchExpoMuta10%
OverhatchMuta10%
Overpool+110%
OverpoolHydra10%
OverpoolLurker10%
OverpoolSpeed20%
PurpleSwarmBuild10%
Sparkle 1HatchMuta10%
ZvP_2HatchMuta50%
ZvP_3BaseSpire+Den30%
ZvP_3HatchPoolHydra40%
ZvP_4HatchPoolHydra10%
ZvZ_12Pool20%
ZvZ_Overpool11Gas10%
ZvZ_Overpool9Gas10%
48 openings1002%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush0%0%44%0%0%0%
Safe expand1919%0%3333%0%32%5%
Turtle8181%2%6060%3%60%2%
Unknown0%0%33%0%0%0%


Steamhammer has trouble telling the difference between Safe Expand (in the protoss case, forge expand with cannons) and Turtle (hide behind cannons), because it does not scout well enough to see the natural nexus reliably. It compensates by reacting similarly in both cases. But the opponent is still seen as an unpredictable strategy switcher, so Steamhammer switches up its openings too. In this case it has more counter openings and tries each fewer times, so they are not as obvious in the table, but they do have higher counts: See 2HatchHydraBust, 3HatchHydraBust, 3HatchLingBust, 4HatchBeforeGas, ZvP_2HatchMuta, and ZvP_3BaseSpire+Den. As against SAIDA, Steamhammer scored 2 wins with a ZvZ fast mutalisk opening. I have an idea to add another exploration phase which experiments with all-in attacks like the fast mutas.

#4 bluebluesky

openinggameswins
11Gas10PoolLurker20%
11Gas10PoolMuta10%
11HatchTurtleHydra20%
2.5HatchMuta10%
2HatchHydraBust50%
2HatchLurker10%
2HatchLurkerAllIn10%
3HatchHydraBust10%
3HatchLingBust10%
3HatchLingExpo10%
4HatchBeforeGas30%
4PoolSoft10%
5PoolHard10%
7PoolHard1010%
8Pool10%
9HatchExpo9Pool9Gas1811%
9HatchMain9Pool9Gas10%
9PoolSpeed30%
9PoolSpeedAllIn30%
AntiFact_2Hatch10%
Over10Hatch20%
Over10Hatch1Sunk10%
Over10Hatch2Sunk20%
Over10Hatch2SunkHard10%
OverhatchExpoLing20%
Overpool+110%
OverpoolHatch10%
OverpoolHydra10%
OverpoolSpeed10%
OverpoolTurtle10%
PurpleSwarmBuild10%
Sparkle 1HatchMuta10%
Sparkle 2HatchMuta10%
Sparkle 3HatchMuta10%
ZvP_2HatchMuta40%
ZvP_3BaseSpire+Den70%
ZvP_3HatchPoolHydra60%
ZvT_13Pool10%
ZvZ_Overgas11Pool10%
ZvZ_Overgas9Pool30%
ZvZ_Overpool11Gas20%
ZvZ_Overpool9Gas10%
42 openings1003%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush77%0%2020%5%29%0%
Naked expand0%0%11%100%0%0%
Safe expand5353%2%4545%0%58%2%
Turtle4040%5%3333%3%45%0%
Unknown0%0%11%0%0%0%


Different all-ins took a few wins from BlueBlueSky.

#5 locutus

openinggameswins
11Gas10PoolLurker20%
11HatchTurtleLurker10%
12HatchTurtle10%
2HatchHydra10%
2HatchHydraBust50%
2HatchLurker20%
2HatchLurkerAllIn20%
3HatchHydra10%
3HatchHydraBust30%
3HatchHydraExpo10%
3HatchLingBust2512%
3HatchLingExpo20%
4PoolSoft10%
5PoolHard20%
6PoolSpeed10%
8Hatch7Pool10%
8Pool10%
9HatchExpo9Pool9Gas10%
9HatchMain9Pool9Gas10%
9PoolSpeed10%
9PoolSpeedAllIn10%
AntiFact_13Pool10%
AntiFact_2Hatch10%
AntiFactory10%
AntiZeal_12Hatch10%
Over10Hatch10%
Over10Hatch2SunkHard10%
OverhatchExpoMuta20%
OverhatchLateGas10%
OverpoolHydra10%
OverpoolSpeed10%
OverpoolSunk10%
OverpoolTurtle10%
PurpleSwarmBuild20%
Sparkle 2HatchMuta10%
Sparkle 3HatchMuta10%
ZvP_2HatchMuta50%
ZvP_3BaseSpire+Den40%
ZvP_3HatchPoolHydra50%
ZvP_Overpool3Hatch10%
ZvT_12PoolMuta40%
ZvT_13Pool10%
ZvT_2HatchMuta10%
ZvT_3HatchMuta10%
ZvZ_12Pool10%
ZvZ_12PoolLing10%
ZvZ_12PoolMain10%
ZvZ_Overgas9Pool10%
ZvZ_Overpool9Gas10%
49 openings1003%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush0%0%44%25%0%0%
Safe expand6262%3%5555%0%60%0%
Turtle3838%3%4141%5%50%0%

#6 isamind

openinggameswins
11Gas10PoolLurker10%
11Gas10PoolMuta10%
2.5HatchMuta10%
2HatchHydra10%
2HatchHydraBust60%
2HatchLurker10%
3HatchHydra10%
3HatchHydraBust50%
3HatchLingBust50%
4HatchBeforeGas30%
4PoolHard10%
4PoolSoft20%
5PoolHard2Player10%
5PoolSoft10%
7PoolHard1118%
7PoolMid10%
7PoolSoft10%
8Hatch7Pool10%
8Pool10%
9HatchExpo9Pool9Gas30%
9HatchMain9Pool9Gas10%
9PoolSpeed10%
9PoolSunkHatch10%
AntiFact_13Pool10%
AntiZeal_12Hatch10%
Over10Hatch10%
Over10Hatch1Sunk20%
Over10Hatch2Sunk10%
Over10Hatch2SunkHard10%
Over10HatchSlowLings10%
OverhatchExpoLing30%
OverpoolHatch812%
OverpoolHydra10%
OverpoolLurker20%
OverpoolSpeed20%
PurpleSwarmBuild10%
ZvP_2HatchMuta20%
ZvP_3BaseSpire+Den40%
ZvP_3HatchPoolHydra617%
ZvP_Overpool3Hatch30%
ZvT_2HatchMuta40%
ZvT_3HatchMutaExpo10%
ZvZ_12HatchMain10%
ZvZ_12PoolMain10%
ZvZ_Overpool11Gas10%
ZvZ_OverpoolTurtle10%
46 openings1004%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush1717%12%1414%14%65%6%
Proxy22%0%22%0%0%0%
Safe expand6262%3%4747%2%47%5%
Turtle1919%0%3333%3%26%0%
Unknown0%0%44%0%0%0%

#7 daqin

openinggameswins
11Gas10PoolMuta812%
2HatchHydra20%
2HatchHydraBust50%
2HatchLurkerAllIn50%
3HatchHydra20%
3HatchHydraBust30%
3HatchHydraExpo20%
3HatchLing10%
3HatchLingBust40%
3HatchLingExpo10%
4HatchBeforeGas40%
4PoolSoft10%
5PoolHard2Player20%
6PoolSpeed30%
8Hatch7Pool10%
9HatchExpo9Pool9Gas10%
9PoolHatch20%
9PoolSpeedAllIn30%
9PoolSpire10%
9PoolSunkHatch30%
9PoolSunkSpeed20%
AntiFact_13Pool10%
AntiFact_2Hatch20%
AntiZeal_12Hatch10%
Over10Hatch1Sunk20%
Over10Hatch2Sunk30%
OverhatchExpoLing10%
OverhatchExpoMuta40%
OverhatchLateGas10%
OverhatchLing10%
OverpoolHatch10%
OverpoolHydra20%
OverpoolLurker10%
OverpoolSpeed40%
OverpoolSunk10%
OverpoolTurtle10%
Sparkle 1HatchMuta20%
ZvP_2HatchMuta20%
ZvP_3BaseSpire+Den30%
ZvP_3HatchPoolHydra20%
ZvP_4HatchPoolHydra10%
ZvT_12PoolMuta10%
ZvT_3HatchMutaExpo10%
ZvZ_12HatchExpo10%
ZvZ_12HatchMain10%
ZvZ_12PoolLing10%
ZvZ_Overgas11Pool10%
ZvZ_OverpoolTurtle20%
48 openings1001%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush0%0%33%0%0%0%
Proxy1010%0%1616%0%0%0%
Safe expand3535%0%3434%0%29%6%
Turtle5555%2%4141%2%40%7%
Unknown0%0%66%0%0%0%

#8 mcrave

openinggameswins
11HatchTurtleHydra1250%
2HatchHydra1136%
2HatchLurker250%
2HatchLurkerAllIn10%
3HatchHydraBust743%
3HatchLing20%
3HatchLingBust10%
AntiZeal_12Hatch20%
Over10Hatch2Hard10%
Over10HatchBust10%
OverhatchLateGas2330%
ZvP_3HatchPoolHydra1323%
ZvP_Overpool3Hatch10%
ZvT_12PoolMuta10%
ZvZ_OverpoolTurtle2264%
15 openings10038%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush9191%37%5151%25%54%31%
Safe expand88%38%1111%45%0%62%
Turtle11%100%55%20%0%0%
Unknown0%0%3333%58%0%0%


The plan predictor struggled to predict what McRave was going to do next, but learning worked well anyway—eventually. The ZvZ_OverpoolTurtle choice is a big surprise, an opening that builds 3 sunkens and gets fast mutalisks on one base. The opening is sound only against certain all-in zerg strategies; protoss really ought to smash it. I’m guessing it worked against a zealot rush where McRave was slow to switch tech when the mutas showed up.

#9 iron

openinggameswins
12HatchTurtle10%
2.5HatchMuta10%
3HatchPoolMuta911%
9PoolExpo825%
9PoolSunkHatch10%
AntiFact_13Pool3523%
AntiFact_2Hatch20%
AntiFactory10%
AntiZeal_12Hatch10%
OverpoolLurker10%
OverpoolSpeed10%
OverpoolSunk10%
ZvP_4HatchPoolHydra10%
ZvZ_12PoolMain10%
ZvZ_Overgas11Pool1450%
ZvZ_Overpool9Gas2245%
16 openings10028%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Factory100100%28%9191%29%91%7%
Turtle0%0%22%0%0%0%
Unknown0%0%77%29%0%0%


When I run matches locally against Iron, Steamhammer soon settles on AntiFactory as the most reliable answer, and that does seem best. For some reason, Steamhammer behaved differently in both CIG and AIIDE. It is astonishing that ZvZ fast mutalisk openings came out on top again. Exactly as against SAIDA, the plan predictor was 100% accurate while the plan recognizer was 91% accurate.

#10 zzzkbot

openinggameswins
3HatchHydraBust10%
4PoolHard10%
9PoolSpeedAllIn1479%
9PoolSunkHatch2232%
OverhatchExpoLing10%
OverhatchLing10%
OverpoolSunk2138%
ZvP_3HatchPoolHydra10%
ZvP_4HatchPoolHydra10%
ZvZ_Overgas9Pool2544%
ZvZ_Overpool11Gas520%
ZvZ_Overpool9Gas10%
ZvZ_OverpoolTurtle617%
13 openings10039%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush7777%42%2121%57%22%75%
Heavy rush1414%21%22%0%0%86%
Turtle99%44%22%100%22%56%
Unknown0%0%7575%33%0%0%


9PoolSunkHatch and OverpoolSunk are anti-rush openings, and 9PoolSpeedAllIn is general-purpose but good against rushes. In contrast, ZvZ_Overgas9Pool is a fast mutalisk opening and can be overrun by too many zerglings. I don’t know how accurate the plan predictions are, but they agree fairly well with the selected openings.

#12 microwave

openinggameswins
11Gas10PoolMuta2832%
3HatchHydraBust10%
3HatchLing10%
3HatchLingExpo10%
3HatchLurker10%
4PoolSoft1217%
5PoolHard2Player10%
9HatchMain9Pool9Gas20%
9PoolSpeed10%
9PoolSpeedAllIn10%
9PoolSunkSpeed20%
AntiFact_2Hatch10%
OverhatchLing20%
OverpoolSunk425%
ZvZ_12HatchMain20%
ZvZ_12PoolLing10%
ZvZ_12PoolMain20%
ZvZ_Overgas9Pool20%
ZvZ_Overpool11Gas1020%
ZvZ_Overpool9Gas2339%
ZvZ_OverpoolTurtle20%
21 openings10023%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush1515%27%1010%50%13%53%
Heavy rush4242%17%2020%40%14%45%
Naked expand4343%28%2121%5%21%49%
Turtle0%0%11%0%0%0%
Unknown0%0%4848%19%0%0%


Microwave really mixed things up, and it was successful! Steamhammer could not predict the opening switches. It’s interesting that when Steamhammer predicted a fast rush, it won a quarter of the time, and when it actually recognized a fast rush, it won half the time. That doesn’t tell us what actually happened in the games. When Steamhammer recognizes a fast rush, it can react no matter what opening it is playing, and often save itself. When it is rushed and doesn’t recognize it, it will lose unless it is playing a safe opening.

#13 lastorder

openinggameswins
3HatchLingBust1233%
4PoolHard10%
4PoolSoft2129%
6PoolSpeed10%
AntiFactory10%
Over10Hatch10%
Over10Hatch1Sunk425%
OverhatchLing20%
OverhatchMuta729%
PurpleSwarmBuild10%
ZvP_3HatchPoolHydra10%
ZvT_3HatchMutaExpo633%
ZvZ_12HatchMain1331%
ZvZ_12PoolLing520%
ZvZ_12PoolMain50%
ZvZ_Overpool11Gas1735%
ZvZ_OverpoolTurtle20%
17 openings10026%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush100100%26%7777%25%77%14%
Naked expand0%0%33%0%0%0%
Turtle0%0%66%17%0%0%
Unknown0%0%1414%43%0%0%


LastOrder did not learn during the tournament and played predictably, yet Steamhammer struggled to find an answer. We also know that LastOrder learned extensively offline before the tournament. Knowing that, and looking at these tables (check out the variety of recognized plans and the variety of Steamhammer’s more successful openings), I get the impression that LastOrder is highly adaptive and knows how to react in a wide variety of situations. I guess we’ll see when the replays come out.

#14 tyr

openinggameswins
2HatchHydraBust1338%
2HatchLurkerAllIn1443%
3HatchHydraExpo3876%
4HatchBeforeGas20%
4PoolHard425%
9PoolSunkSpeed10%
Over10Hatch2Hard10%
Over10HatchBust10%
OverpoolLurker729%
OverpoolSpeed5100%
ZvP_3BaseSpire+Den1362%
ZvP_3HatchPoolHydra10%
12 openings10056%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush3939%56%4545%78%41%3%
Naked expand0%0%11%100%0%0%
Turtle6161%56%5050%32%48%5%
Unknown0%0%44%100%0%0%


These numbers say that anything which helps Steamhammer find the right answers early, without having to do so much random exploration, would be a big win in a long tournament. The plan recognizer is not good enough.

#15 metabot

openinggameswins
11Gas10PoolLurker250%
11HatchTurtleHydra683%
12HatchTurtle367%
2HatchLurkerAllIn367%
3HatchHydraExpo10%
3HatchLing1182%
3HatchLingExpo1060%
4PoolHard10%
6PoolSpeed2100%
9HatchExpo9Pool9Gas850%
9PoolHatch367%
9PoolSpeedAllIn250%
AntiZeal_12Hatch10%
Over10Hatch250%
Over10Hatch2Hard1100%
Over10Hatch2Sunk30%
OverhatchExpoLing862%
OverhatchExpoMuta1443%
OverhatchLateGas425%
OverpoolSpeed475%
ZvP_2HatchMuta250%
21 openings9157%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush3437%65%1921%68%21%41%
Naked expand33%33%33%100%0%33%
Safe expand3437%56%2022%45%21%38%
Turtle1921%47%1314%46%11%42%
Unknown11%100%3640%58%0%0%


It must have been a crazy learning duel! Later I’ll try to figure out what MetaBot learned, and we can check them against each other.

#16 letabot

openinggameswins
12HatchTurtle20%
3HatchLing10%
6PoolSpeed1164%
9HatchExpo9Pool9Gas633%
9PoolLurker4582%
OverpoolHatch771%
OverpoolLurker2882%
7 openings10074%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush9999%74%5959%78%59%20%
Safe expand0%0%44%50%0%0%
Turtle11%100%1717%76%0%0%
Unknown0%0%2020%65%0%0%

#17 arrakhammer

openinggameswins
2HatchLurkerAllIn10%
4PoolHard2268%
6PoolSpeed5275%
7Pool12Hatch10%
9HatchMain9Pool9Gas10%
9PoolSpeedAllIn10%
AntiFactory10%
Over10Hatch2SunkHard10%
Over10HatchBust10%
Over10HatchSlowLings10%
OverhatchExpoMuta10%
OverhatchLing10%
OverpoolHydra10%
ZvZ_12HatchMain10%
ZvZ_12PoolLing10%
ZvZ_12PoolMain20%
ZvZ_Overpool11Gas1136%
17 openings10058%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush9999%58%7878%65%78%1%
Naked expand11%100%2121%29%0%0%
Unknown0%0%11%100%0%0%


This old version of Arrakhammer has a fixed anti-Steamhammer opening configured. It was written before Steamhammer had learning. Modern Steamhammer can exploit the fixed opening. You can’t get away with that any more.

#18 ecgberht

openinggameswins
11Gas10PoolLurker1191%
11HatchTurtleLurker51100%
9PoolLurker3797%
OverpoolLurker10%
4 openings10097%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush100100%97%6767%96%67%33%
Unknown0%0%3333%100%0%0%

#19 ualbertabot

openinggameswins
3HatchLurker10%
7PoolHard1182%
AntiZeal_12Hatch757%
OverhatchExpoMuta10%
OverpoolTurtle8098%
5 openings10091%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Factory22%100%1111%100%0%0%
Fast rush1212%92%1515%80%33%25%
Heavy rush8585%91%4545%89%45%22%
Naked expand11%100%77%100%0%0%
Unknown0%0%2222%95%0%0%


Getting that 98% win rate is one of the reasons I added the seemingly nonsensical overpool turtle opening, which makes an absurd 6 sunkens on one base. It works against all kinds of rushes, fast or slow, when the rusher does not know how to adapt.

#20 ximp

openinggameswins
3HatchHydraExpo1782%
4HatchBeforeGas3683%
9Hatch8Pool10%
AntiFactory10%
ZvP_2HatchMuta978%
ZvP_3BaseSpire+Den3678%
6 openings10079%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Safe expand33%100%1818%94%0%0%
Turtle9797%78%7878%76%77%4%
Unknown0%0%44%75%0%0%


Why didn’t Steamhammer try the 3 hatch before pool opening even once in 100 rounds? I expect it would have scored higher. Well, I know why; when the win rate is so convincing, Steamhammer doesn’t explore much.

#21 cdbot

openinggameswins
11HatchTurtleHydra10%
9PoolSunkSpeed1547%
OverpoolSunk8296%
ZvP_Overpool3Hatch10%
ZvZ_12PoolLing10%
5 openings10086%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush9696%85%3131%71%29%57%
Heavy rush44%100%1313%100%0%25%
Unknown0%0%5656%91%0%0%

#22 aiur

openinggameswins
11Gas10PoolLurker10%
3HatchHydraExpo2889%
5PoolHard2Player10%
AntiZeal_12Hatch4691%
Over10Hatch2492%
5 openings10089%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush9595%89%6565%91%64%18%
Naked expand44%75%1515%73%0%25%
Proxy0%0%22%50%0%0%
Turtle11%100%0%0%0%0%
Unknown0%0%1818%100%0%0%


Turtle was predicted once but never recognized in the last 100 games. That implies that Steamhammer recognized a turtle opening in the first 3 rounds—and it was wrong, since AIUR doesn’t do that; it must have been a misrecognized cannon rush, a bug that has crept in. Comparing against what AIUR learned, I see that AIUR cannon rushed Steamhammer 3 times total, all failures, and favored its defensive strategy.

#23 killall

openinggameswins
6PoolSpeed10%
9PoolSpeed37100%
ZvZ_12PoolMain10%
ZvZ_OverpoolTurtle6193%
4 openings10094%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush7575%93%4343%91%49%36%
Naked expand55%80%1212%100%20%20%
Turtle2020%100%1010%100%45%35%
Unknown0%0%3535%94%0%0%

#24 willyt

openinggameswins
11Gas10PoolLurker3097%
11HatchTurtleLurker786%
12HatchTurtle20%
2HatchLurkerAllIn2496%
6PoolSpeed10%
9PoolLurker10%
OverpoolLurker35100%
7 openings10093%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Heavy rush100100%93%8585%96%85%15%
Unknown0%0%1515%73%0%0%

#25 ailien

openinggameswins
3HatchLurker10%
6PoolSpeed10%
9PoolSpeedAllIn10%
OverhatchLing10%
ZvT_3HatchMuta10%
ZvZ_Overgas9Pool743%
ZvZ_Overpool9Gas2085%
ZvZ_OverpoolTurtle6893%
8 openings10083%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Naked expand9898%85%33%0%2%98%
Unknown22%0%9797%86%0%50%

#26 cunybot

openinggameswins
11Gas10PoolMuta10%
5PoolHard2Player367%
OverhatchLing1593%
OverpoolSpeed10%
ZvZ_12HatchExpo250%
ZvZ_OverpoolTurtle77100%
6 openings9995%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Fast rush44%100%33%100%0%75%
Heavy rush1313%100%66%83%0%62%
Naked expand6263%94%2020%90%19%61%
Turtle1919%100%1010%100%11%58%
Unknown11%0%6061%97%0%0%

#27 hellbot

openinggameswins
2HatchHydraBust580%
3HatchHydra7100%
3HatchHydraBust12100%
3HatchHydraExpo14100%
3HatchLingBust8100%
4HatchBeforeGas16100%
Over10Hatch1Sunk3100%
ZvP_2HatchMuta11100%
ZvP_3BaseSpire+Den15100%
ZvP_3HatchPoolHydra9100%
10 openings10099%
planpredictedrecognizedaccuracy
countgameswinscountgameswinsgood?
Turtle100100%99%7676%99%76%24%
Unknown0%0%2424%100%0%0%

overall

totalZvTZvPZvZZvR
openinggameswinsgameswinsgameswinsgameswinsgameswins
11Gas10PoolLurker5375% 4489% 911%
11Gas10PoolMuta5024% 10% 2015% 2931%
11HatchTurtleHydra2446% 10% 2250% 10%
11HatchTurtleLurker6095% 5898% 20%
12HatchTurtle1020% 50% 540%
2.5HatchMuta50% 10% 30% 10%
2HatchHydra1625% 1625%
2HatchHydraBust4520% 10% 4420%
2HatchLurker617% 617%
2HatchLurkerAllIn5260% 2496% 2730% 10%
3HatchHydra1164% 1164%
3HatchHydraBust4236% 4038% 20%
3HatchHydraExpo10380% 10% 10280%
3HatchLing1656% 10% 1464% 10%
3HatchLingBust5925% 4723% 1233%
3HatchLingExpo1638% 1540% 10%
3HatchLurker60% 10% 20% 20% 10%
3HatchPoolMuta119% 911% 10% 10%
4HatchBeforeGas7363% 20% 7066% 10%
4PoolHard3546% 30% 812% 2462%
4PoolSoft3921% 50% 3424%
5PoolHard40% 10% 30%
5PoolHard2Player1020% 60% 450%
5PoolSoft30% 10% 20%
6Pool10% 10%
6PoolSpeed7564% 1258% 633% 5768%
7Pool12Hatch10% 10%
7PoolHard3534% 2313% 10% 1182%
7PoolMid10% 10%
7PoolSoft40% 20% 20%
8Hatch7Pool40% 30% 10%
8Pool60% 60%
9Hatch8Pool40% 20% 10% 10%
9HatchExpo9Pool9Gas3921% 729% 3219%
9HatchMain9Pool9Gas60% 30% 30%
9Pool10% 10%
9PoolExpo1020% 922% 10%
9PoolHatch633% 633%
9PoolLurker9181% 9181%
9PoolSpeed4386% 50% 3897%
9PoolSpeedAllIn2941% 10% 119% 1765%
9PoolSpire30% 30%
9PoolSunkHatch2726% 10% 40% 2232%
9PoolSunkSpeed2232% 10% 30% 1839%
AntiFact_13Pool4617% 4319% 30%
AntiFact_2Hatch200% 140% 50% 10%
AntiFactory210% 170% 20% 20%
AntiZeal_12Hatch6373% 30% 5379% 757%
Over10Hatch3174% 3077% 10%
Over10Hatch1Sunk1233% 838% 425%
Over10Hatch2Hard333% 333%
Over10Hatch2Sunk90% 90%
Over10Hatch2SunkHard60% 10% 40% 10%
Over10HatchBust50% 40% 10%
Over10HatchSlowLings40% 30% 10%
OverhatchExpoLing1828% 1729% 10%
OverhatchExpoMuta2326% 2129% 10% 10%
OverhatchLateGas3027% 10% 2928%
OverhatchLing2458% 10% 2361%
OverhatchMuta922% 10% 825%
Overpool+130% 10% 20%
OverpoolHatch1833% 862% 1010%
OverpoolHydra70% 60% 10%
OverpoolLurker7679% 6589% 1118%
OverpoolSpeed2236% 10% 1942% 20%
OverpoolSunk11179% 10% 20% 10881%
OverpoolTurtle8394% 30% 8098%
PurpleSwarmBuild70% 10% 50% 10%
Sparkle 1HatchMuta40% 40%
Sparkle 2HatchMuta40% 20% 20%
Sparkle 3HatchMuta20% 20%
ZvP_2HatchMuta4146% 4048% 10%
ZvP_3BaseSpire+Den8659% 8560% 10%
ZvP_3HatchPoolHydra4927% 10% 4628% 20%
ZvP_4HatchPoolHydra40% 10% 20% 10%
ZvP_Overpool3Hatch60% 50% 10%
ZvT_12PoolMuta90% 20% 60% 10%
ZvT_13Pool20% 20%
ZvT_2HatchMuta60% 10% 50%
ZvT_3HatchMuta40% 10% 10% 20%
ZvT_3HatchMutaExpo922% 20% 729%
ZvZ_12HatchExpo333% 10% 250%
ZvZ_12HatchMain3918% 20% 3719%
ZvZ_12Pool30% 30%
ZvZ_12PoolLing128% 10% 20% 911%
ZvZ_12PoolMain160% 10% 20% 130%
ZvZ_Overgas11Pool1644% 1450% 20%
ZvZ_Overgas9Pool4035% 10% 40% 3540%
ZvZ_Overpool11Gas6025% 1315% 40% 4330%
ZvZ_Overpool9Gas10045% 2343% 30% 7447%
ZvZ_OverpoolTurtle26782% 10% 2556% 24185%
total259052%50059%109139%89958%10091%
openings played915287555

Steamhammer played all of its openings during the tournament, almost all of them multiple times. It even tried the 3 specialized openings for the island map Sparkle. Nearly as many were played in ZvP alone, since it spent so much time desperately seeking an answer to the Locutusoids (or possibly Susan). Some openings were highly successful in given matchups, which generally means that the opening defeated one opponent reliably and so was played many times. For example, OverpoolSunk wiped out CDBot, which makes it look in this table as though it wiped out all zergs. If only it were so simple! The opening with the best success across matchups is 6PoolSpeed, an opening that I have never seen in human play.

AIIDE 2018 - what AIUR learned

Here is what the protoss AIUR learned about each opponent over the course of AIIDE 2018. . Seeing AIUR’s counters for each opponent tells us something about how the opponent played. For the recent CIG edition, see CIG 2018 - what AIUR learned.

This is generated from data in AIUR’s final write directory. There were 103 rounds played (100 of which were official) and 10 maps, three 2-player, two 3-player, and five 4-player maps. For some opponents, all games were recorded; for the supernumerary 3 rounds at the end, the extra games were on the 2-player maps (they’re taken in rotation). For many opponents, fewer than 103 games were recorded. AIUR recorded 2606 games in 103 rounds, and officially played 2570 in 100 rounds. 2570 plus the 3 extra rounds times 26 opponents per round gives a total of 2648, which is 42 more than AIUR recorded. There were 37 official crashes in 100 rounds, leaving 5 games unaccounted for. They might be crashes in the extra 3 rounds. It’s also possible that the last round was not finished.

It would be nice if we had the data after round 100, instead of round 103. We could do the accounting and get correct answers.

First, the totals across all opponents.

overall234total
 nwinsnwinsnwinsnwins
cheese22127%7125%13718%42924%
rush10146%8325%21032%39434%
aggressive577%8216%17317%31215%
fast expo8641%7934%23331%39834%
macro8026%6733%13632%28331%
defensive26134%13441%39537%79037%
total80632%51630%128430%260631%
  • 2, 3, 4 - map size, the number of starting positions
  • n - games recorded
  • wins - winning percentage over those games
  • cheese - cannon rush
  • rush - dark templar rush
  • aggressive - fast 4 zealot drop
  • fast expo - nexus first
  • macro - aim for a strong middle game army
  • defensive - try to be safe against rushes

AIUR struggled in this tournament; it has not been updated since 2014. As in CIG, AIUR did about equally well on the different map sizes, but relied on a different mix of strategies on each. On all map sizes, the defensive strategy was most often used. On 2-player maps, the cannon rush was also a popular solution, and on 4-player maps (where cannon rush is harder to pull off), the dark templar rush and the nexus first fast expansion were popular.

#1 saida234total
 nwinsnwinsnwinsnwins
cheese10%30%70%110%
rush10%20%60%90%
aggressive186%70%2711%528%
fast expo10%30%50%90%
macro10%40%20%70%
defensive20%10%20%50%
total244%200%496%934%

As in CIG, AIUR’s learning is able to squeeze a little extra from the toughest opponents. Against #1 SAIDA, it found that the dark templar rush occasionally worked, and was able to get a couple extra wins on 4-player maps. The same plan scored a single win on a 2-player map, but repeating the strategy did not help. Nothing else it tried made a dent.

#2 cherrypi234total
 nwinsnwinsnwinsnwins
cheese119%20%60%195%
rush50%30%50%130%
aggressive20%40%60%120%
fast expo10%60%70%140%
macro50%20%205%274%
defensive60%30%60%150%
total303%200%502%1002%

Oops, I lied already. AIUR was not able to squeeze an extra win against CherryPi. It won a total of 2 times with different strategies, and repeating the strategies did not win again. This is the first time I have seen AIUR’s diverse strategies unable to make any impression.

#3 cse234total
 nwinsnwinsnwinsnwins
cheese2612%1414%50%4511%
rush10%10%90%110%
aggressive20%10%80%110%
fast expo10%10%100%120%
macro10%10%80%100%
defensive10%20%100%130%
total329%2010%500%1025%

CSE was apparently not fully prepared for cannon rushes. AIUR plays the best cannon rush of all bots, in my opinion. But even the best is harder to pull off on a 4-player map.

#4 bluebluesky234total
 nwinsnwinsnwinsnwins
cheese1060%1164%3030%5143%
rush50%20%30%100%
aggressive30%20%70%120%
fast expo30%20%40%90%
macro50%20%50%120%
defensive50%10%10%70%
total3119%2035%5018%10122%

The Locutusoids showed somewhat similar patterns. BlueBlueSky was surprisingly weak against the cannon rush.

#5 locutus234total
 nwinsnwinsnwinsnwins
cheese1817%20%110%3110%
rush812%50%119%248%
aggressive10%20%70%100%
fast expo10%20%50%80%
macro20%30%80%130%
defensive10%50%80%140%
total3113%190%502%1005%

The other part of the pattern is some weakness against dark templar rush. Interestingly, the earlier version of Locutus survived AIUR’s DTs perfectly in CIG, despite a fair number of tries.

#6 isamind234total
 nwinsnwinsnwinsnwins
cheese2711%10%10%2910%
rush10%30%20%60%
aggressive10%40%40%90%
fast expo20%60%408%486%
macro10%20%10%40%
defensive10%425%20%714%
total339%205%506%1037%

#7 daqin234total
 nwinsnwinsnwinsnwins
cheese2015%10%20%2313%
rush540%1520%4319%6321%
aggressive10%10%20%40%
fast expo20%10%10%40%
macro20%10%10%40%
defensive10%10%10%30%
total3116%2015%5016%10116%

#8 mcrave234total
 nwinsnwinsnwinsnwins
cheese1735%1127%425%3231%
rush40%10%30%80%
aggressive10%425%40%911%
fast expo425%20%3330%3928%
macro30%10%10%50%
defensive20%10%50%80%
total3123%2020%5022%10122%

McRave shows a different pattern. Its weaknesses were against the cannon rush on smaller maps and nexus first on 4-player maps—a fast rush versus a macro opening. The tournament manager cycles through the maps in order, which makes a difference for bots which are sensitive to which map is being played. It’s possible that the sequence of strategies that AIUR played as the maps cycled through helped confuse McRave’s learning.

#9 iron234total
 nwinsnwinsnwinsnwins
cheese40%10%50%100%
rush50%10%20%80%
aggressive50%60%40%150%
fast expo70%20%284%373%
macro50%729%40%1612%
defensive60%30%50%140%
total320%2010%482%1003%

#10 zzzkbot234total
 nwinsnwinsnwinsnwins
cheese10%20%20%50%
rush10%10%60%80%
aggressive20%30%40%90%
fast expo10%10%20%40%
macro10%10%20%40%
defensive2715%1225%3412%7315%
total3312%2015%508%10311%

AIUR of course settled on the defensive opening against ZZZKBot, which prefers 4 pool.

#11 steamhammer234total
 nwinsnwinsnwinsnwins
cheese10%10%10%30%
rush10%60%40%110%
aggressive10%30%30%70%
fast expo520%20%1612%2313%
macro00%40%30%70%
defensive2520%40%2313%5215%
total3318%200%5010%10311%

The fast expo (“big army later”) and the defensive opening (“some army fast”) play out similarly when Steamhammer does not go with an early pressure opening. Maybe that’s why they both found some success.

#12 microwave234total
 nwinsnwinsnwinsnwins
cheese10%20%1118%1414%
rush20%10%617%911%
aggressive10%333%1118%1520%
fast expo10%10%20%40%
macro10%540%10%729%
defensive2711%838%1926%5420%
total339%2030%5020%10318%

That is quite a variety of tries against Microwave!

#13 lastorder234total
 nwinsnwinsnwinsnwins
cheese50%10%239%297%
rush40%40%10%90%
aggressive50%40%10%100%
fast expo50%30%10%90%
macro70%10%00%80%
defensive60%714%248%378%
total320%205%508%1025%

LastOrder may have been trained offline against AIUR (that would fit with how LastOrder is supposed to work).

#14 tyr234total
 nwinsnwinsnwinsnwins
cheese875%333%10%1258%
rush1989%1040%4353%7261%
aggressive10%20%250%520%
fast expo250%10%10%425%
macro20%30%20%70%
defensive10%10%10%30%
total3373%2025%5048%10351%

#15 metabot234total
 nwinsnwinsnwinsnwins
cheese2148%425%20%2741%
rush10%10%1520%1718%
aggressive10%944%2227%3231%
fast expo10%10%10%30%
macro10%20%10%40%
defensive10%10%10%30%
total2638%1828%4221%8628%

MetaBot includes AIUR as one of its heads. Also AIUR struggles against both the other heads, Skynet and XIMP. Still, aggressive tries had some success.

#16 letabot234total
 nwinsnwinsnwinsnwins
cheese20%10%10%40%
rush333%20%10%617%
aggressive10%10%10%30%
fast expo2167%425%4477%6971%
macro250%10%10%425%
defensive450%1127%10%1631%
total3355%2020%4969%10255%

#17 arrakhammer234total
 nwinsnwinsnwinsnwins
cheese10%10%333%520%
rush10%10%10%30%
aggressive00%10%10%20%
fast expo20%922%10%1217%
macro00%10%10%20%
defensive2972%743%4347%7956%
total3364%2025%5042%10346%

#18 ecgberht234total
 nwinsnwinsnwinsnwins
cheese10%10%20%40%
rush10%450%10%633%
aggressive30%20%10%60%
fast expo667%250%10%956%
macro10%1060%4360%5459%
defensive2124%10%20%2421%
total3327%2045%5052%10343%

#19 ualbertabot234total
 nwinsnwinsnwinsnwins
cheese00%00%00%00%
rush00%00%2100%2100%
aggressive00%1100%10%250%
fast expo1100%00%00%1100%
macro00%00%00%00%
defensive3135%1942%4723%9731%
total3238%2045%5026%10233%

UAlbertaBot is one of the opponents that AIUR has pre-learned data against. The pre-learned data is not included in this table. That’s why so many cells are 0.

#20 ximp234total
 nwinsnwinsnwinsnwins
cheese3333%00%10%3432%
rush00%00%250%250%
aggressive00%1225%4120%5321%
fast expo00%850%3100%1164%
macro00%00%00%00%
defensive00%00%2100%2100%
total3333%2035%4929%10231%

XIMP is the other competitor that AIUR has pre-learned data about.

#21 cdbot234total
 nwinsnwinsnwinsnwins
cheese10%10%10%30%
rush10%10%10%30%
aggressive10%10%10%30%
fast expo20%10%10%40%
macro10%10%10%30%
defensive2796%15100%4587%8792%
total3379%2075%5078%10378%

It smells like CDBot played a rush every game, and not a strong one.

#23 killall234total
 nwinsnwinsnwinsnwins
cheese10%10%20%40%
rush10%10%10%30%
aggressive10%30%10%50%
fast expo10%10%10%30%
macro10%10%10%30%
defensive2818%1346%4436%8532%
total3315%2030%5032%10326%

#24 willyt234total
 nwinsnwinsnwinsnwins
cheese10%10%956%1145%
rush2685%1369%3067%6974%
aggressive10%10%10%30%
fast expo250%250%650%1050%
macro10%10%20%40%
defensive10%250%250%540%
total3272%2055%5058%10262%

#25 ailien234total
 nwinsnwinsnwinsnwins
cheese10%10%10%30%
rush10%10%20%40%
aggressive20%10%20%50%
fast expo10%10100%10%1283%
macro2741%683%1323%4641%
defensive10%10%3037%3234%
total3333%2075%4929%10239%

#26 cunybot234total
 nwinsnwinsnwinsnwins
cheese250%10%10%425%
rush10%250%250%540%
aggressive2100%10%367%667%
fast expo475%2100%875%1479%
macro250%4100%475%1080%
defensive5100%9100%3087%4491%
total1675%1984%4879%8380%

#27 hellbot234total
 nwinsnwinsnwinsnwins
cheese7100%4100%580%1694%
rush3100%2100%8100%13100%
aggressive1100%3100%8100%12100%
fast expo9100%6100%11100%26100%
macro8100%3100%11100%22100%
defensive2100%2100%7100%11100%
total30100%20100%5098%10099%

Looking across all the tables, each of AIUR’s 6 strategies was sometimes found to be the best. Even today, the variety remains valuable.

AIIDE 2018 - the performance curves

I decided to look more closely at the Win Percentage Over Time curves. For this post, “learning” means online learning during the tournament; bots which only learned offline at home are “non-learning” bots for the moment.

To start off, here are the bots whose curves are more or less flat over time. Of these, #1 SAIDA is the only learning bot. Its learning apparently enabled it to hold its ground at a high level, but not to rise further. The other 3 are #13 LastOrder, #26 CUNYBot, and tail-ender #27 Hellbot. Hellbot gradually lost win rate over time despite its low starting point. The other 2 are very nearly level over time, despite being non-learners in a field of enemies eagerly seeking weaknesses to exploit. I suppose that their play is in some way difficult to exploit by learning, whether highly adaptive, or random and unpredictable, or simply not exposing weaknesses that other bots were able to catch.

pretty much flat and level

Here are all the non-learning bots, as best I could identify from yesterday’s findings. I also included #1 SAIDA to maintain the scale, which usefully goes to 1.1 to accomodate any bots which won more games than they played.

all bots that didn’t learn

Most of the curves trend down over time. The exceptions are #13 LastOrder and #26 CUNYBot from the first graph. Here’s a rescaled graph to tease apart the dense clump from #16 LetaBot to #24 WillyT. It’s easier to see the downward trend. Of these, #17 Arrakhammer which has sophisticated play, and #20 XIMP whose weaknesses may be difficult for many opponents to exploit, leveled out after the early losses. (So did #9 Iron with its numerous adaptive reactions, from the chart above.) The others continued downward for the entire tournament. Apparently if your play is in some way good enough, you can avoid exploitation by other bots to an extent. But most non-learning bots seem doomed to keep losing win rate even over a long tournament.

a clump of closely-spaced bots that didn’t learn

Here are the learning bots which fell at first, then leveled out. It’s due to some combination of statistical fluctuation plus learning by their opponents, and no doubt bugs and whatever other random stuff. There are only 3 of them.

bots that fell then leveled out

#15 MetaBot might belong in the graph above, but I gave it its own picture because it is in a class by itself when it comes to struggling at the start then recovering strongly. It fell hard (on the left its curve drops below #21 CDBot) and came back, but it did not level off! MetaBot rivals Steamhammer and AIUR for performance gains over time. I imagine it’s due to MetaBot’s 2-level learning ability, where it learns which of 3 heads is best against each opponent, and then 2 of the heads (AIUR and Skynet), when chosen, learn how best to play against that opponent. Like Steamhammer and AIUR, it has more scope to learn, and it learns more. The graph shows how many rivals MetaBot left in the dust—it came within an ace of surpassing #14 Tyr, and likely would have given 10 more rounds.

metabot fell hard then rose strongly

Here are the bots which gained win rate early, then largely leveled out—most continued to gain or lose a little for the duration. This is partly because the curves are cumulative. Only the left part of the curve can change quickly; each data point is the average of all the per round win rates to the left. The non-learning #13 LastOrder is included; the others are learning bots. #14 Tyr, which learns less because it only remembers one previous game, had the biggest decline from its peak. That’s interesting: The extremely simple method of learning from only one game is already a powerful form of learning, but it is not as powerful as, say, the UCB learning of #12 Microwave, which remembers summary statistics from many games. All these bots arguably could have done better if they had scope to learn more; their learning ceilings may not be high enough for a long tournament. Perhaps some are tuned for SSCAIT, where fast learning with limited scope helps performance.

bots that rose then leveled out

Finally, here are the learning bots which kept learning for a long time. (#15 MetaBot has its own graph above and is left out.) #2 CherryPi started strongly and reduced its loss rate by 1/3 over the course of the tournament, which is impressive. #10 ZZZKBot started poorly, then has a clean smooth curve which approaches an asymptote after about 10 rounds. #11 Steamhammer also started poorly, and its slower improvement seems to approach an asymptote after around 30 games, but in fact Steamhammer kept on learning throughout, left Tyr, LastOrder, and Microwave behind, and came close to surpassing ZZZKBot. In a longer tournament, it likely would have; Steamhammer’s big repertoire of openings means it still has fresh ideas to try after 100 rounds. #22 AIUR struggled at first, then recovered and showed its usual strong learning gains.

bots that kept on learning

I find that these performance curves are rich with insight. The top finishers have strong basics, and use learning to avoid being exploited (that seems to the only purpose of learning in SAIDA), or to exploit the weaknesses of other bots. Most bots that did not learn suffered for it, but some were difficult to exploit and could hold their ground—LastOrder was chief among these. Bots that did learn sometimes learned too little and could not keep up with their rivals. Steamhammer and MetaBot were remarkable for their comparatively weak foundations and slow but strong learning skills.

Next I’ll look into what specific bots learned about their opponents. Following tradition, I’ll start with AIUR.

AIIDE 2018 - what bots wrote data

As usual, here is my examination of what each bot kept in its AI directory to read at startup, and what it wrote into its write directory for learning and/or debugging. The AI directory is not the only place a bot might keep prepared data; some bots have configuration files, and the binary might contain anything. This time I left out the up/down arrows. The performance curves seem more complicated than in CIG, and I want to look at them separately. Having files doesn’t mean that the files are used; they might be sitting there unread.

#botinfo
1SAIDASAIDA stored three classes of files, 131 DefeatResult files (though officially it lost 106 games and timed out 8 times), 18 Error files, and 229 Timeout files. The DefeatResult files are 33 to 80 lines long and have nicely-formatted readable information including the enemy’s build order history with timings, and unit counts and unit loss counts for both sides. I expect that the enemy build timings are key information for the learning mechanism. The error files range from 2 to 2500 lines long and report internal errors that the bot presumably was able to ignore or recover from. The timeout files report when specific managers ran over.
2CherryPiCherryPi has a couple of larger files in AI, 77MB and 3MB, which are likely offline machine learning data. CherryPi’s survey answers mention offline learning. In the write directory it wrote a JSON file for each opponent. The JSON file gives a list of the build orders CherryPi played, and for each build order, a list of booleans under the name “wins_” that look like the win/loss history. It’s interesting that they give the sequence of wins and losses, not simply the counts. It suggests that their learning method is watching for when the opponent figures something out and starts to perform better. It’s also interesting that the build given as having been played most often versus SAIDA is “zvt3hatchlurker”, which does not seem appropriate versus SAIDA’s mech play—but does claim more wins than the alternatives tried. In the files I checked, the total number of win/loss booleans is slightly over 100, the official number of games played. It looks like the tournament manager played 103 rounds before time ran out, then its results were pruned back to 100 rounds so the maps were equally used.
3CSELog file and learning data that looks like that of Locutus.
4BlueBlueSkyLog file and learning data that looks like that of Locutus.
5LocutusLog file and learning data that... is that of Locutus, not very different from Steamhammer data. Locutus also has pre-learned data for 11 opponents, 2 of which have 2 names.
6ISAMindLog file and learning data that looks like that of Locutus. Also ISAMind’s machine learning data.
7DaQinLog file and learning data that looks like that of Locutus, except that DaQin stores data about only one game per opponent, although the survey answers say differently. Was something broken for this tournament? If so, it doesn’t show in DaQin’s win rate, which is about as expected.
8McRaveFor each opponent, a file listing the 15 protoss strategies that McRave could play, with 2 numbers that look like wins/losses. The numbers sometimes add up to 100 or so, but some are lower. McRave is listed with 83 crashes and 120 frame timeouts, which is likely why.
9IronNothing. #9 Iron is the highest-ranked bot which wrote no learning data.
10ZZZKBotLooks about the same as last year’s format. Even the timestamps say 2017.
11SteamhammerSteamhammer’s familiar data, game records with obscure timing numbers.
12MicrowaveAs before, a file listing 7 or 8 strategies and win/loss counts for each, limited to a max count of 10.
13LastOrderMachine learning data in AI, but no online learning data, only a 2 byte file log_detail_file.
14TyrFor each opponent, a 1 to 4 line file apparently telling whether the previous game was a win or a loss, a small integer, and the strategy Tyr followed, possibly with a few following items named “flags”.
15MetaBotIn AI/learning, a file for each of Skynet, UAlbertaBot, and XIMP, with 91 numbers in each file. 91 is the count of parameters that AIUR learns, and AIUR itself has the same 3 files, so this is AIUR's old pre-learned data about these 3 opponents. In write, a mess of mostly log files, but also with apparent learning data per opponent. states_* files list which head was played for some games against each opponent; this is probably log data, but could also be used for learning. skynet_* files per opponent look like Skynet learning data, no doubt for games where Skynet played. [opponent].txt files are the 91 numbers, likely learning data from when AIUR played. So there are 2 levels of learning here: Learning which head should play, and learning inside that head.
16LetaBotA 619-line file battlescore.txt with 103 game records of 6 lines each, which I think is one record for each round played (though only 100 rounds were official). It could be a log file or learning data.
17ArrakhammerNothing.
18EcgberhtNothing. The author has explained that learning did not work due to an incorrect run_proxy.bat file.
19UAlbertaBotThe familiar UAlbertaBot format. For each opponent, a file listing 11 opening strategies with a win/loss count for each.
20XIMPNothing.
21CDBotNothing.
22AIURA carryover from past years. Pre-learned data against 3 old opponents (as already mentioned under MetaBot), plus for each opponent, the familiar 91 lines of numbers.
23KillAllKillAll is a Steamhammer fork, but it uses a different learning file format. There is a file for each opponent+map combination. It looks like each file gives a game count (usually 10), a chosen opening or “None”, and a list of 8 openings with 3 numbers for each; the last number is floating point. I guess I have to read the code to find out what the numbers mean.
24WillyTA log file with 103 lines, presumably 1 per round played.
25AILienAILien's idiosyncratic learning file format. One file per opponent, with numbers saying what units are preferred and a few odds and ends. It looks as though AILien saved data for only 1 game per opponent. If this is the same version of AILien that I looked at earlier, then I expect learning was turned off and the recorded data was not used.
26CUNYBotIn AI, a file output.txt with a list of build orders and some data on each one. In write, 487 files in these groups: output.txt an apparent log file with 103 lines, [map]_v_[opponent]_status.txt which looks like detailed information per game with a variety of hard-to-understand values, 226 files [map]Veins([x],[y]) with mostly over 200K lines per file where the (x,y) values are too large to be tile positions and too small to be pixel positions (so I guess they are "Veins"). It looks complex.
27HellbotNothing.

Lesson: Learn about your opponent! All the winning kids are doing it!

Some interesting and some complicated stuff here. As for CIG, I’ll be looking at what different bots learned. This time it should be more informative.

SAIDA’s learning and SAIDA’s weaknesses

SAIDA is holding its position as #1 on SSCAIT, but it is under constant attack from other bots and loses some games. On the one hand, SAIDA has weaknesses against early harassment and timing attacks, especially if the opponent denies scouting. On the other hand, SAIDA appears to have a learning mechanism that recognizes rush timing and figures out a defense. The SAIDA page describes it as “He also catches perfect rush timing by using information he collected.” That’s a vague description, but the behavior does appear to involve learning from experience. MicroDK noted that SAIDA writes data only after it loses; this must be why. For example, BananaBrain tried a dark templar rush and won a series of games, but finally the learning kicked in and SAIDA figured out how to get turrets in time to stop it (SAIDA’s code was not updated). Since then, BananaBrain has mostly lost games, defeating SAIDA only once, in this game where the turret was seconds late.

Other examples include PurpleSpirit winning one game with BBS then being unable to win with it again, and Krasi0 winning with its fast barracks marine cheese with similar results.

In the latest attacks, Locutus won with center gates, making only 2 zealots before switching into dragoons, and Krasi0 added a bunker to its marine cheese to overcome SAIDA’s vulture counter to the marines (SAIDA crashed this game). Will SAIDA learn to defeat these tricks too? I don’t know, let’s find out!

How powerful is this learning mechanism? Surely there must be attacks that it cannot figure out how to forestall—or can’t figure out in reasonable time. If you find 2 winning tricks and switch between them, can it learn to defend against both? If you DT rush once so that it learns to get early turrets, does it get early turrets for the rest of time after you switch back to regular play? The unnecessary turrets give you a small advantage, and at a high level of play, small advantages are big.

Here are some of the weaknesses I see in SAIDA’s play.

  • Poor defense against unscouted early attacks, mitigated by the learning mechanism. SAIDA loses more SCVs than it should.
  • SAIDA recovers poorly from economic setbacks. It does not replenish lost SCVs as well as it should, and stops expanding after a while. If you gain an early lead, you can win by holding on and waiting for SAIDA to mine out.
  • SAIDA is vulnerable to mine drags. It sees no danger in having its spider mines and its forces next to each other. It will even place mines in its mineral line, begging you to blow up its SCVs.
  • SAIDA does not know how to build in safe locations. On some maps, like Moon Glaive, parts of the main base are easily sieged from outside. Krasi0 has won games by blasting down factories that are in range, and SAIDA keeps trying to rebuild in places that are also in range.
  • SAIDA is consistent and predictable. It varies to counter the opponent, but at heart always plays the same strategy and the same tactics. The dropships always fly along the edge.

SAIDA also has great strengths. The greatest may be the big red animated arrow that points out the main attack position. As long as SAIDA has a monopoly on big animated arrows, I think it will remain #1.

CIG 2018 - what Locutus learned

Locutus only recorded 8 games. It is configured to retain 200 game records, and I read the source code and verified that Locutus does not intentionally drop game records before the limit of 200. Recording exactly 8 games is the same problem that McRave suffered, and must be due to CIG problems. I don't know what the underlying problem was. My suspicion is that CIG organizers or tournament software may have accidentally or mistakenly cleared learning data for some bots. If that is what happened, and it happened once 8 games before the end of the tournament, it seems likely that it happened more than once. Who knows, though? The error might be somewhere else. Maybe they mistakenly shipped us data from after round 8 instead of round 125—in that case the tournament may have run normally, and only the data about it is wrong.

Locutus has prepared data for some opponents, stored in the AI directory. When Locutus finds it has no game records for a given opponent, it looks in AI to see if it has prepared data, and if so, it reads in those game records. At the end of the game, it writes out the prepared game records along with the record for the newly played game, and from then on the prepared records are treated like any others and retained unless and until the 200 record limit is passed.

How many other bots were affected by the 8 game problem?


Here is Locutus’s prepared data. Against some opponents, like McRave, Locutus picks out openings to avoid at first. If other openings don’t win either, I’m sure Locutus will come back and try these anyway. Against others, it picks out winners to try first. For some, it simply provides data. Most but not all of the prepared data is for opponents which were carried over from last year, for which pre-learning is sure to be helpful... if it is done on the same maps.

#3 mcrave

openinggameswins
12Nexus5ZealotFECannons10%
Turtle10%
2 openings20%

#6 iron

openinggameswins
DTDrop14100%
1 openings14100%

#7 zzzkbot

openinggameswins
ForgeExpand5GateGoon2100%
1 openings2100%

#11 ualbertabot

openinggameswins
4GateGoon10%
9-9GateDefensive250%
ForgeExpand5GateGoon1593%
3 openings1883%

#14 aiur

openinggameswins
4GateGoon3100%
9-9GateDefensive1100%
2 openings4100%

#16 ziabot

openinggameswins
9-9GateDefensive10%
ForgeExpand5GateGoon1100%
2 openings250%

#19 terranuab

openinggameswins
DTDrop10100%
1 openings10100%

#21 opprimobot

openinggameswins
DTDrop11100%
1 openings11100%

#22 sling

openinggameswins
ForgeExpand5GateGoon2100%
1 openings2100%

#23 srbotone

openinggameswins
DTDrop7100%
PlasmaProxy2Gate1100%
2 openings8100%

#24 bonjwa

openinggameswins
DTDrop6100%
PlasmaProxy2Gate1100%
2 openings7100%

overall

totalPvTPvPPvZPvR
openinggameswinsgameswinsgameswinsgameswinsgameswins
12Nexus5ZealotFECannons10% 10%
4GateGoon475% 3100% 10%
9-9GateDefensive450% 1100% 10% 250%
DTDrop48100% 48100%
ForgeExpand5GateGoon2095% 5100% 1593%
PlasmaProxy2Gate2100% 2100%
Turtle10% 10%
total8092%50100%667%683%1883%
openings played72423

Here is Locutus’s learned data. In every case, the number of games recorded is 8 plus the number of games in the prepared data. With only 8 games there is not much to go on, but the prepared data does seem to have helped Locutus choose successful openings.

#2 purplewave

openinggameswins
12Nexus5ZealotFECannons10%
4GateGoon10%
9-9GateDefensive580%
Proxy9-9Gate10%
4 openings850%

#3 mcrave

openinggameswins
12Nexus5ZealotFECannons10%
4GateGoon367%
Proxy9-9Gate5100%
Turtle10%
4 openings1070%

#4 tscmoo

openinggameswins
4GateGoon10%
9-9GateDefensive10%
ForgeExpand5GateGoon425%
Proxy9-9Gate250%
4 openings825%

#5 isamind

openinggameswins
4GateGoon683%
9-9GateDefensive1100%
Proxy9-9Gate1100%
3 openings888%

#6 iron

openinggameswins
DTDrop2295%
1 openings2295%

#7 zzzkbot

openinggameswins
ForgeExpand5GateGoon786%
ForgeExpandSpeedlots250%
Proxy9-9Gate10%
3 openings1070%

#8 microwave

openinggameswins
ForgeExpand5GateGoon8100%
1 openings8100%

#9 letabot

openinggameswins
DTDrop888%
1 openings888%

#10 megabot

openinggameswins
4GateGoon8100%
1 openings8100%

#11 ualbertabot

openinggameswins
4GateGoon10%
9-9GateDefensive250%
ForgeExpand5GateGoon2391%
3 openings2685%

#12 tyr

openinggameswins
4GateGoon8100%
1 openings8100%

#13 ecgberht

openinggameswins
DTDrop888%
1 openings888%

#14 aiur

openinggameswins
12Nexus5ZealotFECannons10%
2GateDTExpo1100%
4GateGoon580%
9-9GateDefensive1100%
Proxy9-9Gate475%
5 openings1275%

#15 titaniron

openinggameswins
DTDrop8100%
1 openings8100%

#16 ziabot

openinggameswins
9-9GateDefensive10%
ForgeExpand5GateGoon683%
ForgeExpandSpeedlots250%
Proxy9-9Gate1100%
4 openings1070%

#17 steamhammer

openinggameswins
ForgeExpand5GateGoon8100%
1 openings8100%

#18 overkill

openinggameswins
ForgeExpand5GateGoon8100%
1 openings8100%

#19 terranuab

openinggameswins
DTDrop18100%
1 openings18100%

#20 cunybot

openinggameswins
ForgeExpand5GateGoon8100%
1 openings8100%

#21 opprimobot

openinggameswins
DTDrop19100%
1 openings19100%

#22 sling

openinggameswins
ForgeExpand5GateGoon10100%
1 openings10100%

#23 srbotone

openinggameswins
DTDrop15100%
PlasmaProxy2Gate1100%
2 openings16100%

#24 bonjwa

openinggameswins
DTDrop14100%
PlasmaProxy2Gate1100%
2 openings15100%

#25 stormbreaker

openinggameswins
ForgeExpand5GateGoon8100%
1 openings8100%

#26 korean

openinggameswins
ForgeExpand5GateGoon8100%
1 openings8100%

#27 salsa

openinggameswins
ForgeExpand5GateGoon8100%
1 openings8100%

overall

totalPvTPvPPvZPvR
openinggameswinsgameswinsgameswinsgameswinsgameswins
12Nexus5ZealotFECannons30% 30%
2GateDTExpo1100% 1100%
4GateGoon3382% 3187% 20%
9-9GateDefensive1164% 786% 10% 333%
DTDrop11297% 11297%
ForgeExpand5GateGoon10693% 7997% 2781%
ForgeExpandSpeedlots450% 450%
PlasmaProxy2Gate2100% 2100%
Proxy9-9Gate1573% 1182% 250% 250%
Turtle10% 10%
total28890%11497%5480%8693%3471%
openings played102644

CIG 2018 - what Steamhammer learned

I wrote a new script to analyze Steamhammer’s learning data. A couple points: 1. Steamhammer crashed in nearly half of its games in CIG 2018. It can’t save learning data after a crash, so against some opponents Steamhammer had few opportunities to experiment. The number of crashes varied strongly depending on the opponent. 2. Steamhammer was set to remember the previous 100 games, since I figure there’s no play advantage to remembering more. The tournament was 125 rounds long. So in the tables below, “100 games” means that Steamhammer played at least 100 games without crashing, and up to 25 games may have been dropped, the early games. Against some weak opponents, Steamhammer learned, within 25 games, how to win 100% of the remaining games, and those tables give a 100% win rate for remembered games. Steamhammer did not score 100% against any opponent overall; it always had some losses in early games.

I should be able to run the same analysis for Steamhammer forks which retain Steamhammer’s opponent model file format.

#1 Locutus

openinggameswins
2HatchHydraBust10%
3HatchHydraExpo20%
3HatchLingBust10%
3HatchLingExpo10%
4HatchBeforeGas10%
OverpoolSpeed956%
6 openings1533%

A mystery is solved. Why was Steamhammer’s crash rate higher than I expected? Because many opponents learned to make Steamhammer crash. A crash for the opponent is a win, and the bot doesn’t care how it wins, so if it can learn a plan that makes the opponent crash reliably, it will. The stronger opponents tend to be learning bots, so Steamhammer crashed more often on average against strong opponents. This also means that my glib conclusion that Steamhammer won 66% of non-crash games, so it seems to have kept up with general progress is not sound. The non-crash games were mostly against weak opponents.

Locutus was lucky that it could figure out how to break Steamhammer. As Bruce mentioned in a comment, this Locutus version had a bug when facing certain zergling timings, and Steamhammer quickly figured out how to exploit the bug. It’s possible that Steamhammer minus the crash would have upset Locutus.

#2 PurpleWave

openinggameswins
11Gas10PoolMuta10%
3HatchHydra30%
3HatchLurker10%
4PoolSoft10%
7Pool12Hatch10%
7PoolSoft10%
9Hatch8Pool10%
9HatchExpo9Pool9Gas10%
9PoolSpeed10%
AntiFactory10%
Over10Hatch60%
Over10Hatch1Sunk70%
Over10Hatch2Sunk180%
Over10HatchBust10%
Over10HatchSlowLings40%
OverhatchMuta10%
OverpoolHatch10%
OverpoolTurtle30%
ZvP_3HatchPoolHydra20%
ZvP_4HatchPoolHydra10%
ZvT_12PoolMuta10%
ZvZ_Overpool11Gas10%
22 openings580%

PurpleWave shut out Steamhammer. It didn’t learn to make Steamhammer crash because every game was a win for it anyway. Steamhammer desperately tried alternatives all over the map, including crazy all-ins and openings intended for ZvT and ZvZ, and nothing worked.

#3 McRave

openinggameswins
11Gas10PoolLurker10%
4HatchBeforeGas10%
9HatchExpo9Pool9Gas10%
9PoolSpeed5100%
ZvP_3HatchPoolHydra20%
5 openings1050%

#4 tscmoo

openinggameswins
9PoolExpo10%
9PoolHatch10%
9PoolSunkHatch10%
AntiFact_2Hatch10%
Over10Hatch2Sunk10%
OverhatchExpoLing1315%
OverpoolSpeed2223%
7 openings4018%

#5 ISAMind

openinggameswins
3HatchHydraExpo10%
4HatchBeforeGas10%
OverpoolSpeed4100%
ZvP_2HatchMuta70%
ZvP_3HatchPoolHydra60%
5 openings1921%

#6 Iron

openinggameswins
2HatchHydra10%
3HatchLingExpo20%
4PoolHard10%
6PoolSpeed10%
9Hatch8Pool10%
9HatchMain9Pool9Gas10%
9PoolSunkSpeed10%
AntiFact_13Pool40%
AntiFact_2Hatch8312%
AntiFactory10%
Over10Hatch10%
PurpleSwarmBuild10%
ZvP_2HatchMuta10%
ZvT_12PoolMuta10%
14 openings10010%

Iron is not a learning bot, so it did not learn to crash Steamhammer. Still, these results show a weakness in Steamhammer: Its best opening against Iron is AntiFactory, which it tried only once in these 100 games. Steamhammer did not explore enough. I tried to fix the weakness in Steamhammer 2.0.

#7 ZZZKBot

openinggameswins
11Gas10PoolMuta10%
8Pool729%
9HatchMain9Pool9Gas10%
9PoolSpeed10%
OverhatchMuta10%
Overpool+110%
OverpoolSpeed10%
ZvZ_12HatchMain20%
ZvZ_12Pool10%
ZvZ_12PoolLing4858%
ZvZ_Overgas9Pool20%
ZvZ_Overpool9Gas20%
12 openings6844%

#8 Microwave

openinggameswins
9PoolSunkHatch580%
9PoolSunkSpeed2767%
OverpoolSunk10%
OverpoolTurtle333%
ZvZ_12PoolLing10%
5 openings3762%

This looks like successful learning. Too bad Steamhammer only successfully played 37 of the 125 games.

#9 LetaBot

openinggameswins
11Gas10PoolLurker10%
2HatchLurkerAllIn40%
3HatchHydraExpo10%
3HatchLurker1338%
9HatchExpo9Pool9Gas4536%
OverpoolLurker1331%
ZvP_2HatchMuta10%
ZvT_12PoolMuta10%
ZvT_13Pool10%
ZvT_3HatchMuta10%
10 openings8131%

#10 MegaBot

openinggameswins
11Gas10PoolLurker10%
3HatchHydra10%
3HatchHydraExpo10%
3HatchLingExpo2143%
Over10Hatch10%
OverhatchExpoLing1100%
ZvP_3HatchPoolHydra20%
7 openings2836%

#11 UAlbertaBot

openinggameswins
3HatchLingExpo10%
5PoolHard2Player10%
9PoolExpo10%
9PoolSpeed10%
9PoolSunkHatch4633%
9PoolSunkSpeed2948%
Over10Hatch1Sunk20%
OverpoolSpeed10%
ZvZ_Overpool9Gas10%
9 openings8335%

#12 Tyr

openinggameswins
9PoolHatch5100%
ZvP_3HatchPoolHydra50%
2 openings1050%

#13 Ecgberht

openinggameswins
11Gas10PoolLurker1050%
2HatchLurker2361%
2HatchLurkerAllIn4475%
Over10HatchBust333%
OverpoolLurker875%
OverpoolSpeed333%
ZvT_13Pool10%
7 openings9265%

#14 Aiur

openinggameswins
11Gas10PoolLurker1100%
5PoolHard2Player1100%
9PoolSunkHatch1100%
9PoolSunkSpeed2100%
Over10Hatch10%
Over10Hatch1Sunk250%
Over10Hatch2Hard1100%
Over10HatchSlowLings1100%
OverpoolSpeed2100%
OverpoolTurtle367%
10 openings1580%

#15 TitanIron

openinggameswins
3HatchLingBust10%
AntiFact_13Pool650%
AntiFact_2Hatch10%
AntiFactory7442%
Over10Hatch2Sunk10%
OverhatchExpoMuta10%
OverpoolLurker10%
ZvZ_Overgas9Pool1421%
ZvZ_Overpool9Gas10%
9 openings10037%

This selection of openings implies that TitanIron plays a factory-first build against zerg, like Iron, and is a non-learning bot, like Iron. Later I’ll look into the source and find out for sure.

#16 Ziabot

openinggameswins
11Gas10PoolMuta425%
2.5HatchMuta10%
3HatchHydraBust10%
6PoolSpeed10%
8Pool771%
9Hatch8Pool10%
9PoolHatch450%
ZvP_2HatchTurtle10%
ZvZ_12Pool10%
ZvZ_12PoolMain1625%
ZvZ_Overpool11Gas1050%
ZvZ_Overpool9Gas5374%
12 openings10056%

Low win rates against Zia and some other opponents suggest to me that Steamhammer had other new weaknesses besides crashing. I think Steamhammer should score over 80% against Zia.

#18 Overkill

openinggameswins
11Gas10PoolMuta1090%
4PoolHard2396%
6PoolSpeed28100%
9Hatch8Pool10%
OverhatchLing250%
OverpoolSpeed1392%
ZvZ_12HatchExpo250%
ZvZ_12PoolMain10%
8 openings8091%

#19 TerranUAB

openinggameswins
2HatchLurker5290%
AntiFact_13Pool888%
AntiFact_2Hatch978%
AntiFactory3190%
4 openings10089%

#20 CUNYbot

openinggameswins
11Gas10PoolMuta978%
OverhatchLing3497%
ZvZ_12PoolLing2796%
ZvZ_Overgas9Pool10%
ZvZ_Overpool9Gas1989%
5 openings9092%

#21 OpprimoBot

openinggameswins
11Gas10PoolLurker367%
2HatchLurker250%
2HatchLurkerAllIn683%
6PoolSpeed19100%
OverpoolLurker10%
OverpoolSpeed580%
ZvT_12PoolMuta2095%
ZvT_3HatchMuta20100%
ZvT_3HatchMutaExpo24100%
9 openings10094%

#22 Sling

openinggameswins
4PoolHard475%
4PoolSoft6100%
5PoolHard2Player3100%
ZvZ_12HatchMain10%
ZvZ_Overgas9Pool10%
5 openings1580%

The selection of fast rush openings suggests that Sling played a macro strategy which was countered by fast rushes. But I don’t want to draw strong conclusions based on 15 non-crash games out of 125.

#23 SRbotOne

openinggameswins
11Gas10PoolLurker1493%
2HatchLurker1090%
2HatchLurkerAllIn1090%
3HatchLurker17100%
4PoolSoft17100%
5PoolHard7100%
9HatchExpo9Pool9Gas475%
9PoolLurker3100%
OverpoolLurker5100%
9 openings8795%

The wide range of lurker openings means that SRbotOne by Johan Kayser fought with mostly barracks units. Well, we already knew that.

#24 Bonjwa

openinggameswins
9PoolExpo6100%
9PoolSunkHatch5100%
9PoolSunkSpeed5100%
AntiFact_2Hatch3100%
AntiFactory5100%
ZvT_2HatchMuta1100%
6 openings25100%

#25 Stormbreaker

openinggameswins
11Gas10PoolMuta1100%
4PoolHard1100%
9PoolSunkHatch8100%
9PoolSunkSpeed8100%
OverhatchLing1100%
OverhatchMuta7100%
OverpoolSpeed1100%
OverpoolSunk7100%
ZvZ_12HatchExpo2100%
ZvZ_12HatchMain3100%
ZvZ_12PoolLing1100%
ZvZ_12PoolMain3100%
12 openings43100%

#26 Korean

openinggameswins
4PoolHard1100%
4PoolSoft3100%
5PoolHard5100%
5PoolHard2Player3100%
5PoolSoft1100%
6PoolSpeed6100%
OverhatchLing9100%
OverhatchMuta12100%
ZvZ_12HatchExpo13100%
ZvZ_12HatchMain16100%
ZvZ_12PoolLing14100%
ZvZ_12PoolMain17100%
12 openings100100%

#27 Salsa

openinggameswins
4PoolHard2100%
4PoolSoft4100%
5PoolHard7100%
5PoolHard2Player1100%
5PoolSoft1100%
6PoolSpeed8100%
OverhatchLing11100%
OverhatchMuta8100%
ZvZ_12HatchExpo12100%
ZvZ_12HatchMain20100%
ZvZ_12PoolLing13100%
ZvZ_12PoolMain12100%
ZvZ_Overgas9Pool1100%
13 openings100100%

overall

totalZvTZvPZvZZvR
openinggameswinsgameswinsgameswinsgameswinsgameswins
11Gas10PoolLurker3168% 2871% 333%
11Gas10PoolMuta2669% 10% 2572%
2.5HatchMuta10% 10%
2HatchHydra10% 10%
2HatchHydraBust10% 10%
2HatchLurker8782% 8782%
2HatchLurkerAllIn6473% 6473%
3HatchHydra40% 40%
3HatchHydraBust10% 10%
3HatchHydraExpo50% 10% 40%
3HatchLingBust20% 10% 10%
3HatchLingExpo2536% 20% 2241% 10%
3HatchLurker3171% 3073% 10%
4HatchBeforeGas30% 30%
4PoolHard3291% 10% 3194%
4PoolSoft3197% 17100% 10% 13100%
5PoolHard19100% 7100% 12100%
5PoolHard2Player989% 1100% 7100% 10%
5PoolSoft2100% 2100%
6PoolSpeed6397% 2095% 4398%
7Pool12Hatch10% 10%
7PoolSoft10% 10%
8Pool1450% 1450%
9Hatch8Pool40% 10% 10% 20%
9HatchExpo9Pool9Gas5137% 4939% 20%
9HatchMain9Pool9Gas20% 10% 10%
9PoolExpo875% 6100% 20%
9PoolHatch1070% 5100% 450% 10%
9PoolLurker3100% 3100%
9PoolSpeed862% 683% 10% 10%
9PoolSunkHatch6650% 5100% 1100% 1392% 4732%
9PoolSunkSpeed7265% 683% 2100% 3574% 2948%
AntiFact_13Pool1856% 1856%
AntiFact_2Hatch9721% 9621% 10%
AntiFactory11257% 11158% 10%
Over10Hatch90% 10% 80%
Over10Hatch1Sunk119% 911% 20%
Over10Hatch2Hard1100% 1100%
Over10Hatch2Sunk200% 10% 180% 10%
Over10HatchBust425% 333% 10%
Over10HatchSlowLings520% 520%
OverhatchExpoLing1421% 1100% 1315%
OverhatchExpoMuta10% 10%
OverhatchLing5796% 5796%
OverhatchMuta2993% 10% 2896%
Overpool+110% 10%
OverpoolHatch10% 10%
OverpoolLurker2854% 2854%
OverpoolSpeed6156% 862% 1573% 1587% 2322%
OverpoolSunk888% 888%
OverpoolTurtle933% 633% 333%
PurpleSwarmBuild10% 10%
ZvP_2HatchMuta90% 20% 70%
ZvP_2HatchTurtle10% 10%
ZvP_3HatchPoolHydra170% 170%
ZvP_4HatchPoolHydra10% 10%
ZvT_12PoolMuta2383% 2286% 10%
ZvT_13Pool20% 20%
ZvT_2HatchMuta1100% 1100%
ZvT_3HatchMuta2195% 2195%
ZvT_3HatchMutaExpo24100% 24100%
ZvZ_12HatchExpo2997% 2997%
ZvZ_12HatchMain4293% 4293%
ZvZ_12Pool20% 20%
ZvZ_12PoolLing10479% 10479%
ZvZ_12PoolMain4973% 4973%
ZvZ_Overgas9Pool1921% 1421% 520%
ZvZ_Overpool11Gas1145% 10% 1050%
ZvZ_Overpool9Gas7674% 10% 7476% 10%
total159664%68562%15526%63382%12329%
openings played6937363113

This summary table took me hours to get right, so I hope it's useful.

Steamhammer played 69 openings in 1596 non-crash games, which is around 2/3rds of the openings it knows. No single matchup had more than 37 different openings. There were far more games against terran and zerg than against protoss and random, partly due to the crashing pattern. Against the random opponents (Tscmoo and UAlbertaBot), it settled on mostly general-purpose openings, as you might expect. Its best matchup was ZvZ, with a Jaedong-like 82% win rate (and lately, Jaedong crashes half the time too, so they’re just alike).

Openings that were both popular and successful include 2HatchLurker and 2HatchLurkerAllIn versus terran, 6PoolSpeed with a 97% win rate against mostly weak opponents, 9PoolSunkSpeed used across all matchups, and ZvZ specialties OverhatchLing, ZvZ_12PoolLing, and ZvZ_Overpool9Gas. None of the opening choices surprises me, though some of the win rates do.

CIG 2018 - what Overkill learned

After analyzing AIUR yesterday, I ran a similar (but much simpler) analysis for the classic zerg #18 Overkill. The version in CIG 2018 has not been updated since 2015 and is the same version that still plays on SSCAIT. In 2015 it was a sensation, placing 3rd in both CIG and AIIDE—its place of 18 in this tournament, with about 35% win rate, suggests huge progress over the past 3 years. But keep reading; Overkill appears to have been broken in this tournament. I did this analysis once before: See what Overkill learned in AIIDE 2015.

Classic Overkill knows 3 openings, a 9 pool opening which stays on one base for a good time, and 10- and 12-hatch openings to get mutalisks first. When it chooses 9 pool, that means that the opponent is either rushing (so the 9 pool is necessary to defend) or is being too greedy (which the 9 pool can exploit). Overkill counts some games twice in an attempt to learn faster, so sometimes its total game count is larger than the number of rounds in the tournament (125).

NinePoollingTenHatchMutaTwelveHatchMutatotal
opponentnwinnwinnwinnwin
#1 Locutus420%420%410%1250%
#2 PurpleWave430%430%420%1280%
#3 McRave440%440%430%1310%
#4 tscmoo400%400%472%1271%
#5 ISAMind420%420%410%1250%
#6 Iron547%320%393%1254%
#7 ZZZKBot472%390%472%1332%
#8 Microwave546%350%422%1313%
#9 LetaBot526%330%402%1253%
#10 MegaBot6012%240%417%1258%
#11 UAlbertaBot410%410%482%1301%
#12 Tyr400%390%472%1261%
#13 Ecgberht5716%244%4212%12312%
#14 Aiur9434%147%1712%12528%
#15 TitanIron3611%200%6916%12512%
#16 Ziabot160%160%9323%12517%
#17 Steamhammer10748%70%1010%12442%
#19 TerranUAB2467%30%9883%12578%
#20 CUNYbot1844%617%10166%12561%
#21 OpprimoBot3667%30%8676%12571%
#22 Sling6746%60%5242%12542%
#23 SRbotOne2374%425%9589%12284%
#24 Bonjwa7592%425%4687%12588%
#25 Stormbreaker7091%20%5387%12588%
#26 Korean7799%20%4693%12595%
#27 Salsa46100%3294%46100%12498%
total130536%5976%137240%327432%

The 10 hatch opening was useless in this tournament—against every opponent, 10 hatch was the worst choice, at best tying for 0. In 2015, 10 hatch was about as successful as the other openings.

Signs are that something was wrong with Overkill in this tournament. In AIIDE 2015, then #3 Overkill scored 23% against then #4 UAlbertaBot, 68% against #5 AIUR, and 99% against #17 OpprimoBot. In CIG 2018, it was 1.6% against UAlbertaBot, 28% against AIUR, 71% against OpprimoBot. All versions appear to be the same in both tournaments—I didn’t look closely, but I did unpack the sources and check dates (in particular, Overkill has file change dates up to 8 October 2015 in both tournaments). Overkill had 14 crash games in CIG 2018, not enough to account for the difference. It’s hard to believe that the maps could have shifted results that much.

Tomorrow: What went wrong with Overkill?