archive by month
Skip to content

CherryPi - Stardust game

As you might guess from the CoG results, Stardust has gained strong new skills since last year. Against zerg in particular, one skill is that it uses corsairs heavily. Another is that it learned the forge expand opening. And yet it can still be defeated.

CherryPi is from 2017, but it shows how. The SSCAIT game Stardust v CherryPi is instructive (I kept the replay for when SSCAIT recycles it).

The recipe is:

1. Facing forge expand, play a greedy opening. CherryPi opened pool first (with a rare 13 pool) so that it could make zerglings if it needed to, but cannons were not going to walk across the map to attack, so it followed up with drones and hatcheries.

2. Fight efficiently in the middle game. CherryPi could not take any one-sided victories against such a tough opponent, but it traded well and cut the protoss army down to a safe size, where zerg could spawn enough defensive units in the time the protoss would take to cross the map.

3. With that breathing room, zerg could safely pull ahead in workers. Then it was just mass and smash.

See how easy it is? It’s a simple matter of being good at everything!

Of course it was only possible because Stardust showed weaknesses. One is that it was cautious and clumsy with its corsairs, and put on less counter-air pressure than it could have. First it kept them with the army, then it flew them over hydras.

AIIDE 2018 - what CherryPi learned

Here is a table of how each CherryPi opening fared against each opponent, like the tables I made for other bots. Reading the code confirmed my inference that the learning files recorded opening build orders, not build orders switched to later in the game; see how CherryPi played.

#bottotal10hatchling2hatchmuta3basepoollings9poolspeedlingmutahydracheesezve9poolspeedzvp10hatchzvp3hatchhydrazvp6hatchhydrazvpohydraszvpomutaszvt2baseguardianzvt2baseultrazvt3hatchlurkerzvtmacrozvz12poolhydraszvz9gas10poolzvz9poolspeedzvzoverpool
#1saida13-90  13%-----1-19 5%------1-15 6%9-37 20%2-19 10%----
#3cse73-30  71%-----0-2 0%24-5 83%--16-8 67%----33-15 69%----
#4bluebluesky89-14  86%-----0-1 0%29-8 78%-------60-5 92%----
#5locutus84-19  82%--63-11 85%-----14-3 82%-2-2 50%---5-3 62%----
#6isamind99-4  96%--1-0 100%-----98-4 96%----------
#7daqin103-0  100%--------------103-0 100%----
#8mcrave87-16  84%--9-2 82%-----31-4 89%-14-4 78%---33-6 85%----
#9iron97-6  94%----97-6 94%--------------
#10zzzkbot93-10  90%58-4 94%--0-1 0%-------------35-4 90%0-1 0%
#11steamhammer81-21  79%22-7 76%----16-5 76%---------0-1 0%-43-8 84%-
#12microwave94-9  91%----------------0-1 0%4-2 67%90-6 94%
#13lastorder85-18  83%45-7 87%----0-1 0%------------40-10 80%
#14tyr98-5  95%------98-5 95%------------
#15metabot94-2  98%---------94-2 98%---------
#16letabot101-2  98%0-1 0%-97-0 100%--1-1 50%-----3-0 100%-------
#17arrakhammer92-11  89%-----------------92-11 89%-
#18ecgberht102-1  99%--------------102-1 99%----
#19ualbertabot99-4  96%---96-2 98%-3-2 60%-------------
#20ximp98-5  95%-------1-0 100%-97-5 95%---------
#21cdbot103-0  100%-----96-0 100%-----------7-0 100%-
#22aiur100-3  97%---------100-3 97%---------
#23killall103-0  100%102-0 100%-----------------1-0 100%
#24willyt103-0  100%-103-0 100%-----------------
#25ailien103-0  100%-----------------103-0 100%-
#26cunybot100-3  97%-----------------100-3 97%-
#27hellbot103-0  100%------31-0 100%--72-0 100%---------
overall-  90%227-19 92%103-0 100%170-13 93%96-3 97%97-6 94%117-31 79%182-18 91%1-0 100%143-11 93%379-18 95%16-6 73%3-0 100%1-15 6%9-37 20%338-49 87%0-1 0%0-1 0%384-28 93%131-17 89%

Look how sparse the chart is—CherryPi was highly selective about its choices. It did not try more than 4 different builds against any opponent. It makes sense to minimize the number of choices so that you don’t lose games exploring bad ones, but you have to be pretty sure that one of the choices you do try is good. Where did the selectivity come from?

The opening “hydracheese” was played only against Iron, and was the only opening played against Iron. It smelled like a hand-coded choice. Sure enough, the file source/src/models/banditconfigurations.cpp configures builds by name for 18 of the 27 entrants. A comment says that the build order switcher is turned off for the hydracheese opening only: “BOS disabled for this specific build because the model hasn’t seen it.” Here is the full set of builds configured, including defaults for those that were not hand-configured. CherryPi played only builds that were configured, but did not play all the builds that were configured; presumably it stopped when it hit a good one.

botsbuildsnote
AILienzve9poolspeed zvz9poolspeedreturning opponents from last year
AIURzvtmacro zvpohydras zvp10hatch
Arrakhammer10hatchling zvz9poolspeed
Ironhydracheese
UAlbertaBotzve9poolspeed 9poolspeedlingmuta
Ximpzvpohydras zvtmacro zvp3hatchhydra
Microwavezvzoverpool zvz9poolspeed zvz9gas10pool“we have some expectations”
Steamhammerzve9poolspeed zvz9poolspeed zvz12poolhydras 10hatchling
ZZZKBot9poolspeedlingmuta 10hatchling zvz9poolspeed zvzoverpool
ISAMind
Locutus
McRave
DaQin
zvtmacro zvp6hatchhydra 3basepoollings zvpomutas
CUNYBotzvzoverpoolplus1 zvz9gas10pool zvz9poolspeed
HannesBredbergzvtp1hatchlurker zvt2baseultra zvt3hatchlurker zvp10hatch
LetaBotzvtmacro 3basepoollings zvt2baseguardian zve9poolspeed 10hatchling
MetaBotzvtmacro zvpohydras zvpomutas zve9poolspeed
WillyTzvt2baseultra 12poolmuta 2hatchmuta
ZvTzvt2baseultra zvtmacro zvt3hatchlurker zve9poolspeeddefaults
ZvPzve9poolspeed zvtmacro zvp10hatch zvpohydras
ZvZ10hatchling zve9poolspeed zvz9poolspeed zvzoverpool
ZvR10hatchling zve9poolspeed 9poolspeedlingmuta

I read this as pulling out all the stops to reach #1. They would have succeeded if not for SAIDA.

banditconfigurations.cpp continues and declares some properties for builds including non-opening builds. It looks like .validOpening() tells whether it can be played as an opening build, .validSwitch() tells whether the build order switcher is allowed to switch to it during the game, and .switchEnabled() tells whether the build order switcher is enabled at all.

The build orders themselves are defined in source/src/buildorders/. I found them a little hard to read, partly because they are written in reverse order: Actions to happen first are posted last to the blackboard.

The opening zve9poolspeed (I read “zve” as zerg versus everything) has the most red boxes in the chart—it did poorly against more opponents than any other. It may have been a poor choice to configure for use in so many cases. In contrast, zvz9poolspeed specialized for ZvZ was successful. It gets fast mutalisks and in general has a lot more strategic detail coded into the build.

They seem to have had expectations of the zvt2baseultra build against terran. It is configured for HannesBredberg, WillyT, and the default ZvT. It was in fact only tried against SAIDA. I didn’t notice anything that tells CherryPi what order to try opening builds in. Maybe the build order switcher itself contributes, helping to choose the more likely openings first?

AIIDE 2018 - how CherryPi played

Overall, the play of the AIIDE 2018 CherryPi version looks similar to last year’s CherryPi which is still playing on SSCAIT. It still has the devastating ling micro, and it still prefers to win games with a flood of low-level units. It still gets melee +1 attack even when +1 carapace seems better. (Do CherryPi’s micro skills make +1 attack better, and if so, how?) Mutalisk micro looks very similar to Tscmoo’s, with mutas individually cautious and clever and collectively lazy and uncoordinated. It can use lurkers, guardians, and ultralisks. I didn’t see defilers, even when they would have been useful.

CherryPi scouts extremely aggressively with its first 2 overlords. They stick near the enemy base and try to poke into every corner, even if the enemy is terran and can shoot them down early. It gets a clear view, which must be useful for its build order switcher. The drawback is that the overlords often die young.

I think this CherryPi looks beatable. It doesn’t have SAIDA’s wide knowledge of action and reaction. It doesn’t have Steamhammer’s knowledge of how to react to LastOrder’s excessive static defense (but usually wins anyway with a zergling flood). It sometimes ignores undefended enemy bases, preferring to attack into the enemy’s strength—or even to wait idly. Game 31245 versus Iron shows it sticking with gas units and failing at macro; it forgot its love of zerglings. It doesn’t know whether it is ahead or behind, and it doesn’t realize that when it is maxed and owns the map, it ought to attack regardless of losses. It’s strong and tricky, but it also makes mistakes. I think next year’s version had better be improved if they don’t want to be overtaken.

Here are the names of the build orders that CherryPi recorded itself as playing in its opponent learning files. One of CherryPi’s major advertised features is a learned build order switcher that can switch to a new build order on the fly. It recorded 103 build order wins/losses for each opponent (except a couple with fewer), and 103 rounds were played, so these appear to be opening build orders only rather than all build orders tried throughout each game. Presumably the openings reflect CherryPi’s intentions when it started the game. It may not have followed the initial build order to its end.

  • 10hatchling
  • 2hatchmuta
  • 3basepoollings
  • 9poolspeedlingmuta
  • hydracheese
  • zve9poolspeed
  • zvp10hatch
  • zvp3hatchhydra
  • zvp6hatchhydra
  • zvpohydras
  • zvpomutas
  • zvt2baseguardian
  • zvt2baseultra
  • zvt3hatchlurker
  • zvtmacro
  • zvz12poolhydras
  • zvz9gas10pool
  • zvz9poolspeed
  • zvzoverpool

CherryPi tried between 1 and 4 openings against each opponent. CherryPi sometimes switched away from its initial try even if it won all games (for example, against CDBot and Hellbot), so I’m not sure what the switching criterion is. But opponents that it tried 4 openings against are all ones that gave it a touch of trouble.

grep -c key *.json

AILien.json:1
Aiur.json:1
Arrakhammer.json:1
BlueBlueSky.json:3
CDBot.json:2
CSE.json:4
CUNYBot.json:1
DaQin.json:1
Ecgberht.json:1
Hellbot.json:2
ISAMind.json:2
Iron.json:1
KillAll.json:2
LastOrder.json:3
LetaBot.json:4
Locutus.json:4
McRave.json:4
MetaBot.json:1
Microwave.json:3
SAIDA.json:4
Steamhammer.json:4
Tyr.json:1
UAlbertaBot.json:2
WillyT.json:1
Ximp.json:2
ZZZKBot.json:4

The other machine learning feature advertised for CherryPi is a building placer. It was trained against human building placements and apparently takes into account some of the bot’s intentions. I recommend against training on human play (or at least exclusively on human play), because machines play differently. Teaching a bot to blindly imitate human decisions that it doesn’t understand will lead to mistakes. It’s worse than teaching a human to imitate without understanding, because the bot won’t figure things out on its own. Nevertheless, CherryPi’s building placement does seem cleaner than other bots. To me the building placement looks simple and logical, but not sophisticated like a strong human player’s. Here’s an example from a ZvZ game, game 1755. The sunken colony does not interfere with gas mining, and it is somewhat protected from zergling surrounds by the geyser, the spawning pool, and the lair itself, while remaining open for drone drills on the drone side. The spire is curiously far away; I would have fit it into the gap next to the sunken. It looks OK but a little loose, not quite optimized. (By the way, game 14742 against the same opponent has the same building layout, except that the spire is placed close.)

ZvZ building placement in game 1755

CherryPi has gained new tactical tricks. I mentioned the burrow trick where it burrows zerglings at expansion locations. So far, I haven’t seen a game where the opponent was ready for the trick; I imagine it contributed to a lot of wins, even though CherryPi sometimes researches burrow and then never uses it. (And I’m disappointed. I thought of using this trick in Steamhammer, and didn’t because I expected that bots which knew how to clear spider mines would also know how to clear burrowed zerglings. I think I was wrong!) As far as I’ve seen, CherryPi doesn’t use burrow for any other purpose (though I wouldn’t be surprised, since there are so many). CherryPi also does zergling runbys; an example is game 1406 versus SAIDA where CherryPi played an unusual and not entirely efficient gas-first 3 hatch zergling build.

CherryPi doesn’t have as many complex skills as SAIDA, but it has a good number. I doubt I saw everything it can do.

note on CherryPi

I’ve been watching CherryPi’s AIIDE games. No conclusions yet, but I noticed that CherryPi likes to research burrow (not the first bot to do so) and burrow scouting lings at expansions to watch for the enemy (I think it’s the first to do that). SAIDA appeared unready for the trick. When an SCV showed up, CherryPi did not unburrow the ling, but sent another to prevent the expansion.

more thoughts on CherryPi

Observation 1: CherryPi makes only zerglings, hydralisks, and mutalisks. That is it for combat units. Also it prefers the most basic units: Zerglings most, mutalisks least.

Observation 2: CherryPi seems to limit itself to a small number of opening build orders. For example, it seems to have 2 ZvZ builds, one nine pool into zergling pressure and one turtle into spire. The opening builds are varied to adapt to the situation in some way; I haven’t been able to discern how much is due to hand-coded reactions and how much to the learning system. Even so, neither build looks impressive in itself.

Observation 3: CherryPi has inconsistent micro skills. Some skills are outstanding, like storm dodging. Some are inferior, like zergling targeting. Micro doesn’t seem to have been a point of heavy emphasis.

And yet CherryPi is doing extremely well, keeping near the top of the rankings. I imagine that that is partly because (as has been pointed out) it is tuned to beat the strongest opponents, and voters like to match it against exactly those opponents. I don’t think that’s the whole story. I think the learning system deserves much of the credit.

All the observations support a story that the learning system is where the heavy development effort has gone. 1. Empirical learners need data. When the situation is simpler, they can get by with less data. CherryPi might support few combat units because the effort went elsewhere, or it might be a deliberate choice to make life easier for the learning system (at least for now). 2. Similarly for the small range of opening choices. Maybe they didn’t have time to polish more choices; maybe the learning system learns how to use each opening, and they need few enough choices that it is forced to learn a decent amount. 3. Poor micro in some situations probably means that they haven’t had time to work on it. The effort went elsewhere.

Of course this is speculation. You could argue the opposite, because the learning system is not mature either. We can tell because it has visible holes. For example, it struggles with Juno by Yuanheng Zhu, the cannon contain bot, and it has some trouble with heavy rushes like those by Wuli or Black Crow, which play easy-to-understand strategies that you might expect to be easy to learn to counter. In my story, that is of course because learning is hard.

My read on the project’s style is that they are pushing ahead hard and in consequence allowing bugs and sloppiness to creep in to a degree. If I am reading it right, then when the next round of tournaments comes up after the middle of the year, we can expect CherryPi to be formidably capable. It won’t matter if they lose 5% of their games to bugs, provided the long tournament allows time for the bot to learn how to win almost all the rest. And, you know, a sufficiently smart learning system might be able to learn how to work around bugs....

what opponent modeling skills does CherryPi have?

CherryPi has a striking habit of barely scouting before concluding that it knows what the opponent is going to do, and then seeming reluctant to ever change its mind. A pro player will often also barely scout, but pros probe for new information later in the game and are ready to draw new conclusions. I thought the CherryPi-TyrProtoss match from the SSCAIT 2017 round of 16 was a tantalizing example (see the video). It doesn’t tell us how CherryPi works, but it offers a small hint.

In the first game Tyr opened with 2 gateways for early pressure. CherryPi scouted the 2 gateways with a drone at about 1:50 into the game and reacted somewhat logically with an array of sunkens, behind which it built up a strong economy. Tyr saw the sunkens and did not seem to react to them at all, strategically. Protoss should have expanded immediately and taken steps to prevent zerg from expanding further. Tyr did neither, and it lost.

Seeing the opponent’s build and reacting is nice, but it’s not a special skill. In this game, CherryPi did well but didn’t show anything special in terms of strategy.

In the second game we saw something that might be more interesting, but it’s still unclear. At about 2:10 the scouting drone saw a forward pylon and immediately returned home without looking further. When I first watched the replay in OpenBW I thought that CherryPi had left before seeing the forge warping in behind the drone, but when I watched the replay in StarCraft with only CherryPi’s vision turned on I saw that it did just catch the forge starting. Still, the fact that the drone turned around immediately suggests that CherryPi had seen enough; the forward pylon was all it wanted to see to understand the opponent’s build.

CherryPi reacted with 3 hatcheries before pool, which is safe versus forge expand but allows good responses from protoss too. I imagine the CherryPi team chose the build because it’s unusual in that situation and a protoss bot might not know a good reaction (though any human player would have an idea). In this case they were right, and CherryPi got ahead and won.

What do you think happened? Did CherryPi see the forge and know there couldn’t be a gateway yet, so it could safely play a slow build? Or did CherryPi take a leap in the dark and make its decision after seeing only the pylon? Nothing stops protoss from building 2 gateways at the forward pylon and making an aggressive rush—a pro is more likely to do that than to build the 2 gateways in the main. The advantages are that the rush distance is shorter and it protects the natural.

opponent modeling skills

I don’t know what CherryPi is doing. Maybe there’s discussion about it somewhere, which I haven’t seen. But I can’t help comparing it to Steamhammer’s opponent model. When all the intended features of the opponent model are implemented, Steamhammer will be able to see the pylon and immediately conclude “I’ve seen you do that before, it was the start of a forge expand opening and you’re probably playing it again. Let’s counter the forge expand.” I suspect that may be what CherryPi did—probably not in exactly the same way, but maybe in a way that’s broadly similar.

Bots tend to be predictable, and their opponents can take advantage. It’s one of the ideas behind Steamhammer’s opponent model. Seeing a forward pylon narrows down what the opponent is doing, but doesn’t zero in on one possibility. But if the opponent tends to continue the same way as in the past, you can act as though there were only one possibility and start to counter it earlier, gaining an advantage. (Against a forge you make drones, against gateways you make zerglings and a sunken. If you make unnecessary zerglings, you set yourself back.)

The development Steamhammer version can already do this, in a limited way against a random opponent. For example, against UAlbertaBot, Steamhammer says “this is probably a heavy rush (with zealots or marines), so I’ll prepare for that.” If it finds out that UAlbertaBot rolled zerg, it immediately (well, within 8 frames) realizes “uh oh, I was wrong, it is going to be a fast rush which has a different counter.” It doesn’t wait to see early zerglings, or a spawning pool, or a drone count, it immediately starts to adapt its build. The exact reaction depends on the timing, but there is code that says, for example, to cancel the second hatchery if it will allow a spawning pool to get up faster. By reacting immediately, Steamhammer has a better chance to survive despite its weak defensive skills.

The opponent can thwart the opponent model, at least to an extent, by being genuinely unpredictable. That’s a countermeasure. Steamhammer’s opponent model should force top bots to vary their play more. That’s another idea behind the opponent model.

two McRave games

Here are 2 McRave games. The first is what will probably turn out to be the biggest upset of SSCAIT 2017, and for journalistic balance (look at me! I can pretend to be objective, just like a reporter!) the second is a win over a tough opponent that has given McRave trouble.

McRave is currently at #3, and it will probably finish there. So I find it striking that both games show easy to notice weaknesses on both sides. All bots have a long way to go to become truly strong.

McRave-FTTankTER

As I write, McRave is #3 and FTTankTER is #69 out of 78 entrants, with fewer than 50 games remaining to play in the tournament. There are a couple of unplayed games that theoretically could unseat this one as the biggest upset, but it’s unlikely. What I find most remarkable about the game is not that the result was such a reversal, but that it came about because FTTankTER played better. McRave didn’t lose because of a bug (at least not one that I can detect) or by playing a risky strategy and getting unlucky, but because of missing skills.

McRave-FTTankTER started with McRave fast expanding behind a single gateway and FTTankTER rushing with marines.

marines arrive at the front

McRave did not make an initial zealot, but waited for its cyber core to finish so it could get straight to dragoons, the key unit at the start in PvT. Making 1 zealot slows down dragoons a trifle but adds safety against all kinds of cheeses and fast rushes, so it’s probably smart. But even without, McRave could have held. When a small number of marines show up at your front, they are weak. Marines gain strength in numbers because they are ranged units, but workers are faster and tougher than marines without medics or stim. Just pull workers and defend until your gateways produce. Workers can easily win fights against small numbers.

Instead, this happened:

Protoss pulled probes only after losing its first gateway, when the marine numbers had grown. The probes did not try to surround marines, but mostly milled around in front of the marines as if playing dodgeball. Nearly every probe was lost before the dragoon entered the fight. McRave was too optimistic, first in ignoring the attack, and then in continuing to throw away probes. A fallback plan would be: Abandon the natural, retreat the surviving probes, wait for the dragoon, and try a coordinated probe-dragoon defense of the main.

FTTankTER is clumsy and wasn’t able to finish off its helpless opponent, but the no-kill time limit ran out and terran won on points.

I think McRave shows some wider vulnerability to marine all-in attacks. McRave-Oleg Ostroumov is an example. Since McRave has lost fewer than 10% of its games, its weaknesses are apparently not easy to exploit.

McRave-CherryPi

CherryPi won its first game over McRave when McRave played a standard forge expand. In the second game, McRave played differently and CherryPi never seemed to notice. It was still a fight, though.

When both players learn, it becomes a race to see who can learn more and faster. With only 2 games, we can’t tell how the race would have turned out.

The game McRave-CherryPi on Benzene opened with McRave building 2 gates and CherryPi playing overpool into second and third hatcheries at the natural. CherryPi droned up as if McRave had fast expanded, which it should have known didn’t happen because its zerglings made it to the protoss natural. Zerg was underdefended, and McRave’s zealots killed a couple drones in the zerg natural and started hitting buildings.

Then a sunken started and the zealots retreated for no apparent reason. Protoss should at least take swipes at the morphing sunken until zerglings appear. The protoss scout probe in the main saw the zergling count and location, so McRave could have known it was safe. In the game above, McRave was overconfident; here it is overcautious. It is a sign of not truly understanding the situation (so far, no bot does). In the picture, the zealots have just retreated.

protoss retreats unnecessarily

Wuli beat CherryPi 2-0 with its heavy rush, but McRave likes to tech faster. CherryPi added to 3 sunkens and continued drone production, still seeming to assume that McRave had fast expanded. McRave poked repeatedly at the front without committing much or achieving much; at least it impelled zerg to spend on fighting units instead of drones. McRave often had a vanguard of units doing the combat and a rear guard that stayed out of the fight. I got the impression that McRave was not hiding its strength, but was just confused.

CherryPi had mismanaged the opening and was contained. Lurkers or mutalisks might have forced protoss back, but CherryPi got the lair late and did not make either; it wants to win with low-tier units. Sticking with zerglings and hydralisks and making many drones, zerg soon needed to expand more than it safely could, and put a hatchery at the nearby mineral only base, barely outside the containment. McRave soon scouted it—and did nothing. Protoss continued to poke at the front and ignored the third base. It could have detached a couple of rear guard zealots to take it down; zerg could have done nothing. The picture shows protoss defeating an inadvisable zerg foray near the mineral only third. After this, McRave ignored the third and made another poke at the front (even if the bot doesn’t notice creep, protoss had seen the hatchery with a probe). In the minimap, McRave has just started its natural nexus.

smashing a zerg escape attempt

Finally McRave felt confident enough to split its forces and kill the expansion. Before it died, CherryPi started a fourth base in the lower right corner. CherryPi was ahead in workers but had only 2 mining bases, while McRave had a far stronger army and a mostly successful containment (it only leaked a few drones).

After finishing the zerg third, McRave seemed to realize how far ahead it was and broke into the natural. With drones killed and a second nexus to make more probes, McRave had effectively caught up in economy and its army was more than zerg could face. In the picture, a high templar is storming drones that decided to fight instead of running away. The drones might as well do that; the only place they could safely run away to was the main, which was already saturated.

storming drones

CherryPi did not go down easy, but protoss was too far ahead. Oddly, though McRave made many templar and they accumulated plenty of energy, that one storm was the only one in the game. The high templar stayed in the rear guard where they were too far away to contribute. Also, both bots seemed confused by the neutral building block on the map, and got units stuck behind the block. I expect that from rough bots like Steamhammer, not from polished competitors.

CherryPi showed its curious strategic rigidity, where it believes without scouting that it knows what the opponent is doing—in this case, it even scouted that the opponent was not doing what the zerg opening assumed. To me it seems strange, because in Steamhammer the first major feature I wrote was the strategy boss which solves this exact problem, and it greatly boosted zerg’s strength. McRave showed surprising caution and slowness in taking advantage of opportunities.

thoughts about CherryPi

CherryPi remains interesting, although for now it still looks like Just Another Bot.

CherryPi wants to win with masses of zerglings, or occasionally hydralisks against protoss. It has some reactions, but overall tends to show a lack of strategic flexibility. It has a plan, but if the plan doesn’t work the followup tends to be slow and inadequate (compare Killerbot by Marian Devecka, which can completely switch its unit mix when a plan doesn’t work). Today’s game against LetaBot by Martin Rooijackers is an example. CherryPi opened with a fast second hatchery and no gas, to put on early pressure with masses of slow zerglings. LetaBot saw it coming and easily repelled the lings. CherryPi kept making masses of lings with few drones even after terran had medics and stim, when no quantity of slow zerglings could pose a threat. Terran played slowly and overcautiously, making easy-to-see mistakes, but it didn’t matter because the zerg strategy was inconsequent. By the time CherryPi started slowly adding mutalisks, mutalisks were also no threat. LetaBot eventually moved out and swept aside everything in its path. (Then LetaBot got stuck on the enemy ramp and crashed, or overstepped the time limit, but that’s a different lesson.)

CherryPi seems to make a lot of decisions without scouting. For example, it makes scourge (sometimes more than a little) when the enemy has no air tech. It moves its overlords, but does not send one to the enemy base. I think it is making choices based on units that it sees, especially versus protoss. But when it feels overmatched it holds its army well back from the enemy, meaning that it can’t see. Compare Steamhammer, which is aggressive and keeps its units forward even when it’s a big risk; the countervailing advantage is that it gets to see what the enemy has and what the enemy is doing.

Against zerg, CherryPi has different openings. If one loses, it tries the next. From what I’ve seen, the first opening is a 9 pool without gas, followed by a hatchery for mass slow zerglings. It’s a safe middle-of-the-road opening, or in other words halfhearted, but CherryPi loves its favorite. If that loses, the second try is sunken turtle into mutalisks, which is successful against many zergs (I think it is likely to work against Steamhammer too). There may be fancy learning going on behind the scenes, but if so we can’t see it because the tournament doesn’t have enough games.

Against protoss, I can’t detect any such progression in CherryPi’s openings. I think it’s always playing the same opening, and adapting it somewhat to the situation. It expands with 12 hatch, sunkens up its front, and makes massive numbers of drones. It’s similar to Killerbot’s plan, and a good one in general though the early sunkens are often unnecessary (and occasionally insufficient). The games against Bereaver make a good example; the first is a loss and the second a win, but zerg plays similarly in both. In the first game, notice how zerg keeps making mass drones and maintains a strong economy even as it is losing every fight, including losing its drones at a high rate. In the second game, the difference was that the players were at cross positions on a large map, and Bereaver’s corsair play and reaver drop were less effective.

CherryPi is safe against fast rushes; it has a perfect record so far against the zerg rushbots. CherryPi is vulnerable to hard rushes. Wuli won 2-0, and so did Flash which also does heavy early zealot pressure. Black Crow’s relentless zergling waves beat both the zergling opening and the turtle opening.

Overall, CherryPi has glaring weaknesses just like other bots do. But as I write, it is ranked #11, so it is strong by the standards of this tournament. I think the main source of its strength compared to other bots is the same as the main source of Steamhammer’s strength, the pressure style of play, which works because bots are better at attack than at defense. Steamhammer is ahead of CherryPi for now, because I invested effort in stability and resilience and lose fewer games to bugs and basic goofs. The CherryPi team is presumably investing in smarts instead, which should pay off in the long run. I haven’t seen any sign that CherryPi has particular smarts in opponent modeling—as far as I can see its opening learning is a simple algorithm, and I can’t detect anything else it might be doing—but if it does we might not be able to tell, because the tournament is not long enough.

The next AIIDE tournament may be interesting.

CherryPi’s games

I watched some CherryPi games to see how it plays.

versus terran

CherryPi scored 92% against the 4 terrans. How did that happen?

Well, only 2 of the terrans are strong by CherryPi’s standard, and it pre-learned openings against both of them. Against #3 Iron, CherryPi played a mass zergling opening. In games where Iron did not wall, game over. What made it work is that, if Iron did build a wall, CherryPi understood when the wall was open or closed. That’s good cleverness. The zerglings formed a concave so they themselves were safe from the defenders behind the wall and nothing of Iron’s could sneak through, then when the barracks lifted to open the wall, rushed in and rampaged. Against #12 LetaBot, CherryPi played the same zergling opening and won by sheer persistence in attacking.

Both Iron and LetaBot could have won with little risk by playing more cautiously.

Against the weaker terrans, #16 IceBot and #25 HannesBredberg, CherryPi’s learning quickly found winning builds. Against IceBot it settled on a slow-moving mutalisk build with too many hatcheries, powering drones while IceBot delayed moving out. It’s the only case I found where CherryPi tries to win eventually instead of quickly. The games are consistent wins but are not impressive. Against HannesBredberg it was the mass lings again.

the learning sequence

Except against the 5 opponents for which it pre-learned openings, CherryPi opened the first game versus each opponent with 4 pool. It succeeded against refreshingly few of them. If that didn’t work, it tried another opening, and another, and so on. Most of the openings are of the low econ “I’ll just run you over in 6 minutes” kind.

CherryPi apparently tries all its openings without worrying whether they are appropriate to the matchup. It tried a lurker opening in this game versus Microwave. It made no sense to me, but maybe some zerg bots are vulnerable to lurkers.

CherryPi’s favorite unit is the zergling. Occasionally mutalisks or lurkers do something essential, but most of the time, early game or late, it wants to win with lings.

the Steamhammer family

#10 Steamhammer upset #6 CherryPi 72-32. The random openings worked as intended and baffled CherryPi’s learning. CherryPi has openings that could have won the majority of games (it won 7 of the first 10), but it could not find them (it won 3 of the last 10). It could not hear the learning signal over the noise.

Steamhammer forks cpac, Microwave, and Arrakhammer also had plus scores versus CherryPi. I doubt they were all for the same reason.

Next: Cpac.

looking at CherryPi

There’s a ton of code in CherryPi, more than I can read in a day. I tried to pick out key parts to look at.

blackboard architecture

This comment from the file CherryPi/src/upc.h explains an important part of CherryPi’s high-level architecture.

/**
 * Who, where, what action tuple (unit, position, command).
 *
 * UPCTuples are used for module-to-module communication (via the Blackboard).
 * Posting a UPCTuple to the Blackboard is akin to requesting a specific action
 * from another module, and consuming a UPCTuple from the Blackboard implies
 * that the consumer implements this action. The implementation can also consist
 * of refining the UPCTuple so that a more lower-level module can actually
 * execute it.
 *
 * For example, a build order module might post a UPCTuple to create a certain
 * unit type, a builder module might wait until their are sufficient resources
 * before consuming it and then select a worker and a location. The
 * UPCToCommandModule takes care of translating executable UPCTuples (with sharp
 * unit, position and command entries) to actual game commands.
 */

I think it’s an excellent architectural choice, especially for a project carried out by a team rather than an individual. Communication between modules is managed in large part automatically by software rather than manually through the calling conventions of each module. It’s flexible and easy to modify and extend.

relation to Tscmoo

People have speculated that CherryPi may borrow a lot from Tscmoo the bot, since Tscmoo the person is on the team. The speculation even made it into ZZZKBot’s source code, as we saw a couple days ago. I compared 2 slices of code that do similar jobs in CherryPi and the CIG 2017 version of Tscmoo. I looked at the combat simulator in both, and code implementing the 1 hatch lurker opening in both.

Note well: If I had looked at different code, I might have drawn different conclusions. I deliberately selected code with related purposes that might be connected. In some places, CherryPi uses ideas from the old BroodwarBotQ that was written up in Gabriel Synnaeve’s PhD thesis.

1. I think CherryPi directly copied nothing from Tscmoo. I didn’t expect it to. The overall architecture was likely decided before Tscmoo the person joined the team. Besides, an academic usually wants credit to be clear, and a corporation usually wants ownership to be clear. The code in detail looks quite different.

2. In the parts I looked at for this comparison, some structure and ideas in Tscmoo were carried over and seemingly reimplemented in CherryPi, with (I should repeat) great differences in detail. It’s clear that somebody familiar with Tscmoo wrote this CherryPi code. For example, in the combat simulator one has addUnit() and run() in that order, and the other add_unit() and run() in that order. They both refer to “teams”, both count up frames from 0 (I would have counted up from the current frame, some would have counted down to 0), and other shallow similarities.

3. CherryPi, in the parts I compared, seems to be simpler and more cleanly written. In the lurker opening in particular, I think CherryPi encodes the opening a little more abstractly. Sometimes Tscmoo has more features. Tscmoo’s combat simulator simulates splash damage, and CherryPi’s does not.

4. OpenBW is another source of ideas, and it is of course also connected with Tscmoo the person. For example, the FogOfWar class says it is based on OpenBW. It calculates visibility depending on ground height and so on.

the openings

I always want to know, “what openings does it play?” In the directory CherryPi/src/buildorders I see 16 classes that look like they could be build orders. The opening learning files include 15 build orders. The in_use.txt file lists these 8 build orders as active or possibly active:

  • 12hatchhydras
  • zvp10hatch
  • 5pool
  • 2basemutas
  • 3basepoollings
  • 1hatchlurker
  • meanlingrush (9 pool speed)
  • ximptest (it says this one is “unknown status”)

I will watch games and find out what openings it plays in practice. Come back tomorrow!

As a sample of how openings are defined, here is a snip from the file CherryPi/src/buildorders/meanlingrush.cpp showing the basic definition of 9 pool speed:

    buildN(Zerg_Drone, 9);
    buildN(Zerg_Extractor, 1);
    buildN(Zerg_Spawning_Pool, 1);
    if (countPlusProduction(st, Zerg_Hatchery) == 1) {
      build(Zerg_Hatchery, nextBase);
      buildN(Zerg_Drone, 9);
    }

It writes on the blackboard: Make drones until you have 9, extractor and spawning pool, then add a second hatchery at an expansion and rebuild the drones to 9. Simple and concise. Details like spawning the overlord and figuring out exactly when to start the second hatchery are left for other code to fill in (in Steamhammer, you have to specify it explicitly). On the other hand, here is how it says to collect only 100 gas to research zergling speed:

    if (hasOrInProduction(st, Metabolic_Boost) || st.gas >= 100.0) {
      state->board()->post("GathererMinGasGatherers", 0);
      state->board()->post("GathererMaxGasGatherers", 0);
    } else {
      state->board()->post("GathererMinGasGatherers", 3);
      state->board()->post("GathererMaxGasGatherers", 3);
    }

More writing on the blackboard. That’s a complicated test, where in Steamhammer you’d simply specify "go gas until 100". It’s fixable. They could, for example, write goals to the blackboard like “collect 100 gas for zergling speed” and have another module collect only enough gas to meet the goals.

machine learning

I’ll take two cases, online learning during the tournament, and offline learning before the tournament starts, producing data that can be fed to or compiled into the bot.

For online learning, the win rate over time graph for CherryPi shows a rapid increase in win rate from .4 to .7 within the first 10 rounds, then a gradual slight decline to the end of the tournament. It looks as though CherryPi rapidly learned how to play against each opponent, then more or less froze its decisions and allowed slower-learning bots to catch up a tiny bit. (Though swings in score in early rounds can also be due to statistical noise.) The readme file says:

CherryPi is a TorchCraft Zerg bot developed by Facebook AI Research.
It uses bandits to select learn strategies that work against a given
opponent.

“Bandits” refers to the n-arm bandit problem, which is behind most bots with opening learning. Looking at the file CherryPi/src/models/bandit.cpp, I see that that is exactly what CherryPi is doing too. It uses the classic UCB1 algorithm to learn which opening to play against each opponent, just like many other bots.

I looked at the opening learning files, one for each opponent. They are in JSON format and are written by a general-purpose serializer that leaves the data a little hard to interpret by eye. It looks like value2 maps between the 15 opening names and 15 opening index numbers. value3 is 15 zeroes, and value4 and value5 are the learned data for the 15 indexes 0 through 14.

The only offline learning that I found is the same opening learning, performed for certain opponents ahead of time.

  • Iron
  • LetaBot
  • Skynet
  • Xelnaga
  • ZZZKBot

I can’t guess how they came up with that set of 5 opponents to pre-learn openings against. For these opponents, CherryPi relied on its offline learning exclusively; it did not write new learning data for these opponents. It’s such a strange decision that I have to wonder whether it’s a bug. In any case, we saw yesterday that it backfired against ZZZKBot, which did not play as expected: Unable to learn, CherryPi played the same unsuccessful opening every time, and lost over and over. Both ZZZKBot and CherryPi had incorrect prior knowledge about each other, and only ZZZKBot adapted.

conclusion

It is clear to me that CherryPi the project is not far along compared to where they are aiming. There are plenty of TODOs in the code. The big machine learning ideas that (if successful) could make CherryPi superhuman are not there yet; only some foundations are laid. CherryPi is still a scripted bot like others, not a machine learning bot. Even so, with (as I understand it) 8 people on the team, they have done a tremendous amount of work. They implemented ideas—most of which I didn’t write about—that I wish I had time to do myself. If they can maintain the rate of progress, then within a few years individual coders won’t be able to keep up. On the other hand, progress may slow when they get to the hard part. We’ll have to stay tuned!

Next: Games by CherryPi.