Entries by Jay Scott | Starcraft AI blog

early experience with Steamhammer’s new opening selection

I’ve been testing out Steamhammer’s new opening selection algorithm, part of the opponent model.

CasiaBot plays a hand-made anti-Steamhammer build, sunken turtle into fast spire. (It does this 80% of the time, and 9 pool speed the other 20%. It’s hard to counter both at once.) Steamhammer’s strategy reactions to it are in the right direction, but are too slow and sloppy. For SSCAIT 2017, I hand-made an anti-CasiaBot build that holds off the mutalisks with a slightly faster spire and gets a second gas to win with numbers, while preventing CasiaBot from ever expanding. CasiaBot’s build is hard for Steamhammer to beat, much tougher than CherryPi’s similar turtle build. But the counter usually works, and Steamhammer scored 2-0 in the round robin.

CasiaBot was a simple first test. In the first game, Steamhammer made its usual random opening selection and lost to the turtle build. But the plan recognizer recognized the build. In the second game, Steamhammer played its counter and won. The system worked.

I didn’t test against CherryPi, but I would expect this sequence as the most likely: Steamhammer wins the first game, because its random opening selection usually beats CherryPi’s first ZvZ opening. CherryPi switches to the turtle build for the second game and evens it at 1-1. (That’s what happened in the round robin.) Steamhammer switches to the counter-turtle build and pulls ahead again to 2-1. I don’t know what happens after that. CherryPi might have another winning build and even it again at 2-2, which would set up a learning race.

UAlbertaBot is a much tougher test because it plays random, and Steamhammer doesn’t have the smarts to fully adapt to each of the strategies UAlbertaBot follows. My hand-made counter scores about 2/3, almost always winning against zerg, mostly winning against terran, and usually losing against the protoss zealot rush.

I turned off the hand-made answer and let Steamhammer learn. In the first game, UAlbertaBot rolled terran and lost to Steamhammer’s random choice. In the second game, UAlbertaBot was protoss and Steamhammer played a counter to the terran opening it had seen: Steamhammer doesn’t need to lose to learn. Stopping the marines and stopping the zealots can be done in a similar way, and Steamhammer won again. Then UAlbertaBot rolled zerg twice in a row, and Steamhammer mis-countered both times. In game 5 UAlbertaBot was protoss again, while Steamhammer (having been hit with 5 pool twice in a row) thought its opponent had switched to fast rushes and played to stop the early zerglings, losing again. But at the end of 15 games the score was 9-6, not distinguishable from the 2/3 winning rate of my hand-made counter. The openings played were completely different, most wins were against protoss and terran instead of zerg and terran, and the reasoning behind the choices was the practical “this is what I saw, now beat it” instead of the theoretical “this is what ought to work” of my hand-made counter, but the end result was the same. Good enough!

Only longer experience will show how well the system works in the wider world of the SSCAIT ladder, and whether it can hold up when bot authors look for and exploit its weaknesses. I have thought of a lot of ways it might break down. I know it doesn’t work against Juno by Yuanheng Zhu. But I have also thought of a lot of ways to improve it. I have ideas in mind for the plan recognizer, the opening selector itself, and the reactions to recognized plans. A lot of information is not exploited yet. Some ideas I will get to before release, some will wait for later versions in the 1.4.x series, and some will wait a long time.

Near the 1.4 release I’ll write a description of how it works, up to date with the release. There are several working parts, but for what it does, it’s not complex. It acts for all 3 races, retains configurability, allows for random choice of openings in every situation if the author wants, and expects that some opponents will change their behavior over time. I imagine that getting all these features is simpler than you expect.

solid versus daring

A game player of a given strength is solid if it wins reliably against weaker opponents, and daring if it loses more games to weaker opponents and makes up for it by winning some against the stronger. I think the term solid is common. I decided for myself that its opposite should be daring.

The idea applies to all games of skill with winners and losers. You can always find more solid and more daring players, unless the game is so constraining that it leaves no room for stylistic differences. From the point of view of a player with a fixed level of skill, you could say that being solid means that your style of play aims to reduce your risk of losing, while playing daringly means you try to increase your chance of winning. From the point of view of an author, you could say that trying to make your bot more solid means working to reduce exploitable weaknesses that cause losses, while trying to be more daring means creating strengths that will catch out some opponents (like timing attacks or unusual rushes or tech switches). It makes sense for authors of weak bots to focus on daringly beating the stronger, and authors of strong bots to solidly beat the weaker. (Of course it also makes sense to do whatever is more fun.)

I’ve never seen a statistical measure of solidness, in the same way the elo is a statistical measure of strength. It seems widely useful, so I hope somebody has worked one out, or will work one out now that they know about it. A good one seems complicated, though. You could do something like estimate the winning chances each player has against each opponent with a method like that of bayeselo, then try to fit a measure of deviation from flatness over the range for each player. Does the difference between predicted and measured winning chance vary systematically depending on the predicted winning chance?

Here’s one simple measure for the top finishers in the SSCAIT round robin: What proportion of a bot’s losses came against the top 16? If most losses are against strong opponents, the bot is solid. The measure is approximately statistically fair only for the top few bots. We can see that Iron is solid and Tscmoo and McRave much more daring, while Killerbot and Bereaver are more solid than Tscmoo and McRave. I don’t think this number gives us much insight into whether Iron is more solid than Bereaver.

#	bot	top16 loss rate	%
1	Iron	7/10	70%
2	Tscmoo	4/14	28%
3	McRave	5/15	33%
4	Killerbot	9/19	47%
5	Bereaver	11/22	50%

Another simple measure for the stronger bots is: What’s the weakest opponent that you lost to in the SSCAIT round robin? The measure will be noisy, and comparisons only work for players that are close in strength. Also extremely daring lower-rank players like Oleg Ostroumov can distort it. But it’s quick to figure out and that counts for a blog post. I read the results from the unofficial crosstable.

#	bot	worst loss
1	Iron	#31 PurpleCheese
2	Tscmoo	#56 NUS Bot
3	McRave	#69 FTTankTER
4	Killerbot	#60 Oleg Ostroumov
5	Bereaver	#35 Dawid Loranc
6	Steamhammer	#44 Lukas Moravec
7	Wuli	#61 Marine Hell
8	CherryPi	#60 Oleg Ostroumov

My feeling is that Killerbot and Wuli are more solid than this noisy measure gives them credit for, and otherwise the numbers give a rough but fair idea. Iron is more solid than Tscmoo or McRave. Bereaver and Steamhammer are more solid than, say, McRave and CherryPi. In Steamhammer I’ve worked toward solidness, so I’m pleased to have it.

close game Steamhammer-LetaBot

I don’t want to write up any games that might be from the SSCAIT elimination phase, since it wouldn’t be polite to scoop the official announcement. But there are some good ones. In particular, CherryPi is starting to show its full opponent modeling skills, which are more sophisticated than we could see in the round robin phase with only 2 games against each opponent.

So here is a close game from the round robin, Steamhammer versus LetaBot by Martin Rooijackers. Steamhammer randomly chose a 1-hatchery lurker rush, a risky opening which often beats LetaBot quickly. This time, Steamhammer got distracted chasing the scouting SCV and put on no pressure with its early zerglings, allowing LetaBot to bunker safely at the front of its natural instead of in its main. Often LetaBot overreacts to the threat of the early zerglings, which is actually slight, and leaves itself weak to the lurkers. Here it defended nicely.

Well, not quite nicely. A bunch of marines stood in front of the defenses and were slaughtered by lurkers outside detection range of the turret. But Steamhammer is no smarter. As soon as the way to the bunkers was clear, zerg became overaggressive with the lurkers and lost them quickly. One lurker stayed outside bunker range and drained terran minerals into repair for a while, but as soon as marine range research finished in the academy, it died too. Steamhammer had rushed to lurkers with a weak economy (see the worker counts in the picture), so after an even-ish combat outcome, terran was ahead.

Followup lurkers behaved the same, killing infantry that placed itself needlessly in danger, then placing themselves needlessly in danger and dying. LetaBot expanded much later than it should have, letting zerg catch up in economy. But Steamhammer had been frittering its army away while LetaBot continued to build up despite losses, so terran was still ahead.

Finally the terran push came. The 4 sunkens and small zerg force can only delay the inevitable; the natural will definitely be lost. Zerg has 4 bases and an adequate economy, so it can attack the terran ball from all sides, but chances to save the game seen small.

The rear-placed sunkens were highly effective, because LetaBot assaulted them without regard to losses. Marines funneled between the buildings, suffering both sunken hits and splash damage from tank fire. Scattered zerglings ran in from every direction, but SCVs kept the tanks repaired. Steamhammer kept sending more drones to mine the natural gas, while LetaBot had expanded and added to its SCV count, so zerg was falling behind in economy again.

A small number of terran reinforcements were intercepted by zerglings taking a strange path across the map. The terran attack started to become disorganized, with some units running into the zerg main before the natural was reduced. In the picture, the spread-out tanks are under attack from both ends, and in the minimap is an engagement between zerglings and reinforcing marines, preventing the marines from joining up promptly.

The zerg army remained tiny, but it was LetaBot’s turn to rush in pell-mell without organizing its forces. Without marines supporting the tanks, and with adrenal gland research finished to increase the zergling attack rate, the small number of zerglings finished them off easily.

With the slow-to-replace tank mass destroyed, Steamhammer had enough forces to stop any followup attacks. The turnabout was sudden. Zerglings broke into the terran natural while the double bunkers were unoccupied, and simultaneously 1 lurker and a handful of zerglings erased the terran third. There was a little more drama with a last-ditch battlecruiser as Steamhammer was slow to deliver the finishing blow, but in the end an unnecessarily massive zerg force battered down the undefended terran buildings.

It was another narrow comeback. Those make fun games.

Next: Analysis of solid versus daring bots.

Steamhammer opponent model status

The opponent model can (sometimes) select openings in the development Steamhammer. That’s the key feature I wanted, so I should be able to release Steamhammer 1.4 shortly after SSCAIT ends, as I hoped. After some thought, I found a simple way to avoid the big refactoring I had expected.

It’s rather nice to see Steamhammer lose the first game, or sometimes the first several games, then realize what’s going on and abruptly counter the opponent’s play. It tries to predict the opponent’s plan. So far the prediction is made in an extremely simpleminded way, but I may improve it before release. Of course it is easier to predict an opponent that plays a fixed strategy than a learning bot, so I expect the skill to be less effective against the top bots. Predicting the opponent’s plan lets Steamhammer choose more extreme openings that are risky against an unknown opponent. Today I added a ZvZ_12Hatch opening, which is unsafe against many bots but is a strong counter to the favorite strategies of AILien and a few other zergs.

I expect to remove most of the enemy-specific strategies that are configured for various opponents. There are way too many of them. Steamhammer will finally have the ability to adjust its play when an opponent starts playing differently. Some enemy-specific configuration will remain for this version because the opponent model is not smart enough yet. I think I will let Steamhammer learn from scratch rather than feeding it artificial “learned” data to replace the manual configuration. That will depress its results against the rushbots until it learns about them, but it will be a good test of how well the opponent model works. Steamhammer will be smarter, but at first it will look stupider.

As always, my list of bugs to fix and improvements to make is longer than I can possibly work through. CherryPi has a whole team that still can’t fix all its bugs. I should be able to make some progress before the end of the tournament, though.

One of Brood War’s permanent irritations is that updates to the game engine tend to break saved replays from previous versions. There is no backward compatibility. I have decided to add a version number to Steamhammer’s saved opponent data files. I hope that I’ll be able to release future versions with different file formats that can still make use of the learning data from older versions. It would be no fun to have to relearn the opponents from scratch because an improved version came out.

aside

Speaking of SSCAIT, mixed in the usual random game list since the round robin phase ended are games that look like they’re from the round of 16. It worked the same last year. I watched some of these games from the replay page, but Steamhammer’s apparent round of 16 games are omitted from the page, no doubt due to some glitch. I save all of Steamhammer’s replays. Tournament replays are especially interesting, so it will be a shame if some of them aren’t available.

SSCAIT 2017 round robin results

The SSCAIT 2017 round robin phase has finished. See the official results and the unofficial crosstable. The unofficial crosstable seems to include a few extra games; I guess there’s a small leak in the pipeline. I have a few thoughts about the results.

Of the top 5, McRave is a newcomer this year and the rest are the old guard: #1 Iron, #2 Tscmoo random, #3 McRave, #4 Killerbot by Marian Devecka, #5 Bereaver. Killerbot and Bereaver weren’t updated this year and couldn’t quite keep up with the best, but remain tough opponents. It still takes a long time to produce a strong bot.

The results were influenced by the long tail of weaker bots which brought the tournament up to 78 participants. With many weaker opponents, the top players benefit from solid play, avoiding the risk of losing. Bots with daring play, which score well against strong opponents but lose to some that are weaker, were at a disadvantage. #1 Iron is the most solid bot: Look at the crosstable and see its row of 1-1 results against its strongest opposition; it more than made up for those losses with extreme consistency in defeating the lower ranks (the weakest bot it lost a game to was #31 PurpleCheese). Tscmoo in contrast scored well against top opposition, but had more losses to the long tail. I will try a little more analysis of the solid/daring tradeoff in another post.

#7 Wuli is hanging in there. The hard zealot rush is still a successful strategy, and it executes well.

#8 CherryPi remains an interesting case. It also suffered from its daring play. To my eye, it seemed to be learning something about each opponent from the first game, and applying it in the second. As the tournament continued, it surged higher in the ranking. How high might it have finished in a very long tournament? It would be interesting to count how many times it scored a loss then a win versus win then loss in the 2 games against each opponent: A high ratio of loss-win over win-loss indicates the ability to learn from a single game. But it might not be so clear; against an opponent that also learns like McRave, or that changes its play up like Steamhammer, what CherryPi figures out from its first game might lead it astray in the second (I think that happened in the second McRave-CherryPi game).

Microwave, Neo Edmund Zerg, and TyrProtoss tied for places #9-#11, each with 31 losses. I had expected Microwave to do a little better, but I think it relies on its opening learning, and it hadn’t played all the opponents before so it didn’t know enough. I had expected the rushbot Neo Edmund Zerg to do a little worse, but the many newcomers of course all fell to its rush.

My predictions for the tournament are reasonably good (except for the glaring mistake that the tournament was actually a double round robin). I did not expect Tscmoo to finish so high. Steamhammer I boldly forecast to finish in the narrow range from #4 to #8, and it ended up squarely in the middle of that range at #6. I’m pleased that I understand the performance of my own bot.

the elimination phase

According to the rules, random bots will not play in the elimination phase. So Tscmoo random and Andrey Kurdiumov are excluded, and the 16 continuing to the elimination phase should be:

Iron
McRave
Killerbot by Marian Devecka
Bereaver
Steamhammer
Wuli
CherryPi
Microwave
Neo Edmund Zerg
TyrProtoss
XIMP by Tomas Vajda
Arrakhammer
Skynet by Andrew Smith
LetaBot by Martin Rooijackers
AILien
ZurZurZur or Black Crow

Last year, Steamhammer and Zia tied for places #16-#17, and played a best-of-3 tiebreaker to decide who continued. This year ZurZurZur and Black Crow are tied for #16-#17 (excluding random bots) with 108 wins and 46 losses. I hope for another tiebreaker!

Last year the pairings were #1-#16, #2-#15, and so on. It gives the top finishers an advantage over middle finishers; #8 is paired with #9 and must play a close rival. The official pairings were tweeted while I was in the process of writing the post; here they are:

This is close to what I expected, but not quite the same. The tied bots Arrakhammer and Skynet were taken in reverse order from the order listed in the official results, so Steamhammer is paired against Skynet and Bereaver against Arrakhammer. Maybe the idea is to avoid Steamhammer playing against its fork Arrakhammer? Or maybe the idea is to avoid 2 mirror matchups, ZvZ and PvP? Anyway, these are acceptable pairings by the same rules followed last year, except for the unannounced tiebreaker. Maybe ZurZurZur’s 2-0 win over Black Crow in the round robin is taken to break the tie.

New this year is a loser’s bracket. This is now a double elimination design, where you have to lose twice to be out, and no longer single elimination. If you lose 1 match, you fall to the loser’s bracket, where you remain until you either lose a second match or win every match and win the loser’s bracket. The final is between the winner of the winner’s bracket, which lost 0 times, and the winner of the loser’s bracket, which lost 1 time. Every other bot lost twice and is out. Giving participants a second chance makes the tournament a little more fair. On the other hand, last year the elimination phase included best-of matches for the round of 4 and later, whereas this year I’m guessing that they may be single games.

I think the rules should explain the format of the tournament. There is no clear explanation that I know of.

two McRave games

Here are 2 McRave games. The first is what will probably turn out to be the biggest upset of SSCAIT 2017, and for journalistic balance (look at me! I can pretend to be objective, just like a reporter!) the second is a win over a tough opponent that has given McRave trouble.

McRave is currently at #3, and it will probably finish there. So I find it striking that both games show easy to notice weaknesses on both sides. All bots have a long way to go to become truly strong.

McRave-FTTankTER

As I write, McRave is #3 and FTTankTER is #69 out of 78 entrants, with fewer than 50 games remaining to play in the tournament. There are a couple of unplayed games that theoretically could unseat this one as the biggest upset, but it’s unlikely. What I find most remarkable about the game is not that the result was such a reversal, but that it came about because FTTankTER played better. McRave didn’t lose because of a bug (at least not one that I can detect) or by playing a risky strategy and getting unlucky, but because of missing skills.

McRave-FTTankTER started with McRave fast expanding behind a single gateway and FTTankTER rushing with marines.

McRave did not make an initial zealot, but waited for its cyber core to finish so it could get straight to dragoons, the key unit at the start in PvT. Making 1 zealot slows down dragoons a trifle but adds safety against all kinds of cheeses and fast rushes, so it’s probably smart. But even without, McRave could have held. When a small number of marines show up at your front, they are weak. Marines gain strength in numbers because they are ranged units, but workers are faster and tougher than marines without medics or stim. Just pull workers and defend until your gateways produce. Workers can easily win fights against small numbers.

Instead, this happened:

Protoss pulled probes only after losing its first gateway, when the marine numbers had grown. The probes did not try to surround marines, but mostly milled around in front of the marines as if playing dodgeball. Nearly every probe was lost before the dragoon entered the fight. McRave was too optimistic, first in ignoring the attack, and then in continuing to throw away probes. A fallback plan would be: Abandon the natural, retreat the surviving probes, wait for the dragoon, and try a coordinated probe-dragoon defense of the main.

FTTankTER is clumsy and wasn’t able to finish off its helpless opponent, but the no-kill time limit ran out and terran won on points.

I think McRave shows some wider vulnerability to marine all-in attacks. McRave-Oleg Ostroumov is an example. Since McRave has lost fewer than 10% of its games, its weaknesses are apparently not easy to exploit.

McRave-CherryPi

CherryPi won its first game over McRave when McRave played a standard forge expand. In the second game, McRave played differently and CherryPi never seemed to notice. It was still a fight, though.

When both players learn, it becomes a race to see who can learn more and faster. With only 2 games, we can’t tell how the race would have turned out.

The game McRave-CherryPi on Benzene opened with McRave building 2 gates and CherryPi playing overpool into second and third hatcheries at the natural. CherryPi droned up as if McRave had fast expanded, which it should have known didn’t happen because its zerglings made it to the protoss natural. Zerg was underdefended, and McRave’s zealots killed a couple drones in the zerg natural and started hitting buildings.

Then a sunken started and the zealots retreated for no apparent reason. Protoss should at least take swipes at the morphing sunken until zerglings appear. The protoss scout probe in the main saw the zergling count and location, so McRave could have known it was safe. In the game above, McRave was overconfident; here it is overcautious. It is a sign of not truly understanding the situation (so far, no bot does). In the picture, the zealots have just retreated.

Wuli beat CherryPi 2-0 with its heavy rush, but McRave likes to tech faster. CherryPi added to 3 sunkens and continued drone production, still seeming to assume that McRave had fast expanded. McRave poked repeatedly at the front without committing much or achieving much; at least it impelled zerg to spend on fighting units instead of drones. McRave often had a vanguard of units doing the combat and a rear guard that stayed out of the fight. I got the impression that McRave was not hiding its strength, but was just confused.

CherryPi had mismanaged the opening and was contained. Lurkers or mutalisks might have forced protoss back, but CherryPi got the lair late and did not make either; it wants to win with low-tier units. Sticking with zerglings and hydralisks and making many drones, zerg soon needed to expand more than it safely could, and put a hatchery at the nearby mineral only base, barely outside the containment. McRave soon scouted it—and did nothing. Protoss continued to poke at the front and ignored the third base. It could have detached a couple of rear guard zealots to take it down; zerg could have done nothing. The picture shows protoss defeating an inadvisable zerg foray near the mineral only third. After this, McRave ignored the third and made another poke at the front (even if the bot doesn’t notice creep, protoss had seen the hatchery with a probe). In the minimap, McRave has just started its natural nexus.

Finally McRave felt confident enough to split its forces and kill the expansion. Before it died, CherryPi started a fourth base in the lower right corner. CherryPi was ahead in workers but had only 2 mining bases, while McRave had a far stronger army and a mostly successful containment (it only leaked a few drones).

After finishing the zerg third, McRave seemed to realize how far ahead it was and broke into the natural. With drones killed and a second nexus to make more probes, McRave had effectively caught up in economy and its army was more than zerg could face. In the picture, a high templar is storming drones that decided to fight instead of running away. The drones might as well do that; the only place they could safely run away to was the main, which was already saturated.

CherryPi did not go down easy, but protoss was too far ahead. Oddly, though McRave made many templar and they accumulated plenty of energy, that one storm was the only one in the game. The high templar stayed in the rear guard where they were too far away to contribute. Also, both bots seemed confused by the neutral building block on the map, and got units stuck behind the block. I expect that from rough bots like Steamhammer, not from polished competitors.

CherryPi showed its curious strategic rigidity, where it believes without scouting that it knows what the opponent is doing—in this case, it even scouted that the opponent was not doing what the zerg opening assumed. To me it seems strange, because in Steamhammer the first major feature I wrote was the strategy boss which solves this exact problem, and it greatly boosted zerg’s strength. McRave showed surprising caution and slowness in taking advantage of opportunities.

Steamhammer 1.4 plans

My hope for Steamhammer 1.4 is to release it shortly after SSCAIT ends. Whether it is then or later, 1.4 will come out when it is “done enough.” I have finally decided what “done enough” means.

The headline feature is of course the opponent model. I turned off the extensive game record for this version, and kept only the brief summary of each game against each opponent, largely a list of numbers like when air units came out. The game summaries are already so rich with information that it’s difficult to exploit it all.

The opponent model will initially support 3 features: 1. The plan recognizer to figure out what the enemy is doing. This is in the tournament version and is being improved. 2. Specific strategy reactions to recorded information about this opponent. 3. The largest addition over the tournament version will be opening selection based on the opponent model. My first cut at this will be simple, but it will be able to hard-counter some opponents in the second game played while still choosing openings randomly. The opponent model will allow Steamhammer to play specialized openings that aren’t safe against the average opponent.

Choosing openings according to the opponent model is turning out harder than I expected. The decisions are not hard; I wrote a very simple first try. Refactoring so that the right information is available at the right time for each decision, that’s the tricky part. The codebase remains on a different model. I’ll have to take care to avoid bugs.

I don’t want to ignore terran and protoss. They have been falling behind in skills. Well, they’re going to keep falling behind, but more slowly. All the opponent model features are supported to some extent by every race. I have added new strategy adaptations and emergency reactions so that terran and protoss should be a little less fragile. Today I am trying to get shield batteries working and fix protoss emergency zealot production.

As usual, bug fixes are making a big contribution. The Steamhammer development version is noticeably stronger than the tournament version, to my eyes. The BuildingManager fix helps all races with macro, and the chooseTechTarget() fix (briefly mentioned here) helps zerg avoid strategy blunders.

Steamhammer 1.4 will be the first of the 1.4.x series. I want to make more progress before I move on to the next major feature.

thoughts about CherryPi

CherryPi remains interesting, although for now it still looks like Just Another Bot.

CherryPi wants to win with masses of zerglings, or occasionally hydralisks against protoss. It has some reactions, but overall tends to show a lack of strategic flexibility. It has a plan, but if the plan doesn’t work the followup tends to be slow and inadequate (compare Killerbot by Marian Devecka, which can completely switch its unit mix when a plan doesn’t work). Today’s game against LetaBot by Martin Rooijackers is an example. CherryPi opened with a fast second hatchery and no gas, to put on early pressure with masses of slow zerglings. LetaBot saw it coming and easily repelled the lings. CherryPi kept making masses of lings with few drones even after terran had medics and stim, when no quantity of slow zerglings could pose a threat. Terran played slowly and overcautiously, making easy-to-see mistakes, but it didn’t matter because the zerg strategy was inconsequent. By the time CherryPi started slowly adding mutalisks, mutalisks were also no threat. LetaBot eventually moved out and swept aside everything in its path. (Then LetaBot got stuck on the enemy ramp and crashed, or overstepped the time limit, but that’s a different lesson.)

CherryPi seems to make a lot of decisions without scouting. For example, it makes scourge (sometimes more than a little) when the enemy has no air tech. It moves its overlords, but does not send one to the enemy base. I think it is making choices based on units that it sees, especially versus protoss. But when it feels overmatched it holds its army well back from the enemy, meaning that it can’t see. Compare Steamhammer, which is aggressive and keeps its units forward even when it’s a big risk; the countervailing advantage is that it gets to see what the enemy has and what the enemy is doing.

Against zerg, CherryPi has different openings. If one loses, it tries the next. From what I’ve seen, the first opening is a 9 pool without gas, followed by a hatchery for mass slow zerglings. It’s a safe middle-of-the-road opening, or in other words halfhearted, but CherryPi loves its favorite. If that loses, the second try is sunken turtle into mutalisks, which is successful against many zergs (I think it is likely to work against Steamhammer too). There may be fancy learning going on behind the scenes, but if so we can’t see it because the tournament doesn’t have enough games.

Against protoss, I can’t detect any such progression in CherryPi’s openings. I think it’s always playing the same opening, and adapting it somewhat to the situation. It expands with 12 hatch, sunkens up its front, and makes massive numbers of drones. It’s similar to Killerbot’s plan, and a good one in general though the early sunkens are often unnecessary (and occasionally insufficient). The games against Bereaver make a good example; the first is a loss and the second a win, but zerg plays similarly in both. In the first game, notice how zerg keeps making mass drones and maintains a strong economy even as it is losing every fight, including losing its drones at a high rate. In the second game, the difference was that the players were at cross positions on a large map, and Bereaver’s corsair play and reaver drop were less effective.

CherryPi is safe against fast rushes; it has a perfect record so far against the zerg rushbots. CherryPi is vulnerable to hard rushes. Wuli won 2-0, and so did Flash which also does heavy early zealot pressure. Black Crow’s relentless zergling waves beat both the zergling opening and the turtle opening.

Overall, CherryPi has glaring weaknesses just like other bots do. But as I write, it is ranked #11, so it is strong by the standards of this tournament. I think the main source of its strength compared to other bots is the same as the main source of Steamhammer’s strength, the pressure style of play, which works because bots are better at attack than at defense. Steamhammer is ahead of CherryPi for now, because I invested effort in stability and resilience and lose fewer games to bugs and basic goofs. The CherryPi team is presumably investing in smarts instead, which should pay off in the long run. I haven’t seen any sign that CherryPi has particular smarts in opponent modeling—as far as I can see its opening learning is a simple algorithm, and I can’t detect anything else it might be doing—but if it does we might not be able to tell, because the tournament is not long enough.

The next AIIDE tournament may be interesting.

cheese game McRave-MadMix

Yesterday an epic, today a short sharp shock. The game McRave vs. MadMix is a brazen cheese. Why shouldn’t I build my first pylon next to your nexus? Maybe because it couldn’t possibly succeed?

MadMix placed its first pylon in McRave’s mineral line, at the corner of the nexus. I can suggest an improvement: Place the pylon slightly to the left so it blocks access to the indented mineral patch. That is called a manner pylon because it is ever so polite. A probe sent to mine there will run behind the mineral line, slowing down the opponent’s mining. If you’re going to push a pylon into the opponent’s face, you might as well make it a manner pylon, as in the famous game Bisu-Pokju from 2007.

A manner pylon is commonly worth it, given that you have an early probe in the opponent’s base, especially if (as in Bisu-Pokju) it blocks 2 mineral patches: In between trapped workers that have to escape (which I doubt bots have the knowledge to do), workers devoted to attacking the pylon, and workers sent behind the mineral line, it can slow down the opponent’s mining more than enough to make up for its cost. It occurs to me that a bot with mineral locking could avoid some of the cost provided it has the special case knowledge to avoid locking workers to the blocked mineral patch or patches. I doubt that any bot has that knowledge yet, since I’ve never seen a bot place a manner pylon! I would be interested to see how a manner pylon interacts with LetaBot’s path smoothing. If the smoothing is not smart enough, some SCVs might be unable to mine at all—I am imagining SCVs bumping against the pylon trying to follow the shortest path.

McRave assigned 2 probes to tear down the offending pylon. MadMix calmly continued the cheesemaking process, building a gateway, then replacing the destroyed pylon with 2 fresh pylons, then laying down a second gateway, all while sending fresh probes to make sure one was always on hand. McRave seemed unimpressed and carried on with its 1 gateway build.

McRave ought to have been impressed. McRave’s first zealot was out earlier, and it could have held easily with good play. But McRave didn’t seem to know how to react; as it was attacking proxy buildings and mining gas and starting its cyber core, MadMix was killing probes and pulling ahead. The proxy won. Correct play when you get proxied like this is to delay your tech until you have the situation in hand. Your opponent set itself back to perform the proxy, and you can always stay ahead in units.

Don’t blame McRave for missing knowledge. All the top bots, Iron and Tscmoo included, have knowledge gaps wide enough to drive a government cheese truck through. It was only shortly before the tournament that I added smarts to Steamhammer to react on the fly to this kind of in-base cheese (Steamhammer makes a spawning pool if it has 9 or more drones, and will cancel gas or a second hatchery if that helps it get the pool up faster). And Steamhammer doesn’t understand how to react to other proxies like Juno’s (by Yuanheng Zhu) cannon contain (it still relies on a hand-coded counter for that). Bots need a lot of knowledge and it takes a long time to acquire.

epic game Steamhammer-WOPR

Today’s game is Steamhammer vs. WOPR by Soeren Klett on Andromeda. The game is not only epically long, it is epically clumsy on both sides.

WOPR plays a mech strategy that is slow with air defense. When Steamhammer opens with mutalisks, it flicks WOPR aside. This game, Steamhammer randomly chose to open with a low-economy lurker rush, and momentum went the opposite direction. WOPR had just enough scans to stop the initial lurker attack (a cleverer or more persistent attack might have succeeded). Then zerg strangely decided that its next tech choice should be hydralisks. Steamhammer had seen enough tanks that it should have known it was making a bad decision, so I think the error was not in analyzing the unit mix, but in choosing the tech target based on the analysis. Yesterday I rewrote StrategyBossZerg::chooseTechTarget() in a way that fixes 2 subtle bugs and improves the intended behavior, and I hope that will solve the problem. Anyway, WOPR eventually went on campaign and wiped out the zerg natural and main and the in-base mineral only, while cementing its huge lead by expanding across half the map. All that was left was to clean up a few underpopulated zerg expansions.

In the picture, the zerg economy remains weak and the army is hopelessly outmatched because it has suffered continual losses due to poor unit mix and poor tactics. In a human game, zerg could give up. 2 new terran expansions are visible on the minimap. Also notice that there are 3 evo chambers even though Steamhammer only knows how to upgrade with 2—it’s another bug that I fixed recently.

Like many old-school bots, WOPR believes it is finished when it has destroyed the enemy main. It does not seek out remaining bases, but sends its army toward (0,0) in the upper left. But WOPR does move its army to react to enemy units that it sees, and it continues to take bases and defend itself. Terran was maxed and zerg was extremely weak, with 8 surviving drones and the only tech a spire built at the last minute before the hive was destroyed. In the minimap, red is everywhere and blue survives in a corner. (Light blue on the minimap represents neutral buildings.)

Steamhammer knows how to recover. It slowly rebuilt its economy and tech and made mostly air units, especially scourge since it had seen battlecruisers. When the few zerg units showed themselves, an overwhelmingly strong column of terran units appeared and chased them away. Even constrained to ignore the zerg bases, terran had a big advantage. If zerg built up fully and also maxed its army, terran still had more bases and had been banking resources for a long time. WOPR only had to hold firm until the time limit was reached, and it would be awarded victory on points.

After the above picture, the battle started to get interesting. The scourge scored more victories than losses against the battlecruisers and cloaked wraiths. The first ultralisks were annihilated by tanks with +3 attack, but zerg was starting to do some damage. Of course WOPR had a huge reserve of minerals and gas, and it instantly replaced lost units. But then zerglings and one ultralisk found their way into the terran 9 o’clock base and destroyed it before it had quite mined out. Terran forces had been distracted by air activity at the center base, and most of the terran army remained guarding the unimportant upper left. It looked as though zerg had a chance after all; WOPR’s tactical reactions were not accurate enough. Notice the zerg supply continuing to increase. WOPR replaced many of its lost workers with combat units, which was correct play because terran had a big bank and didn’t need the income.

Steamhammer attacked the upper left natural and didn’t quite finish it off. Then the ray of hope started to dim. Steamhammer made too many guardians and was in danger of losing them all to the battlecruisers. Then hope brightened: WOPR played poorly and Steamhammer had the sense to follow up with hydralisks and scourge, and the battlecruisers exploded. WOPR targeted hydralisks though battlecruisers can one-shot the more dangerous scourge. Zerg destroyed the upper left natural. Then hope dimmed again: Steamhammer decided that the next target had to be, not an undefended base, but the upper left main where the giant terran army was hanging out. In the picture, the zerg army could not even get close before it was forced back by terran reinforcements appearing to its rear. Still, notice that zerg has retaken its main and mineral only and is now taking the formerly terran upper left natural. WOPR retook 9 o’clock and finished mining it out, but was otherwise not as economically active as it should have been. The zerg economy was in high gear and zerg was ahead in minerals but hurting for gas.

Both sides showed messy tactics and micro. Steamhammer made a few devourers for help against the battlecruisers and wraiths, but used them poorly. Zerg lost workers after taking the indefensible center base (which it had earlier kicked out of terran hands). Steamhammer also made many indecisive movements and often chose the wrong targets. WOPR played worse, though. It kept the goliaths in the corner surrounded by siege tanks, so the massive air defense was out of range of attacking guardians. Guardians slowly ate through the rows of tanks and finally killed the command center. Hydralisks, despite some confused behavior, kept terran reinforcements at bay though they couldn’t approach the mass of tanks.

Having killed the upper left command center, Steamhammer felt no need to engage the remaining terran army but set about the task of clearing the other terran bases. At least that part worked well. WOPR was less efficient and burned through its entire bank of minerals. Steamhammer won without ever maxing its army.

It was 46 minutes and full of events, many of which I didn’t mention. As one example, you can see on the minimap in the last couple pictures that Steamhammer built 2 macro hatcheries in a strange position outside its original base, where WOPR had taken the natural. I’m not sure what caused that.

By the way, this is the kind of game where either player would have benefited from being able to take island bases. If Steamhammer knew how to take islands, it could have made 2 more bases and would have recovered faster. If WOPR took islands and turreted them up, they could have been defended indefinitely by air units. To destroy a strongly defended island base you need a coordinated large-scale attack which no bot has the skills to pull off.

luck

In AIIDE this year, Steamhammer scored much lower in the first 5 round robins (55%) than over the entire tournament (64% over 110 rounds). The difference amounts to finishing #10 instead of #13. It could be due to other bots getting confused by Steamhammer’s random openings and mislearning, but I think it is more likely to be statistical noise. Steamhammer happened to be unlucky early on, and the bad luck washed away over the long tournament. Data is cleansing.

SSCAIT has only 2 round robins. The 5 rounds of Steamhammer’s bad luck comprised 135 games compared to the 154 games each bot plays in the 2 rounds of SSCAIT, similar numbers. Some bots will be lucky and place a little higher than they would have in a very long tournament, and some bots will be unlucky. SSCAIT has a different purpose than AIIDE, so I think that’s OK. But it is a point to remember.

It’s difficult to judge by intuition whether a bot is getting lucky or unlucky. The majority of Steamhammer’s losses so far are “unlucky” losses against opponents that Steamhammer usually defeats. That is exactly what we should expect. The bots in the highest places (Steamhammer is currently #6 out of 78) don’t have many opportunities to lose to stronger opponents. Look at the crosstable and you’ll see that all the top bots have the majority of their losses on the right-hand side, against weaker opponents. No player is perfectly solid, we all lose occasionally due to our own mistakes; it’s a hard game.

That said, I have a clear idea of which games I see as unlucky results. For example, Steamhammer lost 0-2 to Flash this tournament, while in my test at home, Steamhammer beat Flash at a ratio of 4:1. In its losses, Steamhammer happens to randomly choose openings that don’t work against this opponent, or gets into less common situations where weaknesses pop up. Steamhammer’s first game against Flash is its worst game of the tournament so far; Steamhammer barely seems to be in the game at all, but simply falls down when poked. I should repeat that an unlucky result against one opponent doesn’t mean that Steamhammer’s overall result is unlucky. With 2 games against each opponent, lucky and unlucky results against given opponents are virtually inevitable. And it’s hard to judge by intuition whether the good and bad luck balance out.

Steamhammer does have one clear lucky win, Steamhammer > Microwave. Microwave learned that 5 pool on average beats Steamhammer’s ZvZ opening mix, and played it this game too. Steamhammer got lucky and randomly chose 9 pool speed, which counters 5 pool, and won after a long game. Steamhammer maintained its lead the whole game, but Microwave defended stubbornly and had to be ground down (it’s a good game if you like that kind of thing). How will the second Steamhammer-Microwave game go? I can’t predict! Microwave will have an edge if it keeps its opening, but after losing it may switch.

For the rest of the tournament, I predict 2 losses to Iron and 1 more loss to McRave. TyrProtoss is also likely to take its game, and if CherryPi adapts against Steamhammer in the same way it has adapted against other zergs that defeated it, then CherryPi will have an edge in its remaining game—and those are all the likely losses. If Steamhammer wins any of those 5 games, they will be lucky wins. I can’t predict the Microwave or Tscmoo games. Any other losses will be unlucky losses. It seems plain that the majority of Steamhammer’s losses for the rest of the tournament will be unlucky losses, losses against opponents that Steamhammer usually beats, and that is how it should be. Frequent unlikely chances outweigh scarce likely chances.

Next: An epic game.

the weird life history of the sunken colony

A zerg creep colony has a base health of 400 hp. A zerg sunken colony has 300 hp. The way it works is, just when a creep colony finishes morphing into a sunken, it suddenly loses 100 hp. If it falls below 0 it doesn’t die, though; there is a floor of 1, and zerg healing has instant effect so the sunken colony completes with 2 hp (I think it takes 1 frame for the second hit point to be granted). When you start morphing a sunken, the UnitType immediately changes from creep colony to sunken, but the hit points are not removed until the morph completes.

Good players exploit this. I don’t know of any bot that takes advantage. Steamhammer certainly doesn’t. It could be hard to notice in a bot’s play, though. Does anybody know of one?

If you’re attacking morphing sunkens, then you should avoid doing unnecessary damage. If a morphing sunken will heal to no more than 100 hit points before it completes, and your units will still be in range then, and there is something else in the area worth attacking, then you should probably switch to attacking the something else. You will normally be able to kill the brand new sunken with one hit before it can get a shot off, saving the time it would have taken to inflict nearly 100 hp of damage. Depending on the situation, you may also be able to stop early if for some reason you’re pre-emptively hitting creep colonies before they morph. (I think most bots don’t waste their time on that, though.)

If you’re considering morphing a sunken, you may want to hold off if the creep colony has less than 100 hp left, or any low number that would leave the sunken helplessly weak. That may be best if an enemy attack is near; don’t waste 50 minerals on a defense building that will die instantly, get a pair of zerglings instead (if you have a larva and supply). On the other hand, if you don’t expect enemy action for a long time, you may want to morph the sunken immediately. By losing less than 100 hp in the morph, you are effectively speeding up zerg healing compared to the case of waiting to morph the sunken later.

It’s a strange behavior, and the considerations it induces are strangely complicated. You can only make the best choices if you have skill in seeing the unclear future. Taking all considerations into account is too hard, but I think the top bots have become strong enough that they could gain from taking into account the basic considerations.

Spore colonies have 400 hp and don’t show the same strange behavior.

BWEB building placement library

Christian McCrave, author of McRave, is writing a building placement library named BWEB. BWEB is C++ and depends on the BWEM map analysis library by Igor Dimitrijevic, and within those constraints should be easy to add to a Brood War bot. It’s currently in beta at version 0.8, finished enough to try out but not yet thoroughly tested.

The idea behind BWEB is to place buildings not one at a time like most bots, but in compact predefined “blocks” of several buildings of different sizes. A protoss block will include at least 1 space for a pylon plus possibly large spaces for gateways and other large buildings and medium spaces for buildings the size of a Citadel of Adun. From McRave Blocks v1.2 (mentioned in the code):

You can think of a block as a sort of macro for buildings. There are 2 ideas behind it: First, building placement can be compact. Bots that place buildings one at a time often leave space around each building for units to pass through, wasting room and filling up the main base. (ICEbot is a notorious example; its buildings tend to spill out of the main.) BWEB only has to leave space around each block. Second, the library itself can be faster, because it has to choose the locations of a small number of larger blocks rather than a large number of smaller buildings.

Here is the key part of the API, how to get a location for a building:

	// Returns the closest build position possible for a building designed for anything except defenses, with optional parameters of what tiles are used already and where you want to build closest to
	TilePosition getBuildPosition(UnitType, const set* = nullptr, TilePosition = Broodwar->self()->getStartLocation());

	// Returns the closest build position possible for a building designed for defenses, with optional parameters of what tiles are used already and where you want to build closest to
	TilePosition getDefBuildPosition(UnitType, const set* = nullptr, TilePosition = Broodwar->self()->getStartLocation());

	// Returns the closest build position possible, with optional parameters of what tiles are used already and where you want to build closest to
	TilePosition getAnyBuildPosition(UnitType, const set* = nullptr, TilePosition = Broodwar->self()->getStartLocation());

It’s simple enough; ask for a building location, get one. BWEB also has a method to fetch a wall to place in the natural choke, and various utility and debugging methods.

A comment in the code may be a to-do list:

	// Currently missing features:	
	// - Counting of how many of each type of block
	// - Defensive blocks - cannons/turrets
	// - Blocks for areas other than main
	// - Variations based on build order (bio build or mech build)
	// - Smooth density
	// - Optimize starting blocks

thoughts

Gaoyuan Chen’s grid of protoss buildings is similar in idea to BWEB’s blocks of buildings. Unlike BWEB, Gaoyuan Chen places medium buildings in large spaces (and it doesn’t seem to be a problem). In the picture, buildings off the grid are enemy proxies:

BWEB seems suitable for protoss and terran building placement. Zerg building placement has different requirements and doesn’t seem like a natural fit with the block technique. Sure enough, in the BWEB code I see plenty of special cases for terran and protoss, and no mention of zerg. Of course, terran and protoss benefit from compact building placement. Zerg has less space to place buildings, since they have to go on creep, but also has fewer buildings to place.

How does BWEB ensure that the mix of small, medium, and large building spaces fits the mix of buildings you will create? I got the impression that if, say, you want to (or later in the game end up wanting to) create many gateways (large) and fewer than usual tech buildings (mostly medium), then BWEB will pre-allocate blocks with more medium spaces than you finally use. That doesn’t sound like much of a problem, though.

It’s new software, and many features you might want are not there, at least not yet. You may want tech buildings in remote parts of your base so they are safer and harder to scout, and production buildings near the ramp. You can do that with the “build closest to here” feature, but you may have to take extra care to place pylons ahead of time. The wall feature is very limited so far. The game gives us many uses for complete or partial pylon walls and for supply depots as obstacles. There’s no apparent support for proxies or gas steals; you’ll have to use your own code. The same for spotting pylons or blocking the opponent’s natural with an incomplete engineering bay. And of course there are inherent limitations to the idea of predefined blocks of buildings. A predefined block is never quite optimized for the exact terrain and matchup and build order and so on.

Still, it seems fast and easy, and it’s an improvement over what most bots do today. It should be well worth trying out for a protoss or terran bot, especially if you already use BWEM.

Thanks to the author Christian McCrave for letting me know about it!

Steamhammer reserved resource bug

Today I fixed the Steamhammer bug that sometimes had the building manager reserve minerals and gas and not release them. When I put in debugging code, the error turned out to be much more common than I suspected. In the first test game after fixing the bug, late game macro went from sometimes spotty to smooth and strong, and Steamhammer scored a win over McRave from a situation where it never had before. The bug deserved earlier attention than it got.

The error was that one caller of undoBuildings() skipped out on its responsibility to release resources when buildings were canceled or unable to be started. (undoBuildings() is a Steamhammer addition, so the bug does not affect UAlbertaBot or early Steamhammer forks, at least not in the same form.) I fixed it by moving the responsibility of releasing resources from the caller into undoBuildings() itself, adding these lines:

		// If the building is not yet under construction, release its resources.
		if (b.status == BuildingStatus::Unassigned || b.status == BuildingStatus::Assigned)
		{
			_reservedMinerals -= b.type.mineralPrice();
			_reservedGas -= b.type.gasPrice();
		}

Of course those callers that release resources have to be changed so they don’t. Not all of them are supposed to—sometimes the building is under construction and the resources are already released.

Don’t trust this bug fix too much. There could be more bugs. I left the debugging code in for now, so if there are more building resource issues, or mistakes in the fix, then I should find them before the 1.4 release.

Bryan Weber is fearless

Bryan Weber has no fear of building a hatchery in the face of the enemy. Bryan Weber - CerkoBot by Thomas Cere has an example that I couldn’t resist. Notice the one drone calmly mining minerals in the midst of a crowd of probes and surrounded by battle.

Bryan Weber - NUS Bot has a less extreme example, merely building next to the enemy main. This hatchery did not live long, despite the lurker that hatched and defended it.

There are more.

Steamhammer is getting auto gas stealing

I’ve just finished implementing AutoGasSteal, which means that Steamhammer can make its own decisions about when to steal gas. I also added random gas stealing, just because it’s easy. As I wrote in November, my main purpose is to stop the gas steal code from bitrotting like it did before; regular gas stealing in regular games will be my test suite. I doubt that stealing gas will make much difference in Steamhammer’s strength, but it’s easy to imagine that a few opponents may find themselves confounded.

There are now 4 ways to order a gas steal:

Write it into the build order with "go steal gas". If Steamhammer hasn’t started scouting yet, this also sends out the worker scout to find the enemy base.
Write your own C++ code to order a gas steal (it amounts to setting a flag). For example, StrategyManager could decide based on scouting information.
New: Configure a value for RandomGasStealRate, a probability from 0 to 1. Steamhammer will randomly choose to steal gas in that proportion of games.
New: Configure AutoGasSteal as true. Steamhammer will decide based on the opponent model whether to steal gas.

If any of these methods orders a gas steal, Steamhammer will go steal the opponent’s gas if it can. For now, the scouting worker that will steal gas heads out at the usual time (whenever "go scout" or one of its variations comes up in the build order). I added some data so that later Steamhammer can decide based on experience when to send the gas steal worker. I would like the bot to decide gas steal timing and scout timing on its own. I’m not sure whether I’ll get to that in this version.

For the first pass, I implemented auto gas stealing in a simple way, with the UCB1 algorithm that many bots use for opening learning. It keeps data for each opponent to remember how often it has tried stealing gas and how often it has won with and without, and chooses one or the other based on the UCB values. A small amount of fictitious data whispers the lie in its ear that it has tried stealing gas a couple times and lost. That makes it reluctant to pay the cost of stealing gas unless it sees a benefit. In a later version I may switch it to a contextual bandit algorithm so it can take more information into account—the opponent model has plenty more information available.

It works for all races. Just turn it on in the configuration file. I did find one new bug in the gas steal, a rare bug when emergency reactions interfere with starting the refinery building, but that will be fixed before long.

Now I need to go run a lot of tests!