SSCAIT Report 58
I just realized: The SSCAIT Report 57 broadcast came out on Sunday as usual, then Report 58 came out only 2 days later, not waiting until the next week. It’s about terran versus terran.
I just realized: The SSCAIT Report 57 broadcast came out on Sunday as usual, then Report 58 came out only 2 days later, not waiting until the next week. It’s about terran versus terran.
I collected data to calibrate openings so that Steamhammer’s opponent model can make more sense of the opponent’s opening. I thought some of the data might be of interest.
Here are the frame times at which Steamhammer’s first zerglings hatched with different spawning pool timings. These were all measured on Heartbreak Ridge, since the numbers vary depending on the layout of minerals on each map. Also the numbers vary somewhat from game to game, but it’s close enough. Steamhammer has wasted movement, so optimal timing is probably a touch faster.
pool | frames |
---|---|
4 pool | 2630 |
5 pool | 2750 |
6 pool | 2920 |
7 pool cutting drones | 2990 |
7 pool going to 9 drones | 3120 |
8 pool | 3160 |
9 pool | 3230 |
Making a spawning pool on 7 leaves you with 6 drones. It’s possible to make 3 drones and an overlord before the zerglings, but the lings have to wait for the overlord and are slightly delayed. Instead, you can cut 1, 2, or all 3 drones and get that many pairs of zerglings before the overlord, saving a sliver of time. It’s an interesting tradeoff.
I like that there is a nice spread in the timings. It seems as though each of these builds might have some use, depending on what you think your opponent’s timings are and on how you plan to follow up.
Steamhammer has had its first crash loss on SSCAIT since May. Or it may have overstepped the time limit. Randomhammer protoss has sometimes broken the time limit, but this is the first for zerg in a long time.
The game is Steamhammer-UPStarcraftAI 2016 (a zerg rushbot), and it looked like a routine Steamhammer win until 5:19 in, when suddenly bam, the 0/0 supply that means crash. Steamhammer has won dozens of these rushbot games without incident.
It’s a mystery, and there doesn’t seem to be any information to start from.
Steamhammer does still have crashing bugs despite my rigorous eradication campaign. AIIDE 2017 reported 4 crashes out of 2964 games, a rate of about one per 740 games. The current SSCAIT version now has 1 crash in 385 games, which is not statistically different. I found a crash yesterday with a stress test where I forced Steamhammer to play an opening that caused it to lose bases and struggle to recover. After many games, I got one crash. I think I fixed the cause, but the crash is rare and difficult to reproduce, so....
The new bot Pineapple Cactus is named after a nice kind of cactus that grows in the Mojave Desert. If you think any cactus is nice, that is. For a brand new bot, it has a surprising variety of skills.
It also scouts the map with overlords in a careless way, so that the overlords die at a prodigious rate. The 45 minute long first game against terran turtle bot Johan Kayser shows all of these skills. Pineapple Cactus’s second game against Johan Kayser is its most successful game so far.
As I write, Pineapple Cactus has a score of 2 wins and 9 losses, and one of the wins is by crash. It has all those skills and it can barely make a dent in the opposition. 1. The bar is high: It’s tough to jump in with a new bot. 2. If you have weaknesses in basic skills, it doesn’t matter how many other skills you have.
What is a good framework for laying spider mines?
Steamhammer’s terran is much better with barracks units than with factory units. To catch up, it needs 2 skills. One is the ability to siege tanks non-ridiculously, and the other is the ability to lay spider mines in reasonable places. Right now it can’t lay mines at all.
It would be easy enough to code up a simple behavior like “lay mines along the path between the friendly and enemy main.” But I’m a little more ambitious. For terran and protoss, Steamhammer is primarily a starting point for authors to build their own bots. I want there to be enough structure that authors feel they can plug in their own mine-laying skills without having to figure out the details from scratch. I don’t think it would take much structure.
A brief reminder of some of the many uses of spider mines:
So what is a good framework for laying mines? I’m imagining something like a list of available behaviors and some way of telling when to use each, but I haven’t come up with a satisfying plan.
What would you do?
A funny game, a comedy of errors: Steamhammer versus Tyr terran by Simon Prins, on the map Tau Cross. Steamhammer opened with 2 hatch mutalisks, and Tyr bunkered itself in and went with infantry on one base.
Phase 1: Steamhammer fiddled around with its mutalisk force, picking off several building SCVs but mostly wasting time.
Phase 2: Steamhammer happened to notice that the terran mineral line was also a possible target. “Oh, look, terran has SCVs mining. I didn’t know that!” Tyr didn’t defend but moved out to counter instead and lost all SCVs. Steamhammer had 3 mining bases to 0, and only had to survive Tyr’s attack to win. Terran could not reinforce, so that was easy, right?
Phase 3: Instead of defending itself, Steamhammer decided to sacrifice every mutalisk against the meaningless bunker. It also wasted units piecemeal against the terran ball. Tyr destroyed the zerg main and natural. Steamhammer lost many drones by sending them to mine gas at bases that were under attack. That makes 3 big mistakes, more than enough to lose.
Phase 4: Tyr spread out to find the zerg 3rd base, found it after a few tries, and then—decided to send its army home to rest. “No zerg anywhere! Not that I’ll admit, anyway.” The marines in the picture saw the base, and that is when Tyr started to send its troops home.
Phase 5: Steamhammer slowly recovered from 7 drones at its third base. It restarted its tech from zero, put down 4 sunkens because it knew there was a scary army out there, and belatedly switched to lurkers. Steamhammer made far too many drones before moving out, but finally sent a lurker and won easily because Tyr had no detection. The mutalisks had also killed the comsat.
In the picture, nothing can stop the lurker, and more lurkers will come. You can see in the minimap that Steamhammer has retaken its main and natural and a mineral only base as well.
Both sides showed a lack of resilience. This time, Tyr turned out to be lacking a little more. Besides its opportunity to attack and win, Tyr had an unfinished science facility that it could have canceled (it’s in the upper right of the last picture). Then it would have had the resources to make an SCV and get back to mining.
Being unpredictable to your enemies has value. How can you do strategy learning and still remain unpredictable when you should? You can’t simply require randomness, because if one strategy dominates, then you should play it every time. At other times, you may benefit from playing 2 strategies equally, or by playing a normal strategy 95% of the time and otherwise rushing. It depends on what opponent strategies counter each of yours, and the n-arm bandit methods that bots use now don’t understand that. Here’s one way to do it. It’s a step up in complexity from UCB, but not a tall step.
You can record the results of your strategies and the enemy strategies in a zero-sum game matrix, and solve the strategy game (which is the subgame of Starcraft that involves choosing your strategy). In the first cut version, each cell of the matrix is “n times it happened that I played this and the enemy played that, and k of those were wins.” Take the observed probability of win for each cell of the game matrix as the payoff for that cell, and solve the game. The solution tells you how often you should play each of your strategies, assuming that the opponent chooses optimally.
There are a couple different algorithms to solve zero-sum game matrixes fast. I personally prefer the iterative approximate algorithm (here is a simple python implementation), but it doesn’t make much difference.
If you recognize a lot of strategies on both sides, you’ll have many matrix cells to fill in, each of which requires some number of game results to produce a useful probability. 10 strategies for each side already means that a big AIIDE length tournament won’t produce enough data. For a first cut, I recommend recognizing only 2 or 3 categories of enemy strategies, such as (example 1) rush, 1 base play, 2 base play, or (example 2 for zerg) lings and/or hydras, mutalisks, lurkers. Since you’re grouping enemy strategies into broad categories, you don’t need much smarts to recognize them.
You can group your own strategies in a completely different way, if you like. There’s no reason to stick to the same categories. Also, your bot presumably knows what it is doing and doesn’t need to recognize game events as signifying that it is following a given class of strategy.
In this method, you are assumed to choose your strategy before you scout, or at least ignoring scouting information. You can take your time to recognize the enemy strategy, and base the recognition decision on anything you see during the entire game.
How do you get started learning? You might want to start with a matrix of all zeroes and only use the game matrix for decisions after you’ve gathered enough data. Instead, I suggest keeping a global matrix alongside the ones for each opponent, with floating point game counts and win counts in each cell. The global matrix has the totals for all opponents. (Or maybe there’s a global matrix for each opponent race.) When you face a new opponent, initialize the new opponent’s matrix with scaled down game counts and win counts from the global matrix, as if only a small number of games had been played in total (I suggest 1 to 3 times the number of cells in the matrix as a try). You’ll start out playing a strategy mix that is good against the average opponent, and as you accumulate data the mix will shift to specifically counter this opponent.
There are tons of ways to fancy it up if you want to try harder. You could try a variant where you estimate the enemy’s choice probabilities instead of assuming the enemy plays optimally (you’ll need a different solution algorithm). You can keep a larger game matrix in parallel with the small one, and switch to it when you’ve accumulated enough data. Or use a hierarchical method that unfolds categories when there is enough data to distinguish them. You can try a more complicated Bayesian game solution algorithm, which realizes that the numbers in each cell are empirical approximations and takes that into account (“oh, this cell doesn’t have many games, better not rely too strongly on its value”). You can include scouting information in the strategy decision (“well, I can see it’s not a rush, so strike out that option for the opponent”). You can divide your notion of strategy into any fixed number of aspects, and keep independent matrixes for each aspect, so that your strategy choices are potentially random in many different dimensions. The sky is the limit.
In my post about independent control for unit micro I mentioned that Tscmoo can “lose squad coherence” without explaining what that is. I meant that Tscmoo likes to spread its units out instead of keeping them in a bunch, and sometimes units that start out working together end up separated and unable to cooperate. Here are a couple games from today to illustrate. I think the behavior is especially clear with zerg, so these are games of Tscmoo zerg.
First a game to show what Tscmoo gains by scattering units over the map, Tscmoo zerg versus Wuli on Heartbreak Ridge. Tscmoo opened in a way that left itself vulnerable to Wuli’s zealot rush. When the first 4 zealots arrived, Tscmoo had 4 zerglings to try to keep its morphing natural alive. Wuli already had 2 more zealots on the way, so it was army supply 12 versus 2. It looked like the natural would fall for sure.
Well, the fight went on for a while, but Tscmoo’s basic method was to run away and lure zealots out of position, then mob any zealots that strayed and could be locally outnumbered. Wuli didn’t reinforce properly and let its units get distracted, so that its powerful army went on goose chases and accomplished little. In the picture, zealots are chasing a zergling away from the natural. Notice the yellow dots on the minimap; Tscmoo is scattered around.
Eventually mutalisks came out and zerg won. Tscmoo prevailed because its willingness to run away in different directions confused Wuli, which did not know how to concentrate its forces on a vulnerable point like the natural hatchery. Goose chases confuse a lot of bots, including Steamhammer (so far), and they are responsible for a lot of Tscmoo’s resilience when facing defeat.
The image I had in mind when I wrote “loses squad coherence” was a Tscmoo group coming under pressure and forced to retreat. Human players like to have lines of retreat if they are pressed back, so they can keep their units together. Tscmoo likes to form a giant concave and doesn’t seem to pay attention to lines of retreat. As the concave is forced farther back, different units may retreat through different exits so that what was once a group fighting together breaks up into subgroups that the enemy may be able to defeat in detail, or ignore and bypass. There is no longer one coherent squad.
The game Tscmoo zerg versus Krasi0 is not as clear an example as the last game, but it shows what I mean. Tscmoo went for a hydralisk all-in and laid on a punishing attack. The game was decided when Krasi0 held with strong tank placement. In the picture, Tscmoo’s retreating hydralisk group is forced off the central plateau on Jade and breaks up as I described above. The hydras took different exits and did not join up again.
Goose chases do not confuse Krasi0. Krasi0 goes for the throat.
• A medic can blind a zerg egg, and the unit that hatches will be blind. You might have thought that any eyes were hidden away inside the egg.... If 2 zerglings or 2 scourge hatch, they will both be blind. It’s a cute detail that has no practical use whatsoever. Ensnare and other effects work similarly with eggs, but it’s not as funny.
• If a science vessel irradiates a high templar, does the high templar have any chance to live? It could merge into an archon; that saves it. There is another way. If the high templar is next to a full shield battery, the battery has enough energy to save the templar by repeatedly charging its shields (the battery is left nearly empty). I don’t think I’ve never seen a high templar get irradiated in a game. Of course it’s the same for other biological protoss units besides the high templar, but why would you save another unit?
• A medic can heal biological units of any race. Zealots + medics make a fearsome combination. Ultraling + medics is nasty too, but harder to use well; healing close to the 400 hit points of one ultralisk will drain a medic, and the zerglings are too many and don’t stay put. These things come up in some kinds of nonstandard games. One day I wondered, “Wouldn’t medics go best of all with mutalisks?” But it turns out that medics can only heal ground units.
What’s your favorite curious and useless trivia about the game mechanics?
I notice a lot of bot authors want to control each unit independently for micro decisions, to make it an agent with a mind of its own. There are good reasons to write a bot that way.
It doesn’t make sense to me. My micro plans in Steamhammer call for reducing the independence of units, calculating more things centrally at the squad level and leaving less for individual units to decide. For focus fire and avoiding overkill, I want a data structure in the squad to keep track of assignments of shooters to targets. For kiting and fleeing, I want a near-term prediction of when each of my units is likely to die, based on the rate it is taking damage or on the approach of enemy melee units. A planning algorithm can look at all the data to decide, “You, shoot at that” and “You, pull back.” It’s not as simple, but I expect it will have a higher ceiling in practice, because it is easy to change out the planner. And I think that, done right, it can execute fast. Units act frame to frame, but the squad-level planner needs to execute only about once per cooldown period, and it can be out of synchrony with unit actions.
That’s general micro, but there are also important micro skills that bots don’t have and would benefit from. I’ve never seen a bot do any of these effectively.
Zergling surrounds. Watch a bot attack zealots with zerglings: Each zergling heads for where the zealot is now, and the lings pile up. With fast zerglings and slow zealots, if the zealots flee then the zerglings end up trailing behind and getting a hit in now and then, and mostly waste their time getting in each other’s way. Arrakhammer does modestly better by predicting the zealot’s future location, react to the future style. Then watch a human do it: The zerglings move into or around the formation of zealots, then attack all at once. If the zealots are fleeing, they get surrounded and have to find or make a way out before they can escape. Any intermediate zerg player knows this micro.
I’m seriously considering implementing zergling surrounds in Steamhammer with human-like control, at least as a first cut. It’s 2 commands total for small fights with up to one control group of zerglings: Command lings to move as a group, command lings to attack as a group. Choosing the destination for the move command is the tricky part.
Dragoon backstepping. Like most bots with a combat simulator, Steamhammer mostly attacks or retreats, and doesn’t know anything in between. But watch what happens in a human game when a strong terran early timing attack moves out and faces forward dragoons: The dragoons will lose if they stand and fight, they must step back... and back.... But if vultures pull too far in front of the tanks, or if any sloppy movement isolates a few terran units, the dragoons may suddenly stop and fire. The terran force must move slowly or be whittled down—usually some of both. Protoss gains time for more dragoons to take the field.
Related situations are common. FAP doesn’t understand vultures versus zealots, so if there are too many zealots, Steamhammer’s vultures will retreat in terror instead of pulling hit-and-run attacks. If an opportunity comes up, take it even if you weren’t intending to fight. I think the decision is best made at the squad level, so that the squad behavior stays coherent. I often see Tscmoo lose squad coherence and disintegrate; I want to avoid that.
Getting out of each other’s way. A large group of ranged units arrives to fight—hydralisks, say. The closest hydras get into range and stop to fire, and the ones behind maneuver around the sides to get into range (that is inefficient already). The ones behind those... push and shove, but can’t get into range and achieve nothing. Any human above beginner level will move the front hydras forward so that all can get into range, but bots don’t.
Climbing a ramp is similar. Bots see targets at the top of the ramp and stop to hit them, blocking the way so that the rest of the force is stuck below. Steamhammer is clever enough not to siege tanks on the ramp, but sieges them above the ramp and blocks the path anyway. Humans keep the front units moving until the whole force can get up. The difference in DPS is huge.
These decisions can be made by individual units, swarm intelligence style, but I find it hard to design good emergent behaviors. I think it is better to make the decisions at the squad level. It should be more efficient and more effective.
As its opponents have long since noticed, ForceBot has become a 5 pooler. It does an economic followup, so there is some interest.
The picture shows Oyvind Johannessen’s base from ForceBot versus Oyvind Johannessen. Random Oyvind Johannessen ended up zerg, and was ready for the rush. Unfortunately, the defending zerglings chose to follow a circling overlord, so Oyvind lost its opportunity to return the pressure.
ForceBot is trying to catch up in workers. As you can see in the minimap, it is 3 bases versus 2. Although it was ZvZ, neither side played aggressively, and the game went on for 45 minutes. It was fun, for a low-level bot game.
Oyvind barely defended itself while the zerglings were suffering from overlord fascination, and ForceBot started to dominate. ForceBot went hydra-lurker and took the map, rather a change from the 5 pool opener.
Oyvind understood how to hold on in a desperate situation. See all the blood? ForceBot could not coordinate an attack to climb the ramp. Oyvind is long-distance mining its natural and eventually retook it, while ForceBot went passive and sat around with its maxed army. The game ended when Oyvind mined out its natural and crashed. I expect that ForceBot would have won on points after the game time ran out.
This version of ForceBot seems to play similarly against all races: 5 pool, try to build an economy, lair, hydra-lurker. It’s quite different from the AIIDE 2017 ForceBot. It seems to have some added defensive smarts; it doesn’t appear to make pre-emptive sunkens willy-nilly any more, at least not often.
Stealing gas is irritatingly complicated in Steamhammer. Today I discovered another bug: If the scout worker has been ordered to circle the enemy base forever, and while it’s doing that you decide (this late in the day) to steal gas, then the building manager goes haywire and moves the worker into a corner or some other useless location. Now what??? It worked when I told to it to scout once around the enemy base and interrupted the circle in the middle by deciding to steal gas, but circling more than once somehow breaks it... in a different module.
Of course everything works perfectly if you decide up front to steal gas, so the worker arrives at the enemy base knowing what to do. That’s the usual case. It sure would be nice to support late decisions, but there are interactions with the scouting command.
I think I should write a general purpose plan sequencer and reduce the gas steal to a plan represented as data, not code. It will be more complicated, but I only have to debug it once and it will have tons of uses.
Last February, I made a few predictions about skills that bots would gain in 2017.
In 2017 I predict multiple bots with new skills of taking island expansions and carrying out doom drops. For Steamhammer I also plan deeper map analysis so it can do things like backdoor attacks when the map allows it. The tactics and strategy in bot play will grow more complex, and everybody will be scrambling to keep up.
Well, the vague forecast that tactics and strategy will improve came true. None of the specific skills has been implemented, as far as I have seen: No island expansions, no doom drops, no backdoor attacks. Steamhammer progress is behind my expectation (what else is new?); I am still planning the map analysis, but it won’t happen this year.
There’s still time. Help make my predictions come true!
In AIIDE 2017, the tournament manager launched some games that did not start. These games were recorded with duration 0 and score 0 for both sides, and were ignored in the official tally. In the detailed results HTML page, the games are listed as crashes with the crashed player being “unknown”. I think of these games as unattributed crashes: If one bot identifiably crashed, then that bot lost the game. But some games failed without either bot crashing in a way that the tournament manager recognized and attributed to the bot, and those games had to be skipped.
And yet, looking at how often bots appeared in “unknown” crash games, there is one obvious conclusion. The % column here is the percentage of unattributed crash games that the bot participated in. Each unattributed crash game has 2 participants, so the percentages add up to 200% before rounding (even though the column total says 100%).
bot | crashes | % |
---|---|---|
ZZZKBot | 4 | 2.20% |
PurpleWave | 7 | 3.85% |
Iron | 5 | 2.75% |
cpac | 7 | 3.85% |
Microwave | 8 | 4.40% |
CherryPi | 4 | 2.20% |
McRave | 6 | 3.30% |
Arrakhammer | 7 | 3.85% |
Tyr | 4 | 2.20% |
Steamhammer | 6 | 3.30% |
AILien | 4 | 2.20% |
LetaBot | 15 | 8.24% |
Ximp | 8 | 4.40% |
UAlbertaBot | 2 | 1.10% |
Aiur | 5 | 2.75% |
IceBot | 15 | 8.24% |
Skynet | 12 | 6.59% |
KillAll | 5 | 2.75% |
MegaBot | 168 | 92.31% |
Xelnaga | 8 | 4.40% |
Overkill | 12 | 6.59% |
Juno | 8 | 4.40% |
GarmBot | 9 | 4.95% |
Myscbot | 6 | 3.30% |
HannesBredberg | 6 | 3.30% |
Sling | 7 | 3.85% |
ForceBot | 10 | 5.49% |
Ziabot | 6 | 3.30% |
total | 182 | 100% |
With these numbers in hand, the great majority of unattributed crashes can be attributed after the fact to MegaBot. MegaBot may have a bug that sometimes breaks the tournament infrastructure. Likely the bug is in the infrastructure itself, and MegaBot happens to tickle it—and other bots do too, though less often.
As a side effect, MegaBot’s official score could be considered too high. If we see the unattributed crashes with MegaBot as “MegaBot’s fault,” then the games should not be skipped in the results, but counted as wins for the opponent and losses for MegaBot. The change is unfair, though: Even if the bug is in MegaBot, which we do not know, then surely not all of the unattributed crashes are due to MegaBot. Other bots or the infrastructure must be responsible for some.
Running a big tournament is hard....
I’m still going through AIIDE 2017 games in which Steamhammer lost to weaker opponents. Steamhammer scored 98-12 against ICEbot. I thought this game on Tau Cross was full of depressing and/or informative mistakes.
The biggest mistake is that Steamhammer didn’t usually scout or react to enemy expansions. That is fixed in the development version. I think the new version would have won this game easily, despite the second mistake.
The second and interesting mistake came when ICEbot tried one of its 4 goliath drops.
Steamhammer’s notion of “this base is under attack” is not accurate. One goliath wandered outside Steamhammer’s danger recognition zone and found itself still in weapons range. It killed dozens of drones and eventually one of the hatcheries before dying. In the picture we see the totally non-dangerous goliath killing drones while Steamhammer does nothing because “the base is safe now.”
This weakness is not fixed. I’ve seen it before. The base defense code is inherited from UAlbertaBot and needs a bigger rewrite than I’ve given it.
ICEbot was playing weakly too, though, or else it would have scored more than 12 wins in 110 games. Zerg recovered from the huge deficit and made a fight of it. The game timed out after an hour with the map mined out and the winner still undecided. ICEbot was given the victory on points.
Steamhammer’s losses against ICEbot followed a curious pattern, by the way. Steamhammer lost in rounds 0, 2, 4, and 6. Then in the rest of the tournament, which ran to round 109, it lost 8 more games. Neither bot was learning, so the early losses of every second game were purely coincidence.
I thought this picture was funny. In a game yesterday against Iron, McRave fast expanded and walled off one of the bridges to its natural, but left the other bridge open. Vultures cannot pass the wall, but they happily took the other bridge and Iron won easily.
McRave seemed to mess up its build, getting dragoons too late. (The underlying mistake may have been taking gas at the natural instead of the main.) But the idea is sound. McRave is narrowing the entrance to its natural, making it easier to defend against vultures. A small number of dragoons can block a narrow entrance, either early in the game when there are only a few dragoons on the map, or later when the main army is away and the only dragoons around are those rallied from the gateways.
Its next game against Iron was on Roadrunner. McRave won using the same idea of narrowing the entrance.
McRave’s dragoons beat Iron, so they played well by bot standards. But I see room for improvement. Vultures can pass either left or right of the upper gateway. I think the first 2 dragoons should plug the gaps so that vultures are physically blocked. Iron will lay mines to force its way in, and then the dragoons can retreat to the defense line McRave actually held, the gap between the lower gateway and the nexus. The farther away the vultures and their mines are kept from the probes, the better.