Entries by Jay Scott | Starcraft AI blog

what major feature should be next?

After map analysis in the upcoming Steamhammer 2.2, what major feature should I work on next? I haven’t decided. I need to make progress on all fronts, but the largest features should be done one at a time so that they don’t step on each others’ toes. My hope is to have a powerful major feature ready in time for AIIDE 2019 in the fall, a version that I can call Steamhammer 3. As always, I’ll also do many minor features and fixes and stuff. Before I start in, I expect at least a Steamhammer 2.2.1 or 2.3 version with more map analysis features, such as support for finding paths through nydus canals, or at least better static defense placement.

1. Strategy adaptation. This has a lot of parts, and calls for adding judgment skills that don’t exist yet. As phase 1, I would create abstract openings in a general format, and allow the current concrete opening lines (which specify an exact build order) as implementations of the abstract openings. If no concrete opening is known for a given abstract opening, or if the situation changes and the intended opening has to be adapted or abandoned, Steamhammer will have to make up a new concrete build order for the situation. As phase 2, I’ll have Steamhammer collect data on what works. For making the initial opening choice (and for later decisions), I’ll drop the current hand-coded probabilities and rely on the data, so that Steamhammer’s choices will be empirically grounded. At this point, Steamhammer will have far more flexible reactions during the opening builds. As phase 3, I’ll extend the abstract openings to choices of abstract strategy over the whole game. At this point, instead of following hand-written opening lines or hand-coded rules, Steamhammer will weigh decisions: I see the cannons. I can bust them with units, or fly around them, or take the opportunity to grow my economy. Which is better this time, hydras or mutas or drones? Steamhammer currently knows openings for these 3 possibilities, but if it is following its strategy rules, the rules always say to make drones.

2. Operations boss. I want to dump CombatComander and replace its functions with OpsBoss, which currently exists mainly as a stub. The combat commander has a largely fixed set of squads, and its main job is to assign units and give orders to each squad. The ops boss will have goals and plans to achieve the goals. It will make up squads more dynamically, and assigning units will be the tail of its work. The ops boss will be able to carry out multi-step plans like taking island bases (transport a worker, mine out the blocking mineral patch, take the base, transfer workers, add defenses, return workers when the base is mined out) or complex drops, and in general will do more varied and interesting stuff.

3. Squad structure—effectively, the tactics boss. I think the Squad class needs to be completely rewritten; it is not powerful enough to represent all the behaviors that a squad needs. See awkward points and design ideas. As part of this, I would implement formations including large-scale surrounds and some level of support for different kinds of unit coordination: Vultures screen the tanks, dragoons leave a gap for the reaver to shoot through, flying detectors and arbiters maneuver to do their jobs better, etc.

what to do?

Micro still needs tons of work, but I think I can treat micro improvements as a sequence of minor features that I can tackle one at a time. Building and unit placement is important. Another vital subsystem that needs rework is production. All aspects need to be moved under the production goal system, which will improve macro for all races and let me dump BOSS for terran and protoss—and also ties in with strategy adaptation. It’s a necessary change, but ideas 1, 2, and 3 above are also necessary and seem more likely to win tournaments this year.

To me, strategy adaptation and the ops boss are the most tempting. Both make play more fun and interesting, and potentially much stronger. Strategy adaptation would stick to my original strategy-first development plan. From a development efficiency point of view, it is logical to work on squad structure before the ops boss, or at the same time because they interact and their needs are interrelated. But if I work on them together, I might not finish them this year. It’s possible that I could decide to do selected parts of more than one idea.

What do you think? One of these, or something else entirely?

funny map analysis picture

It turns out there are a lot of ways to calculate regions and chokes. In the course of putting one together, I’ve been doing some other map analysis that should be useful for micro and pathfinding. Here is one that doesn’t work yet, a color-coded debug drawing which is supposed to show the room available around each walk tile: How much space is there for a unit or army to fit into? If you know a path, you can check the tiles to find out which units are small enough to travel the path. Or you can figure out how many of your units fit behind the enemy mineral line—should you go there?

Unfortunately, it’s no good as it stands. Among other mistakes, it claims there is no room in places where there obviously is. It makes a funny picture, though.

Like BWEM, I also calculate the distance from the nearest unwalkable tile. Iron makes good use of that information. That code worked correctly on the first try....

which weaknesses are critical?

The tournament version Steamhammer 2.1.4 suffers from a command jam bug which reissues commands far too often, causing many to be dropped. It’s a critical bug with devastating effects, causing units to ignore their orders—to freeze in place, to wander past the enemy taking fire without noticing, and so on. It starts having an effect often before the zerg supply reaches 50, and the effect grows worse as supply increases. By the late game, large groups of units are sitting uselessly around the map doing nothing. It’s a critical bug and intolerably severe.

But how critical is it really? I look at every game that Steamhammer plays. Based on tournament losses, I estimate that if I had fixed the bug before SSCAIT started, Steamhammer’s rank would not be #10 as now, but #7—not much gain considering how closely the ranks are spaced, only a few percentage points up in win rate.

How can such a calamitous bug have so little practical effect? In Steamhammer’s early days, one version had a bug that subtly caused building construction to be delayed. I doubt any stream viewer noticed; I didn’t notice either, until surprised by unexpected losses. Experience and test games proved that it was a critical bug that caused a high rate of losses against early aggression. By comparison, the command jam bug is identifiable as the cause of loss in only a few games, like the one loss against XIMP by Tomas Vajda. In other games where the bug struck hard, as against ICEbot and MadMix, Steamhammer struggled more than it should have but won regardless.

Apparently the bug causes losses only against a narrow range of opponents which play macro games and are strong enough to exploit the weak play that the bug causes. There is no effect against a strong opponent like SAIDA, or against the weakest opponents which lose to Steamhammer’s first 6 zerglings. One explanation is that most opponents either prefer early aggression, or else fall to Steamhammer’s early aggression. Another explanation is that I may underestimate the damage the bug causes; maybe it leads to losses that are not clearly attributable.

forge expand reaction

Why is it still called “forge fast expand”? It was a fast expansion when invented, but by today’s standards it’s not fast at all. That’s why I say “forge expand.” (It has the same number of syllables as FFE).

Though I still have region work to do, today I decided to make an important improvement to how Steamhammer reacts to forge expand and other safe macro openings. The tournament Steamhammer makes 3 attempts to adapt: 1. If the enemy’s opening plan was predicted, it tries to select a good counter opening. 2. Otherwise, having missed the prediction and gone down a poor path, it cancels any planned static defense which is now unnecessary, and 3. makes extra drones to catch up in economy. If it’s still in its opening book, it stays the course and tries to minimize the disruption by changing planned zerglings into drones, which cost the same.

Its plans are still disrupted, though, because the extra drones and the omitted static defense cause minerals to build up. Steamhammer waits until the opening is over before it makes macro hatcheries and otherwise spends down its excess resources, and that is often too late. Zerg can’t keep up with the enemy’s economy and falls far behind.

Today I added 2 new reactions that happen when we want extra drones so that resources threaten to build up: 4. If possible, take gas early (or take another gas early). Putting drones on gas slows down mineral accumulation and may speed up tech openings, so that mutas or lurkers come out sooner. If gas accumulates too much, Steamhammer will stop gas collection, so there’s little downside. 5. Make extra hatcheries as conditions seem ripe. One or all of the extra hatcheries may be placed at expansions, depending on the situation. The rules are more cautious than the macro hatchery rules that apply once the opening is over, because they’re still trying not to disrupt the opening line. The overall effect of the new reactions is that Steamhammer pursues the tech of its opening line, sometimes faster since it has more income, gas, and larvas than expected, and transitions into the middle game in a stronger position. It’s making a big difference in test games, including wins from positions that were sure losses otherwise.

The fix is inspired by recent losses, especially the 2 losses to Skynet by Andrew Smith when it unexpectedly (to Steamhammer) switched from zealot rushes and DT rushes to forge expand. To my intuition, the forge expand reaction seems much less important than the command jam fix, which is a critical bug fix that affects far more games. And yet, taking into account test games and the rate at which Steamhammer was surprised by macro openings in the tournament, I estimate that it will save about 2/3 as many losses—in terms of improving elo, both seem almost equal. How does that happen?

Apparently you have to measure the severity of weaknesses, because intuition does not seem accurate. Unfortunately, to measure with an A/B test, first you need to fix the weakness. Maybe that is an advantage of machine learning, which does its entire job by measuring weaknesses and correcting them.

comparing strength across time

We don’t get many tournaments of bots versus humans. I don’t think there have been any with conditions controlled well enough that we can judge how strong bots are and how they are improving: Enough human participants, of known strength, with known levels of familiarity with computer play, finishing enough games. Then hold events across years so we can compare. We have to make do with seeing how bots are improving against other bots. Here is my best idea so far for comparing strength across tournaments.

1. We need 2 tournaments, preferably round robin, that share some participants—exactly identical bots, the more the better. We can’t do it with humans, because we can’t get exactly identical people across time. Ideally the maps should be the same too. AIIDE has more games, and SSCAIT has more shared participants; either should work, but I think SSCAIT may work better for this purpose despite being short by comparison. You could also compare between AIIDE and SSCAIT, but it would not work as well. It would take extra effort to make sure you know which players are exactly identical, and the different lengths of the tournaments means each provides a different amount of evidence to support the ratings, plus you could get confusing results for learning bots.

2. Pool all the games from both tournaments and compute elo ratings. If some participants which are not identical have the same names, distinguish them somehow—Steamhammer 2017 versus Steamhammer 2018, or whatever.

3. The identical players have identical strength in both tournaments, so consider their elo ratings as fixed. For each tournament separately, compute the elo ratings of the remaining players while keeping the ratings of the identical players fixed. The fixed ratings are benchmarks that keep the elo comparison stable for the remaining players (the idea has been used before).

It’s the best way I’ve thought of to get strength comparisons across time. We can get a pretty accurate measure of how individual bots have improved—Steamhammer 2018 is this much above Steamhammer 2017. We can treat elo as a linear measure of strength (a given elo interval always represents the same win rate difference), so we can simply average together the ratings of any set of bots to compare: The top 16 are x points stronger this year, the protoss are y points stronger, the spread between best and worst has widened to....

I may do this analysis for SSCAIT once it finishes. It’s a bit elaborate, but I’m interested.

random Steamhammer notes

A few unrelated notes about Steamhammer:

The bug that causes Steamhammer to drop commands is due to a missing & in an inconspicuous declaration, causing a data structure to be copied instead of referred to. Updates are made to the copy instead of the correct data, then the copy is thrown away. Even after I deduced that something was being copied behind the scenes, it was tricky to nail the exact mistake.

I developed an opening that I feel I can properly call Fried Liverpool. Like the Fried Liver Attack from chess, mentioned in a comment, it’s crazy sharp and can put on tremendous pressure. Steamhammer can’t play it yet; it needs a couple new features. I tried it by hand and found it is effective against unprepared opponents. Maybe I’ll get it working in time for version 2.2.1 or thereabouts (the one after the upcoming 2.2).

Steamhammer just lost a game against the cannon bot Jakub Trancik. I don’t remember another loss against Jakub Trancik since early last year (maybe I have a bad memory). During the tournament I can’t log in to fetch the game records, but I assume it is the first game against this opponent since version 2.0. It takes a long time to collect enough game records. I’m glad I fixed the proxy recognition bug, or Steamhammer might have lost the next game too.

SSCAIT halfway mark

The SSCAIT round robin phase is halfway over, a good time to take stock. Plenty of places will yet change hands, but the ranks are starting to firm up. #1 SAIDA is on top with less than half the loss rate of #2 Locutus. #3 Iron and #4 PurpleWave follow, close to each other. Then comes a gap, and the rest of the possible top 16 are more closely spaced.

#3 Iron, #7 KrasioP, and #8 Skynet by Andrew Smith are performing better than I expected. Iron is still reliable at rolling up the lower end. KrasioP’s game plan of cannon push into dark templar into carriers is more effective than I anticipated—each step sets the opponent a different problem. Overall, it’s striking how much stronger the field is this year than last. #13 Bereaver and #14 Tscmoo protoss have been pushed well down the rankings. When the round robin is over, I want to do an analysis to compare the win rates of unchanged bots; SSCAIT has many more unchanged entrants than other tournaments.

#9 Steamhammer looks likely to finish at 9 or 10 (the middle of my forecast range of 7 to 13 or one slot above). Most of Steamhammer’s losses are unexpected; it’s annoying, but the same happened last year so it’s not a surprise. In upsets, #2 Locutus lost a game to #36 Ecgberht, and #3 Iron lost a game to #49 AIUR by Florian Richoux—everybody has surprise results, except for SAIDA whose only 2 losses are to high-ranked opponents. Even tail-ender #72 FergalOGrady, which fails to start most games and plays awkwardly when it does start, has a win over #25 ICEbot (though it’s a win by crash as its last buildings were being destroyed).

Getting into the top 16 of 72 bots is not easy. Most of the finalists can be predicted now, though not with certainty. Place 16, just below #15 Microwave, has high odds of changing hands; those above range from likely to nearly sure to be in the finals. Pairings in the finals have historically pitted the top finisher against the last finalist and so on toward the middle. Even once you’ve made it into the top 16, you can benefit from climbing up a place (except that climbing from 9 to 8 makes no difference).

Steamhammer status

As I write, in SSCAIT Steamhammer is tied with BananaBrain for #8-#9, at just under 80% win rate. I’m happy with it. But Steamhammer is one of only 2 bots (the other is #26 NLPRbot) which has not yet played any games against the current top 5. I expect to take losses and slip down in the ranking. A finish near the middle of my predicted range of #7-#13 seems likely.

For some reason I don’t feel like analyzing CherryPi and SAIDA yet, though I’m sure I’ll get around to it. I’ve been working on the upcoming Steamhammer 2.2 instead. I’ve removed all references to BWTA other than regions, and I’m making good progress in implementing my own regions; the outline of the code is there, and supporting data is calculated and looks correct. Finish that and replace references to BWTA regions, and only a few small odds and ends remain before I can drop BWTA. Debugging time is unpredictable, because my regions will be different and may affect play. But I’m likely to be done in time for the end of the tournament.

Today I decided to take a break from that and fix a critical bug instead. I verified what I have been suspecting: There is a deadly one in the new micro system, causing it to reissue commands far too often. That makes the APM shoot over 90K and probably saturates Starcraft’s queues so that commands are lost in transit. That explains unit freezing and misbehavior once the zerg army grows large, and I hope it explains the production freezes too. The bug is resisting me so far; something behind the scenes is mysteriously resetting the “yeah yeah, I already did that” marker. But the code is simple and I’ll nail it before long. The bug is responsible for a lot of bad play during the tournament, but strangely for only a few losses.

As the tournament approached I was trying to make low-risk changes. The code is simple, but I still misstepped. I think my first cut of the micro system was OK, since it passed tests then, and the bug was introduced in later changes as I “improved” it.

the “no shallow forks” rule in SSCAIT

The SSCAIT rules include 2 items about copying other bots. Here’s the first item:

The SSCAIT admins reserve the right to disqualify any bots that are implementation-wise too similar to previous entries (e.g. clones and forks) without adding anything novel / original to the table.

On the upside, it openly admits that the decision is a judgment call, and otherwise it is simple. On the downside, it offers little guidance on what counts as “too similar.” On balance, I like this rule.

The second item is listed separately and seems to have been inserted independently:

If you copy other bots or use IP/files/source code/logic/techniques etc from other bots, you must familiarize yourself with their licenses and ensure that you are not infringing their licenses. Copying other bot(s) is allowed, so long as it does not infringe their licenses and so long as you modify their logic or if you use/wrap it without modifying it and add some of your own logic on top of it, similar to how MegaBot used Skynet/Xelnaga/NUSBot in year 2017, or wrapping/modifying Randomhammer/UAlbertaBot/CommandCenter etc. If you do something like this, you must provide the source code and compilation instructions etc of the bots that you use, so that we can compile them. We decided that to foster research it is best to have the next generation of programmers stand on the shoulders of giants, rather than re-invent the wheel. We encourage authors to take code from old years of this competition and improve it. If you copy a bot, please uphold the spirit of competition and ensure you make significant modification or addition before you submit it. We don’t want multiple apparently near-identical copies of the same bot competing! Additionally, please contact the original bot (in case it has been updated at least once during the past year) author and ask them for permission to upload it.

This paragraph looks like it was partly copied from CIG rules, which were themselves partly copied from old AIIDE rules. Not all of it makes sense for SSCAIT. It gives the impression of having been dropped in without careful thought. It declares multiple rules. In more detail:

The rule about obeying licenses is good, except that borrowing a “technique” from another bot is universal. If you use MCRS, do you have to obey the license of UAlbertaBot because UAlbertaBot with SparCraft invented the technique of making Starcraft decisions by combat simulation? The technique that has been accepted as standard and used nearly everywhere? That makes no sense at all. The word “technique” without qualification is too vague to interpret; leave it out.

Next is some wording about when it’s OK to copy other bots. It’s unclear how much this is establishing a rule and how much it is providing guidelines to interpret the rule already established that something must be “original / novel.” In any case it overlaps with the first item, and if it’s worth keeping at all, part at least should be combined into that paragraph.

Then there’s a strange and awkwardly-worded rule that seems to require source code of the original bot that you wrap or modify, “so that we can compile” it. I don’t think SSCAIT does this. Am I wrong? Is anything similar to this rule enforced at all, or even understood by the organizers? It has the smell of a copy-paste error.

Finally, there is a rule about asking the author for permission. I don’t like that rule. I deliberately chose a license that does not require forks to ask permission, or even to let me know. As far as I am concerned, the license grants permission; to require anything from me is a waste of my time. Instead of this rule, I propose a suggestion to authors that they may want to consider whether to add a permission requirement to their license. Then the obey-the-license rule is all that’s needed.

ambiguous cases

Here are 3 edge cases related to Steamhammer. Maybe they can get people discussing what guidance to add to the rules.

Steamhammer 0.2 was forked from UAlbertaBot and played in SSCAIT 2016 after about 3 weeks of development. I made bug fixes and macro changes and stuff, but 3 weeks is not long enough for substantial code changes. Nevertheless, from the start Steamhammer’s strategy and game plan were very different from UAlbertaBot’s. Nobody suggested that it might be too similar, even though the tactics and micro were nearly identical.

NLPRBot is a 2017 fork of Steamhammer, and has not been updated. The link is to a blog post where the comments discuss what to do about shallow forks. It was accepted in tournaments in 2017 under the rules of the time. SSCAIT chose to accept it in 2018 too, though if today’s rules had been in effect in 2017 it probably would have been rejected then. I think that allowing it in 2018 is fair; NLPRBot plays like an old Steamhammer, differently from the current one, so it still adds variety. But it is an edge case, and there is an argument for rejecting it.

insanitybot is a recent fork of Steamhammer. It adds wraiths and minelaying skills and has other changes, but you could argue that it is a shallow fork; its play is in many ways closely similar to how Randomhammer plays terran. I think it is right to accept it into the tournament, because of the improvements and because Steamhammer doesn’t play terran except as random. But again, it is an edge case.

SSCAIT runs of upsets

I am amused: As I write, Steamhammer is ranked #11 in win rate in SSCAIT. It has played 4 games against players ranked higher—and won all of them. The games are 1-0 Krasi0P (I expect the second game to also be a win), 2-0 Killerbot by Marian Devecka, and 1-0 Hao Pan (the second game could go either way). All the losses that pushed Steamhammer down to #11 were against weaker players. That’s a change from last year, when Steamhammer performed consistently and had comparatively few upsets in either direction. Some losses are caused by bugs (0-1 XIMP by Tomas Vajda) or by Steamhammer’s standard weak play (0-1 AILien), but I think the lion’s share are due to poor opening choices by the opponent model: Steamhammer is performing inconsistently because it is thinking on its own.

Current #14 LetaBot by Martin Rooijackers has a similar record: It has played 6 games against higher-ranked players and won 5 of them, including its game against Steamhammer (it happens sometimes).

Overall, I think the biggest upset so far is #50 KillAll 1-0 #2 PurpleWave. Of course the tournament is only about 1/4 over.

another building delay bug

Constructing buildings efficiently is one of Steamhammer’s most fragile behaviors, because it involves coordinating across modules in an ad hoc way. In my change list for the latest version 2.1.4, I described a bug that delayed the start of buildings. The bug was an interaction between the production manager, the building manager, and the building placer.

Now I’ve discovered another bug with the same symptoms—an already placed building is re-placed elsewhere, causing a delay—that also involves 3 modules. This time it’s the building manager, the production manager, and the worker manager. This bug bites when a pre-positioned worker arrives too early at its designated location and has to wait for resources or tech to be ready. How clever of the first bug, to hide the second one from my view. I think computer bugs evolve camouflage, just like living bugs; hide or die.

Now I’m trying to restructure the interaction to be more robust. My preliminary plan is to pass control to the building manager as early as possible and put it in charge of the rest, to try to keep the module interactions simple and organized. It will involve storing a little more state.

Meanwhile, in SSCAIT I’m not seeing many surprises. Not enough games have been played to firm up the rankings, so my range of expectations is still wide—I guess it’s normal that nothing much is unexpected. Steamhammer has about the right mix of wins and losses in games that could have gone either way. It is entertaining that the score table only gives a rank to bots that have played 30 games, while the games are doled out randomly. At the moment, #1 is Soeren Klett with 15-15, and #2 is Jakub Trancik with 11-19. Visiting space aliens will not understand how the competition works.

AIIDE 2018 - 2 locutusoids dropped

Dave Churchill decided to drop the locutusoid bots BlueBlueSky and ISAMind from the AIIDE 2018 tournament results. They were, he must have concluded, too similar to Locutus. The locutusoids #3 CSE and #5 DaQin were kept. The change switches the order of 2 pairs of participants who had close scores: Iron moved ahead of McRave, and Steamhammer moved ahead of ZZZKBot. Steamhammer is now officially #8 out of 25 (no longer #11 out of 27). #12 Tyr, with a 48% win rate, is now in the upper half of the rankings, which reinforces how much stronger the top bots were.

I only noticed today. No doubt I’m behind the times.

The change puts my analysis out of sync with the official results. You may have to refer to my result tables instead of the official tables, to see the ranking numbers I used.

Steamhammer 2.1.4 source available

Steamhammer’s web page is updated. You can get source for the latest version 2.1.4 there.

SSCAIT 2018 prospects

The tournament starts with the usual round robin. There are 72 players, down from 78 last year due to stricter rules and a trend to disable weaker bots. The top 16 round robin finishers will go on to a final in a knockout format. Last year I predicted—not with full confidence—that Steamhammer would finish between #4 and #8. It ended up dead in the middle at #6. My predictions are not usually that accurate, but I won’t let that stop me.

First, let me get a handle on the top bots. Some familiar players are out, either by the author’s choice or by the rules.

Krasi0	Wacky Krasi0P, not powerful Krasi0 terran.
SAIDA	Updated with scary new skills.
Locutus	Updated.
BananaBrain	Updated.
PurpleWave	Switched in on the last day as a “surprise”.
Killerbot	(Marian Devecka) Updated.
CherryPi	Out, due to shared author rules.
Tscmoo family	Tscmoo protoss.
DaQin	Out, presumably because it is too like Locutus.

SAIDA is of course the favorite for #1. Obviously the protoss trio Locutus, BananaBrain, and PurpleWave should finish high. PurpleWave is a black box. It was left disabled in the run-up to the tournament, no doubt while the author prepared, then switched in on the last day: It is prepared for us, we are not prepared for it. Killerbot by Marian Devecka is the other top contender. These 5 will fill in the top 3 almost surely, and likely the next 2 places as well.

Iron should do well too, though it has fallen behind. Hao Pan has improved greatly. McRave seems to remain inconsistent against weaker opponents, but should perform well. I find it hard to foresee how the lower end of the top 16 will fill out. For the Tscmoo family, the author did not disable unwanted variants, so the organizers did it instead, choosing Tscmoo protoss as the representative. (Last year they said they’d choose randomly; this year I don’t see a decision criterion stated.) Tscmoo can surprise, but this version doesn’t look like a contender. XIMP by Tomas Vajda is no longer likely to finish in the top 16, since its rivals have learned to cope—that is a change, last year it finished #12.

For Steamhammer, I am disheartened by a couple of undiscovered serious bugs in its very last game before the tournament began. Still, I did fix other bugs, and the buggy test versions performed fairly well. I am predicting Steamhammer to finish from #7 to #13 and safely make it into the final—again, not with full confidence.

“Good luck to everyone” is a wimpy wish. Absolute victory to everyone! Crush all in your path!

Steamhammer 2.1.4 change list, SSCAIT 2018 version

The version 2.1.4 change list also includes the changes in the unreleased test versions 2.1.1 through 2.1.3. I rolled them all up. A lot of the work in 2.1.4 was fixing bugs that I introduced in the test versions. Expect source release tomorrow, if I find the energy.

The most important improvements are in bold.

UnitInfo

UnitInfo was inherited from UAlbertaBot. It is responsible for keeping track of enemy units that may be out of sight.

• Keep track of burrowed units, both zerg burrowed units and spider mines. Formerly, when a unit burrowed, Steamhammer thought “oh, it disappeared from its last seen position” and lost track of where the unit was. Now the units are tracked and get passed to the combat simulator, and could be used for other purposes—”mines are ahead, send an overlord now!” It’s a little tricky, by the way. If you detect a burrowed unit then you detected it, but if detection is not available and a unit disappears, then why? When a zerg unit burrows, it has the order Orders::Burrowing, so you can distinguish a burrowing unit from a unit that merely walks out of sight. But a spider mine has the order Orders::VultureMine no matter what it is doing. So I simply marked spider mines down at the position where they were last seen, and it works accurately so far though it could be wrong in rare cases. BWAPI doesn’t provide “a cloaked unit in this position would be detected if it were there, therefore it’s not there,” so for now I ignore the case of a burrowed unit which moved away or was destroyed while out of sight. At some point I’ll add a feature to cover BWAPI’s lack.

• I renamed the field lastHealth to lastHP, since that’s what it is. I decided to use “health” to mean HP + shields.

• Steamhammer 2.0 added UnitInfo::estimateHealth() to estimate the health of an unseen unit, accounting for protoss shield regeneration and zerg HP regeneration. Terran repair and medic healing are not so easy to predict. This version adds separate estimateHP() and estimateShields() for use by the combat simulator.

• The HP and shields of an undetected enemy are 0 because the enemy unit is not detected (easy to understand, right?). Steamhammer formerly took it literally and did not pass an undetected enemy unit in sight range into the combat simulator, because a unit with 0 HP is paper. I had assumed that Steamhammer was weak against dark templar because FAP doesn’t support cloaking and detection, but it was deeper than that. I fixed UnitInfo and the estimators to assume that a visible but undetected enemy has full HP and shields. Hmm, maybe a better fix is possible?

combat sim

• Use estimateHP() and estimateShields() for a more accurate representation of the starting situation. Formerly, for unseen enemies, the combat sim used the last known HP and shields (if the HP were not incorrectly 0 for an undetected enemy, as mentioned in the previous bullet). The estimates take into account regeneration since the units were last seen. Cloaked units are understood much better, though FAP still doesn’t understand that it can’t hit them without detection. Otherwise, the estimators rarely make a big difference.

• Mutalisks versus spore colonies situations play adequately again. A bug was introduced in version 2.1.

• Combat sim is centered on the nearest enemy rather than the frontmost friendly unit. UAlbertaBot provided a system where it picks one of its own units as the vanguard of its force, draws a circle around the vanguard, and includes everything that can fire into the circle in the combat sim. When our force moves forward, more enemies are included, so we may retreat, causing fewer enemies to be included, so we advance, etc. Steamhammer 2.0 changes this to include one cluster of friendly units (based on the unit clustering algorithm) in the combat sim, plus the enemies in the circle. Steamhammer 2.1.4 now centers the circle on the nearest enemy instead of the vanguard friendly unit. As friendly forces move back and forth, the set of enemies often stays the same. It greatly reduces vacillation and unit suicides. The circle is also smaller, to encourage aggression.

• Bruce @ Locutus pointed out a FAP bug confusing ground and air units in unitsim(). I fixed it, and it helps... to a limited extent. It’s a severe bug, and I expected a bigger difference.

• The FAP unit field airMinRange is always 0. I removed it and all its uses. Since it was tested in an inside loop, all sims involving air units run a trifle faster. The groundMinRange affects sieged tanks, so it has a use.

• Units that are under maelstrom (detail stolen from MCRS) or under disruption web are excluded from the combat sim by UnitUtil::IsCombatSimUnit(). A dwebbed unit could move out of dweb, but will it? The combat sim doesn’t understand it.

• The whichEnemies is reworked and completed. It specifies which types of enemies should be included in the combat sim, which makes a difference because Steamhammer decides the result by the unusual but successful criterion “who has more stuff left over at the end?” I renamed AntigroundEnemies to ZerglingEnemies because, believe it or not, it’s clearer, even though it doesn’t apply only to zerglings. You’re a zergling enemy if you’re on the ground (a zergling can hit you) or you’re in the air and you can shoot down (you can hit a zergling). A corsair is not a zergling enemy and is excluded. I added GuardianEnemies (which is unused) and DevourerEnemies (used) to complete the set (AllEnemies was already there). The CombatSimulation setup class calculates all the exclusions more accurately than before to pass the right units to the combat sim.

• Steamhammer scores the combat sim by unit prices. Some units have deceptive prices that don’t represent their value. For example, BWAPI arbitrarily says that a spider mine has mineral cost 1. I made special cases for deceptive prices.

micro

• Nearly all micro actions now have bookkeeping in MicroInfo, and the action of moving (which is when units get stuck) is handled completely by the MicroInfo system. Any move commands are now carried out at the end of the frame, after additional checks are done. I think units get stuck less often, though with more experience it becomes harder to see; maybe I’m fooling myself. At worst, the extra bookkeeping will make it easier to get units to follow their orders. On the downside, some bugs in the previous test versions were caused by failing to record changed orders in the MicroInfo system. I believe that this includes the recent production freeze bug, which I hope is fixed now. I should figure out how to avoid the risk of this kind of bug; I will have to change something.

• DistanceAndDirection() is corrected and simplified. It’s a utility in Common.cpp which takes a base point, a direction point, and a distance. The direction point specifies a direction from the base point, and the routine returns a point at the given distance from the base point in that direction (it’s scaling a vector). The distance can be negative. There was a basic error: It calculated the (x, y) offset from the base point correctly, but forgot to add the base point to the offset, so the code looked right when I read it but the result was completely wrong! Since I was touching it anyway, I also simplified the code.

• A ground unit which finds itself directly next to an undetected enemy dark templar will try to flee away from the DT (using DistanceAndDirection()). The DT has to work harder and some units escape danger, but many units still get hit, especially if there is more than one DT. A disadvantage is that the fleeing units get disorganized and work together even less well than usual.

• Guardians are kited by the same code as mutalisks. It reduces cases where they marry a target and refuse to switch to a better one.

operations and tactics

• Attempt to retreat behind static defense, instead of stopping in front of it. It’s not entirely successful, but it seems to help some.

• There are a number of changes to base defense. There is a minor rewrite to simplify one part and improve efficiency. The enemy scout worker is no longer ignored; any enemy in the base is now reason to form a Base squad and clear the base; it helps deny enemy scouting. When deciding on how many defenders to assign, Steamhammer now weights enemy workers more lightly and certain tough enemy units more heavily, so the squad size should be more appropriate to the threat.

• The base defense squad also assigns a detector under narrower conditions, which ameliorates one major cause of mass overlord suicide. Unfortunately there are other causes.

• Don’t assign a detector to an otherwise empty squad. There was a loophole.

• Fixed a minor bug in dropping the empty Base squad of a destroyed base. It had no important effect. I think there is still one more case where dropping an empty squad does not happen as intended.

• For most of Steamhammer’s life, it has been the case that a melee unit next to a sieged tank does not retreat, but attacks the tank instead. I extended it slightly: If the tank is in the process of sieging or unsieging, the unit also does not retreat. It’s a little more insistent about hitting the tank while it can.

• For purposes of operations targeting, a refinery building is considered always reachable by ground. It’s one of the 4 cooperating bugs that I hit recently. This is, of course, a workaround and not a fix. If Steamhammer becomes aware of a refinery on an island....

early-game scouting

• In the scouting code that Steamhammer inherited from UAlbertaBot, a potential enemy starting base is considered scouted and unoccupied if we have explored the TilePosition where the enemy resource depot would be, and nothing is there. But that’s a little inefficient; it’s the position of the upper left corner tile of a building which is 4x3 tiles in size. If we are approaching from the right, say, then it’s one of the last tiles of the building location that we see. So now Steamhammer can recognize a base as unoccupied if we have scouted any of the 4 corners of the building location. The early game worker scout sometimes saves giant fractions of a second in finding the enemy. Someday I’ll add creep recognition too, which will save a useful amount of time when scouting a zerg enemy.

building construction

• Expansion hatcheries could be mistakenly reassigned as macro hatcheries due to a bug in the building manager. The advent of this bug is what caused Steamhammer to so often expand slowly in the early middle game. Steamhammer’s traditional damn-the-torpedoes-take-the-map attitude is restored.

• Buildings were often re-placed after their initial placement due to a subtle interaction bug that has been around since the beginning. This is what caused the spawning pool drone to move into position, stop, then move to another position before starting the pool. The bad behavior: The production manager notices that the pool is coming up, asks where it should be placed, and moves a drone there to pre-position it, trying to arrive just as 200 minerals become available. Then the building manager runs the placement code again, rejects that position because it is blocked by a drone, chooses a new position, then assigns the nearest drone. Now the building manager assigns the drone first, then places the building, usually keeping the same location because a drone does not block its own building—that’s the sneaky interaction. In most cases, but not all, the closest drone can be assigned to the building because the building was already placed by the production manager. The bug was hard to understand because its 3 parts are in 3 different files. At some point I’ll figure out a way to place the building only once unless a problem occurs; that should avoid the remaining problems. Anyway, the effect of fixing this is that buildings commonly start a little sooner with less wasted drone motion, which can help a lot if one side or the other is rushing.

opponent model

• Checking for enemy proxies had a serious bug. If the location of the enemy base was known, the check did not run at all! On a 2-player map, the enemy base location is always known. That is why Steamhammer lost so many games against Juno by Yuanheng Zhu under the false belief that its opponent was playing a Turtle strategy rather than a cannon rush Proxy strategy. In fixing this, I moved proxy checking from the information manager to the Bases class, and (with the extra info that class makes available) extended it so that it now checks both the main and natural for enemy proxies. When I first wrote the code, no bots proxied to the enemy natural and checking your main was enough.

terran

• I had to touch the tank code to update it for MicroInfo, and I couldn’t resist a tweak. Tanks siege and unsiege too often, and I cut away a tiny bit of the stupidity: An unsieged tank does not siege if faced with a single enemy melee unit.

zerg

• A serious bug in defiler control caused defilers to jitter back and forth seemingly randomly instead of moving to the front where they are wanted. Fixing it also makes the defiler code run substantially faster; there is little risk of overstepping the frame time in the late game due to defilers. This is the second serious defiler control bug I fixed; version 2.1 had the first fix. Defilers are still not active enough—is there a third serious bug?

• A minor bug could prevent a dying defiler from casting one last plague.

• Try harder to avoid making a duplicate or unnecessary defiler mound. It still happened in 1 test game; I don’t know how.

• Even in an emergency, research consume for defilers. That’s when we need it most!

• After an emergency spawning pool because the enemy is rushing or proxying, save larvas so we can get as many zerglings as possible right when the pool finishes. Steamhammer has lost games by making drones while waiting for the pool to finish.

• If the enemy played a fast rush, be more cautious about expanding to the natural. It might not be safe.

• Make emergency zerglings in response to the enemy ground army, not the enemy total army. I saw games where Steamhammer made emergency zerglings to defend against mutalisks, and found itself surprised that it didn’t help.

• I saw Steamhammer lose a number of ZvZ games by making all zerglings versus mutalisks. It’s not always a mistake, but if there are enough mutas and they don’t let themselves get too far out of position, then the lings will do nothing but die (since they don’t know how to scatter or hide). I fixed it to do that less often.

• Emergency sunkens are allowed even if they may not finish before the enemy attack arrives. Steamhammer was trying too hard to avoid overdefending. They are still not allowed if the enemy is rampaging in the natural—most bots will build sunkens then, but to me Steamhammer looks too awkward when it tries, and loses too much. This change involved adding to the “emergency” state a separate “EMERGENCY NOW!” state, which is updated independently.

• Make macro hatcheries faster when there are more drones. It’s a crude rule, but it should improve macro a modest amount.

• Steamhammer has continued to have the problem of overproducing scourge, using up all its gas and delaying other production that it needs. I keep tightening limits, and they seem never tight enough. I added another limit: A total of 12 scourge are allowed to be alive at any one time.

• If we’re maxed, trim the production queue to keep it short. In the late game, Steamhammer likes to put a lot of items in the queue at the same time, because it has the larvas and the resources and it can make them all nearly simultaneously. But when it reaches supply max, it can’t make things quickly any more. The long queue prevents Steamhammer from reacting to changes; it has to work through the whole queue, losing one unit before it can produce the next one, before it empties the queue and reconsiders the situation. Now Steamhammer ruthlessly prunes the queue to a couple items, so it can react and do the most important things first.

• Recognize that zerglings are weak versus dark templar. For some reason, this tidbit of knowledge never made it into the unit mix scores.

• Be more willing to add a second spore colony if the enemy makes multiple scouts. Scouts are not that dangerous... if you actually react to them.

• Don’t try to make a spore colony in the natural base when we don’t own the natural base. This generally causes the spore to be built at the edge of the main base closest to the natural, which is rarely helpful.

• The preferred army size in ZvP is tweaked upward (I adjusted a parameter from 0.60 to 0.65). I concluded that sometimes Steamhammer simply does not make enough fighting units.

openings

• I added 5HatchBeforeGas. It cheats and makes 1 extractor just before the 5th hatchery and 1 after, to avoid a bug that comes up when trying to build 2 extractors simultaneously (the building reservation system is tile-based and doesn’t work in that case, because all geysers are non-buildable for buildings other than a refinery, so it tries to place both extractors on the same geyser).

• I added 8Hatch7PoolSpeed, another rush opening. It’s not configured to be played, and it takes Steamhammer many games against an opponent before it experiments with unconfigured openings, so you won’t see it during the tournament.

• I added a strategy combo AllIn which collects the all-in attacks, such as 8Hatch7PoolSpeed above. It’s not used for now. I’m thinking that exploring all-in openings should be a phase in the exploration program that happens before exploring all openings without restriction. In the meantime, those who wish to play with their own copies can configure AllIn openings for use if they like.

• The 3 hatch mutalisk openings now build a 4th hatchery before mutalisk production starts. The hatchery finishes around the same time as the opening’s last mutalisks. The opening was tuned when I first wrote it, but with mineral locking introduced in version 2.0, it accumulates excess minerals. Steamhammer struggled to recover in the middle game from the macro imbalance of the opening. Mineral locking can make a huge difference in macro openings.

• The usual minor tweaks to openings and probabilities.

configuration file

• If the strategy boss debug option is turned on, you get separate red “emergency” and “EMERGENCY NOW!” indicators. They are independent; either can occur without the other, or you can get both at once. “PANIC” didn’t seem like the right word. Maybe I should call it “DOWN IN FLAMES”?

• A new debug option DrawHiddenEnemies draws remembered positions for out-of-sight enemy units. It was described in this post on Steamhammer 2.1.1.

• There was an accidental duplicate of “Counter Naked expand vT”. Dropped.

The next release should be Steamhammer 2.2, according to my plans as of today (ask again tomorrow). I’m low on energy and feel like skipping an interim 2.1.5 release that fixes the terran and protoss bugs. The headline feature of 2.2 will be that BWTA is dropped and Steamhammer does all its own map analysis. I hope it will be done in January around the time SSCAIT finishes. I want to drop BWTA and upgrade to BWAPI 4.2.0 as separate steps, so that I know what to attribute bugs to. Moving to BWAPI 4.2.0 might be version 2.2.1—I am looking forward to being free of the zerg bugs of BWAPI 4.1.2.

Next: Tournament prospects. Some time after that: CherryPi and SAIDA analysis from AIIDE 2018.

tournament-ready Steamhammer 2.1.4

Steamhammer 2.1.4 is uploaded. It’s ready for the tournament, and it will be the tournament version unless I slip in a few last minute changes. Source release and change list tomorrow, or thereabouts (I have stuff to do that might cause a delay).

This version does have a known bug with transport loading that affects terran and protoss, so the terran and protoss drop openings are not working properly. Also unit clustering interacts with behaviors for a few protoss units: Reavers move far from home before they start to build scarabs; carriers before they start to build interceptors; high templar before they merge into archons (which is all they can do for now). I may make an interim release in the 2.1.x series before the tournament is over to fix these for anybody who wants to download from my site. Otherwise, I expect the next version to be 2.2 after the tournament; headline feature: Dropping BWTA.

the 4 cooperating bugs

Yesterday I pointed out a game versus CUBOT where Steamhammer failed to finish off the enemy and had to win on points when the game timed out. It was due to 4 bugs, and I listed the bugs. Today I want to expand a little: The bugs were all necessary for the bad result. If any one of the bugs were not there, Steamhammer would have destroyed the last building, the enemy extractor, and finished the game on its own.

These are important bugs, because it is important to kill CUBOT fast so it stops spamming how much it needs gas.

1. The extractor was considered inaccessible by ground, so ground units did not try to attack it. The tactical targeting does not send a squad to attack a target that the squad can’t reach. Steamhammer decides ground accessibility like this: If the map were empty, could a unit walk to the position? You can’t walk on a geyser, so the check is incorrect in that case.

If ground units happen to wander close enough to the extractor, they will attack it anyway. The micro targeting takes over. The can’t-reach-a-refinery bug is a known problem, and I never fixed it because, until this game, Steamhammer’s other behaviors had always been robust enough to eventually find and kill any leftover refinery buildings.

2. The Ground squad knows there’s an enemy building left but believes it can’t reach the building, so its job is to explore the map for other enemies. It decided to check the enemy’s mineral-only expansion. Unfortunately, the shortest path there is blocked by a mineral block, and Steamhammer does not know how to path around the block or remove the block. So the Ground squad could make no progress.

3. The Recon squad cannot get jammed in the same way, because if it fails to reach one target, it times out and switches to another. It should eventually check everywhere and kill the extractor. But in this game, a single zergling froze in place at the top of the enemy upramp, and the other zerglings of the Ground squad were positioned so that they could not pass it to reach the downramp that they could not pass because of the mineral block. Stuck units are less common, I think, but still happen. The Ground squad was effectively frozen into a position where it blocked the Recon squad from ever reaching the enemy extractor.

4. Mutalisks don’t care about ground reachability. Despite all the above problems, Steamhammer still would have finished the game on its own if it had continued normally, teching up to mutalisks. But a never-before-seen bug froze production, and Steamhammer stopped making units altogether. I think I’ve diagnosed the bug, and I will fix it.

None of the 4 bugs is critical on its own. They had to cooperate closely, even to the point of freezing one zergling in the right position and placing others around it carefully so they could not move forward or back. Maybe instead of fixing the bugs, I should find the coordinating committee and disband it.