software development - 3 | Starcraft AI blog

production freezes

Production freezes are one of the most serious classes of bugs in UAlbertaBot. Its descendants like Steamhammer and Arrakhammer have mitigated the problem by solving various subclasses of production freeze item by item, but have never solved all of them.

There are two main kinds of production freeze, permanent production freezes caused by deadlocks and temporary production freezes caused by waiting for a slow prerequisite to finish.

There are many ways for production to deadlock. You want hydra speed, so you order it up. Then the hydra den is destroyed and the queue hangs. Steamhammer solves that for zerg by canceling items whose prerequisites are missing, which is itself complicated and bug-prone. I think most remaining permanent deadlocks are caused by bugs in the strategy boss, the information manager, the production manager, or the building manager, or in their interactions with each other. Steamhammer tries to mitigate these by timing out and clearing the queue if nothing has been produced for too long, but that is not enough to save the game against a strong opponent.

A temporary production freeze is caused by waiting unnecessarily for a prerequisite to finish. For example, an old version of Steamhammer once froze frequently when it wanted a hive next, because it didn’t realize that it was still researching overlord speed. It had to wait until research finished before it could morph the hive, and in the meantime it was making no units and probably losing the game. The live version solves most issues like that, but it does have an unsolved production freeze that occurs while waiting for the spire (it sounds easy to solve, but I’m not finding it). Another example is: Suppose you want vulture speed and spider mines, and you made 2 machine shops to do the work, but 1 shop has been destroyed. No prerequisites are missing, so there is no deadlock, but you’ll end up researching the upgrades one after another instead of simultaneously, in the meantime making no units. Or even: You want something that requires gas next, so you wait for the gas to accumulate and produce nothing else even though you could make mineral units while waiting. I may add queue reordering to mitigate this, but the underlying issue remains.

Production freezes happen in other bots too. Watch the production tab in OpenBW. I’ve seen tscmoo suffer badly from temporary production freezes.

Software engineering is the solution. So far I’ve been fixing issues one by one. The underlying problem is that the software architecture is fragile. The right kind of fix is to redesign the production system in a way that is not prone to freezes. I don’t know what the redesign will look like, but some of its features are already visible from a distance. For example, I expect it will take into account low-level details like “which building will do this research?” before ordering the research; currently, the details are left to ProductionManager after the order is made, leaving room for slippage.

Anyway, my plans are always changing. This month I am on opponent modeling. After that, I plan to work on the mutalisk control that I’ve always promised. A new macro manager and re-architected production would be a logical task after that, especially since it would improve play for all races. At the rate I’ve been going, I may or may not get to it this year.

the economics of bot development

In yesterday’s post, LetaBot commented “Pretty easy to hold as long as you pull back [hurt workers].... After all, you can outproduce the opponent.” Of course the details are more complicated than that; you have to get your workers to coordinate to some degree, at least pulling the right number to fight. But in substance, it’s true. The worker rush is not a sound strategy and can be defeated every time. Worker rush bots so far (Stone, then a LetaBot version, now PurpleTickles and Yuanheng Zhu) have found new wrinkles and scored wins against top veterans, but actively developed veterans quickly patch their defense skills and recover.

Even so, it can make sense to play a worker rush, as an option against some opponents. As PurpleWaveJadien also commented, in a round robin tournament “getting an additional win versus weaker opponents is as good as getting an additional win versus stronger opponents (and easier).” The worker rush makes sense because of the economics of bot development.

Think of it from the point of view of a new bot author. Your creation is freshly uploaded to SSCAIT and it doesn’t have many skills yet, but it does win sometimes. What should you work on next? It depends, but probably basic macro and micro skills will lift your elo rating the most. It’s a cost-benefit analysis; for a given development investment, seek the bigger benefits first. You can put effort into worker defense later, when it becomes a bottleneck skill.

Worker defense in Steamhammer (such as it is) had a bigger benefit than I expected, which means I likely put it in later than I should have. But Steamhammer was already an above-average bot by then, in the top third or quarter of the rankings. For a below-average bot, the benefit is probably small; you will lose most games to stronger opponents by being overrun with too many units because you didn’t keep up in macro, or by mismicro or misreaction due to missing skills. The basics are a better investment.

But as long as many bots are missing defense skills, other bots will exploit the missing skills. They’ll play worker rushes or whatever else is easy for them to code and more difficult, or not yet worth it, for their opponents to respond to.

As long as we have a steady stream of new bots, rushbots and other cheap exploits will make economic sense.

tricky bugs

Bugs can be deeply interconnected in obscure ways. Sometimes one appears after changes that seem to have no relation.

If you watch the latest Steamhammer, you’ll sometimes see idle drones in its base, sitting on the creep doing nothing. It happens especially when the bot has been holding off heavy pressure for a long time, as if its APM were not enough to keep up with managing its base. And I haven’t seen the bug in older versions.

It’s actually a primordial UAlbertaBot design flaw that happens to manifest now because of changes in Steamhammer that have nothing to do with drones. When ProductionManager sees that a building is coming up next, it checks whether it can save time by moving a worker to the building location immediately, so that construction can start as soon as resources are available. If you then insert something into the production queue ahead of the building—which parent UAlbertaBot will do when it realizes a sudden need for detection—then the building will come up again in the queue and the bot may send another drone, leaving the previous one idle. There is no tracking except the order of items in the production queue, which is unstable. The newest Steamhammer triggers it more often because its urgent reactions, the things it does when under heavy pressure, more often insert stuff into the queue ahead of buildings. Then of course the bug causes mining to slow down, so that the pressure breaks through, and the reactions end up backfiring.

By the way, ProductionManager ought to also check when tech for the building will be available. If you watch Steamhammer build its spire, sometimes you’ll see the building drone waiting in place, twiddling its zergy thumbs instead of working, well before the lair is finished. ProductionManager only checked the resources, so it thought the spire could start earlier and sent the drone too soon. I’ll fix it eventually, but it probably doesn’t lose more than 24 minerals.

I have seen the same bug in Arrakhammer. Microwave solves the bug by catching idle drones and putting them back to work. One older Steamhammer version did the same. Unfortunately, a drone that was about to start a building may be put back to work instead, causing a construction delay—that’s why I undid the change in Steamhammer.

Another solution would be to return drones when the queue is messed with. Steamhammer sometimes decides that an upcoming production item is useless and should be dropped; if it drops a building from the queue, which it sometimes does, that could cause the same bug. Messing with the queue behind the scenes needs to notify ProductionManager.

A more thorough fix would be to delegate all the work to BuildingManager. It’s awkward for drone pre-positioning to be in ProductionManager while the rest is in BuildingManager; better to keep the related parts together. If buildings are sent to BuildingManager before they can be constructed and BuildingManager is responsible for positioning workers, then the existing BuildingManager state (with the addition of an “in preparation” label for buildings that can’t be started yet) can keep track of which worker has been assigned to each building and avoid some construction delays that happen now when workers move around unnecessarily before starting the building. It’s more complicated, because messing with the queue then needs to notify BuildingManager. So I might go with the simpler solution.

Getting all the errors out of the infrastructure is hard. I have fixed about 10 bugs for the next version, but there are other basic infrastructure bugs that are as bad as this one, and I have fixed zero of them. They’re tricky. Meanwhile, the zerg emergency reactions also indirectly cause other errors in the strategy boss, sometimes preventing necessary tech switches....

Steamhammer configuration parsing bug

Why doesn’t this version of Steamhammer ever play its 5 pool on SSCAIT? I found a bug in parsing the configuration file.

There’s a bug in parsing a strategy combo which is nested inside another strategy mix. The syntax as I intended to implement it is inconsistent, and the parsing code only partially takes that into account.

In Steamhammer as configured, the effect of the bug is that whenever it should play 4 pool or 5 pool, it plays its default 9 pool speed instead. In other words, there’s not much difference, but it does reduce the bot’s unpredictability a little. In a bot with a different configuration, the effect might be worse.

the fix

If you are using the code, the fix is to make the syntax consistent.

1. In ParseUtils::ParseConfigFile(), make this snippet of code look like this. All you have to do is change "StrategyMix" to ourRaceStr in 2 places.

			// If we have set a strategy for the current matchup, set it.
			if (strategy.HasMember(matchup) && _ParseStrategy(strategy[matchup], strategyName, mapWeightString, ourRaceStr, strategyCombos))
			{
				Config::Strategy::StrategyName = strategyName;
			}
			// Failing that, look for a strategy for the current race.
			else if (strategy.HasMember(ourRace) && _ParseStrategy(strategy[ourRace], strategyName, mapWeightString, ourRaceStr, strategyCombos))
			{
				Config::Strategy::StrategyName = strategyName;
			}

2. Make the matching change in the configuration file. Everywhere it says "StrategyMix", change it to the race of the bot at that point, like "Zerg". It’s redundant, but it’s the same syntax for strategy mixes no matter the context.

implementing spells

Some people would spend no more than 10 minutes to bring up a public repository, but I’m taking my time. I am a long-time darcs user for my personal projects and don’t know git. A public repo today should be a git repo, today’s standard, not a version system with beautiful theory and ugly popularity. I want to get up to speed.

On the side, I’m thinking about the large number of skills that bots want to know and I’m trying to figure out how to fit them into the smallest amount of code. I noticed that a lot of spells share similar structures and should share code.

All the spells I list below can share backbone code which is at least partially data driven. I’m imagining something like this for each spellcasting unit:

Does it have the energy and tech to cast?
Is some suitable target nearby?
Find the best target, taking into account the whole situation including any maneuvering needed to get into position first.
If the best target is not good enough, no spellcasting action. Carry on as usual.
If we’re too far away, move toward the position.
We’re in position. Cast.

Each spell would need its own evaluator code “how good is this potential target?” It looks to me as though everything else that differs between them can be kept in a table of data and the rest of the code can be shared. Unless I get a better idea, I expect I’ll implement this eventually and suddenly support a large number of spells.

All of the spells below, but I’m thinking especially of hallucinate and dark swarm, benefit strongly from tactical coordination. That can be made to fit the framework. The tactics boss could decree things like “broodling if it’s an emergency, otherwise hold on until we can make a coordinated attack” simply by adjusting the “it’s good enough” threshold value. But even uncoordinated spells, which is all that most bots ever implement, would be fun and would pose problems to the opponent.

Consume and recall may fit a little awkwardly into the framework. They both depend on prepared units standing ready. But even they seem doable.

target is a unit

optical flare
restoration
defensive matrix
irradiate
lockdown
yamato cannon
hallucinate
feedback
mind control
parasite
broodling
consume

target is an area

EMP
nuke
psionic storm
maelstrom
disruption web
stasis
recall
ensnare
dark swarm
plague

release cycles

I’ve been thinking about release policies.

Eh, let’s try it. I think most people tinker with their bot for a while, then when they want to see how they’re doing they upload it and—see how they’re doing. It’s immediate and low-overhead. Good stuff.

Stamp a version number on it and throw it over the wall. Without thinking hard, I have fallen into an old-fashioned release cycle where I complete a bunch of work and hold a big release party. Releasing and documenting each version is overhead work, so I have incentive to stretch out the cycle to reduce overhead. It is not exactly Ideal Software Development Methodology.

My idea was that people would be unhappy if they saw advances on the server that they couldn’t get their hands on yet. I imagined, “Yo, hurry up and release already, I want that feature.” But is that true? I could switch to a compromise release cycle.

Kick development versions over to the server freely, and then when it looks done when poked with a fork, give it a version number and do the release work. On the downside, it seems more complicated for users. Official “stable” versions with updated documentation would only occasionally be running on the SSCAIT server, so what you see is not what you get. On the upside, there would be more variety for viewers (“look, it’s the bug of the week!”) and stable versions would be better tested.

It would affect how bots are updated in reaction to each other. My reaction cycle would be shorter, so new ways to beat Steamhammer might not survive as long. Equally, new Steamhammer behaviors would appear earlier on the server, so they might be countered sooner. Theoretically, it might speed up bot progress overall by some tiny amount.

The whole process would be more freeform and unpredictable, or in other words, modern. Even freer policies are possible, but I want to keep stable reference versions so that people aren’t tempted to fork broken unfinished code. And I have to minimize work; updating documentation as I go can cause extra work if I end up redesigning a feature that I document before it is tested enough.

What do you think?

bugs that do not go quietly

I recently fixed 2 bugs that then turned around and bit me.

I fixed a bug that caused Steamhammer to make double queen’s nests in around 1 game out of 150. It was a simple missing condition. But I made a typo in the fix: Parentheses in a function call swallowed up the next condition in the if. In any safe language it would have failed to typecheck, but in any case, the result was that Steamhammer made 3 or more queen’s nests in every game. I think C++ was laughing at me—”Nyah nyah! I’m too sharp a knife for you!”

I also tracked down and fixed the last known crashing bug, a subtle oversight in the strategy boss that causes crashes in around 1% of games. I brought up a game for a first test of the fix and... crash! Somehow I replaced a rare crash with a frequent crash, and even though I only touched a little code, I haven’t found it yet.

Well, it probably won’t take long—though who knows? After that I have 2 other critical bugs, and then 1.2.1 will be ready.

By the way, Steamhammer does suffer from one other crash besides the “last known” crashing bug. In rare cases, it crashes on startup. But the crash happens before any Steamhammer code executes, so there’s not much I can do about it without taking a long detour.

spiral development

When I started Steamhammer, one of my first additions was macro code so the bot could build units appropriate to the openings I had coded in.

Now I have tossed that macro code and completely replaced it. The new macro code is better in many ways: It reacts faster to changes, adapts to a wider range of circumstances, survives more emergencies, knows how to transition to a new unit mix and how to pick a new tech goal, and is generally more capable and less buggy.

And I am planning to toss it and replace it again. Its transitions feel mushy and imprecise, as if it weren’t quite sure what it was aiming for. It’s similar to other bots: It is making decisions based on low-level heuristics rather than on an understanding of the situation.

The central zerg macro skill is knowing when to make drones. Steamhammer 1.1 makes drones because it vaguely believes that more drones are better. Strong players don’t do that. A player at ICCup B level understands that the right drone count depends on what you want them for, and makes drones with specific purposes in mind. Making a purposeless drone means that your army is smaller and you are putting less pressure on the enemy; it is a mistake.

For another example, watch AILien tech up (Steamhammer 1.1’s play is not altogether different). When it has teched up to a new unit, you commonly see a few mutalisks or a few ultralisks join the army. If those are successful, more may join in. Pro games don’t go that way; at some point the observer will center on the zerg rally zone and you’ll see a dozen ultralisks as if out of nowhere. The pro is reacting to the future, not the past. No pro would spend on ultra tech to make a few and see how they turned out.

abstraction saves your bacon

So at some point I will rewrite the macro code again, to give it a deeper understanding of goals and constraints. How can I do it?

I keep seeing people complain, or point out by way of excuse, that the state space is too big for machine learning to work. I don’t accept it.

Take the zerg macro problem. You start out knowing part of the game state and part of its past: Where your units are, where enemy units were last seen, and all the other details. Your problem is to use this information to decide what to make next: a drone, a zergling, a hatchery, a creep colony....

If you frame the problem that way, the state space may truly be intractable. “I have a hydralisk with n hit points at position (x, y)” informs the decision, but not much! If you try to train a neural network (or whatever) with that input and output, you may struggle to make progress. And you absolutely will need a huge quantity of data that will be difficult to gather (even with OpenBW tools, which aim to make the data easier to gather).

You need an understanding of the situation, not a list of the details of the situation. I frame the problem as the interaction of goals and constraints. Goals are “I want this current unit mix,” “I want to aim for this set of upgrades,” “I want to add this tech.” Constraints are rooted in the game rules, “it takes x frames to make a y” and “my income is x minerals per second.” Constraints let you calculate or estimate useful information: “How long will it take me to make upgraded ultras?” and “How many ultras will I able to include in my unit mix by then?” and “How many if I expand first to get more gas?”

“What should my current unit mix be?” is a more tractable problem to solve than “what unit should I make next?” and yet in practice it answers the same question. It abstracts away the detail of “what next?” leaving only “what?” The detail of exactly which is next may not matter, and if it does, it can be solved as a separate problem. Also the unit mix question can be answered partly by constraints: Knowing your mineral and gas income and rate of larva production constrains what your unit mix can be, even when it doesn’t constrain what your next unit is. Constraints can be calculated; they should be included in or reflected in the input to the learning algorithm, not treated as something to learn. Don’t do everything bottom up; top down has its place.

Similarly for other goals like “what tech should I seek?”

Abstraction also helps on the input side. I want the input to be not “here is what is known of the game state” directly from InformationManager, but “here is the abstract tactical situation,” the tactical sitrep, maintained by a tactics boss which does not exist yet. I want the strategy boss to realize “it’s too dangerous to expand in this situation, and I have enough drones to support this unit mix, so no more drones for now.” Further in the future, I want it to be able to say “this enemy mineral line has a back door, so morph these nearby hydras to lurkers.” Or something with the same effect.

With an abstract output and a much smaller abstract input, the learning problem should be fully tractable. “The state space is too big” is a flag, and the flag flies over the territory of abstraction. Deep learning copes with large state spaces by learning abstraction using huge amounts of data; I don’t expect to be able to get huge amounts, so I must supply abstraction from outside.

spiral development

Steamhammer’s original macro code was needed so that the bot could play a complete game—otherwise I couldn’t make progress on the basic features. In 1.0 the basic features were finished enough, so next I replaced the macro code to add flexibility; the bot has greater ability to react and adapt. That flexibility is needed for the next step, a smarter tactical boss that keeps track of tactical goals and maintains a picture of the tactical situation. And that tactical picture is the foundation I need to throw away the macro code again and rebuild it.

I’m still concentrating on strategy first. But strategy needs tactical understanding; the levels are not independent.

don’t miss the invisible bug fix

I fixed a sneaky, subtle, in fact nearly invisible bug in the how to beat Stone code and updated the post with a correction and analysis.

The details of operating Brood War via BWAPI are deeply intricate. How much mysterious bot behavior is due to other unnoticed, and perhaps nearly unnoticeable, minute particularities?

thou shalt not divide by zero

Steamhammer is mostly reliable, but does crash occasionally. When SSCAIT catches a crash, it saves a couple of low-level dump files that offer a hope of tracing the cause.

Today I got a game with a division by zero crash, which was infuriating. Division by zero must die! Zero tolerance of zero! I decided to eliminate all bad divisions by inspecting every division “/” and mod “%” in the code, though it was a little tricky when the comment delimiter is “//” and “%” is used in strings.

	// don't even enter the same city as division by zero

I did it, and shored up a few weak spots. No direct division by zero should be possible in Steamhammer. I also added error checks in the hope of catching bad parameters before they are passed to BWAPI, which looks like the cause of other rare crashes that I have never been able to trace.

I tell the story as an example. If you want to write reliable software, you have to be thorough. I fix all reproducible crashes immediately, so that (so far) crashes on the server are unreproducible and difficult to solve, but I don’t let that stop me.

Steamhammer and software development

I’m going to make a post about bugs and flaws in Steamhammer, with pictures. Some are hilarious, some are to weep over. But there are so, so many that I’m too busy fixing them to write about them! So far, my change log for the next version shows 22 new features and bug fixes worth mentioning, and yet the issues marked DONE are thin on the ground.

On my first day of development on Steamhammer, December 2 or 3 I think, I started tracking issues to deal with. Within a day the list was large enough to need organizing. I divided it into topics: Macro issues, build order issues, targeting issues, code issues.... After a while that grew to become unwieldy too, so I also prioritized the issues to concentrate on the critical absolutely-must-fix ones. Today I ran test matches against various opponents and found serious bugs, and now the absolutely-must-fix issues are more than I can keep an easy overview of. I’m overwhelmed, and it’s harming my ability to solve problems in the right order.

This is why it takes a long development period before a new bot can threaten the top ranks. Well, I knew I was in for it. One step at a time. I will eventually get to the good part.

Not that the test matches went poorly. The development version of Steamhammer with new skills and a new ZvZ opening scored 26-4 against Zia, a huge improvement. Zia tried all its openings and could not find an answer. The 4 losses were all due to play bugs that I hope to fix soon—if they don’t get swamped by other bugs. Steamhammer also finally has a non-awful ZvRandom opening.