When I started Steamhammer, one of my first additions was macro code so the bot could build units appropriate to the openings I had coded in.
Now I have tossed that macro code and completely replaced it. The new macro code is better in many ways: It reacts faster to changes, adapts to a wider range of circumstances, survives more emergencies, knows how to transition to a new unit mix and how to pick a new tech goal, and is generally more capable and less buggy.
And I am planning to toss it and replace it again. Its transitions feel mushy and imprecise, as if it weren’t quite sure what it was aiming for. It’s similar to other bots: It is making decisions based on low-level heuristics rather than on an understanding of the situation.
The central zerg macro skill is knowing when to make drones. Steamhammer 1.1 makes drones because it vaguely believes that more drones are better. Strong players don’t do that. A player at ICCup B level understands that the right drone count depends on what you want them for, and makes drones with specific purposes in mind. Making a purposeless drone means that your army is smaller and you are putting less pressure on the enemy; it is a mistake.
For another example, watch AILien tech up (Steamhammer 1.1’s play is not altogether different). When it has teched up to a new unit, you commonly see a few mutalisks or a few ultralisks join the army. If those are successful, more may join in. Pro games don’t go that way; at some point the observer will center on the zerg rally zone and you’ll see a dozen ultralisks as if out of nowhere. The pro is reacting to the future, not the past. No pro would spend on ultra tech to make a few and see how they turned out.
abstraction saves your bacon
So at some point I will rewrite the macro code again, to give it a deeper understanding of goals and constraints. How can I do it?
I keep seeing people complain, or point out by way of excuse, that the state space is too big for machine learning to work. I don’t accept it.
Take the zerg macro problem. You start out knowing part of the game state and part of its past: Where your units are, where enemy units were last seen, and all the other details. Your problem is to use this information to decide what to make next: a drone, a zergling, a hatchery, a creep colony....
If you frame the problem that way, the state space may truly be intractable. “I have a hydralisk with n hit points at position (x, y)” informs the decision, but not much! If you try to train a neural network (or whatever) with that input and output, you may struggle to make progress. And you absolutely will need a huge quantity of data that will be difficult to gather (even with OpenBW tools, which aim to make the data easier to gather).
You need an understanding of the situation, not a list of the details of the situation. I frame the problem as the interaction of goals and constraints. Goals are “I want this current unit mix,” “I want to aim for this set of upgrades,” “I want to add this tech.” Constraints are rooted in the game rules, “it takes x frames to make a y” and “my income is x minerals per second.” Constraints let you calculate or estimate useful information: “How long will it take me to make upgraded ultras?” and “How many ultras will I able to include in my unit mix by then?” and “How many if I expand first to get more gas?”
“What should my current unit mix be?” is a more tractable problem to solve than “what unit should I make next?” and yet in practice it answers the same question. It abstracts away the detail of “what next?” leaving only “what?” The detail of exactly which is next may not matter, and if it does, it can be solved as a separate problem. Also the unit mix question can be answered partly by constraints: Knowing your mineral and gas income and rate of larva production constrains what your unit mix can be, even when it doesn’t constrain what your next unit is. Constraints can be calculated; they should be included in or reflected in the input to the learning algorithm, not treated as something to learn. Don’t do everything bottom up; top down has its place.
Similarly for other goals like “what tech should I seek?”
Abstraction also helps on the input side. I want the input to be not “here is what is known of the game state” directly from InformationManager
, but “here is the abstract tactical situation,” the tactical sitrep, maintained by a tactics boss which does not exist yet. I want the strategy boss to realize “it’s too dangerous to expand in this situation, and I have enough drones to support this unit mix, so no more drones for now.” Further in the future, I want it to be able to say “this enemy mineral line has a back door, so morph these nearby hydras to lurkers.” Or something with the same effect.
With an abstract output and a much smaller abstract input, the learning problem should be fully tractable. “The state space is too big” is a flag, and the flag flies over the territory of abstraction. Deep learning copes with large state spaces by learning abstraction using huge amounts of data; I don’t expect to be able to get huge amounts, so I must supply abstraction from outside.
spiral development
Steamhammer’s original macro code was needed so that the bot could play a complete game—otherwise I couldn’t make progress on the basic features. In 1.0 the basic features were finished enough, so next I replaced the macro code to add flexibility; the bot has greater ability to react and adapt. That flexibility is needed for the next step, a smarter tactical boss that keeps track of tactical goals and maintains a picture of the tactical situation. And that tactical picture is the foundation I need to throw away the macro code again and rebuild it.
I’m still concentrating on strategy first. But strategy needs tactical understanding; the levels are not independent.