My goals and plans for Steamhammer development. What, I told you it was next, didn’t you believe me?
Start with the usual 3 levels of abstraction: Strategy, tactics, unit control. My eventual goal is to conquer the levels one after another with machine learning, starting at the top with strategy—it’s more fun that way. I think I have to tackle each level separately, because any learning method that can learn all levels at once will devour more data than I can feed it—it will be beyond my resources.
My vague plan for each level is to start with a hand-made approximation of good play and learn the differences between the approximation and actual good play. That’s also because it’s more fun that way. First, play will never be too awful, so it won’t be discouraging. Second, learning will be faster because there’s less to learn, which is important because I expect to have to relearn each model repeatedly. For example, every time the tactics change too much, the strategy model should relearn what strategy is good with this set of tactical skills.
One goal is to learn opponent models, at least at the strategy level. I’d love to learn models at each level. We’ll see if I get that far.
Another goal for the strategy level is to be capable of every opening. If an absurd build like 8 gas 14 pool is good against some weird opponent, I want Steamhammer to be able to figure that out, at least in principle. I want to be able to say: If Steamhammer played enough games, it could eventually figure it out, no matter how unusual. Humans can develop new builds, bots should do it too.
For the moment I’m working on infrastructure. I renamed UAlbertaBot’s MetaType
data structure which represents “a thing to build/research/etc.” to MacroAct
, since I like names I can understand. I added “command” actions to MacroAct
, and so far have implemented commands for scouting and taking drones off gas (both of which affect macro—they are macro actions). You can now write these into build orders in the config file, which makes the build orders more powerful; the bot’s behavior is no longer hardcoded. (Dave Churchill was apparently planning something similar, since the original MetaType
already had an unused “command” variant.) I’ll use the same system for communication from the strategy boss to the production manager.
After a few more commands, I’ll add positioning information to MacroAct
, so that it can distinguish between an expansion hatchery and a macro hatchery, specify which base gets the static defense building, and stuff like that.
That covers the most vital missing skills. Then I’ll start on the strategy boss. Goals are versatility, good macro, fast adaptation, and the ability to recover from upsets, all of which Steamhammer 0.2 is weak at.
A first working version should be ready to release some time in January. Depending on how fast I can make progress, I would like to add one more feature before the release, random choice of openings. Learning bots quickly learn to exploit Steamhammer 0.2’s fixed openings, and that’s no good. I’ll start with fixed probabilities for each opening, specified in the config file. Since the strategy boss can adapt to the game situation, the opening build orders will be shorter than now. A random mix of aggressive openings and macro openings with different tech choices should be difficult for the learners to exploit, as long as Steamhammer can play all the openings well enough. It should also be more fun for humans to play against.
Then it will finally be machine learning time. I’ll decide whether to learn openings or strategic play (after the opening) as the next step. The first target of opening learning will not be “learn the best opening,” it will be “learn the best mix of openings.” The second will be to develop new openings, of course.
Along the way I’ll be fixing bugs, improving micro decisions, and whatever else seems essential. But since I plan to completely replace the tactics and micro code, I’d like to minimize the work I put in and only do the essential.