goal monitoring as a structural principle
Humans naturally keep track of progress toward their goals. It doesn’t matter what you’re trying to do: If you reach out for something but don’t get a good grasp, you notice. If you’re making a point in conversation, you try to tell whether the other person understood. In Starcraft, people notice (unless they’re swamped with multitasking) when a unit gets stuck, or an army takes a bad path, or really when anything interferes with a plan. And having noticed a problem, they can try to solve it.
Isn’t it obviously a good idea? And yet it seems rare in bots. I think nearly all bots only notice problems that they are explicitly coded to look for. They don’t notice when their units run into Iron’s wall and start to move aimlessly, seeking a firing position that they can’t reach. Even a novice human will realize that something is wrong, but bots don’t register that progress is stalled and keep trying to execute the failing plan.
I’ve been thinking about adding goal progress monitoring throughout Steamhammer, at every level: Strategy, operations, tactics, micro. First, I want to rewrite everything with explicit goals anyway, because I think it is clearer and more flexible. Carrying out a goal consists of choosing a plan (either ahead of time or piece by piece on the fly) and executing the plan. Then, goal monitoring means being able to tell whether the plan is executing as intended. Firing at Iron’s marine behind the wall is a 2-step plan, get into range and fire a shot. Getting into range is a subgoal to move to a position that is in range. And we can tell whether a movement plan is executing as intended: Is the range closing with time? Does the movement take about as long as predicted? If not, then the plan is going wrong and we may want to patch the plan, or try a different plan, or back up further and try a different goal.
It seems like a lot of detail work if done by hand, and I’m sure I will do part of it by hand. But it means that the bot will always react to problems. If Steamhammer is beating its head against Iron’s wall, it will notice. Even if it doesn’t have a wall recognizer and doesn’t know how to react, it will know that its plan is failing and that is should try something different—choose another target to shoot at, maybe that will work. After several tries, it will be sure to find that shooting the wall itself succeeds. Even without specific knowledge, having general adaptivity seems valuable.
It also provides a clear task structure. Today, Steamhammer’s structure is ad hoc—the underlying principle might as well be “let’s code up some behavior!” With a structure of goals and plans, the amorphous behavior becomes a programming pattern to be reused over and over in the code. Each behavior is made up of a fixed set of parts: Choose goals, plan how to meet each goal, monitor the progress of each plan, back up and try something else if the plan is failing.
The clear structure also helps with machine learning. “OK, system, now learn to behave!” is a hard problem. “Subsystems, go learn to choose goals, plan, monitor, and retry” is an easier problem, because the relationships between the parts are fixed. There is less to learn, and better information about which parts are working well.
Well, that’s my long-term vision. I expect I will get there in a slow and messy way, as always.
Comments
Stanislaw Halik on :
Breaking learning stuff into a hierarchy is a great idea. Curse of dimensionality, input count, all that.
How solid a background in probability theory does one need for learning systems? Can one start using them with only cursory theoretical knowledge? I can personally understand formulae in a paper but at only a shallow level, without knowing the implications the research carries.
Jay Scott on :
Tyr on :
I think the main reason why so few bots do this is because figuring out a reasonable alternative is very hard when you as a programmer don't know what problem the bot will run into. Much easier to look at specific problems and figure out a fix for that than to create a generic problem solver.
Jay Scott on :