software maintenance and the decision cycle

Whether in your brain or in a Starcraft bot, to act in the world you first collect information, evaluate the information to make decisions, and execute your decisions. The steps may not be as neatly separated as the words that describe them, but they are always there. Think of the psychology concepts of perception, cognition, and motor control, or the military OODA loop (observe, orient, decide, act), and other decision cycles.

When you write a big piece of software, it matters how you organize the steps. In general terms, Steamhammer follows its parent UAlbertaBot, and many other bots, in the way it organizes them: By the decisions. The code that makes a decision is responsible for collecting whatever information it needs, by whatever combination of calling BWAPI directly and calling on the rest of the program, and responsible for executing its decisions, again sometimes calling BWAPI directly to issue orders and sometimes passing internal orders to the rest of the program. So one module makes spending decisions (“a hydralisk next”), one module controls mining workers (“send it to that patch”), and so on.

To a certain extent, that organization is inevitable. Decisions of different kinds have to be made by different code (absent super-powerful machine learning or some other extreme abstraction technique), and the code has to have inputs and outputs. But the haphazard way of collecting inputs, and of passing along outputs, is not so good. I noticed long ago, and over time I’ve seen more clearly, that it is error prone.

On the input side, the data a module sees depends on the order that modules run in: They are not independent. I sorted modules so that, on each frame, information-gathering ones like InformationManager run before decision-making ones like CombatCommander, but in the full program the dependencies are not that simple. Read closely and you’ll find comments like “this must happen before that,” and comments like “eh, the data is one frame out of date but in this case it doesn’t matter,” and special cases to work around backward dependencies. I have fixed bugs, and I feel 100% certain that there are undiscovered bugs due to computing information only after it is needed.

On the output side, it’s difficult to coordinate decisions. A common error is double commanding, where a unit is given contradictory orders: One bit says “Look out, drone, the enemy is near, run away,” then the rest of the code doesn’t remember that the decision is made and says “Hey drone, you’re not mining, get back to work.” Most orders (not all) go through the Micro module for execution, and Micro knows not to issue two BWAPI commands for a unit on the same frame, so a frequent result is that the drone is told to run away one frame, then to mine the next frame, and so on back and forth. It’s a common cause of bugs where units vibrate in place instead of doing anything useful, and the worker manager (which makes a lot of special case decisions) has a particularly elaborate internal system to try to prevent it. Literal double commanding at the BWAPI level is only one issue; the same kind of thing can also happen at higher levels of abstraction, causing problems like indecisive squads.

The logical fix is to add architectural barriers between input, decision, and output. In principle, each module collects all its inputs and puts them into a data structure, then draws a line under it, done. Then it makes its decisions on that basis, records the decisions in another data structure (with the idea of forcing it to resolve any conflicting decisions up front), and draws a line under that. Then it executes the recorded decisions. Input, decision, and output become separate phases of execution.

In real life the dependencies are complicated and it’s not that simple. I’m thinking that the ideal architecture for input data is a fixed declarative representation of everything that might be wanted during a given frame, which is evaluated on demand, in the style of lazy functional programming. That way dependencies are explicit, dependency loops will make themselves evident, and only the information you need is computed each frame.

I don’t have such a beautiful solution for output. The Micro module is a partially implemented attempt to separate some decisions from their execution. It does help, but as we’ve seen above, even if it were a complete implementation it would not solve the problem. The decisions themselves have to be good, and though architecture can aid good decisions it can’t require them. Maybe there’s nothing for it but to be clear about exactly what you’re deciding, at what level of abstraction, and be careful to do it right.

Trackbacks

No Trackbacks

Comments

Tully Elliston on Monday, December 14. 2020:

I would approach this problem by tracking the latest valid order given to each unit;
tracking the priority of that order;
and tracking a timeout associated with the order, which must complete before a new order is accepted - UNLESS the new order is of a higher priority, in which case it will overwrite the old order and reset the timeout appropriately.

This would let you have orders which always counter other orders (scatter to avoid reaver projectile), and hopefully still reduce indecisiveness around low priority tasks (since unless a high priority order comes in, the unit will keep it's current order until the timeout completes).

The key then would be assembling appropriate priorities and timeouts for each command. Perhaps something that could be farmed out to a config file or brute forced with machine learning for best results.

Jay Scott on Tuesday, December 15. 2020:

Priority does seem like a good organizing principle.

Bytekeeper on Wednesday, December 16. 2020:

While my bot is not very good, I do like the approach which seems to be similar to what you describe:
Behaviour trees - things that have to happen in order can be described. Things that might need alternative solutions as well.
Even "in-order" "but at the same time" is possible.

Instead of a Micro module, I use a reservation system. For one, it is used to reserve resources like many bots do. I also reserve units.
Ie. the tree for "building" something will lock a worker, and no other follow-up node can use it.

I extended my behaviour trees with a utility value per frame: If "run away from enemy" has a higher utility, it will processed first (depending on where it is in the tree) - and thus cannot be used by the "gather minerals" node.

Also, instead of having separate trees for units, I actually have one large tree governing everything, including individual unit control.

Evaluation is basically lazy - only parts that "may run" are triggered.

Dependencies are *not* explicit unless I recursively call other trees (which I do) - but they are usually very close to each other. And a "sequence" node more less defines an exact order of things, although it is not clear which node depends on which predecessor exactly.

And having a utility value for nodes allows me to avoid renegotiation of assigned resources. It's first come first serve; but the first had the highest priority/utility anyways.

Dan on Thursday, December 17. 2020:

I also use and recommend a reservation system for resources. PurpleWave's game plan is a fixed graph of strictly-prioritized tasks each of which gets to reserve any unallocated (units, minerals, gas, buildable tiles) over any lower-priority tasks. That also allows strategy-specific considerations like "what are sound conditions to attack?" to be customized side-by-side with macro decisions.

PW controls one unit at a time through a decision pipeline that aborts once it has issued a command to that unit. A delegation layer between the micro pipeline and BWAPI ensures those commands are executed smoothly, for example by clipping position-targeted commands to map boundaries, or blocking control of a Dragoon after it starts firing

Jay Scott on Sunday, December 20. 2020:

Steamhammer assigns units top-down to squads with tasks, and bottom-up with reservations for minerals, gas, and building locations. But there are exceptions. Workers are all in one squad (except for workers pulled for combat), and the worker manager assigns tasks with its own system. Steamhammer doesn’t reserve minerals and gas for terran repair—that seems tricky for cases like repairing a bunker that is under attack.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA