strategy selection in Overkill
Zerg bot Overkill by Sijia Xu placed 3rd in the CIG2015 and 3rd in the AIIDE2015 tournaments. It’s not the very best, but it’s a top bot. Its games are entertaining to watch because it has more than one strategy.
Today I’m going to poke into Overkill’s C++ source code for strategy selection. There’s not much to it.
Overkill knows 3 strategies, 9-pool zergling, mutalisks, and hydras. For mutas it knows 2 different build orders, a faster one (10 hatch) and a slower one (12 hatch) with a stronger economy. It has an idea of when to switch between strategies within a game. It also learns from experience which strategies work against each opponent. If that sounds fancy, then the code is probably simpler than you imagine. It ain’t long and it ain’t tricky.
Overkill keeps a history file for each opponent so it can remember which strategies work. Code from the main program Overkill.cpp
shows that if it doesn’t know the opponent yet, it goes 9-pool on the first game. (I stripped out a few irrelevant comments.)
string enemyName = BWAPI::Broodwar->enemy()->getName(); string filePath = "./bwapi-data/read/"; filePath += enemyName; //for each enemy, create a file historyFile.open(filePath.c_str(), ios::in); //file do not exist, first match to the enemy if (!historyFile.is_open()) { BWAPI::Broodwar->printf("first match to: %s", enemyName.c_str()); historyFile.close(); //default opening strategy chooseOpeningStrategy = NinePoolling; StrategyManager::Instance().setOpeningStrategy(chooseOpeningStrategy); } else {
I’ll skip over reading the history file (it’s just reading this-strategy-got-that-game-result records and adding up) and go straight to the decision. Strategy selection is a multi-armed bandit problem, and Overkill solves it with a UCB algorithm that makes a tradeoff between exploration (“I’m not really sure how well this strategy works against this opponent, better try it again”) and exploitation (“this one usually wins, I’ll do it”). This is about the same method as other strategy-learning bots.
double experimentCount = historyInfo.size(); std::map<std::string, double> strategyUCB; for(auto opening : StrategyManager::Instance().getStrategyNameArray()) { strategyUCB[opening] = 99999; } for (auto it : strategyResults) { double strategyExpectation = double(it.second.first) / (it.second.first + it.second.second); double uncertainty = 0.7 * std::sqrt(std::log(experimentCount) / (it.second.first + it.second.second)); strategyUCB[it.first] = strategyExpectation + uncertainty; } std::string maxUCBStrategy; double maxUCB = 0; for (auto it : strategyUCB) { if (it.second > maxUCB) { maxUCBStrategy = it.first; maxUCB = it.second; } BWAPI::Broodwar->printf("%s , UCB: %.4f", it.first.c_str(), it.second); } BWAPI::Broodwar->printf("choose %s opening", maxUCBStrategy.c_str()); int openingId = StrategyManager::Instance().getStrategyByName(maxUCBStrategy); if (openingId != -1) { chooseOpeningStrategy = openingStrategy(openingId); } StrategyManager::Instance().setOpeningStrategy(chooseOpeningStrategy); historyFile.close(); }
If you haven’t seen something like this before it might be a little hard to understand what it’s doing, but you can’t say that the code is complicated. strategyExpectation
is the strategy’s winning rate so far: You want to pick winners. uncertainty
is high for strategies that have not yet been played and decreases as the strategy is tried more times: You want to experiment with strategies that you haven’t tried enough yet. Add them up and pick the one that’s best overall.
The initial strategy is one of 9-pool lings, 10-hatch mutas, or 12-hatch mutas. It never starts the game intending to go hydras. This is declared in StrategyManager.h
:
enum openingStrategy { TwelveHatchMuta, NinePoolling, TenHatchMuta };
So how does Overkill decide to switch between strategies? That’s in the StrategyManager::update()
method.
switch (currentStrategy) { case HydraPush: { ProductionManager::Instance().setExtractorBuildSpeed(0); ProductionManager::Instance().setDroneProductionSpeed(250); if (hydraUpgradeTrigger && BWAPI::Broodwar->self()->allUnitCount(BWAPI::UnitTypes::Zerg_Hydralisk) >= 12) { hydraUpgradeTrigger = false; ProductionManager::Instance().triggerUpgrade(BWAPI::UpgradeTypes::Zerg_Missile_Attacks); ProductionManager::Instance().triggerUpgrade(BWAPI::UpgradeTypes::Zerg_Carapace); ProductionManager::Instance().triggerUpgrade(BWAPI::UpgradeTypes::Zerg_Missile_Attacks); ProductionManager::Instance().triggerUpgrade(BWAPI::UpgradeTypes::Zerg_Carapace); } } break;
Well, this isn’t switching strategy at all, it’s implementing the strategy by sending orders to the ProductionManager. Once Overkill goes hydra, it sticks with hydras. What if it goes muta? Here I stripped out the instructions to ProductionManager and left the decision logic.
case MutaPush: { std::map<BWAPI::UnitType, std::set<BWAPI::Unit>>& enemyBattle = InformationManager::Instance().getEnemyAllBattleUnit(); int enemyAntiAirSupply = 0; for (std::map<BWAPI::UnitType, std::set<BWAPI::Unit>>::iterator it = enemyBattle.begin(); it != enemyBattle.end(); it++) { if (it->first.airWeapon() != BWAPI::WeaponTypes::None) { enemyAntiAirSupply += it->first.supplyRequired() * it->second.size(); } } // if lost too many mutalisk , and enemy still have many anti-air army, change to hydrisk if (enemyAntiAirSupply >= 2 * 1.5 * 12 && BWAPI::Broodwar->self()->allUnitCount(BWAPI::UnitTypes::Zerg_Mutalisk) * 3 < enemyAntiAirSupply) { goalChange(HydraPush); } } break;
This is Overkill’s most complex strategy logic, and it’s not very complex. If the enemy has too many units that can shoot down mutalisks, according to a short formula, then stop making mutalisks and switch to hydras.
What if Overkill is still on zerglings? This time I left in the orders to the ProductionManager.
case ZerglingPush: { int hatchCompleteCount = 0; BOOST_FOREACH(BWAPI::Unit u, BWAPI::Broodwar->self()->getUnits()) { if (u->getType() == BWAPI::UnitTypes::Zerg_Hatchery && u->isCompleted()) { hatchCompleteCount++; } else if (u->getType() == BWAPI::UnitTypes::Zerg_Lair) { hatchCompleteCount++; } else continue; } // slow the drone production speed when only one hatch if (hatchCompleteCount == 1) { ProductionManager::Instance().setDroneProductionSpeed(750); } else { ProductionManager::Instance().setDroneProductionSpeed(250); } ProductionManager::Instance().setExtractorBuildSpeed(15); if (BWAPI::Broodwar->self()->gas() >= 100) { goalChange(MutaPush); } } break;
It counts the hatcheries and, if there are 2 or more, makes drones faster (I think that’s what it means). In other words, make a bunch of lings early on to see what you can do, then focus on drones. This is again implementing the strategy, not switching it. The strategy switch is: As soon as we’ve collected 100 gas (enough to start a lair), switch to mutas. In other words, the strategy path is hardcoded zerglings to mutas to hydras, and Overkill can start with either zerglings or mutas (and for mutas, with either a 10- or a 12-hatchery build). Dead simple.
You might draw the lesson, “If this is all the smarts a top-3 bot has, no wonder humans play so much better!” You won’t be wrong. But that’s where we are today, and we have to make progress one step at a time. The stronger Tscmoo bot knows more strategies than Overkill, but its strategy selection and strategy switching are hardly any smarter.
A more fun lesson to draw is “Hey, this is easy! I can make a top bot without worrying much about strategy selection, even though it seems like such a hard problem.” That’s true too.
There are plenty of other lessons to draw. When Overkill doesn’t expand early enough to its natural, it often gets stuck on one base for the whole game and (since zerg have a lust to expand) gets smooshed. Also, Overkill’s zergling and mutalisk strategies work okay, but the hydralisk strategy can backfire, especially against terran. A terran ground army can roll up an equal-size hydra army like a rug. The first thing you should worry about is not picking a good strategy, but making the strategy you pick good. Any decent strategy can beat other bots if it’s carried out crisply.
Comments