maps - 4 | Starcraft AI blog

map balance - bot balance in AIIDE 2015

I wrote Ye Usualle Little Perl Script to calculate map balance in AIIDE 2015, based on the the detailed game results (the “plaintext” link on that page). The results do not tell us what race random UAlbertaBot got each game, so its results don’t count in the analysis. UAlbertaBot was the only random bot.

map	TvZ		ZvP		PvT
	wins	n	wins	n	wins	n
(2)Benzene.scx	18%	405	72%	315	64%	567
(2)Destination.scx	19%	405	73%	315	64%	567
(2)HeartbreakRidge.scx	24%	405	70%	315	61%	567
(3)Aztec.scx	19%	405	72%	315	64%	567
(3)TauCross.scx	20%	405	70%	315	64%	567
(4)Andromeda.scx	22%	405	69%	315	60%	567
(4)CircuitBreaker.scx	20%	405	69%	315	67%	567
(4)EmpireoftheSun.scm	16%	405	69%	315	65%	567
(4)Fortress.scx	19%	405	70%	315	69%	567
(4)Python.scx	22%	405	73%	315	66%	567
overall	20%	4050	71%	3150	64%	5670

In the table, n is the total number of games in the matchup, one of several crosschecks to make sure the analysis is right. The tournament had 5 zerg, 7 protoss, and 9 terran bots (plus random UAlbertaBot, which was not counted, making 22 participants). There were 90 rounds, each on one map (which over 10 maps means 9 times through the map pool). So for TvZ there should be 5*9*90 = 4050 games; for ZvP 5*7*90 = 3150 games; for PvT 7*9*90 = 5670 games. Good.

OK, from this exercise I learned more about race balance in this tournament than about map balance. Zerg came out on top because zerg bots won. Meanwhile terran bots were concentrated toward the bottom of the crosstable, while protoss were scattered throughout. Zerg crushed protoss 2:1 but annihilated terran 5:1. I had not realized that it was so extreme. The maps made small differences, the bots made big differences.

Bots analyze maps shallowly and try to play about the same on different maps. I had expected that that lack of adaptivity would cause maps to affect results strongly: Adapting means that the bot matters more; failing to adapt means that the map matters more. But if so, it’s not visible in this table. Maybe the maps are standardized enough that adaptation doesn’t matter at this level of play. Or maybe my original thinking is wrong, and adaptation is what allows the map to matter—Heartbreak Ridge has a narrow base entrance, so that you can easily block your enemy in or out, and high ground over the natural to proxy on, and I haven’t seen any bot take advantage of those features.

You can download the AIIDE 2015 map balance analysis script in a .zip file. I ran it on a *nix but it can probably be adapted to run under Windows with no more than a tweak or two.

Next: I’ll try to normalize the results and compare human map balance to bot map balance in relative terms. Though you can get an idea already by eyeballing the tables.

map balance - AIIDE 2015

Here’s the map balance table for AIIDE 2015. As yesterday, these per-matchup statistics are for pro players and are copied from TLPD.

map	TvZ	ZvP	PvT
Benzene	64.1%	49.1%	48.7%
Destination	52.3%	57%	54.5%
Heartbreak Ridge	48.6%	56.6%	59.1%
Aztec	39%	50%	65.4%
Tau Cross	50%	50%	52%
Andromeda	42.7%	58.8%	57.6%
Circuit Breaker	52.9%	51.8%	53%
Empire of the Sun	64.2%	50%	51.1%
Fortress	64.3%	66.7%	51.2%
Python	55.2%	53.9%	45.8%
overall	53.3%	54.4%	53.8%

With 10 maps to average over, the balance looks close enough to be fair. Some individual maps have large imbalances, but they mostly even out over the map pool. They don’t completely even out, though, because imbalances are too consistent across maps; there aren’t enough counterbalancing maps.

Of these 10 maps, only 2 (Tau Cross and Python) overlap with the 5 CIG 2016 maps.

Human balance and bot balance should be different. Next: I’ll try to investigate the bot balance in practice, using the AIIDE 2015 game results. Per-matchup numbers can’t be deduced from any of the summary tables, so I’ll have to go back to the raw game results. Will human and bot balance be somewhat similar, or all different?

map balance - CIG 2016

Map balance is hard.

Only about 5 competition maps have stats showing balance within a few percent of equal for all matchups. Seriously! That’s less than 2% of maps ever used in pro play! (Though to be fair, the total includes maps without enough games for us to know the balance.) The closest are the popular Fighting Spirit, Circuit Breaker, and Tau Cross, and the less-popular Arcadia 2 and Neo Aztec. If you want a balanced map pool beyond these 5 maps, you have to balance the maps against each other: “This one is T>P by 10%, so the rest should add up to P>T by 10%.” Of course those are human stats, and bot balance should be different, so you might want to balance using bot data.

The AIIDE and CIG rules both say that maps will be chosen at random from a larger pool. SSCAIT says its maps are selected from popular recent pro maps, and doesn’t mention balance. So I decided to look into it.

For today I calculated the balance of the CIG 2016 map pool, 5 maps randomly selected from a larger collection. Think of this as a first check to see how balance may come out when you’re not paying attention.

(2)RideofValkyries1.0
(3)Alchemist1.0
(3)TauCross1.1
(4)LunaTheFinal2.3
(4)Python1.3

I used balance numbers from the TLPD map database, which gives statistics for pro games played from 1999 to 2012. It’s not a definitive current pro balance, but it should be pretty good and it was complete and easy to use. Alchemist is not often played (presumably because it is grossly Z>P; also, according to Liquipedia “Alchemist is mostly noted for being a poor attempt at an asymmetrical three-player map”) and its stats are based on only 53 games. The % number in each cell is the winning rate for the first race in the matchup over each column.

map	TvZ	ZvP	PvT
Ride of Valkyries	48.5%	67.1%	54.4%
Alchemist	55.6%	80%	62.5%
Tau Cross	50%	50%	52%
Luna the Final	53.2%	60.2%	60%
Python	55.2%	53.9%	45.8%
overall	52.5%	62.2%	54.9%

I’d say that’s a substantial Z>P imbalance.

The numbers from TLPD are raw outcomes, with no attempt to adjust for the strength of the players. That’s likely good enough; it should average out over the large number of games played on most of these maps. But if we want to compare the pro balance with the bot balance after the tournament is over, we may want to do some normalization of both data sets. I’m predicting that this tournament will be dominated by terran bots. A comparison might give the impression that the maps are T>P and T>Z for bots, when in fact the terran bots were playing better.

Tomorrow: AIIDE 2015 map balance.

tournament map selection as a prod

I will never run a tournament. I don’t have the stomach for that much administrative work (and hats off to those who do!). So it’s perfectly safe for me to offer advice—I know I’ll never have to listen to it myself.

The way I see it, one goal of tournaments is to prod bots to improve; tournaments motivate. Another goal is to measure progress; tournament organizers are happy to include older bots that have competed in past tournaments, to see how they do against newer competition. There’s some tension between the two goals, but you don’t want to compromise either of them too much.

Earlier I suggested changing timeout rules to prod the winner to finish the game. Another way to prod bots is to make them play on new maps that present different challenges. Unfortunately, most of the concept maps that I talked about seem too hard for current bots (and the novelty maps are not suitable for competitions). Exception: The map Fantasy is not too hard, but it’s too subtle. Stepping down a level, I don’t know any current bot that can play on an island map. Even ignoring balance issues, a tournament would not want to include an island map like Charity, or even a semi-island map like Indian Lament, because it would break the goal of measuring progress. Bots that were made able to play the maps would likely score 100% against bots that could not.

There is a compromise. I suggest the map Namja Iyagi, a land map with 4 mineral-only islands (one in the corner behind each main base) and 2 mineral-and-gas islands. A bot with island skills would have a large advantage over a bot without island skills (the prod)—but not necessarily a decisive advantage. Two bots with no island skills could still play sound games against each other. If Namja Iyagi is only one map out of several, the tournament results remain a fair measure of progress.

The map Return of the King has 4 islands, so it might be a gentler prod.

Another prod that would be good is a map that promotes (but does not require) pushing through minerals or mineral-walking through obstacles, as in some of the concept maps. I’m not sure what a good choice would be, though.

A Team Liquid thread RFC: BW AI Bot Ladder proposes a much fancier attempt to encourage progress.

Tomorrow: Map balance.

novelty maps

Humans can play on crazy novelty maps where normal play does not work. We don’t have much trouble inventing special strategies for special maps. It’s a more extreme example of the human adaptability that we see in normal play. Bots have too much scripted behavior and can’t adapt at all to extreme novelty maps.

On the Blizzard map (2)Crystallis distributed with Brood War, the players start out separated by deep maze-like formations of 48-mineral blocks. Gas geysers exist, but they are also behind minerals, so before you can tech you have to mine a path through to a geyser. It’s a playable map, and maybe fun once in a while, but the strategies are vastly different than on a competition map (and terran would seem to have a big advantage). Crystallis seems to be well-known for crashing the BWTA terrain analyzer, so I expect many bots can’t play it at all. In this picture, look at the minimap to see how far the SCV’s have come from the original command center.

On (6)Crazy Critters, also included with Brood War, the map is so full of critters that it is difficult to place buildings. Units face big delays in moving as the pathfinder struggles with shifting critters. Here I opened with a forge and cannon to kill enough critters to make space for a gateway—I couldn’t find a way to place a gateway otherwise, but the random shifting of critters sometimes made room for smaller buildings, when the probe could arrive in time. The terran opponent is a built-in AI, which built a barracks in my base not because it wanted to proxy, but because that was the first open space it found. Also notice my soaring mineral count; I found myself unable to place enough buildings to spend my income. The map is frustrating to play on, but people can do it.

Crazy Critters play, struggling to build

Less extreme concept maps from Blizzard include Blood Bath and Big Game Hunters. Both have been popular in their communities and people have developed specialized strategies.

Concept maps are rare in competition today, because they are difficult to balance, but they went through a period of popularity in pro tournaments around 2006-2008. Examples are Arkanoid, Demon’s Forest, Monty Hall, Plasma, Triathlon, Troy. Another interesting concept map is Fantasy, in which each quarter of the map has a different design, so that the map is not symmetrical and the game balance and strategy depend on the random starting positions. A lot of fun games with surprising strategies have been played on concept maps, and it would be cool if new ones were invented to meet today’s standards.

Here is Demon’s Forest as an example. Much of the 3-player map is covered by an array of doodads that block vision and sometimes bug out the movement of large units. Here an overlord off the top of the screen (visible on the minimap) provides vision of part of the array, and below two hydralisks are barely able to see beyond their snouts. I set up three other hydras in a triangle to show their lack of vision on the minimap.

Island maps were abandoned in competitive play after the early years, because they were imbalanced against zerg. But I wonder—today we know a lot more about how to balance maps. I don’t have the expertise to try it myself, but I would be interested to find out whether an island or semi-island map could be balanced today, using some variety of pro-zerg tricks: Smaller buildable areas, so that protoss and terran are forced to spread buildings across different areas; gas-only or low-mineral expansions, which zerg gets “for free” because zerg needs the hatcheries anyway; expansions or other areas with neutral creep colonies that zerg can use right off, but where other races need to kill the colonies first. The terran late-game information advantage could be reduced by putting a map doodad where the comsat would go in some expansion spots. There are more, you get the idea.

How do humans adapt their play to unfamiliar map features? I don’t know, and it seems like it must be complicated. I picked novelty maps as an extreme example, but humans (given time to learn) adapt their play to all map differences, and in fact to all aspects of the situation. Circuit Breaker and Fighting Spirit are both standard maps and play similarly, but features of the maps—like the mineral-only on Circuit Breaker and its position next to a low-ground expansion—make for important strategy differences. In an example of a different kind of adaptability, Last gained an advantage over Flash in their recent ASL match by recognizing Flash’s habit of building his barracks forward, to lift off and scout sooner. Last scouted for the forward rax and harassed the building SCV. I’m very interested in understanding that kind of human adaptability so that bots can eventually reproduce it. I think bots won’t catch up with humans strategically until bots can adapt broadly and deeply by learning over time.

How can a bot even get started on a map like Crystallis? Without being told, how could it figure out what to do to get gas? Next: Means-end analysis.

drop idea 1: take the islands

Drop has countless uses. A lot of the uses are complex and tough to code, though (I dare you to do reaver drop with all the trimmings). Right now, a few bots do simple harassment drops. No bot has much fun with other drops. I’m going to spend a few posts on drop ideas that seem to me like cool next steps. Who knows, I might even be right!

If you have a macro bot, why not take island expansions?

The SSCAIT map pack, as distributed, includes 3 maps with island expansions, out of 15 maps total (20% of the maps have islands). In all 3 cases, to take an island, you have to drop a worker, mine out the blocking minerals, and then build or float in the expansion. It makes a pretty small state machine. Maps without blocking minerals are thought to favor terran because command centers can float, so they are little used nowadays.

The SSCAIT maps with islands:

Andromeda
Empire of the Sun
Python

Probably the hard part is not taking the island, but populating it efficiently. You want to be able to schedule a task to transfer workers when the expansion is ready to mine—whether by drop, nydus, or recall. (And if you’re thorough, you might want to be able to transfer the workers elsewhere when the island mines out. Nydus and islands are in love and belong together.)

Since no bots take islands yet, surely few bots are capable of attacking islands—probably only bots that already go air. Against many opponents you can get one or two invulnerable expansions, which ought to be a winning edge.

Of course most maps don’t include islands, and if opponents do attack then islands are harder to defend. But a chance at a big macro edge on 20% of maps is nothing to sneeze at. How many opponents will even scout the island?

game balance and the maps

Starcraft Brood War is supposed to be a balanced game. The three races, even though they have different units and abilities and strategies, are equally strong. To be sure, one race has the advantage in these situations and another in those, but that’s what makes it a strategy game.

Well, but we know better, or if you don’t then you will as soon as you think about it. Are Blizzard’s original maps from Brood War’s release balanced today? Not remotely. When a groundbreaking new idea or strategy is developed for one race, from the terran wall-in to the Bisu build, does the game’s balance stay the same? No, it shifts because groundbreaking means groundbreaking.

Competitive maps in any era, unless at the very start of professional play, are carefully designed to be balanced for the play of that era (or at least of the previous era!). Every mapmaker checks exactly how far a tank can shoot into each base from outside, weighs the situations in which zerg can get the crucial third gas, considers the positioning of obstacles to keep games fair in each matchup, and on and on in painstaking detail. Maps provide the degree of freedom that keeps Brood War balanced even as the game balance shifts when new ideas arise.

The same for bots. Bots have different skills than humans, so game balance for bots is and should be different. In other words, for fairness, ideally bots should play on different maps than humans do.

Well, it’s not a problem yet. We’re still at an exploratory stage and all balance issues can be blamed on “bots are dumb.” Because they are dumb, right? If something is imba against you, you need to add smarts to fix it—so far nobody’s ranting otherwise. But someday, if we care about balance (which we may or may not), then we’ll want our own maps that are balanced for bots.

Or automatically design bot-balanced maps. It would be a fun research project.

bots on Big Game Hunters

Today is April Fools, and the SSCAIT live stream has bots playing on the map BGH.

Some bots are struggling on the highly non-standard map, which has 8 starting locations, huge resources so that expanding has a different dynamic, and strangely laid-out bases and paths. For example XIMP, in the game I saw, misplaced its cannon wall and got run over since there is no standard natural expansion. Other bots make other uncharacteristic blunders, such as pathing errors or self-blocking in the narrow walkways.

It’s funny, but the serious lesson is that scripted behaviors are fragile. And also that a lot of Starcraft strategy is specific to given maps, and some commonalities that hold across maps come about because the maps have standardized aspects.