map balance - bot balance in CIG 2016

CIG 2016 reported its results in the same format as AIIDE 2015 (I’m sure they used the same software), so I was able to compute the map balance with a few adjustments to my script. The tournament was run in two halves, qualifiers and finals, each with 100 rounds. With 5 maps, that makes 20 times through the map pool. They could have used twice as many maps without any disadvantage that I see.

The qualifiers, with 16 bots playing 12,000 games total (minus a few lost to errors):

map	TvZ		ZvP		PvT
	wins	n	wins	n	wins	n
(2)RideofValkyries.scx	49%	640	61%	240	57%	480
(3)Alchemist.scm	50%	640	45%	240	60%	479
(3)TauCross.scx	56%	640	43%	240	53%	479
(4)LunaTheFinal.scx	53%	637	47%	240	53%	480
(4)Python.scx	49%	638	45%	240	50%	478
overall	51%	3195	48%	1200	55%	2396

The 3 races came out remarkably even! We already know that’s more due to the strength distribution of bots in the tournament than to the fairness of the game. The low-high spread in TvZ was 56%-49% = 7%; in ZvP 18%, and in PvT 7%. Ride of Valkyries had strikingly different ZvP results than the other maps. I don’t know why. Can anybody guess? The human balance also showed one map standing out in ZvP, but it was Alchemist.

The final, with 8 bots playing 2800 games, looks considerably different:

map	TvZ		ZvP		PvT
	wins	n	wins	n	wins	n
(2)RideofValkyries.scx	54%	120	92%	80	45%	120
(3)Alchemist.scm	52%	120	79%	80	63%	120
(3)TauCross.scx	76%	120	65%	80	49%	120
(4)LunaTheFinal.scx	67%	120	84%	80	46%	120
(4)Python.scx	66%	120	94%	80	34%	120
overall	63%	600	83%	400	48%	600

Here, protoss did poorly because the protoss bots came out on the bottom this time. It’s interesting that the middle-of-the-table zergs did more to hold down the protoss than the winning terrans (but it fits with the game storyline :-). Beyond that, I’m reluctant to draw conclusions from this smaller number of games with fewer players.

I feel vindicated: Map balance can make a difference, even though we don’t understand what the difference is!

Trackbacks

No Trackbacks

Comments

LetaBot on Friday, September 23. 2016:

I gave some more details about this in the mail I send you. You ofc can take those notes and write a more general topic about the CIG results and what they mean for the BW AI of course.

Anyway I can explain the high win % for TvZ in tau cross for the top 8 bots. All the zergs in the top 8 (inluding UAlberta bot when it randoms zerg) have a tendency to do an early rush. On tau cross the ground distance to the other mains is usually much larger than on the other maps, because you have to go all the way around a wall

imp on Saturday, September 24. 2016:

this supports my comment on a previous post on map balance. Not the balance of races is measured, but the balance of strategies.
This will only change once bots cover an evenly distributed wide range of strategies or they select their optimal strategy based on the map. Causality and correlation: Zerg does poorly on large maps not because it is zerg, but because rush strategies are easier to implement than more long-term strategies and zerg is the best race to support such a rush strategy. Therefore, newer developers are more likely to rush -> are more likely to use zerg -> zerg is more likely to lose on big maps.

Jay Scott on Saturday, September 24. 2016:

I agree. Bot-and-map statistics are weak, partly because there aren’t many bots and partly because so many bots play alike. I’m still interested and I’ll be looking more into it.

imp on Saturday, September 24. 2016:

my statement is an observation, not a critique. Balance is not a matter of "yes/no" but is multi-dimensional. For example, it is different for lower level players than for pros because of e.g. the APM restriction lower level players have.
We don't know yet how balance will evolve at above-human APM levels. Just one example: There is a video on youtube demonstrating perfect micro of 4 marines vs. a lurker. The lurker never hits any of the marines. Another aspect: Bots will not be distracted by multi-pronged attacks. In theory they can perfectly allocate CPU cycles to defend each attack. So I agree, it is definitely still interesting.

Jay Scott on Saturday, September 24. 2016:

Yep. Balance varies with skill: That is the theme of my 6 July post “game balance and the maps”.

Jay Scott on Saturday, September 24. 2016:

I’ll add that rush distance is an interesting (and well-known) way to categorize maps. I’ve often seen it mentioned, but never systematically cataloged.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA