Entries by Jay Scott | Starcraft AI blog

AIIDE 2019 - unknown maps per-player

Here are the per-player map tables for the AIIDE 2019 bonus tournament on “unknown” maps, meaning standard and well-known maps that weren’t announced beforehand. Since the competition is about maps, I want to look at the maps first.

With 10 opponents and 100 rounds, there are 10 * 9 * 100 / 2 = 4500 games total if all are played, meaning 900 games per player and 900 / 5 = 180 per player per map. Each player has 9 opponents, so there are 180/9 = 20 games per cell in these tables. The official results say that 4491 of the 4500 were successfully completed, so some cells have slightly fewer. Tail-ender UAlbertaBot participated in every game that did not complete (C0G has the same pattern).

Locutus	overall	Polari	Longin	Arcadi	Fighti	Roadki
PurpleWave	48%	75%	40%	50%	35%	40%
DaQin	73%	75%	75%	85%	75%	55%
BananaBrain	66%	55%	75%	70%	75%	55%
Microwave	98%	100%	90%	100%	100%	100%
Steamhammer	99%	100%	100%	95%	100%	100%
XiaoYi	97%	100%	90%	100%	95%	100%
McRave	96%	100%	95%	100%	100%	85%
Iron	100%	100%	100%	100%	100%	100%
UAlbertaBot	98%	100%	95%	95%	100%	100%
overall	86.11%	89%	84%	88%	87%	82%

PurpleWave	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	52%	25%	60%	50%	65%	60%
DaQin	76%	60%	90%	70%	95%	65%
BananaBrain	68%	80%	60%	70%	60%	70%
Microwave	96%	95%	100%	95%	95%	95%
Steamhammer	86%	70%	95%	80%	90%	95%
XiaoYi	90%	90%	80%	90%	90%	100%
McRave	93%	95%	95%	90%	95%	90%
Iron	98%	100%	100%	100%	90%	100%
UAlbertaBot	100%	100%	100%	100%	100%	100%
overall	84.28%	79%	87%	83%	87%	86%

DaQin	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	27%	25%	25%	15%	25%	45%
PurpleWave	24%	40%	10%	30%	5%	35%
BananaBrain	41%	60%	30%	35%	30%	50%
Microwave	89%	95%	95%	90%	80%	85%
Steamhammer	97%	95%	95%	100%	100%	95%
XiaoYi	74%	100%	65%	70%	60%	75%
McRave	37%	50%	45%	30%	30%	30%
Iron	98%	100%	95%	100%	95%	100%
UAlbertaBot	77%	75%	70%	89%	68%	80%
overall	62.58%	71%	59%	62%	55%	66%

BananaBrain	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	34%	45%	25%	30%	25%	45%
PurpleWave	32%	20%	40%	30%	40%	30%
DaQin	59%	40%	70%	65%	70%	50%
Microwave	74%	75%	70%	75%	75%	75%
Steamhammer	81%	70%	80%	80%	85%	90%
XiaoYi	62%	50%	70%	50%	70%	70%
McRave	69%	80%	70%	55%	65%	75%
Iron	56%	30%	55%	65%	45%	85%
UAlbertaBot	87%	100%	85%	95%	80%	75%
overall	61.56%	57%	63%	61%	62%	66%

Microwave	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	2%	0%	10%	0%	0%	0%
PurpleWave	4%	5%	0%	5%	5%	5%
DaQin	11%	5%	5%	10%	20%	15%
BananaBrain	26%	25%	30%	25%	25%	25%
Steamhammer	84%	90%	95%	75%	80%	80%
XiaoYi	82%	85%	90%	70%	75%	90%
McRave	50%	45%	35%	45%	55%	70%
Iron	33%	30%	15%	40%	15%	65%
UAlbertaBot	65%	70%	60%	79%	60%	55%
overall	39.60%	39%	38%	39%	37%	45%

Steamhammer	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	1%	0%	0%	5%	0%	0%
PurpleWave	14%	30%	5%	20%	10%	5%
DaQin	3%	5%	5%	0%	0%	5%
BananaBrain	19%	30%	20%	20%	15%	10%
Microwave	16%	10%	5%	25%	20%	20%
XiaoYi	50%	25%	80%	70%	45%	30%
McRave	89%	85%	90%	95%	90%	85%
Iron	59%	85%	40%	60%	55%	55%
UAlbertaBot	90%	95%	90%	90%	84%	90%
overall	37.75%	40%	37%	43%	35%	33%

XiaoYi	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	3%	0%	10%	0%	5%	0%
PurpleWave	10%	10%	20%	10%	10%	0%
DaQin	26%	0%	35%	30%	40%	25%
BananaBrain	38%	50%	30%	50%	30%	30%
Microwave	18%	15%	10%	30%	25%	10%
Steamhammer	50%	75%	20%	30%	55%	70%
McRave	29%	35%	40%	25%	20%	25%
Iron	80%	80%	85%	70%	90%	75%
UAlbertaBot	79%	60%	85%	85%	100%	65%
overall	37.00%	36%	37%	37%	42%	33%

McRave	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	4%	0%	5%	0%	0%	15%
PurpleWave	7%	5%	5%	10%	5%	10%
DaQin	63%	50%	55%	70%	70%	70%
BananaBrain	31%	20%	30%	45%	35%	25%
Microwave	50%	55%	65%	55%	45%	30%
Steamhammer	11%	15%	10%	5%	10%	15%
XiaoYi	71%	65%	60%	75%	80%	75%
Iron	41%	25%	30%	45%	40%	65%
UAlbertaBot	46%	20%	63%	50%	55%	45%
overall	36.04%	28%	36%	39%	38%	39%

Iron	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	0%	0%	0%	0%	0%	0%
PurpleWave	2%	0%	0%	0%	10%	0%
DaQin	2%	0%	5%	0%	5%	0%
BananaBrain	44%	70%	45%	35%	55%	15%
Microwave	67%	70%	85%	60%	85%	35%
Steamhammer	41%	15%	60%	40%	45%	45%
XiaoYi	20%	20%	15%	30%	10%	25%
McRave	59%	75%	70%	55%	60%	35%
UAlbertaBot	75%	90%	80%	75%	80%	50%
overall	34.44%	38%	40%	33%	39%	23%

UAlbertaBot	overall	Polari	Longin	Arcadi	Fighti	Roadki
Locutus	2%	0%	5%	5%	0%	0%
PurpleWave	0%	0%	0%	0%	0%	0%
DaQin	23%	25%	30%	11%	32%	20%
BananaBrain	13%	0%	15%	5%	20%	25%
Microwave	35%	30%	40%	21%	40%	45%
Steamhammer	10%	5%	10%	10%	16%	10%
XiaoYi	21%	40%	15%	15%	0%	35%
McRave	54%	80%	37%	50%	45%	55%
Iron	25%	10%	20%	25%	20%	50%
overall	20.43%	21%	19%	16%	19%	27%

Against its strong opponents, Locutus had trouble on the map Roadkill, possibly because of the low-ground main. If Locutus stuck with its cannons-at-the-ramp strategy, the cannons were weak on low ground. Iron also struggled on Roadkill. Polaris Rhapsody, the only 2-player map, also showed some extreme results—see BananaBrain versus Iron and McRave versus UAlbertaBot.

There are plenty more details in the tables.

fun game Simplicity-Locutus

Yesterday’s game Simplicity vs Locutus on Andromeda on BASIL starts out as one of the most entertaining bot games I have seen. The pictures show some of the cool stuff that Simplicity tried—with success. Then, after a tremendous fight where each side pressed temporary advantages and maxed its army, the replay loses sync and OpenBW cannot show the last half of the game.

Queens with broodling. The Research tab shows broodling research, and there are broodlings on the ground. Simplicity made 8 queens early and even researched queen energy, and the queens paid for themselves with interest. When Locutus attacked, zerg sent out as many queens as had energy to simultaneously spawn broodlings, helping to break the attacks. It’s a simple way to coordinate the queens to get tactical results, and is more effective than the common bot approach of using the queens as attrition weapons. Simplicity’s queens eventually died to corsairs; with more careful play, they could have lived to the end of the game, because Locutus was not skilled with its corsairs.

Island base with static defense. The overlord on the left has just dropped off another drone to join the miners.

Drops. Simplicity repeatedly dropped small numbers of units into the far end of Locutus’s base, and Locutus did not react properly. The drops were not decisive, but were cost-effective.

Both sides maxed their supply, or nearly so. At that point, Locutus had better upgrades but Simplicity had a larger army. Locutus could not keep its natural safe. Luckily for protoss, it was already mined out.

But Locutus had a stronger economy with more bases and a large bank of resources, and Simplicity ran out of resources. The desync hides the end of the game, which timed out after an hour. I believe that Locutus wiped out all zerg it could reach on the ground, and then had no answer for the island base. When the game timed out, BASIL gave the win to Simplicity on points. Moral: You need at least enough island skills to make air units to attack inaccessible bases.

new bot ZNZZBot

New protoss ZNZZBot was uploaded to SSCAIT today. It has no apparent connection with ZZZKBot. Quick analysis shows that it is descended from Locutus, but its binary is twice the size of the current Locutus because ZNZZBot is still linked against the BWTA terrain library. I conclude that ZNZZBot was forked from an earlier Locutus, or from an earlier descendent of Locutus.

The bot’s author is given on SSCAIT as “znzz” and in its config file as “zzy”. I’m not sure whether that indicates an attachment to zealots, or fitful sleep.

In its first 5 games, all it has played so far on SSCAIT and BASIL, ZNZZBot scored a win over Krasi0 with the Locutus dragoon runby strategy, but also a loss against the much weaker opponent ICEbot where protoss showed poor macro and poor decisions. I glanced over a couple new openings defined in the config file, and found them a bit odd, perhaps not expertly designed.

That’s all I know so far. I expect we’ll find out more with time.

AIIDE 2019 - unknown maps tournament

The AIIDE 2019 unknown maps competition results are up. At first glance, the biggest surprise is that the ranking is extremely similar to the ranking in the main tournament. 10 bots chose to compete. The weakest players did not participate, so the winning rates for all bots are lower than in the main tournament.

The results for some reason don’t include a straight listing of the 5 maps used. They are (2) Polaris Rhapsody, (3) Longinus 2, (4) Arcadia 2, (4) Fighting Spirit, and (4) Roadkill. Fighting Spirit is of course familiar to SSCAIT participants. The first 4 maps are classics from the KESPA era (which ended in 2012) and Roadkill is a more recent design.

I know from test games that Steamhammer plays well on Arcadia. (I test Steamhammer on all kinds of maps as a regular thing.) I’m pleased to see that reflected in the map statistics. Other map preferences that stand out are that DaQin likes Polaris Rhapsody and dislikes Fighting Spirit, Microwave prefers Roadkill, and Iron had trouble on Roadkill.

I will analyze the unknown maps tournament, at least to some extent. I’m not sure exactly how. There’s a bit of an embarrassment of riches at the moment.

AIIDE 2019 - maps per player

I’m pleased with this one. This is the same data as yesterday, how each bot did against each other on each map, but organized by player rather than by map. If you’re a bot author, I think this is a better way to find out about strengths and weaknesses.

For example, the first table is from the point of view of Locutus. The percentages are Locutus’s win rates. The upset by DaQin on Aztec immediately stands out amid Locutus’s otherwise consistent results. I imagine that Bruce @ Locutus will examine those 10 games and perhaps find a bug that DaQin exploited. (Locutus played its cannons at ramp into zealot drop strategy in these games. It lost because cannons at the ramp are a poor defense when the outside is on higher ground—Aztec has low-ground main bases. Maybe a weakness in learning or preparation?)

Locutus	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
PurpleWave	45%	50%	50%	40%	40%	40%	60%	40%	30%	40%	60%
BananaBrain	89%	100%	100%	80%	100%	100%	80%	70%	100%	70%	90%
DaQin	83%	100%	80%	100%	20%	80%	100%	90%	90%	80%	90%
Steamhammer	97%	90%	100%	100%	90%	100%	90%	100%	100%	100%	100%
ZZZKBot	99%	100%	100%	100%	100%	100%	100%	100%	90%	100%	100%
Microwave	92%	100%	100%	80%	80%	100%	90%	90%	90%	90%	100%
Iron	99%	100%	100%	100%	100%	100%	90%	100%	100%	100%	100%
XiaoYi	96%	100%	100%	100%	90%	70%	100%	100%	100%	100%	100%
McRave	99%	100%	100%	100%	100%	100%	100%	100%	90%	100%	100%
UAlbertaBot	99%	100%	100%	100%	100%	100%	100%	100%	90%	100%	100%
AITP	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
BunkerBoxeR	99%	100%	100%	100%	100%	100%	100%	100%	90%	100%	100%
overall	91.42%	95%	94%	92%	85%	91%	92%	91%	89%	90%	95%

PurpleWave	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	55%	50%	50%	60%	60%	60%	40%	60%	70%	60%	40%
BananaBrain	44%	80%	70%	80%	50%	60%	10%	40%	20%	0%	30%
DaQin	85%	80%	70%	100%	100%	70%	80%	100%	100%	70%	80%
Steamhammer	71%	90%	100%	70%	40%	60%	80%	60%	70%	90%	50%
ZZZKBot	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Microwave	93%	100%	90%	90%	100%	90%	90%	90%	90%	90%	100%
Iron	98%	100%	100%	90%	100%	100%	100%	100%	100%	100%	90%
XiaoYi	97%	90%	100%	100%	100%	100%	90%	100%	90%	100%	100%
McRave	89%	100%	90%	100%	100%	90%	80%	100%	70%	70%	90%
UAlbertaBot	98%	100%	100%	100%	100%	100%	100%	78%	100%	100%	100%
AITP	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
BunkerBoxeR	97%	100%	90%	90%	100%	100%	100%	100%	90%	100%	100%
overall	85.54%	91%	88%	90%	88%	86%	81%	86%	83%	82%	82%

BananaBrain	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	11%	0%	0%	20%	0%	0%	20%	30%	0%	30%	10%
PurpleWave	56%	20%	30%	20%	50%	40%	90%	60%	80%	100%	70%
DaQin	51%	30%	40%	60%	80%	50%	30%	60%	60%	60%	40%
Steamhammer	85%	100%	70%	90%	80%	80%	70%	100%	100%	70%	90%
ZZZKBot	83%	90%	100%	70%	80%	70%	90%	100%	70%	90%	70%
Microwave	71%	90%	70%	90%	70%	60%	70%	60%	80%	60%	60%
Iron	59%	40%	70%	50%	60%	60%	70%	80%	50%	70%	40%
XiaoYi	57%	40%	80%	40%	50%	50%	40%	80%	40%	80%	70%
McRave	69%	70%	60%	90%	70%	50%	80%	70%	70%	70%	60%
UAlbertaBot	84%	80%	90%	80%	90%	80%	90%	80%	89%	90%	70%
AITP	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
BunkerBoxeR	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
overall	68.81%	63%	68%	68%	69%	62%	71%	77%	70%	77%	65%

DaQin	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	17%	0%	20%	0%	80%	20%	0%	10%	10%	20%	10%
PurpleWave	15%	20%	30%	0%	0%	30%	20%	0%	0%	30%	20%
BananaBrain	49%	70%	60%	40%	20%	50%	70%	40%	40%	40%	60%
Steamhammer	94%	100%	100%	90%	100%	100%	100%	100%	70%	90%	90%
ZZZKBot	10%	10%	20%	10%	10%	10%	20%	10%	0%	10%	0%
Microwave	83%	90%	60%	60%	90%	100%	80%	90%	100%	80%	80%
Iron	92%	100%	90%	100%	90%	100%	90%	100%	80%	70%	100%
XiaoYi	82%	100%	100%	90%	100%	90%	60%	70%	60%	60%	90%
McRave	41%	40%	60%	40%	40%	20%	50%	30%	40%	30%	60%
UAlbertaBot	78%	70%	100%	80%	90%	70%	80%	80%	70%	60%	80%
AITP	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
BunkerBoxeR	99%	100%	100%	100%	90%	100%	100%	100%	100%	100%	100%
overall	63.33%	67%	70%	59%	68%	66%	64%	61%	56%	57%	66%

Steamhammer	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	3%	10%	0%	0%	10%	0%	10%	0%	0%	0%	0%
PurpleWave	29%	10%	0%	30%	60%	40%	20%	40%	30%	10%	50%
BananaBrain	15%	0%	30%	10%	20%	20%	30%	0%	0%	30%	10%
DaQin	6%	0%	0%	10%	0%	0%	0%	0%	30%	10%	10%
ZZZKBot	59%	40%	60%	80%	80%	40%	50%	40%	60%	50%	90%
Microwave	25%	10%	30%	50%	30%	30%	20%	40%	20%	10%	10%
Iron	67%	70%	90%	100%	40%	40%	90%	60%	60%	100%	20%
XiaoYi	50%	30%	60%	50%	20%	10%	80%	50%	80%	60%	60%
McRave	86%	100%	90%	100%	100%	70%	100%	90%	90%	40%	80%
UAlbertaBot	91%	90%	90%	90%	89%	80%	90%	89%	90%	100%	100%
AITP	97%	90%	100%	90%	100%	100%	90%	100%	100%	100%	100%
BunkerBoxeR	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
overall	52.25%	46%	54%	59%	54%	44%	57%	50%	55%	51%	52%

ZZZKBot	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	1%	0%	0%	0%	0%	0%	0%	0%	10%	0%	0%
PurpleWave	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
BananaBrain	17%	10%	0%	30%	20%	30%	10%	0%	30%	10%	30%
DaQin	90%	90%	80%	90%	90%	90%	80%	90%	100%	90%	100%
Steamhammer	41%	60%	40%	20%	20%	60%	50%	60%	40%	50%	10%
Microwave	44%	50%	0%	60%	40%	50%	50%	50%	60%	30%	50%
Iron	55%	0%	50%	60%	30%	50%	100%	50%	80%	50%	80%
XiaoYi	49%	60%	90%	70%	10%	20%	60%	40%	20%	70%	50%
McRave	67%	70%	70%	50%	80%	40%	30%	80%	80%	90%	80%
UAlbertaBot	90%	90%	80%	90%	100%	90%	90%	70%	100%	90%	100%
AITP	72%	70%	60%	60%	90%	80%	20%	70%	90%	90%	90%
BunkerBoxeR	99%	100%	100%	100%	100%	100%	90%	100%	100%	100%	100%
overall	52.08%	50%	48%	52%	48%	51%	48%	51%	59%	56%	57%

Microwave	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	8%	0%	0%	20%	20%	0%	10%	10%	10%	10%	0%
PurpleWave	7%	0%	10%	10%	0%	10%	10%	10%	10%	10%	0%
BananaBrain	29%	10%	30%	10%	30%	40%	30%	40%	20%	40%	40%
DaQin	17%	10%	40%	40%	10%	0%	20%	10%	0%	20%	20%
Steamhammer	75%	90%	70%	50%	70%	70%	80%	60%	80%	90%	90%
ZZZKBot	56%	50%	100%	40%	60%	50%	50%	50%	40%	70%	50%
Iron	13%	0%	0%	10%	10%	0%	20%	50%	20%	10%	10%
XiaoYi	65%	60%	80%	70%	60%	80%	40%	80%	60%	30%	90%
McRave	64%	60%	80%	100%	70%	50%	60%	70%	60%	30%	60%
UAlbertaBot	82%	60%	100%	90%	60%	80%	60%	80%	90%	100%	100%
AITP	93%	90%	90%	70%	100%	100%	90%	100%	100%	100%	90%
BunkerBoxeR	99%	100%	100%	100%	90%	100%	100%	100%	100%	100%	100%
overall	50.67%	44%	58%	51%	48%	48%	48%	55%	49%	51%	54%

Iron	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	1%	0%	0%	0%	0%	0%	10%	0%	0%	0%	0%
PurpleWave	2%	0%	0%	10%	0%	0%	0%	0%	0%	0%	10%
BananaBrain	41%	60%	30%	50%	40%	40%	30%	20%	50%	30%	60%
DaQin	8%	0%	10%	0%	10%	0%	10%	0%	20%	30%	0%
Steamhammer	33%	30%	10%	0%	60%	60%	10%	40%	40%	0%	80%
ZZZKBot	45%	100%	50%	40%	70%	50%	0%	50%	20%	50%	20%
Microwave	87%	100%	100%	90%	90%	100%	80%	50%	80%	90%	90%
XiaoYi	26%	10%	50%	0%	50%	20%	40%	20%	30%	40%	0%
McRave	65%	70%	80%	80%	80%	80%	70%	50%	40%	40%	60%
UAlbertaBot	90%	100%	90%	100%	90%	90%	90%	60%	90%	90%	100%
AITP	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
BunkerBoxeR	93%	100%	100%	80%	80%	100%	90%	100%	90%	90%	100%
overall	49.25%	56%	52%	46%	56%	53%	44%	41%	47%	47%	52%

XiaoYi	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	4%	0%	0%	0%	10%	30%	0%	0%	0%	0%	0%
PurpleWave	3%	10%	0%	0%	0%	0%	10%	0%	10%	0%	0%
BananaBrain	43%	60%	20%	60%	50%	50%	60%	20%	60%	20%	30%
DaQin	18%	0%	0%	10%	0%	10%	40%	30%	40%	40%	10%
Steamhammer	50%	70%	40%	50%	80%	90%	20%	50%	20%	40%	40%
ZZZKBot	51%	40%	10%	30%	90%	80%	40%	60%	80%	30%	50%
Microwave	35%	40%	20%	30%	40%	20%	60%	20%	40%	70%	10%
Iron	74%	90%	50%	100%	50%	80%	60%	80%	70%	60%	100%
McRave	36%	40%	10%	50%	30%	40%	30%	40%	30%	60%	30%
UAlbertaBot	73%	44%	60%	90%	80%	80%	75%	60%	89%	89%	60%
AITP	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
BunkerBoxeR	98%	100%	100%	100%	90%	100%	100%	100%	90%	100%	100%
overall	48.62%	50%	34%	52%	52%	57%	49%	47%	52%	50%	44%

McRave	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	1%	0%	0%	0%	0%	0%	0%	0%	10%	0%	0%
PurpleWave	11%	0%	10%	0%	0%	10%	20%	0%	30%	30%	10%
BananaBrain	31%	30%	40%	10%	30%	50%	20%	30%	30%	30%	40%
DaQin	59%	60%	40%	60%	60%	80%	50%	70%	60%	70%	40%
Steamhammer	14%	0%	10%	0%	0%	30%	0%	10%	10%	60%	20%
ZZZKBot	33%	30%	30%	50%	20%	60%	70%	20%	20%	10%	20%
Microwave	36%	40%	20%	0%	30%	50%	40%	30%	40%	70%	40%
Iron	35%	30%	20%	20%	20%	20%	30%	50%	60%	60%	40%
XiaoYi	64%	60%	90%	50%	70%	60%	70%	60%	70%	40%	70%
UAlbertaBot	43%	40%	60%	50%	10%	40%	30%	20%	70%	40%	70%
AITP	82%	80%	70%	60%	100%	90%	80%	80%	100%	100%	60%
BunkerBoxeR	71%	70%	80%	70%	50%	80%	70%	70%	80%	70%	70%
overall	40.00%	37%	39%	31%	32%	48%	40%	37%	48%	48%	40%

UAlbertaBot	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	1%	0%	0%	0%	0%	0%	0%	0%	10%	0%	0%
PurpleWave	2%	0%	0%	0%	0%	0%	0%	22%	0%	0%	0%
BananaBrain	16%	20%	10%	20%	10%	20%	10%	20%	11%	10%	30%
DaQin	22%	30%	0%	20%	10%	30%	20%	20%	30%	40%	20%
Steamhammer	9%	10%	10%	10%	11%	20%	10%	11%	10%	0%	0%
ZZZKBot	10%	10%	20%	10%	0%	10%	10%	30%	0%	10%	0%
Microwave	18%	40%	0%	10%	40%	20%	40%	20%	10%	0%	0%
Iron	10%	0%	10%	0%	10%	10%	10%	40%	10%	10%	0%
XiaoYi	27%	56%	40%	10%	20%	20%	25%	40%	11%	11%	40%
McRave	57%	60%	40%	50%	90%	60%	70%	80%	30%	60%	30%
AITP	75%	89%	80%	50%	78%	90%	70%	33%	90%	70%	100%
BunkerBoxeR	89%	70%	90%	90%	80%	90%	100%	80%	100%	100%	90%
overall	28.04%	31%	25%	22%	29%	31%	31%	33%	27%	26%	25%

AITP	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
PurpleWave	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
BananaBrain	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
DaQin	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
Steamhammer	3%	10%	0%	10%	0%	0%	10%	0%	0%	0%	0%
ZZZKBot	28%	30%	40%	40%	10%	20%	80%	30%	10%	10%	10%
Microwave	7%	10%	10%	30%	0%	0%	10%	0%	0%	0%	10%
Iron	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
XiaoYi	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
McRave	18%	20%	30%	40%	0%	10%	20%	20%	0%	0%	40%
UAlbertaBot	25%	11%	20%	50%	22%	10%	30%	67%	10%	30%	0%
BunkerBoxeR	59%	30%	20%	70%	40%	70%	100%	80%	60%	50%	70%
overall	11.62%	9%	10%	20%	6%	9%	21%	16%	7%	8%	11%

BunkerBoxeR	overall	Benzen	Destin	Heartb	Aztec	TauCro	Androm	Circui	Empire	Fortre	Python
Locutus	1%	0%	0%	0%	0%	0%	0%	0%	10%	0%	0%
PurpleWave	3%	0%	10%	10%	0%	0%	0%	0%	10%	0%	0%
BananaBrain	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
DaQin	1%	0%	0%	0%	10%	0%	0%	0%	0%	0%	0%
Steamhammer	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
ZZZKBot	1%	0%	0%	0%	0%	0%	10%	0%	0%	0%	0%
Microwave	1%	0%	0%	0%	10%	0%	0%	0%	0%	0%	0%
Iron	7%	0%	0%	20%	20%	0%	10%	0%	10%	10%	0%
XiaoYi	2%	0%	0%	0%	10%	0%	0%	0%	10%	0%	0%
McRave	29%	30%	20%	30%	50%	20%	30%	30%	20%	30%	30%
UAlbertaBot	11%	30%	10%	10%	20%	10%	0%	20%	0%	0%	10%
AITP	41%	70%	80%	30%	60%	30%	0%	20%	40%	50%	30%
overall	8.08%	11%	10%	8%	15%	5%	4%	6%	8%	8%	6%

The zergs and terrans seem more sensitive to the map than protoss. For example, Locutus vs PurpleWave win rates vary from 30% to 60%, which could be entirely due to statistical fluctuation, while ZZZKBot vs Iron ranges from 0% to 100%, which is not random. I imagine that ZZZKBot’s map selectivity has to do with its learning algorithm. But overall, there are many cases where the map seems to make a difference against opponents of similar strength. I think bots will benefit from more sensitivity to the map design.

AIIDE 2019 - per-map crosstables

A separate crosstable for each of the 10 maps. Most cells have only 10 games in them (some have fewer because of unsuccessful games), so the numbers are noisy. Nevertheless, I think the tables are full of insights—so full that it’s easy to be overwhelmed. I left the game counts out of the cells to make the tables more compact, so they are easier to compare.

Watch how ZZZKBot versus Iron varies strongly depending on the map.

(2) Benzene

PurpleWave’s favorite map. Is that because of Purple pathfinding skills? But Locutus liked the map too.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	95.00%		50%	100%	100%	90%	100%	100%	100%	100%	100%	100%	100%	100%
2	PurpleWave	90.83%	50%		80%	80%	90%	100%	100%	100%	90%	100%	100%	100%	100%
3	BananaBrain	63.33%	0%	20%		30%	100%	90%	90%	40%	40%	70%	80%	100%	100%
4	DaQin	66.67%	0%	20%	70%		100%	10%	90%	100%	100%	40%	70%	100%	100%
5	Steamhammer	45.83%	10%	10%	0%	0%		40%	10%	70%	30%	100%	90%	90%	100%
6	ZZZKBot	50.00%	0%	0%	10%	90%	60%		50%	0%	60%	70%	90%	70%	100%
7	Microwave	44.17%	0%	0%	10%	10%	90%	50%		0%	60%	60%	60%	90%	100%
8	Iron	55.83%	0%	0%	60%	0%	30%	100%	100%		10%	70%	100%	100%	100%
9	XiaoYi	49.58%	0%	10%	60%	0%	70%	40%	40%	90%		40%	44%	100%	100%
10	McRave	36.67%	0%	0%	30%	60%	0%	30%	40%	30%	60%		40%	80%	70%
11	UAlbertaBot	31.36%	0%	0%	20%	30%	10%	10%	40%	0%	56%	60%		89%	70%
12	AITP	9.24%	0%	0%	0%	0%	10%	30%	10%	0%	0%	20%	11%		30%
13	BunkerBoxeR	10.83%	0%	0%	0%	0%	0%	0%	0%	0%	0%	30%	30%	70%

(2) Destination

Key area on the map: Over the wall behind the enemy natural. Build an e-bay there and send tanks. Float the e-bay and siege the tanks, maybe add turrets for air defense. How many defending bots would survive? XiaoYi suffered badly on this map, but that’s not why.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	94.17%		50%	100%	80%	100%	100%	100%	100%	100%	100%	100%	100%	100%
2	PurpleWave	88.33%	50%		70%	70%	100%	100%	90%	100%	100%	90%	100%	100%	90%
3	BananaBrain	67.50%	0%	30%		40%	70%	100%	70%	70%	80%	60%	90%	100%	100%
4	DaQin	70.00%	20%	30%	60%		100%	20%	60%	90%	100%	60%	100%	100%	100%
5	Steamhammer	54.17%	0%	0%	30%	0%		60%	30%	90%	60%	90%	90%	100%	100%
6	ZZZKBot	47.50%	0%	0%	0%	80%	40%		0%	50%	90%	70%	80%	60%	100%
7	Microwave	58.33%	0%	10%	30%	40%	70%	100%		0%	80%	80%	100%	90%	100%
8	Iron	51.67%	0%	0%	30%	10%	10%	50%	100%		50%	80%	90%	100%	100%
9	XiaoYi	34.17%	0%	0%	20%	0%	40%	10%	20%	50%		10%	60%	100%	100%
10	McRave	39.17%	0%	10%	40%	40%	10%	30%	20%	20%	90%		60%	70%	80%
11	UAlbertaBot	25.00%	0%	0%	10%	0%	10%	20%	0%	10%	40%	40%		80%	90%
12	AITP	10.00%	0%	0%	0%	0%	0%	40%	10%	0%	0%	30%	20%		20%
13	BunkerBoxeR	10.00%	0%	10%	0%	0%	0%	0%	0%	0%	0%	20%	10%	80%

(2) Heartbreak Ridge

It’s a tricky map, but bots don’t seem to realize that. Steamhammer liked it a bit and Iron disliked it, but no bot stood out as particularly loving or hating it.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	91.67%		40%	80%	100%	100%	100%	80%	100%	100%	100%	100%	100%	100%
2	PurpleWave	90.00%	60%		80%	100%	70%	100%	90%	90%	100%	100%	100%	100%	90%
3	BananaBrain	67.50%	20%	20%		60%	90%	70%	90%	50%	40%	90%	80%	100%	100%
4	DaQin	59.17%	0%	0%	40%		90%	10%	60%	100%	90%	40%	80%	100%	100%
5	Steamhammer	59.17%	0%	30%	10%	10%		80%	50%	100%	50%	100%	90%	90%	100%
6	ZZZKBot	52.50%	0%	0%	30%	90%	20%		60%	60%	70%	50%	90%	60%	100%
7	Microwave	50.83%	20%	10%	10%	40%	50%	40%		10%	70%	100%	90%	70%	100%
8	Iron	45.83%	0%	10%	50%	0%	0%	40%	90%		0%	80%	100%	100%	80%
9	XiaoYi	51.67%	0%	0%	60%	10%	50%	30%	30%	100%		50%	90%	100%	100%
10	McRave	30.83%	0%	0%	10%	60%	0%	50%	0%	20%	50%		50%	60%	70%
11	UAlbertaBot	22.50%	0%	0%	20%	20%	10%	10%	10%	0%	10%	50%		50%	90%
12	AITP	20.00%	0%	0%	0%	0%	10%	40%	30%	0%	0%	40%	50%		70%
13	BunkerBoxeR	8.33%	0%	10%	0%	0%	0%	0%	0%	20%	0%	30%	10%	30%

(3) Aztec

A 3-player map with low-ground main bases. I like this map. DaQin liked it too, since it upset Locutus, unlike on any other map.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	85.00%		40%	100%	20%	90%	100%	80%	100%	90%	100%	100%	100%	100%
2	PurpleWave	87.50%	60%		50%	100%	40%	100%	100%	100%	100%	100%	100%	100%	100%
3	BananaBrain	69.17%	0%	50%		80%	80%	80%	70%	60%	50%	70%	90%	100%	100%
4	DaQin	67.50%	80%	0%	20%		100%	10%	90%	90%	100%	40%	90%	100%	90%
5	Steamhammer	53.78%	10%	60%	20%	0%		80%	30%	40%	20%	100%	89%	100%	100%
6	ZZZKBot	48.33%	0%	0%	20%	90%	20%		40%	30%	10%	80%	100%	90%	100%
7	Microwave	48.33%	20%	0%	30%	10%	70%	60%		10%	60%	70%	60%	100%	90%
8	Iron	55.83%	0%	0%	40%	10%	60%	70%	90%		50%	80%	90%	100%	80%
9	XiaoYi	51.67%	10%	0%	50%	0%	80%	90%	40%	50%		30%	80%	100%	90%
10	McRave	32.50%	0%	0%	30%	60%	0%	20%	30%	20%	70%		10%	100%	50%
11	UAlbertaBot	28.81%	0%	0%	10%	10%	11%	0%	40%	10%	20%	90%		78%	80%
12	AITP	5.88%	0%	0%	0%	0%	0%	10%	0%	0%	0%	0%	22%		40%
13	BunkerBoxeR	15.00%	0%	0%	0%	10%	0%	0%	10%	20%	10%	50%	20%	60%

(3) Tau Cross

Bases beyond the natural are open to attack. I think that is why Steamhammer had trouble. It was XiaoYi’s best map.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	90.83%		40%	100%	80%	100%	100%	100%	100%	70%	100%	100%	100%	100%
2	PurpleWave	85.83%	60%		60%	70%	60%	100%	90%	100%	100%	90%	100%	100%	100%
3	BananaBrain	61.67%	0%	40%		50%	80%	70%	60%	60%	50%	50%	80%	100%	100%
4	DaQin	65.83%	20%	30%	50%		100%	10%	100%	100%	90%	20%	70%	100%	100%
5	Steamhammer	44.17%	0%	40%	20%	0%		40%	30%	40%	10%	70%	80%	100%	100%
6	ZZZKBot	50.83%	0%	0%	30%	90%	60%		50%	50%	20%	40%	90%	80%	100%
7	Microwave	48.33%	0%	10%	40%	0%	70%	50%		0%	80%	50%	80%	100%	100%
8	Iron	53.33%	0%	0%	40%	0%	60%	50%	100%		20%	80%	90%	100%	100%
9	XiaoYi	56.67%	30%	0%	50%	10%	90%	80%	20%	80%		40%	80%	100%	100%
10	McRave	47.50%	0%	10%	50%	80%	30%	60%	50%	20%	60%		40%	90%	80%
11	UAlbertaBot	30.83%	0%	0%	20%	30%	20%	10%	20%	10%	20%	60%		90%	90%
12	AITP	9.17%	0%	0%	0%	0%	0%	20%	0%	0%	0%	10%	10%		70%
13	BunkerBoxeR	5.00%	0%	0%	0%	0%	0%	0%	0%	0%	0%	20%	10%	30%

(4) Andromeda

A bot that understands when to take the in-base mineral-only gains an edge. But so would a bot which understands how to attack it from outside, not as common a skill.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	92.50%		60%	80%	100%	90%	100%	90%	90%	100%	100%	100%	100%	100%
2	PurpleWave	80.83%	40%		10%	80%	80%	100%	90%	100%	90%	80%	100%	100%	100%
3	BananaBrain	70.83%	20%	90%		30%	70%	90%	70%	70%	40%	80%	90%	100%	100%
4	DaQin	64.17%	0%	20%	70%		100%	20%	80%	90%	60%	50%	80%	100%	100%
5	Steamhammer	56.67%	10%	20%	30%	0%		50%	20%	90%	80%	100%	90%	90%	100%
6	ZZZKBot	48.33%	0%	0%	10%	80%	50%		50%	100%	60%	30%	90%	20%	90%
7	Microwave	47.50%	10%	10%	30%	20%	80%	50%		20%	40%	60%	60%	90%	100%
8	Iron	44.17%	10%	0%	30%	10%	10%	0%	80%		40%	70%	90%	100%	90%
9	XiaoYi	49.15%	0%	10%	60%	40%	20%	40%	60%	60%		30%	75%	100%	100%
10	McRave	40.00%	0%	20%	20%	50%	0%	70%	40%	30%	70%		30%	80%	70%
11	UAlbertaBot	30.51%	0%	0%	10%	20%	10%	10%	40%	10%	25%	70%		70%	100%
12	AITP	20.83%	0%	0%	0%	0%	10%	80%	10%	0%	0%	20%	30%		100%
13	BunkerBoxeR	4.17%	0%	0%	0%	0%	0%	10%	0%	10%	0%	30%	0%	0%

(4) Circuit Breaker

Iron was unhappy with this map, though to me it seems like a good map for Iron’s skills.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	90.83%		40%	70%	90%	100%	100%	90%	100%	100%	100%	100%	100%	100%
2	PurpleWave	85.71%	60%		40%	100%	60%	100%	90%	100%	100%	100%	78%	100%	100%
3	BananaBrain	76.67%	30%	60%		60%	100%	100%	60%	80%	80%	70%	80%	100%	100%
4	DaQin	60.83%	10%	0%	40%		100%	10%	90%	100%	70%	30%	80%	100%	100%
5	Steamhammer	50.42%	0%	40%	0%	0%		40%	40%	60%	50%	90%	89%	100%	100%
6	ZZZKBot	50.83%	0%	0%	0%	90%	60%		50%	50%	40%	80%	70%	70%	100%
7	Microwave	55.00%	10%	10%	40%	10%	60%	50%		50%	80%	70%	80%	100%	100%
8	Iron	40.83%	0%	0%	20%	0%	40%	50%	50%		20%	50%	60%	100%	100%
9	XiaoYi	46.67%	0%	0%	20%	30%	50%	60%	20%	80%		40%	60%	100%	100%
10	McRave	36.67%	0%	0%	30%	70%	10%	20%	30%	50%	60%		20%	80%	70%
11	UAlbertaBot	33.33%	0%	22%	20%	20%	11%	30%	20%	40%	40%	80%		33%	80%
12	AITP	15.97%	0%	0%	0%	0%	0%	30%	0%	0%	0%	20%	67%		80%
13	BunkerBoxeR	5.83%	0%	0%	0%	0%	0%	0%	0%	0%	0%	30%	20%	20%

(4) Empire of the Sun

ZZZKBot’s favorite map.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	89.17%		30%	100%	90%	100%	90%	90%	100%	100%	90%	90%	100%	90%
2	PurpleWave	83.05%	70%		20%	100%	70%	100%	90%	100%	90%	70%	100%	100%	90%
3	BananaBrain	69.75%	0%	80%		60%	100%	70%	80%	50%	40%	70%	89%	100%	100%
4	DaQin	55.83%	10%	0%	40%		70%	0%	100%	80%	60%	40%	70%	100%	100%
5	Steamhammer	55.00%	0%	30%	0%	30%		60%	20%	60%	80%	90%	90%	100%	100%
6	ZZZKBot	59.17%	10%	0%	30%	100%	40%		60%	80%	20%	80%	100%	90%	100%
7	Microwave	49.17%	10%	10%	20%	0%	80%	40%		20%	60%	60%	90%	100%	100%
8	Iron	46.67%	0%	0%	50%	20%	40%	20%	80%		30%	40%	90%	100%	90%
9	XiaoYi	52.10%	0%	10%	60%	40%	20%	80%	40%	70%		30%	89%	100%	90%
10	McRave	48.33%	10%	30%	30%	60%	10%	20%	40%	60%	70%		70%	100%	80%
11	UAlbertaBot	26.72%	10%	0%	11%	30%	10%	0%	10%	10%	11%	30%		90%	100%
12	AITP	6.67%	0%	0%	0%	0%	0%	10%	0%	0%	0%	0%	10%		60%
13	BunkerBoxeR	8.33%	10%	10%	0%	0%	0%	0%	0%	10%	10%	20%	0%	40%

(4) Fortress

Fortress has corner bases that can be reached either by air, or by workers which mineral-walk through a gate. I have yet to see a bot that can take advantage of the corner bases. I'll be watching replays to see if I can find one. BananaBrain did well on this map, it's a candidate.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	90.00%		40%	70%	80%	100%	100%	90%	100%	100%	100%	100%	100%	100%
2	PurpleWave	81.67%	60%		0%	70%	90%	100%	90%	100%	100%	70%	100%	100%	100%
3	BananaBrain	76.67%	30%	100%		60%	70%	90%	60%	70%	80%	70%	90%	100%	100%
4	DaQin	57.50%	20%	30%	40%		90%	10%	80%	70%	60%	30%	60%	100%	100%
5	Steamhammer	50.83%	0%	10%	30%	10%		50%	10%	100%	60%	40%	100%	100%	100%
6	ZZZKBot	55.83%	0%	0%	10%	90%	50%		30%	50%	70%	90%	90%	90%	100%
7	Microwave	50.83%	10%	10%	40%	20%	90%	70%		10%	30%	30%	100%	100%	100%
8	Iron	46.67%	0%	0%	30%	30%	0%	50%	90%		40%	40%	90%	100%	90%
9	XiaoYi	50.42%	0%	0%	20%	40%	40%	30%	70%	60%		60%	89%	100%	100%
10	McRave	48.33%	0%	30%	30%	70%	60%	10%	70%	60%	40%		40%	100%	70%
11	UAlbertaBot	26.05%	0%	0%	10%	40%	0%	10%	0%	10%	11%	60%		70%	100%
12	AITP	7.50%	0%	0%	0%	0%	0%	10%	0%	0%	0%	0%	30%		50%
13	BunkerBoxeR	7.50%	0%	0%	0%	0%	0%	0%	0%	10%	0%	30%	0%	50%

(4) Python

Python is a grand old classic, a largely successful attempt to redesign Lost Temple without all the imbalances. It has 2 island bases.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	95.00%		60%	90%	90%	100%	100%	100%	100%	100%	100%	100%	100%	100%
2	PurpleWave	81.51%	40%		30%	80%	50%	100%	100%	90%	100%	90%	100%	100%	100%
3	BananaBrain	65.00%	10%	70%		40%	90%	70%	60%	40%	70%	60%	70%	100%	100%
4	DaQin	65.83%	10%	20%	60%		90%	0%	80%	100%	90%	60%	80%	100%	100%
5	Steamhammer	52.50%	0%	50%	10%	10%		90%	10%	20%	60%	80%	100%	100%	100%
6	ZZZKBot	57.50%	0%	0%	30%	100%	10%		50%	80%	50%	80%	100%	90%	100%
7	Microwave	54.17%	0%	0%	40%	20%	90%	50%		10%	90%	60%	100%	90%	100%
8	Iron	51.67%	0%	10%	60%	0%	80%	20%	90%		0%	60%	100%	100%	100%
9	XiaoYi	44.17%	0%	0%	30%	10%	40%	50%	10%	100%		30%	60%	100%	100%
10	McRave	40.00%	0%	10%	40%	40%	20%	20%	40%	40%	70%		70%	60%	70%
11	UAlbertaBot	25.42%	0%	0%	30%	20%	0%	0%	0%	0%	40%	30%		100%	90%
12	AITP	10.92%	0%	0%	0%	0%	0%	10%	10%	0%	0%	40%	0%		70%
13	BunkerBoxeR	5.83%	0%	0%	0%	0%	0%	0%	0%	0%	0%	30%	10%	30%

Next: I want to present the same information in a different format that I hope will be easier to draw lessons from.

AIIDE 2019 - race balance

The CoG results file is troublesome, so I’m analyzing AIIDE first after all. The purpose of a plan, after all, is not to be executed, but to be changed; contact with the enemy and all that. This year’s AIIDE detailed_results.txt file was easy to read and interpret. I only needed a couple small changes from last year’s script.

Here is my version of the crosstable. It is identical to the official crosstable, so it doesn’t include any new information. If somebody wants it, I can also post my version of the per-map results, but that doesn’t include any new information either.

#	bot	overall	Locu	Purp	Bana	DaQi	Stea	ZZZK	Micr	Iron	Xiao	McRa	UAlb	AITP	Bunk
1	Locutus	91.42%		45% 45/100	89% 89/100	83% 83/100	97% 97/100	99% 99/100	92% 92/100	99% 99/100	96% 96/100	99% 99/100	99% 99/100	100% 100/100	99% 99/100
2	PurpleWave	85.54%	55% 55/100		44% 44/100	85% 85/100	71% 71/100	100% 100/100	93% 93/100	98% 98/100	97% 97/100	89% 89/100	98% 94/96	100% 100/100	97% 97/100
3	BananaBrain	68.81%	11% 11/100	56% 56/100		51% 51/100	85% 85/100	83% 83/100	71% 71/100	59% 59/100	57% 57/100	69% 69/100	84% 83/99	100% 100/100	100% 100/100
4	DaQin	63.33%	17% 17/100	15% 15/100	49% 49/100		94% 94/100	10% 10/100	83% 83/100	92% 92/100	82% 82/100	41% 41/100	78% 78/100	100% 100/100	99% 99/100
5	Steamhammer	52.25%	3% 3/100	29% 29/100	15% 15/100	6% 6/100		59% 59/100	25% 25/100	67% 67/100	50% 50/100	86% 86/100	91% 89/98	97% 97/100	100% 100/100
6	ZZZKBot	52.08%	1% 1/100	0% 0/100	17% 17/100	90% 90/100	41% 41/100		44% 44/100	55% 55/100	49% 49/100	67% 67/100	90% 90/100	72% 72/100	99% 99/100
7	Microwave	50.67%	8% 8/100	7% 7/100	29% 29/100	17% 17/100	75% 75/100	56% 56/100		13% 13/100	65% 65/100	64% 64/100	82% 82/100	93% 93/100	99% 99/100
8	Iron	49.25%	1% 1/100	2% 2/100	41% 41/100	8% 8/100	33% 33/100	45% 45/100	87% 87/100		26% 26/100	65% 65/100	90% 90/100	100% 100/100	93% 93/100
9	XiaoYi	48.62%	4% 4/100	3% 3/100	43% 43/100	18% 18/100	50% 50/100	51% 51/100	35% 35/100	74% 74/100		36% 36/100	73% 69/95	100% 100/100	98% 98/100
10	McRave	40.00%	1% 1/100	11% 11/100	31% 31/100	59% 59/100	14% 14/100	33% 33/100	36% 36/100	35% 35/100	64% 64/100		43% 43/100	82% 82/100	71% 71/100
11	UAlbertaBot	28.04%	1% 1/100	2% 2/96	16% 16/99	22% 22/100	9% 9/98	10% 10/100	18% 18/100	10% 10/100	27% 26/95	57% 57/100		75% 72/96	89% 89/100
12	AITP	11.62%	0% 0/100	0% 0/100	0% 0/100	0% 0/100	3% 3/100	28% 28/100	7% 7/100	0% 0/100	0% 0/100	18% 18/100	25% 24/96		59% 59/100
13	BunkerBoxeR	8.08%	1% 1/100	3% 3/100	0% 0/100	1% 1/100	0% 0/100	1% 1/100	1% 1/100	7% 7/100	2% 2/100	29% 29/100	11% 11/100	41% 41/100

The race balance is not too interesting this year. In the crosstable, we see protoss at the top, zerg grouped in the middle, and mostly terran on the bottom, so we don’t need numbers to judge the race balance. But here’s the table anyway. The random row and the versus-random column are the least interesting of all, because UAlbertaBot was the only random player.

	overall	vT	vP	vZ	vR
terran	29%		14%	28%	50%
protoss	70%	86%		71%	80%
zerg	52%	72%	29%		88%
random	28%	50%	20%	12%

And here’s the breakdown of how each bot performed against each race. The most surprising results are that Steamhammer did poorly against these specific 2 zerg opponents, although ZvZ is in general its strongest matchup, and that weaker participants UAlbertaBot and BunkerBoxeR somehow scored higher against mighty protoss than against middling zerg. In the crosstable above, we can see the matchups which were responsible for the surprises.

#	bot	race	overall	vT	vP	vZ	vR
1	Locutus	protoss	91.42%	98%	79%	96%	99%
2	PurpleWave	protoss	85.54%	98%	68%	88%	98%
3	BananaBrain	protoss	68.81%	79%	47%	80%	84%
4	DaQin	protoss	63.33%	93%	30%	62%	78%
5	Steamhammer	zerg	52.25%	78%	28%	42%	91%
6	ZZZKBot	zerg	52.08%	69%	35%	42%	90%
7	Microwave	zerg	50.67%	68%	25%	66%	82%
8	Iron	terran	49.25%	73%	23%	55%	90%
9	XiaoYi	terran	48.62%	91%	21%	45%	73%
10	McRave	protoss	40.00%	63%	26%	28%	43%
11	UAlbertaBot	random	28.04%	50%	20%	12%	-
12	AITP	terran	11.62%	20%	4%	13%	25%
13	BunkerBoxeR	terran	8.08%	17%	7%	1%	11%

We also see that XiaoYi annihilated terran opponents, even though it came in a little below Iron overall. Comparing to last year’s results, XiaoYi’s parent bot SAIDA wiped the floor with terrans even harder.

Next: The voluminous per-map crosstables.

AIIDE 2019 second first look

The AIIDE 2019 tournament has been rerun to correct an error. The results are official, different from before, and hopefully final. In the original run of the tournament, we’re told, a hardware error corrupted a file and caused McRave to crash every game against Locutus. In the corrected rerun, McRave was able to score 1 win against Locutus in 100 games, but ironically ended up with a slightly lower overall winning rate. Bugs in McRave were more important for its result than bugs in the tournament.

#1 Locutus and #2 PurpleWave maintain their positions, but Locutus no longer had plus results against every opponent: PurpleWave edged it out 55-45 in their matchup. #3 BananaBrain gained a rank, and #4 DaQin lost one. From my point of view, the most important result is that #5 Steamhammer moved ahead of #7 Microwave and #8 Iron—these competitors were tightly grouped, and it only took small changes in the results to switch their finishing order around thoroughly.

shifts in the results

The order of finishers looks different, but most winning rates in the final results are within a few percent of the deprecated original results. Exceptions are #4 DaQin at 63.33% which was formerly #3 DaQin at 69.39%, a shift of 6% down, and #6 ZZZKBot at 52.08%, formerly #9 ZZZKBot at 43.04%, a shift of 9% up. What accounts for these two bots having such different results? To my eye, it doesn’t look like typical statistical variation.

I looked at the scores of specific matchups. Surprise result one: Formerly ZZZKBot scored 18-82 versus DaQin, but this time ZZZKBot 90-10 DaQin. This one difference accounts for the entire shift in DaQin’s winning rate, moving it down a rank, and much of ZZZKBot’s shift. Surprise result two: Formerly ZZZKBot 34-66 McRave, but this time ZZZKBot 67-33. That accounts for McRave performing worse overall, and for ZZZKBot jumping up the ranks. In other matchups, ZZZKBot performed similarly in both runs of the tournament.

Why did ZZZKBot perform so differently in these 2 matchups alone? I’ll dig in later, but I can speculate; here are 3 possible reasons, and it could be something else. There is some smell of software error: 18-82 -> 90-10 and 34-66 -> 67-33 look as though the results for the players were swapped. Or perhaps ZZZKBot was affected by the hardware error in these 2 matchups. Another possibility is unstable learning. I know that Steamhammer can perform very differently in two runs of the same matchup depending on what openings it happens to randomly try (does it hit on a winner early?). ZZZKBot’s learning is complicated and hard to analyze, but maybe it is susceptible to some effect like that.

AIIDE 2019 results first look

Important update on Friday 11 October: The results are invalid due to an error and the tournament will be repeated from scratch. See Dave Churchill’s tweet “The 2019 AIIDE StarCraft AI Competition will have to be re-run due to an error on our part causing a corrupted file which caused McRave to crash a lot of games.” The same error might have caused other problems. Even if McRave was the only bot directly affected, the competition was round robin so every bot’s score was potentially affected.

The AIIDE 2019 results were announced today at the conference. The AIIDE conference stream includes Dave Churchill’s presentation starting at about 52:30.They come with a video of Locutus versus PurpleWave, with commentary by Dan Gant focusing not on the game, but on the AI techniques.

The standings: #1 Locutus edged out #2 PurpleWave. #3 DaQin and #4 BananaBrain were far behind, but finished out the dominant protoss bloc. (The win rate over time graph strangely omits #4 BananaBrain.) #5 Iron, #6 Microwave, #7 XiaoYi, and #8 Steamhammer were closely grouped around 50% win rate. As in CoG, Iron is the top terran and the top returning bot, and Microwave was the top zerg.

#10 McRave did surprisingly poorly. It must be suffering from new bugs. I notice that McRave’s army has become strangely passive; it sometimes seems unwilling to fight even with a large advantage. That seems like a symptom of an important bug.

#8 Steamhammer did about as I expected, or at least as I expected after I noticed the combat sim bug that I had just added. Without that bug I think it would have finished slightly ahead of Microwave. I’m bothered by the 59% win rate against Iron, though; I expected over 90%. I tested on every map with the correct version of Iron, but must have made a mistake somewhere.

Last year, Bruce Nielsen provided diffs from Locutus for bots derived from it. This year, Dan Gant has provided diffs of a few other bots.

• Stormbreaker derived from SAIDA - Stormbreaker was disqualified because its behavior was nearly identical to SAIDA’s, though there are big code differences. According to the presentation, Stormbreaker adds a neural network but does not use it.

• XiaoYi derived from SAIDA - According to the presentation, SAIDA would likely have finished 3rd if it had played. XiaoYi placed 7th behind Microwave.

• DaQin this year versus last year. I see a great many detailed changes.

We were promised a second competition on “unknown” maps, for those bots which did not opt out. I count 8 participants for the second competition. I don’t see a sign of its results. Perhaps it has not been run yet.

As always, I will analyze both CoG and AIIDE. But CoG is showing evidence of sloppiness, so AIIDE deserves more attention. With fewer entrants in AIIDE this year, it won’t take as long to dig into them. But I think I have almost managed to interpret the CoG result file, so I’ll start there.

Steamhammer can’t finish the game

Finishing off the enemy just means destroying all their buildings. It sounds simple, but it is a sophisticated skill, and there are a lot of ways to go wrong. Steamhammer has a number of special provisions for quickly finding the last enemy remnants, but small loopholes persist and occasionally a game slips through one.

PurpleSpirit-Steamhammer on La Mancha is an example. It’s an entertaining game, thanks to the purple habit of playing all over the map, but I want to focus on the end, after PurpleSpirit has lost, when Steamhammer fails to destroy the floating terran buildings that are right over its head. That part is entertaining too, but not for the same reason.

Everything terran on the ground is destroyed, except one command center which was infested instead. The remaining terran force is 2 full-strength battlecruisers, and the remaining terran infrastructure is 2 floating ebays. Steamhammer is maxed and banking resources, but its only anti-air units are 8 scourge, plus a defiler with plague. Notice how much game time is left?

One of Steamhammer’s special game-finishing skills is that it makes mutalisks to chase down the residue of the enemy. The condition is, if the enemy has no known bases and no known anti-air units, then Steamhammer will tech to mutalisks and make mutas its primary unit. The mutas scout faster than ground units, and can find floating buildings and island bases that ground units can’t reach. But here terran still has battlecruisers, so the mutalisk rule does not kick in. First the terran army, such as it is, must be defeated.

Some scourge have hit, and the battlecruisers are no longer at full HP. But Steamhammer has been replacing losses primarily with more zerglings and ultralisks, which are of no use. Oops, the unit mix is wrong. Now there are only 2 scourge, and a battlecruiser can kill a scourge in one shot—2 battlecruisers, if not distracted by other targets, are safe from 2 scourge. The ebays choose to park over the terran natural, and zerg units have congregated there. The battlecruisers seek zerg stuff to shoot, and the defiler responds by blanketing the area in defensive swarm, consuming zerglings like crazy.

Well, the 2 scourge hit one of the battlecruisers, which was distracted seeking zerg units that strayed from under dark swarm. And Steamhammer is now making 3 new pairs of scourge to replace various losses; if it can keep this up, the battlecruisers will eventually fall. Best of all, the defiler has plagued the terran buildings, despite the zerg units underneath. That will put the ebays into the yellow. One more plague should put them into the red, after which they will burn down.

Whew, finally the battlecruisers are shot down by scourge.

But that’s all she wrote. The swarms wore off and there was no need to renew them. The defiler did not plague again because it thought the zerg units underneath were more valuable. Scourge are coded to avoid floating buildings, because it is usually wasteful to spend gas destroying them. The mutalisk rule is engaged, and there is supply to build 1 mutalisk, but Steamhammer happened to choose to spawn zerglings first, and after that there was no supply to make a mutalisk. The game timed out with no more progress.

Finishing off the enemy can be hard. In this case, Steamhammer had the wrong unit mix; to make zerglings and ultralisks when all enemies were in the air was no good. The mutalisk rule should make mutalisks only, not mix them with other units. The scourge might have understood that when only floating buildings are left, they are good targets. Also the zerg ground units that can’t shoot up might have known better than to chase floating buildings (though it can be useful when they track a building trying to escape), and the defiler might have realized that damage to its own units was irrelevant when it could eliminate the enemy. That is a lot of flaws, and yet Steamhammer rarely fails to finish a game!

And fixing all the problems would only narrow the loophole, not eliminate it. In the worst case, Steamhammer would need to be able to destroy some of its own units to clear supply to make mutalisks to finish off the enemy. And that’s a high-end skill that I am in no hurry to add.

Next: The start of CoG 2019 analysis.

CoG 2019 results first look

Dan Gant let me know yesterday that the CoG 2019 (formerly CIG 2019) full results are out. They finally got a new web site up. I grabbed everything, but found that replays_04.zip is corrupt, so we are missing replays from the final 10 rounds. The SOURCE_CODE download does not contain source code.

There were 27 participants, a good number, but only 9 were new entrants, not such a good number. The remaining 18 were holdovers from previous years (this assumes that LetaBot was a new submission as registered, not a holdover as stated in the slide show; I don’t know which is correct). 40 rounds were played, numbered 0 to 39. 27*26*40/2 gives 14,040 games ideally, and they claim that 14,027 successfully made it into the results. The five maps were a version of Heartbreak Ridge (2 starting locations), Alchemist and Great Barrier Reef (a version of El Niño) (3 starts), and Neo Sniper Ridge and Python (4 starts). Alchemist is badly designed, but the others are good choices. Heartbreak Ridge and Sniper Ridge have layout similarities; I would not have included both with only 5 maps total (of course, they chose randomly). I still maintain that 5 maps are not enough to smooth out differences; if a bot does particularly well or poorly on one map, it introduces an element of luck into the results.

The result chart in the slide show does not agree with the result crosstable. The table gives #4 Iron 75.96% #5 BananaBrain 74.81%, #6 XiaoYi 72.21%. The slide show gives #5 BananaBrain 72.21% from the next entry down in the table, #6 XiaoYi 70.38% from its next entry down in the table, and so on. The error seems to be in the slide show; all the values are assigned to bots which are off by one from #5 BananaBrain until #21 Ziabot, which in the slide show shares the same 29.33% win rate as #20 Bonjwa. I’ll be careful to use the win rates from the crosstable.

I get the impression that the organizers are overburdened. Running a tournament is a ton of work. They do not seem to have the resources to verify details and get everything right. I hope they have time to go back and clean up the errors.

The participants fall into fairly neat score groups. The protoss leaders #1 PurpleWave, #2 McRave, and #3 Locutus (all independently written, by the way, with no shared code history) are at 88.56-84.9%. Then a gap, and the next group is #4 Iron to #9 BetaStar at 75.96-67.41%. Another gap, and the next group is #10 MetaBot to #13 TitanIron at 59.04-56.35%.

I can verify from its learning files that terran #6 XiaoYi at 72.21% wins is a fork of SAIDA. By the way, it is given as registering under the name XiaoYiAI, but played under the name XIAOYI. It was brand new, so probably no bot had special preparation for it by name. Nevertheless, other bots seem to have played under their names as registered, so XiaoYi was potentially given an advantage of anonymity.

#4 Iron at 75.96% win rate is the top terran and the top holdover bot from the previous year. #7 Microwave did well at 70.38%, making it the top zerg. (#22 Steamhammer is the buggy holdover from the previous year and performed miserably, as expected.)

The biggest upset by far is #24 OpprimoBot, 23.22% win rate, 28-12 versus #1 PurpleWave with 88.56% win rate. I watched a few replays and found that, in those games, PurpleWave made one probe and then stopped all production. I can only guess that OpprimoBot tickled a bug, and the bug must be triggered by the name “OpprimoBot”, since PurpleWave went wrong before learning anything else about its opponent. I imagine that the famously thorough Purple tournament preparation hit a glitch in this one case. Maybe it is related to the fact that OpprimoBot plays random on SSCAIT, but played terran here. The object file Opponents.class does mention OpprimoBot by name, along with many other potential opponents.

I will analyze the results as usual, with the colorful crosstables and stuff, but may be slower than in the past.

Steamhammer’s defiler play

Earlier I claimed that if Steamhammer has defilers out, it is probably winning. It’s true but misleading. Steamhammer’s game plan is such that it doesn’t try to win in the middle game (though it may win accidentally); the midgame is about holding the enemy off, getting upgrades, and growing the economy. When its drone count reaches 75, Steamhammer switches to all army production and its military strength grows rapidly—and that is the same time that defilers come into their own. When it reaches the late game, whether there are defilers or not, Steamhammer will be hitting hard and very likely winning, because if it weren’t winning it would probably have lost already. The late game is Steamhammer’s strongest phase.

Nevertheless, Steamhammer’s defiler usage has grown skillful enough that it definitely contributes to the bot’s strength. It’s an important milestone, because good defiler usage is critical to strong zerg play. Defilers are complicated, and zerg cannot reach its potential without mastering them. Steamhammer is far from mastering defilers, but I think it has become more adept with them than any other bot.

The original description of the defiler implementation is still accurate. Bugs have been fixed and refinements made, but the structure is unchanged.

that laser precision

I picked this game TyrProtoss-Randomhammer because it shows off Steamhammer’s precision and fluidity in defiler spellcasting. The precision was always there; the fluidity was reached by a long road with a year’s worth of bug fixes and other improvements. In the first picture, Steamhammer is about to start its decisive final attack. The attack would have won with or without defiler support. Accurate defiler spells made it effortless.

Steamhammer chooses plague because it looks like the fight will feature zealots versus hydras. The plague is actually a little weak. The defiler (selected and a little hard to see) is hemmed in by friendly units and cannot move forward, and Steamhammer hasn’t seen all the approaching enemies yet, so it plagues only 3 zealots. With a short delay for the opposing armies to close, it could have plagued more enemies, but it is not able to figure that out. Still, the plague helped in the battle.

Once the front zealots have died, Steamhammer has a fuller view of the situation, and realizes that the enemy has mostly dragoons left. The defiler (which has the energy upgrade) casts swarm, consume, consume, then a second swarm. The swarms are accurately placed to nullify the dragoons and drive the enemy back, bringing the fight into the natural. In the picture the defiler has just cast the second swarm, and it instantly consumes again.

The defiler is again stuck behind friendly units for a time, but protoss is forced back to the ramp. As zerg units spread out to raze the enemy natural, the defiler becomes free to move and drops a perfect plague on the ramp. Every protoss unit on the ramp is hit, and the zerg melee units that are in contact are untouched. No better plague is possible.

Steamhammer is awkward in maneuvering its defiler, and it is unable to foresee better opportunities that will arise in the near future. But when a spell goes down, it is often close to the best that is possible at that moment. It’s plenty good for now.

potential to turn the game

One defiler spell can make the difference between losing and winning. In practice, Steamhammer needs more skills before that will happen often in its games, but the examples here offer a foretaste. The game Steamhammer-McRave I picked to show the brute power of plague.

The first defiler of the game wandered carelessly into the front lines, stood on top of an isolated lurker, and got stormed to death. The defiler didn’t know the best place to go, that’s one weakness, and Steamhammer only makes 1 defiler at a time, that’s another weakness. An alert opponent can pick off the defiler and earn a respite. When the second defiler of the game joined the fray, this happened:

It’s a fun game, worth watching through. At the time of the picture, Steamhammer had been slowly twisting the situation into its favor from far behind, but McRave was still winning. The game could have gone either way. After this massive plague, turning a whole phalanx of zealots into straw, McRave’s army was broken and it quickly collapsed.

Here’s a picture from a different Steamhammer-McRave game. The red stuff all over the protoss natural is from 3 active plagues cast one after another by one defiler. The defiler just died; it sacrificed itself amidst the zealots (at the white spot of a dragoon hit) to get the third plague off, the one currently spreading over dragoons in the rear. The sacrifice was worth it. The protoss army was hollowed out, giving zerg time to consolidate an economic advantage and win. Without those plagues, McRave would have been able to move out and take the game with its bigger army (though I’m not sure it would have chosen to).

Here is a third example of a turnaround plague, from Steamhammer-MadMixT. After playing well much of the game, Steamhammer went astray and collapsed under a terran attack. It was about to lose its main and the game when a defiler risked its life to throw a plague over the core of the terran force, reducing the units to eggshells. Plague hurts terran more than protoss, which has shields. It then took only a few zerg units to clear the attack, partly due to terran’s disorganized movement, and with a return to superior play, Steamhammer slowly clawed its way to a win.

keeping active

Steamhammer is weak at using defilers in defense. I think it’s the widest remaining gap in defiler play. But it’s pretty good at using defilers in attack. I chose the game Steamhammer-MadMixP to show how, in the best case, Steamhammer can keep its defiler active through a long attacking sequence. The game itself is not very interesting, but watch the finishing attack which starts at around 17:30 from across the bridge to the terran natural.

It starts with a plague. If I didn’t miss anything, the sequence that follows is consume, consume, swarm, consume, consume, consume, plague, consume, consume, swarm, consume, consume, swarm, consume, consume, consume, swarm, consume, plague, consume, consume, consume, plague, consume, consume, consume, plague, consume, consume—and the last enemy building is destroyed. The defiler was stuck behind friendly units for short stretches, and otherwise busy every moment supporting the raging attack. Even though not all the activity was useful, the ability to keep constantly active is valuable.

blunders

Steamhammer does make mistakes in casting defiler spells, though rarely serious ones. Plague doesn’t account for moving units. It can miss fast-moving enemies altogether, for example trying to plague a group of corsairs and smearing only bare ground. In an active fight it can unintentionally plague its own units, which do not realize that they are moving into a danger zone. (It’s also perfectly willing to intentionally plague its own units, if that lets it splash more enemies. The calculation simply adds damage done to enemies and subtracts damage done to self.)

Dark swarm is thorny. Steamhammer only partially understands it. In Randomhammer-tscmoop2 the defiler cast a swarm that helped the enemy—with hydras on hand versus zealots and a reaver as well as the cannons and dragoons, swarm was a poor idea in the first place, and this swarm placement defends the enemy and does not open a path to attack. Oops. The research tab shows that plague was almost but not quite finished.

improvements needed

Why is Steamhammer weak with defensive dark swarm? To lay down swarm over lurkers and hold a position, or to force marines back with zerglings under swarm, you have to coordinate the swarm with the combat units. The defiler knows that swarm over lurkers is a great idea, and the combat simulator knows it too, in an approximate way. The lurkers themselves have no inkling; when their targets retreat, the lurkers will pick themselves up and obliviously step out of the swarm and die. Even something as simple as rendering hydralisks invulnerable to air attack is beyond Steamhammer. The hydras pay no attention.

Steamhammer’s squad code is already fragile from having too many skills tacked on, and needs to be rethought and rewritten. I’m reluctant to add complicated coordination skills before tackling that, so this weakness will likely be around for a while. It is a severe weakness, though, because defense is a critical use of defilers. Darn it, but there are times when only invulnerability can save you.

Besides the lack of coordination, there is no planning ahead. Defilers live in the moment and know nothing beyond. With prediction and planning, greater things could be accomplished with fewer resources.

More basically, the tendency of the defiler to get stuck behind friendly units reminds me that Steamhammer needs another fundamental skill: Smooth unit movement. Human players know that the defiler’s position is important, and move other units so that the defiler gets where it needs to be. And more generally, the missing skill shows in things like awkward movement through choke points, and clumsy collisions when squads cross paths.

I have a few simple ideas for how to control multiple defilers at once without blowing out cpu usage or seeing them simultaneously plague the same enemies. I expect I’ll implement that before SSCAIT this year. With 2 or 3 defilers in the Ground squad, defenders won’t have time to draw breath between spells.

Someday I’ll implement defiler drops. Zerglings with adrenal glands upgrade might as well be wrecking balls. Drop a defiler and some zerglings off to one side, swarm, swarm, and pick up the defiler for another go later while the lings rip down buildings. It’s cheap to do and expensive to defend against.

Steamhammer 2.3.5 source released

I finally uploaded the source for Steamhammer 2.3.5; see Steamhammer’s web page. Sorry about the delay. As I’ve mentioned, this is the same code as the AIIDE 2019 competition version, but configured to play all races rather than zerg only.

One of my steps in uploading a new version is to calculate the performance numbers of the previous version. By SSCAIT win rate, Steamhammer 2.3 (from last April) with 69% wins is the most successful version since Steamhammer 1.2 with 70% wins, way back in March 2017. And I’ve seen that the following test versions are stronger yet. This version 2.3.5 (and the tournament version 2.3.4 since it’s identical) has a new combat sim bug which makes it slightly weaker than the previous test version, 2.3.3, but I think it is still stronger than version 2.3. Steamhammer is doing well at the moment.

Before long I plan to release a new point version which fixes the combat sim bug and (just for fun) adds new queen skills. After that I’ll dig into strategy adaptation, which I expect will cause strength to plummet, because that is what major new features do at first. When strength recovers, though, it will recover to a higher level.

Next: A long post about defiler play, with examples and explanations. I also found a few research papers that I want to write up.

Steamhammer plans update

After every new Steamhammer version it’s time to come out with new plans for the following versions—plans that are always updated and sometimes changed beyond recognition. No plan survives contact with further consideration.

Despite my intention to stick to low-risk changes in the latest Steamhammer, I seem to have introduced a severe bug into combat simulation. I’m seeing cases of pointless fear, like ultralisks fearful to attack an assimilator, and Games Are Being Lost, which must not be. I need to make a point release to fix this bug, and possibly a few minor bugs besides.

But I don’t have the energy to fix bugs right now. I want to do something fun instead. I’m adding broodling for queens, and thinking about how to adjust the strategy boss so it can make more than 1 queen at a time without ever going overboard. Hmm. The rest of the code already supports any number of queens, at least in principle, but not the strategy boss. This is, by the way, another change of plans; I was originally intending to implement ensnare before broodling.

Related stuff that’s high on my list: Simple sample configuration file, for people who prefer to ignore the complex default config file with features piled over their heads. Up-to-date documentation. I promised a post on defilers. Oh, and I forgot to release source, I’d better do that next. I’d forget the period at the end of the sentence if it weren’t there on the keyboard to remind me

In the longer term, I need to go to BWAPI 4.4.0 and make progress on strategy adaptation. My thinking at the moment is that I’ll make the most progress by working with timings: The opponent model should record the timings and unit mix of opponent attacks, and compare them against the measured timings of Steamhammer’s openings to figure out how to counter an opponent. That’s only a small part of the full strategy adaptation suite, but it’s critical and it seems like a good piece to do early.

If you look closely at the source—you know, the source I forgot to release—you’ll find an unfinished and unused class BOSimulator which is a start on another essential part of strategy adaptation. I need to rename it, the name is confusingly similar to Dave Churchill’s BOSS.

Steamhammer’s strategy boss debug display

For this upload of Steamhammer to SSCAIT, I turned on display of the strategy boss information. It’s kind of cryptic, so I thought I’d explain.

strategy boss debug info drawn on the screen

The top section, with all yellow labels from “bases” to “build”, describes the game situation and some automatic reactions that can happen at any time, including during the opening. The top section displays all game long. Then comes a slight break, and the bottom section, with multi-color labels from “eco” to the end, describes the strategy boss’s plans and conclusions after the opening is over. The bottom section only appears after exit from the opening book.

The strategy boss also cares about the game’s mineral count, gas count, and supply count. That’s why I draw the strategy boss info directly underneath Starcraft’s info.

bases 3/12 Steamhammer owns 3 of the 12 bases on the map.

patches 20 These 3 bases have 20 mineral patches total. This determines the upper limit of how fast it is possible to mine minerals.

geysers 2+0 Steamhammer has taken 2 gas geysers, and no more are available. If a geyser is in the process of being taken—the extractor is morphing—then it doesn’t count on either side. With 3 bases and 2 geysers, it could be that Steamhammer has 2 gas bases and a mineral only, or 3 gas bases and the final geyser is in the process of being taken.

drones 27/39 Steamhammer has 27 drones, including drones that are in the egg and haven’t hatched yet. It will spawn up to 39 drones; it believes that, taking into account the mineral patches and geysers, 39 drones are as many as it is reasonable to make.

mins 23 23 of the drones are mining minerals.

gas 0 None of the drones are mining gas. That leaves 4 drones on neither minerals or gas; they may be still in the egg, or may be carrying out other business, like building or scouting. But in any case, gas collection is turned off at the moment so that Steamhammer can collect minerals faster.

react +16 Based on what we’ve seen the opponent do, aim for 16 drones more than what we would normally build in this situation; they are “reactive drones.” A number as large as +16 usually means that the opponent is playing very defensively with many cannons, so that we can safely make extra drones to get ahead in economy.

larvas 0 There are no larvas available to spawn more units. Steamhammer takes a shortage of larvas as evidence that it may need more hatcheries (other factors count too). That might be why gas is turned off: To collect minerals faster to make a macro hatchery ASAP. (Though it might have concluded only that 911 gas is more than it needs for now.)

build +0g +0h This describes a complex strategy reaction that can occur in the opening book, and cannot occur later in the game. It connects with the “react” item above, the count of reactive drones. If there are too many reactive drones in the opening, they will bring in excess minerals, so Steamhammer takes actions to use resources more efficiently and prevent the excess minerals from building up unused. The actions are that it can take extra geysers—“g”—or make extra hatcheries—“h”. The extra geysers bring in extra gas so that income is balanced, and the extra hatcheries bring extra larvas for faster production. The result is that the unplanned resources can be spent efficiently and Steamhammer runs through its opening line faster, entering the middle game in a stronger position. In this game, it did not add anything extra in the opening; most of the +16 reactive drones must have been decided after the book line was over.

eco 0.35 5/43 describes plans for economic growth. 0.35 is the proportion of larvas planned to be made into drones on average over the long run. Actual moment-to-moment drone production can vary from all drones to all combat units, depending on circumstances, but Steamhammer aims for a given ratio. 5/43 is the actual ratio of drone production: It made 5 drones out of the last 43 larvas morphed, a ratio of about 0.11—Steamhammer is behind on planned drone production. Reactive drones are implemented behind the scenes by manipulating the ratio.

army 24 40 bad gives army sizes: Steamhammer’s army size is 24 (in an arbitrary measure), including static defense, and the enemy’s army size is 40, counting only mobile units. This is “bad” in red because it means that Steamhammer may be overrun. When the army size is “bad” Steamhammer makes combat units to defend itself; that explains why it is behind on drone production.

The 7 orange unit types lings, hydras, lurkers, mutas, guardians, devourers, ultras, are the candidate units for Steamhammer’s main unit mix. Overlords, queens, and defilers are support units that are made on the side and do not count toward the main unit mix. The number after each unit type is the strategy boss’s score for how important the unit type is in the current situation. The scores are used for two purposes: First, to decide on the combat unit mix. Second, to decide on the tech target, the next unit type that the bot should aim for.

The starred unit types are those that Steamhammer has the tech for. They are candidates for the unit mix. The starred unit with the highest score will be part of the unit mix. The rest of the unit types are candidates for the tech target.

Zergling Hydralisk 1 Lurker is the combat unit mix. Hydras had the highest score, so they are part of the unit mix. The full rules for choosing the unit mix are complex. Lurkers had the next highest score, but the strategy boss decided that it is enough to have only 1 lurker for now (which really means at least 1 lurker; if there happen to already be more, that is OK). Minerals not spent on hydras or the 1 lurker (or on drones or whatever else) will be spent on zerglings.

plan Mutas gives the tech target: A tech switch to mutalisks is planned. Mutas are a candidate for the tech target because they have a higher score than any of the starred units that Steamhammer already has the tech for. Guardians and ultras have higher scores yet, but Steamhammer doesn’t have a hive so mutalisks can be gotten much faster. If mutas did not score higher than hydras, the strategy boss would have picked guardians as the tech target.

half of AIIDE dropped out

This is too much. When I wrote about the 26 AIIDE 2019 registrants, I expected that not all of them would end up competing. It would have been surprising if a bot that looked as unfinished as Ophelia were ready in time. But this is too much.

See the list of participants. One extra holdover from last year was added, #11 LastOrder. Of the now 27 entrants, 13 dropped out, including the added LastOrder, so that 14 competitors remain. 4 are listed as withdrawn, 9 as not submitted.

The withdrawals are MetaBot, and the holdovers CSE, LastOrder, and SAIDA. MetaBot is supposedly unchanged, so in practice it is a holdover too. I suspect that LastOrder was added and marked withdrawn at the same time; the order of events suggests that it had already withdrawn before my write-up of the registrants. I especially miss LastOrder—last year, Steamhammer scored 25% against LastOrder, despite finishing higher in the rankings, and I would have enjoyed revenge.

Of the 9 unsubmitted bots, I will especially miss Dragon and Murph, which are both apparently protoss derivatives of CherryPi.

I’m sure circumstances are different in each case, but the large number of dropouts suggests a common underlying factor. What could it be?