archive by month
Skip to content

AIIDE 2019 - unknown maps per-player

Here are the per-player map tables for the AIIDE 2019 bonus tournament on “unknown” maps, meaning standard and well-known maps that weren’t announced beforehand. Since the competition is about maps, I want to look at the maps first.

With 10 opponents and 100 rounds, there are 10 * 9 * 100 / 2 = 4500 games total if all are played, meaning 900 games per player and 900 / 5 = 180 per player per map. Each player has 9 opponents, so there are 180/9 = 20 games per cell in these tables. The official results say that 4491 of the 4500 were successfully completed, so some cells have slightly fewer. Tail-ender UAlbertaBot participated in every game that did not complete (C0G has the same pattern).

LocutusoverallPolariLonginArcadiFightiRoadki
PurpleWave48%75%40%50%35%40%
DaQin73%75%75%85%75%55%
BananaBrain66%55%75%70%75%55%
Microwave98%100%90%100%100%100%
Steamhammer99%100%100%95%100%100%
XiaoYi97%100%90%100%95%100%
McRave96%100%95%100%100%85%
Iron100%100%100%100%100%100%
UAlbertaBot98%100%95%95%100%100%
overall86.11%89%84%88%87%82%


PurpleWaveoverallPolariLonginArcadiFightiRoadki
Locutus52%25%60%50%65%60%
DaQin76%60%90%70%95%65%
BananaBrain68%80%60%70%60%70%
Microwave96%95%100%95%95%95%
Steamhammer86%70%95%80%90%95%
XiaoYi90%90%80%90%90%100%
McRave93%95%95%90%95%90%
Iron98%100%100%100%90%100%
UAlbertaBot100%100%100%100%100%100%
overall84.28%79%87%83%87%86%


DaQinoverallPolariLonginArcadiFightiRoadki
Locutus27%25%25%15%25%45%
PurpleWave24%40%10%30%5%35%
BananaBrain41%60%30%35%30%50%
Microwave89%95%95%90%80%85%
Steamhammer97%95%95%100%100%95%
XiaoYi74%100%65%70%60%75%
McRave37%50%45%30%30%30%
Iron98%100%95%100%95%100%
UAlbertaBot77%75%70%89%68%80%
overall62.58%71%59%62%55%66%


BananaBrainoverallPolariLonginArcadiFightiRoadki
Locutus34%45%25%30%25%45%
PurpleWave32%20%40%30%40%30%
DaQin59%40%70%65%70%50%
Microwave74%75%70%75%75%75%
Steamhammer81%70%80%80%85%90%
XiaoYi62%50%70%50%70%70%
McRave69%80%70%55%65%75%
Iron56%30%55%65%45%85%
UAlbertaBot87%100%85%95%80%75%
overall61.56%57%63%61%62%66%


MicrowaveoverallPolariLonginArcadiFightiRoadki
Locutus2%0%10%0%0%0%
PurpleWave4%5%0%5%5%5%
DaQin11%5%5%10%20%15%
BananaBrain26%25%30%25%25%25%
Steamhammer84%90%95%75%80%80%
XiaoYi82%85%90%70%75%90%
McRave50%45%35%45%55%70%
Iron33%30%15%40%15%65%
UAlbertaBot65%70%60%79%60%55%
overall39.60%39%38%39%37%45%


SteamhammeroverallPolariLonginArcadiFightiRoadki
Locutus1%0%0%5%0%0%
PurpleWave14%30%5%20%10%5%
DaQin3%5%5%0%0%5%
BananaBrain19%30%20%20%15%10%
Microwave16%10%5%25%20%20%
XiaoYi50%25%80%70%45%30%
McRave89%85%90%95%90%85%
Iron59%85%40%60%55%55%
UAlbertaBot90%95%90%90%84%90%
overall37.75%40%37%43%35%33%


XiaoYioverallPolariLonginArcadiFightiRoadki
Locutus3%0%10%0%5%0%
PurpleWave10%10%20%10%10%0%
DaQin26%0%35%30%40%25%
BananaBrain38%50%30%50%30%30%
Microwave18%15%10%30%25%10%
Steamhammer50%75%20%30%55%70%
McRave29%35%40%25%20%25%
Iron80%80%85%70%90%75%
UAlbertaBot79%60%85%85%100%65%
overall37.00%36%37%37%42%33%


McRaveoverallPolariLonginArcadiFightiRoadki
Locutus4%0%5%0%0%15%
PurpleWave7%5%5%10%5%10%
DaQin63%50%55%70%70%70%
BananaBrain31%20%30%45%35%25%
Microwave50%55%65%55%45%30%
Steamhammer11%15%10%5%10%15%
XiaoYi71%65%60%75%80%75%
Iron41%25%30%45%40%65%
UAlbertaBot46%20%63%50%55%45%
overall36.04%28%36%39%38%39%


IronoverallPolariLonginArcadiFightiRoadki
Locutus0%0%0%0%0%0%
PurpleWave2%0%0%0%10%0%
DaQin2%0%5%0%5%0%
BananaBrain44%70%45%35%55%15%
Microwave67%70%85%60%85%35%
Steamhammer41%15%60%40%45%45%
XiaoYi20%20%15%30%10%25%
McRave59%75%70%55%60%35%
UAlbertaBot75%90%80%75%80%50%
overall34.44%38%40%33%39%23%


UAlbertaBotoverallPolariLonginArcadiFightiRoadki
Locutus2%0%5%5%0%0%
PurpleWave0%0%0%0%0%0%
DaQin23%25%30%11%32%20%
BananaBrain13%0%15%5%20%25%
Microwave35%30%40%21%40%45%
Steamhammer10%5%10%10%16%10%
XiaoYi21%40%15%15%0%35%
McRave54%80%37%50%45%55%
Iron25%10%20%25%20%50%
overall20.43%21%19%16%19%27%

Against its strong opponents, Locutus had trouble on the map Roadkill, possibly because of the low-ground main. If Locutus stuck with its cannons-at-the-ramp strategy, the cannons were weak on low ground. Iron also struggled on Roadkill. Polaris Rhapsody, the only 2-player map, also showed some extreme results—see BananaBrain versus Iron and McRave versus UAlbertaBot.

There are plenty more details in the tables.

fun game Simplicity-Locutus

Yesterday’s game Simplicity vs Locutus on Andromeda on BASIL starts out as one of the most entertaining bot games I have seen. The pictures show some of the cool stuff that Simplicity tried—with success. Then, after a tremendous fight where each side pressed temporary advantages and maxed its army, the replay loses sync and OpenBW cannot show the last half of the game.

Queens with broodling. The Research tab shows broodling research, and there are broodlings on the ground. Simplicity made 8 queens early and even researched queen energy, and the queens paid for themselves with interest. When Locutus attacked, zerg sent out as many queens as had energy to simultaneously spawn broodlings, helping to break the attacks. It’s a simple way to coordinate the queens to get tactical results, and is more effective than the common bot approach of using the queens as attrition weapons. Simplicity’s queens eventually died to corsairs; with more careful play, they could have lived to the end of the game, because Locutus was not skilled with its corsairs.

Broodlings!

Island base with static defense. The overlord on the left has just dropped off another drone to join the miners.

Island!

Drops. Simplicity repeatedly dropped small numbers of units into the far end of Locutus’s base, and Locutus did not react properly. The drops were not decisive, but were cost-effective.

Drops!

Both sides maxed their supply, or nearly so. At that point, Locutus had better upgrades but Simplicity had a larger army. Locutus could not keep its natural safe. Luckily for protoss, it was already mined out.

Mass battles!

But Locutus had a stronger economy with more bases and a large bank of resources, and Simplicity ran out of resources. The desync hides the end of the game, which timed out after an hour. I believe that Locutus wiped out all zerg it could reach on the ground, and then had no answer for the island base. When the game timed out, BASIL gave the win to Simplicity on points. Moral: You need at least enough island skills to make air units to attack inaccessible bases.

new bot ZNZZBot

New protoss ZNZZBot was uploaded to SSCAIT today. It has no apparent connection with ZZZKBot. Quick analysis shows that it is descended from Locutus, but its binary is twice the size of the current Locutus because ZNZZBot is still linked against the BWTA terrain library. I conclude that ZNZZBot was forked from an earlier Locutus, or from an earlier descendent of Locutus.

The bot’s author is given on SSCAIT as “znzz” and in its config file as “zzy”. I’m not sure whether that indicates an attachment to zealots, or fitful sleep.

In its first 5 games, all it has played so far on SSCAIT and BASIL, ZNZZBot scored a win over Krasi0 with the Locutus dragoon runby strategy, but also a loss against the much weaker opponent ICEbot where protoss showed poor macro and poor decisions. I glanced over a couple new openings defined in the config file, and found them a bit odd, perhaps not expertly designed.

That’s all I know so far. I expect we’ll find out more with time.

AIIDE 2019 - unknown maps tournament

The AIIDE 2019 unknown maps competition results are up. At first glance, the biggest surprise is that the ranking is extremely similar to the ranking in the main tournament. 10 bots chose to compete. The weakest players did not participate, so the winning rates for all bots are lower than in the main tournament.

The results for some reason don’t include a straight listing of the 5 maps used. They are (2) Polaris Rhapsody, (3) Longinus 2, (4) Arcadia 2, (4) Fighting Spirit, and (4) Roadkill. Fighting Spirit is of course familiar to SSCAIT participants. The first 4 maps are classics from the KESPA era (which ended in 2012) and Roadkill is a more recent design.

I know from test games that Steamhammer plays well on Arcadia. (I test Steamhammer on all kinds of maps as a regular thing.) I’m pleased to see that reflected in the map statistics. Other map preferences that stand out are that DaQin likes Polaris Rhapsody and dislikes Fighting Spirit, Microwave prefers Roadkill, and Iron had trouble on Roadkill.

I will analyze the unknown maps tournament, at least to some extent. I’m not sure exactly how. There’s a bit of an embarrassment of riches at the moment.

AIIDE 2019 - maps per player

I’m pleased with this one. This is the same data as yesterday, how each bot did against each other on each map, but organized by player rather than by map. If you’re a bot author, I think this is a better way to find out about strengths and weaknesses.

For example, the first table is from the point of view of Locutus. The percentages are Locutus’s win rates. The upset by DaQin on Aztec immediately stands out amid Locutus’s otherwise consistent results. I imagine that Bruce @ Locutus will examine those 10 games and perhaps find a bug that DaQin exploited. (Locutus played its cannons at ramp into zealot drop strategy in these games. It lost because cannons at the ramp are a poor defense when the outside is on higher ground—Aztec has low-ground main bases. Maybe a weakness in learning or preparation?)

LocutusoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
PurpleWave45%50%50%40%40%40%60%40%30%40%60%
BananaBrain89%100%100%80%100%100%80%70%100%70%90%
DaQin83%100%80%100%20%80%100%90%90%80%90%
Steamhammer97%90%100%100%90%100%90%100%100%100%100%
ZZZKBot99%100%100%100%100%100%100%100%90%100%100%
Microwave92%100%100%80%80%100%90%90%90%90%100%
Iron99%100%100%100%100%100%90%100%100%100%100%
XiaoYi96%100%100%100%90%70%100%100%100%100%100%
McRave99%100%100%100%100%100%100%100%90%100%100%
UAlbertaBot99%100%100%100%100%100%100%100%90%100%100%
AITP100%100%100%100%100%100%100%100%100%100%100%
BunkerBoxeR99%100%100%100%100%100%100%100%90%100%100%
overall91.42%95%94%92%85%91%92%91%89%90%95%


PurpleWaveoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus55%50%50%60%60%60%40%60%70%60%40%
BananaBrain44%80%70%80%50%60%10%40%20%0%30%
DaQin85%80%70%100%100%70%80%100%100%70%80%
Steamhammer71%90%100%70%40%60%80%60%70%90%50%
ZZZKBot100%100%100%100%100%100%100%100%100%100%100%
Microwave93%100%90%90%100%90%90%90%90%90%100%
Iron98%100%100%90%100%100%100%100%100%100%90%
XiaoYi97%90%100%100%100%100%90%100%90%100%100%
McRave89%100%90%100%100%90%80%100%70%70%90%
UAlbertaBot98%100%100%100%100%100%100%78%100%100%100%
AITP100%100%100%100%100%100%100%100%100%100%100%
BunkerBoxeR97%100%90%90%100%100%100%100%90%100%100%
overall85.54%91%88%90%88%86%81%86%83%82%82%


BananaBrainoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus11%0%0%20%0%0%20%30%0%30%10%
PurpleWave56%20%30%20%50%40%90%60%80%100%70%
DaQin51%30%40%60%80%50%30%60%60%60%40%
Steamhammer85%100%70%90%80%80%70%100%100%70%90%
ZZZKBot83%90%100%70%80%70%90%100%70%90%70%
Microwave71%90%70%90%70%60%70%60%80%60%60%
Iron59%40%70%50%60%60%70%80%50%70%40%
XiaoYi57%40%80%40%50%50%40%80%40%80%70%
McRave69%70%60%90%70%50%80%70%70%70%60%
UAlbertaBot84%80%90%80%90%80%90%80%89%90%70%
AITP100%100%100%100%100%100%100%100%100%100%100%
BunkerBoxeR100%100%100%100%100%100%100%100%100%100%100%
overall68.81%63%68%68%69%62%71%77%70%77%65%


DaQinoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus17%0%20%0%80%20%0%10%10%20%10%
PurpleWave15%20%30%0%0%30%20%0%0%30%20%
BananaBrain49%70%60%40%20%50%70%40%40%40%60%
Steamhammer94%100%100%90%100%100%100%100%70%90%90%
ZZZKBot10%10%20%10%10%10%20%10%0%10%0%
Microwave83%90%60%60%90%100%80%90%100%80%80%
Iron92%100%90%100%90%100%90%100%80%70%100%
XiaoYi82%100%100%90%100%90%60%70%60%60%90%
McRave41%40%60%40%40%20%50%30%40%30%60%
UAlbertaBot78%70%100%80%90%70%80%80%70%60%80%
AITP100%100%100%100%100%100%100%100%100%100%100%
BunkerBoxeR99%100%100%100%90%100%100%100%100%100%100%
overall63.33%67%70%59%68%66%64%61%56%57%66%


SteamhammeroverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus3%10%0%0%10%0%10%0%0%0%0%
PurpleWave29%10%0%30%60%40%20%40%30%10%50%
BananaBrain15%0%30%10%20%20%30%0%0%30%10%
DaQin6%0%0%10%0%0%0%0%30%10%10%
ZZZKBot59%40%60%80%80%40%50%40%60%50%90%
Microwave25%10%30%50%30%30%20%40%20%10%10%
Iron67%70%90%100%40%40%90%60%60%100%20%
XiaoYi50%30%60%50%20%10%80%50%80%60%60%
McRave86%100%90%100%100%70%100%90%90%40%80%
UAlbertaBot91%90%90%90%89%80%90%89%90%100%100%
AITP97%90%100%90%100%100%90%100%100%100%100%
BunkerBoxeR100%100%100%100%100%100%100%100%100%100%100%
overall52.25%46%54%59%54%44%57%50%55%51%52%


ZZZKBotoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus1%0%0%0%0%0%0%0%10%0%0%
PurpleWave0%0%0%0%0%0%0%0%0%0%0%
BananaBrain17%10%0%30%20%30%10%0%30%10%30%
DaQin90%90%80%90%90%90%80%90%100%90%100%
Steamhammer41%60%40%20%20%60%50%60%40%50%10%
Microwave44%50%0%60%40%50%50%50%60%30%50%
Iron55%0%50%60%30%50%100%50%80%50%80%
XiaoYi49%60%90%70%10%20%60%40%20%70%50%
McRave67%70%70%50%80%40%30%80%80%90%80%
UAlbertaBot90%90%80%90%100%90%90%70%100%90%100%
AITP72%70%60%60%90%80%20%70%90%90%90%
BunkerBoxeR99%100%100%100%100%100%90%100%100%100%100%
overall52.08%50%48%52%48%51%48%51%59%56%57%


MicrowaveoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus8%0%0%20%20%0%10%10%10%10%0%
PurpleWave7%0%10%10%0%10%10%10%10%10%0%
BananaBrain29%10%30%10%30%40%30%40%20%40%40%
DaQin17%10%40%40%10%0%20%10%0%20%20%
Steamhammer75%90%70%50%70%70%80%60%80%90%90%
ZZZKBot56%50%100%40%60%50%50%50%40%70%50%
Iron13%0%0%10%10%0%20%50%20%10%10%
XiaoYi65%60%80%70%60%80%40%80%60%30%90%
McRave64%60%80%100%70%50%60%70%60%30%60%
UAlbertaBot82%60%100%90%60%80%60%80%90%100%100%
AITP93%90%90%70%100%100%90%100%100%100%90%
BunkerBoxeR99%100%100%100%90%100%100%100%100%100%100%
overall50.67%44%58%51%48%48%48%55%49%51%54%


IronoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus1%0%0%0%0%0%10%0%0%0%0%
PurpleWave2%0%0%10%0%0%0%0%0%0%10%
BananaBrain41%60%30%50%40%40%30%20%50%30%60%
DaQin8%0%10%0%10%0%10%0%20%30%0%
Steamhammer33%30%10%0%60%60%10%40%40%0%80%
ZZZKBot45%100%50%40%70%50%0%50%20%50%20%
Microwave87%100%100%90%90%100%80%50%80%90%90%
XiaoYi26%10%50%0%50%20%40%20%30%40%0%
McRave65%70%80%80%80%80%70%50%40%40%60%
UAlbertaBot90%100%90%100%90%90%90%60%90%90%100%
AITP100%100%100%100%100%100%100%100%100%100%100%
BunkerBoxeR93%100%100%80%80%100%90%100%90%90%100%
overall49.25%56%52%46%56%53%44%41%47%47%52%


XiaoYioverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus4%0%0%0%10%30%0%0%0%0%0%
PurpleWave3%10%0%0%0%0%10%0%10%0%0%
BananaBrain43%60%20%60%50%50%60%20%60%20%30%
DaQin18%0%0%10%0%10%40%30%40%40%10%
Steamhammer50%70%40%50%80%90%20%50%20%40%40%
ZZZKBot51%40%10%30%90%80%40%60%80%30%50%
Microwave35%40%20%30%40%20%60%20%40%70%10%
Iron74%90%50%100%50%80%60%80%70%60%100%
McRave36%40%10%50%30%40%30%40%30%60%30%
UAlbertaBot73%44%60%90%80%80%75%60%89%89%60%
AITP100%100%100%100%100%100%100%100%100%100%100%
BunkerBoxeR98%100%100%100%90%100%100%100%90%100%100%
overall48.62%50%34%52%52%57%49%47%52%50%44%


McRaveoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus1%0%0%0%0%0%0%0%10%0%0%
PurpleWave11%0%10%0%0%10%20%0%30%30%10%
BananaBrain31%30%40%10%30%50%20%30%30%30%40%
DaQin59%60%40%60%60%80%50%70%60%70%40%
Steamhammer14%0%10%0%0%30%0%10%10%60%20%
ZZZKBot33%30%30%50%20%60%70%20%20%10%20%
Microwave36%40%20%0%30%50%40%30%40%70%40%
Iron35%30%20%20%20%20%30%50%60%60%40%
XiaoYi64%60%90%50%70%60%70%60%70%40%70%
UAlbertaBot43%40%60%50%10%40%30%20%70%40%70%
AITP82%80%70%60%100%90%80%80%100%100%60%
BunkerBoxeR71%70%80%70%50%80%70%70%80%70%70%
overall40.00%37%39%31%32%48%40%37%48%48%40%


UAlbertaBotoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus1%0%0%0%0%0%0%0%10%0%0%
PurpleWave2%0%0%0%0%0%0%22%0%0%0%
BananaBrain16%20%10%20%10%20%10%20%11%10%30%
DaQin22%30%0%20%10%30%20%20%30%40%20%
Steamhammer9%10%10%10%11%20%10%11%10%0%0%
ZZZKBot10%10%20%10%0%10%10%30%0%10%0%
Microwave18%40%0%10%40%20%40%20%10%0%0%
Iron10%0%10%0%10%10%10%40%10%10%0%
XiaoYi27%56%40%10%20%20%25%40%11%11%40%
McRave57%60%40%50%90%60%70%80%30%60%30%
AITP75%89%80%50%78%90%70%33%90%70%100%
BunkerBoxeR89%70%90%90%80%90%100%80%100%100%90%
overall28.04%31%25%22%29%31%31%33%27%26%25%


AITPoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus0%0%0%0%0%0%0%0%0%0%0%
PurpleWave0%0%0%0%0%0%0%0%0%0%0%
BananaBrain0%0%0%0%0%0%0%0%0%0%0%
DaQin0%0%0%0%0%0%0%0%0%0%0%
Steamhammer3%10%0%10%0%0%10%0%0%0%0%
ZZZKBot28%30%40%40%10%20%80%30%10%10%10%
Microwave7%10%10%30%0%0%10%0%0%0%10%
Iron0%0%0%0%0%0%0%0%0%0%0%
XiaoYi0%0%0%0%0%0%0%0%0%0%0%
McRave18%20%30%40%0%10%20%20%0%0%40%
UAlbertaBot25%11%20%50%22%10%30%67%10%30%0%
BunkerBoxeR59%30%20%70%40%70%100%80%60%50%70%
overall11.62%9%10%20%6%9%21%16%7%8%11%


BunkerBoxeRoverallBenzenDestinHeartbAztecTauCroAndromCircuiEmpireFortrePython
Locutus1%0%0%0%0%0%0%0%10%0%0%
PurpleWave3%0%10%10%0%0%0%0%10%0%0%
BananaBrain0%0%0%0%0%0%0%0%0%0%0%
DaQin1%0%0%0%10%0%0%0%0%0%0%
Steamhammer0%0%0%0%0%0%0%0%0%0%0%
ZZZKBot1%0%0%0%0%0%10%0%0%0%0%
Microwave1%0%0%0%10%0%0%0%0%0%0%
Iron7%0%0%20%20%0%10%0%10%10%0%
XiaoYi2%0%0%0%10%0%0%0%10%0%0%
McRave29%30%20%30%50%20%30%30%20%30%30%
UAlbertaBot11%30%10%10%20%10%0%20%0%0%10%
AITP41%70%80%30%60%30%0%20%40%50%30%
overall8.08%11%10%8%15%5%4%6%8%8%6%

The zergs and terrans seem more sensitive to the map than protoss. For example, Locutus vs PurpleWave win rates vary from 30% to 60%, which could be entirely due to statistical fluctuation, while ZZZKBot vs Iron ranges from 0% to 100%, which is not random. I imagine that ZZZKBot’s map selectivity has to do with its learning algorithm. But overall, there are many cases where the map seems to make a difference against opponents of similar strength. I think bots will benefit from more sensitivity to the map design.

AIIDE 2019 - per-map crosstables

A separate crosstable for each of the 10 maps. Most cells have only 10 games in them (some have fewer because of unsuccessful games), so the numbers are noisy. Nevertheless, I think the tables are full of insights—so full that it’s easy to be overwhelmed. I left the game counts out of the cells to make the tables more compact, so they are easier to compare.

Watch how ZZZKBot versus Iron varies strongly depending on the map.

(2) Benzene

PurpleWave’s favorite map. Is that because of Purple pathfinding skills? But Locutus liked the map too.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus95.00%50%100%100%90%100%100%100%100%100%100%100%100%
2PurpleWave90.83%50%80%80%90%100%100%100%90%100%100%100%100%
3BananaBrain63.33%0%20%30%100%90%90%40%40%70%80%100%100%
4DaQin66.67%0%20%70%100%10%90%100%100%40%70%100%100%
5Steamhammer45.83%10%10%0%0%40%10%70%30%100%90%90%100%
6ZZZKBot50.00%0%0%10%90%60%50%0%60%70%90%70%100%
7Microwave44.17%0%0%10%10%90%50%0%60%60%60%90%100%
8Iron55.83%0%0%60%0%30%100%100%10%70%100%100%100%
9XiaoYi49.58%0%10%60%0%70%40%40%90%40%44%100%100%
10McRave36.67%0%0%30%60%0%30%40%30%60%40%80%70%
11UAlbertaBot31.36%0%0%20%30%10%10%40%0%56%60%89%70%
12AITP9.24%0%0%0%0%10%30%10%0%0%20%11%30%
13BunkerBoxeR10.83%0%0%0%0%0%0%0%0%0%30%30%70%

(2) Destination

Key area on the map: Over the wall behind the enemy natural. Build an e-bay there and send tanks. Float the e-bay and siege the tanks, maybe add turrets for air defense. How many defending bots would survive? XiaoYi suffered badly on this map, but that’s not why.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus94.17%50%100%80%100%100%100%100%100%100%100%100%100%
2PurpleWave88.33%50%70%70%100%100%90%100%100%90%100%100%90%
3BananaBrain67.50%0%30%40%70%100%70%70%80%60%90%100%100%
4DaQin70.00%20%30%60%100%20%60%90%100%60%100%100%100%
5Steamhammer54.17%0%0%30%0%60%30%90%60%90%90%100%100%
6ZZZKBot47.50%0%0%0%80%40%0%50%90%70%80%60%100%
7Microwave58.33%0%10%30%40%70%100%0%80%80%100%90%100%
8Iron51.67%0%0%30%10%10%50%100%50%80%90%100%100%
9XiaoYi34.17%0%0%20%0%40%10%20%50%10%60%100%100%
10McRave39.17%0%10%40%40%10%30%20%20%90%60%70%80%
11UAlbertaBot25.00%0%0%10%0%10%20%0%10%40%40%80%90%
12AITP10.00%0%0%0%0%0%40%10%0%0%30%20%20%
13BunkerBoxeR10.00%0%10%0%0%0%0%0%0%0%20%10%80%

(2) Heartbreak Ridge

It’s a tricky map, but bots don’t seem to realize that. Steamhammer liked it a bit and Iron disliked it, but no bot stood out as particularly loving or hating it.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus91.67%40%80%100%100%100%80%100%100%100%100%100%100%
2PurpleWave90.00%60%80%100%70%100%90%90%100%100%100%100%90%
3BananaBrain67.50%20%20%60%90%70%90%50%40%90%80%100%100%
4DaQin59.17%0%0%40%90%10%60%100%90%40%80%100%100%
5Steamhammer59.17%0%30%10%10%80%50%100%50%100%90%90%100%
6ZZZKBot52.50%0%0%30%90%20%60%60%70%50%90%60%100%
7Microwave50.83%20%10%10%40%50%40%10%70%100%90%70%100%
8Iron45.83%0%10%50%0%0%40%90%0%80%100%100%80%
9XiaoYi51.67%0%0%60%10%50%30%30%100%50%90%100%100%
10McRave30.83%0%0%10%60%0%50%0%20%50%50%60%70%
11UAlbertaBot22.50%0%0%20%20%10%10%10%0%10%50%50%90%
12AITP20.00%0%0%0%0%10%40%30%0%0%40%50%70%
13BunkerBoxeR8.33%0%10%0%0%0%0%0%20%0%30%10%30%

(3) Aztec

A 3-player map with low-ground main bases. I like this map. DaQin liked it too, since it upset Locutus, unlike on any other map.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus85.00%40%100%20%90%100%80%100%90%100%100%100%100%
2PurpleWave87.50%60%50%100%40%100%100%100%100%100%100%100%100%
3BananaBrain69.17%0%50%80%80%80%70%60%50%70%90%100%100%
4DaQin67.50%80%0%20%100%10%90%90%100%40%90%100%90%
5Steamhammer53.78%10%60%20%0%80%30%40%20%100%89%100%100%
6ZZZKBot48.33%0%0%20%90%20%40%30%10%80%100%90%100%
7Microwave48.33%20%0%30%10%70%60%10%60%70%60%100%90%
8Iron55.83%0%0%40%10%60%70%90%50%80%90%100%80%
9XiaoYi51.67%10%0%50%0%80%90%40%50%30%80%100%90%
10McRave32.50%0%0%30%60%0%20%30%20%70%10%100%50%
11UAlbertaBot28.81%0%0%10%10%11%0%40%10%20%90%78%80%
12AITP5.88%0%0%0%0%0%10%0%0%0%0%22%40%
13BunkerBoxeR15.00%0%0%0%10%0%0%10%20%10%50%20%60%

(3) Tau Cross

Bases beyond the natural are open to attack. I think that is why Steamhammer had trouble. It was XiaoYi’s best map.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus90.83%40%100%80%100%100%100%100%70%100%100%100%100%
2PurpleWave85.83%60%60%70%60%100%90%100%100%90%100%100%100%
3BananaBrain61.67%0%40%50%80%70%60%60%50%50%80%100%100%
4DaQin65.83%20%30%50%100%10%100%100%90%20%70%100%100%
5Steamhammer44.17%0%40%20%0%40%30%40%10%70%80%100%100%
6ZZZKBot50.83%0%0%30%90%60%50%50%20%40%90%80%100%
7Microwave48.33%0%10%40%0%70%50%0%80%50%80%100%100%
8Iron53.33%0%0%40%0%60%50%100%20%80%90%100%100%
9XiaoYi56.67%30%0%50%10%90%80%20%80%40%80%100%100%
10McRave47.50%0%10%50%80%30%60%50%20%60%40%90%80%
11UAlbertaBot30.83%0%0%20%30%20%10%20%10%20%60%90%90%
12AITP9.17%0%0%0%0%0%20%0%0%0%10%10%70%
13BunkerBoxeR5.00%0%0%0%0%0%0%0%0%0%20%10%30%

(4) Andromeda

A bot that understands when to take the in-base mineral-only gains an edge. But so would a bot which understands how to attack it from outside, not as common a skill.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus92.50%60%80%100%90%100%90%90%100%100%100%100%100%
2PurpleWave80.83%40%10%80%80%100%90%100%90%80%100%100%100%
3BananaBrain70.83%20%90%30%70%90%70%70%40%80%90%100%100%
4DaQin64.17%0%20%70%100%20%80%90%60%50%80%100%100%
5Steamhammer56.67%10%20%30%0%50%20%90%80%100%90%90%100%
6ZZZKBot48.33%0%0%10%80%50%50%100%60%30%90%20%90%
7Microwave47.50%10%10%30%20%80%50%20%40%60%60%90%100%
8Iron44.17%10%0%30%10%10%0%80%40%70%90%100%90%
9XiaoYi49.15%0%10%60%40%20%40%60%60%30%75%100%100%
10McRave40.00%0%20%20%50%0%70%40%30%70%30%80%70%
11UAlbertaBot30.51%0%0%10%20%10%10%40%10%25%70%70%100%
12AITP20.83%0%0%0%0%10%80%10%0%0%20%30%100%
13BunkerBoxeR4.17%0%0%0%0%0%10%0%10%0%30%0%0%

(4) Circuit Breaker

Iron was unhappy with this map, though to me it seems like a good map for Iron’s skills.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus90.83%40%70%90%100%100%90%100%100%100%100%100%100%
2PurpleWave85.71%60%40%100%60%100%90%100%100%100%78%100%100%
3BananaBrain76.67%30%60%60%100%100%60%80%80%70%80%100%100%
4DaQin60.83%10%0%40%100%10%90%100%70%30%80%100%100%
5Steamhammer50.42%0%40%0%0%40%40%60%50%90%89%100%100%
6ZZZKBot50.83%0%0%0%90%60%50%50%40%80%70%70%100%
7Microwave55.00%10%10%40%10%60%50%50%80%70%80%100%100%
8Iron40.83%0%0%20%0%40%50%50%20%50%60%100%100%
9XiaoYi46.67%0%0%20%30%50%60%20%80%40%60%100%100%
10McRave36.67%0%0%30%70%10%20%30%50%60%20%80%70%
11UAlbertaBot33.33%0%22%20%20%11%30%20%40%40%80%33%80%
12AITP15.97%0%0%0%0%0%30%0%0%0%20%67%80%
13BunkerBoxeR5.83%0%0%0%0%0%0%0%0%0%30%20%20%

(4) Empire of the Sun

ZZZKBot’s favorite map.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus89.17%30%100%90%100%90%90%100%100%90%90%100%90%
2PurpleWave83.05%70%20%100%70%100%90%100%90%70%100%100%90%
3BananaBrain69.75%0%80%60%100%70%80%50%40%70%89%100%100%
4DaQin55.83%10%0%40%70%0%100%80%60%40%70%100%100%
5Steamhammer55.00%0%30%0%30%60%20%60%80%90%90%100%100%
6ZZZKBot59.17%10%0%30%100%40%60%80%20%80%100%90%100%
7Microwave49.17%10%10%20%0%80%40%20%60%60%90%100%100%
8Iron46.67%0%0%50%20%40%20%80%30%40%90%100%90%
9XiaoYi52.10%0%10%60%40%20%80%40%70%30%89%100%90%
10McRave48.33%10%30%30%60%10%20%40%60%70%70%100%80%
11UAlbertaBot26.72%10%0%11%30%10%0%10%10%11%30%90%100%
12AITP6.67%0%0%0%0%0%10%0%0%0%0%10%60%
13BunkerBoxeR8.33%10%10%0%0%0%0%0%10%10%20%0%40%

(4) Fortress

Fortress has corner bases that can be reached either by air, or by workers which mineral-walk through a gate. I have yet to see a bot that can take advantage of the corner bases. I'll be watching replays to see if I can find one. BananaBrain did well on this map, it's a candidate.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus90.00%40%70%80%100%100%90%100%100%100%100%100%100%
2PurpleWave81.67%60%0%70%90%100%90%100%100%70%100%100%100%
3BananaBrain76.67%30%100%60%70%90%60%70%80%70%90%100%100%
4DaQin57.50%20%30%40%90%10%80%70%60%30%60%100%100%
5Steamhammer50.83%0%10%30%10%50%10%100%60%40%100%100%100%
6ZZZKBot55.83%0%0%10%90%50%30%50%70%90%90%90%100%
7Microwave50.83%10%10%40%20%90%70%10%30%30%100%100%100%
8Iron46.67%0%0%30%30%0%50%90%40%40%90%100%90%
9XiaoYi50.42%0%0%20%40%40%30%70%60%60%89%100%100%
10McRave48.33%0%30%30%70%60%10%70%60%40%40%100%70%
11UAlbertaBot26.05%0%0%10%40%0%10%0%10%11%60%70%100%
12AITP7.50%0%0%0%0%0%10%0%0%0%0%30%50%
13BunkerBoxeR7.50%0%0%0%0%0%0%0%10%0%30%0%50%

(4) Python

Python is a grand old classic, a largely successful attempt to redesign Lost Temple without all the imbalances. It has 2 island bases.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus95.00%60%90%90%100%100%100%100%100%100%100%100%100%
2PurpleWave81.51%40%30%80%50%100%100%90%100%90%100%100%100%
3BananaBrain65.00%10%70%40%90%70%60%40%70%60%70%100%100%
4DaQin65.83%10%20%60%90%0%80%100%90%60%80%100%100%
5Steamhammer52.50%0%50%10%10%90%10%20%60%80%100%100%100%
6ZZZKBot57.50%0%0%30%100%10%50%80%50%80%100%90%100%
7Microwave54.17%0%0%40%20%90%50%10%90%60%100%90%100%
8Iron51.67%0%10%60%0%80%20%90%0%60%100%100%100%
9XiaoYi44.17%0%0%30%10%40%50%10%100%30%60%100%100%
10McRave40.00%0%10%40%40%20%20%40%40%70%70%60%70%
11UAlbertaBot25.42%0%0%30%20%0%0%0%0%40%30%100%90%
12AITP10.92%0%0%0%0%0%10%10%0%0%40%0%70%
13BunkerBoxeR5.83%0%0%0%0%0%0%0%0%0%30%10%30%

Next: I want to present the same information in a different format that I hope will be easier to draw lessons from.

AIIDE 2019 - race balance

The CoG results file is troublesome, so I’m analyzing AIIDE first after all. The purpose of a plan, after all, is not to be executed, but to be changed; contact with the enemy and all that. This year’s AIIDE detailed_results.txt file was easy to read and interpret. I only needed a couple small changes from last year’s script.

Here is my version of the crosstable. It is identical to the official crosstable, so it doesn’t include any new information. If somebody wants it, I can also post my version of the per-map results, but that doesn’t include any new information either.

#botoverallLocuPurpBanaDaQiSteaZZZKMicrIronXiaoMcRaUAlbAITPBunk
1Locutus91.42%45%
45/100
89%
89/100
83%
83/100
97%
97/100
99%
99/100
92%
92/100
99%
99/100
96%
96/100
99%
99/100
99%
99/100
100%
100/100
99%
99/100
2PurpleWave85.54%55%
55/100
44%
44/100
85%
85/100
71%
71/100
100%
100/100
93%
93/100
98%
98/100
97%
97/100
89%
89/100
98%
94/96
100%
100/100
97%
97/100
3BananaBrain68.81%11%
11/100
56%
56/100
51%
51/100
85%
85/100
83%
83/100
71%
71/100
59%
59/100
57%
57/100
69%
69/100
84%
83/99
100%
100/100
100%
100/100
4DaQin63.33%17%
17/100
15%
15/100
49%
49/100
94%
94/100
10%
10/100
83%
83/100
92%
92/100
82%
82/100
41%
41/100
78%
78/100
100%
100/100
99%
99/100
5Steamhammer52.25%3%
3/100
29%
29/100
15%
15/100
6%
6/100
59%
59/100
25%
25/100
67%
67/100
50%
50/100
86%
86/100
91%
89/98
97%
97/100
100%
100/100
6ZZZKBot52.08%1%
1/100
0%
0/100
17%
17/100
90%
90/100
41%
41/100
44%
44/100
55%
55/100
49%
49/100
67%
67/100
90%
90/100
72%
72/100
99%
99/100
7Microwave50.67%8%
8/100
7%
7/100
29%
29/100
17%
17/100
75%
75/100
56%
56/100
13%
13/100
65%
65/100
64%
64/100
82%
82/100
93%
93/100
99%
99/100
8Iron49.25%1%
1/100
2%
2/100
41%
41/100
8%
8/100
33%
33/100
45%
45/100
87%
87/100
26%
26/100
65%
65/100
90%
90/100
100%
100/100
93%
93/100
9XiaoYi48.62%4%
4/100
3%
3/100
43%
43/100
18%
18/100
50%
50/100
51%
51/100
35%
35/100
74%
74/100
36%
36/100
73%
69/95
100%
100/100
98%
98/100
10McRave40.00%1%
1/100
11%
11/100
31%
31/100
59%
59/100
14%
14/100
33%
33/100
36%
36/100
35%
35/100
64%
64/100
43%
43/100
82%
82/100
71%
71/100
11UAlbertaBot28.04%1%
1/100
2%
2/96
16%
16/99
22%
22/100
9%
9/98
10%
10/100
18%
18/100
10%
10/100
27%
26/95
57%
57/100
75%
72/96
89%
89/100
12AITP11.62%0%
0/100
0%
0/100
0%
0/100
0%
0/100
3%
3/100
28%
28/100
7%
7/100
0%
0/100
0%
0/100
18%
18/100
25%
24/96
59%
59/100
13BunkerBoxeR8.08%1%
1/100
3%
3/100
0%
0/100
1%
1/100
0%
0/100
1%
1/100
1%
1/100
7%
7/100
2%
2/100
29%
29/100
11%
11/100
41%
41/100

The race balance is not too interesting this year. In the crosstable, we see protoss at the top, zerg grouped in the middle, and mostly terran on the bottom, so we don’t need numbers to judge the race balance. But here’s the table anyway. The random row and the versus-random column are the least interesting of all, because UAlbertaBot was the only random player.

overallvTvPvZvR
terran29%14%28%50%
protoss70%86%71%80%
zerg52%72%29%88%
random28%50%20%12%

And here’s the breakdown of how each bot performed against each race. The most surprising results are that Steamhammer did poorly against these specific 2 zerg opponents, although ZvZ is in general its strongest matchup, and that weaker participants UAlbertaBot and BunkerBoxeR somehow scored higher against mighty protoss than against middling zerg. In the crosstable above, we can see the matchups which were responsible for the surprises.

#botraceoverallvTvPvZvR
1Locutusprotoss91.42%98%79%96%99%
2PurpleWaveprotoss85.54%98%68%88%98%
3BananaBrainprotoss68.81%79%47%80%84%
4DaQinprotoss63.33%93%30%62%78%
5Steamhammerzerg52.25%78%28%42%91%
6ZZZKBotzerg52.08%69%35%42%90%
7Microwavezerg50.67%68%25%66%82%
8Ironterran49.25%73%23%55%90%
9XiaoYiterran48.62%91%21%45%73%
10McRaveprotoss40.00%63%26%28%43%
11UAlbertaBotrandom28.04%50%20%12%-
12AITPterran11.62%20%4%13%25%
13BunkerBoxeRterran8.08%17%7%1%11%

We also see that XiaoYi annihilated terran opponents, even though it came in a little below Iron overall. Comparing to last year’s results, XiaoYi’s parent bot SAIDA wiped the floor with terrans even harder.

Next: The voluminous per-map crosstables.

AIIDE 2019 second first look

The AIIDE 2019 tournament has been rerun to correct an error. The results are official, different from before, and hopefully final. In the original run of the tournament, we’re told, a hardware error corrupted a file and caused McRave to crash every game against Locutus. In the corrected rerun, McRave was able to score 1 win against Locutus in 100 games, but ironically ended up with a slightly lower overall winning rate. Bugs in McRave were more important for its result than bugs in the tournament.

#1 Locutus and #2 PurpleWave maintain their positions, but Locutus no longer had plus results against every opponent: PurpleWave edged it out 55-45 in their matchup. #3 BananaBrain gained a rank, and #4 DaQin lost one. From my point of view, the most important result is that #5 Steamhammer moved ahead of #7 Microwave and #8 Iron—these competitors were tightly grouped, and it only took small changes in the results to switch their finishing order around thoroughly.

shifts in the results

The order of finishers looks different, but most winning rates in the final results are within a few percent of the deprecated original results. Exceptions are #4 DaQin at 63.33% which was formerly #3 DaQin at 69.39%, a shift of 6% down, and #6 ZZZKBot at 52.08%, formerly #9 ZZZKBot at 43.04%, a shift of 9% up. What accounts for these two bots having such different results? To my eye, it doesn’t look like typical statistical variation.

I looked at the scores of specific matchups. Surprise result one: Formerly ZZZKBot scored 18-82 versus DaQin, but this time ZZZKBot 90-10 DaQin. This one difference accounts for the entire shift in DaQin’s winning rate, moving it down a rank, and much of ZZZKBot’s shift. Surprise result two: Formerly ZZZKBot 34-66 McRave, but this time ZZZKBot 67-33. That accounts for McRave performing worse overall, and for ZZZKBot jumping up the ranks. In other matchups, ZZZKBot performed similarly in both runs of the tournament.

Why did ZZZKBot perform so differently in these 2 matchups alone? I’ll dig in later, but I can speculate; here are 3 possible reasons, and it could be something else. There is some smell of software error: 18-82 -> 90-10 and 34-66 -> 67-33 look as though the results for the players were swapped. Or perhaps ZZZKBot was affected by the hardware error in these 2 matchups. Another possibility is unstable learning. I know that Steamhammer can perform very differently in two runs of the same matchup depending on what openings it happens to randomly try (does it hit on a winner early?). ZZZKBot’s learning is complicated and hard to analyze, but maybe it is susceptible to some effect like that.

AIIDE 2019 results first look

Important update on Friday 11 October: The results are invalid due to an error and the tournament will be repeated from scratch. See Dave Churchill’s tweet “The 2019 AIIDE StarCraft AI Competition will have to be re-run due to an error on our part causing a corrupted file which caused McRave to crash a lot of games.” The same error might have caused other problems. Even if McRave was the only bot directly affected, the competition was round robin so every bot’s score was potentially affected.

The AIIDE 2019 results were announced today at the conference. The AIIDE conference stream includes Dave Churchill’s presentation starting at about 52:30.They come with a video of Locutus versus PurpleWave, with commentary by Dan Gant focusing not on the game, but on the AI techniques.

The standings: #1 Locutus edged out #2 PurpleWave. #3 DaQin and #4 BananaBrain were far behind, but finished out the dominant protoss bloc. (The win rate over time graph strangely omits #4 BananaBrain.) #5 Iron, #6 Microwave, #7 XiaoYi, and #8 Steamhammer were closely grouped around 50% win rate. As in CoG, Iron is the top terran and the top returning bot, and Microwave was the top zerg.

#10 McRave did surprisingly poorly. It must be suffering from new bugs. I notice that McRave’s army has become strangely passive; it sometimes seems unwilling to fight even with a large advantage. That seems like a symptom of an important bug.

#8 Steamhammer did about as I expected, or at least as I expected after I noticed the combat sim bug that I had just added. Without that bug I think it would have finished slightly ahead of Microwave. I’m bothered by the 59% win rate against Iron, though; I expected over 90%. I tested on every map with the correct version of Iron, but must have made a mistake somewhere.

Last year, Bruce Nielsen provided diffs from Locutus for bots derived from it. This year, Dan Gant has provided diffs of a few other bots.

Stormbreaker derived from SAIDA - Stormbreaker was disqualified because its behavior was nearly identical to SAIDA’s, though there are big code differences. According to the presentation, Stormbreaker adds a neural network but does not use it.

XiaoYi derived from SAIDA - According to the presentation, SAIDA would likely have finished 3rd if it had played. XiaoYi placed 7th behind Microwave.

DaQin this year versus last year. I see a great many detailed changes.

We were promised a second competition on “unknown” maps, for those bots which did not opt out. I count 8 participants for the second competition. I don’t see a sign of its results. Perhaps it has not been run yet.

As always, I will analyze both CoG and AIIDE. But CoG is showing evidence of sloppiness, so AIIDE deserves more attention. With fewer entrants in AIIDE this year, it won’t take as long to dig into them. But I think I have almost managed to interpret the CoG result file, so I’ll start there.

Steamhammer can’t finish the game

Finishing off the enemy just means destroying all their buildings. It sounds simple, but it is a sophisticated skill, and there are a lot of ways to go wrong. Steamhammer has a number of special provisions for quickly finding the last enemy remnants, but small loopholes persist and occasionally a game slips through one.

PurpleSpirit-Steamhammer on La Mancha is an example. It’s an entertaining game, thanks to the purple habit of playing all over the map, but I want to focus on the end, after PurpleSpirit has lost, when Steamhammer fails to destroy the floating terran buildings that are right over its head. That part is entertaining too, but not for the same reason.

the beginning of the end

Everything terran on the ground is destroyed, except one command center which was infested instead. The remaining terran force is 2 full-strength battlecruisers, and the remaining terran infrastructure is 2 floating ebays. Steamhammer is maxed and banking resources, but its only anti-air units are 8 scourge, plus a defiler with plague. Notice how much game time is left?

One of Steamhammer’s special game-finishing skills is that it makes mutalisks to chase down the residue of the enemy. The condition is, if the enemy has no known bases and no known anti-air units, then Steamhammer will tech to mutalisks and make mutas its primary unit. The mutas scout faster than ground units, and can find floating buildings and island bases that ground units can’t reach. But here terran still has battlecruisers, so the mutalisk rule does not kick in. First the terran army, such as it is, must be defeated.

swarm all over

Some scourge have hit, and the battlecruisers are no longer at full HP. But Steamhammer has been replacing losses primarily with more zerglings and ultralisks, which are of no use. Oops, the unit mix is wrong. Now there are only 2 scourge, and a battlecruiser can kill a scourge in one shot—2 battlecruisers, if not distracted by other targets, are safe from 2 scourge. The ebays choose to park over the terran natural, and zerg units have congregated there. The battlecruisers seek zerg stuff to shoot, and the defiler responds by blanketing the area in defensive swarm, consuming zerglings like crazy.

some damage has been done

Well, the 2 scourge hit one of the battlecruisers, which was distracted seeking zerg units that strayed from under dark swarm. And Steamhammer is now making 3 new pairs of scourge to replace various losses; if it can keep this up, the battlecruisers will eventually fall. Best of all, the defiler has plagued the terran buildings, despite the zerg units underneath. That will put the ebays into the yellow. One more plague should put them into the red, after which they will burn down.

nothing happens after this

Whew, finally the battlecruisers are shot down by scourge.

But that’s all she wrote. The swarms wore off and there was no need to renew them. The defiler did not plague again because it thought the zerg units underneath were more valuable. Scourge are coded to avoid floating buildings, because it is usually wasteful to spend gas destroying them. The mutalisk rule is engaged, and there is supply to build 1 mutalisk, but Steamhammer happened to choose to spawn zerglings first, and after that there was no supply to make a mutalisk. The game timed out with no more progress.

Finishing off the enemy can be hard. In this case, Steamhammer had the wrong unit mix; to make zerglings and ultralisks when all enemies were in the air was no good. The mutalisk rule should make mutalisks only, not mix them with other units. The scourge might have understood that when only floating buildings are left, they are good targets. Also the zerg ground units that can’t shoot up might have known better than to chase floating buildings (though it can be useful when they track a building trying to escape), and the defiler might have realized that damage to its own units was irrelevant when it could eliminate the enemy. That is a lot of flaws, and yet Steamhammer rarely fails to finish a game!

And fixing all the problems would only narrow the loophole, not eliminate it. In the worst case, Steamhammer would need to be able to destroy some of its own units to clear supply to make mutalisks to finish off the enemy. And that’s a high-end skill that I am in no hurry to add.

Next: The start of CoG 2019 analysis.

CoG 2019 results first look

Dan Gant let me know yesterday that the CoG 2019 (formerly CIG 2019) full results are out. They finally got a new web site up. I grabbed everything, but found that replays_04.zip is corrupt, so we are missing replays from the final 10 rounds. The SOURCE_CODE download does not contain source code.

There were 27 participants, a good number, but only 9 were new entrants, not such a good number. The remaining 18 were holdovers from previous years (this assumes that LetaBot was a new submission as registered, not a holdover as stated in the slide show; I don’t know which is correct). 40 rounds were played, numbered 0 to 39. 27*26*40/2 gives 14,040 games ideally, and they claim that 14,027 successfully made it into the results. The five maps were a version of Heartbreak Ridge (2 starting locations), Alchemist and Great Barrier Reef (a version of El Niño) (3 starts), and Neo Sniper Ridge and Python (4 starts). Alchemist is badly designed, but the others are good choices. Heartbreak Ridge and Sniper Ridge have layout similarities; I would not have included both with only 5 maps total (of course, they chose randomly). I still maintain that 5 maps are not enough to smooth out differences; if a bot does particularly well or poorly on one map, it introduces an element of luck into the results.

The result chart in the slide show does not agree with the result crosstable. The table gives #4 Iron 75.96% #5 BananaBrain 74.81%, #6 XiaoYi 72.21%. The slide show gives #5 BananaBrain 72.21% from the next entry down in the table, #6 XiaoYi 70.38% from its next entry down in the table, and so on. The error seems to be in the slide show; all the values are assigned to bots which are off by one from #5 BananaBrain until #21 Ziabot, which in the slide show shares the same 29.33% win rate as #20 Bonjwa. I’ll be careful to use the win rates from the crosstable.

I get the impression that the organizers are overburdened. Running a tournament is a ton of work. They do not seem to have the resources to verify details and get everything right. I hope they have time to go back and clean up the errors.

The participants fall into fairly neat score groups. The protoss leaders #1 PurpleWave, #2 McRave, and #3 Locutus (all independently written, by the way, with no shared code history) are at 88.56-84.9%. Then a gap, and the next group is #4 Iron to #9 BetaStar at 75.96-67.41%. Another gap, and the next group is #10 MetaBot to #13 TitanIron at 59.04-56.35%.

I can verify from its learning files that terran #6 XiaoYi at 72.21% wins is a fork of SAIDA. By the way, it is given as registering under the name XiaoYiAI, but played under the name XIAOYI. It was brand new, so probably no bot had special preparation for it by name. Nevertheless, other bots seem to have played under their names as registered, so XiaoYi was potentially given an advantage of anonymity.

#4 Iron at 75.96% win rate is the top terran and the top holdover bot from the previous year. #7 Microwave did well at 70.38%, making it the top zerg. (#22 Steamhammer is the buggy holdover from the previous year and performed miserably, as expected.)

The biggest upset by far is #24 OpprimoBot, 23.22% win rate, 28-12 versus #1 PurpleWave with 88.56% win rate. I watched a few replays and found that, in those games, PurpleWave made one probe and then stopped all production. I can only guess that OpprimoBot tickled a bug, and the bug must be triggered by the name “OpprimoBot”, since PurpleWave went wrong before learning anything else about its opponent. I imagine that the famously thorough Purple tournament preparation hit a glitch in this one case. Maybe it is related to the fact that OpprimoBot plays random on SSCAIT, but played terran here. The object file Opponents.class does mention OpprimoBot by name, along with many other potential opponents.

I will analyze the results as usual, with the colorful crosstables and stuff, but may be slower than in the past.

Steamhammer’s defiler play

Earlier I claimed that if Steamhammer has defilers out, it is probably winning. It’s true but misleading. Steamhammer’s game plan is such that it doesn’t try to win in the middle game (though it may win accidentally); the midgame is about holding the enemy off, getting upgrades, and growing the economy. When its drone count reaches 75, Steamhammer switches to all army production and its military strength grows rapidly—and that is the same time that defilers come into their own. When it reaches the late game, whether there are defilers or not, Steamhammer will be hitting hard and very likely winning, because if it weren’t winning it would probably have lost already. The late game is Steamhammer’s strongest phase.

Nevertheless, Steamhammer’s defiler usage has grown skillful enough that it definitely contributes to the bot’s strength. It’s an important milestone, because good defiler usage is critical to strong zerg play. Defilers are complicated, and zerg cannot reach its potential without mastering them. Steamhammer is far from mastering defilers, but I think it has become more adept with them than any other bot.

The original description of the defiler implementation is still accurate. Bugs have been fixed and refinements made, but the structure is unchanged.

that laser precision

I picked this game TyrProtoss-Randomhammer because it shows off Steamhammer’s precision and fluidity in defiler spellcasting. The precision was always there; the fluidity was reached by a long road with a year’s worth of bug fixes and other improvements. In the first picture, Steamhammer is about to start its decisive final attack. The attack would have won with or without defiler support. Accurate defiler spells made it effortless.

a slightly weak plague

Steamhammer chooses plague because it looks like the fight will feature zealots versus hydras. The plague is actually a little weak. The defiler (selected and a little hard to see) is hemmed in by friendly units and cannot move forward, and Steamhammer hasn’t seen all the approaching enemies yet, so it plagues only 3 zealots. With a short delay for the opposing armies to close, it could have plagued more enemies, but it is not able to figure that out. Still, the plague helped in the battle.

swarm neutralizes the dragoons

Once the front zealots have died, Steamhammer has a fuller view of the situation, and realizes that the enemy has mostly dragoons left. The defiler (which has the energy upgrade) casts swarm, consume, consume, then a second swarm. The swarms are accurately placed to nullify the dragoons and drive the enemy back, bringing the fight into the natural. In the picture the defiler has just cast the second swarm, and it instantly consumes again.

plague hammers in the final nail

The defiler is again stuck behind friendly units for a time, but protoss is forced back to the ramp. As zerg units spread out to raze the enemy natural, the defiler becomes free to move and drops a perfect plague on the ramp. Every protoss unit on the ramp is hit, and the zerg melee units that are in contact are untouched. No better plague is possible.

Steamhammer is awkward in maneuvering its defiler, and it is unable to foresee better opportunities that will arise in the near future. But when a spell goes down, it is often close to the best that is possible at that moment. It’s plenty good for now.

potential to turn the game

One defiler spell can make the difference between losing and winning. In practice, Steamhammer needs more skills before that will happen often in its games, but the examples here offer a foretaste. The game Steamhammer-McRave I picked to show the brute power of plague.

The first defiler of the game wandered carelessly into the front lines, stood on top of an isolated lurker, and got stormed to death. The defiler didn’t know the best place to go, that’s one weakness, and Steamhammer only makes 1 defiler at a time, that’s another weakness. An alert opponent can pick off the defiler and earn a respite. When the second defiler of the game joined the fray, this happened:

a plague on all your zealots

It’s a fun game, worth watching through. At the time of the picture, Steamhammer had been slowly twisting the situation into its favor from far behind, but McRave was still winning. The game could have gone either way. After this massive plague, turning a whole phalanx of zealots into straw, McRave’s army was broken and it quickly collapsed.

Here’s a picture from a different Steamhammer-McRave game. The red stuff all over the protoss natural is from 3 active plagues cast one after another by one defiler. The defiler just died; it sacrificed itself amidst the zealots (at the white spot of a dragoon hit) to get the third plague off, the one currently spreading over dragoons in the rear. The sacrifice was worth it. The protoss army was hollowed out, giving zerg time to consolidate an economic advantage and win. Without those plagues, McRave would have been able to move out and take the game with its bigger army (though I’m not sure it would have chosen to).

plague of plagues

Here is a third example of a turnaround plague, from Steamhammer-MadMixT. After playing well much of the game, Steamhammer went astray and collapsed under a terran attack. It was about to lose its main and the game when a defiler risked its life to throw a plague over the core of the terran force, reducing the units to eggshells. Plague hurts terran more than protoss, which has shields. It then took only a few zerg units to clear the attack, partly due to terran’s disorganized movement, and with a return to superior play, Steamhammer slowly clawed its way to a win.

eggshell terrans

keeping active

Steamhammer is weak at using defilers in defense. I think it’s the widest remaining gap in defiler play. But it’s pretty good at using defilers in attack. I chose the game Steamhammer-MadMixP to show how, in the best case, Steamhammer can keep its defiler active through a long attacking sequence. The game itself is not very interesting, but watch the finishing attack which starts at around 17:30 from across the bridge to the terran natural.

the finishing attack begins with a plague

It starts with a plague. If I didn’t miss anything, the sequence that follows is consume, consume, swarm, consume, consume, consume, plague, consume, consume, swarm, consume, consume, swarm, consume, consume, consume, swarm, consume, plague, consume, consume, consume, plague, consume, consume, consume, plague, consume, consume—and the last enemy building is destroyed. The defiler was stuck behind friendly units for short stretches, and otherwise busy every moment supporting the raging attack. Even though not all the activity was useful, the ability to keep constantly active is valuable.

blunders

Steamhammer does make mistakes in casting defiler spells, though rarely serious ones. Plague doesn’t account for moving units. It can miss fast-moving enemies altogether, for example trying to plague a group of corsairs and smearing only bare ground. In an active fight it can unintentionally plague its own units, which do not realize that they are moving into a danger zone. (It’s also perfectly willing to intentionally plague its own units, if that lets it splash more enemies. The calculation simply adds damage done to enemies and subtracts damage done to self.)

Dark swarm is thorny. Steamhammer only partially understands it. In Randomhammer-tscmoop2 the defiler cast a swarm that helped the enemy—with hydras on hand versus zealots and a reaver as well as the cannons and dragoons, swarm was a poor idea in the first place, and this swarm placement defends the enemy and does not open a path to attack. Oops. The research tab shows that plague was almost but not quite finished.

swarm protects enemy units

improvements needed

Why is Steamhammer weak with defensive dark swarm? To lay down swarm over lurkers and hold a position, or to force marines back with zerglings under swarm, you have to coordinate the swarm with the combat units. The defiler knows that swarm over lurkers is a great idea, and the combat simulator knows it too, in an approximate way. The lurkers themselves have no inkling; when their targets retreat, the lurkers will pick themselves up and obliviously step out of the swarm and die. Even something as simple as rendering hydralisks invulnerable to air attack is beyond Steamhammer. The hydras pay no attention.

Steamhammer’s squad code is already fragile from having too many skills tacked on, and needs to be rethought and rewritten. I’m reluctant to add complicated coordination skills before tackling that, so this weakness will likely be around for a while. It is a severe weakness, though, because defense is a critical use of defilers. Darn it, but there are times when only invulnerability can save you.

Besides the lack of coordination, there is no planning ahead. Defilers live in the moment and know nothing beyond. With prediction and planning, greater things could be accomplished with fewer resources.

More basically, the tendency of the defiler to get stuck behind friendly units reminds me that Steamhammer needs another fundamental skill: Smooth unit movement. Human players know that the defiler’s position is important, and move other units so that the defiler gets where it needs to be. And more generally, the missing skill shows in things like awkward movement through choke points, and clumsy collisions when squads cross paths.

I have a few simple ideas for how to control multiple defilers at once without blowing out cpu usage or seeing them simultaneously plague the same enemies. I expect I’ll implement that before SSCAIT this year. With 2 or 3 defilers in the Ground squad, defenders won’t have time to draw breath between spells.

Someday I’ll implement defiler drops. Zerglings with adrenal glands upgrade might as well be wrecking balls. Drop a defiler and some zerglings off to one side, swarm, swarm, and pick up the defiler for another go later while the lings rip down buildings. It’s cheap to do and expensive to defend against.

Steamhammer 2.3.5 source released

I finally uploaded the source for Steamhammer 2.3.5; see Steamhammer’s web page. Sorry about the delay. As I’ve mentioned, this is the same code as the AIIDE 2019 competition version, but configured to play all races rather than zerg only.

One of my steps in uploading a new version is to calculate the performance numbers of the previous version. By SSCAIT win rate, Steamhammer 2.3 (from last April) with 69% wins is the most successful version since Steamhammer 1.2 with 70% wins, way back in March 2017. And I’ve seen that the following test versions are stronger yet. This version 2.3.5 (and the tournament version 2.3.4 since it’s identical) has a new combat sim bug which makes it slightly weaker than the previous test version, 2.3.3, but I think it is still stronger than version 2.3. Steamhammer is doing well at the moment.

Before long I plan to release a new point version which fixes the combat sim bug and (just for fun) adds new queen skills. After that I’ll dig into strategy adaptation, which I expect will cause strength to plummet, because that is what major new features do at first. When strength recovers, though, it will recover to a higher level.

Next: A long post about defiler play, with examples and explanations. I also found a few research papers that I want to write up.

Steamhammer plans update

After every new Steamhammer version it’s time to come out with new plans for the following versions—plans that are always updated and sometimes changed beyond recognition. No plan survives contact with further consideration.

Despite my intention to stick to low-risk changes in the latest Steamhammer, I seem to have introduced a severe bug into combat simulation. I’m seeing cases of pointless fear, like ultralisks fearful to attack an assimilator, and Games Are Being Lost, which must not be. I need to make a point release to fix this bug, and possibly a few minor bugs besides.

But I don’t have the energy to fix bugs right now. I want to do something fun instead. I’m adding broodling for queens, and thinking about how to adjust the strategy boss so it can make more than 1 queen at a time without ever going overboard. Hmm. The rest of the code already supports any number of queens, at least in principle, but not the strategy boss. This is, by the way, another change of plans; I was originally intending to implement ensnare before broodling.

Related stuff that’s high on my list: Simple sample configuration file, for people who prefer to ignore the complex default config file with features piled over their heads. Up-to-date documentation. I promised a post on defilers. Oh, and I forgot to release source, I’d better do that next. I’d forget the period at the end of the sentence if it weren’t there on the keyboard to remind me

In the longer term, I need to go to BWAPI 4.4.0 and make progress on strategy adaptation. My thinking at the moment is that I’ll make the most progress by working with timings: The opponent model should record the timings and unit mix of opponent attacks, and compare them against the measured timings of Steamhammer’s openings to figure out how to counter an opponent. That’s only a small part of the full strategy adaptation suite, but it’s critical and it seems like a good piece to do early.

If you look closely at the source—you know, the source I forgot to release—you’ll find an unfinished and unused class BOSimulator which is a start on another essential part of strategy adaptation. I need to rename it, the name is confusingly similar to Dave Churchill’s BOSS.

Steamhammer’s strategy boss debug display

For this upload of Steamhammer to SSCAIT, I turned on display of the strategy boss information. It’s kind of cryptic, so I thought I’d explain.

strategy boss debug info drawn on the screen

The top section, with all yellow labels from “bases” to “build”, describes the game situation and some automatic reactions that can happen at any time, including during the opening. The top section displays all game long. Then comes a slight break, and the bottom section, with multi-color labels from “eco” to the end, describes the strategy boss’s plans and conclusions after the opening is over. The bottom section only appears after exit from the opening book.

The strategy boss also cares about the game’s mineral count, gas count, and supply count. That’s why I draw the strategy boss info directly underneath Starcraft’s info.

top section of the info

bases 3/12 Steamhammer owns 3 of the 12 bases on the map.

patches 20 These 3 bases have 20 mineral patches total. This determines the upper limit of how fast it is possible to mine minerals.

geysers 2+0 Steamhammer has taken 2 gas geysers, and no more are available. If a geyser is in the process of being taken—the extractor is morphing—then it doesn’t count on either side. With 3 bases and 2 geysers, it could be that Steamhammer has 2 gas bases and a mineral only, or 3 gas bases and the final geyser is in the process of being taken.

drones 27/39 Steamhammer has 27 drones, including drones that are in the egg and haven’t hatched yet. It will spawn up to 39 drones; it believes that, taking into account the mineral patches and geysers, 39 drones are as many as it is reasonable to make.

mins 23 23 of the drones are mining minerals.

gas 0 None of the drones are mining gas. That leaves 4 drones on neither minerals or gas; they may be still in the egg, or may be carrying out other business, like building or scouting. But in any case, gas collection is turned off at the moment so that Steamhammer can collect minerals faster.

react +16 Based on what we’ve seen the opponent do, aim for 16 drones more than what we would normally build in this situation; they are “reactive drones.” A number as large as +16 usually means that the opponent is playing very defensively with many cannons, so that we can safely make extra drones to get ahead in economy.

larvas 0 There are no larvas available to spawn more units. Steamhammer takes a shortage of larvas as evidence that it may need more hatcheries (other factors count too). That might be why gas is turned off: To collect minerals faster to make a macro hatchery ASAP. (Though it might have concluded only that 911 gas is more than it needs for now.)

build +0g +0h This describes a complex strategy reaction that can occur in the opening book, and cannot occur later in the game. It connects with the “react” item above, the count of reactive drones. If there are too many reactive drones in the opening, they will bring in excess minerals, so Steamhammer takes actions to use resources more efficiently and prevent the excess minerals from building up unused. The actions are that it can take extra geysers—“g”—or make extra hatcheries—“h”. The extra geysers bring in extra gas so that income is balanced, and the extra hatcheries bring extra larvas for faster production. The result is that the unplanned resources can be spent efficiently and Steamhammer runs through its opening line faster, entering the middle game in a stronger position. In this game, it did not add anything extra in the opening; most of the +16 reactive drones must have been decided after the book line was over.

bottom section of the info

eco 0.35 5/43 describes plans for economic growth. 0.35 is the proportion of larvas planned to be made into drones on average over the long run. Actual moment-to-moment drone production can vary from all drones to all combat units, depending on circumstances, but Steamhammer aims for a given ratio. 5/43 is the actual ratio of drone production: It made 5 drones out of the last 43 larvas morphed, a ratio of about 0.11—Steamhammer is behind on planned drone production. Reactive drones are implemented behind the scenes by manipulating the ratio.

army 24 40 bad gives army sizes: Steamhammer’s army size is 24 (in an arbitrary measure), including static defense, and the enemy’s army size is 40, counting only mobile units. This is “bad” in red because it means that Steamhammer may be overrun. When the army size is “bad” Steamhammer makes combat units to defend itself; that explains why it is behind on drone production.

The 7 orange unit types lings, hydras, lurkers, mutas, guardians, devourers, ultras, are the candidate units for Steamhammer’s main unit mix. Overlords, queens, and defilers are support units that are made on the side and do not count toward the main unit mix. The number after each unit type is the strategy boss’s score for how important the unit type is in the current situation. The scores are used for two purposes: First, to decide on the combat unit mix. Second, to decide on the tech target, the next unit type that the bot should aim for.

The starred unit types are those that Steamhammer has the tech for. They are candidates for the unit mix. The starred unit with the highest score will be part of the unit mix. The rest of the unit types are candidates for the tech target.

Zergling Hydralisk 1 Lurker is the combat unit mix. Hydras had the highest score, so they are part of the unit mix. The full rules for choosing the unit mix are complex. Lurkers had the next highest score, but the strategy boss decided that it is enough to have only 1 lurker for now (which really means at least 1 lurker; if there happen to already be more, that is OK). Minerals not spent on hydras or the 1 lurker (or on drones or whatever else) will be spent on zerglings.

plan Mutas gives the tech target: A tech switch to mutalisks is planned. Mutas are a candidate for the tech target because they have a higher score than any of the starred units that Steamhammer already has the tech for. Guardians and ultras have higher scores yet, but Steamhammer doesn’t have a hive so mutalisks can be gotten much faster. If mutas did not score higher than hydras, the strategy boss would have picked guardians as the tech target.

half of AIIDE dropped out

This is too much. When I wrote about the 26 AIIDE 2019 registrants, I expected that not all of them would end up competing. It would have been surprising if a bot that looked as unfinished as Ophelia were ready in time. But this is too much.

See the list of participants. One extra holdover from last year was added, #11 LastOrder. Of the now 27 entrants, 13 dropped out, including the added LastOrder, so that 14 competitors remain. 4 are listed as withdrawn, 9 as not submitted.

The withdrawals are MetaBot, and the holdovers CSE, LastOrder, and SAIDA. MetaBot is supposedly unchanged, so in practice it is a holdover too. I suspect that LastOrder was added and marked withdrawn at the same time; the order of events suggests that it had already withdrawn before my write-up of the registrants. I especially miss LastOrder—last year, Steamhammer scored 25% against LastOrder, despite finishing higher in the rankings, and I would have enjoyed revenge.

Of the 9 unsubmitted bots, I will especially miss Dragon and Murph, which are both apparently protoss derivatives of CherryPi.

I’m sure circumstances are different in each case, but the large number of dropouts suggests a common underlying factor. What could it be?