archive by month
Skip to content

Steamhammer 3.6 uploaded

I uploaded Steamhammer 3.6 to SSCAIT. It turns out that Steamhammer terran needs a bug fix, so no Randomhammer for now. It will have to wait for my traditional post-AIIDE bugfix version.

As I mentioned earlier, I’ll post the change list after I’ve fixed up the blog. First things first.

AIIDE 2023 running results

I was excited to learn that the AIIDE 2023 results are coming in live, this year. Though I see that the detailed results and win percentage over time are not updated as often as the overall results. My first impression is that the ranking so far is exactly what I expected it to be. Looking a little closer, McRave is doing substantially better versus Dragon than last year. (That is last year’s physical hardware version, to match this year.) Both are carryovers, so the difference may be due to the maps, or to the luck of learning. Well, the tournament is not far along. We’ll see how it goes with more data.

It looks like I was wrong about Steamhammer versus BananaBrain. Steamhammer is not as successful there as I expected. Either my tests were unrealistic in some way, or else I introduced a late bug. Or maybe BananaBrain was improved in a way that makes it harder for Steamhammer to do early damage. Steamhammer should put up a good fight in any game where it can kill some probes early on.

I will upload Steamhammer 3.6 to SSCAIT shortly.

AIIDE 2023 - Steamhammer is submitted

Steamhammer is submitted for AIIDE 2023. You bring the futile hopes, I’ll bring the lurker spines.

It was kind of a scramble this year, though. Yesterday when I did what was supposed to be the final round of major tests, I turned up 3 critical bugs, 2 in new features and 1 left over from who knows when, hidden from view because the opponent model learned to avoid playing into it. As I stayed up late last night frantically debugging, a cricket got into the house—into the same room—and started talking loudly with its friends outside. Not entirely relaxing! But I got up early in the morning and finished fixing everything, and ran more tests, and packed it up, and ran the did-I-pack-it-up-correctly checks and tests, and now it’s on its way.

Last year, Steamhammer scored about 8% against BananaBrain, which finished first. BananaBrain has improved since then, but I expect to score substantially higher against it this year. Well, I could have missed further bugs, but there was no sign of it in what I had time for. My very last test game was against BananaBrain. It was hard-fought and went to late game, and zerg lost narrowly. Steamhammer managed to parasite every shuttle, and plague a lot of armies.

Next: Fixing the blog comments is my first priority. Then I’ll release Steamhammer and write up its changes.

AIIDE 2023 participants

The AIIDE 2023 participants have been announced on a Google spreadsheet. They are the usual suspects and the carryovers plus Infested Artosis.

Infested Artosis by Brad Ewing is not high ranked, but I like it. It’s simple. You can read through the code on GitHub in a short time. It’s a far cry from the complex code of Steamhammer, which must be baffling to those who first look at it. At the same time, it has the nice feature that it independently learns what build to play and what unit mix to make. Both are straightforward picks from short lists. There’s provision for learning whether to make static defense too.

Enjoy the competition, Infested Artosis!

In other news, JyJ versus Soulkey in ASL 16 today was a dramatic TvZ best-of-five. The last two games were particularly hard-fought.

Dan’s AIIDE map pool analysis

Purple Dan Gant did some analysis of the AIIDE 2023 map pool and kindly e-mailed info to me. Here it is; I edited it slightly for clarity.

  • Ran games between PurpleWave and BananaBrain across all the AIIDE maps (based on what I think the legal maps are)
  • All the maps ran (they were 1.16.1-compatible)
  • Some of the maps in my zip appear to be observer maps which are unusable for bot play because they have extra start locations. All but Crossing Field have acceptable alternative versions in the pool
  • PurpleWave worked correctly on all maps. Its behavior on Katrina was dumb but explicable due to the backyard natural

He offers results on a google spreadsheet: AIIDE 2023 Map Info (you have to switch between the About and Data sheets). And he created a new version of the map pool which may better represent what we should actually get: Dan’s unofficial copy of AIIDE 2023 map pool version 2.

Thanks, Dan! You have done a public service.

Steamhammer can play on all the maps, but on the ICCup observer maps it feels a need to scout the observer “bases” to make sure nobody’s there. It knows better than to try to expand there. There are maps where it can correctly reject observer slots as non-bases, but it depends on how they are set up. Steamhammer also plays stupidly on Katrina, and I don’t intend to fix it yet because it’s too much time cost to pay down a small risk. At some point I’ll teach it to do the terrain connectivity analysis from first principles, and create a graph that it can reason about. Playing on Outsider is a goal, but one step at a time.

Steamhammer work is coming along nicely. The new feature I’m adding is coming up to the finish line, and it is passing its tests so far. It won’t be fully refined yet, but still a good improvement. I should have time to add one more important feature before the tournament. Between them they will make the bot sharper at taking advantage of opportunities and more resilient when under pressure. Can’t ask more than that in a short timeframe.

CherryPi - Stardust game

As you might guess from the CoG results, Stardust has gained strong new skills since last year. Against zerg in particular, one skill is that it uses corsairs heavily. Another is that it learned the forge expand opening. And yet it can still be defeated.

CherryPi is from 2017, but it shows how. The SSCAIT game Stardust v CherryPi is instructive (I kept the replay for when SSCAIT recycles it).

The recipe is:

1. Facing forge expand, play a greedy opening. CherryPi opened pool first (with a rare 13 pool) so that it could make zerglings if it needed to, but cannons were not going to walk across the map to attack, so it followed up with drones and hatcheries.

2. Fight efficiently in the middle game. CherryPi could not take any one-sided victories against such a tough opponent, but it traded well and cut the protoss army down to a safe size, where zerg could spawn enough defensive units in the time the protoss would take to cross the map.

3. With that breathing room, zerg could safely pull ahead in workers. Then it was just mass and smash.

See how easy it is? It’s a simple matter of being good at everything!

Of course it was only possible because Stardust showed weaknesses. One is that it was cautious and clumsy with its corsairs, and put on less counter-air pressure than it could have. First it kept them with the army, then it flew them over hydras.

CoG 2023 results by map

The map pool is the same as last year: (2)Benzene, (2)Eclipse, (2)Match Point, (3)Neo Aztec, (3)Neo Sylphid, (3)Outsider, (4)Circuit Breakers, (4)Fighting Spirit, (4)Polypoid. It’s three maps of each size.

Outsider is difficult for bots to play well, because it has a ring of bases around the outside, blocked by mineral lines. It takes extra smarts to exploit the map features. Outsider was the weakest map overall for the top protoss bots—even they don’t seem to have the extra smarts. It was the strongest map for XiaoYi, the only terran. Can its tanks and drops harass across the impassable mineral lines?

The stronger bots, from Microwave on up, have relatively stable performance across maps. The lower bots show some dramatic swings. CUNYBot versus ProtossPleb sees its win rates vary from 22% to 92% depending on the map. It’s no doubt due to specific strengths and weaknesses of bots that don’t have well-rounded skills.

overallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
#1 Stardust91.31%94%92%89%92%90%88%90%90%95%
#2 PurpleWave80.87%79%83%83%80%80%79%81%81%81%
#3 BananaBrain75.52%75%75%78%78%77%73%76%74%74%
#4 McRave202256.23%60%55%58%55%56%52%55%60%55%
#5 MicroWave42.30%44%44%45%41%42%44%41%39%40%
#6 XIAOYI201935.75%31%32%31%35%35%41%39%37%40%
#7 CUNYBot20219.56%8%5%14%10%9%6%12%12%11%
#8 ProtossPleb8.45%9%14%2%8%10%16%5%7%5%

For reference, the overall map table from yesterday.

StardustoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
PurpleWave291/360
81%
35/40
88%
29/40
72%
28/40
70%
32/40
80%
32/40
80%
36/40
90%
30/40
75%
31/40
78%
38/40
95%
BananaBrain278/360
77%
31/40
78%
33/40
82%
27/40
68%
31/40
78%
29/40
72%
29/40
72%
32/40
80%
31/40
78%
35/40
88%
McRave2022348/360
97%
38/40
95%
40/40
100%
40/40
100%
38/40
95%
39/40
98%
40/40
100%
36/40
90%
37/40
92%
40/40
100%
MicroWave328/360
91%
39/40
98%
37/40
92%
35/40
88%
39/40
98%
38/40
95%
32/40
80%
34/40
85%
37/40
92%
37/40
92%
XIAOYI2019352/360
98%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
33/40
82%
40/40
100%
40/40
100%
39/40
98%
CUNYBot2021358/360
99%
40/40
100%
40/40
100%
40/40
100%
39/40
98%
39/40
98%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
ProtossPleb346/360
96%
40/40
100%
39/40
98%
40/40
100%
40/40
100%
36/40
90%
37/40
92%
40/40
100%
37/40
92%
37/40
92%
overall91.31%94%92%89%92%90%88%90%90%95%

PurpleWaveoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
Stardust69/360
19%
5/40
12%
11/40
28%
12/40
30%
8/40
20%
8/40
20%
4/40
10%
10/40
25%
9/40
22%
2/40
5%
BananaBrain245/360
68%
30/40
75%
31/40
78%
27/40
68%
26/40
65%
23/40
57%
28/40
70%
25/40
62%
25/40
62%
30/40
75%
McRave2022348/360
97%
38/40
95%
38/40
95%
39/40
98%
38/40
95%
38/40
95%
40/40
100%
38/40
95%
39/40
98%
40/40
100%
MicroWave319/360
89%
30/40
75%
34/40
85%
36/40
90%
34/40
85%
39/40
98%
36/40
90%
37/40
92%
37/40
92%
36/40
90%
XIAOYI2019353/360
98%
39/40
98%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
36/40
90%
39/40
98%
39/40
98%
40/40
100%
CUNYBot2021347/360
96%
39/40
98%
39/40
98%
40/40
100%
39/40
98%
38/40
95%
37/40
92%
39/40
98%
38/40
95%
38/40
95%
ProtossPleb357/360
99%
40/40
100%
40/40
100%
39/40
98%
39/40
98%
39/40
98%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
overall80.87%79%83%83%80%80%79%81%81%81%

BananaBrainoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
Stardust82/360
23%
9/40
22%
7/40
18%
13/40
32%
9/40
22%
11/40
28%
11/40
28%
8/40
20%
9/40
22%
5/40
12%
PurpleWave115/360
32%
10/40
25%
9/40
22%
13/40
32%
14/40
35%
17/40
42%
12/40
30%
15/40
38%
15/40
38%
10/40
25%
McRave2022293/360
81%
32/40
80%
36/40
90%
36/40
90%
37/40
92%
34/40
85%
24/40
60%
32/40
80%
28/40
70%
34/40
85%
MicroWave333/360
92%
39/40
98%
37/40
92%
37/40
92%
39/40
98%
34/40
85%
37/40
92%
38/40
95%
35/40
88%
37/40
92%
XIAOYI2019360/360
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
CUNYBot2021360/360
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
ProtossPleb360/360
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
overall75.52%75%75%78%78%77%73%76%74%74%

McRave2022overallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
Stardust12/360
3%
2/40
5%
0/40
0%
0/40
0%
2/40
5%
1/40
2%
0/40
0%
4/40
10%
3/40
8%
0/40
0%
PurpleWave12/360
3%
2/40
5%
2/40
5%
1/40
2%
2/40
5%
2/40
5%
0/40
0%
2/40
5%
1/40
2%
0/40
0%
BananaBrain67/360
19%
8/40
20%
4/40
10%
4/40
10%
3/40
8%
6/40
15%
16/40
40%
8/40
20%
12/40
30%
6/40
15%
MicroWave351/360
98%
39/40
98%
40/40
100%
39/40
98%
39/40
98%
40/40
100%
40/40
100%
38/40
95%
40/40
100%
36/40
90%
XIAOYI2019295/360
82%
38/40
95%
33/40
82%
39/40
98%
34/40
85%
33/40
82%
21/40
52%
28/40
70%
37/40
92%
32/40
80%
CUNYBot2021360/360
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
ProtossPleb320/360
89%
39/40
98%
35/40
88%
39/40
98%
34/40
85%
36/40
90%
29/40
72%
34/40
85%
34/40
85%
40/40
100%
overall56.23%60%55%58%55%56%52%55%60%55%

McRave loves Outsider against BananaBrain and hates it against XiaoYi and ProtossPleb. It averages out to mild dislike.

MicroWaveoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
Stardust32/360
9%
1/40
2%
3/40
8%
5/40
12%
1/40
2%
2/40
5%
8/40
20%
6/40
15%
3/40
8%
3/40
8%
PurpleWave41/360
11%
10/40
25%
6/40
15%
4/40
10%
6/40
15%
1/40
2%
4/40
10%
3/40
8%
3/40
8%
4/40
10%
BananaBrain27/360
8%
1/40
2%
3/40
8%
3/40
8%
1/40
2%
6/40
15%
3/40
8%
2/40
5%
5/40
12%
3/40
8%
McRave20229/360
2%
1/40
2%
0/40
0%
1/40
2%
1/40
2%
0/40
0%
0/40
0%
2/40
5%
0/40
0%
4/40
10%
XIAOYI2019248/360
69%
31/40
78%
33/40
82%
34/40
85%
28/40
70%
28/40
70%
33/40
82%
24/40
60%
20/40
50%
17/40
42%
CUNYBot2021352/360
98%
38/40
95%
40/40
100%
40/40
100%
39/40
98%
40/40
100%
37/40
92%
39/40
98%
39/40
98%
40/40
100%
ProtossPleb357/360
99%
40/40
100%
39/40
98%
40/40
100%
40/40
100%
40/40
100%
39/40
98%
40/40
100%
39/40
98%
40/40
100%
overall42.30%44%44%45%41%42%44%41%39%40%
XIAOYI2019overallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
Stardust8/360
2%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
7/40
18%
0/40
0%
0/40
0%
1/40
2%
PurpleWave7/360
2%
1/40
2%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
4/40
10%
1/40
2%
1/40
2%
0/40
0%
BananaBrain0/360
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
McRave202265/360
18%
2/40
5%
7/40
18%
1/40
2%
6/40
15%
7/40
18%
19/40
48%
12/40
30%
3/40
8%
8/40
20%
MicroWave112/360
31%
9/40
22%
7/40
18%
6/40
15%
12/40
30%
12/40
30%
7/40
18%
16/40
40%
20/40
50%
23/40
57%
CUNYBot2021354/360
98%
40/40
100%
36/40
90%
39/40
98%
39/40
98%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
ProtossPleb355/360
99%
36/40
90%
40/40
100%
40/40
100%
40/40
100%
40/40
100%
39/40
98%
40/40
100%
40/40
100%
40/40
100%
overall35.75%31%32%31%35%35%41%39%37%40%

CUNYBot2021overallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
Stardust2/360
1%
0/40
0%
0/40
0%
0/40
0%
1/40
2%
1/40
2%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
PurpleWave13/360
4%
1/40
2%
1/40
2%
0/40
0%
1/40
2%
2/40
5%
3/40
8%
1/40
2%
2/40
5%
2/40
5%
BananaBrain0/360
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
McRave20220/360
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
MicroWave8/360
2%
2/40
5%
0/40
0%
0/40
0%
1/40
2%
0/40
0%
3/40
8%
1/40
2%
1/40
2%
0/40
0%
XIAOYI20196/360
2%
0/40
0%
4/40
10%
1/40
2%
1/40
2%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
ProtossPleb212/360
59%
19/40
48%
9/40
22%
37/40
92%
24/40
60%
22/40
55%
10/40
25%
31/40
78%
30/40
75%
30/40
75%
overall9.56%8%5%14%10%9%6%12%12%11%

ProtossPleboverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
Stardust14/360
4%
0/40
0%
1/40
2%
0/40
0%
0/40
0%
4/40
10%
3/40
8%
0/40
0%
3/40
8%
3/40
8%
PurpleWave3/360
1%
0/40
0%
0/40
0%
1/40
2%
1/40
2%
1/40
2%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
BananaBrain0/360
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
McRave202240/360
11%
1/40
2%
5/40
12%
1/40
2%
6/40
15%
4/40
10%
11/40
28%
6/40
15%
6/40
15%
0/40
0%
MicroWave3/360
1%
0/40
0%
1/40
2%
0/40
0%
0/40
0%
0/40
0%
1/40
2%
0/40
0%
1/40
2%
0/40
0%
XIAOYI20195/360
1%
4/40
10%
0/40
0%
0/40
0%
0/40
0%
0/40
0%
1/40
2%
0/40
0%
0/40
0%
0/40
0%
CUNYBot2021148/360
41%
21/40
52%
31/40
78%
3/40
8%
16/40
40%
18/40
45%
30/40
75%
9/40
22%
10/40
25%
10/40
25%
overall8.45%9%14%2%8%10%16%5%7%5%

CoG 2023 first look

In an astonishing break with tradition, my analysis program’s results exactly matched the official results on the first try. There was no need to cope with arbitrary changes to the format of the results file, or search for special cases that were handled differently this year, or exclude broken bots. How did they do it?????

Except for the naming issue, everything seems to have run cleanly, as far as I can tell from here. New bot ProtossPleb crashed 30% of the time, but the crash rate for the others was flat zero. There were 22 frame timeouts total, a negligible number. All bots played all games, which suggests that they silently replayed games which did not start. 360 rounds were played, fewer than the 450 last year.

Well, there is one error. The HTML version of the detailed results links to individual game replay files, but I can’t download them. The web server gives a 400 Bad Request error, not what you expect if the file is simply missing. I imagine the replay files are meant to be served through a relay behind the scenes, and it’s not set up properly.

overallStarPurpBanaMcRaMicrXIAOCUNYProt
Stardust91.31%81%77%97%91%98%99%96%
PurpleWave80.87%19%68%97%89%98%96%99%
BananaBrain75.52%23%32%81%92%100%100%100%
McRave202256.23%3%3%19%98%82%100%89%
MicroWave42.30%9%11%8%2%69%98%99%
XIAOYI201935.75%2%2%0%18%31%98%99%
CUNYBot20219.56%1%4%0%0%2%2%59%
ProtossPleb8.45%4%1%0%11%1%1%41%

Like last year, the crosstable is very smooth, with no upsets. In fact, it’s even smoother; the palest cell off the diagonal is BananaBrain-Stardust, where BananaBrain scored only 23%. The ranking is as clean as can be. The closest thing to an upset was that ProtossPleb was able to fight CUNYBot nearly even.

overallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
#1 Stardust91.31%94%92%89%92%90%88%90%90%95%
#2 PurpleWave80.87%79%83%83%80%80%79%81%81%81%
#3 BananaBrain75.52%75%75%78%78%77%73%76%74%74%
#4 McRave202256.23%60%55%58%55%56%52%55%60%55%
#5 MicroWave42.30%44%44%45%41%42%44%41%39%40%
#6 XIAOYI201935.75%31%32%31%35%35%41%39%37%40%
#7 CUNYBot20219.56%8%5%14%10%9%6%12%12%11%
#8 ProtossPleb8.45%9%14%2%8%10%16%5%7%5%

The map table is smooth too. Averaged across opponents, the maps made little difference. That’s never true against specific opponents; I’ll post that data tomorrow.

racescore
terran36%
protoss64%
zerg36%

I like to include the race statistics, though all they say is “Yeah, protoss is still ahead, and there is still only one terran.”

botraceoverallvTvPvZ
Stardustprotoss91.31%98%85%96%
PurpleWaveprotoss80.87%98%62%94%
BananaBrainprotoss75.52%100%52%91%
McRave2022zerg56.23%82%29%99%
MicroWavezerg42.30%69%32%50%
XIAOYI2019terran35.75%-26%49%
CUNYBot2021zerg9.56%2%16%1%
ProtossPlebprotoss8.45%1%2%18%

Next: The map tables.

CoG first-try results posted

Purple Dan reported to me by e-mail. Thanks!

1. COG operators presented results at the conference and posted the summary results afterwards
2. Folks noticed that bots were run with incorrect names, which would break pretraining/preparation
3. COG silently removed the results, hoping to re-run it
4. After discussing it with bot authors, they have posted the full detailed results while authors decide if there was any likely impact of the error on the results

Microwave and BananaBrain seem to be the possibly-injured bots here, though it seems unlikely to me that anything would materially change.

Sure enough, the first-try detailed results are posted and I have grabbed ’em.

They ran the carryover bots with the year that they were carried over from appended to their names. For example, McRave was carried over from last year, and they ran it under the name McRave2022. In the posted registration list, it was listed simply as McRave. So any bot that was specially prepared to face McRave by name, wasn’t prepared after all. It can potentially make a big difference, but since only carryover bots were changed, it possibly may not.

I think it’s defensible as a policy. The carryovers are by definition not updated, and an updated bot’s ability to prepare for them can be seen as unfair—it’s much easier to hit a target that does not move. Special prep against one opponent may have no effect on play against others, so that the tournament measures progress where there is none. But it’s not a policy to spring on participants by surprise. If you’re going to do it that way, announce it in advance. Don’t make authors waste time.

Anyway, we’ll have to wait and see if there’s a decision to re-run.

Next: I’ll do a quick-look analysis of what we’ve got.

Update: Dan Gant points out to me that Microwave was also misspelled as MicroWave. Ouch.