archive by month
Skip to content

CoG 2022 results by map

I excluded BetaStar from the analysis, since it contributed no information. Everybody else’s winning rate is reduced compared to the official results, because they don’t have all the meaningless free wins.

The map pool is: (2)Benzene, (2)Eclipse, (2)Match Point, (3)Neo Aztec, (3)Neo Sylphid, (3)Outsider, (4)Circuit Breakers, (4)Fighting Spirit, (4)Polypoid. That’s three maps of each size.

overallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
#1 BananaBrain82.96%77%87%86%88%86%83%81%77%80%
#2 PurpleWave71.11%69%73%70%65%72%72%71%75%73%
#3 Stardust68.89%76%65%65%65%65%70%74%67%72%
#4 McRave63.37%68%64%68%64%62%57%58%67%61%
#5 Microwave37.63%38%39%40%40%38%40%37%32%34%
#6 XIAOYI24.63%21%20%21%26%25%27%28%28%26%
#7 CUNYBot1.41%1%3%0%2%1%1%1%3%2%

Results are nice and even for every bot. Averaged out over all opponents, the map did not make a great difference for any bot.

Against specific opponents, the map did matter. I included the count of wins/games in each cell so you can see how much data there is. CoG ran far more games this year than last, so it’s much easier to trust the results at the level of individual cells in the tables.

BananaBrainoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
PurpleWave357/450
79%
35/50
70%
42/50
84%
43/50
86%
47/50
94%
44/50
88%
44/50
88%
31/50
62%
35/50
70%
36/50
72%
Stardust269/450
60%
16/50
32%
44/50
88%
34/50
68%
33/50
66%
32/50
64%
27/50
54%
37/50
74%
22/50
44%
24/50
48%
McRave312/450
69%
37/50
74%
33/50
66%
35/50
70%
39/50
78%
41/50
82%
29/50
58%
34/50
68%
29/50
58%
35/50
70%
Microwave404/450
90%
43/50
86%
42/50
84%
47/50
94%
45/50
90%
42/50
84%
49/50
98%
42/50
84%
48/50
96%
46/50
92%
XIAOYI450/450
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
CUNYBot448/450
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
48/50
96%
50/50
100%
overall82.96%77%87%86%88%86%83%81%77%80%

BananaBrain’s skill against Stardust depends strongly on the map.

PurpleWaveoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
BananaBrain93/450
21%
15/50
30%
8/50
16%
7/50
14%
3/50
6%
6/50
12%
6/50
12%
19/50
38%
15/50
30%
14/50
28%
Stardust376/450
84%
31/50
62%
47/50
94%
40/50
80%
47/50
94%
46/50
92%
48/50
96%
31/50
62%
44/50
88%
42/50
84%
McRave247/450
55%
20/50
40%
23/50
46%
26/50
52%
26/50
52%
34/50
68%
35/50
70%
25/50
50%
31/50
62%
27/50
54%
Microwave331/450
74%
42/50
84%
44/50
88%
41/50
82%
24/50
48%
31/50
62%
29/50
58%
40/50
80%
40/50
80%
40/50
80%
XIAOYI438/450
97%
49/50
98%
50/50
100%
45/50
90%
49/50
98%
50/50
100%
49/50
98%
48/50
96%
49/50
98%
49/50
98%
CUNYBot435/450
97%
49/50
98%
47/50
94%
50/50
100%
47/50
94%
49/50
98%
49/50
98%
49/50
98%
47/50
94%
48/50
96%
overall71.11%69%73%70%65%72%72%71%75%73%

PurpleWave’s results against Stardust also vary strongly by map. The cause is presumably Stardust: In PvP games, its strategy is much better suited for some maps than others. But BananaBrain and PurpleWave differ in their ability to exploit the weaknesses.

StardustoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
BananaBrain181/450
40%
34/50
68%
6/50
12%
16/50
32%
17/50
34%
18/50
36%
23/50
46%
13/50
26%
28/50
56%
26/50
52%
PurpleWave74/450
16%
19/50
38%
3/50
6%
10/50
20%
3/50
6%
4/50
8%
2/50
4%
19/50
38%
6/50
12%
8/50
16%
McRave311/450
69%
25/50
50%
43/50
86%
29/50
58%
35/50
70%
29/50
58%
45/50
90%
41/50
82%
27/50
54%
37/50
74%
Microwave404/450
90%
50/50
100%
43/50
86%
41/50
82%
44/50
88%
47/50
94%
41/50
82%
48/50
96%
44/50
88%
46/50
92%
XIAOYI442/450
98%
50/50
100%
50/50
100%
50/50
100%
47/50
94%
48/50
96%
50/50
100%
50/50
100%
47/50
94%
50/50
100%
CUNYBot448/450
100%
50/50
100%
50/50
100%
50/50
100%
49/50
98%
49/50
98%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
overall68.89%76%65%65%65%65%70%74%67%72%

And again, Stardust varies strongly versus McRave with its mutalisks. It may have to do with hard-to-see details like Stardust’s building placement on each map, or the timing of Stardust’s fixed strategy. The remaining opponents are not strong enough to stress Stardust, so results don’t vary as much.

McRaveoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
BananaBrain138/450
31%
13/50
26%
17/50
34%
15/50
30%
11/50
22%
9/50
18%
21/50
42%
16/50
32%
21/50
42%
15/50
30%
PurpleWave203/450
45%
30/50
60%
27/50
54%
24/50
48%
24/50
48%
16/50
32%
15/50
30%
25/50
50%
19/50
38%
23/50
46%
Stardust139/450
31%
25/50
50%
7/50
14%
21/50
42%
15/50
30%
21/50
42%
5/50
10%
9/50
18%
23/50
46%
13/50
26%
Microwave418/450
93%
45/50
90%
49/50
98%
45/50
90%
48/50
96%
48/50
96%
50/50
100%
44/50
88%
45/50
90%
44/50
88%
XIAOYI364/450
81%
41/50
82%
41/50
82%
49/50
98%
45/50
90%
43/50
86%
31/50
62%
32/50
64%
44/50
88%
38/50
76%
CUNYBot449/450
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
50/50
100%
49/50
98%
50/50
100%
50/50
100%
overall63.37%68%64%68%64%62%57%58%67%61%

MicrowaveoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
BananaBrain46/450
10%
7/50
14%
8/50
16%
3/50
6%
5/50
10%
8/50
16%
1/50
2%
8/50
16%
2/50
4%
4/50
8%
PurpleWave119/450
26%
8/50
16%
6/50
12%
9/50
18%
26/50
52%
19/50
38%
21/50
42%
10/50
20%
10/50
20%
10/50
20%
Stardust46/450
10%
0/50
0%
7/50
14%
9/50
18%
6/50
12%
3/50
6%
9/50
18%
2/50
4%
6/50
12%
4/50
8%
McRave32/450
7%
5/50
10%
1/50
2%
5/50
10%
2/50
4%
2/50
4%
0/50
0%
6/50
12%
5/50
10%
6/50
12%
XIAOYI332/450
74%
45/50
90%
45/50
90%
44/50
88%
31/50
62%
33/50
66%
39/50
78%
36/50
72%
26/50
52%
33/50
66%
CUNYBot441/450
98%
50/50
100%
49/50
98%
50/50
100%
50/50
100%
50/50
100%
49/50
98%
50/50
100%
47/50
94%
46/50
92%
overall37.63%38%39%40%40%38%40%37%32%34%

Ooh, Microwave upset PurpleWave only on the three-player map Neo Aztec. Compare the Match Point column with the Neo Aztec column: They have the same overall win rate, but Microwave played Match Point consistently against each opponent, and played Neo Aztec sometimes better and sometimes worse. It must mean something!

XIAOYIoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
BananaBrain0/450
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
PurpleWave12/450
3%
1/50
2%
0/50
0%
5/50
10%
1/50
2%
0/50
0%
1/50
2%
2/50
4%
1/50
2%
1/50
2%
Stardust8/450
2%
0/50
0%
0/50
0%
0/50
0%
3/50
6%
2/50
4%
0/50
0%
0/50
0%
3/50
6%
0/50
0%
McRave86/450
19%
9/50
18%
9/50
18%
1/50
2%
5/50
10%
7/50
14%
19/50
38%
18/50
36%
6/50
12%
12/50
24%
Microwave118/450
26%
5/50
10%
5/50
10%
6/50
12%
19/50
38%
17/50
34%
11/50
22%
14/50
28%
24/50
48%
17/50
34%
CUNYBot441/450
98%
49/50
98%
45/50
90%
50/50
100%
49/50
98%
50/50
100%
50/50
100%
50/50
100%
49/50
98%
49/50
98%
overall24.63%21%20%21%26%25%27%28%28%26%

CUNYBotoverallBenzenEclipsMatchPNeoAztNeoSylOutsidCircuiFightiPolypo
BananaBrain2/450
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
2/50
4%
0/50
0%
PurpleWave15/450
3%
1/50
2%
3/50
6%
0/50
0%
3/50
6%
1/50
2%
1/50
2%
1/50
2%
3/50
6%
2/50
4%
Stardust2/450
0%
0/50
0%
0/50
0%
0/50
0%
1/50
2%
1/50
2%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
McRave1/450
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
0/50
0%
1/50
2%
0/50
0%
0/50
0%
Microwave9/450
2%
0/50
0%
1/50
2%
0/50
0%
0/50
0%
0/50
0%
1/50
2%
0/50
0%
3/50
6%
4/50
8%
XIAOYI9/450
2%
1/50
2%
5/50
10%
0/50
0%
1/50
2%
0/50
0%
0/50
0%
0/50
0%
1/50
2%
1/50
2%
overall1.41%1%3%0%2%1%1%1%3%2%

CoG 2022 results first look

As Dan Gant let me know, CoG 2022 results are out today, complete with the detailed results file. The participants are the same as last year, except that MetaBot was dropped for unreliability that affecting the running of the tournament. The carryovers from last year are #6 XiaoYi, #7 CUNYbot, and #8 BetaStar. The others are updated for this year.

My version of the crosstable.

overallBanaPurpStarMcRaMicrXIAOCUNYBeta
#1 BananaBrain85.40%79%60%69%90%100%100%100%
#2 PurpleWave75.24%21%84%55%74%97%97%100%
#3 Stardust73.02%40%16%69%90%98%100%98%
#4 McRave68.60%31%45%31%93%81%100%100%
#5 Microwave46.54%10%26%10%7%74%98%100%
#6 XIAOYI35.40%0%3%2%19%26%98%100%
#7 CUNYBot15.49%0%3%0%0%2%2%100%
#8 BetaStar0.32%0%0%2%0%0%0%0%

There are surprises throughout, from top to bottom.

Stardust’s reign is over for the moment. Last year, Stardust scored over 90% in CoG and over 95% in AIIDE, crushing the competition. This time, #1 BananaBrain dominated with 85%, and #2 PurpleWave edged out #3 Stardust. The official results show that Stardust had 67 crashes and 7 frame timeouts in 3150 games. If Stardust had the same number of crashes (zero) and frame timeouts (1) as the two bots above it, it would have finished second by a razor-thin margin.

There is not a single upset, where a lower-ranked bot defeated a higher-ranked bot. The crosstable is very orderly. The lowest winning rate of a higher-ranked bot is 55% for #2 PurpleWave over #4 McRave.

Something went wrong with BetaStar. It is a strong bot and finished well ahead of CUNYbot last year. Head to head versus CUNYBot, it scored 40 wins out of 50 games. This year it scored 10 wins total against all opposition, and all wins were against Stardust and likely due to crashes. What went wrong? Did the new and improved map pool break it? Was there a rule change that it could not cope with?

race results

I made two versions of each table. The left one includes all results, the right one excludes BetaStar.

racescore
terran35%
protoss58%
zerg44%
racescore
terran25%
protoss74%
zerg34%

It’s not very informative, but I like to include it anyway. There was only one terran; we need more. Protoss dominated, as usual in recent years, even when including BetaStar’s debacle.

botraceoverallvTvPvZ
BananaBrainprotoss85.40%100%80%86%
PurpleWaveprotoss75.24%97%68%75%
Stardustprotoss73.02%98%51%86%
McRavezerg68.60%81%52%96%
Microwavezerg46.54%74%37%53%
XIAOYIterran35.40%-26%48%
CUNYBotzerg15.49%2%26%1%
BetaStarprotoss0.32%0%1%0%
botraceoverallvTvPvZ
BananaBrainprotoss82.96%100%70%86%
PurpleWaveprotoss71.11%97%52%75%
Stardustprotoss68.89%98%28%86%
McRavezerg63.37%81%36%96%
Microwavezerg37.63%74%16%53%
XIAOYIterran24.63%-1%48%
CUNYBotzerg1.41%2%1%1%

Again, not very informative with so few participants. Excluding BetaStar clarifies that CUNYbot was outclassed. XiaoYi was also outclassed by the remaining protoss, and was only able to fight against the zergs.

the surprising poor results

Stardust’s crash rate surprises me. It does not have a crashing problem on BASIL. There was something in the tournament environment that it was not ready for. I can’t guess whether that’s more due to Stardust, or more due to the tournament.

BetaStar essentially scored zero and added no information to the tournament results. To me it suggests that the tournament environment changed somehow (we know that at least the map pool changed), and the organizers did not test the carryover bots to make sure they still worked.

losing with valkyries

A number of stronger terran bots make valkyries against Steamhammer. Not one of the terrans knows how to use them well. Steamhammer trades better against valkyries in nearly every game where they show up.

Valkyries are specialist units. Valkyries in small groups, when kept safe, can devastate mutalisks (or wraiths). A valkyrie wandering around on its own can’t devastate anything except one pair of scourge. Zerg will happily trade a pair of scourge costing 25/75 for a 250/125 valk. The valkyrie can run away from scourge, but only if it does not shoot. When the valk fires, it stops dead in the air!

In TvZ, defeating mutalisks is their only important role. In most situations they are not good for hunting overlords, because they are too vulnerable to scourge. To fight guardians, wraiths are better, because guardians have armor. Devourers have more armor, and goliaths and vessels are better.

Advice to terran bots: If you are facing mutalisks, consider keeping up to 3 valkyries in the air against a dozen mutas, maybe slightly more if the mutas mass up beyond that or pull ahead in armor upgrades. Valkyries are expensive; making too many is wasteful. Keep them together so they fire at nearly the same time. It makes them far more effective—seriously, mutas will melt away like light snow; zerg will be forced to spread them so that they cannot focus fire. Keep them near marines, goliaths, or turrets for protection from scourge. Make sure your anti-air units prioritize shooting down scourge; scourge are low in HP and expensive in gas, so they’re good targets whether you have air units nearby or not. Even with all that, valkyries may not pay for themselves unless you also implement valkyrie patrol micro as described in Liquipedia. It’s critical, and I’ve never seen a bot use it.

Valkyries can be worth it, but only for bots that know how to use them.

new bot Pinfel 2

The bot Pinfel has been inactive for a long time. I last wrote it up in 2017. Pinfel played an entertaining zealot-probe all-in strategy, bringing every probe. Now a new bot named Pinfel 2 (BASIL link) has appeared. Its SSCAIT description says it uses STARTcraft.

Pinfel 2 makes 5 or 6 gates, accumulates a lot of zealots, and eventually attacks—if it is not attacked first. It’s rudimentary, but it’s a start. The zealots are many, but until Pinfel 2 starts its very late attack, the zealots only wait by their gates and only react if an enemy comes close.

Pinfel 2 is brand new. I approve of uploading it in a rudimentary state and watching how it does before getting down to the real work. Let’s see what becomes of it!

new bot Pylon Puller

New protoss bot Pylon Puller is running on BASIL only, not SSCAIT. It was uploaded yesterday, and already has one update.

Pylon Puller plays at least three strategies. Its favorite is three gate zealot. It also knows builds for cannon defense followed by dark templar, and cannon defense followed by dragoon-reaver.

Pylon Puller tries to win in the opening. Once its unit mix is set, it does not seem to change. It is able to expand, but rarely does. It can do long distance mining, which I have seen more often than an expansion. Its micro is not strong. It sometimes leaves units on hold position while they’re under attack.

Pylon Puller has not always played the same build against the same opponent. I have only seen it play its dragoon-reaver build against Yuanheng Zhu aka Juno, a cannon bot: The reavers are theoretically effective versus cannons. Without more games, it’s impossible to tell whether it’s a strategy adaptation or a hardcoded build for this opponent. Against IceBot, it played zealots the first game, and Ice defended effortlessly and won. The second game, Pylon Puller went DT and won. Since there was an update, it’s impossible to know whether the switch was due to the bot learning, or the author coding. Similarly for two losses in a row against MadMixT, where Pylon Puller played different builds. But I doubt that a bot at Pylon Puller’s skill level has both strategy adaptation and learning. Learning does seem moderately likely.

Running on BASIL only, frequent updates, focus on winning early with little attention to the middle game, and a restricted but hand-adapted set of opening builds all strike me as characteristic of a Newbie Zerg production—though unlike many past Newbie Zerg bots, it does not seem to be based on Steamhammer. I expect that Pylon Puller will be unable to threaten top bots, but may become a danger to some opponents one tier lower if Pylon Puller’s hand-crafted builds target their weaknesses. Eh, maybe two tiers lower. So far, it does not look impressive.

But whatever. Every new bot is good! Weakness exposed by manual adaptation are still weaknesses that need fixing. Every bot that brings something new will teach us something new.