CoG 2022 results first look
As Dan Gant let me know, CoG 2022 results are out today, complete with the detailed results file. The participants are the same as last year, except that MetaBot was dropped for unreliability that affecting the running of the tournament. The carryovers from last year are #6 XiaoYi, #7 CUNYbot, and #8 BetaStar. The others are updated for this year.
My version of the crosstable.
overall | Bana | Purp | Star | McRa | Micr | XIAO | CUNY | Beta | |
---|---|---|---|---|---|---|---|---|---|
#1 BananaBrain | 85.40% | 79% | 60% | 69% | 90% | 100% | 100% | 100% | |
#2 PurpleWave | 75.24% | 21% | 84% | 55% | 74% | 97% | 97% | 100% | |
#3 Stardust | 73.02% | 40% | 16% | 69% | 90% | 98% | 100% | 98% | |
#4 McRave | 68.60% | 31% | 45% | 31% | 93% | 81% | 100% | 100% | |
#5 Microwave | 46.54% | 10% | 26% | 10% | 7% | 74% | 98% | 100% | |
#6 XIAOYI | 35.40% | 0% | 3% | 2% | 19% | 26% | 98% | 100% | |
#7 CUNYBot | 15.49% | 0% | 3% | 0% | 0% | 2% | 2% | 100% | |
#8 BetaStar | 0.32% | 0% | 0% | 2% | 0% | 0% | 0% | 0% |
There are surprises throughout, from top to bottom.
Stardust’s reign is over for the moment. Last year, Stardust scored over 90% in CoG and over 95% in AIIDE, crushing the competition. This time, #1 BananaBrain dominated with 85%, and #2 PurpleWave edged out #3 Stardust. The official results show that Stardust had 67 crashes and 7 frame timeouts in 3150 games. If Stardust had the same number of crashes (zero) and frame timeouts (1) as the two bots above it, it would have finished second by a razor-thin margin.
There is not a single upset, where a lower-ranked bot defeated a higher-ranked bot. The crosstable is very orderly. The lowest winning rate of a higher-ranked bot is 55% for #2 PurpleWave over #4 McRave.
Something went wrong with BetaStar. It is a strong bot and finished well ahead of CUNYbot last year. Head to head versus CUNYBot, it scored 40 wins out of 50 games. This year it scored 10 wins total against all opposition, and all wins were against Stardust and likely due to crashes. What went wrong? Did the new and improved map pool break it? Was there a rule change that it could not cope with?
race results
I made two versions of each table. The left one includes all results, the right one excludes BetaStar.
race | score |
---|---|
terran | 35% |
protoss | 58% |
zerg | 44% |
race | score |
---|---|
terran | 25% |
protoss | 74% |
zerg | 34% |
It’s not very informative, but I like to include it anyway. There was only one terran; we need more. Protoss dominated, as usual in recent years, even when including BetaStar’s debacle.
bot | race | overall | vT | vP | vZ |
---|---|---|---|---|---|
BananaBrain | protoss | 85.40% | 100% | 80% | 86% |
PurpleWave | protoss | 75.24% | 97% | 68% | 75% |
Stardust | protoss | 73.02% | 98% | 51% | 86% |
McRave | zerg | 68.60% | 81% | 52% | 96% |
Microwave | zerg | 46.54% | 74% | 37% | 53% |
XIAOYI | terran | 35.40% | - | 26% | 48% |
CUNYBot | zerg | 15.49% | 2% | 26% | 1% |
BetaStar | protoss | 0.32% | 0% | 1% | 0% |
bot | race | overall | vT | vP | vZ |
---|---|---|---|---|---|
BananaBrain | protoss | 82.96% | 100% | 70% | 86% |
PurpleWave | protoss | 71.11% | 97% | 52% | 75% |
Stardust | protoss | 68.89% | 98% | 28% | 86% |
McRave | zerg | 63.37% | 81% | 36% | 96% |
Microwave | zerg | 37.63% | 74% | 16% | 53% |
XIAOYI | terran | 24.63% | - | 1% | 48% |
CUNYBot | zerg | 1.41% | 2% | 1% | 1% |
Again, not very informative with so few participants. Excluding BetaStar clarifies that CUNYbot was outclassed. XiaoYi was also outclassed by the remaining protoss, and was only able to fight against the zergs.
the surprising poor results
Stardust’s crash rate surprises me. It does not have a crashing problem on BASIL. There was something in the tournament environment that it was not ready for. I can’t guess whether that’s more due to Stardust, or more due to the tournament.
BetaStar essentially scored zero and added no information to the tournament results. To me it suggests that the tournament environment changed somehow (we know that at least the map pool changed), and the organizers did not test the carryover bots to make sure they still worked.