CoG 2022 results first look
As Dan Gant let me know, CoG 2022 results are out today, complete with the detailed results file. The participants are the same as last year, except that MetaBot was dropped for unreliability that affecting the running of the tournament. The carryovers from last year are #6 XiaoYi, #7 CUNYbot, and #8 BetaStar. The others are updated for this year.
My version of the crosstable.
overall | Bana | Purp | Star | McRa | Micr | XIAO | CUNY | Beta | |
---|---|---|---|---|---|---|---|---|---|
#1 BananaBrain | 85.40% | 79% | 60% | 69% | 90% | 100% | 100% | 100% | |
#2 PurpleWave | 75.24% | 21% | 84% | 55% | 74% | 97% | 97% | 100% | |
#3 Stardust | 73.02% | 40% | 16% | 69% | 90% | 98% | 100% | 98% | |
#4 McRave | 68.60% | 31% | 45% | 31% | 93% | 81% | 100% | 100% | |
#5 Microwave | 46.54% | 10% | 26% | 10% | 7% | 74% | 98% | 100% | |
#6 XIAOYI | 35.40% | 0% | 3% | 2% | 19% | 26% | 98% | 100% | |
#7 CUNYBot | 15.49% | 0% | 3% | 0% | 0% | 2% | 2% | 100% | |
#8 BetaStar | 0.32% | 0% | 0% | 2% | 0% | 0% | 0% | 0% |
There are surprises throughout, from top to bottom.
Stardust’s reign is over for the moment. Last year, Stardust scored over 90% in CoG and over 95% in AIIDE, crushing the competition. This time, #1 BananaBrain dominated with 85%, and #2 PurpleWave edged out #3 Stardust. The official results show that Stardust had 67 crashes and 7 frame timeouts in 3150 games. If Stardust had the same number of crashes (zero) and frame timeouts (1) as the two bots above it, it would have finished second by a razor-thin margin.
There is not a single upset, where a lower-ranked bot defeated a higher-ranked bot. The crosstable is very orderly. The lowest winning rate of a higher-ranked bot is 55% for #2 PurpleWave over #4 McRave.
Something went wrong with BetaStar. It is a strong bot and finished well ahead of CUNYbot last year. Head to head versus CUNYBot, it scored 40 wins out of 50 games. This year it scored 10 wins total against all opposition, and all wins were against Stardust and likely due to crashes. What went wrong? Did the new and improved map pool break it? Was there a rule change that it could not cope with?
race results
I made two versions of each table. The left one includes all results, the right one excludes BetaStar.
race | score |
---|---|
terran | 35% |
protoss | 58% |
zerg | 44% |
race | score |
---|---|
terran | 25% |
protoss | 74% |
zerg | 34% |
It’s not very informative, but I like to include it anyway. There was only one terran; we need more. Protoss dominated, as usual in recent years, even when including BetaStar’s debacle.
bot | race | overall | vT | vP | vZ |
---|---|---|---|---|---|
BananaBrain | protoss | 85.40% | 100% | 80% | 86% |
PurpleWave | protoss | 75.24% | 97% | 68% | 75% |
Stardust | protoss | 73.02% | 98% | 51% | 86% |
McRave | zerg | 68.60% | 81% | 52% | 96% |
Microwave | zerg | 46.54% | 74% | 37% | 53% |
XIAOYI | terran | 35.40% | - | 26% | 48% |
CUNYBot | zerg | 15.49% | 2% | 26% | 1% |
BetaStar | protoss | 0.32% | 0% | 1% | 0% |
bot | race | overall | vT | vP | vZ |
---|---|---|---|---|---|
BananaBrain | protoss | 82.96% | 100% | 70% | 86% |
PurpleWave | protoss | 71.11% | 97% | 52% | 75% |
Stardust | protoss | 68.89% | 98% | 28% | 86% |
McRave | zerg | 63.37% | 81% | 36% | 96% |
Microwave | zerg | 37.63% | 74% | 16% | 53% |
XIAOYI | terran | 24.63% | - | 1% | 48% |
CUNYBot | zerg | 1.41% | 2% | 1% | 1% |
Again, not very informative with so few participants. Excluding BetaStar clarifies that CUNYbot was outclassed. XiaoYi was also outclassed by the remaining protoss, and was only able to fight against the zergs.
the surprising poor results
Stardust’s crash rate surprises me. It does not have a crashing problem on BASIL. There was something in the tournament environment that it was not ready for. I can’t guess whether that’s more due to Stardust, or more due to the tournament.
BetaStar essentially scored zero and added no information to the tournament results. To me it suggests that the tournament environment changed somehow (we know that at least the map pool changed), and the organizers did not test the carryover bots to make sure they still worked.
Comments
Dan on :
BetaStar has a tendency to fail in some environments for reasons I don't understand. It's disabled on BASIL for having a very high crash rate, and a lot of its games fail on my local SC-Docker, but it historically worked fine on SSCAIT. The map pool doesn't explain it either; the first map in the rotation was Benzene so presumably BetaStar would've played fine.
The most likely reason for a UAB-based module bot to play all the games without crashing but do nothing in them is failure to find its config file. The config files are present in BetaStar's AI directory, though. And PurpleWave's logs confirm that it finds its own config files (though it also scans directories upwards/downwards to find them if they're missing, so it's somewhat more robust against unexpected working directories than BetaStar might be).
Amusing stat: PurpleWave averaged ~10:30 to kill an AFK BetaStar, which is longer than both Microwave and CUNYBot's average game length across all opponents.
Dan on :
Bruce on :
Don't know what happened to BetaStar either - it worked locally on all of the maps in my sc-docker installation.
MicroDK on :
MicroDK on :
Jay Scott on :
MicroDK on :
MicroDK on :
Jay Scott on :
MicroDK on :
MicroDK on :