My version of the CIG 2017 crosstable. I have small differences from the official results—see the explanation below. My results reverse the finishing places of #8 CasiaBot and #9 Ziabot as well as #17 Bigeyes and #18 OpprimoBot, because even small differences affect ranking.
The format of the results file has changed since last year. There is no documentation, so I don’t know what all the columns mean, but I only needed a few of them and I was able to pick them out. It turns out that column 6 is true
if the first player won, otherwise false
. It looks like each game is recorded twice, with the same winner, loser, and map but some differences in other data. I imagine that each player is running in its own instance, and each instance records its own data. The games are numbered so the duplicates can be recognized, and sometimes the games are recorded out of order. I had to rewrite parsing code, but only a handful of lines.
The results file turned out to have a section of corrupted data in the middle. Information about a small number of games is missing or corrupted, and I had to delete it from my input. The tournament format of 125 rounds with 20 participants called for 125 * 20 * 19 / 2 = 23750 games. Each game was recorded twice, so there should be 47500 lines in the results file. One game was expected to be missed because the tournament manager software has an off-by-one bug and doesn’t play one game. The last and highest-numbered game recorded in the results file is 23721, and numbering starts from 0 so it looks as though 3 games in fact went unplayed, or at least uncounted. There are 47487 lines remaining in the input file, accounting for 23722 games or 99.88% of the ideal 23750, or 99.89% of the expected and claimed 23749 games.
Anyway, my winning percentages are different from the official numbers mostly starting in the 3rd decimal place, which is what you expect with a discrepancy in game count in the third decimal place. Apparently the official numbers don’t suffer from corrupted data. I have written to the organizers to see if they can provide a clean result file.
| overall | ZZZK | tscm | Purp | Leta | UAlb | Mega | Over | Casi | Ziab | Iron | AIUR | McRa | Tyr | SRbo | Terr | Bonj | Bige | Oppr | Slin | Sals |
ZZZKBot | 75.43% | | 52% | 37% | 53% | 82% | 77% | 90% | 31% | 67% | 66% | 93% | 83% | 84% | 82% | 88% | 89% | 86% | 94% | 90% | 90% |
tscmoo | 73.50% | 48% | | 90% | 82% | 54% | 67% | 65% | 46% | 66% | 58% | 69% | 74% | 69% | 75% | 92% | 89% | 79% | 94% | 85% | 94% |
PurpleWave | 66.51% | 63% | 10% | | 42% | 62% | 74% | 89% | 54% | 41% | 55% | 82% | 51% | 89% | 74% | 86% | 61% | 90% | 74% | 73% | 94% |
LetaBot | 62.75% | 47% | 18% | 58% | | 62% | 49% | 76% | 82% | 70% | 75% | 51% | 66% | 54% | 42% | 70% | 72% | 78% | 41% | 87% | 95% |
UAlbertaBot | 61.67% | 18% | 46% | 38% | 38% | | 30% | 62% | 60% | 70% | 42% | 60% | 79% | 46% | 67% | 88% | 90% | 73% | 87% | 86% | 92% |
MegaBot | 61.06% | 23% | 33% | 26% | 51% | 70% | | 55% | 37% | 61% | 44% | 44% | 60% | 50% | 81% | 90% | 82% | 77% | 97% | 82% | 96% |
Overkill | 59.65% | 10% | 35% | 11% | 24% | 38% | 45% | | 55% | 76% | 43% | 65% | 62% | 42% | 84% | 94% | 91% | 93% | 91% | 83% | 92% |
CasiaBot | 58.32% | 69% | 54% | 46% | 18% | 40% | 63% | 45% | | 34% | 46% | 62% | 59% | 70% | 29% | 47% | 82% | 93% | 75% | 84% | 94% |
Ziabot | 58.49% | 33% | 34% | 59% | 30% | 30% | 39% | 24% | 66% | | 47% | 58% | 39% | 60% | 73% | 79% | 89% | 82% | 90% | 81% | 96% |
Iron | 58.11% | 34% | 42% | 45% | 25% | 58% | 56% | 57% | 54% | 53% | | 69% | 56% | 45% | 69% | 74% | 70% | 76% | 76% | 73% | 74% |
AIUR | 56.73% | 7% | 31% | 18% | 49% | 40% | 56% | 35% | 38% | 42% | 31% | | 64% | 74% | 81% | 82% | 92% | 72% | 90% | 83% | 93% |
McRave | 47.20% | 17% | 26% | 49% | 34% | 21% | 40% | 38% | 41% | 61% | 44% | 36% | | 44% | 65% | 61% | 59% | 52% | 56% | 77% | 76% |
Tyr | 45.32% | 16% | 31% | 11% | 46% | 54% | 50% | 58% | 30% | 40% | 55% | 26% | 56% | | 24% | 45% | 58% | 49% | 63% | 58% | 91% |
SRbotOne | 45.24% | 18% | 25% | 26% | 58% | 33% | 19% | 16% | 71% | 27% | 31% | 19% | 35% | 76% | | 35% | 61% | 97% | 43% | 75% | 94% |
TerranUAB | 38.58% | 12% | 8% | 14% | 30% | 12% | 10% | 6% | 53% | 21% | 26% | 18% | 39% | 55% | 65% | | 77% | 63% | 67% | 65% | 93% |
Bonjwa | 33.04% | 11% | 11% | 39% | 28% | 10% | 18% | 9% | 18% | 11% | 30% | 8% | 41% | 42% | 39% | 23% | | 61% | 67% | 65% | 95% |
Bigeyes | 30.90% | 14% | 21% | 10% | 22% | 27% | 23% | 7% | 7% | 18% | 24% | 28% | 48% | 51% | 3% | 37% | 39% | | 60% | 58% | 91% |
OpprimoBot | 31.90% | 6% | 6% | 26% | 59% | 13% | 3% | 9% | 25% | 10% | 24% | 10% | 44% | 37% | 57% | 33% | 33% | 40% | | 79% | 93% |
Sling | 26.07% | 10% | 15% | 27% | 13% | 14% | 18% | 17% | 16% | 19% | 27% | 17% | 23% | 42% | 25% | 35% | 35% | 42% | 21% | | 77% |
Salsa | 9.52% | 10% | 6% | 6% | 5% | 8% | 4% | 8% | 6% | 4% | 26% | 7% | 24% | 9% | 6% | 7% | 5% | 9% | 7% | 23% | |
observations
Newcomers #3 PurpleWave and #8 CasiaBot were the only players with positive scores versus #1 ZZZKBot. When I have time I’ll look into the replays and see if ZZZKBot was ready for the old-timers with special builds, or if it was the old-timers who weren’t ready for ZZZKBot. Perhaps ZZZKBot now has a learning feature and switches to a backup build if its 4 pool doesn’t work? I’m interested to find out.
#2 Tscmoo was edged out by #1 ZZZKBot and #8 CasiaBot, but otherwise had plus scores across the board. It showed stable performance across opponents—except for its crushing 90% win over #3 PurpleWave.
#3 PurpleWave did reasonably well against all except #2 tscmoo (10% wins) and #9 Zia. (I count 42% against #4 LetaBot as reasonably good, though it’s technically an upset.) But there were several weaker opponents that it edged out more narrowly than you might expect. My conclusion: Strong, but a little lacking in the solidity needed to defeat weaker opponents consistently. With more maturity it will likely become even stronger.
#8 CasiaBot seems to have the most uneven results, with severe upsets in both directions—69% versus #1 ZZZKBot and 29% against #14 SRbotOne.
The biggest upset is #18 OpprimoBot at 59% against #4 LetaBot.
The winning rates of #10 Iron and #12 McRave versus tail-ender #20 Salsa, which lost nearly all games against other opponents, are 74% and 76%. It backs up the claim that the two played only about 75% of games due to map problems with BWEM.
What do you notice in the crosstable?
Next: Per-map crosstables. We can expect dramatic numbers in some table cells thanks to Iron and McRave.
Update: I heard back from the organizers. They say that they had to alter the results file to compensate for errors in the tournament manager. And they think that all the problems are slight given the large number of games played. I think that’s true as far as it goes, but it leaves me feeling a little uneasy about the official results.