archive by month
Skip to content

the fast success of Steamhammer

SSCAIT 2016 is beyond the halfway point, and Steamhammer’s score ratio has been holding steady near 2:1, good enough to place it in the bottom half of the top 16. It will probably make it into the finals (no guarantee; some tough games are still ahead). Its rate of upsets suggests a 30% chance that it could win in the round of 16 and make it through to the round of 8. It is unlikely to get further than that.

I’ve seen a few people wonder how a brand-new bot could do so well. Steamhammer 0.2 has less than 3 weeks of my development effort in it. I can answer that!

1. Steamhammer builds on a strong foundation in UAlbertaBot. I changed the openings and the strategy followup, and fixed bad play where I had time to, but most of Steamhammer’s play comes straight from UAlbertaBot. Tactics are only slightly changed; micro has modestly improved targeting but is otherwise nearly identical.

2. I know the game and I’ve been following the bot scene for a long time. I was able to make choices that both fit within UAlbertaBot’s limitations and pose challenges to other bots. Knowledge is power. Most new bots play inefficient builds; Steamhammer builds inefficiently only to work around other problems.

3. Strong new bots usually show good results at first and then decline as other bots adapt. I’ve seen it before and I see it now with Steamhammer. The opening learning bots find the right openings to play; I played a test match versus Zia in which Steamhammer scored 1 1 1 1 1 0 0 0 0, switching from wins to losses when Zia hit on the right opening. The actively developed bots get fixes to any weaknesses that the new bot reveals; both Krasi0 and LetaBot have mentioned improvements to mutalisk defense.

Steamhammer versus Killerbot and XIMP

I got my SSCAIT wish and saw the first Steamhammer versus Killerbot and Steamhammer versus XIMP games today.

Steamhammer versus Killerbot by Marian Devecka: Would my counter-build work? In my tests Steamhammer wins with it about half the time. Killerbot went with its +1 speed lings and Steamhammer opened 11 pool and held with its own slow lings while teching to lair. Timing is critical—12 pool is too slow on some maps! In this game Steamhammer managed to sneak in a drone kill while Killerbot’s lings were busy chasing the scouting drone (UAlbertaBot doesn’t do that natively, I had to improve the decision making). Steamhammer added 2 sunkens for protection while getting mutalisks—it still had only 1 hatchery, while Killerbot was headed for 3 hatcheries, and it is impossible to hold without static defense. Steamhammer’s first 3 mutalisks hatched and headed for the enemy base, and as soon as the flyers were safely out of range to intervene, Killerbot’s zerglings attacked! Steamhammer pulled drones to defend and lost every single drone, but the sunkens held and the base was left barren but secure for the moment. The 3 mutalisks killed all of Killerbot’s drones in return and then tore down the enemy zerg base with excruciating slowness—Steamhammer won a close game. Steamhammer-Killerbot games often go down to the wire like this in my testing; I’ve seen one side or the other win with only 1 or 2 buildings left bleeding onto the map. But without the special-purpose counter-build, Steamhammer loses every game; Killerbot is far superior in tactics and robustness.

Steamhammer versus XIMP by Tomas Vajda: Steamhammer opened with 3 hatcheries before pool, the only bot I’ve seen play this logical response to forge-expand. (Though I saw auxanic open unsoundly with 5 hatcheries before pool!) XIMP cannoned up for carriers, as always. I hadn’t tested this matchup, so I wasn’t sure what would happen. Steamhammer went hydras with +1 attack, and soon had enough to tear down the cannon wall (slightly before +1 finished, annoyingly), wipe out XIMP’s natural, and start on the main. That’s what I intended, and I’m pleased it worked. XIMP seemed about to fall, but suddenly the first carriers started to make a difference—they finally had enough interceptors. The hydras did not have critical mass to fight carriers and were cleared from the main. At this point, Steamhammer had 6 bases though not enough drones to saturate them, and XIMP was mining minerals with 6 probes and gas with 3. Steamhammer was still winning by a mile.

As the game went on, the carriers moved to the middle of the map and Steamhammer trickled hydras toward them, losing them a few at a time for no apparent gain. There were 2 obvious problems. First, Steamhammer only attacked interceptors; as I watched, I realized “carriers don’t attack, so they’re at the bottom in the target priority list.” The code is easy to rewrite, though I’m not sure what solution is best. And second, trickling the hydras in was suicidal. I concluded while watching that SparCraft must not know about carriers. It ignores units that it does not know about, so when asked to predict the combat outcome, SparCraft always reported “zerg wins,” and Steamhammer moved its hydras confidently to their deaths. That may be harder to fix.

Steamhammer then won when XIMP overstepped a time limit, taking too long on a frame. That was disappointing. :-( I’m not sure how the game would have turned out. Steamhammer was still ahead but was playing like an idiot, and XIMP did finally re-expand to its natural and may have been able to add carriers before Steamhammer could take the rest of the map and start to make progress.

After these two uncertain wins, Steamhammer played against protoss Ian Nicholas DaCosta, a weaker bot which it had beaten in their previous game. Steamhammer got unlucky with base positions and scouting, placed its 3rd hatchery directly on the path that zealots were taking toward its main, and lost. Luck runs both ways!

SSCAIT 2016 links

For convenience, links related to the SSCAIT 2016 tournament that’s going on now.

LetaBot posted the unofficial crosstable link in a comment.

I haven’t seen a plain explanation of the tournament format, although many people seem to understand it. There are 45 players. The tournament will be in 2 stages. The first stage or “student division” is 1980 games in a double round robin: Each bot plays each other twice. Top students in the student division can win prizes. Then the top 16 go into a second stage, the “mixed division,” played in single elimination style.

Iron’s surprise losses

Among SSCAIT tournament watchers, the talk of the moment is of Iron’s two surprise defeats against zergs Steamhammer by me and GarmBot by Aurelien Lermant. Iron got unlucky 2 games in a row.

In the Steamhammer game, Iron became overcautious against the few early zerglings, bunkered its ramp, and was late with its vulture attack. Provoking reactions like that is one of the reasons Steamhammer makes zerglings. When the mutalisks came out, they soon had enough numbers to safely break the bunker and then were free to ravage Iron’s base (they could have gone ravaging from the start, but they were obsessed with the bunker). Iron’s vultures moved in and quickly wiped out every drone, but the mutalisk swarm split up, some cleaning vultures and the rest cleaning the terran base. Steamhammer won with empty bases but unchallenged air supremacy. In my testing, Steamhammer has never won against Iron with this build order—Iron got unlucky, or perhaps a late change weakened its play in this situation.

In the GarmBot game, GarmBot randomly expanded to Iron’s natural and built a sunken that covered the ramp. Iron was again overcautious: It double bunkered its ramp and blocked with vultures and many SCVs, and sent no vultures past the sunken. It seemed in no hurry to get tanks and blow away the sunken. Mutalisks showed up and started picking at a supply depot; nothing stopped them. But the lethal blow did not come until lurkers arrived and burrowed in good position on the ramp. There was no detection. Lurkers defeated the blocking units and cleared the bunkers while mutalisks killed the engineering bay, and then there was no hope. Iron reacted poorly to GarmBot’s bizarre expansion and GarmBot followed up well.

It’s a hard game! Even a bot as well-rounded as Iron is not prepared for everything.

SSCAIT 2016 surprise entries

The deadline is past. Here are the surprise eleventh-hour SSCAIT entrants this year:

  • Aman Zargarpur - terran
  • auxanic - zerg
  • BeeBot - terran
  • Tommy Fang - terran
  • XelnagaII - protoss

Of these, XelnagaII is the only name I recognize. It finished in the middle of the pack in AIIDE 2016. As LetaBot says, BeeBot had a bug in its first upload which was fixed before the deadline, explaining its inconsistent results. It may be a dangerous mech bot. Auxanic has been around long enough for me to get an impression of its style: It is a single-minded macro zerg which doesn’t believe in making units for early defense. Auxanic can recover and win after having its main and tech destroyed, so at least it’s robust. Aman Zargarpur has played 1 game so far and Tommy Fang none yet.

I think we’re in for some variety, and that is good.

As LetaBot notes in the comments, they have returned to calling this the 2016 tournament, so that the sequence of numbers is unbroken. They called it 2017 until lately.

SSCAIT 2017 final rush

The SSCAIT tournament deadline this year is 18 December. Like the 2015 edition last year, it will start in December and I expect final results in January. Curiously, they seem to have decided to skip a year in the numbering scheme, so this is the 2017 edition.

The final rush is on. The Usual Suspects are updating their bots. Marian Devecka, I’ve already noted, is back in the game. Tyr has been reuploaded for the first time since the middle of the year. Possibly a few bot authors are waiting until the last moment, hoping to gain an advantage with surprise updates.

The admins have been at work too. Some previously disabled bots are alive again: Awesomebot, DAIDOES, Ian Nicholas DaCosta’s bot, JompaBot, the 2013 bot by Oleg Ostroumov, PeregrineBot. The number of games used to track the current winning rate has been bumped from 20 to 50, so that the ranking changes more slowly but ends up more accurate.

I will be jumping in too. I got work underway a week ago and I’m rushing to finish a first primitive version. Stand by for the announcement.

SSCAIT scores - raw data for yesterday’s graph

Here is a csv file of rating differences and score ratios. It contains all the data that went into yesterday’s graph, plus the names of the opponents so we can see which bots are behind each dot. It is over 4MB for 80,798 games, hardly Big Data territory but not something you want to look through by eye either.

Krasi0 asked about the outliers in the upper left, where the weaker bot upset the stronger with a huge score ratio. I pulled out the 44 games where the weaker bot was at least 50 Elo behind and won with a score ratio of 100 or more. There’s quite a variety of bots and it looks like most of the losers are genuine strong opponents, but we can’t tell from this why they lost so badly.

winnerloserrating_diffscore_ratio
Odin2014Soeren Klett-121.241105546074102
Maja NemsilajovaGaoyuan Chen-87.5441567081255133.75
EradicatumXVRMartin Rooijackers-176.159988871814134.75
Florian RichouxAndrew Smith-75.4167275722925139.75
EradicatumXVRMartin Rooijackers-160.787872111647123.6
Jakub Tranciktscmooz-234.397880730261107.85
EradicatumXVRTomas Vajda-310.393156864715238.75
Odin2014tscmoo-341.595996071025111.2
Jakub TrancikKrasimir Krystev-206.879443535575118.5
Martin RooijackersTomas Vajda-185.169798041552178
Martin RooijackersTomas Vajda-196.771893565481196.5
Jakub TrancikFlorian Richoux-201.533152887389146
Martin RooijackersTomas Vajda-214.947508799781194.5
EradicatumXVRSoeren Klett-59.609552771464121.625
Martin RooijackersTomas Vajda-197.317290128813173.75
EradicatumXVRMartin Rooijackers-153.322771550638134
NUS BotTomas Cere-217.342932814072102.25
EradicatumXVRMartin Rooijackers-299.534513717939140
Martin RooijackersTomas Vajda-177.764917764824171.5
Jakub Tranciktscmoo-114.300240687596113.25
Martin RooijackersTomas Vajda-261.726998366137185.5
Tomas VajdaICELab-61.7649472971907110.3
Maja NemsilajovaGaoyuan Chen-444.702632047678148.5
Jakub TrancikAurelien Lermant-96.3561952340121144
Adrian SternmullerSerega-165.832601805773127.6
Ian Nicholas DaCostaTomas Cere-180.057678684451109.25
OpprimoBotRadim Bobek-107.746261964435187.75
Martin RooijackersTomas Vajda-276.603234707109108.2
Florian RichouxMartin Rooijackers-117.736046527607126
Igor LacikGaoyuan Chen-94.2338975266132121.75
Florian RichouxAndrew Smith-186.917595581173123.25
Radim BobekUPStarcraftAI-62.9693177806437136.5
Tomas VajdaAndrew Smith-79.8133863519054152.771739130435
Jakub TrancikFlorian Richoux-84.9235344515139128
Martin RooijackersTomas Vajda-78.9815583248117207
Jakub TrancikPeregrineBot-69.7933587952098121.25
Dave ChurchillAndrew Smith-94.5789654176522180.75
Tomas VajdaWuliBot-52.5760915924302182.5
Soeren KlettDave Churchill-155.311160590808361.25
Odin2014PeregrineBot-94.3129452431417124.55
DAIDOESMegaBot-178.609296376345115.875
OpprimoBotMartin Rooijackers-786.274344230815307.75
Soeren KlettDave Churchill-68.2015666358432109

SSCAIT scores - compared to rating differences

This scatter chart shows rating differences versus score ratios. For each SSCAIT game in the dataset which ended with both scores above zero (about 80,000 games), the x-axis has the rating difference, winner’s rating minus loser’s rating. The ratings are calculated as of that game, so changing versions don’t mess things up (much). Check the Elo table to see what the rating differences mean. The y-axis has the score ratio, winner’s score divided by loser’s score. The y-axis is on a logarithmic scale and shows ratios from 0.2 to 200. A small number of points are off the top or bottom; no points are off the left or right. You can click through the graph to get the same image on its own, which may make it easier to zoom in and out.

Since it’s from the winner’s point of view, most games have rating difference > 0 (the higher rated bot won) and score ratio > 1 (the bot with the higher score won).

scatter chart of rating difference versus score ratio

There is complex structure here, but I’m at a loss to interpret much of it. Horizontal lines show that some score ratios are popular, which seems like a quirk of the game. Beyond the sharp lines, some fuzzier stratification in the score ratios is visible. When the stronger bot wins, it is usually by a score ratio of at least 2, increasing slowly as the strength differences goes up. The slowly rising “soft floor” in the score ratio is interesting and surprising. There are other clear structures in the chart, but I don’t know what they mean. It is mildly interesting that the left side, when the stronger bot lost, looks less structured.

Games where the score ratio is less than 1 are games where stopping early and adjudicating by score would give the wrong answer. It’s rare... but not rare enough to give me confidence in the timeout adjudication procedure. Some points are off the bottom, so some bots won despite having less than 1/5 the score by SSCAIT rules. The adjudication procedure will make occasional extreme mistakes.

Next: The winning attitude.

SSCAIT scores - crashes

In the SSCAIT data, some scores are recorded as 0 or -100. I think 0 means that the bot scored no kills and no razings, so the game finished very early. Bots might legitimately lose with no kills against a rush, but I have to guess that in many cases it means the bot failed to start up or failed to do anything. I think -100 means the bot crashed. I also think that exceeding the time limit per frame ends the game without changing the scores, so -100 only means crashes, not time infractions. I don’t promise that my interpretation of the numbers is correct! I could be wrong!

Anyway, assuming that my interpretation is right, this table of -100 scores should tell us about reliability. Like yesterday, I include only games after 17 August, so that distant versions of the same bot are not lumped together. The game counts are different from yesterday, because yesterday’s table excludes games in which either side has a score of 0 or -100.

“Crashes” are games in which the bot had -100 score. “Double crashes” are games in which both sides had -100 score. “Fast crashes” are games in which the opponent had 0 score, so the crash must have happened very early or when the opponent failed to achieve anything. “Max opp score” is the largest score that the opponent had in any game where the bot crashed. “Other crashes” are crashes where the opponent has a positive score (all crashes except double crashes and fast crashes). “Mean pos opp score” is the average of the opponent’s score in these other crashes. Large opponent scores mean that the game went on for a long time before the crash. Since a crash scores as -100, there is no way to tell whether the bot was ahead or behind when it crashed.

botgamescrashescrash %double
crash %
fast
crash %
max
opp
score
other
crash %
mean
pos
opp
score
krasi039410.25%0.25%0.00%-1000.00%-
Iron bot33182.42%1.51%0.00%72250.91%4367
Marian Devecka302309.93%0.33%0.00%1291109.60%52320
Martin Rooijackers33620.60%0.60%0.00%-1000.00%-
tscmooz33030.91%0.91%0.00%-1000.00%-
tscmoo33210.30%0.30%0.00%-1000.00%-
LetaBot CIG 201631810.31%0.31%0.00%-1000.00%-
WuliBot30430.99%0.99%0.00%-1000.00%-
Simon Prins318195.97%2.52%3.46%00.00%-
ICELab334123.59%0.30%0.30%35002.99%2960
Sijia Xu28510.35%0.35%0.00%-1000.00%-
LetaBot SSCAI 2015 Final338288.28%0.89%0.59%608506.80%16196
Dave Churchill3230.00%0.00%0.00%--
Chris Coxe22310.45%0.45%0.00%-1000.00%-
Tomas Vajda317165.05%0.63%0.32%1184354.10%44220
Flash30620.65%0.65%0.00%-1000.00%-
Zia bot3273310.09%0.92%1.22%450957.95%23120
PeregrineBot1570.00%0.00%0.00%--
tscmoop32820.61%0.61%0.00%-1000.00%-
Andrew Smith33451.50%1.50%0.00%-1000.00%-
Florian Richoux3083110.06%0.97%0.32%574058.77%14962
Carsten Nielsen34741.15%1.15%0.00%-1000.00%-
Soeren Klett28420.70%0.35%0.35%-1000.00%-
Jakub Trancik3180.00%0.00%0.00%--
Tomas Cere34530.87%0.00%0.58%818250.29%81825
MegaBot3055417.70%1.31%2.95%9245013.44%16394
Aurelien Lermant34941.15%1.15%0.00%-1000.00%-
Odin2014183179.29%0.55%1.64%386007.10%22077
Gaoyuan Chen3290.00%0.00%0.00%--
DAIDOES13685.88%0.74%0.74%17504.41%4742
Igor Lacik144139.03%0.69%0.00%541008.33%13421
Matej Istenik32310.31%0.31%0.00%-1000.00%-
NUS Bot13732.19%0.00%0.00%341752.19%22400
Roman Danielis30430.99%0.99%0.00%-1000.00%-
ZerGreenBot362569.44%2.78%8.33%3702558.33%22087
Ian Nicholas DaCosta15010.67%0.67%0.00%-1000.00%-
AwesomeBot164169.76%1.22%1.83%278506.71%7968
Johan Kayser30610.33%0.00%0.33%00.00%-
Martin Vlcak1511912.58%0.66%11.92%00.00%-
Rob Bogie1350.00%0.00%0.00%--
Christoffer Artmann3435215.16%0.58%11.95%7002.62%300
Marek Gajdos1756838.86%26.29%12.00%1000.57%100
Travis Shelton1550.00%0.00%0.00%--
Bjorn P Mattsson2883612.50%0.00%12.50%00.00%-
Vladimir Jurenka3493710.60%1.43%2.87%4006.30%264
neverdieTRX17010.59%0.59%0.00%-1000.00%-
OpprimoBot3515114.53%0.57%1.42%3882512.54%17424
Sungguk Cha32016852.50%1.25%4.69%6067546.56%17463
Jacob Knudsen1775631.64%0.56%3.39%1775027.68%5957
HoangPhuc15111576.16%0.66%5.96%3805069.54%10765
ButcherBoy1423524.65%0.70%7.04%1000016.90%1364

The top bots are mostly reliable—except Marian Devecka’s Killerbot. My impression from watching games is that Killerbot may crash when losing. It makes sense that bots should crash less often when winning, because bot authors have more reason to fix crashes that lose winning games. The bottommost bots are all crash-prone. Not a coincidence, is it? Step 1 to a winning bot: Fix your crashing bugs!

We might imagine that double crashes are caused primarily by the game or some other part of the runtime system, but the high rate of double crashes by Marek Gajdos makes me question that. What is that bot breaking?

Tomorrow: I’ll try to correlate scores with ratings, one way or another.

SSCAIT scores - summary by bot

I’m looking into the scores recorded in the SSCAIT game data, which I have up to 27 September. So far I haven’t found anything too interesting, but it’s not entirely useless either.

According to the SSCAIT rules, a player’s score is the sum of units killed plus buildings razed: BWAPI::Player::getKillScore() + BWAPI::Player::getRazingScore().

Here’s basic score information for the dates between 17 August 2016 and 27 September 2016. It’s the same date range I used in the SSCAIT crosstables, chosen so as not to smear too many different bot versions into one table. The difference and ratio columns are all arithmetic means. They give the difference or the ratio between the winner’s and loser’s scores. (Well, the loss score ratio is the ratio between the loser’s and winner’s scores, to make it easier to compare by eye.) Games in which either side had a score of 0 (no kills) or -100 (crash) are left out.

botgameswin %mean
score
mean
win
score
mean
loss
score
win
score
diff
loss
score
diff
win
score
ratio
loss
score
ratio
krasi029886.91%54850556764936845776930615.583.06
Iron bot24280.17%23967248352045918644-769916.774.06
Marian Devecka21192.42%211202221377989442-295788.978.94
Martin Rooijackers26079.23%25077287221117022046-276869.1413.59
tscmooz24773.68%2032225134684712118-152039.8410.71
tscmoo27271.69%35035415911843523035-240155.496.53
LetaBot CIG 201625673.05%26635302291689622995-218109.034.24
WuliBot23465.81%8441862780846814-1805717.2212.69
Simon Prins22465.62%24256280581699821971-1239117.6812.18
ICELab24266.53%36286430992274528818-263506.4710.22
Sijia Xu23663.98%121191459877159913-2346015.798.28
LetaBot SSCAI 2015 Final23664.41%1940024725976418854-1742711.045.24
Dave Churchill23956.49%6656841243776394-1637017.4111.89
Chris Coxe17557.71%2989388617633468-379823.596.49
Tomas Vajda23064.78%34027396272372734048-1923436.525.11
Flash25164.14%114351387270758335-253057.789.89
Zia bot23651.69%130821764082058544-1551812.127.28
PeregrineBot13151.91%3560541515584651-658722.7712.65
tscmoop25852.33%156532312874488055-307189.9514.85
Andrew Smith25756.03%15896183911271712180-265088.847.42
Florian Richoux22353.36%147752280455888518-228056.3011.30
Carsten Nielsen27550.91%101471212181008140-2005711.686.89
Soeren Klett22845.18%39869573192549043732-129739.3211.69
Jakub Trancik20445.10%15552159651521211133290914.556.41
Tomas Cere28144.13%1831732027748818054-289587.1316.25
MegaBot18555.68%1666122464937216042-174739.2116.45
Aurelien Lermant28837.15%18580366697886-12822-226411.2512.13
Odin201413146.56%125521675788878998-1675716.2311.67
Gaoyuan Chen25839.53%114171685578618691-266987.1313.62
DAIDOES10127.72%1079423254601517054-1009312.4313.22
Igor Lacik10535.24%1318628378492014545-144837.667.40
Matej Istenik26029.62%17509326521113719086-115097.547.64
NUS Bot9632.29%70691288542958953-147548.7313.61
Roman Danielis24422.95%19086432681188320858-271513.8310.30
ZerGreenBot666.67%97621355421802291-80453.4410.87
Ian Nicholas DaCosta11616.38%40481035828126223-87957.5612.45
AwesomeBot11526.96%81001871241849280-168983.1120.40
Johan Kayser24917.67%1296535732807823303-1020710.8711.14
Martin Vlcak10231.37%1138620558719410658-1545912.3414.06
Rob Bogie5950.85%127771624891869555-2266213.034.92
Christoffer Artmann19717.26%771224928412115025-151944.9416.67
Marek Gajdos7211.11%518716426378212820-96715.8016.60
Travis Shelton10616.98%79421781659229374-94553.5117.10
Bjorn P Mattsson19018.95%49951473027197300-152373.9928.59
Vladimir Jurenka14228.17%95611533372978970-116915.607.47
neverdieTRX13714.60%849321777622213883-107834.5511.22
OpprimoBot21812.39%15672252581431717671-565420.1813.84
Sungguk Cha12225.41%1526334072885521740-114686.508.50
Jacob Knudsen9219.57%728519250437511994-121935.0818.43
HoangPhuc1291.67%151481631623009550-223453.1410.72
ButcherBoy156.67%1777565515003105-65432.2218.09

The most striking point is that Krasi0 was ahead in points, on average, in the games that it lost. So was Jakub Trancik’s cannon bot. The data that I have does not record the cause of losses. It’s perfectly possible to lose while ahead on points, when you fight efficiently and destroy masses of enemy stuff before dying. But you may also be ahead on points when you crash or overstep the time limit.

Score increases as the game goes on. I think that the score diff columns mostly tell us how long the games were. So, for example, Marian Devecka’s Killerbot often won short games and hung on through a long fight in lost games. The score ratio columns seem more informative about how far ahead the winner was at the end of the game. Killerbot tended to win with about the same point ratio that it lost with. Krasi0 won with a huge point ratio and lost with a small ratio, which might reflect its defensive style. Iron, which is super-aggressive, also won with a huge ratio and lost with a small ratio, which in its case might mean that it won after a long series of pinpricks or lost in a sudden collapse.

The numbers are not easy to interpret! But they must mean something.

Tomorrow: I’ll try to dig out something about the rate of failing to start up.

AIIDE 2016 - upsets by player

Here is a list of the AIIDE 2016 players with upsets, cases in which a lower-ranked bot defeated a higher opponent. 15 of the 21 participants scored a total of 27 upsets out of 210 pairings. In most cases the upset was of an opponent only slightly ahead—the table doesn’t worry about the size of the upset. Xelnaga, well-positioned near the end of a run of bots with close scores, is the runaway upset champion; it also upset LetaBot 8 places ahead. To earn an upset you have to score worse overall, so another way to say it is that Xelnaga is the most inconsistent performer, with strengths and weaknesses that don’t always offset each other. The deepest upset is JiaBot’s upset of ZZZKBot, ranked 9 places ahead.

botupsets
2 ZZZKBot1 Iron
4 LetaBot3 tscmoo
5 UAlbertaBot3 tscmoo
7 Overkill2 ZZZKBot, 5 UAlbertaBot
8 Aiur4 LetaBot, 7 Overkill
9 MegaBot5 UAlbertaBot, 7 Overkill
10 IceBot6 Ximp, 9 MegaBot
11 JiaBot2 ZZZKBot, 9 MegaBot
12 Xelnaga4 LetaBot, 7 Overkill, 9 MegaBot, 10 IceBot, 11 JiaBot
13 Skynet11 JiaBot, 12 Xelnaga
14 GarmBot9 MegaBot, 12 Xelnaga
15 NUSBot8 Aiur, 13 Skynet
17 SRbotOne15 NUSBot
19 Oritaka14 GarmBot
20 CruzBot15 NUSBot

Upsets are interesting because they suggest weaknesses in the stronger bot. They’re clues that may point toward what’s important to fix. For example, the upsets of #2 ZZZKBot by much less successful zergs remind us that 4 pool is easy for zerg to counter; protoss and terran need specific knowledge to counter the rush, but zerg can play a standard 9 pool (which in ZvZ is also fine against other openings) as a hard counter.

AIIDE 2016 - upsets by map

I counted the number of upsets in the crosstables for each map. An upset is when a lower-ranked player defeats one who ends up at a higher rank. On any given map you should expect more upsets than in the tournament overall, which sums over the maps and smooths out differences.

mapupsets
(2)Benzene32
(2)Destination16
(2)HeartbreakRidge36
(3)Aztec29
(3)TauCross33
(4)Andromeda32
(4)CircuitBreaker37
(4)EmpireoftheSun33
(4)Fortress36
(4)Python30

I expected that more standard maps would have fewer upsets, but it didn’t turn out that way. Less standard Heartbreak Ridge is the only map to tease me by acting as if I understood. I’m a little surprised by the high upset counts on standard Circuit Breaker and Fortress and the lower counts on less standard Aztec. I am very surprised that Destination has half the upsets of other maps! I can’t explain it.

Another theory you might try is: Many bots are tuned on SSCAIT, so SSCAIT maps might show more solid play and fewer upsets. The non-SSCAIT maps here are only Aztec and Fortress, which don’t support the hypothesis (though there’s little evidence either way).

I thought that per-map upset rate might be a way to measure the strategic fragility of bots on maps that they weren’t tuned for, but it may also measure code fragility. Look into yesterday’s crosstables to see where today’s totals came from. Heartbreak Ridge has many upsets not so much because bots weren’t ready for it, but because NUSBot wasn’t ready for it. LetaBot could not defend Fortress. Only Circuit Breaker’s high rate of upsets was not helped along by any single bot—and Circuit Breaker is as standard as they come. So we’re not learning as much about map characteristics as about bot characteristics.

Can anybody explain why Destination was more stable and less upset-prone than other maps? That’s the big mystery here.

AIIDE 2016 - crosstables per map

AIIDE 2016 crosstables for each map in the tournament. First, the whole tournament crosstable, for reference. It should match the one on the official results page, though for some reason some of the overall win percentages differ in the second decimal (by a trivial amount). The official results are correct, so I may have some floating point error accumulation (an off-by-one error would cause a bigger discrepancy).

AIIDE 2016overallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron87.44%23%84%87%86%99%88%96%68%98%81%91%57%100%96%100%99%98%100%100%100%
ZZZKBot85.05%77%50%51%77%98%39%87%92%93%46%100%94%100%100%100%98%100%100%100%100%
tscmoo82.71%16%50%43%42%91%97%80%90%99%91%87%83%100%98%100%100%99%89%100%100%
LetaBot74.00%13%49%57%66%68%94%34%52%84%88%39%62%93%93%92%99%99%97%100%100%
UAlbertaBot70.37%14%23%58%34%76%42%80%47%77%71%100%79%81%83%99%86%59%99%100%100%
Ximp64.54%1%2%9%32%24%52%81%67%7%96%62%98%94%91%98%92%96%92%97%100%
Overkill61.93%12%61%3%6%58%48%30%34%79%52%24%62%94%97%96%98%94%92%100%100%
Aiur61.26%4%13%20%66%20%19%70%53%79%53%56%86%87%44%88%96%92%88%91%100%
MegaBot58.42%32%8%10%48%53%33%66%47%49%42%32%78%47%91%94%83%89%84%91%90%
IceBot57.43%2%7%1%16%23%93%21%21%51%56%36%61%61%100%100%100%99%100%100%100%
JiaBot57.25%19%54%9%12%29%4%48%47%58%44%49%37%98%89%90%82%99%86%91%100%
Xelnaga56.98%9%0%13%61%0%38%76%44%68%64%51%42%2%99%100%92%94%89%96%100%
Skynet55.03%43%6%17%38%21%2%38%14%22%39%63%58%100%39%100%100%100%100%100%100%
GarmBot42.52%0%0%0%7%19%6%6%13%53%39%2%98%0%84%84%100%93%47%99%100%
NUSBot27.46%4%0%2%7%17%9%3%56%9%0%11%1%61%16%57%34%61%81%30%90%
TerranUAB27.39%0%0%0%8%1%2%4%12%6%0%10%0%0%16%43%90%73%88%96%99%
SRbotOne22.46%1%2%0%1%14%8%2%4%17%0%18%8%0%0%66%10%53%88%57%100%
Cimex20.50%2%0%1%1%41%4%6%8%11%1%1%6%0%7%39%27%47%59%51%99%
Oritaka19.51%0%0%11%3%1%8%8%12%16%0%14%11%0%53%19%12%12%41%68%100%
CruzBot16.75%0%0%0%0%0%3%0%9%9%0%9%4%0%1%70%4%43%49%32%100%
Tyr1.11%0%0%0%0%0%0%0%0%10%0%0%0%0%0%10%1%0%1%0%0%

Now each of the 10 maps. Each cell represents only 9 games, occasionally fewer when games are missing.

(2)BenzeneoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron95.00%89%100%78%100%100%78%100%89%100%89%100%78%100%100%100%100%100%100%100%100%
ZZZKBot73.89%11%22%11%78%100%11%11%89%100%67%100%78%100%100%100%100%100%100%100%100%
tscmoo83.33%0%78%22%44%100%100%78%89%100%100%78%78%100%100%100%100%100%100%100%100%
LetaBot81.67%22%89%78%67%67%100%44%33%100%100%44%89%100%100%100%100%100%100%100%100%
UAlbertaBot64.80%0%22%56%33%44%22%88%44%78%67%100%78%78%89%89%67%56%89%100%100%
Ximp59.22%0%0%0%33%56%25%11%78%11%100%44%100%100%33%100%100%100%89%100%100%
Overkill68.93%22%89%0%0%78%75%22%44%89%78%22%100%89%100%89%100%100%89%100%100%
Aiur72.07%0%89%22%56%12%89%78%78%67%78%78%89%100%33%100%89%89%100%89%100%
MegaBot56.11%11%11%11%67%56%22%56%22%22%44%22%89%56%100%89%78%100%100%67%100%
IceBot60.00%0%0%0%0%22%89%11%33%78%67%33%100%67%100%100%100%100%100%100%100%
JiaBot52.78%11%33%0%0%33%0%22%22%56%33%67%56%100%67%100%89%100%89%78%100%
Xelnaga56.67%0%0%22%56%0%56%78%22%78%67%33%67%0%100%100%100%89%89%78%100%
Skynet45.81%22%22%22%11%22%0%0%11%11%0%44%33%100%11%100%100%100%100%100%100%
GarmBot42.78%0%0%0%0%22%0%11%0%44%33%0%100%0%78%100%100%100%67%100%100%
NUSBot31.11%0%0%0%0%11%67%0%67%0%0%33%0%89%22%22%33%67%100%11%100%
TerranUAB27.22%0%0%0%0%11%0%11%0%11%0%0%0%0%0%78%100%56%89%89%100%
SRbotOne20.56%0%0%0%0%33%0%0%11%22%0%11%0%0%0%67%0%44%100%22%100%
Cimex22.78%0%0%0%0%44%0%0%11%0%0%0%11%0%0%33%44%56%67%89%100%
Oritaka14.44%0%0%0%0%11%11%11%0%0%0%11%11%0%33%0%11%0%33%56%100%
CruzBot21.23%0%0%0%0%0%0%0%11%33%0%22%22%0%0%89%11%78%11%44%100%
Tyr0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
(2)DestinationoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron91.67%22%89%67%89%100%100%100%78%100%89%100%100%100%100%100%100%100%100%100%100%
ZZZKBot89.44%78%56%56%100%100%44%100%100%100%67%100%89%100%100%100%100%100%100%100%100%
tscmoo85.47%11%44%44%67%100%89%67%100%100%89%100%100%100%100%100%100%100%100%100%100%
LetaBot84.44%33%44%56%67%89%100%56%89%100%78%78%100%100%100%100%100%100%100%100%100%
UAlbertaBot68.33%11%0%33%33%56%11%78%78%78%67%100%100%89%89%100%78%67%100%100%100%
Ximp70.00%0%0%0%11%44%78%100%78%11%89%89%100%100%100%100%100%100%100%100%100%
Overkill63.89%0%56%11%0%89%22%44%67%78%67%33%56%78%89%100%100%89%100%100%100%
Aiur61.67%0%0%33%44%22%0%56%67%67%33%100%100%89%56%100%100%78%89%100%100%
MegaBot50.56%22%0%0%11%22%22%33%33%11%67%22%89%11%89%89%100%89%100%100%100%
IceBot60.56%0%0%0%0%22%89%22%33%89%56%33%89%89%100%100%100%89%100%100%100%
JiaBot56.67%11%33%11%22%33%11%33%67%33%44%67%22%100%78%89%78%100%100%100%100%
Xelnaga51.11%0%0%0%22%0%11%67%0%78%67%33%67%11%100%100%78%100%89%100%100%
Skynet46.67%0%11%0%0%0%0%44%0%11%11%78%33%100%44%100%100%100%100%100%100%
GarmBot43.89%0%0%0%0%11%0%22%11%89%11%0%89%0%67%78%100%100%100%100%100%
NUSBot35.00%0%0%0%0%11%0%11%44%11%0%22%0%56%33%89%100%78%78%67%100%
TerranUAB24.44%0%0%0%0%0%0%0%0%11%0%11%0%0%22%11%89%78%78%89%100%
SRbotOne21.79%0%0%0%0%22%0%0%0%0%0%22%22%0%0%0%11%78%100%78%100%
Cimex17.78%0%0%0%0%33%0%11%22%11%11%0%0%0%0%22%22%22%67%33%100%
Oritaka13.89%0%0%0%0%0%0%0%11%0%0%0%11%0%0%22%22%0%33%78%100%
CruzBot12.78%0%0%0%0%0%0%0%0%0%0%0%0%0%0%33%11%22%67%22%100%
Tyr0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
(2)Heartbreak RidgeoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron88.89%22%89%89%89%100%100%100%33%100%89%78%89%100%100%100%100%100%100%100%100%
ZZZKBot87.78%78%78%67%78%100%44%100%100%100%11%100%100%100%100%100%100%100%100%100%100%
tscmoo80.56%11%22%44%56%78%100%67%89%100%89%89%89%100%100%100%100%100%78%100%100%
LetaBot72.22%11%33%56%78%44%100%22%11%89%100%0%100%100%100%100%100%100%100%100%100%
UAlbertaBot70.00%11%22%44%22%56%44%89%56%78%56%100%89%100%100%100%67%67%100%100%100%
Ximp73.33%0%0%22%56%44%67%100%78%11%100%89%100%100%100%100%100%100%100%100%100%
Overkill51.18%0%56%0%0%56%33%0%12%83%22%12%22%100%100%89%89%89%88%100%100%
Aiur68.89%0%0%33%78%11%0%100%67%100%78%78%100%78%100%78%100%100%78%100%100%
MegaBot54.19%67%0%11%89%44%22%88%33%100%44%22%11%44%100%100%100%100%22%89%0%
IceBot54.80%0%0%0%11%22%89%17%0%0%56%11%89%89%100%100%100%100%100%100%100%
JiaBot58.33%11%89%11%0%44%0%78%22%56%44%44%22%100%100%100%67%100%89%89%100%
Xelnaga60.89%22%0%11%100%0%11%88%22%78%89%56%67%0%100%100%100%100%89%89%100%
Skynet56.11%11%0%11%0%11%0%78%0%89%11%78%33%100%100%100%100%100%100%100%100%
GarmBot38.33%0%0%0%0%0%0%0%22%56%11%0%100%0%100%78%100%100%11%89%100%
NUSBot0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
TerranUAB30.56%0%0%0%0%0%0%11%22%0%0%0%0%0%22%100%89%78%100%89%100%
SRbotOne22.22%0%0%0%0%33%0%11%0%0%0%33%0%0%0%100%11%44%67%44%100%
Cimex21.67%0%0%0%0%33%0%11%0%0%0%0%0%0%0%100%22%56%56%56%100%
Oritaka28.49%0%0%22%0%0%0%12%22%78%0%11%11%0%89%100%0%33%44%44%100%
CruzBot20.79%0%0%0%0%0%0%0%0%11%0%11%11%0%11%100%11%56%44%56%100%
Tyr10.06%0%0%0%0%0%0%0%0%100%0%0%0%0%0%100%0%0%0%0%0%
(3)AztecoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron91.11%11%78%89%100%100%89%100%89%100%89%89%89%100%100%100%100%100%100%100%100%
ZZZKBot85.56%89%56%44%89%100%44%78%89%100%22%100%100%100%100%100%100%100%100%100%100%
tscmoo81.67%22%44%22%56%78%100%67%89%100%100%100%67%100%89%100%100%100%100%100%100%
LetaBot78.33%11%56%78%44%89%100%33%56%100%100%11%100%100%89%100%100%100%100%100%100%
UAlbertaBot65.56%0%11%44%56%100%11%44%56%56%89%100%78%56%89%100%89%33%100%100%100%
Ximp62.78%0%0%22%11%0%67%100%67%0%78%89%100%89%100%89%78%78%100%89%100%
Overkill60.00%11%56%0%0%89%33%22%11%100%56%11%67%89%100%100%100%100%56%100%100%
Aiur63.89%0%22%33%67%56%0%78%67%44%44%67%89%89%67%89%89%100%89%89%100%
MegaBot60.00%11%11%11%44%44%33%89%33%67%22%11%89%56%100%100%100%89%100%89%100%
IceBot59.44%0%0%0%0%44%100%0%56%33%44%33%100%78%100%100%100%100%100%100%100%
JiaBot60.00%11%78%0%0%11%22%44%56%78%56%56%56%89%78%89%100%100%100%78%100%
Xelnaga57.78%11%0%0%89%0%11%89%33%89%67%44%33%0%100%100%100%89%100%100%100%
Skynet48.89%11%0%33%0%22%0%33%11%11%0%44%67%100%44%100%100%100%100%100%100%
GarmBot45.00%0%0%0%0%44%11%11%11%44%22%11%100%0%100%100%100%100%44%100%100%
NUSBot22.22%0%0%11%11%11%0%0%33%0%0%22%0%56%0%11%22%44%89%33%100%
TerranUAB27.78%0%0%0%0%0%11%0%11%0%0%11%0%0%0%89%89%67%89%89%100%
SRbotOne23.89%0%0%0%0%11%22%0%11%0%0%0%0%0%0%78%11%67%100%78%100%
Cimex24.44%0%0%0%0%67%22%0%0%11%0%0%11%0%0%56%33%33%78%78%100%
Oritaka16.67%0%0%0%0%0%0%44%11%0%0%0%0%0%56%11%11%0%22%78%100%
CruzBot15.00%0%0%0%0%0%11%0%11%11%0%22%0%0%0%67%11%22%22%22%100%
Tyr0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
(3)Tau CrossoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron86.67%22%100%89%67%100%89%89%44%100%78%67%89%100%100%100%100%100%100%100%100%
ZZZKBot83.80%78%44%12%89%100%22%89%100%100%44%100%100%100%100%100%89%100%100%100%100%
tscmoo84.44%0%56%44%78%89%89%100%100%100%100%78%89%100%100%100%100%100%67%100%100%
LetaBot76.70%11%88%56%78%44%100%44%33%89%100%22%89%88%100%100%100%100%100%100%100%
UAlbertaBot67.22%33%11%22%22%44%56%78%22%67%67%100%78%100%56%100%100%89%100%100%100%
Ximp70.56%0%0%11%56%56%56%78%67%11%100%78%100%100%100%100%100%100%100%100%100%
Overkill59.78%11%78%11%0%44%44%22%33%56%33%44%44%100%100%89%100%89%89%100%100%
Aiur61.67%11%11%0%56%22%22%78%67%89%56%67%100%78%44%67%100%100%78%89%100%
MegaBot63.89%56%0%0%67%78%33%67%33%56%44%33%89%56%89%100%100%89%89%100%100%
IceBot58.89%0%0%0%11%33%89%44%11%44%67%44%89%44%100%100%100%100%100%100%100%
JiaBot53.63%22%56%0%0%33%0%67%44%56%33%33%11%100%100%67%78%100%78%89%100%
Xelnaga56.11%33%0%22%78%0%22%56%33%67%56%67%22%0%100%100%100%100%67%100%100%
Skynet50.56%11%0%11%11%22%0%56%0%11%11%89%78%100%11%100%100%100%100%100%100%
GarmBot40.78%0%0%0%12%0%0%0%22%44%56%0%100%0%67%78%100%100%33%100%100%
NUSBot34.44%0%0%0%0%44%0%0%56%11%0%0%0%89%33%100%33%100%89%33%100%
TerranUAB26.67%0%0%0%0%0%0%11%33%0%0%33%0%0%22%0%100%67%67%100%100%
SRbotOne17.78%0%11%0%0%0%0%0%0%0%0%22%0%0%0%67%0%67%78%11%100%
Cimex15.56%0%0%0%0%11%0%11%0%11%0%0%0%0%0%0%33%33%56%56%100%
Oritaka25.56%0%0%33%0%0%0%11%22%11%0%22%33%0%67%11%33%22%44%100%100%
CruzBot16.11%0%0%0%0%0%0%0%11%0%0%11%0%0%0%67%0%89%44%0%100%
Tyr0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
(4)AndromedaoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron82.78%22%56%78%78%89%100%100%56%100%67%100%11%100%100%100%100%100%100%100%100%
ZZZKBot82.78%78%33%22%67%100%67%100%78%67%56%100%89%100%100%100%100%100%100%100%100%
tscmoo85.56%44%67%44%33%89%100%100%89%89%100%78%78%100%100%100%100%100%100%100%100%
LetaBot81.11%22%78%56%89%78%100%33%78%89%100%89%33%89%100%100%100%100%89%100%100%
UAlbertaBot73.33%22%33%67%11%100%56%89%33%89%78%100%78%78%89%100%89%56%100%100%100%
Ximp60.56%11%0%11%22%0%56%89%89%0%100%67%89%78%100%100%78%100%44%78%100%
Overkill62.78%0%33%0%0%44%44%44%44%78%78%67%44%100%100%100%89%89%100%100%100%
Aiur55.56%0%0%0%67%11%11%56%33%89%56%44%78%78%11%89%100%100%89%100%100%
MegaBot57.22%44%22%11%22%67%11%56%67%22%33%56%78%33%78%100%78%89%89%89%100%
IceBot57.22%0%33%11%11%11%100%22%11%78%44%67%22%33%100%100%100%100%100%100%100%
JiaBot55.00%33%44%0%0%22%0%22%44%67%56%56%33%100%89%78%89%100%67%100%100%
Xelnaga45.00%0%0%22%11%0%33%33%56%44%33%44%44%0%100%100%67%67%56%89%100%
Skynet62.78%89%11%22%67%22%11%56%22%22%78%67%56%100%33%100%100%100%100%100%100%
GarmBot46.11%0%0%0%11%22%22%0%22%67%67%0%100%0%89%78%100%100%44%100%100%
NUSBot29.44%0%0%0%0%11%0%0%89%22%0%11%0%67%11%78%11%100%78%11%100%
TerranUAB28.89%0%0%0%0%0%0%0%11%0%0%22%0%0%22%22%100%100%100%100%100%
SRbotOne25.56%0%0%0%0%11%22%11%0%22%0%11%33%0%0%89%0%33%89%89%100%
Cimex17.78%0%0%0%0%44%0%11%0%11%0%0%33%0%0%0%0%67%33%56%100%
Oritaka25.56%0%0%0%11%0%56%0%11%11%0%33%44%0%56%22%0%11%67%89%100%
CruzBot15.00%0%0%0%0%0%22%0%0%11%0%0%11%0%0%89%0%11%44%11%100%
Tyr0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
(4)Circuit BreakeroverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron85.00%44%100%89%89%100%67%100%67%89%78%100%11%100%78%100%89%100%100%100%100%
ZZZKBot83.33%56%67%44%56%89%44%100%89%78%44%100%100%100%100%100%100%100%100%100%100%
tscmoo79.44%0%33%22%33%100%100%89%78%100%67%89%78%100%100%100%100%100%100%100%100%
LetaBot81.67%11%56%78%67%89%100%33%89%100%100%78%33%100%100%100%100%100%100%100%100%
UAlbertaBot70.56%11%44%67%33%89%22%89%22%78%78%100%78%67%89%100%100%44%100%100%100%
Ximp56.67%0%11%0%11%11%33%56%44%0%100%44%100%89%100%89%67%78%100%100%100%
Overkill62.78%33%56%0%0%78%67%0%11%78%67%11%56%100%100%100%100%100%100%100%100%
Aiur60.56%0%0%11%67%11%44%100%44%89%22%33%89%100%44%89%89%100%100%78%100%
MegaBot64.44%33%11%22%11%78%56%89%56%67%56%33%67%67%89%100%67%89%100%100%100%
IceBot53.33%11%22%0%0%22%100%22%11%33%44%33%33%33%100%100%100%100%100%100%100%
JiaBot56.67%22%56%33%0%22%0%33%78%44%56%22%22%89%100%89%100%100%78%89%100%
Xelnaga58.33%0%0%11%22%0%56%89%67%67%67%78%22%0%100%100%89%100%100%100%100%
Skynet62.22%89%0%22%67%22%0%44%11%33%67%78%78%100%33%100%100%100%100%100%100%
GarmBot42.78%0%0%0%0%33%11%0%0%33%67%11%100%0%100%89%100%100%11%100%100%
NUSBot31.67%22%0%0%0%11%0%0%56%11%0%0%0%67%0%67%67%56%100%78%100%
TerranUAB27.22%0%0%0%0%0%11%0%11%0%0%11%0%0%11%33%89%78%100%100%100%
SRbotOne21.67%11%0%0%0%0%33%0%11%33%0%0%11%0%0%33%11%33%89%67%100%
Cimex19.44%0%0%0%0%56%22%0%0%11%0%0%0%0%0%44%22%67%22%44%100%
Oritaka17.78%0%0%0%0%0%0%0%0%0%0%22%0%0%89%0%0%11%78%56%100%
CruzBot14.44%0%0%0%0%0%0%0%22%0%0%11%0%0%0%22%0%33%56%44%100%
Tyr0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
(4)Empire of the SunoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron86.11%0%100%89%67%100%89%100%100%100%89%89%11%100%89%100%100%100%100%100%100%
ZZZKBot85.00%100%22%67%67%89%33%89%89%89%56%100%100%100%100%100%100%100%100%100%100%
tscmoo81.67%0%78%44%0%100%100%78%100%100%89%67%89%100%100%100%100%100%89%100%100%
LetaBot72.78%11%33%56%44%78%100%56%67%89%100%22%22%100%78%100%100%100%100%100%100%
UAlbertaBot76.11%33%33%100%56%100%78%100%44%78%67%100%78%67%56%100%100%33%100%100%100%
Ximp63.89%0%11%0%22%0%56%89%56%0%89%56%100%100%100%100%100%100%100%100%100%
Overkill58.33%11%67%0%0%22%44%11%33%56%56%11%78%89%100%89%100%100%100%100%100%
Aiur56.67%0%11%22%44%0%11%89%56%78%56%22%89%89%22%89%100%67%89%100%100%
MegaBot51.67%0%11%0%33%56%44%67%44%56%22%22%78%11%100%89%78%67%67%89%100%
IceBot53.33%0%11%0%11%22%100%44%22%44%33%11%22%44%100%100%100%100%100%100%100%
JiaBot60.56%11%44%11%0%33%11%44%44%78%67%67%67%100%100%100%78%100%67%89%100%
Xelnaga63.33%11%0%33%78%0%44%89%78%78%89%33%22%11%100%100%100%100%100%100%100%
Skynet59.44%89%0%11%78%22%0%22%11%22%78%33%78%100%44%100%100%100%100%100%100%
GarmBot43.89%0%0%0%0%33%0%11%11%89%56%0%89%0%89%89%100%89%22%100%100%
NUSBot27.78%11%0%0%22%44%0%0%78%0%0%0%0%56%11%44%56%44%78%11%100%
TerranUAB28.33%0%0%0%0%0%0%11%11%11%0%0%0%0%11%56%67%100%100%100%100%
SRbotOne21.11%0%0%0%0%0%0%0%0%22%0%22%0%0%0%44%33%67%89%44%100%
Cimex21.67%0%0%0%0%67%0%0%33%33%0%0%0%0%11%56%0%33%56%44%100%
Oritaka21.11%0%0%11%0%0%0%0%11%33%0%33%0%0%78%22%0%11%44%78%100%
CruzBot17.22%0%0%0%0%0%0%0%0%11%0%11%0%0%0%89%0%56%56%22%100%
Tyr0.00%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
(4)FortressoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron81.11%0%33%100%78%100%78%89%56%89%67%100%67%100%89%100%100%78%100%100%100%
ZZZKBot88.33%100%56%100%67%100%22%100%89%100%44%100%89%100%100%100%100%100%100%100%100%
tscmoo86.11%67%44%89%33%89%100%78%78%100%89%89%78%100%89%100%100%100%100%100%100%
LetaBot39.44%0%0%11%78%0%44%0%22%0%0%22%0%56%78%22%89%89%78%100%100%
UAlbertaBot75.00%22%33%67%22%100%78%89%33%78%89%100%78%100%78%100%78%56%100%100%100%
Ximp63.89%0%0%11%100%0%22%89%44%11%100%56%89%89%78%100%100%100%89%100%100%
Overkill68.89%22%78%0%56%22%78%78%22%89%22%22%89%100%100%100%100%100%100%100%100%
Aiur55.56%11%0%22%100%11%11%22%22%78%56%33%67%78%44%89%89%100%100%78%100%
MegaBot70.00%44%11%22%78%67%56%78%78%56%33%67%89%78%89%89%89%89%89%100%100%
IceBot61.11%11%0%0%100%22%89%11%22%44%56%33%44%89%100%100%100%100%100%100%100%
JiaBot62.78%33%56%11%100%11%0%78%44%67%44%44%33%100%78%89%67%100%100%100%100%
Xelnaga56.67%0%0%11%78%0%44%78%67%33%67%56%22%0%89%100%89%100%100%100%100%
Skynet59.44%33%11%22%100%22%11%11%33%11%56%67%78%100%33%100%100%100%100%100%100%
GarmBot39.44%0%0%0%44%0%11%0%22%22%11%0%100%0%56%67%100%56%100%100%100%
NUSBot33.89%11%0%11%22%22%22%0%56%11%0%22%11%67%44%67%22%56%100%33%100%
TerranUAB29.44%0%0%0%78%0%0%0%11%11%0%11%0%0%33%33%89%56%67%100%100%
SRbotOne23.89%0%0%0%11%22%0%0%11%11%0%33%11%0%0%78%11%56%67%67%100%
Cimex23.89%22%0%0%11%44%0%0%0%11%0%0%0%0%44%44%44%44%100%22%89%
Oritaka11.67%0%0%0%22%0%11%0%0%11%0%0%0%0%0%0%33%33%0%22%100%
CruzBot18.89%0%0%0%0%0%0%0%22%0%0%0%0%0%0%67%0%33%78%78%100%
Tyr0.56%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%11%0%0%
(4)PythonoverallIronZZZKtscmLetaUAlbXimpOverAiurMegaIceBJiaBXelnSkynGarmNUSBTerrSRboCimeOritCruzTyr
Iron86.11%0%100%100%100%100%89%78%67%100%78%89%22%100%100%100%100%100%100%100%100%
ZZZKBot90.56%100%67%78%78%100%56%100%100%100%44%100%100%100%100%100%89%100%100%100%100%
tscmoo78.89%0%33%56%22%89%89%78%89%100%89%100%89%100%100%100%100%89%56%100%100%
LetaBot71.67%0%22%44%44%100%100%22%44%89%100%22%56%100%89%100%100%100%100%100%100%
UAlbertaBot72.78%0%22%78%56%67%44%56%78%89%56%100%56%78%100%100%89%89%100%100%100%
Ximp64.44%0%0%11%0%33%56%100%67%11%100%11%100%100%100%100%100%100%100%100%100%
Overkill62.22%11%44%11%0%56%44%56%56%89%44%0%67%100%78%100%100%89%100%100%100%
Aiur56.11%22%0%22%78%44%0%44%33%89%56%33%56%89%22%89%100%89%67%89%100%
MegaBot56.11%33%0%11%56%22%33%44%67%33%56%33%100%56%78%89%44%89%89%89%100%
IceBot55.56%0%0%0%11%11%89%11%11%67%89%56%22%44%100%100%100%100%100%100%100%
JiaBot56.11%22%56%11%0%44%0%56%44%44%11%33%44%100%100%100%78%89%89%100%100%
Xelnaga63.89%11%0%0%78%0%89%100%67%67%44%67%56%0%100%100%100%100%100%100%100%
Skynet58.33%78%0%11%44%44%0%33%44%0%78%56%44%100%33%100%100%100%100%100%100%
GarmBot42.22%0%0%0%0%22%0%0%11%44%56%0%100%0%100%89%100%89%33%100%100%
NUSBot28.89%0%0%0%11%0%0%22%78%22%0%0%0%67%0%89%0%67%100%22%100%
TerranUAB23.33%0%0%0%0%0%0%0%11%11%0%0%0%0%11%11%89%56%89%100%89%
SRbotOne26.11%0%11%0%0%11%0%0%0%56%0%22%0%0%0%100%11%44%100%67%100%
Cimex20.00%0%0%11%0%11%0%11%11%11%0%11%0%0%11%33%44%56%56%33%100%
Oritaka20.00%0%0%44%0%0%0%0%33%11%0%11%0%0%67%0%11%0%44%78%100%
CruzBot16.11%0%0%0%0%0%0%0%11%11%0%0%0%0%0%78%0%33%67%22%100%
Tyr0.56%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%11%0%0%0%0%

I could look at these charts all day and keep finding stuff to think about. Winner Iron was upset by ZZZKBot and scored overwhelmingly against every other bot except the two dragoon-heavy protosses MegaBot and Skynet, which earned upsets on some maps. ZZZKBot’s results fall with rush distance, closely in line with Martin Rooijacker’s explanation. LetaBot did well on most maps, but struggled on Fortress. NUSBot apparently crashed on Heartbreak Ridge, giving Tyr nearly half of its wins. Weak bot Cimex did surprisingly well against strong UAlbertaBot on every map except Tau Cross and Python. Why is that?

AIIDE 2016 Bayesian Elo ratings

Again I have Elo as calculated by Remi Coulom’s bayeselo program. The # column gives the official ranking, so you can see how it differs from the rank by Elo (the bayeselo ranking is slightly more accurate because it takes into account all the information in the tournament results, not only the raw winning rate). I left out the 95% confidence interval column as relatively uninteresting, since the “better?” column tells us how likely each bot is to be superior to the one below it.

#botscoreElobetter?
1Iron87%201699.4%
2ZZZKBot85%197499.6%
3tscmoo zerg83%193299.9%
4LetaBot74%181599.8%
5UAlbertaBot70%177499.9%
6Ximp65%169999.6%
8Aiur61%166351.6%
7Overkill62%166399.6%
9MegaBot58%162788.5%
10IceBot57%161157.5%
12Xelnaga57%160850.0%
11JiaBot57%160898.1%
13Skynet55%1581100%
14GarmBot43%1441100%
16TerranUAB27%125074.7%
15NUSBot27%124099.9%
17SRbotOne22%116799.0%
18Cimex21%113092.6%
19Oritaka20%110699.3%
20CruzBot17%1064100%
21Tyr1%533-

There are some switches from the official ranking, due to bots being statistically indistinguishable. Overkill and Aiur are in a dead heat. IceBot (terran), Xelnaga (protoss) and JiaBot (zerg) are also virtually even. bayeselo gives IceBot a 57.6% chance of being better than JiaBot two ranks down, essentially the same as its 57.5% chance of being better than Xelnaga one rank down.

Tomorrow: The per-map crosstables.

AIIDE 2016 results discussion

The AIIDE 2016 results are out. The top finishers, in order, are Iron, ZZZKBot, Tscmoo zerg, Letabot—half terran and half zerg. As usual, Martin Rooijackers sent me a few observations. Here are my thoughts so far.

• The tournament was played on the same maps as last year. It doesn’t call for that in the rules. We know that they’re OK maps, though.

• Martin Rooijackers says that all four top finishers were updated in between the CIG 2016 entry deadline and the AIIDE deadline a month later. [This turned out to be wrong. See the comments. I misread Martin Rooijacker’s message—he wrote that #1 Iron and #2 ZZZKBot had been updated.] You need continual work to stay at the top. It’s fierce up there.

• In particular, ZZZKBot moved up to #2, with minus scores only against fellow zergs Overkill and JiaBot. I’m curious to know what the improvements are.

• Tscmoo zerg scored even against ZZZKBot (a poor performance; zerg should easily beat the 4 pool), and worse than even against the rest of the top 5. Tscmoo owed its #3 finish to terrorizing its inferiors. Martin Rooijackers took this to mean that terran dominated. But with 2 of the top 4, I have to say that zerg is hanging in there.

• In last year’s AIIDE, UAlbertaBot and Overkill were neck-and-neck. The Bayesian Elo calculation said that UAlbertaBot was about 60% likely to be stronger, a narrow margin. Neither has been updated since. [This was also wrong! See comments.] In CIG 2016, Overkill was ahead. In this tournament, UAlbertaBot finished well ahead. These two can’t make up their minds! Apparently this time UAlbertaBot with its rush builds was better able to cope with the changing field.

• XIMP made it to #6. On SSCAIT its last update was February 2015, and I believe it was a minor adjustment to timings. The carrier strategy seems resilient. By contrast IceBot, AIIDE and CIG champion in 2014 ahead of XIMP, was in the middle of the pack. All its smarts did not make it resilient against improvements in its enemies.

• “JiaBot” in AIIDE is probably the same as “Ziabot” in CIG and “Zia bot” in SSCAIT. In some languages, the same sound can transliterate into English as either J or Z. Looking at the Ziabot source from CIG 2016, I think that the author is Korean based on variable names.

• Xelnaga finished ahead of Skynet, which surprised me. In CIG, XelnagaII was near the bottom. It seems a surprisingly sudden change.

• Tyr finished dead last by a mile, with a 1% win rate which seems so low that it must be due to bugs. Tyr had major updates this year which left it fairly strong. It finished in the middle of the pack in CIG 2016. I imagine it was updated at the last minute with a grievous error.

Tomorrow: The Elo table. I’ve done the calculations, but I have to draw up the table.