AIIDE 2018 - the performance curves

I decided to look more closely at the Win Percentage Over Time curves. For this post, “learning” means online learning during the tournament; bots which only learned offline at home are “non-learning” bots for the moment.

To start off, here are the bots whose curves are more or less flat over time. Of these, #1 SAIDA is the only learning bot. Its learning apparently enabled it to hold its ground at a high level, but not to rise further. The other 3 are #13 LastOrder, #26 CUNYBot, and tail-ender #27 Hellbot. Hellbot gradually lost win rate over time despite its low starting point. The other 2 are very nearly level over time, despite being non-learners in a field of enemies eagerly seeking weaknesses to exploit. I suppose that their play is in some way difficult to exploit by learning, whether highly adaptive, or random and unpredictable, or simply not exposing weaknesses that other bots were able to catch.

Here are all the non-learning bots, as best I could identify from yesterday’s findings. I also included #1 SAIDA to maintain the scale, which usefully goes to 1.1 to accomodate any bots which won more games than they played.

Most of the curves trend down over time. The exceptions are #13 LastOrder and #26 CUNYBot from the first graph. Here’s a rescaled graph to tease apart the dense clump from #16 LetaBot to #24 WillyT. It’s easier to see the downward trend. Of these, #17 Arrakhammer which has sophisticated play, and #20 XIMP whose weaknesses may be difficult for many opponents to exploit, leveled out after the early losses. (So did #9 Iron with its numerous adaptive reactions, from the chart above.) The others continued downward for the entire tournament. Apparently if your play is in some way good enough, you can avoid exploitation by other bots to an extent. But most non-learning bots seem doomed to keep losing win rate even over a long tournament.

a clump of closely-spaced bots that didn’t learn

Here are the learning bots which fell at first, then leveled out. It’s due to some combination of statistical fluctuation plus learning by their opponents, and no doubt bugs and whatever other random stuff. There are only 3 of them.

#15 MetaBot might belong in the graph above, but I gave it its own picture because it is in a class by itself when it comes to struggling at the start then recovering strongly. It fell hard (on the left its curve drops below #21 CDBot) and came back, but it did not level off! MetaBot rivals Steamhammer and AIUR for performance gains over time. I imagine it’s due to MetaBot’s 2-level learning ability, where it learns which of 3 heads is best against each opponent, and then 2 of the heads (AIUR and Skynet), when chosen, learn how best to play against that opponent. Like Steamhammer and AIUR, it has more scope to learn, and it learns more. The graph shows how many rivals MetaBot left in the dust—it came within an ace of surpassing #14 Tyr, and likely would have given 10 more rounds.

Here are the bots which gained win rate early, then largely leveled out—most continued to gain or lose a little for the duration. This is partly because the curves are cumulative. Only the left part of the curve can change quickly; each data point is the average of all the per round win rates to the left. The non-learning #13 LastOrder is included; the others are learning bots. #14 Tyr, which learns less because it only remembers one previous game, had the biggest decline from its peak. That’s interesting: The extremely simple method of learning from only one game is already a powerful form of learning, but it is not as powerful as, say, the UCB learning of #12 Microwave, which remembers summary statistics from many games. All these bots arguably could have done better if they had scope to learn more; their learning ceilings may not be high enough for a long tournament. Perhaps some are tuned for SSCAIT, where fast learning with limited scope helps performance.

Finally, here are the learning bots which kept learning for a long time. (#15 MetaBot has its own graph above and is left out.) #2 CherryPi started strongly and reduced its loss rate by 1/3 over the course of the tournament, which is impressive. #10 ZZZKBot started poorly, then has a clean smooth curve which approaches an asymptote after about 10 rounds. #11 Steamhammer also started poorly, and its slower improvement seems to approach an asymptote after around 30 games, but in fact Steamhammer kept on learning throughout, left Tyr, LastOrder, and Microwave behind, and came close to surpassing ZZZKBot. In a longer tournament, it likely would have; Steamhammer’s big repertoire of openings means it still has fresh ideas to try after 100 rounds. #22 AIUR struggled at first, then recovered and showed its usual strong learning gains.

I find that these performance curves are rich with insight. The top finishers have strong basics, and use learning to avoid being exploited (that seems to the only purpose of learning in SAIDA), or to exploit the weaknesses of other bots. Most bots that did not learn suffered for it, but some were difficult to exploit and could hold their ground—LastOrder was chief among these. Bots that did learn sometimes learned too little and could not keep up with their rivals. Steamhammer and MetaBot were remarkable for their comparatively weak foundations and slow but strong learning skills.

Next I’ll look into what specific bots learned about their opponents. Following tradition, I’ll start with AIUR.

Trackbacks

No Trackbacks

Comments

Jay Scott on Wednesday, November 21. 2018:

By the way, in the post I paid no attention to DaQin or UAlbertaBot. Both have their own lessons, which you may want to draw on your own.

McRave on Wednesday, November 21. 2018:

I'm curious if MetaBots winrate increase over time is correlated with picking Skynet/XIMP more often.

Jay Scott on Wednesday, November 21. 2018:

Hmm, I should look and see whether we can tell with the data we have.

Tully Elliston on Thursday, November 22. 2018:

SH going in with no learning data on any bots - but robust learning - accounts for its especially steep curve.

Another way to look at the top learning bots that seemingly only used learning for adaption is that they were already pre-loaded with learning - accordingly they didn't need to spend the first 30 games learning, and their learning only kicked in when they began to lose games. Arguably, SH would behave exactly the same if it had the most robust basics of any bot and was pre-loaded with learning - its an emergent property of being good.

That SH managed around 50% win rate once its learning began to stabilise suggests that build order counter-picking (the most significant and cutting-edge skill SH has developed since last year) is only going to get you so far - I think a period of focus on agent, tactical & operational unit control skills is going to be needed to keep SH up with the pack next year.

Tully Elliston on Friday, November 23. 2018:

I think there is a growing case for ensuring your bot never uses the same opening twice (or even one in three times) in a row, as otherwise you aid the learning your opponent.

Its hard to find which openings counters an opponent when that opponent randomly moves between openings that require drastically different responses.

Jay Scott on Friday, November 23. 2018:

Ah, but never playing the same opening twice is a pattern that an opponent might pick up on and exploit. :-) Game theory says that if you need to choose different openings in the same situation, you should choose randomly (and helps you calculate the probabilities).

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA