Steamhammer’s prepared learning data for AIIDE 2020

What is the best way to prepare initial learning files for a tournament when you have partial knowledge of how your opponents will play? How should you seed your opponent model? Certainly it depends on your learning algorithm. I did not find the answer, but I poked at it and took my best guess for Steamhammer.

I tried an informal experiment on the Starcraft AI Ladder. The ladder is reset to zero once a week—it erases the game records, everybody’s learned data, everything, and makes a fresh start. One week I collected recent full-length data files and set those as Steamhammer’s prepared learning data for use after the weekly reset. (Because of how Steamhammer uses its prepared learning data, the files were not read at all until the reset.) The opponent is reset too, and may not play the same way that the old learning files expect, so it’s not guaranteed that the old learning files are the best seed for new learning. Still, I expected them to provide an advantage over starting from scratch. It makes intuitive sense that if the opponent is relatively constant, such as an opponent carried over in the tournament from a previous year, then keeping your learning files is good. But there may be cases where it’s not true, because both sides learn.

I was interested in whether the carried-over learning data would be helpful from the start, or would have trouble until it adjusted to the opponent’s reset. I let the ladder run overnight and collected data then. The carried-over data did seem to work well, certainly better than the missing or unmaintained initial data I had been using up to then.

The next week I tried an alternative, preparing minimal initial data. Steamhammer’s learning varies its approach depending on how much data is available, so I expect a difference between full-size and minimal prepared data. I looked through my records of previous weeks—not only one previous week—and selected by hand a small number of sample game records using varied openings that had scored well, never more than 4 games. This is the kind of preparation that makes intuitive sense if you have data on an opponent but you expect that the bot has enjoyed a major update—you want to be ready to exploit known weaknesses, but also to be ready to switch in case the weaknesses are gone. Again I set things up and grabbed the data the next morning. And the minimal data performed better. No opponent had worse numbers.

It was not in any sense a well-controlled experiment. Two weeks of data with the ladder’s small number of opponents is not enough to draw a statistically valid conclusion. Both Steamhammer and other bots were updated during the week between experiments, so the result is more than questionable, not solid but vapor. It’s entirely possible that Steamhammer performed better because I had made an important improvement during the week, and I know that I made improvements. Nevertheless this was the data I had, and I decided that it was more likely to be right than wrong. To my eye, Steamhammer’s performance curve over time looked more convincing with the minimal prepared data—not a scientific conclusion.

So in my AIIDE 2020 submission, I went with minimal prepared learning data. I selected the sample games with more care than in my experiment, trying to take everything into account. I could not prepare for the unknown bots, but I did invent one fictional game for EggBot so that Steamhammer will know it is a cannon bot. I did not prepare for Stardust because nothing has yet worked twice against Stardust. I also didn’t prepare against DaQin because I didn’t have recent data handy; I could have tried harder, but time was short.

We’ll see how it goes!

Surely the best preparation can’t be found by a fixed rule, but depends on the opponent and on what you know about it. And by nature it depends on your bot’s learning algorithm. It’s a question worth thought.

Trackbacks

No Trackbacks

Comments

jtolmar on Saturday, October 3. 2020:

Any thoughts on tagging your learning data with the version of Steamhammer that recorded it? You could weigh all past data so that it only counts as much as four games, for example.

It'd be even better if there was a way to detect enemy version changes, but I don't think there is.

Jay Scott on Saturday, October 3. 2020:

I have thought of that and may do it someday, when the analysis of past game records is fancier. It’s not a priority.

It is possible to estimate when the enemy has started to play differently. If you have a detailed model of game play and the smarts to use it during games, you might be able to tell within a single game that the enemy has started to do something differently. If you don’t have a model like that and only look at the sequence of recognized strategies and the game outcomes, I think you might be able to guess after several games when the enemy has had a major update. You might never notice a minor update, though.

It’s an interesting question about opponent modeling.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA