learning hides bugs

Today I uploaded the third tournament test version of Steamhammer. Games of the second test version show that it’s already visibly stronger, with a couple of the worst weaknesses ameliorated. In particular, I fixed the heinous bug “Why would I expand to that base? No, I’ll just build macro hatcheries all around it instead.” Watching a lot of games to verify my changes, I was reminded of a lesson.

Bruce Nielsen, author of Stardust, wrote a comment about the disadvantages of opening learning:

I’ve found it refreshing to work without opening learning, as I was definitely using it in Locutus as a crutch to avoid doing necessary underlying work on stuff like worker defense or reacting to scouting information. While it of course worked to a certain extent, it also resulted in a lot of embarrassing losses from exploring builds that only work in very specific situations.

Learning seeks to adapt to the situation. Empirical learning, which is almost all of what bots do, adapts by experimentation, which means that some experiments will fail—those are the embarrassing losses. To me, the first part is more central, the “necessary underlying work” on skills. The bot’s own skills and tendencies are part of the situation that learning adapts to; if you lack a skill, learning will seek a workaround so that the lack causes fewer losses.

And the same if you have a bug. Steamhammer’s build-beside-the-base bug caused macro games to go off the rails. The loss rate did not increase as much as you might expect, because opening learning compensated by switching to all-in builds that did not lead to macro games. Now I have fixed the bug, and it should switch back to macro builds when appropriate. But the learning is slow, and it will not switch all the way back before the tournament, so Steamhammer’s tournament result may be worse than if the bug had never existed. Even though the bug is fixed, it contaminated my learning data, and having had the bug before makes play worse now.

Learning hides bugs. Which would be fine if it hid them completely, but of course that’s impossible. Bugs and weaknesses hurt less when learning can find a workaround, but still hurt. It becomes harder to evaluate your bot’s play and choose which weaknesses are more important to work on.

It makes me think that, if you’re making a serious evaluation of how well your bot is performing, you need to do some tests with learning turned off. Drop the crutch and try to walk without it. For example, you could take learning data from a previous version and freeze it, and run a test to see if there are regressions in playing strength versus particular opponents or when playing particular builds.

Trackbacks

No Trackbacks

Comments

MarcoDBAA on Saturday, December 12. 2020:

Use the BASIL learning data? It should recover faster. If so, might be useful to switch to the old (= SSCAIT) maps there.

I once suggested, that authors may completely randomize their random bot (learning off). Enable everything and go. Could have been especially interesting with the tscmoo bot, which was able to do some crazy stuff (terran nukes; protoss dark archon or scouts with shield battery; zerg mass overlord drops).

You could randomize Randomhammer zerg? Or at least explore more with it?

Jay Scott on Sunday, December 13. 2020:

I’m thinking my time will be best spent fixing bugs and weaknesses with permanent effect rather than fussing with data that may help in this tournament.

Jay Scott on Sunday, December 13. 2020:

Speaking of exploring more, when it comes time to switch to 100% data-driven opening selection, I expect I’ll have to turn exploration way up to collect all the data. That will probably be built into the algorithm: If there’s not enough data about it, try it and see.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA