panic button and fish story

Yesterday’s post was about prior knowledge. The posts before were about learning. Today’s is about prior knowledge for learning.

I was inspired by a remark from Dave Churchill, author of UAlbertaBot, in his new A History of Starcraft AI Competitions: In AIIDE 2015 “UAlbertaBot had [only] a 2/3 winning percentage against some of the lower ranking bots due to the fact that one of the 3 races did not win against those bots.” UAlbertaBot, playing random, had its learning turned off, presumably because the selected strategy for each race was dominant. With learning turned on, it would have lost games trying weaker strategies before settling on the dominant strategy, ending up behind overall—so the thinking, if my guess is good.

Well, that’s like Bisu defeating Savior. When somebody comes up with a counter for the game plan you thought was dominant, don’t you think you should try something different?

You can have it both ways. You can restrict yourself to playing your dominant strategy unless and until it turns out to lose repeatedly. You don’t have to lose games exploring your options; you can take losing to mean that you should start exploring your options.

The panic button implementation is simple. Start out recording the game results as usual, as if learning were turned on, but ignore them and always pick your dominant strategy. But when you get to (say) >10 games with <10% win rate, hit the panic button and let your algorithm try alternatives. It’s unlikely to make things worse!

The fish story implementation is also simple. Pretend, before the first game with a new opponent, that you actually have a history with this opponent. Tell yourself a fish story: “Oh, strategy A, I tried that a few times and always won. And strategy B sucked, I tried that a time or two and lost.” It’s literally a few lines of code to slide fictitious history into your learning data, and you’re done. Your strategy selection algorithm will look at it and say “Strategy A, duh,” and as long as A keeps winning it will explore others at a low rate.

The simpleminded learning algorithms that bots use today assume that you start out knowing nothing about which choices are better. And that’s just false. You always know that some strategies are stronger than others, that some are safe and work against many opponents while others are risky and only exploit certain weaknesses. With the fish story, your bot can start out knowing that A is reliable (“it won repeatedly”), B is a fallback (“it lost once”), and C can be tried if all else fails (“it lost some times”) in a last ditch attempt to trick a few points out of a killer opponent. Or any combination you want.

If you have prior knowledge about your opponents but you’re not sure whether they’ll have updates for the tournament, you can go Baron Munchausen and tell yourself a different fish story about each opponent.

Many variations and other ideas work too. Think about your strategy choices and your selected algorithm and how you would like it to behave. You can probably find your own ways.

Update: Dave Churchill told me the real reason behind UAbertaBot’s decision: He ran out of time! He wrote that he actually implemented a “panic button” method, but did not have time before the tournament to test it and make sure it was solid. I think it’s enough that UAlbertaBot can play random—progress comes one step at a time.

Trackbacks

No Trackbacks

Comments

No comments

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA