archive by month
Skip to content

AIIDE 2020 - McRave’s learning algorithm

I meant to summarize McRave’s learning data today, but to know what to put in the tables I had to understand how the numbers are used. Yesterday I examined McRave’s strategy representation with three elements, like “PoolHatch,Overpool,2HatchMuta”. In the code, the elements are named “build” (like PoolHatch), “opener” (like Overpool) and “transition” (like 2HatchMuta). Today I read the code to see what the numbers in the learning files are and how they are used.

Here’s a sample data file, showing McRave doing well versus Steamhammer. The first two numbers are the overall wins and losses. After that, delimited by dashes, is a section for the first build, followed by a section for the openers of that build and a section for the transitions of the build. Then more sections for the other two builds and their appendages. Each element has an independent count of wins and losses.

86 64
-
HatchPool 0 0
-
12Hatch 0 0
-
2HatchMuta 0 0
2HatchSpeedling 0 0
-
PoolHatch 28 27
-
4Pool 0 0
9Pool 0 0
Overpool 0 0
12Pool 28 27
-
2HatchMuta 16 17
2HatchSpeedling 12 10
3HatchSpeedling 0 0
-
PoolLair 58 37
-
9Pool 58 37
-
1HatchMuta 58 37

The code calls a function to check which triples are allowed and deals with other minor details, but even with the fiddly bits it’s simple: It picks the build with the highest UCB value, then given that build the corresponding opener with the highest UCB value, then given that build and opener the transition with the highest UCB value. Because of how the data file is organized, this can be done in one pass. The code is in the file LearningManager.cpp in the nested function parseLearningFile().

In theory, this three-level hierarchy could speed up learning. For example, you might be able to conclude that PoolHatch is better than PoolLair against some opponent, even if you don’t have enough data to know which PoolHatch opener or transition is best. My intuition is that the hierarchical scheme should on average work better than a flat scheme, but that there will be perverse situations where it does worse. Many of the triples are not allowed, which limits the value of the hierarchy. There should be enough data from this tournament to judge whether the hierarchy brought an advantage; it would be interesting to do the analysis.

Next: OK, now I know what tables to generate. I have to add some features to my script, but soon I should be able to post the summary tables.

Trackbacks

No Trackbacks

Comments

McRave on :

Another spot on analysis. It's a simple method that I started doing because it, as you said, gave a quicker way to learn a general build with more time refining which transition is better. It additionally keeps reactions more narrow, keeping them within the confines of a build and opener.

Add Comment

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

Form options

Submitted comments will be subject to moderation before being displayed.