AIIDE 2020 - McRave’s learning algorithm
I meant to summarize McRave’s learning data today, but to know what to put in the tables I had to understand how the numbers are used. Yesterday I examined McRave’s strategy representation with three elements, like “PoolHatch,Overpool,2HatchMuta”. In the code, the elements are named “build” (like PoolHatch), “opener” (like Overpool) and “transition” (like 2HatchMuta). Today I read the code to see what the numbers in the learning files are and how they are used.
Here’s a sample data file, showing McRave doing well versus Steamhammer. The first two numbers are the overall wins and losses. After that, delimited by dashes, is a section for the first build, followed by a section for the openers of that build and a section for the transitions of the build. Then more sections for the other two builds and their appendages. Each element has an independent count of wins and losses.
86 64 - HatchPool 0 0 - 12Hatch 0 0 - 2HatchMuta 0 0 2HatchSpeedling 0 0 - PoolHatch 28 27 - 4Pool 0 0 9Pool 0 0 Overpool 0 0 12Pool 28 27 - 2HatchMuta 16 17 2HatchSpeedling 12 10 3HatchSpeedling 0 0 - PoolLair 58 37 - 9Pool 58 37 - 1HatchMuta 58 37
The code calls a function to check which triples are allowed and deals with other minor details, but even with the fiddly bits it’s simple: It picks the build with the highest UCB value, then given that build the corresponding opener with the highest UCB value, then given that build and opener the transition with the highest UCB value. Because of how the data file is organized, this can be done in one pass. The code is in the file LearningManager.cpp
in the nested function parseLearningFile()
.
In theory, this three-level hierarchy could speed up learning. For example, you might be able to conclude that PoolHatch is better than PoolLair against some opponent, even if you don’t have enough data to know which PoolHatch opener or transition is best. My intuition is that the hierarchical scheme should on average work better than a flat scheme, but that there will be perverse situations where it does worse. Many of the triples are not allowed, which limits the value of the hierarchy. There should be enough data from this tournament to judge whether the hierarchy brought an advantage; it would be interesting to do the analysis.
Next: OK, now I know what tables to generate. I have to add some features to my script, but soon I should be able to post the summary tables.
Comments
McRave on :