Steamhammer’s learning results
When I uploaded Steamhammer 1.4.3 to SSCAIT on 11 June, I erased its learned data from the server. Its elo immediately plunged, partly because the voters wanted to put it through its paces against strong opponents, and partly because it needs its learning data to cope. Most rushbots, and many others, won their first or first few games against Steamhammer. That didn’t change when I uploaded 1.4.4 a week later—the improvements weren’t many.
Finally, only in the past several days, I’ve started to feel that Steamhammer has learned enough that it is closing in on its equilibrium elo. It has been wavering around the high 2100’s, not able to break above 2200 but not falling far either. It seems about right, at least for SSCAIT conditions.
The findings. As I’ve mentioned, clearing the learning data was a deliberate test to see how well the learning system works when learning from scratch. I’m fairly pleased. I see weaknesses, but only weaknesses I expected. Against XIMP, Steamhammer has settled on an opening that has won every game so far, but is not as strong as the 3 hatch before pool opening that I hand-chose for it in the old days. Steamhammer only sees that it wins. It can’t tell the difference between openings that win nearly all games because it sees only the winning rate; it needs an evaluation function that can tell it “this one wins more convincingly.” Against Proxy, it won one game with its unusual 6 pool opening. Then it played another game and recorded another win—because Proxy crashed. Steamhammer thought it had found a winner, and had to lose some games before it realized that the 6 pool was not a reliable counter. (It would be, if not for Proxy’s powerful worker defense.) Possibly 5 pool or 4 pool would succeed, but Steamhammer does not know that some openings are related to others. When I teach it that, it will be able to realize that if one opening shows promise but is not quite successful, it should try related openings.
In some cases, Steamhammer hit on surprising counters. The most striking example is against TyrProtoss, which had been winning every game with its cannon turtle into timing attack strategy. Steamhammer tried its 2 hatch lurker all-in attack, which did not make sense to me—Steamhammer’s lurkers suck when cannons are around, it has little idea how to break the cannons and no idea how to bypass them. But it won a game. We’ll see if it keeps winning!
I expected the weaknesses, and I expected the surprising counters. I feel as though I understood the learning system and its limitations fairly well. It gives me confidence that my planned improvements, when I finally get around to them, will be real improvements.
Comments
Dan on :
Jay Scott on :
Jay Scott on :