experience on the BWAPI bots ladder
I like the BWAPI ladder. It doesn’t seem to have an official name; I’ll just call it “the ladder”.
I’ve enjoyed following the games. I play over most of Steamhammer’s games on the ladder every day. The ladder makes random pairings, so it feeds me a wider variety of opponents and I see more strengths and weaknesses. Also the ladder plays more games in total, because it plays games at full speed, not slowed down for streaming.
Providing accurating rankings and elo is the primary purpose of the ladder, at least as I see it. With a diet of randomly chosen opponents, Steamhammer’s rank stabilized at #8, behind CherryPi and ahead of TyrProtoss, with elo steady in the 2230s. On SSCAIT, the same version’s ranking is not stable—it has varied by over a factor of 2, with the elo tending to rise into the low 2200s, then fall rapidly, then slowly rise again, depending on the whims of the voters. When the voters see Steamhammer with a high rank, they tend to pair it against opponents that it will lose rating points to; after it has lost the rating points, they tend to pay less attention. Steamhammer ends up below its equilibrium elo, and the popular opponents that defeat it end up overrated. The ladder pairs bots fairly, so it better predicts tournament performance.
Randomhammer’s rank can’t be compared across competitions so neatly, though, because the competitions treat random players differently. The difference in rules makes it less useful for predicting the tournament performance of a random player.
File I/O seems to work a little differently in each competition. They are all based on Dave Churchill’s tournament manager software, but each competition uses a different version or tweaks it differently, and the behavior is not exactly the same. They all share in common a read directory and a write directory, with read-only access to read and write-only to write, and copy the contents of write to read. They differ in whether and/or when they clear directories. AIIDE proceeds in all-play-all rounds, and clears write at the end of each round of many games. SSCAIT and the ladder proceed by single games, and can’t do exactly the same thing. I believe that SSCAIT never clears write. I don’t know what the ladder does, but it has different behavior, and Steamhammer’s code doesn’t work correctly.
Steamhammer’s problem, I saw immediately when I requested and received the stored data, is that its record of games against each opponent only extends back 1 game. Instead of the whole history, the opponent model has to draw conclusions based on the one previous game. Data is being cleared at some point; perhaps write is cleared before each game. Steamhammer appends data to the opponent’s file after each game, which works on SSCAIT. I think if I change it to rewrite the entire data file (originally read from read), instead of only appending the new game record, it will work everywhere, including the ladder and the AIIDE tournament. I won’t know for sure until it happens, though, because the details are not documented. The change will be in the next version, 1.4.2.
Call it a bug in Steamhammer. The bug means that Steamhammer’s rank and elo can’t be compared between SSCAIT and the ladder, even though the opponents are mostly the same in both. It’s possible that Steamhammer plays better with the bug, so its higher rank on the ladder is justified. The point about the stability of the rank stands, though.
Comments
Bytekeeper on :
It does not mention files being copied to the write folder.
But it does mention that you need to name files uniquely to prevent them from overwriting the files on the backup server!
I believe the write folder is always empty and is aggregated on the backup write folder on the server.
So I guess, either copy read to write like you did or create a unique file per match.
Bruce on :
Your suggested fix would work for this case as well.
Jay Scott on :
MicroDK on :
MicroDK on :
Jay Scott on :
krasi0 on :
SSCAIT OTOH, only clears the contents of read/ write/ when you upload something to read/ manually.
Jay Scott on :
Jay Scott on :