performance differences on BASIL ladder
On the BASIL ladder, Steamhammer in early days performed poorly. But today Steamhammer ranks #8, ahead of KrasioP. If we skip the BASIL participants which are not in the SSCAI tournament (Krasi0 terran and ChimeraBot), that corresponds to place #6 on SSCAIT, just behind BananaBrain, as compared to Steamhammer’s actual tournament finish at #11. The performance corresponds in general to Steamhammer’s performance curve in AIIDE 2018, starting low and rising strongly, but seems even more dramatic.
Many bots have different rankings on BASIL compared to SSCAIT. Random bots are handicapped on BASIL by comparison, since the opponent knows the race ahead of time. There are other differences in rules, plus the environment can cause different reliability and possibly different behavior. For most bots, I think these differences should not matter much—though anything could happen for a bot with reliability problems. Am I wrong? Am I missing something that can make a big difference?
If I’m right, then the important difference is that BASIL plays more games, so learning bots learn more. Other than environment-specific bugs, I don’t know another way to explain big differences in rank, such as Killerbot by Marian Devecka being #19 on BASIL while it came in #7 in SSCAIT 2018: Killerbot is not a learning bot. Another difference (just to give a second example) is that BASIL ranks Ecgberht one step below Arrakhammer, rather than far below (SSCAIT #16 versus #33): Ecgberht is a learning bot.
Steamhammer has a surprisingly high crash rate on BASIL, over 6%. It doesn’t crash remotely that often on SSCAIT. I’ll have to look into that.
Comments
Joseph Huang on :
Dan on :
Ecgberht on :
I only really worked on the sscait update the last 2 days before the SSCAIT deadline so the amount of testing I did was not enough and I introduced a few undetected, critical bugs (crashes against the zealot rush bots for example, academy first with the 2 fac build, etc.).
I guess that without those few bugs I introduced Ecg would have been around 58~60% winrate instead of the 55-56% It got.
Even with the problems Im happy with Ecg performance this year :D
Jay Scott on :
MicroDK on :
Bruce on :
Marian on :
http://www.openbw.com/replay-viewer/?rep=http://basilicum.bytekeeper.org/bots/JumpyDoggoBot/JumpyDoggoBot%20vs%20Marian%20Devecka%20Fighting%20Spirit%20CTR_E5F9535.rep
I have only seen it in ZvZ and it might affect basil more than sscait - I have to investigate more...
Bytekeeper on :
113 of the 116 marked crashes are caused by either of the 2 bot docker containers failing to start (or crash so horribly that no result remains).
SC_DOCKER (even my fork) doesn't announce which bot failed, if one of them crashes immediatly (or manages to crash its docker container). BASIL cannot determine the cause and will add a "crash" for both bots.
This might help bot authors to find real crashes. I also reminds me to fix this problem at some point.
Everytime a bot is marked as crashed, its log files will be saved - yours are here: http://basilicum.bytekeeper.org/bots/Steamhammer/logs/
I checked a few logs and couldn't find a problem.
But I found "more work": The log files show a game was played but still one container crashed. This shouldn't happen, I'll have to investigate it further.
I'm glad it doesn't happen too often.
PS: Those kind of crashes are not counted as losses, and the ELO rating will not be updated in that case.
Jay Scott on :
Bytekeeper on :
A small update on my analysis: I could reproduce the problem playing Steamhammer vs Flash a few times. It only seems to happen if Steamhammer wins against Flash.
"Sometimes" it seems to hang in a busy loop after it won. After 70 seconds sc-docker "detects" this as crashed game.
Jay Scott on :
Bytekeeper on :
I can't always reproduce it. But I also can't reproduce it with other bots at all.
Is there a way it could run into an endless loop? Maybe due to some file permission restriction within sc-docker/linux?
Jay Scott on :
So any problems have to be due to either the wait period or to writing the opponent model. To me they seem equally likely suspects.
Jay Scott on :
Bytekeeper on :
I believe onFrame might be called after onEnd, maybe something is hidden there.
This game was lost by Assberht, which also acknowledge it:
http://basilicum.bytekeeper.org/bots/Steamhammer
SC_Docker says "One lingering container has been found after single container timeout (70 sec), the game probably crashed"
I also got a similar ones for Locutus, and Randomhammer. My log config is a bit "shallow" - so this is in the span of 1-2 days.
Nevertheless, everytime a lingering bot was found, it was a Steamhammer based bot. That doesn't mean there's a problem in there per se, but it makes it very hard for me to analyze further.
Bytekeeper on :
He thinks the VMs other ladders/tourneys use ignore the hanging and just shut down the VM. Maybe that's why it doesn't affect them.
Jay Scott on :
Jay Scott on :
Bytekeeper on :
Bruce Nielsen is also trying to debug again, thanks for that!
Jay Scott on :
Bytekeeper on :
http://basilicum.bytekeeper.org/bots/Dave%20Churchill/logs/?C=M;O=D
The first 2 are too large yes, but the other crashes are pretty "old" and it seems mostly game timeouts. But since some logs are not uploaded because they had too much output there is a chance for them to have the issue.
Jay Scott on :
Bruce on :
For debugging it I added logging to all the handlers. The bot correctly processed the onEnd handler and onFrame is not called again afterwards. I’ve also tested with all file i/o disabled except loading the configuration file and it still happens, so I don’t think it is related to the actual end-of-game logic. It’s more likely to be some hanging resource preventing clean shutdown, though with file i/o disabled I’m not sure what that would be.
I’m going to try to see if I can get a container to stay hanging for long enough to shell into it and see what I can find there.
Jay Scott on :
Jay Scott on :
Bruce on :
I made a quick hack that exits the destructor immediately if the game is over (just checking a global bool that is set in onEnd), and it seems to be working: no issues after running about 40 test games locally. I just uploaded it to sscait, so we’ll see how it goes when basil is updated.
Jay Scott on :
Jay Scott on :
MicroDK on :
Jay Scott on :
MicroDK on :
krasi0 on :