AIIDE 2021 - a questionable game result and fixing it
Working through the learning files of various bots, I ran into a strange discrepancy: One game between Dragon and McRave which both bots recorded as a loss. In the detailed results, the game ID is 4387 from round 97. The official result has Dragon winning after McRave timed out.
What really happened? I watched the replays recorded by both bots. On the surface, both replays of the game looked the same. McRave destroyed all of Dragon’s buildings, then as usual the replays continued a few more moments. According to OpenBW, Dragon’s replay ended at 22:10 and McRave’s replay at 22:09 (according to the official results, the game ended at 21:57). It’s reflected in the file sizes; Dragon has 1,553,196 bytes while McRave has 1,552,818 bytes (in other cases I checked, differences between replay sizes for the same game are less than 10 bytes). Here is a view at the end of Dragon’s replay, followed by the end of McRave’s. Notice that the valkyrie has moved a little farther in Dragon’s version, and Dragon shows supply 0 for both bots at the end (supporting that they both lost).
On the face of it, Dragon lost because all its buildings were destroyed, then in the brief runout of the game before Broodwar stopped, McRave lost by timing out (the tournament manager said it timed out, plus it recorded its own game result so it didn’t crash). That’s how they both believed they lost. The tournament manager, I have to assume, didn’t expect that situation and took the timeout as definitive, recording that Dragon won.
I wrote to Dave Churchill about it and got this answer:
I don’t have any time to work on this right now and it seems like a pretty small edge case, so if you could post about it and crowd source the fix that would be ideal. I’ll accept any pull request that makes the tournament better!
It’s not critical, it’s a rare case that is not even close to affecting the tournament finishing order. But it would still be nice to fix it.
The first job may be to read code and/or run experiments to figure out more exactly what actually happened. Regardless of the course of events, there are two issues to solve. 1. What should the game result be? 2. How can we tell both bots the correct result?
1. the game result
It was a misplay. Both bots messed up fatally. Don’t count it as a win for either, but skip the game.
McRave won. All terran buildings were destroyed, and that’s the winning condition, right? Never mind that McRave had trouble with time later. Why is there a later at all?
Dragon won. The tournament needs to control how much time bots take, no matter when they take it, as a matter of efficiency and fairness. Therefore the game is not over until Broodwar stops it.
Reasonable people can disagree. I don’t think the answer matters much. What’s more important is to make sure that the actual game, the tournament manager, and the bots all agree on the result as much as possible. I think that issue 1 and issue 2 are interrelated, and should be answered together, not separately.
2. notifying the bots
I don’t understand the technical details of how bots are notified that they won or lost much beyond “somebody calls onEnd()
”. But I have poked at it a bit. Looking at it from the tournament end, the software includes a java tournament manager, a C++ tournament module that is part of BWAPI, and then the bots and game itself.
I think the outline is this: When a game completes normally, there is a short runout phase, and then the tournament module notifies both bots of the result, all good. If a bot times out, the tournament module kicks it out of the game immediately and notifies it of its loss. Then Broodwar realizes there is only one player left, calls the game over, and the remaining player wins. Usually good. In this case, I think Dragon lost and the game entered the runout phase. Then McRave timed out, was kicked out and notified of its loss. Then, when Broodwar ended the game slightly later (giving Dragon the longer replay), for whatever reason the tournament module told Dragon it had lost too. Meanwhile, the tournament manager took the timeout as definitive and recorded that Dragon won. Is my thinking correct? I think it’s close, but I may have details wrong.
The goal is to find a fix so that one of the game results of issue 1 is decided on and carried through consistently. It would be nice if the fix only affected the java tournament manager, but I don’t know if that’s possible.
Comments
Quatari on :
I haven't looked at the replay files but it sounds like your analysis and hypothesis is correct based on what I know about the internal workings of the C++ Tournament Module and the Java tournament manager. The main problem is that in some situations, one or both instances might update their state file again during the small number of frames after onEnd() without there being enough information in the state files to be able to distinguish whether onEnd() occurred before or after a timeout. Dave's source code for the C++ Tournament Module is in https://github.com/davechurchill/StarcraftAITournamentManager/tree/master/src/tournamentmodule. After the game ends for whatever reason, Dave's Java tournament manager parses the state files and analyzes at the various fields in the files to decide who won. I'm not certain, but I think the logic that treats timeouts as losers is at https://github.com/davechurchill/StarcraftAITournamentManager/blob/73aa5d07c5d3974a04fe4609cad2e7035f6fb4b4/src/objects/GameResult.java#L157
I noticed potential problems like the one you describe (and also a similar problem involving an inconsistency between a hard-coded number 85714 and the value of a variable) and fixed them in https://github.com/chriscoxe/bwapi-tm. I used Dave's exact source code as a base, and there haven't been any changes to Dave's since then that I haven't also applied to my repo. SCHNAIL is using my repo as far as I know, as it has some minor features to enforce rules for bot-vs-human games. It also has some other minor enhancements/fixes, e.g. the INI file is more configurable, and it can update the state file based on time (e.g. if it has been more than M milliseconds since the last update) and/or every N frames rather than exactly every 360 frames, which can help when arbitrating/investigating/debugging games with slow bot(s).
In my repo, it uses guard checks to ensure that once it has updated the state file in onEnd(), it won't update the state file again. That is one way of solving the main problem, and it's a simple change that could be applied to Dave's source code. It doesn't solve the other similar problems/improvements like the 85714 bug or also updating the state file based on time.
If guard checks are added, the slight downside is that if a bot times out after onEnd(), it won't be recorded in the state file, which might make it harder to notice/detect bots that behave badly after onEnd(). I suppose some new field(s) could be added to the state file such as frameCountOnEnd to record this (my repo doesn't do this currently), and change the logic to forcefully update the state file whenever the various kinds of timeout occur (my repo does this), even after onEnd() (my repo doesn't do this), and record the frame number of the various timeouts if they occur (my repo does this).
AIIDE/CoG and other competition organizers are welcome to use/modify my repo - it's MIT license. However, the fields in the state file are a bit different to Dave's (mostly just more fields than Dave) so the Java tournament manager would need to be updated to parse the different fields. An explanation of the fields and the logic I propose for how to determine the winner is in https://github.com/chriscoxe/bwapi-tm/blob/master/BWAPI_440/README.md. It's a bit convoluted, but that is mainly due to the uncertainty due to the fact that the state file isn't updated every frame. The INI file is also different (but Dave's INI file should still work). I wrote some instructions for Sonko for how to incorporate my Tournament Module into SCHNAIL. If people want, maybe I could do a similar thing for AIIDE/CoG (as they are bot-vs-bot, not bot-vs-human). Or maybe I could help merge fixes into Dave's repo. Regardless, it's a non-trivial amount of work (and tricky to test above and beyond checking nothing obvious broke). I don't know whether it's really worth it as there seem to be so few game results affected by problems like this. I don't know how much time I want to spend on it, if any.
Quatari on :
If the Tournament Module kicks its bot because it considers it to have timed out, it does this by calling leaveGame() on behalf of its bot, then (the same frame or the next frame? not sure) BWAPI calls that bot's onEnd() to notify it that it lost, then that bot shuts down (i.e. no more onFrame()/onEnd() etc callbacks are called for that bot). The other bot is notified that its opponent has left the game (via onPlayerLeft() callback) and its onEnd() notifies it that it is the winner, then it has a few more frames of callbacks before the game engine finally "ends" for that game.
Jay Scott on :
Jay Scott on :