timeout issues
Bruce @ Stardust commented on yesterday’s post. The indented paragraphs are quoted from the comment.
There is some AIIDE news from Discord that relates a bit to your last point about timeouts: PurpleWave has withdrawn from the tournament as it somewhat mysteriously was timing out in most of its games. I say mysteriously since Dan has invested a huge amount of effort into making PurpleWave run asynchronously, which should make timeouts impossible.
Executive summary: That sucks. Purple Dan Gant dug deep into low-level particulars that bot authors really should not have to know, and yet it wasn’t enough.
the problem
Last year when the BWAPI client timing bug was being investigated, some other issues were discovered, like problems relating to the timer not being high-enough resolution and problems with the bot process being pre-empted by the OS and therefore appearing to have spikes that were actually nothing to do with the bot.
BWAPI’s timer. I read the source and saw that BWAPI 4.4.0 times bots using the real-time counter GetTickCount(). The documentation that I just linked says that the timer’s resolution is “typically in the range of 10 milliseconds to 16 milliseconds.” That’s very crude for measuring that an interval does not exceed 55 milliseconds. A measurement “it took 55ms” means “it probably sort of maybe took between 40ms and 70ms, though it depends on your system.” One solution would be to use a high resolution timer in a new BWAPI version. That’s how Steamhammer times itself, with code inherited from UAlbertaBot. Another solution might be to find a way to time accurately from outside BWAPI, somehow accounting for overheads.
BWAPI reports the time through a call BWAPI::Broodwar->getLastEventTime()
. A comment in ExampleTournamentModule.cpp
explains a workaround in the code to cope with peculiarities that are hard to understand. It’s a code smell, as the authors are well aware, or the comment would not be there. I don’t want to try to figure out if and when the code works as intended.
Both these points appear in the BWAPI issue getLastEventTime() has different behavior for client/module bots, linked by Dan in another comment. Confusion among BWAPI developers in the comment thread shows how hard it is to understand!
Being pre-empted. As I understand it, in the tournament the timer is provided by a multitasking virtual machine which itself runs under a multitasking operating system. Looks like ample opportunity for slippage in every aspect of timing. I don’t know the solution for that. Is it possible to measure something like cpu time + I/O time instead of real time? Surely every operating system keeps track. Would it work better, even when running under a vm that might itself be pre-empted by the host OS? I can think of other potential problems, but that’s a good start!
Experiments with timers in one environment might not tell us about timers in another environment. And yet if we want to hold bots to time limits, then bots need a reliable way to measure their time usage.
a proposed solution
All of this has got me wondering if we should change the approach to timeouts. I think the 1- and 10-second limits are fine, but perhaps the 55ms rule should be an average over the entire game instead of a frame count limit. I’m a bit worried that the current rules will result in more unfair disqualifications or force more bot authors to spend a lot of time working around single-frame spikes, both of which are bad for our already-quite-small community.
That worries me too, and I like your suggestion, especially for tournaments, because tournaments care more about total time needed than time per frame. (For playing against humans, or for streaming, consistent speed counts.) My first thought is that if there is a mean frame time limit, then the limit should be lower than 55ms, perhaps 42ms. Averages are easier to keep low than occasional peaks are. Maybe histograms of frame time for a bunch of bots would help us understand what is best. I’m imagining that the tournament would allow a startup transient, then keep track of the frame count and total frame time, and verify the average periodically, perhaps once per second. Fail and earn an immediate loss.
Dan suggested a rolling average (aka moving average) as a possible alternative. That’s more complicated to implement, but not by much.
There are other averages than the mean. The mean has the advantage of simplicity, and the advantage that the total time allowed is proportional to the length of the game. I think the mean is the right choice. But if the goal is to limit spikes above 55ms (or whatever), then we could choose an averaging function that penalizes those more. Choose appropriately, and the 1-second and 10-second rules could be eliminated because the averaging function takes over their roles.
real-time systems
I favor making life easy for bot authors, but there’s only so easy it can be made.
Stepping back for a bigger picture, a BWAPI bot is a complex real-time system. If the bot does little work per frame, it is easy to hold it to its real-time promises, no matter the details of the promises. Don’t worry, just run it, it’ll be fine (Steamhammer’s approach so far). If it does a lot of work per frame and risks breaking its promises, then in general it has decide what work to skip to save time. It needs some way, preferably a smart way, to divide its work into pieces and to choose which pieces to drop or delay (PurpleWave’s approach). It’s much harder. The difficulty is intrinsic to real-time systems: If you want to play as well as possible, and playing your best may take more time than you have, then the bot needs a baked-in system to cut corners.
I can imagine that somebody might provide a real-time framework for bots, but even then not everybody would or should use it. With more to learn, starting a bot would be harder. Maybe it would be good to have a framework with optional real-time features.
I remember BeeBot, interesting but eventually disabled for being too slow. I can at least offer advice for authors whose bots are slow, or in danger of becoming slow. Many of these bots, I think, are by less-experienced programmers who haven’t yet mastered the art of efficient algorithms and structuring their code to avoid unnecessary work. Over-optimization that obfuscates code is an anti-skill for long-term development, but clear and efficient structure is good. Skip computations that you don’t need, calculate on demand data that you may or may not use, cache data that you may reuse, tolerate some out-of-date data if it doesn’t need to be the latest—all easy ideas, but not so easy to become expert at. And that means that the expertise is valuable.
A little more for those who do have the experience. If you’re not familiar with real-time systems, you may not realize: Code with a predictable runtime is often better than fast but unpredictable code. If you know how long it will take, then you can schedule it to safely meet your real-time promises. If it’s faster on average but occasionally takes longer, you may risk breaking your promises. Better yet is code where you decide how long it takes: See anytime algorithms, which offer some answer whenever you stop them, and a better answer if you let them run longer. Many search algorithms have the anytime property.