breakthrough game programs

In a given game, the progress of game playing programs shows occasional sudden breakthroughs to higher performance and long periods of slow improvement while programmers learn how to better exploit the previous breakthrough. It’s my observation. Historically, most games have 1 breakthrough idea whose refinement, as computers grow faster, eventually leads to superhuman skill.

A breakthrough is always based on a simple idea. Maybe it’s my power of oversimplification, but that’s how it seems to me. A breakthrough program is always complex, though: A new idea is not accepted as a breakthrough unless it performs, and if you come up with a potential breakthrough idea then you have to do the detail work to bring it to the point where it can actually break through. The detail work can include inventing more new ideas to support the breakthrough.

chess: Chess 4.0, Slate and Atkin, 1973
the simple idea: full-width search
By 1973, the alpha-beta algorithm was universal. Chess programmers generally believed that, to search deeply enough to play well, programs had to be selective about which moves they searched. They wrote complicated code to select good candidate moves and prune bad ones. Slate and Atkin observed that the pruning code was slow and introduced a lot of play errors when good moves were mistakenly pruned—how are you supposed to know whether a move is good before you search it? Their new version, Chess 4.0, searched all moves as deeply as their previous version had searched selected moves, and played much better. To exploit the breakthrough, they used efficient bit-board data structures and added complex new ideas of iterative deepening search, a transposition table to avoid re-searching known positions and to re-use move ordering information from past search iterations, and a quiescence search of forcing moves such as captures. In the 1990s the null move heuristic became popular and chess programs started to become more selective again—they try to search bad moves less rather than not at all. Today’s superhuman chess programs are highly selective in their searches. I don’t see it as a repudiation of the original breakthrough idea, but as a refinement: Here is how you tune it; you still search every move, some deeply and some shallowly.

backgammon: TD-Gammon, Gerald Tesauro, 1992
the simple idea: neural network evaluator trained by self-play
Previous backgammon programs had used hand-coded evaluators. Gerald Tesauro figured out how to train a neural network as the evaluator by playing the program against itself while using temporal difference learning to reduce errors, producing a far more accurate evaluator. Today’s near-perfect backgammon programs use the same idea, with the addition of various kinds of search. AlphaGo also famously adopted this style of evaluator—now you know where the idea came from.

othello: Logistello, Michael Buro, 1993
the simple idea: statistical evaluator with a huge number of binary features
An othello program with a fast search and a simple evaluation function can play well. Michael Buro added a highly knowledgeable evaluation function for overwhelming strength, crushing the human champion 6-0 in 1997. The idea behind the evaluator is: Score a large number of positions by search; these scores have to be reasonably accurate. Find a huge number of binary features that you can calculate fast (in othello, millions of possible disk patterns that may appear on the board). Then treat the evaluation features as independent and use straightforward statistical regression to find the value of each feature. It’s important to get the math details right; you can read some of the papers at the link. Suddenly you have an evaluator with a hell of a lot of detailed knowledge! The same idea has been used to produce strong evaluators in many other games, and I don’t know any reason it wouldn’t work in Starcraft.

contract bridge: GIB, Matthew Ginsberg, 1998
the simple idea: draw a statistical sample of the possible game situations
Bridge is a game of partial information: After the deal, you only know your own cards. As play goes on, you gain more information. GIB randomly generates game situations compatible with what it knows and evaluates its choices in the sample situations; whatever is best on average is best in reality, within the statistical margin of error. GIB and other current programs still use the same idea. See computer bridge for more. Starcraft is also a game of partial information, so the idea is directly relevant.

go: Crazy Stone, Rémi Coulom, 2006
the simple idea: draw a statistical sample of possible lines of play (Monte Carlo tree search)
(This is the same Rémi Coulom who wrote bayeselo, by the way.) GIB knows its possible lines of play and has to sample from the unknown game situations. Go programs know the game situation but, because there are a huge number of possible moves at any time, can only sample from possible lines of play. By now, this family of search algorithms is somewhat well developed; there have been rounds of improvement as new refinements are invented. The Starcraft bot MaasCraft uses a variation, and the latest versions of LetaBot are supposed to as well.

go: AlphaGo, DeepMind, 2015
the simple idea: use deep learning for search control
Go is such a difficult game that it took 2 breakthroughs to catch up to human performance. AlphaGo keeps the idea of Monte Carlo tree search from Crazy Go and other programs and adds search control by deep learning; that is its main new idea, in my view. AlphaGo also includes the learning by self-play idea of TD-Gammon and, with deep learning, makes it more successful than past attempts in go. I think everybody has noticed by now that deep learning is highly promising for Starcraft, too.

Search and knowledge are the two ways to get stuff done, and here I strip new ideas down to their essence, so it should be no surprise that the breakthroughs fall more or less neatly into two categories. The search breakthroughs are in Chess 4.0, GIB, and Crazy Stone. The knowledge breakthroughs are in TD-Gammon, Logistello, and AlphaGo. AlphaGo does not categorize so neatly, though; it is both a knowledge and a search idea. And that is as it should be, if you ask me. Search and knowledge belong together.

Right now, Starcraft programs are at the stage where the greater part of most programs is a mass of ad hoc hand-crafted heuristic kludges (it’s redundant; the 4 terms all mean the same). We’re still at the stage where chess programs were before Chess 4.0 and backgammon programs before TD-Gammon. We are still looking for our breakthrough. People are casting around for the weak point in the wall of difficulty. Now is the time to learn from past breakthroughs!

Personally, I’m expecting a knowledge breakthrough more than a search breakthrough. But it’s only my guess, and I could easily be wrong. The skills we want are creativity and close analysis; whoever has enough of both will produce the Starcraft breakthrough.

Trackbacks

No Trackbacks

Comments

krasi0 on Tuesday, October 25. 2016:

I really enjoyed reading your layout of past breakthroughs in AI for various computer games. Even as someone, who has implemented a chess playing AI engine, I found it very informative (mostly wrt other popular games).

Could you expand on the following statement: "I think everybody has noticed by now that deep learning is highly promising for Starcraft, too.". I must have missed something but has there been any evidence to support that particular claim? I am not saying that DL wouldn't work here. It's an honest question.

I believe that the time for a breakthrough would come once we hit a *hard* wall / ceiling on the way up to improvement. Still, many bot authors are able to register gradual improvements in the strength of play of their bots (just look at Iron, tscmoo or Bereaver, for example), so no one really feels hard pressed to utilize their creativity and to think out of the box for the purpose of a long-term success prospect.
Just my 2 cents

Jay Scott on Tuesday, October 25. 2016:

Wah, I have a bunch to say. In three parts, then. 1. Short pithy stories are of course much simpler than events. From first to last, these breakthroughs have interesting complications. Chess 4.0 wasn’t the first full-width chess program (Tech was the first successful one). It became recognized as the breakthrough because of the detail work that made “the simple idea” into a breakaway winner. In go programs, there was already a shift toward deep learning and increasing recognition among the programmers that it was a winning method. But AlphaGo introduced new techniques and blew away expectations. Breakthroughs (as I see it) don’t come from good ideas, they come from the wide recognition of the good ideas. I’d say breakthrough = successful results * wide recognition + reproducible methods + good writeup.

Jay Scott on Tuesday, October 25. 2016:

2. Has everybody noticed about deep learning? I don’t have evidence, I just assume that the massive hype has penetrated every nook! The name “deep learning” is itself a hype term. There are mentions, for example in a few CIG 2016 papers and in the future work section of Dave Churchill’s thesis. Is deep learning promising for Starcraft? Hell yes, much of the hype is justified. Deep learning has brought AI to near-human or beyond-human levels of performance on a lot of hard problems. That is promise in itself. The right way to use deep learning is not obvious, though. It’s not “push the deep learning button.” A wide range of techniques are known and, like AlphaGo, we may have to invent new ones for our problem. Figuring out the right problem representations, finding successful training schedules, nothing is obvious. And collecting enough data to make it work may be out of reach for individual bot authors—I think that’s the highest hurdle.

Jay Scott on Tuesday, October 25. 2016:

3. I don’t think that game programs ever hit a wall, at least not until the game is mathematically solved like Chinook solved checkers. :-) Progress is pretty much always ongoing, sometimes slow and sometimes fast. Breakthroughs (I think) result from insight and long hard work on the part of their inventors, but tend to come out of the blue for everyone else. I don’t think we can guess when one will happen.

krasi0 on Wednesday, October 26. 2016:

Instead of "to hit a wall", I meant to say: "to hit a plateau" in terms of general playing strength or ELO if you will. E.g. the best bot out there can beat a D+ rated player, but not a C- rated player (in a Bo9) for a long period of time. That would represent something akin to a BW AI winter of sorts.

Jay Scott on Wednesday, October 26. 2016:

We need empirical evidence. Let’s see how surprised we are when it happens. :-)

krasi0 on Thursday, October 27. 2016:

*THUMBS UP* ;)

Jay Scott on Sunday, October 30. 2016:

I had to make a correction to the original post: Null move became popular in the 1990s. If only I could remember that I can’t remember things....

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA