Rich Sutton and The Bitter Lesson
In one of the Undermind podcasts there came up the recent Rich Sutton essay The Bitter Lesson. The basic point of the essay is true: “general methods that leverage computation are ultimately the most effective.” (That is why, though I don’t post about it, behind the scenes I am implementing a machine learning method for Steamhammer.) Nevertheless, I have a complaint about what he says.
Rich Sutton (home page) is a big name in machine learning, and his opinions are worth attention. He is co-author of the major textbook Reinforcement Learning: An Introduction, which I recommend. On his home page you can find more short essays on AI topics; I particularly like Verification, The Key to AI. He’s a smart guy who thinks things through carefully.
Reading The Bitter Lesson, the first paragraph makes perfect sense to me. I might quibble with minor details, but I have no substantial objection; I agree with it, it’s correct or nearly so. Then he goes into examples, and there I feel that he misrepresents history in a cartoonish way. That’s not how the bitter lesson was or is learned.
In the computer chess example, “this was looked upon with dismay by the majority of computer-chess researchers” is false. The Chess 4.x series of programs by Slate and Atkin showed in the 1970s that massive search was more successful than any other method tried. Long before Deep Blue defeated Kasparov, work on knowledge-based approaches was a minor activity; I have a pile of International Computer Chess Association journals to prove it.
The example of computer go is also presented misleadingly. I subscribe to the computer go mailing list and have listened in on the conversations since before the deep learning revolution in computer go. In the old days, people were writing complex programs with too many varieties of hand-coded knowledge, not because they were sure it was the right way—I think everyone agreed that it was not very successful—but because nobody had found a better one. There were experiments with neural networks that intimated that a breakthrough might be possible, but until AlphaGo, the breakthrough was not achieved. The big lesson was not “use a general method of massive learning,” it was “here is the recipe for massive learning.” Finding the recipe was extremely difficult; people had been working on it for many years of slow progress, and victory did not arrive until there were technical breakthroughs in deep learning and DeepMind brought its great resources to bear. Knowing that you should use massive search and/or massive learning is only a tiny part of the battle; figuring out how to use them is the real fight.
That is exactly where we stand in Brood War. We know, and most agree, that search methods (as used in combat simulation) and learning methods (like deep learning) have the potential to be successful. And we have intimations of a breakthrough, bots that have achieved some of that success, like LastOrder with its macro model and CherryPi in TorchCraft with its build order switcher. But we don’t know the full recipe yet, and finding it is difficult work.