Rich Sutton and The Bitter Lesson
In one of the Undermind podcasts there came up the recent Rich Sutton essay The Bitter Lesson. The basic point of the essay is true: “general methods that leverage computation are ultimately the most effective.” (That is why, though I don’t post about it, behind the scenes I am implementing a machine learning method for Steamhammer.) Nevertheless, I have a complaint about what he says.
Rich Sutton (home page) is a big name in machine learning, and his opinions are worth attention. He is co-author of the major textbook Reinforcement Learning: An Introduction, which I recommend. On his home page you can find more short essays on AI topics; I particularly like Verification, The Key to AI. He’s a smart guy who thinks things through carefully.
Reading The Bitter Lesson, the first paragraph makes perfect sense to me. I might quibble with minor details, but I have no substantial objection; I agree with it, it’s correct or nearly so. Then he goes into examples, and there I feel that he misrepresents history in a cartoonish way. That’s not how the bitter lesson was or is learned.
In the computer chess example, “this was looked upon with dismay by the majority of computer-chess researchers” is false. The Chess 4.x series of programs by Slate and Atkin showed in the 1970s that massive search was more successful than any other method tried. Long before Deep Blue defeated Kasparov, work on knowledge-based approaches was a minor activity; I have a pile of International Computer Chess Association journals to prove it.
The example of computer go is also presented misleadingly. I subscribe to the computer go mailing list and have listened in on the conversations since before the deep learning revolution in computer go. In the old days, people were writing complex programs with too many varieties of hand-coded knowledge, not because they were sure it was the right way—I think everyone agreed that it was not very successful—but because nobody had found a better one. There were experiments with neural networks that intimated that a breakthrough might be possible, but until AlphaGo, the breakthrough was not achieved. The big lesson was not “use a general method of massive learning,” it was “here is the recipe for massive learning.” Finding the recipe was extremely difficult; people had been working on it for many years of slow progress, and victory did not arrive until there were technical breakthroughs in deep learning and DeepMind brought its great resources to bear. Knowing that you should use massive search and/or massive learning is only a tiny part of the battle; figuring out how to use them is the real fight.
That is exactly where we stand in Brood War. We know, and most agree, that search methods (as used in combat simulation) and learning methods (like deep learning) have the potential to be successful. And we have intimations of a breakthrough, bots that have achieved some of that success, like LastOrder with its macro model and CherryPi in TorchCraft with its build order switcher. But we don’t know the full recipe yet, and finding it is difficult work.
Comments
Marian on :
The interesting problems in my opinion are:
- estimating army strength fast and decisively by taking all the unit types, special abilities and terrain into account
- estimating what opponent is doing and how many units/buildings it has with limited scouting information (opponent modeling)
- estimating where enemy unit might be with limited scouting information
- army group movement and positioning
- multiple group synchronization
Jay Scott on :
Jay Scott on :
Joseph Huang on :
Tully Elliston on :
jtolmar on :
http://www.staredit.net/topic/17809/0/
Jay Scott on :
Johannes Holzfuß on :
In a game on stream against the new bot Dragon it drew its "I can't see this unit, but I know it used to be there" rectangles for invisible units and then *kept them up to date* while under the fog of war.
The game starts at https://www.twitch.tv/videos/420630546 around the 5 hour mark.
Jay Scott on :
MarcoDBAA on :
Dragon scored an impressive win vs SAIDA, where it exposed SAIDAs goliaths, by luring them into tank fire repeatedly with its wraiths. M&Ms did well vs PWs carriers and storm (expected for a CPi descendent) too.
However it showed a clear weakness dealing with harassment. Overreactions vs just a few units, getting stuck trying to chase tscmooz mutas, and seeming to lose any sort of real inititative (game vs Proxy).
Bytekeeper on :
The bot was disabled immediately.
Mistakes happen.
MicroDK on :
On Discord, the author has been informed to change the config file and upload again. The author was testing something locally but forgot to turn the setting to false.
Johannes Holzfuß on :
I was unaware of that bug and thought that the TM would prevent an accidentally enabled debug flag ("Tournament mode engaged"), so I assumed foul play was at work. I guess I shouldn't have.
Sorry. :/
Joseph Huang on :