speed and smarts

Was Deep Blue intelligent? The computer was victorious in its 1997 rematch against World Chess Champion Garry Kasparov. Here I consider the value of brute force methods for artificial intelligence as a whole.

In the dawn of artificial intelligence, in the 1950’s and 1960’s, brute force methods were seen as a dead end. Humans are highly selective, for one thing, and for another exhaustive search quickly hits the combinatorial wall, so both experience and theory backed up this early opinion. In fact, brute force methods were seen as so obviously bad that they were rarely tried, even when later experience found them to work well.

By the time I first wrote this page, in the 1990’s, opinions had adjusted because brute force methods had seen some successes, such as in chess programs. It didn’t hurt that computers were much faster, of course. My read on the general opinion by then is that brute force was considered appropriate in some domain-specific cases, but not broadly useful. At least people were willing to try ideas when they came up.

Now, in the 2010’s, we have tasted the fruit of those ideas that came up, such as Monte Carlo tree search. My read on the general opinion is that brute force is finally accepted as broadly applicable, but that the recognized good applications are still somewhat stereotyped. Today, with large memories and massive parallelism available, there is also an appreciation that it can be useful to bring to bear a large amount of data quickly (web search is an example—it relies on machine learning). I think there is still a lot of room for new ideas that will lead to further breakthroughs.

Jay : game learning : ideas : speed and smarts

By a brute force method, I mean, rather vaguely, a technique that emphasizes calculation over knowledge. Deep Blue included a lot of chess knowledge, and the knowledge was essential to its performance, but its search tree was huge compared to its knowledge base.

There is no easy way to transplant Deep Blue’s power to other domains—it’s a chess program, and that’s all it is. Nevertheless I’ll argue that brute force methods as a class are important, promising, and underappreciated despite their success.

the value of speed

The first rule of brute force is that the more you have of it, the more you can do with it. “If brute force don’t work, ya ain’t usin’ enough of it.” In 1975, many chess players thought that chess computers would never reach master strength. In 1985, the barrier was grandmaster strength. In 1995, the barrier was the world champion. Every time, it turned out that faster computers—more brute force—broke through the barrier.

How broadly are brute force methods applicable? IBM's Deep Blue web site suggests data mining, molecular dynamics and financial analysis as suitable domains for brute-force attack. Many, many other domains are candidates as well; for example, design domains, like VLSI design, often have appropriate problems. Surprising applications exist too: Matt Ginsberg’s bridge program GIB relies on a brute force method for card play, although knowledge-based methods had usually been considered necessary.

But new applications are less than half the story. Most combinatorial optimization algorithms are brute force algorithms, with just a dash of smarts. For example, the knowledge embodied in a genetic algorithm consists of a few selection and recombination rules—far less knowledge than Deep Blue relies on. A genetic algorithm is a brute force method (or else Deep Blue’s algorithm isn’t!). In the same way, dynamic programming algorithms, constraint satisfaction algorithms, and even the simplex algorithm are brute force methods that take a little knowledge a long way.

In the same way that all programs can be seen as a combination of data structures and algorithms, all problem solving can be seen as a combination of search and knowledge. In making a decision, you either know what to do, or else you must consider the options; knowing and searching are the only possibilities. So search is half of all thinking, and it shouldn’t be surprising that we can get good results by searching well.

This view can be traced back at least as far as Newell and Simon’s General Problem Solver (1963), and it is directly embodied in Soar. It is a weak view, in that by itself it doesn’t say anything about what knowledge to use or how to search, but by the same token it is completely general.

Humans think slowly. We don’t have any choice but to prefer knowledge over search in our reasoning, because we’re too sluggish to search much. Computers don’t have that limitation, so for an automated problem solver we should explore the entire range of the search/knowledge tradeoff. In fact, as long as we have difficulty encoding knowledge so that programs can use it—as long as there is a “knowledge acquisition bottleneck”—in practice we should often prefer brute force methods.

the value of smarts

Besides being the fastest chess program, Deep Blue is also thought to have been one of the smartest of its day. Because its evaluation function was implemented in custom hardware, Deep Blue could apply evaluation knowledge in parallel. That meant it could take advantage of more knowledge in each evaluation without slowing down, whereas a software-only chess program becomes slower when knowledge is added.

Brute force alone solves nothing. The knowledge is essential; to play at world-class level, Deep Blue needed extensive knowledge of what’s important in many different types of chess positions. Skill is a rectangle whose dimensions are speed and smarts, and increasing the smarts is as important as increasing the speed. In fact, if you already have the speed then increasing the smarts may be the easiest way to advance.

Many problems are knowledge-intensive and can’t be approached at all without lots of knowledge. Natural language understanding is one. The success of expert systems shows that the knowledge-rich, search-poor range of the knowledge/search tradeoff is accessible to machines as well as to humans. An extreme example is Cyc, a large knowledge base created with the hope of duplicating common sense.

the lessons

Search is half of thinking. It may be the easy half, but that’s not what counts! When you can, take advantage of the fact that it’s the easy half. Keep an eye out for that little bit of knowledge that will make your problem tractable for a brute force method.

If you need knowledge, you should look for ways to be smart quickly. Deep Blue’s big lesson is that you can do best if you can apply knowledge efficiently.

history offers its advice

Capsule history of computer chess search: Until 1973 competition programs generally relied on hand-made selective searches, because selective search was seen as obviously necessary. Brute-force search (using only alpha-beta to prune, but also with quiescence search) broke through with the famous program Chess 4.0. It rapidly proved successful and soon became the mainstream. In the 1980’s and 1990’s successful selective search heuristics like null-move pruning and singular extensions were found, and by the turn of the century the top chess programs once again had highly selective searches. But the selectivity was not ad hoc as in the 1960’s, it was based on well-understood and carefully-tuned domain-specific heuristics. The heuristics did not change the basic nature of the underlying brute-force search, they refined it to make it work better for playing chess.

Capsule history of computer go search: Until the decade of the 2000’s competition programs relied on hand-made selective algorithms, because nobody had any better ideas—and the lack of ideas was not for lack of looking, the needed breakthrough was genuinely difficult to discover. Finally the 1990’s line of research into Monte Carlo tree search bore fruit in the next decade as new improvements were discovered, and soon the best programs were using brute-force methods of randomly sampling the game tree. At first the sampling used little knowledge and the sampled lines of play were “light playouts”, but over time it was found useful to add knowledge at the cost of speed, giving “heavy playouts”. And that’s where we are today; algorithm refinements and knowledge refinements continue to improve go programs.

Search for both games went through a similar progression of three steps, and I imagine that the pattern is general. 1. Ad hoc selective search, whatever can be cobbled together. 2. Brute force with the minimal domain-specific adjustments needed for success. 3. Principled selective search derived by refining the brute force search of the previous step. My advice is:

Avoid ad hoc search. In other words, skip step 1 as best you can and start with step 2.

Start simple. This is a heuristic for skipping step 1. Occam’s Razor applies: Start simple, add complexity only as you can prove that it is justified.

Don’t be afraid of a little dirt. A truly simple idea is likely to need some refinement to be successful at all; minimal domain-specific adjustments is not the same as no domain-specific adjustments. Success counts more than simplicity. (But do be afraid of a lot of dirt. It will bury you.)

Refine it. High performance comes from refining a method that produces adequate performance. You can’t start great. Look for the right idea, a simple method that does reasonably well. Most simple methods will automatically come with ample scope for refinement.

The right idea is likely to be a brute force search. Search is half of thinking, and you need a simple idea so some simple kind of brute force search is a good starting point. (But don’t overlook the possibility that the right idea is knowledge-based. Knowledge is the other half.)

99% perspiration. Most of the work is in the refinement stage. The breakthrough right idea is essential but it is only the starting point.

references

Chess 4.0
Page of the Chess Programming Wiki about all versions of the historic program.

Cycorp
Cyc is another ambitious project, started by Douglas Lenat, to create a huge knowledge base of “common sense” information. Cycorp is the company created to commercialize the result.

GIB
Matt Ginsberg
Web page for the bridge program. It uses exhaustive search in a surprising way.

GPS, a program that simulates human thought (1963)
Allen Newell and Herbert A. Simon
This is a summary by J. William Murdock of the original paper from Computers and Thought, edited by E.A. Feigenbaum and J. Feldman, New York: McGraw-Hill. To my knowledge the original paper is not online anywhere.

Kasparov vs. Deep Blue: The Rematch
The official IBM web site.

Soar
This is an ambitious project involving many researchers. “Our intention is for Soar to support all the capabilities required of a general intelligent agent.”

updated 14 May 2012