As companies grow, they go through three phases. First, they do everything manually: the owners of a mom-and-pop store personally know their customers, and they order, display, and recommend items accordingly. This is nice, but it doesnвЂ™t scale. In the second and least happy phase, the company grows large enough that it needs to use computers. In come the programmers, consultants, and database managers, and millions of lines of code get written to automate all the functions of the company that can be automated. Many more peopleare served, but not as well: decisions are made based on coarse demographic categories, and computer programs are too rigid to match humansвЂ™ infinite versatility.. The argument from neuroscience. In any case, if we formalize ChomskyвЂ™s вЂњpoverty of the stimulusвЂќ argument, we find that itвЂ™s demonstrably false. In 1969, J. J. Horning proved that probabilistic context-free grammars can be learned from positive examples only, and stronger results have followed. (Context-free grammars are the linguistвЂ™s bread and butter, and the probabilistic version models how likely each rule is to be used.) Besides, language learning doesnвЂ™t happen in a vacuum; children get all sorts of cues from their parents and the environment. If weвЂ™re able to learn language from a few yearsвЂ™ worth of examples, itвЂ™s partly because of the similarity between its structure and the structure of the world. This common structure is what weвЂ™re interested in, and we know from Horning and others that it suffices.. ReplaceвЂњgodвЂќ with вЂњlearning algorithmвЂќ and вЂњeternal lifeвЂќ with вЂњaccurate prediction,вЂќ and you have the вЂњno free lunchвЂќ theorem. Pick your favorite learner. (WeвЂ™ll see many in this book.) For every world where it does better than random guessing, I, the devilвЂ™s advocate, will deviously construct one where it does worse by the same amount. All I have to do is flip the labels of allunseen instances. Since the labels of the observed ones agree, thereвЂ™s no way your learner can distinguish between the world and the antiworld. On average over the two, itвЂ™s as good as random guessing. And therefore, on average over all possible worlds, pairing each world with its antiworld, your learner is equivalent to flipping coins.. In his storyвЂњFunes the Memorious,вЂќ Jorge Luis Borges tells of meeting a youth with perfect memory. This might at first seem like a great fortune, but it is in fact an awful curse. Funes can remember the exact shape of the clouds in the sky at an arbitrary time in the past, but he has trouble understanding that a dog seen from the side at 3:14 p.m. is the same dog seen from the front at 3:15 p.m. His own face in the mirror surprises him every time he sees it. Funes canвЂ™t generalize; to him, two things are the same only if they look the same down to every last detail. An unrestricted rule learner is like Funes and is equally unable to function. Learning is forgetting the details as much as it is remembering the important parts. Computers are the ultimate idiot savants: they can remember everything with no trouble at all, but thatвЂ™s not what we want them to do.. Suppose a perceptron has two continuous inputsx andy. (In other words,x andy can take on any numeric values, not just 0 and 1.) Then each example can be represented by a point on the plane, and the boundary between positive examples (for which the perceptron outputs 1) and negative ones (output 0) is a straight line:. [РљР°СЂС‚РёРЅРєР°: pic_9.jpg]. InThe HitchhikerвЂ™s Guide to the Galaxy, an alien race builds a massive supercomputer to answer the ultimate question, and after a long time the computer spits outвЂњ42.вЂќ But the computer also points out that the aliens donвЂ™t know what the question is, so they build an even bigger computer to figure that out. This computer-otherwise known as planet Earth-is unfortunately destroyed to make way for a space freeway minutes before finishing its multimillion-year computation. We can only guess at the question now, but perhaps it was: Which slot machine should you play?. Evolution searches for good structures, and neural learning fills them in: this combination is the easiest of the steps weвЂ™ll take toward the Master Algorithm. This may come as a surprise to anyone familiar with the never-ending twists and turns of the nature versus nurture controversy, 2,500 years old and still going strong. Seeing life through the eyes of a computer clarifies a lot of things, however. вЂњNatureвЂќfor a computer is the program it runs, and вЂњnurtureвЂќ is the data it gets. The question of which one is more important is clearly absurd; thereвЂ™s no output without both program and data, and itвЂ™s not like the output is, say, 60 percent caused by the program and 40 percent by the data. ThatвЂ™s the kind of linear thinking that a familiarity with machine learning immunizes you against.. The molecular biology of living cells is such a mess that molecular biologists often quip that only people who donвЂ™t know any of it could believe in intelligent design. The architecture of the brain may well have similar faults-the brain has many constraints that computers donвЂ™t, like very limited short-term memory-and thereвЂ™s no reason to stay within them. Moreover, we know of many situations where humans seem to consistently do the wrong thing, as Daniel Kahneman illustrates at length in his bookThinking, Fast and Slow.. A learner that uses BayesвЂ™ theorem and assumes the effects are independent given the cause is called a NaГЇve Bayes classifier. ThatвЂ™s because, well, thatвЂ™s such a naГЇve assumption. In reality, having a fever makes having a cough more likely, even if you already know you have the flu, because (for example) it makes you more likely to have a bad flu. But machine learning is the art of making false assumptions and getting away with it. As the statistician George Box famously put it: вЂњAll models are wrong, but some are useful.вЂќ An oversimplified model that you have enough data to estimate is better than a perfect one that you donвЂ™t. ItвЂ™s astonishing how simultaneously very wrong and very useful some models can be. The economist Milton Friedman even argued in a highly influential essay that the best theories are the most oversimplified, provided their predictions are accurate, because they explain the most with the least. That seems to me like a bridge too far, but it illustrates that, counter to EinsteinвЂ™s dictum, science often progresses by making things as simple as possible, and then some.. This algorithm is calledk-means, and its origins go back to the fifties. ItвЂ™s nice and simple and quite popular, but it has several shortcomings, some of which are easier to solve than others. For one, we need to fix the number of clusters in advance, but in the real world, Robby is always running into new kinds of objects. One option is to let an object start a new cluster if itвЂ™s too different from the existing ones. Another is to allow clusters to split and merge as we go along. Either way, we probably want the algorithm to include a preference for fewer clusters, lest we wind up with each object as its own cluster (hard to beat if we want clusters to consistof similar objects, but clearly not the goal).. YouвЂ™ve reached the final stage of your quest. You knock on the door of the Tower of Support Vectors. A menacing-looking guard opens it, and you suddenly realize that you donвЂ™t know the password. вЂњKernel,вЂќ you blurt out, trying to keep the panic from your voice. The guard bows and steps aside. Regaining your composure, you step in, mentally kicking yourself for your carelessness. The entire ground floor of the tower is taken up by a lavishly appointed circular chamber, with what seems to be a marble representation of an SVM occupying pride of place at the center. As you walk around it, you notice a door on the far side. It must lead to the central tower-the Tower of the Master Algorithm. The door seems unguarded. You decide to take a shortcut. Slipping through the doorway, you walk down a short corridor and find yourself in an even larger pentagonal chamber, with a door in each wall. In the center, a spiral staircase rises as high as the eye can see. You hear voices above and duck into the doorway opposite. This one leads to the Tower of Neural Networks. Once again youвЂ™re in a circular chamber, this one with a sculpture of a multilayer perceptron as the centerpiece. Its parts are different from the SVMвЂ™s, but their arrangement is remarkably similar. Suddenly you see it: an SVM is just a multilayer perceptron with a hidden layer composed of kernels instead of S curves and an output thatвЂ™s a linear combination instead of another S curve.. In any case, banning robot warfare may not be viable. Far from banning drones-the precursors of tomorrowвЂ™s warbots-countries large and small are busy developing them, presumably because in their estimation the benefits outweigh the risks. As with any weapon, itвЂ™s safer to have robots than to trust the other side not to. If in future wars millions of kamikaze drones will destroy conventional armies in minutes, theyвЂ™d better be our drones. If World War III will be over in seconds, as one side takes control of the otherвЂ™s systems, weвЂ™d better have the smarter, faster, more resilient network. (Off-grid systems are not the answer: systems that arenвЂ™t networked canвЂ™t be hacked, but theycanвЂ™t compete with networked systems, either.) And, on balance, a robot arms race may be a good thing, if it hastens the day when the Fifth Geneva Convention bans humans in combat. War will always be with us, but the casualties of war need not be.. Google + Master Algorithm = Skynet?.