Machine learning is the scientific method on steroids. It follows the same process of generating, testing, and discarding or refining hypotheses. But while a scientist may spend his or her whole life coming up with and testing a few hundred hypotheses, a machine-learning system can do the same in a fraction of a second. Machine learning automates discovery. ItвЂ™s no surprise, then, that itвЂ™s revolutionizing science as much as itвЂ™s revolutionizing business.. The argument from statistics. According to one school of statisticians, a single simple formula underlies all learning. BayesвЂ™ theorem, as the formula is known, tells you how to update your beliefs whenever you see new evidence. A Bayesian learner starts with a set of hypotheses about the world. When it sees a new piece of data, the hypotheses that are compatible with it become more likely, and the hypotheses that arenвЂ™t become less likely (or even impossible). After seeing enough data, a single hypothesis dominates, or a few do. For example, if IвЂ™m looking for a program that accurately predicts stock movements and a stock that a candidate program had predicted would go up instead goes down, that candidate loses credibility. After IвЂ™ve reviewed a number of candidates, only a few credible ones will remain, and they will encapsulate my new knowledge of the stock market.. Our quest will take us across the territory of each of the five tribes. The border crossings, where they meet, negotiate and skirmish, will be the trickiest part of the journey. Each tribe has a different piece of the puzzle, which we must gather. Machine learners, like all scientists, resemble the blind men and the elephant: one feels the trunk and thinks itвЂ™s a snake, another leans against the leg and thinks itвЂ™s a tree, yet another touches the tusk and thinks itвЂ™s a bull. Our aim is to touch each part without jumping to conclusions; and once weвЂ™ve touched all of them, we will try to picture the whole elephant. ItвЂ™s far from obvious how to combine all the pieces into one solution-impossible, according to some-but this is what we will do.. ItвЂ™s not an exaggeration to say that this innocuous-sounding statement is at the heart of the Newtonian revolution and of modern science. KeplerвЂ™s laws applied to exactly six entities: the planets of the solar system known in his time. NewtonвЂ™s laws apply to every last speck of matter in the universe. The leap in generality between the two is staggering, and itвЂ™s a direct consequence of NewtonвЂ™s principle. This one principle is all by itself a knowledge pump of phenomenal power. Without it there would be no laws of nature, only a forever incomplete patchwork of small regularities.. What we need is to learn concepts that are defined by a set of rules, not just a single rule, such as:. In his storyвЂњFunes the Memorious,вЂќ Jorge Luis Borges tells of meeting a youth with perfect memory. This might at first seem like a great fortune, but it is in fact an awful curse. Funes can remember the exact shape of the clouds in the sky at an arbitrary time in the past, but he has trouble understanding that a dog seen from the side at 3:14 p.m. is the same dog seen from the front at 3:15 p.m. His own face in the mirror surprises him every time he sees it. Funes canвЂ™t

generalize; to him, two things are the same only if they look the same down to every last detail. An unrestricted rule learner is like Funes and is equally unable to function. Learning is forgetting the details as much as it is remembering the important parts. Computers are the ultimate idiot savants: they can remember everything with no trouble at all, but thatвЂ™s not what we want them to do.. Learning to cure cancer. But if humans have all these abilities that their brains didnвЂ™t learn by tweaking synapses, where did they come from? Unless you believe in magic, the answer must be evolution. If youвЂ™re a connectionism skeptic and you have the courage of your convictions, it behooves you to figure out how evolution learned everything a baby knows at birth-and the more you think is innate, the taller the order. But if you can figure it out and program a computer to do it, it would be churlish to deny that youвЂ™ve invented at least one version of the Master Algorithm.. For example, suppose you want to evolve a program to compute the duration of a planetвЂ™s year,T, from its average distance to the sun,D. According to KeplerвЂ™s third law,T is the square root ofD cubed, times a constantC that depends on the units you use for time and distance. A genetic algorithm should be able to discover this by looking at Tycho BraheвЂ™s data on planetary motions like Kepler did. In KozaвЂ™s approach,D andC are the leaves of a program tree, and the operations that combine them, like multiplication and taking the square root, are the internal nodes. The following program tree correctly computesT:. This is a radical departure from the way science is usually done. ItвЂ™s like saying, вЂњActually, neither Copernicus nor Ptolemy was right; letвЂ™s just predict the planetsвЂ™ future trajectories assuming Earth goes round the sun and vice versa and average the results.вЂќ. Decision trees are not immune to the curse of dimensionality either. LetвЂ™s say the concept youвЂ™re trying to learn is a sphere: points inside it are positive, and points outside it are negative. A decision tree can approximate a sphere by the smallest cube it fits inside. Not perfect, but not too bad either: only the corners of the cube get misclassified. But in high dimensions, almost the entire volume of the hypercube lies outside the hypersphere. For every example you correctly classify as positive, you incorrectly classify many negative ones as positive, causing your accuracy to plummet.. Nevertheless, itвЂ™s still the case that most shops are pretty close to University Avenue, and if you were allowed only one number to locate a shop, its distance from the Caltrain station along the avenue would be a pretty good choice: after walking that distance, looking around is probably enough to find the shop. So youвЂ™ve just reduced the dimensionality of вЂњshop locations in Palo AltoвЂќ from two to one.. The doctor will see you now. Chapter Ten.