Over time, computer scientists build on each otherвЂ™s work and invent algorithms for new things. Algorithms combine with other algorithms to use the results of other algorithms, in turn producing results for still more algorithms. Every second, billions of transistors in billions of computers switch billions of times. Algorithms form a new kind of ecosystem-ever growing, comparable in richness only to life itself.. Every computer scientist does battle with the complexity monster every day. When computer scientists lose the battle, complexity seeps into our lives. YouвЂ™ve probably noticed that many a battle has been lost. Nevertheless, we continue to build our tower of algorithms, with greater and greater difficulty. Each new generation of algorithms has to be built on top of the previous ones and has to deal with their complexities in addition to its own. Thetower grows taller and taller, and it covers the whole world, but itвЂ™s also increasingly fragile, like a house of cards waiting to collapse. One tiny error in an algorithm and a billion-dollar rocket explodes, or the power goes out for millions. Algorithms interact in unexpected ways, and the stock market crashes.. The good news today is that sciences that were once data-poor are now data-rich. Instead of paying fifty bleary-eyed undergraduates to perform some task in the lab, psychologists can get as many subjects as they want by posting the task on AmazonвЂ™s Mechanical Turk. (It makes for a more diverse sample too.) ItвЂ™s getting hard to remember, but little more than a decade ago sociologists studying social networks lamented that they couldnвЂ™t get their hands on a network with more than a few hundred members. Now thereвЂ™s Facebook, with overa billion. A good chunk of those members post almost blow-by-blow accounts of their lives too; itвЂ™s like having a live feed of social life on planet Earth. In neuroscience, connectomics and functional magnetic resonance imaging have opened an extraordinarily detailed window into the brain. In molecular biology, databases of genes and proteins grow exponentially. Even in вЂњolderвЂќ sciences like physics and astronomy, progress continues because of the flood of data pouring forth from particle accelerators and digital sky surveys.. With big data and machine learning, you can understand much more complex phenomena than before. In most fields, scientists have traditionally used only very limited kinds of models, like linear regression, where the curve you fit to the data is always a straight line. Unfortunately, most phenomena in the world are nonlinear. (Or fortunately, since otherwise life would be very boring-in fact, there would be no life.) Machine learning opens up a vast new world of nonlinear models. ItвЂ™s like turning on the lights in a room where only a sliver of moonlight filtered before.. Technology trends come and go all the time. WhatвЂ™s unusual about machine learning is that, through all these changes, through boom and bust, it just keeps growing. Its first big hit was in finance, predicting stock ups and downs, starting in the late 1980s. The next wave was mining corporate databases, which by the mid-1990s were starting to grow quite large, and in areas like direct marketing, customer relationship management, credit scoring, and fraud detection. Then came the web and e-commerce, where automated personalization quickly became de rigueur. When the dot-com bust temporarily curtailed that, the use of learning for web search and ad placement took off. For better or worse, the 9/11 attacks put machine learning in the front line of the war on terror. Web 2.0 brought a swath of new applications, from mining social networks to figuring out what bloggers are saying about your products. In parallel, scientists of all stripes were increasingly turning to large-scale modeling, with molecular biologists and astronomers leading the charge. The housing bust barely registered; its main effect was a welcome transfer of talent from Wall Street to Silicon Valley. In 2011, the вЂњbig dataвЂќ meme hit, putting machine learning squarely in the center of the global economyвЂ™s future. Today, there seems to be hardly an area of human endeavor untouched by machine learning, including seemingly unlikely candidates like music, sports, and wine tasting.. Even if by some miracle we managed to finish coding up all the necessary pieces, our troubles would be just beginning. Over the years, a number of research groups have attempted to build complete intelligent agents by putting together algorithms for vision, speech recognition, language understanding, reasoning, planning, navigation, manipulation, and so on. Without a unifying framework, these attempts soon hit an insurmountable wall of complexity: too many moving parts, too many interactions, too many bugs for poor human software engineers to cope with. Knowledge engineers believe AI is just an engineering problem, but we have not yet reached the point where engineering can take us the rest of the way. In 1962, when Kennedy gave his famous moon-shot speech, going to the moon was an engineering problem. In 1662, it wasnвЂ™t, and thatвЂ™s closer to where AI is today.. In his storyвЂњFunes the Memorious,вЂќ Jorge Luis Borges tells of meeting a youth with perfect memory. This might at first seem like a great fortune, but it is in fact an awful curse. Funes can remember the exact shape of the clouds in the sky at an arbitrary time in the past, but he has trouble understanding that a dog seen from the side at 3:14 p.m.
is the same dog seen from the front at 3:15 p.m. His own face in the mirror surprises him every time he sees it. Funes canвЂ™t generalize; to him, two things are the same only if they look the same down to every last detail. An unrestricted rule learner is like Funes and is equally unable to function. Learning is forgetting the details as much as it is remembering the important parts. Computers are the ultimate idiot savants: they can remember everything with no trouble at all, but thatвЂ™s not what we want them to do.. Your friend Ben is also pretty good, but heвЂ™s had a bit too much to drink. His darts are all over, but he loudly points out that on average heвЂ™s hitting the bullвЂ™s-eye. (Maybe he should have been a statistician.) This is the low-bias, high-variance case, shown in the bottom right corner. BenвЂ™s girlfriend, Ashley, is very steady, butshe has a tendency to aim too high and to the right. She has low variance and high bias (top left corner). Cody, whoвЂ™s visiting from out of town and has never played darts before, is both all over and off center. He has both high bias and high variance (top right).. Symbolism is the shortest path to the Master Algorithm. It doesnвЂ™t require us to figure out how evolution or the brain works, and it avoids the mathematical complexities of Bayesianism. Sets of rules and decision trees are easy to understand, so we know what the learner is up to. This makes it easier to figure out what itвЂ™s doing right and wrong, fix the latter, and have confidence in the results.. HMMs are good for modeling sequences of all kinds, but theyвЂ™re still a far cry from the flexibility of the symbolistsвЂ™IfвЂ¦thenвЂ¦ rules, where anything can appear as an antecedent, and a ruleвЂ™s consequent can in turn be an antecedent in any downstream rule. If we allow such an arbitrary structure in practice, however, the number of probabilities we need to learn blows up. For a long time no one knew how to square this circle, and researchers resorted to ad-hoc schemes, like attaching confidence estimates to rules and somehow combining them. If A implies B with confidence 0.8 and B implies C with confidence 0.7, then perhaps A implies C with confidence 0.8 Г— 0.7.. To handle weakly relevant attributes, one option is to learn attribute weights. Instead of letting the similarity along all dimensions count equally, weвЂњshrinkвЂќ the less-relevant ones. Suppose the training examples are points in a room, and the height dimension is not that important for our purposes. Discarding it would project all examples onto the floor. Downweighting it is more like giving the room a lower ceiling. The height of a point still counts when computing its distance to other points, but less than its horizontal position. And like many other things in machine learning, we can learn attribute weights by gradient descent.. Discovering the shape of the data. One of the most popular algorithms for nonlinear dimensionality reduction, called Isomap, does just this. It connects each data point in a high-dimensional space (a face, say) to all nearby points (very similar faces), computes the shortest distances between all pairs of points along the resulting network and finds the reduced coordinates that best approximate these distances. In contrast to PCA, facesвЂ™ coordinates in this space are often quite meaningful: one may represent which direction the face is facing (left profile, three quarters, head on, etc.); another how the face looks (very sad, a little sad, neutral, happy, very happy, etc.); and so on. From understanding motion in video to detecting emotion in speech, Isomap has a surprising ability to zero in on the most important dimensions of complex data.. Your head is spinning. You go outside to the balcony. The sun has risen over the city. You gaze out over the rooftops to the countryside beyond. Forests of servers stretch away in all directions, humming quietly, waiting for the Master Algorithm. Convoys move along the roads, carrying gold from the data mines. Far to the west, the land gives way to a sea of information, dotted with ships. You look up at the flag of the Master Algorithm. You can now clearly see the inscription inside the five-pointed star:. Conversely, the long-term prospects of scientists are not the brightest, sadly. In the future, the only scientists may well be computer scientists, meaning computers doing science. The people formerly known as scientists (like me) will devote their lives to understanding the scientific advances made by computers. They wonвЂ™t be noticeably less happy than before; after all, science was always a hobby to them. And one very important job for the technically minded will remain: keeping an eye on the computers. In fact, this will require more than engineers; ultimately, it may be the full-time occupation of all mankindto figure out what we want from the machines and make sure weвЂ™re getting it-more on this later in this chapter..