It would be prohibitively expensive, though, if we had to build a new computer for every different thing we want to do. Rather, a modern computer is a vast assembly of transistors that can do many different things, depending on which transistors are activated. Michelangelo said that all he did was see the statue inside the block of marble and carve away the excess stone until the statue was revealed. Likewise, an algorithm carves away the excess transistors in the computer until the intended function is revealed, whether itвЂ™s an airlinerвЂ™s autopilot or a new Pixar movie.. Designing an algorithm is not easy. Pitfalls abound, and nothing can be taken for granted. Some of your intuitions will turn out to have been wrong, and youвЂ™ll have to find another way. On top of designing the algorithm, you have to write it down in a language computers can understand, like Java or Python (at which point itвЂ™s called a program). Then you have to debug it: find every error and fix it until the computer runs your program without screwing up. But once you have a program that does what you want, you can really go to town. Computers will do your bidding millions of times, at ultrahigh speed, without complaint. Everyone in the world can use your creation. The cost can be zero, if you so choose, or enough to make you a billionaire, if the problem you solved is important enough. A programmer-someone who creates algorithms and codes them up-is a minor god, creating universes at will. You could even say that the God of Genesis himself is a programmer: language, not manipulation, is his tool of creation. Words become worlds. Today, sitting on the couch with your laptop, you too can be a god. Imagine a universe and make it real. The laws of physics are optional.. One of the greatest talents a politician can have is the ability to understand voters, individually or in small groups, and speak directly to them (or seem to). Bill Clinton is the paradigmatic example of this in recent memory. The effect of machine learning is like having a dedicated Bill Clinton for every voter. Each of these mini-Clintons is a far cry from the real one, but they have the advantage of numbers; even Bill Clinton canвЂ™t know what every single voter in America is thinking (although heвЂ™d surely like to). Learning algorithms are the ultimate retail politicians.. What makes this possible? How do learning algorithms work? What canвЂ™t they currently do, and what will the next generation look like? How will the machine-learning revolution unfold? And what opportunities and dangers should you look out for? ThatвЂ™s what this book is about-read on!. P and NP are the two most important classes of problems in computer science. (The names are not very mnemonic, unfortunately.) A problem is in P if we can solve it efficiently, and itвЂ™s in NP if we can efficiently check its solution. The famous P = NP question is whether every efficiently checkable problem is also efficiently solvable. Because of NP-completeness, all it takes to answer it is to prove thatone NP-complete problem is efficiently solvable (or not). NP is not the hardest class of problems in computer science, but itвЂ™s arguably the hardest вЂњrealisticвЂќ class: if you canвЂ™t even check a problemвЂ™s solution before the universe ends, whatвЂ™s the point of trying to solve it? Humans are good at solving NP problems approximately, and conversely, problems that we find interesting (like Tetris) often have an вЂњNP-nessвЂќ about them. One definition of artificial intelligence is that it consists of finding heuristic solutions to NP-complete problems. Often, we do this by reducing them to satisfiability, the canonical NP-complete problem: Can a given logical formula ever be true, or is it self-contradictory? If we invent a learner that can learn to solve satisfiability, it has a good claim to being the Master Algorithm.. To which the answer is: indeed. WouldnвЂ™t it be nice if, instead of trying hundreds of variations of many algorithms, we just had to try hundreds of variations of a single one? If we can figure out whatвЂ™s important and not so important in each one, what the important parts have in common and how they complement each other, we can, indeed, synthesize a Master Algorithm from them. ThatвЂ™s what weвЂ™re going to do in this book, or as close to it as we can. Perhaps you,
dear reader, will have some ideas of your own as you read it.. Are you ready? Our journey begins with a visit to the symbolists, the tribe with the oldest roots.. The rationalist likes to plan everything in advance before making the first move. The empiricist prefers to try things and see how they turn out. I donвЂ™t know if thereвЂ™s a gene for rationalism or one for empiricism, but looking at my computer scientist colleagues, IвЂ™ve observed time and again that they are almost like personality traits: some people are rationalistic to the core and could never have been otherwise; and others are empiricistthrough and through, and thatвЂ™s what theyвЂ™ll always be. The two sides can converse with each other and sometimes draw on each otherвЂ™s results, but they can understand each other only so much. Deep down each believes that what the other does is secondary, and not very interesting.. These can be organized into the following decision tree:. CHAPTER SIX: In the Church of the Reverend Bayes. One of the most exciting applications of Bayesian networks is modeling how genes regulate each other in living cells. Billions of dollars have been spent trying to discover pairwise correlations between individual genes and specific diseases, but the yield has been disappointingly low. In retrospect, this is not so surprising: a cellвЂ™s behavior is the result of complex interactions among genes and the environment, and a single gene has limited predictive power. But with Bayesian networks, we can uncover these interactions, provided we have the requisite data, and with the spread of DNA microarrays, we increasingly do.. The inference problem. Markov weighs the evidence. The last type of data-data you donвЂ™t share-also has a problem, which is that maybe you should share it. Maybe it hasnвЂ™t occurred to you to do so, maybe thereвЂ™s no easy way to, or maybe you just donвЂ™t want to. In the latter case, you should consider whether you have an ethical responsibility to share. One example weвЂ™ve seen is cancer patients, who can contribute to curing cancer by sharing their tumorsвЂ™ genomes and treatment histories. But it goes well beyond that. All sorts of questions about society and policy can potentially be answered by learning from the data we generate in our daily lives. Social science is entering a golden age, where it finally has data commensurate with the complexity of the phenomena it studies, and the benefits to all of us could be enormous-provided the data is accessible to researchers, policy makers, and citizens. This does not mean letting others peek into your private life; it means letting them see the learned models, which should contain only statistical information. So between you and them there needs to be an honest data broker that guarantees your data wonвЂ™t be misused, but also that no free riders share the benefits without sharing the data.. вЂњFirst links in the Markov chain,вЂќ by Brian Hayes (American Scientist, 2013), recounts MarkovвЂ™s invention of the eponymous chains. вЂњLarge language models in machine translation,вЂќ* by Thorsten Brants et al. (Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007), explains how Google Translate works.вЂњThe PageRank citation ranking: Bringing order to the Web,вЂќ* by Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd (Stanford University technical report, 1998), describes the PageRank algorithm and its interpretation as a random walk over the web.Statistical Language Learning,* by Eugene Charniak (MIT Press, 1996), explains how hidden Markov models work.Statistical Methods for Speech Recognition,* by Fred Jelinek (MIT Press, 1997), describes their application to speech recognition. The story of HMM-style inference in communication is told inвЂњThe Viterbi algorithm: A personal history,вЂќ by David Forney (unpublished; online at arxiv.org/pdf/cs/0504020v2.pdf).Bioinformatics: The Machine Learning Approach,* by Pierre Baldi and SГёren Brunak (2nd ed., MIT Press, 2001), is an introduction to the use of machine learning in biology, including HMMs. вЂњEngineers look to Kalman filtering for guidance,вЂќ by Barry Cipra (SIAM News, 1993), is a brief introduction to Kalman filters, their history, and their applications..