Machine learning plays a part in every stage of your life. If you studied online for the SAT college admission exam, a learning algorithm graded your practice essays. And if you applied to business school and took the GMAT exam recently, one of your essay graders was a learning system. Perhaps when you applied for your job, a learning algorithm picked your rГ©sumГ© from the virtual pile and told your prospective employer: hereвЂ™s a strong candidate; take a look. Your latest raise may have come courtesy of another learning algorithm. If youвЂ™re looking to buy a house, Zillow.com will estimate what each one youвЂ™re considering is worth. When youвЂ™vesettled on one, you apply for a home loan, and a learning algorithm studies your application and recommends accepting it (or not). Perhaps most important, if youвЂ™ve used an online dating service, machine learning may even have helped you find the love of your life.. Every computer scientist does battle with the complexity monster every day. When computer scientists lose the battle, complexity seeps into our lives. YouвЂ™ve probably noticed that many a battle has been lost. Nevertheless, we continue to build our tower of algorithms, with greater and greater difficulty. Each new generation of algorithms has to be built on top of the previous ones and has to deal with their complexities in addition to its own. Thetower grows taller and taller, and it covers the whole world, but itвЂ™s also increasingly fragile, like a house of cards waiting to collapse. One tiny error in an algorithm and a billion-dollar rocket explodes, or the power goes out for millions. Algorithms interact in unexpected ways, and the stock market crashes.. One if by land, two if by Internet. Machine learning has an unavoidable element of gambling. In the firstDirty Harry movie, Clint Eastwood chases a bank robber, repeatedly firing at him. Finally, the robber is lying next to a loaded gun, unsure whether to spring for it. Did Harry fire six shots or only five? Harry sympathizes (so to speak):вЂњYouвЂ™ve got to ask yourself one question: вЂDo I feel lucky?вЂ™ Well, do you, punk?вЂќ ThatвЂ™s the question machine learners have to ask themselves every day when they go to work: Do I feel lucky today? Just like evolution, machine learning doesnвЂ™t get it right every time; in fact, errors are the rule, not the exception. But itвЂ™s OK, because we discard the misses and build on the hits, and the cumulative result is what matters. Once we acquire a new piece of knowledge, it becomes a basis for inducing yet more knowledge. The only question is where to begin.. Aristotle is human. Aristotle is mortal.. The first formal model of a neuron was proposed by Warren McCulloch and Walter Pitts in 1943. It looked a lot like the logic gates computers are made of. An OR gate switches on when at least one of its inputs is on, and an AND gate when all of them are on. A McCulloch-Pitts neuron switches on when the number of its active inputs passes some threshold. If the threshold is one, the neuron acts as an OR gate; if the threshold is equal to the number of inputs, as an AND gate. In addition, a McCulloch-Pitts neuron can prevent another from switching on, which models both inhibitory synapses and NOT gates. So a network of neurons can do all the operations a computer does. In the early days, computers were often called electronic brains, and this was not just an analogy.. The exploration-exploitation dilemma. Evolutionaries and connectionists have something important in common: they both design learning algorithms inspired by nature. But then they part ways. Evolutionaries focus on learning structure; to them, fine-tuning an evolved structure by optimizing parameters is of secondary importance. In contrast, connectionists prefer to take a simple, hand-coded structure with lots of connections and let weight learning do all the work. This is machine learningвЂ™s version of the nature versus nurture controversy, and there are good arguments on both sides.. This is an instance of a tension that runs throughout much of science and philosophy: the split between descriptive and normative theories, betweenвЂњthis is how it isвЂќ and вЂњthis is how it should be.вЂќ Symbolists and Bayesians like to point out, however, that figuring out how we should learn can also help us to understand how we do learn because the two are presumably not entirely unrelated-far from it. In particular, behaviors that are important for survival and have had a long time to evolve should not be far from optimal. WeвЂ™re not very good at answering written questions about probabilities, but we are very good at instantly choosing hand and arm movements to hit a target. Many psychologists have used symbolist or Bayesian models to explain aspects of human behavior. Symbolists dominated the first few decades of cognitive psychology. In the 1980s and 1990s, connectionists held sway, but now Bayesians are on the rise.. It often happens that,
even after we take all conditional independences into account, some nodes in a Bayesian network still have too many parents. Some networks are so dense with arrows that when we print them, the page turns solid black. (The physicist Mark Newman calls themвЂњridiculograms.вЂќ) A doctor needs to simultaneously diagnose all the possible diseases a patient could have, not just one, and every disease is a parent of many different symptoms. A fever could be caused by any number of conditions besides the flu, but itвЂ™s hopeless to try to predict its probability given every possible combination of conditions. All is not lost. Instead of a table specifying the nodeвЂ™s conditional probability for every state of its parents, we can learn a simpler distribution. The most popular choice is a probabilistic version of the logical OR operation: any cause alone can provoke a fever, but each cause has a certain probability of failing to do so, even if itвЂ™s usually sufficient. Heckerman and others have learned Bayesian networks that diagnose hundreds of infectious diseases in this way. Google uses a giant Bayesian network of this type in its AdSense system for automatically choosing ads to place on web pages. The network relates a million content variables to each other and to twelve million words and phrases via over three hundred million arrows, all learned from a hundred billion text snippets and search queries.. When you arrange books on a shelf so that books on similar topics are close to each other, youвЂ™re doing a kind of dimensionality reduction, from the vast space of topics to the one-dimensional shelf. Unavoidably, some books that are closely related will wind up far apart on the shelf, but you can still order them in a way that minimizes such occurrences. ThatвЂ™s what dimensionality reduction algorithms do.. Of course, even if you use Alchemy with no initial formulas (and you can), that doesnвЂ™t make it knowledge-free. The choice of formal language, score function, and optimizer implicitly encodes assumptions about the world. So itвЂ™s natural to ask whether we can have an even more general learner than Alchemy. What did evolution assume when it began its long journey from the first bacteria to all the life-forms around today? I think thereвЂ™s a simple assumption from which all else follows: the learner is part of the world. This means that the learner as a physical system obeys the same laws as its environment, whatever they are, and therefore already вЂњknowsвЂќ them implicitly and is primed to discover them. In the next section, weвЂ™ll see what this can mean concretely and how to embody it in Alchemy. But for the moment, letвЂ™s note that itвЂ™s perhaps the best answer we can ever give to HumeвЂ™s question. On the one hand, assuming the learner is part of the worldis an assumption-in principle, the learner could obey different laws from those the world obeys-so it satisfies HumeвЂ™s dictum that learning is only possible with prior knowledge. On the other hand, itвЂ™s an assumption so basic and hard to disagree with that perhaps itвЂ™s all we need for this world.. Seven for the Engineers in their halls of servers,. Picture two strands of DNA going for a swim in their private pool, aka a bacteriumвЂ™s cytoplasm, two billion years ago. TheyвЂ™re pondering a momentous decision. вЂњIвЂ™m worried, Diana,вЂќ says one. вЂњIf we start making multicellular creatures, will they take over?вЂќ Fast-forward to the twenty-first century, and DNA is still alive and well. Better than ever, in fact, with anincreasing fraction living safely in bipedal organisms comprising trillions of cells. ItвЂ™s been quite a ride for our tiny double-stranded friends since they made their momentous decision. Humans are their trickiest creation yet; weвЂ™ve invented things like contraception that let us have fun without spreading our DNA, and we have-or seem to have-free will. But itвЂ™s still DNA that shapes our notions of fun, and we use our free will to pursue pleasure and avoid pain, which, for the most part, still coincides with whatвЂ™s best for our DNAвЂ™s survival. We may yet be DNAвЂ™s demise if we choose to transmute ourselves into silicon, but even then, itвЂ™s been a great two billion years. The decision we face today is similar: if we start making AIs-vast, interconnected, superhuman, unfathomable AIs-will they take over? Not any more than multicellular organisms took over from genes, vast and unfathomable as we may be to them. AIs are our survival machines, in the same way that we are our genesвЂ™.. Chapter Six.