The other reason machine learners are theГјber-geeks is that the world has far fewer of them than it needs, even by the already dire standards of computer science. According to tech guru Tim OвЂ™Reilly, вЂњdata scientistвЂќ is the hottest job title in Silicon Valley. The McKinsey Global Institute estimates that by 2018 the United States alone will need 140,000 to 190,000 more machine-learning experts than will be available, and 1.5 million more data-savvy managers. Machine learningвЂ™s applications have exploded too suddenly for education to keep up, and it has a reputation for being a difficult subject. Textbooks are liable to giveyou math indigestion. This difficulty is more apparent than real, however. All of the important ideas in machine learning can be expressed math-free. As you read this book, you may even find yourself inventing your own learning algorithms, with nary an equation in sight.. The larger outcome is that democracy works better because the bandwidth of communication between voters and politicians increases enormously. In these days of high-speed Internet, the amount of information your elected representatives get from you is still decidedly nineteenth century: a hundred bits or so every two years, as much as fits on a ballot. This is supplemented by polling and perhaps the occasional e-mail or town-hall meeting, but thatвЂ™s still precious little. Big data and machine learning change the equation. In the future, provided voter models are accurate, elected officials will be able to ask voters what they want a thousand times a day and act accordingly-without having to pester the actual flesh-and-blood citizens.. Most of all, we have to worry about what the Master Algorithm could do in the wrong hands. The first line of defense is to make sure the good guys get it first-or, if itвЂ™s not clear who the good guys are, to make sure itвЂ™s open-sourced. The second is to realize that, no matter how good the learning algorithm is, itвЂ™s only as good as the data it gets. He who controls the data controls the learner. Your reaction to the datafication of life should not be to retreat to a log cabin-the woods, too, are full of sensors-but to aggressively seek control of the data that matters to you. ItвЂ™s good to have recommenders that find what you want and bring it to you; youвЂ™d feel lost without them. But they should bring you whatyou want, not what someone else wants you to have. Control of data and ownership of the models learned from it is what many of the twenty-first centuryвЂ™s battles will be about-between governments, corporations, unions, and individuals. But you also have an ethical duty to share data for the common good. Machine learning alone will not cure cancer; cancer patients will, by sharing their data for the benefit of future patients.. The dating example is a little deceptive because itвЂ™s very small (four variables and four examples). But suppose now that you run an online dating service and you need to figure out which couples to match. If each user of your system has filled out a questionnaire with answers to fifty yes/no questions, each potential match is characterized by one hundred attributes, fifty from each member of the prospective couple. Based on the couples that have gone on a date and reported the outcome, can you find a conjunctive definition for the concept of a вЂњgood matchвЂќ? There are 3100 possible definitions to try. (The three options for each attribute are yes, no, and not part of the concept.) Even with the fastest computer in the world, the couples will all be long gone-and your company bankrupt-by the time youвЂ™re done, unless youвЂ™re lucky and a very short definition hits the jackpot. So many rules, so little time. We need to do something smarter.. The problem is not limited to memorizing instances wholesale. Whenever a learner finds a pattern in the data that is not actually true in the real world, we say that it has overfit the data. Overfitting is the central problem in machine learning. More papers have been written about it than about any other topic. Every powerful learner, whether symbolist, connectionist, or any other, has to worry about hallucinating patterns. The only safe way to avoid it is to severely restrict what the learner can learn, for example by requiring that it be a short conjunctive concept. Unfortunately, that throws out the baby with the bathwater, leaving the learner unable to see most of the true patterns that are visible in the data. Thus a good learner is forever walking the narrow path between blindness and hallucination.. Socrates is human.. Aristotle is human. Aristotle is mortal.. Symbolist machine learners share this belief in the power of symbol manipulation with many other computer scientists, psychologists, and philosophers. The psychologist David Marr argued that every information processing system should be studied at three distinct levels: the fundamental properties of the problem itвЂ™s solving; the algorithms and representations used to solve it; and how they are physically implemented. For example, addition can be defined by a set of axioms irrespective of how itвЂ™s carried out; numbers can be expressed in different ways (e.g., Roman and Arabic) and added using different algorithms; and these can be implemented using an abacus, a pocket calculator, or even, very inefficiently, in your head. Learning is a prime example of a cognitive faculty we can profitably study according to MarrвЂ™s levels.. And this jungle crackles with electricity. Sparks run along tree trunks and set off more sparks in neighboring trees. Every now and then, a whole area of the jungle whips itself into a frenzy before settling down again. When you wiggle
your toe, a series of electric discharges, called action potentials, runs all the way down your spinal chord and leg until it reaches your toe muscles and tells them to move. Your brain at work is a symphony of these electric sparks. If you could sit inside it and watch what happens as you read this page, the scene youвЂ™d see would make even the busiest science-fiction metropolis look laid back by comparison. The end result of this phenomenally complex pattern of neuron firings is your consciousness.. [РљР°СЂС‚РёРЅРєР°: pic_12.jpg]. What is sex for?. One solution, left as an exercise by Pearl in his book on Bayesian networks, is to pretend the graph has no loops and just keep propagating probabilities back and forth until they converge. This is known as loopy belief propagation, both because it works on graphs with loops and because itвЂ™s a crazy idea. Surprisingly, it turns out to work quite well in many cases. For instance, itвЂ™s a state-of-the art method for wireless communication, with the random variables being the bits in the message, encoded in a clever way. But loopy belief propagation can also converge to the wrong answers or oscillate forever. Another solution, which originated in physics but was imported into machine learning and greatly extended by Michael Jordan and others, is to approximate an intractable distribution with a tractable one and optimize the latterвЂ™s parameters to make it as close as possible to the former.. Looking around for applications, Vapnik and his coworkers soon alighted on handwritten digit recognition, which their connectionist colleagues at Bell Labs were the world experts on. To everyoneвЂ™s surprise, SVMs did as well out of the box as multilayer perceptrons that had been carefully crafted for digit recognition over the years. This set the stage for a long-running, wide-ranging competition between the two. SVMs can be seen as a generalization of the perceptron, because a hyperplane boundary between classes is what you get when you use a particular similarity measure (the dot product between vectors). But SVMs have a major advantage compared to multilayer perceptrons: the weights have a single optimum instead of many local ones and so learning them reliably is much easier. Despite this, SVMs are no less expressive than multilayer perceptrons; the support vectors effectively act as a hidden layer and their weighted average as the output layer. For example, an SVM can easily represent the exclusive-OR function by having one support vector for each of the four possible configurations. But the connectionists didnвЂ™t give up without a fight. In 1995, Larry Jackel, the head of VapnikвЂ™s department at Bell Labs, bet him a fancy dinner that by 2000 neural networks would be as well understood as SVMs. He lost. But in return, Vapnik bet that by 2005 no one would use neural networks any more, and he also lost. (The only one to get a free dinner was Yann LeCun, their witness.) Moreover, with the advent of deep learning, connectionists have regained the upper hand. Provided you can learn them, networks with many layers can express many functions more compactly than SVMs, which always have just one layer, and this can make all the difference.. The question, of course, is what algorithm should be running in RobbyвЂ™s brain at birth. Researchers influenced by child psychology look askance at neural networks because the microscopic workings of a neuron seem a million miles from the sophistication of even a childвЂ™s most basic behaviors, like reaching for an object, grasping it, and inspecting it with wide, curious eyes. We need to model the childвЂ™s learning at a higher level of abstraction, lest we miss the planet for the trees. Above all, even though children certainly get plenty of help from their parents, they learn mostly on their own, without supervision, and thatвЂ™s what seems most miraculous. None of the algorithms weвЂ™ve seen so far can do it, but weвЂ™re about to see several that can-bringing us one step closer to the Master Algorithm.. In this rapidly approaching future, youвЂ™re not going to be the only one with a вЂњdigital halfвЂќ doing your bidding twenty-four hours a day. Everyone will have a detailed model of him- or herself, and these models will talk to each other all the time. If youвЂ™re looking for a job and company X is looking to hire, its model will interview your model. It will be a lot like a real, flesh-and-blood interview-your model will still be well advised to not volunteer negative information about you, and so on-but it will take only a fraction of a second. YouвЂ™ll click on вЂњFind JobвЂќ in your future LinkedIn account, and youвЂ™ll immediately interview for every job in the universe that remotely fits your parameters (profession, location, pay, etc.). LinkedIn will respond on the spot with a ranked list of the best prospects, and out of those, youвЂ™ll pick the first company that you want to have a chat with. Same with dating: your model will go on millions of dates so you donвЂ™t have to, and come Saturday, youвЂ™ll meet your top prospects at an OkCupid-organized party, knowing that youвЂ™re also one oftheir top prospects-and knowing, of course, that theirother top prospects are also in the room. ItвЂ™s sure to be an interesting night..