During the Battle of Britain, the Royal Air Force held back the Luftwaffe despite being heavily outnumbered. German pilots couldnвЂ™t understand how, wherever they went, they always ran into the RAF. The British had a secret weapon: radar, which detected the German planes well before they crossed into BritainвЂ™s airspace. Machine learning is like having a radar that sees into the future. DonвЂ™t just react to your adversaryвЂ™s moves; predict them and preempt them.. In April 2000, a team of neuroscientists from MIT reported inNature the results of an extraordinary experiment. They rewired the brain of a ferret, rerouting the connections from the eyes to the auditory cortex (the part of the brain responsible for processing sounds) and rerouting the connections from the ears to the visual cortex. YouвЂ™d think the result would be a severely disabled ferret, but no: the auditory cortex learned to see, the visual cortex learned to hear, and the ferret was fine. In normal mammals, the visual cortex contains a map of the retina: neurons connected to nearby regions of the retina are close to each other in the cortex. Instead, the rewired ferrets developed a map of the retina in the auditory cortex. If the visual input is redirected instead to the somatosensory cortex, responsible for touch perception, it too learns to see. Other mammals also have this ability.. Another line of evidence comes from optimization, the branch of mathematics concerned with finding the input to a function that produces its highest output. For example, finding the sequence of stock purchases and sales that maximizes your total returns is an optimization problem. In optimization, simple functions often give rise to surprisingly complex solutions. Optimization plays a prominent role in almost every field of science, technology, and business, including machine learning. Each field optimizes within the constraints defined by optimizations in other fields. We try to maximize our happiness within economic constraints, which are firmsвЂ™ best solutions within the constraints of the available technology-which in turn consists of the best solutions we could find within the constraints of biology and physics. Biology, in turn, is the result of optimization by evolution within the constraints of physics and chemistry, and the laws of physics themselves are solutions to optimization problems. Perhaps, then, everything that exists is the progressive solution of an overarching optimization problem, and the Master Algorithm follows from the statement of that problem.. The argument from computer science. Swan bites robot. If your credit card was used to purchase one dollar of gas, it was stolen.. Sets of rules are popular with retailers who are deciding which goods to stock. Typically, they use a more exhaustive approach thanвЂњdivide and conquer,вЂќ looking for all rules that strongly predict the purchase of each item. Walmart was a pioneer in this area. One of their early findings was that if you buy diapers you are also likely to buy beer. Huh? One interpretation of this is that Mom sends Dad to the supermarket to buy diapers, and as emotional compensation, Dad buys a case of beer to go with them. Knowing this, the supermarket can now sell more beer by putting it next to the diapers, which would never have occurred to it without rule mining. The вЂњbeer and diapersвЂќ rule has acquired legendary status among data miners (although some claim the legend is of the urban variety). Either way, itвЂ™s a long way from the digital circuit design problems Michalski had in mind when he first started thinking about rule induction in the 1960s. When you invent a new learning algorithm, you canвЂ™t even begin to imagine all the things it will be used for.. Your friend Ben is also pretty good, but heвЂ™s had a bit too much to drink. His darts are all over, but he loudly points out that on average heвЂ™s hitting the bullвЂ™s-eye. (Maybe he should have been a statistician.) This is the low-bias, high-variance case, shown in the bottom right corner. BenвЂ™s girlfriend, Ashley, is very steady, butshe has a tendency to aim too high and to the right. She has low variance and high bias (top left corner). Cody, whoвЂ™s visiting from out of town and has never played darts before, is both all over and off center. He has both high bias and high variance (top right).. Socrates is a philosopher.. HebbвЂ™s rule was a confluence of ideas from psychology and neuroscience, with a healthy dose of speculation thrown in. Learning by association was a favorite theme of the British empiricists, from Locke and Hume to John Stuart Mill. In hisPrinciples of Psychology, William James enunciates a general principle of association thatвЂ™s remarkably similar to HebbвЂ™s rule, with neurons replaced by brain processes and firing efficiency by propagation of excitement. Around the same time, the great Spanish neuroscientist Santiago RamГіn y Cajal was making the first detailed observations of the brain, staining individual neurons using the recently invented Golgi method and cataloguing what he saw like a botanist classifying new species of trees. By HebbвЂ™s time, neuroscientists had a rough understanding of how neurons work, but he was the first to propose a mechanism by which they could encode associations.. This is an instance of a tension that runs throughout much of science and philosophy: the split between descriptive and normative theories, betweenвЂњthis is how it isвЂќ and вЂњthis is how it should be.вЂќ Symbolists and Bayesians like to point out, however, that figuring out how we should learn can also help us to understand how we do learn because the two are presumably not entirely unrelated-far from it. In particular, behaviors that are important for survival and have had a long time to evolve should not be far from optimal. WeвЂ™re not very good at answering written questions about probabilities, but we are very good at instantly choosing hand and arm movements to hit a target. Many psychologists have used symbolist or Bayesian models to explain aspects of human behavior. Symbolists dominated the first few decades of cognitive psychology. In the 1980s and 1990s, connectionists held sway, but now Bayesians are on the rise.. P ( A|B ) = P ( A ) P(B|A) / P(B). This is not the whole story, however. If we observe an effect that would happen even without the cause, then surely thatвЂ™s not much evidence of the cause being present. BayesвЂ™ theorem incorporates this by saying thatP(cause | effect) goes down withP(effect), the prior probability of the effect (i.e., its probability in the absence of any knowledge of the causes). Finally, other things being equal, the more likely a cause is a priori, the more likely it should be a posteriori. Putting all of these together, BayesвЂ™ theorem says that. No one is sure who invented the NaГЇve Bayes algorithm. It was mentioned without attribution in a 1973 pattern recognition textbook, but it only took off in the 1990s, when researchers noticed that, surprisingly, it was often more accurate than much more sophisticated learners. I was a graduate student at the time, and when I belatedly decided to include NaГЇve Bayes in my experiments, I was shocked to find it did better than all the other algorithms I was comparing, save one-luckily, the algorithm I was developing for my thesis, or I might not be here now.. For those of us who are not keen on online dating, a more immediately useful notion is to choose which interactions to record and where. If you donвЂ™t want your Christmas shopping to leave Amazon confused about your tastes, do it on other sites. (Sorry, Amazon.) If you watch different kinds of videos at home and for work, keep two accounts on YouTube, one for each, and YouTube will learn to make the corresponding recommendations. And if youвЂ™re about to watch some videos of a kind that you ordinarily have no interest in, log out first. Use ChromeвЂ™s incognito mode not for guilty browsing (which youвЂ™d never do, of course) but for when you donвЂ™t want the current session to influence future personalization. On Netflix, adding profiles for the different people using your account will spare you R-rated recommendations on family movie night. If you donвЂ™t like a company, click on their ads: this will not only waste their money now, but teach Google to waste it again in the future by showing the ads to people who are unlikely to buy the products. And if you have very specific queries that you want Google to answer correctly in the future, take a moment to trawl through the later results pages for the relevant links and click on them. More generally, if a system keeps recommending the wrong things to you, try teaching it by finding and clicking on a bunch of the right ones and come back later to see if it did..