[MUSIC] I think one of the good new applications of data science is in the medical field, like in drug delivery, or cancer treatment. >> I think the very interesting one is how now companies can use all the information they're gathering from their customers to actually develop new products that respond to the needs of the customers. >> A good new application of data science was the high trained news of Pokemon GO. So they use the Ingress, they use the data of the Ingress app, the last app of the same company. And they choose the locations for Pokemons GO and Gyms according to data from the last apps, so they learn with their errors. >> Google Search is a application for us, data science. It's the Google Search whenever we want to search anything. So I think it's all because of data science, whatever Google is now, it's all because of data science. >> Augmented reality is my favorite new implementation of data science. I think you can't look at a new technology and not see data science in there. But augmented reality is the one I'm just the most excited about. The ability to walk around and see things on walls or around us that aren't really there, Pokemon is just the start >> So what has happened is that now the tools are available and the data sets are available, people are applying them with not much diligence. And I think one of the strange cases, which got reported in the newspapers is about the story of a father walking into a Target store in the US. And complaining about the fact that the Target was sending mails to his teenage daughter about diapers, and milk, baby formula, and he was angry with them. He said why would you like me for my teenage daughter to be to have babies? And he was obviously disturbed by this mail or the ad campaign, and they obviously apologize. But then the father returned two weeks later, and he apologized to them, saying he didn't know his daughter was pregnant. Now the question is how did Target know this thing before the father knew? And what has happened is that they would look at the purchasing behavior of individuals. So if you're buying some sort of supplements or vitamins, then you know that this is the first trimester of pregnancy. So they know what products to send to you, assuming that the person who bought those supplements were pregnant. Now this is a great story about data science and how data science can forecast and predict these consumer behaviors, even before the family would find out. And I find it disturbing, and strange, and odd for a variety of reasons. First of all, for every correct prediction, you have hundreds of incorrect predictions, which we call the false positives. And no data scientist actually advertises his or her false positives, we only advertise and promote what we got it right. But when we got it wrong, hundreds of times we don't tell it. Second thing is, that's an abuse of data, that's basically not really giving you much insight. You've just found a correlation but someone could be purchasing the same material for someone else. And then the odds of getting it wrong and their odds of getting false positives is much higher. So I find it strange and I think it gives us false sense of our ability to predict the future. The reality is about data science and the most important thing for the budding data scientists to know that all forecasts are wrong, they're useful, but they are wrong. And so one should not put their faith into the fact that now that we can do predictive analytics, that we can solve all problems. I think a good example is the Google Search. And Google published a paper saying that they can predict flu epidemics before the Center for Disease Control. And what they did is they were looking at what people were searching on Google for flu symptoms. So Google saw the flu symptoms as you searches before anybody else, and they were able to predict it. The thing is these searches are good, and they are correlated with some outcomes, but not necessarily all the time. So at that time when Google announced it, it was a big thing and everybody really liked it, and say that's a new era of predictive analytics. Only that a few years later, they realized that Google started to predict false positives. That they were predicting things that were not really there, or their predictions were not that accurate for a variety of reasons. They changed probably their algorithms and the datasets were not really correlated with the outcome. So what's the lesson to learn here? One has to avoid what we call the data hubris, and that you should not believe in your models too much because they can lead you astray. Data Science has tremendous potential to bring change in parts of the world and parts of our society that have been disenfranchised for years. One sees great examples of data science, especially in developing countries where they are targeting relief efforts. They're targeting food and other aid to individuals, to places that have not been targeted in the past. And the reason it is happening now is because the greater availability of data and our models and analytics to be able to pinpoint where the greatest needs are. The ability to design and conduct experiments to see if one were to give microcredit, small loans to very poor households in developing parts of the world to see how they affect the individual households ability to get out of poverty. And also the local community's ability to collectively improve their economic well-being by just very small infusions of cash or credit. So these experiments happening all over the world are allowing and that is a direct result of our ability to analyze data. And be able to design experiments, and then roll out humongous efforts in providing relief, providing credit. And providing an opportunity to those who have been disenfranchised in the past an opportunity to join the rest of the world in prosperity, and happiness, and help. [MUSIC]