Now, I want to come back to the topic of natural user interfaces. One of the most natural interfaces for people is human speech. For the last 60 years, computer scientists have been trying to find a way to understand and recognize human speech. Now, at the beginning, when people first started tackling this problem, they looked out it largely as a pattern matching problem. The earliest systems attempted to take the waveforms that came out of speaker's voice and match them up to waveforms that they knew represent a certain word. Unfortunately, that approach was extremely fragile. Partly because everyone speaks differently, but also given a single speaker will often say the same thing differently depending on the other words or the context in which they're speaking. You've probably already noticed me doing that. Now, in the late 1970s, there was a major change in the way people decided to do speech recognition. This is work being done at Carnegie Mellon University, and the idea was to use a statistical modeling technique. In this case hidden Markov model. To really be able to take a lot of data from many speakers and produce more robust statistical models of speech. Now, that was a huge improvement. Over the last 30 years, speech recognition systems have become dramatically better than they used to be. They still make a lot of mistakes. In limited domain, it's possible to do very successful speech interfaces. So, for example, in the United States when I call my bank, I'm talking to a computer, I'm not talking to a person. The computer can answer simple questions about my bank account, or if necessary, it can connect me to a real person if I have a significant issue that I wanted to discuss. I'm sure you've heard of Apple Siri products which answer simple questions, Microsoft Connect has a robot speech interface that allows you to control the interface, and it even allows you to issue commands in the middle of games. Still these systems have a lot of errors. The error rate or arbitrary speed have been in the 20 to 25% range. Well, few years ago, researchers at Microsoft Research at the University of Toronto came together to develop another breakthrough in the field of speech recognition research. The idea that they had was to use technology in a way patterned after the way the human brain works. It's called deep neural networks. Engineers got to take in much more data than had previously been able to be used with the Hidden Markov Model, and use that to significantly improve recognition rates. So, that one change, that particular breakthrough increased recognition rate by approximately 30 percent. That's a big deal. That's the difference between going from 20 to 25 percent errors, or about one error out of every four or five words, to roughly 15 percent or slightly less errors. Roughly one out of every seven words, or perhaps even one out of every eight. So, it's still not perfect. There's still a long way to go, but I think you can see that we have already made significant amount of additional progress in the recognition of speech. Now, one of the problems that we've also been trying to solve for 60 years is machine translation. Again in just the last few years, the combination of statistical techniques and big data have allowed us to do a much better job than we previously were able to do in being able to translate Web pages or other kinds of information into other languages. For example, today BigTranslate, which is Microsoft's translation system that comes out at Microsoft Research, BigTranslate translates millions of pages a day for users into their native language. It's a extremely heavily used service. Now, if I want to have what I'm saying be translated into Chinese, we can take the text that comes from my voice and put that through translation system. It really happens in two steps. In the first instance, we take the English and we convert it more or less word by word into Chinese text. I think pretty soon we may see that up in this screen. So, what happens is we're basically taking the English text and pushing it through the translation system. We then have the reorder that text in Chinese because the word order in Chinese is not the same as the word order in English. To produce something that begins to resemble something that a Chinese speaker might say. So, now we're taking the things that I'm saying, and we're converting them into Chinese text. Now, the last step that I want to be able to take in this process is to actually speak to you in Chinese. Now, the key thing there is we've been able to take a large amount of information from many Chinese speakers and produce a text-to-speech system that takes Chinese text and converts it into Chinese language. Then we've taken an hour or so of my own voice and we used that to modulate the standard text-to-speech system so that it would sound like me. So, what you see now is the result of that change. I'm speaking in English, and hopefully you will hear me speak in Chinese in my own voice. Again the results are not perfect. There are in fact quite a few errors. [FOREIGN]. There's much work to be done in this area. [FOREIGN]. So, this technology is very promising. [FOREIGN] and we hope in a few years that we'll be able to break down the language barriers between people. [FOREIGN]. Personally, I believe this is going to lead to a better world.