In October, Microsoft’s Chief Research Officer Rich Rashid demonstrated a new speech recognition software to a packed auditorium in Tianjin, China.
As Rashid spoke, a computer flashed his words in English onto one screen and in Chinese onto another. Rashid paused as the computer then translated what he had just said and played it back in Chinese, mimicking the sound of his own voice. The reaction from the crowd was euphoric.
(credit : Shutterstock)
The technology showcased by Rashid was made possible by a breakthrough in deep neural networks, an area of machine learning, or artificial intelligence, pioneered by CIFAR Fellow Geoff Hinton (University of Toronto) and his team of researchers. The team’s work is leading to vast improvements in computer vision, speech recognition and data mining, and is being adopted by global tech giants like IBM, Microsoft and Google. A recent leap forward in the capacity of deep neural networks, or deep learning, is generating worldwide interest, which prompted a feature story on the cover of the New York Times this past November.
In the 1980s, Hinton was among the first to show that it was possible to “train” artificial neural networks, where a computer could accumulate information and “learn” to recognize patterns like handwriting styles, images or sound. These so-called neural networks are mathematical algorithms inspired by the many layers of information flow in the human brain.
In 2006, Hinton took this thinking to a new level with his groundbreaking work in deep learning. In a paper published in Science, Hinton and his team introduced a new technique to train computers where the machines can teach themselves without supervision. Deep learning has improved pattern recognition by a significant magnitude. “That particular breakthrough,” says Rick Rashid, “increased recognition rates by approximately thirty per cent. That’s a big deal.”
At the same time that deep learning techniques were improving, so too was the processing power of computers. “There was bound to come a point that deep learning models would result in better speech recognition,” says Hinton. “We got lucky that computers have become fast enough that we could do a better job than the previous models.”
By deciding not to patent the technology, Hinton has encouraged other scientists to explore deep learning, which has also resulted in widespread enthusiasm from technology companies. In addition to the new speech recognition technology at Microsoft, Google is also benefiting from Hinton’s work, using deep neural networks for their voice search function in the Android 4.1 operating system. Similarly, the technology is well-established in the speech labs of IBM, a long-time leader in the field.
Deep learning also has implications for managing large sets of data. In October 2012, a group of Dr. Hinton’s graduate students won first place in a contest sponsored by the pharmaceutical company Merck to help design software that determines which molecules are likely to be effective drug agents. The team used the competition to show how deep neural network models can be used to aid pattern recognition with greater accuracy even in fields like health care.
CIFAR’s Learning in Machines & Brains program (formerly known as Neural Computation & Adaptive Perception) is a contributing force for this emerging field. “For the past ten years, my main group of intellectual peers was found in the CIFAR program, and thanks to this group, we have accelerated the development of deep learning by several years, and this has been tremendously valuable to a lot of people,” says Hinton.