Breakthroughs in machine translation using neural networks have improved our ability to translate words and sentences between many languages. Researchers say their techniques will soon surpass tools such as Google Translate.
Traditional machine translation trains computers with paired examples of text in both languages, keeping track of how a word or a short phrase in one language gets translated into another. These shorter pieces can then be combined into overall sentences. This approach works reasonably well given enough training pairs, but it ignores any underlying sense of meaning.
Researchers have been improving upon this traditional technique for about two decades, but recently they’ve taken on a new approach using recurrent neural networks.
The technique uses an encoder and a decoder. The encoder reads a word, phrase or sentence and computes it to a set of numbers that represents the meaning. Two sentences that are different but mean the same thing would be represented in almost the same way, and sentences in two different languages with the same meaning would also be represented by a very similar set of numbers.
“You could imagine some kind of intermediate lingua franca, but it’s not in words, it’s in vectors,” says CIFAR Senior Fellow Yoshua Bengio (Université de Montréal).
Given a meaning representation, the decoder takes these numbers and produces a sequence of words in another language. If you are translating from English to French, the system uses the English encoder to convert to the numerical common language, then the French decoder to produce the translation.
In parallel with colleagues at Google also working on recurrent networks for machine translation, Bengio and his colleagues have recently produced a series of papers on improvements to this technique, including a recent breakthrough that overcame a problem with translating long sentences by incorporating an ability to pay attention to one element at a time.
“Let’s say I ask you to translate a whole page from one language to another one, you wouldn’t try to understand the whole page in your head and then try to spit out the translation,” he says. “After having read the original French text, as you produce translated words in English, you would go over appropriate segments of the French text to guide your translation as you do it.”
Bengio and his team have also collaborated with their CIFAR Senior Fellow Richard Zemel (University of Toronto) to incorporate attention into a model for generating image captions from scratch.
Improving translation allows scientists to study one of the big challenges of artificial intelligence — natural language understanding. “It’s a task that requires semantic understanding,” Bengio says. “It needs to really understand what the person’s intent was.”
It is also a way to advance the applications of neural networks, which have already vastly improved image and speech recognition. Bengio says the successes of their models contradict a long-held view in the AI community that they could never work to handle linguistic meaning.
“A few decades ago, people in AI thought that it would be ridiculous to think that neural networks would be able to do a good job at something like machine translation,” he says. In fact, their models are at par with the best existing methods and improving rapidly.
“It’s almost sure that within a year we’ll be substantially above the state of the art.”