The machine’s own language



30, Nov 2016

The Google Brain Artificial Intelligence (AI) has taught itself to translate between two languages without prior training. For example, training the AI to translate from Korean to English and English to Japanese will equip it to translate from Korean to Japanese; a challenge for machines so far. Massive data are required to enhance translation accuracy and computational cost is significant. This advancement has the potential to reduce these burdens.

What is the latest advancement in Google's translation capabilities?

  • The Google Brain AI has created an algorithm all by itself – an interlingua – to teach it translation between two languages without being taught how to do so.
  • According to a post in the Google Research Blog, if the neural network is taught to translate from Japanese to English and vice versa and English to Spanish and vice versa, it can figure out how to translate from Japanese to Korean and vice versa. This method has produced reasonable results. Google calls it zero-shot translation.

The post says this is the first time – to the best of their knowledge – that zero-shot translation has been successfully realized in Machine Learning.

  • On observing the zero-shot translation behavior, researchers wondered how the machine was going about it. A study of the underlying mechanism revealed that the system was not merely memorizing phrase-to-phrase translations; it was storing information about the meaning of the sentence. Researchers inferred this must be due to the existence of an interlingua.
  • The interlingua used the information it had learnt to work out what it had not been taught. It does this by creating a common representation in which sentences with the same meaning are represented in a similar way irrespective of language.

Google in its post dated November 22 said the Multilingual Google Neural Machine Translation system could be used by Google Translate users today and that Multilingual systems served 10 of the recently launched 16 language pairs, and provided superior quality on a simplified production architecture.

Why is this remarkable even for Google?

  • While Google Translate had definitely gotten past the stage of funny translations that made one grin or grimace, the tons of data required to improve the accuracy has been a drawback. So for any two languages, say English and Tamil, one needed to feed the English version and the acceptable Tamil version.
  • Google Translate works on over 140 billion words daily. At the backend, this means building and maintain of several different systems to accomplish translation between any two languages. The computational cost is huge.
  • This new discovery means that to accomplish a Tamil to Kannada translation one does not have to feed a separate Tamil version and corresponding Kannada version if the system already knows how to translate from Tamil to English and English to Kannada through the appropriate data sets.
  • While the achievement may seem a fairly easy task for humans, it is a great leap forward for machines. For the first time AI has shown that it can perform translation tasks without being trained beforehand.

Most importantly the AI used its self-learnt skills to translate between two languages to create its own interlingua.

When did Google adopt neural machine translation?

Google introduced neural machine translation (GNMT) this September to handle inter-conversion between Chinese-English pairs. Converting from Chinese to English and vice versa was a daunting task for machines. The new system was employed on the mobile and web platforms and Google computer scientists said it reduced errors by 60 percent.

  • Before this Google used Phrase-Based Machine Translation which split sentences into words and phrases for translation.GNMT differed by taking the entire sentence as its input. The accuracy improved when engineers made it pick out and handle rare and vague words separately. Bilingual humans continuously gave feedback on the new system.
  • GNMT processes 18 million requests daily and the team managed to tweak it to provide an optimal performance.They designed customized chips for their TensorFlow – an open source framework for Machine Learning. These chips handled the massive computations to produce the target language results rapidly.
  • GNMT provided translation results comparable to that achieved by humans as bilingual people were employed to coach the system.

Where did this proceed to?

In November, in the biggest ever update so far, GNMT’s capabilities were expanded to handle eight language pairs. With French, German, Japanese, Korean, Portuguese, Spanish and Turkish, GNMT could work on 35 percent of the translation requests on Google.

  • Google wanted to extend the capabilities to all the 103 languages Google Translate handled. But while this would provide substantial improvement in the quality of translation, scaling was a daunting challenge. The AI would have to work with 10,609 models and be fed with data for 103^2 language pairs.
  • To address this issue, Google made a single system translate between multiple languages (multilingual system) by sharing the translation knowledge between language pairs. The blog post on this multilingual system explained, “This transfer learning and the need to translate between multiple languages forces the system to better use its modeling power.
  • This fuelled the curiosity of the engineers and led them to ask if their system could accomplish translation between language pairs it had not had prior exposure to.

The answer (Yes Indeed!) has been elaborated in the What section of this knapp.

Who benefits most from machine translators?

This section focuses on Google Translate

  • Users of Google Translate can enable translation at the tap of a widget. The annoyance of copy paste is eliminated and translations of chat texts, comments, articles and lyrics are obtained much more easily. Tap to Translate was due to arrive in India, Thailand, Indonesia and Brazil.
  • Offline mode of Google Translate in Android and iOS phones is a lifesaver for foreign travelers in far off places where Internet connection is unreliable. As of May this year, it supported 52 languages. When combined with Word Lens – as of January last year this feature was available only for 7 languages; the number was 29 by this May – the effort to translate foreign text on road signs or any other printed text became as minimal as holding the phone in front of the text.
  • Business owners can overcome language barriers and find out crucial information to evaluate the potential of their product in different countries by using Google Global Market Finder.

Google has said that 90 percent of Translate Users are outside US. Therefore to sustain the app it is necessary to increasingly cater to the requirements of this huge user base.

How did Google Brain’s AI demonstrate its capabilities recently?

The neural networks showed they could invent their own encryption strategy and get better at it with practice. Researchers at Google Brain experimented on three networks – named Alice, Bob and Eve – for a start.

  • Each system was trained for a specific role – Alice was the sender, Bob the receiver and Eve the eavesdropper. Alice had to make sure her message was understandable only to Bob while Eve had to try and foil her plan. Alice and Bob worked out a key that only they both knew.
  • Despite poor performance in the initial attempts, with practice Alice improved on her encryption strategy and Bob could decipher the message while Eve could guess only 8 of the 16 bits forming the message.
  • While this machine invented encryption is very basic, yet to be exactly understood and cannot be vouched for in practical applications, it has excited researchers.

For in-depth analysis of many such topics, download Knappily.  KNAPPILY is a must-have app for anyone who wants to know more, to know better and to know faster.

Tags | AI Artificial Intelligence Google Google Brain translation