The various products from Google, such as B. Search and Assistant are already available in India in several national languages. The company is now turning to new AI to potentially make more of its offerings available to Indian language speakers – more specifically, it uses a technology called MuRIL.
At its virtual event today, the Big G introduced a new language model called Multilingual representations for Indian languages (MuRIL). This is the first model to support the interaction between 16 different Indian languages.
That includes Assamese, Bengali, English, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Sindhi, Tamil, Telugu and Urdu.
While MuRIL is based on Google’s own BERT (Researchers claim that the model for bidirectional encoder representations of transformers is more efficient for Indian languages.
Partha Talukdar, a researcher at Google India, said the new model would better understand the context of statements in local languages.
For example, the previous model understood the following Hindi statement as a negative emotion: a Hindi statement “Accha hua account bandh ho gaya” (It is good that the account has been closed). However, the new model correctly predicts that the statement is positive.
Users in India often use their English language keyboard to type local languages - like in the sentence above. To do this, researchers have added support for transliteration recognition in other languages using the Roman script.
Google is making this model open source for other researchers and startups.
Currently, MuRIL is not embedded in any of the Google products. However, based on the contributions of researchers and programmers, this model is expected to be added to its offering in the future for better accuracy.
You can learn more and read the MuRIL code here.
Published on December 17, 2020 – 06:04 UTC