Meta’s new AI model can translate & transcribe nearly 100 languages

The company has publicly released SeamlessM4T under a research licence to allow researchers and developers to build on it.

By: Tech Desk
New Delhi | Updated: August 24, 2023 01:05 PM IST

3 min read

Illustration showing meta's language model at work

Meta claims that when compared to approaches using separate models, SeamlessM4T’s single system approach reduces errors and delays. (Meta)

Listen to this article

Your browser does not support the audio element.

In a truly globalised world, language should not be a barrier. And this is what seems to have prompted tech giant Meta to come out with a unique AI model. Meta has created an AI model, SeamlessM4T, which can translate and transcribe up to 100 languages across text and speech. SeamlessM4T is the first all-in-one multilingual multimodal AI translation and transcription model.

With its latest model, Meta aims to take interconnectedness to another level, offering users access to more multilingual content. “Today, we’re introducing SeamlessM4T, the first all-in-one multimodal and multilingual AI translation model that allows people to communicate effortlessly through speech and text across different languages,” reads the post on Meta’s official website.

Also Read | Llama 2 from Meta signals worry for OpenAI’s ChatGPT: What is it & how to access

Meta has publicly released SeamlessM4T under a research licence to allow researchers and developers to build on it. The SeamlessM4T has speech recognition for nearly 100 languages and speech-to-text translation for nearly 100 input and output languages. It also supports speech-to-speech translation in around 100 input languages and around 35 output languages, including English.

Built on a vast dataset

The company also said that it is releasing the metadata for SeamlessAlign, its biggest open multimodal translation dataset to date which has a total of 270K hours of mined speech and text alignments.

“Building a universal language translator, like the fictional Babel Fish in The Hitchhiker’s Guide to the Galaxy, is challenging because existing speech-to-speech and speech-to-text systems only cover a small fraction of the world’s languages. But we believe the work we’re announcing today is a significant step forward in this journey,” read the post.

Also Read | ChatGPT is just the tip of the iceberg: 10 AI tools that are way cooler than OpenAI’s chatbot

Meta claims that when compared to approaches using separate models, SeamlessM4T’s single system approach reduces errors and delays. It increases the efficiency and quality of the translation process. The company says that this facet would make it easy for people who speak different languages to communicate effectively.

What is SeamlessM4T built on?

Meta said that the new AI model is built on advancements that it and others made over the past few years in a bid to create a universal translator. In 2022, the company released its text-to-text machine translation model, No Language Left Behind (NLLB), which supports around 200 languages. NLLB has been integrated into Wikipedia ever since as one of its translation providers.

Story continues below this ad

The company had also shown a demo of its Universal Speech Translator, which is also the first direct speech-to-speech translation model for Hokkien, a language that does not have a widely used writing system. In 2023, it revealed its Massively Multilingual Speech, a model that offers speech recognition, speech synthesis, and language identification for over 1,100 languages.

According to Meta, SeamlessM4T is based on the findings from all the above projects that allow it to offer multilingual and multimodal translation experience from a single model.

From the homepage