When the pandemic was at its peak, Varshul Gupta and Anuja Dhawan, Co-founders, Dubverse, saw a huge problem in video-based educational content being only available in English. The lack of local language support was a barrier for non-native English speakers. That was when Gupta and Dhawan thought of an AI-based dubbing platform that would allow creators to dub a video in as many as 30 languages in real-time.
“AI was primarily limited to labs and institutes two-three years back but we thought about how we should use AI to solve real-life problems,” Dhawan, 30, explains the idea behind Dubverse which was started with an educational mindset but now caters to any visual creator who wants to dub a video in the language of their choice.
Dubbing for the longest time has been associated with movies. Movie studios and production houses have relied on dubbing to release their films in other widely-spoken languages to target a wider group of audience. Dubbing or voice-over is a process where the script sound is replaced with words from another language but the challenge is how each dialogue sounds just fluid with the right emotions in order to blend well with a native language.
Historically, dubbing content has been a complex task having multiple layers to it. It requires a narrator, a scriptwriter, a studio, and a video editor. The process is long, expensive and time-consuming, in addition to the cost of the narrator whose fee can vary between Rs 20,000 an hour to Rs 5,00,000 an hour depending on the experience and the type of work the dubbing artiste/narrator has done over the years.
But as the demand for producing high-production videos increases, dubbing a video in multiple languages presents a major challenge, especially for smaller content creators with limited budgets.
Dhawan’s Gurgaon-based startup wants to bring dubbing from movie production studios to independent creators and news publications, who may not want to get into the complex production process required to make their videos available in multiple languages. “The language barrier is real,” Dhawan says. “What we are trying to do is to make any video that’s created from a 10-second TikTok to a minute-long form YouTube content can be very easily dubbed into multiple languages with a single click of a button,” she adds.
The concept of AI-generated voices is not new. AI voices have been used in e-books, games, voice assistants, and branding exercises for years. But what Dhawan’s Dubverse platform does is it makes digital voices sound as if they are narrated by a human. “We realised that the problem is when an AI speaker speaks, it sounds flat, monotonous and not engaging to the point that consumers never consume the content,” she says. “We went back to the drawing board and said ‘How can we improve this?’ That’s when we thought we needed to improve these voices to make them more human-like.”
Story continues below this ad
Gupta and Dhawan alongside their team. (Image credit: Dubverse)
Dubverse represents a big shift where content creators now have the option to dub their videos in languages that sound less robotic and more human-like without the need for an actual narrator. Dhawan’s company is using a combination of humans and artificial intelligence to create voices that sound like real people. “We take a human voice, we clone it, and create an AI avatar out of it,” Dhawan says when asked about how the company’s machine learning algorithms and software can produce audio that sounds more like a human than synthetically generated voices. The idea is to retain the original voice while speaking a new language with a local accent and dialect.
AI dubbing platforms like Dubverse are promising a seamless way to dub a video in a language of choice. All the user needs to do is insert a video link (say a YouTube video) into the platform, select a language you want the video to be converted to, select a speaker, and the video is available in a new language. The company wants to make dubbing a video in multiple languages “accessible” in order to achieve the scale the creator has for the content. Think about a creator who runs multiple YouTube channels and wants the content to be available not only in Hindi or Bengali but also in Spanish, German and Korean.
Dhawan says her company has a library of 30 human voices that are trained on the AI models replicating a certain accent or voice. Despite the use of AI-powered voice-over software for dubbing a voice, Dhawan says the role of humans is critical because the software is trained using real people. Although a voice generated by artificial intelligence-powered software still can’t come close to how an actual human voice sounds, Dhawan says she has seen results anywhere between 90 to 95 per cent but that varies from language to language.
Dubverse has a database of 500 million words across different languages. (Image credit: Dubverse)
“AI still needs assistance. It’s not 100 per cent accurate,” agrees Dhawan. Although AI-based voice-overs have come a long way, Dhawan says they have designed a mechanism where a creator can request if the dubbed video created on her platform can be reviewed by a language expert who speaks the language before posting the content online. “We definitely make sure that there are humans in the loop and make AI more assistive in nature so that the end result is more trustworthy to be put out in anybody’s name.”
Story continues below this ad
The primary argument that puts the AI dubbing platforms in the dock is whether the voice generated by the software can match the voice of a narrator with real emotions and the little nuances’ without the loss of quality and full meaning. “I would say the only answer to this is the data. If you’re able to give me your happy voice, your excited voice, your sad voice and your criminal voice in a certain way, then I’m able to replicate all of that,” says Dhawan.
But Dhawan says getting the emotions right also depends on the type of content you are presenting. The emotional quotient, as Dhawan said, applies more when the narration is done by a celebrity. For now, though, AI dubbing is best suited for “informational” content like e-learning, training and product explainers. Dhawan’s startup has worked with Reliance and ITC in the past and soon Mahindra and Kotak will start using Dubverse for internal use.
The primary complaint about AI dubbing, regardless of the language being dubbed, is that artificial intelligence will replace the jobs of human narrators and voice artists. “AI is not replacing humans,” she says. For her, AI dubbing is a way to get rid of repetitive tasks done by a human narrator. It could be as basic as visiting a studio multiple times for recording a voice when not required. “That’s a lot of added cost,” she adds.
Although Dubverse went live only in January and was made available to the public after a closed beta for almost a year, the platform has garnered over 20,000 users. The startup is banking on subscriptions to increase the user base and bring in more revenue. The subscription plans start at Rs 3000 a month and go up to Rs 60,000 a month. The multiple pricing tiers help attract a variety of creators including YouTubers, Courser creators and news publishers but the company is also focusing on bringing bigger organisations on board.