Internet in India has changed drastically over the last two years. The data wars that have ensued have made prices of mobile broadband plummet, allowing easier and affordable access for many. India’s internet population is around 450 million users, and over 390 million of these are active users. But, according to search giant Google, what’s driving online consumption is the growth of regional languages on the Internet and increasing video consumption.
“India today has about 400 million users, and 250 million of them only access to internet on their own language. So really, Indian language internet users are pretty large already. In fact, there are more non-English users on the Indian internet today, than there are English users,” Rajan Anandan, Google India’s Managing Director tells indianexpress.com in an interaction.
The prediction is that this 250 million number will get to 500 million Indian language internet users by 2021. “I think this idea of India’s internet was only English, that was four years ago, that India’s non-English internet is only Hindi users was three years ago. Today, it’s every language that is exploding,” points out Anandan.
As Google sees it, growth is coming from tier two, three, four cities and rural areas; metros are the not only driving force anymore. According to Anandan, “Many more women are coming online in rural areas. We think by 2024, 45 per cent of the internet user base in India will be women. It’s not just about millennials anymore.”
Online content in Indian languages
However, text-based content in regional languages is an area that’s still lacking. That’s where Google is hoping its new ‘Navlekha’ platform will help out. “If you look at internet in India languages, in non-video, which is text, only 1 per cent of the content is in Indian languages. As much as 99 per cent is still English, that is where the big gap lies,” he explains.
The problem is not that India has a shortage of written text-based content in regional languages. That content is just not on the internet. Navlekha plans to change all of that by letting publishers of regional language convert their PDF-based content into a web presence.
“Navlekha lets you take a PDF in whatever language you want and converts it into a web presence,” explained Anandan. The idea is to reach out to magazine, book publishers, and help them bring their regional content online.
“Let’s say I’m a Tamil magazine publisher, and I put out a print publication. It could be just a magazine, 10 pages once a month. So essentially, I have a PDF document…what this lets you do is take that PDF, and literally within a minute, make it into a web presence,” he points out.
Google is hoping that it will make putting content in Indian languages online more seamless.
Another area where Google is seeing growth in online consumption is video. In fact, according to Rajan video consumption is growing more in rural areas. “It’s not an urban phenomenon (video consumption). It’s not a rural phenomenon. It’s an all of India phenomenon,” he highlights.
This also explains why Google is getting ready to launch its YouTube originals in India starting with a musical talent show with AR Rahman.
YouTube and the growth of regional content
“YouTube has seen rapid growth in India in the last two years. The daily active user base is growing 100 per cent yearly. More than 75 per cent of the consumption is happening on mobile. What’s more is that our users are engaging multiple times a day. This is not just an active user, but also a connected user, because they are subscribing to channels as well and want to watch more videos from these in the future,” Satya Raghavan, Head of Entertainment, YouTube in India told indianexpress.com in a telephonic interaction.
According to a KPMG study, close to 95 per cent of video content consumed online in India is in regional languages. While YouTube did not give numbers on the kind of views regional content is raking on its platform, it has 245 million unique users per month, according to official Comscore numbers.
“Regional languages across the board are growing on YouTube,” Raghavan said, though he refused to share numbers on which languages attracted the highest views. However, he insisted that it would be wrong to see Hindi as the only dominant Indian language, as others like Tamil, Telugu, Kannada are also growing.
He also pointed out that English content or creators who speak in English are no longer the only ones doing well. “In 2014, we had 16 YouTube channels with more than 1 million subscribers in India. Now, over 300 channels have passed that milestone, and we’re adding two more almost every single week,” he added.
Many of these channels with more than 1 million subscribers in India have content in regional languages. In fact, T-Series in India has close to 63 million subscribers, which makes it the second biggest channel in the world, just after PewDiePie, which has 66 million.
He also highlighted that because data is no longer a constraint, users are much more engaged on the platform. “In 2015, we saw comedy, food and music content starting to grown. In South India, regional language content started growing. In 2016, tech channels in India grew and we have also seen the growth of regional content in Punjabi, Gujarati. By 2017-18, we were seeing all verticals grow, and the ecosystem has become much wider,” he explains.
Given the diversity of content, it would be fair to say YouTube has in fact become one of easiest sources of accessing video in Indian languages, across a variety of categories., which is something no other platform can offer, at least not on this scale.
Challenges in translation, the future of voice
Still for Google getting more content online from regional languages is just one part of the problem. With the growth of voice-based services, and the company’s growing focus on Google Assistant, the challenges are still very much there in India.
According to Anandan, the newer users coming online in India would much rather speak to the internet, than type or tap, which also explains why voice-based queries are growing.
“We have 270 per cent growth in search queries on a yearly basis in India. In the last ‘Google for India’, we said that 28% of search queries on Google search app are voice-based so you can imagine that percentage is going up much, much higher,” Rajan said. Last year, Google had revealed that Hindi voice search queries had grown by 400 per cent in India.
While he admits that when it comes to translation there is still scope for improvement, he is confident that Google’s machine learning capabilities will ensure that voice-based search in regional languages will only get better over time.
“We have spent a huge amount of time, you know, getting the machines to learn, relied on lots of data. At least in the in the top 11 Indian languages today, being able to translate into those languages we think is actually working pretty well,” he said.
Monetisation in regional languages
While the majority of India’s future internet citizens will rely on their own languages to access the web, for online publishers, especially those venturing into vernacular languages, the challenges to monetisation are still there.
A recent, Internet and Mobile Association of India (IAMAI) and Kantar IMRB report said that by December 2018, digital advertising spending in India would be close to Rs 12,046 crore, and this is clearly an avenue which will continue to grow in double digits.
The importance of serving ads in local languages is not lost on Google, which has opened up AdWords in some of these languages. “Look I do think that monetisation lags. On our part we are making sure that you can make AdWords available in other languages. Last year we had announced AdWords in Hindi, then we added Bengali, now we’ve added Tamil, Telugu,” explains Anandan.
“If I’m a user, and I don’t speak English, I’m watching some content on and you show me ads in English, it is probably not going to be very effective because I can’t really figure out what you’re saying,” he said.
📣 The Indian Express is now on Telegram. Click here to join our channel (@indianexpress) and stay updated with the latest headlines