‘Digital colonisation’ can evoke many meanings especially in the context of the ‘Indian Internet’. Unlike China, India’s web space is dominated by American companies such as Amazon, Google and Facebook to name a few. But there’s another aspect to this digital colonisation, and that’s the dominance of the English language over India’s internet, at the cost of local languages. This dominance is surprising considering English is not a language spoken by everyone in the country, though it commands a clear aspirational value.
“There is a clear discriminatory treatment against Indian languages. In traditional media, over 90 percent of engagement happens in local languages. The reason is equality of access. However, on the internet that is not the case in India; the phones, the user interface, everything has been designed from the ground up for ease of use in the English language. It does not take into account the usability aspect for Indian language,” Arvind Pani Co-Founder & CEO at Reverie Language Technologies told indianexpress.com.
Bengaluru-based Reverie Language Technologies has been working towards creating Indian languages solutions for business across sectors in India since 2009. The company was acquired by Reliance Industries Investments & Holdings in 2019.
According to Reverie’s founders Arvind Pani and Vivekanand Pani, the Indian internet is far from being an inclusive space given the preference towards English, which has come at the expense of regional Indian languages.
They argue that historical blunders have ensured that regional Indian languages are not well represented on the Indian internet, a situation which is in complete contrast to countries like China, Japan or even European nations, where the native languages dominate. For instance, if you were to buy a laptop in Japan, the keyboard layout will typically be in native Japanese. In India, most laptops have an English QWERTY keyboard. And that’s just one example of the kind of dominance English has in the Indian computing systems.
The existing problems for Indian languages
“Our Indian language standards are getting defined in the US, by companies who are in the US. They are represented by people who don’t understand the nuances of Indian languages, and we are expected to go and adopt standards laid out by them,” Arvind pointed out.
He said the problems created by Unicode, the universal character encoding standard, which is maintained by the Unicode Consortium, are partly to blame. For one, Unicode ended up introducing an additional eight to 10 characters for the Hindi script in India, which creates a whole new set of problems.
“Students do not study these in schools in India. So when they use the internet, they are left confused. It would be exactly like if someone told us English does not have 26 characters, but 32 characters,” he explained.
“Unicode is hampering representation for Indian languages. We have challenges with encoding, display, search and input methods (which is typing),” pointed out his co-founder Vivekanand Pani.
“The display does not consider fundamental properties of Indian regional languages. People would want to use their native scripts. But even on the phone there is a barrier to be able to type. Technology has not made it easier. It has actually made it difficult,” he added.
The issues, as laid out by Reverie’s founders, relate to the encoding of Indian languages, the problems with display and fonts, along with problems of input methods.
For instance, they point out that while the English keyboard has a set standard format that users have come to expect, for Indian regional languages each brand decides its own layout, which adds to general confusion for users.
The historical blunder
According to Vivekanand Pani, the historical blunders committed around encoding of Indian languages are yet to be corrected. He pointed out efforts to encode Indian languages started in India in IIT Kanpur way back in 1970. In 1988, India introduced its very own Indian Script Code for Information Interchange.
But the Unicode consortium took what he calls a very arrogant stance, when adopting the rules from ISCII. “Americans had no knowledge of how to encode Indian languages. Unicode ignored the full standard of complete guidelines, properties of Indian languages. They only picked superset characters from the ISCII document. So there were characters that might not belong to Devnagri, but might belong to Tamil, etc, were introduced. This corrupted the encoding,” he argued.
The problems were compounded further when Microsoft and Windows entered India. “Indian language computing started in the 1970s. India started adopting these technologies in 1988. But when Windows was introduced at the time, the OS, the architecture was in such a way that the display was controlled by them. You could not override the font rendering engines. They introduced OpenFont, which was terribly complex, latin based,” Vivekanand said.
In his view, the system introduced by Microsoft and others did not “understand Indian language properties,” leading to “absurd writing” being created in regional languages. “Even today we have abysmal font choices for Indian languages or no choices at all,” he lamented.
According to Vivekanand, India needs to set the rules on Indian languages and their encoding standards, which should be followed by all players. “We need to lay the right infrastructure, that is India’s responsibility. Anyone doing business in India should have to follow these standards,” he said.
Right now as he sees it Indian languages are at the mercy of the operating system provider be it Microsoft or Google or Apple. In his view, ISCII had evolved a font standard for Indian languages, and if that is implemented, it will become simpler for font designers to work in Indian languages on the internet.
When asked why China and Japan got it so right with their policies on native language encoding for computing, Vivekanand said in those countries the decision makers were always more comfortable in their native language. “In India, the legacy systems of the British meant that those who were deciding for the rest of the country always preferred English,” he said.
In his view, India made absolutely zero efforts to introduce our standards. The end result was that people did not have a choice and for many now, the Internet is in English only. “So naturally we are two decades behind English,” he said.
According to Arvind, “For Indian languages, India needs to take charge. If it does not, the problem is going to get compounded if more and more content is added based on current infrastructure.”
📣 The Indian Express is now on Telegram. Click here to join our channel (@indianexpress) and stay updated with the latest headlines