Every character that we type on a keyboard has a unique code number. These numbers are encoded under a standard maintained by the Unicode Consortium, which also classifies characters according to the scripts they belong to. For example, the letter ‘A’ has the code number U+0041 and is listed as “Latin capital letter A” in the “Basic Latin” code chart. On the other hand, the first vowel of the Assamese alphabet is U+0985. The issue is, Unicode defines that symbol as “Bengali letter A”.
Indeed, the Bengali script uses the same symbol for its first vowel. The two scripts share a large number of characters, some pronounced the same way, others denoting different sounds. In Unicode’s charts, these shared forms are defined as Bengali characters. Only a few characters exclusive to Assamese are listed as “additions for Assamese” — in the chart for Bengali.
For years, prominent Assamese individuals, now backed by influential organisations and the state government, have been campaigning for recognition of the Assamese script as separate. A delegation attended a recent Unicode conference in London, presented a proposal by the Assam government, and returned last week with the news that ISO — an international organisation that defines the characters that Unicode encodes — has recommended that the Bengali script be renamed as “Bengali-Assamese”. But the original demand remains.
Unicode characters are defined by the international standard ISO 10646; ISO does the encoding on the basis of submissions from the concerned governments. Prof Shikhar Sarma of Gauhati University’s IT department, who attended the London meeting as a member of a Bureau of Indian Standards (BIS) panel called LITD-20 and ISO’s working group-2, said the Indian government had committed a mistake in 1991, by dropping Assamese while translating its standard ISCII (Indian Standard Code for Information Interchange) for transfer to the Unicode standard.
The ISO recommendations are for inclusion of the Assamese name for every character, encoding of the unlisted characters, and renaming of the page title (the web page already says “Bengali and Assamese” but most characters are defined only as Bengali).
While Sarma said the “ball is in the court” of the BIS, Dr Satyakam Phukan, a Guwahati-based surgeon who ran a solo campaign for years before it gained wider support, would rather have the BIS push for the original demand. Phukan notes that the Assamese script is not yet included in the ISO 15919 standard for “transliteration of Devanagari and related Indic scripts into Latin characters”. This could come in the way of including the Assamese sounds in a common script.
Unicode’s stated objective is to provide a unique number for every character across platforms, devices and languages. If it were to recognise Assamese as separate from Bengali, would it not mean duplicating characters that are identical in form? A fact sheet written by Phukan and published by the Asam Sahitya Sabha, Assam’s apex literary organisation, argues against this.
For one thing, there is already duplication in the Unicode charts, not only between Latin and Cyrillic with the same character sometimes denoting different sounds, but also between Bengali and Tirhuta (Maithili) with the same character sometimes denoting the same sound (see illustration).
At times, the same sound is expressed by different characters in Assamese and Bengali. The definitive example is the letter ‘ra’, which takes two different forms in the two languages, besides a letter ‘wa’ that is exclusive to Assamese. The Unicode chart lists the Assamese ‘ra’ and ‘wa’ among additions for Assamese, and defines them as being both Bengali and Assamese characters.
At other times, the same letter denotes two different sounds. For example, three Assamese characters denote the sound ‘xa’, defined as “a soft ‘kh’ with the air released from the throat with the base of the tongue not touching the palate or the roof of the mouth”. In Bengali, the same three characters denote the sounds ‘sa’, ‘sha’ and ‘ssa’.
Phukan’s fact sheet lists many other examples for all these arguments.
The Assam government proposal lists 104 characters and symbols — 35 identical to Bengali ones in name and shape, 42 similarly shaped but with different sounds/uses, and 27 yet to be encoded.
“We did not get what we wanted… but what happened in London is big progress,” said Prof Sarma. “We will stick with our original demand… We insisted that you don’t decide but you recommend, and you let the BIS coordinate the entire effort… BIS has to sit with the government of Assam and people of Assam. We need a separate slot, or we need a slot where all the characters and symbols of Assamese language are encoded.”
All this would require consultations only with the government of Assam but, later, possibly with those of West Bengal and Bangladesh. Dr Parmananda Rajbongshi, president of the Asam Sahitya Sabha, is hopeful that it can be resolved amicably. While he agreed that this is not what the campaign was for — he was in the delegation to London — he said it has opened a door.