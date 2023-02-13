As the battle between tech giants Google and Microsoft over the future of Internet search intensifies, something enormously significant is happening in India, seeking to harness the power of artificial intelligence (AI) for the country’s estimated 150 million farmers.

WhatsApp, the enormously popular messaging service, could soon facilitate search on key government schemes, powered by the sensational AI chatbot ChatGPT, and an ambitious national-level programme that aims to build vast crowdsourced datasets with samples of Indian voices in several local languages.

WhatsApp is owned by Meta Inc (formerly Facebook). ChatGPT has been developed by the San Francisco-based firm OpenAI, in which Microsoft has reportedly made an investment of $10 billion.

Bhashini, a small team at the Ministry of Electronics and IT (MeitY), is building a WhatsApp-based chatbot that relies on information generated by ChatGPT to return appropriate responses to queries. And because users, especially farmers in rural areas, may not always want — or be able to — type out their queries, questions can be put to the chatbot using voice notes.

How will the chatbot work?

In essence, a user could simply ask a question using voice notes, and receive a voice-based response generated by ChatGPT. According to a senior government official, a model of this bot was shown to Microsoft CEO Satya Nadella, who mentioned it at the World Economic Forum (WEF) in Davos last month.

During a demo seen by The Indian Express, the chatbot — which is currently under testing — seamlessly responded to a spoken query about the details of the PM Awas Yojana, the government’s flagship affordable house scheme.

The chatbot has been developed keeping in mind sections of India’s rural and agrarian population that most depend on government schemes and subsidies. These potential users speak a wide range of languages, which makes it important to build a language model that can successfully identify and understand them, another senior government official associated with the project said.

Neither official committed to a date for the public release of the chatbot. But they said that its demo had left Microsoft’s Nadella impressed.

“A demo I saw was a rural Indian farmer trying to access some government programme. He just expressed a complex thought in speech in one of the local languages that got translated and interpreted by a bot, and a response came back saying ‘go to a portal and here is how you will access the programme’. He said, ‘I’m not going to go to the portal, I want you to do this for me.’ The bot completed it, and the reason why it was able to complete it was because a developer building it had taken GPT and trained it over all of the Government of India’s documents and then scaffolded it with the speech recognition software,” Nadella told WEF founder Klaus Schwab in an interview.

How will the bot understand and interpret local languages?

While ChatGPT has impressed with its ability to respond to complex queries in fascinating and eloquent ways, building a national digital public platform for Indian languages will be critical for the success of the WhatsApp chatbot that the Bhashini team is working on.

To build such a language model, it is essential to have large datasets of the various local languages spoken in India, on which the model can be trained, officials said. And this is where an initiative called Bhasha Daan comes in.

Bhasha Daan, officials said, is an ambitious project that aims to crowdsource voice datasets in multiple Indian languages. People can contribute on the project’s website by recording themselves reading out a portion of text, by typing out a sentence that they hear, or by translating text in one language into another.

“The majority of those who will use this chatbot would not know English. So for their voice inputs to work on the chatbot, it is important that we train our language processing models on as many Indian languages as possible. We have a decent-sized repository of voices in many Indian languages that people of the country have contributed to through the Bhasha Daan portal. We also have a vast database of all the languages that Doordarshan broadcasts in,” the second official said.

In the test phase, the model currently supports 12 languages including English, Hindi, Tamil, Telugu, Marathi, Bengali, Kannada, Odia, and Assamese. If a user were to send a voice note in any of these languages, the chatbot will successfully return a response.

Are there any limitations or concerns around such models?

Technology ethics experts have cautioned that responses of generative AI models like ChatGPT may not always be accurate. Last week, when Google unveiled Bard, its competitor to ChatGPT, the bot made a factual error about the James Webb Space Telescope. The company’s shares dropped by $100 billion after the mistake was spotted.

In its current testing phase, the WhatsApp chatbot can only respond to simple queries about government schemes, etc. This is primarily due to the current limitation of ChatGPT itself — the fact that it cannot access information from the Internet in real time. ChatGPT’s language model was trained on a dataset that only includes information until 2021.

However, this could soon change. Last Wednesday, Microsoft announced a new version of its search engine Bing, powered by an upgraded version of the same AI technology that underpins ChatGPT. Microsoft said that the feature will be powered by an updated version of GPT 3.5, the AI language model created by OpenAI that powers ChatGPT. It called this the “Prometheus Model”, and said it was more powerful than GPT 3.5 and better able to answer search queries with more up-to-date information and annotated answers.

The first official mentioned above said that once ChatGPT has the ability to search the Internet and return with real-time results, the scope of the WhatsApp chatbot could expand greatly. “People will not just be able to get information about various government schemes in a concise manner, but they will also be able to inquire if they are eligible for those schemes,” the official said.

The choice of WhatsApp as the delivery platform was deliberate, the official said. “WhatsApp has more than 500 million users and even those with relatively low digital literacy know their way around the app,” he said.