Journalism of Courage
Advertisement
Premium

Per Capita AI consumption could become development indicator of the future: Pratyush Kumar, co-founder, Sarvam AI

Kumar is also the co-founder of AI4Bharat, pioneering Indian language AI application and PadhAI, which provides deep and affordable online learning to students.

10 min read
Sarvam AIKumar is optimistic that LLMs and other related technologies can be harnessed for good. (Express photo by Jithendra M)

Pratyush Kumar is passionate about democratising the impact of AI. The co-founder of Sarvam AI, a Bengaluru-based startup building large language models (LLMs) to solve India specific problems, Kumar is optimistic that LLMs and other related technologies can be harnessed for good.

Kumar is also the co-founder of AI4Bharat, pioneering Indian language AI application and PadhAI, which provides deep and affordable online learning to students.

A Ph.D. from ETH Zurich and a B.Tech. from IIT Bombay, Kumar has worked at Microsoft Research, IBM Research, and IIT Madras. He is also an adjunct faculty at IIT Madras.

Kumar spoke to indianexpress.com on how GenerativeAI could have an impact on sectors like education, health and agriculture,, the need for enormous data curation in Indian languages for large language models to work, and how ten years from now, most things that we see on the internet might become unrecognisable. Edited excerpts:

Venkatesh Kannaiah: Can you tell us about the broad trends in AI and which domains might see a greater impact?

Pratyush Kumar: The speed at which tech is moving in this sector, we may get closer to Artificial General Intelligence soon, where an intelligent agent could learn to accomplish any task that human beings can possibly do. There are those who say that the tech may not be moving that fast but that is a debate for another day.

We at Sarvam AI are interested in the democratisation of access to AI, and this could come from two axes. One is with cost – it should be affordable, and secondly, it should have the right kind of use cases or applications relevant to an Indian context. In the previous generation of tech, it was about providing access through laptops, smartphones and cheap data, now it is about adding intelligence to the access.

You must also understand that deep tech within various domains like health would take time to emerge to fully utilise the potential of AI. So the impact of AI would not be uniform across all domains. Some like education would see a radical shift, but in other domains it is for those subject matter experts to build applications to utilise the power of AI and for changes to happen.

Story continues below this ad

We are accustomed to seeing per capita energy consumption as a metric of development, going forward we would be seeing per capita AI consumption as an indicator of development.

“When it comes to agriculture, we see AI trying to solve the problem of information asymmetry, providing context specific information in real-time,” Kumar says. (Express photo by Jithendra M)

Venkatesh Kannaiah: Can you tell us how your company will change the life of a common man in the field of education, health and agriculture?

Pratyush Kumar: I see that in say five years, AI systems which are interactive will be able to teach a student a chapter in a manner comparable to a good teacher, if not better. These systems can in due course provide data and insights on students by page and perhaps even across pincodes, and being interactive and with a real-time doubt clearing system, this will change the way we deliver education. With this, we can know if students in say some pin codes like in Karnataka or Bihar are finding some chapters or some parts of the chapters difficult, and work on to improve the same. It would also have spatial awareness, meaning that it would know which part of the page you are reading from or working on. We are building a Tuition Anna, a product which will have all these features and much more.

As for the health sector, it is slightly complex as large amounts of health data is needed which is now either with the hospitals or with various government agencies. While we need to be aware of privacy issues, we need to also look to build structures wherein data is shared with groups working on AI. If it happens fast, we would be seeing personalised AI health assistants emerge in the near future. But the challenge for AI to work effectively and at scale in the health sector is for data to be made available. Governments perhaps need to step in and solve the logjam.

When it comes to agriculture, we see AI trying to solve the problem of information asymmetry, providing context specific information in real-time. It will lead to Indian farmers becoming more productive. What we might be looking at would be like a co-pilot for farmers; kind of personalised information assistants.

Story continues below this ad

Venkatesh Kannaiah: You talk of an India-centric AI. Does it mean that going forward AI models will increasingly become domain specific and region specific.

Pratyush Kumar: It is nice to talk about being India-centric or India-specific AI but what we must understand is that large domain agnostic models seem to be working better. For example, to write better poetry, we find that if the model has already been trained on Math, it would write better poetry and vice versa. So the larger the model, the better it would be to do seemingly unrelated tasks. I do not see different large language models developing for each country or region.

However, the applications that would be built on each of the models could be region-specific.

We already have Aadhaar, Jan Dhan, UPI, Ayushman Bharat and we need to look at how we layer on them, use AI and build new products and applications solving India-specific problems which could range from malnutrition in a particular region or iron deficiency in a segment of the population.

Sarvam AI’s large Indic language models with voice interfaces, will make it easy for Indians to interact with their model.

Venkatesh Kannaiah: You talk of paucity of high-quality and diverse Indic language content at scale. Can you explain the same and tell us which Indian languages pass this test?

Pratyush Kumar: We have a rich heritage, culture and literature in most Indian languages, but the issue is that it is not digitised in the vernacular languages. Most of the LLM models take information from the internet and not much of Indian language content is available, compared to English. They are also not of the highest quality or fluency. We need to consciously digitise large amounts of Indian language content. We are quite behind the English language on this issue.

Story continues below this ad

As for Indian languages, Hindi, Tamil, Telugu, Tamil, Kannada, Malayalam and Bengali pass the test of substantial amounts of content being available on the internet for these large language AI models to work on. However, there is a need to digitise much more, with much higher quality. Machine translation works to a limited extent in expanding the amount of content available on the web.

Venkatesh Kannaiah: You say you would be building LLMs that use voice as the default interface in India. Why is it so?

Pratyush Kumar: Young Indians do not type in their regional languages and that makes voice the format to preserve and distribute. Voice would be the default interface in India for Sarvam AI.

The common customer experience with AI systems are with voice interactive call centre models and with AI chatbots in customer care.These things are going to improve dramatically. Within a year, you would feel equally comfortable with an AI chatbot or an AI call agent as with a call centre executive.

You must understand that this tech is moving very fast. We are working on getting the science and the tech together and working on building an ecosystem of products for the market to adopt. After all, all of it will be done with an eye on the return on investment.

Story continues below this ad

Venkatesh Kannaiah: What is your interaction with KisanAI/Dhenu and how will it help Indian farmers?

Pratyush Kumar: KisanAI is a very good example of how things will pan out. We built OpenHathi at Sarvam AI to make contributions to the ecosystem with open models and datasets to encourage innovation in Indian language AI. It is a partnership with our academic partners at AI4Bharat who have contributed language resources and benchmarks. We built the base model which understands English and Hindi, and KisanAI comes and builds a product for farmers within a few weeks. We finetuned our solution, did some hand holding and the product got built in a short period of time. The turnaround time for building products and applications would be surprisingly short.

We do not have subject matter expertise in say agriculture, and it is for other companies to build on our platform. It is a good template to showcase.

Venkatesh Kannaiah: What do you mean when you say that you will be leading efforts for large scale data curation in public good space. How would it work?

Pratyush Kumar: We have a societal imperative to unlock our data. Our culture is not there on the web now. The challenge is on how we digitise the data in Indian languages. We need to come up with a system, a formula wherein the rights of the data owners are taken care of and digitised and how Indian data could be made available to build large Indian AI models to solve India-specific problems. It is a challenge, but we in India, have expertise in thinking about protocols and systems in a different manner. We are confident that it will be resolved in due course of time. At Sarvam AI, we would be launching a set of initiatives to digitise and showcase data in the public domain, in a privacy friendly manner.

Tags:
  • artificial intelligence
Edition
Install the Express App for
a better experience
Featured
Trending Topics
News
Multimedia
Follow Us
Follow Live UpdatesNepal PM Oli resigns amid anti-corruption protests
X