Chinese Artificial Intelligence (AI) start-up DeepSeek’s R1 model, which has disrupted the tech sector, holds lessons for India to develop the critical technology in cost-effective ways and without massive computational resources, the nation’s top researchers told The Indian Express.
“With the launch of the R1 model, India could do a Mangalyaan in AI,” said Gautam Shroff, Professor at IIIT-Delhi and former senior vice-president of Tata Consultancy Services and its head of research.
Researchers are of the view that although the US and China lead the AI race, India can certainly catch up. “With the right focus, India can position itself as a strong contender in the global AI ecosystem. The open-sourcing of DeepSeek models is creating a ripple effect, triggering global competition and collaboration. India should embrace this momentum,” said Mayank Vatsa, professor of Computer Science at IIT-Jodhpur.
Like Shroff of IIIT-Delhi, Vatsa, who works on biometrics and computer vision, said India needs to follow the template set by Indian Space Research Organisation (ISRO) to usher in major AI advancements. “In the early days of India’s space program, we were considered outsiders — so much so that cartoons ridiculed our perceived deficiencies in space technology. Yet ISRO’s journey has proven otherwise, transforming India into a global leader in this domain through cost-effective, innovative methods that encouraged self-reliance and delivered wide-ranging social impact,” he said.
These same principles, Vatsa felt, could be applied to AI amidst the race to develop the technology. “We managed to reach Mars at a fraction of the cost of the developed nations. This is what is needed in AI as well, and with the latest developments we have a demonstration and even a starting point,” Shroff said.
The R1 model released last week has sent shock waves across the world — the global tech market has seen a shift, DeepSeek’s model overtook rival ChatGPT to become the top-rated free application on Apple’s App Store, and Silicon Valley giants are spooked. The little-known company based in Hangzhou has built an AI model which can match the performance of its cutting-edge American rivals at a much lower cost and with limited resources.
“DeepSeek’s models show that necessity breeds innovation. When you are denied some technology, you figure out a way around it and do something smarter. Some of the engineering is quite ingenious,” Shroff said.
The R1 model is said to have been built with only around $6 million — OpenAI spent more than $100 million to train its GPT-4. DeepSeek’s model uses a “mixture of experts” approach, where multiple specialised sub-models work together to answer a question, instead of a single big model managing everything.
Vatsa said, “This allows for better performance and efficiency compared to a single large model. This approach has the potential to democratise access to advanced AI by reducing the computational resources required, enabling researchers in resource-constrained environments to participate more actively.”
However, researchers also said it was important to wait and watch how the model performs in the coming days as it has just been rolled out.Shroff said, “The R1 model has beat the benchmark tests. But you have to see how they scale this for hundreds of people and commercialise. It is not just about creating AI but also about serving half-a-billion people every day.”
The R1 model is not only cost-effective but also consumes less energy, according to DeepSeek. Thousands of GPUs used for training AI models are housed in data centres, which devour large amounts of energy and water for cooling. In the US, data centers consumed roughly 4.4% of electricity in 2023 but are anticipated to use 6.7% to 12% of all power by 2028, according to a report produced by the Lawrence Berkeley National Laboratory.
The Chinese start-up is said to have trained the R1 model using around 2,000 Graphics Processing Units (GPUs) — companies like OpenAI use as many as 16,000 or more GPUs to train their models. The R1 model used Nvidia H800 chips, which are the less-advanced GPU chips available to DeepSeek, amidst a ban imposed by the US on more sophisticated chips such as H100.
Pushpak Bhattacharya, Professor at the Computer Science Department at IIT Bombay who works on machine translation, said the R1 model’s launch is “heartening” as “the AI community is worried about the environmental impact of the technology”.
Researchers hope that DeepSeek’s breakthrough would lead to development of more models in India that compete with the systems of the industry’s leading players such as Google, OpenAI, and Meta.
They also called for scaling up work on indigenous models built for Indian languages so that they can address the country’s regional diversities and complexities. For instance, Bhattacharya explained, these models should ultimately be able to help farmers in rural areas.
“They should be able to access AI applications on their phones and, let’s say, upload a photo of a diseased crop. The application should then be able to lay out next steps for tackling the diseased crop to ensure that it does not contaminate the whole field. This communication to the farmer should be in their language,” he said.
To propel AI growth in India, researchers highlight that there is a need to focus on developing indigenous foundational models through sustained funding, and industry-government collaboration.
Shroff said that currently, our private companies “do not have the capacity or even the desire to build foundational models”. “We have seen many senior leaders go and say we should not be building models and we should just be using them. We do not have to just consume, we have to contribute,” he said.
He also pointed out that people building models in India should learn from DeepSeek and think of “better spending of the 10,000 crores allocated in the AI mission along all verticals”. In March 2024, the Centre launched the IndiaAI Mission, which allocated Rs 10,372 crore for the next five years towards several initiatives to boost the country’s AI capabilities.