Premium

DeepSeek-R2 to launch soon? What we know about the highly-anticipated AI model

So far, DeepSeek has been tight-lipped about the upcoming R2 model and little information is available in the public domain.

The release of R2 may galvanise the US to impose further restrictions on the export of GPUs to China. (File Photo)

There has been a lot of chatter lately around DeepSeek’s upcoming AI model R2. While the Chinese AI startup was initially expected to introduce R2 in May this year, Reuters has reported that DeepSeek is accelerating the launch timeline of the successor to its R1 model, which was introduced in January 2025.

DeepSeek-R2 is likely to be able to reason in languages beyond English. The new model is also expected to have improved capabilities for generating code and will be multimodal.

So far, DeepSeek has been tight-lipped about the upcoming R2 model and little information is available in the public domain. However, DeepSeek’s technical report on R1 suggests that the reasoning ability of its successor will be a vast improvement with expanded reinforcement learning (RL) training datasets.

“We believe the engineering performance of DeepSeek-R1 will improve in the next version, as the amount of related RL training data currently remains very limited,” DeepSeek researchers wrote in their technical paper that accompanied the release of R1.

The excitement around DeepSeek-R2 is growing, even as rival companies in the United States of America reckon with the implications of R1. The Huangzhou-based firm claims to have built R1 on less-powerful Nvidia chips and thinner margins than other competitive AI models developed by US tech giants at costs of hundreds of billions of dollars.

Also Read | How DeepSeek’s origins explain its AI model overtaking US rivals like ChatGPT

As DeepSeek’s popularity grew, its AI chatbot app temporarily dethroned OpenAI’s ChatGPT as the most downloaded app on the Apple App Store in the US. Soon, it triggered a $1 trillion-plus sell-off in global equities markets, with Nvidia stock dropping more than 15 per cent in a single trading day.

Since then, DeepSeek has attempted to sustain the momentum by offering a 75 per cent discount on API access to its R1 reasoning model during non-peak hours (from 16:30 – 00:30 every day).

🚨 Off-Peak Discounts Alert!

Starting today, enjoy off-peak discounts on the DeepSeek API Platform from 16:30–00:30 UTC daily:

🔹 DeepSeek-V3 at 50% off
🔹 DeepSeek-R1 at a massive 75% off

Maximize your resources smarter — save more during these high-value hours! pic.twitter.com/00Qa6iNbcG

— DeepSeek (@deepseek_ai) February 26, 2025

https://platform.twitter.com/widgets.js

Story continues below this ad

The reduction in API prices was announced in the midst of DeepSeek’s “Open-Source Week,” where it pledged to open-source five code repositories that have been “documented, deployed and battle-tested in production.”

“As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey. Daily unlocks are coming soon. No ivory towers — just pure garage-energy and community-driven innovation,” DeepSeek wrote in a post on X.

🚀 Day 0: Warming up for #OpenSourceWeek!

We’re a tiny team @deepseek_ai exploring AGI. Starting next week, we’ll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency.

These humble building blocks in our online service have been documented,…

— DeepSeek (@deepseek_ai) February 21, 2025

https://platform.twitter.com/widgets.js

Challenges facing DeepSeek-R2

While DeepSeek aims to build on its R1 success, the roll-out of its next-generation of AI models faces several challenges. The release of R2 may galvanise the US government to impose further restrictions on the export of graphics processing units (GPUs) to China and other countries. For instance, Nvidia’s H20 chip is still outside the scope of US export controls.

DeepSeek’s main backer High-Flyer, a Chinese quant hedge fund, is known to have set up two supercomputing AI clusters in 2020 and 2021, one of which comprises around 10,000 Nvidia A100 chips used for training AI models. The US government banned the sale and export of A100 chips to China in 2022.

Story continues below this ad

Also Read | DeepSeek decoded: 5 myths and realities about the Chinese AI startup’s rise

The startup has claimed that it relied on advanced techniques such as Mixture-of-Experts (MoE) and multihead latent attention (MLA) to develop R1 on lower computing capacity. However, experts have pointed out that making gains in post-training stages of AI development such as inferencing will require advanced GPUs.

Privacy concerns around DeepSeek’s AI services could also prove to be a hurdle in the widespread adoption of the upcoming R2 model. From South Korea to Italy, several governments have ordered the removal of DeepSeek from app stores, citing privacy concerns.

Tags:
artificial intelligence