Journalism of Courage
Advertisement
Premium

What is Phi-2, Microsoft’s new small language model

Microsoft's new generative AI model is leaner and more capable than even bigger language models.

Phi-2Satya Nadella announcing Phi-2 at Ignite 2023 (Image credit: Microsoft)

In the world of large language models (LLM) like GPT-4 and Bard, Microsoft has just released a new small language model—Phi-2, which has 2.7 billion parameters and is an upgraded version of Phi-1.5. Currently available via the Azure AI Studio model catalogue, Microsoft claims that Phi-2 can outperform larger models such as Llama-2, Mistral, and Gemini-2 in various generative AI benchmark tests.

Originally announced by Satya Nadella at Ignite 2023 and released earlier this week, Phi-2 was built by the Microsoft research team, and the generative AI model is said to have “common sense,” “language understanding,” and “logical reasoning.” According to the company, Phi-2 can even outperform models that are 25 times larger on specific tasks.

Microsoft Phi-2 SLM is trained using “textbook-quality” data, which includes synthetic datasets, general knowledge, theory of mind, daily activities, and more. It is a transformer-based model with capabilities like a next-word prediction objective. Microsoft has trained Phi-2 on 96 A100 GPUs for 14 days, indicating that it is easier and more cost-effective to train this model on specific data compared to GPT-4. GPT-4 is reported to take around 90-100 days for training, using tens of thousands of A100 Tensor Core GPUs.

Performance of Phi-2 against Llama and Mistral (Image credit: Microsoft)

Microsoft’s Phi-2 can also solve complex mathematical equations and physics problems. On top of that, it can identify a mistake made by a student in a calculation.

On benchmarks like commonsense reasoning, language understanding, math, and coding, Phi-2 outperforms the 13B Llama-2 and 7B Mistral. Similarly, the model also outperforms the 70B Llama-2 LLM by a significant margin. Not just that, it even outperforms the Google Gemini Nano 2, a 3.25B model, which can natively run on Google Pixel 8 Pro.

A smaller model outperforming a large language model like Llama-2 has a huge advantage, as they cost a lot less to run with lower power and computing requirements. These are also models that can be trained for specific tasks and can easily run natively on the device, reducing output latency. Developers can access the Phi-2 model on Azure AI Studio.

From the homepage
Tags:
  • artificial intelligence microsoft
Edition
Install the Express App for
a better experience
Featured
Trending Topics
News
Multimedia
Follow Us
Bihar Election ResultsNitish factor propels NDA towards Bihar sweep, Tejashwi fails to retain strongholds
X