What is the Phi-3.5 series, Microsoft’s newly launched trio of smaller AI models?

Microsoft says its latest small language AI models can outperform larger models by Meta and Mistral.

Microsoft logo is seen near computer motherboard in this illustration taken January 8, 2024. (File photo: REUTERS/Dado Ruvic)

Microsoft has released a new batch of lightweight AI models that are open-source and said to be better than Google’s Gemini 1.5 Flash and Meta’s Llama 3.1 as well as OpenAI’s GPT-4o (in some ways).

The Phi-3.5-mini-instruct, Phi-3.5-Mixture of Experts (MoE)-instruct, and Phi-3.5-vision-instruct, are the latest additions to the tech giant’s family of small-language models (SLMs) known as the Phi-3 series. The Phi-3-mini, Microsoft’s first SLM, made its debut in April this year.

What are the new Phi-3.5 models?

The Phi-3.5-mini-instruct comes with 3.82 billion parameters while the Phi-3.5-MoE-instruct boasts of 41.9 billion parameters out of which it reportedly operates only on 6.6 billion active parameters. Meanwhile, the Phi-3.5-vision-instruct includes 4.15 billion parameters.

The parameter count of an AI model serves as an indicator of its size. It also provides an estimate of the knowledge and skills possessed by an AI model through Machine Learning.

Meanwhile, all three Phi 3.5 models support a context window of 128k tokens. Context windows are measured in tokens and they signal the amount of information that can be processed and generated by an AI model at any given time. Longer context windows means the AI model is capable of processing more text, images, audio, code, video, etc.

Also Read | What is Phi-2, Microsoft’s new small language model

According to Microsoft, the Phi-3.5 Mini was trained for a period of ten days on 3.4 trillion tokens while the Phi-3.5 MoE model was trained for a period of 23 days on 4.9 trillion tokens. It took 500 billion tokens and six days to train the Phi-3.5 Vision model, the company said. The training datasets fed to the new Phi-3.5 models comprised high-quality, reasoning-dense, publicly available data.

What are the capabilities?

In a nutshell, the Phi-3.5 Mini is equipped with basic and quick reasoning capabilities that are useful for generating code or solving mathematical and logical problems. Since it is a combination of multiple models that specialise in certain tasks, the Phi-3.5 MoE model can handle complex AI tasks across multiple languages.

Story continues below this ad

On the other hand, the Phi-3.5 Vision model is capable of processing text as well as images. As a result, the multimodal AI model can carry out visual tasks such as summarising videos or analysing charts and tables.

How to use these AI models?

Developers can download, customise, and integrate the Phi-3.5 series into their platforms at no cost as Microsoft has released these AI models under an open-source licence. They can be accessed via Hugging Face, which is an AI cloud hosting platform with no restrictions on its commercial usage and modifications.