Microsoft Unveils New Small Language Model, Phi-2, for Research Use

Microsoft has launched Phi-2, the latest edition of its small language models (SLM), exclusively for research applications through the Azure AI Studio model catalog. SLMs, designed for domain-specific data, offer an affordable solution for academic pursuits without necessitating the extensive computational resources of larger models.

Phi-2 distinguishes itself by utilizing "textbook-quality" data during its training phase to accentuate educational value and content superiority. "Phi-2's modest size makes it perfect for research exploration, including areas such as mechanistic interpretability, safety enhancements, or fine-tuning tasks," stated Microsoft.

The company has enhanced Phi-2's training data with synthetic datasets to foster common sense reasoning and general knowledge in areas like science, everyday activities, and the theory of mind. This is complemented by selected web data, refined for its educational merit and content integrity.

Despite being labeled "small," Phi-2 is a significant step up from its predecessor, Phi-1.5, with 2.7 billion parameters compared to the latter's 1.3 billion. Microsoft claims that Phi-2 can surpass models up to 25 times its size due to efficient scaling methods.

In benchmark tests, Phi-2 exceeded the performance of other SLMs, including Mistrial and Llama-2, in tasks like reasoning, language comprehension, mathematics, and coding. It also contends with Google's latest SLM, Google Gemini Nano 2, despite having fewer parameters.

According to Microsoft, Phi-2, trained on specially curated data without human feedback reinforcement, exhibits reduced bias and toxicity levels compared to Llama-2 and previous Phi versions.

About the Author

Chris Paoli (@ChrisPaoli5) is the associate editor for Converge360.