In recent months, the remarkable strides made in AI innovation have ignited a wave of transformative possibilities, captivating our collective imagination with the promise of reshaping industries and the way we work.
Today, at Microsoft Inspire, Meta and Microsoft announced support for the Llama 2 family of large language models (LLMs) on Azure and Windows. Llama 2 is designed to enable developers and organizations to build generative AI-powered tools and experiences. Meta and Microsoft share a commitment to democratizing AI and its benefits and we are excited that Meta is taking an open approach with Llama 2. We offer developers choice in the types of models they build on, supporting open and frontier models and are thrilled to be Meta’s preferred partner as they release their new version of Llama 2 to commercial customers for the first time.
Now Azure customers can fine-tune and deploy the 7B, 13B, and 70B-parameter Llama 2 models easily and more safely on Azure, the platform for the most widely adopted frontier and open models. In addition, Llama will be optimized to run locally on Windows. Windows developers will be able to use Llama by targeting the DirectML execution provider through the ONNX Runtime, allowing a seamless workflow as they bring generative AI experiences to their applications.
Our growing partnership with Meta
Meta and Microsoft have been longtime partners on AI, starting with a collaboration to integrate ONNX Runtime with PyTorch to create a great developer experience for PyTorch on Azure, and Meta’s choice of Azure as a strategic cloud provider. Today’s announcement builds on our partnership to accelerate innovation in the era of AI and further extends Microsoft’s open model ecosystem and position as the world’s supercomputing platform for AI.
Azure’s purpose-built AI supercomputing platform is uniquely designed from the facility, hardware and software to support the world’s leading AI organizations to build, train and deploy some of the most demanding AI workloads. The availability of the Llama 2 models with Azure AI enables developers to take advantage of Azure AI’s powerful tooling for model training, fine-tuning, inference, and particularly the capabilities that support AI safety.
The inclusion of the Llama 2 models in Windows helps propel Windows as the best place for developers to build AI experiences tailored for their customers’ needs and unlock their ability to build using world-class tools like Windows Subsystem for Linux (WSL), Windows terminal, Microsoft Visual Studio and VS Code.
Expanding Azure AI model catalog and Windows availability
Llama 2 is the latest addition to our growing Azure AI model catalog. The model catalog, currently in public preview, serves as a hub of foundation models and empowers developers and machine learning (ML) professionals to easily discover, evaluate, customize and deploy pre-built large AI models at scale.
The catalog eliminates the need for users to manage all infrastructure dependencies when operationalizing Llama 2. It provides turnkey support for model fine-tuning and evaluation, including powerful optimization techniques such as DeepSpeed and ONNX Runtime, that can significantly enhance the speed of model fine-tuning.
Windows developers will be able to easily build new experiences using Llama 2 that can be accessed via GitHub Repo. With Windows Subsystem for Linux and highly capable GPUs, developers can fine tune LLMs to meet their specific needs right on their Windows PCs.
Building responsibly with Azure
Responsible AI is at the heart of Microsoft’s approach to AI and how we partner. For years we’ve invested heavily in making Azure the place for responsible, cutting-edge AI innovation, whether customers are building their own models or using pre-built and customizable models from Microsoft, Meta, OpenAI and the open-source ecosystem.
At Microsoft, we mitigate potential risks presented by the use of large language models through an iterative, layered approach that includes experimentation and measurement. Azure AI customers can test Llama 2 with their own sample data to see how it performs for their particular use case. Then, customers can use prompt engineering and retrieval augmented generation (RAG) techniques to develop, evaluate and optimize meta-prompts for their app and deliver safer and more reliable experiences for end users.
Services like Azure AI Content Safety add another layer of protection, helping ensure a safer online experience with AI apps. Part of our collaboration with Meta led to combining Meta’s safety techniques with Azure AI Content Safety so that by default, the deployments of the Llama 2 models in Azure AI come with a layered safety approach.
Today’s expansion of our model catalog with Llama 2 and our partnership with Meta is a big step forward in achieving a responsible, open approach to AI.