Meta has already made Llama-2 largely open source, but it’s now going a step further — it’s working to make Llama-2 available on users’ phones.
Meta and chip maker Qualcomm are working to bring the Llama-2 model to phones. The companies say that Llama-2 should be available on flagship smartphones and PCs starting from 2024 onwards. This is perhaps the first instance of a chip maker working with a company that’s created an LLM to optimize it for mobile devices. “Developers can start today optimizing applications for on-device AI using the Qualcomm® AI Stack – a dedicated set of tools that allow to process AI more efficiently on Snapdragon, making on-device AI possible even in small, thin, and light devices,” the companies said in a statement.
“We applaud Meta’s approach to open and responsible AI and are committed to driving innovation and reducing barriers-to-entry for developers of any size by bringing generative AI on-device,” said Durga Malladi, senior vice president and general manager of technology, planning and edge solutions businesses at Qualcomm. “To effectively scale generative AI into the mainstream, AI will need to run on both the cloud and devices at the edge, such as smartphones, laptops, vehicles, and IoT devices,” he added.
There could be several advantages to having a powerful LLM run directly on users’ phones. For starters, users will be able to access Llama-2 without an internet connection, and even when the phone is on flight mode. The on-phone implementation will also mean that users will be able to protect their privacy, and their questions and conversations will theoretically never leave their phones. The LLM will also be free to run, unlike LLMs like GPT-4 which charge users for a certain number of tokens.
And it’s already been proven that LLMs don’t need sophisticated cloud infrastructure for inference. Several people have been tinkering with open-source models, and getting them to run on all manner of devices including Apple M1 computers, a Pixel phone, and a Raspberry Pi, and even a calculator. But having an AI company and a chip maker jointly work on getting a LLM to run on a phone could lead to a faster and more efficient implementation than those hacked together by enthusiasts.
All this wouldn’t be the best news for OpenAI, which was the pioneer in creating public-facing LLMs. OpenAI has followed a path that’s diametrically different from what Meta is doing — while its models are closed-source, Meta’s Llama-2 is largely open source; while OpenAI charges money to generate tokens on GPT-4, Llama-2 is free. And if Llama-2 can work on phones without requiring a internet connection, and provide reasonable results, it could significantly dent OpenAI’s business prospects. It’s still early days, but with open-source models, rapid advances in LLM research, and now models which run on phones, OpenAI’s closed AI model is getting challenged in more ways than one.