Generative AI on the Edge
|
NEWS
|
On 24 February, Qualcomm announced that they successfully deployed Stable Diffusion, a text-to-image generative Artifical Intelligence (AI) model capable of creating photorealistic images given any text input within tens of seconds, on an Android phone through full-stack AI optimization.
Before this demo, Stable Diffusion, with over one billion parameters, was confined to running in the cloud. Qualcomm started with FP32 version 1-5 open-source model from Hugging Face and made optimizations through model compression and hardware acceleration. To shrink the model from F32 to INT8, they used the AI Model Efficiency Toolkit’s (AIMET) post-training quantization, a tool developed from techniques created by Qualcomm AI Research.
In the demo, Qualcomm managed to run it under 15 seconds for twenty inference steps to generate a 512x512 pixel image. Qualcomm claimed this to be the first successful demo of generative AI in edge devices in the world, opening up the endless potential of running generative AI on consumer devices.
Collaboration as the Catalyst for Rapid AI Development
|
IMPACT
|
The demo’s result is significant to the edge AI industry. Qualcomm’s use case demonstrates the solid first step to deploy resource-demanding generative AI to edge devices. Generative AI models are trained to create new outputs based on the data they have been trained on. The outputs can be text-based, in the case of GPT-3 and its conversational AI derivative, ChatGPT, synthetic data generation in Generative Adversarial Networks (GAN), or text-to-image generation in Stable Diffusion. These models can even be combined under a single framework, such as Visual Commonsense Reasoning in Time (VisualCOMET), to understand and contextualize images and videos. Traditionally, all these models are extremely compute-intensive due to their large number of parameters and require data server level Graphic Processing Units (GPUs) to run. However, this demo by Qualcomm has demonstrated that generative AI can also be run on commercial devices given the right edge AI chipset.
As mentioned by Qualcomm, edge AI hardware alone is not the solution. The critical component to this successful demo is the introduction of quantization, compilation, and hardware acceleration by Qualcomm AI Research. According to ABI Research’s previous insight (IN-6849), major chipset vendors are introducing advanced software to optimize AI model performance and developer experience for their hardware platform. In June 2022, Qualcomm announced it was strengthening its existing AI Software Development Kit (SDK) with its Qualcomm AI Stack. Qualcomm also partnered with Google Cloud Platform to integrate Vertex AI Neural networks Architecture Search (NAS) capability to its AI Engine Direct.
Work Together to Gain Upper Hand
|
RECOMMENDATIONS
|
The demo also brings out an important message: the future development of AI relies more heavily on the collaboration between big market players and turnkey service providers, i.e., companies specializing in AI optimization to accelerate the AI development process. For example, Hugging Face provides open-source models with optimization to reduce significant research and development time. With the optimized model, major AI chipset players can focus on developing more sophisticated AI products.
The total edge AI market size is estimated to achieve US$34 billion by 2028 (MD-AIML-110). Time is money; whoever can deploy a matured and proven edge AI solution the fastest will grab the lion’s share. The partnership between Qualcomm and Hugging Face is a win-win for both parties in the industry, especially in edge AI where the use cases are diverse and market opportunities are widespread. Therefore, simplifying the AI development process is the key to staying competitive in the AI market. ABI Research’s recommendation is to explore, forge, and deepen collaboration opportunity between vendors, especially Software-as-a-Service (SaaS) and turnkey service providers, in order to capture market share at the fastest speed. We do expect other edge AI chipset vendors to catch up with this development. For example, Perceive’s recently launched Ergo 2 chipset can support Large Language Models (LLMs) in low-power consumer devices. In January 2023, Deci, an AI software company that provides deep learning acceleration platform, achieved GPU-like AI inference performance on Intel’s 4th Gen Xeon Scalable processors for computer vision and Natural Langauge Processing (NLP) tasks in edge cloud servers.
In the foreseeable future, such collaboration effort will be more common and important. The challenge is to identify compatible vendors since the AI market is extremely diversified. The safe bet is to find vendors that host open-source frameworks or models with value-added services, to ensure compatibility and interoperability under the greater AI ecosystem.