AI Accelerated Hardware Unlocks On-Device Generative AI
|
NEWS
|
Generative Artificial Intelligence (AI) inferencing was initially restricted to cloud environments due to the vast memory, compute, and commensurate power to run workloads from models containing hundreds of billions of parameters. However, Small Language Models (SLMs), enabled through advances in compression techniques (including optimization), coupled with improvements in hardware performance through the addition of AI accelerators within heterogenous systems, are enabling inference to move toward more memory- and resource-constrained form factors like Personal Computers (PCs) (and smartphones). The market for on-device generative AI has made huge strides over the last year, with all major Original Equipment Manufacturers (OEMs) and chip vendors announcing and bringing to market solutions capable of handling AI inference workloads on-device. Notable developments in PC AI include:
- Select AMD Ryzen 7000 and 8000 series processors contain AMD Ryzen AI, a heterogenous computing platform made up of AMD Radeon Graphics Processing Units (GPUs), Ryzen Central Processing Units (CPUs), and dedicated AI engines or Neural Processing Units (NPUs). Select enterprise-destined chipsets also contain the AMD Ryzen AI engine.
- Intel Core Ultra (Meteor Lake family) series desktop and laptop processors integrate the CPU, GPU, and NPU into a heterogenous computing chipset where workloads are distributed between architectures.
- At CES, NVIDIA’s announced that GeForce RTX PC GPUs, capable of handling generative AI workloads on-device, will be complemented by an array of new hardware (GeForce RTX SUPER GPUs targeting demanding workloads, including gaming, developers, and creative industries) and tools to bring generative AI to devices.
- Qualcomm’s Snapdragon X Elite Systems-on-Chip (SoCs) for laptops contain its Hexagon NPU, and Arm CPUs, to support on-device generative AI models with 13 billion parameters, and inferencing at up to 30 tokens per second.
- Apple’s 4Q 2023 M3 series SoCs for PCs contain an updated version of its neural engine, as well as Arm CPUs, supporting on-device AI/Machine Learning (ML) workloads, including transformer models with billions of parameters.
Interestingly, both Apple and Qualcomm SoCs use Arm CPUs, eschewing the more power-hungry x86 CPUs from Intel and AMD. Apple’s successful journey to Arm SoCs in its PCs started with its 2020 M1 series chipsets, initiating its transition from Intel x86 processors. Two generations later, it has demonstrated Arm’s viability in the PC form factor. Moreover, the inclusion of neural engines in M1 heterogeneous SoCs, continued through the proceeding M2 and M3 families, proves the success in powering general AI workloads, and later, generative AI with the M3.
If Apple Can, Why Can't Others?
|
IMPACT
|
An impact of this innovation cycle is the renewed interest in Arm architectures running Windows, which remains the most popular Operating System (OS) for PCs. It was reported that NVIDIA and AMD have begun designing CPUs with Arm Instruction Set Architecture (ISA), as part of Microsoft’s drive for chip vendors to implement an alternative architecture to x86 and capitalize on, for example, the improved energy efficiency of Arm SoCs. Add to that Microsoft’s October 2023 announcement that it is launching Arm Advisory Service to help Independent Software Vendors (ISVs) working on software for Windows running on Arm to increase customer adoption of Arm systems, which coincides neatly with the rumors surrounding AMD and NVIDIA’s interest in the alternative architecture.
The vast majority of Windows PCs run with legacy x86 architecture CPUs, around which a billion-dollar software and OEM ecosystem has been built (Apple has its own OS, MacOS). However, Windows has long supported Arm architectures, and if confirmed, NVIDIA and AMD would thereby join Qualcomm, which has been producing SoCs for laptops using Arm CPUs for several years (and will continue to do so with Snapdragon X Elite, set to ship mid-2024 in time for the next academic year, when many will replace existing devices). Moreover, forthcoming PCs running Windows will need on-device AI capabilities as the company rolls out Copilot and other generative AI features requiring heterogenous architectures across the newest PCs running Windows 11 (and eventually 12), as the company proclaims 2024 to be the “year of the AI PC.” These capabilities are supported by Qualcomm’s PC chipset, and would be joined by any Arm PC systems developed by AMD and NVIDIA.
Applications, Not Hardware, Will Drive Adoption
|
RECOMMENDATIONS
|
For 2024 to be the year of the AI PC, and for Arm to catch up, the ISV ecosystem will need to build out its applications, which must be powerful enough to convince customers to upgrade. To this end, all of the above vendors offer Software Development Kits (SDKs) and toolkits to spur the development of AI software that can run efficiently on their respective hardware platforms. But for Arm to become a leader and join x86-powered Window PCs, its relatively niche position will need to be elevated. The ecosystem of OEMs, chip vendors, and ISVs must address several items:
- OEMs and Windows will need to tempt customers to move to Arm-based systems. The architecture must be lifted from the market niche in which it currently resides. This requires concerted marketing efforts to build trust in devices powered by an ISA other than the legacy x86 processors from AMD and Intel.
- Differentiation from x86 systems should be highlighted to spur demand. This includes superior connectivity, extended battery life, and lightweight devices.
- Windows software applications will need to be ported to Arm, which requires significant work, given the billion-dollar legacy of the mature x86 ecosystem in the PC space. Apple has an advantage here given its full-stack control from OS to chipset design.
- Vendor efforts in ISV partnerships should be ramped up. A leading example in the x86 arena is Intel, and fragmentation is demonstrated by the relative size of Qualcomm and AMD’s publicized partnership ecosystems.
- Focus on building an Arm PC software ecosystem around chipsets to enable quick application development with tools like model optimization to expedite the deployment on Arm CPUs.
Custom Arm licenses to run Windows are expensive, which raises the question if a RISC-V solution, explored in ABI Research’s RISC-V for Edge AI Applications report—and an alternative to x86 and Arm—would appear more to benefit from the instruction set’s flexibility and lack of licensing fees. The interest in RISC-V in China also renders it a potentially commercially beneficial decision, able to be developed using home-grown chipsets likely favored by Chinese consumers.