Private Cloud Steps Forward for AI Inferencing, Helping Overcome Some Early Enterprise Privacy and Availability Concerns
By Reece Hayden |
25 Jun 2024 |
IN-7421
Log In to unlock this content.
You have x unlocks remaining.
This content falls outside of your subscription, but you may view up to five pieces of premium content outside of your subscription each month
You have x unlocks remaining.
By Reece Hayden |
25 Jun 2024 |
IN-7421
Leading Technology Companies Tap into Private Cloud for AI Processing |
NEWS |
Private AI is not a new concept with Equinix, VMware, Intel, and NVIDIA already active. Once again, NVIDIA has unveiled a partner-led solution that might revolutionize the enterprise private Artificial Intelligence (AI) cloud opportunity. Alongside HPE, it announced NVIDIA AI Computing by Hewlett Packard Enterprise (HPE). A core element of this solution is HPE Private Cloud AI, which provides a turnkey fully integrated private AI solution enabling enterprises to deploy generative AI with device-like privacy. This solution is built around NVIDIA AI Enterprise (including NVIDIA Inference Microservices (NIMs)) and HPE AI Essentials, which have been tightly integrated enabling an end-to-end AI solution that supports easy deployment of optimized AI applications. On the face of it, this solution is hitting the right notes in the enterprise market as it reduces time-to-value, bridges enterprise talent gaps, and soothes data privacy and security concerns. It clearly makes a great deal of sense for NVIDIA, as it will accelerate enterprise AI adoption, while tying enterprises deeper into its ecosystem, which will certainly ensure strong recurring software revenue to complement hardware sales.
Other vendors are also seeing an important position for private cloud in the growing AI market. At Apple’s much-awaited (but slightly underwhelming) WWDC24, it threw its weight behind Private Cloud Compute (PCC) for AI inferencing. Alongside on-device AI and its third-party partnership with OpenAI, Apple highlighted that “Apple Intelligence” will use PCC for advanced reasoning, utilizing larger foundation models. PCC is designed specifically for private AI processing and aims to extend device-level security/privacy to the cloud. It has achieved this, not by relying on traditional cloud security measures, but by setting core requirements, including stateless computation of personal user data (user data cannot be available to anyone during processing and the data cannot be retained). Although Apple’s partnership with OpenAI has come under fire from different commentators like Elon Musk, PCC seems to provide a solution that effectively balances the benefits of larger foundation models for more complex applications with the risks of data privacy. However, that is not to say that it may bring challenges that would not impact on-device AI like availability, latency, and data sovereignty.
Private Cloud Has Privacy Advantages, but May Hinder AI Scalability |
IMPACT |
From an enterprise perspective, private cloud deployments can be broadly split into four categories: on-premises private cloud (hosted in an enterprise data center), virtual private cloud (hosted within a public cloud environment with additional security), hosted private cloud (run off-premises on a Communication Service Provider (CSP) server), and managed private cloud (hosted private cloud with management services, including maintenance, upgrades, and security management tools).
Private clouds bring numerous benefits for enterprises and consumers by mitigating AI deployment risks. First, they provide enterprises with greater control over costs. Public cloud costs are unpredictable as they are based on usage and have high egress fees. Private cloud costs are more or less fixed with resources being allocated to support specific requirements. Second, private cloud deployment will mitigate data privacy challenges by ensuring that user data are retained locally and not shared across publicly available resources. With hosted or on-premises private cloud deployments, enterprises can ensure that they are the only ones that have access to the servers.
However, private AI deployments bring challenges, as they have high setup costs. As AI is an emerging technology with an unclear Return on Investment (ROI) for enterprises, this high Capital Expenditure (CAPEX) may slow Proofs of Concept (PoCs) and implementation, limiting value creation. In addition, private AI is less scalable, as it is limited to the compute resources that enterprises have invested in. This can create challenges for enterprises as AI usage scales quickly, impacting performance and experience. Lastly, most private cloud solutions do not have access to AI development and Machine Learning Operations (MLOps) platforms like SageMaker or Vertex AI, which means that any AI deployment will be more time, resource, and talent intensive, as resource orchestration, data tools, and inference/training services are not readily available and will require integration.
Enterprise AI deployment is still in a relatively early stage (given the high rate of adoption ABI Research expects in the next 6 years) with most still running PoCs and deploying AI to support individual use cases. This means that it makes most sense to use public cloud resources, as they come with no CAPEX requirements, can scale quickly to meet capacity demands, and can be integrated with hyperscaler AI platforms to enable test and evaluation for new AI models. As enterprise strategies mature, the public cloud will hinder AI rollout, as costs will quickly skyrocket and data privacy concerns will hinder C-suite buy-in. ABI Research expects that maturing enterprise AI strategies will lead to growing private cloud deployments over the next 6 years.
Where Does the Private Cloud Fit into the Hybrid AI Enterprise Story and What Should Innovators Do? |
RECOMMENDATIONS |
The future of AI processing is hybrid. When taken to its logical conclusion, this means that AI processing is spread across the entire distributed compute continuum from device to public cloud. Each deployment location is used for different workloads depending on numerous requirements, including data privacy, hardware access, latency, model size, availability, connectivity, cost, and energy consumption. For example, the public cloud may be used to perform inference on large, third-party Application Programming Interface (API)-based models, while device AI will be used to run inference on small, use case-specific models with stricter data privacy and latency expectations. Private clouds will be an important domain within a hybrid AI system, as they can effectively balance reliability, cost, security, and performance, while ensuring that users have direct control over their data.
Subsequently, as private clouds for AI deployment grow, Original Equipment Manufacturers (OEMs) must look to develop their solution’s proposition and commercial approach. But how?
- Building turnkey solutions through partnerships with chip vendors (like AMD, Intel, and NVIDIA) with tight integration between hardware/software will help accelerate deployments across cloud and enterprise premises. OEMs must diversify their private AI partnerships to mitigate supply chain risks associated with AI hardware vendors. This should extend to regional partnerships to limit geopolitical risks.
- As security is a key driver for private AI deployments, investing in internal capabilities or developing partnerships with cloud-focused cybersecurity vendors like RedHat will help build further market awareness and differentiation.
- Although potentially creating privacy/security challenges, OEMs should partner with and integrate AI/data specialists into their software solutions to help bridge enterprise AI talent gaps by providing the foundation to build AI solutions within a secure, private AI environment. This partnership could expose enterprises to ready-made, pre-trained foundation models and other MLOps tools, reducing time to market and development costs. However, private clouds should not be directly integrated into AI development platforms due to enterprise security considerations.
Beyond technology investment and partnerships, OEMs should also look to develop further commercial partnerships—especially with infrastructure aggregators like regional cloud service providers, neutral hosts, interconnection providers, and even telcos. These infrastructure vendors can provide the necessary resources to support enterprise private AI deployments. These “local aggregators” will become increasingly important for enterprise AI strategies that focus on regulatory compliance, especially AI/data sovereignty. ABI Research expects that, moving forward, enterprise security, sovereignty, cloud lock-in, and cost concerns will contribute to growing demand for private cloud AI deployments.