Akamai has launched an innovative service called Akamai Cloud Inference, designed to transform the efficiency and effectiveness of organizations leveraging large language models (LLMs) and predictive models. Running on a highly distributed cloud platform, it addresses the challenges associated with centralized cloud systems. This groundbreaking service enables businesses to significantly reduce AI inference and related workload costs by up to 86% compared to traditional hyperscaler infrastructure. By bringing AI data closer to users and devices, Akamai aims to overcome major limitations of legacy cloud models and improve performance metrics such as throughput and latency.
Enhancing AI Performance Through Distributed Cloud
Adam Karon, Akamai’s Chief Operating Officer and General Manager of the Cloud Technology Group, emphasizes the unique advantage provided by Akamai Cloud Inference. He notes that while large-scale training of LLMs will continue to occur in big hyperscale data centers, inferencing tasks can be efficiently handled at the edge through Akamai’s extensive distributed cloud infrastructure developed over the past 25 years. This shift allows organizations to deliver superior AI performance three times better in throughput and reduces latency by up to 60% compared to mainstream hyperscale infrastructure.
Akamai Cloud Inference comes with a comprehensive toolset tailored for platform engineers and developers, enabling them to build and execute AI applications and data-intensive workloads near the end-users. These tools include compute options, data management, and containerization, all operating on a globally distributed infrastructure. Traditional CPUs, GPUs, and specialized ASIC VPUs form the backbone of compute capabilities, integrated with NVIDIA’s AI Enterprise ecosystem to further optimize AI inference performance on NVIDIA GPUs, ensuring robust and scalable solutions.
Advanced Data Management and Containerization
The platform excels in data management through a collaboration with VAST Data, offering real-time data access crucial for AI workloads. It boasts scalable object storage and partnerships with vector database vendors like Aiven and Milvus. These integrations facilitate seamless AI data retrieval and management processes, crucial for efficient and intelligent data utilization. The inclusion of scalable object storage ensures flexibility and adaptability in managing vast amounts of data required for modern AI applications.
Containerization, brought to life by Kubernetes, plays a significant role in enhancing the platform’s performance. Kubernetes allows for efficient scaling, improved resilience, and optimal cost management. This capability is reinforced by the Linode Kubernetes Engine-Enterprise, which provides robust handling for large-scale enterprise applications. Furthermore, Akamai Cloud Inference features cutting-edge edge compute capabilities that harness WebAssembly (Wasm). This enables developers to execute LLM inferencing directly from serverless applications, supporting latency-sensitive deployments that demand real-time processing.
Shifting Trends Towards Edge Computing
The transition towards a distributed cloud model aligns with a broader industry trend to manage data closer to its generation points. This shift is pivotal as organizations move from merely training LLMs to leveraging AI for making real-time, intelligent decisions and delivering customized experiences. Adam Karon compares training LLMs to map-making and inference to using a GPS for real-time navigation, highlighting the transformation towards actionable, real-time AI applications. This analogy underscores the importance of immediate, intelligent decision-making enabled by edge computing.
Early applications of Akamai Cloud Inference underscore its versatility and potential. This includes in-car voice assistance that provides seamless user experiences, AI-driven agricultural management solutions that offer real-time insights for farmers, and personalized virtual shopping experiences that enhance consumer satisfaction. These applications demonstrate the breadth and scope of Akamai Cloud Inference, showcasing its ability to revolutionize various industries by providing critical operational intelligence through its distributed cloud infrastructure.
The Future of AI Efficiency at the Edge
Akamai has introduced a groundbreaking service called Akamai Cloud Inference, engineered to enhance the efficiency and effectiveness of organizations using large language models (LLMs) and predictive models. This innovative solution operates on a highly distributed cloud platform, addressing the challenges of centralized cloud systems. By integrating AI data closer to users and devices, Akamai Cloud Inference aims to overcome the major limitations of legacy cloud models, improving key performance metrics like throughput and latency. Businesses can expect to significantly lower AI inference and related workload costs by up to 86% compared to traditional hyperscaler infrastructures. This service democratizes AI technology, making advanced capabilities more accessible and affordable for a wide range of organizations. With Akamai Cloud Inference, companies can leverage the power of AI more effectively, transforming how they operate and perform in today’s data-driven world. By reducing latency and increasing throughput, Akamai sets a new standard for AI processing efficiency.