Home / Cloud Data / Google Launches Parallelstore to Accelerate AI and HPC Workloads

Google Launches Parallelstore to Accelerate AI and HPC Workloads

Oct 15, 2024

Dustin TrainorTech Innovation Expert

Google Cloud Platform (GCP) recently announced the release of Parallelstore, a managed parallel file storage service designed to meet the high demands of input/output (I/O) operations, particularly those required for artificial intelligence (AI) applications. Built on Intel’s Distributed Asynchronous Object Storage (DAOS) architecture, initially used with Intel’s now-discontinued Optane persistent memory technology, Parallelstore is positioned to enhance AI and high-performance computing (HPC) workloads effectively.

The Core of Parallelstore: Leveraging DAOS

DAOS Architecture: Foundation of Performance

Parallelstore’s foundation lies in Intel’s DAOS architecture, a parallel file system that spans multiple storage nodes. Utilizing a metadata store in persistent memory, DAOS achieves minimized latency for I/O operations by replicating files across as many nodes as possible. Despite the discontinuation of Optane, DAOS remains reliant on critical Intel technologies such as the Intel Omnipath communications protocol. This design enables compute nodes to access metadata servers for locating files and read/write operations, facilitating block mode communication using RDMA over Converged Ethernet (RoCE).

The architecture’s robust design ensures that performance is maximized through efficient data management and quick retrieval processes. Even as Optane phases out, the technology underpinning DAOS allows for seamless communication between servers, ensuring that data is readily accessible without lags. The ability to store and manage vast amounts of data in parallel streams minimizes bottlenecks, significantly benefiting applications that require rapid, consistent data access. The DAOS system emphasizes high availability and resiliency, catering to the elevated performance standards required for intensive AI and HPC tasks.

Optimizing Data Delivery: Ensuring Maximum Performance

Parallelstore is meticulously designed to saturate server bandwidth to maximize data delivery to GPUs and TPUs. This focus on performance is vital for reducing the expenses associated with extensive AI workloads. According to GCP product director Barak Epstein, Parallelstore can sustain continuous read/write access for thousands of VMs, GPUs, and TPUs, catering to both modest and massively scaled AI and HPC requirements.

The service ensures that data processing workloads are managed efficiently, providing a seamless flow of data to computational resources. By optimizing data delivery, Parallelstore helps organizations reduce operational costs, allowing them to allocate resources more effectively. This efficiency is particularly evident in AI model training scenarios where data throughput and processing speed are critical. Improved data delivery translates into faster AI training times, lower latencies, and higher throughput, making it a preferred option for engineers and data scientists seeking optimal performance from their AI and HPC investments.

Handling Intensive Workloads: Performance Metrics and Capacities

Throughput and Input/Output Operations Per Second (IOPS)

In a full 100TB deployment, Parallelstore achieves impressive performance metrics: throughput of approximately 115GBps, three million read IOPS, and one million write IOPS, with latency as low as 0.3 milliseconds. These capabilities make Parallelstore exceptionally capable of handling small files and random, distributed access across numerous clients. When tested, Parallelstore could accelerate AI model training by nearly four times compared to other machine learning data loaders, illustrating its effectiveness in real-world AI applications.

This high level of performance ensures that complex datasets can be processed quickly and accurately, contributing to more efficient AI and HPC project workflows. The ability to maintain low latency and high throughput means that projects with stringent performance demands can rely on Parallelstore to meet their needs consistently. These robust performance metrics provide a competitive edge for enterprises adopting Parallelstore, enabling faster data processing, reduced training times, and improved overall project efficiency. In environments where speed and reliability are paramount, Parallelstore’s performance can significantly enhance productivity and outcome quality.

Integration with Google Cloud Storage and Kubernetes

Customers first utilize Google Cloud Storage to house their data, selecting appropriate datasets for AI processing via Parallelstore. GCP’s Storage Insights Dataset service, part of the Gemini AI offerings, aids in evaluating and curating data for AI training. Additionally, Parallelstore’s integration extends to Kubernetes clusters via GCP’s Google Container Engine (GKE), managed with dedicated Container Storage Interface (CSI) drivers. This integration enables administrators to manage Parallelstore volumes effortlessly, akin to any other storage attached to GKE.

The seamless integration within Google’s existing cloud infrastructure simplifies the data management process for users, providing a unified platform for storage and processing. By leveraging GCP’s capabilities, organizations can streamline their workflows, reduce complexity, and enhance operational efficiency. This comprehensive approach ensures that data is easily accessible and manageable, fostering a more cohesive and user-friendly environment for AI and HPC applications. The use of Google Cloud Storage and Kubernetes integration provides a scalable and flexible solution, tailored to meet the diverse needs of modern enterprises.

Addressing AI-Domain Specific Needs: Small Files and Transfer Speed

Small File Handling and Random Access

Parallelstore particularly excels with small files, often critical in AI workloads for training models. It allows for seamless random and distributed access, handling high volumes of files (up to 5,000 files per second for files smaller than 32MB). This capability is significant for AI applications that require swift access to a broad array of small-sized data packets, thereby ensuring efficient and expedited performance in intensive AI tasks.

Efficiently managing small files allows AI models to access critical data rapidly, improving training times and model accuracy. Parallelstore’s ability to handle numerous small files simultaneously reduces the delays typically associated with data retrieval and processing, providing a more reliable and effective platform for AI development. This proficiency is invaluable for data-intensive AI projects, where quick, reliable data access is crucial. The ability to handle small files seamlessly underscores Parallelstore’s suitability for a range of AI and machine learning applications, making it a vital tool for researchers and engineers focused on high-performance, data-driven projects.

Data Transfer Speed and Scalability

Transfer speeds to Parallelstore can reach up to 20GBps, demonstrating extraordinary efficiency, especially for large-scale data migrations and processing. This speed ensures that data scientists and engineers can swiftly move vast datasets into Parallelstore, reducing downtime and expediting project timelines. These capabilities lead to near-linear scalability in read/write I/O performance as client I/O requests increase, making Parallelstore a robust solution for varied and demanding environments.

The impressive transfer speeds also contribute to the scalability of Parallelstore, accommodating the growing data needs of modern AI and HPC applications. As projects expand and data volumes increase, Parallelstore’s capacity to handle high transfer speeds ensures continued performance without degradation. This scalability is essential for projects that evolve over time, providing a future-proof solution capable of adapting to changing needs. By delivering consistent, high-speed data transfers, Parallelstore supports the dynamic requirements of AI and HPC workloads, fostering innovation and efficiency in an evolving technological landscape.

Strategic Utilization of Existing Technologies

Repurposing Intel’s Innovations

Parallelstore’s reliance on DAOS and other Intel technologies showcases adaptive thinking, effectively repurposing Intel’s innovations for current needs. Although Optane has been phased out, using Intel Omnipath communications protocol and RDMA over Converged Ethernet (RoCE) ensures that the architecture remains potent and relevant. This approach maximizes the longevity and utility of existing technologies, demonstrating a pragmatic pathway to innovation amidst changing technological landscapes.

Intel’s technologies provide a stable foundation upon which Parallelstore builds, ensuring continued performance and reliability. By leveraging these established technologies, Parallelstore can offer a familiar yet advanced solution for data management and processing. This strategic utilization highlights the adaptability and resilience of Parallelstore’s architecture, making it a viable option for enterprises seeking robust, reliable storage solutions. The effective repurposing of Intel’s innovations underscores the thoughtful engineering behind Parallelstore, ensuring that it remains a cutting-edge solution despite technological shifts.

Future Proofing AI and HPC Workloads

Google Cloud Platform (GCP) has unveiled Parallelstore, a managed parallel file storage service designed to handle high input/output (I/O) demands, especially for artificial intelligence (AI) applications. This service is based on Intel’s Distributed Asynchronous Object Storage (DAOS) architecture, which was originally integrated with Intel’s now-discontinued Optane persistent memory technology. Parallelstore aims to significantly improve AI and high-performance computing (HPC) workloads by providing robust and efficient storage solutions.

The advent of Parallelstore marks a critical development for industries relying on intensive computational tasks and large-scale data processing. Given its advanced architecture, users can expect enhanced performance and reduced latency, thereby streamlining complex operations. Moreover, the service’s managed nature means that businesses can leverage high-level infrastructure without the overhead of maintenance and scaling. This innovation by GCP not only strengthens their cloud offerings but also sets a new benchmark in file storage solutions tailored for demanding AI and HPC environments. With Parallelstore, enterprises can achieve optimal efficiency and performance in their computational endeavors.