Google Cloud takes a shine to Lustre

Google Cloud Managed Lustre, based on DDN’s EXAScaler software, is now generally available.

This project was first announced in April to enable Google Cloud to offer file storage and fast access services for enterprises and startups building AI, GenAI, and HPC applications. Lustre is an open source parallel file system offering high throughput and low latency. DDN, Lustre’s primary maintainer, uses it to run its scale-out EXAScaler array hardware.

Paul Bloch, DDN
Paul Bloch

DDN co-founder and president Paul Bloch stated: “By bringing our EXAScaler technology to Google Cloud customers as a fully managed service, we’re enabling organizations across industries to accelerate innovation without the overhead of managing complex infrastructure.”

Google Cloud Managed Lustre is aimed at tightly coupled HPC workloads as well as AI training and inference. It is slated to deliver performance of up to 1 TBps read throughput with < 1ms latency, and can scale from 18 TiB to 8 PiB-plus. With multiple performance tiers (125 MBps/TiB to 1,000 MBps/TiB), it says customers can tailor performance to meet the specific needs of their AI, simulation, or analytics workloads. 

It has POSIX compliance and native integration with Google services like Google Cloud Compute Engine, Google Kubernetes Engine (GKE), IAM, VPC Service Controls, and the Vertex AI platform, and works with Google Cloud’s Nvidia GPU servers. Vertex AI is a combined data engineering, data science, and ML engineering workflow offering for training, deploying, and customizing large language models (LLMs) and developing AI apps.

Google Cloud Managed Lustre supports Terraform, bulk data movement with Google Cloud Storage, and a Managed CSI Driver for GKE. The service has a 99.9 percent availability SLA.

Dave Salvator, director of Accelerated Computing Products, Nvidia, said: “By integrating DDN’s enterprise-grade data platforms and Google’s global cloud capabilities, organizations can readily access vast amounts of data and unlock the full potential of AI with the Nvidia AI platform (or Nvidia accelerated computing platform) on Google Cloud – reducing time-to-insight, maximizing GPU utilization, and lowering total cost of ownership.”

Google Cloud says it offers two other parallel file systems – DDN Infinia and Sycomp Storage based on IBM’s Storage Scale. Both Infinia and Storage Scale are available in the Google Cloud Marketplace, but neither are Google-managed services.

On-premises DDN customers are able to move their EXAScaler workloads to Google’s cloud as needed. See an overview of Google Cloud Managed Lustre here.

Customers can deploy instances directly through the Google Cloud Console or speak with Google Cloud or DDN representatives for tailored guidance and support.