Ultra-massive GPU arrays.
Pre-built clusters of hundreds to thousands of interconnected NVIDIA H100/H200 GPUs. InfiniBand fabric, parallel storage, and job scheduling — ready to train the largest models.
Up to 16,384 GPUs
Scale
400 Gbps
InfiniBand
1T+ parameters
Ready for
99.9%
Uptime SLA
Machine families
Purpose-built configurations for every workload profile — from web serving to GPU-accelerated ML training.
Superpod Configurations
Pre-configured GPU superpods with optimized network topology, parallel storage, and job scheduling for training the world's largest AI models.
GPU
H200 / H100
Max GPUs
16,384
InfiniBand
Fat tree NDR
Storage
DAOS / Lustre
Infrastructure for frontier AI.
Turn-key GPU superpods with optimized networking, storage, and scheduling.
Up to 16,384 GPUs
Pre-built superpod clusters from 256 to 16,384 GPUs with optimized fat-tree InfiniBand topology for all-to-all communication.
InfiniBand fat-tree fabric
400 Gbps NDR InfiniBand with non-blocking fat-tree topology. Sub-microsecond MPI latency across the entire cluster.
Parallel storage system
DAOS or Lustre file system delivering 1+ TB/s aggregate read throughput for checkpoint I/O and training data.
Training frameworks
Pre-tuned Megatron-LM, DeepSpeed, and FSDP configurations. NCCL topology files generated for your cluster.
Automatic health monitoring
GPU health checks, InfiniBand link monitoring, and automatic node replacement for failed GPUs.
Job scheduling
Managed Slurm scheduler with priority queues, job preemption, and multi-tenant access control.
Getting started
Launch your first instance in three steps. CLI, console, or API — your choice.
ur superpod request \
--config=sp-h200-256 \
--duration=3months \
--storage=daos-500tbTrain the largest models.
Foundation models, multi-modal AI, and sovereign AI infrastructure.
Train foundation models
Train GPT, Llama, and Mixtral-class models from scratch. 256-4096 GPU clusters with optimized 3D parallelism.
View tutorialSuggested configuration
SP-H200-1024 · Megatron-LM · DAOS
Estimate your costs
Create detailed configurations to see exactly how much your architecture will cost. Pay for what you use, down to the second.
Configuration 1
Platform & Architecture
Compute Resources
Storage
Cost Optimization
Cost details
InfiniBand networking included. Volume discounts for 100+ GPUs.
Works seamlessly with
Frequently asked questions
Train at any scale.
Pre-built GPU superpods from 256 to 16,384 GPUs. Request access today.