Managed Hadoop & Spark clusters.
Fully managed big data clusters. Run Hadoop, Spark, Presto, and Hive at petabyte scale. Auto-scaling, spot instances, and HDFS.
Petabytes
Scale
Hadoop/Spark
Engines
Auto
Scaling
Spot ready
Cost
Data, at scale.
Petabyte Hadoop & Spark clusters.
Petabyte scale
Process petabytes with auto-scaling clusters.
Multi-engine
Hadoop, Spark, Presto, Hive, and Flink.
Spot instances
80% cost savings with spot instance support.
HDFS & S3
Native HDFS and cloud object storage.
Auto-scaling
Scale workers based on workload demand.
Notebook integration
Jupyter and Zeppelin notebooks built in.
Getting started
Launch your first instance in three steps. CLI, console, or API — your choice.
ur data cluster create analytics \
--engine=spark --workers=10 \
--spot=true --auto-scale=5-50Big data patterns.
ETL and interactive analytics.
Suggested configuration
Spark · Petabyte · Auto-scale
Estimate your costs
Create detailed configurations to see exactly how much your architecture will cost. Pay for what you use, down to the second.
Configuration 1
Spark Engine
Compute Resources
Storage & Output
Cost details
Managed Spark and Hadoop. Unified data lake.
Works seamlessly with
Frequently asked questions
Big data, managed.
Petabyte-scale Hadoop & Spark.