
Amazon Redshift
Fully managed, petabyte-scale cloud data warehouse for analytics and reporting
Discover top open-source software, updated regularly with real-world adoption signals.

Scalable, fault-tolerant platform for big-data storage and processing
YTsaurus delivers a multitenant, distributed storage and compute engine with MapReduce, SQL, and NoSQL capabilities, supporting exabyte-scale data, millions of cores, and seamless scaling.

YTsaurus is a distributed storage and processing platform designed for organizations that need to handle petabyte‑to‑exabyte data volumes across many users. It combines a MapReduce engine, an SQL query layer powered by ClickHouse (CHYT), a job scheduler, and a key‑value store for OLTP workloads, all within a single multitenant ecosystem.
The system offers fault‑tolerant operation with automated replication and zero‑downtime updates, while scaling to millions of CPU cores, thousands of GPUs, and tens of thousands of nodes. Data can reside on HDD, SSD, NVMe, or RAM, and the platform supports ACID transactions, a rich set of SDKs/APIs, and secure isolation of compute and storage resources. Integrated SPYT brings Apache Spark‑compatible tools for ETL, and the web UI simplifies cluster monitoring and job management.
YTsaurus can be launched locally via source builds or quickly provisioned on Kubernetes using the provided Helm chart. An online demo is also available for hands‑on evaluation. The Apache‑2.0 license permits unrestricted use and contribution, making it suitable for both on‑premises data centers and cloud environments.
When teams consider YTsaurus, these hosted platforms usually appear on the same shortlist.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Real‑time clickstream analytics
Process billions of events per day with low latency using MapReduce and CHYT for instant dashboards.
Large‑scale ETL pipelines
Leverage SPYT to orchestrate Spark jobs that transform and load petabytes of data into the lake.
Multi‑tenant data lake for business units
Provide isolated storage and compute environments for different departments while sharing underlying hardware.
High‑frequency trading data storage
Store tick‑level data with ACID guarantees and query it efficiently via the ClickHouse‑compatible SQL layer.
YTsaurus offers SDKs and APIs for C++, Python, Java, Go, and additional languages through REST and gRPC interfaces.
The platform uses automated data replication across nodes and a distributed architecture that continues operating despite individual server failures.
Yes, a Helm chart and quick‑start guide enable deployment of YTsaurus clusters on Kubernetes.
The CHYT layer is powered by ClickHouse, providing a familiar ClickHouse SQL dialect for fast analytic queries.
YTsaurus includes a web‑based UI for monitoring nodes, managing jobs, and interacting with stored data.
Project at a glance
ActiveLast synced 4 days ago