
Astronomer
Managed Apache Airflow service for orchestrating and monitoring data pipelines in the cloud
Discover top open-source software, updated regularly with real-world adoption signals.

Cloud-native orchestrator for developing and maintaining data assets
Dagster is a declarative data pipeline orchestrator with integrated lineage, observability, and testability. Build and maintain tables, datasets, ML models, and reports using Python functions.

Dagster is a cloud-native orchestration platform designed for developing, testing, and maintaining data assets—tables, datasets, machine learning models, and reports. Unlike traditional workflow engines, Dagster uses a declarative programming model where you define assets as Python functions, and the platform handles scheduling, dependencies, and execution.
From local development and unit testing to staging and production, Dagster provides integrated lineage tracking, observability, and diagnostics in a unified control plane. The platform scales both technically and organizationally with multi-tenant, multi-tool orchestration capabilities. Data practitioners can embrace CI/CD best practices, build reusable components, catch data quality issues early, and maintain visibility as complexity grows.
Dagster integrates with the modern data stack through a growing library of connectors for popular tools. Deploy to your own infrastructure while centralizing metadata, cataloging, and performance monitoring. The platform supports Python 3.9 through 3.13 and includes a web UI for development and operations.
When teams consider Dagster, these hosted platforms usually appear on the same shortlist.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Machine Learning Pipeline Orchestration
Define feature tables, training datasets, and models as assets with automatic dependency tracking and lineage from raw data to deployed models
Data Warehouse Table Management
Declaratively manage table dependencies and transformations with built-in data quality checks and observability across staging and production environments
Analytics Report Generation
Orchestrate report creation with upstream data dependencies, ensuring reports update automatically when source data changes while maintaining full lineage
Multi-Team Data Platform
Centralize orchestration across teams using different tools with unified metadata, cataloging, and performance monitoring in a single control plane
Dagster uses a declarative asset-based model where you define what data assets to build rather than just task sequences. It includes integrated lineage, observability, and testability as core features rather than add-ons.
Dagster officially supports Python 3.9 through Python 3.13 and is available via PyPI for easy installation.
Yes, Dagster provides a growing library of integrations for popular data stack tools and allows deployment to your own infrastructure while maintaining centralized orchestration.
Yes, Dagster is designed for the entire development lifecycle—from local development and unit tests through integration testing, staging, and production deployment with multi-tenant capabilities.
Dagster provides built-in lineage tracking that automatically maps dependencies between assets, along with integrated observability, diagnostics, and cataloging in a unified control plane.
Project at a glance
ActiveLast synced 4 days ago