Best Open-source ETL & Data Integration tools

Explore curated open-source tools in the ETL & Data Integration category. Compare technologies, see alternatives, and find the right solution for your workflow.

10+ projects · Page 1 of 1

CocoIndex logo

CocoIndex

Ultra-performant data transformation framework for AI pipelines

Stars
6,308
License
Apache-2.0
Last commit
3 hours ago
RustActive
Airbyte logo

Airbyte

Data integration platform for ELT pipelines from any source

Stars
20,838
License
Last commit
4 hours ago
PythonActive
OLake logo

OLake

Blazing-fast database replication to Apache Iceberg tables

Stars
1,304
License
Apache-2.0
Last commit
5 hours ago
GoActive
Apache Spark logo

Apache Spark

Fast, unified engine for large-scale data analytics

Stars
42,941
License
Apache-2.0
Last commit
6 hours ago
ScalaActive
Crawl4AI logo

Crawl4AI

Turn the web into clean, LLM-ready Markdown instantly

Stars
61,499
License
Apache-2.0
Last commit
14 hours ago
PythonActive
Artie Transfer logo

Artie Transfer

Real-time CDC replication from OLTP to OLAP databases

Stars
832
License
Last commit
22 hours ago
GoActive
CloudQuery logo

CloudQuery

High-performance ELT framework powered by Apache Arrow

Stars
6,337
License
MPL-2.0
Last commit
22 hours ago
GoActive
Meltano logo

Meltano

Declarative code-first data integration engine for modern pipelines

Stars
2,377
License
MIT
Last commit
1 day ago
PythonActive
Apache SeaTunnel logo

Apache SeaTunnel

Multimodal distributed data integration for massive-scale synchronization

Stars
9,151
License
Apache-2.0
Last commit
1 day ago
JavaActive
Mara Pipelines logo

Mara Pipelines

Lightweight Python ETL framework with PostgreSQL and web UI

Stars
2,086
License
MIT
Last commit
2 years ago
PythonDormant