Best Open-source ETL & Data Integration tools

Explore curated open-source tools in the ETL & Data Integration category. Compare technologies, see alternatives, and find the right solution for your workflow.

10+ projects · Page 1 of 1

Airbyte logo

Airbyte

Data integration platform for ELT pipelines from any source

Stars
21,017
License
Last commit
17 days ago
PythonActive
OLake logo

OLake

Blazing-fast database replication to Apache Iceberg tables

Stars
1,314
License
Apache-2.0
Last commit
17 days ago
GoActive
Meltano logo

Meltano

Declarative code-first data integration engine for modern pipelines

Stars
2,409
License
MIT
Last commit
17 days ago
PythonActive
Crawl4AI logo

Crawl4AI

Turn the web into clean, LLM-ready Markdown instantly

Stars
63,373
License
Apache-2.0
Last commit
18 days ago
PythonActive
CocoIndex logo

CocoIndex

Ultra-performant data transformation framework for AI pipelines

Stars
6,726
License
Apache-2.0
Last commit
18 days ago
RustActive
CloudQuery logo

CloudQuery

High-performance ELT framework powered by Apache Arrow

Stars
6,365
License
MPL-2.0
Last commit
18 days ago
GoActive
Artie Transfer logo

Artie Transfer

Real-time CDC replication from OLTP to OLAP databases

Stars
836
License
Last commit
18 days ago
GoActive
Apache Spark logo

Apache Spark

Fast, unified engine for large-scale data analytics

Stars
43,084
License
Apache-2.0
Last commit
18 days ago
ScalaActive
Apache SeaTunnel logo

Apache SeaTunnel

Multimodal distributed data integration for massive-scale synchronization

Stars
9,226
License
Apache-2.0
Last commit
19 days ago
JavaActive
Mara Pipelines logo

Mara Pipelines

Lightweight Python ETL framework with PostgreSQL and web UI

Stars
2,084
License
MIT
Last commit
2 years ago
PythonDormant