Best Open-source ETL & Data Integration tools

Explore curated open-source tools in the ETL & Data Integration category. Compare technologies, see alternatives, and find the right solution for your workflow.

10+ projects · Page 1 of 1

Airbyte logo

Airbyte

Data integration platform for ELT pipelines from any source

Stars
21,396
License
Last commit
4 hours ago
PythonActive
Apache Spark logo

Apache Spark

Fast, unified engine for large-scale data analytics

Stars
43,407
License
Apache-2.0
Last commit
4 hours ago
ScalaActive
OLake logo

OLake

Blazing-fast database replication to Apache Iceberg tables

Stars
1,350
License
Apache-2.0
Last commit
12 hours ago
GoActive
CocoIndex logo

CocoIndex

Ultra-performant data transformation framework for AI pipelines

Stars
10,205
License
Apache-2.0
Last commit
19 hours ago
RustActive
CloudQuery logo

CloudQuery

High-performance ELT framework powered by Apache Arrow

Stars
6,424
License
MPL-2.0
Last commit
20 hours ago
GoActive
Meltano logo

Meltano

Declarative code-first data integration engine for modern pipelines

Stars
2,532
License
MIT
Last commit
23 hours ago
PythonActive
Apache SeaTunnel logo

Apache SeaTunnel

Multimodal distributed data integration for massive-scale synchronization

Stars
9,374
License
Apache-2.0
Last commit
1 day ago
JavaActive
Crawl4AI logo

Crawl4AI

Turn the web into clean, LLM-ready Markdown instantly

Stars
67,916
License
Apache-2.0
Last commit
2 days ago
PythonActive
Artie Transfer logo

Artie Transfer

Real-time CDC replication from OLTP to OLAP databases

Stars
836
License
Last commit
2 months ago
GoActive
Mara Pipelines logo

Mara Pipelines

Lightweight Python ETL framework with PostgreSQL and web UI

Stars
2,085
License
MIT
Last commit
2 years ago
PythonDormant