
Airbyte
Open-source data integration engine for ELT pipelines across data sources
Discover top open-source software, updated regularly with real-world adoption signals.

Real-time CDC replication from OLTP to OLAP databases
Stream production data from operational databases to warehouses like Snowflake, BigQuery, Redshift, and Databricks with sub-minute latency using change data capture.

Artie Transfer eliminates the hours-to-days lag inherent in traditional batch ETL pipelines. By leveraging change data capture (CDC) and stream processing, it delivers production data to your warehouse in under a minute—keeping analytics fresh as business happens.
Designed for teams tired of managing complex DAGs and Airflow schedules, Artie requires only a simple configuration file to start replicating data. It automatically detects schemas, creates tables, and merges downstream changes without manual intervention. Idempotent processing and automatic retries ensure reliability, while built-in telemetry and error reporting provide visibility into every sync.
Whether you're moving 1GB or 100+ TB, Artie scales seamlessly across a wide range of sources—including PostgreSQL, MySQL, MongoDB, DynamoDB, and Oracle—to destinations like Snowflake, BigQuery, Redshift, Databricks, and S3. Written in Go and distributed as cross-compiled binaries and Docker images, it integrates with Kafka for robust message queuing and fits naturally into modern data stacks.
When teams consider Artie Transfer, these hosted platforms usually appear on the same shortlist.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Real-Time Business Intelligence
Analysts query live production data in Snowflake for up-to-the-minute dashboards and reports without waiting for nightly batch jobs.
Operational Analytics at Scale
E-commerce platforms replicate 50+ TB from PostgreSQL to BigQuery, enabling real-time inventory and customer behavior analysis.
Multi-Source Data Consolidation
Enterprises stream data from MongoDB, MySQL, and DynamoDB into a unified Databricks lakehouse for cross-system analytics.
Compliance and Audit Trails
Financial services maintain sub-minute replicas in Redshift for regulatory reporting and fraud detection without impacting transactional systems.
Artie uses change data capture (CDC) to stream database changes in real time via Kafka, processing updates continuously instead of waiting for batch schedules.
Sources include PostgreSQL, MySQL, MongoDB, DynamoDB, DocumentDB, Oracle, and SQL Server. Destinations include Snowflake, BigQuery, Redshift, Databricks, S3, Iceberg, SQL Server, and PostgreSQL.
No. Artie automatically detects schemas, creates tables, and merges schema changes downstream without manual intervention.
Yes. Artie currently uses Kafka as the default message queue for reliable, scalable stream processing between sources and destinations.
Set up a simple configuration file specifying your source and destination, then run the binary or Docker container. Examples and a getting started guide are available in the repository.
Project at a glance
ActiveLast synced 4 days ago