Redpanda Connect logo

Redpanda Connect

High-performance resilient stream processor with declarative pipelines

Redpanda Connect lets you connect dozens of sources and sinks, apply Bloblang transformations, and guarantee at‑least‑once delivery—all via a single declarative YAML file, runnable as a binary or Docker image.

Redpanda Connect banner

Overview

Overview

Redpanda Connect is a high‑performance stream processor that lets you stitch together a wide variety of data sources and sinks—AWS, Azure, GCP, Kafka, NATS, MQTT, SQL databases, and more—using a single declarative YAML configuration. The built‑in Bloblang mapping language enables complex enrichments, transformations, and filtering without writing code, while an in‑process transaction model guarantees at‑least‑once delivery even after crashes.

Deployment & Operations

The processor can be run as a static binary (rpk connect run) or as a minimal Docker image, making it fit seamlessly into container‑orchestrated environments. Health endpoints (/ping, /ready) and extensive metrics (Prometheus, StatsD, JSON) simplify monitoring, and OpenTelemetry tracing provides visibility into each processing stage. Custom connectors are added via Go plugins, and extra components can be compiled with the x_benthos_extra build tag when needed.

Who Benefits

It targets data‑engineering teams that need reliable, low‑latency pipelines without managing offset storage, and ops teams that value straightforward observability and cloud‑native deployment.

Highlights

Declarative pipelines defined in a single YAML file
Built‑in Bloblang language for complex transformations
At‑least‑once delivery without external state persistence
Extensive source/sink catalog and native Docker/binary deployment

Pros

  • High throughput and low latency processing
  • Simple configuration and cloud‑native deployment
  • Rich monitoring via Prometheus, StatsD, OpenTelemetry
  • Extensible through Go plugins

Considerations

  • Learning curve for Bloblang syntax
  • Limited built‑in stateful processing (requires external storage)
  • Extra plugins need external libraries and build tags
  • Configuration can become large for complex topologies

Managed products teams compare with

When teams consider Redpanda Connect, these hosted platforms usually appear on the same shortlist.

Aiven for Apache Flink logo

Aiven for Apache Flink

Fully managed Apache Flink service by Aiven.

Amazon Managed Service for Apache Flink logo

Amazon Managed Service for Apache Flink

Serverless Apache Flink for real-time stream processing on AWS.

Azure Stream Analytics logo

Azure Stream Analytics

Serverless real-time analytics with SQL on streams.

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

  • Teams needing reliable data pipelines across cloud services
  • Developers who prefer code‑free pipeline definitions
  • Ops teams looking for easy health‑check and metrics integration
  • Organizations requiring at‑least‑once guarantees without managing offsets

Not ideal when

  • Use cases requiring exactly‑once semantics
  • Heavy stateful stream processing that needs built‑in windowing
  • Environments where only a single language runtime is allowed (no Go)
  • Scenarios demanding a GUI pipeline designer

How teams use it

Real‑time event enrichment from Pub/Sub to Redis Streams

Enriches incoming messages with computed fields using Bloblang and delivers them to Redis with at‑least‑once guarantees.

Batch file ingestion from S3 to ClickHouse

Pulls objects from S3, transforms records, and streams them into ClickHouse for analytics.

Cross‑cloud data sync between Azure Blob and GCP BigQuery

Continuously moves files from Azure Blob storage to BigQuery tables, handling format conversion on the fly.

Log aggregation from MQTT to Elasticsearch

Collects MQTT telemetry, filters noise, and indexes logs in Elasticsearch for search and visualization.

Tech snapshot

Go96%
Python4%
TSQL1%
Shell1%
Makefile1%
Dockerfile1%

Tags

logskafkarabbitmqdata-opsgocqrsetlmessage-queuemessage-busevent-sourcingnatsstream-processordata-engineeringstreaming-datagolangamqpstream-processing

Frequently asked questions

How does Redpanda Connect ensure at‑least‑once delivery?

It uses an in‑process transaction model that acknowledges messages only after successful processing, without persisting state to disk.

Can I run Redpanda Connect without Docker?

Yes, you can download the static binary (rpk) for Linux or install via Homebrew and run it directly with a config file.

What monitoring integrations are available?

Metrics can be exposed to Prometheus, StatsD, or a JSON endpoint; health checks are provided via /ping and /ready, and OpenTelemetry tracing is emitted.

How do I add a custom connector?

Write a Go plugin implementing the connector interface, import it via the public API, and build Redpanda Connect with the appropriate build tag.

Is there support for exactly‑once processing?

Redpanda Connect guarantees at‑least‑once; exactly‑once requires external coordination and is not provided out of the box.

Project at a glance

Active
Stars
8,555
Watchers
8,555
Forks
912
Repo age9 years old
Last commit6 hours ago
Primary languageGo

Last synced 4 hours ago