
Amazon Redshift
Fully managed, petabyte-scale cloud data warehouse for analytics and reporting
Discover top open-source software, updated regularly with real-world adoption signals.

In-process SQL OLAP engine powered by ClickHouse
Embedded SQL analytics engine bringing ClickHouse's columnar performance directly into Python applications without external dependencies or separate server installations.

chDB is an embedded SQL OLAP engine that brings the full power of ClickHouse directly into your Python applications as an in-process library. Unlike traditional database deployments, chDB requires no separate server installation, configuration, or network overhead—simply pip install and start querying.
Designed for data engineers, analysts, and Python developers who need high-performance analytical queries on local data without the operational complexity of managing database infrastructure. Whether you're processing Parquet files, transforming Pandas DataFrames, or running ad-hoc analytics, chDB delivers ClickHouse-grade performance with minimal setup.
chDB supports 60+ data formats including Parquet, CSV, JSON, Arrow, and ORC with zero-copy data access via Python memoryview. It offers multiple query interfaces: a DB-API 2.0 compliant connection API, direct file querying, stateful sessions with persistent tables and views, and Python UDF support for custom transformations. Query results can be returned as DataFrames, JSON, CSV, or any ClickHouse-supported format.
Available via pip for Python 3.8+ on macOS and Linux (x86_64 and ARM64). Use in-memory mode for ephemeral analytics or file-based mode for persistent storage across sessions.
When teams consider chDB, these hosted platforms usually appear on the same shortlist.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Ad-hoc Parquet Analysis
Query multi-gigabyte Parquet files directly from disk with SQL, returning results as Pandas DataFrames without ETL pipelines or database imports.
Embedded Application Analytics
Integrate real-time OLAP queries into Python applications for dashboards, reporting, or user-facing analytics without external database dependencies.
DataFrame Join Acceleration
Perform complex joins and aggregations on multiple Pandas DataFrames using SQL, leveraging ClickHouse's columnar engine for 10-100x speedups over native Pandas.
ETL Pipeline Prototyping
Test and validate ClickHouse SQL transformations locally before deploying to production clusters, using identical query syntax and behavior.
No. chDB embeds the ClickHouse engine directly into Python as a library. Simply pip install chdb and start querying—no server setup, configuration files, or network ports required.
chDB supports 60+ formats including Parquet, CSV, JSON, Arrow, ORC, and all formats supported by ClickHouse. You can query files directly from disk or work with in-memory data structures like Pandas DataFrames.
Yes. Use file-based connections (e.g., chdb.connect('test.db')) to create persistent databases with tables and views that survive across Python sessions. In-memory mode (:memory:) is ephemeral.
Both are embedded OLAP engines. chDB brings ClickHouse's columnar engine and SQL dialect to Python, while DuckDB has its own engine. Choose chDB if you need ClickHouse compatibility or plan to migrate queries to a ClickHouse cluster.
UDFs must be stateless, pure Python functions that process input line-by-line. They do not support user-defined aggregations (UDAFs) or stateful operations. All inputs are strings (tab-separated), and you must handle type conversions manually.
Project at a glance
ActiveLast synced 4 days ago