ODD Platform

Unified data discovery, lineage, and observability platform

A modern platform that centralizes data cataloging, lineage, quality, and security, enabling teams to discover, monitor, and govern data assets across diverse sources.

Overview

The Open Data Discovery (ODD) platform provides a federated data catalog with end‑to‑end lineage, quality dashboards, and security tagging. Designed for data engineers, analysts, and ML practitioners, it consolidates metadata from hundreds of sources, offering a single pane of glass to understand how data flows through pipelines, dashboards, and models.

Capabilities & Deployment

ODD integrates with tools such as Airflow, DBT, Great Expectations, and many databases via native adapters. It logs ML experiment parameters automatically and supports reference data management for master data. Deployments are container‑first: run a single Docker image, use the provided docker‑compose demo, or install via Helm charts on Kubernetes. PostgreSQL stores all metadata, configured through environment variables.

Benefits

By shortening discovery cycles, providing transparent usage insights, and enabling proactive data quality monitoring, ODD helps organizations foster a data‑centric culture while maintaining compliance and governance.

Highlights

Federated data catalog with end‑to‑end lineage across heterogeneous sources

Built‑in data quality dashboard compatible with Great Expectations and DBT

Automatic ML experiment tracking and parameter logging

Extensive integrations (200+ adapters) and Helm/Kubernetes deployment options

Pros

Reduces time required for data discovery
Provides comprehensive lineage visibility
Supports compliance via tagging and security features
Deployable via Docker, Docker‑Compose, or Helm

Considerations

Requires a PostgreSQL backend to store metadata
UI may feel lightweight compared to commercial alternatives
Large number of integrations can increase configuration complexity
Observability features depend on proper instrumentation of pipelines

Managed products teams compare with

When teams consider ODD Platform, these hosted platforms usually appear on the same shortlist.

Alation

Data catalog platform for data discovery, governance, and lineage

Ataccama

Unified data management platform combining catalog, governance, data quality, and MDM

Atlan

Modern data catalog and collaborative metadata platform for data discovery and governance

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

Data engineering teams needing a unified catalog and lineage view
Analytics groups that require data quality dashboards
ML practitioners wanting automatic experiment metadata capture
Organizations adopting the Open Data Discovery specification

Not ideal when

Small projects without a dedicated PostgreSQL instance
Teams that need an out‑of‑the‑box SaaS UI
Environments lacking container orchestration support
Users requiring advanced data profiling beyond basic metrics

How teams use it

Accelerate dashboard creation

Analysts quickly locate source tables and understand lineage, reducing time to build reliable BI reports.

Automate data quality monitoring

Data engineers integrate Great Expectations tests, view failures in the DQ dashboard, and receive alerts to prevent downstream issues.

Track ML experiment provenance

ML teams log parameters and results automatically, enabling reproducibility and comparison across model runs.

Govern reference data

Data stewards manage lookup tables centrally, ensuring consistent codes and compliance across pipelines.

Tech snapshot

Java58%

TypeScript41%

Mustache1%

Groovy1%

HTML1%

JavaScript1%

Frequently asked questions

What database does the platform use for metadata?

It stores metadata in PostgreSQL; you configure the connection via environment variables.

Can I run the platform locally without Kubernetes?

Yes, you can start a Docker container or use the provided docker‑compose demo.

How does ODD integrate with existing data pipelines?

ODD provides proxy adapters and native connectors for tools like Airflow, DBT, Great Expectations, and many databases.

Is there support for data lineage visualization?

The UI includes end‑to‑end lineage graphs for datasets, transformers, and consumers.

Is the platform compatible with the Open Data Discovery specification?

Yes, ODD is a reference implementation of the Open Data Discovery spec.

Project at a glance

Active

Visit site View repo

Stars: 1,419
Watchers: 1,419
Forks: 143

LicenseApache-2.0

Repo age5 years old

Last commit2 weeks ago

Primary languageJava

Last synced 3 hours ago

Overview

Overview

Capabilities & Deployment

Benefits

Highlights

Pros

Considerations

Managed products teams compare with

Alation

Ataccama

Atlan

Fit guide

Great for

Not ideal when

How teams use it

Tech snapshot

Tags

Frequently asked questions