ODD Platform logo

ODD Platform

Unified data discovery, lineage, and observability platform

A modern platform that centralizes data cataloging, lineage, quality, and security, enabling teams to discover, monitor, and govern data assets across diverse sources.

ODD Platform banner

Overview

Overview

The Open Data Discovery (ODD) platform provides a federated data catalog with end‑to‑end lineage, quality dashboards, and security tagging. Designed for data engineers, analysts, and ML practitioners, it consolidates metadata from hundreds of sources, offering a single pane of glass to understand how data flows through pipelines, dashboards, and models.

Capabilities & Deployment

ODD integrates with tools such as Airflow, DBT, Great Expectations, and many databases via native adapters. It logs ML experiment parameters automatically and supports reference data management for master data. Deployments are container‑first: run a single Docker image, use the provided docker‑compose demo, or install via Helm charts on Kubernetes. PostgreSQL stores all metadata, configured through environment variables.

Benefits

By shortening discovery cycles, providing transparent usage insights, and enabling proactive data quality monitoring, ODD helps organizations foster a data‑centric culture while maintaining compliance and governance.

Highlights

Federated data catalog with end‑to‑end lineage across heterogeneous sources
Built‑in data quality dashboard compatible with Great Expectations and DBT
Automatic ML experiment tracking and parameter logging
Extensive integrations (200+ adapters) and Helm/Kubernetes deployment options

Pros

  • Reduces time required for data discovery
  • Provides comprehensive lineage visibility
  • Supports compliance via tagging and security features
  • Deployable via Docker, Docker‑Compose, or Helm

Considerations

  • Requires a PostgreSQL backend to store metadata
  • UI may feel lightweight compared to commercial alternatives
  • Large number of integrations can increase configuration complexity
  • Observability features depend on proper instrumentation of pipelines

Managed products teams compare with

When teams consider ODD Platform, these hosted platforms usually appear on the same shortlist.

Alation logo

Alation

Data catalog platform for data discovery, governance, and lineage

Ataccama logo

Ataccama

Unified data management platform combining catalog, governance, data quality, and MDM

Atlan logo

Atlan

Modern data catalog and collaborative metadata platform for data discovery and governance

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

  • Data engineering teams needing a unified catalog and lineage view
  • Analytics groups that require data quality dashboards
  • ML practitioners wanting automatic experiment metadata capture
  • Organizations adopting the Open Data Discovery specification

Not ideal when

  • Small projects without a dedicated PostgreSQL instance
  • Teams that need an out‑of‑the‑box SaaS UI
  • Environments lacking container orchestration support
  • Users requiring advanced data profiling beyond basic metrics

How teams use it

Accelerate dashboard creation

Analysts quickly locate source tables and understand lineage, reducing time to build reliable BI reports.

Automate data quality monitoring

Data engineers integrate Great Expectations tests, view failures in the DQ dashboard, and receive alerts to prevent downstream issues.

Track ML experiment provenance

ML teams log parameters and results automatically, enabling reproducibility and comparison across model runs.

Govern reference data

Data stewards manage lookup tables centrally, ensuring consistent codes and compliance across pipelines.

Tech snapshot

Java58%
TypeScript41%
Mustache1%
Groovy1%
HTML1%
JavaScript1%

Tags

data-pipelinesobservabilitydata-governancealertingdata-profilingbigdatadata-platformdata-explorationmetadatadata-catalogdata-qualitydatacatalogdata-observabilitydata-engineeringdata-lineagedata-sciencedata-discoverymetadata-managementlineageoss

Frequently asked questions

What database does the platform use for metadata?

It stores metadata in PostgreSQL; you configure the connection via environment variables.

Can I run the platform locally without Kubernetes?

Yes, you can start a Docker container or use the provided docker‑compose demo.

How does ODD integrate with existing data pipelines?

ODD provides proxy adapters and native connectors for tools like Airflow, DBT, Great Expectations, and many databases.

Is there support for data lineage visualization?

The UI includes end‑to‑end lineage graphs for datasets, transformers, and consumers.

Is the platform compatible with the Open Data Discovery specification?

Yes, ODD is a reference implementation of the Open Data Discovery spec.

Project at a glance

Active
Stars
1,372
Watchers
1,372
Forks
132
LicenseApache-2.0
Repo age4 years old
Last commit6 days ago
Primary languageJava

Last synced 3 hours ago