
Amazon SageMaker Feature Store
Fully managed repository to create, store, share, and serve ML features
Discover top open-source software, updated regularly with real-world adoption signals.

Scalable feature store for unified data and AI engineering
Feathr provides Pythonic APIs to define, register, and share feature transformations across batch, streaming, and online environments, with point-in-time correctness and native cloud integrations for enterprise AI pipelines.

Feathr is a data and AI engineering platform used in production at LinkedIn for over six years and now available as an open‑source project under the LF AI & Data Foundation. It lets data scientists define feature transformations with Pythonic APIs, register them by name, and reuse them across teams, ensuring consistent, point‑in‑time‑correct data for model training and online serving.
Feathr supports batch, streaming, and online workloads with built‑in optimizations that can handle billions of rows and petabyte‑scale datasets. Its native integrations with Databricks and Azure Synapse, along with ARM templates and CLI guides, simplify cloud deployment. Users can start quickly with the Feathr Sandbox Docker container, which includes a UI and Jupyter notebooks for hands‑on experimentation, or install the client via pip for local development.
A built‑in registry and intuitive UI provide feature discovery, lineage tracking, and access control. Rich transformation primitives—including time‑based aggregations, sliding windows, and custom UDFs with PySpark or Spark SQL—enable flexible engineering of complex AI features.
When teams consider Feathr, these hosted platforms usually appear on the same shortlist.

Fully managed repository to create, store, share, and serve ML features

Feature registry with governance, lineage, and MLflow integration

Central hub to manage, govern, and serve ML features across batch, streaming, and real time
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
NYC Taxi fare prediction
Rapidly define, materialize, and serve fare prediction features with point‑in‑time correctness
Fraud detection pipeline
Combine user account and transaction streams into real‑time fraud risk features
Product recommendation system
Generate and serve user‑item interaction features for personalized ranking
Feature embedding for NLP
Create embedding features using transformer models and serve them in online inference
Run the Feathr Sandbox Docker container, which includes UI, Jupyter, and core services, and follow the quickstart notebook.
Feathr’s APIs are Pythonic; transformations can be expressed with native PySpark or Spark SQL.
Yes, it has native integrations with Databricks and Azure Synapse, with deployment guides and ARM templates.
It computes features using point‑in‑time‑correct semantics, ensuring training data only sees information available up to the event timestamp.
Feathr includes a web UI for searching, exploring lineage, and managing access to registered features.
Project at a glance
DormantLast synced 4 days ago