
Amazon SageMaker Feature Store
Fully managed repository to create, store, share, and serve ML features
Discover top open-source software, updated regularly with real-world adoption signals.

Turn existing data pipelines into a collaborative virtual feature store
Featureform provides a centralized, immutable repository for defining, managing, and serving ML features, leveraging your current data stack while adding RBAC, audit logs, and vector-database support.

Featureform is a virtual feature store that sits atop your existing data stack. It lets data scientists define features, labels, and training sets in a logical, immutable form while the platform orchestrates the underlying compute and storage—whether Spark, Redis, or a vector database.
By centralizing definitions with metadata such as lineage, owner, and variant, teams can share and reuse features safely. Built-in role-based access control, audit logs, and dynamic serving rules help organizations meet compliance requirements without changing their workflow.
Featureform can run locally on a single machine, in a Docker container, or at scale on Kubernetes, connecting to any supported provider. Native support for embeddings enables versioned vector stores for both training and inference, making it suitable for modern ML applications.
When teams consider Featureform, these hosted platforms usually appear on the same shortlist.

Fully managed repository to create, store, share, and serve ML features

Feature registry with governance, lineage, and MLflow integration

Central hub to manage, govern, and serve ML features across batch, streaming, and real time
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Notebook-to-Production Feature Pipeline
Data scientists push transformations from Jupyter notebooks to a central repository, then Featureform orchestrates Spark jobs and serves the feature in Redis for online inference.
Enterprise Governance Enforcement
Featureform’s RBAC and audit logs automatically enforce GDPR-based serving rules, ensuring only authorized models access sensitive features.
Embedding Store for Recommendation System
Transformer-based embeddings are versioned and stored in a vector database, enabling consistent training and real-time similarity queries.
Cross-Team Feature Reuse
Multiple teams discover and reuse existing feature definitions, reducing duplicate work and improving model consistency.
Featureform is infrastructure-agnostic and can orchestrate transformations on platforms such as Spark, Redis, and vector databases, connecting to your existing resources.
All feature, label, and training set definitions are stored as immutable objects with versioning and lineage metadata, preventing accidental changes.
Yes, Featureform can be run on a single machine, in a Docker container, or via Minikube for local testing.
Featureform is provided as open-source software; there is no separate SaaS offering mentioned in the documentation.
Featureform is released under the Mozilla Public License 2.0 (MPL-2.0).
Project at a glance
StableLast synced 4 days ago