Seldon Core 2

Deploy modular, data-centric AI applications at scale on Kubernetes

Seldon Core 2 provides a Kubernetes-native MLOps and LLMOps platform for deploying, managing, and scaling modular AI applications, supporting pipelines, autoscaling, multi-model serving, and experiments.

Overview

Seldon Core 2 is a Kubernetes‑native framework that enables data‑centric AI teams to package, deploy, and operate both machine‑learning and large‑language‑model workloads at production scale. It abstracts the complexities of Kubernetes while giving fine‑grained control over model serving, routing, and resource management.

Capabilities

The platform supports composable pipelines that stream data via Kafka, automatic scaling of models and custom components, and multi‑model serving that consolidates many models onto shared inference servers. Experiments let you run A/B tests, shadow deployments, and drift detection without disrupting live traffic. Overcommit functionality further reduces infrastructure spend by allowing more models than physical memory would normally permit.

Deployment

Deployments are defined declaratively and can run on‑premises or in any cloud that hosts a Kubernetes cluster. Seldon Core 2 integrates with existing CI/CD pipelines and provides out‑of‑the‑box monitoring and performance tuning tools, making it suitable for enterprises seeking a production‑ready MLOps solution.

Highlights

Composable pipelines with Kafka‑based real‑time streaming

Native and custom autoscaling for models and components

Multi‑model serving to consolidate inference workloads

Experiment routing for A/B tests, shadow deployments, and drift detection

Pros

Kubernetes‑native scalability and reliability
Modular architecture enables flexible AI application design
Cost‑saving overcommit and shared inference servers
Built‑in experiment management for rapid model iteration

Considerations

Requires operational expertise with Kubernetes
Business Source License may limit unrestricted commercial redistribution
Complexity when building custom components or pipelines
Steeper learning curve compared to lightweight single‑model servers

Managed products teams compare with

When teams consider Seldon Core 2, these hosted platforms usually appear on the same shortlist.

Anyscale

Ray-powered platform for scalable LLM training and inference.

Amazon SageMaker

Fully managed machine learning service to build, train, and deploy ML models at scale

BentoML

Open-source model serving framework to ship AI applications.

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

Enterprises needing production‑grade model serving at scale
Teams building data‑centric AI pipelines with real‑time requirements
Organizations that run many models and want to share infrastructure
Developers experimenting with A/B testing and LLM integrations

Not ideal when

Small projects without access to a Kubernetes cluster
Users requiring a permissive open‑source license for redistribution
Teams lacking ops resources to manage cluster lifecycle
Simple single‑model deployments where a lightweight server suffices

How teams use it

Real‑time fraud detection pipeline

Stream transaction data through Kafka‑linked models to flag anomalies instantly while auto‑scaling under load.

Recommendation model A/B testing

Route live traffic between candidate models, collect performance metrics, and promote the best performer without downtime.

Scalable LLM chat service with drift detection

Deploy LLMs alongside custom components that monitor response quality and trigger fallback models when drift is detected.

Consolidated IoT inference for edge devices

Host dozens of lightweight models on shared servers, using overcommit to maximize hardware utilization across edge workloads.

Tech snapshot

Go54%

Jupyter Notebook21%

Kotlin14%

JavaScript3%

Smarty3%

Java2%

Frequently asked questions

What Kubernetes version is required?

Seldon Core 2 supports Kubernetes 1.21 and newer; consult the documentation for specific version compatibility.

How does autoscaling work?

Autoscaling can be driven by native metrics (CPU, memory) or custom logic defined in Seldon resources.

Can I run Seldon Core 2 on‑premises?

Yes, the platform is cloud‑agnostic and runs on any on‑prem Kubernetes cluster.

What are the licensing terms?

The project is distributed under the Business Source License; contributions inherit the same license.

How do I add custom components?

Implement a Dockerized service that conforms to Seldon's component API and reference it in your pipeline definition.

Project at a glance

Active

Visit site View repo

Stars: 4,732
Watchers: 4,732
Forks: 857

Repo age8 years old

Last commit6 days ago

Primary languageGo

Last synced 3 hours ago

Overview

Overview

Capabilities

Deployment

Highlights

Pros

Considerations

Managed products teams compare with

Anyscale

Amazon SageMaker

BentoML

Fit guide

Great for

Not ideal when

How teams use it

Tech snapshot

Tags

Frequently asked questions