Seldon Core 2 logo

Seldon Core 2

Deploy modular, data-centric AI applications at scale on Kubernetes

Seldon Core 2 provides a Kubernetes-native MLOps and LLMOps platform for deploying, managing, and scaling modular AI applications, supporting pipelines, autoscaling, multi-model serving, and experiments.

Seldon Core 2 banner

Overview

Overview

Seldon Core 2 is a Kubernetes‑native framework that enables data‑centric AI teams to package, deploy, and operate both machine‑learning and large‑language‑model workloads at production scale. It abstracts the complexities of Kubernetes while giving fine‑grained control over model serving, routing, and resource management.

Capabilities

The platform supports composable pipelines that stream data via Kafka, automatic scaling of models and custom components, and multi‑model serving that consolidates many models onto shared inference servers. Experiments let you run A/B tests, shadow deployments, and drift detection without disrupting live traffic. Overcommit functionality further reduces infrastructure spend by allowing more models than physical memory would normally permit.

Deployment

Deployments are defined declaratively and can run on‑premises or in any cloud that hosts a Kubernetes cluster. Seldon Core 2 integrates with existing CI/CD pipelines and provides out‑of‑the‑box monitoring and performance tuning tools, making it suitable for enterprises seeking a production‑ready MLOps solution.

Highlights

Composable pipelines with Kafka‑based real‑time streaming
Native and custom autoscaling for models and components
Multi‑model serving to consolidate inference workloads
Experiment routing for A/B tests, shadow deployments, and drift detection

Pros

  • Kubernetes‑native scalability and reliability
  • Modular architecture enables flexible AI application design
  • Cost‑saving overcommit and shared inference servers
  • Built‑in experiment management for rapid model iteration

Considerations

  • Requires operational expertise with Kubernetes
  • Business Source License may limit unrestricted commercial redistribution
  • Complexity when building custom components or pipelines
  • Steeper learning curve compared to lightweight single‑model servers

Managed products teams compare with

When teams consider Seldon Core 2, these hosted platforms usually appear on the same shortlist.

Amazon SageMaker logo

Amazon SageMaker

Fully managed machine learning service to build, train, and deploy ML models at scale

Anyscale logo

Anyscale

Ray-powered platform for scalable LLM training and inference.

BentoML logo

BentoML

Open-source model serving framework to ship AI applications.

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

  • Enterprises needing production‑grade model serving at scale
  • Teams building data‑centric AI pipelines with real‑time requirements
  • Organizations that run many models and want to share infrastructure
  • Developers experimenting with A/B testing and LLM integrations

Not ideal when

  • Small projects without access to a Kubernetes cluster
  • Users requiring a permissive open‑source license for redistribution
  • Teams lacking ops resources to manage cluster lifecycle
  • Simple single‑model deployments where a lightweight server suffices

How teams use it

Real‑time fraud detection pipeline

Stream transaction data through Kafka‑linked models to flag anomalies instantly while auto‑scaling under load.

Recommendation model A/B testing

Route live traffic between candidate models, collect performance metrics, and promote the best performer without downtime.

Scalable LLM chat service with drift detection

Deploy LLMs alongside custom components that monitor response quality and trigger fallback models when drift is detected.

Consolidated IoT inference for edge devices

Host dozens of lightweight models on shared servers, using overcommit to maximize hardware utilization across edge workloads.

Tech snapshot

Go54%
Jupyter Notebook21%
Kotlin14%
JavaScript3%
Smarty3%
Java2%

Tags

mlopsservingkubernetesaiopsproduction-machine-learningmachine-learningdeploymentmachine-learning-operations

Frequently asked questions

What Kubernetes version is required?

Seldon Core 2 supports Kubernetes 1.21 and newer; consult the documentation for specific version compatibility.

How does autoscaling work?

Autoscaling can be driven by native metrics (CPU, memory) or custom logic defined in Seldon resources.

Can I run Seldon Core 2 on‑premises?

Yes, the platform is cloud‑agnostic and runs on any on‑prem Kubernetes cluster.

What are the licensing terms?

The project is distributed under the Business Source License; contributions inherit the same license.

How do I add custom components?

Implement a Dockerized service that conforms to Seldon's component API and reference it in your pipeline definition.

Project at a glance

Active
Stars
4,710
Watchers
4,710
Forks
857
Repo age8 years old
Last commit2 days ago
Primary languageGo

Last synced 12 hours ago