
SGLang
High‑performance serving framework for LLMs and vision‑language models.
- Stars
- 20,742
- License
- Apache-2.0
- Last commit
- 3 days ago
Explore curated open-source tools in the Model Serving & Inference Platforms category. Compare technologies, see alternatives, and find the right solution for your workflow.
10+ projects · Page 1 of 1

High‑performance serving framework for LLMs and vision‑language models.

Fast, lightweight Python framework for scalable LLM inference

Unified AI model serving across clouds, edge, and GPUs

Accelerated LLM inference with NVIDIA TensorRT optimizations

Scale Python and AI workloads from laptop to cluster effortlessly

Unified Python framework for building high‑performance AI inference APIs

Unified AI inference platform for generative and predictive workloads on Kubernetes

Deploy modular, data-centric AI applications at scale on Kubernetes

High‑throughput LLM serving with intra‑device parallelism and asynchronous CPU scheduling

Unified ML library for scalable training, serving, and federated learning.