LiteLLM logo

LiteLLM

Unified gateway for all LLM APIs with OpenAI compatibility

LiteLLM provides a single Python interface to call dozens of LLM providers—OpenAI, Azure, Anthropic, Bedrock, HuggingFace, and more—using the familiar OpenAI request/response format.

LiteLLM banner

Overview

Overview

LiteLLM is designed for developers, data scientists, and enterprises that need to integrate multiple large language model (LLM) services without rewriting code for each vendor. By exposing a consistent OpenAI‑style API, it lets you swap providers or run parallel experiments with a single function call.

Core capabilities

The library translates inputs to each provider’s completion, embedding, and image‑generation endpoints, guarantees a uniform choices[0].message.content response shape, and includes built‑in retry, fallback, and routing logic. It supports async calls, streaming token‑by‑token output, and configurable budgets, rate limits, and per‑project isolation. Observability callbacks can forward logs to Lunary, MLflow, Langfuse, Helicone, and other platforms.

Deployment options

Install via pip install litellm or run the official Docker image with the -stable tag for production‑grade load‑tested containers. Set provider API keys as environment variables and optionally deploy the LiteLLM proxy server for multi‑tenant routing, hosted preview, or enterprise‑managed services.

Highlights

Single OpenAI‑compatible API for over 20 LLM providers
Built‑in retry, fallback, and routing across multiple deployments
Streaming and async support for real‑time applications
Observability callbacks to Lunary, MLflow, Langfuse, Helicone, and more

Pros

  • Reduces code duplication when switching between providers
  • Consistent response schema simplifies downstream processing
  • Supports budgeting, rate limiting, and multi‑project isolation
  • Extensible logging integrates with popular observability platforms

Considerations

  • Adds an abstraction layer that may introduce slight latency
  • Requires keeping provider API keys as environment variables
  • Feature set depends on provider‑specific capabilities; not all endpoints are fully mapped
  • Complex routing rules may need additional configuration

Managed products teams compare with

When teams consider LiteLLM, these hosted platforms usually appear on the same shortlist.

Eden AI logo

Eden AI

Unified API aggregator for AI services across providers

OpenRouter logo

OpenRouter

One API for 400+ AI models with smart routing and unified billing/BYOK

Vercel AI Gateway logo

Vercel AI Gateway

Unified AI gateway for multi-provider routing, caching, rate limits, and observability

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

  • Teams building applications that need to experiment with multiple LLMs
  • Enterprises requiring spend tracking and rate‑limit enforcement per project
  • Developers who prefer the OpenAI request format across providers
  • Projects that need streaming or async responses for chat‑like interfaces

Not ideal when

  • Simple scripts that only ever call a single provider
  • Environments where adding another Python dependency is undesirable
  • Use cases demanding ultra‑low latency without any abstraction overhead
  • Scenarios requiring provider‑specific features not yet exposed through LiteLLM

How teams use it

Multi‑model A/B testing

Switch between OpenAI, Anthropic, and Cohere models with a single function call, enabling rapid performance comparison.

Enterprise spend monitoring

Set per‑project budgets and rate limits, automatically routing excess traffic to fallback providers.

Real‑time chat application

Leverage async and streaming support to deliver token‑by‑token responses to end users.

Centralized logging for compliance

Send request and response data to Langfuse, Helicone, or MLflow for audit trails and performance analytics.

Tech snapshot

Python87%
TypeScript12%
HTML1%
JavaScript1%
Shell1%
Makefile1%

Tags

bedrockllm-gatewayllmgatewayai-gatewayazure-openaivertex-aimcp-gatewaylangchainanthropicopenaiopenai-proxyllmopslitellm

Frequently asked questions

How do I install LiteLLM?

Use `pip install litellm` or pull the official Docker image with the `-stable` tag.

Which LLM providers are supported?

LiteLLM supports OpenAI, Azure, Anthropic, Bedrock, HuggingFace, TogetherAI, VertexAI, Groq, and many others; see the provider list in the docs.

Can I use LiteLLM for embeddings and image generation?

Yes, the library translates calls to the provider’s completion, embedding, and image_generation endpoints.

How does rate limiting work?

You can configure budgets and rate limits per project, API key, or model through the proxy’s routing settings.

Is there a hosted version?

A preview hosted proxy is available, and an enterprise tier offers managed deployment.

Project at a glance

Active
Stars
34,233
Watchers
34,233
Forks
5,416
Repo age2 years old
Last commit3 hours ago
Primary languagePython

Last synced 3 hours ago