
Confident AI
DeepEval-powered LLM evaluation platform to test, benchmark, and safeguard apps
Discover top open-source software, updated regularly with real-world adoption signals.

Collaborative platform for building, monitoring, and debugging LLM applications.
Langfuse lets AI teams develop, observe, evaluate, and iterate on LLM apps with unified tracing, prompt versioning, evaluation pipelines, and a self‑hostable deployment in minutes.

Langfuse is a platform designed for AI product teams that need end‑to‑end visibility into their LLM applications. It captures traces of model calls, retrievals, embeddings, and agent actions, allowing developers to debug complex sessions and iterate quickly using an integrated playground.
Beyond observability, Langfuse offers centralized prompt management with version control and low‑latency caching, flexible evaluation frameworks that support LLM‑as‑judge, human feedback, and custom pipelines, and dataset handling for benchmark testing. Typed SDKs for Python and JavaScript/TypeScript, plus a comprehensive OpenAPI spec, let you embed Langfuse into bespoke LLMOps workflows.
You can start with Langfuse Cloud’s free tier—no credit card required—or self‑host in minutes using Docker Compose. For production, Helm charts, Kubernetes, and Terraform templates are available for AWS, Azure, and GCP, giving you full control over data residency and compliance.
When teams consider Langfuse, these hosted platforms usually appear on the same shortlist.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Debugging a multi‑step agent workflow
Trace each LLM call, retrieval, and tool use to pinpoint failures and iterate via the integrated playground.
A/B testing prompt versions
Manage prompt versions, collect user feedback, and run automated evaluations to select the highest‑performing prompt.
Continuous evaluation of model updates
Run custom evaluation pipelines on new model releases, compare metrics, and prevent regressions before deployment.
Self‑hosting LLM observability in a regulated environment
Deploy Langfuse on a private Kubernetes cluster, keeping all trace data on‑premise to meet compliance requirements.
Clone the repository and run `docker compose up` for a local instance; for production use the Helm chart or Terraform templates.
Official SDKs are available for Python and JavaScript/TypeScript.
Yes, integrations include Ollama, Amazon Bedrock, LiteLLM proxy, and any model accessible via the API.
Langfuse Cloud offers a generous free tier with no credit‑card required.
Data resides in a PostgreSQL database; you control encryption, backups, and network access as part of your infrastructure.
Project at a glance
ActiveLast synced 4 days ago