Farfalle

Self-hosted AI search engine with local and cloud LLMs

Farfalle combines multiple search APIs and LLMs—local via Ollama or cloud via OpenAI, Groq, LiteLLM—to deliver AI-enhanced answers, all deployable via Docker and Vercel.

Overview

Farfalle is a self‑hosted AI‑powered search platform that merges web search providers (SearXNG, Tavily, Serper, Bing) with large language models. Users can run local models through Ollama (llama3, gemma, mistral, phi3) or tap cloud services such as OpenAI GPT‑4o, Groq Llama‑3, and any custom endpoint via LiteLLM. The system answers queries with a blend of live web results and LLM reasoning, offering a richer, more contextual response than standard search.

Audience & Deployment

Designed for developers, teams, and privacy‑conscious users who need control over data and model costs. The stack—Next.js frontend, FastAPI backend, Redis rate limiting, and Logfire logging—runs in Docker containers, with a pre‑built image for quick start. After launching the backend, the frontend can be deployed on Vercel by pointing it to the API URL. Optional API keys enable additional search sources, but the core functionality works entirely offline with local models.

Highlights

Multi‑search provider integration (SearXNG, Tavily, Serper, Bing)

Support for local LLMs via Ollama and cloud models (OpenAI, Groq)

Custom LLM routing through LiteLLM

Agent‑driven search planning for higher relevance

Pros

Runs locally for privacy‑focused users
Flexible model selection (local, cloud, custom)
Dockerized deployment simplifies setup
Rich UI built with Next.js and shadcn/ui

Considerations

Requires Docker and Ollama for local models
Performance depends on chosen LLM and API limits
File‑based chat feature is still in roadmap
Configuration of multiple API keys can be complex

Managed products teams compare with

When teams consider Farfalle, these hosted platforms usually appear on the same shortlist.

ChatGPT

AI conversational assistant for answering questions, writing, and coding help

Claude

AI conversational assistant for reasoning, writing, and coding

Manus

General purpose AI agent for automating complex tasks

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

Developers wanting a self‑hosted AI search platform
Teams needing control over model privacy and cost
Projects that require integration with multiple search APIs
Users who prefer a customizable UI built with Next.js

Not ideal when

Non‑technical users uncomfortable with Docker setup
Environments without internet access to fetch models
Scenarios demanding instant zero‑config deployment
Use cases needing built‑in document ingestion (not yet released)

How teams use it

Research assistant for developers

Provides concise answers sourced from web and LLM reasoning, accelerating code exploration.

Enterprise knowledge base search

Combines internal search APIs with private LLMs to answer employee queries while keeping data on‑prem.

Educational Q&A portal

Students ask questions and receive AI‑generated explanations backed by live web results.

Prototype AI chatbot

Leverages LiteLLM to route requests to preferred cloud provider, enabling rapid experimentation.

Tech snapshot

TypeScript69%

Python28%

Dockerfile2%

CSS1%

JavaScript1%

Mako1%

Frequently asked questions

Do I need an OpenAI API key to run Farfalle?

No, you can operate entirely with local models via Ollama; API keys are only required for cloud providers.

Which Docker command starts the application?

Run `docker-compose -f docker-compose.dev.yaml up -d` after configuring the .env file.

Can I add my own search provider?

Farfalle is built to integrate additional providers; you can extend the FastAPI backend to call custom APIs.

Is there a pre‑built Docker image?

Yes, a ready‑to‑use image is published as part of the repository’s releases.

What license governs Farfalle?

The project is released under the Apache‑2.0 license.

Project at a glance

Dormant

Visit site View repo

Stars: 3,518
Watchers: 3,518
Forks: 323

LicenseApache-2.0

Repo age1 year old

Last commitlast year

Self-hostingSupported

Primary languageTypeScript

Last synced 24 hours ago

Overview

Overview

Audience & Deployment

Highlights

Pros

Considerations

Managed products teams compare with

ChatGPT

Claude

Manus

Fit guide

Great for

Not ideal when

How teams use it

Tech snapshot

Tags

Frequently asked questions