SurfSense logo

SurfSense

Private AI research agent integrated with your knowledge base

Self-hosted AI research platform that combines NotebookLM-style chat with Perplexity-like search, connected to 15+ external sources including Slack, Notion, GitHub, and Gmail.

SurfSense banner

Overview

Your Personal AI Research Hub

SurfSense is a customizable AI research agent that transforms how you interact with information. Unlike standalone tools, it connects directly to your personal knowledge base—ingesting documents, videos, images, and content from 50+ file formats—while simultaneously pulling from external sources like Slack, Linear, Jira, Confluence, Gmail, Notion, YouTube, GitHub, Discord, and more.

Advanced Capabilities

Built on enterprise-grade RAG architecture, SurfSense employs hierarchical indices, hybrid search combining semantic and full-text retrieval, and supports 100+ LLMs with 6,000+ embedding models. Every answer includes citations like Perplexity, ensuring transparency. The platform generates podcasts from conversations in under 20 seconds and offers a cross-browser extension to capture authenticated web content.

Privacy-First Deployment

Designed for self-hosting, SurfSense runs entirely on your infrastructure with full Ollama support for local LLMs. Choose between Docker deployment with pgAdmin or manual installation for granular control. Whether you're a researcher consolidating scattered knowledge, a team centralizing project intelligence, or a privacy-conscious organization, SurfSense delivers cited, conversational answers grounded in your data.

Highlights

Unified search across 50+ file formats and 15+ external sources (Slack, Notion, GitHub, Gmail, Jira)
Advanced RAG with hierarchical indices, hybrid search, and support for 100+ LLMs and 6,000+ embedding models
Blazingly fast podcast generation—3-minute audio in under 20 seconds from chat conversations
Privacy-first self-hosting with Ollama local LLM support and flexible Docker or manual deployment

Pros

  • Integrates personal files with live external sources for comprehensive research
  • Cited answers with transparent source attribution like Perplexity
  • Supports local LLMs via Ollama for complete data privacy
  • Extensive file format support (50+ extensions) and multiple ETL service options

Considerations

  • Not yet production-ready; actively under development
  • Requires setup of PGVector database and ETL service configuration
  • External source integrations require individual API keys and authentication
  • Self-hosting demands infrastructure management and technical expertise

Managed products teams compare with

When teams consider SurfSense, these hosted platforms usually appear on the same shortlist.

Coda logo

Coda

Docs, tables, and apps combined into one collaborative workspace

Craft logo

Craft

Collaborative documents and notes with rich formatting

Document360 logo

Document360

Knowledge base software for product docs and self‑service help

Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.

Fit guide

Great for

  • Researchers consolidating knowledge from documents, Slack, Notion, and other scattered sources
  • Privacy-conscious teams needing on-premise AI research without cloud data exposure
  • Organizations with authenticated content behind firewalls or login walls
  • Technical users comfortable with Docker or manual deployment seeking customizable RAG pipelines

Not ideal when

  • Non-technical users seeking plug-and-play SaaS solutions without infrastructure setup
  • Production environments requiring battle-tested stability and enterprise SLAs
  • Teams without resources to manage self-hosted databases and LLM infrastructure
  • Users needing immediate deployment without configuring ETL services or API keys

How teams use it

Cross-Platform Project Research

Query Slack threads, Jira tickets, Confluence docs, and GitHub issues simultaneously to surface relevant context for sprint planning or incident response.

Knowledge Base Podcast Creation

Convert lengthy chat conversations or research findings into 3-minute podcasts in under 20 seconds for team updates or asynchronous knowledge sharing.

Authenticated Content Capture

Use the browser extension to save paywalled articles, internal wikis, or login-protected resources into your searchable knowledge base.

Privacy-Compliant Research

Run entirely on local infrastructure with Ollama LLMs to analyze sensitive documents without exposing data to third-party APIs.

Tech snapshot

Python50%
TypeScript48%
MDX2%
CSS1%
Dockerfile1%
JavaScript1%

Tags

aiextensionaceternity-uislackagentsollamaragchrome-extensionpythonperplexitylangchainnextjsagentnotionfastapitypescriptlanggraphnotebooklm

Frequently asked questions

Does SurfSense require cloud APIs or can it run completely offline?

SurfSense supports fully local operation using Ollama for LLMs and Docling for document processing. External source integrations (Slack, Notion, etc.) require internet access, but core chat and search work offline with local models.

What file formats can I upload to my knowledge base?

Depending on your ETL service, SurfSense supports 50+ formats via LlamaCloud, 34+ via Unstructured, or core formats via Docling—including PDFs, Office docs, images, videos, spreadsheets, and email files.

How does SurfSense differ from NotebookLM or Perplexity?

SurfSense combines NotebookLM-style conversational chat with Perplexity-like cited search, but integrates directly with your personal files and 15+ external sources (Slack, GitHub, Notion, etc.) in a self-hosted environment.

Is SurfSense ready for production use?

No, SurfSense is actively under development and not yet production-ready. It's suitable for experimentation, research projects, and teams comfortable with evolving software.

What deployment options are available?

SurfSense offers Docker installation with pgAdmin for easy setup, or manual installation for users needing granular control. Both methods support Windows, macOS, and Linux with detailed guides.

Project at a glance

Active
Stars
12,525
Watchers
12,525
Forks
1,093
LicenseApache-2.0
Repo age1 year old
Last commit2 days ago
Self-hostingSupported
Primary languagePython

Last synced yesterday