
Otter.ai
AI meeting assistant for transcription and automated note-taking
Discover top open-source software, updated regularly with real-world adoption signals.

Near‑real‑time speech transcription with flexible AI backends
WhisperLive streams audio to text instantly, supporting microphone, file, RTSP/HLS inputs, multiple inference backends, Docker deployment, and optional translation.
WhisperLive streams audio to text with near‑real‑time latency. It works with live microphone input, local audio files, and network streams (RTSP, HLS), automatically handling language detection and optional translation.
The server supports three inference backends—faster_whisper for CPU, TensorRT for NVIDIA GPUs, and OpenVINO for Intel CPUs/GPUs—allowing you to choose the best performance for your hardware. Docker images simplify GPU setup, while native execution is possible with the appropriate drivers. Client configuration lets you control model size, VAD, recording, and translation features.
You can limit concurrent users with --max_clients and set connection timeouts. The server can instantiate a separate model per client or share a single model to reduce RAM usage. Environment variables like OMP_NUM_THREADS let you fine‑tune CPU threading.
When teams consider WhisperLive, these hosted platforms usually appear on the same shortlist.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Live meeting captioning
Generate real‑time subtitles for virtual conferences and team calls
Streaming broadcast transcription
Provide captions for RTSP/HLS video streams in live broadcasts
Multilingual podcast translation
Transcribe and translate episodes into English or other target languages
Offline audio archiving
Batch process recorded audio files into searchable text transcripts
TensorRT usually offers the highest throughput on NVIDIA GPUs, OpenVINO is optimized for Intel CPUs/GPUs, and faster_whisper works well on CPUs.
A GPU is not required; CPU inference works via faster_whisper, but GPU backends provide faster results.
Set `enable_translation=True` and specify `target_language` in the client; the server will run a translation thread.
Yes, you can run natively after installing the required drivers and runtimes for the chosen backend.
Microphone input, local audio files, RTSP streams, and HLS streams are all supported.
Project at a glance
ActiveLast synced 4 days ago