ten-vad
Voice Activity Detector (VAD): low-latency, high-performance and lightweight
About ten-vad
You are more than welcome to Visit TEN Hugging Face Space to try VAD and Turn Detection together.
TEN VAD is a real-time voice activity detection system designed for enterprise use, providing accurate frame-level speech activity detection. It shows superior precision compared to both WebRTC VAD and Silero VAD, which are commonly used in the industry. Additionally, TEN VAD offers lower computational complexity and reduced memory usage compared to Silero VAD. Meanwhile, the architecture's temporal efficiency enables rapid voice activity detection, significantly reducing end-to-end response…
The precision-recall curves comparing the performance of WebRTC VAD (pitch-based), Silero VAD, and TEN VAD are shown below. The evaluation is conducted on the precisely manually annotated testset. The audio files are from librispeech, gigaspeech, DNS Challenge etc. As demonstrated, TEN VAD achieves the best performance. Additionally, cross-validation experiments conducted on large internal real-world datasets demonstrate the reproducibility of these findings. The testset with annotated labels…
ten-vad is an open-source project written primarily in C, with 2.2k stars on GitHub. It was last updated in February 2026.
pip install -U --force-reinstall -v git+https://github.com/TEN-framework/ten-vad.gitten-vad vs. the alternatives
All voice agents →| Agent | Stars | Pricing | ||
|---|---|---|---|---|
| ten-vad | 2.2k | C | — | Open source |
| xiaozhi-esp32-server | 10.0k | JavaScript | MIT | Open source |
| bailing | 1.7k | Python | MIT | Open source |
| RCLI | 1.5k | C++ | MIT | Open source |
| CyberVerse | 1.4k | Python | GPL-3.0 | Open source |
| Patter | 919 | Python | MIT | Open source |
