Abstract: Voice recognition has been widely used in intelligent human-computer interaction, especially in the areas of voice assistant, intelligent house and autonomous driving. Due to the rapid ...
Overview Open source Python libraries empower developers to build advanced, customizable voice agents with full transparency.Python libraries like Whisper, Rasa ...
Abstract: This paper reports how speech recognition accuracy can be improved using the speech few-shot in-context learning capabilities of a multimodal foundation model when applied to the speech of ...
A lightweight Node.js application for local voice-to-text transcription with keyboard shortcut activation. Automatically types transcribed text at your cursor position in any application.
In the traditional cascade modeling approach, automatic speech recognition (ASR) first produces a single text string, which is then passed to retrieval. Small transcription errors can change query ...
A privacy-focused, local speech-to-text application that enables system-wide dictation on Linux. Speak into your microphone and have the text appear at your cursor position in any application.