Rapid
Speech
ASR & TTS in the Browser — WebAssembly + WebGPU
Checking WebGPU support…
ASR Offline
ASR Online
TTS Offline
Model
Model URL
Threads
Use LLM
Load Model
Enter a model URL and click Load.
VAD (optional — silero-vad or firered-vad)
VAD model URL
Threshold
0.50
Min seg (s)
Load VAD
No VAD loaded — full clip will be transcribed.
Input
Source
Upload WAV
Record from mic
Audio file
Recording
● Record
00:00.0
Transcribe
Re-decode (LLM)
Transcript
Model
Model URL
Threads
Use LLM
Two-pass
Load Model
Enter a model URL and click Load.
VAD (neural — silero-vad or firered-vad; falls back to energy gate)
VAD model URL
Load VAD
Energy gate (default).
Speech threshold
0.500
Silence frames (energy mode only)
15
Microphone
Start
Stop
Clear
silence
Transcript
Model
Model URL
Threads
Load Model
Enter a model URL and click Load.
Generation params
Instruct
male
female
child
Language
English
Chinese
Seed
Diffusion steps
32
Voice cloning (optional)
Reference WAV
Reference text
Synthesize
Text
Generate
Clear
⬇ Download WAV