Free browser speech-to-text with session history and transcript download
Audio privacy notice
In Chrome/Edge, audio is sent to Google's speech servers (same as voice search). In Safari, it goes to Apple. No audio or text is stored on our servers — we have none. All session data stays in your browser.
Speech recognition not available
This app requires the Web Speech API. Use Chrome, Edge, or Safari.
Firefox requires enabling media.webspeech.recognition.enable in about:config.
Voice Transcriber turns speech to text using your browser’s built-in speech recognition engine. Press R to record, speak, press R again to stop — the transcript is automatically copied to your clipboard. Each segment is saved to a session log you can review, re-copy, or download as a text file. No account, no API key, no audio upload to Gekro servers.
.txt file.Session persistence — the current session is saved to localStorage. Refreshing or reopening the tab restores your segment history until you clear it.
The Web Speech API (SpeechRecognition / webkitSpeechRecognition) is a browser standard that abstracts over the underlying speech recognition engine. The API operates in two modes simultaneously:
The recognizer runs continuously while recording is active. When you stop, the final result for the last segment fires and the transcript is assembled from all final results since the last stop event.
Where does the actual recognition happen?
This means “no upload to Gekro” is accurate — but audio does go to Google or Apple, depending on your browser. Same infrastructure as voice search. If that’s a concern, Firefox with local recognition is the alternative, or a self-hosted Whisper instance for production use cases.
The core use case is frictionless quick capture. The gap between having a thought and having it in text is where ideas die. Dictating is typically 3–5x faster than typing for prose-length content, and for tasks like drafting a Slack message, writing a note, or capturing a verbal idea during a walk, voice-to-clipboard is meaningfully faster than any keyboard-first workflow.
Why not just use the OS dictation? macOS and Windows have built-in dictation. The difference with a browser-based tool is context and workflow: this tool saves a session log. You can dictate 10 segments across a 30-minute session, then download the full transcript and edit it. OS dictation discards intermediate results. This is the difference between a shorthand pad and a transcription service.
When the Web Speech API is right vs. when it isn’t:
The Web Speech API is right for: quick notes, meeting summaries where someone is manually curating (not automated), personal productivity workflows, low-stakes transcription of clear speech in a quiet environment.
The Web Speech API falls short for: multi-speaker transcription (no diarization), high-accuracy medical/legal transcription, offline-only environments, non-English languages with complex phonetics, audio files (it only accepts live microphone input, not uploaded audio).
For offline or high-accuracy use cases, the alternative is running OpenAI Whisper locally or via API. Whisper handles 99 languages, produces word-level timestamps, supports audio file transcription, and runs fully offline in its local version. The tradeoff is setup complexity and latency — Whisper processes in batch after recording completes, not in real time. For streaming real-time transcription at scale, Whisper Streaming or Assembly AI’s real-time API are the production options.
Browser support reality check:
| Browser | Support |
|---|---|
| Chrome / Edge | Full — Google speech recognition |
| Safari | Full — Apple speech recognition |
| Firefox | Limited — may require media.webspeech.recognition.enable flag |
| Mobile Chrome | Supported |
| Mobile Safari | Supported |
Firefox’s Web Speech API support has been behind a flag for years. If your target users are Firefox-first, this tool won’t work reliably for them. Chrome is the primary target.
.txt file and open it in your editor. The session log preserves the temporal ordering of your thoughts, which is useful context for a first edit pass.localStorage and cleared when browser storage is cleared. Not synced across devices.For informational purposes only. Not financial, medical, or legal advice. You are solely responsible for how you use these tools.