Item: Wispr Flow
Rating: 4.5
Author: Rohit Burani

Why I tried it

The bottleneck in my day was never thinking, it was getting the thought down. Prompts, Slack replies, commit messages, the first messy draft of a post - all of it lived at typing speed, and typing speed is slower than I talk. I had tried dictation before and bounced off it every time, because the raw transcript always needed a second pass to strip the “um”s and add the commas, and that second pass cost back whatever the first pass saved.

Wispr Flow was the first one where the second pass mostly disappeared. You hold a hotkey, talk, let go, and the text that appears in whatever field you were in is already punctuated, de-ummed, and reads like you wrote it on purpose. It sits in the menu bar and works the same in the editor, the browser, and a chat window. After a week it had absorbed most of the writing I do that is not code.

The habit stuck, and the numbers back it up. Wispr’s own dashboard has me at 150 words a minute (it puts that in the top 0.1% of its users), 71,137 words dictated in about seven weeks, and 2,339 cleanups it made along the way - a copy editor riding shotgun that I never have to thank. About 1,406 of those dictations were prompts fired straight at AI tools, which is to say a large share of how I talk to models now is literally me talking. Those are the app’s own numbers, not a cherry-picked week of mine.

So I built my own

Here is the part that makes this an engineer’s review and not a coupon. I liked Wispr enough that I wanted to own the workflow instead of renting it, so I forked whisper-key-local by Pin Wang and then kept going. Whisper Local is now more than 40 commits ahead of the upstream I started from, and still moving: a profiles system, per-app rules, an inline voice-command DSL, a transforms-and-dictionary layer, a floating capture overlay, an opt-in whisper.cpp backend, and a stats dashboard, all bent toward one goal - the Wispr feel, running locally. I have been on it daily for weeks.

The difference from Wispr is the whole point: it runs OpenAI’s Whisper entirely on your machine through faster-whisper, so the audio never leaves the laptop - not even the metadata. A hotkey records, you release, it pastes. An optional local Ollama pass adds the punctuation-and-capitalization polish that makes dictated text feel finished, and there is a small local API on localhost:7777 if you want to wire it into other tools. It is MIT licensed, runs on Windows and macOS, uses the GPU if you have one, and installs in three lines of pip.

Honest comparison after a few weeks side by side: Wispr is still smoother out of the box. It is a real product - the mobile apps and Command Mode have no equivalent in my build, and the default tiny model is less accurate than Wispr’s cloud until you size up to a larger Whisper model. What mine gives back is everything in the “what it’s bad at” list above: no cloud, no screenshots, no subscription, and a codebase you can actually read. If privacy or the $144 a year is what has been stopping you, clone it and go: github.com/drajb/whisper-local.

When I’d skip it

Skip it for two things. First, code - the accuracy on identifiers, flags, and library names is not good enough, and correcting getUserById back from “get user by I.D.” is slower than just typing it. I dictate the comment and the commit message, then type the function.

Second, anything sensitive on screen. Wispr uses your on-screen context, and unless Privacy Mode is on, their policy lets your audio and text feed model training. If you are looking at a terminal with a token in it, or a customer’s data, that is the wrong tool - which is exactly why I built the fully local Whisper Local for that side of my day. Windows’ own Voice Typing (Win+H) is free and built in, but in my use it is unreliable on technical terms and whole phrases, and it only auto-punctuates and strips filler on Copilot+ PCs - so it has never been a real substitute for me.

My setup

I run the Pro tier with Privacy Mode (Zero Data Retention) switched on, which is the only configuration I would recommend for anyone whose screen ever shows credentials. Day to day it is: dictate the first draft of a prompt, dictate Slack and email, dictate the rough shape of a blog section, then go back and type the parts that need precision. The same subscription follows me to the phone, though the dashboard says 97% of my dictation still happens at the desktop. The cleanup is good enough that I have stopped proofreading short messages before I send them. That, more than any benchmark, is the real endorsement.

This pairs with how I think about letting AI handle the mechanical parts of writing while I keep the judgment - dictation is the same trade for prose that an AI pair is for code.

Where it goes next

Both stay, for now, and that is the honest state of it. Wispr is the polished daily driver, my fork is the one I keep tightening, and I will keep publishing the side-by-side numbers as the sample grows instead of declaring a winner early. The day my own build clearly wins on the work I do every day, this entry moves to watching and the writeup of Whisper Local becomes the main event. Until then the choice is simple: rent the polish, or own the workflow. Wispr if you want it handled for you, mine if you would rather it never leave your machine.

This entry includes a referral link. If you sign up through it, gekro may receive free credits or a small benefit at no extra cost to you. Referrals never affect the verdict. Gekro takes no ad revenue.

	Wispr Flow	Whisper Local	Windows Voice Typing
Runs offline	No	Yes	No (cloud)
Auto-cleanup	Yes	Yes (optional)	Copilot+ PCs only
Works in every app	Yes	Yes	Yes
Sends audio to cloud	Yes	No	Yes
Price	$15/mo ($12 annual)	Free (MIT)	Free (built-in)

Wispr Flow

The good and the bad

What it's good at

What it's bad at

How it compares

Why I tried it

So I built my own

When I’d skip it

My setup

Where it goes next

The good and the bad

+ What it's good at

− What it's bad at

How it compares

Why I tried it

So I built my own

When I’d skip it

My setup

Where it goes next

What it's good at

What it's bad at