Speaker Recognition for Mac Meeting Transcription — Vext 1.1.0

Vext 1.0.0 shipped meeting transcription with a simple speaker model: Me for your microphone, Them for system audio. It worked, but it flattened multi-person calls into a single voice on each side.

1.1.0 is the speakers release. Vext now recognizes individual voices, learns them, remembers them across meetings, and lets you color them however you like — all on your Mac, with no biometric data ever leaving the device.

Here's what's new.

Cross-meeting voice recognition

Label a speaker once. The next meeting that voice appears in, Vext recognizes them and applies the name automatically.

It's a real voice profile, not a name lookup. When you save a label, Vext quietly remembers what that voice sounds like. The next time you hit record, it listens for the people you've already named and tags them as soon as they speak.

A few things that fall out of this:

It gets sharper over time. Each meeting gives Vext a slightly better fingerprint of each voice, so accuracy quietly improves the more you use it.
It works across mic and system audio. Sarah on Zoom on Monday is still Sarah in the conference room on Friday.
Voices you haven't labeled yet show up as Speaker 1, Speaker 2 — same per-meeting fallback you're used to. Label them when you have a moment, or don't.

If you've used Otter or Fireflies, this idea is familiar — they do similar voice recognition. The difference is where the voice profile lives. In Vext, it lives next to your transcripts, on your Mac. There's no server-side biometric profile of you or anyone you talk to. Nothing to leak, nothing to subpoena, nothing to opt out of.

Multi-speaker on a single mic

Until now, anything coming from your microphone was tagged Me. That's the right model for solo dictation and remote calls — but it falls apart in the room. Three people huddled around one laptop, an interview, a panel.

1.1.0 separates voices on the microphone the same way it already did for system audio. Two or three people sharing one mic now show up as Me, Me 2, Me 3. Once you rename them, the cross-meeting recognition kicks in for those voices too.

In-person meetings now produce the same labeled transcripts as remote ones.

Pick your colors

Each speaker now has a color — picked once, sticks across every transcript that voice appears in.

This sounds cosmetic. It isn't. The transcript view is dense, and consistent color makes it scannable: skim a 30-minute meeting and you can see at a glance who dominated, who interrupted, where action items got assigned. Older meetings from before this update get sensible default colors so nothing looks broken.

A more honest hotkey

Two small fixes that close annoying paper cuts:

Won't trigger while you're typing. If you're holding the dictation key while still finishing a sentence, dictation no longer fires the moment your fingers leave the keyboard.

Screenshot mode arms before it shows. A short pause on the hotkey now means nothing happens at all — only an actual drag opens the overlay. Result: fewer flickers, fewer accidental captures, mouse-driven screenshots feel instant.

Launch at login

Yes, finally. Toggle it in Settings — Vext registers itself with macOS's official login items system, so you can manage or disable it from System Settings the same way you manage any other startup app.

Quality of life

A handful of fixes that you'll feel without noticing:

Timestamps stay aligned. Long meetings used to show a tiny drift between the transcript and the screenshots you captured. Both timelines are now anchored to the same wall clock — no more drift.
Transcripts open instantly. Long meeting views are noticeably faster, especially for recordings with lots of screenshots.
No more phantom phrases. Near-silent or pure-noise audio sometimes used to come back transcribed as random filler ("you know", "uh-huh", "thanks for watching"). 1.1.0 quietly drops those.
Markdown summaries. Meeting summaries now render with proper headings, lists, and bold — instead of one wall of text.
Fewer merged speakers. Two people with similar voices used to occasionally collapse into one cluster. Tighter clustering keeps them separate.

Update

If you have Vext installed:

brew upgrade muvon/tap/vext

Or download Vext 1.1.0 directly. Existing meetings keep their data — older transcripts pick up default speaker colors automatically the first time you open them.

If you label your team once over the next few meetings, Vext will be doing most of the work for you by the end of the week. That's the whole point.

Download Vext 1.1.0