About the product
A voice layer for writing, translation, and app-aware output on macOS.
Voxt is a macOS menu bar voice input and translation app. The product is built around a fast hold-to-speak shortcut, live transcription cleanup, translation, rewriting, and app-specific enhancement that can adapt output to the current workspace.
Workflows
What Voxt does in practice.
The core product is intentionally small in shape but broad in workflow coverage. Each shortcut changes the result type without forcing a different app or a second capture surface.
Transcription
The default transcription path keeps a live preview while you speak, then applies punctuation, filler-word cleanup, app-specific prompts, and personal dictionary rules before paste.
Translation
Translation can run right after speech transcription or on selected text directly, with separate model choice and terminology guidance for the translation lane.
Rewrite and prompt
Rewrite mode uses voice as the instruction. It can rewrite selected text, generate fresh text, and keep an answer card visible even when no writable field is focused.
App-specific enhancement
App Branch lets Voxt apply different cleanup rules, prompts, dictionaries, and output preferences based on the current app or URL, so chat, mail, docs, and editors can each receive the right tone.
Model architecture
Local ASR
MLX Audio, Whisper, Direct Dictation
Local LLM
Qwen, GLM, Llama, Mistral, Gemma
Remote ASR
OpenAI, Doubao ASR, GLM ASR, Aliyun Bailian ASR
Remote LLM
Anthropic, Gemini, OpenAI, Ollama, OpenRouter and more
Details
The product behavior is tuned around real desktop writing.
A few product details matter because they explain why Voxt is faster than a generic push-to-talk wrapper.
App-specific behavior
App Branch lets different apps or URLs use different enhancement prompts and cleanup rules, so chat, email, coding, and research can each keep their own voice.
Dictionary-aware output
Personal dictionary support can inject exact terms into prompts and auto-correct high-confidence near matches, which is especially useful for names, products, and bilingual jargon.
One workflow, multiple engines
Each workflow can stay on the same shortcut-driven surface while still routing transcription, translation, rewriting, and enhancement through different providers when that produces better latency or quality.
Main window
The desktop app keeps permissions, models, shortcuts, and workflow settings close to the main control surface.
