MacWhisperer Tutorial

1

Installation

📦 Download the App

Download the latest .dmg file from the GitHub Releases page. Open the DMG and drag MacWhisperer.app to your Applications folder.

🛠 Build from Source (optional)

If you prefer building from source, you'll need Xcode 15+ and CMake installed:

# Install CMake if needed
$ brew install cmake

# Clone the repository
$ git clone https://github.com/stonedMoose/iWhisperer.git
$ cd iWhisperer/Whisperer

# Build whisper.cpp native libraries (one-time)
$ bash scripts/build-whisper.sh

# Build and run
$ swift run MacWhisperer

Tip: MacWhisperer requires macOS 14 Sonoma or later and is optimized for Apple Silicon (M1/M2/M3/M4). It also works on Intel Macs but without Metal GPU acceleration.

2

Initial Setup

On first launch, MacWhisperer opens a setup wizard that walks you through two essential permissions:

🎤 Microphone Permission

MacWhisperer needs access to your microphone to capture audio. macOS will display a system prompt — click Allow. You can verify this later in System Settings > Privacy & Security > Microphone.

♿ Accessibility Permission

This permission allows MacWhisperer to inject transcribed text directly at your cursor position in any application. Without it, the app can still copy text to your clipboard but cannot type it automatically.

Open System Settings > Privacy & Security > Accessibility
Click the + button and add MacWhisperer
Toggle it on

💾 First Model Download

After granting permissions, MacWhisperer downloads the default Whisper model (Small, ~460 MB). This happens once — the model is cached locally for all future use. No network access is needed after this.

Launch app

➔

Grant permissions

➔

Download model

➔

Ready to use

3

Choosing a Model

MacWhisperer offers five Whisper model sizes. Larger models are more accurate but slower and use more memory. Choose based on your hardware and needs:

Model	Size	Speed	Accuracy	Best For
Tiny	~75 MB	Fastest	Basic	Quick notes, single language
Base	~140 MB	Very fast	Good	Casual dictation
Small ★	~460 MB	Fast	Very good	Daily use (recommended)
Medium	~1.5 GB	Moderate	Excellent	Important transcriptions
Large v3	~3 GB	Slowest	Best	Multi-language, maximum accuracy

Recommendation: Start with Small — it offers the best balance of speed and accuracy for most use cases. Switch to Large v3 if you frequently transcribe multilingual audio or need maximum precision.

To change your model, open Settings from the menu bar icon and select a new model under the Model section. The new model will be downloaded automatically if not already cached.

4

Dictation Mode

This is MacWhisperer's core feature: speak, and text appears at your cursor. It works in any application — text editors, email, chat apps, browsers, search fields, and more.

🎹 How to Dictate

Place your cursor where you want text to appear (e.g., an email draft, a document, a search field)
Hold your keyboard shortcut (configure it in Settings > Insert at Caret). A purple recording indicator appears on screen
Speak naturally — don't worry about pace or pauses
Release the shortcut — MacWhisperer processes the audio and injects the transcribed text at your cursor position

Hold shortcut

➔

Speak

➔

Release

➔

Text appears at cursor

⌨ Setting the Keyboard Shortcut

Open Settings from the menu bar icon. Under Insert at Caret, click the recorder field and press the key combination you want (e.g., ⌥ Option + Space or F5). This is a hold-to-record shortcut — recording starts when you press and stops when you release.

Important: Avoid shortcuts already used by macOS or other apps. Common safe choices: ⌥ Option + Space, Fn + F5, or ⌃ Ctrl + ⇧ Shift + R.

5

Streaming Mode

Streaming mode transcribes audio in real time as you speak, instead of waiting for you to finish. Words appear progressively as they are recognized.

📡 How It Works

When streaming is enabled, MacWhisperer processes audio in small chunks (3-second windows) and injects recognized words immediately. A stability algorithm ensures that only confident words are typed — earlier words may be revised as more context arrives.

⚙ Enable Streaming

Open Settings and toggle Enable streaming mode under the Streaming section. Then use the same hold-to-record shortcut — the only difference is that text appears while you're still speaking.

When to use: Streaming is great for live presentations, brainstorming, or when you want immediate visual feedback. For formal documents, the standard (batch) mode is often more accurate since it processes the full audio at once.

6

Meeting Recording

MacWhisperer can record entire meetings and produce a timestamped transcript with automatic speaker diarization — each participant is identified and labeled.

🎬 Start a Meeting Recording

Set up a Meeting shortcut in Settings under the Meeting section (e.g., ⌃ Ctrl + ⇧ Shift + M)
Press the shortcut to start recording — the menu bar icon indicates active recording and elapsed time
Press the shortcut again to stop recording
MacWhisperer processes the audio: transcription + speaker diarization
A Markdown transcript is saved to your configured transcript directory

📄 Transcript Output

Meeting transcripts are saved as Markdown files with the following format:

# Meeting — 2025-03-16 14:30

[00:00] SPEAKER_00: Let's start with the Q4 results...
[00:15] SPEAKER_01: Revenue is up 12% compared to last quarter.
[00:28] SPEAKER_00: Great. What about the mobile launch timeline?
[00:35] SPEAKER_02: We're targeting early April...

📁 Transcript Storage

By default, transcripts are saved in your Documents folder. Change this in Settings > Transcript location by clicking Choose and selecting your preferred directory.

Tip: Point the transcript directory to a cloud-synced folder (iCloud Drive, Dropbox, etc.) to automatically back up all your meeting notes.

7

AI Transcript Refinement

After a meeting recording, MacWhisperer can optionally send the transcript to an LLM to clean it up: identify speakers by name, fix transcription errors, and improve readability.

Privacy note: This is the only feature that sends data outside your machine. It is disabled by default and entirely optional. The core transcription always remains 100% local.

⚙ Configure Refinement

Open Settings and scroll to the AI Refinement section
Toggle Refine transcript with AI on
Choose your provider:

OpenAI

Uses GPT-4o by default. Requires an API key from platform.openai.com.

Anthropic

Uses Claude Sonnet by default. Requires an API key from console.anthropic.com.

Claude Code (CLI)

Uses Claude Code installed locally — no API key needed. Requires Claude Code to be installed on your system.

✍ Custom Prompts

You can customize the refinement prompt in Settings. The default prompt instructs the LLM to identify speakers by name from context clues, fix misattributed speech segments, and clean up transcription errors. Edit it to match your specific needs — for example, you could add participant names upfront.

8

Settings Reference

Access all settings by clicking the MacWhisperer icon in the menu bar and selecting Settings. The settings panel is organized into four columns:

⚙ General

Launch at Login — start MacWhisperer automatically when you log in
App Language — interface language (English, French, Spanish, Chinese, Portuguese, German)
Model — select the Whisper model size (Tiny to Large v3)
Permissions — verify microphone and accessibility status, re-run setup guide

🌐 Languages

Preferred Languages — select which languages you commonly speak. This helps Whisper prioritize detection. Supported: English, French, German, Spanish, Italian, Portuguese, Dutch, Japanese, Chinese, Korean, Russian, Arabic

⌨ Insert at Caret

Keyboard Shortcut — hold-to-record shortcut for dictation
Streaming Mode — toggle real-time transcription on/off

👥 Meeting

Meeting Shortcut — toggle shortcut for starting/stopping meeting recordings
Transcript Location — directory where meeting transcripts are saved
AI Refinement — provider, API key, model, and custom prompt

9

Tips & Troubleshooting

💡 Best Practices

Speak clearly at a natural pace — Whisper handles pauses and hesitations well
Use a good microphone — built-in MacBook mics work, but a dedicated mic improves accuracy significantly
Match model to task — use Small for daily dictation, Large v3 for important or multilingual recordings
Quiet environment — background noise reduces accuracy, especially with smaller models
Set preferred languages — narrowing the language list improves detection speed and accuracy

🔧 Common Issues

Text doesn't appear at my cursor

The Accessibility permission is missing or disabled. Go to System Settings > Privacy & Security > Accessibility and ensure MacWhisperer is toggled on. You may need to remove and re-add the app if you updated it.

Recording doesn't start

Check that the Microphone permission is granted (System Settings > Privacy & Security > Microphone). Also verify your keyboard shortcut isn't conflicting with another app.

Transcription is slow

Switch to a smaller model (Tiny or Base). On Intel Macs, Metal GPU acceleration is not available, so the Small model is the recommended maximum.

Wrong language detected

Open Settings > Languages and configure your Preferred Languages. Reducing the list helps Whisper detect your language faster and more accurately.

Model download fails

Models are downloaded from HuggingFace. Ensure you have an internet connection. If the download is interrupted, MacWhisperer will retry on next launch. You can also manually download the GGML model files from huggingface.co/ggerganov/whisper.cpp.

App doesn't appear in menu bar

MacWhisperer is a menu bar app — it doesn't have a Dock icon. Look for the Whisperer icon in the top-right area of your menu bar. If you have many menu bar icons, it may be hidden by the notch on MacBook Pro.

↑ Back to top