Skip to content

How to Read Messages Aloud and Dictate Questions

Daneel can read any assistant reply aloud and take your questions by voice. You choose between three text-to-speech providers depending on whether you want instant setup, the best voice quality, or full on-device privacy. Dictation uses the browser’s built-in recognizer today, with a local Moonshine option in the pipeline.

For background on privacy trade-offs between the providers, see Speech in Daneel. For a full list of controls, see Speech Reference.

Open Settings > Speech in the widget. Two toggles live at the top of each section:

  • Text-to-speech — enables the Play buttons on assistant messages and the Auto-read option.
  • Speech recognition — enables the mic button in the composer.

Both default to on. Each section has its own provider picker below.

Hover over any assistant message. A Play button appears in the action row next to Copy and Delete. Click it and the assistant’s reply reads aloud using the currently selected TTS provider. Click Stop to interrupt, or click Play on a different message to cut the first one cleanly and start the new one.

Send a new question while a reply is reading? The current playback cancels automatically. No overlap.

In Settings > Speech > Text-to-speech, three provider cards appear:

  • System voices (default) — uses the voices your operating system already has. Starts instantly, nothing to download, no data leaves your machine.
  • Kokoro 82M — a 326 MB local neural TTS model that runs on your GPU. Needs a one-time download. Delivers expressive, natural voices across seven languages.
  • Moonshine — marked “Coming soon”. Placeholder for the upcoming local STT provider; not selectable yet.

Click a card to make that provider active. The next Play click uses it.

Some of the voices your browser exposes, typically the ones named Google UK English Male and similar, stream text to Google servers for a richer prosody. Daneel filters them out by default to keep speech on-device.

To enable them:

  1. Pick System voices as the active TTS provider.
  2. Expand the Advanced accordion under the voice picker.
  3. Flip Allow Google cloud voices.

The voice list refreshes. Cloud voices appear with a (cloud) suffix. They sound remarkable, and the trade-off is explicit.

Kokoro is the option to pick when you want TTS to stay fully local.

  1. Settings > Speech > Text-to-speech > click the Download (~326 MB) button on the Kokoro card.
  2. A progress bar shows the model fetching from Hugging Face. First load takes a few minutes on a typical connection; the download is cached in your browser and reused forever.
  3. When complete, a green Downloaded pill replaces the button, and a Remove link appears next to it if you want to free the space later.
  4. Click the Kokoro card to make it the active TTS provider. The voice picker refreshes to show 54 Kokoro voices across US English, British English, Spanish, French, Italian, Hindi, Japanese, and Mandarin.

Click Test next to the voice picker to hear a short sample in the chosen voice before committing.

A mic button sits next to the Send button in every chat composer. Hold-and-release or click to toggle recording:

  1. Click the mic button. On first use, your browser asks for microphone permission.
  2. Speak your question.
  3. Click the mic again to stop. The transcript lands in the composer input box. It does not auto-send, which gives you a chance to read it first.
  4. Correct anything if needed, then click Send.

You do not need to find the mic button with your cursor. The keyboard shortcut Alt+Space toggles dictation from anywhere on the page, even while you are scrolling through content. Press once to start, once more to stop.

If the shortcut does not work, check chrome://extensions/shortcuts. Chrome occasionally reassigns shortcuts when another extension claims the same keys.

If you would rather not click Play on each message, flip the Auto-read responses toggle under the voice picker. Every new assistant message plays automatically the moment it finishes streaming. Asking a new question interrupts the current playback cleanly.

The Speed slider under the voice picker ranges from 0.5× to 2.0×. The setting applies to whichever provider is active and takes effect on the next Play click.

All speech settings are live. Change the active provider, pick a different voice, flip Auto-read, adjust speed, and the very next Play uses the new configuration. No reload, no restart.