
Daneel can read any assistant reply aloud and take your questions by voice. You choose between three text-to-speech providers depending on whether you want instant setup, the best voice quality, or full on-device privacy. Dictation uses the browser's built-in recognizer today, with a local Moonshine option in the pipeline.

For background on privacy trade-offs between the providers, see [Speech in Daneel](/concepts/speech/). For a full list of controls, see [Speech Reference](/reference/speech/).

## Enable speech

Open **Settings > Speech** in the widget. Two toggles live at the top of each section:

- **Text-to-speech** — enables the Play buttons on assistant messages and the Auto-read option.
- **Speech recognition** — enables the mic button in the composer.

Both default to on. Each section has its own provider picker below.

## Read a reply aloud

Hover over any assistant message. A **Play** button appears in the action row next to **Copy** and **Delete**. Click it and the assistant's reply reads aloud using the currently selected TTS provider. Click **Stop** to interrupt, or click Play on a different message to cut the first one cleanly and start the new one.

Send a new question while a reply is reading? The current playback cancels automatically. No overlap.

## Pick a TTS provider

In **Settings > Speech > Text-to-speech**, three provider cards appear:

- **System voices** (default) — uses the voices your operating system already has. Starts instantly, nothing to download, no data leaves your machine.
- **Kokoro 82M** — a 326 MB local neural TTS model that runs on your GPU. Needs a one-time download. Delivers expressive, natural voices across seven languages.
- **Moonshine** — marked "Coming soon". Placeholder for the upcoming local STT provider; not selectable yet.

Click a card to make that provider active. The next Play click uses it.

### Opt into Google Cloud voices

Some of the voices your browser exposes, typically the ones named `Google UK English Male` and similar, stream text to Google servers for a richer prosody. Daneel filters them out by default to keep speech on-device.

To enable them:

1. Pick **System voices** as the active TTS provider.
2. Expand the **Advanced** accordion under the voice picker.
3. Flip **Allow Google cloud voices**.

The voice list refreshes. Cloud voices appear with a `(cloud)` suffix. They sound remarkable, and the trade-off is explicit.

:::note
When [Offline Mode](/how-to/offline/) is active, cloud voices still play (the speech synthesis API is not part of the network gate). The speech recognition side, however, is blocked because it genuinely leaves the machine.
:::

## Download and use Kokoro

Kokoro is the option to pick when you want TTS to stay fully local.

1. **Settings > Speech > Text-to-speech** > click the **Download (~326 MB)** button on the Kokoro card.
2. A progress bar shows the model fetching from Hugging Face. First load takes a few minutes on a typical connection; the download is cached in your browser and reused forever.
3. When complete, a green **Downloaded** pill replaces the button, and a **Remove** link appears next to it if you want to free the space later.
4. Click the Kokoro card to make it the active TTS provider. The voice picker refreshes to show 54 Kokoro voices across US English, British English, Spanish, French, Italian, Hindi, Japanese, and Mandarin.

Click **Test** next to the voice picker to hear a short sample in the chosen voice before committing.

:::note
Kokoro runs on WebGPU when your hardware supports it, which is the case for most GPUs since 2018. No WebGPU means Kokoro will not be usable; stick to System voices in that case.
:::

## Dictate a question

A mic button sits next to the Send button in every chat composer. Hold-and-release or click to toggle recording:

1. Click the mic button. On first use, your browser asks for microphone permission.
2. Speak your question.
3. Click the mic again to stop. The transcript lands in the composer input box. It does **not** auto-send, which gives you a chance to read it first.
4. Correct anything if needed, then click **Send**.

### Alt+Space from anywhere

You do not need to find the mic button with your cursor. The keyboard shortcut **Alt+Space** toggles dictation from anywhere on the page, even while you are scrolling through content. Press once to start, once more to stop.

If the shortcut does not work, check `chrome://extensions/shortcuts`. Chrome occasionally reassigns shortcuts when another extension claims the same keys.

## Auto-read every reply

If you would rather not click Play on each message, flip the **Auto-read responses** toggle under the voice picker. Every new assistant message plays automatically the moment it finishes streaming. Asking a new question interrupts the current playback cleanly.

## Change the speaking speed

The **Speed** slider under the voice picker ranges from 0.5× to 2.0×. The setting applies to whichever provider is active and takes effect on the next Play click.

## Switch providers mid-session

All speech settings are live. Change the active provider, pick a different voice, flip Auto-read, adjust speed, and the very next Play uses the new configuration. No reload, no restart.
