Skip to content

How to Reclaim Disk Space from Local Models

Local models used by Daneel, whether for chat, embedding, or knowledge extraction, are downloaded once and cached in your browser so they stay available offline. A single chat model can run from a few hundred megabytes up to several gigabytes. The Models Storage settings panel shows exactly what is on disk and lets you reclaim space at any time.

  1. Open Settings > Models Storage.
  2. The summary card at the top shows the total disk used by all cached model artifacts, the number of models involved, and a browser-level “used of available” line for context.
  3. Below the summary, models are grouped into sections by role:
    • Language models, the WebGPU LLMs used in chat.
    • Embedding models, the sentence-embedders used for site and vault indexing.
    • Knowledge extraction models, the GLiNER and LFM2-Extract variants used to build knowledge graphs.
    • Other, any cached artifact that does not match a current catalog entry (see The “Other” section below).

Each section shows its own subtotal, model count, and file count. Within a section, models are sorted largest first. Each row has a progress bar showing its share of the overall total.

  1. Click the trash icon on the row of the model you want to remove.
  2. An inline confirmation appears showing the exact amount of disk space that will be freed.
  3. Click Delete to confirm, or Cancel to dismiss.

The cached files are removed immediately. The model will re-download from HuggingFace the next time you pick it, so do this when you are comfortable with that cost.

  1. Click Delete all in the summary card.
  2. An inline confirmation appears showing the total that will be freed and the number of models affected.
  3. Click Delete all again to confirm.

This wipes every cached model artifact in one step. Your provider selection, embedding-model choice, and knowledge-extraction model preference are not touched, so picking a provider again will simply trigger a fresh download.

You can delete a model that is currently selected or even actively running. The model’s weights stay in GPU memory until the next cold start, so your current chat, ongoing ingestion, or running knowledge-graph build continues uninterrupted. The deletion only affects the next time the model has to load from scratch.

No reload, no restart, no re-activation is needed.

When a model is renamed in the catalog or removed in a later update, any files you had previously downloaded for it stay on disk until you delete them. The Other section groups these orphan artifacts so you can reclaim the space without waiting on a catalog change.

The section only appears when orphan artifacts are present. A generic graph icon is used since no catalog label is available.

Cached artifacts live in two separate browser cache stores:

  • transformers-cache, populated by the transformers.js runtime for ONNX shards, tokenizers, and configuration JSON.
  • daneel-ner-models, populated by the NER worker for GLiNER ONNX binaries, since those are downloaded outside the transformers.js pipeline.

The panel queries both stores, so the totals you see account for all downloaded artifacts in one place. Everything stays on your machine. Nothing about your model footprint ever leaves the browser.

For a broader overview of what Daneel stores and where, see Storage and Limits.

  • How Providers Work to understand which providers download models locally and which do not.
  • Use Daneel Offline to confirm that the models you keep are the ones you actually need without a network.
  • Switch Embedding Models if you want a different embedding model, which will leave the old cache behind until you delete it from this panel.