# Daneel AI — Full Documentation

> Privacy-first Chrome extension for AI-powered browsing and document analysis.
> This file contains all documentation pages for LLM consumption.

Built: 2026-04-27
Version: latest
Pages: 50

---

# Daneel AI

**Chat with any website or your own documents, 100% in your browser.**

> Source: https://doc.daneel.injen.io/index.md

import { LinkCard, CardGrid } from '@astrojs/starlight/components';

<CardGrid>
  <LinkCard title="Page Chat" href="/guides/first-page-chat/" description="Ask questions about any webpage. Content is extracted and answered in real time. Works with YouTube videos too." />
  <LinkCard title="Site Search" href="/guides/first-site-index/" description="Index entire websites via sitemap crawling. RAG-powered search with AI answers and source citations." />
  <LinkCard title="Document Vault" href="/guides/first-vault/" description="Import PDFs, DOCX, TXT, HTML, and more. Chat with your documents using local vector search. Build knowledge graphs." />
  <LinkCard title="5 AI Backends" href="/guides/connect-provider/" description="WebGPU (local), Ollama, Gemini Nano, Claude API, and Azure OpenAI. From fully offline to cloud-powered." />
  <LinkCard title="MCP Tool Calling" href="/how-to/mcp-server/" description="Connect Stripe, Notion, Vercel, Supabase, and more. Let the AI call tools on your behalf." />
  <LinkCard title="Privacy First" href="/concepts/privacy/" description="Default backend runs 100% in your browser. Your data stays on your machine unless you choose otherwise." />
</CardGrid>

---

# Background Tasks

**How Daneel runs site crawls, vault indexing, and knowledge graph builds independently of the UI — and survives panel close, tab switches, and browser restarts.**

> Source: https://doc.daneel.injen.io/concepts/background-tasks/index.md

Site crawling, vault indexing, and knowledge graph extraction can take minutes. Daneel runs these operations in the background so you can close panels, switch tabs, or navigate away without losing progress.

This is what powers the task monitor in [Settings > Tasks](/how-to/background-tasks/).

## The problem: UI-coupled operations

Chrome extensions run UI code inside content scripts and popup panels. When a panel closes — because you clicked outside it, navigated to another site, or simply switched views — any JavaScript running inside that panel stops. If a crawl was in progress, it dies silently.

This is a fundamental problem: long-running operations cannot live inside short-lived UI components.

## The solution: service worker ownership

Daneel inverts the relationship. Instead of the UI owning the operation, the **background service worker** owns it. The UI becomes a passive observer — it subscribes to progress updates, but its presence or absence does not affect execution.

When you start a site crawl from the Search overlay:

1. The overlay sends a `task-enqueue` message to the service worker
2. The service worker creates a **task record**, persists it to storage, and dispatches the work to the host tab
3. The host tab (which has GPU access) does the actual crawling and embedding
4. Progress broadcasts flow back through the service worker, which updates the task record
5. The overlay (if still open) receives these updates and renders a progress bar

If you close the overlay at step 3, steps 3 and 4 continue unchanged. When you reopen the overlay later, it connects to the service worker and receives the current task state.

## Checkpoint-resume: surviving eviction

Chrome's Manifest V3 service workers are not persistent. They sleep after roughly 30 seconds of inactivity and can be terminated at any time. A naive implementation would lose all state when the service worker dies.

Daneel handles this with **checkpoint-forward execution**:

- After each meaningful progress update (a page crawled, a document embedded, a batch of entities extracted), the service worker writes the updated task record to `chrome.storage.local`
- A `chrome.alarms` heartbeat fires every 60 seconds
- When the alarm wakes the service worker, it reads task records from storage and checks: is there a task marked as "running" but with no active dispatch in memory?
- If so, the service worker re-creates the host tab (if needed) and re-dispatches the operation from where it left off

For site crawls, the checkpoint includes the list of already-crawled URLs. On resume, these URLs are passed as a `skipUrls` parameter, so the crawler picks up where it left off without re-fetching pages.

For knowledge graph builds, the checkpoint tracks how many chunks have been processed. On resume, incremental mode detects which chunks already have entities extracted and skips them.

For vault indexing, the checkpoint tracks which documents have been ingested. The scheduler drives the per-document loop, dispatching the next un-ingested document after each one completes.

## The GPU semaphore

All three task types need the GPU — site crawling embeds chunks via WebGPU, vault indexing does the same, and knowledge graph extraction runs NER inference plus entity embedding. Running two GPU-heavy operations simultaneously causes WebGPU context loss (the second model's ONNX Runtime session loses its device reference).

The task scheduler enforces a simple rule: **one GPU-bound task at a time**. When a second task is enqueued while the first is running, it enters a `pending` state and waits in a FIFO queue. When the first task completes (or is cancelled), the next pending task is dispatched automatically.

This is implemented as a boolean lock (`gpuBusy`) with a queue of waiting task IDs. There is no priority system or preemption — tasks run in the order they were enqueued.

## Task lifecycle

A task moves through these states:

| Status | Meaning |
|--------|---------|
| `pending` | Queued, waiting for the GPU lock |
| `running` | Actively executing in the host tab |
| `paused` | User requested pause — checkpoint saved, host stopped |
| `complete` | All work finished successfully |
| `failed` | An error occurred (network timeout, model crash, etc.) |
| `cancelled` | User cancelled the task |

State transitions are guarded — a completed task cannot be paused, a pending task cannot be "completed," and so on. Invalid transitions are silently ignored rather than throwing errors, which prevents race conditions in async code.

## What the UI sees

Any UI component (the Search overlay, the Vault panel, the Settings > Tasks panel) can observe task state by connecting a `chrome.runtime.Port` named `task-observer`. On connect, the service worker sends a snapshot of all current tasks. After that, incremental `task:updated` events flow over the port whenever progress changes.

This means you can start a crawl from the Search overlay, close it, open Settings > Tasks, and see the same live progress bar. The data source is the same — only the rendering differs.

The existing progress messages from the host tab (like `crawler-page` and `embedding-page`) still flow through the normal relay system. The task layer intercepts these to update checkpoints, then lets them pass through to any UI that's listening. This means the old progress indicators still work alongside the new task system.

## What this enables

Beyond fixing the "close panel, lose progress" bug, the task layer is the foundation for:

- **Concurrent awareness** — even though only one GPU task runs at a time, you can see what's queued and what's waiting
- **Crash recovery** — if Chrome closes mid-crawl, the task resumes from its checkpoint on next launch
- **Cross-panel visibility** — any settings panel or overlay can show the same task state
- **Future deep research** — multi-step operations (query planning, source fetching, synthesis) that could run for 10+ minutes will use the same infrastructure

See [How to Monitor Background Tasks](/how-to/background-tasks/) for the practical guide to the Settings > Tasks panel.

---

# Environment Context

**How Daneel injects location, datetime, and timezone into agent prompts — and why it matters.**

> Source: https://doc.daneel.injen.io/concepts/context-injection/index.md

When you ask an AI "what time is the next flight?", it needs to know your timezone. When you ask "find me a good restaurant nearby", it needs to know where you are. By default, most AI models have no idea about either. Daneel's environment context system solves this by injecting location, datetime, and timezone data into the system prompt before the AI sees your message.

## What gets injected

When enabled, Daneel adds an `## Environment Context` section to the system prompt:

```
## Environment Context
- Location: Lyon, France
- Current date/time: 2026-04-02T14:30:00.000Z
- Timezone: Europe/Paris
```

The AI reads this as part of its instructions and can reference it naturally in responses. There is no special API or tool involved — the context is plain text, visible to the model alongside the rest of the prompt.

## The three-tier architecture

Context injection uses a layered permission model. All three tiers must agree before data is injected.

### Tier 1: Global privacy gate

Two toggles in **Settings > Privacy** control what the extension is allowed to share:

- **Share location with agents** (off by default) — controls geolocation
- **Share date & timezone with agents** (on by default) — controls datetime/timezone

If a toggle is off, that data type is never injected, regardless of what agents or MCP servers request. This is the hard override.

### Tier 2: MCP server declarations

Each registered MCP server can declare whether its tools benefit from context:

- `requiresLocation` — tools like Google Maps, weather APIs, or local search
- `requiresDatetime` — tools that deal with scheduling, deadlines, or time-sensitive data

These flags are toggled per-server in **Settings > MCP** via clickable `location` and `datetime` badges on each server row.

### Tier 3: Agent overrides

Each agent has optional override fields:

- **Uses location** — force on/off regardless of bound servers
- **Uses date & time** — force on/off regardless of bound servers

When unset (the default), the agent inherits from its bound MCP servers: if any bound server declares `requiresLocation`, the agent gets location injected.

### Resolution order

```
Agent explicit flag (if set)
  → else: inherit from any bound MCP server
    → else: defaults (location=false, datetime=true)
      → gated by global privacy toggles
```

Datetime defaults to `true` because it has zero privacy cost (it uses `Date` and `Intl.DateTimeFormat` — no network calls, no permissions). Location defaults to `false` because it requires browser geolocation permission.

## How geolocation works

The geolocation pipeline has several steps, each designed to degrade gracefully:

1. The background service worker sends a `geo-request` to the host page (an extension page with DOM access)
2. The host page calls `navigator.geolocation.getCurrentPosition()` with `enableHighAccuracy: false` (WiFi/IP-level, not GPS)
3. If the user grants permission, the host returns latitude and longitude
4. The background calls [Nominatim](https://nominatim.openstreetmap.org/) (OpenStreetMap's free reverse geocoder) to convert coordinates into a city name
5. The result (e.g., "Lyon, France") is cached in memory for the session

If any step fails — permission denied, timeout, geocoding error — the location is silently omitted. No error is shown to the user; the prompt simply proceeds without location context.

### Why the host page?

`navigator.geolocation` is not available in Chrome extension service workers. Calling it from a content script would prompt the user against the *host site's* origin ("example.com wants your location"), which is confusing. By using the extension's own host page, the permission prompt correctly shows the extension name.

### Permission flow

When you toggle "Share location with agents" **on** for the first time:

1. The host tab is briefly focused so Chrome can show its native geolocation prompt
2. You grant or deny the permission
3. Focus returns to your original tab automatically
4. If you denied, the toggle stays off

This only happens once. After granting permission, location resolves silently on subsequent uses.

### Caching and rate limits

- The resolved location is cached in memory for the browser session (cleared when the service worker restarts)
- The browser's own position cache (`maximumAge: 300,000ms`) avoids repeated GPS/WiFi lookups
- Nominatim is called at most once per session — well within its 1 request/second policy
- No location data is written to `chrome.storage` or persisted to disk

## How datetime works

Datetime injection is trivial by comparison:

- `new Date().toISOString()` for the current timestamp
- `Intl.DateTimeFormat().resolvedOptions().timeZone` for the IANA timezone (e.g., `Europe/Paris`)

No permissions, no network calls, no caching needed. The value is computed fresh on each message.

## Privacy considerations

Environment context follows Daneel's [privacy model](/concepts/privacy/):

- **Geolocation** stays between your browser and Nominatim (for reverse geocoding). The resolved city name is only sent to your selected LLM provider as part of the prompt. Nominatim receives coordinates but no user identity — requests use a generic `DaneelAI-Extension` user-agent.
- **Datetime** never leaves your browser until it's embedded in a prompt sent to the LLM.
- **Neither type** is stored in `chrome.storage`, included in data exports, or sent to telemetry. The in-memory cache is lost when the service worker restarts.

The telemetry system has its own separate IP-based geolocation (for country/region analytics), which is independent and governed by the "IP geolocation" toggle. The two systems do not share data.

## When context injection activates

| Scenario | Location | Datetime |
|----------|----------|----------|
| Agent with `usesLocation: true` | Yes (if global gate on) | Yes |
| Agent inheriting from MCP server with `requiresLocation` | Yes (if global gate on) | Yes |
| Bare MCP server usage (no agent) | Inherits from server flags | Yes |
| Page Q&A or Site RAG (no agent, no MCP) | No | Yes |

Datetime is injected in all scenarios by default, including plain page questions. This gives the AI temporal awareness even without agents or tools — useful for questions like "is this documentation still current?"

## Next steps

- [How to Use Environment Context](/how-to/context-injection/) — practical setup steps
- [How to Create a Custom Agent](/how-to/agents/) — agents with context overrides
- [Settings Reference](/reference/settings/) — all context injection toggles

---

# Entity Resolution

**How Daneel maps entity mentions in your documents to canonical Wikidata identifiers, and why this is harder than it looks.**

> Source: https://doc.daneel.injen.io/concepts/entity-resolution/index.md

Entity resolution is the process of turning a string like "Feynman" into a stable identifier like `Q39246` — the Wikidata QID for Richard Feynman the physicist, not Roger Feynman the mathematician, not "Feynman, Alabama". This page explains how Daneel performs that mapping when you click a node in the knowledge graph, what the trade-offs are, and why the implementation fans out across two APIs instead of picking one.

## Why it is hard

Text is ambiguous in ways databases are not. A single surface form can refer to many entities:

- *"Paris"* — the French capital, the Greek mythological figure, the city in Texas, the Paris hotel in Las Vegas
- *"Einstein"* — Albert, but also a dozen other Einsteins on Wikidata
- *"Apple"* — the fruit, the company, the record label, the Manhattan neighborhood

A knowledge-graph node gives us two extra pieces of context the raw string does not carry: the ontology type (is this a person, a place, a company?) and the other entities that co-occur with it in the source documents. Good resolution uses both.

Daneel's own NER extractor assigns an ontology label when it identifies an entity. Labels like `person`, `city`, `organization`, `book`, `war` come from the vault's ontology (either a preset or a user-defined set). The fact-box resolver uses this label to narrow the search space on Wikidata.

## Two signals, one QID

The fact-box panel resolves entities by querying two services in parallel:

**OpenRefine Reconciliation API** — `wikidata-reconciliation.wmcloud.org/en/api`

Purpose-built for "free-text string plus optional type filter, rank the candidates". Accepts a Wikidata class QID (e.g., `Q5` = human, `Q515` = city) as the `type` parameter. When the entity's ontology label maps to a known Wikidata class, the reconciliation service receives that class and returns only candidates of the matching type — "Feynman" as a person, not "Feynman" as a place. Scores come back on a [0, 100] scale.

**Wikidata Search API** — `wbsearchentities`

A general-purpose label and alias matcher. No type filter. Returns the top candidates by text similarity. Useful as a fallback when the ontology label doesn't map to a clean Wikidata class, and as a coverage net when reconciliation misses.

Both calls fire in parallel. When they return, the candidates are merged by QID, scores are reconciled, and the list is sorted.

## Scoring and the auto-select threshold

Candidates that appear in both sources keep reconciliation's authoritative score. Candidates that appear only in the general search get a synthetic score, highest for the top hit (0.6 by default, with a boost to 0.7 when reconciliation returned nothing at all), decaying for lower ranks.

If the top candidate's final score crosses the auto-select threshold (0.85 by default), the panel commits to that QID without asking. This cuts a click out of the flow for unambiguous names like "Albert Einstein" or "Napoleon Bonaparte".

When no candidate crosses the threshold, the panel shows a disambiguation picker instead. This is the common case for short last names, common given names, and anything polysemous. The user sees the top 5 candidates with their QIDs, confidence scores, and one-line descriptions, and picks the right one.

The picker's choice is cached for 30 days and keyed on the surface text and ontology type. Future clicks on the same entity in any vault resolve instantly.

## Why both sources instead of one

It is tempting to pick one API and be done. We tried this in planning, and neither works alone:

- **Reconciliation alone** misses entities whose Wikidata class is not well covered, or whose ontology label doesn't map to a clean class. Users with custom ontologies like `research_paper` or `compound` would resolve nothing.
- **Search alone** has no type awareness. "Paris" asked for a city returns a mix of the city, the myth, and several American towns, with no way to bias toward the geographic sense.

The merge handles both weaknesses. Reconciliation narrows; search widens. Combined, coverage is broader than either source alone, and the type filter still gives reconciliation-aware queries the precision they need.

## What happens after resolution

Once a QID is committed (either auto-select or user pick), the panel fetches the full entity payload with `wbgetentities`. The response is simplified with `wikibase-sdk`'s `simplifyClaims` helper into a flat structure keyed by property ID, with qualifiers and references kept where useful.

Statement values come back as either a QID (for item-typed claims) or a primitive (for strings, dates, quantities, URLs). QID-valued statements need labels, so the panel collects every referenced QID and property ID and fetches human-readable labels in a single batch via `wbformatentities`. This batches up to 50 IDs per request, so a typical entity with 15 to 30 referenced QIDs resolves in one follow-up call.

The resulting label map is shared across every fact box in the same session. Seeing "Princeton University" once caches it for every future entity that references `Q21578`.

## Why this is a good fit for knowledge graphs

A knowledge graph built from your documents captures what is in your documents. A knowledge graph built from Wikidata captures what is true about the wider world. The fact box bridges the two: it takes your graph's nodes, connects each one to its canonical Wikidata identity, and surfaces the structured facts that Wikidata knows but your documents never mentioned.

This matters most when your documents assume background knowledge. A research paper citing "Bell" expects you to know it means John Stewart Bell, the physicist behind Bell's theorem. A historical document mentioning "Agincourt" expects you to know roughly when and where. The fact box fills in that background on demand, without you having to leave the graph view.

It is also the foundation for the upcoming factual-edges layer. Once every node in your vault has a resolved QID, a single SPARQL query can pull real relationships from Wikidata — "X was educated at Y", "A founded B", "C is located in D" — and render them as a distinct edge type in the 3D graph, alongside the co-occurrence edges you already see. That work uses the same resolution pipeline described here; the QIDs cached today become the input to the graph augmentation tomorrow.

## Trade-offs we accepted

- **English only in v1.** Reconciliation runs against the English endpoint, entity fetching pulls English labels, the label cache is lang-unaware. Multilingual resolution is a separate, bigger piece of work.
- **No semantic fallback by default.** We researched using [wd-vectordb](https://wd-vectordb.wmcloud.org/), Wikimedia's hybrid vector-keyword search, as a third signal. The service is promising but slow and fuzzy compared to reconciliation on the entity types we care most about. It may land as an optional backstop later.
- **The user is the tiebreaker.** When the two-source merge leaves multiple candidates in a dead heat, we show a picker rather than guess. The cached pick becomes our durable ground truth.

## Related reading

- [How to Use the Wikidata Fact Box](/how-to/wikidata-fact-box/) — the task-oriented guide to the feature.
- [Knowledge Graphs](/concepts/knowledge-graph/) — how the graph itself is built from your documents.
- [Privacy Model](/concepts/privacy/) — how external lookups fit into the data-residency picture.

---

# Graph Analytics

**Why your knowledge graph has importance scores, topic clusters, bridges, and a health rating — and what they actually measure.**

> Source: https://doc.daneel.injen.io/concepts/graph-analytics/index.md

A knowledge graph by itself is just a picture: nodes for entities, edges for connections. Daneel's analytics layer turns that picture into a structured set of insights — which entities matter most, which ones bridge separate ideas, what topical clusters exist, and whether the graph is healthy.

This page explains what each insight means, why it's useful, and what trade-offs come with the approach. To use these features in the extension, see [How to Explore Your Knowledge Graph](/how-to/explore-knowledge-graph/).

## Why analytics on a knowledge graph?

Visualizing 5,000 entities in 3D space is impressive but not directly useful. You can rotate it, zoom it, watch the physics simulation settle, and still walk away with no sense of what's important. The graph needs to be **summarized** before it can be understood.

The analytics layer answers four questions a curious user would naturally ask:

1. **What are the most important entities?** → Key Entities (importance ranking)
2. **What topics exist in this corpus?** → Topics (community detection)
3. **What connects different topics?** → Bridges (bridge detection)
4. **How is X related to Y?** → Path Finder (shortest path search)

A fifth question — **is my graph healthy?** — is answered by the structural diagnostics that flag fragmentation and possible duplicate entities.

All these questions have well-known answers from the field of graph theory. Daneel's analytics layer wraps the [graphology](https://graphology.github.io/) library to compute them.

## Importance: who's connected to whom

The "Key Entities" ranking uses **PageRank**, the same algorithm Google originally used to rank web pages. The intuition is recursive: an entity is important if it's connected to other important entities.

A famous person mentioned only twice in your corpus, but in passages with many other important entities, can outrank a generic term mentioned hundreds of times in passing. PageRank rewards being part of the conversation, not being repeated.

This is different from raw mention counts. If your vault contains physics papers and "the" is mentioned 10,000 times, mention count is useless. PageRank looks at the network structure instead.

**Trade-off:** PageRank assumes the graph is connected enough for the recursion to flow. On heavily fragmented graphs, isolated clusters get inflated scores within their own bubble. Daneel's "Graph Health" card warns you when this is happening.

## Topics: groups of things that travel together

Topics come from **Louvain community detection**, an algorithm that partitions a graph into clusters by maximizing internal connections relative to external connections. Two entities end up in the same community if they appear together more often than chance would predict.

In a knowledge graph from NER extraction, communities tend to correspond to:

- **Research domains** in academic corpora (a "machine learning" cluster, a "molecular biology" cluster)
- **Geographic regions** in news or travel content
- **Historical periods** in history documents
- **Product lines** in technical documentation
- **Topical themes** in mixed-domain corpora

Each community gets an auto-generated label by combining its top two entities (e.g., "Einstein & Relativity"). The label is heuristic — it's just a hint, not a semantic summary.

**Trade-off:** Community detection has no notion of meaning. It only looks at connection patterns. Two completely unrelated topics that happen to share a few documents can end up bundled into one community. The clustering is also non-deterministic — running it twice can produce slightly different communities.

## Bridges: the structural connectors

Bridges come from **betweenness centrality**, which measures how often an entity sits on the shortest path between other pairs of entities. High betweenness means: "if you removed this entity, the graph would fragment."

These are the **connectors**: entities that link otherwise separate communities. In a research corpus, a bridge might be an interdisciplinary scholar whose work spans two fields. In a business corpus, it might be a parent company that owns brands in different sectors.

Bridges are often the most interesting entities in a knowledge graph because they reveal **non-obvious connections**. The most-mentioned entity is usually the obvious one — but the bridge between Topic A and Topic B is something you wouldn't have spotted by reading documents one at a time.

**Trade-off:** Betweenness is the most computationally expensive analytic — it scales as O(n × m). On graphs with thousands of nodes it's still fast (a few hundred milliseconds), but it's the reason analytics are computed once and cached, not on every interaction.

## Connections: how is X related to Y?

The Path Finder answers the most human question of all: "given these two entities, is there any connection between them, and what's the shortest one?"

Daneel uses **Dijkstra's shortest path algorithm** with one important twist: edges with stronger co-occurrence (more shared documents) count as **shorter** distances. Mathematically, the distance is `1 / weight`. So a path through highly co-occurring entities is preferred over a path through weak ones, even if the weak path has fewer hops.

The result is a chain like:

> Einstein → General Relativity → Eddington → Royal Society

Each step in the chain comes with the chunk IDs that established the link, so you can trace the path back to the source documents.

If multiple paths of the same length exist, Daneel returns up to 5 alternatives. This reveals different "stories" connecting the same two entities — Einstein might connect to the Nobel Prize through one collaborator and to the same prize through a different one.

**Trade-off:** Path search is constrained by the graph itself. If two entities are in different connected components (different islands), there's no path. The Graph Health card tells you when fragmentation is severe enough to make this likely.

## Graph health: is your graph trustworthy?

The Graph Health card combines three structural measures into a single traffic-light status:

- **Connected components** — how many disjoint islands the graph contains. A healthy graph has one big island and maybe a few small satellites.
- **Largest component percent** — what fraction of entities are in the biggest island. A healthy graph has 70%+ in the main cluster.
- **Possible duplicates** — entity pairs that look textually similar but weren't merged. Detected via the same heuristics the entity resolver uses (substring containment, edit distance, reversed word order).

A heavily fragmented graph (many small components) usually means one of two things:
1. **Entity resolution failed** — the same entity exists under multiple slightly different names, splitting what should be one connected blob into many small ones. The "Possible duplicates" list flags these cases.
2. **Your corpus is genuinely disjoint** — you imported documents about completely unrelated topics. The graph correctly reflects that.

The duplicates list is **actionable**: each entry shows the pair, the type, and the reason they look similar. You can use this to spot recurring entity-resolution failures and adjust your ontology or model accordingly.

## Visual encoding: sizing and coloring modes

The 3D visualization separates **what's measured** (analytics) from **how it's drawn** (sizing and coloring). You can swap either dimension without recomputing the underlying metrics.

**Node size** can represent:
- **Mentions** (default) — how often the entity appears, log-scaled
- **Importance** — PageRank score
- **Bridges** — betweenness centrality
- **Connectivity** — degree (raw connection count)

The size uses a square-root curve over a wide range so the natural power-law shape of centrality metrics stays visible — a handful of dramatic outliers stay dramatic instead of being flattened into uniform blobs.

**Node color** can represent:
- **Type** (default) — the entity's ontology type (person, location, organization, etc.)
- **Topic** — which Louvain community it belongs to

Switching color modes also switches the bottom-left legend. In topic-color mode, the legend lists communities and lets you click one to filter the view.

The decoupling matters because the same graph tells different stories depending on what you ask. "Color by type" reveals the ontological mix. "Color by topic" reveals thematic clusters. "Size by importance" reveals influential entities. "Size by bridges" reveals structural connectors. Combine them however you like.

## Wikipedia: external context on demand

The graph tells you what's in your corpus. Wikipedia tells you what the world knows about each entity. Daneel bridges the two: clicking any node in the 3D view triggers a Wikipedia search alongside the focus action.

The lookup uses Wikipedia's prefix-search API to find pages whose titles match the entity name, returns up to 10 candidates with thumbnails and short descriptions, and lets you either:
- Open an article in the **document viewer** (Daneel fetches the page, converts the HTML to markdown, and renders it inline)
- Open the article on **wikipedia.org** in a new tab

The first option is useful because the article appears alongside the graph — you can read about an entity without losing your place in the visualization.

Results are cached in `chrome.storage.local` for 7 days, so repeat lookups don't hit the Wikipedia API again. Empty results (no matching articles) are cached for 1 day to avoid repeatedly retrying broken queries.

**Trade-off:** Wikipedia is general-purpose. If your corpus uses domain-specific terminology that Wikipedia doesn't cover, the lookup may return empty or irrelevant results. The disambiguation pages (e.g., five different "John Tate"s) help with the opposite problem — when an entity name maps to multiple real-world things and you need to pick the right one for context.

## How it all fits together

The analytics layer is an **interpretation layer** between raw extraction (NER + entity resolution) and human exploration (the 3D viz, panels, and Wikipedia). Each piece serves a different question, and the cross-references between them are the point — clicking a topic filters the graph, clicking an entity opens its neighborhood and Wikipedia, clicking a path highlights the connection chain.

You don't need to know which algorithm is running underneath. The point is that "Importance", "Topics", "Bridges", "Path", and "Health" map to questions you'd ask about any large network of things, and Daneel answers them with well-understood graph theory operating entirely in your browser.

---

# Knowledge Graphs

**What entity extraction and knowledge graphs are, and how they improve document understanding.**

> Source: https://doc.daneel.injen.io/concepts/knowledge-graph/index.md

A knowledge graph is a structured representation of the entities (people, organizations, places, concepts) in your documents and the relationships between them. Daneel builds knowledge graphs from vault documents to help you see connections that aren't obvious from reading individual files.

To build one, follow [How to Build a Knowledge Graph](/how-to/knowledge-graph/). To explore one with analytics and Wikipedia lookup, follow [How to Explore Your Knowledge Graph](/how-to/explore-knowledge-graph/). For the analytics layer's underlying ideas (importance, topics, bridges, paths), see [Graph Analytics](/concepts/graph-analytics/). For configuration details, see [Settings > Knowledge Graph](/reference/settings/#knowledge-graph).

## What it does

When you enable the knowledge graph on a vault, Daneel:

1. Reads every document in the vault
2. Extracts named entities using a local NER (Named Entity Recognition) model
3. Resolves duplicates ("OpenAI", "Open AI", "OPENAI" become one entity)
4. Identifies relationships based on co-occurrence within the same text passages
5. Builds an interactive 3D graph you can explore visually

The result is a map of your documents' key concepts and how they connect.

## Named Entity Recognition (NER)

NER is the process of identifying and classifying named things in text. Given the sentence:

> *"Satya Nadella announced that Microsoft would invest $10 billion in OpenAI."*

A NER model extracts:
- **Satya Nadella** — Person
- **Microsoft** — Organization
- **$10 billion** — Financial value
- **OpenAI** — Organization

Daneel uses GLiNER, an ONNX-based NER model that runs entirely in your browser via a dedicated web worker. No text is sent to any server for entity extraction.

The model comes in four variants, trading size for accuracy and language support:

| Model | Size | Languages | Best for |
|-------|------|-----------|----------|
| GLiNER Small v2.1 (fp32) | 583 MB | English | Maximum accuracy |
| GLiNER Small v2.1 (int8) | 183 MB | English | Good balance (default) |
| GLiNER Multi v2.1 (int8) | 349 MB | Multilingual | Non-English documents |
| GLiNER Multi v2.1 (fp16) | 580 MB | Multilingual | Best multilingual accuracy |

## Entity resolution

Raw NER output contains duplicates. "IBM", "I.B.M.", and "International Business Machines" might all refer to the same entity. Daneel's `EntityResolver` deduplicates using normalized string matching — comparing lowercased, whitespace-collapsed versions of entity names and merging those above a similarity threshold (default: 85%).

This is a heuristic, not perfect. It handles case variations and minor formatting differences well, but won't merge "IBM" and "Big Blue" (which would require semantic understanding). You can adjust the threshold in settings — lower values merge more aggressively, higher values are more conservative.

## Ontology presets

An ontology defines what types of entities the NER model looks for. Different domains have different relevant entity types. Daneel ships with 8 presets:

- **General** — people, organizations, places, events, concepts
- **Academic** — researchers, institutions, theories, publications
- **Legal** — cases, statutes, courts, parties
- **Medical** — conditions, treatments, drugs, anatomy
- **Programming** — languages, frameworks, APIs, data structures
- **Business** — companies, products, markets, financials
- **Travel** — destinations, landmarks, transport, accommodations
- **History** — historical figures, battles, treaties, eras

You can also define custom ontology labels for specialized domains. The ontology is configured per-vault, so a legal vault and a programming vault can use different entity types.

## Relationships

Daneel infers relationships from co-occurrence: if two entities appear in the same text passage, they're likely related. The more often they co-occur, the stronger the relationship.

This is simpler than hand-curated knowledge graphs (like Wikidata) where relationships have explicit types ("works for", "located in"). But for document analysis, co-occurrence captures the important signal: these things are discussed together.

## The visualization

The 3D graph uses WebGL rendering (via ngraph):

- **Nodes** are entities, sized by mention frequency
- **Edges** are relationships, weighted by co-occurrence strength
- **Colors** map to entity types (configurable per type)
- **Physics simulation** positions nodes — related entities cluster together, unrelated ones drift apart

You can rotate, zoom, and hover to explore. Customizable parameters include charge strength (node repulsion), link opacity, particle animations, and bloom glow effects.

## Why it helps

Reading 50 documents individually, you might miss that three different papers all mention the same researcher, or that a concept from document A is the foundation for the technique described in document D. The knowledge graph surfaces these cross-document connections visually.

It's most useful for:

- **Literature reviews** — mapping the landscape of who studies what
- **Legal discovery** — seeing which entities appear across case files
- **Business intelligence** — understanding relationships between companies, people, and products
- **Research synthesis** — finding thematic connections across a corpus

## Beyond visualization: analytics

A graph by itself is hard to read at scale. With thousands of nodes, the visualization is impressive but not directly informative. Daneel adds an [analytics layer](/concepts/graph-analytics/) on top of the graph that summarizes it into actionable insights:

- **Key Entities** — entities that are structurally important via PageRank
- **Topics** — clusters of entities that travel together (Louvain communities)
- **Bridges** — entities that connect otherwise separate parts of the graph (betweenness centrality)
- **Paths** — shortest connection chains between any two entities (Dijkstra)
- **Graph Health** — fragmentation diagnostics + possible duplicate detection

The analytics layer also enables **one-click Wikipedia lookup**: clicking any entity in the graph triggers a search and lets you read articles directly inside the document viewer.

## Limitations

- **Co-occurrence is not causation.** Two entities appearing in the same paragraph doesn't mean they're directly related. The graph shows proximity, not meaning.
- **NER quality depends on the model.** The int8 model is fast but occasionally misidentifies entities or misses subtle ones. The fp32 model is more accurate but larger.
- **English bias.** The English-only models work best on English text. For other languages, use the multilingual variants.
- **No relationship typing.** Edges don't have labels like "works at" or "located in" — they just indicate co-occurrence strength.

---

# How Licensing Works

**The Daneel AI licensing model — one-time payment, portable license keys, offline JWT verification, and the free vs paid feature split.**

> Source: https://doc.daneel.injen.io/concepts/licensing/index.md

Daneel AI uses a one-time payment model with no accounts, no subscriptions, and no phone-home checks during normal use. This page explains how the licensing system is designed, what stays local, and why features keep working when you're offline.

## Free by default, paid for extras

The core of the extension is free. Page Q&A, Site RAG, local inference with WebGPU or Ollama, MCP tool calling, agents, and vaults all work without paying or signing up. A handful of capabilities — the ones with meaningful infrastructure costs or aimed at power users — are gated behind a single payment.

Paid features are enforced by **feature flags** embedded in your license token, not by hardcoded product tiers. The current gates include vault limits (free: 1 vault with up to 5 documents; paid: unlimited vaults with up to 50 documents each) and a few premium model and backup destinations. The set evolves as Daneel grows; your license automatically picks up any new flags added to your tier.

## No account, just a key

There is no sign-up form, no password, no dashboard. When you click **Unlock** in **Settings > License**, you're taken to a standard Stripe checkout page. After payment, Stripe fires a webhook to the Daneel backend, which generates a key in the format `DAN-XXXX-XXXX-XXXX` and emails it to the address on your receipt.

The key is your identity. Copy it into your password manager, paste it into the license panel on any machine, and you're activated. The same key works across Chrome profiles and devices — it is not bound to a browser fingerprint, hardware ID, or account.

## The JWT token and offline caching

Pasting a license key does not unlock features directly. The extension sends the key to the backend once, which returns a signed **ES256 JWT** containing your plan tier, feature flags, and an expiry timestamp. The extension caches this token in Chrome's local storage.

Every feature check is local. A small `LicenseGate` reads the cached token, verifies its signature against an embedded public key, and checks whether the requested feature flag is present. There is no network call on normal use, which means:

- Feature checks add no latency.
- The extension works fully offline once the token is cached.
- The backend does not see which features you use or how often.

The token carries a **7-day TTL**. During that window everything continues to work without connectivity. In the background, Daneel refreshes the token before it expires. If you happen to be offline when the refresh window opens, the cached token continues to work until it runs out; the next time you come online, the refresh happens automatically and the clock resets.

If you add new flags to your tier on the backend — for instance, when a previously experimental feature ships to paid users — the next refresh pulls them in. Your key does not change, but what it unlocks can grow over time.

## Recovery

If you lose your key, open the recovery page and enter the email address you used at checkout. The backend looks up any keys associated with that address and emails them back to you. There is no support ticket, no identity verification beyond control of the email inbox, and no waiting.

## What the backend stores

Because there are no user accounts, the record of you on the backend is minimal: the license key itself and the Stripe receipt email. Stripe handles card details — Daneel never sees them. Refresh calls transmit only the license key; they do not carry any information about your browsing, your documents, your indexes, or your usage.

The backend runs on Vercel with a Supabase PostgreSQL database. The JWT is signed with an ES256 key held on the backend; the extension only ships the public half for verification.

## Related

- The license panel itself is documented in [Settings](/reference/settings/).
- For the broader data-handling picture, see the [Privacy Model](/concepts/privacy/).

---

# MCP and Tool Calling

**How Daneel AI uses the Model Context Protocol to let AI call tools on external services.**

> Source: https://doc.daneel.injen.io/concepts/mcp/index.md

MCP (Model Context Protocol) is a standard that lets AI models call tools on external services. When you connect an MCP server to Daneel, the AI gains the ability to read Stripe invoices, query databases, manage deployments, and more — all within the chat conversation.

To connect a server, follow [How to Connect an MCP Server](/how-to/mcp-server/). For which providers support tool calling, see the [AI Providers reference](/reference/providers/).

## What is a tool call?

A tool call is when the AI decides it needs external data to answer your question. Instead of making something up, it:

1. Recognizes it needs information from a connected service
2. Formats a tool call request (e.g., "list invoices for customer X")
3. Sends the request to the MCP server
4. Receives the result
5. Incorporates the result into its response

You see this in the chat as an inline tool call indicator showing which tool was called and a summary of the action.

## Multi-turn tool loops

For complex questions, the AI may need multiple tool calls across several turns:

```
You: "Compare last month's revenue to the month before"

AI → Stripe: list_invoices(period: "last month")
Stripe → AI: [invoice data...]

AI → Stripe: list_invoices(period: "two months ago")  
Stripe → AI: [invoice data...]

AI: "Last month's revenue was $12,400, up 15% from $10,800 the previous month."
```

The `ToolCallLoop` orchestrator manages these multi-turn conversations automatically. The AI decides when it has enough information to answer.

## Tool call strategies

Different LLM providers format tool calls differently. Daneel handles this transparently:

| Provider | Strategy | How it works |
|----------|----------|-------------|
| Claude | Native tool_use | Claude's API has built-in tool calling with structured `tool_use` blocks. Most reliable. |
| Ollama / Azure OpenAI | OpenAI function format | Uses the OpenAI-compatible function calling format. Reliability depends on the specific model. |
| WebGPU / Gemini Nano | Prompt-based XML | Tool calls are formatted as `<tool_call>` XML tags in the prompt. Works in theory but small models often misformat calls. |

## Discovery and authentication

When you connect an MCP server, Daneel discovers its available tools automatically. The server declares what tools it offers (e.g., `list_invoices`, `create_customer`, `get_deployment`), and Daneel makes them available to the AI.

Authentication is handled per-server:

- **OAuth2 + PKCE** — for services like Stripe, Notion, Vercel. Daneel manages the full OAuth flow, token storage, and refresh via Chrome's identity API.
- **API Key** — for services like Google Maps, Exa. You paste the key once and it's stored locally.
- **Bearer token** — for custom servers with token-based auth.
- **None** — for open servers that don't require auth.

Daneel discovers auth requirements automatically using WWW-Authenticate headers and RFC 8414 OAuth metadata.

## Agents and tools

[Agents](/how-to/agents/) can have specific MCP servers bound to them. When you use an agent, the AI only has access to the tools you've explicitly assigned. This lets you create focused workflows — a "Billing Agent" that only sees Stripe, a "DevOps Agent" that only sees Vercel and Cloudflare.

## Validated servers

These MCP servers have been tested and confirmed working with Daneel:

| Server | Auth | Category |
|--------|------|----------|
| Stripe | OAuth | Payments |
| Supabase | OAuth | Database |
| Vercel | OAuth | DevOps |
| Notion | OAuth | Productivity |
| Figma | OAuth | Design |
| Linear | OAuth | Project management |
| Slack | OAuth | Communication |
| Google Maps | API Key | Maps |
| Cloudflare | OAuth | Infrastructure |
| Exa | API Key | Search |
| data.gouv.fr | None | Open data |
| Context7 | None | Documentation |

The full list of featured servers is in **Settings > MCP**.

## Limitations

- **Small models struggle with tools.** WebGPU (3B) and Gemini Nano often fail to format tool calls correctly. Use Claude or Ollama with 7B+ models for reliable tool calling.
- **One tool loop at a time.** The current architecture processes tool calls sequentially within a conversation turn.
- **Server availability.** MCP servers are remote services — they can be down, rate-limited, or require paid subscriptions.

---

# Offline Mode

**How Daneel**

> Source: https://doc.daneel.injen.io/concepts/offline-mode/index.md

Daneel's Offline Mode is a user-asserted, verifiable no-network switch. When you flip it on, every outbound call that would leave your machine is blocked by design. The extension stays fully functional for everything that runs on-device or on your LAN. This page explains the rules behind what is allowed and what is denied, so the guarantee is predictable rather than mysterious.

## The data-residency rule

Every AI provider and every tool integration in Daneel carries a `PrivacyProfile` describing where its data physically goes. The most important field is `leavesMachine`: `true` if a call reaches a third party, `false` if it stays on your device or your LAN.

Offline Mode is keyed on that single field. There is no bespoke block list. The rule is:

> If a call would leave this machine, it is blocked. If it stays local, it passes through.

This is why Ollama on localhost keeps answering, why a Docker Companion MCP server on your LAN keeps working, why WebGPU inference never noticed anything changed. The same rule denies Claude, Azure OpenAI, cloud backup, and the license verification endpoint because those genuinely leave the machine.

## What stays enabled

The following keep working exactly as they do online:

- WebGPU inference — runs in your browser, never touched the network
- Gemini Nano — on-device Chrome AI, no outbound calls
- Ollama on localhost or LAN — local server, local data
- Docker Companion and MCP bridges you host yourself — same rationale
- Vault search, chat, import, and document viewing from already-cached content
- The knowledge graph (GLiNER entity extraction runs in a Web Worker locally)
- Local filesystem data export and import (ZIP download, file picker)

## What gets blocked

Everything below hits a third-party host. All of it is denied with a typed error the UI knows how to recover from:

- Claude and Azure OpenAI inference
- Remote MCP servers (Stripe, Supabase, Vercel, public registries)
- MCP registry search (the Official registry, PulseMCP)
- Cloud backup via Azure Blob Storage or S3-compatible targets
- License verification against the Daneel backend (the cached token is used)
- Model registry refresh (the bundled catalog is used)
- Telemetry events (dropped entirely, not buffered for later replay)
- Wikipedia lookup from the knowledge graph node viewer
- Loading external pages in the vault's document viewer
- Fetches of the news, changelog, credits, and documentation pages from the live site (the local cache serves them instead)

Several surfaces that depend on remote calls also render a disabled overlay so you do not spend time filling in credentials that cannot be used: Claude, Azure OpenAI, MCP, and Models Storage. The Models Storage panel is specifically frozen because the one thing you absolutely do not want mid-flight is an accidental delete of the cached model keeping your offline session alive.

The Data panel is split: local filesystem export and import stay enabled, Azure Blob and S3 sub-sections grey out with a short notice.

## Persistent and Test modes

The switch has two forms, deliberately separate:

- **Switch to offline mode** — persistent, survives browser restarts. This is the real trust feature. Flip it on, close the browser, come back tomorrow on a plane, it is still on.
- **Test offline mode** — transient, active until the extension reloads. It lets you verify your offline setup without pulling the ethernet cable. The effect is identical, the duration is not.

Either one flipped on makes the effective state offline. You can have both on at once, which is equivalent to having one on.

## Three escape hatches

The worst outcome of a trust feature is getting trapped inside it. Daneel gives you three independent ways to turn it off:

1. The extension popup shows a prominent green OFFLINE MODE card with a one-click Turn off button whenever the mode is active. The popup is always reachable from the toolbar icon.
2. The Vault tab (the standalone vault.html page) shows the same card at the top of the page. It works even if no webpage is loaded anywhere in Chrome.
3. Settings > Offline mode, from the widget on any normal webpage, gives you the canonical toggles.

Wherever you find yourself, you are one click from the exit.

## How the trust is verifiable

Open Chrome DevTools, switch to the service worker's network panel, flip Offline Mode on, and use the extension normally. You will see zero requests to `api.anthropic.com`, to the license backend, to Google Analytics, to the docs site, to Wikipedia. Localhost (for Ollama) and the extension's own local resources are all that move.

This is not a disclosure trick. The service worker's proxy-fetch handler, the MCP transport, and the provider clients all consult the same gate before issuing any request. The block happens at the point of call, not at the network boundary.

Telemetry deserves a specific mention: when the gate denies a telemetry event, the event is dropped, not queued. Your actions while offline are not reported when you reconnect. The privacy expectation of the switch would be violated by any kind of replay.

## Related reading

- [How to Prepare for Offline](/how-to/prepare-for-offline/) — the practical preflight walkthrough
- [How to Use Daneel Offline](/how-to/offline/) — daily-use guide, activation, recovery
- [Privacy Model](/concepts/privacy/) — the full data-residency picture per provider
- [The Provider Spectrum](/concepts/providers/) — which providers work offline and which do not

---

# Privacy Model

**How Daneel AI handles your data and what stays local vs. what leaves your machine.**

> Source: https://doc.daneel.injen.io/concepts/privacy/index.md

Daneel AI is designed with a privacy gradient — you choose how much (or how little) of your data leaves your machine. This page explains the data flow for each provider and feature.

## The privacy gradient

Every AI provider in Daneel has a **data residency** classification:

| Level | Meaning | Providers |
|-------|---------|-----------|
| **On-device** | Data never leaves your browser process | WebGPU, Gemini Nano |
| **Local network** | Data goes to a server on your LAN, never to the internet | Ollama |
| **Your cloud** | Data goes to infrastructure you control | Azure OpenAI |
| **Third-party cloud** | Data goes to an external API provider | Claude (Anthropic) |

You can filter models by privacy level in **Settings > AI Models** to find models that match your requirements.

## What stays local — always

Regardless of which LLM provider you use, these operations never leave your browser:

- **Embedding** — All vector embeddings are generated locally by the BGE Small model running on WebGPU (or WASM fallback). Your text is chunked and embedded on-device.
- **Vector search** — Cosine similarity search runs in IndexedDB or GPU-accelerated memory. Search queries never leave the browser.
- **Document storage** — Vault documents, site indexes, and knowledge graphs are stored in IndexedDB in your browser profile.
- **Settings and credentials** — All configuration data, including encrypted API keys, stays in Chrome's local storage.
- **Content extraction** — Page text extraction (Readability.js, Turndown) runs in the content script or service worker.

## What leaves your machine — by choice

When you select a cloud LLM provider, the following data is sent to that provider's API:

- The assembled prompt (page content or RAG context + your question + conversation history)
- The AI's response streams back

This is the standard flow for any AI chat application. The difference is that with Daneel, you can avoid it entirely by using WebGPU or Ollama.

### Claude (Anthropic)

- Data sent to Anthropic's API servers
- API key is encrypted with AES-256-GCM before storage; transmitted via HTTPS
- Anthropic's [data usage policy](https://www.anthropic.com/policies) applies
- The `anthropic-dangerous-direct-browser-access: true` header is set (required for browser-based API calls)

### Ollama

- Data sent to your Ollama server (default: `localhost:11434`)
- Stays on your local network — nothing reaches the internet
- You control the server and its data retention

### Azure OpenAI

- Data sent to your Azure OpenAI deployment in your tenant
- Your Azure data residency and compliance policies apply
- Authentication via API key or Entra ID (your Azure AD)

### MCP tool calls

When using MCP servers, tool call parameters and results are exchanged with the remote server. Each MCP server has its own data handling policy. OAuth-connected servers (Stripe, Notion, etc.) operate under their respective privacy policies.

## Environment context

When enabled, Daneel injects your approximate location (city level) and current datetime into agent system prompts. This data is:

- **Location** — resolved once per session via browser geolocation + OpenStreetMap reverse geocoding. Stored only in memory (never persisted to disk). Sent to your LLM provider as part of the prompt.
- **Datetime** — computed locally from `Date` and `Intl.DateTimeFormat`. No network calls.

Both are gated by toggles in **Settings > Privacy** and are off/on by default respectively. The telemetry geolocation system (below) is completely separate and does not share data with context injection.

See [Environment Context](/concepts/context-injection/) for the full architecture.

## Telemetry

Daneel includes optional analytics (GA4 Measurement Protocol). When enabled:

**Collected:** feature usage counters (chat, search, crawl, model load), provider and model name, OS, Chrome version, language, country/region.

**Never collected:** page content, URLs you visit, chat messages, documents, API keys, or any personally identifiable information.

Telemetry is opt-in/out in **Settings > Privacy**. Disabling it stops all analytics collection immediately.

## Encryption

- Claude API keys: AES-256-GCM encryption at rest in Chrome storage
- MCP OAuth tokens: stored in Chrome's local storage with auto-migration from legacy formats
- S3 credentials: stored in Chrome storage, excluded from data exports
- Azure SAS URLs: stored in Chrome storage, excluded from data exports

## Practical guidance

- **Maximum privacy:** Use WebGPU for LLM + default local embedding. Zero data leaves your machine.
- **Privacy with power:** Use Ollama on localhost. Data stays on your machine but you get access to larger models.
- **Enterprise compliance:** Use Azure OpenAI. Data stays in your Azure tenant under your compliance umbrella.
- **Best quality:** Use Claude. Prompts are sent to Anthropic's API, but embedding and search remain local.

To see this in action, follow [Your First Page Chat](/guides/first-page-chat/) with the WebGPU provider — everything runs locally.

---

# The Provider Spectrum

**Trade-offs between local and cloud AI providers in Daneel.**

> Source: https://doc.daneel.injen.io/concepts/providers/index.md

Daneel supports five LLM providers, each with different trade-offs in quality, privacy, cost, and setup. This page helps you understand when to use which.

For configuration steps, see [Connect a Cloud Provider](/guides/connect-provider/). For the full technical specs, see the [AI Providers reference](/reference/providers/).

## The spectrum

Providers are arranged from most private (left) to most capable (right):

```
WebGPU → Gemini Nano → Ollama → Azure OpenAI → Claude
  ↑                                                 ↑
  100% local                              highest quality
  zero cost                               per-token cost
  smaller models                          largest models
```

There is no universally "best" provider. The right choice depends on what you're optimizing for.

## Provider comparison

| | WebGPU | Gemini Nano | Ollama | Azure OpenAI | Claude |
|---|---|---|---|---|---|
| **Privacy** | On-device | On-device | Local network | Your cloud | Third-party |
| **Quality** | Good (up to 3B models) | Basic (3B) | Good to excellent | Excellent | Excellent |
| **Cost** | Free | Free | Free | Azure pricing | Per-token |
| **Setup** | None | Chrome flag | Install Ollama | Azure subscription | API key |
| **Internet** | No | No | LAN only | Yes | Yes |
| **Tool calling** | Experimental | Experimental | Yes | Yes | Yes (best) |
| **Model variety** | 20+ models | 1 model | Thousands | Your deployments | 3 models |

## When to use each

### WebGPU — privacy above all

Choose WebGPU when:
- You need complete privacy with zero data leaving your machine
- You're working with sensitive or confidential content
- You don't have (or don't want to use) API keys
- You have a decent GPU (most modern integrated GPUs work)

Limitations: smaller models (up to ~3B parameters) mean lower quality on complex reasoning tasks. Tool calling is experimental and unreliable. That said, models like Bonsai 1.7B (q1) weigh just 291 MB while still supporting step-by-step reasoning, making WebGPU viable even on low-end hardware.

### Gemini Nano — zero-setup local AI

Choose Gemini Nano when:
- You want on-device inference with zero downloads
- You're on a Chrome version that supports the AI API
- Quality requirements are modest

Limitations: single model, no model choice, limited capabilities, experimental tool calling.

### Ollama — local power

Choose Ollama when:
- You want to run larger, more capable models locally
- Privacy matters but you're comfortable with localhost/LAN traffic
- You want to experiment with many different models
- You need reliable tool calling with local models

Limitations: requires installing and running the Ollama server. Resource-heavy for large models.

### Azure OpenAI — enterprise compliance

Choose Azure OpenAI when:
- Your organization requires Azure data residency
- You need enterprise-grade compliance and audit trails
- You have existing Azure OpenAI deployments
- Tool calling reliability matters

Limitations: requires Azure subscription and deployment setup.

### Claude — maximum quality

Choose Claude when:
- Response quality is the top priority
- You need the best tool calling experience with MCP servers
- You're comfortable sending prompts to Anthropic's API
- Cost per token is acceptable for your use case

Limitations: requires API key, internet connection, and costs money per token.

## Mixing providers

You can switch providers at any time from the chat panel dropdown. A common pattern:

- **Daily browsing:** WebGPU for quick, private page Q&A
- **Deep research:** Switch to Claude when you need high-quality synthesis across a large site index
- **Tool workflows:** Use Claude or Ollama when working with MCP-connected agents

Embedding always runs locally regardless of LLM provider. Your indexes and vaults work with any provider — switching only changes the AI that generates answers.

## Quality vs. model size

A rough rule of thumb for LLM quality:

- **< 1B parameters** — basic summarization, simple Q&A
- **1B–3B parameters** — good for most page Q&A and document chat (WebGPU default)
- **7B–13B parameters** — strong reasoning, reliable tool calling (Ollama sweet spot)
- **70B+ parameters** — near state-of-the-art (large Ollama models, Claude, Azure)

Larger models need more memory and compute. WebGPU is limited by browser GPU memory. Ollama can use system RAM for larger models but inference is slower.

Quantization also matters. A 1.7B model at standard q4 weighs ~1.1 GB, but at q1 (1-bit) it drops to 291 MB. Daneel supports q1 and q2 quantization for models that are designed for it, like PrismML's Bonsai family. Not every model benefits from extreme quantization, though: 1-bit works best when the model was trained for it from the start.

---

# How RAG Works

**The retrieval-augmented generation pipeline behind Daneel**

> Source: https://doc.daneel.injen.io/concepts/rag/index.md

RAG (Retrieval-Augmented Generation) is the technique Daneel uses to answer questions about websites and documents. Instead of sending an entire site to the AI, Daneel finds the most relevant passages first, then asks the AI to answer based on those passages.

This is what powers [Site Search](/guides/first-site-index/) and [Document Vault](/guides/first-vault/).

## The pipeline

RAG in Daneel has four stages:

### 1. Content acquisition

**For sites:** Daneel discovers sitemaps, crawls pages (BFS up to configurable depth and page count), and extracts text using a three-strategy pipeline: Readability.js for articles, CSS cascade + Turndown for structured pages, plain-text fallback for everything else.

**For vaults:** You import files directly. Daneel converts them to text using format-specific converters (EdgeParse for PDFs, Mammoth for DOCX, Turndown for HTML).

**For pages:** The content script extracts the current page's text in real time. YouTube videos get special treatment — Daneel fetches the transcript via the InnerTube API.

### 2. Chunking

Raw text is split into overlapping chunks. Daneel uses recursive chunking (via Chonkie) with these defaults:

- **Chunk size:** 512 tokens
- **Overlap:** 64 tokens

Overlap ensures that important context near chunk boundaries isn't lost. You can adjust chunk size in [Settings > Indexes](/reference/settings/#indexes).

A single page produces up to 200 chunks (configurable up to 2,000). This cap prevents a single massive page from dominating the index.

### 3. Embedding

Each chunk is converted to a vector (a list of numbers) using an embedding model. Daneel runs this locally using the BGE Small EN v1.5 model on WebGPU:

- **384 dimensions** per vector
- **fp16 quantization** for GPU performance
- **Batch size of 32** to prevent GPU memory issues

The embedding model runs entirely in your browser. Even when you use Claude or Azure for the LLM, embeddings are always local.

Vectors are stored in IndexedDB, partitioned by domain (for sites) or vault ID (for documents).

### 4. Retrieval + generation

When you ask a question:

1. Your question is embedded using the same model.
2. Cosine similarity finds the top-k most similar chunks (default: 50 candidates, narrowed to 15 source URLs).
3. A keyword boost (15% weight) supplements semantic similarity — this helps with exact term matches that embedding models sometimes miss.
4. Chunks scoring below a minimum threshold (0.6) are filtered out.
5. The top chunks are assembled into a prompt with source URLs.
6. The prompt + your question go to the active LLM provider for answer generation.

The AI sees the relevant context and source links, then generates a grounded response. You see the answer with clickable source references.

## Why local embedding matters

Because embedding runs locally, your documents and site content are never sent to a cloud service for indexing. The only time content reaches a cloud provider is during step 4, when the assembled prompt (selected chunks, not the full corpus) goes to the LLM. And even that step is optional — use WebGPU or Ollama to keep everything local.

For more on data flow, see [Privacy Model](/concepts/privacy/).

## GPU-accelerated search

For large indexes (50,000+ chunks), Daneel uses GPU-accelerated cosine similarity search. This keeps search times under 5ms even at scale, compared to sequential CPU search which would take noticeably longer.

## Trade-offs

**Chunk size** is a balance between context and precision. Larger chunks provide more context per result but may include irrelevant text. Smaller chunks are more precise but may miss surrounding context. The 512-token default works well for most content.

**Top-k** controls how many chunks the AI sees. More chunks give the AI more information but increase prompt size (and cost, for token-billed providers). The default of 15 source URLs strikes a balance.

**Embedding model** quality affects retrieval accuracy. BGE Small is compact and fast but less accurate than larger models. This is a deliberate trade-off for a browser extension where model size matters.

---

# Speech in Daneel

**Why three providers, how Kokoro stays fully local on WebGPU, and the design decisions behind gapless playback and strict preemption.**

> Source: https://doc.daneel.injen.io/concepts/speech/index.md

Speech is one of the few features where the same user action maps to three genuinely different trade-offs. Reading a reply aloud can stay entirely on your device, or it can stream text to a cloud service for better prosody, or it can land somewhere in between with OS-provided voices. Daneel exposes that choice honestly rather than picking for you.

This page explains the spectrum, why Kokoro gets to be the privacy-first option, and a few engineering decisions that shape how speech actually feels in use.

## The provider spectrum

Every text-to-speech provider in Daneel carries a privacy profile describing where your text goes. Two fields matter: `leavesProcess` (does the text cross the browser sandbox?) and `leavesMachine` (does it leave your device?).

| Provider | leaves process | leaves machine | Observer |
|---|---|---|---|
| Kokoro 82M | no | no | none |
| System voices (local) | yes | no | browser vendor |
| System voices (Google cloud) | yes | yes | browser vendor |

Kokoro is the only option where no component outside Daneel ever sees your text. System voices go through the browser's Speech Synthesis engine, which is part of the OS but not Daneel. Google cloud voices genuinely leave your machine for richer prosody. The three steps are intentional, and they are visible in the settings panel as privacy pills.

For the parallel spectrum applied to speech recognition, see the provider table in [Speech Reference](/reference/speech/). The short version: the browser recognizer streams to Google today; the coming Moonshine provider will close that gap with a local model.

## Why Kokoro is fp32 on WebGPU

Kokoro's ONNX weights ship in several quantizations: `fp32` (~326 MB), `fp16` (~165 MB), `q8` (~80 MB), and a few others. The instinct is to pick the smallest to minimize download. For WebGPU, the instinct is wrong.

Quantized ONNX models on WebGPU rely on dequantization ops that, as of writing, have no WebGPU kernel. ONNX Runtime quietly assigns those ops to CPU, which bounces tensors across the CPU/GPU boundary on every forward pass. The result is synthesis latency measured in tens of seconds for what should take one.

Daneel uses `fp32` on WebGPU because pure fp32 inference has no such ops and runs entirely on the GPU. The download is larger, the download is a one-time cost, and the runtime is 3 to 5 times faster. The kokoro-js library's README recommends the same combination. When a faster quantization is a good match, it is a good match for WASM (CPU) execution, not for WebGPU.

If your hardware does not support WebGPU, Daneel falls back to the System voices provider. Kokoro is not a viable option on WASM-only devices today.

## Host-owned audio and gapless playback

A user-visible detail that only exists because of a specific architecture: audio keeps playing when you switch tabs.

The AudioContext that plays Kokoro's PCM is owned by Daneel's background host page, not by the tab you are browsing. When Kokoro produces a chunk of audio, the PCM is posted from the worker to the host page. The host page's AudioContext schedules the buffer on the WebAudio timeline at a specific absolute time, computed as `max(currentTime, end_of_previous_chunk)`. The hardware audio driver executes the schedule; reordering is physically impossible once a chunk is committed.

One consequence is that synthesis can run ahead of playback without introducing gaps. Daneel pipelines chunk N+1's synthesis while chunk N plays, with a backpressure limit of one chunk ahead. Another consequence is that navigating away from the tab where you started playback does not interrupt the audio. The AudioContext survives the tab switch.

## Strict preemption between messages

When you click Play on message A and, partway through, click Play on message B, you expect A to stop cleanly and B to start. No overlap, no lingering last sentence of A.

The implementation is mechanical rather than clever. Every new TTS request at the host first aborts every prior in-flight request and calls `reset()` on the playback queue. Only then does it register itself. The "latest wins" semantics are enforced at the host boundary, not negotiated between widget and host over a cancel round-trip.

An earlier iteration of this feature relied on the widget sending a `tts-cancel` before the new `tts-synthesize`, with host-side handling that assumed they arrived in order. They did not always, which produced an audible bug where message A finished reading while B was already halfway through. Moving the preemption to unconditional host-side abort removed the race.

## Why the sanitizer and chunker are boring

Kokoro's style vector, which controls prosody, is offset into a voice tensor using the token length of the input. Very short inputs (a bare title, an eight-word heading) pick a low-offset region of the tensor where the prosody is unstable and can mangle or repeat the phrase. Slightly longer inputs pick the stable middle region.

The chunker has a boring job: take a markdown message, strip formatting, split on paragraph and sentence boundaries, and emit chunks in a bounded character range. One non-obvious detail is that when the first chunk is tiny (a heading preceding a paragraph), it gets forward-merged into the next chunk even if the combined length exceeds the soft target, up to a hard ceiling. This protects Kokoro from the unstable-prosody region.

Markdown never reaches Kokoro. Headings become plain text, code fences become `(Code block.)`, math becomes `(Math formula.)`, Mermaid diagrams become `(Diagram.)`. Kokoro sees only speakable prose.

## Dictation and the permission model

The microphone button in the composer does not prompt the browser for permission until you click it. That is a deliberate decision: Daneel does not want to ask for hardware access at install time, because the user has not yet asked for speech recognition. When you click the mic, Chrome's permission prompt appears, and it is scoped to the extension origin, so you grant it once and never again.

The transcript does not auto-send. It lands in the composer input so you can see what was heard, correct anything, and send deliberately. In practice this catches dictation errors and sidesteps the awkwardness of accidental sends.

When [Offline Mode](/concepts/offline-mode/) is active, the mic button disables itself with a tooltip. The reason is that the default recognizer streams audio to Google; the network gate denies it on principle. When the local Moonshine option ships, the mic will stay enabled under Offline Mode with Moonshine selected, because that provider's privacy profile says the audio never leaves the machine.

## Future: voice locality on both sides

Kokoro closed the locality gap for text-to-speech. Moonshine is the same story for the other direction. Once the provider class, worker, and microphone capture pipeline are in place, selecting Moonshine in **Settings > Speech > Speech recognition** will give you a dictation experience that is indistinguishable from the cloud version, without any audio leaving your device.

The speech catalog is already set up to receive it. The UI path is already there. The remaining piece is the runtime.

---

# Connect a Cloud Provider

**Set up Claude, Ollama, or Azure OpenAI as your AI backend.**

> Source: https://doc.daneel.injen.io/guides/connect-provider/index.md

This tutorial walks you through connecting a cloud AI provider. By the end, you'll have switched from the default local model to a more powerful cloud backend.

Daneel AI works out of the box with WebGPU (local inference). But if you want higher-quality responses, you can connect a cloud provider. This tutorial covers the three main options.

## Option A: Claude (Anthropic API)

Claude is Anthropic's flagship model family. It offers the highest quality responses and supports native tool calling with MCP servers.

1. Open Daneel's settings (gear icon on the launcher).
2. Navigate to **Claude** in the sidebar.
3. Paste your Anthropic API key. The key is encrypted with AES-256-GCM and stored locally — it never leaves your browser unencrypted.
4. Select a model:
   - **Claude Opus 4.7** — most capable, hybrid reasoning for coding and vision
   - **Claude Opus 4.6** — previous flagship, same pricing as 4.7
   - **Claude Sonnet 4.6** — balanced quality and speed
   - **Claude Haiku 4.5** — fastest, lowest cost
5. Close settings. In the chat panel, switch the provider dropdown to **Claude**.

You're now chatting with Claude. You'll see a cost annotation next to each response showing token usage.

:::note
Claude requires an API key from [Anthropic's console](https://console.anthropic.com/). Usage is billed per token.
:::

## Option B: Ollama (local server)

Ollama runs open-source models on your machine. Responses stay on your local network — nothing reaches the internet.

1. [Install Ollama](https://ollama.com/) on your computer.
2. Pull a model: `ollama pull llama3.2` (or any model you prefer).
3. In Daneel's settings, navigate to **Ollama**.
4. Set the base URL (default: `http://localhost:11434`). Daneel auto-probes the connection.
5. Select a model from the dropdown — Daneel lists all models installed on your Ollama server.
6. Close settings and switch the provider dropdown to **Ollama**.

Ollama supports tool calling with MCP servers, model management (pull, delete), and think-block streaming.

## Option C: Azure OpenAI (enterprise)

For enterprise environments with Azure OpenAI Service deployments.

1. In Daneel's settings, navigate to **Azure OpenAI**.
2. Enter your Azure endpoint URL and deployment name.
3. Choose an authentication method:
   - **API Key** — paste your Azure API key
   - **Entra ID (OAuth2)** — authenticate via Microsoft identity
4. Select your deployed model.
5. Close settings and switch the provider to **Azure OpenAI**.

See [How to Set Up Azure OpenAI](/how-to/azure-openai/) for the detailed guide.

## Option D: Gemini Nano (Chrome built-in)

Gemini Nano is a small model built into Chrome. No downloads, no API keys.

1. Make sure you're on Chrome 120+ with the Gemini Nano flag enabled.
2. In Daneel's settings, navigate to **Gemini Nano**.
3. Daneel detects availability automatically. If available, select a language.
4. Switch the provider dropdown to **Gemini Nano**.

Gemini Nano runs on-device with no internet required, but it's a small model — expect lower quality than Claude or Ollama with larger models.

## Comparing providers

For a deeper comparison of trade-offs between local and cloud providers, see [The Provider Spectrum](/concepts/providers/).

## Next steps

- [Connect an MCP server](/how-to/mcp-server/) to give your AI access to external tools
- [Create a custom agent](/how-to/agents/) with a specialized prompt
- Read about the [privacy model](/concepts/privacy/) to understand the data flow for each provider

---

# Your First Page Chat

**Ask a question about any webpage in under a minute.**

> Source: https://doc.daneel.injen.io/guides/first-page-chat/index.md

This tutorial walks you through chatting with a webpage for the first time. By the end, you'll have asked a question about a page and received an AI-generated answer.

## 1. Navigate to a page

Open any article, documentation page, or blog post in Chrome. For this tutorial, pick something with a few paragraphs of text — a Wikipedia article works well.

## 2. Open Daneel AI

Look for the floating launcher bubble in the bottom-right corner of the page. Click the **sparkles icon** (Ask Site) to open the chat panel.

If you don't see the bubble, click the Daneel AI extension icon in the Chrome toolbar to open the side panel instead.

## 3. Ask a question

The chat panel opens with **Page** mode selected by default. Type a question about the page content and press **Enter**.

For example, on a Wikipedia article about space exploration:

> *What were the key milestones in the Apollo program?*

## 4. See the response

Daneel extracts the page content, sends it along with your question to the active AI model, and streams back a response. You'll see tokens appear in real time.

The response is formatted in Markdown with headings, lists, and tables rendered inline.

## 5. Continue the conversation

Ask follow-up questions. Daneel keeps the conversation context, so you can refer to previous answers:

> *Which of those milestones had the highest budget?*

## 6. Try a different page

Navigate to a new page and ask another question. The chat context resets per page in Page mode.

## What just happened

Daneel extracted the page content using a three-strategy pipeline (Readability.js, CSS cascade with Turndown, or plain-text fallback), assembled it into a prompt with your question, and streamed the response from your active AI provider.

Everything ran locally if you're using the default WebGPU backend. No data left your machine.

## Next steps

- [Index Your First Site](/guides/first-site-index/) to search across an entire website
- [Build a Document Vault](/guides/first-vault/) to chat with your own files
- Learn [how to chat with a YouTube video](/how-to/youtube-chat/) — Daneel extracts transcripts automatically

---

# Index Your First Site

**Crawl a website, build a vector index, and search it with AI.**

> Source: https://doc.daneel.injen.io/guides/first-site-index/index.md

This tutorial walks you through indexing a website and searching it with natural language. By the end, you'll have a local vector index you can query anytime, even offline.

## 1. Navigate to the site

Go to any website you want to index. Documentation sites, blogs, wikis, and knowledge bases all work well.

## 2. Open Site Search

Click the **magnifying glass icon** on the Daneel launcher bubble to open the search overlay.

## 3. Check for sitemaps

Daneel automatically checks the current domain for sitemaps. Two things can happen:

- **Sitemaps found** — the **Sitemap** discovery method is pre-selected. You'll see a checklist of discovered sitemaps with page counts.
- **No sitemap found** — Daneel switches to **Web Crawl**, which discovers pages by following links from your current page.

You can switch between the two methods at any time using the discovery method cards at the top of the panel. For a first test, use whichever Daneel selects automatically.

:::note
For details on when to choose one method over the other, see [How to Index a Site](/how-to/site-indexing/).
:::

## 4. Configure the crawl

Set your crawl parameters:

- **Max pages** — how many pages to crawl (1–200, default 50). Start small for your first run.
- **Depth** — how many levels deep to follow (1–10, default 3). For sitemaps this controls sitemap nesting depth; for web crawl it controls how many link hops from the starting page.

If you're using **Web Crawl**, you'll also see a **Path prefix** field. Daneel infers a prefix from your current URL to keep the crawl focused on the section you're browsing. You can edit or clear it.

For a first test, try 10–20 pages.

## 5. Start indexing

Click the **Crawl** button. Daneel begins the indexing pipeline:

1. **Discovers** page URLs (from the sitemap or by following links)
2. **Fetches** each page and extracts text content using Readability
3. **Splits** text into overlapping chunks
4. **Embeds** each chunk as a vector using your local embedding model
5. **Stores** everything in IndexedDB in your browser

A progress bar shows crawl and embedding progress. The task runs in the background, so you can close the panel or navigate away without losing progress.

## 6. Search the index

Once indexing completes, a search box appears. Type a natural language question:

> *How do I configure authentication?*

Daneel runs a vector similarity search across all indexed chunks, finds the most relevant passages, and assembles an AI-powered answer with links to the source pages.

## 7. Review results

Each result shows:
- The source page title and URL
- A relevance score
- A text excerpt from the matching chunk

Click any source link to jump directly to that page.

## What just happened

Daneel discovered the pages (via sitemap or link crawling), fetched each one through the background service worker, extracted clean text with Readability, chunked it, and embedded each chunk using the BGE Small model running on WebGPU. The vector index is stored in IndexedDB, partitioned by domain. Searches run cosine similarity (GPU-accelerated when available) and assemble the top results into a RAG prompt.

The entire index lives in your browser. Nothing was sent to any server (assuming you're on the WebGPU backend).

## Next steps

- [How to Index a Site](/how-to/site-indexing/) for choosing between sitemap and web crawl, path prefix filtering, and safety guards
- [Build a Document Vault](/guides/first-vault/) to chat with your own files
- [Manage Site Indexes](/how-to/manage-indexes/) to re-index, clear, or view stats
- [How RAG Works](/concepts/rag/) to understand the pipeline under the hood

---

# Build a Document Vault

**Import local documents and chat with them using AI.**

> Source: https://doc.daneel.injen.io/guides/first-vault/index.md

This tutorial walks you through creating a document vault, importing a file, and asking questions about it. By the end, you'll have a personal knowledge base you can query with AI.

## 1. Open the Vault

Click the **folder icon** on the Daneel launcher bubble to open the vault overlay.

## 2. Create a vault

If this is your first time, you'll see an empty vault list. Click **Create Vault** and give it a name — something like "Research Papers" or "Project Docs".

## 3. Import a document

Click the **Import** button inside your vault. You can:

- **Click** to open a file picker
- **Drag and drop** files directly onto the vault area

Supported formats: PDF, DOCX, TXT, HTML, PPTX, Excel (XLS, XLSX).

Pick a document — a PDF works well for this tutorial. Daneel converts it to text, splits it into chunks, generates embeddings, and stores everything locally.

You'll see the document appear in the vault with its name, format icon, and chunk count.

## 4. Ask a question

With your document imported, type a question in the chat input at the bottom of the vault overlay:

> *What are the main conclusions of this paper?*

Daneel searches the vault's vector index, finds the most relevant chunks, and generates an answer using your active AI model.

## 5. Import more documents

Add more files to the same vault. Daneel deduplicates by content hash (SHA-256), so importing the same file twice won't create duplicates.

## 6. Try the knowledge graph (optional)

If you want to visualize entity relationships across your documents:

1. Open the vault's settings (gear icon)
2. Enable **Knowledge Graph**
3. Daneel extracts named entities (people, organizations, places, concepts) using a local NER model and builds an interactive 3D graph

See [How to Build a Knowledge Graph](/how-to/knowledge-graph/) for the full guide.

## What just happened

Daneel converted your document to structured Markdown (using EdgeParse for PDFs, Mammoth for DOCX), chunked it into overlapping segments, and embedded each chunk with the BGE Small model on WebGPU. The vectors are stored in IndexedDB, partitioned by vault ID. Queries run semantic search over those vectors and feed the top matches into a RAG prompt.

Everything stays in your browser. Your documents are never uploaded anywhere.

## Free vs. paid limits

| | Free | Paid |
|---|---|---|
| Vaults | 1 | Unlimited |
| Documents per vault | 5 | 50 |
| Max file size | 1 MB | 10 MB |
| Max characters per doc | 50,000 | 500,000 |

[Upgrade your license](/how-to/offline/#license-activation) to unlock the full limits.

## Next steps

- [Connect a Cloud Provider](/guides/connect-provider/) for more powerful AI responses
- Learn [how to create a custom agent](/how-to/agents/) to specialize your vault's AI
- [Browse linked pages from a vault document](/how-to/vault-mini-browser/) — turn web-origin docs into a navigable surface
- Read about [the privacy model](/concepts/privacy/) to understand what stays local

---

# Installation

**How to install Daneel AI from the Chrome Web Store or as a packaged extension.**

> Source: https://doc.daneel.injen.io/guides/installation/index.md

Daneel AI is a Chrome extension. There are two ways to install it.

## Chrome Web Store

1. Visit the [Daneel AI listing](https://chromewebstore.google.com/) on the Chrome Web Store.
2. Click **Add to Chrome**.
3. Confirm the permissions prompt.

The extension icon appears in your toolbar. Click it to open the side panel, or look for the floating launcher bubble on any webpage.

## Beta testing (packaged extension)

If you received a `.crx` or `.zip` file for beta testing:

1. Open `chrome://extensions` in Chrome.
2. Enable **Developer mode** (toggle in the top-right corner).
3. Drag and drop the `.crx` file onto the page — or click **Load unpacked** and select the unzipped folder.
4. The extension installs immediately.

:::caution
Packaged extensions installed this way may show a "Developer mode extensions" warning on Chrome startup. This is normal for beta builds.
:::

## First launch

After installation, Daneel AI shows a one-time onboarding flow:

1. **Language** — pick your preferred language (English, French, Spanish, German, Italian).
2. **Feature tour** — quick overview of Page Chat, Site Search, and Document Vault.
3. **Provider setup** — choose your AI backend. WebGPU (local) is selected by default and works immediately. You can optionally configure Ollama, Claude, or Gemini Nano.
4. **License** — enter a license key if you have one, or continue on the free plan.
5. **Data import** — restore a previous backup if migrating from another device.
6. **Telemetry** — opt in or out of anonymous usage analytics.

You're ready to go. Head to [Your First Page Chat](/guides/first-page-chat/) to start using the extension.

## Requirements

- **Chrome 113+** (for WebGPU support). Chrome 120+ recommended.
- A GPU with WebGPU support for local inference. Most modern GPUs (integrated or discrete) work. Check `chrome://gpu` to verify.
- No server, no account, and no API key needed for the default local setup.

---

# Introduction

**What Daneel AI is, what it does, and who it**

> Source: https://doc.daneel.injen.io/guides/introduction/index.md

Daneel AI is a Chrome extension that lets you chat with any website, any local document, or any MCP-connected tool — using AI that runs directly in your browser.

## What it does

Daneel AI has three core modes:

**Page Chat** — Ask questions about the page you're looking at. Daneel extracts the content (articles, docs, even YouTube transcripts), sends it to an AI model, and streams back an answer. No copy-pasting, no context switching.

**Site Search** — Index an entire website by its sitemap. Daneel crawls the pages, splits them into chunks, and builds a vector index stored locally in your browser. When you search, it finds the most relevant passages and assembles an AI-powered answer with source links.

**Document Vault** — Import your own files (PDF, DOCX, TXT, HTML, PPTX, Excel) into named vaults. Daneel embeds them locally and lets you chat with your documents the same way you chat with websites. You can also build a knowledge graph to visualize entity relationships.

## What makes it different

**Privacy by default.** The default AI backend (WebGPU) runs entirely in your browser. Your data never leaves your machine unless you choose a cloud provider.

**No API keys required.** Daneel works out of the box with local inference. Cloud providers like Claude, Ollama, and Azure OpenAI are opt-in.

**Multiple AI backends.** Switch between four LLM providers depending on your needs:

| Provider | Runs on | Internet needed? |
|----------|---------|-----------------|
| WebGPU | Your GPU, in-browser | No |
| Ollama | Local server | No (LAN only) |
| Gemini Nano | Chrome built-in | No |
| Claude | Anthropic API | Yes |
| Azure OpenAI | Azure cloud | Yes |

**MCP tool calling.** Connect remote services (Stripe, Notion, Vercel, Supabase, and more) via the Model Context Protocol. The AI can call tools autonomously across multi-turn conversations.

**Agents.** Define structured personas with custom system prompts and bound MCP servers for specialized workflows.

## Who it's for

- Researchers who want to query documentation sites without leaving the browser
- Developers who want an AI assistant connected to their tools (Stripe, Vercel, GitHub)
- Privacy-conscious users who want local-first AI with no data leaving their machine
- Teams who need document Q&A over internal files without uploading them to a third party

## Next steps

- [Install Daneel AI](/guides/installation/) from the Chrome Web Store
- Follow the [Your First Page Chat](/guides/first-page-chat/) tutorial to get started in under a minute

---

# Unlock Paid Features

**Purchase a Daneel AI license with Stripe and activate paid features with a one-time payment.**

> Source: https://doc.daneel.injen.io/guides/unlock-paid/index.md

Daneel is free to use for most things, and a small set of features are gated behind a one-time license purchase. This tutorial walks you through buying a license and activating it on your machine. No account, no subscription — a single payment delivers a key that works across all your browsers and devices.

Which features are paid changes over time as Daneel evolves. [How Licensing Works](/concepts/licensing/) explains the model; the current list of what's gated is visible in the extension itself — any locked feature is marked as such in its own panel. You don't need to decide up front.

## 1. Open the License panel

1. Click the Daneel launcher on any page to open the widget.
2. Click the gear icon to open settings.
3. In the sidebar, click **License**.

You'll see your current status. If this is a fresh install, the panel shows "Free plan — Some features are locked."

## 2. Click Unlock

Click the **Unlock** card. A new tab opens with a Stripe-hosted checkout page. Fill in your email and card details, then complete the purchase.

:::note
Daneel never sees your card details. Payment is handled entirely by Stripe, and only the final "a purchase happened" event and the email you used at checkout are relayed to the Daneel backend.
:::

## 3. Wait for auto-activation

After payment, Stripe redirects you to a success page. Within a few seconds, the extension detects the redirect, fetches your license key, and activates it automatically. You'll see your license panel flip to "Premium active" with a green masked preview of your key, e.g. `DAN-XXXX-..-D78B`.

## 4. Check your email

You will receive two emails right after payment:

- **A receipt from Stripe** at `receipts@stripe.com`, containing your receipt number (format `XXXX-XXXX`).
- **A license-key email from Daneel** at `noreply@daneel.injen.io`, containing your full `DAN-XXXX-XXXX-XXXX` key in plaintext.

Save both. The license key is your identity — you'll need it to activate Daneel on a second machine or after a reinstall. The receipt number is your second factor for recovering a lost key.

:::caution
Daneel has no account you can log into and no password reset. Losing both the key and the receipt number means losing automated recovery. Store them in a password manager like any other credential.
:::

## 5. Verify activation

Back in the License panel, you should now see:

- **Premium active** with a green unlocked-lock icon
- Your plan name
- The masked license key
- A "Refreshes in Nd" line — the extension refreshes the token automatically before it expires, so you never need to re-enter your key

Paid features are now unlocked across all your browsing sessions, and the extension keeps working offline using its cached token.

## Next steps

- Learn how the licensing model works, what stays local, and why features keep working offline: [How Licensing Works](/concepts/licensing/).
- Use the same key on another computer, recover a lost key, or switch licenses: [How to Manage Your License Key](/how-to/manage-license/).

---

# How to Create a Custom Agent

**Define specialized AI personas with custom prompts and bound MCP servers.**

> Source: https://doc.daneel.injen.io/how-to/agents/index.md

Agents are structured configurations that give the AI a specific persona, task focus, and optional tool access. This guide shows how to create and use them.

## Prerequisites

- Daneel AI installed
- (Optional) One or more [MCP servers connected](/how-to/mcp-server/) if you want the agent to use tools

## Create an agent

1. Open **Settings > Agents**.
2. Click **Create Agent**.
3. Fill in the agent definition:
   - **Name** — a short label (e.g., "Billing Helper", "Code Reviewer")
   - **Purpose** — one-line description of what this agent does
   - **System prompt** — the full instructions the AI receives. This is where you define the persona, constraints, output format, and behavior. Write it as if you're briefing the AI directly.
   - **MCP servers** — select which connected MCP servers this agent can use. The agent only has access to the servers you bind here.
4. Click **Save**.

## Attach an agent to a vault

Agents can be attached to document vaults for specialized document Q&A:

1. Open the vault overlay.
2. Click the **Agent** tab in your vault.
3. Select an agent from the dropdown.

When attached, the agent's system prompt and MCP servers are used for all conversations within that vault.

:::note
A vault uses either an agent or standalone MCP servers — not both. Attaching an agent replaces any directly-bound MCP servers.
:::

## Use an agent in chat

Agents are also available in regular chat conversations:

1. Open the chat panel.
2. Select an agent from the agent picker at the top.
3. The AI now responds according to the agent's system prompt and has access to the agent's bound tools.

## Example: Stripe billing agent

Here's a practical example:

- **Name:** Billing Support
- **Purpose:** Answer billing questions using Stripe data
- **System prompt:**
  ```
  You are a billing support specialist. When asked about invoices,
  subscriptions, or payments, use the Stripe tools to look up real data.
  Always include invoice IDs and amounts in your responses.
  Format currency as USD with two decimal places.
  If you can't find a customer, ask for their email address.
  ```
- **MCP servers:** Stripe

Now you can ask: *"What's the current subscription status for acme@example.com?"* and the agent will query Stripe and respond in the defined format.

## Configure context injection

Agents can receive environment context (your location and current date/time) in their system prompt. This is useful for location-aware or time-sensitive workflows.

In the agent editor, scroll to the **Context** section:

- **Uses location** — when checked, the agent receives your city (e.g., "Lyon, France") in its prompt. Requires "Share location with agents" to be enabled in Settings > Privacy.
- **Uses date & time** — when checked, the agent receives the current date, time, and timezone.

Both fields support three states: checked (always inject), unchecked (never inject), and indeterminate (inherit from bound MCP servers). Click through to cycle states.

When set to indeterminate, the agent inherits from its bound MCP servers — if any server has the `location` or `datetime` badge active in Settings > MCP, the agent gets that context automatically.

See [Environment Context](/concepts/context-injection/) for the full architecture and privacy details.

## Edit or delete agents

- In **Settings > Agents**, click an agent to edit its configuration.
- Click **Delete** to remove an agent. If it's attached to a vault, the vault reverts to no agent.

## Next steps

- [Connect MCP servers](/how-to/mcp-server/) to expand what your agents can do
- Read about [MCP and tool calling](/concepts/mcp/) to understand how agents invoke tools
- See the [AI Providers reference](/reference/providers/) for tool calling support per provider

---

# How to Set Up Azure OpenAI

**Connect Daneel AI to your Azure OpenAI Service deployment.**

> Source: https://doc.daneel.injen.io/how-to/azure-openai/index.md

Azure OpenAI gives you access to OpenAI models hosted in your own Azure tenant. This guide covers both authentication methods.

## Prerequisites

- An Azure subscription with Azure OpenAI Service enabled
- A deployed model (e.g., GPT-4o, GPT-4 Turbo) in your Azure OpenAI resource
- The endpoint URL and deployment name from the Azure portal

## Configure with API key

1. Open **Settings > Azure OpenAI** in Daneel.
2. Enter your **Endpoint URL** (e.g., `https://your-resource.openai.azure.com/`).
3. Enter your **Deployment name** (the name you gave your model deployment).
4. Select **API Key** as the auth method.
5. Paste your Azure API key.
6. Click **Save**.

Switch the provider dropdown in the chat panel to **Azure OpenAI** to start using it.

## Configure with Entra ID (OAuth2)

For environments that require Azure Active Directory authentication:

1. Open **Settings > Azure OpenAI** in Daneel.
2. Enter your **Endpoint URL** and **Deployment name**.
3. Select **Entra ID** as the auth method.
4. Daneel initiates an OAuth2 flow via Chrome's identity API.
5. Sign in with your Microsoft account and consent to the required permissions.

Daneel handles token refresh automatically.

## Tool calling

Azure OpenAI supports MCP tool calling using the OpenAI function calling format. Once configured, [connected MCP servers](/how-to/mcp-server/) work the same as with Claude or Ollama.

## Next steps

- [Connect MCP servers](/how-to/mcp-server/) to give Azure OpenAI access to external tools
- See the [AI Providers reference](/reference/providers/) for a comparison of all backends
- Read about [the provider spectrum](/concepts/providers/) to understand trade-offs

---

# Monitor Background Tasks

**How to track, pause, resume, and cancel long-running operations like site crawls, vault indexing, and knowledge graph builds.**

> Source: https://doc.daneel.injen.io/how-to/background-tasks/index.md

Long-running operations in Daneel — site crawls, vault indexing, knowledge graph builds — run in the background. You can close panels, switch tabs, and navigate freely without losing progress. The Settings > Tasks panel lets you monitor and control these operations.

## Open the task monitor

1. Click the Daneel widget icon to open the extension
2. Open **Settings** (gear icon)
3. Select **Tasks** in the sidebar (clock icon, between Data Backup and AI Models)

The panel shows two sections: **Active** tasks and **History**.

## Active tasks

Each running task shows:

- **Task name** — the site hostname or vault name
- **Task type** — Site Crawl, Vault Indexing, or Knowledge Graph
- **Progress bar** — percentage complete
- **Progress label** — current step (e.g., "Embedding page 12 of 50...")
- **Duration** — how long the task has been running
- **ETA** — estimated time remaining (extrapolated from current progress rate)

### Transport controls

Each active task has three icon buttons:

| Button | Action |
|--------|--------|
| **Pause** (two vertical bars) | Stops the task at its current checkpoint. The host tab releases resources. |
| **Resume** (play triangle) | Restarts a paused task from its last checkpoint. Skips already-completed work. |
| **Stop** (square) | Cancels the task permanently. Partial results (pages already crawled, documents already embedded) are kept. |
| **Delete** (trash) | Cancels the task (if active) and removes it from the list entirely. |

### Queued tasks

If you start a second task while one is already running, it shows as **"Waiting for GPU..."** with an amber dot. Daneel runs one GPU-heavy task at a time to prevent crashes. The queued task starts automatically when the current one finishes.

## History

Completed, failed, and cancelled tasks appear in the History section, sorted newest first. Each entry shows:

- Status (green dot for completed, red for failed, grey for cancelled)
- Task name and type
- Total duration
- How long ago it finished

Failed tasks include an error message you can use for troubleshooting.

### Clearing history

- **Clear all** — click "Clear all" next to the History heading to remove all history entries
- **Delete one** — click the trash icon on any individual history entry

History entries are also cleaned up automatically after 24 hours.

## Starting tasks

You don't start tasks from the Tasks panel — you start them from their respective UIs:

- **Site crawls** — open the Search overlay on any website, configure sitemaps and page limits, click "Start Crawl"
- **Vault indexing** — open the Vault panel, import files into a vault. Embedding starts automatically after conversion.
- **Knowledge graph builds** — open the Vault panel, scroll to the Knowledge Graph section, click "Build" or "Update"

Once started, all three task types appear in Settings > Tasks regardless of which panel triggered them.

## Recovering from interruptions

If the extension's service worker is evicted (Chrome routinely suspends inactive extension workers), or if you close and reopen Chrome entirely, tasks resume automatically within about 60 seconds. You may notice a brief pause in the progress bar, then it picks up where it left off.

For site crawls, already-crawled pages are not re-fetched. For knowledge graphs, already-processed chunks are detected via incremental mode and skipped.

To understand the technical details behind this, see [Background Tasks (concepts)](/concepts/background-tasks/).

---

# How to Back Up Your Data

**Export your settings, indexes, and vaults to a local file or cloud storage.**

> Source: https://doc.daneel.injen.io/how-to/cloud-backup/index.md

Daneel stores everything in your browser's local storage and IndexedDB. This guide covers how to back up and restore that data.

## Local backup (ZIP file)

### Export

1. Open **Settings > Data Backup**.
2. Click **Export**.
3. Daneel packages your settings, vault data, indexes, agents, and MCP server configurations into a `.zip` file.
4. Save the file to your computer.

### Import

1. Open **Settings > Data Backup**.
2. Click **Import** and select a previously exported `.zip` file — or drag and drop it.
3. Daneel validates the archive and restores your data. A progress bar shows the import status.

:::caution
Importing a backup overwrites your current data. Export first if you want to keep your existing configuration.
:::

## Cloud backup: Azure Blob Storage

Back up to Azure Blob Storage using a SAS (Shared Access Signature) URL. No Azure SDK or dependencies required.

1. In the Azure portal, generate a SAS URL for a blob container with read and write permissions.
2. Open **Settings > Data Backup**.
3. Paste the SAS URL in the **Azure Blob Storage** section.
4. Click **Upload**. Your backup is stored as `daneel.backup.zip` in the container.
5. To restore, click **Download** to pull the latest backup.

The SAS URL is stored locally and excluded from data exports for security.

## Cloud backup: S3-compatible storage

Back up to any S3-compatible service (AWS S3, Cloudflare R2, Backblaze B2, MinIO).

1. Open **Settings > Data Backup**.
2. In the **S3-Compatible Storage** section, enter:
   - **Access Key ID**
   - **Secret Access Key**
   - **Bucket name**
   - **Region** (e.g., `us-east-1`)
   - **Endpoint** (optional, for non-AWS services like R2 or MinIO)
3. Click **Upload**. Your backup is stored as `daneel.backup.zip`.
4. To restore, click **Download**.

Signing uses SigV4 via the Web Crypto API — no AWS SDK needed.

## What's included in a backup

- All user settings and preferences
- Vault definitions and document metadata
- Indexed site data and embeddings
- Agent configurations
- MCP server registrations
- Knowledge graph data

**Not included:** API keys and cloud storage credentials (for security).

## Next steps

- Read about the [privacy model](/concepts/privacy/) to understand where your data lives
- See [Storage and Limits](/reference/storage/) for details on what's stored where

---

# How to Use Environment Context

**Enable geolocation and datetime injection so your agents and tools know where and when you are.**

> Source: https://doc.daneel.injen.io/how-to/context-injection/index.md

This guide shows how to set up location and datetime context injection for agents and MCP tools.

## Prerequisites

- Daneel AI installed
- (Optional) An [agent](/how-to/agents/) or [MCP server](/how-to/mcp-server/) configured

## Enable datetime injection

Datetime injection is on by default. To verify or change it:

1. Open **Settings > Privacy**.
2. Check that **Share date & timezone with agents** is toggled on.

When enabled, every prompt sent to the AI includes the current date, time, and IANA timezone. No permission is needed.

## Enable geolocation

Geolocation is off by default and requires a one-time browser permission:

1. Open **Settings > Privacy**.
2. Toggle **Share location with agents** on.
3. The browser will briefly switch to an extension tab and show a permission prompt: "Daneel AI wants to know your location."
4. Click **Allow**.
5. Focus returns to your page automatically.

If you click **Block**, the toggle stays off. You can retry later or enable location in Chrome's site settings for the extension.

:::note
Location is resolved at city level using WiFi/IP positioning (not GPS). The coordinates are sent to OpenStreetMap's Nominatim service once per session and cached in memory.
:::

## Mark an MCP server as location-aware

If you use an MCP server whose tools benefit from knowing your location (e.g., Google Maps, a weather API):

1. Open **Settings > MCP**.
2. Find the server in the **Registered Servers** list.
3. Click the `location` badge next to the server name. It turns blue when active.
4. Optionally click the `datetime` badge too (though datetime is injected by default for all MCP-connected flows).

Any agent that binds this server will automatically inherit the location requirement — no per-agent configuration needed.

## Configure an agent with context overrides

If you want an agent to always receive location regardless of its MCP servers, or to opt out of datetime:

1. Open **Settings > Agents**.
2. Create or edit an agent.
3. Scroll to the **Context** section at the bottom of the editor.
4. Set the checkboxes:
   - **Uses location** — three states: checked (always inject), unchecked (never inject), indeterminate (inherit from servers)
   - **Uses date & time** — same three states

Click through the checkbox to cycle: indeterminate (inherit) → checked (force on) → unchecked (force off) → indeterminate.

## Example: travel planner agent

A practical configuration for a location-aware agent:

1. **Enable geolocation** in Settings > Privacy.
2. **Create an agent** in Settings > Agents:
   - **Name:** Travel Planner
   - **Purpose:** Plan trips and find local attractions
   - **Persona:** "You are an experienced travel planner who knows local customs, transportation, and dining."
   - **Task:** "Help users plan trips, find nearby attractions, and suggest restaurants. Always consider their current location and local time when making recommendations."
   - **Context:** Uses location = checked, Uses date & time = checked
3. **Attach the agent** to a chat conversation.
4. Ask: "What restaurants are open near me right now?"

The AI will see your city and current time in its context and respond accordingly.

## Verify what the AI sees

To confirm context injection is working, ask the agent directly:

> "What is my current location and the current date and time?"

The AI will report the injected values from the `## Environment Context` section of its system prompt.

## Troubleshooting

**Location not appearing:**
- Check that "Share location with agents" is on in Settings > Privacy
- Check that the agent has "Uses location" checked, or that a bound MCP server has the `location` badge active
- Verify you granted browser geolocation permission (check Chrome's address bar for the location icon)

**Permission prompt never appeared:**
- The prompt shows on the extension's host tab. If it was blocked by a popup blocker, navigate to `chrome://settings/content/location` and allow the extension

**Wrong city displayed:**
- Nominatim resolves based on WiFi/IP positioning, which can be approximate. This is normal — the goal is city-level context, not street-level precision.

## Next steps

- [Environment Context](/concepts/context-injection/) — how the three-tier architecture works
- [Privacy Model](/concepts/privacy/) — what data leaves your machine
- [MCP and Tool Calling](/concepts/mcp/) — how agents use tools with context

---

# How to Set Up the Docker Companion

**Generate a Docker Compose stack for local MCP servers and Ollama.**

> Source: https://doc.daneel.injen.io/how-to/docker-companion/index.md

The Docker Companion generates a `docker-compose.yml` that bridges local MCP servers (designed for stdio) to HTTP/SSE endpoints that Daneel can connect to — alongside optional Ollama for local LLM inference.

## Prerequisites

- [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed and running
- Basic familiarity with Docker Compose

## Configure presets

Open **Settings > Docker** in Daneel. The configuration is organized as presets:

### Companion sidecar (always included)

The Companion sidecar provides service discovery and health checks by monitoring the Docker socket. It's always included in the generated compose file.

- **Port:** 8809 (default)
- **Image:** `ghcr.io/daneel-ai/companion:latest`

### Ollama (optional)

Enable the Ollama preset to include a local Ollama container:

- **Port:** 11434 (default)
- **Image:** `ollama/ollama:latest`
- CORS is pre-configured for Chrome extension access

### MCP Servers (optional, beta)

Add local MCP servers from the template catalog:

| Template | Description |
|----------|-------------|
| Filesystem | Browse and search local files |
| GitHub | Repository operations |
| SQLite | Query local databases |
| PostgreSQL | Query PostgreSQL databases |
| Brave Search | Web search |
| Puppeteer | Browser automation |
| Everything | Local file search (Windows) |
| Memory | Persistent knowledge store |

You can also add custom servers by entering the command (e.g., `npx -y my-mcp-server`).

Each MCP server runs in a `supercorp/supergateway` container that bridges stdio to SSE. Ports are assigned sequentially starting from 8810.

## Export and run

1. Review the YAML preview in the settings panel.
2. Click **Export**. Daneel:
   - Downloads the `daneel.compose.yml` file
   - Auto-registers each MCP server as `http://localhost:{port}/sse` in Daneel's MCP server list
3. Run the stack:

```bash
docker compose -f daneel.compose.yml up -d
```

4. Daneel detects the running containers via the Companion sidecar's service discovery endpoint.

## How it works

The `supercorp/supergateway` image wraps any stdio-based MCP server as an HTTP/SSE endpoint:

```
Your MCP server (stdio) → supergateway container → HTTP/SSE on localhost:{port}
    ↑                                                      ↓
    └── Daneel connects here, just like any remote MCP server
```

Each container runs: `--stdio "<command>" --port 8000 --cors`

For Python-based servers, the `:uvx` variant of supergateway is used automatically.

## Next steps

- [Connect MCP servers](/how-to/mcp-server/) once your stack is running
- See the [AI Providers reference](/reference/providers/) for Ollama configuration details

---

# How to Explore Your Knowledge Graph

**Use the analytics insights panel, path finder, and Wikipedia lookup to make sense of large knowledge graphs.**

> Source: https://doc.daneel.injen.io/how-to/explore-knowledge-graph/index.md

Once you've built a knowledge graph from your vault, the analytics layer turns the visualization into a research workbench. This guide shows you how to use the insights panel, path finder, sizing modes, and one-click Wikipedia lookup.

## Prerequisites

- A vault with a built [knowledge graph](/how-to/knowledge-graph/)
- The graph must have entities and connections (single isolated entities won't surface much)

## Open the insights panel

With the knowledge graph view active, click the **chart icon** in the toolbar (top-right of the canvas, next to the document selector). The Insights panel slides in from the right.

The first time you open it, Daneel computes all analytics in one pass. For graphs of a few thousand entities this takes under a second; larger graphs show a "Analyzing graph..." progress indicator.

The panel has four collapsible cards. **Key Entities** and **Graph Health** are open by default; **Topics** and **Bridges** are collapsed.

## Find your most important entities

The **Key Entities** card lists the top 12 entities ranked by structural importance — entities that are connected to other well-connected entities (not just entities mentioned the most often).

Each row shows:
- Rank number
- A colored dot for the entity type
- The entity name
- The type label
- A bar showing the relative importance score

**Click any entity** to focus the 3D view on that entity's neighborhood (the same as clicking it in the graph). The Wikipedia panel opens automatically with matching articles.

## Discover topics

Click the **Topics** card header to expand it. Daneel groups entities into topical clusters by analyzing which entities frequently appear together in the same documents.

Each topic shows:
- A colored dot (the topic's color in the graph when in "Color: Topic" mode)
- An auto-generated label using the top two entities (e.g., "Einstein & Bohr")
- The number of entities in the topic
- A small bar showing the type distribution within the topic

**Click any topic** to filter the 3D view to only show that cluster — non-topic entities are hidden, leaving a clean view of just the topic and its internal connections.

To return to the full graph, click the **Reset view** button (bottom-right of the canvas).

Topics with fewer than 2 entities are filtered out of the list to reduce noise.

## Find bridge entities

The **Bridges** card lists entities that connect otherwise separate parts of the graph. These are the structural "connectors" — remove them and the graph would fragment.

Bridges are useful for spotting:
- Researchers who span multiple fields
- Concepts that link different topics
- Institutions that bridge geographies
- Anything that acts as a hub between communities

Click any bridge entity to focus on its neighborhood in the 3D view.

## Check graph health

The **Graph Health** card shows a traffic-light status:

- **Good** (green): the graph is well-connected, with most entities forming a single cluster
- **Fair** (amber): some gaps detected — multiple separate clusters
- **Poor** (red): heavily fragmented, many disconnected pieces

Below the status, four metrics:
- **Clusters**: total number of disconnected components
- **Isolated**: entities with no connections at all
- **Main cluster**: percentage of entities in the largest connected component
- **Density**: how interconnected the graph is overall

If the status is amber or red, the card also shows a **Possible duplicates** section — entity pairs that look similar enough to suggest entity resolution missed them (e.g., "luminiferous ether" and "luminiferous aether"). These are candidates for manual cleanup.

## Find a connection between two entities

Click the **path finder icon** (looks like two connected circles) in the toolbar. The Path Finder panel replaces the Insights panel.

1. Type into the **From** field and select a source entity from the autocomplete dropdown
2. Type into the **To** field and select a target entity
3. Click **Find connection**

If a path exists, Daneel shows:
- The shortest chain of entities connecting the two
- The number of hops and total connection weight
- Edge provenance (which co-occurrence count established each link)
- An expandable **alternative paths** section if multiple shortest paths exist

The path is highlighted in the 3D view — path entities are colored, everything else is dimmed.

If no path exists, the entities are in different clusters and aren't connected via your corpus.

## Change what node size represents

By default, node size reflects **mention count** — frequently mentioned entities appear larger. The toolbar dropdown lets you switch:

- **Size: Mentions** — original behavior, biggest nodes are most-mentioned
- **Size: Importance** — biggest nodes are structurally important (PageRank)
- **Size: Bridges** — biggest nodes are bridge entities (betweenness)
- **Size: Connectivity** — biggest nodes have the most connections (degree)

Switching modes preserves the layout — only the visual sizes change, so you can compare the same graph from different angles.

The sizing uses a value-based curve (not a flat ranking), so dramatic outliers stay dramatic. A handful of high-PageRank entities will tower over the rest, just as in the underlying data.

## Switch to topic colors

The toolbar also has a **Color** dropdown:

- **Color: Type** — default, entities colored by their ontology type (person, location, etc.)
- **Color: Topic** — entities colored by the topic cluster they belong to

When you switch to topic colors, the bottom-left legend changes to show topic labels instead of entity types. Click any topic in the legend to filter the view to that topic.

## Look up entities on Wikipedia

When you click any node in the 3D view, two things happen at once:
1. The graph focuses on the entity's neighborhood (using the depth setting in the toolbar)
2. A **Wikipedia panel** appears in the top-left corner with matching articles

The Wikipedia search uses the entity's name as a prefix query and returns up to 10 matching pages with thumbnails and short descriptions.

**Click any result** to fetch the article, convert it to readable text, and display it in the document viewer pane on the right of the screen. Click the small **external-link icon** to open the article on wikipedia.org in a new tab instead.

Wikipedia results are cached locally for 7 days to avoid repeated API calls.

To close the Wikipedia panel, click its **X** button. To clear the article from the document viewer, click the **X** in the article's header.

## Resize the analytics panel

The analytics and path finder panels share a draggable handle on their left edge. Click and drag to resize between 220 and 500 pixels wide. Useful when entity names are long.

## Reset everything

The floating **Reset view** button appears in the bottom-right of the 3D canvas whenever any focus, highlight, or topic filter is active. Click it to clear all filters and return to the full graph.

## Next steps

- For background on what these algorithms actually compute, see [Graph Analytics](/concepts/graph-analytics/)
- For configuration of the underlying knowledge graph, see [Settings reference](/reference/settings/)
- For the broader knowledge graph concept, see [Knowledge Graphs](/concepts/knowledge-graph/)

---

# How to Enable Gemini Nano

**Turn on Chrome**

> Source: https://doc.daneel.injen.io/how-to/gemini-nano/index.md

Gemini Nano is Google's on-device language model shipped inside Chrome itself. When enabled, prompts run in the browser process, never cross a network boundary, and require no API key. This guide walks you through enabling the model in Chrome, verifying it works, and selecting it in Daneel.

## Prerequisites

- **Chrome 138 or later** (stable channel). The Prompt API for on-device usage ships in Chrome 138+, and the `LanguageModel` global is available as of this version. ([Chrome for Developers — Prompt API](https://developer.chrome.com/docs/ai/prompt-api))
- **Operating system**: Windows 10/11, macOS 13 (Ventura) or later, Linux, or ChromeOS Platform 16389.0.0+ on a Chromebook Plus device. ([Chrome for Developers — Get started with built-in AI](https://developer.chrome.com/docs/ai/get-started))
- **Hardware**: at least 16 GB RAM, 4+ CPU cores, and a GPU with more than 4 GB of VRAM.
- **Storage**: roughly 22 GB free for the model download. The model is shared across Chrome profiles.
- **Network**: unmetered connection for the initial download.

Gemini Nano is not yet available on Android or iOS Chrome.

## Step 1 — Enable the Chrome flags

Open two URLs in Chrome and flip both flags.

1. Navigate to `chrome://flags/#optimization-guide-on-device-model`.
2. Set it to **Enabled BypassPerfRequirement**. The plain *Enabled* state also works on machines that meet the hardware bar, but `BypassPerfRequirement` avoids silent "unavailable" results if Chrome's heuristics disagree with your GPU. ([Chrome for Developers — Get started](https://developer.chrome.com/docs/ai/get-started))
3. Navigate to `chrome://flags/#prompt-api-for-gemini-nano`.
4. Set it to **Enabled**. Choose **Enabled multilingual** if you plan to prompt in languages other than English.
5. Click **Relaunch** at the bottom of the page.

After Chrome restarts, both flags must show as enabled. If either reverts to Default, re-open the flag URL and save again.

## Step 2 — Trigger the model download

The model is fetched lazily the first time Chrome recognizes it is needed.

1. Open `chrome://components/`.
2. Scroll to **Optimization Guide On Device Model**. If the version reads `0.0.0.0`, click **Check for update**.
3. The component status will progress through *Downloading* to a real version string (for example `2024.x.x.x`). The download is roughly 2 GB on disk and may take several minutes.
4. Keep Chrome open until the version number appears. The download resumes across restarts but stalls if Chrome is closed mid-fetch.

If the version stays at `0.0.0.0` after a few minutes, see [Troubleshooting](#troubleshooting) below.

## Step 3 — Verify availability from the DevTools console

Confirm the model is ready before switching providers in Daneel.

1. Open any tab and press **F12** to open DevTools.
2. In the **Console**, run:

   ```js
   await LanguageModel.availability();
   ```

3. The expected return values are:
   - `"available"` — ready to use.
   - `"downloadable"` — flags are set but the model has not been fetched yet. Return to Step 2.
   - `"downloading"` — the component is still being fetched.
   - `"unavailable"` — hardware, OS, or flag requirements are not met.

The global is called `LanguageModel`, not `ai.languageModel` or `window.ai` — earlier Chrome builds exposed different surfaces, and Daneel's provider targets the current `LanguageModel` class directly. ([Chrome for Developers — Prompt API](https://developer.chrome.com/docs/ai/prompt-api))

You can also smoke-test a prompt:

```js
const session = await LanguageModel.create();
await session.prompt("Write one sentence about otters.");
```

## Step 4 — Select Gemini Nano in Daneel

1. Open Daneel from the Chrome toolbar.
2. Go to **Settings > Gemini Nano**.
3. Daneel runs its own availability check and shows a status badge: *Available*, *Downloadable*, *Downloading*, or *Unavailable*.
4. If the status is *Available*, pick a language from the **Language** dropdown (used as the session's expected input and output locale).
5. Close settings, open the chat panel, and switch the provider selector to **Gemini Nano**.

You can now ask page questions, summarize selections, and run vault RAG with zero network traffic for inference. Embeddings still run locally through the WebGPU embedding provider.

:::note
Gemini Nano is Daneel's highest privacy tier — prompts never leave the Chrome process. See the [privacy model](/concepts/privacy/) for where it sits relative to other providers.
:::

## Troubleshooting

**`LanguageModel` is undefined in the console.**
The flags did not take effect. Re-check both `chrome://flags/#optimization-guide-on-device-model` and `chrome://flags/#prompt-api-for-gemini-nano`, confirm they read *Enabled*, and relaunch Chrome from the prompt at the bottom of the flags page (not by closing the window manually).

**Component stuck at `0.0.0.0`.**
Chrome only downloads the model when the Optimization Guide flag is enabled *and* the device passes the performance check. Switch the flag to **Enabled BypassPerfRequirement**, relaunch, then click **Check for update** in `chrome://components/` again. Corporate proxies and strict DNS filters also block the download — try from an unfiltered network.

**Availability returns `"unavailable"` after download.**
Confirm you have at least 22 GB free on the Chrome user data partition and more than 4 GB of VRAM. On laptops with hybrid graphics, force Chrome onto the discrete GPU via your OS graphics settings.

**Daneel shows "Unavailable" even though the console test works.**
Reload the Daneel side panel. The provider caches its availability probe for the lifetime of the panel.

**Responses are short, repetitive, or off-topic.**
Gemini Nano is a ~3B parameter model. It handles summarization, rewriting, and simple Q&A well, but struggles with long multi-step reasoning and tool calling. For complex agentic workflows, switch to Claude, Azure OpenAI, or a larger Ollama model. See [the provider spectrum](/concepts/providers/) for guidance.

## Next steps

- Try it on a page: follow [Your First Page Chat](/guides/first-chat/) with Gemini Nano selected.
- Compare it against other local options in the [AI Providers reference](/reference/providers/).
- Learn [how to use Daneel fully offline](/how-to/offline/).

---

# How to Build a Knowledge Graph

**Extract entities from your vault documents and visualize their relationships.**

> Source: https://doc.daneel.injen.io/how-to/knowledge-graph/index.md

The knowledge graph feature extracts named entities (people, organizations, places, concepts) from your vault documents and builds an interactive 3D visualization of their relationships.

## Prerequisites

- A [document vault](/guides/first-vault/) with at least one imported document
- Enough free memory for the NER model (183 MB–583 MB depending on model choice)

## Enable the knowledge graph

1. Open the vault overlay and select your vault.
2. Click the **Knowledge Graph** toggle to enable it.
3. Daneel downloads the NER (Named Entity Recognition) model if not already cached, then processes all documents in the vault.

Entity extraction runs locally in a dedicated web worker using GLiNER (an ONNX model). No data leaves your browser.

## Choose a NER model

Open **Settings > Knowledge Graph** to select a model:

| Model | Size | Languages | Notes |
|-------|------|-----------|-------|
| GLiNER Small v2.1 (fp32) | 583 MB | English | Highest accuracy |
| GLiNER Small v2.1 (int8) | 183 MB | English | Good balance of size and quality |
| GLiNER Multi v2.1 (int8) | 349 MB | Multilingual | For non-English documents |
| GLiNER Multi v2.1 (fp16) | 580 MB | Multilingual | Highest multilingual accuracy |

The int8 English model is a good default for most use cases.

## Pick an ontology preset

Ontology presets define what types of entities Daneel looks for. Choose one in **Settings > Knowledge Graph**:

- **General** — people, organizations, places, events, concepts
- **Academic** — researchers, institutions, theories, publications
- **Legal** — cases, statutes, courts, parties
- **Medical** — conditions, treatments, drugs, anatomy
- **Programming** — languages, frameworks, APIs, data structures
- **Business** — companies, products, markets, financials
- **Travel** — destinations, landmarks, transport, accommodations
- **History** — historical figures, battles, treaties, eras

You can also define a **custom ontology** by entering your own entity type labels.

## Explore the visualization

Once extraction completes, the knowledge graph appears as a 3D interactive visualization:

- **Nodes** represent entities, sized by how often they appear
- **Edges** represent co-occurrence relationships between entities
- **Colors** indicate entity types (people in blue, organizations in green, places in red, etc.)
- **Hover** over a node to see its label and type
- **Click a node** to focus on its neighborhood and trigger a Wikipedia lookup
- **Click and drag** to rotate the view
- **Scroll** to zoom in and out

For analytics, path finding, sizing modes, topic clusters, and the Wikipedia lookup, see [How to Explore Your Knowledge Graph](/how-to/explore-knowledge-graph/).

## Customize the visualization

In **Settings > Knowledge Graph**, adjust:

- **Particle animation** — toggle animated particles along edges
- **Bloom glow** — toggle a glow effect on nodes
- **Charge strength** — how strongly nodes repel each other (affects spacing)
- **Link opacity** — transparency of relationship edges
- **Node scale** — base size multiplier for nodes

## How entity resolution works

Daneel automatically deduplicates entities using normalized string matching with a configurable threshold (default: 85% similarity). "OpenAI", "Open AI", and "OPENAI" resolve to a single entity. Adjust the deduplication threshold in settings if you need stricter or looser matching.

## Next steps

- [Explore the graph with analytics, path finder, and Wikipedia lookup](/how-to/explore-knowledge-graph/)
- Learn more about [what knowledge graphs are and why they help](/concepts/knowledge-graph/)
- Understand the [graph analytics layer](/concepts/graph-analytics/) — importance, topics, bridges, paths
- See the [Settings reference](/reference/settings/) for all knowledge graph parameters
- [Build a Document Vault](/guides/first-vault/) if you haven't created one yet

---

# How to Manage Site Indexes

**View, re-index, and clear your indexed websites.**

> Source: https://doc.daneel.injen.io/how-to/manage-indexes/index.md

When you index a site with Daneel, the vector data is stored in IndexedDB in your browser. This guide covers how to manage those indexes.

## View indexed sites

1. Open **Settings > Indexes**.
2. You'll see a list of all indexed domains with:
   - Number of pages crawled
   - Number of chunks embedded
   - Last indexed date

You can also see a summary on the **Settings > Home** dashboard card.

## Re-index a site

To update an index with fresh content:

1. Navigate to the site.
2. Open the search overlay (magnifying glass icon).
3. Click **Re-index**. Daneel re-crawls using the same discovery method (sitemap or web crawl) and replaces the old chunks.

Alternatively, from **Settings > Indexes**, click the re-index button next to any domain.

For details on choosing between sitemap and web crawl discovery, see [How to Index a Site](/how-to/site-indexing/).

## Clear an index

To remove all indexed data for a site:

1. Open **Settings > Indexes**.
2. Click the **Clear** button next to the domain.

This deletes all chunks and embeddings for that domain from IndexedDB. The action is immediate and cannot be undone.

You can also clear a domain from the search overlay by clicking the clear button next to the domain stats.

## Crawl settings

Adjust default crawl parameters in **Settings > Indexes** or per-crawl in the search overlay:

| Setting | Default | Range | Description |
|---------|---------|-------|-------------|
| Max pages | 150 | 1–200 | Maximum pages to crawl per site |
| Max depth | 3 | 1–10 | How deep to follow links from the sitemap |
| Max chunks per page | 2,000 | — | Upper limit on chunks per single page |
| Chunk size | 512 tokens | — | Target size of each text chunk |
| Overlap | 64 tokens | — | Overlap between consecutive chunks |

## Storage

Indexed data is stored in IndexedDB, partitioned by domain. Each domain's data is independent — clearing one site doesn't affect others.

For storage details, see [Storage and Limits](/reference/storage/).

## Next steps

- [Index Your First Site](/guides/first-site-index/) if you haven't tried it yet
- Learn [how RAG works](/concepts/rag/) under the hood
- [Back up your data](/how-to/cloud-backup/) to preserve your indexes

---

# How to Manage Your License Key

**Enter a license key, recover a lost key, reset or switch licenses, and check your license status.**

> Source: https://doc.daneel.injen.io/how-to/manage-license/index.md

This guide covers the four short flows you'll come back to over the life of your license: entering a key on a new install, recovering a lost key, resetting or switching, and checking what's currently active.

If you don't yet have a license, start with [Unlock Paid Features](/guides/unlock-paid/). The concept page [How Licensing Works](/concepts/licensing/) covers the design behind all four flows.

## Enter a license key

Use this when you already have a `DAN-XXXX-XXXX-XXXX` key — typically when installing Daneel on a second machine or after a browser-profile reset.

1. Open **Settings > License**.
2. Click **Enter key**.
3. Paste your license key into the dialog. The input auto-uppercases and the format is `DAN-XXXX-XXXX-XXXX`.
4. Click **Activate**.

The extension exchanges your key with the backend for a signed token, caches it locally, and flips the panel to "Premium active". From this point on feature checks are fully local — no network needed.

## Recover a lost key

Your license key is your only identifier, and Daneel has no user account you can log into. If you've lost your key, recover it with the email and Stripe receipt number from your original purchase.

1. Open **Settings > License** and click **Enter key** to open the dialog.
2. Scroll down to **Forgot your license key?** below the activation form.
3. Enter the email address you used at checkout.
4. Enter your Stripe receipt number. The format is `XXXX-XXXX`, for example `1513-6412`. You'll find it in the receipt email from Stripe (`receipts@stripe.com`) — look for "Receipt #" in the header.
5. Click **Recover**.

The dialog collapses to a confirmation: _"An email was sent to \<your address\> if your license key was found."_ If both fields match a live license on file, the backend emails the key to that address. If either doesn't match, no email fires — but the UI shows the same message regardless.

:::note
Both the email and the receipt number must match. Daneel never surfaces a key from email alone — this is a deliberate guard against anyone who knows your email address claiming your license.
:::

Once the email arrives, follow [Enter a license key](#enter-a-license-key) above to activate it on your current machine.

If you can also access the Stripe receipt email, it includes a clickable link to the hosted Stripe receipt — a useful fallback for tax purposes. If you've lost both the key and the receipt number, email [support@daneel.injen.io](mailto:support@daneel.injen.io).

## Reset or switch licenses

Use this to move a license to a different Chrome profile, revert the extension to the free tier on a given machine, or free the slot for a different key.

1. Open **Settings > License**.
2. Click the **Reset** button.
3. Confirm.

The local token is cleared and the extension returns to the free tier. Your license key is not destroyed on the backend — it remains valid, and you can re-enter it here or on any other machine to reactivate.

:::caution
Resetting does not cancel your purchase or trigger a refund. It only clears the cached token on this machine. The license key stays valid indefinitely on the backend.
:::

## Check your license status

Open **Settings > License**. The panel shows one of two states.

**Premium active** — green unlocked-lock icon with:

- Your plan name
- The active feature flags
- "Refreshes in Nd" — days until the next automatic token refresh
- A masked preview of your key in the form `DAN-XXXX-..-LAST4`

**Free plan** — grey locked-lock icon with the note that some features are gated.

When active, a **Refresh** button lets you force a token refresh against the server. You don't normally need it — the extension refreshes the token automatically in the background, on a 7-day TTL. Manual refresh is useful right after a plan change, to pick up new feature flags immediately without waiting for the next scheduled refresh.

See [How Licensing Works](/concepts/licensing/) for the reasoning behind the 7-day TTL and offline caching.

---

# How to Connect an MCP Server

**Add a remote MCP server to give your AI access to external tools like Stripe, Notion, or Vercel.**

> Source: https://doc.daneel.injen.io/how-to/mcp-server/index.md

The Model Context Protocol (MCP) lets Daneel's AI call tools on remote services — reading Stripe invoices, querying Supabase tables, managing Vercel deployments, and more. This guide shows how to connect a server.

## Prerequisites

- Daneel AI installed
- A cloud provider that supports tool calling (Claude or Ollama recommended). WebGPU and Gemini Nano have experimental tool calling support but results vary with small models.

## Browse featured servers

1. Open **Settings > MCP** in Daneel.
2. The **Featured** tab shows curated servers with known-good configurations:

| Server | Auth type | Category |
|--------|-----------|----------|
| Stripe | OAuth | Payments |
| Notion | OAuth | Productivity |
| Vercel | OAuth | DevOps |
| Supabase | OAuth | Database |
| Figma | OAuth | Design |
| Linear | OAuth | Project management |
| Slack | OAuth | Communication |
| Google Maps | API Key | Maps |
| Cloudflare | OAuth | Infrastructure |
| Exa | API Key | Search |

3. Click a server to see its description and connection details.

## Connect with OAuth

For servers that use OAuth (Stripe, Notion, Vercel, etc.):

1. Click **Connect** on the server card.
2. Daneel opens an OAuth consent flow in a new tab.
3. Authorize the connection.
4. The server appears in your **Registered** list with a green status badge.

Daneel handles the full OAuth2 + PKCE flow, token storage, and refresh automatically.

## Connect with an API key

For servers that use API key authentication (Google Maps, Exa, etc.):

1. Click **Connect** on the server card.
2. Paste your API key in the input field.
3. Click **Save**.

The key is stored in Chrome's local storage.

## Add a custom server

For servers not in the featured list:

1. In the MCP settings panel, click **Add Custom Server**.
2. Enter the server's SSE endpoint URL (e.g., `https://mcp.example.com/sse`).
3. Choose the auth type (None, API Key, Bearer Token, or OAuth).
4. Enter credentials if required.
5. Click **Register**.

Daneel discovers the server's available tools automatically.

## Use tools in conversation

Once connected, tools are available in any chat conversation:

1. Start a chat in any mode (Page, Site, or Vault).
2. Ask a question that requires the connected service. For example, with Stripe connected:

> *Show me the last 5 invoices for customer acme@example.com*

3. The AI recognizes it needs Stripe data, calls the appropriate tool, receives the result, and incorporates it into the response.

Tool calls appear inline in the chat with the tool name and a summary of what was called.

## Manage servers

- **Disable** a server without removing it: toggle it off in the registered servers list. The AI won't use its tools until you re-enable it.
- **Remove** a server: click the delete button. This revokes stored credentials.
- **Test** a connection: click the test button to verify the server responds.

## Next steps

- [Create a custom agent](/how-to/agents/) with specific MCP servers bound to it
- Read about [MCP and tool calling](/concepts/mcp/) to understand how multi-turn tool loops work
- See the [AI Providers reference](/reference/providers/) for which providers support tool calling

---

# How to Reclaim Disk Space from Local Models

**View, delete, or wipe the WebGPU model artifacts stored in your browser.**

> Source: https://doc.daneel.injen.io/how-to/models-storage/index.md

Local models used by Daneel, whether for chat, embedding, or knowledge extraction, are downloaded once and cached in your browser so they stay available offline. A single chat model can run from a few hundred megabytes up to several gigabytes. The **Models Storage** settings panel shows exactly what is on disk and lets you reclaim space at any time.

## View downloaded models

1. Open **Settings > Models Storage**.
2. The summary card at the top shows the total disk used by all cached model artifacts, the number of models involved, and a browser-level "used of available" line for context.
3. Below the summary, models are grouped into sections by role:
   - **Language models**, the WebGPU LLMs used in chat.
   - **Embedding models**, the sentence-embedders used for site and vault indexing.
   - **Knowledge extraction models**, the GLiNER and LFM2-Extract variants used to build knowledge graphs.
   - **Other**, any cached artifact that does not match a current catalog entry (see [The "Other" section](#the-other-section) below).

Each section shows its own subtotal, model count, and file count. Within a section, models are sorted largest first. Each row has a progress bar showing its share of the overall total.

## Delete one model

1. Click the trash icon on the row of the model you want to remove.
2. An inline confirmation appears showing the exact amount of disk space that will be freed.
3. Click **Delete** to confirm, or **Cancel** to dismiss.

The cached files are removed immediately. The model will re-download from HuggingFace the next time you pick it, so do this when you are comfortable with that cost.

## Delete all downloaded models

1. Click **Delete all** in the summary card.
2. An inline confirmation appears showing the total that will be freed and the number of models affected.
3. Click **Delete all** again to confirm.

This wipes every cached model artifact in one step. Your provider selection, embedding-model choice, and knowledge-extraction model preference are not touched, so picking a provider again will simply trigger a fresh download.

## Deleting an active model is safe

You can delete a model that is currently selected or even actively running. The model's weights stay in GPU memory until the next cold start, so your current chat, ongoing ingestion, or running knowledge-graph build continues uninterrupted. The deletion only affects the next time the model has to load from scratch.

No reload, no restart, no re-activation is needed.

## The "Other" section

When a model is renamed in the catalog or removed in a later update, any files you had previously downloaded for it stay on disk until you delete them. The **Other** section groups these orphan artifacts so you can reclaim the space without waiting on a catalog change.

The section only appears when orphan artifacts are present. A generic graph icon is used since no catalog label is available.

## Storage details

Cached artifacts live in two separate browser cache stores:

- `transformers-cache`, populated by the transformers.js runtime for ONNX shards, tokenizers, and configuration JSON.
- `daneel-ner-models`, populated by the NER worker for GLiNER ONNX binaries, since those are downloaded outside the transformers.js pipeline.

The panel queries both stores, so the totals you see account for all downloaded artifacts in one place. Everything stays on your machine. Nothing about your model footprint ever leaves the browser.

For a broader overview of what Daneel stores and where, see [Storage and Limits](/reference/storage/).

## Next steps

- [How Providers Work](/concepts/providers/) to understand which providers download models locally and which do not.
- [Use Daneel Offline](/how-to/offline/) to confirm that the models you keep are the ones you actually need without a network.
- [Switch Embedding Models](/reference/settings/) if you want a different embedding model, which will leave the old cache behind until you delete it from this panel.

---

# Configure OS Notifications

**How to enable, silence, and tune OS toast notifications for background tasks — crawls, vault indexing, knowledge graph builds, and data backups.**

> Source: https://doc.daneel.injen.io/how-to/notifications/index.md

Daneel fires OS toast notifications when background tasks start, complete, fail, or are cancelled. That way a site crawl finishing while you're on another tab, or a 250 MB data export completing while you walk away from your desk, still reaches you. This guide shows how to turn notifications on and off, suppress their sound, tune how long they stay on screen, and troubleshoot when no toast appears.

If you are new to background tasks themselves, see [Monitor Background Tasks](/how-to/background-tasks/) first.

## Open the Notifications panel

1. Click the Daneel widget icon to open the extension
2. Open **Settings** (gear icon)
3. Select **Notifications** in the sidebar (bell icon, between Tasks and AI Models)

The panel has three sections: master controls, per-event settings, and a Verify button.

## Master toggle

The top card has a single **Enable notifications** toggle. When this is off, nothing fires regardless of per-kind settings. Use it as a one-click silence when you want Daneel quiet for a while without reconfiguring anything.

## Silent toggle (no sound)

Directly below the master toggle is a **Silent (no sound)** switch. When on, Daneel asks the OS to suppress the notification sound — you still see the toast, you just don't hear a chime.

The OS has the final say: on Windows, Focus Assist and Do Not Disturb can still mute or suppress the toast entirely. On Linux the "silent" hint depends on your notification daemon.

## Per-event settings

Four task-lifecycle transitions can each fire a toast:

| Event | Default sound | Default duration | Typical use |
|-------|---------------|------------------|-------------|
| **Started** | On | 2.5s | Confirmation that a long-running task has begun |
| **Complete** | On | 3.5s | A task finished successfully |
| **Cancelled** | On | 3.5s | You stopped a task from the Tasks panel or a cancel button |
| **Failed** | On | 10s | A task errored out — longer default so you have time to read it |

Each event card has its own toggle + auto-dismiss slider (1.5–15 seconds). The slider controls how long the toast stays visible before it auto-closes. After the auto-dismiss the notification still appears in your OS notification center (Action Center on Windows, Notification Center on macOS) until you clear it manually.

Disabling a specific kind (for example, turning **Started** off because you don't need a chime every time a task kicks off) only silences that one kind — the others keep firing normally.

## Test now button

At the bottom of the panel, the **Test now** button fires a synthetic toast — exactly what a real completed task would produce. Useful for:

- Confirming the OS is rendering Daneel toasts at all
- Previewing the look-and-feel after you change Windows theme, OS dark mode, or a duration slider
- Debugging Do Not Disturb / Focus Assist interactions

Click it, watch for the toast. If nothing appears, jump to [Troubleshooting](#troubleshooting) below.

## Toast format

Every toast uses the same compact structure:

```
Daneel AI
{operation} > {scope} > {target} > {state}
```

For example:

- `Indexing > site > example.com > Complete`
- `Building knowledge graph > vault > Research notes > Started`
- `Exporting data > backup > Local file > Failed`
- `Indexing documents > vault > Project Alpha > Cancelled`

The format is deliberately terse — enough to know what happened, no marketing noise. There is no progress bar inside the toast itself; progress lives in the [Tasks panel](/how-to/background-tasks/).

## Click behavior

Clicking a Daneel toast closes it. That is the whole interaction — no tab focus, no deep link, no extra panel opens. This keeps notifications unobtrusive: they inform you, they don't pull you away from what you're doing.

If you want to see more detail after a toast, open the extension and go to **Settings > Tasks**.

## Where toasts appear on your platform

| OS | Location | Persistence |
|----|----------|-------------|
| **Windows 10/11** | Bottom-right, above the system tray | Sits in Action Center (speech-bubble icon on the taskbar) after auto-dismiss |
| **macOS** | Top-right corner | Stacks in Notification Center (click the clock in the menu bar) |
| **Linux** | Depends on your desktop environment (GNOME: top-right; KDE: bottom-right; most DEs use `notify-send` conventions) | Varies by notification daemon |

## Styling and theme

You cannot customize Daneel toast appearance — colors, fonts, backgrounds, or layouts. Chrome's notification API renders through the OS, and the OS owns the visual style. Daneel supplies only the icon, title, and message text.

The good news: toasts automatically follow your **system theme**. Turn Windows on Dark Mode (Settings → Personalization → Colors → "Choose your default app mode" → Dark), and Daneel toasts will appear with dark backgrounds and light text without any extra configuration. Same for macOS dark mode and Linux dark themes.

If you genuinely need fully custom-styled toasts that live inside the browser — for branding, for example — that would require an in-browser overlay instead of OS notifications, and it would only show while a Daneel surface is open. Daneel does not offer that today.

## Troubleshooting

If you click **Test now** and nothing appears on screen, Chrome accepted the notification but your OS silently dropped it. On Windows this is common after a fresh Chrome install or when Focus Assist is active.

### Check three Windows gates in order

Open **Settings → System → Notifications** (Win+I, then navigate):

1. **Master toggle** — the top switch labeled "Notifications" must be **On**.
2. **Per-app toggle** — scroll the apps list, find **Google Chrome**, confirm it is **On**. Click through to Chrome's entry and confirm both "Show notification banners" and "Show in notification center" are enabled.
3. **Focus Assist / Do Not Disturb** — must be **Off** for Daneel toasts to pop as banners. On Windows 11 this is at Settings → System → Notifications → Do not disturb. "Priority only" still suppresses most app notifications; either switch Focus Assist off, or add Chrome to the priority list.

:::note
Missed toasts are usually still in your notification center. On Windows, click the bell/speech-bubble icon on the taskbar — you'll typically find the toasts you never saw as banners waiting there.
:::

### Chrome missing from the per-app list

If Google Chrome doesn't appear in the Windows apps list at all, Windows never registered Chrome's notification subsystem. Fully quit Chrome — right-click its system-tray icon → **Quit**, not just close the window — then relaunch it. Windows registers Chrome's notifications on app launch.

### macOS settings

System Settings → Notifications → **Google Chrome** → "Allow Notifications" must be on. Banner style determines how long the toast stays visible.

### Nothing helps

Open the extension's service worker console (`chrome://extensions` → find Daneel AI → click **Inspect views: service worker**) and click **Test now** again. You should see two log lines:

```
[notifications] firing task-complete: {id: …, iconUrl: …, message: …}
[notifications] created id=… (requested …)
```

If both appear, Chrome created the notification successfully and your OS is dropping it — the three gates above cover 99% of cases. If you see `[notifications] chrome.notifications.create rejected:` instead, something is blocking the Chrome API itself — report the error message as a bug.

## See also

- [Monitor Background Tasks](/how-to/background-tasks/) — the panel where you actually watch tasks progress
- [Back Up Your Data](/how-to/cloud-backup/) — the one background task you'll most want a completion toast for
- [Privacy Model](/concepts/privacy/) — what Daneel shares with the OS notification system (nothing beyond the visible toast text)

---

# How to Use Daneel Offline

**Activate Offline Mode, know what to expect, and get out of it when you reconnect.**

> Source: https://doc.daneel.injen.io/how-to/offline/index.md

Daneel ships a first-class Offline Mode: a single switch that blocks every outbound call the extension would make, verifiable in DevTools, with three independent ways to turn it off. This guide covers activation, what you will see while it is on, and recovery.

## Prerequisites

- You have run through [How to Prepare for Offline](/how-to/prepare-for-offline/) at least once. The Prepare panel makes sure your models and resources are cached.
- You have imported your vault documents. Import is a local operation and works any time, online or off.

## Activate Offline Mode

You have two toggles in **Settings > Offline mode > Mode**, and they do different things:

- **Switch to offline mode** — persistent. Survives browser restarts. Use this before a flight.
- **Test offline mode** — transient. Active until the service worker reloads (browser restart or extension update). Use this to verify your offline setup without actually disconnecting.

Flip either one. The effective state becomes offline. A green OFFLINE MODE pill appears in the Settings header and the Vault header, so you can tell at a glance.

## What changes while offline

The behavior is deliberate and predictable. Nothing crashes, nothing hangs waiting for a timeout.

- **The provider bar** — if your active provider is Claude or Azure OpenAI, any new question shows an amber banner with a one-click switch to WebGPU, Gemini Nano, or Ollama.
- **MCP surfaces** — the Settings > MCP panel and the Tools section of the vault render a disabled overlay labelled "paused". Local MCP servers you host via Docker Companion still work.
- **Cloud backup** — the Azure and S3 cards in Settings > Data grey out with a short notice. The local filesystem export and import controls in the same panel stay active. Your vault can still be backed up to a USB drive.
- **Model Storage** — the Settings > Models Storage panel is locked. This is on purpose. Deleting a cached model mid-flight would leave your local providers unable to answer.
- **Wikipedia and external links** — the knowledge graph's Wikipedia lookup pauses with a short notice. The fetched-page viewer in the vault hides the open-in-browser icon for the same reason.
- **News, documentation, changelog, credits** — served from the local cache instead of the live site. The Refresh button is disabled until you go back online.

All of these reflect the same underlying rule: if a call would leave your machine, it is blocked. See [Offline Mode](/concepts/offline-mode/) for the full rule table.

## Recover from offline

Three independent escape hatches, any one of them works:

1. **Extension popup** — click the toolbar icon. A green OFFLINE MODE card with a Turn off button appears. One click. Works on any tab, including blank tabs and error pages.
2. **Vault tab banner** — open the dedicated Vault tab. A banner at the top of the page shows the same Turn off button.
3. **Settings toggle** — from any normal webpage, click the widget, open Settings > Offline mode, flip Switch and Test back to off.

Turn off clears both the persistent and the transient flags. Whichever one was active, it is fully off afterwards.

## Open the vault when no webpage is available

Daneel's widget normally injects into the page you are viewing. On a blank new tab, on a `chrome://error` page, or on a captive-portal holding page, that assumption breaks. The extension popup has an **Open Vault** button that launches a dedicated Vault tab at `chrome-extension://...//src/vault/vault.html`. The Vault tab hosts the full vault experience: search, chat, knowledge graph, import.

In Offline Mode you will also see a URL-parameter deep-link option: `vault.html?action=import` opens the import dialog automatically once the active vault loads. This is useful if a preflight button in the Prepare panel needs to route you to vault import.

## What keeps working

This list is the heart of the feature. All of it works identically in Offline Mode:

- Chat with any document in your vault, via WebGPU, Gemini Nano, or Ollama
- Search your vault (cosine similarity over cached embeddings)
- Import new documents (.md, .txt, .docx, .pdf, .html from disk)
- View documents, including PDFs rendered with edgeparse-wasm
- Explore the knowledge graph in 3D
- Browse previously cached pages via the vault mini-browser
- Read cached news, documentation, changelog, and credits
- Export your vault to a local ZIP file
- Use local MCP servers via Docker Companion

## What does not work

All of these are blocked by design:

- Chat with Claude or Azure OpenAI
- Connect to a remote MCP server (Stripe, Supabase, public registries)
- Cloud backup to Azure Blob or S3
- Index a new site (requires fetching pages)
- Fetch a fresh Wikipedia page from the knowledge graph viewer
- Activate or refresh a license

For license specifically: the cached JWT lasts 7 days. You can use paid features offline during that window. After, you have to turn off Offline Mode, refresh the license, and turn it back on.

## Related

- [How to Prepare for Offline](/how-to/prepare-for-offline/) — get everything cached before you go
- [Offline Mode](/concepts/offline-mode/) — the rules behind what is blocked and what is allowed
- [Privacy Model](/concepts/privacy/) — the data-residency picture per provider

---

# How to Chat with a PDF

**Ask questions about any PDF document directly in your browser — no upload, no copy-paste.**

> Source: https://doc.daneel.injen.io/how-to/pdf-chat/index.md

Daneel detects when Chrome is displaying a PDF and automatically extracts its text, letting you chat with the document, copy its content as Markdown, or save it to a vault.

## Steps

1. Open any PDF in Chrome (click a link, paste a URL, or navigate directly — e.g. `arxiv.org/pdf/2601.00162`).
2. The Daneel widget appears in the corner, just like on any web page.
3. Open the chat panel (sparkles icon). The mode button shows **PDF** instead of *Page*, and a green status bar confirms how much text was extracted.
4. Ask a question about the document:

> *Summarize the main contributions of this paper in bullet points.*

The AI receives the extracted text as context and responds based on the PDF content.

## Quick actions

| Action | How |
|--------|-----|
| **Copy as Markdown** | Single-click the Markdown button on the launcher — PDF text is copied to your clipboard. |
| **Download as Markdown** | Double-click the Markdown button — saves a `.md` file named `daneel.{title}.{timestamp}.md`. |
| **Save to Vault** | Click *+ Vault* in the chat panel, pick a vault, and the PDF is imported with a descriptive filename (`{hostname}.{path}.{timestamp}.pdf.md`). |

## How it works

Chrome's modern PDF viewer (OOPIF, Chrome 126+) renders PDFs at the original URL rather than redirecting to an internal `chrome-extension://` page. This means Daneel's widget can inject normally.

When the widget detects a PDF page, it:

1. **Detects** the PDF via three signals: the `pdfoopifenabled` attribute on `<html>` (set by Chrome's OOPIF viewer), `document.contentType`, or a `.pdf` URL suffix.
2. **Fetches** the PDF binary through the background service worker proxy (bypasses CORS restrictions).
3. **Extracts** structured Markdown using [EdgeParse WASM](https://github.com/raphaelmansuy/edgeparse), preserving headings, tables, and reading order.
4. **Caches** the result so subsequent questions reuse the same extraction.

The extracted Markdown flows into the same prompt pipeline as any other page — context selection, prompt building, and streaming to whichever AI provider you have active.

## What works differently on PDF pages

- **Site mode is disabled.** A PDF has no sitemap or crawlable structure, so the *Site* toggle is hidden.
- **Page title comes from the URL.** Chrome's PDF viewer leaves `document.title` empty, so Daneel derives a display title from the URL path (e.g., `2601.00162` from `arxiv.org/pdf/2601.00162`).
- **DOM extraction is skipped.** The PDF viewer wraps its content in a closed shadow root that cannot be read. Daneel fetches the PDF binary directly instead of parsing the DOM.

## Limitations

- **Scanned PDFs** (image-only, no selectable text) cannot be extracted. Daneel will show an error if every page contains fewer than 20 characters.
- **Very large PDFs** work but may take a few seconds to fetch and extract. The context selection algorithm trims the text to fit the model's token budget.
- **`file://` PDFs** require granting Daneel file access in Chrome's extension settings — this is not enabled by default.

## Next steps

- [Build a Document Vault](/guides/vault/) to organize and search across multiple PDFs
- [How RAG works](/concepts/rag/) explains the chunking and search pipeline behind document Q&A
- [Your First Page Chat](/guides/first-page-chat/) covers the general chat flow that PDFs build on

---

*PDF extraction is powered by [EdgeParse](https://github.com/raphaelmansuy/edgeparse) by [Raphaël Mansuy](https://github.com/raphaelmansuy). Apache 2.0 licensed.*

---

# How to Prepare for Offline

**Walk through the Prepare for offline panel and make sure Daneel is ready before you lose connection.**

> Source: https://doc.daneel.injen.io/how-to/prepare-for-offline/index.md

Before a long flight, a train through a tunnel, or a day somewhere with a captive-portal wifi, open **Settings > Offline mode** and run through the Prepare for offline panel. Each row is a check you actually need, and the Cache resources button fills in the static content in one click.

## Prerequisites

- An active license if you use paid features. The cached token lasts 7 days, but you want a fresh refresh before you go offline for long. See [How Licensing Works](/concepts/licensing/) if you are not sure what your status is.
- At least one vault with the documents you want to work with. If you have not built one yet, follow [Build a Document Vault](/guides/first-vault/).
- A local provider chosen in Settings. WebGPU works out of the box. Ollama on localhost or LAN is also a valid choice. See [The Provider Spectrum](/concepts/providers/) if you need help deciding.

## Open the panel

1. Click the widget in the bottom-right corner of any page, then the cog icon to open Settings.
2. Scroll the sidebar to **Offline mode**. The panel has two sections: Status and Prepare for offline.
3. The Prepare section shows six rows, each with a status pill on the right.

## Read each row

The panel composes the report from live state every time you open it. The Refresh button in the top-right of the panel re-runs the check.

**Vault content** — counts every vault and the total documents + chunks. Green Ready means at least one vault has indexed documents. Red No vault means you have not created one. Amber Empty means you have a vault but nothing indexed yet.

**License** — reads your cached token and reports days remaining. Green Ready means more than three days left. Amber Refresh needed means three days or fewer. Red Missing means no license is activated, which is fine if you only use free features. Red Expired means the cached token is past its expiration.

**Language model** — counts WebGPU models in the `transformers-cache` Cache API bucket. Green Ready requires at least one downloaded. Red Missing means you need to download one in Settings > Models.

**Embedding model** — same check against the embedding catalog. This is specifically required for vault search. Without it, offline chat works but offline search does not.

**Cached resources** — counts entries in the `daneel-resources` cache bucket. Green Ready means the bucket is populated and fresh. Amber Stale means the entries exist but are older than two weeks. Red Missing means the bucket is empty.

**Knowledge extraction (optional)** — counts GLiNER NER models. Green Ready if at least one is downloaded. Amber Missing is acceptable if you do not need the knowledge graph while offline. If you want in-vault entity extraction without a network, this has to be green.

The final line shows your storage quota: used MB vs quota MB. Chrome extensions with `unlimitedStorage` permission rarely hit the quota, but it is surfaced for transparency.

## Cache the static resources

Click **Cache resources for offline**. The button calls the PreCacheManager, which:

1. Fetches the docs site's `pages.json` manifest.
2. Iterates every documentation page and writes its markdown into the `daneel-resources` bucket.
3. Fetches `articles.json` from the news site and caches every article.
4. Fetches the changelog and credits pages.

Each cached response carries an `X-Daneel-Cached-At` header with a timestamp so the panel can show you how fresh the cache is.

When it finishes, the Cached resources row flips to green and a small caption appears below the button: "Cached N resources" and a failure count if any URL did not respond.

## Download missing models

Models are not downloaded by the Cache resources button. A multi-gigabyte pull is not something we do silently. If your Language model or Embedding model rows are red:

1. Open **Settings > Models** (or **Settings > WebGPU** for the default WebGPU model).
2. Pick a model and click Download. Granite 4.0 Micro 3B is a good all-round starting point. Bonsai 1.7B q1 is 291 MB and still supports step-by-step reasoning if you are tight on disk.
3. Return to **Settings > Offline mode** and press Refresh. The row should flip green.

For the optional knowledge extraction row, see [How to Build a Knowledge Graph](/how-to/knowledge-graph/).

## Verify your setup

The safest way to check that everything works without actually disconnecting is Test offline mode.

1. In **Settings > Offline mode**, under the Mode section, flip the Test offline mode toggle (the amber one).
2. Open a vault, run a search, ask a question. Everything should work.
3. Try to run a chat against a remote provider like Claude. You should see an amber banner offering to switch to a local provider.
4. Flip Test offline mode off when you are done.

Test mode persists until the service worker reloads (browser restart or extension update). Switch to offline mode persists across browser restarts.

## Troubleshooting

**The Cache resources button returned some failed URLs.** Not every failure is a problem. If the docs site was briefly unreachable, the remaining URLs succeeded and you have most of what you need. Run the button again to retry.

**License shows Missing but I have a paid plan.** You have not activated the license in this browser. Open **Settings > License** and paste your key while online.

**Language model row stays red after download.** Open **Settings > Models Storage** to see what is actually cached. If your model does appear there but the Offline mode panel shows it missing, Refresh the panel. The inventory is computed via cache inspection and can lag by a few seconds after a download finishes.

**Storage quota is close to full.** Use [Reclaim Disk from Local Models](/how-to/models-storage/) to delete large models you no longer need, but only while online. The Models Storage panel is deliberately locked while offline mode is on so an accidental delete does not break your session.

## Next steps

- Read [How to Use Daneel Offline](/how-to/offline/) for the activation and recovery flow once everything is cached.
- Read [Offline Mode](/concepts/offline-mode/) to understand why certain calls are blocked and others are not.

---

# How to Index a Site

**Choose between sitemap and web crawl discovery to index any website for semantic search.**

> Source: https://doc.daneel.injen.io/how-to/site-indexing/index.md

Daneel can index an entire website so you can search and ask questions about its content. There are two discovery methods: **Sitemap**, which reads the site's `sitemap.xml`, and **Web Crawl**, which discovers pages by following links. This guide covers when to use each and how to configure them.

## Prerequisites

- Daneel installed and an AI provider configured (any provider works for embedding)
- Navigate to the site you want to index

## Open the Site panel

Click the Daneel icon on any page, then open the **Site** tab (magnifying glass icon). Daneel automatically checks for sitemaps when the panel opens.

## Choose a discovery method

After the sitemap check completes, you'll see two options:

### Sitemap

Best for sites that maintain a `sitemap.xml`. Daneel discovers sitemaps automatically from `robots.txt` and standard locations (`/sitemap.xml`, `/sitemap_index.xml`). It also checks path-level candidates based on your current URL.

When sitemaps are found:

1. Review the discovered sitemaps in the checklist. Each entry shows the URL and estimated page count.
2. Uncheck any sitemaps you don't want to include.
3. Set **Max pages** and **Depth**, then click **Crawl**.

### Web Crawl

Best for sites without a sitemap, or when the sitemap is incomplete. The crawler starts from your current page and discovers content by following every link it finds in the HTML, breadth-first.

When no sitemap is found, Daneel automatically selects Web Crawl. You can also switch to it manually when sitemaps exist but don't cover the full site.

1. Select the **Web Crawl** card.
2. Optionally set a **path prefix** to limit the crawl scope (see below).
3. Set **Max pages** and **Depth**, then click **Crawl**.

## Use the path prefix filter

When Web Crawl is selected, a **Path prefix** field appears. Daneel infers a prefix from your current URL. For example, if you're on `example.com/docs/getting-started`, the prefix is set to `/docs`.

The crawler only follows links whose path starts with this prefix. This keeps the crawl focused on a section of the site instead of indexing everything.

- Edit the prefix to narrow or widen the scope
- Click the **x** button to clear it entirely and crawl the whole site

## Crawl settings

| Setting | Default | Description |
|---------|---------|-------------|
| Max pages | 50 | Maximum pages to fetch in this crawl session |
| Depth | 3 | For sitemap: nesting depth of sitemap indexes. For web crawl: BFS hops from the starting page |

## What happens during a crawl

Once you click Crawl, the task runs in the background:

1. **Discovery** finds page URLs (from sitemap or by following links)
2. **Extraction** converts each page's HTML to clean Markdown using Readability
3. **Chunking** splits the Markdown into overlapping segments
4. **Embedding** converts each chunk to a vector using the active embedding model
5. **Storage** saves vectors to IndexedDB, partitioned by domain

A progress bar shows crawl and embedding progress. You can close the panel or navigate away; the task continues. See [Monitor Background Tasks](/how-to/background-tasks/) for details.

## Cancel a crawl

Click **Cancel** next to the progress bar, or go to **Settings > Tasks** and stop the task from there.

## After indexing

Once the crawl completes, the Site panel switches to search view. Type a question to search across all indexed pages. Results are ranked by semantic similarity and include source links.

To re-index, clear, or manage stored data, see [Manage Site Indexes](/how-to/manage-indexes/).

## Web Crawl safety guards

The web crawler includes several protections against runaway crawls:

- **Same-origin only**: links to other domains are discovered but not followed
- **Query normalization**: pagination parameters (`page`, `offset`, `cursor`, etc.) are stripped, so `/results?page=1` and `/results?page=2` are treated as the same URL
- **Path depth cap**: URLs with more than 10 path segments are skipped
- **Queue limit**: the internal queue is capped at 3x the max pages setting
- **Retry with backoff**: server errors (5xx) are retried up to twice with exponential backoff; client errors (4xx) are skipped immediately
- **Optional robots.txt**: when enabled, the crawler respects `User-agent: *` Disallow rules

## Next steps

- Learn [how RAG works](/concepts/rag/) under the hood
- [Build a Document Vault](/guides/first-vault/) for local files
- [Monitor Background Tasks](/how-to/background-tasks/) to track crawl progress

---

# How to Read Messages Aloud and Dictate Questions

**Enable speech in Daneel, pick a voice, download the local Kokoro model, and use the Alt+Space shortcut to dictate.**

> Source: https://doc.daneel.injen.io/how-to/speech/index.md

Daneel can read any assistant reply aloud and take your questions by voice. You choose between three text-to-speech providers depending on whether you want instant setup, the best voice quality, or full on-device privacy. Dictation uses the browser's built-in recognizer today, with a local Moonshine option in the pipeline.

For background on privacy trade-offs between the providers, see [Speech in Daneel](/concepts/speech/). For a full list of controls, see [Speech Reference](/reference/speech/).

## Enable speech

Open **Settings > Speech** in the widget. Two toggles live at the top of each section:

- **Text-to-speech** — enables the Play buttons on assistant messages and the Auto-read option.
- **Speech recognition** — enables the mic button in the composer.

Both default to on. Each section has its own provider picker below.

## Read a reply aloud

Hover over any assistant message. A **Play** button appears in the action row next to **Copy** and **Delete**. Click it and the assistant's reply reads aloud using the currently selected TTS provider. Click **Stop** to interrupt, or click Play on a different message to cut the first one cleanly and start the new one.

Send a new question while a reply is reading? The current playback cancels automatically. No overlap.

## Pick a TTS provider

In **Settings > Speech > Text-to-speech**, three provider cards appear:

- **System voices** (default) — uses the voices your operating system already has. Starts instantly, nothing to download, no data leaves your machine.
- **Kokoro 82M** — a 326 MB local neural TTS model that runs on your GPU. Needs a one-time download. Delivers expressive, natural voices across seven languages.
- **Moonshine** — marked "Coming soon". Placeholder for the upcoming local STT provider; not selectable yet.

Click a card to make that provider active. The next Play click uses it.

### Opt into Google Cloud voices

Some of the voices your browser exposes, typically the ones named `Google UK English Male` and similar, stream text to Google servers for a richer prosody. Daneel filters them out by default to keep speech on-device.

To enable them:

1. Pick **System voices** as the active TTS provider.
2. Expand the **Advanced** accordion under the voice picker.
3. Flip **Allow Google cloud voices**.

The voice list refreshes. Cloud voices appear with a `(cloud)` suffix. They sound remarkable, and the trade-off is explicit.

:::note
When [Offline Mode](/how-to/offline/) is active, cloud voices still play (the speech synthesis API is not part of the network gate). The speech recognition side, however, is blocked because it genuinely leaves the machine.
:::

## Download and use Kokoro

Kokoro is the option to pick when you want TTS to stay fully local.

1. **Settings > Speech > Text-to-speech** > click the **Download (~326 MB)** button on the Kokoro card.
2. A progress bar shows the model fetching from Hugging Face. First load takes a few minutes on a typical connection; the download is cached in your browser and reused forever.
3. When complete, a green **Downloaded** pill replaces the button, and a **Remove** link appears next to it if you want to free the space later.
4. Click the Kokoro card to make it the active TTS provider. The voice picker refreshes to show 54 Kokoro voices across US English, British English, Spanish, French, Italian, Hindi, Japanese, and Mandarin.

Click **Test** next to the voice picker to hear a short sample in the chosen voice before committing.

:::note
Kokoro runs on WebGPU when your hardware supports it, which is the case for most GPUs since 2018. No WebGPU means Kokoro will not be usable; stick to System voices in that case.
:::

## Dictate a question

A mic button sits next to the Send button in every chat composer. Hold-and-release or click to toggle recording:

1. Click the mic button. On first use, your browser asks for microphone permission.
2. Speak your question.
3. Click the mic again to stop. The transcript lands in the composer input box. It does **not** auto-send, which gives you a chance to read it first.
4. Correct anything if needed, then click **Send**.

### Alt+Space from anywhere

You do not need to find the mic button with your cursor. The keyboard shortcut **Alt+Space** toggles dictation from anywhere on the page, even while you are scrolling through content. Press once to start, once more to stop.

If the shortcut does not work, check `chrome://extensions/shortcuts`. Chrome occasionally reassigns shortcuts when another extension claims the same keys.

## Auto-read every reply

If you would rather not click Play on each message, flip the **Auto-read responses** toggle under the voice picker. Every new assistant message plays automatically the moment it finishes streaming. Asking a new question interrupts the current playback cleanly.

## Change the speaking speed

The **Speed** slider under the voice picker ranges from 0.5× to 2.0×. The setting applies to whichever provider is active and takes effect on the next Play click.

## Switch providers mid-session

All speech settings are live. Change the active provider, pick a different voice, flip Auto-read, adjust speed, and the very next Play uses the new configuration. No reload, no restart.

---

# How to Browse Linked Pages from a Vault Document

**Click links inside any web-origin vault document to open the next page in the same viewer, with a back button.**

> Source: https://doc.daneel.injen.io/how-to/vault-mini-browser/index.md

When a vault document was imported from a web page, the document viewer turns into a small markdown-based browser. Click any link inside the document, and the next page is fetched, converted to clean markdown, and displayed right where the previous one was. A back button keeps your trail.

This guide shows how to use that surface and what to expect.

## Prerequisites

- A [document vault](/guides/first-vault/) with at least one document that came from a web page
- A document either:
  - Saved from the chat using **Add to vault** (any open browser tab), or
  - Imported through the **Wikipedia panel** that appears when you click a node in the [knowledge graph viewer](/how-to/explore-knowledge-graph/)

Local files (`.md`, `.pdf`, `.docx`, `.html` imported from disk) do not become navigable. They render as plain markdown like before. Only documents with a known web origin are treated as browsable.

## Open a navigable document

1. Open the vault overlay and select your vault.
2. In the **Documents** list, click a document that was originally a web page. The viewer pane opens with a header showing the source label (for example, "Wikipedia") and the page title.
3. The document is now displayed as a navigable surface. The **Add to vault** button is replaced by an **In vault** badge, since the page is already saved.

## Click links to navigate

Click any link inside the rendered markdown. The viewer briefly shows a "Fetching page…" state, then replaces its content with the linked page, also rendered as markdown.

Each navigation pushes the previous page onto a back stack. When the stack has at least one entry, a small **Back arrow** appears at the top-left of the viewer header.

You can hop several pages deep, then click **Back** to return to where you came from. Going back never re-fetches: pages are kept in memory for the session.

## Open a link in your real browser instead

Hold any modifier key while clicking — **Ctrl**, **Cmd**, **Shift**, or **Alt** — or use a **middle-click**. The viewer ignores the click, and the browser opens the link in a new tab as usual.

The header also has a small external-link icon next to the **Add to vault** button. Click it to open the current page in a real browser tab.

## Close the navigable view

Click the **X** at the top-right of the viewer header. The viewer closes, the document is unscoped, and the chat returns to your full-vault view.

## What gets fetched, where, and what's safe

In-viewer fetches go through the extension's background worker, which sidesteps the cross-origin restrictions that would otherwise block in-page fetches. The fetched HTML is processed by the same Readability-based extraction the rest of Daneel uses for page Q&A, then sanitized with [DOMPurify](https://github.com/cure53/DOMPurify) before it lands in the viewer. Scripts, inline event handlers, and dangerous attributes are stripped — nothing the page sends can execute.

See [Privacy Model](/concepts/privacy/) for the broader picture of what stays local and what leaves your machine.

## Limits and known trade-offs

- **Some sites refuse to cooperate.** Pages that gate everything behind a login, or that aggressively detect non-browser fetches, will fail to load. When this happens you get an error toast, and the viewer stays on the current page — your back stack is preserved.
- **Readability is not perfect.** It is excellent for article bodies but sometimes drops sidebars, callouts, or rich-formatted elements. If a page looks too thin in the viewer, use the external-link icon to open the original.
- **Image links are not navigated.** Direct links to image files (`.jpg`, `.png`, `.svg`, etc.) open in the browser instead of attempting markdown extraction.
- **In-page anchors (`#section`) keep their default scroll behavior** and are not intercepted.
- **There is no forward button yet** — the back stack is one-way for now. Use the document list to start a new navigation.
- **Local file documents are not navigable.** Their links, if any, behave like normal browser links.

## Save a hopped-to page back into the vault

If you navigate to a page you'd like to keep, click **Add to vault** in the viewer header. The page is added to the active vault as a new markdown document with its source URL recorded. You can reopen it later as a navigable surface, just like the page you started from.

---

# How to Use the Wikidata Fact Box

**Surface structured facts, external IDs, and disambiguation candidates from Wikidata when you click entities in your knowledge graph.**

> Source: https://doc.daneel.injen.io/how-to/wikidata-fact-box/index.md

The Wikidata fact box appears beside the Wikipedia panel whenever you click an entity in the 3D knowledge graph. Where Wikipedia shows you prose, Wikidata shows you a scannable card: image, label, curated key-value facts, authority links, and a deep link back to the canonical record. This guide covers the panel's states, the disambiguation flow, and the offline behavior.

## Prerequisites

- A vault with a built [knowledge graph](/how-to/knowledge-graph/)
- Internet access on the first lookup for each entity (subsequent clicks are cached)
- No setup needed — the fact box ships enabled by default

## Open the fact box

Click any node in the 3D knowledge graph view. Two panels appear in the top-left corner of the canvas:

- **Wikipedia** (violet icon) — a list of matching articles
- **Wikidata** (amber icon) — the fact box for the most likely QID

The panels load independently. The fact box shows a spinner with "Finding entity…" while resolution runs, then "Loading facts…" while the entity payload is fetched, then the rendered card.

## Understand the fact box layout

When resolution succeeds, the panel renders in four stacked sections:

**Header**

- A thumbnail (from Wikidata's P18 "image" property, served via Commons)
- The canonical label in your UI language
- The one-line description from Wikidata
- A QID badge that opens `wikidata.org/wiki/Q…` in a new tab

**Grouped facts**

Curated per entity type. Example for a person:

- *Life* — date of birth, date of death, place of birth, place of death, citizenship
- *Career* — occupation, educated at, employer, doctoral advisor, influenced by
- *Relations* — spouse, child
- *Recognition* — awards received, notable works

Dates are formatted with precision respected: day when Wikidata has it, month or year when the record is less specific. Multi-value properties (e.g., two citizenships) are joined with a middle-dot separator.

**External IDs**

A row of small chips for known authority records. Clickable ones open the external service directly:

- VIAF, ORCID, ISNI — scholarly and name authority IDs
- IMDB, MusicBrainz — entertainment
- GitHub, Twitter — online identity
- GND, LoC — library catalogs

**Attribution footer**

A single line: "Data from Wikidata · CC0", with a link to the Wikidata entity page.

## Pick the right entity when the name is ambiguous

Common names produce multiple candidates. When the resolver is not confident enough to auto-select, the panel switches to a picker:

- A short "Multiple matches — pick one:" header
- Up to 5 candidate rows, each showing: label, description, QID, confidence score
- Click any row to commit

Your pick is cached for 30 days and keyed on the entity text + ontology type. The next time the same entity appears in any vault with the same type, it resolves instantly to your chosen QID.

If you later realise the pick was wrong, clicking the node again currently re-loads the cached choice. Clearing browser storage for the extension resets all resolutions.

## Work with different entity types

Daneel maps your knowledge graph's ontology labels to Wikidata's class hierarchy so the reconciliation stage can filter candidates by type. Five pillars are recognized out of the box:

- **Person** — labels like `person`, `human`, `fictional_character`, `deity`
- **Organization** — `organization`, `company`, `university`, `political_party`, `sports_team`, `news_outlet`
- **Place** — `location`, `country`, `city`, `region`, `continent`, `body_of_water`, `mountain`, `building`, `monument`
- **Creative work** — `book`, `film`, `song`, `album`, `artwork`, `television_show`, `video_game`
- **Event** — `war`, `battle`, `election`, `revolution`, `treaty`, `conference`

Custom ontology labels that don't match any of the above still work: the reconciliation runs without a type filter, candidates come back ranked by label match, and the disambiguation picker shows top results.

## Use the panel offline

The fact box is gated by [Offline Mode](/how-to/offline/). When the switch is on:

- **Cached entities still render.** If you looked up Einstein yesterday and his payload is still fresh in `chrome.storage`, the fact box loads normally from cache, no network needed.
- **New lookups show a paused notice.** Clicking a node whose QID has never been resolved shows "OFFLINE MODE — Wikidata lookup is paused".
- **External ID chips still link.** The chip URLs open wikidata.org and authority sites in a new tab; the extension itself issues no outbound call.

To pre-warm the cache before going offline, click through the entities you want to have ready while online.

## Rate-limit behavior

All calls to Wikidata and the reconciliation service share a three-slot concurrency limit. If you rapidly click through a dozen nodes, no more than three network requests run at a time. The rest queue silently.

If Wikimedia responds with a 429 (rate limited), the panel respects the `Retry-After` header and retries up to three times with exponential backoff. On repeated 429s the panel shows "Couldn't reach Wikidata" — wait a minute and try again.

## Cache lifetimes

Three separate caches keep the panel responsive:

- **Resolutions** (text + type → QID): 30 days for user-confirmed picks, 1 day for auto-selected
- **Entity payloads** (QID → full simplified claims): 7 days
- **Label map** (QID or property ID → human-readable label): LRU-capped at 5,000 entries, untimed

Labels change rarely, so the map grows monotonically across sessions until you hit the cap.

## Close the panel

Click the **X** in the panel header to dismiss the fact box. The Wikipedia panel stays open independently — you can dismiss either, both, or neither.

Closing the 3D graph view or switching documents clears the panel automatically.

## Language

Version 1 is English only. The label on your node is resolved against English Wikidata labels, the fact box renders in English, and the cache is lang-unaware. Multilingual support is tracked for a later release.

For background on how resolution works — reconciliation with a type filter, running in parallel with the general entity search, merged and thresholded — see [Entity Resolution](/concepts/entity-resolution/).

## Next steps

- For the broader 3D graph exploration workflow, see [How to Explore Your Knowledge Graph](/how-to/explore-knowledge-graph/).
- For how entities are extracted from your documents in the first place, see [Knowledge Graphs](/concepts/knowledge-graph/).
- For the offline-mode rules that gate the panel, see [Offline Mode](/concepts/offline-mode/).

---

# How to Chat with a YouTube Video

**Ask questions about any YouTube video using its transcript.**

> Source: https://doc.daneel.injen.io/how-to/youtube-chat/index.md

Daneel automatically extracts transcripts from YouTube videos, letting you ask questions about video content the same way you chat with any other page.

## Steps

1. Navigate to any YouTube video in Chrome.
2. Open the Daneel chat panel (sparkles icon on the launcher).
3. Daneel detects the YouTube URL and extracts the transcript automatically.
4. Ask a question about the video:

> *What are the three main points discussed in this talk?*

The AI receives the full transcript with timestamps and responds based on the video content.

## How it works

Daneel extracts transcripts via YouTube's InnerTube API:

1. Detects `youtube.com/watch`, `youtu.be`, or `m.youtube.com/watch` URLs
2. Parses video metadata (title, channel, duration)
3. Fetches caption tracks — preferring manual captions over auto-generated ones
4. Merges short fragments into natural sentences with timestamps
5. Formats everything as Markdown:

```
# Video Title
**Channel**: Creator Name · **Duration**: 12:34 · **Language**: English

## Transcript
[0:00] First sentence of the video...
[0:15] Second sentence continues here...
```

## Supported videos

- Any video with captions (manual or auto-generated)
- Any language with available caption tracks
- Daneel prefers: manual English > any manual > auto-generated English > first available

Videos without any captions will fall back to the visible page content (title, description).

## Next steps

- [Your First Page Chat](/guides/first-page-chat/) covers the general page chat flow
- Read about [how RAG works](/concepts/rag/) for the pipeline behind content extraction

---

# Changelog

**Release history and notable changes for every version of Daneel AI.**

> Source: https://doc.daneel.injen.io/reference/changelog/index.md

## [1.50.0](https://github.com/daneel-ai/extension-code/compare/v1.49.0...v1.50.0) (2026-04-25)


### Features

* **research:** add /research and /paper skills with publishing pipeline ([#213](https://github.com/daneel-ai/extension-code/issues/213)) ([6373b5b](https://github.com/daneel-ai/extension-code/commit/6373b5b8337f4dea12cc38ff36c69be61e77ad2c))

## [1.49.0](https://github.com/daneel-ai/extension-code/compare/v1.48.1...v1.49.0) (2026-04-24)


### Features

* simplify license recovery to email-only ([#210](https://github.com/daneel-ai/extension-code/issues/210)) ([fbeba3f](https://github.com/daneel-ai/extension-code/commit/fbeba3fbf3702d413da204789e4de09481018b03))


### Bug Fixes

* **site:** public home renders correctly on mobile ([#212](https://github.com/daneel-ai/extension-code/issues/212)) ([f2fe1a4](https://github.com/daneel-ai/extension-code/commit/f2fe1a4174137164bde57a7323c5e0714fe80d61))

## [1.48.1](https://github.com/daneel-ai/extension-code/compare/v1.48.0...v1.48.1) (2026-04-24)


### Documentation

* unlock paid features + manage license key ([#208](https://github.com/daneel-ai/extension-code/issues/208)) ([bbe530e](https://github.com/daneel-ai/extension-code/commit/bbe530e1805745bb1ad283351ccfe98801529c5c))

## [1.48.0](https://github.com/daneel-ai/extension-code/compare/v1.47.0...v1.48.0) (2026-04-24)


### Features

* transactional emails + two-factor license recovery ([#207](https://github.com/daneel-ai/extension-code/issues/207)) ([d61d4ed](https://github.com/daneel-ai/extension-code/commit/d61d4ed7c966416b7afadea511970e5145bdfd73))


### Bug Fixes

* CI coverage-job race + Privacy Policy tab inline rendering ([#205](https://github.com/daneel-ai/extension-code/issues/205)) ([e2e1d51](https://github.com/daneel-ai/extension-code/commit/e2e1d5129959356ec0a41aca87a1fb8f6052d2a3))

## [1.47.0](https://github.com/daneel-ai/extension-code/compare/v1.46.0...v1.47.0) (2026-04-24)


### Features

* **extension:** link the privacy policy from Settings &gt; Privacy ([#204](https://github.com/daneel-ai/extension-code/issues/204)) ([4bb0df0](https://github.com/daneel-ai/extension-code/commit/4bb0df0af13ce65cca96ba60af9dca552f6bb192))
* **site:** Chrome Web Store-compliant privacy policy + static pages pipeline ([#202](https://github.com/daneel-ai/extension-code/issues/202)) ([22ad96e](https://github.com/daneel-ai/extension-code/commit/22ad96e88da86a2cb5f9302f3947d81207c375dc))

## [1.46.0](https://github.com/daneel-ai/extension-code/compare/v1.45.0...v1.46.0) (2026-04-23)


### Features

* **extension:** Data I/O as long-running tasks + OS toast notifications + KG in backups ([#199](https://github.com/daneel-ai/extension-code/issues/199)) ([aad0fbe](https://github.com/daneel-ai/extension-code/commit/aad0fbe878ce4e90157474d17e70d1bf0e878114))

## [1.45.0](https://github.com/daneel-ai/extension-code/compare/v1.44.0...v1.45.0) (2026-04-22)


### Features

* **site:** UTM-tag syndication feed click destinations ([#196](https://github.com/daneel-ai/extension-code/issues/196)) ([d4ea212](https://github.com/daneel-ai/extension-code/commit/d4ea212b7f1f763253870da157a48145add8b7e8))


### Bug Fixes

* **site:** escape & in UTM-tagged URLs emitted in RSS and Atom ([#198](https://github.com/daneel-ai/extension-code/issues/198)) ([99ce906](https://github.com/daneel-ai/extension-code/commit/99ce90656a4ed11536b6a6d378351b834b9582b7))

## [1.44.0](https://github.com/daneel-ai/extension-code/compare/v1.43.0...v1.44.0) (2026-04-22)


### Features

* **site:** hardware-check benchmark + RC-only CTAs ([#194](https://github.com/daneel-ai/extension-code/issues/194)) ([31eef16](https://github.com/daneel-ai/extension-code/commit/31eef16ad408f99d9245f8a1946f5ed7eb9f8f5a))
* **site:** route RC CTAs to GitHub issue-form template ([#195](https://github.com/daneel-ai/extension-code/issues/195)) ([348c2d7](https://github.com/daneel-ai/extension-code/commit/348c2d71eb0ead00fbe569ac46c93769d82294b4))


### Documentation

* **claude:** add OS Notifications planned-feature section to backlog ([#192](https://github.com/daneel-ai/extension-code/issues/192)) ([3e6b4e2](https://github.com/daneel-ai/extension-code/commit/3e6b4e244e7e5c2b37d127ed26f5304a3bb266c0))

## [1.43.0](https://github.com/daneel-ai/extension-code/compare/v1.42.2...v1.43.0) (2026-04-22)


### Features

* **chat:** search_images tool via swappable ImageSearchProvider ([#190](https://github.com/daneel-ai/extension-code/issues/190)) ([5076ceb](https://github.com/daneel-ai/extension-code/commit/5076ceb4907b651cd7479805e1a9c745f9d19f1e))
* **chat:** universal fetch_url tool with abuse guardrails ([#188](https://github.com/daneel-ai/extension-code/issues/188)) ([944e871](https://github.com/daneel-ai/extension-code/commit/944e8715f378cfc98d57d23916d07d0378b229e4))


### Documentation

* **search-images:** correct DDG safesearch claim + opt-in integration test ([#191](https://github.com/daneel-ai/extension-code/issues/191)) ([6229d56](https://github.com/daneel-ai/extension-code/commit/6229d56bd1429100eff16b7331d3195afc4ba9f6))

## [1.42.2](https://github.com/daneel-ai/extension-code/compare/v1.42.1...v1.42.2) (2026-04-22)


### Documentation

* **rag:** mark SourceIndex augmentWithLiteralMatches as shipped ([#186](https://github.com/daneel-ai/extension-code/issues/186)) ([418056e](https://github.com/daneel-ai/extension-code/commit/418056e652bc021d453241138e71d588a7980b69))

## [1.42.1](https://github.com/daneel-ai/extension-code/compare/v1.42.0...v1.42.1) (2026-04-22)


### Bug Fixes

* **rag:** wire augmentWithLiteralMatches into SiteSourceIndex + VaultSourceIndex ([#184](https://github.com/daneel-ai/extension-code/issues/184)) ([550584e](https://github.com/daneel-ai/extension-code/commit/550584e563753eb2378419c4dae3896d7cbf4cb0))

## [1.42.0](https://github.com/daneel-ai/extension-code/compare/v1.41.0...v1.42.0) (2026-04-22)


### Features

* **embeddings:** asymmetric query/passage prefix support ([#181](https://github.com/daneel-ai/extension-code/issues/181)) ([8962d75](https://github.com/daneel-ai/extension-code/commit/8962d7508b8ccaf101ce5f66f85cc1c5fffca701))


### Documentation

* **wrap:** embedding-prefix notes + article copy refinements ([#183](https://github.com/daneel-ai/extension-code/issues/183)) ([1878a0e](https://github.com/daneel-ai/extension-code/commit/1878a0eb1ec9c8b76a7668bb7f6302bb23bad7ec))

## [1.41.0](https://github.com/daneel-ai/extension-code/compare/v1.40.2...v1.41.0) (2026-04-22)


### Features

* **mcp:** 20 new featured servers + tags field on FeaturedMcpServer ([#178](https://github.com/daneel-ai/extension-code/issues/178)) ([cf61182](https://github.com/daneel-ai/extension-code/commit/cf6118293988d46862aaec39adeb758e37f07f76))

## [1.40.2](https://github.com/daneel-ai/extension-code/compare/v1.40.1...v1.40.2) (2026-04-21)


### Documentation

* **speech:** CLAUDE.md + docs site (3 pages) + Settings docsLinks + news article ([#177](https://github.com/daneel-ai/extension-code/issues/177)) ([5d1343b](https://github.com/daneel-ai/extension-code/commit/5d1343bc17c6614ddbbe23d26f7be8c1414aaaf6))

## [1.40.1](https://github.com/daneel-ai/extension-code/compare/v1.40.0...v1.40.1) (2026-04-21)


### Bug Fixes

* **speech:** Kokoro card unselectable + voice picker missing + stale voice-id leak ([#173](https://github.com/daneel-ai/extension-code/issues/173)) ([09d034e](https://github.com/daneel-ai/extension-code/commit/09d034e7f51da0413fbd8df8f6a359d2ac5f5e99))

## [1.40.0](https://github.com/daneel-ai/extension-code/compare/v1.39.0...v1.40.0) (2026-04-21)


### Features

* **speech:** M2c — createSpeechControls routes to KokoroTTSProvider when selected ([#171](https://github.com/daneel-ai/extension-code/issues/171)) ([07d4337](https://github.com/daneel-ai/extension-code/commit/07d4337fb0307a0fdec7d1ddae4a333cb950ab30))

## [1.39.0](https://github.com/daneel-ai/extension-code/compare/v1.38.0...v1.39.0) (2026-04-21)


### Features

* **speech:** M2b — Kokoro download button + progress + cached-state detection ([#169](https://github.com/daneel-ai/extension-code/issues/169)) ([d963874](https://github.com/daneel-ai/extension-code/commit/d9638741c8d581231adc15b0b2c6b7390da350fb))

## [1.38.0](https://github.com/daneel-ai/extension-code/compare/v1.37.0...v1.38.0) (2026-04-21)


### Features

* **speech:** M2a — Kokoro provider + host PlaybackQueue + worker ([#166](https://github.com/daneel-ai/extension-code/issues/166)) ([94ff43b](https://github.com/daneel-ai/extension-code/commit/94ff43b355e0fd39c7a187b77646c8511797ea56))


### Bug Fixes

* **speech:** wire autoRead — new assistant messages auto-play when enabled ([#168](https://github.com/daneel-ai/extension-code/issues/168)) ([6e9ca93](https://github.com/daneel-ai/extension-code/commit/6e9ca93702ff38c9549058d0d1d087c2fc26f8c2))

## [1.37.0](https://github.com/daneel-ai/extension-code/compare/v1.36.0...v1.37.0) (2026-04-21)


### Features

* **speech:** M1d — wire mic + play buttons end-to-end in ChatOverlay + VaultOverlay ([#164](https://github.com/daneel-ai/extension-code/issues/164)) ([3551998](https://github.com/daneel-ai/extension-code/commit/35519988d61963a0ede3eb3047ffb18cf9d107e9))

## [1.36.0](https://github.com/daneel-ai/extension-code/compare/v1.35.0...v1.36.0) (2026-04-21)


### Features

* **speech:** M1c — UI wiring (settings panel, mic + play buttons, shortcut) ([#162](https://github.com/daneel-ai/extension-code/issues/162)) ([3f994d9](https://github.com/daneel-ai/extension-code/commit/3f994d99a5b9cd175f87756e8a6a939db7c85c1b))

## [1.35.0](https://github.com/daneel-ai/extension-code/compare/v1.34.2...v1.35.0) (2026-04-21)


### Features

* **speech:** M1 foundation — settings, catalog, privacy, interfaces, pure logic ([#157](https://github.com/daneel-ai/extension-code/issues/157)) ([b365e09](https://github.com/daneel-ai/extension-code/commit/b365e09bb631c8711a4ab6e4a00dfd855a2a5407))
* **speech:** M1b — WebSpeechTTSProvider + WebSpeechSTTProvider + contracts ([#161](https://github.com/daneel-ai/extension-code/issues/161)) ([29f1aa1](https://github.com/daneel-ai/extension-code/commit/29f1aa1346ec7641fa7609c47124b67a60ebef23))

## [1.34.2](https://github.com/daneel-ai/extension-code/compare/v1.34.1...v1.34.2) (2026-04-21)


### Code Refactoring

* **chat:** unify chat surface across Page/Site/Vault ([#154](https://github.com/daneel-ai/extension-code/issues/154)) ([9b921d4](https://github.com/daneel-ai/extension-code/commit/9b921d47112310d4793b8368e4a4979774fb31c5))

## [1.34.1](https://github.com/daneel-ai/extension-code/compare/v1.34.0...v1.34.1) (2026-04-21)


### Bug Fixes

* **vault:** live context preview, MCP disabled-state filter, picker freshness ([#152](https://github.com/daneel-ai/extension-code/issues/152)) ([e7f184a](https://github.com/daneel-ai/extension-code/commit/e7f184a7ee64dcdea7e704708b3a3667186f1b4c))

## [1.34.0](https://github.com/daneel-ai/extension-code/compare/v1.33.1...v1.34.0) (2026-04-21)


### Features

* **webgpu:** LFM2 gen-param matrix + progress UX + registry-driven vault RAG ([#150](https://github.com/daneel-ai/extension-code/issues/150)) ([90417a8](https://github.com/daneel-ai/extension-code/commit/90417a838ca252e4b66568238ddbcceb176f18de))

## [1.33.1](https://github.com/daneel-ai/extension-code/compare/v1.33.0...v1.33.1) (2026-04-20)


### Bug Fixes

* **settings:** mask license key in active status ([#148](https://github.com/daneel-ai/extension-code/issues/148)) ([469c929](https://github.com/daneel-ai/extension-code/commit/469c929d2d2e65e3cad9d5fc01819e6ecf46a8fb))

## [1.33.0](https://github.com/daneel-ai/extension-code/compare/v1.32.0...v1.33.0) (2026-04-20)


### Features

* **settings:** knowledge graphs in Indexes panel with size bars ([#146](https://github.com/daneel-ai/extension-code/issues/146)) ([c6abaf8](https://github.com/daneel-ai/extension-code/commit/c6abaf8262bbdc6079e3cd4d7f23bab69ffaa57d))

## [1.32.0](https://github.com/daneel-ai/extension-code/compare/v1.31.0...v1.32.0) (2026-04-18)


### Features

* **wikidata:** fact-box panel with reconciliation, caching, and telemetry template ([#144](https://github.com/daneel-ai/extension-code/issues/144)) ([35b2e96](https://github.com/daneel-ai/extension-code/commit/35b2e961195992fd7428d6af2923cfa20a98335e))

## [1.31.0](https://github.com/daneel-ai/extension-code/compare/v1.30.0...v1.31.0) (2026-04-18)


### Features

* **offline:** full offline mode + news article + docs ([#142](https://github.com/daneel-ai/extension-code/issues/142)) ([77f8534](https://github.com/daneel-ai/extension-code/commit/77f8534504682c422a57228534b234c990b3cdd0))

## [1.30.0](https://github.com/daneel-ai/extension-code/compare/v1.29.0...v1.30.0) (2026-04-18)


### Features

* **settings:** models storage panel ([#140](https://github.com/daneel-ai/extension-code/issues/140)) ([bd14faa](https://github.com/daneel-ai/extension-code/commit/bd14faad445fca445107dd781be23e4bc171c865))

## [1.29.0](https://github.com/daneel-ai/extension-code/compare/v1.28.0...v1.29.0) (2026-04-18)


### Features

* **widget:** hide in fullscreen + per-site kill switch ([#138](https://github.com/daneel-ai/extension-code/issues/138)) ([e2bba9c](https://github.com/daneel-ai/extension-code/commit/e2bba9c5948ea4042ebb0ece3eb897cd50d4f145))

## [1.28.0](https://github.com/daneel-ai/extension-code/compare/v1.27.0...v1.28.0) (2026-04-17)


### Features

* **context:** context injection preview + per-vault controls ([#131](https://github.com/daneel-ai/extension-code/issues/131)) ([18f74d8](https://github.com/daneel-ai/extension-code/commit/18f74d8d976748dd31cc2bcb2abbe3d8a19de8ed))
* **news:** agentic development workflow article ([#129](https://github.com/daneel-ai/extension-code/issues/129)) ([21ed61c](https://github.com/daneel-ai/extension-code/commit/21ed61c3a83040ebc70ddb441214f4b2930f232a))
* **vault:** chat UI redesign, tool traces column, model attribution + mcp: force final synthesis ([#132](https://github.com/daneel-ai/extension-code/issues/132)) ([13b9a32](https://github.com/daneel-ai/extension-code/commit/13b9a32482444a00f629dae8bc50ea6a1e8a9f1f))
* **widget:** KaTeX math rendering + build hardening ([#133](https://github.com/daneel-ai/extension-code/issues/133)) ([9ac76be](https://github.com/daneel-ai/extension-code/commit/9ac76be5229c21a295fec31f035119b4767720a8))
* **widget:** mermaid diagram rendering in chat ([#137](https://github.com/daneel-ai/extension-code/issues/137)) ([a93bd30](https://github.com/daneel-ai/extension-code/commit/a93bd30d23ea32be3997f40232459a8582be7a21))


### Bug Fixes

* **news:** async walkTokens for shiki code highlighting ([#135](https://github.com/daneel-ai/extension-code/issues/135)) ([ddc4020](https://github.com/daneel-ai/extension-code/commit/ddc402069db79de5941f09d41e3448fdd4019fce))
* **news:** render KaTeX math in article pipeline ([#134](https://github.com/daneel-ai/extension-code/issues/134)) ([1d344cb](https://github.com/daneel-ai/extension-code/commit/1d344cbdeaa6ea66ccb4bd382ca898f44c6768f5))

## [1.27.0](https://github.com/daneel-ai/extension-code/compare/v1.26.1...v1.27.0) (2026-04-16)


### Features

* **docs:** Changelog and Credits pages from public repo ([#124](https://github.com/daneel-ai/extension-code/issues/124)) ([bcc68a3](https://github.com/daneel-ai/extension-code/commit/bcc68a3e56f18eb0d620ca495f51520161aaaec7))

## [1.26.1](https://github.com/daneel-ai/extension-code/compare/v1.26.0...v1.26.1) (2026-04-16)


### Bug Fixes

* JSON Feed spec compliance (content_html + content_text) ([#122](https://github.com/daneel-ai/extension-code/issues/122)) ([712b634](https://github.com/daneel-ai/extension-code/commit/712b634a2755c185cec7cc1e54d8c7b9273d37b8))

## [1.26.0](https://github.com/daneel-ai/extension-code/compare/v1.25.0...v1.26.0) (2026-04-16)


### Features

* eager YouTube detection and transcript extraction ([#119](https://github.com/daneel-ai/extension-code/issues/119)) ([25a1bf4](https://github.com/daneel-ai/extension-code/commit/25a1bf4ef44a4b68ce5802a0d03e84db27493ecb))

## [1.25.0](https://github.com/daneel-ai/extension-code/compare/v1.24.0...v1.25.0) (2026-04-16)


### Features

* add credits.md generation and sync workflow ([#116](https://github.com/daneel-ai/extension-code/issues/116)) ([8358db1](https://github.com/daneel-ai/extension-code/commit/8358db188ba9d4f8798f9c10ac5066eefde076fb))

## [1.24.0](https://github.com/daneel-ai/extension-code/compare/v1.23.0...v1.24.0) (2026-04-16)


### Features

* Changelog and Credits settings with MarkdownPage component ([#114](https://github.com/daneel-ai/extension-code/issues/114)) ([a299e14](https://github.com/daneel-ai/extension-code/commit/a299e14646e85e34ef0c14d7ac0b7f345bf1d4a9))

## [1.23.0](https://github.com/daneel-ai/extension-code/compare/v1.22.0...v1.23.0) (2026-04-16)


### Features

* **crawler:** BFS web crawler with link discovery ([#108](https://github.com/daneel-ai/extension-code/issues/108)) ([1e5a95f](https://github.com/daneel-ai/extension-code/commit/1e5a95f6bd3f7bb9ec913bd61748e7a7f98d7862))

## [1.22.0](https://github.com/daneel-ai/extension-code/compare/v1.21.0...v1.22.0) (2026-04-16)


### Features

* **models:** Bonsai 1.7B (q4 + q1), transformers.js 4.1.0, 1-bit quantization ([#106](https://github.com/daneel-ai/extension-code/issues/106)) ([4416130](https://github.com/daneel-ai/extension-code/commit/441613016d1965145986b7412ff51ce1448135f4))


### Bug Fixes

* **widget:** Vault cold-start + Enter-to-submit in shadow DOM ([#104](https://github.com/daneel-ai/extension-code/issues/104)) ([0f6850d](https://github.com/daneel-ai/extension-code/commit/0f6850d1c6aa5e1926ec9980d2b0b393cd736ceb))

## [1.21.0](https://github.com/daneel-ai/extension-code/compare/v1.20.0...v1.21.0) (2026-04-15)


### Features

* **search:** unified ranker, hybrid retrieval, KG cleanup, observer resilience ([9542e3d](https://github.com/daneel-ai/extension-code/commit/9542e3d835636b8af2e690fa4dc74c5ac789608d))
* **search:** unified ranker, hybrid retrieval, KG cleanup, observer resilience ([38efb83](https://github.com/daneel-ai/extension-code/commit/38efb833501718f85b696b66349fd2f2e2497754))

## [1.20.0](https://github.com/daneel-ai/extension-code/compare/v1.19.0...v1.20.0) (2026-04-14)


### Features

* **kg:** GLiNER-X model support, Unicode NER preprocessing, entity dedup improvements ([81e3091](https://github.com/daneel-ai/extension-code/commit/81e3091da869b64c15d85a86590e4cdd05b95942))
* **kg:** GLiNER-X model, Unicode NER, entity dedup improvements ([b5b3771](https://github.com/daneel-ai/extension-code/commit/b5b3771b761a4e6b39f9a269d6e1ba971ac16aee))


### Bug Fixes

* **vault:** PDF-to-vault indexing pipeline, re-index banner, chunkCount persistence ([a0b274f](https://github.com/daneel-ai/extension-code/commit/a0b274ff21b60a9a9a29133e277b0d22df737e6e))

## [1.19.0](https://github.com/daneel-ai/extension-code/compare/v1.18.1...v1.19.0) (2026-04-13)


### Features

* **pdf:** replace pdfjs-dist with edgeparse-wasm ([1d64052](https://github.com/daneel-ai/extension-code/commit/1d64052e26b8a9213087290c719bd873ac02cca5))
* **pdf:** replace pdfjs-dist with edgeparse-wasm for structured markdown extraction ([9118f43](https://github.com/daneel-ai/extension-code/commit/9118f4362b2c6b100a816bfa2e2b6cac2e52afac))


### Bug Fixes

* **build:** remove lfm2-extract worker from additionalInputs ([6c17d4b](https://github.com/daneel-ai/extension-code/commit/6c17d4ba5c4062abef63e5cf93ec61df95f749fb))

## [1.18.1](https://github.com/daneel-ai/extension-code/compare/v1.18.0...v1.18.1) (2026-04-10)


### Bug Fixes

* site crawl empty overlay + KG section button visibility ([a95590c](https://github.com/daneel-ai/extension-code/commit/a95590c585d7aecfb93a7373ccb43bcff82104c7))
* **site-rag:** empty overlay after crawl completion + KG buttons visibility ([a79ea78](https://github.com/daneel-ai/extension-code/commit/a79ea78546049bcd4aba7aa6645a9e168bb14051))
* **vault:** hide Knowledge Graph section when vault has no documents ([a82679f](https://github.com/daneel-ai/extension-code/commit/a82679f06d8000c1b7b5879d7da1578534ef3b0d))
* WebGPU progress bar, KG section visibility, keygen CLI ([95912dd](https://github.com/daneel-ai/extension-code/commit/95912ddbe9d5bb46c6bfa324453cf100e3507541))
* **webgpu:** show "Compiling model for GPU" during ORT shader compilation ([f0deaf1](https://github.com/daneel-ai/extension-code/commit/f0deaf1d14358a052266d5cc3871babba17a0905))

## [1.18.0](https://github.com/daneel-ai/extension-code/compare/v1.17.0...v1.18.0) (2026-04-10)


### Features

* **kg:** entity search overlay + graph filter + section UI cleanup ([0e03bb9](https://github.com/daneel-ai/extension-code/commit/0e03bb9854f2d9618be3f90ea60407fe2231f7a5))
* **kg:** entity search overlay with graph filter + KG section UI cleanup ([86bd3f4](https://github.com/daneel-ai/extension-code/commit/86bd3f4d5a916cd582cf71ebfac2d58b3e36cbec))
* **kg:** token-based entity resolution + pre-NER cleanup ([#92](https://github.com/daneel-ai/extension-code/issues/92)) ([7354bb3](https://github.com/daneel-ai/extension-code/commit/7354bb325c4f93db03eeb7e274948b90c6dfec4a))
* **kg:** token-based entity resolution, pre-NER cleanup, canonical form promotion ([4a8919e](https://github.com/daneel-ai/extension-code/commit/4a8919e714ae12f30da476da21b6504f1b01fedc))


### Bug Fixes

* **manifest:** remove side panel + options page, add action popup ([650cea8](https://github.com/daneel-ai/extension-code/commit/650cea8a3f58625a4568cf8ff7d73e3136c921e2))
* remove side panel + options page, add action popup ([065a1b2](https://github.com/daneel-ai/extension-code/commit/065a1b2f9f3ca5141d456de6899c7ffe6b8c8e6a))
* **widget:** resolve Svelte 5 reactivity warnings — 2 real bugs + 4 convention fixes ([ea257e6](https://github.com/daneel-ai/extension-code/commit/ea257e674f7d8abd8ea42fdb3e314fb3a4bf81d9))
* **widget:** Svelte 5 reactivity warnings — 2 bugs + convention fixes ([9b06456](https://github.com/daneel-ai/extension-code/commit/9b06456dcd83327557b04272891108e09fc25775))

## [1.17.0](https://github.com/daneel-ai/extension-code/compare/v1.16.0...v1.17.0) (2026-04-09)


### Features

* PDF first-class support ([2cecb3f](https://github.com/daneel-ai/extension-code/commit/2cecb3f02fe2640a9993c4066382d4ad883fbaa5))
* **widget:** PDF first-class support — chat, markdown export, vault import ([17fcf17](https://github.com/daneel-ai/extension-code/commit/17fcf17637a4ecbfd9702bbda522eddbc02d4ec9))

## [1.16.0](https://github.com/daneel-ai/extension-code/compare/v1.15.0...v1.16.0) (2026-04-09)


### Features

* **extraction:** clean up Wikipedia math, citations, and noise in fetched-page viewer ([807aba9](https://github.com/daneel-ai/extension-code/commit/807aba9a3f46dea3069765235f97699de03757ba))
* **vault:** PDF viewer + proxy headers + nbsp cleanup ([1953acd](https://github.com/daneel-ai/extension-code/commit/1953acd9edb5efe7cf94f78aa0601102e31fbf5e))
* **vault:** PDF-aware fetched-page viewer + nbsp cleanup + proxy header forwarding ([be65416](https://github.com/daneel-ai/extension-code/commit/be65416a2965c0217d0f4ff54af9d73dc67384e9))
* Wikipedia content cleanup for fetched-page viewer ([504d283](https://github.com/daneel-ai/extension-code/commit/504d283c29ea34f3f08067b703b0f7b1e0efee55))


### Bug Fixes

* **backend:** articles icons and images. ([e43e58a](https://github.com/daneel-ai/extension-code/commit/e43e58a2e49704d69de96b32cac5d448df331eed))

## [1.15.0](https://github.com/daneel-ai/extension-code/compare/v1.14.0...v1.15.0) (2026-04-08)


### Features

* **backend:** full SEO/social meta overhaul for news + home ([14fcdf9](https://github.com/daneel-ai/extension-code/commit/14fcdf9537785808ced1a10e4a36fe93200ce9f2))
* **docs:** full SEO/social meta overhaul for Astro/Starlight site ([9a69538](https://github.com/daneel-ai/extension-code/commit/9a6953853da5096f7139f00e737b9b7c9989d807))
* full SEO/social meta overhaul for news + docs sites ([#75](https://github.com/daneel-ai/extension-code/issues/75)) ([8ed568c](https://github.com/daneel-ai/extension-code/commit/8ed568c9e9bb4c5ebc2bc0a9883a05e52a701d2b))
* **settings:** inline docs viewer with per-section tabs ([67f9c76](https://github.com/daneel-ai/extension-code/commit/67f9c7668c1b403ec1e14c0b96e343b6ade7916d))
* **settings:** inline docs viewer with per-section tabs ([ac0f12d](https://github.com/daneel-ai/extension-code/commit/ac0f12d63beaef62681bdccec8dc034f8e56456e))

## [1.14.0](https://github.com/daneel-ai/extension-code/compare/v1.13.0...v1.14.0) (2026-04-07)


### Features

* navigable markdown mini-browser in the vault viewer ([#73](https://github.com/daneel-ai/extension-code/issues/73)) ([528b490](https://github.com/daneel-ai/extension-code/commit/528b4907086dd8df06fbaacd8a9b071005d84ead))
* New Tab launcher (dormant) + pin field on vaults & agents ([8cd598a](https://github.com/daneel-ai/extension-code/commit/8cd598ad729d51da18cdbee2693a330cede8f29a))

## [1.13.0](https://github.com/daneel-ai/extension-code/compare/v1.12.0...v1.13.0) (2026-04-07)


### Features

* import Wikipedia article into vault from KG viz ([04501f3](https://github.com/daneel-ai/extension-code/commit/04501f3a28c762981ceb4e20540ad10c834c4f80))
* import Wikipedia article into vault from KG viz ([1df1f1a](https://github.com/daneel-ai/extension-code/commit/1df1f1a4482473ee54ec56ca5dc5a42269f6f472))


### Bug Fixes

* add unlimitedStorage permission for vault content ([5863eea](https://github.com/daneel-ai/extension-code/commit/5863eeac3b5bb5bd4b1fce877568dabc3139b56b))
* add unlimitedStorage permission for vault content ([e9357ac](https://github.com/daneel-ai/extension-code/commit/e9357acc5d85597800774c9274c3d031ec15f5eb))
* cross-path Wikipedia dedup via URL key ([88509a1](https://github.com/daneel-ai/extension-code/commit/88509a1a0212d168083299eec331678a70e3f794))
* cross-path Wikipedia dedup via URL key + canonical URL form ([ddacc44](https://github.com/daneel-ai/extension-code/commit/ddacc4491401d177e93e6abe032adc5d98cd5c13))
* KG incremental no longer reprocesses noise chunks forever ([c764506](https://github.com/daneel-ai/extension-code/commit/c7645065453da15b901b0a2e20735ef13052cbae))
* KG incremental no longer reprocesses noise chunks forever ([620fe4e](https://github.com/daneel-ai/extension-code/commit/620fe4e852aea8476b540ac33206f767fd25aab6))
* vault_index first-doc dispatch silently ingesting 0 chunks ([b4f8264](https://github.com/daneel-ai/extension-code/commit/b4f826490480ecd5f43f69df00b4463b9eb30a4e))
* vault_index first-document dispatch was silently ingesting 0 chunks ([ab11e4f](https://github.com/daneel-ai/extension-code/commit/ab11e4f026a8a66b1fd1c792ea42510818e4c8cf))

## [1.12.0](https://github.com/daneel-ai/extension-code/compare/v1.11.0...v1.12.0) (2026-04-07)


### Features

* knowledge graph analytics + Wikipedia integration ([3455ffa](https://github.com/daneel-ai/extension-code/commit/3455ffaeb58ae97b68d05370a4bd7f2f96dbdbbd))
* knowledge graph analytics layer + Wikipedia integration ([928f2a9](https://github.com/daneel-ai/extension-code/commit/928f2a9db51bd922d47fa048277fb016b495610d))

## [1.11.0](https://github.com/daneel-ai/extension-code/compare/v1.10.1...v1.11.0) (2026-04-05)


### Features

* add background long-running task layer ([1873da8](https://github.com/daneel-ai/extension-code/commit/1873da897aba1100b1b66770029b0560ccc86e07))
* add Gemma 4 support (WebGPU + Ollama) and publish announcement ([0057260](https://github.com/daneel-ai/extension-code/commit/00572604a0d4da4f7da40597dca6bafe99131c32))
* Gemma 4 support (WebGPU + Ollama) + announcement article ([9f891d3](https://github.com/daneel-ai/extension-code/commit/9f891d363bbc1663a0c03fb4f3e8d8c6b2990811))


### Bug Fixes

* SettingsWebGPU uses live model registry instead of static catalog ([959ca72](https://github.com/daneel-ai/extension-code/commit/959ca7247782e87627140e2b9e407e31ffe7bb6d))
* show user-friendly error when Ollama model requires newer server version ([e4792fb](https://github.com/daneel-ai/extension-code/commit/e4792fb2fdf8cfb4269becce0f28ade281ce195f))
* user-friendly Ollama 412 error message ([8159f88](https://github.com/daneel-ai/extension-code/commit/8159f884b57c0a5067f099f29a5bc2666d43198d))
* wire SettingsWebGPU to live ModelRegistryService instead of static catalog ([6f4ee2f](https://github.com/daneel-ai/extension-code/commit/6f4ee2f9018249beb909ee4340062a10323c913d))

## [1.10.1](https://github.com/daneel-ai/extension-code/compare/v1.10.0...v1.10.1) (2026-04-03)


### Bug Fixes

* resolve pre-existing TypeScript errors caught by CI ([5f05b31](https://github.com/daneel-ai/extension-code/commit/5f05b31bb9e5ee40079456618da75df0dc03aa1e))
* resolve TypeScript errors caught by new CI ([75233a2](https://github.com/daneel-ai/extension-code/commit/75233a2e86ee5f7827ca14dab09ba9db6a64c076))

## [1.10.0](https://github.com/daneel-ai/extension-code/compare/v1.9.0...v1.10.0) (2026-04-03)


### Features

* knowledge graph — entity resolution overhaul, Neo4j export, incremental builds, viz fixes ([fe12583](https://github.com/daneel-ai/extension-code/commit/fe1258383538f8bfc9f44d465845e297b09c810a))
* knowledge graph — entity resolution, Neo4j export, incremental builds, viz fixes ([9f8ebbe](https://github.com/daneel-ai/extension-code/commit/9f8ebbeb4cf3dd402d141289861e0487fe06df98))


### Bug Fixes

* **ci:** run biome from repo root so .gitignore is found ([f1aa513](https://github.com/daneel-ai/extension-code/commit/f1aa513114824387d1cc74ebd6517b2e9c28227c))
* license key prefix CWS → DAN + robots.txt fixes ([3b20358](https://github.com/daneel-ai/extension-code/commit/3b2035873c7f5d674a591cb37d051fd4cb2d1484))
* license key prefix CWS → DAN, sitemap priorities, robots.txt fixes ([4fe758a](https://github.com/daneel-ai/extension-code/commit/4fe758ac2de00439868e84b9d443fe5fe40d1543))
* remove unused variable in recommend test ([c3d8adc](https://github.com/daneel-ai/extension-code/commit/c3d8adcc0c3b68d6e2d4e6781681796e6a6fe83f))

## [1.9.0](https://github.com/daneel-ai/extension-code/compare/v1.8.0...v1.9.0) (2026-04-02)


### Features

* add llms.txt/llms-full.txt link tags + remove noindex meta ([6a010ad](https://github.com/daneel-ai/extension-code/commit/6a010ada95c292000d1f75c9dec51b156fee55ab))
* inject markdown alternate link into each doc HTML page ([45bb648](https://github.com/daneel-ai/extension-code/commit/45bb648dadf867b4a6767f59f38a1eec0f8d8e34))
* llms link tags + remove noindex ([56d574f](https://github.com/daneel-ai/extension-code/commit/56d574f39c8c986214a2bed6d30eb0700a79fff0))
* markdown alternate link in doc HTML pages ([9153579](https://github.com/daneel-ai/extension-code/commit/9153579230a2e15718220b5e77c6f5aceabd391c))

## [1.8.0](https://github.com/daneel-ai/extension-code/compare/v1.7.0...v1.8.0) (2026-04-02)


### Features

* enhanced llms.txt + llms-full.txt generation ([9a0069f](https://github.com/daneel-ai/extension-code/commit/9a0069f3c2425500881a2b9f12eb62146788eb61))
* enhanced llms.txt + llms-full.txt generation ([5c9277d](https://github.com/daneel-ai/extension-code/commit/5c9277dda40e36d6ea655d953ec97e66b8f51a90))

## [1.7.0](https://github.com/daneel-ai/extension-code/compare/v1.6.0...v1.7.0) (2026-04-02)


### Features

* generate pages.json manifest + document help integration in CLAUDE.md ([91ffdfc](https://github.com/daneel-ai/extension-code/commit/91ffdfc8ac104aa0487341176ea833a88f16c6b3))
* geolocation & datetime context injection ([7fd80dd](https://github.com/daneel-ai/extension-code/commit/7fd80dda55e572b31f5c06c6f6f72cc25fb9205a))
* geolocation & datetime context injection for agents and MCP tools ([64f75a2](https://github.com/daneel-ai/extension-code/commit/64f75a241c0691bd5cbce57035eeb18307a23143))
* in-app Documentation section in Settings ([#42](https://github.com/daneel-ai/extension-code/issues/42)) ([f882c67](https://github.com/daneel-ai/extension-code/commit/f882c67c3ca290b4551c472bfae6000c1c2036b8))
* pages.json manifest for in-app help ([9877a46](https://github.com/daneel-ai/extension-code/commit/9877a4667afbbadb81c68e3319d3c5d892a0efda))

## [1.6.0](https://github.com/daneel-ai/extension-code/compare/v1.5.0...v1.6.0) (2026-04-02)


### Features

* /document skill — Diataxis-based documentation generator for Astro docs site ([d195577](https://github.com/daneel-ai/extension-code/commit/d195577ab9dca514eadac491f287d087485aa724))
* news article system, /announce skill, SettingsNews panel, registry updates ([d3d9dbc](https://github.com/daneel-ai/extension-code/commit/d3d9dbc1ed3ae801f562ce6a5b18826b044b3e4d))

## [1.5.0](https://github.com/daneel-ai/extension-code/compare/v1.4.2...v1.5.0) (2026-04-01)


### Features

* vault knowledge graph — GLiNER NER extraction, entity dedup, 3D WebGL visualization ([ede23e1](https://github.com/daneel-ai/extension-code/commit/ede23e1433324f8f6c4679c21b60545fc3cd03a1))
* vault knowledge graph — NER extraction, entity dedup, 3D visualization ([a08aded](https://github.com/daneel-ai/extension-code/commit/a08aded133b8031bb8b25f5fd514dfdd7de9338a))

## [1.4.2](https://github.com/daneel-ai/extension-code/compare/v1.4.1...v1.4.2) (2026-04-01)


### Bug Fixes

* upgrade @huggingface/transformers to 4.0.0 for LFM2.5-350M support ([1e975aa](https://github.com/daneel-ai/extension-code/commit/1e975aafda0dbf9db47434805edfa8224c7543d7))
* upgrade transformers.js for LFM2.5-350M support ([7ea415e](https://github.com/daneel-ai/extension-code/commit/7ea415ecf54897ab836c2ea49d6ea164058638c6))

## [1.4.1](https://github.com/daneel-ai/extension-code/compare/v1.4.0...v1.4.1) (2026-03-31)


### Bug Fixes

* Azure SAS validation, data progress bars, Ollama auto-probe, export security ([ff8ac32](https://github.com/daneel-ai/extension-code/commit/ff8ac325aeb054662e92c2ed27967812914c8342))
* Azure SAS, progress bars, Ollama auto-probe, export security ([4cf6800](https://github.com/daneel-ai/extension-code/commit/4cf680023847077e309c29b0655387a44114f2e5))

## [1.4.0](https://github.com/daneel-ai/extension-code/compare/v1.3.0...v1.4.0) (2026-03-31)


### Features

* MCP tool calling for WebGPU + Gemini Nano local models ([896a3b4](https://github.com/daneel-ai/extension-code/commit/896a3b487ccc7dd1b5cdfba2f0453b3d52c1b461))
* MCP tool calling for WebGPU + Gemini Nano local models ([b0bb7ca](https://github.com/daneel-ai/extension-code/commit/b0bb7cae297c16fe9eea4bccd464413a60ea87d2))
* unified model registry — shared package, evaluation engine, Settings UI ([6fdf331](https://github.com/daneel-ai/extension-code/commit/6fdf331fa144d73c06e49b5bbe002477475460e1))
* unified model registry + evaluation engine + Settings UI ([b6e800b](https://github.com/daneel-ai/extension-code/commit/b6e800bf02fd617cbe2147b7172ee097f0b2d417))

## [1.3.0](https://github.com/daneel-ai/extension-code/compare/v1.2.1...v1.3.0) (2026-03-30)


### Features

* Gemini Nano MCP tool calling + Chrome 146 fix ([f9dd448](https://github.com/daneel-ai/extension-code/commit/f9dd44878be7bb0bf10424ddb8bd0a332190ef33))
* wire MCP tool calling into Gemini Nano via PromptToolCallStrategy ([4f9893e](https://github.com/daneel-ai/extension-code/commit/4f9893eaaade27d4b4916ea22795708fd5a69974))


### Bug Fixes

* Gemini Nano on Chrome 146 — modern API probe first ([6e3683c](https://github.com/daneel-ai/extension-code/commit/6e3683c2b6996495eed268948abf1496590438ea))
* Gemini Nano on Chrome 146 — probe modern API first, require outputLanguage ([f934f08](https://github.com/daneel-ai/extension-code/commit/f934f0861f0f4bee1011308ef7c41cda70f0e85d))

## [1.2.1](https://github.com/daneel-ai/extension-code/compare/v1.2.0...v1.2.1) (2026-03-30)


### Bug Fixes

* MCP OAuth — resource indicator, redirect auth, SSE hang ([281edef](https://github.com/daneel-ai/extension-code/commit/281edefb4ff54d77fc362d1c0035fd1d820d82d2))
* MCP OAuth — RFC 8707 resource indicator, redirect auth header, SSE hang ([3309b73](https://github.com/daneel-ai/extension-code/commit/3309b73698820b9738d6fd3343781af83d7b3ad9))

## [1.2.0](https://github.com/daneel-ai/extension-code/compare/v1.1.0...v1.2.0) (2026-03-30)


### Features

* Ollama MCP tool calling, Azure OpenAI provider, cloud backup, agentic RAG ([c1eb026](https://github.com/daneel-ai/extension-code/commit/c1eb026d781839b6b23901955834569c310d4bc4))

## [1.1.0](https://github.com/daneel-ai/extension-code/compare/v1.0.0...v1.1.0) (2026-03-28)


### Features

* add Docker Companion — compose generator, MCP gateway, and health sidecar ([318ab31](https://github.com/daneel-ai/extension-code/commit/318ab311d73af4ae92d04ea63b349f6307214c8b))
* add Perplexity MCP server to template catalog ([#16](https://github.com/daneel-ai/extension-code/issues/16)) ([7a6b3d1](https://github.com/daneel-ai/extension-code/commit/7a6b3d1c4f62452a4a7cdbba5170388ab1a73bce))
* Docker Companion — compose generator, supergateway MCP bridge, health sidecar ([021842a](https://github.com/daneel-ai/extension-code/commit/021842a74c7c506a6146bcacc43829ee48ce0289))


### Bug Fixes

* use streamableHttp transport and /mcp endpoint for supergateway ([c28ae03](https://github.com/daneel-ai/extension-code/commit/c28ae0339065efcd369a7fb0130f114a9e7fa93f))

## 1.0.0 (2026-03-27)


### Features

* **agents:** add Agents — structured prompts + MCP servers ([7b2a516](https://github.com/daneel-ai/extension-code/commit/7b2a516534881b000f9c1f4e9adfb11d5b3cbe8f))
* **appearance:** add Neon color style + deepen all color themes ([887415f](https://github.com/daneel-ai/extension-code/commit/887415fc858ed996a276e5a828412a0222f2b08b))
* **extraction:** add YouTube video transcript extraction ([4478d17](https://github.com/daneel-ai/extension-code/commit/4478d17e84e7a45261af4cb4cf61f992cb2ad032))
* **extraction:** wire YouTube transcript to Markdown export button ([fff2211](https://github.com/daneel-ai/extension-code/commit/fff22115f3cdd25036ed9338246c17827d8737f2))
* **mcp:** add auth providers — credential store, OAuth2 flow, transport ([54d7277](https://github.com/daneel-ai/extension-code/commit/54d7277766576f271280bea2c6c8220f0f5effc7))
* **mcp:** add background handlers — register, OAuth, tools, Origin stripping ([ac09a57](https://github.com/daneel-ai/extension-code/commit/ac09a57f6975c2dac78145f12950799e93f17316))
* **mcp:** add core interfaces for MCP auth, registry, and credential management ([fbcfca3](https://github.com/daneel-ai/extension-code/commit/fbcfca36bc24b069d2bbe4faa8fbe4e6b2bea35a))
* **mcp:** add core logic — auth discovery, registration flow, registry search ([02220a8](https://github.com/daneel-ai/extension-code/commit/02220a8d943dbb8a4c228a47395b057ea6936822))
* **mcp:** add curated featured servers config with auth hints ([55eaf05](https://github.com/daneel-ai/extension-code/commit/55eaf0548b81675d16c71fb8f7f6c91c5680429f))
* **mcp:** add registry providers — Official MCP Registry + PulseMCP ([aefa525](https://github.com/daneel-ai/extension-code/commit/aefa52542a5c993f8056573be312dd748fea3171))
* **mcp:** add Settings MCP UI — search, featured grid, registered servers ([9f3f7b4](https://github.com/daneel-ai/extension-code/commit/9f3f7b4f3b3475f8a8a2cc6381333253aa3b171a))
* **mcp:** add tool call interfaces, executor, and loop orchestrator ([68df6c1](https://github.com/daneel-ai/extension-code/commit/68df6c18ca60bdd63b8f550f6eb390f0f5cad7e2))
* **mcp:** add tool call strategies — Claude native + prompt-based stub ([c7b2f5d](https://github.com/daneel-ai/extension-code/commit/c7b2f5d8a8a38d0ad7068c7f88070d385a28331f))
* **mcp:** vault MCP integration — attach servers, [MCP] button, tool calls ([bc0122f](https://github.com/daneel-ai/extension-code/commit/bc0122f609b478b49eb545f1ead06bcbb8869619))
* **ui:** OAuth credential form, dark mode polish, vault UX improvements ([fa5bfdf](https://github.com/daneel-ai/extension-code/commit/fa5bfdfb8ca99bcb8475715807fc041a6777ba59))
* vault quick-save, hardware detection, YouTube fixes ([68fddf8](https://github.com/daneel-ai/extension-code/commit/68fddf8602b5b22175add20deeaedb18e36d8df0))
* vault quick-save, hardware detection, YouTube fixes ([68fddf8](https://github.com/daneel-ai/extension-code/commit/68fddf8602b5b22175add20deeaedb18e36d8df0))
* vault quick-save, hardware detection, YouTube SPA fixes, Gemini Nano ([168e65c](https://github.com/daneel-ai/extension-code/commit/168e65c2626047fdd66cafe1a86f02d54bebaec2))
* YouTube transcript, hardware detection, settings fixes ([731c318](https://github.com/daneel-ai/extension-code/commit/731c31888835af4d64f5f3db528209951f22d258))
* YouTube transcript, hardware detection, settings fixes, Gemini Nano output language ([731c318](https://github.com/daneel-ai/extension-code/commit/731c31888835af4d64f5f3db528209951f22d258))


### Bug Fixes

* **backend:** presentation page with proper links and screenshots for chrome store distribution. ([5a4c02d](https://github.com/daneel-ai/extension-code/commit/5a4c02d803f605dc9112b2f0fc56893006653cb3))
* **backend:** presentation page with proper links and screenshots for… ([f31e88e](https://github.com/daneel-ai/extension-code/commit/f31e88ee2d9e1a099460f15902dbb263e3346e49))
* **backend:** remove html comments in public pages. ([ee1b7a2](https://github.com/daneel-ai/extension-code/commit/ee1b7a23e4c844360a189c7ff2d3cb38a150ebe3))
* **build:** add shebang to husky pre-commit hook ([e0d2316](https://github.com/daneel-ai/extension-code/commit/e0d2316b2e802c0fca45d19f4bcff8003342f0d0))
* **chat:** enable GFM tables in markdown rendering ([378822e](https://github.com/daneel-ai/extension-code/commit/378822efc8e2204c6e8249f2564d3fd7abb48627))
* **chat:** include query params in conversation key + YouTube SPA nav ([742b5bb](https://github.com/daneel-ai/extension-code/commit/742b5bb50abffda28ca671044106a5a3b9e99e27))
* **ci:** build docs in CI before deploying to Vercel ([cae2771](https://github.com/daneel-ai/extension-code/commit/cae27714b441ed877ebb90bfc03b444bc270492d))
* **ci:** disable migration step in deploy workflow ([f9bdf58](https://github.com/daneel-ai/extension-code/commit/f9bdf58b78006f296c766d31c72c23c8ca20dd09))
* **ci:** use release-type simple — no root package.json needed ([871c8bf](https://github.com/daneel-ai/extension-code/commit/871c8bf2c75227bd8d0cb26811058e5280ca7968))
* **ci:** use release-type simple — no root package.json needed ([327c19d](https://github.com/daneel-ai/extension-code/commit/327c19d3c0edb99630c8f98b2de5d2bb5aa08ea0))
* **claude:** settings ([8c3ccce](https://github.com/daneel-ai/extension-code/commit/8c3ccce69eb97f2231b1e4a8237d7495d51c686a))
* **claude:** settings ([b2094d1](https://github.com/daneel-ai/extension-code/commit/b2094d1fd17936328610abac551a0e07e2edde0d))
* **extraction:** use ANDROID player API for all YouTube metadata ([1af6a63](https://github.com/daneel-ai/extension-code/commit/1af6a6310cc477c375cf4883e1f1c317702be9e5))
* **icons:** restore mandatory extension icons ([8fa66e1](https://github.com/daneel-ai/extension-code/commit/8fa66e12c5088d36baf29265c454cec09218a9b0))
* **public:** change enterprise accent color to orange.. ([ef64f27](https://github.com/daneel-ai/extension-code/commit/ef64f27cd3a7d13ac20c914159363b0d6a2f446d))
* **settings:** add missing ACTIVE_MODEL import to SettingsSystem ([94ed166](https://github.com/daneel-ai/extension-code/commit/94ed16633d4ad29160af3eb5ce0831f6ef732c30))
* **settings:** add missing Button/UnlockBanner imports to extracted components ([5f97e1d](https://github.com/daneel-ai/extension-code/commit/5f97e1dc04bef01e5c0de577eaaa50d9df4b8010))
* **settings:** restore _ prefix mismatches after revert collision ([b39d902](https://github.com/daneel-ai/extension-code/commit/b39d902db618dd62398d0d73fbf3e78cdeed1be1))
* **settings:** restore 55 biome-broken template references ([3ea4753](https://github.com/daneel-ai/extension-code/commit/3ea47533c1ee6dbc561aaf299187d6e28e20672f))
* **ui:** restore biome-broken Badge, Button, Progress components ([421fb54](https://github.com/daneel-ai/extension-code/commit/421fb54ced16a98c5d026809b85026565c856af2))
* **ui:** revert all UI primitives to pre-biome state ([35d491e](https://github.com/daneel-ai/extension-code/commit/35d491e3f4838850ee22e3d26cc65f44df9f5863))

---

# Credits

**Open-source libraries and tools that power Daneel AI.**

> Source: https://doc.daneel.injen.io/reference/credits/index.md

## Authors

- **Injen.io Team** — design, architecture, product
- **Claude Code** (Anthropic) — implementation partner

## Acknowledgements and dependencies

Daneel AI is built on the shoulders of these open-source projects:

| Name | Version | Description | Link |
| --- | --- | --- | --- |
| **@chonkiejs/core** | `0.0.7` | Core chunking library for Chonkie - lightweight and efficient text chunking | [npm](https://docs.chonkie.ai) |
| **@huggingface/transformers** | `4.1.0` | State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server! | [npm](https://github.com/huggingface/transformers.js#readme) |
| **@mozilla/readability** | `0.6.0` | A standalone version of the readability library used for Firefox Reader View. | [npm](https://github.com/mozilla/readability) |
| **3d-force-graph** | `1.80.0` | UI component for a 3D force-directed graph using ThreeJS and d3-force-3d layout engine | [npm](https://github.com/vasturiano/3d-force-graph) |
| **dompurify** | `3.4.0` | DOMPurify is a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. It's written in JavaScript and works in all modern browsers (Safari, Opera (15+), Internet Explorer (10+), Firefox and Chrome - as well as almost anything else usin | [npm](https://github.com/cure53/DOMPurify) |
| **edgeparse-wasm** | `0.2.5` | EdgeParse PDF parser — WebAssembly build for browsers | [npm](https://www.edgeparse.com) |
| **fflate** | `0.8.2` | High performance (de)compression in an 8kB package | [npm](https://101arrowz.github.io/fflate) |
| **graphology** | `0.26.0` | A robust and multipurpose Graph object for JavaScript. | [npm](https://github.com/graphology/graphology#readme) |
| **graphology-communities-louvain** | `2.0.2` | Louvain community detection for graphology. | [npm](https://github.com/graphology/graphology#readme) |
| **graphology-components** | `1.5.4` | Connected components for graphology. | [npm](https://github.com/graphology/graphology#readme) |
| **graphology-metrics** | `2.4.0` | Miscellaneous graph metrics for graphology. | [npm](https://github.com/graphology/graphology#readme) |
| **graphology-operators** | `1.6.1` | Miscellaneous operators for graphology. | [npm](https://github.com/graphology/graphology#readme) |
| **graphology-shortest-path** | `2.1.0` | Shortest path functions for graphology. | [npm](https://github.com/graphology/graphology#readme) |
| **graphology-types** | `0.24.8` | TypeScript declaration for graphology. | [npm](https://github.com/graphology/graphology#readme) |
| **idb** | `8.0.3` | A small wrapper that makes IndexedDB usable | [npm](https://github.com/jakearchibald/idb#readme) |
| **katex** | `0.16.45` | Fast math typesetting for the web. | [npm](https://katex.org) |
| **kokoro-js** | `1.2.1` | High-quality text-to-speech for the web | [npm](https://github.com/hexgrad/kokoro) |
| **mammoth** | `1.12.0` | Convert Word documents from docx to simple HTML and Markdown | [npm](https://github.com/mwilliamson/mammoth.js#readme) |
| **marked** | `17.0.4` | A markdown parser built for speed | [npm](https://marked.js.org) |
| **marked-katex-extension** | `5.1.8` | MarkedJS extesion to render katex | [npm](https://github.com/UziTech/marked-katex-extension#readme) |
| **mermaid** | `11.14.0` | Markdown-ish syntax for generating flowcharts, mindmaps, sequence diagrams, class diagrams, gantt charts, git graphs and more. | [npm](https://github.com/mermaid-js/mermaid#readme) |
| **oauth4webapi** | `3.8.5` | Low-Level OAuth 2 / OpenID Connect Client API for JavaScript Runtimes | [npm](https://github.com/panva/oauth4webapi) |
| **turndown** | `7.2.4` | A library that converts HTML to Markdown | [npm](https://github.com/mixmark-io/turndown#readme) |
| **turndown-plugin-gfm** | `1.0.2` | Turndown plugin to add GitHub Flavored Markdown extensions. | [npm](https://github.com/domchristie/turndown-plugin-gfm#readme) |
| **wikibase-sdk** | `11.2.7` | utils functions to query a Wikibase instance and simplify its results | [npm](https://github.com/maxlath/wikibase-sdk) |

---

# Supported File Formats

**File formats accepted by Document Vault and how they are converted.**

> Source: https://doc.daneel.injen.io/reference/formats/index.md

To get started with document import, see [Build a Document Vault](/guides/first-vault/).

## Supported formats

| Format | Extensions | Conversion method | Notes |
|--------|-----------|-------------------|-------|
| PDF | `.pdf` | EdgeParse WASM | Structured Markdown extraction (no OCR for scanned PDFs) |
| Microsoft Word | `.docx` | Mammoth (DOCX → HTML → text) | Modern `.docx` only, not legacy `.doc` |
| Plain text | `.txt` | Direct read | UTF-8 assumed |
| HTML | `.html`, `.htm` | Turndown (HTML → Markdown) | Strips scripts, styles, and navigation |
| PowerPoint | `.pptx` | Text extraction from slides | Slide text only, no speaker notes or images |
| Excel | `.xls`, `.xlsx` | Cell text extraction | Text content from cells, not formulas |
| Markdown | `.md` | Direct read | Preserved as-is |

## Conversion pipeline

1. **Format detection** — Daneel infers the format from the file extension.
2. **Conversion** — The `CompositeConverter` delegates to the appropriate converter (PdfConverter, DocxConverter, or HtmlDocumentConverter).
3. **Text output** — All formats are converted to plain text or Markdown.
4. **Chunking** — The text is split into overlapping chunks (default: 512 tokens, 64 token overlap).
5. **Embedding** — Each chunk is embedded using the active embedding model.
6. **Deduplication** — SHA-256 content hash prevents duplicate imports.

## Size limits

| Limit | Free plan | Paid plan |
|-------|-----------|-----------|
| Max file size | 1 MB | 10 MB |
| Max converted characters | 50,000 | 500,000 |
| Max chunks per document | 100 | 1,000 |
| Max documents per vault | 5 | 50 |
| Max vaults | 1 | Unlimited |

## Page content extraction

For web pages (Page Chat and Site Search modes), Daneel uses a three-strategy extraction pipeline:

1. **Readability.js** — Mozilla's reader-mode extractor. Best for articles and blog posts.
2. **CSS cascade + Turndown** — Selects the main content area via CSS heuristics, converts to Markdown. Used when Readability fails.
3. **Plain-text fallback** — Strips all HTML and returns raw text. Last resort.

YouTube pages use a separate [transcript extraction pipeline](/how-to/youtube-chat/).

---

# Privacy Policy

**Daneel AI is a private AI reading and research assistant for web pages, websites, and local documents. Conversation content stays on the User**

> Source: https://doc.daneel.injen.io/reference/privacy-policy/index.md

:::note
The authoritative version of this policy is published at
[daneel.injen.io/privacy-policy.html](https://daneel.injen.io/privacy-policy.html).
This page is a mirror served by the docs site for convenience.
:::

**Last updated:** 2026-04-24
**Effective date:** 2026-04-24

[View on daneel.injen.io](https://daneel.injen.io/privacy-policy.html) · [Markdown source](https://daneel.injen.io/privacy-policy.md)

---

## Purpose

Daneel AI is a private AI reading and research assistant for the web pages, websites, and local documents the User chooses to engage with. Every feature in the extension is an implementation of that single purpose: helping the User understand content they have already decided to read.

All design decisions in Daneel flow from this purpose. Daneel reads, fetches, or processes content only in response to an explicit User action — asking a question about the current page, importing a document into a vault, or starting a site index.

## Privacy summary

Daneel runs AI inference inside the User's browser or against backends the User configures. Prompts, AI responses, indexed website content, and local document contents stay on the User's device.

The only personal data we store on our servers is payment- and license-related: a purchase record and a per-license activation counter. Details are in section 3.

We do not sell User data and do not train AI models on User content.

---

## 1. Who we are

Daneel AI is developed and operated by **WAEBS**, a French company trading under the **Injen.io** brand.

- **Legal entity:** WAEBS
- **SIREN:** 431 946 532
- **SIRET:** 431 946 532 00071
- **European VAT number:** FR21 431 946 532
- **Registered address:** 131 Boulevard Pereire, 75017 Paris, France
- **Legal representative:** Julien Borrel
- **Official public registry entry (France):** [annuaire-entreprises.data.gouv.fr](https://annuaire-entreprises.data.gouv.fr/entreprise/waebs-incand-inkand-431946532)
- **Company website:** [injen.io](https://injen.io)
- **Privacy contact:** [think+daneel-privacy@injen.io](mailto:think+daneel-privacy@injen.io)

WAEBS is the **data controller** within the meaning of the EU General Data Protection Regulation (GDPR) for the personal data processed in connection with Daneel AI.

For EU residents, the lead supervisory authority is the French data protection authority (CNIL, [cnil.fr](https://www.cnil.fr)).

---

## 2. Scope of this policy

This policy covers:

- The **Daneel AI Chrome extension** (Chrome Web Store ID: `ebmjckdkmojgnbhnnindennabonbnogp`)
- The **Daneel purchase and license-activation flow** (Stripe Checkout, the `/api/*` backend endpoints on `daneel.injen.io`, and the Supabase database that stores license records)
- Emails sent to WAEBS contact addresses (e.g. `think+daneel-privacy@injen.io`)

This policy does **not** cover:

- The **LLM backends the User configures** (Azure OpenAI, a self-hosted Ollama server, Claude API, Chrome's built-in Gemini Nano, etc.). Each of these is operated by a separate provider or by the User, and each has its own privacy terms.
- **Stripe's** handling of payment card details. Card numbers, CVV, expiry, and billing address entered into Stripe Checkout are governed by [Stripe's privacy policy](https://stripe.com/privacy). WAEBS never sees that information.
- **Supabase's** infrastructure-level data handling, governed by [Supabase's privacy policy](https://supabase.com/privacy).
- **Google Chrome** itself and its browser-level data practices, governed by Google's policies.
- Websites the User visits through the browser.

---

## 3. Information collected — and what is not

### 3.1. What is NOT collected

WAEBS does not collect, transmit, or retain on its servers any of the following:

- Conversation content — the User's prompts, the AI's responses, and the context of chats
- The text of queries typed into Daneel
- Indexed website content (this stays in the User's browser `IndexedDB` / `chrome.storage.local`)
- Local document contents imported into a vault (these stay on the User's device)
- Browsing history
- Cookies from sites the User visits
- Form data from sites the User visits
- Keystrokes outside of Daneel's own UI
- Screenshots
- Full payment card details — card numbers, CVV, expiry, and billing address are handled entirely by Stripe's hosted Checkout; WAEBS never sees them
- Precise geolocation (GPS coordinates, street-level location)
- Any Daneel user account — there is no sign-in to Daneel itself

### 3.2. What is collected or processed

#### A. Payment and purchase records (at the time of a Daneel Pro purchase)

**Data stored:**

- Email address
- Payment status (succeeded, failed, refunded)
- Stripe session ID, customer ID, and payment intent ID
- Purchase timestamp
- License key issued (`DAN-XXXX-XXXX-XXXX` format)
- Plan tier (`paid` or `sponsor`)
- Amount and currency
- Billing country (for tax compliance)
- Feature flags associated with the license
- Revocation status

**Source of the data:** Stripe's hosted Checkout page. Once a payment succeeds, Stripe sends a webhook to the Daneel backend (`POST /api/webhook/stripe`) with the fields listed above.

**Where the data is stored:** A row in a **Supabase** (PostgreSQL) database, hosted in Supabase's **EU West (Ireland, `eu-west-1`)** region. Access is restricted to WAEBS administrators through authenticated Supabase service credentials.

**Purpose:**

- To issue a valid license key to the paying customer
- To provide billing support (refunds, disputes, receipts)
- To comply with French and EU tax and accounting obligations
- To detect and prevent payment fraud

**Retention:** Retained for as long as the license is active, plus the minimum period required by French tax and accounting law — typically **ten years** — after which records are anonymised or deleted.

**Legal basis (GDPR):**

- Art. 6(1)(b) — performance of contract (to issue and support the purchased licence)
- Art. 6(1)(c) — legal obligation (French tax and accounting retention)
- Art. 6(1)(f) — legitimate interest (fraud prevention)

#### B. License-key activation counter

**Data stored:** Each license key has an `activation_count` integer and a `last_activated_at` timestamp. Every time a device calls the Daneel backend to unlock Daneel with that key, the counter is incremented by one and the timestamp is updated.

**Data NOT stored alongside the counter:** The IP address of the activating device, the device fingerprint, the operating system, the hostname, the browser fingerprint, and any other identifier tied to the machine are not recorded. Only the running count and the timestamp of the most recent activation.

**Source of the data:** The Daneel extension contacts the `/api/activate` endpoint when a User enters a license key.

**Where the data is stored:** The same Supabase database row as the license itself, keyed by the license key.

**Purpose:** License keys are sold per User. A counter that climbs far beyond the number of seats purchased indicates the key is being shared or leaked in violation of the license terms. This is a **license-enforcement signal**, not a user-tracking mechanism.

**Retention:** Retained for the lifetime of the license. Reset or deleted when the license is revoked or refunded.

**Legal basis (GDPR):** Art. 6(1)(f) — legitimate interest (protecting the integrity of a paid product against unauthorised sharing).

#### C. License validation requests

**Data processed:** When the extension calls the `/api/activate` or `/api/refresh` endpoint, it transmits the license key and a product version string. As with any HTTP request, standard network metadata (source IP, timestamp, user-agent header) is observable at the edge and may be retained in **operational logs** for a short window (typically 30 days) for abuse mitigation and debugging, after which logs are rotated or deleted.

**Retention:**

- Operational edge logs: **30 days maximum**, then rotated.
- Validation outcome (success/failure linked to the license): retained with the license record.

**Legal basis (GDPR):**

- Art. 6(1)(b) — performance of contract (validation is part of serving the purchased license)
- Art. 6(1)(f) — legitimate interest (abuse prevention, service reliability)

#### D. Extension update pings

When Chrome automatically updates Daneel from the Chrome Web Store, it pings Google's update servers. This is handled entirely by Google Chrome and the Chrome Web Store, not by WAEBS. It is governed by Google's privacy policy.

#### E. Anonymous product analytics

Daneel ships with **optional, anonymous product analytics** that help WAEBS understand which features are used and catch regressions. This is enabled by default and can be turned off at any time in **Settings → Privacy → Telemetry**.

**Transport:** Google Analytics 4 (GA4). Events are sent from the extension's service worker directly to Google's endpoint, not via WAEBS servers. Google acts as a data processor for this data.

**What is sent in each event:**

- An event name from a fixed catalog (for example `chat`, `search`, `vault_create`, `install`, `provider_switch`, `model_load`, `license_set`, `data_export`) — never a free-form string.
- Typed, non-identifying properties such as the name of the LLM provider in use (`webgpu`, `ollama`, `claude`, `gemini-nano`), the number of pages crawled, or a duration in milliseconds. No prompts, no responses, no URLs, no document content.
- Common device properties: extension version, browser name and version, operating system and architecture, screen resolution, language, and IANA timezone.
- Approximate geography (country, region, city) **derived from the browser timezone** — no external IP lookup is performed by default. Users may opt in separately to a more accurate IP-based geolocation in **Settings → Privacy** (the `ipGeoEnabled` toggle, off by default), which makes a single HTTP call to `ipapi.co` on service-worker startup.
- A randomly generated client ID stored in `chrome.storage.local`. It is not linked to the User's email, license key, or any other identifier.

**What is never sent:** conversation text, prompts, AI responses, URLs of pages the User visits, document content, file names, or any error messages that could contain user text. Telemetry payloads contain only enumerated event names, booleans, integers, and durations.

**How to disable:** open Daneel → Settings → Privacy → turn off "Anonymous product analytics". No event is sent after the toggle is off.

**Legal basis (GDPR):** Art. 6(1)(f) — legitimate interest in improving the product. This interest is balanced against User rights by (a) making the data non-identifying, (b) providing a clear opt-out, and (c) not linking telemetry events to any license record.

WAEBS does not use crash reporting services (e.g. Sentry, Bugsnag). Errors surface in the extension's own UI and in the browser's developer console; they are not transmitted to WAEBS.

#### F. Correspondence

When a User emails WAEBS (for example at `think+daneel-privacy@injen.io` or a support address), WAEBS retains the message, the email address, and its responses as long as necessary to resolve the query and for a reasonable follow-up window afterwards.

---

## 4. How data flows

### 4.1. AI content flow (the vast majority of Daneel usage)

```text
A question is typed into Daneel
            ↓
The Daneel extension (running inside the User's browser)
            ↓
The LLM backend the User chose:
   • WebGPU (100% in-browser, no network at all)
   • Chrome Built-in Gemini Nano (Chrome ships the model locally)
   • A self-hosted Ollama server (typically localhost)
   • An Azure OpenAI deployment operated by the User
   • Anthropic Claude API (with the User's own API key)
            ↓
The response streams back to the extension
            ↓
Stored in the browser's IndexedDB / chrome.storage.local,
on the User's device only
```

**WAEBS never appears in this flow.** There is no proxy, no relay, no mirror, no sampling point. WAEBS could not read the User's prompts or responses even if it wanted to, because they are never sent to WAEBS.

### 4.2. Payment and licensing flow (one-time, at purchase)

```text
User clicks "Buy Pro" on daneel.injen.io
            ↓
Redirect to Stripe Checkout (hosted by Stripe)
            ↓
Card details entered on Stripe's servers
(WAEBS never sees the card)
            ↓
Stripe webhook → POST /api/webhook/stripe (Daneel backend)
            ↓
Row inserted into Supabase:
    { email, stripe_customer_id, license_key, plan,
      amount, currency, country, status, timestamps }
            ↓
License key is delivered to the purchasing email via Resend,
and also shown on the /api/success page in the browser
so the User can copy it into the extension
```

### 4.3. License validation flow (on activation and periodic refresh)

```text
User pastes the license key in Daneel's Settings → License
            ↓
Extension → POST /api/activate on daneel.injen.io (HTTPS)
            ↓
Daneel backend:
   • Looks up the key in Supabase
   • Verifies it is not revoked
   • Increments activation_count, updates last_activated_at
   • Signs a 7-day ES256 JWT containing plan + feature flags
            ↓
JWT returned to the extension, cached locally,
and verified offline using a public key bundled
in the extension
```

The JWT is refreshed in the background before expiry via `POST /api/refresh`. No user content is part of these requests — only the license key and a product version string.

---

## 5. Chrome extension permissions

Daneel declares the following permissions in its `manifest.json`. Each one is listed here with its purpose and what it does **not** do.

### `activeTab`

- **Purpose:** Read the current page's DOM so the User can ask questions about it (Page Q&A mode).
- **Scope:** Granted only for the tab the User interacts with, only while the User interacts with it.
- **What Daneel does:** Extract the page's visible text on demand to include it as context for the LLM backend the User chose.
- **What Daneel does not do:** Continuously monitor the tab, capture screenshots, log keystrokes, or read forms that have not been submitted.

### `storage` and `unlimitedStorage`

- **Purpose:** Persist the User's settings, conversations, local vaults, indexed website chunks, knowledge graphs, and downloaded local AI models.
- **Scope:** Purely local. Data is stored in `chrome.storage.local` and `IndexedDB` on the User's device. `unlimitedStorage` is needed because AI models and vector indexes can exceed Chrome's default quota.
- **What Daneel does:** Read and write data the User generated inside Daneel.
- **What Daneel does not do:** Send the contents of local storage anywhere. Settings and conversations are not synced to WAEBS servers.

### `scripting`

- **Purpose:** Inject Daneel's widget overlay (the small launcher button and panel) into web pages the User visits.
- **Scope:** Only Daneel's own widget scripts, loaded from the extension bundle — never arbitrary code.
- **What Daneel does not do:** Inject scripts into pages for tracking, ad injection, affiliate hijacking, or any other purpose.

### `tabs`

- **Purpose:** Open internal extension pages (the Vault tab, the onboarding flow, the Stripe checkout success page), coordinate widget state across tabs of the same site, and target the correct tab when activating a license.
- **What Daneel does:** Query metadata (URL, title) of tabs that need to be coordinated with.
- **What Daneel does not do:** Inventory the User's open tabs, record browsing history, or transmit tab data off the device.

### `identity`

- **Purpose:** Drive the OAuth 2.0 + PKCE login flow for third-party services the User chooses to connect (for example Stripe, Vercel, Cloudflare via the Model Context Protocol). Uses Chrome's `chrome.identity.launchWebAuthFlow`.
- **Scope:** Invoked only when the User clicks "Connect" on a service card in Settings.
- **What Daneel does not do:** Authenticate the User to a WAEBS backend — there is no Daneel user account system. WAEBS never sees the OAuth tokens: they are stored in the User's browser and sent only to the service the User authorised.

### `webNavigation`

- **Purpose:** Used for two User-initiated flows:
  - Detect Single-Page Application (SPA) navigation events on pages where the widget is active (for example YouTube navigating between videos without a full page load) so the widget can refresh its state (for example re-extract the new video's transcript).
  - Capture the redirect URI that completes an OAuth flow when the User connects a third-party service (e.g. an MCP server). Daneel listens for that redirect only in the tab it opened for the auth flow, and closes that tab as soon as the redirect is observed.
- **What Daneel does not do:** Record a history of pages the User visits. Navigation events are consumed in memory and not stored.

### `declarativeNetRequest`

- **Purpose:** Remove the `Origin` HTTP header on `POST` requests the extension makes to certain third-party services that reject cross-origin preflight requests from browser extensions. This is the Manifest V3-compliant way to handle that constraint.
- **Scope:** Applied only to extension-originated requests to services the User has registered.
- **What Daneel does not do:** Block ads, filter trackers, rewrite URLs, or modify any traffic that Daneel did not initiate itself.

### `alarms`

- **Purpose:** Schedule a heartbeat (roughly every minute) so the background service worker can resume long-running tasks the User has started (site crawl, vault indexing, knowledge-graph build, data export, data import) after Chrome's Manifest V3 eviction of idle service workers.
- **What Daneel does not do:** Use alarms for standalone background operations. The alarm handler only resumes tasks the User has explicitly initiated; it does not start new work on its own.

### `notifications`

- **Purpose:** Show native operating-system toast notifications when long-running background tasks reach a milestone (start, complete, failed, cancelled). Can be disabled in Settings → Notifications.
- **What Daneel does:** Render a toast such as `Indexing > site > example.com > Complete`. No content of the task — no URL titles, no query text — is included.
- **What Daneel does not do:** Use notifications for marketing, re-engagement, or upsell.

### `host_permissions: <all_urls>`

- **Purpose:** Allow the Daneel widget to run on any website so the User can ask questions about the current page (Page Q&A) or index a site for later search (Site RAG).
- **Scope:** Daneel reads web page content only in response to an explicit User action: asking a question about the current page, or starting a site index. Once the User starts a site index, Daneel continues crawling that site's pages in the background service worker until the crawl is complete or the User cancels it — this is the continuation of the action the User started. Daneel does not read pages on sites where no indexing task has been started.
- **What Daneel does not do:** Crawl the open web without User initiation, build a profile of the User's browsing, or transmit page content to WAEBS servers.

---

## 6. Third-party services (sub-processors)

The following processors may receive or store data on behalf of WAEBS, each strictly for the operational purpose listed:

| Processor | What they do | Data they receive | Their policy |
|---|---|---|---|
| **Stripe, Inc.** (US) | Payment processing, Checkout UI, and the automatic payment receipt email sent to the purchasing email address | Email, card details (entered directly by the User), billing country, amount, currency | [stripe.com/privacy](https://stripe.com/privacy) |
| **Supabase, Inc.** (US; data hosted in EU West / Ireland) | Database (PostgreSQL) and backend hosting | Purchase records, license keys, activation counters | [supabase.com/privacy](https://supabase.com/privacy) |
| **Vercel, Inc.** (US; EU regions available) | Hosting the Daneel backend serverless endpoints and the public marketing site | Standard HTTP request metadata (IP, user-agent, timestamp) handled at the edge for routing and abuse-mitigation | [vercel.com/legal/privacy-policy](https://vercel.com/legal/privacy-policy) |
| **Resend** (Ireland, EU West / `eu-west-1`) | Transactional email delivery. Daneel sends the license-key confirmation email from `noreply@daneel.injen.io` via Resend after a successful purchase | The purchasing email address, the license key, and the email body | [resend.com/legal/privacy-policy](https://resend.com/legal/privacy-policy) |
| **Google LLC** (Chrome Web Store) | Extension distribution and automatic updates | Standard Chrome Web Store update telemetry, handled by Google | [policies.google.com/privacy](https://policies.google.com/privacy) |
| **Google Analytics 4** (Google LLC) | Optional anonymous product analytics (enabled by default, can be turned off in Settings → Privacy) | Event names from a fixed catalog, device/browser properties, timezone-derived coarse geography | [policies.google.com/privacy](https://policies.google.com/privacy) |
| **ipapi.co** | Used **only if** the User opts in to IP-based geography enrichment (`ipGeoEnabled` toggle, off by default). A single HTTP call on service-worker startup | The User's IP address (as part of any HTTP request) | [ipapi.co/privacy](https://ipapi.co/privacy) |
| **LLM backends the User configures** | AI inference | The User's prompts and context, as the User chooses. **Not WAEBS sub-processors** — the User picks and controls them | Each provider's own terms |

---

## 7. Data storage and security

**AI content and user data.** Lives entirely inside the User's browser (`IndexedDB`, `chrome.storage.local`) or on infrastructure the User or the User's employer operates (a self-hosted Ollama server, an Azure OpenAI tenant). WAEBS has no access.

**Payment and license data.** Lives in the Daneel Supabase (PostgreSQL) database, in the EU West (Ireland) region. Encrypted at rest by Supabase. Access is restricted to WAEBS administrators via authenticated service credentials. Row-level security policies are applied on the server-side API layer.

**Transit.** All network calls from the extension to the `daneel.injen.io` endpoints use HTTPS / TLS 1.2 or higher.

**Retention.**

- AI content, vaults, conversations, knowledge graphs — **persist in the User's browser until the User deletes them** from Daneel's UI or clears the browser's site data.
- Payment and license data — retained as described in §3.2.A.
- Operational logs — 30 days maximum.
- Telemetry (for Users who have not opted out) — retained by Google Analytics for up to 14 months (GA4's default).

**Breach notification.** In the event of a data breach affecting the Supabase records, WAEBS will notify affected Users by email within **72 hours** of becoming aware of the breach, consistent with GDPR Articles 33 and 34, and notify the CNIL as required.

---

## 8. Data sharing and sale

- WAEBS does **not** sell personal information.
- WAEBS does **not** share personal information with advertisers, data brokers, or analytics vendors other than the processors listed in §6, and those only for the operational purposes described.
- WAEBS does **not** train AI models on User data — WAEBS has no access to User conversations in the first place.
- WAEBS may disclose data if compelled by a lawful request (subpoena, court order, or binding regulatory order), in which case it will notify the affected User to the extent legally permitted.

---

## 9. International data transfers

Stripe and Vercel are US-headquartered companies. Supabase is US-headquartered but offers regional data residency; Daneel's Supabase project uses the **EU West (Ireland, `eu-west-1`) region**, so license and payment records are primarily stored inside the EU. Resend is headquartered in Ireland and its infrastructure used for Daneel runs in the `eu-west-1` region. Standard cross-border support and operations may still involve incidental access from outside the EU.

Where personal data is transferred to the United States or to any country that does not have an EU Commission adequacy decision, WAEBS relies on:

- **Standard Contractual Clauses (SCCs)** executed by each processor, and/or
- The **EU–US Data Privacy Framework** where the processor is certified.

Stripe and Supabase both publish SCC-backed Data Processing Addenda.

---

## 10. User rights

### 10.1. Users in the European Union, European Economic Area, United Kingdom, or Switzerland (GDPR / UK GDPR / Swiss FADP)

Users have the right to:

- **Access** (Art. 15) — request a copy of the personal data held about them
- **Rectification** (Art. 16) — have inaccurate data corrected
- **Erasure** (Art. 17) — have their data deleted, subject to tax-retention obligations that may require WAEBS to keep certain purchase records for the statutory period (ten years)
- **Restriction** (Art. 18) — ask WAEBS to stop processing in specific situations
- **Portability** (Art. 20) — receive the data they provided in a structured, commonly used, machine-readable format
- **Object** (Art. 21) — object to processing based on legitimate interest
- **Withdraw consent** at any time where processing is based on consent
- **Lodge a complaint** with the relevant national data protection authority. For France, this is the CNIL ([cnil.fr](https://www.cnil.fr))

To exercise these rights, Users may email `think+daneel-privacy@injen.io` from the address associated with the purchase. WAEBS responds within **30 days**.

### 10.2. California residents (CCPA / CPRA)

California residents have the right to:

- **Know** what personal information has been collected about them
- **Delete** their personal information, subject to the tax-retention exception above
- **Correct** inaccurate personal information
- **Opt out of sale or sharing** of personal information — not applicable, because **WAEBS does not sell or share personal information**
- **Non-discrimination** for exercising any of these rights

To exercise these rights, California residents may email `think+daneel-privacy@injen.io`.

### 10.3. Other jurisdictions

Residents of Brazil (LGPD), Canada (PIPEDA), and other jurisdictions with equivalent frameworks may exercise analogous rights by contacting WAEBS at the same address.

---

## 11. Children's privacy

Daneel AI is not directed at children. WAEBS does not knowingly collect personal data from anyone under 13 (United States, COPPA) or, where applicable, under 16 (European Union). If WAEBS becomes aware that it has inadvertently collected data from a minor, it will delete that data promptly. Guardians who believe their child has provided personal data may contact `think+daneel-privacy@injen.io`.

---

## 12. Changes to this policy

- Changes are reflected by updating the **Last updated** date at the top of this page.
- **Material changes** — for example, adding a new sub-processor, changing the region in which data is stored, or adding a new category of data collected — will also be announced in the extension's release notes and in the News section inside Daneel.
- Continued use of Daneel after a material change constitutes acceptance of the updated policy.

---

## 13. Contact

For any question, request, or complaint about this policy or a User's personal data:

- **Email:** [think+daneel-privacy@injen.io](mailto:think+daneel-privacy@injen.io)
- **Postal mail:**

  WAEBS — Daneel AI Privacy
  131 Boulevard Pereire
  75017 Paris
  France

- **Response SLA:** WAEBS responds within **30 days**.

---

## 14. Chrome Web Store Limited Use disclosure

Daneel AI's use and transfer of information received from Google APIs to any other app will adhere to the [Chrome Web Store User Data Policy](https://developer.chrome.com/docs/webstore/user_data/), including the Limited Use requirements.

Daneel AI does not currently use Google OAuth or any Google user-data API. If that ever changes, the following additional Limited Use language applies, verbatim:

> Daneel AI's use of information received from Google APIs will adhere to the [Google API Services User Data Policy](https://developers.google.com/terms/api-services-user-data-policy), including the Limited Use requirements.

---

*This page is served as both HTML and Markdown at the same URL. The `.md` version is intended for automated processing and AI crawlers and is linguistically identical to the rendered page.*

---

# AI Providers

**Complete reference for all LLM and embedding providers supported by Daneel AI.**

> Source: https://doc.daneel.injen.io/reference/providers/index.md

To get started with a provider, see [Connect a Cloud Provider](/guides/connect-provider/). For a conceptual comparison, see [The Provider Spectrum](/concepts/providers/).

## LLM Providers

Daneel supports five LLM backends. All implement the same interface — switching providers changes the AI brain without affecting the rest of the experience.

### WebGPU (Local)

Runs AI models directly on your GPU using WebGPU and ONNX Runtime. No server, no API key, no internet after first model download.

| Property | Value |
|----------|-------|
| Data residency | On-device |
| Internet required | No (after model cache) |
| Tool calling | Experimental (prompt-based XML tags) |
| Streaming | Yes |
| Thinking/reasoning | Yes (model-dependent) |
| Cost | Free |

**Available models:**

Models are auto-selected based on your GPU capabilities. The catalog includes 20+ models from Liquid AI, Microsoft, HuggingFace, PrismML, DeepSeek, Zhipu, Alibaba, Meta, Google, and IBM, ranging from 350M to 3B+ parameters. Open **Settings > AI Models** to browse the full catalog with hardware compatibility scores.

Default model: **Granite 4.0 Micro 3B** (q4f16).

**Quantization formats:** Models ship in various quantization levels. Most use q4 (4-bit), which balances quality and size. Some models also offer q4f16 (4-bit with fp16 compute, requires shader-f16), q8 (8-bit), q2 (2-bit), and q1 (1-bit). The 1-bit and 2-bit formats are new, enabled by `@huggingface/transformers` 4.1.0.

Notable: [Bonsai 1.7B](https://huggingface.co/onnx-community/Bonsai-1.7B-ONNX) from [PrismML](https://prismml.com) is available in both q4 (1.1 GB) and q1 (291 MB). The q1 variant is the lightest thinking-capable model in the catalog, designed for low-end GPUs and fast cold starts.

**Configuration:** Settings > WebGPU. Model selection, quantization level, and context window are auto-configured based on GPU detection.

### Ollama (Local Server)

Connects to a local [Ollama](https://ollama.com/) server via the OpenAI-compatible API.

| Property | Value |
|----------|-------|
| Data residency | Local network |
| Internet required | No (LAN only) |
| Tool calling | Yes (OpenAI function format) |
| Streaming | Yes |
| Thinking/reasoning | Yes (think-block stripping) |
| Cost | Free (self-hosted) |

**Configuration:** Settings > Ollama.

| Setting | Default | Description |
|---------|---------|-------------|
| Base URL | `http://localhost:11434` | Ollama server address |
| Model | — | Selected from detected installed models |
| Availability timeout | 3,000 ms | How long to wait for server probe |
| Model list timeout | 5,000 ms | How long to wait for model enumeration |

Daneel auto-probes the Ollama server on settings open. Model management (pull, delete) is available in the Ollama settings panel.

### Gemini Nano (Chrome Built-in)

Uses Chrome's built-in Gemini Nano model via the Chrome AI API.

| Property | Value |
|----------|-------|
| Data residency | On-device |
| Internet required | No |
| Tool calling | Experimental (prompt-based XML tags) |
| Streaming | Yes |
| Thinking/reasoning | No |
| Cost | Free |

**Configuration:** Settings > Gemini Nano. Language selection. Availability is auto-detected — requires Chrome 120+ with the Gemini Nano flag enabled.

### Claude (Anthropic API)

Connects to Anthropic's Claude models via the API.

| Property | Value |
|----------|-------|
| Data residency | Third-party cloud (Anthropic) |
| Internet required | Yes |
| Tool calling | Yes (native `tool_use` blocks) |
| Streaming | Yes (SSE) |
| Thinking/reasoning | Yes |
| Cost | Per-token (see below) |

**Available models:**

| Model | Input cost | Output cost | Context |
|-------|-----------|-------------|---------|
| Claude Opus 4.7 | $5 / 1M tokens | $25 / 1M tokens | 200K |
| Claude Opus 4.6 | $5 / 1M tokens | $25 / 1M tokens | 200K |
| Claude Sonnet 4.6 | $3 / 1M tokens | $15 / 1M tokens | 200K |
| Claude Haiku 4.5 | $1 / 1M tokens | $5 / 1M tokens | 200K |

Cost annotations appear next to each response in the chat panel.

**Configuration:** Settings > Claude. API key is encrypted with AES-256-GCM and stored locally. The key never leaves your browser unencrypted.

### Azure OpenAI (Enterprise)

Connects to Azure OpenAI Service deployments.

| Property | Value |
|----------|-------|
| Data residency | Your Azure tenant |
| Internet required | Yes |
| Tool calling | Yes (OpenAI function format) |
| Streaming | Yes |
| Thinking/reasoning | Model-dependent |
| Cost | Per your Azure pricing |

**Authentication:** API Key or Entra ID (OAuth2). See [How to Set Up Azure OpenAI](/how-to/azure-openai/) for configuration steps.

## Embedding Providers

Daneel uses a local embedding model for all vector operations (site indexing, vault search, knowledge graph).

| Model | Dimensions | Context | Backend | Size |
|-------|-----------|---------|---------|------|
| BGE Small EN v1.5 (default) | 384 | 512 tokens | WebGPU fp16 | ~23 MB |
| Granite Embedding | 384 | 1,024 tokens | WebGPU q8 | — |
| MiniLM-L6-v2 | 384 | 256 tokens | WebGPU q8 | — |

Embeddings always run locally regardless of your LLM provider choice. Batched at 32 chunks maximum to prevent GPU memory issues.

:::caution
Switching embedding models clears all existing indexes and vault embeddings, since vector dimensions may differ. Back up your data first.
:::

## Vector Search

| Implementation | Persistence | Use case |
|----------------|------------|----------|
| IndexedDBVectorStore | Persistent (survives browser restart) | Production — site indexes, vaults |
| GPUCosineSearch | In-memory (GPU-accelerated) | <5ms search over 50k+ chunks |
| InMemoryVectorStore | Ephemeral | Testing only |

## Tool Calling Support by Provider

| Provider | Strategy | Reliability | Notes |
|----------|----------|-------------|-------|
| Claude | Native `tool_use` blocks | High | Best MCP experience |
| Ollama | OpenAI function format | High | Depends on model |
| Azure OpenAI | OpenAI function format | High | Depends on deployment |
| WebGPU | Prompt-based XML tags | Low | Small models struggle with tool format |
| Gemini Nano | Prompt-based XML tags | Low | 3B model often misformats calls |

---

# Settings Reference

**Complete reference for every settings panel and control in Daneel AI.**

> Source: https://doc.daneel.injen.io/reference/settings/index.md

To get started with settings, see [Connect a Cloud Provider](/guides/connect-provider/). For background on provider trade-offs, see [The Provider Spectrum](/concepts/providers/).

Daneel's settings are organized into 19 panels, accessible via the gear icon on the launcher bubble.

## Home

Dashboard overview showing:
- Active LLM provider and model
- GPU/hardware summary (GPU name, cores, RAM, VRAM, bandwidth)
- Index statistics (indexed sites, total chunks, storage estimate)
- Appearance summary (theme, color, position)
- Privacy and telemetry status
- License plan and feature flags
- MCP server count
- Current page index status

## News

Latest updates, features, and release notes from the Daneel AI team.

## Appearance

| Control | Options | Default |
|---------|---------|---------|
| Theme | Light, Dark | System-detected |
| Color style | Default (slate), Cyan, Magenta, Yellow, Green, Neon | Default |
| Launcher position | Bottom right, Bottom center | Bottom right |

## Data Backup

| Control | Description |
|---------|-------------|
| Export | Download a `.zip` of all settings, vaults, indexes, agents, MCP configs |
| Import | Restore from a `.zip` backup (drag-drop or file picker) |
| Azure Blob Storage | SAS URL input for cloud backup |
| S3-Compatible Storage | Access key, secret, bucket, region, endpoint for S3/R2/B2/MinIO |

See [How to Back Up Your Data](/how-to/cloud-backup/) for step-by-step instructions.

## AI Models

Cross-provider model browser with:
- **Search** — filter by name, provider, description
- **Provider filter** — All, WebGPU, Ollama, Claude, Azure, Gemini Nano
- **Privacy filter** — On-device only, Local network, Your cloud, Any cloud
- **Capability filter** — Tool calling, Thinking, Vision
- **Hardware detection** — GPU description, VRAM, RAM, GFLOPS, shader-f16 support
- **Model wizard** — guided recommendation based on your hardware
- **Expandable model cards** — description, quality rating, match score, license, effective context window, speed estimate, data residency

## WebGPU

| Control | Description |
|---------|-------------|
| Model selection | Choose from 20+ models auto-filtered by GPU compatibility |
| Model loading | Download and cache models for offline use |
| Status | Current model status, GPU detection results |

## Ollama

| Control | Description |
|---------|-------------|
| Base URL | Ollama server address (default: `http://localhost:11434`) |
| Model selection | Dropdown of installed models (auto-detected) |
| Model management | Pull new models, delete existing ones |
| Connection status | Auto-probe indicator |

## Gemini Nano

| Control | Description |
|---------|-------------|
| Availability | Auto-detected Chrome AI API status |
| Language | Language preference for the built-in model |

## Claude

| Control | Description |
|---------|-------------|
| API key | Encrypted input (AES-256-GCM, stored locally) |
| Model selection | Opus 4.7, Opus 4.6, Sonnet 4.6, Haiku 4.5 |

## Azure OpenAI

| Control | Description |
|---------|-------------|
| Endpoint URL | Azure OpenAI resource endpoint |
| Deployment name | Your model deployment name |
| Auth method | API Key or Entra ID (OAuth2) |
| API key | Input field (when API Key auth is selected) |

See [How to Set Up Azure OpenAI](/how-to/azure-openai/).

## Indexes

| Control | Description |
|---------|-------------|
| Indexed sites list | Domain, page count, chunk count, last indexed date |
| Re-index | Re-crawl and update a site's index |
| Clear | Delete all indexed data for a domain |

Default crawl parameters:

| Parameter | Default |
|-----------|---------|
| Max pages | 150 |
| Max depth | 3 |
| Max chunks per page | 2,000 |
| Chunk size | 512 tokens |
| Chunk overlap | 64 tokens |

See [How to Manage Site Indexes](/how-to/manage-indexes/).

## Knowledge Graph

| Control | Default | Description |
|---------|---------|-------------|
| NER model | GLiNER Small v2.1 int8 | Entity extraction model (4 options, 183–583 MB) |
| Ontology preset | General | Entity type categories (8 presets + custom) |
| Custom ontology | — | User-defined entity type labels |
| Dedup threshold | 0.85 | Entity name similarity threshold for merging |
| Extraction threshold | 0.55 | Minimum confidence for entity extraction |
| Max entity width | 12 | Maximum token span for a single entity |
| Particle animation | On | Animated particles on graph edges |
| Bloom glow | Off | Glow effect on nodes |
| Particle speed | 0.004 | Animation speed |
| Charge strength | -80 | Node repulsion force |
| Node scale | 4 | Base node size multiplier |
| Link opacity | 0.4 | Edge transparency |
| Background color | #060810 | Visualization background |
| Bloom strength | 1.0 | Intensity of bloom effect |

See [How to Build a Knowledge Graph](/how-to/knowledge-graph/).

## MCP

| Control | Description |
|---------|-------------|
| Featured servers | Curated list of known-good MCP servers |
| Search | Browse MCP registries |
| Registered servers | Your connected servers with status badges |
| Add custom server | SSE URL + auth configuration |
| Enable/disable | Toggle individual servers without removing them |
| Test connection | Verify server responds |

See [How to Connect an MCP Server](/how-to/mcp-server/).

## Agents

| Control | Description |
|---------|-------------|
| Agent list | All configured agents |
| Create/edit agent | Name, purpose, system prompt, bound MCP servers |
| Delete | Remove an agent |

See [How to Create a Custom Agent](/how-to/agents/).

## Docker

| Control | Description |
|---------|-------------|
| Companion sidecar | Always included (port 8809) |
| Ollama preset | Toggle Ollama container (port 11434) |
| MCP server templates | Add from catalog or custom command |
| YAML preview | Live preview of the generated compose file |
| Export | Download `daneel.compose.yml` + auto-register servers |

See [How to Set Up the Docker Companion](/how-to/docker-companion/).

## System

Hardware and runtime information: GPU details, browser version, extension version, memory usage.

## Privacy

### Context injection

| Control | Default | Description |
|---------|---------|-------------|
| Share location with agents | Off | Inject city-level geolocation into agent system prompts. Triggers a browser permission prompt on first enable. |
| Share date & timezone with agents | On | Inject current date, time, and IANA timezone into prompts. No permission needed. |

These are global gates — when off, no agent or MCP server can trigger context injection regardless of its own flags. See [Environment Context](/concepts/context-injection/) for the full architecture.

### Analytics

| Control | Default | Description |
|---------|---------|-------------|
| Anonymous usage analytics | On | Feature usage telemetry via GA4 |
| Enhanced geolocation | Off | IP-based country/region for analytics (separate from agent context injection) |

**Collected:** feature usage counters (chat, search, crawl, model load, MCP, agents, vault), provider and model name, OS, Chrome version, language, country/region.

**Never collected:** page content, URLs visited, chat messages, personal information.

See [Privacy Model](/concepts/privacy/) for the full data flow explanation.

## License

| Control | Description |
|---------|-------------|
| Status | Current plan and feature flags |
| Unlock | Opens Stripe checkout |
| Enter key | Manual license key entry (`DAN-XXXX-XXXX-XXXX`) |

## Debug

Diagnostic tools: logs, storage inspection, runtime state.

---

# Speech Providers

**Complete reference for the text-to-speech and speech-to-text providers supported by Daneel AI.**

> Source: https://doc.daneel.injen.io/reference/speech/index.md

To get started with speech, see [How to Read Messages Aloud and Dictate Questions](/how-to/speech/). For the design rationale behind multiple providers, see [Speech in Daneel](/concepts/speech/).

## Text-to-speech providers

Daneel supports three TTS providers. All implement the same interface, switching is a single click in **Settings > Speech**.

### System voices (default)

Uses the browser's built-in Speech Synthesis API. The voice catalog is whatever your operating system provides.

| Property | Value |
|---|---|
| Provider id | `web-speech` |
| Data residency | On-device (mostly) |
| Download size | 0 MB |
| Internet required | No (except for optional cloud voices) |
| Languages | All voices your OS provides |
| Streaming start | Instant |
| Cancellation latency | ~100 ms |

Chrome exposes a subset of voices named `Google <language>` that stream text to Google servers for higher-quality prosody. Daneel filters these by default. Flip **Settings > Speech > Advanced > Allow Google cloud voices** to expose them. They are clearly marked `(cloud)` in the voice list.

### Kokoro 82M (local)

A neural TTS model running entirely in your browser on WebGPU. 82 million parameters, 54 voices, seven languages.

| Property | Value |
|---|---|
| Provider id | `kokoro` |
| Data residency | On-device |
| Download size | ~326 MB (one-time) |
| Internet required | First download only |
| Languages | en-US, en-GB, es, fr, it, hi, ja, zh |
| Quantization (dtype) | fp32 (Xenova reference config for WebGPU) |
| Sample rate | 24 kHz mono |
| Cache location | Browser Cache API (`transformers-cache` + `kokoro-voices`) |

The 54-voice list is split by locale and gender. High-quality voices are marked with emoji in kokoro-js's voice table (Heart ❤️, Bella 🔥, Nicole 🎧, Emma 🚺, George 🚹).

Voice style files are fetched on first use per voice and cached separately under the `kokoro-voices` Cache API bucket.

:::note
Kokoro uses fp32 on WebGPU, not a smaller quantization, because quantized variants force dequantization ops onto the CPU, slowing synthesis 3 to 5 times. The trade-off is a larger download for a much faster runtime.
:::

### Moonshine (coming soon)

Placeholder provider. Catalog entries exist in the provider picker but the card remains disabled. When the provider class ships, it will extend local speech recognition with the same privacy guarantees as Kokoro.

## Speech-to-text providers

### Browser speech recognition (default)

Uses Chrome's built-in `SpeechRecognition` API. Audio streams to Google servers for transcription.

| Property | Value |
|---|---|
| Provider id | `web-speech` |
| Data residency | Third-party cloud (Google) |
| Download size | 0 MB |
| Internet required | Yes |
| Languages | Any BCP-47 tag supported by Chrome |
| Offline Mode behavior | Blocked, mic button disables with tooltip |

Set the recognition language in **Settings > Speech > Speech recognition > Language**. The default is `en-US`.

### Moonshine Base / Tiny (coming soon)

Two sizes of a local English speech recognition model. Catalog entries exist, provider classes pending.

| Variant | Download | Use case |
|---|---|---|
| Moonshine Base | ~120 MB | Best accuracy |
| Moonshine Tiny | ~55 MB | Low-end devices |

## Settings reference

All speech controls live under **Settings > Speech**, split into two sections.

### Text-to-speech section

| Control | Values | Default |
|---|---|---|
| Enabled | on / off | on |
| Provider | System voices / Kokoro 82M | System voices |
| Voice | provider-specific list | provider default |
| Speed | 0.5× to 2.0× | 1.0× |
| Auto-read responses | on / off | off |
| Allow Google cloud voices | on / off | off (advanced) |

The voice picker updates based on the active provider. Kokoro's picker is populated after the model is cached; before that, the card shows a Download button instead of the picker.

### Speech recognition section

| Control | Values | Default |
|---|---|---|
| Enabled | on / off | on |
| Provider | Browser speech recognition / Moonshine Base / Moonshine Tiny | Browser speech recognition |
| Recognition language | BCP-47 tag | `en-US` |

## Keyboard shortcut

`Alt+Space` toggles dictation from anywhere on the page. The shortcut is registered via the `toggle-stt` Chrome extension command and can be reassigned at `chrome://extensions/shortcuts`.

## UI affordances

- **Play button** — appears in the hover action row on every assistant message, between Copy and Delete. Flips to **Stop** when that message is playing.
- **Mic button** — appears in the chat composer next to Send. Four states: idle (grey), requesting-permission (amber, pulsing), listening (red, pulsing), transcribing (amber, static).
- **Test button** — next to the voice picker in Settings. Plays a short sample of the currently selected voice at the current rate.
- **Cloud badge** — the `(cloud)` suffix on voice list entries indicates a voice that streams text to a remote service. Visible only when Allow Google cloud voices is enabled.

## Privacy profiles

Each provider carries a `PrivacyProfile` consulted by the [Offline Mode](/how-to/offline/) network gate.

| Provider | leavesProcess | leavesMachine | dataObservers |
|---|---|---|---|
| System voices (local) | true | false | browser-vendor |
| System voices (Google cloud) | true | true | browser-vendor |
| Kokoro 82M | false | false | none |
| Web Speech STT | true | true | browser-vendor |
| Moonshine (planned) | false | false | none |

When `leavesMachine: true` and Offline Mode is effective, the network gate blocks the call and the relevant UI affordance (mic button, cloud-voice playback) disables with a tooltip.

## What Daneel never touches

- **Raw audio waveforms are not persisted.** Neither the PCM produced by Kokoro nor the audio captured by the mic is written to storage. Everything lives in memory for the duration of the playback or recording.
- **Transcripts are not saved outside the chat message.** When dictation completes, the text lands in the composer. If you do not send the message, nothing is stored.
- **No telemetry includes speech content.** The analytics catalog explicitly forbids logging transcripts, voice IDs the user typed, or error messages; only enums, booleans, durations, and character counts are emitted.

---

# Storage and Limits

**What Daneel AI stores, where it**

> Source: https://doc.daneel.injen.io/reference/storage/index.md

For backup instructions, see [How to Back Up Your Data](/how-to/cloud-backup/). For privacy implications, see [Privacy Model](/concepts/privacy/).

## Storage locations

Daneel uses two browser storage mechanisms:

### Chrome Storage (`chrome.storage.local`)

Stores settings, credentials, and lightweight metadata.

| Key | Content |
|-----|---------|
| `selectedProvider` | Active LLM provider name |
| `selectedModelId` | Active WebGPU model ID |
| `selectedEmbeddingModelId` | Active embedding model ID |
| `claudeApiKeyEncrypted` | AES-256-GCM encrypted Claude API key |
| `claudeModelId` | Selected Claude model |
| `ollamaBaseUrl` | Ollama server URL |
| `daneel:mcp:servers` | Registered MCP server list |
| `daneel:mcp:disabled` | Disabled MCP server IDs |
| `daneel:mcp:catalog` | Cached MCP tool manifests |
| `daneel:credential:{url}` | MCP server credentials |
| `daneel:agents` | Agent definitions |
| `daneel:docker:config` | Docker Companion configuration |
| `daneel:cloud:azure-sas` | Azure Blob Storage SAS URL |
| `daneel:cloud:s3-config` | S3 credentials (JSON) |
| Widget preferences | Theme, color style, position, panel width |
| License data | JWT token, plan tier, feature flags |
| Telemetry preferences | Analytics opt-in/out |

### IndexedDB

Stores vector embeddings and large datasets, partitioned by domain or vault.

| Store | Partitioning | Content |
|-------|-------------|---------|
| `embeddings:{hostname}` | Per domain | Site page chunks + vector embeddings |
| `vault:{vaultId}` | Per vault | Document chunks + vector embeddings |
| Knowledge graph data | Per vault | Entities, relationships, graph structure |

### CacheStorage

Model files downloaded from HuggingFace are cached in the browser's CacheStorage. These persist across sessions and enable offline use after first download. Daneel uses two separate named caches:

| Cache name | Populated by | Contents |
|------------|--------------|----------|
| `transformers-cache` | transformers.js runtime | ONNX shards, tokenizer files, configuration JSON for LLM and embedding models, plus tokenizer files for GLiNER |
| `daneel-ner-models` | NER worker | GLiNER ONNX binaries, fetched directly outside the transformers.js pipeline |

To see how much disk each cached model uses and to reclaim space, see [How to Reclaim Disk Space from Local Models](/how-to/models-storage/).

## Free vs. paid limits

| Feature | Free | Paid |
|---------|------|------|
| Page Chat | Unlimited | Unlimited |
| Site Search | Unlimited | Unlimited |
| Vaults | 1 | Unlimited |
| Documents per vault | 5 | 50 |
| Max file size | 1 MB | 10 MB |
| Max characters per document | 50,000 | 500,000 |
| Max chunks per document | 100 | 1,000 |
| MCP servers | Unlimited | Unlimited |
| Agents | Unlimited | Unlimited |
| Knowledge Graph | Yes | Yes |
| Cloud backup | Yes | Yes |

## License

One-time payment via Stripe. After purchase, you receive a `DAN-XXXX-XXXX-XXXX` license key. The license is verified via a signed JWT with a 7-day TTL and offline caching — you don't need to be online continuously.

## Data portability

All data can be exported as a portable `.zip` file via **Settings > Data Backup**. Cloud backup to Azure Blob Storage or S3-compatible storage is also available. See [How to Back Up Your Data](/how-to/cloud-backup/).

**Excluded from exports** (for security): API keys, cloud storage credentials (SAS URLs, S3 secrets).