
This tutorial walks you through indexing a website and searching it with natural language. By the end, you'll have a local vector index you can query anytime, even offline.

## 1. Navigate to the site

Go to any website you want to index. Documentation sites, blogs, wikis, and knowledge bases all work well.

## 2. Open Site Search

Click the **magnifying glass icon** on the Daneel launcher bubble to open the search overlay.

## 3. Check for sitemaps

Daneel automatically checks the current domain for sitemaps. Two things can happen:

- **Sitemaps found** — the **Sitemap** discovery method is pre-selected. You'll see a checklist of discovered sitemaps with page counts.
- **No sitemap found** — Daneel switches to **Web Crawl**, which discovers pages by following links from your current page.

You can switch between the two methods at any time using the discovery method cards at the top of the panel. For a first test, use whichever Daneel selects automatically.

:::note
For details on when to choose one method over the other, see [How to Index a Site](/how-to/site-indexing/).
:::

## 4. Configure the crawl

Set your crawl parameters:

- **Max pages** — how many pages to crawl (1–200, default 50). Start small for your first run.
- **Depth** — how many levels deep to follow (1–10, default 3). For sitemaps this controls sitemap nesting depth; for web crawl it controls how many link hops from the starting page.

If you're using **Web Crawl**, you'll also see a **Path prefix** field. Daneel infers a prefix from your current URL to keep the crawl focused on the section you're browsing. You can edit or clear it.

For a first test, try 10–20 pages.

## 5. Start indexing

Click the **Crawl** button. Daneel begins the indexing pipeline:

1. **Discovers** page URLs (from the sitemap or by following links)
2. **Fetches** each page and extracts text content using Readability
3. **Splits** text into overlapping chunks
4. **Embeds** each chunk as a vector using your local embedding model
5. **Stores** everything in IndexedDB in your browser

A progress bar shows crawl and embedding progress. The task runs in the background, so you can close the panel or navigate away without losing progress.

## 6. Search the index

Once indexing completes, a search box appears. Type a natural language question:

> *How do I configure authentication?*

Daneel runs a vector similarity search across all indexed chunks, finds the most relevant passages, and assembles an AI-powered answer with links to the source pages.

## 7. Review results

Each result shows:
- The source page title and URL
- A relevance score
- A text excerpt from the matching chunk

Click any source link to jump directly to that page.

## What just happened

Daneel discovered the pages (via sitemap or link crawling), fetched each one through the background service worker, extracted clean text with Readability, chunked it, and embedded each chunk using the BGE Small model running on WebGPU. The vector index is stored in IndexedDB, partitioned by domain. Searches run cosine similarity (GPU-accelerated when available) and assemble the top results into a RAG prompt.

The entire index lives in your browser. Nothing was sent to any server (assuming you're on the WebGPU backend).

## Next steps

- [How to Index a Site](/how-to/site-indexing/) for choosing between sitemap and web crawl, path prefix filtering, and safety guards
- [Build a Document Vault](/guides/first-vault/) to chat with your own files
- [Manage Site Indexes](/how-to/manage-indexes/) to re-index, clear, or view stats
- [How RAG Works](/concepts/rag/) to understand the pipeline under the hood
