CrawlOptions
Defined in: Crawler.ts:2
Options controlling a crawl operation.
Properties
Section titled “Properties”| Property | Modifier | Type | Description | Defined in |
|---|---|---|---|---|
delay? | readonly | number | Delay between requests in ms. Default: 100. | Crawler.ts:18 |
excludePatterns? | readonly | readonly (string | RegExp)[] | URL patterns to exclude. Strings are substring-matched; RegExps tested against the full URL. | Crawler.ts:14 |
headers? | readonly | Readonly<Record<string, string>> | Extra HTTP headers appended to every request. | Crawler.ts:22 |
ignoredQueryParams? | readonly | readonly string[] | Query param names stripped during URL normalization. | Crawler.ts:26 |
inPath? | readonly | string | Strict path prefix filter — only follow links under this path. | Crawler.ts:12 |
maxDepth? | readonly | number | - | Crawler.ts:4 |
maxPages? | readonly | number | - | Crawler.ts:3 |
maxPathDepth? | readonly | number | Max path segment count — URLs deeper than this are skipped. Default: 10. | Crawler.ts:16 |
maxRetries? | readonly | number | Retry attempts on network errors or HTTP 5xx. Default: 2. | Crawler.ts:24 |
respectRobots? | readonly | boolean | - | Crawler.ts:5 |
skipUrls? | readonly | readonly string[] | URLs to skip (already crawled in a previous run / checkpoint resume). | Crawler.ts:7 |
timeout? | readonly | number | Per-page fetch timeout in ms. Default: 10000. | Crawler.ts:20 |