GenerateOptions

Options controlling text generation behavior.

All fields are optional — providers apply sensible defaults when omitted.

Example

const options: GenerateOptions = {
  maxNewTokens: 512,
  temperature: 0.3,
  doSample: false,
  thinkingBudget: 256,
};
const response = await provider.generate('Summarize this page.', options);

Properties

Property	Modifier	Type	Default value	Description	Defined in
`doSample?`	`readonly`	`boolean`	`false`	Whether to use sampling (`true`) or greedy decoding (`false`). Remarks When `false`, the model always picks the highest-probability token regardless of temperature.	LLMProvider.ts:70
`maxNewTokens?`	`readonly`	`number`	`Provider-specific (typically 512)`	Maximum number of tokens to generate. Remarks Higher values allow longer responses but increase GPU memory usage. Values above 512 may cause OOM errors on consumer GPUs with WebGPU providers.	LLMProvider.ts:47
`temperature?`	`readonly`	`number`	`0.3`	Sampling temperature. Lower values produce more deterministic output. Remarks - `0` — greedy decoding (always pick the most likely token) - `0.3` — slightly creative but mostly focused - `1.0` — full sampling distribution	LLMProvider.ts:59
`thinkingBudget?`	`readonly`	`number`	`0`	Maximum tokens the model may spend in its `<think>` reasoning block. Set to `0` to disable thinking entirely. Remarks Only effective with reasoning-capable models (e.g. LFM2.5-Thinking). The thinking tokens are filtered out before the response reaches the user via ThinkBlockFilter.	LLMProvider.ts:83