fix: keep local embedding batches from flooding providers

2026-05-06 14:00:47 +00:00 · 2026-04-26 00:11:07 +01:00
parent 3f90005e56
commit 956cb1c7db
19 changed files with 205 additions and 8 deletions
--- a/docs/reference/memory-config.md
+++ b/docs/reference/memory-config.md
@@ -219,6 +219,17 @@ to an existing local file. `hf:` and HTTP(S) model references can still be used
 explicitly with `provider: "local"`, but they do not make `auto` select local
 before the model is available on disk.

+### Inline embedding timeout
+
+| Key                                 | Type     | Default          | Description                                                              |
+| ----------------------------------- | -------- | ---------------- | ------------------------------------------------------------------------ |
+| `sync.embeddingBatchTimeoutSeconds` | `number` | provider default | Override the timeout for inline embedding batches during memory indexing |
+
+Unset uses the provider default: 600 seconds for local/self-hosted providers
+such as `local`, `ollama`, and `lmstudio`, and 120 seconds for hosted providers.
+
+Increase this when local CPU-bound embedding batches are healthy but slow.
+
 ---

 ## Hybrid search config
@@ -347,6 +358,10 @@ Prevents re-embedding unchanged text during reindex or transcript updates.
 Available for `openai`, `gemini`, and `voyage`. OpenAI batch is typically
 fastest and cheapest for large backfills.

+This is separate from `sync.embeddingBatchTimeoutSeconds`, which controls inline
+embedding calls used by local/self-hosted providers and hosted providers when
+provider batch APIs are not active.
+
 ---

 ## Session memory search (experimental)