Peter Steinberger
a0ff9d9bbb
perf(test): reduce sync passes in memory batch failure test
2026-02-14 23:16:37 +00:00
Peter Steinberger
b3c3ec4231
perf(test): reuse managers in embedding token limit suite
2026-02-14 23:16:37 +00:00
Peter Steinberger
9860d6fcc2
perf(test): reuse managers in embedding batches suite
2026-02-14 23:16:37 +00:00
Peter Steinberger
57b91b6b81
perf(test): reuse memory manager batch suite
2026-02-14 23:16:36 +00:00
Vignesh Natarajan
c0bf6bc24f
Memory/QMD: parse scope once in qmd scope checks
2026-02-14 14:59:18 -08:00
Vignesh Natarajan
0fdcb3be43
Memory/QMD: skip unchanged session export writes
2026-02-14 14:59:18 -08:00
Vignesh Natarajan
83e08b3bd5
Memory/QMD: optimize qmd readFile for line-window reads
2026-02-14 14:59:18 -08:00
Vignesh Natarajan
62aae7f69d
Memory/QMD: add limit arg to search command
2026-02-14 14:59:18 -08:00
Vignesh Natarajan
19df928e7f
Memory/QMD: robustly parse noisy qmd JSON output
2026-02-14 14:59:18 -08:00
Vignesh Natarajan
6bf333bf31
Memory/QMD: prefer exact docid lookup in index
2026-02-14 14:59:18 -08:00
Vignesh Natarajan
f9f816d139
Memory/QMD: cap qmd command output buffering
2026-02-14 14:59:18 -08:00
Peter Steinberger
9521fe977a
refactor(test): dedupe openai batch test fetch mocks
2026-02-14 20:15:35 +00:00
Peter Steinberger
82f0388951
test: disable unsafe memory reindex for atomic suite
2026-02-14 20:12:26 +00:00
Peter Steinberger
3f5351529f
perf(test): skip atomic sqlite swaps for memory index
2026-02-14 20:12:26 +00:00
Peter Steinberger
d5a724fbee
perf(test): mock chokidar in memory tests
2026-02-14 18:46:24 +00:00
Peter Steinberger
f2c56de955
perf(test): speed up memory suites
2026-02-14 16:36:15 +00:00
Peter Steinberger
2b5e0a6075
perf(test): speed up memory batch + web logout
2026-02-14 16:36:15 +00:00
Peter Steinberger
76e4e9d176
perf(test): reduce skills + update + memory suite overhead
2026-02-14 16:36:15 +00:00
Peter Steinberger
684c18458a
perf(test): speed up line, models list, and memory batch
2026-02-14 16:36:15 +00:00
Peter Steinberger
bc4881ed0c
refactor(memory): share stale index cleanup
2026-02-14 15:39:46 +00:00
Peter Steinberger
7bd073340a
refactor(memory): share batch output parsing
2026-02-14 15:39:45 +00:00
Peter Steinberger
e707a7bd36
refactor(memory): reuse runWithConcurrency
2026-02-14 15:39:44 +00:00
Peter Steinberger
eb4215d570
perf(test): speed up Vitest bootstrap
2026-02-14 12:13:27 +00:00
vignesh07
e38ed4f640
fix(memory): default qmd searchMode to search + scope search/vsearch to collections
2026-02-13 23:14:34 -08:00
Peter Steinberger
a50638eead
perf(test): disable vector index in OpenAI batch tests
2026-02-14 05:25:40 +00:00
Peter Steinberger
0e5e72edb4
perf(test): shrink memory embedding batch fixtures
2026-02-14 05:25:40 +00:00
Peter Steinberger
115444b37c
perf(test): deflake and speed up qmd manager tests
2026-02-14 03:08:13 +00:00
Peter Steinberger
dd08ca97bb
perf(test): reduce import and fixture overhead in hot tests
2026-02-14 02:49:19 +00:00
Peter Steinberger
2583de5305
refactor(routing): normalize binding matching and harden qmd boot-update tests
2026-02-14 03:40:28 +01:00
Peter Steinberger
36726b52f4
perf(test): drop redundant memory reindex integration case
2026-02-14 02:37:09 +00:00
Peter Steinberger
63711330e4
perf(test): dedupe browser/telegram coverage and trim batch retry cost
2026-02-14 02:37:09 +00:00
Peter Steinberger
03fee3c605
refactor(memory): unify embedding provider constants
2026-02-14 03:16:46 +01:00
Peter Steinberger
61b5133264
fix(memory): align QAT default docs/tests ( #15429 ) (thanks @azade-c)
2026-02-14 03:11:14 +01:00
Azade 🐐
5219f74615
fix(memory): use QAT variant of embedding model for better quality
...
Switch default local embedding model from embeddinggemma-300M to
embeddinggemma-300m-qat (Quantization Aware Training). QAT models are
trained with quantization in mind, yielding better embedding quality
at the same size (Q8_0).
2026-02-14 03:11:14 +01:00
Peter Steinberger
e794ef0478
perf(test): reduce hot-suite setup and duplicate test work
2026-02-13 23:30:41 +00:00
Peter Steinberger
dc507f3dec
perf(test): reduce memory and port probe overhead
2026-02-13 23:22:30 +00:00
Peter Steinberger
1aa746f042
perf(test): lower synthetic payload in embedding batch split case
2026-02-13 23:16:42 +00:00
Peter Steinberger
faeac955b5
perf(test): trim retry-loop work in embedding batch tests
2026-02-13 23:16:42 +00:00
Peter Steinberger
e324cb5b94
perf(test): reduce fixture churn in hot suites
2026-02-13 23:16:41 +00:00
Peter Steinberger
dac8f5ba3f
perf(test): trim fixture and import overhead in hot suites
2026-02-13 23:16:41 +00:00
Peter Steinberger
4c401d336d
refactor(memory): extract manager sync and embedding ops
2026-02-13 19:08:37 +00:00
Peter Steinberger
ca3a42009c
refactor(memory): extract qmd scope helpers
2026-02-13 19:08:37 +00:00
Peter Steinberger
5d8eef8b35
perf(test): remove module reloads in browser and embedding suites
2026-02-13 15:31:17 +00:00
Peter Steinberger
faec6ccb1d
perf(test): reduce module reload churn in unit suites
2026-02-13 15:19:13 +00:00
Rodrigo Uroz
b912d3992d
(fix): handle Cloudflare 521 and transient 5xx errors gracefully ( #13500 )
...
Merged via /review-pr -> /prepare-pr -> /merge-pr.
Prepared head SHA: a8347e95c5
Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com >
Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com >
Reviewed-by: @Takhoffman
2026-02-11 21:42:33 -06:00
Vignesh Natarajan
36e27ad561
Memory: make qmd search-mode flags compatible
2026-02-11 17:51:08 -08:00
Vignesh Natarajan
6d9d4d04ed
Memory/QMD: add configurable search mode
2026-02-11 17:51:08 -08:00
Vignesh Natarajan
2f1f82674a
Memory/QMD: harden no-results parsing
2026-02-11 15:39:28 -08:00
Vignesh Natarajan
3d343932cf
Memory/QMD: treat plain-text no-results as empty
2026-02-11 15:39:28 -08:00
Rodrigo Uroz
7f1712c1ba
(fix): enforce embedding model token limit to prevent overflow ( #13455 )
...
* fix: enforce embedding model token limit to prevent 8192 overflow
- Replace EMBEDDING_APPROX_CHARS_PER_TOKEN=1 with UTF-8 byte length
estimation (safe upper bound for tokenizer output)
- Add EMBEDDING_MODEL_MAX_TOKENS=8192 hard cap
- Add splitChunkToTokenLimit() that binary-searches for the largest
safe split point, with surrogate pair handling
- Add enforceChunkTokenLimit() wrapper called in indexFile() after
chunkMarkdown(), before any embedding API call
- Fixes: session files with large JSONL entries could produce chunks
exceeding text-embedding-3-small's 8192 token limit
Tests: 2 new colocated tests in manager.embedding-token-limit.test.ts
- Verifies oversized ASCII chunks are split to <=8192 bytes each
- Verifies multibyte (emoji) content batching respects byte limits
* fix: make embedding token limit provider-aware
- Add optional maxInputTokens to EmbeddingProvider interface
- Each provider (openai, gemini, voyage) reports its own limit
- Known-limits map as fallback: openai 8192, gemini 2048, voyage 32K
- Resolution: provider field > known map > default 8192
- Backward compatible: local/llama uses fallback
* fix: enforce embedding input size limits (#13455 ) (thanks @rodrigouroz)
---------
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com >
2026-02-10 20:10:17 -06:00