node-llama-cpp defaults contextSize to "auto", which on large embedding
models like Qwen3-Embedding-8B (trained context 40,960) inflates gateway
VRAM from ~8.8 GB to ~32 GB and causes OOM on single-GPU hosts that share
the gateway with an LLM runtime.
Expose memorySearch.local.contextSize in openclaw.json (number | "auto"),
default to 4096 which comfortably covers typical memory-search chunks
(128–512 tokens) while keeping non-weight VRAM bounded.
Closes#69667.