* fix(onboard): increase verification timeout and reduce max_tokens for custom provider probes
The onboard wizard sends a chat-completion request to verify custom
providers. With max_tokens: 1024 and a 10 s timeout, large local
models (e.g. Qwen3.5-27B on llama.cpp) routinely time out because
the server needs to load the model and generate up to 1024 tokens
before responding.
Changes:
- Raise VERIFY_TIMEOUT_MS from 10 s to 30 s
- Lower max_tokens from 1024 to 1 (verification only needs a single
token to confirm the API is reachable and the model ID is valid)
- Add explicit stream: false to both OpenAI and Anthropic probes
Closes#27346
Made-with: Cursor
* Changelog: note custom-provider onboarding verification fix
---------
Co-authored-by: Philipp Spiess <hello@philippspiess.com>