mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-06 06:41:08 +00:00
When a model endpoint becomes unreachable (e.g., local proxy down, relay server offline), the failover system fails to switch to the next candidate model. Errors like "Connection error." are not classified as retryable, causing the session to hang on a broken endpoint instead of falling back to healthy alternatives. Connection/network errors are not recognized by the current failover classifier: - Text patterns like "Connection error.", "fetch failed", "network error" - Error codes like ECONNREFUSED, ENOTFOUND, EAI_AGAIN (in message text) While `failover-error.ts` handles these as error codes (err.code), it misses them when they appear as plain text in error messages. Extend timeout error patterns to include connection/network failures: **In `errors.ts` (ERROR_PATTERNS.timeout):** - Text: "connection error", "network error", "fetch failed", etc. - Regex: /\beconn(?:refused|reset|aborted)\b/i, /\benotfound\b/i, /\beai_again\b/i **In `failover-error.ts` (TIMEOUT_HINT_RE):** - Same patterns for non-assistant error paths Added test cases covering: - "Connection error." - "fetch failed" - "network error: ECONNREFUSED" - "ENOTFOUND" / "EAI_AGAIN" in message text - **Compatibility:** High - only expands retryable error detection - **Behavior:** Connection failures now trigger automatic fallback - **Risk:** Low - changes are additive and well-tested