Commit Graph

10924 Commits

Author SHA1 Message Date
Vincent Koc
911f853b7f fix(agents): preserve absent embedded session keys 2026-06-24 14:26:03 +08:00
Sally O'Malley
487951f813 fix(compaction): route codex oauth compaction natively (#95831)
Signed-off-by: sallyom <somalley@redhat.com>
2026-06-24 00:16:01 -04:00
Alexander Zogheb
cf86a9799c fix(agents): run heartbeat_prompt_contribution on harness prompt builds (#96233)
* fix(agents): run heartbeat_prompt_contribution on harness prompt builds

Harness runtimes (e.g. the Codex app-server) assemble the prompt through
resolveAgentHarnessBeforePromptBuildResult rather than the embedded runner's
resolvePromptBuildHookResult. The harness helper ran before_prompt_build and
before_agent_start but never invoked heartbeat_prompt_contribution, so that hook
silently no-ops on those runtimes: plugins that contribute heartbeat context via
the documented hook get nothing on heartbeat turns.

Invoke heartbeat_prompt_contribution from the harness helper too, gated on
ctx.trigger === "heartbeat", merging its prepend/append context ahead of the
before_prompt_build / before_agent_start contributions (matching the embedded
path's ordering). before_prompt_build appendContext is already honored here, so
no change is needed for boot-style append contributions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agents): preserve heartbeat hook ordering

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
2026-06-24 12:03:25 +08:00
Yuval Dinodia
82ccee027c fix(exec): preserve turn-source routing target in approval followups for plugin channels (#96140)
* fix(exec): preserve turn-source routing target in approval followups for plugin channels

When an async exec approval is resolved and the originating session is
resumed, buildAgentFollowupArgs forwarded the turn-source to/accountId/threadId
only for built-in deliverable channels or gateway-internal channels. For an
external channel plugin whose channel is not in the in-process deliverable set,
the followup dispatched channel alone and dropped the recipient, so the resumed
agent reply routed to webchat instead of the originating channel.

Forward the turn-source routing fields whenever the resolved delivery target is
not used, matching how the channel itself is already preserved, so the gateway
can route the post-approval reply back to the originating channel.

Fixes #96103

* fix(exec): normalize followup thread routing

* fix(exec): normalize followup thread routing

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-06-24 11:28:03 +08:00
Josh Lehman
96cee6cb64 refactor: route live model reads through session accessor (#96206) 2026-06-23 16:52:22 -07:00
Josh Lehman
6f2869c296 refactor: migrate agent session accessors (#96182)
* refactor: migrate agent session accessor writes

* refactor: move subagent orphan lookup to reconciliation

* test: align session accessor mocks
2026-06-23 16:31:43 -07:00
Josh Lehman
9512294e8f fix: bridge ACP metadata to session accessors (#96195)
* fix: bridge ACP metadata to session accessors

* fix: simplify ACP accessor key ownership

* fix: bind ACP metadata after session canonicalization
2026-06-23 16:14:53 -07:00
Jamil Zakirov
a86ca4f4ba perf(gateway): drop redundant per-access session-key case scan (#95699)
Merged via squash.

Prepared head SHA: 42c922460a
Co-authored-by: jzakirov <15848838+jzakirov@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-06-23 11:29:05 -07:00
Josh Lehman
0dfa22c6e0 refactor: add embedded run session target seam (#90439) 2026-06-23 10:08:29 -07:00
Josh Lehman
475252453b refactor: add transcript update identity contract (#89912) 2026-06-23 05:52:08 -07:00
Vincent Koc
d37300f357 fix(anthropic): narrow stream block index guard 2026-06-23 17:45:15 +08:00
Vincent Koc
9e63323388 perf(anthropic): index active stream blocks 2026-06-23 17:45:15 +08:00
Vincent Koc
4dac8f47ed perf(agents): index displaced tool results 2026-06-23 17:40:21 +08:00
maweibin
740578b596 fix: assistant reply lost between compaction summary and first kept user in successor transcript (#95484)
Merged via squash.

Prepared head SHA: eff5894fb8
Co-authored-by: maweibin <18023423+maweibin@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 17:25:45 +08:00
Yuval Dinodia
9f0d2427cd fix(agents): keep post-compaction user re-issue of a kept-tail prompt during compaction rotation (#94328)
Merged via squash.

Prepared head SHA: 05981b6c9f
Co-authored-by: yetval <102706514+yetval@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 16:41:12 +08:00
Rohit
695cea68f5 Fix recent session resume with long headers (#94578)
Merged via squash.

Prepared head SHA: 8102961184
Co-authored-by: rohitjavvadi <76606932+rohitjavvadi@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 16:21:15 +08:00
Vincent Koc
a0f93cf88f fix(agents): gate subagent stream suppression 2026-06-23 15:54:57 +08:00
chenhaoqiang
1876e3e1c1 perf: skip per-chunk live parsing for subagents
Subagent runs do not have a live stream consumer; their result is delivered from
the terminal message path after the child run finishes. The intermediate
message_update stream work only feeds live preview output.

Thread suppressLiveStreamOutput from the subagent lane into the embedded runner
subscription and return from handleMessageUpdate after accumulating the raw
chunk. This keeps final delivery unchanged while skipping per-chunk visible text
and reasoning stream parsing for subagents, which reduces event-loop pressure
when multiple child agents stream long answers in parallel.

Interactive and Control UI runs keep the existing live preview path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
(cherry picked from commit e0382c2c58c3eabdf64638777ec82cb1e68514e9)
2026-06-23 15:54:57 +08:00
Vincent Koc
21d67b168a feat(copilot): wire harness parity helpers 2026-06-23 15:48:27 +08:00
mushuiyu886
01abe0a33d fix(agents): suggest recovery for unknown tool ids (#93374)
Merged via squash.

Prepared head SHA: bee84e4eb8
Co-authored-by: mushuiyu886 <266724580+mushuiyu886@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 14:20:09 +08:00
Jason O'Neal
b8f1961aae fix(model-fallback): classify Codex usage-limit payloads (#95400)
* fix(model-fallback): classify Codex usage-limit payloads

* test: add real behavior proof for Codex usage-limit fallback

Adds a permanent real behavior proof test that exercises the production
classifyEmbeddedAgentRunResultForModelFallback() classifier with the exact
Codex subscription usage-limit error text.

Covers:
- Primary path: isError payload with usage-limit text -> rate_limit fallback
- Non-error payload: same text as normal assistant output -> no fallback
- Visible output already delivered -> no fallback
- Cross-provider: same text via openrouter -> rate_limit fallback

* fix(fallback-classifier): guard on finalAssistantVisibleText delivery evidence

When finalAssistantVisibleText contains real visible output (non-empty,
non-silent-reply), the agent already delivered a response to the user.
The classifier must not trigger model fallback in that case, because the
user already has their answer and rotating models would only burn quota
without improving the outcome.

Adds a guard in classifyEmbeddedAgentRunResultForModelFallback() that
checks finalAssistantVisibleText after committed outbound delivery
evidence and before the hook_block check. Uses the existing
isSilentReplyPayloadText() helper to avoid suppressing NO_REPLY and
similar intentional silent tokens.

This fixes the already-delivered-output test case in the Codex
usage-limit real behavior proof test.

* fix(test): use toEqual for cross-provider proof test type safety

The ModelFallbackResultClassification union includes { error: unknown },
so accessing .reason/.code after not.toBeNull() fails type checking.
Use toEqual with the full expected object instead, matching the pattern
used in result-fallback-classifier.test.ts.

* fix(model-fallback): refresh usage-limit fallback

Signed-off-by: sallyom <somalley@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Co-authored-by: sallyom <somalley@redhat.com>
2026-06-23 00:55:17 -04:00
Stellar鱼
a64e270ae7 fix(agents): infer runtime provider from qualified model ids (#91724)
Merged via squash.

Prepared head SHA: 9b544a23d7
Co-authored-by: yu-xin-c <175149126+yu-xin-c@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 12:23:52 +08:00
Vincent Koc
68a1e00b73 fix(agents): retry silent subagent completion handoffs 2026-06-23 06:04:16 +02:00
兰之
bd479958c0 feat(plugin-sdk): add extensible channel identity hook context (#91903)
Merged via squash.

Prepared head SHA: 90f51eafd5
Co-authored-by: lanzhi-lee <36190508+lanzhi-lee@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 11:56:49 +08:00
Vincent Koc
ad3b2f4b88 fix(agents): align OpenRouter model scan body cap 2026-06-23 11:28:29 +08:00
Alix-007
91b0567e89 fix(agents): bound Google prompt cache response reads (#95417)
The Google embedded-agent prompt-cache helpers parsed cachedContents
metadata with an unbounded `await response.json()` in both
createGooglePromptCache and updateGooglePromptCacheTtl. A buggy or
hostile Generative Language endpoint returning a 200 with a large or
never-ending body (especially with no Content-Length) would be fully
buffered into memory before parsing, with the existing
cancelUnreadResponseBody guard firing too late (json() already drained
the body).

Route both reads through the shared streaming byte-cap reader
(readResponseWithLimit) under a 1 MiB cap, cancelling the stream on
overflow instead of buffering it, then JSON.parse the bounded buffer.
This is the symmetric Google-endpoint counterpart to the Anthropic
error-stream and gateway pricing-catalog bounds.

Adds regressions that stream an oversized no-Content-Length body through
the real create and TTL-refresh paths and assert the body is cancelled.
2026-06-22 23:26:37 -04:00
Alix-007
06ca1235ef fix(agents): bound OpenRouter model-scan catalog success body (#95418)
The OpenRouter /models catalog read in fetchOpenRouterModels hardened only
the error/early-return path (dbd5689 cancels the body when res.bodyUsed is
false), but the success branch still buffered the whole body with an
unbounded `await res.json()`. The response is a provider-controlled,
runtime-fetched body, so a faulty or hostile provider can stream an
effectively unbounded JSON document and exhaust process memory before the
parse completes; the finally-cancel is a no-op once .json() has drained.

Read the success body through the canonical byte-cap reader
(readResponseWithLimit) under a 4 MiB ceiling before JSON.parse, cancelling
the stream on overflow and bounding idle stalls with the call's existing
timeout. This is the symmetric success-path counterpart to the bounded-stream
hardening landed in #95103 (pricing catalog) and #95108 (Anthropic error
streams), reusing the same helper rather than a new abstraction.
2026-06-22 23:10:15 -04:00
Alix-007
3da4280caf fix(agents): bound OpenRouter model catalog response reads (#95420)
* fix(agents): bound OpenRouter model catalog response reads

The runtime OpenRouter model-capability detector fetched the full
/models catalog with an unbounded `await response.json()`, so a
compromised or misbehaving endpoint could stream an arbitrarily large
body and force the process to buffer the whole payload before parsing.

Read the body through the shared bounded reader instead, capping it at
16 MiB (matching the sibling pricing-cache endpoint hardened in #95103)
and cancelling the stream on overflow. This mirrors the symmetric
bound-stream fixes in #95103 and #95108.

Adds coverage that an oversized streamed catalog is cancelled instead of
buffered and that an under-cap chunked body still reassembles, parses,
and round-trips through the SQLite cache on a fresh import.

* fix(agents): avoid OpenRouter refetch after capped catalog miss

---------

Co-authored-by: sallyom <somalley@redhat.com>
2026-06-22 23:00:17 -04:00
mikasa
aa0bdb901f fix #95489: [Bug]: claude-cli out-of-credits error bypasses model fallback chain — error text delivered as final response (#95508)
* fix(agents): fallback on generic cli failure text

* fix(agents): guard generic cli failure payload visibility

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(agents): use exported generic failure text

Signed-off-by: sallyom <somalley@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: sallyom <somalley@redhat.com>
2026-06-22 22:55:35 -04:00
Vincent Koc
abd8a46b0a improve: reduce hot-path linear scans and redundant I/O (#95697)
Merged via squash.

Prepared head SHA: 67f2678a34
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 10:11:18 +08:00
Vincent Koc
7fc4bbc0bc fix(agents): wake active parents for subagent completions 2026-06-23 04:01:11 +02:00
Vincent Koc
2920dc3282 refactor(openai): share completion stop reason mapping 2026-06-23 08:45:12 +08:00
Vincent Koc
32494c7ace refactor(agents): share session truncation warnings 2026-06-23 08:39:34 +08:00
Alix-007
2592f8a51a fix(agents): bound provider JSON response reads (#95218) 2026-06-23 00:33:38 +00:00
Vincent Koc
b60f63150f refactor(exec): share policy layer merging 2026-06-23 08:27:23 +08:00
Vincent Koc
321e58c030 refactor(files): share nested ignore rule loading 2026-06-23 07:56:52 +08:00
Vincent Koc
a70dae40b7 refactor(media): share duplicate guard action results 2026-06-23 07:50:31 +08:00
Vincent Koc
cc32f277fe refactor(models): centralize model key normalization 2026-06-23 07:45:50 +08:00
Bek
5e915e1f89 fix(agents): keep cron cloud idle watchdog enabled (#94445)
* fix(agents): keep cron cloud idle watchdog enabled

* docs: align cron idle timeout guidance
2026-06-23 06:47:19 +08:00
cornna
ef62076789 fix(agents): resolve webchat current session status
* fix(agents): resolve webchat current session status

* fix(agents): resolve webchat current session status

---------

Co-authored-by: Cornna <96944678+ymylive@users.noreply.github.com>
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
2026-06-23 06:26:49 +08:00
zw-xysk
3a32d24395 fix(cron): trim trailing whitespace from recognized job object keys (#95674)
* fix(cron): trim trailing whitespace from recognized job object keys (#95407)

Some tool-call extraction/serialization pipelines can produce cron object
keys with trailing spaces (e.g. 'schedule ' instead of 'schedule'), causing
gateway validation to reject the job.

Add repairPaddedCronKeys() to canonicalizeCronToolObject() that trims only
recognized CRON_RECOVERABLE_OBJECT_KEYS. Non-recognized keys (including
special ones like '__proto__') are never trimmed, preventing prototype
pollution. When both padded and canonical forms exist, the canonical key
wins.

Tests:
- add job with trailing-space keys -> trimmed
- update patch with trailing-space keys -> trimmed
- non-recognized padded keys left intact (safety)
- canonical key preserved over padded duplicate
- clean keys unchanged

133 tests pass (128 existing + 5 new).

* fix(cron): preserve padded duplicate keys when canonical form already exists (#95407)

When both a padded key (e.g. 'schedule ') and its canonical form
('schedule') exist, the padded key is now preserved so strict gateway
validation rejects the ambiguous input rather than silently picking one
value. Only padded keys without a canonical counterpart are trimmed.
2026-06-22 20:24:59 +00:00
Vincent Koc
9122e762d8 refactor(records): reuse canonical object guard 2026-06-23 03:58:08 +08:00
zhang-guiping
769579bcf0 fix(opencode-go): streaming completes when provider ends responses (#93965)
* fix(opencode-go): abort stalled SSE streams at provider-owned raw boundary

opencode-go routes through the shared OpenAI-compatible completions provider,
where a stalled SSE socket (provider emits tokens then never closes the stream)
hangs the gateway until stuckSessionAbortMs (~622s) and surfaces as
'LLM request failed' / 'Request was aborted'. Issue #93610 reports ~90% of
opencode-go cron jobs failing intermittently this way.

Add a provider-owned stream wrapper at the opencode-go raw SSE boundary that
injects an AbortController into the underlying OpenAI SDK request and aborts
it after a configurable idle window (default 30s, far below 622s) elapses
without any forward-progress event. The wrapper is:

- Provider-scoped: only applies when model.provider === 'opencode-go'; the
  shared openai-completions.ts path is untouched.
- Abortable: calls controller.abort() on the injected AbortSignal, which
  propagates through OpenAI SDK requestOptions.signal and genuinely
  interrupts the underlying fetch/stream (not just iterator return()).
- Idle-based: every event (text/tool/thinking delta, including delayed
  usage-only chunks) refreshes the timer; natural completion (done/error)
  cancels it. Normal delayed usage-only completion is preserved.
- Boundary-terminal: pushes a terminal { type: 'error', reason: 'aborted' }
  event downstream so consumers do not hang.

TDD: stream-termination.test.ts covers (a) stalled stream after first
progress is aborted within the idle window with a downstream 'aborted'
terminal event, and (b) normal delayed completion within the idle window
is not aborted and the done event is forwarded unchanged.

* fix(opencode-go): align stalled-stream idle default with runtime (120s)

Match the runtime's shared `DEFAULT_LLM_IDLE_TIMEOUT_MS` (120s) so
non-cron interactive opencode-go runs see no behavior change versus the
existing watchdog. Cron runs — for which the runtime disables its idle
watchdog entirely (`resolveLlmIdleTimeoutMs` returns 0 when trigger is
cron and no explicit timeout is set) — still get provider-owned
termination well before the ~622s stuck-session recovery.

Refs #93610

* fix(opencode-go): satisfy CI lint and test type checks

- Remove unnecessary `?? {}` fallback in spread (oxlint
  no-useless-fallback-in-spread).
- Drop non-narrowing `!` on the wrapper return type; use
  `await Promise.resolve(...)` to collapse the
  `StreamLike | Promise<StreamLike>` union before `for await`.

Refs #93610

* fix(opencode-go): arm stalled-stream idle timer only after first event

The wrapper armed the idle timer before the first upstream event, which
would mis-abort slow time-to-first-byte requests — including the
opencode-go cron runs that the runtime deliberately leaves uncapped via
resolveLlmIdleTimeoutMs. Arm only after the first forwarded event, and
add regression coverage for the slow-first-event path.

* fix(opencode-go): cover stalled stream first event

* fix(opencode-go): respect explicit stream timeout

* fix(opencode-go): preserve first-event timer after synthetic start

* fix(opencode-go): satisfy stream termination test lint

* fix(opencode-go): distinguish synthetic stream preambles

* fix(opencode-go): route stalled streams through failover
2026-06-22 19:57:21 +00:00
Vincent Koc
056e5b6b07 refactor(routing): share optional agent id normalization 2026-06-23 03:53:45 +08:00
NIO
8fdb1b61db fix(agents): classify generic LLM-request-failed error as transient timeout (#94062)
The generic assistant error text "LLM request failed." (GENERIC_ASSISTANT_ERROR_TEXT) is
produced by formatUserFacingAssistantErrorText when the underlying provider error cannot
be formatted into a specific category. For local providers (LM Studio, Ollama) this wraps
connection/availability failures when the model is not loaded or the endpoint is unreachable.

Without this match, the error is not classified as any transient type (rate_limit, overloaded,
network, server_error, timeout), so cron retry and payload.fallbacks never engage — even
though the configured fallback chain should handle provider availability failures.

Add /^llm request failed\.$/i as an exact-match regex in the timeout error patterns. This
strictly matches only the bare "LLM request failed." string, not variants like
"LLM request failed: provider rejected the request schema or tool payload." (which is a
format/schema error, not transient). Variants with specific transient reasons (connection
refused, network error, etc.) are classified through their own existing patterns.

Closes #93931
2026-06-22 19:53:26 +00:00
Vincent Koc
0a338147a5 refactor(numbers): share non-negative finite guard 2026-06-23 03:46:22 +08:00
Hoi Hin Adrian Ip
dbd4c98b02 Handle Codex toolResult blocks in truncation (#87912)
Co-authored-by: Hoi Hin Adrian Ip <255652477+AdrianIp0204@users.noreply.github.com>
2026-06-22 19:41:30 +00:00
Vincent Koc
066700bdd0 refactor(anthropic): share Foundry bearer auth policy 2026-06-23 03:31:32 +08:00
Vincent Koc
b31bf811cb refactor(providers): share bounded error body reader 2026-06-23 03:24:54 +08:00
Yzx
a0ed4273ee fix(agents): resolve bound route agent for inbound sessions (#95118) 2026-06-22 19:14:17 +00:00