Fail closed when managed OpenAI OAuth refresh fails instead of silently falling back to stale external Codex CLI credentials.
Make managed provider OAuth authoritative after bootstrap, preserve API-key and non-OpenAI external CLI behavior, and surface targeted re-auth guidance without exposing profile IDs in group/channel replies.
Fixes#99120.
Co-authored-by: Eva <239388517+100yenadmin@users.noreply.github.com>
* fix(interactive): preserve button command values in fallback text for degraded approval UX
* fix(interactive): keep callback values private in fallback text and narrow Feishu interactive detection
- P1: Skip rendering action.type === "callback" values in
renderMessagePresentationFallbackText to avoid leaking opaque
channel/plugin data into user-visible text. Command and legacy
values are still rendered.
- P2: Replace hasMessagePresentationBlocks/hasInteractiveReplyBlocks
with isMessagePresentationInteractiveBlock so Feishu comment
guidance only appears when the presentation actually contains
buttons or selects, not for text-only blocks.
- Update tests: callback button now shows label-only; all 137 tests pass.
* fix(interactive): only render typed command values in fallback text, keep legacy value private
* fix(feishu): gate document-comment command guidance on actual command action
* docs(message-presentation): document command/callback value fallback visibility
* fix(feishu): omit command guidance when URL overrides fallback command text
* docs: regenerate docs_map.md
* fix(interactive): exclude disabled buttons from fallback command rendering and guidance
* fix(interactive): extract hasRenderedCommandAction, exclude disabled buttons from command fallback
* fix(feishu): preserve command guidance marker through core presentation rendering
* fix(feishu): type-narrow channelData.feishu with isRecord before reading rendered-command marker
* fix(feishu): move hasRenderedCommandAction from public SDK into Feishu plugin as local helper
Keep the helper local to the only caller (Feishu outbound) instead of
adding a new public plugin SDK API contract. The shared fallback renderer
in renderMessagePresentationFallbackText already inlines the same
command-visibility logic; a local helper is sufficient for the Feishu
comment-thread guidance gate.
* refactor(feishu): tighten fallback command marker
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(config/sessions): narrow reply-session initialization revision to identity fields
The initialization guard compared the full persisted session entry, so
background touches to updatedAt, heartbeat timestamps, context-budget
metadata, etc. produced false-positive stale-snapshot conflicts and the
"reply session initialization conflicted" error.
Only sessionId and sessionFile matter for detecting a session rotation.
Narrow the revision to those identity fields and add a regression test.
Fixes#98672
* fix(config/sessions): merge current metadata when reply-init identity guard passes
* fix(config/sessions): preserve only snapshot-drifted metadata in reply-init commit
* fix: preserve cleared reply-session metadata
* fix: allow same-session reply initialization drift
---------
Co-authored-by: moguangyu5-design <moguangyu5-design@users.noreply.github.com>
Co-authored-by: Josh Lehman <josh@martian.engineering>
Summary:
- The branch updates agent tool-result truncation so live prompt projection protects trailing fresh tool-resul ... ges under the aggregate cap, adds a bounded aggregate elision marker, and extends focused truncation tests.
- PR surface: Source +54, Tests +124. Total +178 across 2 files.
- Reproducibility: yes. source-level: the linked issues give a deterministic saturated-history prompt projecti ... clear tool-result text to empty. I did not run a live WebChat or Discord session in this read-only review.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 50069fdd6f.
- Required merge gates passed before the squash merge.
Prepared head SHA: 50069fdd6f
Review: https://github.com/openclaw/openclaw/pull/98955#issuecomment-4862947669
Co-authored-by: momothemage <niuzhengnan@163.com>
Approved-by: momothemage
Cron jobs run in the session they were created from unless the job explicitly requests an isolated fresh session.
Co-authored-by: Peter Steinberger <58493+steipete@users.noreply.github.com>
* fix(anthropic): restore Fable 5 Vertex simple completions
* test(agents): satisfy custom API model types
* test(ci): route reliability test from temp helper
* test(agents): satisfy custom API model types
* fix(agents): fail fast with attributable reason after MCP stdio session dies mid-run
Wires MCP Client onclose/onerror during bundle-mcp session creation so a
crashed/exited server flips session.connected instead of staying stale.
Next tool/resource/prompt call throws a domain-specific 'is disconnected'
error immediately instead of surfacing the SDK's generic 'Not connected'.
A disconnected reused session is retired and rebuilt fresh on the next
catalog pass rather than reused, since the SDK chains onclose/onerror
cumulatively on repeat connect() and the stdio transport never clears its
read buffer on an unexpected exit.
* fix(agents): retire a reused MCP session that dies mid-refresh, not just pre-refresh
codex review found: the catch-path retirement in getCatalog()'s per-server
task only covered two cases (fresh session that never connected this pass,
and a non-reused session that failed for any reason) - a reused session
that was healthy when this pass started but disconnects mid-refresh (child
process dies between ensureSessionConnected() returning and
listAllToolsBestEffort() finishing) hit neither branch, so onclose flipped
connected=false but the dead session object stayed in the map until the
next catalog rebuild happened to notice it. Added the missing branch plus
a regression test that kills the child process mid tools/list on a reused
session and asserts the session is purged from the map within the same
refresh (not just fail-fast on tool calls, which already worked).
* fix(agents): retire a dead reused MCP session even across overlapping catalog generations
codex round-2 review found: the mid-refresh retirement branch still
skipped when sharedWithNewerGeneration was true, which protects a
still-alive session another generation is actively using - but once
onclose flips session.connected=false, the transport is dead for every
generation sharing that object, so that guard no longer applies.
Simplified to a single !session.connected branch that always retires
(retireSessionIfCurrent already no-ops safely if a newer generation
replaced the map entry), and dropped the now-write-only
connectedForCatalog local it replaced.
* test(scripts): allow MCP SDK callback suppressions
* fix(agents): retire closed MCP sessions
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Makes native iMessage polls behave correctly end to end.
What changed:
- Inbound polls render with a numbered-options vote cue so the agent casts a native vote instead of answering the poll in prose.
- poll-vote resolves the poll reference from pollId/pollGuid/messageId and now defaults to the current inbound poll message when the model omits it; still errors when no reference exists.
- poll-vote echo suppression is session-scoped, so the redundant spoken answer is dropped across the separate poll and comment runs.
- A poll's inline-reply caption is folded (not delivered as a standalone question) only when the poll creator and reply sender are both known and equal; unknown/mismatched sender falls through to the normal inbound decision gate, so no inbound reply is silently dropped.
Evidence:
- 64 passing tests in the poll suites (poll-comment, poll-render, actions), incl. sender fail-closed regressions; pnpm build clean.
- Two Codex autoreviews clean (patch is correct); ClawSweeper re-review rated it platinum hermit.
- Live-verified on macOS 26.4.1 on the deployed gateway: poll "What color pill?" -> native vote delivered with a 7333-byte payload, caption folded, zero echo.
Note: vote delivery also depends on the imsg vote-stamp fix (openclaw/imsg#150); OpenClaw ships ahead of imsg per owner decision.
* fix(fal): route grok-imagine and nano-banana-2-lite edits to correct endpoints
The fal image-generation provider appends '/image-to-image' to any model
that isn't 'openai/gpt-image-*' or 'fal-ai/nano-banana-*' when reference
images are supplied. That's wrong for two models fal serves:
- `xai/grok-imagine-image`: fal 404s on '/image-to-image'. The real edit
endpoint is '/quality/edit'. The endpoint also expects lowercase
resolution values ('1k'/'2k' only) and a distinct aspect_ratio enum.
- `google/nano-banana-2-lite`: fal 404s on '/image-to-image'. The real
edit endpoint is '/edit'. The endpoint does not accept a 'resolution'
parameter.
Add schema entries for both models so ensureFalModelPath and
applyFalImageGeometry pick the right suffix and body shape. Introduce
resolution allowlist support ('resolutions: readonly string[]') and
lowercase transform ('resolutionCase: "lower"') on the schema; existing
schemas keep their behaviour (nb2 still forwards uppercase resolution
unchanged; flux/gpt-image/nb2/krea untouched). Refactor
ensureFalModelPath to consult schema.appendEditPath instead of hardcoded
prefix checks so future models only need a schema entry.
Tested:
- Existing 49 fal unit tests still pass; added 9 new tests covering the
two new endpoints and their guard conditions (32 -> 34 tests in the
image-generation-provider suite).
- Live fal.ai calls confirm both endpoints return 200 with real
reference images; the buggy old URLs still return 404.
* fix(fal): preserve standard edit routing
* fix(image): apply inferred resolution per model
* fix(image): preserve provider reference limits
* fix(image): resolve reference limits per model
* fix(fal): preserve nano banana family limits
* test(ios): stub generated file list helper
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Summary:
- The branch adds `useAutoCleanupTempDirTracker()`, broadens the temp-dir warning reporter to flag new manual helper imports/usages, updates docs, and migrates two script tests to the new helper.
- PR surface: Tests +301, Docs +1, Other +248. Total +550 across 8 files.
- Reproducibility: not applicable. this is test/tooling cleanup, and the changed behavior is exercised through helper/reporter tests and CI evidence rather than a user reproduction path.
Automerge notes:
- PR branch already contained follow-up commit before automerge: test: harden temp dir helper guard
- PR branch already contained follow-up commit before automerge: test: clarify auto cleanup temp dir helper name
- PR branch already contained follow-up commit before automerge: test: cover existing mkdtemp temp dir forms
- PR branch already contained follow-up commit before automerge: test: read staged temp helper source from index
Validation:
- ClawSweeper review passed for head 1fdd7d2a9a.
- Required merge gates passed before the squash merge.
Prepared head SHA: 1fdd7d2a9a
Review: https://github.com/openclaw/openclaw/pull/93209#issuecomment-4705653665
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Approved-by: hxy91819
* fix(agents): don't inject A2A turns into isolated-cron sessions_send (#92257)
Fire-and-forget sessions_send (timeoutSeconds === 0) with announce
delivery runs the A2A ping-pong loop. For a cross-session send
(requester != target) the loop's first iteration feeds the target
agent's reply back into the requester session as a new turn. For a
normal requester that roundtrip is intended, but for an *isolated cron*
requester it injects reply context into the isolated run and causes an
agent feedback loop.
Narrow the fix to isolated-cron requesters only (detected by a session
key containing ":cron:" or channel "cron" -- the same signal used by
src/agents/subagent-registry.ts and set in
src/cron/isolated-agent/run.ts), NOT by timeoutSeconds. Gating on
timeoutSeconds was too broad: it disabled the intended ping-pong for
normal cross-session fire-and-forget sends.
Two hunks in src/agents/tools/sessions-send-tool.ts, both reusing one
`isIsolatedCronRequester` gate:
1. Force maxPingPongTurns to 0 in the runSessionsSendA2AFlow invocation
only for an isolated-cron requester. The a2a flow's ping-pong guard
(`maxPingPongTurns > 0`) then skips the requester-injection loop and
proceeds straight to the announce step in the TARGET session,
preserving fire-and-forget announce delivery. Normal requesters keep
the configured turn count.
2. Read baselineReply for fire-and-forget sends when same-session (prior
behavior) OR isolated-cron. Without a baseline fingerprint, a2a.ts
would treat pre-existing assistant text in the target session (e.g.
an unrelated concurrent cron's output) as "the reply" and
misattribute it. The normal cross-session fire leg's history-call
count is unchanged from origin/main. Read failures are tolerated so a
snapshot error never blocks accepting the send.
Tests (sessions.test.ts): an isolated-cron cross-session fire-and-forget
forwards maxPingPongTurns: 0; a normal-requester regression guard
(discord:group:req / channel discord) forwards the configured turn count
(not 0), proving the normal ping-pong is preserved. a2a flow tests
(sessions-send-tool.a2a.test.ts): with turns=0 + requester!=target the
requester is never stepped but the target is still announced, and a
baseline-matching reply is neither injected nor announced.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(agents): gate isolated-cron A2A on canonical cron-run classifier (#92257)
Address Codex [P2] review: replace the raw `:cron:` substring detector with
the canonical isCronRunSessionKey so a non-canonical cron-like requester key
(e.g. agent:main:slack:cron:job:run:uuid) keeps its intended cross-session
ping-pong. Keep the requesterChannel === "cron" arm (isolated cron runs set
that channel in src/cron/isolated-agent/run.ts). Add a regression covering the
non-canonical key.
* refactor(agents): drop dead cron-channel A2A arm, gate on canonical key only (#92257)
The requesterChannel === "cron" arm was unreachable: agentChannel is always a
DeliverableMessageChannel from resolveGatewayMessageChannel(messageProvider),
never the literal "cron". The channel: "cron" in src/cron/isolated-agent/run.ts
labels diagnostics events/lifecycle, not the tool channel. Gate on
isCronRunSessionKey alone and fix the misleading comment.
Tests drove the dead arm via agentChannel/requesterChannel "cron" plus a
non-canonical key (agent:main:cron:run:abc, which isCronRunSessionKey rejects).
Switch them to a canonical cron-run key (agent:main:cron:job:run:abc) and a
normal delivery channel so they exercise the real production gate.
* fix(agents): align cron A2A fallback baselines
* chore: prepare branch refresh
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(gateway): distinguish reachable gateway from failed status probe
* fix(status): gate owns-port RPC recovery guidance on no stale gateway PIDs
inspectGatewayRestart can set health.healthy from bare reachability after
ownership attribution failed, while still returning a non-empty
staleGatewayPids. Printing owns-port recovery guidance in that case
contradicted the dedicated stale-PID diagnostic below it. Require
staleGatewayPids to be empty before treating healthy as owns-port proof.