Commit Graph

230 Commits

Author SHA1 Message Date
Ayaan Zaidi
4de9b79d30 refactor(agents): simplify stale cli retry cleanup 2026-05-31 17:28:05 +05:30
brokemac79
e8c7c933f8 Retry stale CLI sessions in runner lifecycle 2026-05-31 17:28:05 +05:30
Peter Steinberger
77f1359612 refactor: extract media and ACP core packages (#88534)
* refactor: extract media and acp core packages

* refactor: remove relocated media and acp sources

* build: wire new core packages into dependency checks

* test: alias new core packages in vitest

* build: keep media sniffer runtime dependency

* docs: refresh plugin sdk api baseline

* fix: keep normalized proposal queries non-empty

* test: keep channel timer tests isolated

* fix: keep rebased plugin checks green

* fix: preserve sms numeric allowlist entries

* test: harden exec foreground timeout failure

* test: remove duplicate skill workshop assertion

* fix: remove channel config lint suppression

* test: refresh lint suppression allowlist
2026-05-31 11:30:33 +01:00
Frank Yang
15c1511817 fix(agents): clear stale compaction bindings 2026-05-31 13:38:52 +05:30
Frank Yang
5e1e029d91 fix(agents): skip below-target CLI compaction failures 2026-05-31 13:38:52 +05:30
Peter Steinberger
23dac6c263 test: keep vitest cases under one second 2026-05-31 06:51:34 +01:00
Peter Steinberger
00d8d7ead0 refactor: extract normalization core package
Extract shared normalization/coercion helpers into private @openclaw/normalization-core workspace package while preserving existing plugin SDK helper subpaths.\n\nAlso keeps direct normalization-core imports internal, wires UI/build/loader resolution, and replaces the slow PR network CodeQL lane with a fast added-line boundary scan while retaining full CodeQL for scheduled/manual runs.\n\nVerification: local moved tests, plugin SDK boundary tests, extension loader tests, agents-support shard, UI build/test, build artifacts, lint, workflow guards, autoreview, and GitHub CI passed on PR head 963d893715.
2026-05-31 01:33:00 +01:00
Peter Steinberger
4c33aaa86c refactor: unify OpenAI provider identity (#88451)
* refactor: unify OpenAI provider identity

* refactor: move legacy oauth sidecar doctor helpers

* test: align OpenAI fixtures after rebase

* test: clean OpenAI provider unification

* fix: finish OpenAI provider cleanup

* fix: finish OpenAI cleanup follow-through

* fix: finish OpenAI CI cleanup
2026-05-31 00:29:44 +01:00
scotthuang
7920af0c9e refactor: route browser screenshot vision through shared media understanding
* feat(browser): add optional vision understanding to screenshot tool

* fix(browser): wrap vision output as external content, enforce maxBytes, forward auth profiles

* fix(browser): remove no-op scope/attachments config, drop profile pass-through lacking runtime support

* feat(media-understanding): add profile/preferredProfile to DescribeImageFileWithModelParams and forward to describeImage

* style(browser): add curly braces to satisfy eslint curly rule

* fix(browser): correct tools.browser.enabled help text to match actual behavior

* fix(browser): thread agentDir/workspaceDir from plugin tool context into browser vision

* refactor(browser): move vision config from tools.browser to browser.models

The browser plugin's vision configuration now lives on the top-level
`browser` config namespace (browser.models, browser.visionEnabled,
browser.visionPrompt, etc.) instead of `tools.browser`. This aligns
with the plugin's existing config location and avoids confusion between
tool-level and plugin-level settings.

- Remove tools.browser from ToolsSchema and ToolsConfig
- Add models/vision* fields to BrowserConfig and its zod schema
- Update getBrowserVisionConfig to read from cfg.browser
- Update schema help, labels, and quality test
- Update vision.test.ts to use new config shape

* docs(browser): add screenshot vision configuration section

Document the new browser.models config for automatic screenshot
description via vision models, enabling text-only main models to
reason about web page content.

* fix(browser): remove deliverable media markers from vision result, drop unused import

P1: Vision-success path no longer exposes the raw screenshot as
deliverable media (removes MEDIA: line and details.media.mediaUrl).
This prevents channel delivery from auto-sending sensitive page content
when the intended output is a text description.

P2: Remove unused ToolsMediaUnderstandingSchema import that would fail
noUnusedLocals typecheck.

* fix(browser): add command/args fields to browser models schema

The browser vision model schema uses .strict(), so CLI-type entries
with command/args were rejected by TypeScript. Add these fields to
align with MediaUnderstandingModelSchema.

* chore(browser): remove debug console.log statements

* fix(browser): harden screenshot vision result against MEDIA: directive injection and restore image sanitization on failure fallback

ClawSweeper #84247 review round 2:

P1 (security, high): neutralize line-start MEDIA: directives in vision descriptions
before wrapping with wrapExternalContent. The agent media extractor scans every
browser tool-result text block via splitMediaFromOutput which treats line-start
MEDIA: as a trusted local-media delivery directive, and browser is on the
trusted-media allowlist. Without neutralization, page or vision-provider output
containing 'MEDIA:/tmp/secret.png' could synthesize a channel-deliverable media
artifact from untrusted content. wrapExternalContent itself does not strip
line-start directives. Introduce neutralizeMediaDirectives in vision.ts that
prepends '[neutralized] ' to any line whose trimStart() begins with MEDIA:
(case-insensitive), defanging the parser anchor while keeping the original
text human-readable.

P2 (compatibility): pass resolveRuntimeImageSanitization() to imageResultFromFile
in the vision-failure catch fallback. The non-vision screenshot path already
forwards this option (d5cc0d53b7) so configured agents.defaults.imageMaxDimensionPx
takes effect. Without this fix, any provider timeout/error silently bypasses the
sanitization guard and returns a raw full-resolution screenshot.

Regression coverage:
- vision.test.ts: 6 unit cases for neutralizeMediaDirectives (no-op fast path,
  mid-line MEDIA: untouched, line-start defanged, leading-whitespace defanged,
  case-insensitive, multiple directives per blob).
- browser-tool.test.ts: 2 integration cases that drive the full screenshot
  tool execute path:
    - 'neutralizes MEDIA: directives in vision text and does not attach media'
      asserts no line matches /^\s*MEDIA:/i in returned text, secret path text
      is preserved verbatim, details.media is absent, and imageResultFromFile
      is not called on the success path.
    - 'preserves screenshot image sanitization on vision failure fallback'
      mocks describeImageFileWithModel to reject and asserts the fallback
      imageResultFromFile call receives imageSanitization: {maxDimensionPx:1600}
      plus the 'browser screenshot vision failed' extraText.

* fix(browser): apply clawsweeper fallback media fix from PR #84247

* refactor: reuse media image understanding for browser screenshots

* refactor: use structured media delivery

* test: update music completion media instruction expectation

* fix: trim buffered reply directive padding

* test: refresh codex prompt snapshots for message media aliases

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-31 00:00:19 +01:00
Lellansin Huang
fe3c3ac5cd fix(gateway): forward stop sequences across providers
Forward OpenAI-compatible stop sequences from gateway chat completions through the agent runner into provider transports.

The gateway now normalizes stop into sampling extras, agent transports pass it into the shared stream options, and OpenAI, Anthropic, Mistral, Google, and Vertex-backed simple providers map it to their native request fields. Provider/gateway/agent coverage plus Crabbox live gateway proof verify valid stop dispatch and invalid stop rejection.

Refs #87920
2026-05-30 19:24:21 +01:00
Peter Steinberger
18e7d28b21 perf(gateway): reuse stable turn metadata 2026-05-30 17:30:47 +01:00
Peter Steinberger
d92b3b5cc2 refactor: unify OpenAI provider identity
Refactor OpenAI provider identity so OpenAI remains the canonical provider for API-key and OAuth-backed flows while legacy openai-codex state is doctor/migration-only.

Keeps OpenAI Codex Responses as an API/transport class rather than a provider identity, moves auth aliases through providerAuthAliases, updates doctor repair sequencing for old auth/profile state, and refreshes tests/docs around the canonical OpenAI behavior.
2026-05-30 11:48:41 +02:00
Peter Steinberger
de1dfab03e refactor: move terminal core into package (#88279)
* refactor: move terminal core into package

* refactor: move terminal module files

* fix: clean terminal package CI followups

* test: update lint suppression allowlist

* fix: ship terminal core runtime aliases
2026-05-30 11:07:45 +02:00
Jason (Json)
81505ada18 fix(codex): rotate native threads before overflow
Fix Codex app-server native thread overflow recovery and CLI compaction fallback.

- rotate Codex native startup bindings when rollout token pressure leaves too little headroom
- keep byte-size rollout fuses ahead of rollout content reads
- clear stale resumed context-engine bindings only when the stored thread id still matches
- fall back to context-engine compaction when Codex owns/skips native compaction

Verification:
- node scripts/run-vitest.mjs run --config test/vitest/vitest.extension-codex.config.ts extensions/codex/src/app-server/startup-binding.test.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts extensions/codex/src/app-server/session-binding.test.ts --reporter=verbose
- node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/command/cli-compaction.test.ts --reporter=verbose
- git diff --check origin/main...HEAD
- autoreview --mode branch --base origin/main: clean
- GitHub CI for 466bfbe78c: green

Co-authored-by: fuller-stack-dev <263060202+fuller-stack-dev@users.noreply.github.com>
2026-05-30 08:07:29 +02:00
Ayaan Zaidi
f848a6f7f7 perf(agents): bound claude orphan transcript scan 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
72eff6b2e9 fix(agents): clear orphan tool state on string assistant turns 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
622404fcec fix(agents): detect claude-specific orphaned tools 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
c0a5f15dc8 fix(agents): clear unflushed cli bindings 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
2e21158d04 refactor(agents): simplify cli session recovery probes 2026-05-30 10:09:19 +05:30
Abdel Gomez-Perez
16b510807b fix(agents/cli-runner): invalidate sessions whose transcript ends mid-tool
A claude-cli session whose JSONL transcript ends with an assistant
`tool_use` content block that was never answered by a `tool_result` user
message cannot resume — claude-cli will sit waiting for the missing
`tool_result`, hit its no-output watchdog, and the runtime kills it
with `reason=abort`. The dispatcher then sees an empty payload and emits
NO_REPLY, which to the user looks like the agent silently ignored their
message — same end-user symptom as the binding-flush amnesia bug, but a
different root cause.

The orphan can be left behind when:
  - Gateway restarts mid-tool (brew upgrade, manual kickstart, OOM,
    crash) — claude was waiting on a tool result that never arrived.
  - `claude-live-session.ts` no-output watchdog fires while a tool is
    actively running and OC kills the subprocess.
  - The tool itself crashed or hung past its own deadline.

In all cases the resumed session is dead until the binding gets cleared,
because every subsequent resume hits the same trailing tool_use and the
same kill cycle. Observed in production on a personal OpenClaw gateway
(3d-engineer agent, 50-message-deep transcript ending in a Bash
`tool_use`; every Telegram message after the orphan landed silently
aborted at the 180s no-output mark).

Add `claudeCliSessionTranscriptHasOrphanedToolUse` to the helpers that
walks the JSONL, finds the last assistant message, and returns true if
any of its `tool_use` ids has no matching `tool_result` later in the
file. Wire into `prepareCliRunContext` as a second invalidator gate
alongside `missing-transcript`. The new `invalidatedReason:
"orphaned-tool-use"` follows the same path as missing-transcript: the
binding is dropped, this turn starts a fresh session, and the prior
context is reseeded into the new session via `RAW_TRANSCRIPT_RESEED`.

Detection only considers TRAILING orphans — an unanswered tool_use
deeper in history is inert because a later assistant message already
moved past it. Only the most recent assistant message's tool_use ids
matter for forward progress.

Probe runs only for claude-cli providers and only when the transcript-
content gate already passed, so we add no I/O on already-invalidated
sessions and no behavior change for non-claude providers.

AI-assisted: yes. Tooling: Claude Opus + claude-cli.
2026-05-30 10:09:19 +05:30
Peter Steinberger
a509c48f0e feat: add core session goals (#87469)
* feat: add core session goals

* feat: polish session goals in tui

* fix: resolve goal tool session stores

* fix: keep get goal read-only

* fix: migrate legacy goal session slots

* fix: persist goal token accounting

* fix: validate goal session rows

* refactor: remove unshipped goal legacy handling

* fix: handle goal commands in local tui

* fix: satisfy goal tool display checks

* fix: reset goal budget on overdue resume

* feat: surface session goals across control surfaces

* test: update gateway protocol test import

* test: align goal fixture types with protocol

* fix: scope selected global transcript usage fallback

* fix: scope selected global web subscriptions

* fix: preserve selected global agent during chat dispatch

* fix: scope chat inject to selected global agents
2026-05-29 22:36:29 +02:00
xin zhuang
960117259d fix(agents): preserve rotated compaction session identity
Fix `sessions.json` persistence after compaction transcript rotation.

When the agent runtime rotates from the pre-compaction session transcript to the post-compaction transcript, post-run consumers now receive the effective OpenClaw session id and session file. Backend CLI session ids remain backend metadata and no longer overwrite the top-level OpenClaw session identity.

Refs #88040.
Thanks @1052326311.

Verification:
- `node scripts/run-vitest.mjs src/agents/agent-command.compaction-rotation.test.ts src/agents/agent-command.live-model-switch.test.ts src/agents/command/session-store.test.ts`
- Autoreview clean
- GitHub CI green on PR head `c3d3c77ddf675bbba0b9ba6681b030a2f69a898c`
2026-05-29 22:05:05 +02:00
Peter Steinberger
467b068fdc perf(sessions): patch single-entry store writes 2026-05-29 19:54:01 +01:00
Peter Steinberger
1ca7f5c0a0 perf(gateway): reuse session maintenance config during turns 2026-05-29 19:23:28 +01:00
benjamin1492
de455304cc fix(command): stabilize claude-cli transcript resume (#81048)
Fix claude-cli transcript resume so session-id rotation and transcript flush timing do not drop valid resume state.

- Capture the latest claude-cli session_id from JSONL output.
- Resolve Claude project transcript paths through the shared canonical project-dir resolver.
- Probe transcript content from the actual CLI process cwd.
- Thanks @benjamin1492!
2026-05-29 22:56:09 +05:30
Shakker
efffb42ef9 refactor: split skills index follow-up 2026-05-29 17:35:02 +01:00
Shakker
d9278c8efd refactor: organize skills subsystem layout 2026-05-29 17:35:02 +01:00
Shakker
355fb4d860 refactor: use direct skills imports 2026-05-29 17:35:02 +01:00
Shakker
22e2d1560f refactor: centralize skills subsystem 2026-05-29 17:35:02 +01:00
Paul Frederiksen
e69855e68c fix(codex): recover raw missing-thread compaction failures (#87738)
Recover Codex compaction paths when a stale app-server thread binding returns an unstructured `thread not found` failure. The raw missing-thread response now shares the same recovery behavior as structured missing/stale binding failures for preflight, queued compaction, and CLI fallback.

Fixes #87736.

Co-authored-by: Paul Frederiksen <paul@paulfrederiksen.com>
2026-05-28 22:41:44 +01:00
Andy Ye
5f88932806 fix(sessions): recover empty preflight compaction
Fixes #87016.

Empty preflight compaction recovery now resets stale token snapshots immediately, preserves valid legacy transcript rows during cleanup, and avoids re-persisting stale context-budget or compaction metadata after a successful retry.

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
2026-05-28 17:06:38 +01:00
Peter Steinberger
aab5410bd5 test: speed up slow test suite (#87611)
* test: speed up slow test suite

* test: preserve fake timer cleanup hooks

* test: avoid timeout readiness race

* test: satisfy reply test types

* test: restore runner and image coverage

* test: restore final media runner path

* test: make cli auth status fixture deterministic

* test: repair runtime alias fixtures
2026-05-28 13:20:19 +01:00
Andy Ye
d8641a661b fix(sessions): avoid stale restart continuation reuse
Avoid stale restart continuation reuse after a session key has rotated.

Queued restart agent turns now carry the session id they were queued for and fall back to a system wake if the key points at a different session by delivery time. Normal completed-run lifecycle fields stay reusable for fresh sessions, while new-session creation clears stale lifecycle markers.

Closes #86593.

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
2026-05-27 23:55:24 +01:00
Mariano
7299c56953 Fix sub-agent cwd/workspace separation (#87218)
Merged via squash.

Prepared head SHA: f47b073830
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Reviewed-by: @mbelinky
2026-05-27 23:55:24 +02:00
Peter Steinberger
bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341)
* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types
2026-05-27 19:24:04 +01:00
Peter Steinberger
8c8162f1f7 perf(gateway): borrow read-only session metadata 2026-05-27 07:32:29 +01:00
Peter Steinberger
8fa4fad3a7 perf(gateway): skip duplicate turn session touch 2026-05-27 05:28:10 +01:00
Peter Steinberger
d2711c900d perf(gateway): reuse prepared auth stores 2026-05-27 04:51:43 +01:00
Peter Steinberger
cdfb1b4bf1 perf: trim gateway session cache churn 2026-05-27 03:23:26 +01:00
Shakker
696fb41c5b fix: restore user turn persistence checks 2026-05-27 02:38:58 +01:00
Shakker
5268bf900e refactor: persist cli user turns through sessions 2026-05-27 02:38:58 +01:00
Shakker
840cea5d6e refactor: use shared user turn builder for command transcripts 2026-05-27 02:38:58 +01:00
Lellansin Huang
6c7b3f3f23 feat(gateway): forward OpenAI sampling params (#84094)
Forward OpenAI-compatible frequency_penalty, presence_penalty, and seed params through the gateway/chat-completions path while keeping Responses untouched.

Verification:
- pnpm test src/gateway/openai-http.test.ts src/agents/pi-embedded-runner/extra-params.sampling.test.ts src/agents/openai-transport-stream.test.ts
- CI passed on head 9abb9466d9 after rerunning cancelled jobs: preflight, critical quality network-runtime-boundary, security high, checks, docs, Real behavior proof.

Co-authored-by: lellansin <lellansin@gmail.com>
2026-05-26 00:33:26 +01:00
Sally O'Malley
f0b6f70053 fix(agents): honor effective exec policy for Claude live Bash (#86330)
* fix(agents): answer Claude live control_request can_use_tool via exec policy

Claude CLI emits stream-json control_request frames with subtype
can_use_tool when it wants to use a native tool. The Claude live-session
bridge previously dropped these frames, leaving Claude waiting for a
control_response until the 180/600s no-output timeout fired (see #80819).

Resolve the effective OpenClaw exec policy (per-agent tools.exec -> global
tools.exec -> allowlist/on-miss defaults) once at session-start time and
thread it through fingerprinting and the session record. When a
can_use_tool request arrives:

- Allow native Bash when the resolved policy is security=full, ask=off
  (matching the bypassPermissions semantics OpenClaw already documents).
- Otherwise deny with a message that names the resolved policy and
  points the agent at OpenClaw MCP tools.

Unsupported control_request subtypes get a structured error response
instead of a silent no-op, and stray control_response frames are
silently dropped. Adds spawn-test coverage for both allow and deny paths.

Fixes #80819

* fix(agents): align Claude live control_request policy with backend defaults

Resolve the effective exec policy through the same defaults that
extensions/anthropic/cli-shared.ts:isOpenClawRequestedYolo and
src/agents/exec-defaults.ts:resolveExecDefaults already use (security
?? "full", ask ?? "off") instead of falling back to a hand-rolled
allowlist/on-miss default that disagreed with the rest of the codebase.
Without this, a default-config OpenClaw deployment launches Claude with
--permission-mode bypassPermissions but the bridge would still deny
Bash control_requests, re-creating the #80819 stall for the very
default-config case the issue reports.

Also thread the effective Claude permission mode into the policy
decision. Prefer the operator's explicit --permission-mode in argv,
falling back to what normalizeClaudePermissionArgs would have inserted
for an un-overridden launch. Native Bash is auto-allowed only when the
effective mode is bypassPermissions AND tools.exec resolves to
full/no-ask, so explicit raw-arg overrides like --permission-mode
default or acceptEdits broaden Claude's native prompting and are
honored by routing through deny.

Adds a no-config regression test (default deployment allows Bash, no
stall) and a permission-mode-override test (tools.exec full/off plus
explicit --permission-mode default in raw args denies). Existing
allow/deny tests continue to pass via the synthesized-mode fallback.

* fix(agents): honor effective exec policy for Claude live Bash

---------

Co-authored-by: Guillaume Thirry <g.thirry@gmail.com>
2026-05-25 11:39:17 -04:00
pashpashpash
dd47e479ae Fail Codex compaction at the Codex boundary (#85958) 2026-05-24 22:12:34 -07:00
brokemac79
f55e98671a fix: preserve internal handoff status attribution [AI-assisted] (#85726)
* fix: preserve status attribution for internal handoffs

* fix: preserve internal handoff status attribution (#85726) (thanks @brokemac79)

* fix: surface internal fallback failures (#85726)

* fix: preserve internal handoff session continuity (#85726)

* fix: skip internal fallback auto overrides (#85726)

* fix: preserve direct internal handoff state (#85726)

* fix: authorize internal announce handoff (#85726)

* fix: preserve handoff accounting without hiding transcript (#85726)

* test: fix session-store cli backend fixture (#85726)

* fix: trust-gate handoff accounting preservation (#85726)

* fix: avoid stale preserve-mode session writes (#85726)

* fix: avoid preserve-mode session identity writes (#85726)

* fix: hide internal handoff usage footers (#85726)

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-24 03:24:27 +01:00
Peter Steinberger
d73f3ac85d refactor: split subagent delivery state 2026-05-24 00:05:48 +01:00
Gio Della-Libera
05c6e7a553 feat(agents): expose estimated context budget status
Expose a path-free estimated context budget status on session entries and gateway session rows, render it in status when fresh provider usage is unavailable, and clear stale estimates across reset, refresh, compaction, and session-rotation boundaries.

Verification: focused local Vitest covered session persistence, status rendering, gateway rows, model resets, compaction, and session rotation; GitHub CI passed on clean head cad199e43d.

Refs #80594, #54996, #77992, #84490, #83177, #43009, #83526, #8635.
2026-05-23 14:17:44 -07:00
Peter Steinberger
c4f0da00a9 refactor: use channel target resolution APIs (#85814)
* refactor: use channel target resolution apis

* refactor: satisfy delivery lint

* refactor: remove unused target parsing shim

* fix: preserve routed cron topic targets
2026-05-23 21:26:55 +01:00
brokemac79
f4b92f5e6c fix(agents): simplify subagent completion handoff
Simplify native subagent completion handoff and remove manual subagent control surfaces.

Co-authored-by: brokemac79 <martin_cleary@yahoo.co.uk>
2026-05-23 13:50:08 +01:00