Commit Graph

31922 Commits

Author SHA1 Message Date
Peter Steinberger
7c1484d637 refactor: extract media generation core package
Extract pure media generation catalog/model-ref/normalization helpers into a private workspace package and wire the package through build, watch, SDK alias, and plugin boundary d.ts paths.

Verification:
- node scripts/run-vitest.mjs test/scripts/crabbox-wrapper.test.ts packages/media-generation-core/src src/media-generation/runtime-shared.test.ts src/plugins/sdk-alias.test.ts src/infra/watch-node.test.ts src/plugins/registry.provider-like.test.ts src/agents/model-ref-shared.test.ts extensions/codex-supervisor/src/plugin-tools.test.ts extensions/codex-supervisor/src/supervisor.test.ts src/wizard/setup.official-plugins.test.ts src/infra/net/http-connect-tunnel.test.ts
- node scripts/prepare-extension-package-boundary-artifacts.mjs --mode=all
- node scripts/run-vitest.mjs src/plugins/contracts/extension-package-project-boundaries.test.ts src/plugins/sdk-alias.test.ts
- pnpm protocol:check
- pnpm check:changed
- .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
- GitHub CI 26676608512
2026-05-30 08:17:43 +02:00
Peter Steinberger
be2c43ee3e fix(llm): cap codex retry delays 2026-05-30 02:17:30 -04:00
Peter Steinberger
5aa2bd7921 fix(agents): cap subagent context TTLs 2026-05-30 02:12:45 -04:00
Peter Steinberger
5db2cd6c00 perf: skip session store clones in turn hot paths 2026-05-30 07:11:03 +01:00
Jason (Json)
81505ada18 fix(codex): rotate native threads before overflow
Fix Codex app-server native thread overflow recovery and CLI compaction fallback.

- rotate Codex native startup bindings when rollout token pressure leaves too little headroom
- keep byte-size rollout fuses ahead of rollout content reads
- clear stale resumed context-engine bindings only when the stored thread id still matches
- fall back to context-engine compaction when Codex owns/skips native compaction

Verification:
- node scripts/run-vitest.mjs run --config test/vitest/vitest.extension-codex.config.ts extensions/codex/src/app-server/startup-binding.test.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts extensions/codex/src/app-server/session-binding.test.ts --reporter=verbose
- node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/command/cli-compaction.test.ts --reporter=verbose
- git diff --check origin/main...HEAD
- autoreview --mode branch --base origin/main: clean
- GitHub CI for 466bfbe78c: green

Co-authored-by: fuller-stack-dev <263060202+fuller-stack-dev@users.noreply.github.com>
2026-05-30 08:07:29 +02:00
Peter Steinberger
8edeba0de3 fix(agents): cap provider request timeouts 2026-05-30 02:07:14 -04:00
Peter Steinberger
beb42b12c9 refactor(agents): type media completion delivery misses (#88250) 2026-05-30 08:04:50 +02:00
Peter Steinberger
42b320ad65 fix(cron): cap explicit job timeouts 2026-05-30 02:00:52 -04:00
Peter Steinberger
bba8015688 fix: show chat errors as visible messages
Surface gateway chat failures as visible assistant messages in the Control UI, with regression coverage and Crabbox/WebVNC proof.
2026-05-30 07:57:18 +02:00
Peter Steinberger
05e31bbedd refactor(agents): reuse terminal outcome for subagent waits 2026-05-30 06:56:52 +01:00
Peter Steinberger
c806a736af fix(agents): cap session wait timeouts 2026-05-30 01:56:44 -04:00
Vincent Koc
ceb179f84d refactor: share web search time filters 2026-05-30 07:53:51 +02:00
Peter Steinberger
afa6d0cd18 fix(web): cap provider timeout seconds 2026-05-30 01:47:06 -04:00
Peter Steinberger
aa0d6e1bca refactor: extract LLM core packages (#88117)
* refactor: extract llm core packages

* chore: drop generated llm package artifacts

* fix: align llm package export artifacts

* test: fix moving main CI expectations

* fix: align llm core subpath aliases

* fix: use llm package exports

* fix: stabilize llm package boundary artifacts

* fix: sync llm boundary path contract

* test: isolate crabbox provider env

* test: pin crabbox configured-provider cases

* test: apply crabbox lease provider override
2026-05-30 07:45:04 +02:00
Peter Steinberger
c536bd6af1 fix(agents): cap exec reviewer timeout 2026-05-30 01:29:05 -04:00
Peter Steinberger
fcdc25ba64 test: dedupe redundant test coverage 2026-05-30 06:27:13 +01:00
Ayaan Zaidi
f848a6f7f7 perf(agents): bound claude orphan transcript scan 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
72eff6b2e9 fix(agents): clear orphan tool state on string assistant turns 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
56fc17be78 fix(agents): avoid cli facade load in flush gate 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
3c3e39684e test(agents): cover flushed cli context engine session 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
25dfe9294f fix(agents): pass workspace to cli flush probe 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
622404fcec fix(agents): detect claude-specific orphaned tools 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
bda02f4be8 fix(agents): scope cli binding clears 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
58de6f91dc fix(auto-reply): clear unflushed cli bindings 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
c0a5f15dc8 fix(agents): clear unflushed cli bindings 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
21b5f601b6 fix(agents): preserve auth-boundary cli invalidation 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
2e21158d04 refactor(agents): simplify cli session recovery probes 2026-05-30 10:09:19 +05:30
Abdel Gomez-Perez
16b510807b fix(agents/cli-runner): invalidate sessions whose transcript ends mid-tool
A claude-cli session whose JSONL transcript ends with an assistant
`tool_use` content block that was never answered by a `tool_result` user
message cannot resume — claude-cli will sit waiting for the missing
`tool_result`, hit its no-output watchdog, and the runtime kills it
with `reason=abort`. The dispatcher then sees an empty payload and emits
NO_REPLY, which to the user looks like the agent silently ignored their
message — same end-user symptom as the binding-flush amnesia bug, but a
different root cause.

The orphan can be left behind when:
  - Gateway restarts mid-tool (brew upgrade, manual kickstart, OOM,
    crash) — claude was waiting on a tool result that never arrived.
  - `claude-live-session.ts` no-output watchdog fires while a tool is
    actively running and OC kills the subprocess.
  - The tool itself crashed or hung past its own deadline.

In all cases the resumed session is dead until the binding gets cleared,
because every subsequent resume hits the same trailing tool_use and the
same kill cycle. Observed in production on a personal OpenClaw gateway
(3d-engineer agent, 50-message-deep transcript ending in a Bash
`tool_use`; every Telegram message after the orphan landed silently
aborted at the 180s no-output mark).

Add `claudeCliSessionTranscriptHasOrphanedToolUse` to the helpers that
walks the JSONL, finds the last assistant message, and returns true if
any of its `tool_use` ids has no matching `tool_result` later in the
file. Wire into `prepareCliRunContext` as a second invalidator gate
alongside `missing-transcript`. The new `invalidatedReason:
"orphaned-tool-use"` follows the same path as missing-transcript: the
binding is dropped, this turn starts a fresh session, and the prior
context is reseeded into the new session via `RAW_TRANSCRIPT_RESEED`.

Detection only considers TRAILING orphans — an unanswered tool_use
deeper in history is inert because a later assistant message already
moved past it. Only the most recent assistant message's tool_use ids
matter for forward progress.

Probe runs only for claude-cli providers and only when the transcript-
content gate already passed, so we add no I/O on already-invalidated
sessions and no behavior change for non-claude providers.

AI-assisted: yes. Tooling: Claude Opus + claude-cli.
2026-05-30 10:09:19 +05:30
Abdel Gomez-Perez
07c1245db4 fix(agents/cli-runner): gate cliSessionBinding persist on transcript flush
When a claude-cli turn produces a session id but the underlying claude
subprocess fails to flush an assistant-role record to its
~/.claude/projects/<cwd>/<sid>.jsonl transcript (e.g. mid-turn kill from
a concurrent fingerprint-mismatched turn, supervisor restart, internal
failure), buildCliRunResult was still persisting that session id into
cliSessionBinding. The next turn ran claudeCliSessionTranscriptHasContent,
didn't find the file, logged 'cli session reset: reason=missing-transcript',
and started a brand-new claude session with empty memory.

End-user symptom: agent forgets prior conversation between turns.

Gate the cliSessionBinding spread on the same predicate the next-turn
invalidator uses, evaluated at write time. Also clear agentMeta.sessionId
in the same case so the session-store fallback at command/session-store.ts
(which reads agentMeta.sessionId via setCliSessionId when the binding is
absent) doesn't re-persist the unflushed sid through a different field
path. The fallback is what makes the binding-only gate insufficient on
its own; both writes must drop together.

The gate only fires for claude-cli providers — other CLI providers don't
write to ~/.claude/projects, so probing them would always return false
and incorrectly strip valid binding metadata. isCliBindingFlushed now
takes the provider id and returns true unconditionally for non-claude-cli
sessions.

A bounded retry (0 / 50 / 150 ms) tolerates the brief gap between
claude-cli's stdio close and the OS making the JSONL line visible to
readers (cooperative fsync semantics on APFS, but not guaranteed under
stress).

The transcript-probe is exposed as an injectable dep
(setCliRunnerTestDeps / restoreCliRunnerTestDeps) mirroring the existing
pattern in src/agents/cli-runner/prepare.ts so isCliBindingFlushed is
testable without touching ~/.claude/projects.

AI-assisted: yes. Tooling: Claude Opus + claude-cli. Codex review caught
the fallback path and the missing provider gate before this hit upstream.
Real-Behavior-Proof: dist-side patch on M5 gateway; branch-build
follow-up pending — see PR body.
2026-05-30 10:09:19 +05:30
Ayaan Zaidi
1659b26151 fix(agent): allow media retry after blocked delivery 2026-05-30 09:07:53 +05:30
Ayaan Zaidi
c88178d9b6 fix(agent): recover media completion delivery 2026-05-30 09:07:53 +05:30
Peter Steinberger
d115fb4cf9 refactor: move task state to shared sqlite
Move task run, delivery, and flow registry persistence onto the shared OpenClaw state SQLite database.

Summary:
- Store task runs, delivery state, and flow runs in state/openclaw.sqlite via the generated Kysely schema.
- Migrate shipped task sidecars into the shared state DB and archive old sidecars, including invalid-config/read-only CLI paths.
- Keep startup migration lightweight for read-only status/tasks paths while still detecting known legacy state markers and custom session stores.

Verification:
- .agents/skills/autoreview/scripts/autoreview --mode local: clean after final fix
- pnpm test src/tasks/task-registry.store.test.ts src/tasks/task-flow-registry.store.test.ts src/commands/doctor-state-migrations.test.ts -- --reporter=verbose
- pnpm test src/commands/doctor-state-migrations.test.ts src/cli/program/config-guard.test.ts src/cli/route.test.ts src/cli/command-path-policy.test.ts -- --reporter=verbose
- pnpm test src/cli/program/config-guard.test.ts src/cli/route.test.ts src/cli/command-startup-policy.test.ts src/cli/command-path-policy.test.ts src/cli/command-execution-startup.test.ts -- --reporter=verbose
- pnpm test src/cli/program/config-guard.test.ts src/cli/argv.test.ts src/cli/route.test.ts src/commands/doctor-config-preflight.state-migration.test.ts -- --reporter=verbose
- pnpm test src/tasks/task-flow-registry.store.test.ts -- --reporter=verbose
- pnpm test test/scripts/lint-suppressions.test.ts -- --reporter=verbose
- pnpm db:kysely:check
- pnpm lint:kysely
- git diff --check HEAD
- pnpm test:startup:memory
- PR CI green on 2f7d76f0d5
2026-05-30 04:54:37 +02:00
Josh Avant
584fa3215c Fix restart sentinel internal continuations (#88161)
* fix restart sentinel internal continuations

* update gateway prompt snapshots

* stabilize sandbox browser audit timer tests

* drive sandbox audit timeouts deterministically

* drive gh-read timeout tests deterministically

* drive label-open-issues timeout tests deterministically

* document deterministic timeout test timers

* test: preserve deterministic timer setup after rebase
2026-05-29 19:06:54 -07:00
Vincent Koc
985b41e136 refactor: share Codex auth identity helpers 2026-05-30 03:57:20 +02:00
Vincent Koc
75de853c37 refactor: share provider OAuth runtime helpers 2026-05-30 03:30:51 +02:00
Josh Avant
b3b962a051 fix subagent dm completion delivery (#88182) 2026-05-29 18:24:49 -07:00
Peter Steinberger
acb0e9c155 fix(agents): extend terminal outcome projections (#88162)
* fix(agents): extend terminal outcome projections

* fix(agents): align terminal outcome follow-up checks

* fix(agents): satisfy terminal outcome mapper lint

* test(scripts): isolate websocket open timers

* test(security): drive sandbox browser timeout timers

* test(scripts): drive gh-read timeout timers

* test(agents): isolate code mode timers

* fix(agents): preserve hard timeouts on wait surfaces

* fix(agents): require timeout attribution for provider errors

* fix(sdk): require timeout attribution for provider errors

* fix(scripts): preserve changelog parse cause
2026-05-30 03:13:01 +02:00
Vincent Koc
deb48a96fb refactor: share prompt template arguments 2026-05-30 03:05:46 +02:00
Vincent Koc
1a4eb0b5e7 refactor: share agent truncate utilities 2026-05-30 02:46:45 +02:00
clawsweeper[bot]
18f94fc83a fix(agents): classify embedded provider business denials for fallback (#84814)
Summary:
- The PR classifies selected embedded agent provider-denial error payloads through the shared failover matcher ... 1/current-ak auth matching, preserves guarded non-fallback cases, and covers fallback progression in tests.
- PR surface: Source +34, Tests +166. Total +200 across 5 files.
- Reproducibility: yes. Current main is source-reproducible: a non-GPT embedded result whose only signal is CE ... returns null from the classifier, and the fallback wrapper treats null classification as candidate success.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(agents): classify embedded provider business denials for fallback
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8304…

Validation:
- ClawSweeper review passed for head e266beac93.
- Required merge gates passed before the squash merge.

Prepared head SHA: e266beac93
Review: https://github.com/openclaw/openclaw/pull/84814#issuecomment-4505010446

Co-authored-by: Stellar鱼 <2182712990@qq.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
2026-05-30 00:34:28 +00:00
Peter Steinberger
aada44fca5 fix(agents): preserve Codex auth for compaction fallback
Fixes #86820.

Preserve Codex OAuth-backed compaction by selecting and loading the Codex harness before resolving direct or queued compaction models, while keeping OpenAI-compatible custom base URLs on the OpenAI context config path. Also preserves persisted concrete harness pins so compaction does not hot-switch existing sessions just because an explicit Codex fallback exists.

Verification:
- node scripts/run-vitest.mjs src/agents/embedded-agent-runner/compact.hooks.test.ts src/agents/harness/selection.test.ts src/agents/harness/runtime-plugin.test.ts
- pnpm tsgo:prod
- pnpm check:test-types
- pnpm lint --threads=8
- git diff --check origin/main...HEAD
- git diff --check
- autoreview clean: no accepted/actionable findings reported; overall patch is correct (0.82)
- GitHub PR checks green on ac6f93de4a
2026-05-30 02:26:00 +02:00
Peter Steinberger
43658872d9 test: stabilize sandbox browser audit timers 2026-05-30 01:18:53 +01:00
Merlin
c8a733eae5 fix(gateway): resolve message actions against runtime config (#84535)
* fix(gateway): resolve message action config from runtime snapshot

* fix(gateway): preserve runtime config matching through auto-enable

* fix(gateway): preserve auto-enabled message action fallback

* fix(gateway): use canonical runtime snapshot for message actions

* fix(discord): route credential actions through gateway

---------

Co-authored-by: Merlin <258679497+funmerlin@users.noreply.github.com>
Co-authored-by: joshavant <830519+joshavant@users.noreply.github.com>
2026-05-29 17:14:45 -07:00
Dallin Romney
914f313740 test(unit-fast): isolate fake-timer files (#88160) 2026-05-29 17:11:05 -07:00
Peter Steinberger
4efc48a80d test(ci): stabilize sandbox browser audit timeout 2026-05-30 02:06:58 +02:00
Peter Steinberger
14795dc0cc test: stabilize block reply abort timers 2026-05-30 00:56:15 +01:00
Vincent Koc
c01a0f5588 refactor: share provider oauth runtime helpers 2026-05-30 01:31:10 +02:00
Peter Steinberger
8ff61be8d6 fix(providers): cap local service timers 2026-05-29 19:29:40 -04:00
Peter Steinberger
90d569e896 fix(telegram): centralize positive timer bounds 2026-05-29 19:25:30 -04:00
Peter Steinberger
d8bc71f222 test: stabilize realtime websocket timeout 2026-05-30 00:18:02 +01:00