Commit Graph

311 Commits

Author SHA1 Message Date
Peter Steinberger
0b8aabe864 docs: document auth profile failure policy contract (#89613)
* docs: document markdown marker renderer

* docs: document rendered markdown chunking

* docs: document markdown text chunking

* docs: document shared text chunking

* docs: document plugin text chunking exports

* docs: document avatar policy constants

* docs: document node match candidates

* docs: document scoped expiring id cache

* docs: document runtime import normalization

* docs: document string sample summaries

* docs: document session usage timeseries types

* docs: document session usage response types

* docs: document manifest frontmatter shapes

* docs: document channel route input metadata

* docs: document pair loop guard settings

* docs: document migration config patch helpers

* docs: document api provider registry

* docs: document tool call repair payloads

* docs: document plugin tool payload helpers

* docs: document lazy promise loader

* docs: document store writer queue state

* docs: document thread binding lifecycle

* docs: document concurrency helper contract

* docs: document gateway client info contract

* docs: document delivery context contracts

* docs: document secret ref defaults contract

* docs: document command gating contract

* docs: document avatar policy contract

* docs: document node match policy

* docs: document message channel normalization

* docs: document boolean parsing contract

* docs: document zod parse helpers

* docs: document direct dm guard policy

* docs: document fixed window limiter contract

* docs: document node presence event contract

* docs: document secret normalization contract

* docs: document progress draft line removal

* docs: document usage formatting contracts

* docs: document agent run status contract

* docs: document runtime import helpers

* docs: document provider utility ownership

* docs: document invalid config helpers

* docs: document json compat parser

* docs: document channel config metadata ownership

* docs: document channel logging helpers

* docs: document sender identity validation ownership

* docs: document string sampling helper

* docs: document global singleton helpers

* docs: document transcript tool helpers

* docs: document exec safe-bin normalization

* docs: document reaction level resolver

* docs: document account snapshot redaction boundary

* docs: document messaging target helpers

* docs: document thread binding messages

* docs: document conversation binding context

* docs: document conversation resolution helper

* docs: document owner display secret retention

* docs: document provider request config types

* docs: document skills config types

* docs: document memory config types

* docs: document imessage config types

* docs: document crestodian config types

* docs: document tools config policies

* docs: document shared config base types

* docs: document channel config contracts

* docs: document openclaw config state types

* docs: document model config contracts

* docs: document shared agent config types

* docs: document agent defaults config types

* docs: document secret input contracts

* docs: document auth config contracts

* docs: document gateway config contracts

* docs: document tool call stream repair contracts

* docs: document memory host facades

* docs: document llm core contracts

* docs: document markdown core contracts

* docs: document gateway connect error contracts

* docs: document gateway protocol primitives

* docs: document gateway frame schemas

* docs: document gateway device schemas

* docs: document gateway environment schemas

* docs: document gateway push schemas

* docs: document gateway plugin schemas

* docs: document gateway artifact schemas

* docs: document gateway command schemas

* docs: document gateway task schemas

* docs: document gateway exec approval schemas

* docs: document gateway secret schemas

* docs: document gateway config schemas

* docs: document gateway snapshot schemas

* docs: document gateway chat schemas

* docs: document gateway wizard schemas

* docs: document gateway node schemas

* docs: document gateway plugin approval schemas

* docs: document gateway talk schemas

* docs: document gateway agent schemas

* docs: document gateway session schemas

* docs: document gateway cron schemas

* docs: document gateway agent model skill schemas

* docs: document gateway skill proposal tool schemas

* docs: document gateway protocol registry

* docs: document gateway channel status schemas

* docs: document gateway schema regression tests

* docs: document gateway schema barrel

* docs: document gateway validator tests

* docs: document gateway primitive push tests

* docs: document gateway contract tests

* docs: document native protocol guard

* docs: document channel schema tests

* docs: document gateway protocol smoke tests

* docs: document gateway protocol entrypoint

* docs: document gateway protocol type exports

* docs: document gateway error codes

* docs: document protocol schema registry

* docs: document talk audio codec

* docs: document talk activation names

* docs: document talk consult questions

* docs: document talk consult tool

* docs: document talk run control contracts

* docs: document talk run control adapter

* docs: document talkback consult queue

* docs: document talk consult transcript guard

* docs: document talk fast context runtime

* docs: document forced talk consult coordinator

* docs: document talk output activity tracker

* docs: document talk event metrics

* docs: document talk diagnostics

* docs: document talk observability hook

* docs: document talk provider resolver

* docs: document talk provider registry

* docs: document talk runtime primitives

* docs: document talk consult controller logs

* docs: document channel identity helpers

* docs: document channel account allowlist helpers

* docs: document channel metadata draft controls

* docs: document channel ingress policy

* docs: document channel sender access gates

* docs: document channel catalog message contracts

* docs: document channel account plugin helpers

* docs: document configured binding helpers

* docs: document channel acp approval config helpers

* docs: document channel bundled config write helpers

* docs: document channel plugin utility contracts

* docs: document channel config access helpers

* docs: document channel message action helpers

* docs: document channel outbound runtime helpers

* docs: document channel pairing promotion helpers

* docs: document channel registry helpers

* docs: document channel setup wizard helpers

* docs: document channel lifecycle status helpers

* docs: document channel target thread helpers

* docs: document channel session binding helpers

* docs: document channel package module probes

* docs: document channel setup wizard contracts

* docs: document channel plugin API barrels

* docs: document channel contract test helpers

* docs: document channel core helpers

* docs: document small core facades

* docs: document provider runtime helpers

* docs: document persistence and realtime helpers

* docs: document mcp and state helpers

* docs: document tool planner contracts

* docs: document music generation runtime

* docs: document crestodian command flow

* docs: document utility helpers

* docs: document node host helpers

* docs: document transcript contracts

* docs: document trajectory export contracts

* docs: document image generation contracts

* docs: document routing helper contracts

* docs: document session helper contracts

* docs: document video generation contracts

* docs: document model catalog contracts

* docs: document proxy capture contracts

* docs: document status rendering contracts

* docs: document test helper contracts

* docs: document wizard setup contracts

* docs: document process contracts

* docs: document memory host sdk contracts

* docs: document tts contracts

* docs: document secrets runtime contracts

* docs: document shared helper contracts

* docs: document hook runtime contracts

* docs: document security audit contracts

* docs: document flow contracts

* docs: document media understanding contracts

* docs: document tui contracts

* docs: document logging contracts

* docs: document llm contracts

* docs: document cron contracts

* docs: document daemon contracts

* docs: document task contracts

* docs: document acp contracts

* docs: document test utility contracts

* docs: document skill contracts

* docs: document config contracts

* docs: document outbound infra contracts

* docs: document command analysis contracts

* docs: document provider usage infra contracts

* docs: document file safety infra contracts

* docs: document exec approval infra contracts

* docs: document gateway runtime infra contracts

* docs: document infra utility contracts

* docs: document infra queue storage contracts

* docs: document heartbeat infra contracts

* docs: document remaining infra contracts

* docs: document gateway auth contracts

* docs: document gateway display helpers

* docs: document gateway http helpers

* docs: document gateway node helpers

* docs: document gateway mcp helpers

* docs: document gateway support helpers

* docs: document gateway server runtime helpers

* docs: document gateway runtime bootstrap helpers

* docs: document gateway session events

* docs: document gateway utility helpers

* docs: document gateway talk helpers

* docs: document gateway helper contracts

* docs: document gateway server method helpers

* docs: document gateway server auth helpers

* docs: document gateway server tests

* docs: document gateway test helpers

* docs: document gateway node tests

* docs: document gateway channel tests

* docs: document gateway session tests

* docs: document gateway server startup tests

* docs: document gateway tool test helpers

* docs: document gateway server test helpers

* docs: document gateway server method tests

* docs: document remaining gateway tests

* docs: document plugin sdk public subpaths

* docs: document plugin sdk runtime helpers

* docs: document plugin sdk memory provider helpers

* docs: document plugin sdk runtime facades

* docs: document plugin sdk command approval helpers

* docs: document plugin sdk runtime types

* docs: document plugin sdk browser account helpers

* docs: document plugin sdk media memory helpers

* docs: document plugin sdk core tests

* docs: document plugin sdk contract helpers

* docs: document plugin sdk test helpers

* docs: document remaining plugin sdk tests

* docs: document cli utility helpers

* docs: document cli runtime helpers

* docs: document cli command registration helpers

* docs: document node cli helpers

* docs: document cli program registration

* docs: document message cli registration

* docs: document daemon cli helpers

* docs: document cli route parsers
2026-06-03 15:20:39 -07:00
LiLan0125
ad9f7f9a59 fix(diagnostics): requeue stuck session lane after recovery
Reset the session command lane when stuck-session recovery aborts and drains a ghost embedded run but queued lane work remains. This preserves pending user messages by using the existing lane recovery pump instead of leaving them stranded after recovery reports success.

Adds focused regression coverage for the abort=true, drained=true, queuedCount=1 path.

Fixes #89208.
Supersedes #89293.
Thanks @LiLan0125.

Co-authored-by: 李兰 0668001394 <li.lan3@xydigit.com>
2026-06-02 06:36:19 -04:00
Chunyue Wang
c0195f7ed5 fix(diagnostics): clear embedded-run activity when recovery declares lane idle (#88820)
* fix(diagnostics): clear embedded-run activity when recovery declares lane idle

Stuck-session recovery transitions a lane to idle via the recovery
coordinator, but only mutated the session-state store. When an aborted
embedded run was removed without markDiagnosticEmbeddedRunEnded, the
activity store kept hasActiveEmbeddedRun set, so the liveness sweep
reported idle/embedded_run and isIdleQueuedRecoverableSessionStall
re-triggered recovery indefinitely.

Reconcile the activity store from the authoritative idle declaration by
clearing the session's embedded-run owners. The existing generation
guard already excludes any newer run that re-armed activity, so a live
requeued run is preserved.

* fix(diagnostics): reconcile tool/model activity on authoritative idle cleanup

clearDiagnosticEmbeddedRunActivityForSession (renamed from
clearDiagnosticEmbeddedRunsForSession) now clears the aborted run's tool and
model markers alongside the embedded-run owners, matching the default
markDiagnosticEmbeddedRunEnded teardown. Clearing only the owner set left the
lane as idle + orphaned tool/model activity, which
isIdleQueuedRecoverableSessionStall still treats as recoverable while work is
queued, so the liveness sweep kept re-triggering recovery instead of converging.
Adds regression cases with stale tool and model markers plus queued work.

* test(phone-control): align service mocks with keyed store API

* fix(diagnostics): preserve rearmed recovery activity

* fix(diagnostics): clear recovered owner markers

* fix(diagnostics): clear recovered embedded work keys

* fix(diagnostics): ignore stale same-key recovery owners

* fix(diagnostics): preserve same-session recovery rearm

* fix(diagnostics): ignore stale queued activity starts

* fix(diagnostics): record recovery cutoffs for empty activity

* fix(diagnostics): preserve fresh recovery markers

* fix(diagnostics): prune stale activity before fresh recovery block

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-06-01 01:07:35 -04:00
Peter Steinberger
27dde7a4d6 chore(lint): enable stricter error rules 2026-06-01 01:12:21 +01:00
Peter Steinberger
22cb7fb6b7 chore(lint): enable no-promise-executor-return 2026-05-31 23:06:13 +01:00
Peter Steinberger
b653d94918 chore(lint): enable no-useless-assignment 2026-05-31 22:40:48 +01:00
Shubhankar Tripathy
d042452d20 fix(logging): refresh file log hostname per write
Fix JSONL file-log hostnames getting pinned to `unknown` when the first hostname read returns an empty value. The logger now retries empty hostname reads and caches the first non-empty value, keeping the top-level `hostname` and `_meta.hostname` fields aligned.

Fixes #87258.
Thanks @lonexreb for the fix.

Verification:
- `node scripts/run-vitest.mjs src/logging/logger-redaction-behavior.test.ts src/logger.test.ts`
- `node_modules/.bin/oxfmt --check --threads=1 src/logging/logger.ts src/logging/logger-redaction-behavior.test.ts`
- `.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main`
- `gh pr checks 88131 --watch=false`

Co-authored-by: lonexreb <reach2shubhankar@gmail.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-31 22:35:04 +01:00
Peter Steinberger
304e2c83c0 chore(lint): enable stricter oxlint rules 2026-05-31 18:59:02 +01:00
Peter Steinberger
63de51ab96 refactor(cron): clarify sqlite store internals 2026-05-31 16:38:49 +01:00
Peter Steinberger
c797f02ff7 fix(diagnostics): surface Bonjour state in support exports 2026-05-31 13:57:17 +01:00
Peter Steinberger
ae4ab2a41f refactor(logging): share stuck recovery session refs 2026-05-31 11:10:06 +01:00
Feelw00
b4cdc33fc9 fix(logging): align diagnostic recovery dedup keys
Align diagnostic stuck-session recovery in-flight dedup with the runtime recovery key. The coordinator now dedups by logical session ref only, so a mid-flight generation bump cannot emit a phantom `session.recovery.requested` event that runtime recovery skips as already in flight.

Adds a regression test for the idle-queued stall path where a queued message bumps generation while recovery is pending.

Fixes #88010
2026-05-31 11:00:42 +01:00
Peter Steinberger
b9fe0894a6 chore(lint): enable additional cleanup rules 2026-05-31 08:16:11 +01:00
Peter Steinberger
deb7bc6539 chore(lint): enable readability lint rules 2026-05-31 07:17:57 +01:00
Peter Steinberger
51dee73a5d perf: cache log timestamp formatters 2026-05-31 06:09:22 +01:00
Peter Steinberger
00d8d7ead0 refactor: extract normalization core package
Extract shared normalization/coercion helpers into private @openclaw/normalization-core workspace package while preserving existing plugin SDK helper subpaths.\n\nAlso keeps direct normalization-core imports internal, wires UI/build/loader resolution, and replaces the slow PR network CodeQL lane with a fast added-line boundary scan while retaining full CodeQL for scheduled/manual runs.\n\nVerification: local moved tests, plugin SDK boundary tests, extension loader tests, agents-support shard, UI build/test, build artifacts, lint, workflow guards, autoreview, and GitHub CI passed on PR head 963d893715.
2026-05-31 01:33:00 +01:00
Peter Steinberger
57c88dd46e chore: remove more unused internal helpers 2026-05-30 23:26:29 +01:00
Peter Steinberger
005da57957 Move cron persistence to SQLite (#88285)
* refactor: move cron persistence to sqlite

* fix: repair sqlite cron migration regressions

* fix: move cron legacy migration to doctor

* test: align cron sqlite migration fixtures

* test: fix cron sqlite rebase gates

* test: align cron sqlite runtime tests

* test: fix doctor e2e migration mock

* test: fix doctor shard e2e isolation

* test: fix infra child-process mocks
2026-05-30 21:03:41 +01:00
Peter Steinberger
de1dfab03e refactor: move terminal core into package (#88279)
* refactor: move terminal core into package

* refactor: move terminal module files

* fix: clean terminal package CI followups

* test: update lint suppression allowlist

* fix: ship terminal core runtime aliases
2026-05-30 11:07:45 +02:00
Peter Steinberger
f4c6c0aec4 refactor: extract net policy package 2026-05-29 09:45:14 +01:00
Peter Steinberger
b6ef874220 fix: reject partial numeric parsing 2026-05-28 10:51:32 -04:00
Peter Steinberger
4252f07ff0 fix: reduce gateway warning noise
Reduce repeated gateway warning noise in startup/auth retry paths while preserving credential mismatch and rate-limit audit visibility.

Also hardens empty embedded-assistant retry handling by carrying lifecycle state through the missing-assistant guard, and keeps the relevant regression coverage in gateway and agent tests.
2026-05-28 13:17:57 +01:00
Peter Steinberger
bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341)
* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types
2026-05-27 19:24:04 +01:00
Josh Avant
3349fe21bb Fix embedded session file ownership race (#87159)
* fix: serialize embedded session file attempts

* test: update reply runtime mock for session file lookup

* fix: thread session files into diagnostic recovery

* fix: attach causes to session owner abort errors
2026-05-26 23:18:27 -07:00
Samuel Soares da Silva
286964cd6a fix(diagnostics): recover orphaned session activity
Recover idle queued sessions whose diagnostic activity retained stale ownerless model or tool calls by classifying them as recoverable session.stuck after the usual recovery gates. Yield the event loop before stale session-lock process inspection so sync process lookup cannot monopolize lock contention paths.

Docs now describe the widened session.stuck telemetry contract for recoverable stale bookkeeping, including ownerless activity. Thanks @samuelsoaress.

Refs #84903.

Co-authored-by: samuelsoaress <samuelsoares177778@gmail.com>
2026-05-27 02:47:42 +01:00
Peter Steinberger
2035f38ab2 perf: trim gateway runtime hotspots 2026-05-27 02:17:29 +01:00
Peter Steinberger
a43da0c8c5 perf: reduce gateway cpu churn 2026-05-27 01:52:27 +01:00
Onur Solmaz
6c18c212e9 fix(logging): preserve env placeholders during redaction
* fix(logging): preserve env placeholders during redaction

* fix(logging): honor custom redaction patterns

* fix(logging): preserve generic env placeholders

---------

Co-authored-by: Onur Solmaz <onur@Onurs-MacBook-Pro.local>
2026-05-27 00:49:34 +08:00
clawsweeper[bot]
23e9bc8c0b fix(diagnostics): track model stream progress (#86757)
Summary:
- The PR updates diagnostics to mark streamed model chunks as run progress, keeps silent model calls abortable after the stuck-session timeout, and adds regression coverage for stream progress and recovery behavior.
- PR surface: Source +54, Tests +229. Total +283 across 6 files.
- Reproducibility: yes. at source level: current main tracks model-call start/end activity but streamed chunks ... covery keys on stale lastProgressAgeMs. I did not run a live local-provider repro in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(diagnostics): track model stream progress
- PR branch already contained follow-up commit before automerge: test(diagnostics): cover silent local model aborts
- PR branch already contained follow-up commit before automerge: fix(diagnostics): skip stream progress when disabled

Validation:
- ClawSweeper review passed for head fcc74d9869.
- Required merge gates passed before the squash merge.

Prepared head SHA: fcc74d9869
Review: https://github.com/openclaw/openclaw/pull/86757#issuecomment-4540111930

Co-authored-by: Onur Solmaz <2453968+osolmaz@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: osolmaz
Co-authored-by: osolmaz <2453968+osolmaz@users.noreply.github.com>
2026-05-26 04:47:11 +00:00
Vincent Koc
ef8619d5f5 fix(diagnostics): expose missing telemetry signals (#86682) 2026-05-26 01:10:59 +01:00
Peter Steinberger
77d9ac30bb refactor: reuse shared coercion helpers (#86419)
* refactor: share talk event metric extraction

* refactor: reuse shared coercion helpers

* refactor: reuse shared primitive guards

* refactor: reuse shared record guard

* refactor: reuse shared primitive helpers

* refactor: reuse shared string guards

* refactor: reuse shared non-empty string guard

* refactor: share plugin primitive coercion helpers

* refactor: reuse plugin coercion helpers

* refactor: reuse plugin coercion helpers in more plugins

* refactor: reuse channel coercion helpers

* refactor: reuse monitor coercion helpers

* refactor: reuse provider coercion helpers

* refactor: reuse core coercion helpers

* refactor: reuse runtime coercion helpers

* refactor: reuse helper coercion in codex paths

* refactor: reuse helper coercion in runtime paths

* refactor: reuse codex app-server coercion helpers

* refactor: reuse codex record helpers

* refactor: reuse migration and qa record helpers

* refactor: reuse feishu and core helper guards

* refactor: reuse browser and policy coercion helpers

* refactor: reuse memory wiki record helper

* refactor: share boolean coercion helpers

* refactor: reuse finite number coercion

* refactor: reuse trimmed string list helpers

* refactor: reuse string list normalization

* refactor: reuse remaining string list helpers

* refactor: reuse string entry normalizer

* refactor: share sorted string helpers

* refactor: share string list normalization

* test: preserve command registry browser imports

* refactor: reuse trimmed list helpers

* refactor: reuse string dedupe helpers

* refactor: reuse local dedupe helpers

* refactor: reuse more string dedupe helpers

* refactor: reuse command string dedupe helpers

* refactor: dedupe memory path lists with helper

* refactor: expose string dedupe helpers to plugins

* refactor: reuse core string dedupe helpers

* refactor: reuse shared unique value helpers

* refactor: reuse unique helpers in agent utilities

* refactor: reuse unique helpers in config plumbing

* refactor: reuse unique helpers in extensions

* refactor: reuse unique helpers in core utilities

* refactor: reuse unique helpers in qa plugins

* refactor: reuse unique helpers in memory plugins

* refactor: reuse unique helpers in channel plugins

* refactor: reuse unique helpers in core tails

* refactor: reuse unique helper in comfy workflow

* refactor: reuse unique helpers in test utilities

* refactor: expose unique value helper to plugins

* refactor: reuse unique helpers for numeric lists

* refactor: replace index dedupe filters

* refactor: reuse string entry normalization

* refactor: reuse string normalization in plugin helpers

* refactor: reuse string normalization in extension helpers

* refactor: reuse string normalization in channel parsers

* refactor: reuse string normalization in memory search

* refactor: reuse string normalization in provider parsers

* refactor: reuse string normalization in qa helpers

* refactor: reuse string normalization in infra parsers

* refactor: reuse string normalization in messaging parsers

* refactor: reuse string normalization in core parsers

* refactor: reuse string normalization in extension parsers

* refactor: reuse string normalization in remaining parsers

* refactor: reuse string normalization in final parser spots

* refactor: reuse string normalization in qa media helpers

* refactor: reuse normalization in provider and media lists

* refactor: reuse normalization for remaining set filters

* refactor: reuse normalization in policy allowlists

* refactor: reuse normalization in session and owner lists

* refactor: centralize primitive string lists

* refactor: reuse lowercase entry helpers

* refactor: reuse sorted string helpers

* refactor: reuse unique trimmed helpers

* refactor: reuse string normalization helpers

* refactor: reuse catalog string helpers

* refactor: reuse remaining string helpers

* refactor: simplify remaining list normalization

* refactor: reuse codex auth order normalization

* chore: refresh plugin sdk api baseline

* fix: make shared string sorting deterministic

* chore: refresh plugin sdk api baseline

* fix: align host env security ordering
2026-05-25 21:20:41 +01:00
Peter Steinberger
78a1e7dfe6 fix(logging): keep string failure codes on EPIPE 2026-05-25 20:56:56 +01:00
Peter Steinberger
623a60a2b7 fix(logging): preserve failure exit on EPIPE 2026-05-25 20:56:56 +01:00
Pavel Zakharov
2aa5f1771f fix(logging): exit on stdout/stderr EPIPE instead of spinning
When the gateway process is orphaned after a systemd service restart,
the parent's journal pipe closes and every write to stdout/stderr returns
EPIPE. The previous handler swallowed it with a bare return, so background
loops (config file watcher, etc.) kept firing and the process spun at
100% CPU indefinitely.

Exit cleanly with code 0 instead — a process whose own output streams
are broken has nowhere to log and no reason to keep running.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 20:56:56 +01:00
Peter Steinberger
baab4cf045 refactor(logging): share diagnostic message lifecycle
Refactor diagnostic queued/state/processed emission into a shared helper used by dispatch and isolated cron turns.

Preserve dispatch processed-event behavior, cron queue-depth symmetry, and final cron session-id adoption while adding focused helper coverage and reviewer comments for the non-obvious invariants.
2026-05-25 19:48:45 +01:00
Peter Steinberger
459e89ada8 fix(gateway): keep session tool mirrors under pressure
Reverts the diagnostic queue-pressure suppression of non-terminal session tool mirrors from PR 84846 while keeping PR 86503 recipient dedupe intact. Session-only Control UI subscribers keep receiving tool lifecycle mirrors; overlapping run and session subscribers still receive one canonical run-scoped frame. Verification: focused gateway and diagnostic tests, diff check, changed check, and autoreview all passed.
2026-05-25 15:22:52 +01:00
Chunyue Wang
6f76d9f246 fix(diagnostics): reclaim wedged session lanes with a stale leaked active run (#86056)
* fix(diagnostics): reclaim wedged session lanes with a stale leaked active run

A group session lane could wedge permanently (#85639): an embedded run that dies
abnormally leaves a stale ACTIVE_EMBEDDED_RUNS handle, so the diagnostic heartbeat
classifies the lane stale_session_state (recoveryEligible without allowActiveAbort)
while stuck-session recovery reads the leaked isEmbeddedPiRunActive flag and skips
with active_reply_work — a tautology that keeps the lane forever. The age-based
escape never fires because ageMs (last-activity) resets on every incoming queued
message.

Make the active-run skip a liveness check: before keeping the lane, consult the
run's real forward-progress age (lastProgressAgeMs, not refreshed by incoming
messages). If a run flagged active has made no forward progress past the resolved
diagnostics.stuckSessionAbortMs threshold (threaded through the recovery request;
falls back to a 5-minute floor) with queued work waiting, treat it as a
leaked/dead handle and reclaim it (abort + drain + force-clear) instead of
skipping. A genuinely progressing run, or one within an operator-raised
threshold, is kept.

Fixes #85639

* test(diagnostics): cover stale active run recovery

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-25 14:20:59 +01:00
Galin Iliev
42bdc949f2 fix(gateway): back off session tool mirrors under pressure (#84846)
Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>
2026-05-24 18:34:37 -07:00
Gaurav Prasad
558a05b6d0 feat(diagnostics): classify skill and tool usage (#80370)
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-05-23 16:08:55 +08:00
Vincent Koc
bdcaac06c6 fix(diagnostics): bound diagnostic buffers 2026-05-22 22:01:41 +08:00
Josh Avant
221f5349b5 fix: redact denied exec failure params (#85140)
* fix exec failure log redaction

* docs: add exec redaction changelog

* test: satisfy redaction lint
2026-05-21 18:10:50 -07:00
Josh Avant
0ab1449215 Fix Discord session recovery abort ownership (#85100)
* fix auto-reply abort ownership

* add changelog for #85100
2026-05-21 14:34:18 -07:00
clawsweeper[bot]
5955f354f7 fix(status): add gateway delivery health telemetry (#85016)
Summary:
- This replacement PR adds inbound delivery diagnostic events, gateway status counters and warnings, transport ... ut, Prometheus/OpenTelemetry metrics, docs, changelog, and regression coverage for gateway delivery health.
- Reproducibility: no. high-confidence live reproduction of the original Feishu failure was run here. Source i ... ch/turn telemetry, and the source PR supplies after-fix live output for the connected WebChat gateway path.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(types): restore PR conflict resolution type checks

Validation:
- ClawSweeper review passed for head 6ffe08a9c7.
- Required merge gates passed before the squash merge.

Prepared head SHA: 6ffe08a9c7
Review: https://github.com/openclaw/openclaw/pull/85016#issuecomment-4510224436

Co-authored-by: Andi Liao <liaoandi95@gmail.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
2026-05-21 16:55:29 +00:00
clawsweeper[bot]
c9b6a8b408 fix(trajectory): tolerate partial skill snapshot entries in support capture (#84797)
Summary:
- This PR filters partial skill snapshot entries in trajectory support metadata, accepts nullish support-redaction paths, adds regression tests, and records the fix in the changelog.
- Reproducibility: yes. Source inspection on current main shows undefined skill path/name values can reach str ... and the related source PR provides redacted live before/after gateway logs for the symlink-escape scenario.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(trajectory): tighten test types for partial skill entries
- PR branch already contained follow-up commit before automerge: fix(trajectory): tolerate partial skill snapshot entries in support c…

Validation:
- ClawSweeper review passed for head ecb3df6c08.
- Required merge gates passed before the squash merge.

Prepared head SHA: ecb3df6c08
Review: https://github.com/openclaw/openclaw/pull/84797#issuecomment-4504703074

Co-authored-by: Luke Boyett <46942646+lukeboyett@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
2026-05-21 05:38:57 +00:00
Peter Steinberger
4f4d108639 chore(lint): remove underscore-dangle allow list (#83542)
* chore(lint): reduce underscore-dangle exceptions

* chore(lint): reduce more underscore exceptions

* chore(lint): remove underscore-dangle allow list

* fix(lint): repair underscore cleanup regressions

* test(lint): track version define suppression
2026-05-18 14:56:06 +01:00
Eva (agent)
08ecc518ec fix: close adversarial release stability gaps 2026-05-18 14:58:10 +05:30
Eva (agent)
54f87184f0 fix: recover stale release-stability stalls and auth loops 2026-05-18 14:58:10 +05:30
Eva (agent)
a792068d9d fix: harden release stability diagnostics 2026-05-18 14:58:10 +05:30
Peter Steinberger
669786595d fix(logging): avoid false liveness backlog warnings 2026-05-17 07:15:17 +01:00
吴杨帆
cc9117f729 fix: validate limit edge cases and voicecall numeric flags (#82679)
Fix diagnostics/session usage limit handling and voice-call numeric CLI validation.

- Treat explicit zero, negative, and non-finite diagnostics/session limits as empty results instead of falling back to defaults.
- Reject invalid, non-finite, and fractional voice-call numeric flags.
- Add focused tests and a live repro proof for the canonical edge cases.

Fixes #82646, #82650, #82651, #82653.

Co-authored-by: wuyangfan <1102042793@qq.com>
2026-05-17 00:11:46 +01:00