Commit Graph

260 Commits

Author SHA1 Message Date
Vincent Koc
a171600d1d test: isolate broad unit state 2026-05-17 02:32:58 +08:00
Vincent Koc
b6b33ad6d3 test: harden broad qa timing 2026-05-17 02:32:57 +08:00
Vincent Koc
e06782d5e7 fix(gateway): land linked diagnostics fixes
Fix logs.tail credential-header redaction and JSON-mode gateway transport errors.\n\nFixes #66832.\nFixes #79108.\nSupersedes #67041.\nSupersedes #79233.\n\nCo-authored-by: Mil Wang <mingjwan@microsoft.com>\nCo-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
2026-05-17 02:05:02 +08:00
Josh Avant
a1e208ee26 Recover stale embedded tool calls during gateway diagnostics (#82369)
* fix(gateway): recover stale embedded tool calls

* chore(changelog): note stale embedded tool recovery
2026-05-15 19:36:59 -05:00
Peter Steinberger
a383baac03 test(logging): fix stalled recovery threshold test 2026-05-15 12:25:31 +01:00
Peter Steinberger
778ad09ff2 test(logging): derive diagnostic abort threshold 2026-05-15 10:38:43 +01:00
Peter Steinberger
e30be460e1 fix: shorten stalled Codex recovery window 2026-05-15 10:19:37 +01:00
Josh Lehman
f64feab47a fix: prevent codex app-server surrogate stalls 2026-05-14 19:59:23 +01:00
Josh Avant
fd244fd76d Fix Telegram polling ingress under event-loop stalls (#81746)
* fix telegram polling ingress under event-loop stalls

* add changelog for telegram ingress fix
2026-05-14 03:35:06 -05:00
Josh Avant
4e1f59010e fix(gateway): suppress startup liveness warnings (#81699)
* fix(gateway): suppress startup liveness warnings

* docs(changelog): note diagnostic startup grace fix
2026-05-14 01:39:46 -05:00
Peter Steinberger
694ca50e97 Revert "refactor: move runtime state to SQLite"
This reverts commit f91de52f0d.
2026-05-13 13:33:38 +01:00
Peter Steinberger
f91de52f0d refactor: move runtime state to SQLite
* refactor: remove stale file-backed shims

* fix: harden sqlite state ci boundaries

* refactor: store matrix idb snapshots in sqlite

* fix: satisfy rebased CI guardrails

* refactor: store current conversation bindings in sqlite table

* refactor: store tui last sessions in sqlite table

* refactor: reset sqlite schema history

* refactor: drop unshipped sqlite table migration

* refactor: remove plugin index file rollback

* refactor: drop unshipped sqlite sidecar migrations

* refactor: remove runtime commitments kv migration

* refactor: preserve kysely sync result types

* refactor: drop unshipped sqlite schema migration table

* test: keep session usage coverage sqlite-backed

* refactor: keep sqlite migration doctor-only

* refactor: isolate device legacy imports

* refactor: isolate push voicewake legacy imports

* refactor: isolate remaining runtime legacy imports

* refactor: tighten sqlite migration guardrails

* test: cover sqlite persisted enum parsing

* refactor: isolate legacy update and tui imports

* refactor: tighten sqlite state ownership

* refactor: move legacy imports behind doctor

* refactor: remove legacy session row lookup

* refactor: canonicalize memory transcript locators

* refactor: drop transcript path scope fallbacks

* refactor: drop runtime legacy session delivery pruning

* refactor: store tts prefs only in sqlite

* refactor: remove cron store path runtime

* refactor: use cron sqlite store keys

* refactor: rename telegram message cache scope

* refactor: read memory dreaming status from sqlite

* refactor: rename cron status store key

* refactor: stop remembering transcript file paths

* test: use sqlite locators in agent fixtures

* refactor: remove file-shaped commitments and cron store surfaces

* refactor: keep compaction transcript handles out of session rows

* refactor: derive transcript handles from session identity

* refactor: derive runtime transcript handles

* refactor: remove gateway session locator reads

* refactor: remove transcript locator from session rows

* refactor: store raw stream diagnostics in sqlite

* refactor: remove file-shaped transcript rotation

* refactor: hide legacy trajectory paths from runtime

* refactor: remove runtime transcript file bridges

* refactor: repair database-first rebase fallout

* refactor: align tests with database-first state

* refactor: remove transcript file handoffs

* refactor: sync post-compaction memory by transcript scope

* refactor: run codex app-server sessions by id

* refactor: bind codex runtime state by session id

* refactor: pass memory transcripts by sqlite scope

* refactor: remove transcript locator cleanup leftovers

* test: remove stale transcript file fixtures

* refactor: remove transcript locator test helper

* test: make cron sqlite keys explicit

* test: remove cron runtime store paths

* test: remove stale session file fixtures

* test: use sqlite cron keys in diagnostics

* refactor: remove runtime delivery queue backfill

* test: drop fake export session file mocks

* refactor: rename acp session read failure flag

* refactor: rename acp row session key

* refactor: remove session store test seams

* refactor: move legacy session parser tests to doctor

* refactor: reindex managed memory in place

* refactor: drop stale session store wording

* refactor: rename session row helpers

* refactor: rename sqlite session entry modules

* refactor: remove transcript locator leftovers

* refactor: trim file-era audit wording

* refactor: clean managed media through sqlite

* fix: prefer explicit agent for exports

* fix: use prepared agent for session resets

* fix: canonicalize legacy codex binding import

* test: rename state cleanup helper

* docs: align backup docs with sqlite state

* refactor: drop legacy Pi usage auth fallback

* refactor: move legacy auth profile imports to doctor

* refactor: keep Pi model discovery auth in memory

* refactor: remove MSTeams legacy learning key fallback

* refactor: store model catalog config in sqlite

* refactor: use sqlite model catalog at runtime

* refactor: remove model json compatibility aliases

* refactor: store auth profiles in sqlite

* refactor: seed copied auth profiles in sqlite

* refactor: make auth profile runtime sqlite-addressed

* refactor: migrate hermes secrets into sqlite auth store

* refactor: move plugin install config migration to doctor

* refactor: rename plugin index audit checks

* test: drop auth file assumptions

* test: remove legacy transcript file assertions

* refactor: drop legacy cli session aliases

* refactor: store skill uploads in sqlite

* refactor: keep subagent attachments in sqlite vfs

* refactor: drop subagent attachment cleanup state

* refactor: move legacy session aliases to doctor

* refactor: require node 24 for sqlite state runtime

* refactor: move provider caches into sqlite state

* fix: harden virtual agent filesystem

* refactor: enforce database-first runtime state

* refactor: rename compaction transcript rotation setting

* test: clean sqlite refactor test types

* refactor: consolidate sqlite runtime state

* refactor: model session conversations in sqlite

* refactor: stop deriving cron delivery from session keys

* refactor: stop classifying sessions from key shape

* refactor: hydrate announce targets from typed delivery

* refactor: route heartbeat delivery from typed sqlite context

* refactor: tighten typed sqlite session routing

* refactor: remove session origin routing shadow

* refactor: drop session origin shadow fixtures

* perf: query sqlite vfs paths by prefix

* refactor: use typed conversation metadata for sessions

* refactor: prefer typed session routing metadata

* refactor: require typed session routing metadata

* refactor: resolve group tool policy from typed sessions

* refactor: delete dead session thread info bridge

* Show Codex subscription reset times in channel errors (#80456)

* feat(plugin-sdk): consolidate session workflow APIs

* fix(agents): allow read-only agent mount reads

* [codex] refresh plugin regression fixtures

* fix(agents): restore compaction gateway logs

* test: tighten gateway startup assertions

* Redact persisted secret-shaped payloads [AI] (#79006)

* test: tighten device pair notify assertions

* test: tighten hermes secret assertions

* test: assert matrix client error shapes

* test: assert config compat warnings

* fix(heartbeat): remap cron-run exec events to session keys (#80214)

* fix(codex): route btw through native side threads

* fix(auth): accept friendly OpenAI order for Codex profiles

* fix(codex): rotate auth profiles inside harness

* fix: keep browser status page probe within timeout

* test: assert agents add outputs

* test: pin cron read status

* fix(agents): avoid Pi resource discovery stalls

Co-authored-by: dataCenter430 <titan032000@gmail.com>

* fix: retire timed-out codex app-server clients

* test: tighten qa lab runtime assertions

* test: check security fix outputs

* test: verify extension runtime messages

* feat(wake): expose typed sessionKey on wake protocol + system event CLI

* fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790)

* test: guard talk consult call helper

* fix(codex): scale context engine projection (#80761)

* fix(codex): scale context engine projection

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* chore: align Codex projection changelog

* chore: realign Codex projection changelog

* fix: isolate Codex projection patch

---------

Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>

* refactor: move agent runtime state toward piless

* refactor: remove cron session reaper

* refactor: move session management to sqlite

* refactor: finish database-first state migration

* chore: refresh generated sqlite db types

* refactor: remove stale file-backed shims

* test: harden kysely type coverage

# Conflicts:
#	.agents/skills/kysely-database-access/SKILL.md
#	src/infra/kysely-sync.types.test.ts
#	src/proxy-capture/store.sqlite.test.ts
#	src/state/openclaw-agent-db.test.ts
#	src/state/openclaw-state-db.test.ts

* refactor: remove cron store path runtime

* refactor: keep compaction transcript handles out of session rows

* refactor: derive embedded transcripts from sqlite identity

* refactor: remove embedded transcript locator handoff

* refactor: remove runtime transcript file bridges

* refactor: remove transcript file handoffs

* refactor: remove MSTeams legacy learning key fallback

* refactor: store model catalog config in sqlite

* refactor: use sqlite model catalog at runtime

# Conflicts:
#	docs/cli/secrets.md
#	docs/gateway/authentication.md
#	docs/gateway/secrets.md

* fix: keep oauth sibling sync sqlite-local

# Conflicts:
#	src/commands/onboard-auth.test.ts

* refactor: remove task session store maintenance

# Conflicts:
#	src/commands/tasks.ts

* refactor: keep diagnostics in state sqlite

* refactor: enforce database-first runtime state

* refactor: consolidate sqlite runtime state

* Show Codex subscription reset times in channel errors (#80456)

* fix(codex): refresh subscription limit resets

* fix(codex): format reset times for channels

* Update CHANGELOG with latest changes and fixes

Updated CHANGELOG with recent fixes and improvements.

* fix(codex): keep command load failures on codex surface

* fix(codex): format account rate limits as rows

* fix(codex): summarize account limits as usage status

* fix(codex): simplify account limit status

* test: tighten subagent announce queue assertion

* test: tighten session delete lifecycle assertions

* test: tighten cron ops assertions

* fix: track cron execution milestones

* test: tighten hermes secret assertions

* test: assert matrix sync store payloads

* test: assert config compat warnings

* fix(codex): align btw side thread semantics

* fix(codex): honor codex fallback blocking

* fix(agents): avoid Pi resource discovery stalls

* test: tighten codex event assertions

* test: tighten cron assertions

* Fix Codex app-server OAuth harness auth

* refactor: move agent runtime state toward piless

* refactor: move device and push state to sqlite

* refactor: move runtime json state imports to doctor

* refactor: finish database-first state migration

* chore: refresh generated sqlite db types

* refactor: clarify cron sqlite store keys

* refactor: remove stale file-backed shims

* refactor: bind codex runtime state by session id

* test: expect sqlite trajectory branch export

* refactor: rename session row helpers

* fix: keep legacy device identity import in doctor

* refactor: enforce database-first runtime state

* refactor: consolidate sqlite runtime state

* build: align pi contract wrappers

* chore: repair database-first rebase

* refactor: remove session file test contracts

* test: update gateway session expectations

* refactor: stop routing from session compatibility shadows

* refactor: stop persisting session route shadows

* refactor: use typed delivery context in clients

* refactor: stop echoing session route shadows

* refactor: repair embedded runner rebase imports

# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts

* refactor: align pi contract imports

* refactor: satisfy kysely sync helper guard

* refactor: remove file transcript bridge remnants

* refactor: remove session locator compatibility

* refactor: remove session file test contracts

* refactor: keep rebase database-first clean

* refactor: remove session file assumptions from e2e

* docs: clarify database-first goal state

* test: remove legacy store markers from sqlite runtime tests

* refactor: remove legacy store assumptions from runtime seams

* refactor: align sqlite runtime helper seams

* test: update memory recall sqlite audit mock

* refactor: align database-first runtime type seams

* test: clarify doctor cron legacy store names

* fix: preserve sqlite session route projections

* test: fix copilot token cache test syntax

* docs: update database-first proof status

* test: align database-first test fixtures

* docs: update database-first proof status

* refactor: clean extension database-first drift

* test: align agent session route proof

* test: clarify doctor legacy path fixtures

* chore: clean database-first changed checks

* chore: repair database-first rebase markers

* build: allow baileys git subdependency

* chore: repair exp-vfs rebase drift

* chore: finish exp-vfs rebase cleanup

* chore: satisfy rebase lint drift

* chore: fix qqbot rebase type seam

* chore: fix rebase drift leftovers

* fix: keep auth profile oauth secrets out of sqlite

* fix: repair rebase drift tests

* test: stabilize pairing request ordering

* test: use source manifests in plugin contract checks

* fix: restore gateway session metadata after rebase

* fix: repair database-first rebase drift

* fix: clean up database-first rebase fallout

* test: stabilize line quick reply receipt time

* fix: repair extension rebase drift

* test: keep transcript redaction tests sqlite-backed

* fix: carry injected transcript redaction through sqlite

* chore: clean database branch rebase residue

* fix: repair database branch CI drift

* fix: repair database branch CI guard drift

* fix: stabilize oauth tls preflight test

* test: align database branch fast guards

* test: repair build artifact boundary guards

* chore: clean changelog rebase markers

---------

Co-authored-by: pashpashpash <nik@vault77.ai>
Co-authored-by: Eva <eva@100yen.org>
Co-authored-by: stainlu <stainlu@newtype-ai.org>
Co-authored-by: Jason Zhou <jason.zhou.design@gmail.com>
Co-authored-by: Ruben Cuevas <hi@rubencu.com>
Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com>
Co-authored-by: Shakker <shakkerdroid@gmail.com>
Co-authored-by: Kaspre <36520309+Kaspre@users.noreply.github.com>
Co-authored-by: dataCenter430 <titan032000@gmail.com>
Co-authored-by: Kaspre <kaspre@gmail.com>
Co-authored-by: pandadev66 <nova.full.stack@outlook.com>
Co-authored-by: Eva <admin@100yen.org>
Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>
Co-authored-by: jeffjhunter <support@aipersonamethod.com>
2026-05-13 13:15:12 +01:00
clawsweeper[bot]
faaa7efef0 fix(security): inline redact into appendSessionTranscriptMessage (#79645)
Merged via squash.

Prepared head SHA: da91ab6cf1
Co-authored-by: app/clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Reviewed-by: @hxy91819
2026-05-13 16:31:04 +08:00
Ayaan Zaidi
3d3a2399b5 fix(logging): track reply runs in diagnostics 2026-05-13 10:48:46 +05:30
Pavan Kumar Gondhi
39bcd1e088 fix(plugins): scan installed dependency runtime code [AI] (#81066)
* fix: scan installed plugin dependency code

* addressing review-skill

* addressing review-skill

* addressing codex review

* addressing codex review

* addressing codex review

* addressing codex review

* addressing codex review

* addressing codex review

* addressing codex review

* addressing codex review

* addressing ci

* addressing ci

* docs: add changelog entry for PR merge
2026-05-13 10:26:24 +05:30
nimbleenigma
277eb16652 fix: redact persisted tool result details
Refresh PR #80444 on current upstream main.
2026-05-12 20:13:26 -04:00
Shakker
201dd735a0 test: log diagnostic messages 2026-05-12 22:00:59 +01:00
Peter Steinberger
2b2664f17f test: guard logging mock calls 2026-05-11 22:34:04 +01:00
Peter Steinberger
32b8925cfa test: guard core sdk null helpers 2026-05-11 21:16:17 +01:00
Peter Steinberger
5f00135a44 test: guard support helper assertions 2026-05-11 20:16:47 +01:00
Val Alexander
342ae551ae fix(logging): reduce active-only liveness noise
Summary:
- Reduce active-only diagnostic liveness noise by emitting transient event-loop max delay samples as info-level telemetry.
- Keep warnings for queued or waiting work and for sustained high P99 loop delay.
- Cover the active-only path in the diagnostic stability tests and changelog.

Verification:
- pnpm format:check src/logging/diagnostic-stability.ts src/logging/diagnostic.test.ts CHANGELOG.md
- pnpm test src/logging/diagnostic.test.ts
- pnpm check:changed
- GitHub PR checks passed on head 25e674fe41.
2026-05-11 07:09:12 -05:00
Peter Steinberger
da9e77f6e9 test: tighten diagnostic support export assertions 2026-05-11 09:11:15 +01:00
Pavan Kumar Gondhi
17ceca86d6 Redact persisted secret-shaped payloads [AI] (#79006)
* fix: redact persisted secret-shaped payloads

* docs: add changelog entry for PR merge
2026-05-11 12:52:32 +05:30
Shakker
5cbcf2a549 test: tighten stability bundle assertions 2026-05-11 05:58:21 +01:00
Shakker
aa6e640304 test: tighten diagnostic memory assertions 2026-05-11 05:57:34 +01:00
Shakker
469421144c test: tighten logger context assertions 2026-05-11 05:55:37 +01:00
Shakker
ebe201a3e3 test: tighten browser logger settings assertion 2026-05-11 05:21:32 +01:00
Shakker
069fd6787a test: tighten logging config assertions 2026-05-11 05:11:25 +01:00
Peter Steinberger
0fba29b495 test: tighten stuck session recovery log assertions 2026-05-10 23:21:14 +01:00
Peter Steinberger
47317236ab test: tighten diagnostic logger assertions 2026-05-10 22:20:02 +01:00
Peter Steinberger
52111a0d5b test: tighten diagnostic stability assertions 2026-05-10 20:52:42 +01:00
Ayaan Zaidi
529bfdbaca refactor(codex): simplify native tool diagnostics 2026-05-10 18:21:44 +05:30
Keshav's Bot
a624988ae6 fix(codex): mark native tools active for diagnostics 2026-05-10 18:21:44 +05:30
Peter Steinberger
e384932497 test: clear logging diagnostic broad matchers 2026-05-10 12:52:43 +01:00
Ruben Cuevas
d0bba218e4 fix(gateway): redact fast-path console logs 2026-05-09 23:55:37 -04:00
Shakker
9ec2831c20 test: avoid channel catalog import in logger setup 2026-05-09 23:21:44 +01:00
Peter Steinberger
a4e3b4b6e3 test: tighten logging assertions 2026-05-09 12:40:54 +01:00
Shakker
d19735cbb3 test: tighten shared empty array assertions 2026-05-09 05:37:51 +01:00
Andi Liao
21d758c644 fix(logging): redact http client secrets (#75033)
Summary:
- The branch expands shared logging redaction for quoted HTTP client secret and auth-header fields, adds redaction/error-chain regression tests, and adds an Unreleased changelog entry.
- Reproducibility: yes. at source level. Current main routes formatted Error.cause text through shared redaction, and the current default patterns do not match the quoted HTTP client secret/auth fields covered by this PR.

Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.

Validation:
- ClawSweeper review passed for head fcd308be71.
- Required merge gates passed before the squash merge.

Prepared head SHA: fcd308be71
Review: https://github.com/openclaw/openclaw/pull/75033#issuecomment-4351803001

Co-authored-by: Andi Liao <liaoandi95@gmail.com>
2026-05-09 03:53:59 +00:00
Peter Steinberger
ce515dbf4d test: avoid misc count filter allocations 2026-05-08 22:05:41 +01:00
Peter Steinberger
6d785f01e8 test: avoid diagnostic count filter allocations 2026-05-08 21:50:46 +01:00
Peter Steinberger
84c4e66288 test: avoid zero length filter assertions 2026-05-08 21:49:20 +01:00
Peter Steinberger
7011bbb953 test: require logging async callbacks 2026-05-08 19:44:10 +01:00
Peter Steinberger
e6fa674b75 test: tighten parser null assertions 2026-05-08 16:42:26 +01:00
Peter Steinberger
2806e22caa test: tighten gateway logging string assertions 2026-05-08 14:35:32 +01:00
Peter Steinberger
d0f484d024 test: clarify runtime event assertions 2026-05-08 13:06:18 +01:00
Peter Steinberger
9ef37d1907 test: tighten assertions and harness coverage 2026-05-08 05:28:12 +01:00
Mert Başar
029ca8c268 feat(agents): implement state-aware failover and lane suspension
Summary:
- Persist quota-suspension state transitions and reload fresh suspension state before failover handoff injection.
- Restore suspended lanes to configured concurrency and share failover-to-suspension reason mapping across fallback and embedded runner paths.
- Export model.failover diagnostics via OTLP and cover queueing/resume behavior with regressions.

Verification:
- pnpm test src/config/sessions/store.pruning.integration.test.ts src/process/command-queue.test.ts src/agents/session-suspension.test.ts src/agents/model-fallback.test.ts extensions/diagnostics-otel/src/service.test.ts
- git diff --check
- pnpm exec oxfmt --check --threads=1 on changed TypeScript files
- GitHub checks: 92 successful, 0 pending, 0 failed on head 962146be88
- Review threads: none unresolved
2026-05-07 18:34:05 -05:00
Vincent Koc
e2501b2d6d fix(diagnostics): export Talk metrics after SDK refactor
Adds bounded Talk lifecycle/audio diagnostics and session recovery metrics for OTEL, Prometheus, and stability snapshots after the Talk SDK/session refactor. Includes changelog/docs updates and Testbox/live proof.
2026-05-06 02:01:52 -07:00
Peter Steinberger
538605ff44 [codex] Extract filesystem safety primitives (#77918)
* refactor: extract filesystem safety primitives

* refactor: use fs-safe for file access helpers

* refactor: reuse fs-safe for media reads

* refactor: use fs-safe for image reads

* refactor: reuse fs-safe in qqbot media opener

* refactor: reuse fs-safe for local media checks

* refactor: consume cleaner fs-safe api

* refactor: align fs-safe json option names

* fix: preserve fs-safe migration contracts

* refactor: use fs-safe primitive subpaths

* refactor: use grouped fs-safe subpaths

* refactor: align fs-safe api usage

* refactor: adapt private state store api

* chore: refresh proof gate

* refactor: follow fs-safe json api split

* refactor: follow reduced fs-safe surface

* build: default fs-safe python helper off

* fix: preserve fs-safe plugin sdk aliases

* refactor: consolidate fs-safe usage

* refactor: unify fs-safe store usage

* refactor: trim fs-safe temp workspace usage

* refactor: hide low-level fs-safe primitives

* build: use published fs-safe package

* fix: preserve outbound recovery durability after rebase

* chore: refresh pr checks
2026-05-06 02:15:17 +01:00