Adds owner-level startup trace attribution for gateway auth, plugin loading, lookup counts, and plugin sidecar services.
Verification:
- node scripts/run-vitest.mjs src/plugins/startup-trace-segment.test.ts src/plugins/services.test.ts src/plugins/loader.test.ts src/gateway/server-startup-config.secrets.test.ts
- pnpm build
- pnpm check
CI override:
- Red checks are unrelated baseline noise. The failed CI shard is src/cli/plugins-install-persist.test.ts, which fails on origin/main 336ba2a2b3 with the same missing resolveIsNixMode mock export. PR #81738 touches gateway/plugin startup trace files and CHANGELOG.md, not the failing CLI plugin install test.
Thanks @samzong.
Co-authored-by: samzong <13782141+samzong@users.noreply.github.com>
Summary:
- The PR changes Gateway reload planning, CLI plugin install-index writes, plugin runtime/cache cleanup, docs, changelog, and tests so plugin enable/disable hot reloads while install/update/uninstall stay restart-backed.
- Reproducibility: yes. The earlier blocker has a source-level reproduction: run an external plugin install/up ... watches config and only the managed plugin index changes; the PR now tests that path and queues a restart.
ClawSweeper fixups:
- Included follow-up commit: fix: hot reload plugin management changes
- Included follow-up commit: fix(clawsweeper): address review for automerge-openclaw-openclaw-7597…
- Ran the ClawSweeper repair loop before final review.
Validation:
- ClawSweeper review passed for head 860594f722.
- Required merge gates passed before the squash merge.
Prepared head SHA: 860594f722
Review: https://github.com/openclaw/openclaw/pull/75976#issuecomment-4363168379
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Simplify plugin installation and runtime loading around package-manager-owned dependencies, with Jiti reserved for local/TS fallback paths.
Also scans npm plugin install roots so hoisted transitive dependencies are covered by dependency denylist and node_modules symlink checks.
* Revert "fix(memory/dreaming): surface blocked status when heartbeat is disabled for main (#69875)"
This reverts commit 529577e045.
Making way for the dreaming-vs-heartbeat decoupling from Josh's
josh/dreaming-isolated-cron-fix branch, which moves the managed dreaming
cron to isolated agent turns (sessionTarget: "isolated") so dreaming no
longer requires heartbeat to fire. Once the cron no longer rides the
heartbeat path, the blocked-reason observability has nothing left to
report — removing it cleanly here before the cherry-picks land.
* openclaw-3ba.1: move managed dreaming cron to isolated agent turns
* openclaw-46d: claim cron runs before embedded attempts
* openclaw-575: disable managed dreaming cron delivery
* openclaw-575: accept wrapped dreaming cron tokens
* openclaw-ccd: filter cron and wrapper transcript noise from dreaming corpus
* openclaw-cd9: filter archived, cron, and heartbeat transcript noise from dreaming corpus
* openclaw-cd9: suppress role-label reflection tags in rem dreaming
* openclaw-b49: stop narrative timeouts from blocking dreaming cron
* openclaw-b49: keep managed dreaming cron out of diary subagents
* openclaw-ff9: restore cron dream diary generation without serial waits
* openclaw-ff9: run dreaming narratives with lightweight isolated subagent lanes
* openclaw-ff9: detach cron dream diary generation from run completion
* openclaw-ff9: defer cron diary task startup until after cron completion
* doctor/cron: migrate stale managed dreaming jobs to isolated agent turns
After the dreaming cron moved off the heartbeat path to sessionTarget:
"isolated" + payload.kind: "agentTurn" (see the preceding memory-core
changes), users with existing ~/.openclaw/cron/jobs.json entries in the
old sessionTarget: "main" + payload.kind: "systemEvent" shape still
carry stale jobs until the gateway restart reconcile rewrites them.
Add a dreaming-specific cron migration to the existing
maybeRepairLegacyCronStore doctor path so "openclaw doctor" (and
"openclaw doctor --fix") rewrites those jobs without needing a gateway
restart. Match lives in a new doctor-cron-dreaming-payload-migration
helper alongside the existing legacy-delivery and store-migration files.
The matching uses the memory-core managed-job name and description tag
plus the short-term-promotion payload token. Constants are mirrored
from extensions/memory-core/src/dreaming.ts and commented so a future
rename in memory-core is a visible drift point here too.
* memory/dreaming: tighten cron-token match to known wrapper, not substring
The previous match relaxed the line check from 'trimmed line equals token'
to 'line contains token anywhere as a substring' to accept the
`[cron:<id>] <token>` wrapper that isolated-cron turns add. Substring
matching also let any user message embedding the token mid-sentence
trigger the dream-promotion hook, and was flagged by both Greptile and
Aisle on PR #70737.
Replace it with strip-the-known-prefix-then-exact-match: keep the
`[cron:<id>]` wrapper case working, reject every other variant. Add
focused unit coverage that the bare token, the wrapped token, and bare
multiline cases match while embedded / code-fenced / arbitrarily-wrapped
variants do not.
* memory/dreaming: drop assistant followup only on assistant-side signals
Per PR #70737 review (aisle-research-bot, Medium): the previous logic
suppressed the next assistant message whenever the prior user message
matched a 'generated prompt' pattern (`[cron:...]`,
`System (untrusted): ...`, heartbeat prompts, exec-completion events).
Real users can type those same patterns, which let a user exfiltrate
real assistant replies from the dreaming corpus by prefixing their own
prompt — the assistant's reply would be silently dropped.
Remove the cross-message coupling. Assistant-side machinery (silent
replies, system wrappers) is already dropped by sanitizeSessionText,
which is the right layer for that filter. Add an explicit assistant-side
HEARTBEAT_TOKEN check to keep the legitimate `HEARTBEAT_OK` ack drop
working without depending on the prior user message. Add a regression
test exercising the spoofing scenario.
* doctor/cron: assert mirrored dreaming constants stay in sync
Per PR #70737 review (greptile-apps): the doctor migration mirrors three
constants (MANAGED_DREAMING_CRON_NAME, MANAGED_DREAMING_CRON_TAG,
DREAMING_SYSTEM_EVENT_TEXT) from extensions/memory-core/src/dreaming.ts.
A future rename in either file would silently break the migration.
Add a vitest unit that reads both files and asserts the literals match.
Manually verified the assertion fires with a clear error when one side
diverges. Adds no runtime cost; sits in the regular test pipeline.
* fix(memory): stabilize dreaming CI checks
* memory/dreaming: skip eager narrative session cleanup when detached
Per PR #70737 review (chatgpt-codex-connector, P2): runDreamingSweepPhases
called deleteNarrativeSessionBestEffort synchronously right after each
phase. Once narrative generation moved to detached mode (queued via
queueMicrotask), the eager cleanup races the writer: the session is
deleted before the queued subagent run reads it, silently dropping cron
diary entries.
Skip the eager cleanup branch when params.detachNarratives is true.
generateAndAppendDreamNarrative still runs its own deleteSession in the
finally{} block, so the cleanup intent is preserved without the race.
Heartbeat-driven (non-detached) runs keep the original eager-cleanup
behavior.
* fix(plugin-sdk): restore heartbeat-summary re-export
Per PR #70737 review (chatgpt-codex-connector, P1): the revert of
PR #69875 dropped the `heartbeat-summary` re-export from
`openclaw/plugin-sdk/infra-runtime`. That subpath shipped publicly two
days earlier, so removing it is technically a breaking change to a
public SDK surface — third-party plugins importing
`isHeartbeatEnabledForAgent` / `resolveHeartbeatIntervalMs` from this
path would fail with no replacement contract introduced.
Restore the re-export. Costs nothing to keep; the helpers are already
public via `../infra/heartbeat-summary.ts`. SDK additions are by
default backwards-compatible (CLAUDE.md), so removing within days of
introduction violates that intent.
* changelog: note dreaming decoupling from heartbeat
Refs PR #70737.
---------
Co-authored-by: Josh Lehman <josh@martian.engineering>
* plugins: include resolved workspaceDir in provider hook cache keys
resolveProviderPluginsForHooks, resolveProviderPluginsForCatalogHooks, and
resolveProviderRuntimePlugin used the raw params.workspaceDir for cache keys
and plugin-id discovery while resolvePluginProviders already fell back to
the active registry workspace. Resolve workspaceDir once at the top of each
function so cache keys, candidate filtering, and loading all use the same
workspace root.
* fix(plugins): inherit runtime workspace for snapshot loads
* test(gateway): stub runtime registry seam
* fix(plugins): restore workspace fallback after rebase
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Plugin subagent dispatch used a hardcoded synthetic client carrying
operator.admin, operator.approvals, and operator.pairing for all
runtime.subagent.* calls. Plugin HTTP routes with auth:"plugin" require
no gateway auth by design, so an unauthenticated external request could
drive admin-only gateway methods (sessions.delete, agent.run) through
the subagent runtime.
Propagate the real gateway client into the plugin runtime request scope
when one is available. Plugin HTTP routes now run inside a scoped
runtime client: auth:"plugin" routes receive a non-admin synthetic
operator.write client; gateway-authenticated routes retain admin-capable
scopes. The security boundary is enforced at the HTTP handler level.
Fixes GHSA-xw77-45gv-p728
* feat(context-engine): add ContextEngine interface and registry
Introduce the pluggable ContextEngine abstraction that allows external
plugins to register custom context management strategies.
- ContextEngine interface with lifecycle methods: bootstrap, ingest,
ingestBatch, afterTurn, assemble, compact, prepareSubagentSpawn,
onSubagentEnded, dispose
- Module-level singleton registry with registerContextEngine() and
resolveContextEngine() (config-driven slot selection)
- LegacyContextEngine: pass-through implementation wrapping existing
compaction behavior for 100% backward compatibility
- ensureContextEnginesInitialized() guard for safe one-time registration
- 19 tests covering contract, registry, resolution, and legacy parity
* feat(plugins): add context-engine slot and registerContextEngine API
Wire the ContextEngine abstraction into the plugin system so external
plugins can register context engines via the standard plugin API.
- Add 'context-engine' to PluginKind union type
- Add 'contextEngine' slot to PluginSlotsConfig (default: 'legacy')
- Wire registerContextEngine() through OpenClawPluginApi
- Export ContextEngine types from plugin-sdk for external consumers
- Restore proper slot-based resolution in registry
* feat(context-engine): wire ContextEngine into agent run lifecycle
Integrate the ContextEngine abstraction into the core agent run path:
- Resolve context engine once per run (reused across retries)
- Bootstrap: hydrate canonical store from session file on first run
- Assemble: route context assembly through pluggable engine
- Auto-compaction guard: disable built-in auto-compaction when
the engine declares ownsCompaction (prevents double-compaction)
- AfterTurn: post-turn lifecycle hook for ingest + background
compaction decisions
- Overflow compaction: route through contextEngine.compact()
- Dispose: clean up engine resources in finally block
- Notify context engine on subagent lifecycle events
Legacy engine: all lifecycle methods are pass-through/no-op, preserving
100% backward compatibility for users without a context engine plugin.
* feat(plugins): add scoped subagent methods and gateway request scope
Expose runtime.subagent.{run, waitForRun, getSession, deleteSession}
so external plugins can spawn sub-agent sessions without raw gateway
dispatch access.
Uses AsyncLocalStorage request-scope bridge to dispatch internally via
handleGatewayRequest with a synthetic operator client. Methods are only
available during gateway request handling.
- Symbol.for-backed global singleton for cross-module-reload safety
- Fallback gateway context for non-WS dispatch paths (Telegram/WhatsApp)
- Set gateway request scope for all handlers, not just plugin handlers
- 3 staleness tests for fallback context hardening
* feat(context-engine): route /compact and sessions.get through context engine
Wire the /compact command and sessions.get handler through the pluggable
ContextEngine interface.
- Thread tokenBudget and force parameters to context engine compact
- Route /compact through contextEngine.compact() when registered
- Wire sessions.get as runtime alias for plugin subagent dispatch
- Add .pebbles/ to .gitignore
* style: format with oxfmt 0.33.0
Fix duplicate import (ControlUiRootState in server.impl.ts) and
import ordering across all changed files.
* fix: update extension test mocks for context-engine types
Add missing subagent property to bluebubbles PluginRuntime mock.
Add missing registerContextEngine to lobster OpenClawPluginApi mock.
* fix(subagents): keep deferred delete cleanup retryable
* style: format run attempt for CI
* fix(rebase): remove duplicate embedded-run imports
* test: add missing gateway context mock export
* fix: pass resolved auth profile into afterTurn compaction
Ensure the embedded runner forwards resolved auth profile context into
legacy context-engine compaction params on the normal afterTurn path,
matching overflow compaction behavior. This allows downstream LCM
summarization to use the intended provider auth/profile consistently.
Also fix strict TS typing in external-link token dedupe and align an
attempt unit test reasoningLevel value with the current ReasoningLevel
enum.
Regeneration-Prompt: |
We were debugging context-engine compaction where downstream summary
calls were missing the right auth/profile context in normal afterTurn
flow, while overflow compaction already propagated it. Preserve current
behavior and keep changes additive: thread the resolved authProfileId
through run -> attempt -> legacy compaction param builder without
broad refactors.
Add tests that prove the auth profile is included in afterTurn legacy
params and that overflow compaction still passes it through run
attempts. Keep existing APIs stable, and only adjust small type issues
needed for strict compilation.
* fix: remove duplicate imports from rebase
* feat: add context-engine system prompt additions
* fix(rebase): dedupe attempt import declarations
* test: fix fetch mock typing in ollama autodiscovery
* fix(test): add registerContextEngine to diffs extension mock APIs
* test(windows): use path.delimiter in ios-team-id fixture PATH
* test(cron): add model formatting and precedence edge case tests
Covers:
- Provider/model string splitting (whitespace, nested paths, empty segments)
- Provider normalization (casing, aliases like bedrock→amazon-bedrock)
- Anthropic model alias normalization (opus-4.5→claude-opus-4-5)
- Precedence: job payload > session override > config default
- Sequential runs with different providers (CI flake regression pattern)
- forceNew session preserving stored model overrides
- Whitespace/empty model string edge cases
- Config model as string vs object format
* test(cron): fix model formatting test config types
* test(phone-control): add registerContextEngine to mock API
* fix: re-export ChannelKind from config-reload-plan
* fix: add subagent mock to plugin-runtime-mock test util
* docs: add changelog fragment for context engine PR #22201