* perf(plugins): thread metadata snapshot and discovery through hot paths
With the snapshot memo now actually hitting, route the snapshot's
manifestRegistry and discovery through the helper chains that already
had fast paths for them. Eliminates redundant per-call rebuilds at
two big amplifiers.
- Provider resolve paths (resolvePluginProviders /
isPluginProvidersLoadInFlight / resolveOwningPluginIdsForProvider /
resolveExternalAuthProfilesWithPlugins) self-service a snapshot once
at the public entry, then thread it as a separate required arg
through resolvePluginProviderLoadBase,
resolveExplicitProviderOwnerPluginIds, and the setup/runtime load
state helpers. Inner reads change from
'params.pluginMetadataSnapshot?.x' to 'snapshot.x', no more
enrichedParams clone. loadPluginManifestRegistryForInstalledIndex
fires drop ~685 -> ~10 per cold start.
- Bundled-channel / auto-enable chain accepts an optional
PluginDiscoveryResult. discoverOpenClawPlugins is fired once during
snapshot building (resolveInstalledPluginIndexRegistry already
produced it internally; now bubbled up through
loadInstalledPluginIndexWithDiscovery, PluginRegistrySnapshotResult,
and onto PluginMetadataSnapshot.discovery). load-context reads
metadataSnapshot.discovery and passes it through
applyPluginAutoEnable, so the bundled-channel cascade
(collectConfiguredChannelIds, listBundledChannelIdsWith*,
listPotentialConfiguredChannelPresenceSignals) short-circuits
instead of each leaf re-firing discovery. Persisted-cache path is
unchanged: no discovery on the snapshot, downstream chain handles
its own fallback (pre-PR behavior on that path).
* test(plugins): isolate snapshot memo across tests that mock manifest registry
The snapshot memo is now process-scoped and effective (~98% hit rate).
Three test files were depending on cache misses (because the broken
cache returned them) — each test would set up its own
loadPluginManifestRegistry mock and expect a fresh derive. With the
cache fixed, an earlier test's mocked registry now leaks into later
tests in the same file.
- io.write-config.test.ts: afterEach now clears the snapshot memo so
the 'demo' plugin mocked in the first test does not survive into
'keeps shipped plugin install config records when index migration
fails', which expects an empty registry to surface the 'plugin not
found: demo' warning.
- gateway/model-pricing-cache.ts: resetGatewayModelPricingCacheForTest
also clears the memo. Tests in model-pricing-cache.test.ts assert
loadPluginManifestRegistryForInstalledIndex was called; the memo
hit otherwise skips the call.
- providers.test.ts: vi.doMock loadPluginMetadataSnapshot to wrap the
existing loadPluginManifestRegistryMock fixture. The plumbing
commit added an auto-fetch fall-through in
resolveOwningPluginIdsForProvider; without the mock, providers
tests hit real disk reads and return empty registries (which is
what surfaced as 9 unrelated-looking failures in the prior CI
run).
* fix(plugins): preserve setup.cliBackends owner matching in provider scan
resolveOwningPluginIdsForProvider now also checks plugin.setup?.cliBackends.
The pre-PR no-registry fallback used resolvePluginContributionOwners which
includes both top-level cliBackends and setup.cliBackends; the PR's manifest
scan replacement was missing the setup case.
* fix(plugins): inherit active registry workspaceDir before loading metadata snapshot
isPluginProvidersLoadInFlight and resolvePluginProviders now resolve
env and workspaceDir once at the entry point (falling back to
getActivePluginRegistryWorkspaceDir) and pass them into both
loadPluginMetadataSnapshot and resolvePluginProviderLoadBase. Pre-fix
the snapshot used params.workspaceDir raw while the load base inherited
the active workspace, so workspace-scoped provider plugins could be
absent from the snapshot manifest registry even though owner resolution
expected them.
Regression test asserts the snapshot mock receives the active
workspaceDir when the caller omits it.
* perf(gateway): thread discovery into applyPluginAutoEnable call sites
Every gateway applyPluginAutoEnable call now passes the snapshot's
PluginDiscoveryResult so the bundled-channel cascade (collectConfiguredChannelIds
→ listBundledChannelIdsWith* → listPotentialConfiguredChannelPresenceSignals)
short-circuits instead of each leaf re-firing discovery.
Startup-time sites pull discovery from the snapshot/lookup-table they already
hold:
- server-plugin-bootstrap.ts (pluginLookUpTable)
- server-startup-plugins.ts (pluginMetadataSnapshot)
- server-startup-config.ts (pluginMetadataSnapshot)
- server-plugins.ts (pluginLookUpTable, both call sites)
Per-RPC sites (server.impl getRuntimeConfig callback, server-methods/channels
status + start handlers, server-methods/send) source discovery via
getCurrentPluginMetadataSnapshot using the runtime config to validate
compatibility. Falls through to the original slow path when the snapshot is
absent or incompatible.
Adds owner-level startup trace attribution for gateway auth, plugin loading, lookup counts, and plugin sidecar services.
Verification:
- node scripts/run-vitest.mjs src/plugins/startup-trace-segment.test.ts src/plugins/services.test.ts src/plugins/loader.test.ts src/gateway/server-startup-config.secrets.test.ts
- pnpm build
- pnpm check
CI override:
- Red checks are unrelated baseline noise. The failed CI shard is src/cli/plugins-install-persist.test.ts, which fails on origin/main 336ba2a2b3 with the same missing resolveIsNixMode mock export. PR #81738 touches gateway/plugin startup trace files and CHANGELOG.md, not the failing CLI plugin install test.
Thanks @samzong.
Co-authored-by: samzong <13782141+samzong@users.noreply.github.com>
Summary:
- The PR changes Gateway reload planning, CLI plugin install-index writes, plugin runtime/cache cleanup, docs, changelog, and tests so plugin enable/disable hot reloads while install/update/uninstall stay restart-backed.
- Reproducibility: yes. The earlier blocker has a source-level reproduction: run an external plugin install/up ... watches config and only the managed plugin index changes; the PR now tests that path and queues a restart.
ClawSweeper fixups:
- Included follow-up commit: fix: hot reload plugin management changes
- Included follow-up commit: fix(clawsweeper): address review for automerge-openclaw-openclaw-7597…
- Ran the ClawSweeper repair loop before final review.
Validation:
- ClawSweeper review passed for head 860594f722.
- Required merge gates passed before the squash merge.
Prepared head SHA: 860594f722
Review: https://github.com/openclaw/openclaw/pull/75976#issuecomment-4363168379
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Simplify plugin installation and runtime loading around package-manager-owned dependencies, with Jiti reserved for local/TS fallback paths.
Also scans npm plugin install roots so hoisted transitive dependencies are covered by dependency denylist and node_modules symlink checks.
* Revert "fix(memory/dreaming): surface blocked status when heartbeat is disabled for main (#69875)"
This reverts commit 529577e045.
Making way for the dreaming-vs-heartbeat decoupling from Josh's
josh/dreaming-isolated-cron-fix branch, which moves the managed dreaming
cron to isolated agent turns (sessionTarget: "isolated") so dreaming no
longer requires heartbeat to fire. Once the cron no longer rides the
heartbeat path, the blocked-reason observability has nothing left to
report — removing it cleanly here before the cherry-picks land.
* openclaw-3ba.1: move managed dreaming cron to isolated agent turns
* openclaw-46d: claim cron runs before embedded attempts
* openclaw-575: disable managed dreaming cron delivery
* openclaw-575: accept wrapped dreaming cron tokens
* openclaw-ccd: filter cron and wrapper transcript noise from dreaming corpus
* openclaw-cd9: filter archived, cron, and heartbeat transcript noise from dreaming corpus
* openclaw-cd9: suppress role-label reflection tags in rem dreaming
* openclaw-b49: stop narrative timeouts from blocking dreaming cron
* openclaw-b49: keep managed dreaming cron out of diary subagents
* openclaw-ff9: restore cron dream diary generation without serial waits
* openclaw-ff9: run dreaming narratives with lightweight isolated subagent lanes
* openclaw-ff9: detach cron dream diary generation from run completion
* openclaw-ff9: defer cron diary task startup until after cron completion
* doctor/cron: migrate stale managed dreaming jobs to isolated agent turns
After the dreaming cron moved off the heartbeat path to sessionTarget:
"isolated" + payload.kind: "agentTurn" (see the preceding memory-core
changes), users with existing ~/.openclaw/cron/jobs.json entries in the
old sessionTarget: "main" + payload.kind: "systemEvent" shape still
carry stale jobs until the gateway restart reconcile rewrites them.
Add a dreaming-specific cron migration to the existing
maybeRepairLegacyCronStore doctor path so "openclaw doctor" (and
"openclaw doctor --fix") rewrites those jobs without needing a gateway
restart. Match lives in a new doctor-cron-dreaming-payload-migration
helper alongside the existing legacy-delivery and store-migration files.
The matching uses the memory-core managed-job name and description tag
plus the short-term-promotion payload token. Constants are mirrored
from extensions/memory-core/src/dreaming.ts and commented so a future
rename in memory-core is a visible drift point here too.
* memory/dreaming: tighten cron-token match to known wrapper, not substring
The previous match relaxed the line check from 'trimmed line equals token'
to 'line contains token anywhere as a substring' to accept the
`[cron:<id>] <token>` wrapper that isolated-cron turns add. Substring
matching also let any user message embedding the token mid-sentence
trigger the dream-promotion hook, and was flagged by both Greptile and
Aisle on PR #70737.
Replace it with strip-the-known-prefix-then-exact-match: keep the
`[cron:<id>]` wrapper case working, reject every other variant. Add
focused unit coverage that the bare token, the wrapped token, and bare
multiline cases match while embedded / code-fenced / arbitrarily-wrapped
variants do not.
* memory/dreaming: drop assistant followup only on assistant-side signals
Per PR #70737 review (aisle-research-bot, Medium): the previous logic
suppressed the next assistant message whenever the prior user message
matched a 'generated prompt' pattern (`[cron:...]`,
`System (untrusted): ...`, heartbeat prompts, exec-completion events).
Real users can type those same patterns, which let a user exfiltrate
real assistant replies from the dreaming corpus by prefixing their own
prompt — the assistant's reply would be silently dropped.
Remove the cross-message coupling. Assistant-side machinery (silent
replies, system wrappers) is already dropped by sanitizeSessionText,
which is the right layer for that filter. Add an explicit assistant-side
HEARTBEAT_TOKEN check to keep the legitimate `HEARTBEAT_OK` ack drop
working without depending on the prior user message. Add a regression
test exercising the spoofing scenario.
* doctor/cron: assert mirrored dreaming constants stay in sync
Per PR #70737 review (greptile-apps): the doctor migration mirrors three
constants (MANAGED_DREAMING_CRON_NAME, MANAGED_DREAMING_CRON_TAG,
DREAMING_SYSTEM_EVENT_TEXT) from extensions/memory-core/src/dreaming.ts.
A future rename in either file would silently break the migration.
Add a vitest unit that reads both files and asserts the literals match.
Manually verified the assertion fires with a clear error when one side
diverges. Adds no runtime cost; sits in the regular test pipeline.
* fix(memory): stabilize dreaming CI checks
* memory/dreaming: skip eager narrative session cleanup when detached
Per PR #70737 review (chatgpt-codex-connector, P2): runDreamingSweepPhases
called deleteNarrativeSessionBestEffort synchronously right after each
phase. Once narrative generation moved to detached mode (queued via
queueMicrotask), the eager cleanup races the writer: the session is
deleted before the queued subagent run reads it, silently dropping cron
diary entries.
Skip the eager cleanup branch when params.detachNarratives is true.
generateAndAppendDreamNarrative still runs its own deleteSession in the
finally{} block, so the cleanup intent is preserved without the race.
Heartbeat-driven (non-detached) runs keep the original eager-cleanup
behavior.
* fix(plugin-sdk): restore heartbeat-summary re-export
Per PR #70737 review (chatgpt-codex-connector, P1): the revert of
PR #69875 dropped the `heartbeat-summary` re-export from
`openclaw/plugin-sdk/infra-runtime`. That subpath shipped publicly two
days earlier, so removing it is technically a breaking change to a
public SDK surface — third-party plugins importing
`isHeartbeatEnabledForAgent` / `resolveHeartbeatIntervalMs` from this
path would fail with no replacement contract introduced.
Restore the re-export. Costs nothing to keep; the helpers are already
public via `../infra/heartbeat-summary.ts`. SDK additions are by
default backwards-compatible (CLAUDE.md), so removing within days of
introduction violates that intent.
* changelog: note dreaming decoupling from heartbeat
Refs PR #70737.
---------
Co-authored-by: Josh Lehman <josh@martian.engineering>
* plugins: include resolved workspaceDir in provider hook cache keys
resolveProviderPluginsForHooks, resolveProviderPluginsForCatalogHooks, and
resolveProviderRuntimePlugin used the raw params.workspaceDir for cache keys
and plugin-id discovery while resolvePluginProviders already fell back to
the active registry workspace. Resolve workspaceDir once at the top of each
function so cache keys, candidate filtering, and loading all use the same
workspace root.
* fix(plugins): inherit runtime workspace for snapshot loads
* test(gateway): stub runtime registry seam
* fix(plugins): restore workspace fallback after rebase
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>