* Plugins: add native ask dialog for before_tool_call hooks
Extend the before_tool_call plugin hook with a requireApproval return field
that pauses agent execution and waits for real user approval via channels
(Telegram, Discord, /approve command) instead of relying on the agent to
cooperate with a soft block.
- Add requireApproval field to PluginHookBeforeToolCallResult with id, title,
description, severity, timeout, and timeoutBehavior options
- Extend runModifyingHook merge callback to receive hook registration so
mergers can stamp pluginId; always invoke merger even for the first result
- Make ExecApprovalManager generic so it can be reused for plugin approvals
- Add plugin.approval.request/waitDecision/resolve gateway methods with
schemas, scope guards, and broadcast events
- Handle requireApproval in pi-tools via two-phase gateway RPC with fallback
to soft block when the gateway is unavailable
- Extend the exec approval forwarder with plugin approval message builders
and forwarding methods
- Update /approve command to fall back to plugin.approval.resolve when exec
approval lookup fails
- Document before_tool_call requireApproval in hooks docs and unified
/approve behavior in exec-approvals docs
* Plugins: simplify plugin approval code
- Extract mergeParamsWithApprovalOverrides helper to deduplicate param
merge logic in before_tool_call hook handling
- Use idiomatic conditional spread syntax in toolContext construction
- Extract callApprovalMethod helper in /approve command to eliminate
duplicated callGateway calls
- Simplify plugin approval schema by removing unnecessary Type.Union
with Type.Null on optional fields
- Extract normalizeTrimmedString helper for turn source field trimming
* Tests: add plugin approval wiring and /approve fallback coverage
Fix 3 broken assertions expecting old "Exec approval" message text.
Add tests for the /approve command's exec→plugin fallback path,
plugin approval method registration and scope authorization, and
handler factory key verification.
* UI: wire plugin approval events into the exec approval overlay
Handle plugin.approval.requested and plugin.approval.resolved gateway
events by extending the existing exec approval queue with a kind
discriminator. Plugin approvals reuse the same overlay, queue management,
and expiry timer, with branched rendering for plugin-specific content
(title, description, severity). The decision handler routes resolve calls
to the correct gateway method based on kind.
* fix: read plugin approval fields from nested request payload
The gateway broadcasts plugin approval payloads with title, description,
severity, pluginId, agentId, and sessionKey nested inside the request
object (PluginApprovalRequestPayload), not at the top level. Fix the
parser to read from the correct location so the overlay actually appears.
* feat: invoke plugin onResolution callback after approval decision
Adds onResolution to the requireApproval type and invokes it after
the user resolves the approval dialog, enabling plugins to react to
allow-always vs allow-once decisions.
* docs: add onResolution callback to requireApproval hook documentation
* test: fix /approve assertion for unified approval response text
* docs: regenerate plugin SDK API baseline
* docs: add changelog entry for plugin approval hooks
* fix: harden plugin approval hook reliability
- Add APPROVAL_NOT_FOUND error code so /approve fallback uses structured
matching instead of fragile string comparison
- Check block before requireApproval so higher-priority plugin blocks
cannot be overridden by a lower-priority approval
- Race waitDecision against abort signal so users are not stuck waiting
for the full approval timeout after cancelling a run
- Use null consistently for missing pluginDescription instead of
converting to undefined
- Add comments explaining the +10s timeout buffer on gateway RPCs
* docs: document block > requireApproval precedence in hooks
* fix: address Phase 1 critical correctness issues for plugin approval hooks
- Fix timeout-allow param bug: return merged hook params instead of
original params when timeoutBehavior is "allow", preventing security
plugins from having their parameter rewrites silently discarded.
- Host-generate approval IDs: remove plugin-provided id field from the
requireApproval type, gateway request, and protocol schema. Server
always generates IDs via randomUUID() to prevent forged/predictable
ID attacks.
- Define onResolution semantics: add PluginApprovalResolutions constants
and PluginApprovalResolution type. onResolution callback now fires on
every exit path (allow, deny, timeout, abort, gateway error, no-ID).
Decision branching uses constants instead of hard-coded strings.
- Fix pre-existing test infrastructure issues: bypass CJS mock cache for
getGlobalHookRunner global singleton, reset gateway mock between tests,
fix hook merger priority ordering in block+requireApproval test.
* fix: tighten plugin approval schema and add kind-prefixed IDs
Harden the plugin approval request schema: restrict severity to
enum (info|warning|critical), cap timeoutMs at 600s, limit title
to 80 chars and description to 256 chars. Prefix plugin approval
IDs with `plugin:` so /approve routing can distinguish them from
exec approvals deterministically instead of relying on fallback.
* fix: address remaining PR feedback (Phases 1-3 source changes)
* chore: regenerate baselines and protocol artifacts
* fix: exclude requesting connection from approval-client availability check
hasExecApprovalClients() counted the backend connection that issued
the plugin.approval.request RPC as an approval client, preventing
the no-approval-route fast path from firing in headless setups and
causing 120s stalls. Pass the caller's connId so it is skipped.
Applied to both plugin and exec approval handlers.
* Approvals: complete Discord parity and compatibility fallback
* Hooks: make plugin approval onResolution non-blocking
* Hooks: freeze params after approval owner is selected
* Gateway: harden plugin approval request/decision flow
* Discord/Telegram: fix plugin approval delivery parity
* Approvals: fix Telegram plugin approval edge cases
* Auto-reply: enforce Telegram plugin approval approvers
* Approvals: harden Telegram and plugin resolve policies
* Agents: static-import gateway approval call and fix e2e mock loading
* Auto-reply: restore /approve Telegram import boundary
* Approvals: fail closed on no-route and neutralize Discord mentions
* docs: refresh generated config and plugin API baselines
---------
Co-authored-by: Václav Belák <vaclav.belak@gendigital.com>
When preferSetupRuntimeForChannelPlugins is active, gateway boot performs
two plugin loads: a setup-runtime pass and a full reload after listen.
The initial pin captured the setup-entry snapshot. The deferred reload now
re-pins so getChannelPlugin() resolves against the full implementations.
emitChatFinal frees buffers on clean run completion, and the
maintenance timer sweeps abortedRuns after ABORTED_RUN_TTL_MS. But
runs that get stuck (e.g. LLM timeout without triggering clean
lifecycle end) are never aborted and their string buffers persist
indefinitely. This is the direct trigger for the StringAdd_CheckNone
OOM crash reported in the issue.
Add a stale buffer sweep in the maintenance timer that cleans up
buffers, deltaSentAt, and deltaLastBroadcastLen for any run not
updated within ABORTED_RUN_TTL_MS, regardless of abort status.
Closes#51821
* fix: make cleanup "keep" persist subagent sessions indefinitely
* feat: expose subagent session metadata in sessions list
* fix: include status and timing in sessions_list tool
* fix: hide injected timestamp prefixes in chat ui
* feat: push session list updates over websocket
* feat: expose child subagent sessions in subagents list
* feat: add admin http endpoint to kill sessions
* Emit session.message websocket events for transcript updates
* Estimate session costs in sessions list
* Add direct session history HTTP and SSE endpoints
* Harden dashboard session events and history APIs
* Add session lifecycle gateway methods
* Add dashboard session API improvements
* Add dashboard session model and parent linkage support
* fix: tighten dashboard session API metadata
* Fix dashboard session cost metadata
* Persist accumulated session cost
* fix: stop followup queue drain cfg crash
* Fix dashboard session create and model metadata
* fix: stop guessing session model costs
* Gateway: cache OpenRouter pricing for configured models
* Gateway: add timeout session status
* Fix subagent spawn test config loading
* Gateway: preserve operator scopes without device identity
* Emit user message transcript events and deduplicate plugin warnings
* feat: emit sessions.changed lifecycle event on subagent spawn
Adds a session-lifecycle-events module (similar to transcript-events)
that emits create events when subagents are spawned. The gateway
server.impl.ts listens for these events and broadcasts sessions.changed
with reason=create to SSE subscribers, so dashboards can pick up new
subagent sessions without polling.
* Gateway: allow persistent dashboard orchestrator sessions
* fix: preserve operator scopes for token-authenticated backend clients
Backend clients (like agent-dashboard) that authenticate with a valid gateway
token but don't present a device identity were getting their scopes stripped.
The scope-clearing logic ran before checking the device identity decision,
so even when evaluateMissingDeviceIdentity returned 'allow' (because
roleCanSkipDeviceIdentity passed for token-authed operators), scopes were
already cleared.
Fix: also check decision.kind before clearing scopes, so token-authenticated
operators keep their requested scopes.
* Gateway: allow operator-token session kills
* Fix stale active subagent status after follow-up runs
* Fix dashboard image attachments in sessions send
* Fix completed session follow-up status updates
* feat: stream session tool events to operator UIs
* Add sessions.steer gateway coverage
* Persist subagent timing in session store
* Fix subagent session transcript event keys
* Fix active subagent session status in gateway
* bump session label max to 512
* Fix gateway send session reactivation
* fix: publish terminal session lifecycle state
* feat: change default session reset to effectively never
- Change DEFAULT_RESET_MODE from "daily" to "idle"
- Change DEFAULT_IDLE_MINUTES from 60 to 0 (0 = disabled/never)
- Allow idleMinutes=0 through normalization (don't clamp to 1)
- Treat idleMinutes=0 as "no idle expiry" in evaluateSessionFreshness
- Default behavior: mode "idle" + idleMinutes 0 = sessions never auto-reset
- Update test assertion for new default mode
* fix: prep session management followups (#50101) (thanks @clay-datacurve)
---------
Co-authored-by: Tyler Yust <TYTYYUST@YAHOO.COM>
* feat(context-engine): add ContextEngine interface and registry
Introduce the pluggable ContextEngine abstraction that allows external
plugins to register custom context management strategies.
- ContextEngine interface with lifecycle methods: bootstrap, ingest,
ingestBatch, afterTurn, assemble, compact, prepareSubagentSpawn,
onSubagentEnded, dispose
- Module-level singleton registry with registerContextEngine() and
resolveContextEngine() (config-driven slot selection)
- LegacyContextEngine: pass-through implementation wrapping existing
compaction behavior for 100% backward compatibility
- ensureContextEnginesInitialized() guard for safe one-time registration
- 19 tests covering contract, registry, resolution, and legacy parity
* feat(plugins): add context-engine slot and registerContextEngine API
Wire the ContextEngine abstraction into the plugin system so external
plugins can register context engines via the standard plugin API.
- Add 'context-engine' to PluginKind union type
- Add 'contextEngine' slot to PluginSlotsConfig (default: 'legacy')
- Wire registerContextEngine() through OpenClawPluginApi
- Export ContextEngine types from plugin-sdk for external consumers
- Restore proper slot-based resolution in registry
* feat(context-engine): wire ContextEngine into agent run lifecycle
Integrate the ContextEngine abstraction into the core agent run path:
- Resolve context engine once per run (reused across retries)
- Bootstrap: hydrate canonical store from session file on first run
- Assemble: route context assembly through pluggable engine
- Auto-compaction guard: disable built-in auto-compaction when
the engine declares ownsCompaction (prevents double-compaction)
- AfterTurn: post-turn lifecycle hook for ingest + background
compaction decisions
- Overflow compaction: route through contextEngine.compact()
- Dispose: clean up engine resources in finally block
- Notify context engine on subagent lifecycle events
Legacy engine: all lifecycle methods are pass-through/no-op, preserving
100% backward compatibility for users without a context engine plugin.
* feat(plugins): add scoped subagent methods and gateway request scope
Expose runtime.subagent.{run, waitForRun, getSession, deleteSession}
so external plugins can spawn sub-agent sessions without raw gateway
dispatch access.
Uses AsyncLocalStorage request-scope bridge to dispatch internally via
handleGatewayRequest with a synthetic operator client. Methods are only
available during gateway request handling.
- Symbol.for-backed global singleton for cross-module-reload safety
- Fallback gateway context for non-WS dispatch paths (Telegram/WhatsApp)
- Set gateway request scope for all handlers, not just plugin handlers
- 3 staleness tests for fallback context hardening
* feat(context-engine): route /compact and sessions.get through context engine
Wire the /compact command and sessions.get handler through the pluggable
ContextEngine interface.
- Thread tokenBudget and force parameters to context engine compact
- Route /compact through contextEngine.compact() when registered
- Wire sessions.get as runtime alias for plugin subagent dispatch
- Add .pebbles/ to .gitignore
* style: format with oxfmt 0.33.0
Fix duplicate import (ControlUiRootState in server.impl.ts) and
import ordering across all changed files.
* fix: update extension test mocks for context-engine types
Add missing subagent property to bluebubbles PluginRuntime mock.
Add missing registerContextEngine to lobster OpenClawPluginApi mock.
* fix(subagents): keep deferred delete cleanup retryable
* style: format run attempt for CI
* fix(rebase): remove duplicate embedded-run imports
* test: add missing gateway context mock export
* fix: pass resolved auth profile into afterTurn compaction
Ensure the embedded runner forwards resolved auth profile context into
legacy context-engine compaction params on the normal afterTurn path,
matching overflow compaction behavior. This allows downstream LCM
summarization to use the intended provider auth/profile consistently.
Also fix strict TS typing in external-link token dedupe and align an
attempt unit test reasoningLevel value with the current ReasoningLevel
enum.
Regeneration-Prompt: |
We were debugging context-engine compaction where downstream summary
calls were missing the right auth/profile context in normal afterTurn
flow, while overflow compaction already propagated it. Preserve current
behavior and keep changes additive: thread the resolved authProfileId
through run -> attempt -> legacy compaction param builder without
broad refactors.
Add tests that prove the auth profile is included in afterTurn legacy
params and that overflow compaction still passes it through run
attempts. Keep existing APIs stable, and only adjust small type issues
needed for strict compilation.
* fix: remove duplicate imports from rebase
* feat: add context-engine system prompt additions
* fix(rebase): dedupe attempt import declarations
* test: fix fetch mock typing in ollama autodiscovery
* fix(test): add registerContextEngine to diffs extension mock APIs
* test(windows): use path.delimiter in ios-team-id fixture PATH
* test(cron): add model formatting and precedence edge case tests
Covers:
- Provider/model string splitting (whitespace, nested paths, empty segments)
- Provider normalization (casing, aliases like bedrock→amazon-bedrock)
- Anthropic model alias normalization (opus-4.5→claude-opus-4-5)
- Precedence: job payload > session override > config default
- Sequential runs with different providers (CI flake regression pattern)
- forceNew session preserving stored model overrides
- Whitespace/empty model string edge cases
- Config model as string vs object format
* test(cron): fix model formatting test config types
* test(phone-control): add registerContextEngine to mock API
* fix: re-export ChannelKind from config-reload-plan
* fix: add subagent mock to plugin-runtime-mock test util
* docs: add changelog fragment for context engine PR #22201
## Overview
This PR enables external channel plugins (loaded via Plugin SDK) to access
advanced runtime features like AI response dispatching, which were previously
only available to built-in channels.
## Changes
### src/gateway/server-channels.ts
- Import PluginRuntime type
- Add optional channelRuntime parameter to ChannelManagerOptions
- Pass channelRuntime to channel startAccount calls via conditional spread
- Ensures backward compatibility (field is optional)
### src/gateway/server.impl.ts
- Import createPluginRuntime from plugins/runtime
- Create and pass channelRuntime to channel manager
### src/channels/plugins/types.adapters.ts
- Import PluginRuntime type
- Add comprehensive documentation for channelRuntime field
- Document available features, use cases, and examples
- Improve type safety (use imported PluginRuntime type vs inline import)
## Benefits
External channel plugins can now:
- Generate AI-powered responses using dispatchReplyWithBufferedBlockDispatcher
- Access routing, text processing, and session management utilities
- Use command authorization and group policy resolution
- Maintain feature parity with built-in channels
## Backward Compatibility
- channelRuntime field is optional in ChannelGatewayContext
- Conditional spread ensures it's only passed when explicitly provided
- Existing channels without channelRuntime support continue to work unchanged
- No breaking changes to channel plugin API
## Testing
- Email channel plugin successfully uses channelRuntime for AI responses
- All existing built-in channels (slack, discord, telegram, etc.) work unchanged
- Gateway loads and runs without errors when channelRuntime is provided
The health monitor was created once at startup and never touched by
applyHotReload(), so changing channelHealthCheckMinutes only took
effect after a full gateway restart.
Wire up a "restart-health-monitor" reload action so hot-reload can
stop the old monitor and (re)create one with the updated interval —
or disable it entirely when set to 0.
Closes#32105
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>