[codex] Move internal development notes to maintainers (#57316)

* docs: move internal notes to maintainers

* docs: drop internal notes agent guidance
This commit is contained in:
Onur Solmaz
2026-03-30 00:15:08 +02:00
committed by GitHub
parent d82d6ba0c4
commit 2da61e6553
26 changed files with 6 additions and 4369 deletions

View File

@@ -3,7 +3,6 @@
"ignores": [
"docs/zh-CN/**",
"docs/.i18n/**",
"docs/internal/**",
"docs/reference/templates/**",
"**/.local/**"
],

View File

@@ -76,20 +76,6 @@
- README (GitHub): keep absolute docs URLs (`https://docs.openclaw.ai/...`) so links work on GitHub.
- Docs content must be generic: no personal device names/hostnames/paths; use placeholders like `user@gateway-host` and “gateway host”.
## Internal Development Docs
- `docs/internal/**` is for internal-only development context that should stay in the repo but not on the public docs site.
- Automatically create or update a note in `docs/internal/<github-username>/` when starting meaningful implementation work, refactors, migrations, architecture changes, or other development context that would otherwise be lost.
- Default note types include implementation plans, refactor plans, migration plans, architecture notes, debugging notes, and other development context.
- Infer `<github-username>` from the current authenticated GitHub user. Prefer the active GitHub login; if that cannot be resolved, fall back to the best available local git identity.
- Use dated filenames: `YYYY-MM-DD-short-topic.md`.
- Use YAML frontmatter at minimum: `title`, `summary`, `author`, `github_username`, and `created`.
- When the author identity is known, prefer `author: "Name <email>"`.
- Keep these notes in plain language.
- Avoid deleting or rewriting someone else's internal notes unless there is a clear valid reason, such as an explicit user request, accidental duplicate content, secret removal, replacement by a newer note, or override by BDFL.
- Do not place internal planning notes in public `docs/**` pages or revive `experiments/`; use `docs/internal/` instead.
- Read `docs/internal/README.md` before creating or reorganizing internal development notes.
## Docs i18n (zh-CN)
- `docs/zh-CN/**` is generated; do not edit unless the user explicitly asks.

View File

@@ -1 +0,0 @@
internal/

View File

@@ -1,67 +0,0 @@
# Internal Development Docs
`docs/internal/` stores internal-only development notes that should live in the repo but stay out of the public docs site and docs tooling flow.
These notes are for development context that would otherwise get lost as code changes over time.
## What belongs here
- Implementation plans
- Refactor plans
- Migration plans
- Architecture notes
- Debugging and investigation notes
- Other development context that is useful to keep in version control
## Folder layout
Create notes under:
`docs/internal/<github-username>/`
For new notes, infer `<github-username>` from the current authenticated GitHub user. If that cannot be resolved, fall back to the best available local git identity.
## File naming
Use dated filenames:
`YYYY-MM-DD-short-topic.md`
Example:
`docs/internal/steipete/2026-01-05-model-config.md`
Do not use this README to track or index individual notes. Keep it focused on the conventions for creating and organizing them.
## Frontmatter
Use YAML frontmatter. At minimum include:
```yaml
---
title: "Model Config Exploration"
summary: "Exploration: model config, auth profiles, and fallback behavior"
author: "Peter Steinberger <steipete@gmail.com>"
github_username: "steipete"
created: "2026-01-05"
---
```
Optional fields commonly used here:
- `status`
- `last_updated`
- `read_when`
## Writing guidance
- Use plain language.
- Prefer preserving dated context over deleting it.
- Update an existing note when it is clearly the same thread of work; create a new dated note when the plan or investigation is meaningfully separate.
- If a note moves folders or is adopted by another contributor later, keep the history clear in git rather than rewriting the repo history.
## Important rules
- Do not put these notes in public `docs/` pages.
- Do not use `experiments/` for this content.
- New implementation plans, refactor plans, and similar development notes should be created here by default.

View File

@@ -1,152 +0,0 @@
---
title: "Background Task Lifecycle Tracking"
summary: "Plain-language outline of the durable task registry, quieter background ACP delivery, and new task inspection commands"
author: "Mariano Belinky <mbelinky@gmail.com>"
github_username: "mbelinky"
created: "2026-03-29"
---
# Feature Outline: Background Task Lifecycle Tracking
## TL;DR
- OpenClaw now treats long-running background work as tracked tasks instead of "something a child session is doing somewhere."
- The system saves task state to disk, so operators can later check what happened even after a restart.
- Background ACP runs are quieter: by default they stop sending raw child-session chatter into the user chat and send one short final update instead.
- You can inspect tasks with `openclaw tasks`, change the notification level, and cancel running ACP or subagent work.
## What problem this solves
Before this PR, background work had a few weak spots:
1. It was hard to tell whether a task was still running, done, failed, or simply disappeared.
2. ACP child sessions could leak setup or progress chatter into the main user thread.
3. After restarts or partial failures, operators sometimes had to inspect sessions or logs manually to reconstruct what happened.
## What changed
### 1. OpenClaw now has a durable task registry
OpenClaw now creates a task record for:
- background ACP work
- subagent runs
- gateway-started background CLI runs
Each record can keep:
- a task ID and run ID
- the runtime (`acp`, `subagent`, or `cli`)
- the requester session and child session
- the current status
- short progress and final summaries
- delivery state and notification policy
The registry is saved under the configured state directory as `tasks/runs.json`, so it can be restored after restart.
### 2. Background ACP runs now report through the parent, not the child
This PR adds a clear split between:
- normal interactive ACP sessions
- parent-owned background ACP sessions
If an ACP oneshot session was spawned from another session, OpenClaw now treats it as background work. In plain terms, the child session should not speak directly in the user-facing chat. Instead, the parent-owned task record handles the final update.
### 3. Default ACP notifications are short and quiet
For background ACP tasks, the default notification policy is `done_only`.
That means the requester normally gets one short message when the task ends, such as:
- done
- failed
- timed out
- lost
- cancelled
This is intentionally quieter than streaming raw child updates into the main conversation.
### 4. Optional state-change updates
If someone wants more visibility, they can switch a task to `state_changes`. Then OpenClaw can send short updates like:
- task started
- no output for a while
- output resumed
- a short progress summary
There is also a `silent` mode if no notifications should be sent.
### 5. New `openclaw tasks` commands
This PR adds a durable operator-facing CLI for task inspection and control:
```bash
openclaw tasks
openclaw tasks list --runtime acp --status running
openclaw tasks show <task-id|run-id|session-key>
openclaw tasks notify <lookup> done_only
openclaw tasks notify <lookup> state_changes
openclaw tasks notify <lookup> silent
openclaw tasks cancel <lookup>
```
This gives operators a direct way to answer "what is this task doing?" without scraping session output.
### 6. Recovery and cleanup
The registry is restored on startup and maintained in the background.
Important behaviors:
- If a task is still marked active but its backing session disappears for long enough, OpenClaw can mark it as `lost`.
- Old finished tasks are pruned after a retention window.
- Task lookup works by task ID, run ID, or session key.
## What users will notice
The biggest user-facing behavior change is for background ACP runs.
Before:
- a background ACP child session could leak raw chatter into the main chat
- the final outcome could be unclear
- "let me know when you're done" was not very reliable
After:
- the main chat stays cleaner
- background ACP work has a durable status record
- the requester can get a single clear final update
- operators can inspect or cancel tasks explicitly
## What operators and maintainers will notice
- ACP, subagent, and background CLI work now share one task-tracking model.
- Task records are linked to the requester session, child session, and run ID.
- Some existing status and info views now surface linked task IDs and delivery state.
- Restart and recovery behavior is stronger because task state is no longer only implicit in session logs.
## Simple example
1. A user asks OpenClaw to do a long ACP job in the background.
2. OpenClaw creates a task record right away.
3. The child ACP session runs in its own session.
4. Progress is tracked in the registry instead of being dumped into the user thread.
5. When the job finishes, OpenClaw sends one short final update back to the original requester.
6. If needed, an operator can inspect or cancel the task with `openclaw tasks`.
## What this feature does not add
This PR is about lifecycle tracking and delivery behavior. It does not add:
- a new UI
- automatic retries
- a general-purpose job queue
## Main takeaway
This feature makes background work feel like a real first-class task instead of hidden session state. The main win is reliability: OpenClaw can now track, recover, inspect, notify, and cancel background work in a much more predictable way.
Source PR: https://github.com/openclaw/openclaw/pull/52518

View File

@@ -1,43 +0,0 @@
---
title: "Telegram Allowlist Hardening"
summary: "Telegram allowlist hardening: prefix + whitespace normalization"
author: "Marcus Neves <2423436+mneves75@users.noreply.github.com>"
github_username: "mneves75"
created: "2026-01-05"
read_when:
- Reviewing historical Telegram allowlist changes
---
# Telegram Allowlist Hardening
**Date**: 2026-01-05
**Status**: Complete
**PR**: #216
## Summary
Telegram allowlists now accept `telegram:` and `tg:` prefixes case-insensitively, and tolerate
accidental whitespace. This aligns inbound allowlist checks with outbound send normalization.
## What changed
- Prefixes `telegram:` and `tg:` are treated the same (case-insensitive).
- Allowlist entries are trimmed; empty entries are ignored.
## Examples
All of these are accepted for the same ID:
- `telegram:123456`
- `TG:123456`
- `tg:123456`
## Why it matters
Copy/paste from logs or chat IDs often includes prefixes and whitespace. Normalizing avoids
false negatives when deciding whether to respond in DMs or groups.
## Related docs
- [Group Chats](/channels/groups)
- [Telegram Provider](/channels/telegram)

View File

@@ -1,69 +0,0 @@
---
title: "Cron Add Hardening"
summary: "Harden cron.add input handling, align schemas, and improve cron UI/agent tooling"
author: "Marcus Neves <2423436+mneves75@users.noreply.github.com>"
github_username: "mneves75"
created: "2026-01-06"
read_when:
- Debugging invalid `cron.add` payloads
- Aligning cron schemas across gateway, CLI, and UI
owner: "openclaw"
status: "complete"
last_updated: "2026-01-05"
---
# Cron Add Hardening & Schema Alignment
## Context
Recent gateway logs show repeated `cron.add` failures with invalid parameters (missing `sessionTarget`, `wakeMode`, `payload`, and malformed `schedule`). This indicates that at least one client (likely the agent tool call path) is sending wrapped or partially specified job payloads. Separately, there is drift between cron provider enums in TypeScript, gateway schema, CLI flags, and UI form types, plus a UI mismatch for `cron.status` (expects `jobCount` while gateway returns `jobs`).
## Goals
- Stop `cron.add` INVALID_REQUEST spam by normalizing common wrapper payloads and inferring missing `kind` fields.
- Align cron provider lists across gateway schema, cron types, CLI docs, and UI forms.
- Make agent cron tool schema explicit so the LLM produces correct job payloads.
- Fix the Control UI cron status job count display.
- Add tests to cover normalization and tool behavior.
## Non-goals
- Change cron scheduling semantics or job execution behavior.
- Add new schedule kinds or cron expression parsing.
- Overhaul the UI/UX for cron beyond the necessary field fixes.
## Findings (current gaps)
- `CronPayloadSchema` in gateway excludes `signal` + `imessage`, while TS types include them.
- Control UI CronStatus expects `jobCount`, but gateway returns `jobs`.
- Agent cron tool schema allows arbitrary `job` objects, enabling malformed inputs.
- Gateway strictly validates `cron.add` with no normalization, so wrapped payloads fail.
## What changed
- `cron.add` and `cron.update` now normalize common wrapper shapes and infer missing `kind` fields.
- Agent cron tool schema matches the gateway schema, which reduces invalid payloads.
- Provider enums are aligned across gateway, CLI, UI, and macOS picker.
- Control UI uses the gateways `jobs` count field for status.
## Current behavior
- **Normalization:** wrapped `data`/`job` payloads are unwrapped; `schedule.kind` and `payload.kind` are inferred when safe.
- **Defaults:** safe defaults are applied for `wakeMode` and `sessionTarget` when missing.
- **Providers:** Discord/Slack/Signal/iMessage are now consistently surfaced across CLI/UI.
See [Cron jobs](/automation/cron-jobs) for the normalized shape and examples.
## Verification
- Watch gateway logs for reduced `cron.add` INVALID_REQUEST errors.
- Confirm Control UI cron status shows job count after refresh.
## Optional Follow-ups
- Manual Control UI smoke: add a cron job per provider + verify status job count.
## Open Questions
- Should `cron.add` accept explicit `state` from clients (currently disallowed by schema)?
- Should we allow `webchat` as an explicit delivery provider (currently filtered in delivery resolution)?

View File

@@ -1,234 +0,0 @@
---
title: "Browser Evaluate CDP Refactor"
summary: "Plan: isolate browser act:evaluate from Playwright queue using CDP, with end-to-end deadlines and safer ref resolution"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-02-11"
read_when:
- Working on browser `act:evaluate` timeout, abort, or queue blocking issues
- Planning CDP based isolation for evaluate execution
status: "draft"
last_updated: "2026-02-10"
---
# Browser Evaluate CDP Refactor Plan
## Context
`act:evaluate` executes user provided JavaScript in the page. Today it runs via Playwright
(`page.evaluate` or `locator.evaluate`). Playwright serializes CDP commands per page, so a
stuck or long running evaluate can block the page command queue and make every later action
on that tab look "stuck".
PR #13498 adds a pragmatic safety net (bounded evaluate, abort propagation, and best-effort
recovery). This document describes a larger refactor that makes `act:evaluate` inherently
isolated from Playwright so a stuck evaluate cannot wedge normal Playwright operations.
## Goals
- `act:evaluate` cannot permanently block later browser actions on the same tab.
- Timeouts are single source of truth end to end so a caller can rely on a budget.
- Abort and timeout are treated the same way across HTTP and in-process dispatch.
- Element targeting for evaluate is supported without switching everything off Playwright.
- Maintain backward compatibility for existing callers and payloads.
## Non-goals
- Replace all browser actions (click, type, wait, etc.) with CDP implementations.
- Remove the existing safety net introduced in PR #13498 (it remains a useful fallback).
- Introduce new unsafe capabilities beyond the existing `browser.evaluateEnabled` gate.
- Add process isolation (worker process/thread) for evaluate. If we still see hard to recover
stuck states after this refactor, that is a follow-up idea.
## Current Architecture (Why It Gets Stuck)
At a high level:
- Callers send `act:evaluate` to the browser control service.
- The route handler calls into Playwright to execute the JavaScript.
- Playwright serializes page commands, so an evaluate that never finishes blocks the queue.
- A stuck queue means later click/type/wait operations on the tab can appear to hang.
## Proposed Architecture
### 1. Deadline Propagation
Introduce a single budget concept and derive everything from it:
- Caller sets `timeoutMs` (or a deadline in the future).
- The outer request timeout, route handler logic, and the execution budget inside the page
all use the same budget, with small headroom where needed for serialization overhead.
- Abort is propagated as an `AbortSignal` everywhere so cancellation is consistent.
Implementation direction:
- Add a small helper (for example `createBudget({ timeoutMs, signal })`) that returns:
- `signal`: the linked AbortSignal
- `deadlineAtMs`: absolute deadline
- `remainingMs()`: remaining budget for child operations
- Use this helper in:
- `src/browser/client-fetch.ts` (HTTP and in-process dispatch)
- `src/node-host/runner.ts` (proxy path)
- browser action implementations (Playwright and CDP)
### 2. Separate Evaluate Engine (CDP Path)
Add a CDP based evaluate implementation that does not share Playwright's per page command
queue. The key property is that the evaluate transport is a separate WebSocket connection
and a separate CDP session attached to the target.
Implementation direction:
- New module, for example `src/browser/cdp-evaluate.ts`, that:
- Connects to the configured CDP endpoint (browser level socket).
- Uses `Target.attachToTarget({ targetId, flatten: true })` to get a `sessionId`.
- Runs either:
- `Runtime.evaluate` for page level evaluate, or
- `DOM.resolveNode` plus `Runtime.callFunctionOn` for element evaluate.
- On timeout or abort:
- Sends `Runtime.terminateExecution` best-effort for the session.
- Closes the WebSocket and returns a clear error.
Notes:
- This still executes JavaScript in the page, so termination can have side effects. The win
is that it does not wedge the Playwright queue, and it is cancelable at the transport
layer by killing the CDP session.
### 3. Ref Story (Element Targeting Without A Full Rewrite)
The hard part is element targeting. CDP needs a DOM handle or `backendDOMNodeId`, while
today most browser actions use Playwright locators based on refs from snapshots.
Recommended approach: keep existing refs, but attach an optional CDP resolvable id.
#### 3.1 Extend Stored Ref Info
Extend the stored role ref metadata to optionally include a CDP id:
- Today: `{ role, name, nth }`
- Proposed: `{ role, name, nth, backendDOMNodeId?: number }`
This keeps all existing Playwright based actions working and allows CDP evaluate to accept
the same `ref` value when the `backendDOMNodeId` is available.
#### 3.2 Populate backendDOMNodeId At Snapshot Time
When producing a role snapshot:
1. Generate the existing role ref map as today (role, name, nth).
2. Fetch the AX tree via CDP (`Accessibility.getFullAXTree`) and compute a parallel map of
`(role, name, nth) -> backendDOMNodeId` using the same duplicate handling rules.
3. Merge the id back into the stored ref info for the current tab.
If mapping fails for a ref, leave `backendDOMNodeId` undefined. This makes the feature
best-effort and safe to roll out.
#### 3.3 Evaluate Behavior With Ref
In `act:evaluate`:
- If `ref` is present and has `backendDOMNodeId`, run element evaluate via CDP.
- If `ref` is present but has no `backendDOMNodeId`, fall back to the Playwright path (with
the safety net).
Optional escape hatch:
- Extend the request shape to accept `backendDOMNodeId` directly for advanced callers (and
for debugging), while keeping `ref` as the primary interface.
### 4. Keep A Last Resort Recovery Path
Even with CDP evaluate, there are other ways to wedge a tab or a connection. Keep the
existing recovery mechanisms (terminate execution + disconnect Playwright) as a last resort
for:
- legacy callers
- environments where CDP attach is blocked
- unexpected Playwright edge cases
## Implementation Plan (Single Iteration)
### Deliverables
- A CDP based evaluate engine that runs outside the Playwright per-page command queue.
- A single end-to-end timeout/abort budget used consistently by callers and handlers.
- Ref metadata that can optionally carry `backendDOMNodeId` for element evaluate.
- `act:evaluate` prefers the CDP engine when possible and falls back to Playwright when not.
- Tests that prove a stuck evaluate does not wedge later actions.
- Logs/metrics that make failures and fallbacks visible.
### Implementation Checklist
1. Add a shared "budget" helper to link `timeoutMs` + upstream `AbortSignal` into:
- a single `AbortSignal`
- an absolute deadline
- a `remainingMs()` helper for downstream operations
2. Update all caller paths to use that helper so `timeoutMs` means the same thing everywhere:
- `src/browser/client-fetch.ts` (HTTP and in-process dispatch)
- `src/node-host/runner.ts` (node proxy path)
- CLI wrappers that call `/act` (add `--timeout-ms` to `browser evaluate`)
3. Implement `src/browser/cdp-evaluate.ts`:
- connect to the browser-level CDP socket
- `Target.attachToTarget` to get a `sessionId`
- run `Runtime.evaluate` for page evaluate
- run `DOM.resolveNode` + `Runtime.callFunctionOn` for element evaluate
- on timeout/abort: best-effort `Runtime.terminateExecution` then close the socket
4. Extend stored role ref metadata to optionally include `backendDOMNodeId`:
- keep existing `{ role, name, nth }` behavior for Playwright actions
- add `backendDOMNodeId?: number` for CDP element targeting
5. Populate `backendDOMNodeId` during snapshot creation (best-effort):
- fetch AX tree via CDP (`Accessibility.getFullAXTree`)
- compute `(role, name, nth) -> backendDOMNodeId` and merge into the stored ref map
- if mapping is ambiguous or missing, leave the id undefined
6. Update `act:evaluate` routing:
- if no `ref`: always use CDP evaluate
- if `ref` resolves to a `backendDOMNodeId`: use CDP element evaluate
- otherwise: fall back to Playwright evaluate (still bounded and abortable)
7. Keep the existing "last resort" recovery path as a fallback, not the default path.
8. Add tests:
- stuck evaluate times out within budget and the next click/type succeeds
- abort cancels evaluate (client disconnect or timeout) and unblocks subsequent actions
- mapping failures cleanly fall back to Playwright
9. Add observability:
- evaluate duration and timeout counters
- terminateExecution usage
- fallback rate (CDP -> Playwright) and reasons
### Acceptance Criteria
- A deliberately hung `act:evaluate` returns within the caller budget and does not wedge the
tab for later actions.
- `timeoutMs` behaves consistently across CLI, agent tool, node proxy, and in-process calls.
- If `ref` can be mapped to `backendDOMNodeId`, element evaluate uses CDP; otherwise the
fallback path is still bounded and recoverable.
## Testing Plan
- Unit tests:
- `(role, name, nth)` matching logic between role refs and AX tree nodes.
- Budget helper behavior (headroom, remaining time math).
- Integration tests:
- CDP evaluate timeout returns within budget and does not block the next action.
- Abort cancels evaluate and triggers termination best-effort.
- Contract tests:
- Ensure `BrowserActRequest` and `BrowserActResponse` remain compatible.
## Risks And Mitigations
- Mapping is imperfect:
- Mitigation: best-effort mapping, fallback to Playwright evaluate, and add debug tooling.
- `Runtime.terminateExecution` has side effects:
- Mitigation: only use on timeout/abort and document the behavior in errors.
- Extra overhead:
- Mitigation: only fetch AX tree when snapshots are requested, cache per target, and keep
CDP session short lived.
- Extension relay limitations:
- Mitigation: use browser level attach APIs when per page sockets are not available, and
keep the current Playwright path as fallback.
## Open Questions
- Should the new engine be configurable as `playwright`, `cdp`, or `auto`?
- Do we want to expose a new "nodeRef" format for advanced users, or keep `ref` only?
- How should frame snapshots and selector scoped snapshots participate in AX mapping?

View File

@@ -1,197 +0,0 @@
---
title: "PTY and Process Supervision Plan"
summary: "Production plan for reliable interactive process supervision (PTY + non-PTY) with explicit ownership, unified lifecycle, and deterministic cleanup"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-02-16"
read_when:
- Working on exec/process lifecycle ownership and cleanup
- Debugging PTY and non-PTY supervision behavior
status: "in_progress"
last_updated: "2026-02-15"
---
# PTY and Process Supervision Plan
## 1. Problem and goal
We need one reliable lifecycle for long-running command execution across:
- `exec` foreground runs
- `exec` background runs
- `process` follow up actions (`poll`, `log`, `send-keys`, `paste`, `submit`, `kill`, `remove`)
- CLI agent runner subprocesses
The goal is not just to support PTY. The goal is predictable ownership, cancellation, timeout, and cleanup with no unsafe process matching heuristics.
## 2. Scope and boundaries
- Keep implementation internal in `src/process/supervisor`.
- Do not create a new package for this.
- Keep current behavior compatibility where practical.
- Do not broaden scope to terminal replay or tmux style session persistence.
## 3. Implemented in this branch
### Supervisor baseline already present
- Supervisor module is in place under `src/process/supervisor/*`.
- Exec runtime and CLI runner are already routed through supervisor spawn and wait.
- Registry finalization is idempotent.
### This pass completed
1. Explicit PTY command contract
- `SpawnInput` is now a discriminated union in `src/process/supervisor/types.ts`.
- PTY runs require `ptyCommand` instead of reusing generic `argv`.
- Supervisor no longer rebuilds PTY command strings from argv joins in `src/process/supervisor/supervisor.ts`.
- Exec runtime now passes `ptyCommand` directly in `src/agents/bash-tools.exec-runtime.ts`.
2. Process layer type decoupling
- Supervisor types no longer import `SessionStdin` from agents.
- Process local stdin contract lives in `src/process/supervisor/types.ts` (`ManagedRunStdin`).
- Adapters now depend only on process level types:
- `src/process/supervisor/adapters/child.ts`
- `src/process/supervisor/adapters/pty.ts`
3. Process tool lifecycle ownership improvement
- `src/agents/bash-tools.process.ts` now requests cancellation through supervisor first.
- `process kill/remove` now use process-tree fallback termination when supervisor lookup misses.
- `remove` keeps deterministic remove behavior by dropping running session entries immediately after termination is requested.
4. Single source watchdog defaults
- Added shared defaults in `src/agents/cli-watchdog-defaults.ts`.
- `src/agents/cli-backends.ts` consumes the shared defaults.
- `src/agents/cli-runner/reliability.ts` consumes the same shared defaults.
5. Dead helper cleanup
- Removed unused `killSession` helper path from `src/agents/bash-tools.shared.ts`.
6. Direct supervisor path tests added
- Added `src/agents/bash-tools.process.supervisor.test.ts` to cover kill and remove routing through supervisor cancellation.
7. Reliability gap fixes completed
- `src/agents/bash-tools.process.ts` now falls back to real OS-level process termination when supervisor lookup misses.
- `src/process/supervisor/adapters/child.ts` now uses process-tree termination semantics for default cancel/timeout kill paths.
- Added shared process-tree utility in `src/process/kill-tree.ts`.
8. PTY contract edge-case coverage added
- Added `src/process/supervisor/supervisor.pty-command.test.ts` for verbatim PTY command forwarding and empty-command rejection.
- Added `src/process/supervisor/adapters/child.test.ts` for process-tree kill behavior in child adapter cancellation.
## 4. Remaining gaps and decisions
### Reliability status
The two required reliability gaps for this pass are now closed:
- `process kill/remove` now has a real OS termination fallback when supervisor lookup misses.
- child cancel/timeout now uses process-tree kill semantics for default kill path.
- Regression tests were added for both behaviors.
### Durability and startup reconciliation
Restart behavior is now explicitly defined as in-memory lifecycle only.
- `reconcileOrphans()` remains a no-op in `src/process/supervisor/supervisor.ts` by design.
- Active runs are not recovered after process restart.
- This boundary is intentional for this implementation pass to avoid partial persistence risks.
### Maintainability follow-ups
1. `runExecProcess` in `src/agents/bash-tools.exec-runtime.ts` still handles multiple responsibilities and can be split into focused helpers in a follow-up.
## 5. Implementation plan
The implementation pass for required reliability and contract items is complete.
Completed:
- `process kill/remove` fallback real termination
- process-tree cancellation for child adapter default kill path
- regression tests for fallback kill and child adapter kill path
- PTY command edge-case tests under explicit `ptyCommand`
- explicit in-memory restart boundary with `reconcileOrphans()` no-op by design
Optional follow-up:
- split `runExecProcess` into focused helpers with no behavior drift
## 6. File map
### Process supervisor
- `src/process/supervisor/types.ts` updated with discriminated spawn input and process local stdin contract.
- `src/process/supervisor/supervisor.ts` updated to use explicit `ptyCommand`.
- `src/process/supervisor/adapters/child.ts` and `src/process/supervisor/adapters/pty.ts` decoupled from agent types.
- `src/process/supervisor/registry.ts` idempotent finalize unchanged and retained.
### Exec and process integration
- `src/agents/bash-tools.exec-runtime.ts` updated to pass PTY command explicitly and keep fallback path.
- `src/agents/bash-tools.process.ts` updated to cancel via supervisor with real process-tree fallback termination.
- `src/agents/bash-tools.shared.ts` removed direct kill helper path.
### CLI reliability
- `src/agents/cli-watchdog-defaults.ts` added as shared baseline.
- `src/agents/cli-backends.ts` and `src/agents/cli-runner/reliability.ts` now consume same defaults.
## 7. Validation run in this pass
Unit tests:
- `pnpm vitest src/process/supervisor/registry.test.ts`
- `pnpm vitest src/process/supervisor/supervisor.test.ts`
- `pnpm vitest src/process/supervisor/supervisor.pty-command.test.ts`
- `pnpm vitest src/process/supervisor/adapters/child.test.ts`
- `pnpm vitest src/agents/cli-backends.test.ts`
- `pnpm vitest src/agents/bash-tools.exec.pty-cleanup.test.ts`
- `pnpm vitest src/agents/bash-tools.process.poll-timeout.test.ts`
- `pnpm vitest src/agents/bash-tools.process.supervisor.test.ts`
- `pnpm vitest src/process/exec.test.ts`
E2E targets:
- `pnpm vitest src/agents/cli-runner.test.ts`
- `pnpm vitest run src/agents/bash-tools.exec.pty-fallback.test.ts src/agents/bash-tools.exec.background-abort.test.ts src/agents/bash-tools.process.send-keys.test.ts`
Typecheck note:
- Use `pnpm build` (and `pnpm check` for full lint/docs gate) in this repo. Older notes that mention `pnpm tsgo` are obsolete.
## 8. Operational guarantees preserved
- Exec env hardening behavior is unchanged.
- Approval and allowlist flow is unchanged.
- Output sanitization and output caps are unchanged.
- PTY adapter still guarantees wait settlement on forced kill and listener disposal.
## 9. Definition of done
1. Supervisor is lifecycle owner for managed runs.
2. PTY spawn uses explicit command contract with no argv reconstruction.
3. Process layer has no type dependency on agent layer for supervisor stdin contracts.
4. Watchdog defaults are single source.
5. Targeted unit and e2e tests remain green.
6. Restart durability boundary is explicitly documented or fully implemented.
## 10. Summary
The branch now has a coherent and safer supervision shape:
- explicit PTY contract
- cleaner process layering
- supervisor driven cancellation path for process operations
- real fallback termination when supervisor lookup misses
- process-tree cancellation for child-run default kill paths
- unified watchdog defaults
- explicit in-memory restart boundary (no orphan reconciliation across restart in this pass)

View File

@@ -1,228 +0,0 @@
---
title: "Session Binding Channel Agnostic Plan"
summary: "Channel agnostic session binding architecture and iteration 1 delivery scope"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-02-21"
read_when:
- Refactoring channel-agnostic session routing and bindings
- Investigating duplicate, stale, or missing session delivery across channels
status: "in_progress"
last_updated: "2026-02-21"
---
# Session Binding Channel Agnostic Plan
## Overview
This document defines the long term channel agnostic session binding model and the concrete scope for the next implementation iteration.
Goal:
- make subagent bound session routing a core capability
- keep channel specific behavior in adapters
- avoid regressions in normal Discord behavior
## Why this exists
Current behavior mixes:
- completion content policy
- destination routing policy
- Discord specific details
This caused edge cases such as:
- duplicate main and thread delivery under concurrent runs
- stale token usage on reused binding managers
- missing activity accounting for webhook sends
## Iteration 1 scope
This iteration is intentionally limited.
### 1. Add channel agnostic core interfaces
Add core types and service interfaces for bindings and routing.
Proposed core types:
```ts
export type BindingTargetKind = "subagent" | "session";
export type BindingStatus = "active" | "ending" | "ended";
export type ConversationRef = {
channel: string;
accountId: string;
conversationId: string;
parentConversationId?: string;
};
export type SessionBindingRecord = {
bindingId: string;
targetSessionKey: string;
targetKind: BindingTargetKind;
conversation: ConversationRef;
status: BindingStatus;
boundAt: number;
expiresAt?: number;
metadata?: Record<string, unknown>;
};
```
Core service contract:
```ts
export interface SessionBindingService {
bind(input: {
targetSessionKey: string;
targetKind: BindingTargetKind;
conversation: ConversationRef;
metadata?: Record<string, unknown>;
ttlMs?: number;
}): Promise<SessionBindingRecord>;
listBySession(targetSessionKey: string): SessionBindingRecord[];
resolveByConversation(ref: ConversationRef): SessionBindingRecord | null;
touch(bindingId: string, at?: number): void;
unbind(input: {
bindingId?: string;
targetSessionKey?: string;
reason: string;
}): Promise<SessionBindingRecord[]>;
}
```
### 2. Add one core delivery router for subagent completions
Add a single destination resolution path for completion events.
Router contract:
```ts
export interface BoundDeliveryRouter {
resolveDestination(input: {
eventKind: "task_completion";
targetSessionKey: string;
requester?: ConversationRef;
failClosed: boolean;
}): {
binding: SessionBindingRecord | null;
mode: "bound" | "fallback";
reason: string;
};
}
```
For this iteration:
- only `task_completion` is routed through this new path
- existing paths for other event kinds remain as-is
### 3. Keep Discord as adapter
Discord remains the first adapter implementation.
Adapter responsibilities:
- create/reuse thread conversations
- send bound messages via webhook or channel send
- validate thread state (archived/deleted)
- map adapter metadata (webhook identity, thread ids)
### 4. Fix currently known correctness issues
Required in this iteration:
- refresh token usage when reusing existing thread binding manager
- record outbound activity for webhook based Discord sends
- stop implicit main channel fallback when a bound thread destination is selected for session mode completion
### 5. Preserve current runtime safety defaults
No behavior change for users with thread bound spawn disabled.
Defaults stay:
- `channels.discord.threadBindings.spawnSubagentSessions = false`
Result:
- normal Discord users stay on current behavior
- new core path affects only bound session completion routing where enabled
## Not in iteration 1
Explicitly deferred:
- ACP binding targets (`targetKind: "acp"`)
- new channel adapters beyond Discord
- global replacement of all delivery paths (`spawn_ack`, future `subagent_message`)
- protocol level changes
- store migration/versioning redesign for all binding persistence
Notes on ACP:
- interface design keeps room for ACP
- ACP implementation is not started in this iteration
## Routing invariants
These invariants are mandatory for iteration 1.
- destination selection and content generation are separate steps
- if session mode completion resolves to an active bound destination, delivery must target that destination
- no hidden reroute from bound destination to main channel
- fallback behavior must be explicit and observable
## Compatibility and rollout
Compatibility target:
- no regression for users with thread bound spawning off
- no change to non-Discord channels in this iteration
Rollout:
1. Land interfaces and router behind current feature gates.
2. Route Discord completion mode bound deliveries through router.
3. Keep legacy path for non-bound flows.
4. Verify with targeted tests and canary runtime logs.
## Tests required in iteration 1
Unit and integration coverage required:
- manager token rotation uses latest token after manager reuse
- webhook sends update channel activity timestamps
- two active bound sessions in same requester channel do not duplicate to main channel
- completion for bound session mode run resolves to thread destination only
- disabled spawn flag keeps legacy behavior unchanged
## Proposed implementation files
Core:
- `src/infra/outbound/session-binding-service.ts` (new)
- `src/infra/outbound/bound-delivery-router.ts` (new)
- `src/agents/subagent-announce.ts` (completion destination resolution integration)
Discord adapter and runtime:
- `src/discord/monitor/thread-bindings.manager.ts`
- `src/discord/monitor/reply-delivery.ts`
- `src/discord/send.outbound.ts`
Tests:
- `src/discord/monitor/provider*.test.ts`
- `src/discord/monitor/reply-delivery.test.ts`
- `src/agents/subagent-announce.format.test.ts`
## Done criteria for iteration 1
- core interfaces exist and are wired for completion routing
- correctness fixes above are merged with tests
- no main and thread duplicate completion delivery in session mode bound runs
- no behavior change for disabled bound spawn deployments
- ACP remains explicitly deferred

View File

@@ -1,341 +0,0 @@
---
title: "Thread Bound Subagents"
summary: "Discord thread bound subagent sessions with plugin lifecycle hooks, routing, and config kill switches"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-02-21"
owner: "onutc"
status: "implemented"
last_updated: "2026-02-21"
---
# Thread Bound Subagents
## Overview
This feature lets users interact with spawned subagents directly inside Discord threads.
Instead of only waiting for a completion summary in the parent session, users can move into a dedicated thread that routes messages to the spawned subagent session. Replies are sent in-thread with a thread bound persona.
The implementation is split between channel agnostic core lifecycle hooks and Discord specific extension behavior.
## Goals
- Allow direct thread conversation with a spawned subagent session.
- Keep default subagent orchestration channel agnostic.
- Support both automatic thread creation on spawn and manual focus controls.
- Provide predictable cleanup on completion, kill, timeout, and thread lifecycle changes.
- Keep behavior configurable with global defaults plus channel and account overrides.
## Out of scope
- New ACP protocol features.
- Non Discord thread binding implementations in this document.
- New bot accounts or app level Discord identity changes.
## What shipped
- `sessions_spawn` supports `thread: true` and `mode: "run" | "session"`.
- Spawn flow supports persistent thread bound sessions.
- Discord thread binding manager supports bind, unbind, TTL sweep, and persistence.
- Plugin hook lifecycle for subagents:
- `subagent_spawning`
- `subagent_spawned`
- `subagent_delivery_target`
- `subagent_ended`
- Discord extension implements thread auto bind, delivery target override, and unbind on end.
- Text commands for manual control:
- `/focus`
- `/unfocus`
- `/agents`
- `/session ttl`
- Global and Discord scoped enablement and TTL controls, including a global kill switch.
## Core concepts
### Spawn modes
- `mode: "run"`
- one task lifecycle
- completion announcement flow
- `mode: "session"`
- persistent thread bound session
- supports follow up user messages in thread
Default mode behavior:
- if `thread: true` and mode omitted, mode defaults to `"session"`
- otherwise mode defaults to `"run"`
Constraint:
- `mode: "session"` requires `thread: true`
### Thread binding target model
Bindings are generic targets, not only subagents.
- `targetKind: "subagent" | "acp"`
- `targetSessionKey: string`
This allows the same routing primitive to support ACP/session bindings as well.
### Thread binding manager
The manager is responsible for:
- binding or creating threads for a session target
- unbinding by thread or by target session
- managing webhook reuse and recent unbound webhook echo suppression
- TTL based unbind and stale thread cleanup
- persistence load and save
## Architecture
### Core and extension boundary
Core (`src/agents/*`) does not directly depend on Discord routing internals.
Core emits lifecycle intent through plugin hooks.
Discord extension (`extensions/discord/src/subagent-hooks.ts`) implements Discord specific behavior:
- pre spawn thread bind preparation
- completion delivery target override to bound thread
- unbind on subagent end
### Plugin hook flow
1. `subagent_spawning`
- before run starts
- can block spawn with `status: "error"`
- used to prepare thread binding when `thread: true`
2. `subagent_spawned`
- post run registration event
3. `subagent_delivery_target`
- completion routing override hook
- can redirect completion delivery to bound Discord thread origin
4. `subagent_ended`
- cleanup and unbind signal
### Account ID normalization contract
Thread binding and routing state must use one canonical account id abstraction.
Specification:
- Introduce a shared account id module (proposed: `src/routing/account-id.ts`) and stop defining local normalizers.
- Expose two explicit helpers:
- `normalizeAccountId(value): string`
- returns canonical, defaulted id (current default is `default`)
- use for map keys, manager registration and lookup, persistence keys, routing keys
- `normalizeOptionalAccountId(value): string | undefined`
- returns canonical id when present, `undefined` when absent
- use for inbound optional context fields and merge logic
- Do not implement ad hoc account normalization in feature modules.
- This includes `trim`, `toLowerCase`, or defaulting logic in local helper functions.
- Any map keyed by account id must only accept canonical ids from shared helpers.
- Hook payloads and delivery context should carry raw optional account ids, and normalize at module boundaries only.
Migration guardrails:
- Replace duplicate normalizers in routing, reply payload, command context, and provider helpers with shared helpers.
- Add contract tests that assert identical normalization behavior across:
- route resolution
- thread binding manager lookup
- reply delivery target filtering
- command run context merge
### Persistence and state
Binding state path:
- `${stateDir}/discord/thread-bindings.json`
Record shape contains:
- account, channel, thread
- target kind and target session key
- agent label metadata
- webhook id/token
- boundBy, boundAt, expiresAt
State is stored on `globalThis` to keep one shared registry across ESM and Jiti loader paths.
## Configuration
### Effective precedence
For Discord thread binding options, account override wins, then channel, then global session default, then built in fallback.
- account: `channels.discord.accounts.<id>.threadBindings.<key>`
- channel: `channels.discord.threadBindings.<key>`
- global: `session.threadBindings.<key>`
### Keys
| Key | Scope | Default | Notes |
| ------------------------------------------------------- | --------------- | --------------- | ----------------------------------------- |
| `session.threadBindings.enabled` | global | `true` | master default kill switch |
| `session.threadBindings.ttlHours` | global | `24` | default auto unfocus TTL |
| `channels.discord.threadBindings.enabled` | channel/account | inherits global | Discord override kill switch |
| `channels.discord.threadBindings.ttlHours` | channel/account | inherits global | Discord TTL override |
| `channels.discord.threadBindings.spawnSubagentSessions` | channel/account | `false` | opt in for `thread: true` spawn auto bind |
### Runtime effect of enable switch
When effective `enabled` is false for a Discord account:
- provider creates a noop thread binding manager for runtime wiring
- no real manager is registered for lookup by account id
- inbound bound thread routing is effectively disabled
- completion routing overrides do not resolve bound thread origins
- `/focus`, `/unfocus`, and thread binding specific operations report unavailable
- `thread: true` spawn path returns actionable error from Discord hook layer
## Flow and behavior
### Spawn with `thread: true`
1. Spawn validates mode and permissions.
2. `subagent_spawning` hook runs.
3. Discord extension checks effective flags:
- thread bindings enabled
- `spawnSubagentSessions` enabled
4. Extension attempts auto bind and thread creation.
5. If bind fails:
- spawn returns error
- provisional child session is deleted
6. If bind succeeds:
- child run starts
- run is registered with spawn mode
### Manual focus and unfocus
- `/focus <target>`
- Discord only
- resolves subagent or session target
- binds current or created thread to target session
- `/unfocus`
- Discord thread only
- unbinds current thread
### Inbound routing
- Discord preflight checks current thread id against thread binding manager.
- If bound, effective session routing uses bound target session key.
- If not bound, normal routing path is used.
### Outbound routing
- Reply delivery checks whether current session has thread bindings.
- Bound sessions deliver to thread via webhook aware path.
- Unbound sessions use normal bot delivery.
### Completion routing
- Core completion flow calls `subagent_delivery_target`.
- Discord extension returns bound thread origin when it can resolve one.
- Core merges hook origin with requester origin and delivers completion.
### Cleanup
Cleanup occurs on:
- completion
- error or timeout completion path
- kill and terminate paths
- TTL expiration
- archived or deleted thread probes
- manual `/unfocus`
Cleanup behavior includes unbind and optional farewell messaging.
## Commands and user UX
| Command | Purpose |
| ---------------------------------------------------------- | -------------------------------------------------------------------- | ------------------------------------- | --------------- | ------------------------------------------- |
| `/subagents spawn <agentId> <task> [--model] [--thinking]` | spawn subagent; may be thread bound when `thread: true` path is used |
| `/focus <subagent-label | session-key | session-id | session-label>` | manually bind thread to subagent or session |
| `/unfocus` | remove binding from current thread |
| `/agents` | list active agents and binding state |
| `/session ttl <duration | off>` | update TTL for focused thread binding |
Notes:
- `/session ttl` is currently Discord thread focused behavior.
- Thread intro and farewell text are generated by thread binding message helpers.
## Failure handling and safety
- Spawn returns explicit errors when thread binding cannot be prepared.
- Spawn failure after provisional bind attempts best effort unbind and session delete.
- Completion logic prevents duplicate ended hook emission.
- Retry and expiry guards prevent infinite completion announce retry loops.
- Webhook echo suppression avoids unbound webhook messages being reprocessed as inbound turns.
## Module map
### Core orchestration
- `src/agents/subagent-spawn.ts`
- `src/agents/subagent-announce.ts`
- `src/agents/subagent-registry.ts`
- `src/agents/subagent-registry-cleanup.ts`
- `src/agents/subagent-registry-completion.ts`
### Discord runtime
- `src/discord/monitor/provider.ts`
- `src/discord/monitor/thread-bindings.manager.ts`
- `src/discord/monitor/thread-bindings.state.ts`
- `src/discord/monitor/thread-bindings.lifecycle.ts`
- `src/discord/monitor/thread-bindings.messages.ts`
- `src/discord/monitor/message-handler.preflight.ts`
- `src/discord/monitor/message-handler.process.ts`
- `src/discord/monitor/reply-delivery.ts`
### Plugin hooks and extension
- `src/plugins/types.ts`
- `src/plugins/hooks.ts`
- `extensions/discord/src/subagent-hooks.ts`
### Config and schema
- `src/config/types.base.ts`
- `src/config/types.discord.ts`
- `src/config/zod-schema.session.ts`
- `src/config/zod-schema.providers-core.ts`
- `src/config/schema.help.ts`
- `src/config/schema.labels.ts`
## Test coverage highlights
- `extensions/discord/src/subagent-hooks.test.ts`
- `src/discord/monitor/thread-bindings.ttl.test.ts`
- `src/discord/monitor/thread-bindings.shared-state.test.ts`
- `src/discord/monitor/reply-delivery.test.ts`
- `src/discord/monitor/message-handler.preflight.test.ts`
- `src/discord/monitor/message-handler.process.test.ts`
- `src/auto-reply/reply/commands-subagents-focus.test.ts`
- `src/auto-reply/reply/commands-session-ttl.test.ts`
- `src/agents/subagent-registry.steer-restart.test.ts`
- `src/agents/subagent-registry-completion.test.ts`
## Operational summary
- Use `session.threadBindings.enabled` as the global kill switch default.
- Use `channels.discord.threadBindings.enabled` and account overrides for selective enablement.
- Keep `spawnSubagentSessions` opt in for thread auto spawn behavior.
- Use TTL settings for automatic unfocus policy control.
This model keeps subagent lifecycle orchestration generic while giving Discord a full thread bound interaction path.
## Related plan
For channel agnostic SessionBinding architecture and scoped iteration planning, see:
- `docs/experiments/plans/session-binding-channel-agnostic.md`
ACP remains a next step in that plan and is intentionally not implemented in this shipped Discord thread-bound flow.

View File

@@ -1,802 +0,0 @@
---
title: "ACP Thread Bound Agents"
summary: "Integrate ACP coding agents via a first-class ACP control plane in core and plugin-backed runtimes (acpx first)"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-02-26"
status: "draft"
last_updated: "2026-02-25"
---
# ACP Thread Bound Agents
## Overview
This plan defines how OpenClaw should support ACP coding agents in thread-capable channels (Discord first) with production-level lifecycle and recovery.
Related document:
- [Unified Runtime Streaming Refactor Plan](./2026-02-26-acp-unified-streaming-refactor.md)
Target user experience:
- a user spawns or focuses an ACP session into a thread
- user messages in that thread route to the bound ACP session
- agent output streams back to the same thread persona
- session can be persistent or one shot with explicit cleanup controls
## Decision summary
Long term recommendation is a hybrid architecture:
- OpenClaw core owns ACP control plane concerns
- session identity and metadata
- thread binding and routing decisions
- delivery invariants and duplicate suppression
- lifecycle cleanup and recovery semantics
- ACP runtime backend is pluggable
- first backend is an acpx-backed plugin service
- runtime does ACP transport, queueing, cancel, reconnect
OpenClaw should not reimplement ACP transport internals in core.
OpenClaw should not rely on a pure plugin-only interception path for routing.
## North-star architecture (holy grail)
Treat ACP as a first-class control plane in OpenClaw, with pluggable runtime adapters.
Non-negotiable invariants:
- every ACP thread binding references a valid ACP session record
- every ACP session has explicit lifecycle state (`creating`, `idle`, `running`, `cancelling`, `closed`, `error`)
- every ACP run has explicit run state (`queued`, `running`, `completed`, `failed`, `cancelled`)
- spawn, bind, and initial enqueue are atomic
- command retries are idempotent (no duplicate runs or duplicate Discord outputs)
- bound-thread channel output is a projection of ACP run events, never ad-hoc side effects
Long-term ownership model:
- `AcpSessionManager` is the single ACP writer and orchestrator
- manager lives in gateway process first; can be moved to a dedicated sidecar later behind the same interface
- per ACP session key, manager owns one in-memory actor (serialized command execution)
- adapters (`acpx`, future backends) are transport/runtime implementations only
Long-term persistence model:
- move ACP control-plane state to a dedicated SQLite store (WAL mode) under OpenClaw state dir
- keep `SessionEntry.acp` as compatibility projection during migration, not source-of-truth
- store ACP events append-only to support replay, crash recovery, and deterministic delivery
### Delivery strategy (bridge to holy-grail)
- short-term bridge
- keep current thread binding mechanics and existing ACP config surface
- fix metadata-gap bugs and route ACP turns through a single core ACP branch
- add idempotency keys and fail-closed routing checks immediately
- long-term cutover
- move ACP source-of-truth to control-plane DB + actors
- make bound-thread delivery purely event-projection based
- remove legacy fallback behavior that depends on opportunistic session-entry metadata
## Why not pure plugin only
Current plugin hooks are not sufficient for end to end ACP session routing without core changes.
- inbound routing from thread binding resolves to a session key in core dispatch first
- message hooks are fire-and-forget and cannot short-circuit the main reply path
- plugin commands are good for control operations but not for replacing core per-turn dispatch flow
Result:
- ACP runtime can be pluginized
- ACP routing branch must exist in core
## Existing foundation to reuse
Already implemented and should remain canonical:
- thread binding target supports `subagent` and `acp`
- inbound thread routing override resolves by binding before normal dispatch
- outbound thread identity via webhook in reply delivery
- `/focus` and `/unfocus` flow with ACP target compatibility
- persistent binding store with restore on startup
- unbind lifecycle on archive, delete, unfocus, reset, and delete
This plan extends that foundation rather than replacing it.
## Architecture
### Boundary model
Core (must be in OpenClaw core):
- ACP session-mode dispatch branch in the reply pipeline
- delivery arbitration to avoid parent plus thread duplication
- ACP control-plane persistence (with `SessionEntry.acp` compatibility projection during migration)
- lifecycle unbind and runtime detach semantics tied to session reset/delete
Plugin backend (acpx implementation):
- ACP runtime worker supervision
- acpx process invocation and event parsing
- ACP command handlers (`/acp ...`) and operator UX
- backend-specific config defaults and diagnostics
### Runtime ownership model
- one gateway process owns ACP orchestration state
- ACP execution runs in supervised child processes via acpx backend
- process strategy is long lived per active ACP session key, not per message
This avoids startup cost on every prompt and keeps cancel and reconnect semantics reliable.
### Core runtime contract
Add a core ACP runtime contract so routing code does not depend on CLI details and can switch backends without changing dispatch logic:
```ts
export type AcpRuntimePromptMode = "prompt" | "steer";
export type AcpRuntimeHandle = {
sessionKey: string;
backend: string;
runtimeSessionName: string;
};
export type AcpRuntimeEvent =
| { type: "text_delta"; stream: "output" | "thought"; text: string }
| { type: "tool_call"; name: string; argumentsText: string }
| { type: "done"; usage?: Record<string, number> }
| { type: "error"; code: string; message: string; retryable?: boolean };
export interface AcpRuntime {
ensureSession(input: {
sessionKey: string;
agent: string;
mode: "persistent" | "oneshot";
cwd?: string;
env?: Record<string, string>;
idempotencyKey: string;
}): Promise<AcpRuntimeHandle>;
submit(input: {
handle: AcpRuntimeHandle;
text: string;
mode: AcpRuntimePromptMode;
idempotencyKey: string;
}): Promise<{ runtimeRunId: string }>;
stream(input: {
handle: AcpRuntimeHandle;
runtimeRunId: string;
onEvent: (event: AcpRuntimeEvent) => Promise<void> | void;
signal?: AbortSignal;
}): Promise<void>;
cancel(input: {
handle: AcpRuntimeHandle;
runtimeRunId?: string;
reason?: string;
idempotencyKey: string;
}): Promise<void>;
close(input: { handle: AcpRuntimeHandle; reason: string; idempotencyKey: string }): Promise<void>;
health?(): Promise<{ ok: boolean; details?: string }>;
}
```
Implementation detail:
- first backend: `AcpxRuntime` shipped as a plugin service
- core resolves runtime via registry and fails with explicit operator error when no ACP runtime backend is available
### Control-plane data model and persistence
Long-term source-of-truth is a dedicated ACP SQLite database (WAL mode), for transactional updates and crash-safe recovery:
- `acp_sessions`
- `session_key` (pk), `backend`, `agent`, `mode`, `cwd`, `state`, `created_at`, `updated_at`, `last_error`
- `acp_runs`
- `run_id` (pk), `session_key` (fk), `state`, `requester_message_id`, `idempotency_key`, `started_at`, `ended_at`, `error_code`, `error_message`
- `acp_bindings`
- `binding_key` (pk), `thread_id`, `channel_id`, `account_id`, `session_key` (fk), `expires_at`, `bound_at`
- `acp_events`
- `event_id` (pk), `run_id` (fk), `seq`, `kind`, `payload_json`, `created_at`
- `acp_delivery_checkpoint`
- `run_id` (pk/fk), `last_event_seq`, `last_discord_message_id`, `updated_at`
- `acp_idempotency`
- `scope`, `idempotency_key`, `result_json`, `created_at`, unique `(scope, idempotency_key)`
```ts
export type AcpSessionMeta = {
backend: string;
agent: string;
runtimeSessionName: string;
mode: "persistent" | "oneshot";
cwd?: string;
state: "idle" | "running" | "error";
lastActivityAt: number;
lastError?: string;
};
```
Storage rules:
- keep `SessionEntry.acp` as a compatibility projection during migration
- process ids and sockets stay in memory only
- durable lifecycle and run status live in ACP DB, not generic session JSON
- if runtime owner dies, gateway rehydrates from ACP DB and resumes from checkpoints
### Routing and delivery
Inbound:
- keep current thread binding lookup as first routing step
- if bound target is ACP session, route to ACP runtime branch instead of `getReplyFromConfig`
- explicit `/acp steer` command uses `mode: "steer"`
Outbound:
- ACP event stream is normalized to OpenClaw reply chunks
- delivery target is resolved through existing bound destination path
- when a bound thread is active for that session turn, parent channel completion is suppressed
Streaming policy:
- stream partial output with coalescing window
- configurable min interval and max chunk bytes to stay under Discord rate limits
- final message always emitted on completion or failure
### State machines and transaction boundaries
Session state machine:
- `creating -> idle -> running -> idle`
- `running -> cancelling -> idle | error`
- `idle -> closed`
- `error -> idle | closed`
Run state machine:
- `queued -> running -> completed`
- `running -> failed | cancelled`
- `queued -> cancelled`
Required transaction boundaries:
- spawn transaction
- create ACP session row
- create/update ACP thread binding row
- enqueue initial run row
- close transaction
- mark session closed
- delete/expire binding rows
- write final close event
- cancel transaction
- mark target run cancelling/cancelled with idempotency key
No partial success is allowed across these boundaries.
### Per-session actor model
`AcpSessionManager` runs one actor per ACP session key:
- actor mailbox serializes `submit`, `cancel`, `close`, and `stream` side effects
- actor owns runtime handle hydration and runtime adapter process lifecycle for that session
- actor writes run events in-order (`seq`) before any Discord delivery
- actor updates delivery checkpoints after successful outbound send
This removes cross-turn races and prevents duplicate or out-of-order thread output.
### Idempotency and delivery projection
All external ACP actions must carry idempotency keys:
- spawn idempotency key
- prompt/steer idempotency key
- cancel idempotency key
- close idempotency key
Delivery rules:
- Discord messages are derived from `acp_events` plus `acp_delivery_checkpoint`
- retries resume from checkpoint without re-sending already delivered chunks
- final reply emission is exactly-once per run from projection logic
### Recovery and self-healing
On gateway start:
- load non-terminal ACP sessions (`creating`, `idle`, `running`, `cancelling`, `error`)
- recreate actors lazily on first inbound event or eagerly under configured cap
- reconcile any `running` runs missing heartbeats and mark `failed` or recover via adapter
On inbound Discord thread message:
- if binding exists but ACP session is missing, fail closed with explicit stale-binding message
- optionally auto-unbind stale binding after operator-safe validation
- never silently route stale ACP bindings to normal LLM path
### Lifecycle and safety
Supported operations:
- cancel current run: `/acp cancel`
- unbind thread: `/unfocus`
- close ACP session: `/acp close`
- auto close idle sessions by effective TTL
TTL policy:
- effective TTL is minimum of
- global/session TTL
- Discord thread binding TTL
- ACP runtime owner TTL
Safety controls:
- allowlist ACP agents by name
- restrict workspace roots for ACP sessions
- env allowlist passthrough
- max concurrent ACP sessions per account and globally
- bounded restart backoff for runtime crashes
## Config surface
Core keys:
- `acp.enabled`
- `acp.dispatch.enabled` (independent ACP routing kill switch)
- `acp.backend` (default `acpx`)
- `acp.defaultAgent`
- `acp.allowedAgents[]`
- `acp.maxConcurrentSessions`
- `acp.stream.coalesceIdleMs`
- `acp.stream.maxChunkChars`
- `acp.runtime.ttlMinutes`
- `acp.controlPlane.store` (`sqlite` default)
- `acp.controlPlane.storePath`
- `acp.controlPlane.recovery.eagerActors`
- `acp.controlPlane.recovery.reconcileRunningAfterMs`
- `acp.controlPlane.checkpoint.flushEveryEvents`
- `acp.controlPlane.checkpoint.flushEveryMs`
- `acp.idempotency.ttlHours`
- `channels.discord.threadBindings.spawnAcpSessions`
Plugin/backend keys (acpx plugin section):
- backend command/path overrides
- backend env allowlist
- backend per-agent presets
- backend startup/stop timeouts
- backend max inflight runs per session
## Implementation specification
### Control-plane modules (new)
Add dedicated ACP control-plane modules in core:
- `src/acp/control-plane/manager.ts`
- owns ACP actors, lifecycle transitions, command serialization
- `src/acp/control-plane/store.ts`
- SQLite schema management, transactions, query helpers
- `src/acp/control-plane/events.ts`
- typed ACP event definitions and serialization
- `src/acp/control-plane/checkpoint.ts`
- durable delivery checkpoints and replay cursors
- `src/acp/control-plane/idempotency.ts`
- idempotency key reservation and response replay
- `src/acp/control-plane/recovery.ts`
- boot-time reconciliation and actor rehydrate plan
Compatibility bridge modules:
- `src/acp/runtime/session-meta.ts`
- remains temporarily for projection into `SessionEntry.acp`
- must stop being source-of-truth after migration cutover
### Required invariants (must enforce in code)
- ACP session creation and thread bind are atomic (single transaction)
- there is at most one active run per ACP session actor at a time
- event `seq` is strictly increasing per run
- delivery checkpoint never advances past last committed event
- idempotency replay returns previous success payload for duplicate command keys
- stale/missing ACP metadata cannot route into normal non-ACP reply path
### Core touchpoints
Core files to change:
- `src/auto-reply/reply/dispatch-from-config.ts`
- ACP branch calls `AcpSessionManager.submit` and event-projection delivery
- remove direct ACP fallback that bypasses control-plane invariants
- `src/auto-reply/reply/inbound-context.ts` (or nearest normalized context boundary)
- expose normalized routing keys and idempotency seeds for ACP control plane
- `src/config/sessions/types.ts`
- keep `SessionEntry.acp` as projection-only compatibility field
- `src/gateway/server-methods/sessions.ts`
- reset/delete/archive must call ACP manager close/unbind transaction path
- `src/infra/outbound/bound-delivery-router.ts`
- enforce fail-closed destination behavior for ACP bound session turns
- `src/discord/monitor/thread-bindings.ts`
- add ACP stale-binding validation helpers wired to control-plane lookups
- `src/auto-reply/reply/commands-acp.ts`
- route spawn/cancel/close/steer through ACP manager APIs
- `src/agents/acp-spawn.ts`
- stop ad-hoc metadata writes; call ACP manager spawn transaction
- `src/plugin-sdk/**` and plugin runtime bridge
- expose ACP backend registration and health semantics cleanly
Core files explicitly not replaced:
- `src/discord/monitor/message-handler.preflight.ts`
- keep thread binding override behavior as the canonical session-key resolver
### ACP runtime registry API
Add a core registry module:
- `src/acp/runtime/registry.ts`
Required API:
```ts
export type AcpRuntimeBackend = {
id: string;
runtime: AcpRuntime;
healthy?: () => boolean;
};
export function registerAcpRuntimeBackend(backend: AcpRuntimeBackend): void;
export function unregisterAcpRuntimeBackend(id: string): void;
export function getAcpRuntimeBackend(id?: string): AcpRuntimeBackend | null;
export function requireAcpRuntimeBackend(id?: string): AcpRuntimeBackend;
```
Behavior:
- `requireAcpRuntimeBackend` throws a typed ACP backend missing error when unavailable
- plugin service registers backend on `start` and unregisters on `stop`
- runtime lookups are read-only and process-local
### acpx runtime plugin contract (implementation detail)
For the first production backend (`extensions/acpx`), OpenClaw and acpx are
connected with a strict command contract:
- backend id: `acpx`
- plugin service id: `acpx-runtime`
- runtime handle encoding: `runtimeSessionName = acpx:v1:<base64url(json)>`
- encoded payload fields:
- `name` (acpx named session; uses OpenClaw `sessionKey`)
- `agent` (acpx agent command)
- `cwd` (session workspace root)
- `mode` (`persistent | oneshot`)
Command mapping:
- ensure session:
- `acpx --format json --json-strict --cwd <cwd> <agent> sessions ensure --name <name>`
- prompt turn:
- `acpx --format json --json-strict --cwd <cwd> <agent> prompt --session <name> --file -`
- cancel:
- `acpx --format json --json-strict --cwd <cwd> <agent> cancel --session <name>`
- close:
- `acpx --format json --json-strict --cwd <cwd> <agent> sessions close <name>`
Streaming:
- OpenClaw consumes ndjson events from `acpx --format json --json-strict`
- `text` => `text_delta/output`
- `thought` => `text_delta/thought`
- `tool_call` => `tool_call`
- `done` => `done`
- `error` => `error`
### Session schema patch
Patch `SessionEntry` in `src/config/sessions/types.ts`:
```ts
type SessionAcpMeta = {
backend: string;
agent: string;
runtimeSessionName: string;
mode: "persistent" | "oneshot";
cwd?: string;
state: "idle" | "running" | "error";
lastActivityAt: number;
lastError?: string;
};
```
Persisted field:
- `SessionEntry.acp?: SessionAcpMeta`
Migration rules:
- phase A: dual-write (`acp` projection + ACP SQLite source-of-truth)
- phase B: read-primary from ACP SQLite, fallback-read from legacy `SessionEntry.acp`
- phase C: migration command backfills missing ACP rows from valid legacy entries
- phase D: remove fallback-read and keep projection optional for UX only
- legacy fields (`cliSessionIds`, `claudeCliSessionId`) remain untouched
### Error contract
Add stable ACP error codes and user-facing messages:
- `ACP_BACKEND_MISSING`
- message: `ACP runtime backend is not configured. Install and enable the acpx runtime plugin.`
- `ACP_BACKEND_UNAVAILABLE`
- message: `ACP runtime backend is currently unavailable. Try again in a moment.`
- `ACP_SESSION_INIT_FAILED`
- message: `Could not initialize ACP session runtime.`
- `ACP_TURN_FAILED`
- message: `ACP turn failed before completion.`
Rules:
- return actionable user-safe message in-thread
- log detailed backend/system error only in runtime logs
- never silently fall back to normal LLM path when ACP routing was explicitly selected
### Duplicate delivery arbitration
Single routing rule for ACP bound turns:
- if an active thread binding exists for the target ACP session and requester context, deliver only to that bound thread
- do not also send to parent channel for the same turn
- if bound destination selection is ambiguous, fail closed with explicit error (no implicit parent fallback)
- if no active binding exists, use normal session destination behavior
### Observability and operational readiness
Required metrics:
- ACP spawn success/failure count by backend and error code
- ACP run latency percentiles (queue wait, runtime turn time, delivery projection time)
- ACP actor restart count and restart reason
- stale-binding detection count
- idempotency replay hit rate
- Discord delivery retry and rate-limit counters
Required logs:
- structured logs keyed by `sessionKey`, `runId`, `backend`, `threadId`, `idempotencyKey`
- explicit state transition logs for session and run state machines
- adapter command logs with redaction-safe arguments and exit summary
Required diagnostics:
- `/acp sessions` includes state, active run, last error, and binding status
- `/acp doctor` (or equivalent) validates backend registration, store health, and stale bindings
### Config precedence and effective values
ACP enablement precedence:
- account override: `channels.discord.accounts.<id>.threadBindings.spawnAcpSessions`
- channel override: `channels.discord.threadBindings.spawnAcpSessions`
- global ACP gate: `acp.enabled`
- dispatch gate: `acp.dispatch.enabled`
- backend availability: registered backend for `acp.backend`
Auto-enable behavior:
- when ACP is configured (`acp.enabled=true`, `acp.dispatch.enabled=true`, or
`acp.backend=acpx`), plugin auto-enable marks `plugins.entries.acpx.enabled=true`
unless denylisted or explicitly disabled
TTL effective value:
- `min(session ttl, discord thread binding ttl, acp runtime ttl)`
### Test map
Unit tests:
- `src/acp/runtime/registry.test.ts` (new)
- `src/auto-reply/reply/dispatch-from-config.acp.test.ts` (new)
- `src/infra/outbound/bound-delivery-router.test.ts` (extend ACP fail-closed cases)
- `src/config/sessions/types.test.ts` or nearest session-store tests (ACP metadata persistence)
Integration tests:
- `src/discord/monitor/reply-delivery.test.ts` (bound ACP delivery target behavior)
- `src/discord/monitor/message-handler.preflight*.test.ts` (bound ACP session-key routing continuity)
- acpx plugin runtime tests in backend package (service register/start/stop + event normalization)
Gateway e2e tests:
- `src/gateway/server.sessions.gateway-server-sessions-a.e2e.test.ts` (extend ACP reset/delete lifecycle coverage)
- ACP thread turn roundtrip e2e for spawn, message, stream, cancel, unfocus, restart recovery
### Rollout guard
Add independent ACP dispatch kill switch:
- `acp.dispatch.enabled` default `false` for first release
- when disabled:
- ACP spawn/focus control commands may still bind sessions
- ACP dispatch path does not activate
- user receives explicit message that ACP dispatch is disabled by policy
- after canary validation, default can be flipped to `true` in a later release
## Command and UX plan
### New commands
- `/acp spawn <agent-id> [--mode persistent|oneshot] [--thread auto|here|off]`
- `/acp cancel [session]`
- `/acp steer <instruction>`
- `/acp close [session]`
- `/acp sessions`
### Existing command compatibility
- `/focus <sessionKey>` continues to support ACP targets
- `/unfocus` keeps current semantics
- `/session idle` and `/session max-age` replace the old TTL override
## Phased rollout
### Phase 0 ADR and schema freeze
- ship ADR for ACP control-plane ownership and adapter boundaries
- freeze DB schema (`acp_sessions`, `acp_runs`, `acp_bindings`, `acp_events`, `acp_delivery_checkpoint`, `acp_idempotency`)
- define stable ACP error codes, event contract, and state-transition guards
### Phase 1 Control-plane foundation in core
- implement `AcpSessionManager` and per-session actor runtime
- implement ACP SQLite store and transaction helpers
- implement idempotency store and replay helpers
- implement event append + delivery checkpoint modules
- wire spawn/cancel/close APIs to manager with transactional guarantees
### Phase 2 Core routing and lifecycle integration
- route thread-bound ACP turns from dispatch pipeline into ACP manager
- enforce fail-closed routing when ACP binding/session invariants fail
- integrate reset/delete/archive/unfocus lifecycle with ACP close/unbind transactions
- add stale-binding detection and optional auto-unbind policy
### Phase 3 acpx backend adapter/plugin
- implement `acpx` adapter against runtime contract (`ensureSession`, `submit`, `stream`, `cancel`, `close`)
- add backend health checks and startup/teardown registration
- normalize acpx ndjson events into ACP runtime events
- enforce backend timeouts, process supervision, and restart/backoff policy
### Phase 4 Delivery projection and channel UX (Discord first)
- implement event-driven channel projection with checkpoint resume (Discord first)
- coalesce streaming chunks with rate-limit aware flush policy
- guarantee exactly-once final completion message per run
- ship `/acp spawn`, `/acp cancel`, `/acp steer`, `/acp close`, `/acp sessions`
### Phase 5 Migration and cutover
- introduce dual-write to `SessionEntry.acp` projection plus ACP SQLite source-of-truth
- add migration utility for legacy ACP metadata rows
- flip read path to ACP SQLite primary
- remove legacy fallback routing that depends on missing `SessionEntry.acp`
### Phase 6 Hardening, SLOs, and scale limits
- enforce concurrency limits (global/account/session), queue policies, and timeout budgets
- add full telemetry, dashboards, and alert thresholds
- chaos-test crash recovery and duplicate-delivery suppression
- publish runbook for backend outage, DB corruption, and stale-binding remediation
### Full implementation checklist
- core control-plane modules and tests
- DB migrations and rollback plan
- ACP manager API integration across dispatch and commands
- adapter registration interface in plugin runtime bridge
- acpx adapter implementation and tests
- thread-capable channel delivery projection logic with checkpoint replay (Discord first)
- lifecycle hooks for reset/delete/archive/unfocus
- stale-binding detector and operator-facing diagnostics
- config validation and precedence tests for all new ACP keys
- operational docs and troubleshooting runbook
## Test plan
Unit tests:
- ACP DB transaction boundaries (spawn/bind/enqueue atomicity, cancel, close)
- ACP state-machine transition guards for sessions and runs
- idempotency reservation/replay semantics across all ACP commands
- per-session actor serialization and queue ordering
- acpx event parser and chunk coalescer
- runtime supervisor restart and backoff policy
- config precedence and effective TTL calculation
- core ACP routing branch selection and fail-closed behavior when backend/session is invalid
Integration tests:
- fake ACP adapter process for deterministic streaming and cancel behavior
- ACP manager + dispatch integration with transactional persistence
- thread-bound inbound routing to ACP session key
- thread-bound outbound delivery suppresses parent channel duplication
- checkpoint replay recovers after delivery failure and resumes from last event
- plugin service registration and teardown of ACP runtime backend
Gateway e2e tests:
- spawn ACP with thread, exchange multi-turn prompts, unfocus
- gateway restart with persisted ACP DB and bindings, then continue same session
- concurrent ACP sessions in multiple threads have no cross-talk
- duplicate command retries (same idempotency key) do not create duplicate runs or replies
- stale-binding scenario yields explicit error and optional auto-clean behavior
## Risks and mitigations
- Duplicate deliveries during transition
- Mitigation: single destination resolver and idempotent event checkpoint
- Runtime process churn under load
- Mitigation: long lived per session owners + concurrency caps + backoff
- Plugin absent or misconfigured
- Mitigation: explicit operator-facing error and fail-closed ACP routing (no implicit fallback to normal session path)
- Config confusion between subagent and ACP gates
- Mitigation: explicit ACP keys and command feedback that includes effective policy source
- Control-plane store corruption or migration bugs
- Mitigation: WAL mode, backup/restore hooks, migration smoke tests, and read-only fallback diagnostics
- Actor deadlocks or mailbox starvation
- Mitigation: watchdog timers, actor health probes, and bounded mailbox depth with rejection telemetry
## Acceptance checklist
- ACP session spawn can create or bind a thread in a supported channel adapter (currently Discord)
- all thread messages route to bound ACP session only
- ACP outputs appear in the same thread identity with streaming or batches
- no duplicate output in parent channel for bound turns
- spawn+bind+initial enqueue are atomic in persistent store
- ACP command retries are idempotent and do not duplicate runs or outputs
- cancel, close, unfocus, archive, reset, and delete perform deterministic cleanup
- crash restart preserves mapping and resumes multi turn continuity
- concurrent thread bound ACP sessions work independently
- ACP backend missing state produces clear actionable error
- stale bindings are detected and surfaced explicitly (with optional safe auto-clean)
- control-plane metrics and diagnostics are available for operators
- new unit, integration, and e2e coverage passes
## Addendum: targeted refactors for current implementation (status)
These are non-blocking follow-ups to keep the ACP path maintainable after the current feature set lands.
### 1) Centralize ACP dispatch policy evaluation (completed)
- implemented via shared ACP policy helpers in `src/acp/policy.ts`
- dispatch, ACP command lifecycle handlers, and ACP spawn path now consume shared policy logic
### 2) Split ACP command handler by subcommand domain (completed)
- `src/auto-reply/reply/commands-acp.ts` is now a thin router
- subcommand behavior is split into:
- `src/auto-reply/reply/commands-acp/lifecycle.ts`
- `src/auto-reply/reply/commands-acp/runtime-options.ts`
- `src/auto-reply/reply/commands-acp/diagnostics.ts`
- shared helpers in `src/auto-reply/reply/commands-acp/shared.ts`
### 3) Split ACP session manager by responsibility (completed)
- manager is split into:
- `src/acp/control-plane/manager.ts` (public facade + singleton)
- `src/acp/control-plane/manager.core.ts` (manager implementation)
- `src/acp/control-plane/manager.types.ts` (manager types/deps)
- `src/acp/control-plane/manager.utils.ts` (normalization + helper functions)
### 4) Optional acpx runtime adapter cleanup
- `extensions/acpx/src/runtime.ts` can be split into:
- process execution/supervision
- ndjson event parsing/normalization
- runtime API surface (`submit`, `cancel`, `close`, etc.)
- improves testability and makes backend behavior easier to audit

View File

@@ -1,98 +0,0 @@
---
title: "Unified Runtime Streaming Refactor Plan"
summary: "Holy grail refactor plan for one unified runtime streaming pipeline across main, subagent, and ACP"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-02-26"
status: "draft"
last_updated: "2026-02-25"
---
# Unified Runtime Streaming Refactor Plan
## Objective
Deliver one shared streaming pipeline for `main`, `subagent`, and `acp` so all runtimes get identical coalescing, chunking, delivery ordering, and crash recovery behavior.
## Why this exists
- Current behavior is split across multiple runtime-specific shaping paths.
- Formatting/coalescing bugs can be fixed in one path but remain in others.
- Delivery consistency, duplicate suppression, and recovery semantics are harder to reason about.
## Target architecture
Single pipeline, runtime-specific adapters:
1. Runtime adapters emit canonical events only.
2. Shared stream assembler coalesces and finalizes text/tool/status events.
3. Shared channel projector applies channel-specific chunking/formatting once.
4. Shared delivery ledger enforces idempotent send/replay semantics.
5. Outbound channel adapter executes sends and records delivery checkpoints.
Canonical event contract:
- `turn_started`
- `text_delta`
- `block_final`
- `tool_started`
- `tool_finished`
- `status`
- `turn_completed`
- `turn_failed`
- `turn_cancelled`
## Workstreams
### 1) Canonical streaming contract
- Define strict event schema + validation in core.
- Add adapter contract tests to guarantee each runtime emits compatible events.
- Reject malformed runtime events early and surface structured diagnostics.
### 2) Shared stream processor
- Replace runtime-specific coalescer/projector logic with one processor.
- Processor owns text delta buffering, idle flush, max-chunk splitting, and completion flush.
- Move ACP/main/subagent config resolution into one helper to prevent drift.
### 3) Shared channel projection
- Keep channel adapters dumb: accept finalized blocks and send.
- Move Discord-specific chunking quirks to channel projector only.
- Keep pipeline channel-agnostic before projection.
### 4) Delivery ledger + replay
- Add per-turn/per-chunk delivery IDs.
- Record checkpoints before and after physical send.
- On restart, replay pending chunks idempotently and avoid duplicates.
### 5) Migration and cutover
- Phase 1: shadow mode (new pipeline computes output but old path sends; compare).
- Phase 2: runtime-by-runtime cutover (`acp`, then `subagent`, then `main` or reverse by risk).
- Phase 3: delete legacy runtime-specific streaming code.
## Non-goals
- No changes to ACP policy/permissions model in this refactor.
- No channel-specific feature expansion outside projection compatibility fixes.
- No transport/backend redesign (acpx plugin contract remains as-is unless needed for event parity).
## Risks and mitigations
- Risk: behavioral regressions in existing main/subagent paths.
Mitigation: shadow mode diffing + adapter contract tests + channel e2e tests.
- Risk: duplicate sends during crash recovery.
Mitigation: durable delivery IDs + idempotent replay in delivery adapter.
- Risk: runtime adapters diverge again.
Mitigation: required shared contract test suite for all adapters.
## Acceptance criteria
- All runtimes pass shared streaming contract tests.
- Discord ACP/main/subagent produce equivalent spacing/chunking behavior for tiny deltas.
- Crash/restart replay sends no duplicate chunk for the same delivery ID.
- Legacy ACP projector/coalescer path is removed.
- Streaming config resolution is shared and runtime-independent.

View File

@@ -1,92 +0,0 @@
---
title: "ACP Bound Command Authorization (Proposal)"
summary: "Proposal: long-term command authorization model for ACP-bound conversations"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-03-05"
read_when:
- Designing native command auth behavior in Telegram/Discord ACP-bound channels/topics
---
# ACP Bound Command Authorization (Proposal)
Status: Proposed, **not implemented yet**.
This document describes a long-term authorization model for native commands in
ACP-bound conversations. It is an experiments proposal and does not replace
current production behavior.
For implemented behavior, read source and tests in:
- `src/telegram/bot-native-commands.ts`
- `src/discord/monitor/native-command.ts`
- `src/auto-reply/reply/commands-core.ts`
## Problem
Today we have command-specific checks (for example `/new` and `/reset`) that
need to work inside ACP-bound channels/topics even when allowlists are empty.
This solves immediate UX pain, but command-name-based exceptions do not scale.
## Long-term shape
Move command authorization from ad-hoc handler logic to command metadata plus a
shared policy evaluator.
### 1) Add auth policy metadata to command definitions
Each command definition should declare an auth policy. Example shape:
```ts
type CommandAuthPolicy =
| { mode: "owner_or_allowlist" } // default, current strict behavior
| { mode: "bound_acp_or_owner_or_allowlist" } // allow in explicitly bound ACP conversations
| { mode: "owner_only" };
```
`/new` and `/reset` would use `bound_acp_or_owner_or_allowlist`.
Most other commands would remain `owner_or_allowlist`.
### 2) Share one evaluator across channels
Introduce one helper that evaluates command auth using:
- command policy metadata
- sender authorization state
- resolved conversation binding state
Both Telegram and Discord native handlers should call the same helper to avoid
behavior drift.
### 3) Use binding-match as the bypass boundary
When policy allows bound ACP bypass, authorize only if a configured binding
match was resolved for the current conversation (not just because current
session key looks ACP-like).
This keeps the boundary explicit and minimizes accidental widening.
## Why this is better
- Scales to future commands without adding more command-name conditionals.
- Keeps behavior consistent across channels.
- Preserves current security model by requiring explicit binding match.
- Keeps allowlists optional hardening instead of a universal requirement.
## Rollout plan (future)
1. Add command auth policy field to command registry types and command data.
2. Implement shared evaluator and migrate Telegram + Discord native handlers.
3. Move `/new` and `/reset` to metadata-driven policy.
4. Add tests per policy mode and channel surface.
## Non-goals
- This proposal does not change ACP session lifecycle behavior.
- This proposal does not require allowlists for all ACP-bound commands.
- This proposal does not change existing route binding semantics.
## Note
This proposal is intentionally additive and does not delete or replace existing
experiments documents.

View File

@@ -1,383 +0,0 @@
---
title: "ACP Persistent Bindings for Discord Channels and Telegram Topics"
summary: "Introduce persistent ACP bindings for Discord channels and Telegram topics backed by long-lived ACP session bindings."
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-03-05"
status: "draft"
last_updated: "2026-03-05"
---
# ACP Persistent Bindings for Discord Channels and Telegram Topics
## Summary
Introduce persistent ACP bindings that map:
- Discord channels (and existing threads, where needed), and
- Telegram forum topics in groups/supergroups (`chatId:topic:topicId`)
to long-lived ACP sessions, with binding state stored in top-level `bindings[]` entries using explicit binding types.
This makes ACP usage in high-traffic messaging channels predictable and durable, so users can create dedicated channels/topics such as `codex`, `claude-1`, or `claude-myrepo`.
## Why
Current thread-bound ACP behavior is optimized for ephemeral Discord thread workflows. Telegram does not have the same thread model; it has forum topics in groups/supergroups. Users want stable, always-on ACP “workspaces” in chat surfaces, not only temporary thread sessions.
## Goals
- Support durable ACP binding for:
- Discord channels/threads
- Telegram forum topics (groups/supergroups)
- Make binding source-of-truth config-driven.
- Keep `/acp`, `/new`, `/reset`, `/focus`, and delivery behavior consistent across Discord and Telegram.
- Preserve existing temporary binding flows for ad-hoc usage.
## Non-Goals
- Full redesign of ACP runtime/session internals.
- Removing existing ephemeral binding flows.
- Expanding to every channel in the first iteration.
- Implementing Telegram channel direct-messages topics (`direct_messages_topic_id`) in this phase.
- Implementing Telegram private-chat topic variants in this phase.
## UX Direction
### 1) Two binding types
- **Persistent binding**: saved in config, reconciled on startup, intended for “named workspace” channels/topics.
- **Temporary binding**: runtime-only, expires by idle/max-age policy.
### 2) Command behavior
- `/acp spawn ... --thread here|auto|off` remains available.
- Add explicit bind lifecycle controls:
- `/acp bind [session|agent] [--persist]`
- `/acp unbind [--persist]`
- `/acp status` includes whether binding is `persistent` or `temporary`.
- In bound conversations, `/new` and `/reset` reset the bound ACP session in place and keep the binding attached.
### 3) Conversation identity
- Use canonical conversation IDs:
- Discord: channel/thread ID.
- Telegram topic: `chatId:topic:topicId`.
- Never key Telegram bindings by bare topic ID alone.
## Config Model (Proposed)
Unify routing and persistent ACP binding configuration in top-level `bindings[]` with explicit `type` discriminator:
```jsonc
{
"agents": {
"list": [
{
"id": "main",
"default": true,
"workspace": "~/.openclaw/workspace-main",
"runtime": { "type": "embedded" },
},
{
"id": "codex",
"workspace": "~/.openclaw/workspace-codex",
"runtime": {
"type": "acp",
"acp": {
"agent": "codex",
"backend": "acpx",
"mode": "persistent",
"cwd": "/workspace/repo-a",
},
},
},
{
"id": "claude",
"workspace": "~/.openclaw/workspace-claude",
"runtime": {
"type": "acp",
"acp": {
"agent": "claude",
"backend": "acpx",
"mode": "persistent",
"cwd": "/workspace/repo-b",
},
},
},
],
},
"acp": {
"enabled": true,
"backend": "acpx",
"allowedAgents": ["codex", "claude"],
},
"bindings": [
// Route bindings (existing behavior)
{
"type": "route",
"agentId": "main",
"match": { "channel": "discord", "accountId": "default" },
},
{
"type": "route",
"agentId": "main",
"match": { "channel": "telegram", "accountId": "default" },
},
// Persistent ACP conversation bindings
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "222222222222222222" },
},
"acp": {
"label": "codex-main",
"mode": "persistent",
"cwd": "/workspace/repo-a",
"backend": "acpx",
},
},
{
"type": "acp",
"agentId": "claude",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "333333333333333333" },
},
"acp": {
"label": "claude-repo-b",
"mode": "persistent",
"cwd": "/workspace/repo-b",
},
},
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "telegram",
"accountId": "default",
"peer": { "kind": "group", "id": "-1001234567890:topic:42" },
},
"acp": {
"label": "tg-codex-42",
"mode": "persistent",
},
},
],
"channels": {
"discord": {
"guilds": {
"111111111111111111": {
"channels": {
"222222222222222222": {
"enabled": true,
"requireMention": false,
},
"333333333333333333": {
"enabled": true,
"requireMention": false,
},
},
},
},
},
"telegram": {
"groups": {
"-1001234567890": {
"topics": {
"42": {
"requireMention": false,
},
},
},
},
},
},
}
```
### Minimal Example (No Per-Binding ACP Overrides)
```jsonc
{
"agents": {
"list": [
{ "id": "main", "default": true, "runtime": { "type": "embedded" } },
{
"id": "codex",
"runtime": {
"type": "acp",
"acp": { "agent": "codex", "backend": "acpx", "mode": "persistent" },
},
},
{
"id": "claude",
"runtime": {
"type": "acp",
"acp": { "agent": "claude", "backend": "acpx", "mode": "persistent" },
},
},
],
},
"acp": { "enabled": true, "backend": "acpx" },
"bindings": [
{
"type": "route",
"agentId": "main",
"match": { "channel": "discord", "accountId": "default" },
},
{
"type": "route",
"agentId": "main",
"match": { "channel": "telegram", "accountId": "default" },
},
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "222222222222222222" },
},
},
{
"type": "acp",
"agentId": "claude",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "333333333333333333" },
},
},
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "telegram",
"accountId": "default",
"peer": { "kind": "group", "id": "-1009876543210:topic:5" },
},
},
],
}
```
Notes:
- `bindings[].type` is explicit:
- `route`: normal agent routing.
- `acp`: persistent ACP harness binding for a matched conversation.
- For `type: "acp"`, `match.peer.id` is the canonical conversation key:
- Discord channel/thread: raw channel/thread ID.
- Telegram topic: `chatId:topic:topicId`.
- `bindings[].acp.backend` is optional. Backend fallback order:
1. `bindings[].acp.backend`
2. `agents.list[].runtime.acp.backend`
3. global `acp.backend`
- `mode`, `cwd`, and `label` follow the same override pattern (`binding override -> agent runtime default -> global/default behavior`).
- Keep existing `session.threadBindings.*` and `channels.discord.threadBindings.*` for temporary binding policies.
- Persistent entries declare desired state; runtime reconciles to actual ACP sessions/bindings.
- One active ACP binding per conversation node is the intended model.
- Backward compatibility: missing `type` is interpreted as `route` for legacy entries.
### Backend Selection
- ACP session initialization already uses configured backend selection during spawn (`acp.backend` today).
- This proposal extends spawn/reconcile logic to prefer typed ACP binding overrides:
- `bindings[].acp.backend` for conversation-local override.
- `agents.list[].runtime.acp.backend` for per-agent defaults.
- If no override exists, keep current behavior (`acp.backend` default).
## Architecture Fit in Current System
### Reuse existing components
- `SessionBindingService` already supports channel-agnostic conversation references.
- ACP spawn/bind flows already support binding through service APIs.
- Telegram already carries topic/thread context via `MessageThreadId` and `chatId`.
### New/extended components
- **Telegram binding adapter** (parallel to Discord adapter):
- register adapter per Telegram account,
- resolve/list/bind/unbind/touch by canonical conversation ID.
- **Typed binding resolver/index**:
- split `bindings[]` into `route` and `acp` views,
- keep `resolveAgentRoute` on `route` bindings only,
- resolve persistent ACP intent from `acp` bindings only.
- **Inbound binding resolution for Telegram**:
- resolve bound session before route finalization (Discord already does this).
- **Persistent binding reconciler**:
- on startup: load configured top-level `type: "acp"` bindings, ensure ACP sessions exist, ensure bindings exist.
- on config change: apply deltas safely.
- **Cutover model**:
- no channel-local ACP binding fallback is read,
- persistent ACP bindings are sourced only from top-level `bindings[].type="acp"` entries.
## Phased Delivery
### Phase 1: Typed binding schema foundation
- Extend config schema to support `bindings[].type` discriminator:
- `route`,
- `acp` with optional `acp` override object (`mode`, `backend`, `cwd`, `label`).
- Extend agent schema with runtime descriptor to mark ACP-native agents (`agents.list[].runtime.type`).
- Add parser/indexer split for route vs ACP bindings.
### Phase 2: Runtime resolution + Discord/Telegram parity
- Resolve persistent ACP bindings from top-level `type: "acp"` entries for:
- Discord channels/threads,
- Telegram forum topics (`chatId:topic:topicId` canonical IDs).
- Implement Telegram binding adapter and inbound bound-session override parity with Discord.
- Do not include Telegram direct/private topic variants in this phase.
### Phase 3: Command parity and resets
- Align `/acp`, `/new`, `/reset`, and `/focus` behavior in bound Telegram/Discord conversations.
- Ensure binding survives reset flows as configured.
### Phase 4: Hardening
- Better diagnostics (`/acp status`, startup reconciliation logs).
- Conflict handling and health checks.
## Guardrails and Policy
- Respect ACP enablement and sandbox restrictions exactly as today.
- Keep explicit account scoping (`accountId`) to avoid cross-account bleed.
- Fail closed on ambiguous routing.
- Keep mention/access policy behavior explicit per channel config.
## Testing Plan
- Unit:
- conversation ID normalization (especially Telegram topic IDs),
- reconciler create/update/delete paths,
- `/acp bind --persist` and unbind flows.
- Integration:
- inbound Telegram topic -> bound ACP session resolution,
- inbound Discord channel/thread -> persistent binding precedence.
- Regression:
- temporary bindings continue to work,
- unbound channels/topics keep current routing behavior.
## Open Questions
- Should `/acp spawn --thread auto` in Telegram topic default to `here`?
- Should persistent bindings always bypass mention-gating in bound conversations, or require explicit `requireMention=false`?
- Should `/focus` gain `--persist` as an alias for `/acp bind --persist`?
## Rollout
- Ship as opt-in per conversation (`bindings[].type="acp"` entry present).
- Start with Discord + Telegram only.
- Add docs with examples for:
- “one channel/topic per agent”
- “multiple channels/topics per same agent with different `cwd`
- “team naming patterns (`codex-1`, `claude-repo-x`)".

View File

@@ -1,339 +0,0 @@
---
title: "Discord Async Inbound Worker Plan"
summary: "Status and next steps for decoupling Discord gateway listeners from long-running agent turns with a Discord-specific inbound worker"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-03-06"
status: "in_progress"
last_updated: "2026-03-05"
---
# Discord Async Inbound Worker Plan
## Objective
Remove Discord listener timeout as a user-facing failure mode by making inbound Discord turns asynchronous:
1. Gateway listener accepts and normalizes inbound events quickly.
2. A Discord run queue stores serialized jobs keyed by the same ordering boundary we use today.
3. A worker executes the actual agent turn outside the Carbon listener lifetime.
4. Replies are delivered back to the originating channel or thread after the run completes.
This is the long-term fix for queued Discord runs timing out at `channels.discord.eventQueue.listenerTimeout` while the agent run itself is still making progress.
## Current status
This plan is partially implemented.
Already done:
- Discord listener timeout and Discord run timeout are now separate settings.
- Accepted inbound Discord turns are enqueued into `src/discord/monitor/inbound-worker.ts`.
- The worker now owns the long-running turn instead of the Carbon listener.
- Existing per-route ordering is preserved by queue key.
- Timeout regression coverage exists for the Discord worker path.
What this means in plain language:
- the production timeout bug is fixed
- the long-running turn no longer dies just because the Discord listener budget expires
- the worker architecture is not finished yet
What is still missing:
- `DiscordInboundJob` is still only partially normalized and still carries live runtime references
- command semantics (`stop`, `new`, `reset`, future session controls) are not yet fully worker-native
- worker observability and operator status are still minimal
- there is still no restart durability
## Why this exists
Current behavior ties the full agent turn to the listener lifetime:
- `src/discord/monitor/listeners.ts` applies the timeout and abort boundary.
- `src/discord/monitor/message-handler.ts` keeps the queued run inside that boundary.
- `src/discord/monitor/message-handler.process.ts` performs media loading, routing, dispatch, typing, draft streaming, and final reply delivery inline.
That architecture has two bad properties:
- long but healthy turns can be aborted by the listener watchdog
- users can see no reply even when the downstream runtime would have produced one
Raising the timeout helps but does not change the failure mode.
## Non-goals
- Do not redesign non-Discord channels in this pass.
- Do not broaden this into a generic all-channel worker framework in the first implementation.
- Do not extract a shared cross-channel inbound worker abstraction yet; only share low-level primitives when duplication is obvious.
- Do not add durable crash recovery in the first pass unless needed to land safely.
- Do not change route selection, binding semantics, or ACP policy in this plan.
## Current constraints
The current Discord processing path still depends on some live runtime objects that should not stay inside the long-term job payload:
- Carbon `Client`
- raw Discord event shapes
- in-memory guild history map
- thread binding manager callbacks
- live typing and draft stream state
We already moved execution onto a worker queue, but the normalization boundary is still incomplete. Right now the worker is "run later in the same process with some of the same live objects," not a fully data-only job boundary.
## Target architecture
### 1. Listener stage
`DiscordMessageListener` remains the ingress point, but its job becomes:
- run preflight and policy checks
- normalize accepted input into a serializable `DiscordInboundJob`
- enqueue the job into a per-session or per-channel async queue
- return immediately to Carbon once the enqueue succeeds
The listener should no longer own the end-to-end LLM turn lifetime.
### 2. Normalized job payload
Introduce a serializable job descriptor that contains only the data needed to run the turn later.
Minimum shape:
- route identity
- `agentId`
- `sessionKey`
- `accountId`
- `channel`
- delivery identity
- destination channel id
- reply target message id
- thread id if present
- sender identity
- sender id, label, username, tag
- channel context
- guild id
- channel name or slug
- thread metadata
- resolved system prompt override
- normalized message body
- base text
- effective message text
- attachment descriptors or resolved media references
- gating decisions
- mention requirement outcome
- command authorization outcome
- bound session or agent metadata if applicable
The job payload must not contain live Carbon objects or mutable closures.
Current implementation status:
- partially done
- `src/discord/monitor/inbound-job.ts` exists and defines the worker handoff
- the payload still contains live Discord runtime context and should be reduced further
### 3. Worker stage
Add a Discord-specific worker runner responsible for:
- reconstructing the turn context from `DiscordInboundJob`
- loading media and any additional channel metadata needed for the run
- dispatching the agent turn
- delivering final reply payloads
- updating status and diagnostics
Recommended location:
- `src/discord/monitor/inbound-worker.ts`
- `src/discord/monitor/inbound-job.ts`
### 4. Ordering model
Ordering must remain equivalent to today for a given route boundary.
Recommended key:
- use the same queue key logic as `resolveDiscordRunQueueKey(...)`
This preserves existing behavior:
- one bound agent conversation does not interleave with itself
- different Discord channels can still progress independently
### 5. Timeout model
After cutover, there are two separate timeout classes:
- listener timeout
- only covers normalization and enqueue
- should be short
- run timeout
- optional, worker-owned, explicit, and user-visible
- should not be inherited accidentally from Carbon listener settings
This removes the current accidental coupling between "Discord gateway listener stayed alive" and "agent run is healthy."
## Recommended implementation phases
### Phase 1: normalization boundary
- Status: partially implemented
- Done:
- extracted `buildDiscordInboundJob(...)`
- added worker handoff tests
- Remaining:
- make `DiscordInboundJob` plain data only
- move live runtime dependencies to worker-owned services instead of per-job payload
- stop rebuilding process context by stitching live listener refs back into the job
### Phase 2: in-memory worker queue
- Status: implemented
- Done:
- added `DiscordInboundWorkerQueue` keyed by resolved run queue key
- listener enqueues jobs instead of directly awaiting `processDiscordMessage(...)`
- worker executes jobs in-process, in memory only
This is the first functional cutover.
### Phase 3: process split
- Status: not started
- Move delivery, typing, and draft streaming ownership behind worker-facing adapters.
- Replace direct use of live preflight context with worker context reconstruction.
- Keep `processDiscordMessage(...)` temporarily as a facade if needed, then split it.
### Phase 4: command semantics
- Status: not started
Make sure native Discord commands still behave correctly when work is queued:
- `stop`
- `new`
- `reset`
- any future session-control commands
The worker queue must expose enough run state for commands to target the active or queued turn.
### Phase 5: observability and operator UX
- Status: not started
- emit queue depth and active worker counts into monitor status
- record enqueue time, start time, finish time, and timeout or cancellation reason
- surface worker-owned timeout or delivery failures clearly in logs
### Phase 6: optional durability follow-up
- Status: not started
Only after the in-memory version is stable:
- decide whether queued Discord jobs should survive gateway restart
- if yes, persist job descriptors and delivery checkpoints
- if no, document the explicit in-memory boundary
This should be a separate follow-up unless restart recovery is required to land.
## File impact
Current primary files:
- `src/discord/monitor/listeners.ts`
- `src/discord/monitor/message-handler.ts`
- `src/discord/monitor/message-handler.preflight.ts`
- `src/discord/monitor/message-handler.process.ts`
- `src/discord/monitor/status.ts`
Current worker files:
- `src/discord/monitor/inbound-job.ts`
- `src/discord/monitor/inbound-worker.ts`
- `src/discord/monitor/inbound-job.test.ts`
- `src/discord/monitor/message-handler.queue.test.ts`
Likely next touch points:
- `src/auto-reply/dispatch.ts`
- `src/discord/monitor/reply-delivery.ts`
- `src/discord/monitor/thread-bindings.ts`
- `src/discord/monitor/native-command.ts`
## Next step now
The next step is to make the worker boundary real instead of partial.
Do this next:
1. Move live runtime dependencies out of `DiscordInboundJob`
2. Keep those dependencies on the Discord worker instance instead
3. Reduce queued jobs to plain Discord-specific data:
- route identity
- delivery target
- sender info
- normalized message snapshot
- gating and binding decisions
4. Reconstruct worker execution context from that plain data inside the worker
In practice, that means:
- `client`
- `threadBindings`
- `guildHistories`
- `discordRestFetch`
- other mutable runtime-only handles
should stop living on each queued job and instead live on the worker itself or behind worker-owned adapters.
After that lands, the next follow-up should be command-state cleanup for `stop`, `new`, and `reset`.
## Testing plan
Keep the existing timeout repro coverage in:
- `src/discord/monitor/message-handler.queue.test.ts`
Add new tests for:
1. listener returns after enqueue without awaiting full turn
2. per-route ordering is preserved
3. different channels still run concurrently
4. replies are delivered to the original message destination
5. `stop` cancels the active worker-owned run
6. worker failure produces visible diagnostics without blocking later jobs
7. ACP-bound Discord channels still route correctly under worker execution
## Risks and mitigations
- Risk: command semantics drift from current synchronous behavior
Mitigation: land command-state plumbing in the same cutover, not later
- Risk: reply delivery loses thread or reply-to context
Mitigation: make delivery identity first-class in `DiscordInboundJob`
- Risk: duplicate sends during retries or queue restarts
Mitigation: keep first pass in-memory only, or add explicit delivery idempotency before persistence
- Risk: `message-handler.process.ts` becomes harder to reason about during migration
Mitigation: split into normalization, execution, and delivery helpers before or during worker cutover
## Acceptance criteria
The plan is complete when:
1. Discord listener timeout no longer aborts healthy long-running turns.
2. Listener lifetime and agent-turn lifetime are separate concepts in code.
3. Existing per-session ordering is preserved.
4. ACP-bound Discord channels work through the same worker path.
5. `stop` targets the worker-owned run instead of the old listener-owned call stack.
6. Timeout and delivery failures become explicit worker outcomes, not silent listener drops.
## Remaining landing strategy
Finish this in follow-up PRs:
1. make `DiscordInboundJob` plain-data only and move live runtime refs onto the worker
2. clean up command-state ownership for `stop`, `new`, and `reset`
3. add worker observability and operator status
4. decide whether durability is needed or explicitly document the in-memory boundary
This is still a bounded follow-up if kept Discord-only and if we continue to avoid a premature cross-channel worker abstraction.

View File

@@ -1,529 +0,0 @@
---
title: "Bindings Capability Architecture Plan"
summary: "Plan for keeping bindings as a small core capability while moving ACP-specific binding policy and app-server policy out of core"
author: "Onur Solmaz <onur@solmaz.io>"
github_username: "osolmaz"
created: "2026-03-17"
status: "in_progress"
last_updated: "2026-03-17"
---
# Bindings Capability Architecture Plan
Status: in progress
## Summary
The goal is not to move all ACP code out of core.
The goal is to make `bindings` a small core capability, keep the ACP session kernel in core, and move ACP-specific binding policy plus codex app server policy out of core.
That gives us a lightweight core without hiding core semantics behind plugin indirection.
## Current Conclusion
The current architecture should converge on this split:
- Core owns the generic binding capability.
- Core owns the generic ACP session kernel.
- Channel plugins own channel-specific binding semantics.
- ACP backend plugins own runtime protocol details.
- Product-level consumers like ACP configured bindings and the codex app server sit on top of the binding capability instead of hardcoding their own binding plumbing.
This is different from "everything becomes a plugin".
## Why This Changed
The current codebase already shows that there are really three different layers:
- binding and conversation ownership
- long-lived session and runtime-handle orchestration
- product-specific turn logic
Those layers should not all be forced into one runtime engine.
Today the duplication is mostly in the execution/control-plane shape, not in storage or binding plumbing:
- the main harness has its own turn engine
- ACP has its own session control plane
- the codex app server plugin path likely owns its own app-level turn engine outside this repo
The right move is to share the stable control-plane contracts, not to force all three into one giant executor.
## Verified Current State
### Generic binding pieces already exist
- `src/infra/outbound/session-binding-service.ts` already provides a generic binding store and adapter model.
- `src/plugins/conversation-binding.ts` already lets plugins request a conversation binding and stores plugin-owned binding metadata.
- `src/plugins/types.ts` already exposes plugin-facing binding APIs.
- `src/plugins/types.ts` already exposes the generic `inbound_claim` hook.
### ACP is only partially pluginified
- `src/channels/plugins/configured-binding-registry.ts` now owns generic configured binding compilation and lookup.
- `src/channels/plugins/binding-routing.ts` and `src/channels/plugins/binding-targets.ts` now own the generic route and target lifecycle seams.
- ACP now plugs into that seam through `src/channels/plugins/acp-configured-binding-consumer.ts` and `src/channels/plugins/acp-stateful-target-driver.ts`.
- `src/acp/persistent-bindings.lifecycle.ts` still owns configured ACP ensure and reset behavior.
- runtime-created plugin conversation bindings still use a separate path in `src/plugins/conversation-binding.ts`.
### Codex app server is already closer to the desired shape
From this repo's side, the codex app server path is much thinner:
- a plugin binds a conversation
- core stores that binding
- inbound dispatch targets the plugin's `inbound_claim` hook
What core does not provide for the codex app server path is an ACP-like shared session kernel. If the app server needs retries, long-lived runtime handles, cancellation, or session health logic, it must own that itself today.
## The Durable Split
### 1. Core Binding Capability
This should become the primary shared seam.
Responsibilities:
- canonical `ConversationRef`
- binding record storage
- configured binding compilation
- runtime-created binding storage
- fast binding lookup on inbound
- binding touch/unbind lifecycle
- generic dispatch handoff to the binding target
What core binding capability must not own:
- Discord thread rules
- Telegram topic rules
- Feishu chat rules
- ACP session orchestration
- codex app server business logic
### 2. Core Stateful Target Kernel
This is the small generic kernel for long-lived bound targets.
Responsibilities:
- ensure target ready
- run turn
- cancel turn
- close target
- reset target
- status and health
- persistence of target metadata
- retries and runtime-handle safety
- per-target serialization and concurrency
ACP is the first real implementation of this shape.
This kernel should stay in core because it is mandatory infrastructure and has strict startup, reset, and recovery semantics.
### 3. Channel Binding Providers
Each channel plugin should own the meaning of "this channel conversation maps to this binding rule".
Responsibilities:
- normalize configured binding targets
- normalize inbound conversations
- match inbound conversations against compiled bindings
- define channel-specific matching priority
- optionally provide binding description text for status and logs
This is where Discord channel vs thread logic, Telegram topic rules, and Feishu conversation rules belong.
### 4. Product Consumers
Bindings are a shared capability. Different products should consume it differently.
ACP configured bindings:
- compile config rules
- resolve a target session
- ensure the ACP session is ready through the ACP kernel
Codex app server:
- create runtime-requested bindings
- claim inbound messages through plugin hooks
- optionally adopt the shared stateful target contract later if it really needs long-lived session orchestration
Main harness:
- does not need to become "a binding product"
- may eventually share small lifecycle contracts, but it should not be forced into the same engine as ACP
## The Key Architectural Decision
The shared abstraction should be:
- `bindings` as the capability
- `stateful target drivers` as an optional lower-level contract
The shared abstraction should not be:
- "one runtime engine for main harness, ACP, and codex app server"
That would overfit very different systems into one executor.
## Stable Nouns
Core should understand only stable nouns.
The stable nouns are:
- `ConversationRef`
- `BindingRule`
- `CompiledBinding`
- `BindingResolution`
- `BindingTargetDescriptor`
- `StatefulTargetDriver`
- `StatefulTargetHandle`
ACP, codex app server, and future products should compile down to those nouns instead of leaking product-specific routing rules through core.
## Proposed Capability Model
### Binding capability
The binding capability should support both configured bindings and runtime-created bindings.
Required operations:
- compile configured bindings at startup or reload
- resolve a binding from an inbound `ConversationRef`
- create a runtime binding
- touch and unbind an existing binding
- dispatch a resolved binding to its target
### Binding target descriptor
A resolved binding should point to a typed target descriptor rather than ad hoc ACP- or plugin-specific metadata blobs.
The descriptor should be able to represent at least:
- plugin-owned inbound claim targets
- stateful target drivers
That means the same binding capability can support both:
- codex app server plugin-bound conversations
- ACP configured bindings
without pretending they are the same product.
### Stateful target driver
This is the reusable control-plane contract for long-lived bound targets.
Required operations:
- `ensureReady`
- `runTurn`
- `cancel`
- `close`
- `reset`
- `status`
- `health`
ACP should remain the first built-in driver.
If the codex app server later proves that it also needs durable session handles, it can either:
- use a driver that consumes this contract, or
- keep its own product-owned runtime if that remains simpler
That should be a product decision, not something forced by the binding capability.
## Why ACP Kernel Stays In Core
ACP's kernel should remain in core because session lifecycle, persistence, retries, cancellation, and runtime-handle safety are generic platform machinery.
Those concerns are not channel-specific, and they are not codex-app-server-specific.
If we move that machinery into an ordinary plugin, we create circular bootstrapping:
- channels need it during startup and inbound routing
- reset and recovery need it when plugins may already be degraded
- failure semantics become special-case core logic anyway
If we later wrap it in a "built-in capability module", that is still effectively core.
## What Should Move Out Of Core
The following should move out of ACP-shaped core code:
- channel-specific configured binding matching
- channel-specific binding target normalization
- channel-specific recovery UX
- ACP-specific route wrapping helpers as named ACP seams
- codex app server fallback policy beyond generic plugin-bound dispatch behavior
The following should stay:
- generic binding storage and dispatch
- generic ACP control plane
- generic stateful target driver contract
## Current Problems To Remove
### Residual cleanup is now small
Most ACP-era compatibility names are gone from the generic seam.
The remaining cleanup is smaller:
- `src/acp/persistent-bindings.ts` compatibility barrel can be deleted once tests stop importing it
- ACP-named tests and mocks can be renamed over time for consistency
- docs should stop describing already-removed ACP wrappers as if they still exist
### Configured binding implementation is still too monolithic
`src/channels/plugins/configured-binding-registry.ts` still mixes:
- registry compilation
- cache invalidation
- inbound matching
- materialization of binding targets
- session-key reverse lookup
That file is now generic, but still too large and too coupled.
### Runtime-created plugin bindings still use a separate stack
`src/plugins/conversation-binding.ts` is still a separate implementation path for plugin-created bindings.
That means configured bindings and runtime-created bindings share storage, but not one consistent capability layer.
### Generic registries still hardcode ACP as a built-in
`src/channels/plugins/configured-binding-consumers.ts` and `src/channels/plugins/stateful-target-drivers.ts` still import ACP directly.
That is acceptable for now, but the clean final shape is to keep ACP built in while registering it from a dedicated bootstrap point instead of wiring it inside the generic registry files.
## Target Contracts
### Channel binding provider contract
Conceptually, each channel plugin should support:
- `compileConfiguredBinding(binding, cfg) -> CompiledBinding | null`
- `resolveInboundConversation(event) -> ConversationRef | null`
- `matchInboundConversation(compiledBinding, conversation) -> BindingMatch | null`
- `describeBinding(compiledBinding) -> string | undefined`
### Binding capability contract
Core should support:
- `compileConfiguredBindings(cfg, plugins) -> CompiledBindingRegistry`
- `resolveBinding(conversationRef) -> BindingResolution | null`
- `createRuntimeBinding(target, conversationRef, metadata) -> BindingRecord`
- `touchBinding(bindingId)`
- `unbindBinding(bindingId | target)`
- `dispatchResolvedBinding(bindingResolution, inboundEvent)`
### Stateful target driver contract
Core should support:
- `ensureReady(targetRef, cfg)`
- `runTurn(targetRef, input)`
- `cancel(targetRef, reason)`
- `close(targetRef, reason)`
- `reset(targetRef, reason)`
- `status(targetRef)`
- `health(targetRef)`
## File-Level Transition Plan
### Keep
- `src/infra/outbound/session-binding-service.ts`
- `src/acp/control-plane/*`
- `extensions/acpx/*`
### Generalize
- `src/plugins/conversation-binding.ts`
- fold runtime-created plugin bindings into the same generic binding capability instead of keeping a separate implementation stack
- `src/channels/plugins/configured-binding-registry.ts`
- split into compiler, matcher, and session-key resolution modules with a thin facade
- `src/channels/plugins/types.adapters.ts`
- finish removing ACP-era aliases after the deprecation window
- `src/plugin-sdk/conversation-runtime.ts`
- export only the generic binding capability surfaces
- `src/acp/persistent-bindings.lifecycle.ts`
- either become a generic stateful target driver consumer or be renamed to ACP driver-specific lifecycle code
### Shrink Or Delete
- `src/acp/persistent-bindings.ts`
- delete the compatibility barrel once tests import the real modules directly
- `src/acp/persistent-bindings.resolve.ts`
- keep only while ACP-specific compatibility helpers are still useful to internal callers
- ACP-named test files
- rename over time once the behavior is stable and there is no risk of mixing behavioral and naming churn
## Recommended Refactor Order
### Completed groundwork
The current branch has already completed most of the first migration wave:
- stable generic binding nouns exist
- configured bindings compile through a generic registry
- inbound routing goes through generic binding resolution
- configured binding lookup no longer performs fallback plugin discovery
- ACP is expressed as a configured-binding consumer plus a built-in stateful target driver
The remaining work is cleanup and unification, not first-principles redesign.
### Phase 1: Freeze the nouns
Introduce and document the stable binding and target types:
- `ConversationRef`
- `CompiledBinding`
- `BindingResolution`
- `BindingTargetDescriptor`
- `StatefulTargetDriver`
Do this before more movement so the rest of the refactor has firm vocabulary.
### Phase 2: Promote bindings to a first-class core capability
Refactor the existing generic binding store into an explicit capability layer.
Requirements:
- runtime-created bindings stay supported
- configured bindings become first-class
- lookup becomes channel-agnostic
### Phase 3: Compile configured bindings at startup and reload
Move configured binding compilation off the inbound hot path.
Requirements:
- load enabled channel plugins once
- compile configured bindings once
- rebuild on config or plugin reload
- inbound path becomes pure registry lookup
### Phase 4: Expand the channel provider seam
Replace the ACP-specific adapter shape with a generic channel binding provider contract.
Requirements:
- channel plugins own normalization and matching
- core no longer knows channel-specific configured binding rules
### Phase 5: Re-express ACP as a binding consumer plus built-in stateful target driver
Move ACP configured binding policy to the new binding capability while keeping ACP runtime orchestration in core.
Requirements:
- ACP configured bindings resolve through the generic binding registry
- ACP target readiness uses the ACP driver contract
- ACP-specific naming disappears from generic binding code
### Phase 6: Finish residual ACP cleanup
Remove the last compatibility leftovers and stale naming.
Requirements:
- delete `src/acp/persistent-bindings.ts`
- rename ACP-named tests where that improves clarity without changing behavior
- keep docs synchronized with the actual generic seam instead of the earlier transition state
### Phase 7: Split the configured binding registry by responsibility
Refactor `src/channels/plugins/configured-binding-registry.ts` into smaller modules.
Suggested split:
- compiler module
- inbound matcher module
- session-key reverse lookup module
- thin public facade
Requirements:
- caching behavior remains unchanged
- matching behavior remains unchanged
- session-key resolution behavior remains unchanged
### Phase 8: Keep codex app server on the same binding capability
Do not force the codex app server into ACP semantics.
Requirements:
- codex app server keeps runtime-created bindings through the same binding capability
- inbound claim remains the default delivery path
- only adopt the stateful target driver seam if the app server truly needs long-lived target orchestration
- `src/plugins/conversation-binding.ts` stops being a separate binding stack and becomes a consumer of the generic binding capability
### Phase 9: Decouple built-in ACP registration from generic registry files
Keep ACP built in, but stop importing it directly from the generic registry modules.
Requirements:
- `src/channels/plugins/configured-binding-consumers.ts` no longer hardcodes ACP imports
- `src/channels/plugins/stateful-target-drivers.ts` no longer hardcodes ACP imports
- ACP still registers by default during normal startup
- generic registry files remain product-agnostic
### Phase 10: Remove ACP-shaped compatibility facades
Once all call sites are on the generic capability:
- delete ACP-shaped routing helpers
- delete hot-path plugin bootstrapping logic
- keep only thin compatibility exports if external plugins still need a deprecation window
## Success Criteria
The architecture is done when all of these are true:
- no inbound configured-binding resolution performs plugin discovery
- no channel-specific binding semantics remain in generic core binding code
- ACP still uses a core session kernel
- codex app server and ACP both sit on top of the same binding capability
- the binding capability can represent both configured and runtime-created bindings
- runtime-created plugin bindings do not use a separate implementation stack
- long-lived target orchestration is shared through a small core driver contract
- generic registry files do not import ACP directly
- ACP-era alias names are gone from the generic/plugin SDK surface
- the main harness is not forced into the ACP engine
- external plugins can use the same capability without internal imports
## Non-Goals
These are not goals of the remaining refactor:
- moving the ACP session kernel into an ordinary plugin
- forcing the main harness, ACP, and codex app server into one executor
- making every channel implement its own retry and session-safety logic
- keeping ACP-shaped naming in the long-term generic binding layer
## Bottom Line
The right 20-year split is:
- bindings are the shared core capability
- ACP session orchestration remains a small built-in core kernel
- channel plugins own binding semantics
- backend plugins own runtime protocol details
- product consumers like ACP configured bindings and codex app server build on the same binding capability without being forced into one runtime engine
That is the leanest core that still has honest boundaries.

View File

@@ -1,128 +0,0 @@
---
title: "OpenResponses Gateway Plan"
summary: "Plan: Add OpenResponses /v1/responses endpoint and deprecate chat completions cleanly"
author: "Ryan Lisse <ryan@ryanlisse.com>"
github_username: "RyanLisse"
created: "2026-01-20"
read_when:
- Designing or implementing `/v1/responses` gateway support
- Planning migration from Chat Completions compatibility
status: "draft"
last_updated: "2026-01-19"
---
# OpenResponses Gateway Integration Plan
## Context
OpenClaw Gateway currently exposes a minimal OpenAI-compatible Chat Completions endpoint at
`/v1/chat/completions` (see [OpenAI Chat Completions](/gateway/openai-http-api)).
Open Responses is an open inference standard based on the OpenAI Responses API. It is designed
for agentic workflows and uses item-based inputs plus semantic streaming events. The OpenResponses
spec defines `/v1/responses`, not `/v1/chat/completions`.
## Goals
- Add a `/v1/responses` endpoint that adheres to OpenResponses semantics.
- Keep Chat Completions as a compatibility layer that is easy to disable and eventually remove.
- Standardize validation and parsing with isolated, reusable schemas.
## Non-goals
- Full OpenResponses feature parity in the first pass (images, files, hosted tools).
- Replacing internal agent execution logic or tool orchestration.
- Changing the existing `/v1/chat/completions` behavior during the first phase.
## Research Summary
Sources: OpenResponses OpenAPI, OpenResponses specification site, and the Hugging Face blog post.
Key points extracted:
- `POST /v1/responses` accepts `CreateResponseBody` fields like `model`, `input` (string or
`ItemParam[]`), `instructions`, `tools`, `tool_choice`, `stream`, `max_output_tokens`, and
`max_tool_calls`.
- `ItemParam` is a discriminated union of:
- `message` items with roles `system`, `developer`, `user`, `assistant`
- `function_call` and `function_call_output`
- `reasoning`
- `item_reference`
- Successful responses return a `ResponseResource` with `object: "response"`, `status`, and
`output` items.
- Streaming uses semantic events such as:
- `response.created`, `response.in_progress`, `response.completed`, `response.failed`
- `response.output_item.added`, `response.output_item.done`
- `response.content_part.added`, `response.content_part.done`
- `response.output_text.delta`, `response.output_text.done`
- The spec requires:
- `Content-Type: text/event-stream`
- `event:` must match the JSON `type` field
- terminal event must be literal `[DONE]`
- Reasoning items may expose `content`, `encrypted_content`, and `summary`.
- HF examples include `OpenResponses-Version: latest` in requests (optional header).
## Proposed Architecture
- Add `src/gateway/open-responses.schema.ts` containing Zod schemas only (no gateway imports).
- Add `src/gateway/openresponses-http.ts` (or `open-responses-http.ts`) for `/v1/responses`.
- Keep `src/gateway/openai-http.ts` intact as a legacy compatibility adapter.
- Add config `gateway.http.endpoints.responses.enabled` (default `false`).
- Keep `gateway.http.endpoints.chatCompletions.enabled` independent; allow both endpoints to be
toggled separately.
- Emit a startup warning when Chat Completions is enabled to signal legacy status.
## Deprecation Path for Chat Completions
- Maintain strict module boundaries: no shared schema types between responses and chat completions.
- Make Chat Completions opt-in by config so it can be disabled without code changes.
- Update docs to label Chat Completions as legacy once `/v1/responses` is stable.
- Optional future step: map Chat Completions requests to the Responses handler for a simpler
removal path.
## Phase 1 Support Subset
- Accept `input` as string or `ItemParam[]` with message roles and `function_call_output`.
- Extract system and developer messages into `extraSystemPrompt`.
- Use the most recent `user` or `function_call_output` as the current message for agent runs.
- Reject unsupported content parts (image/file) with `invalid_request_error`.
- Return a single assistant message with `output_text` content.
- Return `usage` with zeroed values until token accounting is wired.
## Validation Strategy (No SDK)
- Implement Zod schemas for the supported subset of:
- `CreateResponseBody`
- `ItemParam` + message content part unions
- `ResponseResource`
- Streaming event shapes used by the gateway
- Keep schemas in a single, isolated module to avoid drift and allow future codegen.
## Streaming Implementation (Phase 1)
- SSE lines with both `event:` and `data:`.
- Required sequence (minimum viable):
- `response.created`
- `response.output_item.added`
- `response.content_part.added`
- `response.output_text.delta` (repeat as needed)
- `response.output_text.done`
- `response.content_part.done`
- `response.completed`
- `[DONE]`
## Tests and Verification Plan
- Add e2e coverage for `/v1/responses`:
- Auth required
- Non-stream response shape
- Stream event ordering and `[DONE]`
- Session routing with headers and `user`
- Keep `src/gateway/openai-http.test.ts` unchanged.
- Manual: curl to `/v1/responses` with `stream: true` and verify event ordering and terminal
`[DONE]`.
## Doc Updates (Follow-up)
- Add a new docs page for `/v1/responses` usage and examples.
- Update `/gateway/openai-http-api` with a legacy note and pointer to `/v1/responses`.

View File

@@ -1,231 +0,0 @@
---
title: "Workspace Memory Research"
summary: "Research notes: offline memory system for Clawd workspaces (Markdown source-of-truth + derived index)"
author: "Peter Steinberger <steipete@gmail.com>"
github_username: "steipete"
created: "2025-12-23"
read_when:
- Designing workspace memory (~/.openclaw/workspace) beyond daily Markdown logs
- Deciding: standalone CLI vs deep OpenClaw integration
- Adding offline recall + reflection (retain/recall/reflect)
---
# Workspace Memory v2 (offline): research notes
Target: Clawd-style workspace (`agents.defaults.workspace`, default `~/.openclaw/workspace`) where “memory” is stored as one Markdown file per day (`memory/YYYY-MM-DD.md`) plus a small set of stable files (e.g. `memory.md`, `SOUL.md`).
This doc proposes an **offline-first** memory architecture that keeps Markdown as the canonical, reviewable source of truth, but adds **structured recall** (search, entity summaries, confidence updates) via a derived index.
## Why change?
The current setup (one file per day) is excellent for:
- “append-only” journaling
- human editing
- git-backed durability + auditability
- low-friction capture (“just write it down”)
Its weak for:
- high-recall retrieval (“what did we decide about X?”, “last time we tried Y?”)
- entity-centric answers (“tell me about Alice / The Castle / warelay”) without rereading many files
- opinion/preference stability (and evidence when it changes)
- time constraints (“what was true during Nov 2025?”) and conflict resolution
## Design goals
- **Offline**: works without network; can run on laptop/Castle; no cloud dependency.
- **Explainable**: retrieved items should be attributable (file + location) and separable from inference.
- **Low ceremony**: daily logging stays Markdown, no heavy schema work.
- **Incremental**: v1 is useful with FTS only; semantic/vector and graphs are optional upgrades.
- **Agent-friendly**: makes “recall within token budgets” easy (return small bundles of facts).
## North star model (Hindsight × Letta)
Two pieces to blend:
1. **Letta/MemGPT-style control loop**
- keep a small “core” always in context (persona + key user facts)
- everything else is out-of-context and retrieved via tools
- memory writes are explicit tool calls (append/replace/insert), persisted, then re-injected next turn
2. **Hindsight-style memory substrate**
- separate whats observed vs whats believed vs whats summarized
- support retain/recall/reflect
- confidence-bearing opinions that can evolve with evidence
- entity-aware retrieval + temporal queries (even without full knowledge graphs)
## Proposed architecture (Markdown source-of-truth + derived index)
### Canonical store (git-friendly)
Keep `~/.openclaw/workspace` as canonical human-readable memory.
Suggested workspace layout:
```
~/.openclaw/workspace/
memory.md # small: durable facts + preferences (core-ish)
memory/
YYYY-MM-DD.md # daily log (append; narrative)
bank/ # “typed” memory pages (stable, reviewable)
world.md # objective facts about the world
experience.md # what the agent did (first-person)
opinions.md # subjective prefs/judgments + confidence + evidence pointers
entities/
Peter.md
The-Castle.md
warelay.md
...
```
Notes:
- **Daily log stays daily log**. No need to turn it into JSON.
- The `bank/` files are **curated**, produced by reflection jobs, and can still be edited by hand.
- `memory.md` remains “small + core-ish”: the things you want Clawd to see every session.
### Derived store (machine recall)
Add a derived index under the workspace (not necessarily git tracked):
```
~/.openclaw/workspace/.memory/index.sqlite
```
Back it with:
- SQLite schema for facts + entity links + opinion metadata
- SQLite **FTS5** for lexical recall (fast, tiny, offline)
- optional embeddings table for semantic recall (still offline)
The index is always **rebuildable from Markdown**.
## Retain / Recall / Reflect (operational loop)
### Retain: normalize daily logs into “facts”
Hindsights key insight that matters here: store **narrative, self-contained facts**, not tiny snippets.
Practical rule for `memory/YYYY-MM-DD.md`:
- at end of day (or during), add a `## Retain` section with 25 bullets that are:
- narrative (cross-turn context preserved)
- self-contained (standalone makes sense later)
- tagged with type + entity mentions
Example:
```
## Retain
- W @Peter: Currently in Marrakech (Nov 27Dec 1, 2025) for Andys birthday.
- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md).
- O(c=0.95) @Peter: Prefers concise replies (&lt;1500 chars) on WhatsApp; long content goes into files.
```
Minimal parsing:
- Type prefix: `W` (world), `B` (experience/biographical), `O` (opinion), `S` (observation/summary; usually generated)
- Entities: `@Peter`, `@warelay`, etc (slugs map to `bank/entities/*.md`)
- Opinion confidence: `O(c=0.0..1.0)` optional
If you dont want authors to think about it: the reflect job can infer these bullets from the rest of the log, but having an explicit `## Retain` section is the easiest “quality lever”.
### Recall: queries over the derived index
Recall should support:
- **lexical**: “find exact terms / names / commands” (FTS5)
- **entity**: “tell me about X” (entity pages + entity-linked facts)
- **temporal**: “what happened around Nov 27” / “since last week”
- **opinion**: “what does Peter prefer?” (with confidence + evidence)
Return format should be agent-friendly and cite sources:
- `kind` (`world|experience|opinion|observation`)
- `timestamp` (source day, or extracted time range if present)
- `entities` (`["Peter","warelay"]`)
- `content` (the narrative fact)
- `source` (`memory/2025-11-27.md#L12` etc)
### Reflect: produce stable pages + update beliefs
Reflection is a scheduled job (daily or heartbeat `ultrathink`) that:
- updates `bank/entities/*.md` from recent facts (entity summaries)
- updates `bank/opinions.md` confidence based on reinforcement/contradiction
- optionally proposes edits to `memory.md` (“core-ish” durable facts)
Opinion evolution (simple, explainable):
- each opinion has:
- statement
- confidence `c ∈ [0,1]`
- last_updated
- evidence links (supporting + contradicting fact IDs)
- when new facts arrive:
- find candidate opinions by entity overlap + similarity (FTS first, embeddings later)
- update confidence by small deltas; big jumps require strong contradiction + repeated evidence
## CLI integration: standalone vs deep integration
Recommendation: **deep integration in OpenClaw**, but keep a separable core library.
### Why integrate into OpenClaw?
- OpenClaw already knows:
- the workspace path (`agents.defaults.workspace`)
- the session model + heartbeats
- logging + troubleshooting patterns
- You want the agent itself to call the tools:
- `openclaw memory recall "…" --k 25 --since 30d`
- `openclaw memory reflect --since 7d`
### Why still split a library?
- keep memory logic testable without gateway/runtime
- reuse from other contexts (local scripts, future desktop app, etc.)
Shape:
The memory tooling is intended to be a small CLI + library layer, but this is exploratory only.
## “S-Collide” / SuCo: when to use it (research)
If “S-Collide” refers to **SuCo (Subspace Collision)**: its an ANN retrieval approach that targets strong recall/latency tradeoffs by using learned/structured collisions in subspaces (paper: arXiv 2411.14754, 2024).
Pragmatic take for `~/.openclaw/workspace`:
- **dont start** with SuCo.
- start with SQLite FTS + (optional) simple embeddings; youll get most UX wins immediately.
- consider SuCo/HNSW/ScaNN-class solutions only once:
- corpus is big (tens/hundreds of thousands of chunks)
- brute-force embedding search becomes too slow
- recall quality is meaningfully bottlenecked by lexical search
Offline-friendly alternatives (in increasing complexity):
- SQLite FTS5 + metadata filters (zero ML)
- Embeddings + brute force (works surprisingly far if chunk count is low)
- HNSW index (common, robust; needs a library binding)
- SuCo (research-grade; attractive if theres a solid implementation you can embed)
Open question:
- whats the **best** offline embedding model for “personal assistant memory” on your machines (laptop + desktop)?
- if you already have Ollama: embed with a local model; otherwise ship a small embedding model in the toolchain.
## Smallest useful pilot
If you want a minimal, still-useful version:
- Add `bank/` entity pages and a `## Retain` section in daily logs.
- Use SQLite FTS for recall with citations (path + line numbers).
- Add embeddings only if recall quality or scale demands it.
## References
- Letta / MemGPT concepts: “core memory blocks” + “archival memory” + tool-driven self-editing memory.
- Hindsight Technical Report: “retain / recall / reflect”, four-network memory, narrative fact extraction, opinion confidence evolution.
- SuCo: arXiv 2411.14754 (2024): “Subspace Collision” approximate nearest neighbor retrieval.

View File

@@ -1,46 +0,0 @@
---
title: "Onboarding and Config Protocol"
summary: "RPC protocol notes for setup wizard and config schema"
author: "Peter Steinberger <steipete@gmail.com>"
github_username: "steipete"
created: "2026-01-03"
read_when: "Changing setup wizard steps or config schema endpoints"
---
# Onboarding + Config Protocol
Purpose: shared onboarding + config surfaces across CLI, macOS app, and Web UI.
## Components
- Wizard engine (shared session + prompts + onboarding state).
- CLI onboarding uses the same wizard flow as the UI clients.
- Gateway RPC exposes wizard + config schema endpoints.
- macOS onboarding uses the wizard step model.
- Web UI renders config forms from JSON Schema + UI hints.
## Gateway RPC
- `wizard.start` params: `{ mode?: "local"|"remote", workspace?: string }`
- `wizard.next` params: `{ sessionId, answer?: { stepId, value? } }`
- `wizard.cancel` params: `{ sessionId }`
- `wizard.status` params: `{ sessionId }`
- `config.schema` params: `{}`
- `config.schema.lookup` params: `{ path }`
- `path` accepts standard config segments plus slash-delimited plugin ids, for example `plugins.entries.pack/one.config`.
Responses (shape)
- Wizard: `{ sessionId, done, step?, status?, error? }`
- Config schema: `{ schema, uiHints, version, generatedAt }`
- Config schema lookup: `{ path, schema, hint?, hintPath?, children[] }`
## UI Hints
- `uiHints` keyed by path; optional metadata (label/help/group/order/advanced/sensitive/placeholder).
- Sensitive fields render as password inputs; no redaction layer.
- Unsupported schema nodes fall back to the raw JSON editor.
## Notes
- This doc is the single place to track protocol refactors for onboarding/config.

View File

@@ -1,39 +0,0 @@
---
title: "Model Config Exploration"
summary: "Exploration: model config, auth profiles, and fallback behavior"
author: "Peter Steinberger <steipete@gmail.com>"
github_username: "steipete"
created: "2026-01-05"
read_when:
- Exploring future model selection + auth profile ideas
---
# Model Config (Exploration)
This document captures **ideas** for future model configuration. It is not a
shipping spec. For current behavior, see:
- [Models](/concepts/models)
- [Model failover](/concepts/model-failover)
- [OAuth + profiles](/concepts/oauth)
## Motivation
Operators want:
- Multiple auth profiles per provider (personal vs work).
- Simple `/model` selection with predictable fallbacks.
- Clear separation between text models and image-capable models.
## Possible direction (high level)
- Keep model selection simple: `provider/model` with optional aliases.
- Let providers have multiple auth profiles, with an explicit order.
- Use a global fallback list so all sessions fail over consistently.
- Only override image routing when explicitly configured.
## Open questions
- Should profile rotation be per-provider or per-model?
- How should the UI surface profile selection for a session?
- What is the safest migration path from legacy config keys?

View File

@@ -1,320 +0,0 @@
---
title: "hooks.gmail.model for Gmail PubSub Processing"
summary: "Spec for hooks.gmail.model - cheaper model for Gmail PubSub processing"
author: "Peter Steinberger <steipete@gmail.com>"
github_username: "steipete"
created: "2026-01-09"
read_when:
- Implementing hooks.gmail.model feature
- Modifying Gmail hook processing
- Working on hook model selection
---
# hooks.gmail.model: Cheaper Model for Gmail PubSub Processing
## Problem
Gmail PubSub hook processing (`/gmail-pubsub`) currently uses the session's primary model (`agents.defaults.model.primary`), which may be an expensive model like `claude-opus-4-5`. For automated email processing that doesn't require the most capable model, this wastes tokens/cost.
## Solution
Add `hooks.gmail.model` config option to specify an optional cheaper model for Gmail PubSub processing, with intelligent fallback to the primary model on auth/rate-limit/timeout failures.
## Config Structure
```json5
{
hooks: {
gmail: {
account: "user@gmail.com",
// ... existing gmail config ...
// NEW: Optional model override for Gmail hook processing
model: "openrouter/meta-llama/llama-3.3-70b-instruct:free",
// NEW: Optional thinking level override
thinking: "off",
},
},
}
```
### Fields
| Field | Type | Default | Description |
| ---------------------- | -------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `hooks.gmail.model` | `string` | (none) | Model to use for Gmail hook processing. Accepts `provider/model` refs or aliases from `agents.defaults.models`. |
| `hooks.gmail.thinking` | `string` | (inherited) | Thinking level override (`off`, `minimal`, `low`, `medium`, `high`, `xhigh`; GPT-5.2 + Codex models only). If unset, inherits from `agents.defaults.thinkingDefault` or model's default. |
### Alias Support
`hooks.gmail.model` accepts:
- Full refs: `"openrouter/meta-llama/llama-3.3-70b-instruct:free"`
- Aliases from `agents.defaults.models`: `"Opus"`, `"Sonnet"`, `"GLM"`
Resolution uses `buildModelAliasIndex()` from `model-selection.ts`.
## Fallback Behavior
### Fallback Triggers
Auth, rate-limit, and timeout errors trigger fallback:
- `401 Unauthorized`
- `403 Forbidden`
- `429 Too Many Requests`
- Timeouts (provider hangs / network timeouts)
Other errors (500s, content errors) fail in place.
### Fallback Chain
```
hooks.gmail.model (if set)
↓ (on auth/rate-limit/timeout)
agents.defaults.model.fallbacks[0..n]
↓ (exhausted)
agents.defaults.model.primary
```
### Uncatalogued Model
If `hooks.gmail.model` is set but not found in the model catalog or allowlist:
- **Config load**: Log warning (surfaced in `clawdbot doctor`)
- **Allowlist**: If `agents.defaults.models` is set and the model isn't listed, the hook falls back to primary.
### Cooldown Integration
Uses existing model-failover cooldown from `model-failover.ts`:
- After auth/rate-limit failure, model enters cooldown
- Next hook invocation respects cooldown before retrying
- Integrates with auth profile rotation
## Implementation
### Type Changes
```typescript
// src/config/types.ts
export type HooksGmailConfig = {
account?: string;
label?: string;
// ... existing fields ...
/** Optional model override for Gmail hook processing (provider/model or alias). */
model?: string;
/** Optional thinking level override for Gmail hook processing. */
thinking?: "off" | "minimal" | "low" | "medium" | "high";
};
```
### Model Resolution
New function in `src/cron/isolated-agent.ts` or `src/agents/model-selection.ts`:
```typescript
export function resolveHooksGmailModel(params: {
cfg: ClawdbotConfig;
defaultProvider: string;
defaultModel: string;
}): { provider: string; model: string; isHooksOverride: boolean } | null {
const hooksModel = params.cfg.hooks?.gmail?.model;
if (!hooksModel) return null;
const aliasIndex = buildModelAliasIndex({
cfg: params.cfg,
defaultProvider: params.defaultProvider,
});
const resolved = resolveModelRefFromString({
raw: hooksModel,
defaultProvider: params.defaultProvider,
aliasIndex,
});
if (!resolved) return null;
return {
provider: resolved.ref.provider,
model: resolved.ref.model,
isHooksOverride: true,
};
}
```
### Processing Flow
In `runCronIsolatedAgentTurn()` (or new wrapper for hooks):
```typescript
// Resolve model - prefer hooks.gmail.model for Gmail hooks
const isGmailHook = params.sessionKey.startsWith("hook:gmail:");
const hooksModelRef = isGmailHook
? resolveHooksGmailModel({ cfg, defaultProvider, defaultModel })
: null;
const { provider, model } =
hooksModelRef ??
resolveConfiguredModelRef({
cfg: params.cfg,
defaultProvider: DEFAULT_PROVIDER,
defaultModel: DEFAULT_MODEL,
});
// Run with fallback - on auth/rate-limit/timeout, fall through to agents.defaults.model.fallbacks
const fallbackResult = await runWithModelFallback({
cfg: params.cfg,
provider,
model,
hooksOverride: hooksModelRef?.isHooksOverride,
run: (providerOverride, modelOverride) =>
runEmbeddedPiAgent({
// ... existing params ...
}),
});
```
### Fallback Detection
Extend `runWithModelFallback()` to detect auth/rate-limit:
```typescript
function isAuthRateLimitError(err: unknown): boolean {
if (err instanceof ApiError) {
return [401, 403, 429].includes(err.status);
}
// Check for common patterns in error messages
const msg = String(err).toLowerCase();
return (
msg.includes("unauthorized") || msg.includes("rate limit") || msg.includes("quota exceeded")
);
}
```
## Validation
### Config Load Time
In config validation (for `clawdbot doctor`):
```typescript
if (cfg.hooks?.gmail?.model) {
const resolved = resolveHooksGmailModel({ cfg, defaultProvider, defaultModel });
if (!resolved) {
issues.push({
path: "hooks.gmail.model",
message: `Model "${cfg.hooks.gmail.model}" could not be resolved`,
});
} else {
const catalog = await loadModelCatalog({ config: cfg });
const key = modelKey(resolved.provider, resolved.model);
const inCatalog = catalog.some((e) => modelKey(e.provider, e.id) === key);
if (!inCatalog) {
issues.push({
path: "hooks.gmail.model",
message: `Model "${key}" not found in agents.defaults.models catalog (will fall back to primary)`,
});
}
}
}
```
### Runtime
At hook invocation time, validate and fall back:
- If model not in catalog → log warning, use primary
- If model auth fails → log warning, enter cooldown, fall back
## Observability
### Log Messages
```
[hooks] Gmail hook: using model openrouter/meta-llama/llama-3.3-70b-instruct:free
[hooks] Gmail hook: model llama auth failed (429), falling back to claude-opus-4-5
```
### Hook Event Summary
Include fallback info in the hook summary sent to session:
```
Hook Gmail (fallback:llama→opus): <summary>
```
## Hot Reload
`hooks.gmail.model` and `hooks.gmail.thinking` are hot-reloadable:
- Changes apply to the next hook invocation
- No gateway restart required
- Hooks config is already in the hot-reload matrix
## Test Plan
### Unit Tests
1. **Model resolution** (`model-selection.test.ts`):
- `resolveHooksGmailModel()` with valid ref
- `resolveHooksGmailModel()` with alias
- `resolveHooksGmailModel()` with invalid input → null
2. **Config validation** (`config.test.ts`):
- Warning on uncatalogued model
- No warning on valid model
- Graceful handling of missing hooks.gmail section
3. **Fallback triggers** (`model-fallback.test.ts`):
- 401/403/429 → triggers fallback
- timeouts → triggers fallback
- 500/content error → no fallback
- Content error → no fallback
### Integration Tests
1. **Hook processing** (`server.hooks.test.ts`):
- Gmail hook uses `hooks.gmail.model` when set
- Fallback to primary on auth failure
- Thinking level override applied
2. **Hot reload** (`config-reload.test.ts`):
- Change `hooks.gmail.model` → next hook uses new model
## Documentation
Update `docs/gateway/configuration.md`:
```json5
{
hooks: {
gmail: {
account: "user@gmail.com",
topic: "projects/my-project/topics/gmail-watch",
// ... existing config ...
// Optional: Use a cheaper model for Gmail processing
// Falls back to agents.defaults.model.primary on auth/rate-limit errors
model: "openrouter/meta-llama/llama-3.3-70b-instruct:free",
// Optional: Override thinking level for Gmail processing
thinking: "off",
},
},
}
```
## Scope Limitation
This PR is Gmail-specific. Future hooks (`hooks.github.model`, etc.) would follow the same pattern but are out of scope.
## Changelog Entry
```
- feat: add hooks.gmail.model for cheaper Gmail PubSub processing (#XXX)
- Falls back to agents.defaults.model.primary on auth/rate-limit/timeouts (401/403/429)
- Supports aliases from agents.defaults.models
- Add hooks.gmail.thinking override
```

View File

@@ -30,7 +30,6 @@
"docs/",
"!docs/.generated/**",
"!docs/.i18n/zh-CN.tm.jsonl",
"!docs/internal/**",
"skills/",
"scripts/postinstall-bundled-plugins.mjs"
],
@@ -1033,8 +1032,8 @@
"format:all": "pnpm format && pnpm format:swift",
"format:check": "oxfmt --check --threads=1",
"format:diff": "oxfmt --write && git --no-pager diff",
"format:docs": "git ls-files 'docs/**/*.md' 'docs/**/*.mdx' 'README.md' ':(exclude)docs/internal/**' | xargs oxfmt --write",
"format:docs:check": "git ls-files 'docs/**/*.md' 'docs/**/*.mdx' 'README.md' ':(exclude)docs/internal/**' | xargs oxfmt --check",
"format:docs": "git ls-files 'docs/**/*.md' 'docs/**/*.mdx' 'README.md' | xargs oxfmt --write",
"format:docs:check": "git ls-files 'docs/**/*.md' 'docs/**/*.mdx' 'README.md' | xargs oxfmt --check",
"format:fix": "oxfmt --write",
"format:swift": "swiftformat --lint --config .swiftformat apps/macos/Sources apps/ios/Sources apps/shared/OpenClawKit/Sources",
"gateway:dev": "OPENCLAW_SKIP_CHANNELS=1 node scripts/run-node.mjs --dev gateway",

View File

@@ -6,7 +6,7 @@ import path from "node:path";
const ROOT = process.cwd();
const GLOSSARY_PATH = path.join(ROOT, "docs", ".i18n", "glossary.zh-CN.json");
const DOC_FILE_RE = /^docs\/(?!zh-CN\/)(?!internal\/).+\.(md|mdx)$/i;
const DOC_FILE_RE = /^docs\/(?!zh-CN\/).+\.(md|mdx)$/i;
const LIST_ITEM_LINK_RE = /^\s*(?:[-*]|\d+\.)\s+\[([^\]]+)\]\((\/[^)]+)\)/;
const MAX_TITLE_WORDS = 8;
const MAX_LABEL_WORDS = 6;

View File

@@ -65,15 +65,7 @@ for (const item of docsConfig.redirects || []) {
}
const allFiles = walk(DOCS_DIR);
function isIgnoredDocsPath(relPath) {
return relPath.startsWith("internal/");
}
const relAllFiles = new Set(
allFiles
.map((abs) => normalizeSlashes(path.relative(DOCS_DIR, abs)))
.filter((rel) => !isIgnoredDocsPath(rel)),
);
const relAllFiles = new Set(allFiles.map((abs) => normalizeSlashes(path.relative(DOCS_DIR, abs))));
function isGeneratedTranslatedDoc(relPath) {
return relPath.startsWith("zh-CN/");
@@ -84,7 +76,7 @@ const markdownFiles = allFiles.filter((abs) => {
return false;
}
const rel = normalizeSlashes(path.relative(DOCS_DIR, abs));
return !isGeneratedTranslatedDoc(rel) && !isIgnoredDocsPath(rel);
return !isGeneratedTranslatedDoc(rel);
});
const routes = new Set();

View File

@@ -20,7 +20,7 @@ if (!statSync(DOCS_DIR).isDirectory()) {
process.exit(1);
}
const EXCLUDED_DIRS = new Set(["archive", "internal", "research"]);
const EXCLUDED_DIRS = new Set(["archive", "research"]);
/**
* @param {unknown[]} values