From da2289869d692cb5619668eb825e9d92fca2eecf Mon Sep 17 00:00:00 2001 From: Vincent Koc Date: Wed, 18 Mar 2026 00:55:47 -0700 Subject: [PATCH] docs: remove experiments/ and design/ directories Delete all experiment plans, proposals, research docs, and the kilo-gateway-integration design doc. These are internal planning docs that do not belong on the public docs site. - 12 English experiment files - 5 zh-CN experiment translations - 1 design doc (kilo-gateway-integration) - Remove nav groups from docs.json (English + zh-CN) - Remove 3 redirects pointing to deleted experiment pages - Remove dead experiment links from hubs.md Co-Authored-By: Claude Opus 4.6 --- docs/design/kilo-gateway-integration.md | 542 ------------ docs/docs.json | 38 - .../experiments/onboarding-config-protocol.md | 43 - ...ndings-discord-channels-telegram-topics.md | 375 -------- .../plans/acp-thread-bound-agents.md | 800 ------------------ .../plans/acp-unified-streaming-refactor.md | 96 --- .../plans/browser-evaluate-cdp-refactor.md | 232 ----- .../plans/discord-async-inbound-worker.md | 337 -------- .../plans/openresponses-gateway.md | 126 --- .../plans/pty-process-supervision.md | 195 ----- .../plans/session-binding-channel-agnostic.md | 226 ----- .../proposals/acp-bound-command-auth.md | 89 -- docs/experiments/proposals/model-config.md | 36 - docs/experiments/research/memory.md | 228 ----- docs/start/hubs.md | 6 - .../experiments/onboarding-config-protocol.md | 47 - .../experiments/plans/cron-add-hardening.md | 70 -- .../plans/group-policy-hardening.md | 45 - .../plans/openresponses-gateway.md | 121 --- .../experiments/proposals/model-config.md | 42 - docs/zh-CN/experiments/research/memory.md | 235 ----- 21 files changed, 3929 deletions(-) delete mode 100644 docs/design/kilo-gateway-integration.md delete mode 100644 docs/experiments/onboarding-config-protocol.md delete mode 100644 docs/experiments/plans/acp-persistent-bindings-discord-channels-telegram-topics.md delete mode 100644 docs/experiments/plans/acp-thread-bound-agents.md delete mode 100644 docs/experiments/plans/acp-unified-streaming-refactor.md delete mode 100644 docs/experiments/plans/browser-evaluate-cdp-refactor.md delete mode 100644 docs/experiments/plans/discord-async-inbound-worker.md delete mode 100644 docs/experiments/plans/openresponses-gateway.md delete mode 100644 docs/experiments/plans/pty-process-supervision.md delete mode 100644 docs/experiments/plans/session-binding-channel-agnostic.md delete mode 100644 docs/experiments/proposals/acp-bound-command-auth.md delete mode 100644 docs/experiments/proposals/model-config.md delete mode 100644 docs/experiments/research/memory.md delete mode 100644 docs/zh-CN/experiments/onboarding-config-protocol.md delete mode 100644 docs/zh-CN/experiments/plans/cron-add-hardening.md delete mode 100644 docs/zh-CN/experiments/plans/group-policy-hardening.md delete mode 100644 docs/zh-CN/experiments/plans/openresponses-gateway.md delete mode 100644 docs/zh-CN/experiments/proposals/model-config.md delete mode 100644 docs/zh-CN/experiments/research/memory.md diff --git a/docs/design/kilo-gateway-integration.md b/docs/design/kilo-gateway-integration.md deleted file mode 100644 index e498ea36e89..00000000000 --- a/docs/design/kilo-gateway-integration.md +++ /dev/null @@ -1,542 +0,0 @@ ---- -title: "Kilo Gateway Integration Design" -summary: "Design doc for integrating Kilo Gateway as a first-class OpenClaw provider" -read_when: - - Working on the Kilo Gateway provider integration - - Understanding provider integration patterns ---- - -# Kilo Gateway Provider Integration Design - -## Overview - -This document outlines the design for integrating "Kilo Gateway" as a first-class provider in OpenClaw, modeled after the existing OpenRouter implementation. Kilo Gateway uses an OpenAI-compatible completions API with a different base URL. - -## Design Decisions - -### 1. Provider Naming - -**Recommendation: `kilocode`** - -Rationale: - -- Matches the user config example provided (`kilocode` provider key) -- Consistent with existing provider naming patterns (e.g., `openrouter`, `opencode`, `moonshot`) -- Short and memorable -- Avoids confusion with generic "kilo" or "gateway" terms - -Alternative considered: `kilo-gateway` - rejected because hyphenated names are less common in the codebase and `kilocode` is more concise. - -### 2. Default Model Reference - -**Recommendation: `kilocode/anthropic/claude-opus-4.6`** - -Rationale: - -- Based on user config example -- Claude Opus 4.5 is a capable default model -- Explicit model selection avoids reliance on auto-routing - -### 3. Base URL Configuration - -**Recommendation: Hardcoded default with config override** - -- **Default Base URL:** `https://api.kilo.ai/api/gateway/` -- **Configurable:** Yes, via `models.providers.kilocode.baseUrl` - -This matches the pattern used by other providers like Moonshot, Venice, and Synthetic. - -### 4. Model Scanning - -**Recommendation: No dedicated model scanning endpoint initially** - -Rationale: - -- Kilo Gateway proxies to OpenRouter, so models are dynamic -- Users can manually configure models in their config -- If Kilo Gateway exposes a `/models` endpoint in the future, scanning can be added - -### 5. Special Handling - -**Recommendation: Inherit OpenRouter behavior for Anthropic models** - -Since Kilo Gateway proxies to OpenRouter, the same special handling should apply: - -- Cache TTL eligibility for `anthropic/*` models -- Extra params (cacheControlTtl) for `anthropic/*` models -- Transcript policy follows OpenRouter patterns - -## Files to Modify - -### Core Credential Management - -#### 1. `src/commands/onboard-auth.credentials.ts` - -Add: - -```typescript -export const KILOCODE_DEFAULT_MODEL_REF = "kilocode/anthropic/claude-opus-4.6"; - -export async function setKilocodeApiKey(key: string, agentDir?: string) { - upsertAuthProfile({ - profileId: "kilocode:default", - credential: { - type: "api_key", - provider: "kilocode", - key, - }, - agentDir: resolveAuthAgentDir(agentDir), - }); -} -``` - -#### 2. `src/agents/model-auth.ts` - -Add to `envMap` in `resolveEnvApiKey()`: - -```typescript -const envMap: Record = { - // ... existing entries - kilocode: "KILOCODE_API_KEY", -}; -``` - -#### 3. `src/config/io.ts` - -Add to `SHELL_ENV_EXPECTED_KEYS`: - -```typescript -const SHELL_ENV_EXPECTED_KEYS = [ - // ... existing entries - "KILOCODE_API_KEY", -]; -``` - -### Config Application - -#### 4. `src/commands/onboard-auth.config-core.ts` - -Add new functions: - -```typescript -export const KILOCODE_BASE_URL = "https://api.kilo.ai/api/gateway/"; - -export function applyKilocodeProviderConfig(cfg: OpenClawConfig): OpenClawConfig { - const models = { ...cfg.agents?.defaults?.models }; - models[KILOCODE_DEFAULT_MODEL_REF] = { - ...models[KILOCODE_DEFAULT_MODEL_REF], - alias: models[KILOCODE_DEFAULT_MODEL_REF]?.alias ?? "Kilo Gateway", - }; - - const providers = { ...cfg.models?.providers }; - const existingProvider = providers.kilocode; - const { apiKey: existingApiKey, ...existingProviderRest } = (existingProvider ?? {}) as Record< - string, - unknown - > as { apiKey?: string }; - const resolvedApiKey = typeof existingApiKey === "string" ? existingApiKey : undefined; - const normalizedApiKey = resolvedApiKey?.trim(); - - providers.kilocode = { - ...existingProviderRest, - baseUrl: KILOCODE_BASE_URL, - api: "openai-completions", - ...(normalizedApiKey ? { apiKey: normalizedApiKey } : {}), - }; - - return { - ...cfg, - agents: { - ...cfg.agents, - defaults: { - ...cfg.agents?.defaults, - models, - }, - }, - models: { - mode: cfg.models?.mode ?? "merge", - providers, - }, - }; -} - -export function applyKilocodeConfig(cfg: OpenClawConfig): OpenClawConfig { - const next = applyKilocodeProviderConfig(cfg); - const existingModel = next.agents?.defaults?.model; - return { - ...next, - agents: { - ...next.agents, - defaults: { - ...next.agents?.defaults, - model: { - ...(existingModel && "fallbacks" in (existingModel as Record) - ? { - fallbacks: (existingModel as { fallbacks?: string[] }).fallbacks, - } - : undefined), - primary: KILOCODE_DEFAULT_MODEL_REF, - }, - }, - }, - }; -} -``` - -### Auth Choice System - -#### 5. `src/commands/onboard-types.ts` - -Add to `AuthChoice` type: - -```typescript -export type AuthChoice = - // ... existing choices - "kilocode-api-key"; -// ... -``` - -Add to `OnboardOptions`: - -```typescript -export type OnboardOptions = { - // ... existing options - kilocodeApiKey?: string; - // ... -}; -``` - -#### 6. `src/commands/auth-choice-options.ts` - -Add to `AuthChoiceGroupId`: - -```typescript -export type AuthChoiceGroupId = - // ... existing groups - "kilocode"; -// ... -``` - -Add to `AUTH_CHOICE_GROUP_DEFS`: - -```typescript -{ - value: "kilocode", - label: "Kilo Gateway", - hint: "API key (OpenRouter-compatible)", - choices: ["kilocode-api-key"], -}, -``` - -Add to `buildAuthChoiceOptions()`: - -```typescript -options.push({ - value: "kilocode-api-key", - label: "Kilo Gateway API key", - hint: "OpenRouter-compatible gateway", -}); -``` - -#### 7. `src/commands/auth-choice.preferred-provider.ts` - -Add mapping: - -```typescript -const PREFERRED_PROVIDER_BY_AUTH_CHOICE: Partial> = { - // ... existing mappings - "kilocode-api-key": "kilocode", -}; -``` - -### Auth Choice Application - -#### 8. `src/commands/auth-choice.apply.api-providers.ts` - -Add import: - -```typescript -import { - // ... existing imports - applyKilocodeConfig, - applyKilocodeProviderConfig, - KILOCODE_DEFAULT_MODEL_REF, - setKilocodeApiKey, -} from "./onboard-auth.js"; -``` - -Add handling for `kilocode-api-key`: - -```typescript -if (authChoice === "kilocode-api-key") { - const store = ensureAuthProfileStore(params.agentDir, { - allowKeychainPrompt: false, - }); - const profileOrder = resolveAuthProfileOrder({ - cfg: nextConfig, - store, - provider: "kilocode", - }); - const existingProfileId = profileOrder.find((profileId) => Boolean(store.profiles[profileId])); - const existingCred = existingProfileId ? store.profiles[existingProfileId] : undefined; - let profileId = "kilocode:default"; - let mode: "api_key" | "oauth" | "token" = "api_key"; - let hasCredential = false; - - if (existingProfileId && existingCred?.type) { - profileId = existingProfileId; - mode = - existingCred.type === "oauth" ? "oauth" : existingCred.type === "token" ? "token" : "api_key"; - hasCredential = true; - } - - if (!hasCredential && params.opts?.token && params.opts?.tokenProvider === "kilocode") { - await setKilocodeApiKey(normalizeApiKeyInput(params.opts.token), params.agentDir); - hasCredential = true; - } - - if (!hasCredential) { - const envKey = resolveEnvApiKey("kilocode"); - if (envKey) { - const useExisting = await params.prompter.confirm({ - message: `Use existing KILOCODE_API_KEY (${envKey.source}, ${formatApiKeyPreview(envKey.apiKey)})?`, - initialValue: true, - }); - if (useExisting) { - await setKilocodeApiKey(envKey.apiKey, params.agentDir); - hasCredential = true; - } - } - } - - if (!hasCredential) { - const key = await params.prompter.text({ - message: "Enter Kilo Gateway API key", - validate: validateApiKeyInput, - }); - await setKilocodeApiKey(normalizeApiKeyInput(String(key)), params.agentDir); - hasCredential = true; - } - - if (hasCredential) { - nextConfig = applyAuthProfileConfig(nextConfig, { - profileId, - provider: "kilocode", - mode, - }); - } - { - const applied = await applyDefaultModelChoice({ - config: nextConfig, - setDefaultModel: params.setDefaultModel, - defaultModel: KILOCODE_DEFAULT_MODEL_REF, - applyDefaultConfig: applyKilocodeConfig, - applyProviderConfig: applyKilocodeProviderConfig, - noteDefault: KILOCODE_DEFAULT_MODEL_REF, - noteAgentModel, - prompter: params.prompter, - }); - nextConfig = applied.config; - agentModelOverride = applied.agentModelOverride ?? agentModelOverride; - } - return { config: nextConfig, agentModelOverride }; -} -``` - -Also add tokenProvider mapping at the top of the function: - -```typescript -if (params.opts.tokenProvider === "kilocode") { - authChoice = "kilocode-api-key"; -} -``` - -### CLI Registration - -#### 9. `src/cli/program/register.onboard.ts` - -Add CLI option: - -```typescript -.option("--kilocode-api-key ", "Kilo Gateway API key") -``` - -Add to action handler: - -```typescript -kilocodeApiKey: opts.kilocodeApiKey as string | undefined, -``` - -Update auth-choice help text: - -```typescript -.option( - "--auth-choice ", - "Auth: setup-token|token|chutes|openai-codex|openai-api-key|openrouter-api-key|kilocode-api-key|ai-gateway-api-key|...", -) -``` - -### Non-Interactive Onboarding - -#### 10. `src/commands/onboard-non-interactive/local/auth-choice.ts` - -Add handling for `kilocode-api-key`: - -```typescript -if (authChoice === "kilocode-api-key") { - const resolved = await resolveNonInteractiveApiKey({ - provider: "kilocode", - cfg: baseConfig, - flagValue: opts.kilocodeApiKey, - flagName: "--kilocode-api-key", - envVar: "KILOCODE_API_KEY", - }); - await setKilocodeApiKey(resolved.apiKey, agentDir); - nextConfig = applyAuthProfileConfig(nextConfig, { - profileId: "kilocode:default", - provider: "kilocode", - mode: "api_key", - }); - // ... apply default model -} -``` - -### Export Updates - -#### 11. `src/commands/onboard-auth.ts` - -Add exports: - -```typescript -export { - // ... existing exports - applyKilocodeConfig, - applyKilocodeProviderConfig, - KILOCODE_BASE_URL, -} from "./onboard-auth.config-core.js"; - -export { - // ... existing exports - KILOCODE_DEFAULT_MODEL_REF, - setKilocodeApiKey, -} from "./onboard-auth.credentials.js"; -``` - -### Special Handling (Optional) - -#### 12. `src/agents/pi-embedded-runner/cache-ttl.ts` - -Add Kilo Gateway support for Anthropic models: - -```typescript -export function isCacheTtlEligibleProvider(provider: string, modelId: string): boolean { - const normalizedProvider = provider.toLowerCase(); - const normalizedModelId = modelId.toLowerCase(); - if (normalizedProvider === "anthropic") return true; - if (normalizedProvider === "openrouter" && normalizedModelId.startsWith("anthropic/")) - return true; - if (normalizedProvider === "kilocode" && normalizedModelId.startsWith("anthropic/")) return true; - return false; -} -``` - -#### 13. `src/agents/transcript-policy.ts` - -Add Kilo Gateway handling (similar to OpenRouter): - -```typescript -const isKilocodeGemini = provider === "kilocode" && modelId.toLowerCase().includes("gemini"); - -// Include in needsNonImageSanitize check -const needsNonImageSanitize = - isGoogle || isAnthropic || isMistral || isOpenRouterGemini || isKilocodeGemini; -``` - -## Configuration Structure - -### User Config Example - -```json -{ - "models": { - "mode": "merge", - "providers": { - "kilocode": { - "baseUrl": "https://api.kilo.ai/api/gateway/", - "apiKey": "xxxxx", - "api": "openai-completions", - "models": [ - { - "id": "anthropic/claude-opus-4.6", - "name": "Anthropic: Claude Opus 4.6" - }, - { "id": "minimax/minimax-m2.5:free", "name": "Minimax: Minimax M2.5" } - ] - } - } - } -} -``` - -### Auth Profile Structure - -```json -{ - "profiles": { - "kilocode:default": { - "type": "api_key", - "provider": "kilocode", - "key": "xxxxx" - } - } -} -``` - -## Testing Considerations - -1. **Unit Tests:** - - Test `setKilocodeApiKey()` writes correct profile - - Test `applyKilocodeConfig()` sets correct defaults - - Test `resolveEnvApiKey("kilocode")` returns correct env var - -2. **Integration Tests:** - - Test setup flow with `--auth-choice kilocode-api-key` - - Test non-interactive setup with `--kilocode-api-key` - - Test model selection with `kilocode/` prefix - -3. **E2E Tests:** - - Test actual API calls through Kilo Gateway (live tests) - -## Migration Notes - -- No migration needed for existing users -- New users can immediately use `kilocode-api-key` auth choice -- Existing manual config with `kilocode` provider will continue to work - -## Future Considerations - -1. **Model Catalog:** If Kilo Gateway exposes a `/models` endpoint, add scanning support similar to `scanOpenRouterModels()` - -2. **OAuth Support:** If Kilo Gateway adds OAuth, extend the auth system accordingly - -3. **Rate Limiting:** Consider adding rate limit handling specific to Kilo Gateway if needed - -4. **Documentation:** Add docs at `docs/providers/kilocode.md` explaining setup and usage - -## Summary of Changes - -| File | Change Type | Description | -| ----------------------------------------------------------- | ----------- | ----------------------------------------------------------------------- | -| `src/commands/onboard-auth.credentials.ts` | Add | `KILOCODE_DEFAULT_MODEL_REF`, `setKilocodeApiKey()` | -| `src/agents/model-auth.ts` | Modify | Add `kilocode` to `envMap` | -| `src/config/io.ts` | Modify | Add `KILOCODE_API_KEY` to shell env keys | -| `src/commands/onboard-auth.config-core.ts` | Add | `applyKilocodeProviderConfig()`, `applyKilocodeConfig()` | -| `src/commands/onboard-types.ts` | Modify | Add `kilocode-api-key` to `AuthChoice`, add `kilocodeApiKey` to options | -| `src/commands/auth-choice-options.ts` | Modify | Add `kilocode` group and option | -| `src/commands/auth-choice.preferred-provider.ts` | Modify | Add `kilocode-api-key` mapping | -| `src/commands/auth-choice.apply.api-providers.ts` | Modify | Add `kilocode-api-key` handling | -| `src/cli/program/register.onboard.ts` | Modify | Add `--kilocode-api-key` option | -| `src/commands/onboard-non-interactive/local/auth-choice.ts` | Modify | Add non-interactive handling | -| `src/commands/onboard-auth.ts` | Modify | Export new functions | -| `src/agents/pi-embedded-runner/cache-ttl.ts` | Modify | Add kilocode support | -| `src/agents/transcript-policy.ts` | Modify | Add kilocode Gemini handling | diff --git a/docs/docs.json b/docs/docs.json index 9d04ab81c5c..1d98a93c602 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -535,10 +535,6 @@ "source": "/onboarding", "destination": "/start/onboarding" }, - { - "source": "/onboarding-config-protocol", - "destination": "/experiments/onboarding-config-protocol" - }, { "source": "/pairing", "destination": "/channels/pairing" @@ -559,10 +555,6 @@ "source": "/presence", "destination": "/concepts/presence" }, - { - "source": "/proposals/model-config", - "destination": "/experiments/proposals/model-config" - }, { "source": "/provider-routing", "destination": "/channels/channel-routing" @@ -583,10 +575,6 @@ "source": "/remote-gateway-readme", "destination": "/gateway/remote-gateway-readme" }, - { - "source": "/research/memory", - "destination": "/experiments/research/memory" - }, { "source": "/rpc", "destination": "/reference/rpc" @@ -1358,21 +1346,6 @@ { "group": "Release policy", "pages": ["reference/RELEASING", "reference/test"] - }, - { - "group": "Experiments", - "pages": [ - "design/kilo-gateway-integration", - "experiments/onboarding-config-protocol", - "experiments/plans/acp-thread-bound-agents", - "experiments/plans/acp-unified-streaming-refactor", - "experiments/plans/browser-evaluate-cdp-refactor", - "experiments/plans/openresponses-gateway", - "experiments/plans/pty-process-supervision", - "experiments/plans/session-binding-channel-agnostic", - "experiments/research/memory", - "experiments/proposals/model-config" - ] } ] }, @@ -1938,17 +1911,6 @@ { "group": "发布策略", "pages": ["zh-CN/reference/RELEASING", "zh-CN/reference/test"] - }, - { - "group": "实验性功能", - "pages": [ - "zh-CN/experiments/onboarding-config-protocol", - "zh-CN/experiments/plans/openresponses-gateway", - "zh-CN/experiments/plans/cron-add-hardening", - "zh-CN/experiments/plans/group-policy-hardening", - "zh-CN/experiments/research/memory", - "zh-CN/experiments/proposals/model-config" - ] } ] }, diff --git a/docs/experiments/onboarding-config-protocol.md b/docs/experiments/onboarding-config-protocol.md deleted file mode 100644 index e3b9d9dff10..00000000000 --- a/docs/experiments/onboarding-config-protocol.md +++ /dev/null @@ -1,43 +0,0 @@ ---- -summary: "RPC protocol notes for setup wizard and config schema" -read_when: "Changing setup wizard steps or config schema endpoints" -title: "Onboarding and Config Protocol" ---- - -# Onboarding + Config Protocol - -Purpose: shared onboarding + config surfaces across CLI, macOS app, and Web UI. - -## Components - -- Wizard engine (shared session + prompts + onboarding state). -- CLI onboarding uses the same wizard flow as the UI clients. -- Gateway RPC exposes wizard + config schema endpoints. -- macOS onboarding uses the wizard step model. -- Web UI renders config forms from JSON Schema + UI hints. - -## Gateway RPC - -- `wizard.start` params: `{ mode?: "local"|"remote", workspace?: string }` -- `wizard.next` params: `{ sessionId, answer?: { stepId, value? } }` -- `wizard.cancel` params: `{ sessionId }` -- `wizard.status` params: `{ sessionId }` -- `config.schema` params: `{}` -- `config.schema.lookup` params: `{ path }` - - `path` accepts standard config segments plus slash-delimited plugin ids, for example `plugins.entries.pack/one.config`. - -Responses (shape) - -- Wizard: `{ sessionId, done, step?, status?, error? }` -- Config schema: `{ schema, uiHints, version, generatedAt }` -- Config schema lookup: `{ path, schema, hint?, hintPath?, children[] }` - -## UI Hints - -- `uiHints` keyed by path; optional metadata (label/help/group/order/advanced/sensitive/placeholder). -- Sensitive fields render as password inputs; no redaction layer. -- Unsupported schema nodes fall back to the raw JSON editor. - -## Notes - -- This doc is the single place to track protocol refactors for onboarding/config. diff --git a/docs/experiments/plans/acp-persistent-bindings-discord-channels-telegram-topics.md b/docs/experiments/plans/acp-persistent-bindings-discord-channels-telegram-topics.md deleted file mode 100644 index e85ddeaf4a7..00000000000 --- a/docs/experiments/plans/acp-persistent-bindings-discord-channels-telegram-topics.md +++ /dev/null @@ -1,375 +0,0 @@ -# ACP Persistent Bindings for Discord Channels and Telegram Topics - -Status: Draft - -## Summary - -Introduce persistent ACP bindings that map: - -- Discord channels (and existing threads, where needed), and -- Telegram forum topics in groups/supergroups (`chatId:topic:topicId`) - -to long-lived ACP sessions, with binding state stored in top-level `bindings[]` entries using explicit binding types. - -This makes ACP usage in high-traffic messaging channels predictable and durable, so users can create dedicated channels/topics such as `codex`, `claude-1`, or `claude-myrepo`. - -## Why - -Current thread-bound ACP behavior is optimized for ephemeral Discord thread workflows. Telegram does not have the same thread model; it has forum topics in groups/supergroups. Users want stable, always-on ACP “workspaces” in chat surfaces, not only temporary thread sessions. - -## Goals - -- Support durable ACP binding for: - - Discord channels/threads - - Telegram forum topics (groups/supergroups) -- Make binding source-of-truth config-driven. -- Keep `/acp`, `/new`, `/reset`, `/focus`, and delivery behavior consistent across Discord and Telegram. -- Preserve existing temporary binding flows for ad-hoc usage. - -## Non-Goals - -- Full redesign of ACP runtime/session internals. -- Removing existing ephemeral binding flows. -- Expanding to every channel in the first iteration. -- Implementing Telegram channel direct-messages topics (`direct_messages_topic_id`) in this phase. -- Implementing Telegram private-chat topic variants in this phase. - -## UX Direction - -### 1) Two binding types - -- **Persistent binding**: saved in config, reconciled on startup, intended for “named workspace” channels/topics. -- **Temporary binding**: runtime-only, expires by idle/max-age policy. - -### 2) Command behavior - -- `/acp spawn ... --thread here|auto|off` remains available. -- Add explicit bind lifecycle controls: - - `/acp bind [session|agent] [--persist]` - - `/acp unbind [--persist]` - - `/acp status` includes whether binding is `persistent` or `temporary`. -- In bound conversations, `/new` and `/reset` reset the bound ACP session in place and keep the binding attached. - -### 3) Conversation identity - -- Use canonical conversation IDs: - - Discord: channel/thread ID. - - Telegram topic: `chatId:topic:topicId`. -- Never key Telegram bindings by bare topic ID alone. - -## Config Model (Proposed) - -Unify routing and persistent ACP binding configuration in top-level `bindings[]` with explicit `type` discriminator: - -```jsonc -{ - "agents": { - "list": [ - { - "id": "main", - "default": true, - "workspace": "~/.openclaw/workspace-main", - "runtime": { "type": "embedded" }, - }, - { - "id": "codex", - "workspace": "~/.openclaw/workspace-codex", - "runtime": { - "type": "acp", - "acp": { - "agent": "codex", - "backend": "acpx", - "mode": "persistent", - "cwd": "/workspace/repo-a", - }, - }, - }, - { - "id": "claude", - "workspace": "~/.openclaw/workspace-claude", - "runtime": { - "type": "acp", - "acp": { - "agent": "claude", - "backend": "acpx", - "mode": "persistent", - "cwd": "/workspace/repo-b", - }, - }, - }, - ], - }, - "acp": { - "enabled": true, - "backend": "acpx", - "allowedAgents": ["codex", "claude"], - }, - "bindings": [ - // Route bindings (existing behavior) - { - "type": "route", - "agentId": "main", - "match": { "channel": "discord", "accountId": "default" }, - }, - { - "type": "route", - "agentId": "main", - "match": { "channel": "telegram", "accountId": "default" }, - }, - // Persistent ACP conversation bindings - { - "type": "acp", - "agentId": "codex", - "match": { - "channel": "discord", - "accountId": "default", - "peer": { "kind": "channel", "id": "222222222222222222" }, - }, - "acp": { - "label": "codex-main", - "mode": "persistent", - "cwd": "/workspace/repo-a", - "backend": "acpx", - }, - }, - { - "type": "acp", - "agentId": "claude", - "match": { - "channel": "discord", - "accountId": "default", - "peer": { "kind": "channel", "id": "333333333333333333" }, - }, - "acp": { - "label": "claude-repo-b", - "mode": "persistent", - "cwd": "/workspace/repo-b", - }, - }, - { - "type": "acp", - "agentId": "codex", - "match": { - "channel": "telegram", - "accountId": "default", - "peer": { "kind": "group", "id": "-1001234567890:topic:42" }, - }, - "acp": { - "label": "tg-codex-42", - "mode": "persistent", - }, - }, - ], - "channels": { - "discord": { - "guilds": { - "111111111111111111": { - "channels": { - "222222222222222222": { - "enabled": true, - "requireMention": false, - }, - "333333333333333333": { - "enabled": true, - "requireMention": false, - }, - }, - }, - }, - }, - "telegram": { - "groups": { - "-1001234567890": { - "topics": { - "42": { - "requireMention": false, - }, - }, - }, - }, - }, - }, -} -``` - -### Minimal Example (No Per-Binding ACP Overrides) - -```jsonc -{ - "agents": { - "list": [ - { "id": "main", "default": true, "runtime": { "type": "embedded" } }, - { - "id": "codex", - "runtime": { - "type": "acp", - "acp": { "agent": "codex", "backend": "acpx", "mode": "persistent" }, - }, - }, - { - "id": "claude", - "runtime": { - "type": "acp", - "acp": { "agent": "claude", "backend": "acpx", "mode": "persistent" }, - }, - }, - ], - }, - "acp": { "enabled": true, "backend": "acpx" }, - "bindings": [ - { - "type": "route", - "agentId": "main", - "match": { "channel": "discord", "accountId": "default" }, - }, - { - "type": "route", - "agentId": "main", - "match": { "channel": "telegram", "accountId": "default" }, - }, - - { - "type": "acp", - "agentId": "codex", - "match": { - "channel": "discord", - "accountId": "default", - "peer": { "kind": "channel", "id": "222222222222222222" }, - }, - }, - { - "type": "acp", - "agentId": "claude", - "match": { - "channel": "discord", - "accountId": "default", - "peer": { "kind": "channel", "id": "333333333333333333" }, - }, - }, - { - "type": "acp", - "agentId": "codex", - "match": { - "channel": "telegram", - "accountId": "default", - "peer": { "kind": "group", "id": "-1009876543210:topic:5" }, - }, - }, - ], -} -``` - -Notes: - -- `bindings[].type` is explicit: - - `route`: normal agent routing. - - `acp`: persistent ACP harness binding for a matched conversation. -- For `type: "acp"`, `match.peer.id` is the canonical conversation key: - - Discord channel/thread: raw channel/thread ID. - - Telegram topic: `chatId:topic:topicId`. -- `bindings[].acp.backend` is optional. Backend fallback order: - 1. `bindings[].acp.backend` - 2. `agents.list[].runtime.acp.backend` - 3. global `acp.backend` -- `mode`, `cwd`, and `label` follow the same override pattern (`binding override -> agent runtime default -> global/default behavior`). -- Keep existing `session.threadBindings.*` and `channels.discord.threadBindings.*` for temporary binding policies. -- Persistent entries declare desired state; runtime reconciles to actual ACP sessions/bindings. -- One active ACP binding per conversation node is the intended model. -- Backward compatibility: missing `type` is interpreted as `route` for legacy entries. - -### Backend Selection - -- ACP session initialization already uses configured backend selection during spawn (`acp.backend` today). -- This proposal extends spawn/reconcile logic to prefer typed ACP binding overrides: - - `bindings[].acp.backend` for conversation-local override. - - `agents.list[].runtime.acp.backend` for per-agent defaults. -- If no override exists, keep current behavior (`acp.backend` default). - -## Architecture Fit in Current System - -### Reuse existing components - -- `SessionBindingService` already supports channel-agnostic conversation references. -- ACP spawn/bind flows already support binding through service APIs. -- Telegram already carries topic/thread context via `MessageThreadId` and `chatId`. - -### New/extended components - -- **Telegram binding adapter** (parallel to Discord adapter): - - register adapter per Telegram account, - - resolve/list/bind/unbind/touch by canonical conversation ID. -- **Typed binding resolver/index**: - - split `bindings[]` into `route` and `acp` views, - - keep `resolveAgentRoute` on `route` bindings only, - - resolve persistent ACP intent from `acp` bindings only. -- **Inbound binding resolution for Telegram**: - - resolve bound session before route finalization (Discord already does this). -- **Persistent binding reconciler**: - - on startup: load configured top-level `type: "acp"` bindings, ensure ACP sessions exist, ensure bindings exist. - - on config change: apply deltas safely. -- **Cutover model**: - - no channel-local ACP binding fallback is read, - - persistent ACP bindings are sourced only from top-level `bindings[].type="acp"` entries. - -## Phased Delivery - -### Phase 1: Typed binding schema foundation - -- Extend config schema to support `bindings[].type` discriminator: - - `route`, - - `acp` with optional `acp` override object (`mode`, `backend`, `cwd`, `label`). -- Extend agent schema with runtime descriptor to mark ACP-native agents (`agents.list[].runtime.type`). -- Add parser/indexer split for route vs ACP bindings. - -### Phase 2: Runtime resolution + Discord/Telegram parity - -- Resolve persistent ACP bindings from top-level `type: "acp"` entries for: - - Discord channels/threads, - - Telegram forum topics (`chatId:topic:topicId` canonical IDs). -- Implement Telegram binding adapter and inbound bound-session override parity with Discord. -- Do not include Telegram direct/private topic variants in this phase. - -### Phase 3: Command parity and resets - -- Align `/acp`, `/new`, `/reset`, and `/focus` behavior in bound Telegram/Discord conversations. -- Ensure binding survives reset flows as configured. - -### Phase 4: Hardening - -- Better diagnostics (`/acp status`, startup reconciliation logs). -- Conflict handling and health checks. - -## Guardrails and Policy - -- Respect ACP enablement and sandbox restrictions exactly as today. -- Keep explicit account scoping (`accountId`) to avoid cross-account bleed. -- Fail closed on ambiguous routing. -- Keep mention/access policy behavior explicit per channel config. - -## Testing Plan - -- Unit: - - conversation ID normalization (especially Telegram topic IDs), - - reconciler create/update/delete paths, - - `/acp bind --persist` and unbind flows. -- Integration: - - inbound Telegram topic -> bound ACP session resolution, - - inbound Discord channel/thread -> persistent binding precedence. -- Regression: - - temporary bindings continue to work, - - unbound channels/topics keep current routing behavior. - -## Open Questions - -- Should `/acp spawn --thread auto` in Telegram topic default to `here`? -- Should persistent bindings always bypass mention-gating in bound conversations, or require explicit `requireMention=false`? -- Should `/focus` gain `--persist` as an alias for `/acp bind --persist`? - -## Rollout - -- Ship as opt-in per conversation (`bindings[].type="acp"` entry present). -- Start with Discord + Telegram only. -- Add docs with examples for: - - “one channel/topic per agent” - - “multiple channels/topics per same agent with different `cwd`” - - “team naming patterns (`codex-1`, `claude-repo-x`)". diff --git a/docs/experiments/plans/acp-thread-bound-agents.md b/docs/experiments/plans/acp-thread-bound-agents.md deleted file mode 100644 index a0637cedee5..00000000000 --- a/docs/experiments/plans/acp-thread-bound-agents.md +++ /dev/null @@ -1,800 +0,0 @@ ---- -summary: "Integrate ACP coding agents via a first-class ACP control plane in core and plugin-backed runtimes (acpx first)" -owner: "onutc" -status: "draft" -last_updated: "2026-02-25" -title: "ACP Thread Bound Agents" ---- - -# ACP Thread Bound Agents - -## Overview - -This plan defines how OpenClaw should support ACP coding agents in thread-capable channels (Discord first) with production-level lifecycle and recovery. - -Related document: - -- [Unified Runtime Streaming Refactor Plan](/experiments/plans/acp-unified-streaming-refactor) - -Target user experience: - -- a user spawns or focuses an ACP session into a thread -- user messages in that thread route to the bound ACP session -- agent output streams back to the same thread persona -- session can be persistent or one shot with explicit cleanup controls - -## Decision summary - -Long term recommendation is a hybrid architecture: - -- OpenClaw core owns ACP control plane concerns - - session identity and metadata - - thread binding and routing decisions - - delivery invariants and duplicate suppression - - lifecycle cleanup and recovery semantics -- ACP runtime backend is pluggable - - first backend is an acpx-backed plugin service - - runtime does ACP transport, queueing, cancel, reconnect - -OpenClaw should not reimplement ACP transport internals in core. -OpenClaw should not rely on a pure plugin-only interception path for routing. - -## North-star architecture (holy grail) - -Treat ACP as a first-class control plane in OpenClaw, with pluggable runtime adapters. - -Non-negotiable invariants: - -- every ACP thread binding references a valid ACP session record -- every ACP session has explicit lifecycle state (`creating`, `idle`, `running`, `cancelling`, `closed`, `error`) -- every ACP run has explicit run state (`queued`, `running`, `completed`, `failed`, `cancelled`) -- spawn, bind, and initial enqueue are atomic -- command retries are idempotent (no duplicate runs or duplicate Discord outputs) -- bound-thread channel output is a projection of ACP run events, never ad-hoc side effects - -Long-term ownership model: - -- `AcpSessionManager` is the single ACP writer and orchestrator -- manager lives in gateway process first; can be moved to a dedicated sidecar later behind the same interface -- per ACP session key, manager owns one in-memory actor (serialized command execution) -- adapters (`acpx`, future backends) are transport/runtime implementations only - -Long-term persistence model: - -- move ACP control-plane state to a dedicated SQLite store (WAL mode) under OpenClaw state dir -- keep `SessionEntry.acp` as compatibility projection during migration, not source-of-truth -- store ACP events append-only to support replay, crash recovery, and deterministic delivery - -### Delivery strategy (bridge to holy-grail) - -- short-term bridge - - keep current thread binding mechanics and existing ACP config surface - - fix metadata-gap bugs and route ACP turns through a single core ACP branch - - add idempotency keys and fail-closed routing checks immediately -- long-term cutover - - move ACP source-of-truth to control-plane DB + actors - - make bound-thread delivery purely event-projection based - - remove legacy fallback behavior that depends on opportunistic session-entry metadata - -## Why not pure plugin only - -Current plugin hooks are not sufficient for end to end ACP session routing without core changes. - -- inbound routing from thread binding resolves to a session key in core dispatch first -- message hooks are fire-and-forget and cannot short-circuit the main reply path -- plugin commands are good for control operations but not for replacing core per-turn dispatch flow - -Result: - -- ACP runtime can be pluginized -- ACP routing branch must exist in core - -## Existing foundation to reuse - -Already implemented and should remain canonical: - -- thread binding target supports `subagent` and `acp` -- inbound thread routing override resolves by binding before normal dispatch -- outbound thread identity via webhook in reply delivery -- `/focus` and `/unfocus` flow with ACP target compatibility -- persistent binding store with restore on startup -- unbind lifecycle on archive, delete, unfocus, reset, and delete - -This plan extends that foundation rather than replacing it. - -## Architecture - -### Boundary model - -Core (must be in OpenClaw core): - -- ACP session-mode dispatch branch in the reply pipeline -- delivery arbitration to avoid parent plus thread duplication -- ACP control-plane persistence (with `SessionEntry.acp` compatibility projection during migration) -- lifecycle unbind and runtime detach semantics tied to session reset/delete - -Plugin backend (acpx implementation): - -- ACP runtime worker supervision -- acpx process invocation and event parsing -- ACP command handlers (`/acp ...`) and operator UX -- backend-specific config defaults and diagnostics - -### Runtime ownership model - -- one gateway process owns ACP orchestration state -- ACP execution runs in supervised child processes via acpx backend -- process strategy is long lived per active ACP session key, not per message - -This avoids startup cost on every prompt and keeps cancel and reconnect semantics reliable. - -### Core runtime contract - -Add a core ACP runtime contract so routing code does not depend on CLI details and can switch backends without changing dispatch logic: - -```ts -export type AcpRuntimePromptMode = "prompt" | "steer"; - -export type AcpRuntimeHandle = { - sessionKey: string; - backend: string; - runtimeSessionName: string; -}; - -export type AcpRuntimeEvent = - | { type: "text_delta"; stream: "output" | "thought"; text: string } - | { type: "tool_call"; name: string; argumentsText: string } - | { type: "done"; usage?: Record } - | { type: "error"; code: string; message: string; retryable?: boolean }; - -export interface AcpRuntime { - ensureSession(input: { - sessionKey: string; - agent: string; - mode: "persistent" | "oneshot"; - cwd?: string; - env?: Record; - idempotencyKey: string; - }): Promise; - - submit(input: { - handle: AcpRuntimeHandle; - text: string; - mode: AcpRuntimePromptMode; - idempotencyKey: string; - }): Promise<{ runtimeRunId: string }>; - - stream(input: { - handle: AcpRuntimeHandle; - runtimeRunId: string; - onEvent: (event: AcpRuntimeEvent) => Promise | void; - signal?: AbortSignal; - }): Promise; - - cancel(input: { - handle: AcpRuntimeHandle; - runtimeRunId?: string; - reason?: string; - idempotencyKey: string; - }): Promise; - - close(input: { handle: AcpRuntimeHandle; reason: string; idempotencyKey: string }): Promise; - - health?(): Promise<{ ok: boolean; details?: string }>; -} -``` - -Implementation detail: - -- first backend: `AcpxRuntime` shipped as a plugin service -- core resolves runtime via registry and fails with explicit operator error when no ACP runtime backend is available - -### Control-plane data model and persistence - -Long-term source-of-truth is a dedicated ACP SQLite database (WAL mode), for transactional updates and crash-safe recovery: - -- `acp_sessions` - - `session_key` (pk), `backend`, `agent`, `mode`, `cwd`, `state`, `created_at`, `updated_at`, `last_error` -- `acp_runs` - - `run_id` (pk), `session_key` (fk), `state`, `requester_message_id`, `idempotency_key`, `started_at`, `ended_at`, `error_code`, `error_message` -- `acp_bindings` - - `binding_key` (pk), `thread_id`, `channel_id`, `account_id`, `session_key` (fk), `expires_at`, `bound_at` -- `acp_events` - - `event_id` (pk), `run_id` (fk), `seq`, `kind`, `payload_json`, `created_at` -- `acp_delivery_checkpoint` - - `run_id` (pk/fk), `last_event_seq`, `last_discord_message_id`, `updated_at` -- `acp_idempotency` - - `scope`, `idempotency_key`, `result_json`, `created_at`, unique `(scope, idempotency_key)` - -```ts -export type AcpSessionMeta = { - backend: string; - agent: string; - runtimeSessionName: string; - mode: "persistent" | "oneshot"; - cwd?: string; - state: "idle" | "running" | "error"; - lastActivityAt: number; - lastError?: string; -}; -``` - -Storage rules: - -- keep `SessionEntry.acp` as a compatibility projection during migration -- process ids and sockets stay in memory only -- durable lifecycle and run status live in ACP DB, not generic session JSON -- if runtime owner dies, gateway rehydrates from ACP DB and resumes from checkpoints - -### Routing and delivery - -Inbound: - -- keep current thread binding lookup as first routing step -- if bound target is ACP session, route to ACP runtime branch instead of `getReplyFromConfig` -- explicit `/acp steer` command uses `mode: "steer"` - -Outbound: - -- ACP event stream is normalized to OpenClaw reply chunks -- delivery target is resolved through existing bound destination path -- when a bound thread is active for that session turn, parent channel completion is suppressed - -Streaming policy: - -- stream partial output with coalescing window -- configurable min interval and max chunk bytes to stay under Discord rate limits -- final message always emitted on completion or failure - -### State machines and transaction boundaries - -Session state machine: - -- `creating -> idle -> running -> idle` -- `running -> cancelling -> idle | error` -- `idle -> closed` -- `error -> idle | closed` - -Run state machine: - -- `queued -> running -> completed` -- `running -> failed | cancelled` -- `queued -> cancelled` - -Required transaction boundaries: - -- spawn transaction - - create ACP session row - - create/update ACP thread binding row - - enqueue initial run row -- close transaction - - mark session closed - - delete/expire binding rows - - write final close event -- cancel transaction - - mark target run cancelling/cancelled with idempotency key - -No partial success is allowed across these boundaries. - -### Per-session actor model - -`AcpSessionManager` runs one actor per ACP session key: - -- actor mailbox serializes `submit`, `cancel`, `close`, and `stream` side effects -- actor owns runtime handle hydration and runtime adapter process lifecycle for that session -- actor writes run events in-order (`seq`) before any Discord delivery -- actor updates delivery checkpoints after successful outbound send - -This removes cross-turn races and prevents duplicate or out-of-order thread output. - -### Idempotency and delivery projection - -All external ACP actions must carry idempotency keys: - -- spawn idempotency key -- prompt/steer idempotency key -- cancel idempotency key -- close idempotency key - -Delivery rules: - -- Discord messages are derived from `acp_events` plus `acp_delivery_checkpoint` -- retries resume from checkpoint without re-sending already delivered chunks -- final reply emission is exactly-once per run from projection logic - -### Recovery and self-healing - -On gateway start: - -- load non-terminal ACP sessions (`creating`, `idle`, `running`, `cancelling`, `error`) -- recreate actors lazily on first inbound event or eagerly under configured cap -- reconcile any `running` runs missing heartbeats and mark `failed` or recover via adapter - -On inbound Discord thread message: - -- if binding exists but ACP session is missing, fail closed with explicit stale-binding message -- optionally auto-unbind stale binding after operator-safe validation -- never silently route stale ACP bindings to normal LLM path - -### Lifecycle and safety - -Supported operations: - -- cancel current run: `/acp cancel` -- unbind thread: `/unfocus` -- close ACP session: `/acp close` -- auto close idle sessions by effective TTL - -TTL policy: - -- effective TTL is minimum of - - global/session TTL - - Discord thread binding TTL - - ACP runtime owner TTL - -Safety controls: - -- allowlist ACP agents by name -- restrict workspace roots for ACP sessions -- env allowlist passthrough -- max concurrent ACP sessions per account and globally -- bounded restart backoff for runtime crashes - -## Config surface - -Core keys: - -- `acp.enabled` -- `acp.dispatch.enabled` (independent ACP routing kill switch) -- `acp.backend` (default `acpx`) -- `acp.defaultAgent` -- `acp.allowedAgents[]` -- `acp.maxConcurrentSessions` -- `acp.stream.coalesceIdleMs` -- `acp.stream.maxChunkChars` -- `acp.runtime.ttlMinutes` -- `acp.controlPlane.store` (`sqlite` default) -- `acp.controlPlane.storePath` -- `acp.controlPlane.recovery.eagerActors` -- `acp.controlPlane.recovery.reconcileRunningAfterMs` -- `acp.controlPlane.checkpoint.flushEveryEvents` -- `acp.controlPlane.checkpoint.flushEveryMs` -- `acp.idempotency.ttlHours` -- `channels.discord.threadBindings.spawnAcpSessions` - -Plugin/backend keys (acpx plugin section): - -- backend command/path overrides -- backend env allowlist -- backend per-agent presets -- backend startup/stop timeouts -- backend max inflight runs per session - -## Implementation specification - -### Control-plane modules (new) - -Add dedicated ACP control-plane modules in core: - -- `src/acp/control-plane/manager.ts` - - owns ACP actors, lifecycle transitions, command serialization -- `src/acp/control-plane/store.ts` - - SQLite schema management, transactions, query helpers -- `src/acp/control-plane/events.ts` - - typed ACP event definitions and serialization -- `src/acp/control-plane/checkpoint.ts` - - durable delivery checkpoints and replay cursors -- `src/acp/control-plane/idempotency.ts` - - idempotency key reservation and response replay -- `src/acp/control-plane/recovery.ts` - - boot-time reconciliation and actor rehydrate plan - -Compatibility bridge modules: - -- `src/acp/runtime/session-meta.ts` - - remains temporarily for projection into `SessionEntry.acp` - - must stop being source-of-truth after migration cutover - -### Required invariants (must enforce in code) - -- ACP session creation and thread bind are atomic (single transaction) -- there is at most one active run per ACP session actor at a time -- event `seq` is strictly increasing per run -- delivery checkpoint never advances past last committed event -- idempotency replay returns previous success payload for duplicate command keys -- stale/missing ACP metadata cannot route into normal non-ACP reply path - -### Core touchpoints - -Core files to change: - -- `src/auto-reply/reply/dispatch-from-config.ts` - - ACP branch calls `AcpSessionManager.submit` and event-projection delivery - - remove direct ACP fallback that bypasses control-plane invariants -- `src/auto-reply/reply/inbound-context.ts` (or nearest normalized context boundary) - - expose normalized routing keys and idempotency seeds for ACP control plane -- `src/config/sessions/types.ts` - - keep `SessionEntry.acp` as projection-only compatibility field -- `src/gateway/server-methods/sessions.ts` - - reset/delete/archive must call ACP manager close/unbind transaction path -- `src/infra/outbound/bound-delivery-router.ts` - - enforce fail-closed destination behavior for ACP bound session turns -- `src/discord/monitor/thread-bindings.ts` - - add ACP stale-binding validation helpers wired to control-plane lookups -- `src/auto-reply/reply/commands-acp.ts` - - route spawn/cancel/close/steer through ACP manager APIs -- `src/agents/acp-spawn.ts` - - stop ad-hoc metadata writes; call ACP manager spawn transaction -- `src/plugin-sdk/**` and plugin runtime bridge - - expose ACP backend registration and health semantics cleanly - -Core files explicitly not replaced: - -- `src/discord/monitor/message-handler.preflight.ts` - - keep thread binding override behavior as the canonical session-key resolver - -### ACP runtime registry API - -Add a core registry module: - -- `src/acp/runtime/registry.ts` - -Required API: - -```ts -export type AcpRuntimeBackend = { - id: string; - runtime: AcpRuntime; - healthy?: () => boolean; -}; - -export function registerAcpRuntimeBackend(backend: AcpRuntimeBackend): void; -export function unregisterAcpRuntimeBackend(id: string): void; -export function getAcpRuntimeBackend(id?: string): AcpRuntimeBackend | null; -export function requireAcpRuntimeBackend(id?: string): AcpRuntimeBackend; -``` - -Behavior: - -- `requireAcpRuntimeBackend` throws a typed ACP backend missing error when unavailable -- plugin service registers backend on `start` and unregisters on `stop` -- runtime lookups are read-only and process-local - -### acpx runtime plugin contract (implementation detail) - -For the first production backend (`extensions/acpx`), OpenClaw and acpx are -connected with a strict command contract: - -- backend id: `acpx` -- plugin service id: `acpx-runtime` -- runtime handle encoding: `runtimeSessionName = acpx:v1:` -- encoded payload fields: - - `name` (acpx named session; uses OpenClaw `sessionKey`) - - `agent` (acpx agent command) - - `cwd` (session workspace root) - - `mode` (`persistent | oneshot`) - -Command mapping: - -- ensure session: - - `acpx --format json --json-strict --cwd sessions ensure --name ` -- prompt turn: - - `acpx --format json --json-strict --cwd prompt --session --file -` -- cancel: - - `acpx --format json --json-strict --cwd cancel --session ` -- close: - - `acpx --format json --json-strict --cwd sessions close ` - -Streaming: - -- OpenClaw consumes ndjson events from `acpx --format json --json-strict` -- `text` => `text_delta/output` -- `thought` => `text_delta/thought` -- `tool_call` => `tool_call` -- `done` => `done` -- `error` => `error` - -### Session schema patch - -Patch `SessionEntry` in `src/config/sessions/types.ts`: - -```ts -type SessionAcpMeta = { - backend: string; - agent: string; - runtimeSessionName: string; - mode: "persistent" | "oneshot"; - cwd?: string; - state: "idle" | "running" | "error"; - lastActivityAt: number; - lastError?: string; -}; -``` - -Persisted field: - -- `SessionEntry.acp?: SessionAcpMeta` - -Migration rules: - -- phase A: dual-write (`acp` projection + ACP SQLite source-of-truth) -- phase B: read-primary from ACP SQLite, fallback-read from legacy `SessionEntry.acp` -- phase C: migration command backfills missing ACP rows from valid legacy entries -- phase D: remove fallback-read and keep projection optional for UX only -- legacy fields (`cliSessionIds`, `claudeCliSessionId`) remain untouched - -### Error contract - -Add stable ACP error codes and user-facing messages: - -- `ACP_BACKEND_MISSING` - - message: `ACP runtime backend is not configured. Install and enable the acpx runtime plugin.` -- `ACP_BACKEND_UNAVAILABLE` - - message: `ACP runtime backend is currently unavailable. Try again in a moment.` -- `ACP_SESSION_INIT_FAILED` - - message: `Could not initialize ACP session runtime.` -- `ACP_TURN_FAILED` - - message: `ACP turn failed before completion.` - -Rules: - -- return actionable user-safe message in-thread -- log detailed backend/system error only in runtime logs -- never silently fall back to normal LLM path when ACP routing was explicitly selected - -### Duplicate delivery arbitration - -Single routing rule for ACP bound turns: - -- if an active thread binding exists for the target ACP session and requester context, deliver only to that bound thread -- do not also send to parent channel for the same turn -- if bound destination selection is ambiguous, fail closed with explicit error (no implicit parent fallback) -- if no active binding exists, use normal session destination behavior - -### Observability and operational readiness - -Required metrics: - -- ACP spawn success/failure count by backend and error code -- ACP run latency percentiles (queue wait, runtime turn time, delivery projection time) -- ACP actor restart count and restart reason -- stale-binding detection count -- idempotency replay hit rate -- Discord delivery retry and rate-limit counters - -Required logs: - -- structured logs keyed by `sessionKey`, `runId`, `backend`, `threadId`, `idempotencyKey` -- explicit state transition logs for session and run state machines -- adapter command logs with redaction-safe arguments and exit summary - -Required diagnostics: - -- `/acp sessions` includes state, active run, last error, and binding status -- `/acp doctor` (or equivalent) validates backend registration, store health, and stale bindings - -### Config precedence and effective values - -ACP enablement precedence: - -- account override: `channels.discord.accounts..threadBindings.spawnAcpSessions` -- channel override: `channels.discord.threadBindings.spawnAcpSessions` -- global ACP gate: `acp.enabled` -- dispatch gate: `acp.dispatch.enabled` -- backend availability: registered backend for `acp.backend` - -Auto-enable behavior: - -- when ACP is configured (`acp.enabled=true`, `acp.dispatch.enabled=true`, or - `acp.backend=acpx`), plugin auto-enable marks `plugins.entries.acpx.enabled=true` - unless denylisted or explicitly disabled - -TTL effective value: - -- `min(session ttl, discord thread binding ttl, acp runtime ttl)` - -### Test map - -Unit tests: - -- `src/acp/runtime/registry.test.ts` (new) -- `src/auto-reply/reply/dispatch-from-config.acp.test.ts` (new) -- `src/infra/outbound/bound-delivery-router.test.ts` (extend ACP fail-closed cases) -- `src/config/sessions/types.test.ts` or nearest session-store tests (ACP metadata persistence) - -Integration tests: - -- `src/discord/monitor/reply-delivery.test.ts` (bound ACP delivery target behavior) -- `src/discord/monitor/message-handler.preflight*.test.ts` (bound ACP session-key routing continuity) -- acpx plugin runtime tests in backend package (service register/start/stop + event normalization) - -Gateway e2e tests: - -- `src/gateway/server.sessions.gateway-server-sessions-a.e2e.test.ts` (extend ACP reset/delete lifecycle coverage) -- ACP thread turn roundtrip e2e for spawn, message, stream, cancel, unfocus, restart recovery - -### Rollout guard - -Add independent ACP dispatch kill switch: - -- `acp.dispatch.enabled` default `false` for first release -- when disabled: - - ACP spawn/focus control commands may still bind sessions - - ACP dispatch path does not activate - - user receives explicit message that ACP dispatch is disabled by policy -- after canary validation, default can be flipped to `true` in a later release - -## Command and UX plan - -### New commands - -- `/acp spawn [--mode persistent|oneshot] [--thread auto|here|off]` -- `/acp cancel [session]` -- `/acp steer ` -- `/acp close [session]` -- `/acp sessions` - -### Existing command compatibility - -- `/focus ` continues to support ACP targets -- `/unfocus` keeps current semantics -- `/session idle` and `/session max-age` replace the old TTL override - -## Phased rollout - -### Phase 0 ADR and schema freeze - -- ship ADR for ACP control-plane ownership and adapter boundaries -- freeze DB schema (`acp_sessions`, `acp_runs`, `acp_bindings`, `acp_events`, `acp_delivery_checkpoint`, `acp_idempotency`) -- define stable ACP error codes, event contract, and state-transition guards - -### Phase 1 Control-plane foundation in core - -- implement `AcpSessionManager` and per-session actor runtime -- implement ACP SQLite store and transaction helpers -- implement idempotency store and replay helpers -- implement event append + delivery checkpoint modules -- wire spawn/cancel/close APIs to manager with transactional guarantees - -### Phase 2 Core routing and lifecycle integration - -- route thread-bound ACP turns from dispatch pipeline into ACP manager -- enforce fail-closed routing when ACP binding/session invariants fail -- integrate reset/delete/archive/unfocus lifecycle with ACP close/unbind transactions -- add stale-binding detection and optional auto-unbind policy - -### Phase 3 acpx backend adapter/plugin - -- implement `acpx` adapter against runtime contract (`ensureSession`, `submit`, `stream`, `cancel`, `close`) -- add backend health checks and startup/teardown registration -- normalize acpx ndjson events into ACP runtime events -- enforce backend timeouts, process supervision, and restart/backoff policy - -### Phase 4 Delivery projection and channel UX (Discord first) - -- implement event-driven channel projection with checkpoint resume (Discord first) -- coalesce streaming chunks with rate-limit aware flush policy -- guarantee exactly-once final completion message per run -- ship `/acp spawn`, `/acp cancel`, `/acp steer`, `/acp close`, `/acp sessions` - -### Phase 5 Migration and cutover - -- introduce dual-write to `SessionEntry.acp` projection plus ACP SQLite source-of-truth -- add migration utility for legacy ACP metadata rows -- flip read path to ACP SQLite primary -- remove legacy fallback routing that depends on missing `SessionEntry.acp` - -### Phase 6 Hardening, SLOs, and scale limits - -- enforce concurrency limits (global/account/session), queue policies, and timeout budgets -- add full telemetry, dashboards, and alert thresholds -- chaos-test crash recovery and duplicate-delivery suppression -- publish runbook for backend outage, DB corruption, and stale-binding remediation - -### Full implementation checklist - -- core control-plane modules and tests -- DB migrations and rollback plan -- ACP manager API integration across dispatch and commands -- adapter registration interface in plugin runtime bridge -- acpx adapter implementation and tests -- thread-capable channel delivery projection logic with checkpoint replay (Discord first) -- lifecycle hooks for reset/delete/archive/unfocus -- stale-binding detector and operator-facing diagnostics -- config validation and precedence tests for all new ACP keys -- operational docs and troubleshooting runbook - -## Test plan - -Unit tests: - -- ACP DB transaction boundaries (spawn/bind/enqueue atomicity, cancel, close) -- ACP state-machine transition guards for sessions and runs -- idempotency reservation/replay semantics across all ACP commands -- per-session actor serialization and queue ordering -- acpx event parser and chunk coalescer -- runtime supervisor restart and backoff policy -- config precedence and effective TTL calculation -- core ACP routing branch selection and fail-closed behavior when backend/session is invalid - -Integration tests: - -- fake ACP adapter process for deterministic streaming and cancel behavior -- ACP manager + dispatch integration with transactional persistence -- thread-bound inbound routing to ACP session key -- thread-bound outbound delivery suppresses parent channel duplication -- checkpoint replay recovers after delivery failure and resumes from last event -- plugin service registration and teardown of ACP runtime backend - -Gateway e2e tests: - -- spawn ACP with thread, exchange multi-turn prompts, unfocus -- gateway restart with persisted ACP DB and bindings, then continue same session -- concurrent ACP sessions in multiple threads have no cross-talk -- duplicate command retries (same idempotency key) do not create duplicate runs or replies -- stale-binding scenario yields explicit error and optional auto-clean behavior - -## Risks and mitigations - -- Duplicate deliveries during transition - - Mitigation: single destination resolver and idempotent event checkpoint -- Runtime process churn under load - - Mitigation: long lived per session owners + concurrency caps + backoff -- Plugin absent or misconfigured - - Mitigation: explicit operator-facing error and fail-closed ACP routing (no implicit fallback to normal session path) -- Config confusion between subagent and ACP gates - - Mitigation: explicit ACP keys and command feedback that includes effective policy source -- Control-plane store corruption or migration bugs - - Mitigation: WAL mode, backup/restore hooks, migration smoke tests, and read-only fallback diagnostics -- Actor deadlocks or mailbox starvation - - Mitigation: watchdog timers, actor health probes, and bounded mailbox depth with rejection telemetry - -## Acceptance checklist - -- ACP session spawn can create or bind a thread in a supported channel adapter (currently Discord) -- all thread messages route to bound ACP session only -- ACP outputs appear in the same thread identity with streaming or batches -- no duplicate output in parent channel for bound turns -- spawn+bind+initial enqueue are atomic in persistent store -- ACP command retries are idempotent and do not duplicate runs or outputs -- cancel, close, unfocus, archive, reset, and delete perform deterministic cleanup -- crash restart preserves mapping and resumes multi turn continuity -- concurrent thread bound ACP sessions work independently -- ACP backend missing state produces clear actionable error -- stale bindings are detected and surfaced explicitly (with optional safe auto-clean) -- control-plane metrics and diagnostics are available for operators -- new unit, integration, and e2e coverage passes - -## Addendum: targeted refactors for current implementation (status) - -These are non-blocking follow-ups to keep the ACP path maintainable after the current feature set lands. - -### 1) Centralize ACP dispatch policy evaluation (completed) - -- implemented via shared ACP policy helpers in `src/acp/policy.ts` -- dispatch, ACP command lifecycle handlers, and ACP spawn path now consume shared policy logic - -### 2) Split ACP command handler by subcommand domain (completed) - -- `src/auto-reply/reply/commands-acp.ts` is now a thin router -- subcommand behavior is split into: - - `src/auto-reply/reply/commands-acp/lifecycle.ts` - - `src/auto-reply/reply/commands-acp/runtime-options.ts` - - `src/auto-reply/reply/commands-acp/diagnostics.ts` - - shared helpers in `src/auto-reply/reply/commands-acp/shared.ts` - -### 3) Split ACP session manager by responsibility (completed) - -- manager is split into: - - `src/acp/control-plane/manager.ts` (public facade + singleton) - - `src/acp/control-plane/manager.core.ts` (manager implementation) - - `src/acp/control-plane/manager.types.ts` (manager types/deps) - - `src/acp/control-plane/manager.utils.ts` (normalization + helper functions) - -### 4) Optional acpx runtime adapter cleanup - -- `extensions/acpx/src/runtime.ts` can be split into: -- process execution/supervision -- ndjson event parsing/normalization -- runtime API surface (`submit`, `cancel`, `close`, etc.) -- improves testability and makes backend behavior easier to audit diff --git a/docs/experiments/plans/acp-unified-streaming-refactor.md b/docs/experiments/plans/acp-unified-streaming-refactor.md deleted file mode 100644 index 3834fb9f8d8..00000000000 --- a/docs/experiments/plans/acp-unified-streaming-refactor.md +++ /dev/null @@ -1,96 +0,0 @@ ---- -summary: "Holy grail refactor plan for one unified runtime streaming pipeline across main, subagent, and ACP" -owner: "onutc" -status: "draft" -last_updated: "2026-02-25" -title: "Unified Runtime Streaming Refactor Plan" ---- - -# Unified Runtime Streaming Refactor Plan - -## Objective - -Deliver one shared streaming pipeline for `main`, `subagent`, and `acp` so all runtimes get identical coalescing, chunking, delivery ordering, and crash recovery behavior. - -## Why this exists - -- Current behavior is split across multiple runtime-specific shaping paths. -- Formatting/coalescing bugs can be fixed in one path but remain in others. -- Delivery consistency, duplicate suppression, and recovery semantics are harder to reason about. - -## Target architecture - -Single pipeline, runtime-specific adapters: - -1. Runtime adapters emit canonical events only. -2. Shared stream assembler coalesces and finalizes text/tool/status events. -3. Shared channel projector applies channel-specific chunking/formatting once. -4. Shared delivery ledger enforces idempotent send/replay semantics. -5. Outbound channel adapter executes sends and records delivery checkpoints. - -Canonical event contract: - -- `turn_started` -- `text_delta` -- `block_final` -- `tool_started` -- `tool_finished` -- `status` -- `turn_completed` -- `turn_failed` -- `turn_cancelled` - -## Workstreams - -### 1) Canonical streaming contract - -- Define strict event schema + validation in core. -- Add adapter contract tests to guarantee each runtime emits compatible events. -- Reject malformed runtime events early and surface structured diagnostics. - -### 2) Shared stream processor - -- Replace runtime-specific coalescer/projector logic with one processor. -- Processor owns text delta buffering, idle flush, max-chunk splitting, and completion flush. -- Move ACP/main/subagent config resolution into one helper to prevent drift. - -### 3) Shared channel projection - -- Keep channel adapters dumb: accept finalized blocks and send. -- Move Discord-specific chunking quirks to channel projector only. -- Keep pipeline channel-agnostic before projection. - -### 4) Delivery ledger + replay - -- Add per-turn/per-chunk delivery IDs. -- Record checkpoints before and after physical send. -- On restart, replay pending chunks idempotently and avoid duplicates. - -### 5) Migration and cutover - -- Phase 1: shadow mode (new pipeline computes output but old path sends; compare). -- Phase 2: runtime-by-runtime cutover (`acp`, then `subagent`, then `main` or reverse by risk). -- Phase 3: delete legacy runtime-specific streaming code. - -## Non-goals - -- No changes to ACP policy/permissions model in this refactor. -- No channel-specific feature expansion outside projection compatibility fixes. -- No transport/backend redesign (acpx plugin contract remains as-is unless needed for event parity). - -## Risks and mitigations - -- Risk: behavioral regressions in existing main/subagent paths. - Mitigation: shadow mode diffing + adapter contract tests + channel e2e tests. -- Risk: duplicate sends during crash recovery. - Mitigation: durable delivery IDs + idempotent replay in delivery adapter. -- Risk: runtime adapters diverge again. - Mitigation: required shared contract test suite for all adapters. - -## Acceptance criteria - -- All runtimes pass shared streaming contract tests. -- Discord ACP/main/subagent produce equivalent spacing/chunking behavior for tiny deltas. -- Crash/restart replay sends no duplicate chunk for the same delivery ID. -- Legacy ACP projector/coalescer path is removed. -- Streaming config resolution is shared and runtime-independent. diff --git a/docs/experiments/plans/browser-evaluate-cdp-refactor.md b/docs/experiments/plans/browser-evaluate-cdp-refactor.md deleted file mode 100644 index 5832c8a65e6..00000000000 --- a/docs/experiments/plans/browser-evaluate-cdp-refactor.md +++ /dev/null @@ -1,232 +0,0 @@ ---- -summary: "Plan: isolate browser act:evaluate from Playwright queue using CDP, with end-to-end deadlines and safer ref resolution" -read_when: - - Working on browser `act:evaluate` timeout, abort, or queue blocking issues - - Planning CDP based isolation for evaluate execution -owner: "openclaw" -status: "draft" -last_updated: "2026-02-10" -title: "Browser Evaluate CDP Refactor" ---- - -# Browser Evaluate CDP Refactor Plan - -## Context - -`act:evaluate` executes user provided JavaScript in the page. Today it runs via Playwright -(`page.evaluate` or `locator.evaluate`). Playwright serializes CDP commands per page, so a -stuck or long running evaluate can block the page command queue and make every later action -on that tab look "stuck". - -PR #13498 adds a pragmatic safety net (bounded evaluate, abort propagation, and best-effort -recovery). This document describes a larger refactor that makes `act:evaluate` inherently -isolated from Playwright so a stuck evaluate cannot wedge normal Playwright operations. - -## Goals - -- `act:evaluate` cannot permanently block later browser actions on the same tab. -- Timeouts are single source of truth end to end so a caller can rely on a budget. -- Abort and timeout are treated the same way across HTTP and in-process dispatch. -- Element targeting for evaluate is supported without switching everything off Playwright. -- Maintain backward compatibility for existing callers and payloads. - -## Non-goals - -- Replace all browser actions (click, type, wait, etc.) with CDP implementations. -- Remove the existing safety net introduced in PR #13498 (it remains a useful fallback). -- Introduce new unsafe capabilities beyond the existing `browser.evaluateEnabled` gate. -- Add process isolation (worker process/thread) for evaluate. If we still see hard to recover - stuck states after this refactor, that is a follow-up idea. - -## Current Architecture (Why It Gets Stuck) - -At a high level: - -- Callers send `act:evaluate` to the browser control service. -- The route handler calls into Playwright to execute the JavaScript. -- Playwright serializes page commands, so an evaluate that never finishes blocks the queue. -- A stuck queue means later click/type/wait operations on the tab can appear to hang. - -## Proposed Architecture - -### 1. Deadline Propagation - -Introduce a single budget concept and derive everything from it: - -- Caller sets `timeoutMs` (or a deadline in the future). -- The outer request timeout, route handler logic, and the execution budget inside the page - all use the same budget, with small headroom where needed for serialization overhead. -- Abort is propagated as an `AbortSignal` everywhere so cancellation is consistent. - -Implementation direction: - -- Add a small helper (for example `createBudget({ timeoutMs, signal })`) that returns: - - `signal`: the linked AbortSignal - - `deadlineAtMs`: absolute deadline - - `remainingMs()`: remaining budget for child operations -- Use this helper in: - - `src/browser/client-fetch.ts` (HTTP and in-process dispatch) - - `src/node-host/runner.ts` (proxy path) - - browser action implementations (Playwright and CDP) - -### 2. Separate Evaluate Engine (CDP Path) - -Add a CDP based evaluate implementation that does not share Playwright's per page command -queue. The key property is that the evaluate transport is a separate WebSocket connection -and a separate CDP session attached to the target. - -Implementation direction: - -- New module, for example `src/browser/cdp-evaluate.ts`, that: - - Connects to the configured CDP endpoint (browser level socket). - - Uses `Target.attachToTarget({ targetId, flatten: true })` to get a `sessionId`. - - Runs either: - - `Runtime.evaluate` for page level evaluate, or - - `DOM.resolveNode` plus `Runtime.callFunctionOn` for element evaluate. - - On timeout or abort: - - Sends `Runtime.terminateExecution` best-effort for the session. - - Closes the WebSocket and returns a clear error. - -Notes: - -- This still executes JavaScript in the page, so termination can have side effects. The win - is that it does not wedge the Playwright queue, and it is cancelable at the transport - layer by killing the CDP session. - -### 3. Ref Story (Element Targeting Without A Full Rewrite) - -The hard part is element targeting. CDP needs a DOM handle or `backendDOMNodeId`, while -today most browser actions use Playwright locators based on refs from snapshots. - -Recommended approach: keep existing refs, but attach an optional CDP resolvable id. - -#### 3.1 Extend Stored Ref Info - -Extend the stored role ref metadata to optionally include a CDP id: - -- Today: `{ role, name, nth }` -- Proposed: `{ role, name, nth, backendDOMNodeId?: number }` - -This keeps all existing Playwright based actions working and allows CDP evaluate to accept -the same `ref` value when the `backendDOMNodeId` is available. - -#### 3.2 Populate backendDOMNodeId At Snapshot Time - -When producing a role snapshot: - -1. Generate the existing role ref map as today (role, name, nth). -2. Fetch the AX tree via CDP (`Accessibility.getFullAXTree`) and compute a parallel map of - `(role, name, nth) -> backendDOMNodeId` using the same duplicate handling rules. -3. Merge the id back into the stored ref info for the current tab. - -If mapping fails for a ref, leave `backendDOMNodeId` undefined. This makes the feature -best-effort and safe to roll out. - -#### 3.3 Evaluate Behavior With Ref - -In `act:evaluate`: - -- If `ref` is present and has `backendDOMNodeId`, run element evaluate via CDP. -- If `ref` is present but has no `backendDOMNodeId`, fall back to the Playwright path (with - the safety net). - -Optional escape hatch: - -- Extend the request shape to accept `backendDOMNodeId` directly for advanced callers (and - for debugging), while keeping `ref` as the primary interface. - -### 4. Keep A Last Resort Recovery Path - -Even with CDP evaluate, there are other ways to wedge a tab or a connection. Keep the -existing recovery mechanisms (terminate execution + disconnect Playwright) as a last resort -for: - -- legacy callers -- environments where CDP attach is blocked -- unexpected Playwright edge cases - -## Implementation Plan (Single Iteration) - -### Deliverables - -- A CDP based evaluate engine that runs outside the Playwright per-page command queue. -- A single end-to-end timeout/abort budget used consistently by callers and handlers. -- Ref metadata that can optionally carry `backendDOMNodeId` for element evaluate. -- `act:evaluate` prefers the CDP engine when possible and falls back to Playwright when not. -- Tests that prove a stuck evaluate does not wedge later actions. -- Logs/metrics that make failures and fallbacks visible. - -### Implementation Checklist - -1. Add a shared "budget" helper to link `timeoutMs` + upstream `AbortSignal` into: - - a single `AbortSignal` - - an absolute deadline - - a `remainingMs()` helper for downstream operations -2. Update all caller paths to use that helper so `timeoutMs` means the same thing everywhere: - - `src/browser/client-fetch.ts` (HTTP and in-process dispatch) - - `src/node-host/runner.ts` (node proxy path) - - CLI wrappers that call `/act` (add `--timeout-ms` to `browser evaluate`) -3. Implement `src/browser/cdp-evaluate.ts`: - - connect to the browser-level CDP socket - - `Target.attachToTarget` to get a `sessionId` - - run `Runtime.evaluate` for page evaluate - - run `DOM.resolveNode` + `Runtime.callFunctionOn` for element evaluate - - on timeout/abort: best-effort `Runtime.terminateExecution` then close the socket -4. Extend stored role ref metadata to optionally include `backendDOMNodeId`: - - keep existing `{ role, name, nth }` behavior for Playwright actions - - add `backendDOMNodeId?: number` for CDP element targeting -5. Populate `backendDOMNodeId` during snapshot creation (best-effort): - - fetch AX tree via CDP (`Accessibility.getFullAXTree`) - - compute `(role, name, nth) -> backendDOMNodeId` and merge into the stored ref map - - if mapping is ambiguous or missing, leave the id undefined -6. Update `act:evaluate` routing: - - if no `ref`: always use CDP evaluate - - if `ref` resolves to a `backendDOMNodeId`: use CDP element evaluate - - otherwise: fall back to Playwright evaluate (still bounded and abortable) -7. Keep the existing "last resort" recovery path as a fallback, not the default path. -8. Add tests: - - stuck evaluate times out within budget and the next click/type succeeds - - abort cancels evaluate (client disconnect or timeout) and unblocks subsequent actions - - mapping failures cleanly fall back to Playwright -9. Add observability: - - evaluate duration and timeout counters - - terminateExecution usage - - fallback rate (CDP -> Playwright) and reasons - -### Acceptance Criteria - -- A deliberately hung `act:evaluate` returns within the caller budget and does not wedge the - tab for later actions. -- `timeoutMs` behaves consistently across CLI, agent tool, node proxy, and in-process calls. -- If `ref` can be mapped to `backendDOMNodeId`, element evaluate uses CDP; otherwise the - fallback path is still bounded and recoverable. - -## Testing Plan - -- Unit tests: - - `(role, name, nth)` matching logic between role refs and AX tree nodes. - - Budget helper behavior (headroom, remaining time math). -- Integration tests: - - CDP evaluate timeout returns within budget and does not block the next action. - - Abort cancels evaluate and triggers termination best-effort. -- Contract tests: - - Ensure `BrowserActRequest` and `BrowserActResponse` remain compatible. - -## Risks And Mitigations - -- Mapping is imperfect: - - Mitigation: best-effort mapping, fallback to Playwright evaluate, and add debug tooling. -- `Runtime.terminateExecution` has side effects: - - Mitigation: only use on timeout/abort and document the behavior in errors. -- Extra overhead: - - Mitigation: only fetch AX tree when snapshots are requested, cache per target, and keep - CDP session short lived. -- Extension relay limitations: - - Mitigation: use browser level attach APIs when per page sockets are not available, and - keep the current Playwright path as fallback. - -## Open Questions - -- Should the new engine be configurable as `playwright`, `cdp`, or `auto`? -- Do we want to expose a new "nodeRef" format for advanced users, or keep `ref` only? -- How should frame snapshots and selector scoped snapshots participate in AX mapping? diff --git a/docs/experiments/plans/discord-async-inbound-worker.md b/docs/experiments/plans/discord-async-inbound-worker.md deleted file mode 100644 index 70397b51338..00000000000 --- a/docs/experiments/plans/discord-async-inbound-worker.md +++ /dev/null @@ -1,337 +0,0 @@ ---- -summary: "Status and next steps for decoupling Discord gateway listeners from long-running agent turns with a Discord-specific inbound worker" -owner: "openclaw" -status: "in_progress" -last_updated: "2026-03-05" -title: "Discord Async Inbound Worker Plan" ---- - -# Discord Async Inbound Worker Plan - -## Objective - -Remove Discord listener timeout as a user-facing failure mode by making inbound Discord turns asynchronous: - -1. Gateway listener accepts and normalizes inbound events quickly. -2. A Discord run queue stores serialized jobs keyed by the same ordering boundary we use today. -3. A worker executes the actual agent turn outside the Carbon listener lifetime. -4. Replies are delivered back to the originating channel or thread after the run completes. - -This is the long-term fix for queued Discord runs timing out at `channels.discord.eventQueue.listenerTimeout` while the agent run itself is still making progress. - -## Current status - -This plan is partially implemented. - -Already done: - -- Discord listener timeout and Discord run timeout are now separate settings. -- Accepted inbound Discord turns are enqueued into `src/discord/monitor/inbound-worker.ts`. -- The worker now owns the long-running turn instead of the Carbon listener. -- Existing per-route ordering is preserved by queue key. -- Timeout regression coverage exists for the Discord worker path. - -What this means in plain language: - -- the production timeout bug is fixed -- the long-running turn no longer dies just because the Discord listener budget expires -- the worker architecture is not finished yet - -What is still missing: - -- `DiscordInboundJob` is still only partially normalized and still carries live runtime references -- command semantics (`stop`, `new`, `reset`, future session controls) are not yet fully worker-native -- worker observability and operator status are still minimal -- there is still no restart durability - -## Why this exists - -Current behavior ties the full agent turn to the listener lifetime: - -- `src/discord/monitor/listeners.ts` applies the timeout and abort boundary. -- `src/discord/monitor/message-handler.ts` keeps the queued run inside that boundary. -- `src/discord/monitor/message-handler.process.ts` performs media loading, routing, dispatch, typing, draft streaming, and final reply delivery inline. - -That architecture has two bad properties: - -- long but healthy turns can be aborted by the listener watchdog -- users can see no reply even when the downstream runtime would have produced one - -Raising the timeout helps but does not change the failure mode. - -## Non-goals - -- Do not redesign non-Discord channels in this pass. -- Do not broaden this into a generic all-channel worker framework in the first implementation. -- Do not extract a shared cross-channel inbound worker abstraction yet; only share low-level primitives when duplication is obvious. -- Do not add durable crash recovery in the first pass unless needed to land safely. -- Do not change route selection, binding semantics, or ACP policy in this plan. - -## Current constraints - -The current Discord processing path still depends on some live runtime objects that should not stay inside the long-term job payload: - -- Carbon `Client` -- raw Discord event shapes -- in-memory guild history map -- thread binding manager callbacks -- live typing and draft stream state - -We already moved execution onto a worker queue, but the normalization boundary is still incomplete. Right now the worker is "run later in the same process with some of the same live objects," not a fully data-only job boundary. - -## Target architecture - -### 1. Listener stage - -`DiscordMessageListener` remains the ingress point, but its job becomes: - -- run preflight and policy checks -- normalize accepted input into a serializable `DiscordInboundJob` -- enqueue the job into a per-session or per-channel async queue -- return immediately to Carbon once the enqueue succeeds - -The listener should no longer own the end-to-end LLM turn lifetime. - -### 2. Normalized job payload - -Introduce a serializable job descriptor that contains only the data needed to run the turn later. - -Minimum shape: - -- route identity - - `agentId` - - `sessionKey` - - `accountId` - - `channel` -- delivery identity - - destination channel id - - reply target message id - - thread id if present -- sender identity - - sender id, label, username, tag -- channel context - - guild id - - channel name or slug - - thread metadata - - resolved system prompt override -- normalized message body - - base text - - effective message text - - attachment descriptors or resolved media references -- gating decisions - - mention requirement outcome - - command authorization outcome - - bound session or agent metadata if applicable - -The job payload must not contain live Carbon objects or mutable closures. - -Current implementation status: - -- partially done -- `src/discord/monitor/inbound-job.ts` exists and defines the worker handoff -- the payload still contains live Discord runtime context and should be reduced further - -### 3. Worker stage - -Add a Discord-specific worker runner responsible for: - -- reconstructing the turn context from `DiscordInboundJob` -- loading media and any additional channel metadata needed for the run -- dispatching the agent turn -- delivering final reply payloads -- updating status and diagnostics - -Recommended location: - -- `src/discord/monitor/inbound-worker.ts` -- `src/discord/monitor/inbound-job.ts` - -### 4. Ordering model - -Ordering must remain equivalent to today for a given route boundary. - -Recommended key: - -- use the same queue key logic as `resolveDiscordRunQueueKey(...)` - -This preserves existing behavior: - -- one bound agent conversation does not interleave with itself -- different Discord channels can still progress independently - -### 5. Timeout model - -After cutover, there are two separate timeout classes: - -- listener timeout - - only covers normalization and enqueue - - should be short -- run timeout - - optional, worker-owned, explicit, and user-visible - - should not be inherited accidentally from Carbon listener settings - -This removes the current accidental coupling between "Discord gateway listener stayed alive" and "agent run is healthy." - -## Recommended implementation phases - -### Phase 1: normalization boundary - -- Status: partially implemented -- Done: - - extracted `buildDiscordInboundJob(...)` - - added worker handoff tests -- Remaining: - - make `DiscordInboundJob` plain data only - - move live runtime dependencies to worker-owned services instead of per-job payload - - stop rebuilding process context by stitching live listener refs back into the job - -### Phase 2: in-memory worker queue - -- Status: implemented -- Done: - - added `DiscordInboundWorkerQueue` keyed by resolved run queue key - - listener enqueues jobs instead of directly awaiting `processDiscordMessage(...)` - - worker executes jobs in-process, in memory only - -This is the first functional cutover. - -### Phase 3: process split - -- Status: not started -- Move delivery, typing, and draft streaming ownership behind worker-facing adapters. -- Replace direct use of live preflight context with worker context reconstruction. -- Keep `processDiscordMessage(...)` temporarily as a facade if needed, then split it. - -### Phase 4: command semantics - -- Status: not started - Make sure native Discord commands still behave correctly when work is queued: - -- `stop` -- `new` -- `reset` -- any future session-control commands - -The worker queue must expose enough run state for commands to target the active or queued turn. - -### Phase 5: observability and operator UX - -- Status: not started -- emit queue depth and active worker counts into monitor status -- record enqueue time, start time, finish time, and timeout or cancellation reason -- surface worker-owned timeout or delivery failures clearly in logs - -### Phase 6: optional durability follow-up - -- Status: not started - Only after the in-memory version is stable: - -- decide whether queued Discord jobs should survive gateway restart -- if yes, persist job descriptors and delivery checkpoints -- if no, document the explicit in-memory boundary - -This should be a separate follow-up unless restart recovery is required to land. - -## File impact - -Current primary files: - -- `src/discord/monitor/listeners.ts` -- `src/discord/monitor/message-handler.ts` -- `src/discord/monitor/message-handler.preflight.ts` -- `src/discord/monitor/message-handler.process.ts` -- `src/discord/monitor/status.ts` - -Current worker files: - -- `src/discord/monitor/inbound-job.ts` -- `src/discord/monitor/inbound-worker.ts` -- `src/discord/monitor/inbound-job.test.ts` -- `src/discord/monitor/message-handler.queue.test.ts` - -Likely next touch points: - -- `src/auto-reply/dispatch.ts` -- `src/discord/monitor/reply-delivery.ts` -- `src/discord/monitor/thread-bindings.ts` -- `src/discord/monitor/native-command.ts` - -## Next step now - -The next step is to make the worker boundary real instead of partial. - -Do this next: - -1. Move live runtime dependencies out of `DiscordInboundJob` -2. Keep those dependencies on the Discord worker instance instead -3. Reduce queued jobs to plain Discord-specific data: - - route identity - - delivery target - - sender info - - normalized message snapshot - - gating and binding decisions -4. Reconstruct worker execution context from that plain data inside the worker - -In practice, that means: - -- `client` -- `threadBindings` -- `guildHistories` -- `discordRestFetch` -- other mutable runtime-only handles - -should stop living on each queued job and instead live on the worker itself or behind worker-owned adapters. - -After that lands, the next follow-up should be command-state cleanup for `stop`, `new`, and `reset`. - -## Testing plan - -Keep the existing timeout repro coverage in: - -- `src/discord/monitor/message-handler.queue.test.ts` - -Add new tests for: - -1. listener returns after enqueue without awaiting full turn -2. per-route ordering is preserved -3. different channels still run concurrently -4. replies are delivered to the original message destination -5. `stop` cancels the active worker-owned run -6. worker failure produces visible diagnostics without blocking later jobs -7. ACP-bound Discord channels still route correctly under worker execution - -## Risks and mitigations - -- Risk: command semantics drift from current synchronous behavior - Mitigation: land command-state plumbing in the same cutover, not later - -- Risk: reply delivery loses thread or reply-to context - Mitigation: make delivery identity first-class in `DiscordInboundJob` - -- Risk: duplicate sends during retries or queue restarts - Mitigation: keep first pass in-memory only, or add explicit delivery idempotency before persistence - -- Risk: `message-handler.process.ts` becomes harder to reason about during migration - Mitigation: split into normalization, execution, and delivery helpers before or during worker cutover - -## Acceptance criteria - -The plan is complete when: - -1. Discord listener timeout no longer aborts healthy long-running turns. -2. Listener lifetime and agent-turn lifetime are separate concepts in code. -3. Existing per-session ordering is preserved. -4. ACP-bound Discord channels work through the same worker path. -5. `stop` targets the worker-owned run instead of the old listener-owned call stack. -6. Timeout and delivery failures become explicit worker outcomes, not silent listener drops. - -## Remaining landing strategy - -Finish this in follow-up PRs: - -1. make `DiscordInboundJob` plain-data only and move live runtime refs onto the worker -2. clean up command-state ownership for `stop`, `new`, and `reset` -3. add worker observability and operator status -4. decide whether durability is needed or explicitly document the in-memory boundary - -This is still a bounded follow-up if kept Discord-only and if we continue to avoid a premature cross-channel worker abstraction. diff --git a/docs/experiments/plans/openresponses-gateway.md b/docs/experiments/plans/openresponses-gateway.md deleted file mode 100644 index 8ca63c34ec9..00000000000 --- a/docs/experiments/plans/openresponses-gateway.md +++ /dev/null @@ -1,126 +0,0 @@ ---- -summary: "Plan: Add OpenResponses /v1/responses endpoint and deprecate chat completions cleanly" -read_when: - - Designing or implementing `/v1/responses` gateway support - - Planning migration from Chat Completions compatibility -owner: "openclaw" -status: "draft" -last_updated: "2026-01-19" -title: "OpenResponses Gateway Plan" ---- - -# OpenResponses Gateway Integration Plan - -## Context - -OpenClaw Gateway currently exposes a minimal OpenAI-compatible Chat Completions endpoint at -`/v1/chat/completions` (see [OpenAI Chat Completions](/gateway/openai-http-api)). - -Open Responses is an open inference standard based on the OpenAI Responses API. It is designed -for agentic workflows and uses item-based inputs plus semantic streaming events. The OpenResponses -spec defines `/v1/responses`, not `/v1/chat/completions`. - -## Goals - -- Add a `/v1/responses` endpoint that adheres to OpenResponses semantics. -- Keep Chat Completions as a compatibility layer that is easy to disable and eventually remove. -- Standardize validation and parsing with isolated, reusable schemas. - -## Non-goals - -- Full OpenResponses feature parity in the first pass (images, files, hosted tools). -- Replacing internal agent execution logic or tool orchestration. -- Changing the existing `/v1/chat/completions` behavior during the first phase. - -## Research Summary - -Sources: OpenResponses OpenAPI, OpenResponses specification site, and the Hugging Face blog post. - -Key points extracted: - -- `POST /v1/responses` accepts `CreateResponseBody` fields like `model`, `input` (string or - `ItemParam[]`), `instructions`, `tools`, `tool_choice`, `stream`, `max_output_tokens`, and - `max_tool_calls`. -- `ItemParam` is a discriminated union of: - - `message` items with roles `system`, `developer`, `user`, `assistant` - - `function_call` and `function_call_output` - - `reasoning` - - `item_reference` -- Successful responses return a `ResponseResource` with `object: "response"`, `status`, and - `output` items. -- Streaming uses semantic events such as: - - `response.created`, `response.in_progress`, `response.completed`, `response.failed` - - `response.output_item.added`, `response.output_item.done` - - `response.content_part.added`, `response.content_part.done` - - `response.output_text.delta`, `response.output_text.done` -- The spec requires: - - `Content-Type: text/event-stream` - - `event:` must match the JSON `type` field - - terminal event must be literal `[DONE]` -- Reasoning items may expose `content`, `encrypted_content`, and `summary`. -- HF examples include `OpenResponses-Version: latest` in requests (optional header). - -## Proposed Architecture - -- Add `src/gateway/open-responses.schema.ts` containing Zod schemas only (no gateway imports). -- Add `src/gateway/openresponses-http.ts` (or `open-responses-http.ts`) for `/v1/responses`. -- Keep `src/gateway/openai-http.ts` intact as a legacy compatibility adapter. -- Add config `gateway.http.endpoints.responses.enabled` (default `false`). -- Keep `gateway.http.endpoints.chatCompletions.enabled` independent; allow both endpoints to be - toggled separately. -- Emit a startup warning when Chat Completions is enabled to signal legacy status. - -## Deprecation Path for Chat Completions - -- Maintain strict module boundaries: no shared schema types between responses and chat completions. -- Make Chat Completions opt-in by config so it can be disabled without code changes. -- Update docs to label Chat Completions as legacy once `/v1/responses` is stable. -- Optional future step: map Chat Completions requests to the Responses handler for a simpler - removal path. - -## Phase 1 Support Subset - -- Accept `input` as string or `ItemParam[]` with message roles and `function_call_output`. -- Extract system and developer messages into `extraSystemPrompt`. -- Use the most recent `user` or `function_call_output` as the current message for agent runs. -- Reject unsupported content parts (image/file) with `invalid_request_error`. -- Return a single assistant message with `output_text` content. -- Return `usage` with zeroed values until token accounting is wired. - -## Validation Strategy (No SDK) - -- Implement Zod schemas for the supported subset of: - - `CreateResponseBody` - - `ItemParam` + message content part unions - - `ResponseResource` - - Streaming event shapes used by the gateway -- Keep schemas in a single, isolated module to avoid drift and allow future codegen. - -## Streaming Implementation (Phase 1) - -- SSE lines with both `event:` and `data:`. -- Required sequence (minimum viable): - - `response.created` - - `response.output_item.added` - - `response.content_part.added` - - `response.output_text.delta` (repeat as needed) - - `response.output_text.done` - - `response.content_part.done` - - `response.completed` - - `[DONE]` - -## Tests and Verification Plan - -- Add e2e coverage for `/v1/responses`: - - Auth required - - Non-stream response shape - - Stream event ordering and `[DONE]` - - Session routing with headers and `user` -- Keep `src/gateway/openai-http.test.ts` unchanged. -- Manual: curl to `/v1/responses` with `stream: true` and verify event ordering and terminal - `[DONE]`. - -## Doc Updates (Follow-up) - -- Add a new docs page for `/v1/responses` usage and examples. -- Update `/gateway/openai-http-api` with a legacy note and pointer to `/v1/responses`. diff --git a/docs/experiments/plans/pty-process-supervision.md b/docs/experiments/plans/pty-process-supervision.md deleted file mode 100644 index 4ec898058cd..00000000000 --- a/docs/experiments/plans/pty-process-supervision.md +++ /dev/null @@ -1,195 +0,0 @@ ---- -summary: "Production plan for reliable interactive process supervision (PTY + non-PTY) with explicit ownership, unified lifecycle, and deterministic cleanup" -read_when: - - Working on exec/process lifecycle ownership and cleanup - - Debugging PTY and non-PTY supervision behavior -owner: "openclaw" -status: "in-progress" -last_updated: "2026-02-15" -title: "PTY and Process Supervision Plan" ---- - -# PTY and Process Supervision Plan - -## 1. Problem and goal - -We need one reliable lifecycle for long-running command execution across: - -- `exec` foreground runs -- `exec` background runs -- `process` follow up actions (`poll`, `log`, `send-keys`, `paste`, `submit`, `kill`, `remove`) -- CLI agent runner subprocesses - -The goal is not just to support PTY. The goal is predictable ownership, cancellation, timeout, and cleanup with no unsafe process matching heuristics. - -## 2. Scope and boundaries - -- Keep implementation internal in `src/process/supervisor`. -- Do not create a new package for this. -- Keep current behavior compatibility where practical. -- Do not broaden scope to terminal replay or tmux style session persistence. - -## 3. Implemented in this branch - -### Supervisor baseline already present - -- Supervisor module is in place under `src/process/supervisor/*`. -- Exec runtime and CLI runner are already routed through supervisor spawn and wait. -- Registry finalization is idempotent. - -### This pass completed - -1. Explicit PTY command contract - -- `SpawnInput` is now a discriminated union in `src/process/supervisor/types.ts`. -- PTY runs require `ptyCommand` instead of reusing generic `argv`. -- Supervisor no longer rebuilds PTY command strings from argv joins in `src/process/supervisor/supervisor.ts`. -- Exec runtime now passes `ptyCommand` directly in `src/agents/bash-tools.exec-runtime.ts`. - -2. Process layer type decoupling - -- Supervisor types no longer import `SessionStdin` from agents. -- Process local stdin contract lives in `src/process/supervisor/types.ts` (`ManagedRunStdin`). -- Adapters now depend only on process level types: - - `src/process/supervisor/adapters/child.ts` - - `src/process/supervisor/adapters/pty.ts` - -3. Process tool lifecycle ownership improvement - -- `src/agents/bash-tools.process.ts` now requests cancellation through supervisor first. -- `process kill/remove` now use process-tree fallback termination when supervisor lookup misses. -- `remove` keeps deterministic remove behavior by dropping running session entries immediately after termination is requested. - -4. Single source watchdog defaults - -- Added shared defaults in `src/agents/cli-watchdog-defaults.ts`. -- `src/agents/cli-backends.ts` consumes the shared defaults. -- `src/agents/cli-runner/reliability.ts` consumes the same shared defaults. - -5. Dead helper cleanup - -- Removed unused `killSession` helper path from `src/agents/bash-tools.shared.ts`. - -6. Direct supervisor path tests added - -- Added `src/agents/bash-tools.process.supervisor.test.ts` to cover kill and remove routing through supervisor cancellation. - -7. Reliability gap fixes completed - -- `src/agents/bash-tools.process.ts` now falls back to real OS-level process termination when supervisor lookup misses. -- `src/process/supervisor/adapters/child.ts` now uses process-tree termination semantics for default cancel/timeout kill paths. -- Added shared process-tree utility in `src/process/kill-tree.ts`. - -8. PTY contract edge-case coverage added - -- Added `src/process/supervisor/supervisor.pty-command.test.ts` for verbatim PTY command forwarding and empty-command rejection. -- Added `src/process/supervisor/adapters/child.test.ts` for process-tree kill behavior in child adapter cancellation. - -## 4. Remaining gaps and decisions - -### Reliability status - -The two required reliability gaps for this pass are now closed: - -- `process kill/remove` now has a real OS termination fallback when supervisor lookup misses. -- child cancel/timeout now uses process-tree kill semantics for default kill path. -- Regression tests were added for both behaviors. - -### Durability and startup reconciliation - -Restart behavior is now explicitly defined as in-memory lifecycle only. - -- `reconcileOrphans()` remains a no-op in `src/process/supervisor/supervisor.ts` by design. -- Active runs are not recovered after process restart. -- This boundary is intentional for this implementation pass to avoid partial persistence risks. - -### Maintainability follow-ups - -1. `runExecProcess` in `src/agents/bash-tools.exec-runtime.ts` still handles multiple responsibilities and can be split into focused helpers in a follow-up. - -## 5. Implementation plan - -The implementation pass for required reliability and contract items is complete. - -Completed: - -- `process kill/remove` fallback real termination -- process-tree cancellation for child adapter default kill path -- regression tests for fallback kill and child adapter kill path -- PTY command edge-case tests under explicit `ptyCommand` -- explicit in-memory restart boundary with `reconcileOrphans()` no-op by design - -Optional follow-up: - -- split `runExecProcess` into focused helpers with no behavior drift - -## 6. File map - -### Process supervisor - -- `src/process/supervisor/types.ts` updated with discriminated spawn input and process local stdin contract. -- `src/process/supervisor/supervisor.ts` updated to use explicit `ptyCommand`. -- `src/process/supervisor/adapters/child.ts` and `src/process/supervisor/adapters/pty.ts` decoupled from agent types. -- `src/process/supervisor/registry.ts` idempotent finalize unchanged and retained. - -### Exec and process integration - -- `src/agents/bash-tools.exec-runtime.ts` updated to pass PTY command explicitly and keep fallback path. -- `src/agents/bash-tools.process.ts` updated to cancel via supervisor with real process-tree fallback termination. -- `src/agents/bash-tools.shared.ts` removed direct kill helper path. - -### CLI reliability - -- `src/agents/cli-watchdog-defaults.ts` added as shared baseline. -- `src/agents/cli-backends.ts` and `src/agents/cli-runner/reliability.ts` now consume same defaults. - -## 7. Validation run in this pass - -Unit tests: - -- `pnpm vitest src/process/supervisor/registry.test.ts` -- `pnpm vitest src/process/supervisor/supervisor.test.ts` -- `pnpm vitest src/process/supervisor/supervisor.pty-command.test.ts` -- `pnpm vitest src/process/supervisor/adapters/child.test.ts` -- `pnpm vitest src/agents/cli-backends.test.ts` -- `pnpm vitest src/agents/bash-tools.exec.pty-cleanup.test.ts` -- `pnpm vitest src/agents/bash-tools.process.poll-timeout.test.ts` -- `pnpm vitest src/agents/bash-tools.process.supervisor.test.ts` -- `pnpm vitest src/process/exec.test.ts` - -E2E targets: - -- `pnpm vitest src/agents/cli-runner.test.ts` -- `pnpm vitest run src/agents/bash-tools.exec.pty-fallback.test.ts src/agents/bash-tools.exec.background-abort.test.ts src/agents/bash-tools.process.send-keys.test.ts` - -Typecheck note: - -- Use `pnpm build` (and `pnpm check` for full lint/docs gate) in this repo. Older notes that mention `pnpm tsgo` are obsolete. - -## 8. Operational guarantees preserved - -- Exec env hardening behavior is unchanged. -- Approval and allowlist flow is unchanged. -- Output sanitization and output caps are unchanged. -- PTY adapter still guarantees wait settlement on forced kill and listener disposal. - -## 9. Definition of done - -1. Supervisor is lifecycle owner for managed runs. -2. PTY spawn uses explicit command contract with no argv reconstruction. -3. Process layer has no type dependency on agent layer for supervisor stdin contracts. -4. Watchdog defaults are single source. -5. Targeted unit and e2e tests remain green. -6. Restart durability boundary is explicitly documented or fully implemented. - -## 10. Summary - -The branch now has a coherent and safer supervision shape: - -- explicit PTY contract -- cleaner process layering -- supervisor driven cancellation path for process operations -- real fallback termination when supervisor lookup misses -- process-tree cancellation for child-run default kill paths -- unified watchdog defaults -- explicit in-memory restart boundary (no orphan reconciliation across restart in this pass) diff --git a/docs/experiments/plans/session-binding-channel-agnostic.md b/docs/experiments/plans/session-binding-channel-agnostic.md deleted file mode 100644 index aa1f926b36b..00000000000 --- a/docs/experiments/plans/session-binding-channel-agnostic.md +++ /dev/null @@ -1,226 +0,0 @@ ---- -summary: "Channel agnostic session binding architecture and iteration 1 delivery scope" -read_when: - - Refactoring channel-agnostic session routing and bindings - - Investigating duplicate, stale, or missing session delivery across channels -owner: "onutc" -status: "in-progress" -last_updated: "2026-02-21" -title: "Session Binding Channel Agnostic Plan" ---- - -# Session Binding Channel Agnostic Plan - -## Overview - -This document defines the long term channel agnostic session binding model and the concrete scope for the next implementation iteration. - -Goal: - -- make subagent bound session routing a core capability -- keep channel specific behavior in adapters -- avoid regressions in normal Discord behavior - -## Why this exists - -Current behavior mixes: - -- completion content policy -- destination routing policy -- Discord specific details - -This caused edge cases such as: - -- duplicate main and thread delivery under concurrent runs -- stale token usage on reused binding managers -- missing activity accounting for webhook sends - -## Iteration 1 scope - -This iteration is intentionally limited. - -### 1. Add channel agnostic core interfaces - -Add core types and service interfaces for bindings and routing. - -Proposed core types: - -```ts -export type BindingTargetKind = "subagent" | "session"; -export type BindingStatus = "active" | "ending" | "ended"; - -export type ConversationRef = { - channel: string; - accountId: string; - conversationId: string; - parentConversationId?: string; -}; - -export type SessionBindingRecord = { - bindingId: string; - targetSessionKey: string; - targetKind: BindingTargetKind; - conversation: ConversationRef; - status: BindingStatus; - boundAt: number; - expiresAt?: number; - metadata?: Record; -}; -``` - -Core service contract: - -```ts -export interface SessionBindingService { - bind(input: { - targetSessionKey: string; - targetKind: BindingTargetKind; - conversation: ConversationRef; - metadata?: Record; - ttlMs?: number; - }): Promise; - - listBySession(targetSessionKey: string): SessionBindingRecord[]; - resolveByConversation(ref: ConversationRef): SessionBindingRecord | null; - touch(bindingId: string, at?: number): void; - unbind(input: { - bindingId?: string; - targetSessionKey?: string; - reason: string; - }): Promise; -} -``` - -### 2. Add one core delivery router for subagent completions - -Add a single destination resolution path for completion events. - -Router contract: - -```ts -export interface BoundDeliveryRouter { - resolveDestination(input: { - eventKind: "task_completion"; - targetSessionKey: string; - requester?: ConversationRef; - failClosed: boolean; - }): { - binding: SessionBindingRecord | null; - mode: "bound" | "fallback"; - reason: string; - }; -} -``` - -For this iteration: - -- only `task_completion` is routed through this new path -- existing paths for other event kinds remain as-is - -### 3. Keep Discord as adapter - -Discord remains the first adapter implementation. - -Adapter responsibilities: - -- create/reuse thread conversations -- send bound messages via webhook or channel send -- validate thread state (archived/deleted) -- map adapter metadata (webhook identity, thread ids) - -### 4. Fix currently known correctness issues - -Required in this iteration: - -- refresh token usage when reusing existing thread binding manager -- record outbound activity for webhook based Discord sends -- stop implicit main channel fallback when a bound thread destination is selected for session mode completion - -### 5. Preserve current runtime safety defaults - -No behavior change for users with thread bound spawn disabled. - -Defaults stay: - -- `channels.discord.threadBindings.spawnSubagentSessions = false` - -Result: - -- normal Discord users stay on current behavior -- new core path affects only bound session completion routing where enabled - -## Not in iteration 1 - -Explicitly deferred: - -- ACP binding targets (`targetKind: "acp"`) -- new channel adapters beyond Discord -- global replacement of all delivery paths (`spawn_ack`, future `subagent_message`) -- protocol level changes -- store migration/versioning redesign for all binding persistence - -Notes on ACP: - -- interface design keeps room for ACP -- ACP implementation is not started in this iteration - -## Routing invariants - -These invariants are mandatory for iteration 1. - -- destination selection and content generation are separate steps -- if session mode completion resolves to an active bound destination, delivery must target that destination -- no hidden reroute from bound destination to main channel -- fallback behavior must be explicit and observable - -## Compatibility and rollout - -Compatibility target: - -- no regression for users with thread bound spawning off -- no change to non-Discord channels in this iteration - -Rollout: - -1. Land interfaces and router behind current feature gates. -2. Route Discord completion mode bound deliveries through router. -3. Keep legacy path for non-bound flows. -4. Verify with targeted tests and canary runtime logs. - -## Tests required in iteration 1 - -Unit and integration coverage required: - -- manager token rotation uses latest token after manager reuse -- webhook sends update channel activity timestamps -- two active bound sessions in same requester channel do not duplicate to main channel -- completion for bound session mode run resolves to thread destination only -- disabled spawn flag keeps legacy behavior unchanged - -## Proposed implementation files - -Core: - -- `src/infra/outbound/session-binding-service.ts` (new) -- `src/infra/outbound/bound-delivery-router.ts` (new) -- `src/agents/subagent-announce.ts` (completion destination resolution integration) - -Discord adapter and runtime: - -- `src/discord/monitor/thread-bindings.manager.ts` -- `src/discord/monitor/reply-delivery.ts` -- `src/discord/send.outbound.ts` - -Tests: - -- `src/discord/monitor/provider*.test.ts` -- `src/discord/monitor/reply-delivery.test.ts` -- `src/agents/subagent-announce.format.test.ts` - -## Done criteria for iteration 1 - -- core interfaces exist and are wired for completion routing -- correctness fixes above are merged with tests -- no main and thread duplicate completion delivery in session mode bound runs -- no behavior change for disabled bound spawn deployments -- ACP remains explicitly deferred diff --git a/docs/experiments/proposals/acp-bound-command-auth.md b/docs/experiments/proposals/acp-bound-command-auth.md deleted file mode 100644 index 1d02e9e8469..00000000000 --- a/docs/experiments/proposals/acp-bound-command-auth.md +++ /dev/null @@ -1,89 +0,0 @@ ---- -summary: "Proposal: long-term command authorization model for ACP-bound conversations" -read_when: - - Designing native command auth behavior in Telegram/Discord ACP-bound channels/topics -title: "ACP Bound Command Authorization (Proposal)" ---- - -# ACP Bound Command Authorization (Proposal) - -Status: Proposed, **not implemented yet**. - -This document describes a long-term authorization model for native commands in -ACP-bound conversations. It is an experiments proposal and does not replace -current production behavior. - -For implemented behavior, read source and tests in: - -- `src/telegram/bot-native-commands.ts` -- `src/discord/monitor/native-command.ts` -- `src/auto-reply/reply/commands-core.ts` - -## Problem - -Today we have command-specific checks (for example `/new` and `/reset`) that -need to work inside ACP-bound channels/topics even when allowlists are empty. -This solves immediate UX pain, but command-name-based exceptions do not scale. - -## Long-term shape - -Move command authorization from ad-hoc handler logic to command metadata plus a -shared policy evaluator. - -### 1) Add auth policy metadata to command definitions - -Each command definition should declare an auth policy. Example shape: - -```ts -type CommandAuthPolicy = - | { mode: "owner_or_allowlist" } // default, current strict behavior - | { mode: "bound_acp_or_owner_or_allowlist" } // allow in explicitly bound ACP conversations - | { mode: "owner_only" }; -``` - -`/new` and `/reset` would use `bound_acp_or_owner_or_allowlist`. -Most other commands would remain `owner_or_allowlist`. - -### 2) Share one evaluator across channels - -Introduce one helper that evaluates command auth using: - -- command policy metadata -- sender authorization state -- resolved conversation binding state - -Both Telegram and Discord native handlers should call the same helper to avoid -behavior drift. - -### 3) Use binding-match as the bypass boundary - -When policy allows bound ACP bypass, authorize only if a configured binding -match was resolved for the current conversation (not just because current -session key looks ACP-like). - -This keeps the boundary explicit and minimizes accidental widening. - -## Why this is better - -- Scales to future commands without adding more command-name conditionals. -- Keeps behavior consistent across channels. -- Preserves current security model by requiring explicit binding match. -- Keeps allowlists optional hardening instead of a universal requirement. - -## Rollout plan (future) - -1. Add command auth policy field to command registry types and command data. -2. Implement shared evaluator and migrate Telegram + Discord native handlers. -3. Move `/new` and `/reset` to metadata-driven policy. -4. Add tests per policy mode and channel surface. - -## Non-goals - -- This proposal does not change ACP session lifecycle behavior. -- This proposal does not require allowlists for all ACP-bound commands. -- This proposal does not change existing route binding semantics. - -## Note - -This proposal is intentionally additive and does not delete or replace existing -experiments documents. diff --git a/docs/experiments/proposals/model-config.md b/docs/experiments/proposals/model-config.md deleted file mode 100644 index 6a0ef6524b0..00000000000 --- a/docs/experiments/proposals/model-config.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -summary: "Exploration: model config, auth profiles, and fallback behavior" -read_when: - - Exploring future model selection + auth profile ideas -title: "Model Config Exploration" ---- - -# Model Config (Exploration) - -This document captures **ideas** for future model configuration. It is not a -shipping spec. For current behavior, see: - -- [Models](/concepts/models) -- [Model failover](/concepts/model-failover) -- [OAuth + profiles](/concepts/oauth) - -## Motivation - -Operators want: - -- Multiple auth profiles per provider (personal vs work). -- Simple `/model` selection with predictable fallbacks. -- Clear separation between text models and image-capable models. - -## Possible direction (high level) - -- Keep model selection simple: `provider/model` with optional aliases. -- Let providers have multiple auth profiles, with an explicit order. -- Use a global fallback list so all sessions fail over consistently. -- Only override image routing when explicitly configured. - -## Open questions - -- Should profile rotation be per-provider or per-model? -- How should the UI surface profile selection for a session? -- What is the safest migration path from legacy config keys? diff --git a/docs/experiments/research/memory.md b/docs/experiments/research/memory.md deleted file mode 100644 index 99135e78be9..00000000000 --- a/docs/experiments/research/memory.md +++ /dev/null @@ -1,228 +0,0 @@ ---- -summary: "Research notes: offline memory system for Clawd workspaces (Markdown source-of-truth + derived index)" -read_when: - - Designing workspace memory (~/.openclaw/workspace) beyond daily Markdown logs - - Deciding: standalone CLI vs deep OpenClaw integration - - Adding offline recall + reflection (retain/recall/reflect) -title: "Workspace Memory Research" ---- - -# Workspace Memory v2 (offline): research notes - -Target: Clawd-style workspace (`agents.defaults.workspace`, default `~/.openclaw/workspace`) where “memory” is stored as one Markdown file per day (`memory/YYYY-MM-DD.md`) plus a small set of stable files (e.g. `memory.md`, `SOUL.md`). - -This doc proposes an **offline-first** memory architecture that keeps Markdown as the canonical, reviewable source of truth, but adds **structured recall** (search, entity summaries, confidence updates) via a derived index. - -## Why change? - -The current setup (one file per day) is excellent for: - -- “append-only” journaling -- human editing -- git-backed durability + auditability -- low-friction capture (“just write it down”) - -It’s weak for: - -- high-recall retrieval (“what did we decide about X?”, “last time we tried Y?”) -- entity-centric answers (“tell me about Alice / The Castle / warelay”) without rereading many files -- opinion/preference stability (and evidence when it changes) -- time constraints (“what was true during Nov 2025?”) and conflict resolution - -## Design goals - -- **Offline**: works without network; can run on laptop/Castle; no cloud dependency. -- **Explainable**: retrieved items should be attributable (file + location) and separable from inference. -- **Low ceremony**: daily logging stays Markdown, no heavy schema work. -- **Incremental**: v1 is useful with FTS only; semantic/vector and graphs are optional upgrades. -- **Agent-friendly**: makes “recall within token budgets” easy (return small bundles of facts). - -## North star model (Hindsight × Letta) - -Two pieces to blend: - -1. **Letta/MemGPT-style control loop** - -- keep a small “core” always in context (persona + key user facts) -- everything else is out-of-context and retrieved via tools -- memory writes are explicit tool calls (append/replace/insert), persisted, then re-injected next turn - -2. **Hindsight-style memory substrate** - -- separate what’s observed vs what’s believed vs what’s summarized -- support retain/recall/reflect -- confidence-bearing opinions that can evolve with evidence -- entity-aware retrieval + temporal queries (even without full knowledge graphs) - -## Proposed architecture (Markdown source-of-truth + derived index) - -### Canonical store (git-friendly) - -Keep `~/.openclaw/workspace` as canonical human-readable memory. - -Suggested workspace layout: - -``` -~/.openclaw/workspace/ - memory.md # small: durable facts + preferences (core-ish) - memory/ - YYYY-MM-DD.md # daily log (append; narrative) - bank/ # “typed” memory pages (stable, reviewable) - world.md # objective facts about the world - experience.md # what the agent did (first-person) - opinions.md # subjective prefs/judgments + confidence + evidence pointers - entities/ - Peter.md - The-Castle.md - warelay.md - ... -``` - -Notes: - -- **Daily log stays daily log**. No need to turn it into JSON. -- The `bank/` files are **curated**, produced by reflection jobs, and can still be edited by hand. -- `memory.md` remains “small + core-ish”: the things you want Clawd to see every session. - -### Derived store (machine recall) - -Add a derived index under the workspace (not necessarily git tracked): - -``` -~/.openclaw/workspace/.memory/index.sqlite -``` - -Back it with: - -- SQLite schema for facts + entity links + opinion metadata -- SQLite **FTS5** for lexical recall (fast, tiny, offline) -- optional embeddings table for semantic recall (still offline) - -The index is always **rebuildable from Markdown**. - -## Retain / Recall / Reflect (operational loop) - -### Retain: normalize daily logs into “facts” - -Hindsight’s key insight that matters here: store **narrative, self-contained facts**, not tiny snippets. - -Practical rule for `memory/YYYY-MM-DD.md`: - -- at end of day (or during), add a `## Retain` section with 2–5 bullets that are: - - narrative (cross-turn context preserved) - - self-contained (standalone makes sense later) - - tagged with type + entity mentions - -Example: - -``` -## Retain -- W @Peter: Currently in Marrakech (Nov 27–Dec 1, 2025) for Andy’s birthday. -- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md). -- O(c=0.95) @Peter: Prefers concise replies (<1500 chars) on WhatsApp; long content goes into files. -``` - -Minimal parsing: - -- Type prefix: `W` (world), `B` (experience/biographical), `O` (opinion), `S` (observation/summary; usually generated) -- Entities: `@Peter`, `@warelay`, etc (slugs map to `bank/entities/*.md`) -- Opinion confidence: `O(c=0.0..1.0)` optional - -If you don’t want authors to think about it: the reflect job can infer these bullets from the rest of the log, but having an explicit `## Retain` section is the easiest “quality lever”. - -### Recall: queries over the derived index - -Recall should support: - -- **lexical**: “find exact terms / names / commands” (FTS5) -- **entity**: “tell me about X” (entity pages + entity-linked facts) -- **temporal**: “what happened around Nov 27” / “since last week” -- **opinion**: “what does Peter prefer?” (with confidence + evidence) - -Return format should be agent-friendly and cite sources: - -- `kind` (`world|experience|opinion|observation`) -- `timestamp` (source day, or extracted time range if present) -- `entities` (`["Peter","warelay"]`) -- `content` (the narrative fact) -- `source` (`memory/2025-11-27.md#L12` etc) - -### Reflect: produce stable pages + update beliefs - -Reflection is a scheduled job (daily or heartbeat `ultrathink`) that: - -- updates `bank/entities/*.md` from recent facts (entity summaries) -- updates `bank/opinions.md` confidence based on reinforcement/contradiction -- optionally proposes edits to `memory.md` (“core-ish” durable facts) - -Opinion evolution (simple, explainable): - -- each opinion has: - - statement - - confidence `c ∈ [0,1]` - - last_updated - - evidence links (supporting + contradicting fact IDs) -- when new facts arrive: - - find candidate opinions by entity overlap + similarity (FTS first, embeddings later) - - update confidence by small deltas; big jumps require strong contradiction + repeated evidence - -## CLI integration: standalone vs deep integration - -Recommendation: **deep integration in OpenClaw**, but keep a separable core library. - -### Why integrate into OpenClaw? - -- OpenClaw already knows: - - the workspace path (`agents.defaults.workspace`) - - the session model + heartbeats - - logging + troubleshooting patterns -- You want the agent itself to call the tools: - - `openclaw memory recall "…" --k 25 --since 30d` - - `openclaw memory reflect --since 7d` - -### Why still split a library? - -- keep memory logic testable without gateway/runtime -- reuse from other contexts (local scripts, future desktop app, etc.) - -Shape: -The memory tooling is intended to be a small CLI + library layer, but this is exploratory only. - -## “S-Collide” / SuCo: when to use it (research) - -If “S-Collide” refers to **SuCo (Subspace Collision)**: it’s an ANN retrieval approach that targets strong recall/latency tradeoffs by using learned/structured collisions in subspaces (paper: arXiv 2411.14754, 2024). - -Pragmatic take for `~/.openclaw/workspace`: - -- **don’t start** with SuCo. -- start with SQLite FTS + (optional) simple embeddings; you’ll get most UX wins immediately. -- consider SuCo/HNSW/ScaNN-class solutions only once: - - corpus is big (tens/hundreds of thousands of chunks) - - brute-force embedding search becomes too slow - - recall quality is meaningfully bottlenecked by lexical search - -Offline-friendly alternatives (in increasing complexity): - -- SQLite FTS5 + metadata filters (zero ML) -- Embeddings + brute force (works surprisingly far if chunk count is low) -- HNSW index (common, robust; needs a library binding) -- SuCo (research-grade; attractive if there’s a solid implementation you can embed) - -Open question: - -- what’s the **best** offline embedding model for “personal assistant memory” on your machines (laptop + desktop)? - - if you already have Ollama: embed with a local model; otherwise ship a small embedding model in the toolchain. - -## Smallest useful pilot - -If you want a minimal, still-useful version: - -- Add `bank/` entity pages and a `## Retain` section in daily logs. -- Use SQLite FTS for recall with citations (path + line numbers). -- Add embeddings only if recall quality or scale demands it. - -## References - -- Letta / MemGPT concepts: “core memory blocks” + “archival memory” + tool-driven self-editing memory. -- Hindsight Technical Report: “retain / recall / reflect”, four-network memory, narrative fact extraction, opinion confidence evolution. -- SuCo: arXiv 2411.14754 (2024): “Subspace Collision” approximate nearest neighbor retrieval. diff --git a/docs/start/hubs.md b/docs/start/hubs.md index 882f547f65a..fb3357a46aa 100644 --- a/docs/start/hubs.md +++ b/docs/start/hubs.md @@ -176,12 +176,6 @@ Use these hubs to discover every page, including deep dives and reference docs t - [Templates: TOOLS](/reference/templates/TOOLS) - [Templates: USER](/reference/templates/USER) -## Experiments (exploratory) - -- [Onboarding config protocol](/experiments/onboarding-config-protocol) -- [Research: memory](/experiments/research/memory) -- [Model config exploration](/experiments/proposals/model-config) - ## Project - [Credits](/reference/credits) diff --git a/docs/zh-CN/experiments/onboarding-config-protocol.md b/docs/zh-CN/experiments/onboarding-config-protocol.md deleted file mode 100644 index 991801871ef..00000000000 --- a/docs/zh-CN/experiments/onboarding-config-protocol.md +++ /dev/null @@ -1,47 +0,0 @@ ---- -read_when: Changing onboarding wizard steps or config schema endpoints -summary: 新手引导向导和配置模式的 RPC 协议说明 -title: 新手引导和配置协议 -x-i18n: - generated_at: "2026-02-03T07:47:10Z" - model: claude-opus-4-5 - provider: pi - source_hash: 55163b3ee029c02476800cb616a054e5adfe97dae5bb72f2763dce0079851e06 - source_path: experiments/onboarding-config-protocol.md - workflow: 15 ---- - -# 新手引导 + 配置协议 - -目的:CLI、macOS 应用和 Web UI 之间共享的新手引导 + 配置界面。 - -## 组件 - -- 向导引擎(共享会话 + 提示 + 新手引导状态)。 -- CLI 新手引导使用与 UI 客户端相同的向导流程。 -- Gateway 网关 RPC 公开向导 + 配置模式端点。 -- macOS 新手引导使用向导步骤模型。 -- Web UI 从 JSON Schema + UI 提示渲染配置表单。 - -## Gateway 网关 RPC - -- `wizard.start` 参数:`{ mode?: "local"|"remote", workspace?: string }` -- `wizard.next` 参数:`{ sessionId, answer?: { stepId, value? } }` -- `wizard.cancel` 参数:`{ sessionId }` -- `wizard.status` 参数:`{ sessionId }` -- `config.schema` 参数:`{}` - -响应(结构) - -- 向导:`{ sessionId, done, step?, status?, error? }` -- 配置模式:`{ schema, uiHints, version, generatedAt }` - -## UI 提示 - -- `uiHints` 按路径键入;可选元数据(label/help/group/order/advanced/sensitive/placeholder)。 -- 敏感字段渲染为密码输入;无脱敏层。 -- 不支持的模式节点回退到原始 JSON 编辑器。 - -## 注意 - -- 本文档是跟踪新手引导/配置协议重构的唯一位置。 diff --git a/docs/zh-CN/experiments/plans/cron-add-hardening.md b/docs/zh-CN/experiments/plans/cron-add-hardening.md deleted file mode 100644 index c1dcf1d53bd..00000000000 --- a/docs/zh-CN/experiments/plans/cron-add-hardening.md +++ /dev/null @@ -1,70 +0,0 @@ ---- -last_updated: "2026-01-05" -owner: openclaw -status: complete -summary: 加固 cron.add 输入处理,对齐 schema,改进 cron UI/智能体工具 -title: Cron Add 加固 -x-i18n: - generated_at: "2026-02-03T07:47:26Z" - model: claude-opus-4-5 - provider: pi - source_hash: d7e469674bd9435b846757ea0d5dc8f174eaa8533917fc013b1ef4f82859496d - source_path: experiments/plans/cron-add-hardening.md - workflow: 15 ---- - -# Cron Add 加固 & Schema 对齐 - -## 背景 - -最近的 Gateway 网关日志显示重复的 `cron.add` 失败,参数无效(缺少 `sessionTarget`、`wakeMode`、`payload`,以及格式错误的 `schedule`)。这表明至少有一个客户端(可能是智能体工具调用路径)正在发送包装的或部分指定的任务负载。另外,TypeScript 中的 cron 提供商枚举、Gateway 网关 schema、CLI 标志和 UI 表单类型之间存在漂移,加上 `cron.status` 的 UI 不匹配(期望 `jobCount` 而 Gateway 网关返回 `jobs`)。 - -## 目标 - -- 通过规范化常见的包装负载并推断缺失的 `kind` 字段来停止 `cron.add` INVALID_REQUEST 垃圾。 -- 在 Gateway 网关 schema、cron 类型、CLI 文档和 UI 表单之间对齐 cron 提供商列表。 -- 使智能体 cron 工具 schema 明确,以便 LLM 生成正确的任务负载。 -- 修复 Control UI cron 状态任务计数显示。 -- 添加测试以覆盖规范化和工具行为。 - -## 非目标 - -- 更改 cron 调度语义或任务执行行为。 -- 添加新的调度类型或 cron 表达式解析。 -- 除了必要的字段修复外,不大改 cron 的 UI/UX。 - -## 发现(当前差距) - -- Gateway 网关中的 `CronPayloadSchema` 排除了 `signal` + `imessage`,而 TS 类型包含它们。 -- Control UI CronStatus 期望 `jobCount`,但 Gateway 网关返回 `jobs`。 -- 智能体 cron 工具 schema 允许任意 `job` 对象,导致格式错误的输入。 -- Gateway 网关严格验证 `cron.add` 而不进行规范化,因此包装的负载会失败。 - -## 变更内容 - -- `cron.add` 和 `cron.update` 现在规范化常见的包装形式并推断缺失的 `kind` 字段。 -- 智能体 cron 工具 schema 与 Gateway 网关 schema 匹配,减少无效负载。 -- 提供商枚举在 Gateway 网关、CLI、UI 和 macOS 选择器之间对齐。 -- Control UI 使用 Gateway 网关的 `jobs` 计数字段显示状态。 - -## 当前行为 - -- **规范化:**包装的 `data`/`job` 负载被解包;`schedule.kind` 和 `payload.kind` 在安全时被推断。 -- **默认值:**当缺失时,为 `wakeMode` 和 `sessionTarget` 应用安全默认值。 -- **提供商:**Discord/Slack/Signal/iMessage 现在在 CLI/UI 中一致显示。 - -参见 [Cron 任务](/automation/cron-jobs) 了解规范化的形式和示例。 - -## 验证 - -- 观察 Gateway 网关日志中 `cron.add` INVALID_REQUEST 错误是否减少。 -- 确认 Control UI cron 状态在刷新后显示任务计数。 - -## 可选后续工作 - -- 手动 Control UI 冒烟测试:为每个提供商添加一个 cron 任务 + 验证状态任务计数。 - -## 开放问题 - -- `cron.add` 是否应该接受来自客户端的显式 `state`(当前被 schema 禁止)? -- 我们是否应该允许 `webchat` 作为显式投递提供商(当前在投递解析中被过滤)? diff --git a/docs/zh-CN/experiments/plans/group-policy-hardening.md b/docs/zh-CN/experiments/plans/group-policy-hardening.md deleted file mode 100644 index afbb8b39d6a..00000000000 --- a/docs/zh-CN/experiments/plans/group-policy-hardening.md +++ /dev/null @@ -1,45 +0,0 @@ ---- -read_when: - - 查看历史 Telegram 允许列表更改 -summary: Telegram 允许列表加固:前缀 + 空白规范化 -title: Telegram 允许列表加固 -x-i18n: - generated_at: "2026-02-03T07:47:16Z" - model: claude-opus-4-5 - provider: pi - source_hash: a2eca5fcc85376948cfe1b6044f1a8bc69c7f0eb94d1ceafedc1e507ba544162 - source_path: experiments/plans/group-policy-hardening.md - workflow: 15 ---- - -# Telegram 允许列表加固 - -**日期**:2026-01-05 -**状态**:已完成 -**PR**:#216 - -## 摘要 - -Telegram 允许列表现在不区分大小写地接受 `telegram:` 和 `tg:` 前缀,并容忍意外的空白。这使入站允许列表检查与出站发送规范化保持一致。 - -## 更改内容 - -- 前缀 `telegram:` 和 `tg:` 被同等对待(不区分大小写)。 -- 允许列表条目会被修剪;空条目会被忽略。 - -## 示例 - -以下所有形式都被接受为同一 ID: - -- `telegram:123456` -- `TG:123456` -- `tg:123456` - -## 为什么重要 - -从日志或聊天 ID 复制/粘贴通常会包含前缀和空白。规范化可避免在决定是否在私信或群组中响应时出现误判。 - -## 相关文档 - -- [群聊](/channels/groups) -- [Telegram 提供商](/channels/telegram) diff --git a/docs/zh-CN/experiments/plans/openresponses-gateway.md b/docs/zh-CN/experiments/plans/openresponses-gateway.md deleted file mode 100644 index 797da3d91af..00000000000 --- a/docs/zh-CN/experiments/plans/openresponses-gateway.md +++ /dev/null @@ -1,121 +0,0 @@ ---- -last_updated: "2026-01-19" -owner: openclaw -status: draft -summary: 计划:添加 OpenResponses /v1/responses 端点并干净地弃用 chat completions -title: OpenResponses Gateway 网关计划 -x-i18n: - generated_at: "2026-02-03T07:47:33Z" - model: claude-opus-4-5 - provider: pi - source_hash: 71a22c48397507d1648b40766a3153e420c54f2a2d5186d07e51eb3d12e4636a - source_path: experiments/plans/openresponses-gateway.md - workflow: 15 ---- - -# OpenResponses Gateway 网关集成计划 - -## 背景 - -OpenClaw Gateway 网关目前在 `/v1/chat/completions` 暴露了一个最小的 OpenAI 兼容 Chat Completions 端点(参见 [OpenAI Chat Completions](/gateway/openai-http-api))。 - -Open Responses 是基于 OpenAI Responses API 的开放推理标准。它专为智能体工作流设计,使用基于项目的输入加语义流式事件。OpenResponses 规范定义的是 `/v1/responses`,而不是 `/v1/chat/completions`。 - -## 目标 - -- 添加一个遵循 OpenResponses 语义的 `/v1/responses` 端点。 -- 保留 Chat Completions 作为兼容层,易于禁用并最终移除。 -- 使用隔离的、可复用的 schema 标准化验证和解析。 - -## 非目标 - -- 第一阶段完全实现 OpenResponses 功能(图片、文件、托管工具)。 -- 替换内部智能体执行逻辑或工具编排。 -- 在第一阶段更改现有的 `/v1/chat/completions` 行为。 - -## 研究摘要 - -来源:OpenResponses OpenAPI、OpenResponses 规范网站和 Hugging Face 博客文章。 - -提取的关键点: - -- `POST /v1/responses` 接受 `CreateResponseBody` 字段,如 `model`、`input`(字符串或 `ItemParam[]`)、`instructions`、`tools`、`tool_choice`、`stream`、`max_output_tokens` 和 `max_tool_calls`。 -- `ItemParam` 是以下类型的可区分联合: - - 具有角色 `system`、`developer`、`user`、`assistant` 的 `message` 项 - - `function_call` 和 `function_call_output` - - `reasoning` - - `item_reference` -- 成功响应返回带有 `object: "response"`、`status` 和 `output` 项的 `ResponseResource`。 -- 流式传输使用语义事件,如: - - `response.created`、`response.in_progress`、`response.completed`、`response.failed` - - `response.output_item.added`、`response.output_item.done` - - `response.content_part.added`、`response.content_part.done` - - `response.output_text.delta`、`response.output_text.done` -- 规范要求: - - `Content-Type: text/event-stream` - - `event:` 必须匹配 JSON `type` 字段 - - 终止事件必须是字面量 `[DONE]` -- Reasoning 项可能暴露 `content`、`encrypted_content` 和 `summary`。 -- HF 示例在请求中包含 `OpenResponses-Version: latest`(可选头部)。 - -## 提议的架构 - -- 添加 `src/gateway/open-responses.schema.ts`,仅包含 Zod schema(无 gateway 导入)。 -- 添加 `src/gateway/openresponses-http.ts`(或 `open-responses-http.ts`)用于 `/v1/responses`。 -- 保持 `src/gateway/openai-http.ts` 不变,作为遗留兼容适配器。 -- 添加配置 `gateway.http.endpoints.responses.enabled`(默认 `false`)。 -- 保持 `gateway.http.endpoints.chatCompletions.enabled` 独立;允许两个端点分别切换。 -- 当 Chat Completions 启用时发出启动警告,以表明其遗留状态。 - -## Chat Completions 弃用路径 - -- 保持严格的模块边界:responses 和 chat completions 之间不共享 schema 类型。 -- 通过配置使 Chat Completions 成为可选,这样无需代码更改即可禁用。 -- 一旦 `/v1/responses` 稳定,更新文档将 Chat Completions 标记为遗留。 -- 可选的未来步骤:将 Chat Completions 请求映射到 Responses 处理器,以便更简单地移除。 - -## 第一阶段支持子集 - -- 接受 `input` 为字符串或带有消息角色和 `function_call_output` 的 `ItemParam[]`。 -- 将 system 和 developer 消息提取到 `extraSystemPrompt` 中。 -- 使用最近的 `user` 或 `function_call_output` 作为智能体运行的当前消息。 -- 对不支持的内容部分(图片/文件)返回 `invalid_request_error` 拒绝。 -- 返回带有 `output_text` 内容的单个助手消息。 -- 返回带有零值的 `usage`,直到 token 计数接入。 - -## 验证策略(无 SDK) - -- 为以下支持子集实现 Zod schema: - - `CreateResponseBody` - - `ItemParam` + 消息内容部分联合 - - `ResponseResource` - - Gateway 网关使用的流式事件形状 -- 将 schema 保存在单个隔离模块中,以避免漂移并允许未来代码生成。 - -## 流式实现(第一阶段) - -- 带有 `event:` 和 `data:` 的 SSE 行。 -- 所需序列(最小可行): - - `response.created` - - `response.output_item.added` - - `response.content_part.added` - - `response.output_text.delta`(根据需要重复) - - `response.output_text.done` - - `response.content_part.done` - - `response.completed` - - `[DONE]` - -## 测试和验证计划 - -- 为 `/v1/responses` 添加端到端覆盖: - - 需要认证 - - 非流式响应形状 - - 流式事件顺序和 `[DONE]` - - 使用头部和 `user` 的会话路由 -- 保持 `src/gateway/openai-http.e2e.test.ts` 不变。 -- 手动:用 `stream: true` curl `/v1/responses` 并验证事件顺序和终止 `[DONE]`。 - -## 文档更新(后续) - -- 为 `/v1/responses` 使用和示例添加新文档页面。 -- 更新 `/gateway/openai-http-api`,添加遗留说明和指向 `/v1/responses` 的指针。 diff --git a/docs/zh-CN/experiments/proposals/model-config.md b/docs/zh-CN/experiments/proposals/model-config.md deleted file mode 100644 index 291e5a193ba..00000000000 --- a/docs/zh-CN/experiments/proposals/model-config.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -read_when: - - 探索未来模型选择和认证配置文件的方案 -summary: 探索:模型配置、认证配置文件和回退行为 -title: 模型配置探索 -x-i18n: - generated_at: "2026-02-01T20:25:05Z" - model: claude-opus-4-5 - provider: pi - source_hash: 48623233d80f874c0ae853b51f888599cf8b50ae6fbfe47f6d7b0216bae9500b - source_path: experiments/proposals/model-config.md - workflow: 14 ---- - -# 模型配置(探索) - -本文档记录了未来模型配置的**构想**。这不是正式的发布规范。如需了解当前行为,请参阅: - -- [模型](/concepts/models) -- [模型故障转移](/concepts/model-failover) -- [OAuth + 配置文件](/concepts/oauth) - -## 动机 - -运营者希望: - -- 每个提供商支持多个认证配置文件(个人 vs 工作)。 -- 简单的 `/model` 选择,并具有可预测的回退行为。 -- 文本模型与图像模型之间有清晰的分离。 - -## 可能的方向(高层级) - -- 保持模型选择简洁:`provider/model` 加可选别名。 -- 允许提供商拥有多个认证配置文件,并指定明确的顺序。 -- 使用全局回退列表,使所有会话以一致的方式进行故障转移。 -- 仅在明确配置时才覆盖图像路由。 - -## 待解决的问题 - -- 配置文件轮换应该按提供商还是按模型进行? -- UI 应如何为会话展示配置文件选择? -- 从旧版配置键迁移的最安全路径是什么? diff --git a/docs/zh-CN/experiments/research/memory.md b/docs/zh-CN/experiments/research/memory.md deleted file mode 100644 index 6f5b521c06c..00000000000 --- a/docs/zh-CN/experiments/research/memory.md +++ /dev/null @@ -1,235 +0,0 @@ ---- -read_when: - - 设计超越每日 Markdown 日志的工作区记忆(~/.openclaw/workspace) - - Deciding: standalone CLI vs deep OpenClaw integration - - 添加离线回忆 + 反思(retain/recall/reflect) -summary: 研究笔记:Clawd 工作区的离线记忆系统(Markdown 作为数据源 + 派生索引) -title: 工作区记忆研究 -x-i18n: - generated_at: "2026-02-03T10:06:14Z" - model: claude-opus-4-5 - provider: pi - source_hash: 1753c8ee6284999fab4a94ff5fae7421c85233699c9d3088453d0c2133ac0feb - source_path: experiments/research/memory.md - workflow: 15 ---- - -# 工作区记忆 v2(离线):研究笔记 - -目标:Clawd 风格的工作区(`agents.defaults.workspace`,默认 `~/.openclaw/workspace`),其中"记忆"以每天一个 Markdown 文件(`memory/YYYY-MM-DD.md`)加上一小组稳定文件(例如 `memory.md`、`SOUL.md`)的形式存储。 - -本文档提出一种**离线优先**的记忆架构,保持 Markdown 作为规范的、可审查的数据源,但通过派生索引添加**结构化回忆**(搜索、实体摘要、置信度更新)。 - -## 为什么要改变? - -当前设置(每天一个文件)非常适合: - -- "仅追加"式日志记录 -- 人工编辑 -- git 支持的持久性 + 可审计性 -- 低摩擦捕获("直接写下来") - -但它在以下方面较弱: - -- 高召回率检索("我们对 X 做了什么决定?"、"上次我们尝试 Y 时?") -- 以实体为中心的答案("告诉我关于 Alice / The Castle / warelay 的信息")而无需重读多个文件 -- 观点/偏好稳定性(以及变化时的证据) -- 时间约束("2025 年 11 月期间什么是真实的?")和冲突解决 - -## 设计目标 - -- **离线**:无需网络即可工作;可在笔记本电脑/Castle 上运行;无云依赖。 -- **可解释**:检索的项目应该可归因(文件 + 位置)并与推理分离。 -- **低仪式感**:每日日志保持 Markdown,无需繁重的 schema 工作。 -- **增量式**:v1 仅使用 FTS 就很有用;语义/向量和图是可选升级。 -- **对智能体友好**:使"在 token 预算内回忆"变得简单(返回小型事实包)。 - -## 北极星模型(Hindsight × Letta) - -需要融合两个部分: - -1. **Letta/MemGPT 风格的控制循环** - -- 保持一个小的"核心"始终在上下文中(角色 + 关键用户事实) -- 其他所有内容都在上下文之外,通过工具检索 -- 记忆写入是显式的工具调用(append/replace/insert),持久化后在下一轮重新注入 - -2. **Hindsight 风格的记忆基底** - -- 分离观察到的、相信的和总结的内容 -- 支持 retain/recall/reflect -- 带有置信度的观点可以随证据演变 -- 实体感知检索 + 时间查询(即使没有完整的知识图谱) - -## 提议的架构(Markdown 数据源 + 派生索引) - -### 规范存储(git 友好) - -保持 `~/.openclaw/workspace` 作为规范的人类可读记忆。 - -建议的工作区布局: - -``` -~/.openclaw/workspace/ - memory.md # 小型:持久事实 + 偏好(类似核心) - memory/ - YYYY-MM-DD.md # 每日日志(追加;叙事) - bank/ # "类型化"记忆页面(稳定、可审查) - world.md # 关于世界的客观事实 - experience.md # 智能体做了什么(第一人称) - opinions.md # 主观偏好/判断 + 置信度 + 证据指针 - entities/ - Peter.md - The-Castle.md - warelay.md - ... -``` - -注意: - -- **每日日志保持为每日日志**。无需将其转换为 JSON。 -- `bank/` 文件是**经过整理的**,由反思任务生成,仍可手动编辑。 -- `memory.md` 保持"小型 + 类似核心":你希望 Clawd 每次会话都能看到的内容。 - -### 派生存储(机器回忆) - -在工作区下添加派生索引(不一定需要 git 跟踪): - -``` -~/.openclaw/workspace/.memory/index.sqlite -``` - -后端支持: - -- 用于事实 + 实体链接 + 观点元数据的 SQLite schema -- SQLite **FTS5** 用于词法回忆(快速、小巧、离线) -- 可选的嵌入表用于语义回忆(仍然离线) - -索引始终**可从 Markdown 重建**。 - -## Retain / Recall / Reflect(操作循环) - -### Retain:将每日日志规范化为"事实" - -Hindsight 在这里重要的关键洞察:存储**叙事性、自包含的事实**,而不是微小的片段。 - -`memory/YYYY-MM-DD.md` 的实用规则: - -- 在一天结束时(或期间),添加一个 `## Retain` 部分,包含 2-5 个要点: - - 叙事性(保留跨轮上下文) - - 自包含(独立时也有意义) - - 标记类型 + 实体提及 - -示例: - -``` -## Retain -- W @Peter: Currently in Marrakech (Nov 27–Dec 1, 2025) for Andy's birthday. -- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md). -- O(c=0.95) @Peter: Prefers concise replies (<1500 chars) on WhatsApp; long content goes into files. -``` - -最小化解析: - -- 类型前缀:`W`(世界)、`B`(经历/传记)、`O`(观点)、`S`(观察/摘要;通常是生成的) -- 实体:`@Peter`、`@warelay` 等(slug 映射到 `bank/entities/*.md`) -- 观点置信度:`O(c=0.0..1.0)` 可选 - -如果你不想让作者考虑这些:反思任务可以从日志的其余部分推断这些要点,但有一个显式的 `## Retain` 部分是最简单的"质量杠杆"。 - -### Recall:对派生索引的查询 - -Recall 应支持: - -- **词法**:"查找精确的术语/名称/命令"(FTS5) -- **实体**:"告诉我关于 X 的信息"(实体页面 + 实体链接的事实) -- **时间**:"11 月 27 日前后发生了什么"/"自上周以来" -- **观点**:"Peter 偏好什么?"(带置信度 + 证据) - -返回格式应对智能体友好并引用来源: - -- `kind`(`world|experience|opinion|observation`) -- `timestamp`(来源日期,或如果存在则提取的时间范围) -- `entities`(`["Peter","warelay"]`) -- `content`(叙事性事实) -- `source`(`memory/2025-11-27.md#L12` 等) - -### Reflect:生成稳定页面 + 更新信念 - -反思是一个定时任务(每日或心跳 `ultrathink`),它: - -- 根据最近的事实更新 `bank/entities/*.md`(实体摘要) -- 根据强化/矛盾更新 `bank/opinions.md` 置信度 -- 可选地提议对 `memory.md`("类似核心"的持久事实)的编辑 - -观点演变(简单、可解释): - -- 每个观点有: - - 陈述 - - 置信度 `c ∈ [0,1]` - - last_updated - - 证据链接(支持 + 矛盾的事实 ID) -- 当新事实到达时: - - 通过实体重叠 + 相似性找到候选观点(先 FTS,后嵌入) - - 通过小幅增量更新置信度;大幅跳跃需要强矛盾 + 重复证据 - -## CLI 集成:独立 vs 深度集成 - -建议:**深度集成到 OpenClaw**,但保持可分离的核心库。 - -### 为什么要集成到 OpenClaw? - -- OpenClaw 已经知道: - - 工作区路径(`agents.defaults.workspace`) - - 会话模型 + 心跳 - - 日志记录 + 故障排除模式 -- 你希望智能体自己调用工具: - - `openclaw memory recall "…" --k 25 --since 30d` - - `openclaw memory reflect --since 7d` - -### 为什么仍要分离库? - -- 保持记忆逻辑可测试,无需 Gateway 网关/运行时 -- 可从其他上下文重用(本地脚本、未来的桌面应用等) - -形态: -记忆工具预计是一个小型 CLI + 库层,但这仅是探索性的。 - -## "S-Collide" / SuCo:何时使用(研究) - -如果"S-Collide"指的是 **SuCo(Subspace Collision)**:这是一种 ANN 检索方法,通过在子空间中使用学习/结构化碰撞来实现强召回/延迟权衡(论文:arXiv 2411.14754,2024)。 - -对于 `~/.openclaw/workspace` 的务实观点: - -- **不要从** SuCo 开始。 -- 从 SQLite FTS +(可选的)简单嵌入开始;你会立即获得大部分 UX 收益。 -- 仅在以下情况下考虑 SuCo/HNSW/ScaNN 级别的解决方案: - - 语料库很大(数万/数十万个块) - - 暴力嵌入搜索变得太慢 - - 召回质量明显受到词法搜索的瓶颈限制 - -离线友好的替代方案(按复杂性递增): - -- SQLite FTS5 + 元数据过滤(零 ML) -- 嵌入 + 暴力搜索(如果块数量低,效果出奇地好) -- HNSW 索引(常见、稳健;需要库绑定) -- SuCo(研究级;如果有可嵌入的可靠实现则很有吸引力) - -开放问题: - -- 对于你的机器(笔记本 + 台式机)上的"个人助理记忆",**最佳**的离线嵌入模型是什么? - - 如果你已经有 Ollama:使用本地模型嵌入;否则在工具链中附带一个小型嵌入模型。 - -## 最小可用试点 - -如果你想要一个最小但仍有用的版本: - -- 添加 `bank/` 实体页面和每日日志中的 `## Retain` 部分。 -- 使用 SQLite FTS 进行带引用的回忆(路径 + 行号)。 -- 仅在召回质量或规模需要时添加嵌入。 - -## 参考资料 - -- Letta / MemGPT 概念:"核心记忆块" + "档案记忆" + 工具驱动的自编辑记忆。 -- Hindsight 技术报告:"retain / recall / reflect",四网络记忆,叙事性事实提取,观点置信度演变。 -- SuCo:arXiv 2411.14754(2024):"Subspace Collision"近似最近邻检索。