docs: remove experiments/ and design/ directories

Delete all experiment plans, proposals, research docs, and the
kilo-gateway-integration design doc. These are internal planning
docs that do not belong on the public docs site.

- 12 English experiment files
- 5 zh-CN experiment translations
- 1 design doc (kilo-gateway-integration)
- Remove nav groups from docs.json (English + zh-CN)
- Remove 3 redirects pointing to deleted experiment pages
- Remove dead experiment links from hubs.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Vincent Koc
2026-03-18 00:55:47 -07:00
parent 0ae3e70a5c
commit da2289869d
21 changed files with 0 additions and 3929 deletions

View File

@@ -1,542 +0,0 @@
---
title: "Kilo Gateway Integration Design"
summary: "Design doc for integrating Kilo Gateway as a first-class OpenClaw provider"
read_when:
- Working on the Kilo Gateway provider integration
- Understanding provider integration patterns
---
# Kilo Gateway Provider Integration Design
## Overview
This document outlines the design for integrating "Kilo Gateway" as a first-class provider in OpenClaw, modeled after the existing OpenRouter implementation. Kilo Gateway uses an OpenAI-compatible completions API with a different base URL.
## Design Decisions
### 1. Provider Naming
**Recommendation: `kilocode`**
Rationale:
- Matches the user config example provided (`kilocode` provider key)
- Consistent with existing provider naming patterns (e.g., `openrouter`, `opencode`, `moonshot`)
- Short and memorable
- Avoids confusion with generic "kilo" or "gateway" terms
Alternative considered: `kilo-gateway` - rejected because hyphenated names are less common in the codebase and `kilocode` is more concise.
### 2. Default Model Reference
**Recommendation: `kilocode/anthropic/claude-opus-4.6`**
Rationale:
- Based on user config example
- Claude Opus 4.5 is a capable default model
- Explicit model selection avoids reliance on auto-routing
### 3. Base URL Configuration
**Recommendation: Hardcoded default with config override**
- **Default Base URL:** `https://api.kilo.ai/api/gateway/`
- **Configurable:** Yes, via `models.providers.kilocode.baseUrl`
This matches the pattern used by other providers like Moonshot, Venice, and Synthetic.
### 4. Model Scanning
**Recommendation: No dedicated model scanning endpoint initially**
Rationale:
- Kilo Gateway proxies to OpenRouter, so models are dynamic
- Users can manually configure models in their config
- If Kilo Gateway exposes a `/models` endpoint in the future, scanning can be added
### 5. Special Handling
**Recommendation: Inherit OpenRouter behavior for Anthropic models**
Since Kilo Gateway proxies to OpenRouter, the same special handling should apply:
- Cache TTL eligibility for `anthropic/*` models
- Extra params (cacheControlTtl) for `anthropic/*` models
- Transcript policy follows OpenRouter patterns
## Files to Modify
### Core Credential Management
#### 1. `src/commands/onboard-auth.credentials.ts`
Add:
```typescript
export const KILOCODE_DEFAULT_MODEL_REF = "kilocode/anthropic/claude-opus-4.6";
export async function setKilocodeApiKey(key: string, agentDir?: string) {
upsertAuthProfile({
profileId: "kilocode:default",
credential: {
type: "api_key",
provider: "kilocode",
key,
},
agentDir: resolveAuthAgentDir(agentDir),
});
}
```
#### 2. `src/agents/model-auth.ts`
Add to `envMap` in `resolveEnvApiKey()`:
```typescript
const envMap: Record<string, string> = {
// ... existing entries
kilocode: "KILOCODE_API_KEY",
};
```
#### 3. `src/config/io.ts`
Add to `SHELL_ENV_EXPECTED_KEYS`:
```typescript
const SHELL_ENV_EXPECTED_KEYS = [
// ... existing entries
"KILOCODE_API_KEY",
];
```
### Config Application
#### 4. `src/commands/onboard-auth.config-core.ts`
Add new functions:
```typescript
export const KILOCODE_BASE_URL = "https://api.kilo.ai/api/gateway/";
export function applyKilocodeProviderConfig(cfg: OpenClawConfig): OpenClawConfig {
const models = { ...cfg.agents?.defaults?.models };
models[KILOCODE_DEFAULT_MODEL_REF] = {
...models[KILOCODE_DEFAULT_MODEL_REF],
alias: models[KILOCODE_DEFAULT_MODEL_REF]?.alias ?? "Kilo Gateway",
};
const providers = { ...cfg.models?.providers };
const existingProvider = providers.kilocode;
const { apiKey: existingApiKey, ...existingProviderRest } = (existingProvider ?? {}) as Record<
string,
unknown
> as { apiKey?: string };
const resolvedApiKey = typeof existingApiKey === "string" ? existingApiKey : undefined;
const normalizedApiKey = resolvedApiKey?.trim();
providers.kilocode = {
...existingProviderRest,
baseUrl: KILOCODE_BASE_URL,
api: "openai-completions",
...(normalizedApiKey ? { apiKey: normalizedApiKey } : {}),
};
return {
...cfg,
agents: {
...cfg.agents,
defaults: {
...cfg.agents?.defaults,
models,
},
},
models: {
mode: cfg.models?.mode ?? "merge",
providers,
},
};
}
export function applyKilocodeConfig(cfg: OpenClawConfig): OpenClawConfig {
const next = applyKilocodeProviderConfig(cfg);
const existingModel = next.agents?.defaults?.model;
return {
...next,
agents: {
...next.agents,
defaults: {
...next.agents?.defaults,
model: {
...(existingModel && "fallbacks" in (existingModel as Record<string, unknown>)
? {
fallbacks: (existingModel as { fallbacks?: string[] }).fallbacks,
}
: undefined),
primary: KILOCODE_DEFAULT_MODEL_REF,
},
},
},
};
}
```
### Auth Choice System
#### 5. `src/commands/onboard-types.ts`
Add to `AuthChoice` type:
```typescript
export type AuthChoice =
// ... existing choices
"kilocode-api-key";
// ...
```
Add to `OnboardOptions`:
```typescript
export type OnboardOptions = {
// ... existing options
kilocodeApiKey?: string;
// ...
};
```
#### 6. `src/commands/auth-choice-options.ts`
Add to `AuthChoiceGroupId`:
```typescript
export type AuthChoiceGroupId =
// ... existing groups
"kilocode";
// ...
```
Add to `AUTH_CHOICE_GROUP_DEFS`:
```typescript
{
value: "kilocode",
label: "Kilo Gateway",
hint: "API key (OpenRouter-compatible)",
choices: ["kilocode-api-key"],
},
```
Add to `buildAuthChoiceOptions()`:
```typescript
options.push({
value: "kilocode-api-key",
label: "Kilo Gateway API key",
hint: "OpenRouter-compatible gateway",
});
```
#### 7. `src/commands/auth-choice.preferred-provider.ts`
Add mapping:
```typescript
const PREFERRED_PROVIDER_BY_AUTH_CHOICE: Partial<Record<AuthChoice, string>> = {
// ... existing mappings
"kilocode-api-key": "kilocode",
};
```
### Auth Choice Application
#### 8. `src/commands/auth-choice.apply.api-providers.ts`
Add import:
```typescript
import {
// ... existing imports
applyKilocodeConfig,
applyKilocodeProviderConfig,
KILOCODE_DEFAULT_MODEL_REF,
setKilocodeApiKey,
} from "./onboard-auth.js";
```
Add handling for `kilocode-api-key`:
```typescript
if (authChoice === "kilocode-api-key") {
const store = ensureAuthProfileStore(params.agentDir, {
allowKeychainPrompt: false,
});
const profileOrder = resolveAuthProfileOrder({
cfg: nextConfig,
store,
provider: "kilocode",
});
const existingProfileId = profileOrder.find((profileId) => Boolean(store.profiles[profileId]));
const existingCred = existingProfileId ? store.profiles[existingProfileId] : undefined;
let profileId = "kilocode:default";
let mode: "api_key" | "oauth" | "token" = "api_key";
let hasCredential = false;
if (existingProfileId && existingCred?.type) {
profileId = existingProfileId;
mode =
existingCred.type === "oauth" ? "oauth" : existingCred.type === "token" ? "token" : "api_key";
hasCredential = true;
}
if (!hasCredential && params.opts?.token && params.opts?.tokenProvider === "kilocode") {
await setKilocodeApiKey(normalizeApiKeyInput(params.opts.token), params.agentDir);
hasCredential = true;
}
if (!hasCredential) {
const envKey = resolveEnvApiKey("kilocode");
if (envKey) {
const useExisting = await params.prompter.confirm({
message: `Use existing KILOCODE_API_KEY (${envKey.source}, ${formatApiKeyPreview(envKey.apiKey)})?`,
initialValue: true,
});
if (useExisting) {
await setKilocodeApiKey(envKey.apiKey, params.agentDir);
hasCredential = true;
}
}
}
if (!hasCredential) {
const key = await params.prompter.text({
message: "Enter Kilo Gateway API key",
validate: validateApiKeyInput,
});
await setKilocodeApiKey(normalizeApiKeyInput(String(key)), params.agentDir);
hasCredential = true;
}
if (hasCredential) {
nextConfig = applyAuthProfileConfig(nextConfig, {
profileId,
provider: "kilocode",
mode,
});
}
{
const applied = await applyDefaultModelChoice({
config: nextConfig,
setDefaultModel: params.setDefaultModel,
defaultModel: KILOCODE_DEFAULT_MODEL_REF,
applyDefaultConfig: applyKilocodeConfig,
applyProviderConfig: applyKilocodeProviderConfig,
noteDefault: KILOCODE_DEFAULT_MODEL_REF,
noteAgentModel,
prompter: params.prompter,
});
nextConfig = applied.config;
agentModelOverride = applied.agentModelOverride ?? agentModelOverride;
}
return { config: nextConfig, agentModelOverride };
}
```
Also add tokenProvider mapping at the top of the function:
```typescript
if (params.opts.tokenProvider === "kilocode") {
authChoice = "kilocode-api-key";
}
```
### CLI Registration
#### 9. `src/cli/program/register.onboard.ts`
Add CLI option:
```typescript
.option("--kilocode-api-key <key>", "Kilo Gateway API key")
```
Add to action handler:
```typescript
kilocodeApiKey: opts.kilocodeApiKey as string | undefined,
```
Update auth-choice help text:
```typescript
.option(
"--auth-choice <choice>",
"Auth: setup-token|token|chutes|openai-codex|openai-api-key|openrouter-api-key|kilocode-api-key|ai-gateway-api-key|...",
)
```
### Non-Interactive Onboarding
#### 10. `src/commands/onboard-non-interactive/local/auth-choice.ts`
Add handling for `kilocode-api-key`:
```typescript
if (authChoice === "kilocode-api-key") {
const resolved = await resolveNonInteractiveApiKey({
provider: "kilocode",
cfg: baseConfig,
flagValue: opts.kilocodeApiKey,
flagName: "--kilocode-api-key",
envVar: "KILOCODE_API_KEY",
});
await setKilocodeApiKey(resolved.apiKey, agentDir);
nextConfig = applyAuthProfileConfig(nextConfig, {
profileId: "kilocode:default",
provider: "kilocode",
mode: "api_key",
});
// ... apply default model
}
```
### Export Updates
#### 11. `src/commands/onboard-auth.ts`
Add exports:
```typescript
export {
// ... existing exports
applyKilocodeConfig,
applyKilocodeProviderConfig,
KILOCODE_BASE_URL,
} from "./onboard-auth.config-core.js";
export {
// ... existing exports
KILOCODE_DEFAULT_MODEL_REF,
setKilocodeApiKey,
} from "./onboard-auth.credentials.js";
```
### Special Handling (Optional)
#### 12. `src/agents/pi-embedded-runner/cache-ttl.ts`
Add Kilo Gateway support for Anthropic models:
```typescript
export function isCacheTtlEligibleProvider(provider: string, modelId: string): boolean {
const normalizedProvider = provider.toLowerCase();
const normalizedModelId = modelId.toLowerCase();
if (normalizedProvider === "anthropic") return true;
if (normalizedProvider === "openrouter" && normalizedModelId.startsWith("anthropic/"))
return true;
if (normalizedProvider === "kilocode" && normalizedModelId.startsWith("anthropic/")) return true;
return false;
}
```
#### 13. `src/agents/transcript-policy.ts`
Add Kilo Gateway handling (similar to OpenRouter):
```typescript
const isKilocodeGemini = provider === "kilocode" && modelId.toLowerCase().includes("gemini");
// Include in needsNonImageSanitize check
const needsNonImageSanitize =
isGoogle || isAnthropic || isMistral || isOpenRouterGemini || isKilocodeGemini;
```
## Configuration Structure
### User Config Example
```json
{
"models": {
"mode": "merge",
"providers": {
"kilocode": {
"baseUrl": "https://api.kilo.ai/api/gateway/",
"apiKey": "xxxxx",
"api": "openai-completions",
"models": [
{
"id": "anthropic/claude-opus-4.6",
"name": "Anthropic: Claude Opus 4.6"
},
{ "id": "minimax/minimax-m2.5:free", "name": "Minimax: Minimax M2.5" }
]
}
}
}
}
```
### Auth Profile Structure
```json
{
"profiles": {
"kilocode:default": {
"type": "api_key",
"provider": "kilocode",
"key": "xxxxx"
}
}
}
```
## Testing Considerations
1. **Unit Tests:**
- Test `setKilocodeApiKey()` writes correct profile
- Test `applyKilocodeConfig()` sets correct defaults
- Test `resolveEnvApiKey("kilocode")` returns correct env var
2. **Integration Tests:**
- Test setup flow with `--auth-choice kilocode-api-key`
- Test non-interactive setup with `--kilocode-api-key`
- Test model selection with `kilocode/` prefix
3. **E2E Tests:**
- Test actual API calls through Kilo Gateway (live tests)
## Migration Notes
- No migration needed for existing users
- New users can immediately use `kilocode-api-key` auth choice
- Existing manual config with `kilocode` provider will continue to work
## Future Considerations
1. **Model Catalog:** If Kilo Gateway exposes a `/models` endpoint, add scanning support similar to `scanOpenRouterModels()`
2. **OAuth Support:** If Kilo Gateway adds OAuth, extend the auth system accordingly
3. **Rate Limiting:** Consider adding rate limit handling specific to Kilo Gateway if needed
4. **Documentation:** Add docs at `docs/providers/kilocode.md` explaining setup and usage
## Summary of Changes
| File | Change Type | Description |
| ----------------------------------------------------------- | ----------- | ----------------------------------------------------------------------- |
| `src/commands/onboard-auth.credentials.ts` | Add | `KILOCODE_DEFAULT_MODEL_REF`, `setKilocodeApiKey()` |
| `src/agents/model-auth.ts` | Modify | Add `kilocode` to `envMap` |
| `src/config/io.ts` | Modify | Add `KILOCODE_API_KEY` to shell env keys |
| `src/commands/onboard-auth.config-core.ts` | Add | `applyKilocodeProviderConfig()`, `applyKilocodeConfig()` |
| `src/commands/onboard-types.ts` | Modify | Add `kilocode-api-key` to `AuthChoice`, add `kilocodeApiKey` to options |
| `src/commands/auth-choice-options.ts` | Modify | Add `kilocode` group and option |
| `src/commands/auth-choice.preferred-provider.ts` | Modify | Add `kilocode-api-key` mapping |
| `src/commands/auth-choice.apply.api-providers.ts` | Modify | Add `kilocode-api-key` handling |
| `src/cli/program/register.onboard.ts` | Modify | Add `--kilocode-api-key` option |
| `src/commands/onboard-non-interactive/local/auth-choice.ts` | Modify | Add non-interactive handling |
| `src/commands/onboard-auth.ts` | Modify | Export new functions |
| `src/agents/pi-embedded-runner/cache-ttl.ts` | Modify | Add kilocode support |
| `src/agents/transcript-policy.ts` | Modify | Add kilocode Gemini handling |

View File

@@ -535,10 +535,6 @@
"source": "/onboarding",
"destination": "/start/onboarding"
},
{
"source": "/onboarding-config-protocol",
"destination": "/experiments/onboarding-config-protocol"
},
{
"source": "/pairing",
"destination": "/channels/pairing"
@@ -559,10 +555,6 @@
"source": "/presence",
"destination": "/concepts/presence"
},
{
"source": "/proposals/model-config",
"destination": "/experiments/proposals/model-config"
},
{
"source": "/provider-routing",
"destination": "/channels/channel-routing"
@@ -583,10 +575,6 @@
"source": "/remote-gateway-readme",
"destination": "/gateway/remote-gateway-readme"
},
{
"source": "/research/memory",
"destination": "/experiments/research/memory"
},
{
"source": "/rpc",
"destination": "/reference/rpc"
@@ -1358,21 +1346,6 @@
{
"group": "Release policy",
"pages": ["reference/RELEASING", "reference/test"]
},
{
"group": "Experiments",
"pages": [
"design/kilo-gateway-integration",
"experiments/onboarding-config-protocol",
"experiments/plans/acp-thread-bound-agents",
"experiments/plans/acp-unified-streaming-refactor",
"experiments/plans/browser-evaluate-cdp-refactor",
"experiments/plans/openresponses-gateway",
"experiments/plans/pty-process-supervision",
"experiments/plans/session-binding-channel-agnostic",
"experiments/research/memory",
"experiments/proposals/model-config"
]
}
]
},
@@ -1938,17 +1911,6 @@
{
"group": "发布策略",
"pages": ["zh-CN/reference/RELEASING", "zh-CN/reference/test"]
},
{
"group": "实验性功能",
"pages": [
"zh-CN/experiments/onboarding-config-protocol",
"zh-CN/experiments/plans/openresponses-gateway",
"zh-CN/experiments/plans/cron-add-hardening",
"zh-CN/experiments/plans/group-policy-hardening",
"zh-CN/experiments/research/memory",
"zh-CN/experiments/proposals/model-config"
]
}
]
},

View File

@@ -1,43 +0,0 @@
---
summary: "RPC protocol notes for setup wizard and config schema"
read_when: "Changing setup wizard steps or config schema endpoints"
title: "Onboarding and Config Protocol"
---
# Onboarding + Config Protocol
Purpose: shared onboarding + config surfaces across CLI, macOS app, and Web UI.
## Components
- Wizard engine (shared session + prompts + onboarding state).
- CLI onboarding uses the same wizard flow as the UI clients.
- Gateway RPC exposes wizard + config schema endpoints.
- macOS onboarding uses the wizard step model.
- Web UI renders config forms from JSON Schema + UI hints.
## Gateway RPC
- `wizard.start` params: `{ mode?: "local"|"remote", workspace?: string }`
- `wizard.next` params: `{ sessionId, answer?: { stepId, value? } }`
- `wizard.cancel` params: `{ sessionId }`
- `wizard.status` params: `{ sessionId }`
- `config.schema` params: `{}`
- `config.schema.lookup` params: `{ path }`
- `path` accepts standard config segments plus slash-delimited plugin ids, for example `plugins.entries.pack/one.config`.
Responses (shape)
- Wizard: `{ sessionId, done, step?, status?, error? }`
- Config schema: `{ schema, uiHints, version, generatedAt }`
- Config schema lookup: `{ path, schema, hint?, hintPath?, children[] }`
## UI Hints
- `uiHints` keyed by path; optional metadata (label/help/group/order/advanced/sensitive/placeholder).
- Sensitive fields render as password inputs; no redaction layer.
- Unsupported schema nodes fall back to the raw JSON editor.
## Notes
- This doc is the single place to track protocol refactors for onboarding/config.

View File

@@ -1,375 +0,0 @@
# ACP Persistent Bindings for Discord Channels and Telegram Topics
Status: Draft
## Summary
Introduce persistent ACP bindings that map:
- Discord channels (and existing threads, where needed), and
- Telegram forum topics in groups/supergroups (`chatId:topic:topicId`)
to long-lived ACP sessions, with binding state stored in top-level `bindings[]` entries using explicit binding types.
This makes ACP usage in high-traffic messaging channels predictable and durable, so users can create dedicated channels/topics such as `codex`, `claude-1`, or `claude-myrepo`.
## Why
Current thread-bound ACP behavior is optimized for ephemeral Discord thread workflows. Telegram does not have the same thread model; it has forum topics in groups/supergroups. Users want stable, always-on ACP “workspaces” in chat surfaces, not only temporary thread sessions.
## Goals
- Support durable ACP binding for:
- Discord channels/threads
- Telegram forum topics (groups/supergroups)
- Make binding source-of-truth config-driven.
- Keep `/acp`, `/new`, `/reset`, `/focus`, and delivery behavior consistent across Discord and Telegram.
- Preserve existing temporary binding flows for ad-hoc usage.
## Non-Goals
- Full redesign of ACP runtime/session internals.
- Removing existing ephemeral binding flows.
- Expanding to every channel in the first iteration.
- Implementing Telegram channel direct-messages topics (`direct_messages_topic_id`) in this phase.
- Implementing Telegram private-chat topic variants in this phase.
## UX Direction
### 1) Two binding types
- **Persistent binding**: saved in config, reconciled on startup, intended for “named workspace” channels/topics.
- **Temporary binding**: runtime-only, expires by idle/max-age policy.
### 2) Command behavior
- `/acp spawn ... --thread here|auto|off` remains available.
- Add explicit bind lifecycle controls:
- `/acp bind [session|agent] [--persist]`
- `/acp unbind [--persist]`
- `/acp status` includes whether binding is `persistent` or `temporary`.
- In bound conversations, `/new` and `/reset` reset the bound ACP session in place and keep the binding attached.
### 3) Conversation identity
- Use canonical conversation IDs:
- Discord: channel/thread ID.
- Telegram topic: `chatId:topic:topicId`.
- Never key Telegram bindings by bare topic ID alone.
## Config Model (Proposed)
Unify routing and persistent ACP binding configuration in top-level `bindings[]` with explicit `type` discriminator:
```jsonc
{
"agents": {
"list": [
{
"id": "main",
"default": true,
"workspace": "~/.openclaw/workspace-main",
"runtime": { "type": "embedded" },
},
{
"id": "codex",
"workspace": "~/.openclaw/workspace-codex",
"runtime": {
"type": "acp",
"acp": {
"agent": "codex",
"backend": "acpx",
"mode": "persistent",
"cwd": "/workspace/repo-a",
},
},
},
{
"id": "claude",
"workspace": "~/.openclaw/workspace-claude",
"runtime": {
"type": "acp",
"acp": {
"agent": "claude",
"backend": "acpx",
"mode": "persistent",
"cwd": "/workspace/repo-b",
},
},
},
],
},
"acp": {
"enabled": true,
"backend": "acpx",
"allowedAgents": ["codex", "claude"],
},
"bindings": [
// Route bindings (existing behavior)
{
"type": "route",
"agentId": "main",
"match": { "channel": "discord", "accountId": "default" },
},
{
"type": "route",
"agentId": "main",
"match": { "channel": "telegram", "accountId": "default" },
},
// Persistent ACP conversation bindings
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "222222222222222222" },
},
"acp": {
"label": "codex-main",
"mode": "persistent",
"cwd": "/workspace/repo-a",
"backend": "acpx",
},
},
{
"type": "acp",
"agentId": "claude",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "333333333333333333" },
},
"acp": {
"label": "claude-repo-b",
"mode": "persistent",
"cwd": "/workspace/repo-b",
},
},
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "telegram",
"accountId": "default",
"peer": { "kind": "group", "id": "-1001234567890:topic:42" },
},
"acp": {
"label": "tg-codex-42",
"mode": "persistent",
},
},
],
"channels": {
"discord": {
"guilds": {
"111111111111111111": {
"channels": {
"222222222222222222": {
"enabled": true,
"requireMention": false,
},
"333333333333333333": {
"enabled": true,
"requireMention": false,
},
},
},
},
},
"telegram": {
"groups": {
"-1001234567890": {
"topics": {
"42": {
"requireMention": false,
},
},
},
},
},
},
}
```
### Minimal Example (No Per-Binding ACP Overrides)
```jsonc
{
"agents": {
"list": [
{ "id": "main", "default": true, "runtime": { "type": "embedded" } },
{
"id": "codex",
"runtime": {
"type": "acp",
"acp": { "agent": "codex", "backend": "acpx", "mode": "persistent" },
},
},
{
"id": "claude",
"runtime": {
"type": "acp",
"acp": { "agent": "claude", "backend": "acpx", "mode": "persistent" },
},
},
],
},
"acp": { "enabled": true, "backend": "acpx" },
"bindings": [
{
"type": "route",
"agentId": "main",
"match": { "channel": "discord", "accountId": "default" },
},
{
"type": "route",
"agentId": "main",
"match": { "channel": "telegram", "accountId": "default" },
},
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "222222222222222222" },
},
},
{
"type": "acp",
"agentId": "claude",
"match": {
"channel": "discord",
"accountId": "default",
"peer": { "kind": "channel", "id": "333333333333333333" },
},
},
{
"type": "acp",
"agentId": "codex",
"match": {
"channel": "telegram",
"accountId": "default",
"peer": { "kind": "group", "id": "-1009876543210:topic:5" },
},
},
],
}
```
Notes:
- `bindings[].type` is explicit:
- `route`: normal agent routing.
- `acp`: persistent ACP harness binding for a matched conversation.
- For `type: "acp"`, `match.peer.id` is the canonical conversation key:
- Discord channel/thread: raw channel/thread ID.
- Telegram topic: `chatId:topic:topicId`.
- `bindings[].acp.backend` is optional. Backend fallback order:
1. `bindings[].acp.backend`
2. `agents.list[].runtime.acp.backend`
3. global `acp.backend`
- `mode`, `cwd`, and `label` follow the same override pattern (`binding override -> agent runtime default -> global/default behavior`).
- Keep existing `session.threadBindings.*` and `channels.discord.threadBindings.*` for temporary binding policies.
- Persistent entries declare desired state; runtime reconciles to actual ACP sessions/bindings.
- One active ACP binding per conversation node is the intended model.
- Backward compatibility: missing `type` is interpreted as `route` for legacy entries.
### Backend Selection
- ACP session initialization already uses configured backend selection during spawn (`acp.backend` today).
- This proposal extends spawn/reconcile logic to prefer typed ACP binding overrides:
- `bindings[].acp.backend` for conversation-local override.
- `agents.list[].runtime.acp.backend` for per-agent defaults.
- If no override exists, keep current behavior (`acp.backend` default).
## Architecture Fit in Current System
### Reuse existing components
- `SessionBindingService` already supports channel-agnostic conversation references.
- ACP spawn/bind flows already support binding through service APIs.
- Telegram already carries topic/thread context via `MessageThreadId` and `chatId`.
### New/extended components
- **Telegram binding adapter** (parallel to Discord adapter):
- register adapter per Telegram account,
- resolve/list/bind/unbind/touch by canonical conversation ID.
- **Typed binding resolver/index**:
- split `bindings[]` into `route` and `acp` views,
- keep `resolveAgentRoute` on `route` bindings only,
- resolve persistent ACP intent from `acp` bindings only.
- **Inbound binding resolution for Telegram**:
- resolve bound session before route finalization (Discord already does this).
- **Persistent binding reconciler**:
- on startup: load configured top-level `type: "acp"` bindings, ensure ACP sessions exist, ensure bindings exist.
- on config change: apply deltas safely.
- **Cutover model**:
- no channel-local ACP binding fallback is read,
- persistent ACP bindings are sourced only from top-level `bindings[].type="acp"` entries.
## Phased Delivery
### Phase 1: Typed binding schema foundation
- Extend config schema to support `bindings[].type` discriminator:
- `route`,
- `acp` with optional `acp` override object (`mode`, `backend`, `cwd`, `label`).
- Extend agent schema with runtime descriptor to mark ACP-native agents (`agents.list[].runtime.type`).
- Add parser/indexer split for route vs ACP bindings.
### Phase 2: Runtime resolution + Discord/Telegram parity
- Resolve persistent ACP bindings from top-level `type: "acp"` entries for:
- Discord channels/threads,
- Telegram forum topics (`chatId:topic:topicId` canonical IDs).
- Implement Telegram binding adapter and inbound bound-session override parity with Discord.
- Do not include Telegram direct/private topic variants in this phase.
### Phase 3: Command parity and resets
- Align `/acp`, `/new`, `/reset`, and `/focus` behavior in bound Telegram/Discord conversations.
- Ensure binding survives reset flows as configured.
### Phase 4: Hardening
- Better diagnostics (`/acp status`, startup reconciliation logs).
- Conflict handling and health checks.
## Guardrails and Policy
- Respect ACP enablement and sandbox restrictions exactly as today.
- Keep explicit account scoping (`accountId`) to avoid cross-account bleed.
- Fail closed on ambiguous routing.
- Keep mention/access policy behavior explicit per channel config.
## Testing Plan
- Unit:
- conversation ID normalization (especially Telegram topic IDs),
- reconciler create/update/delete paths,
- `/acp bind --persist` and unbind flows.
- Integration:
- inbound Telegram topic -> bound ACP session resolution,
- inbound Discord channel/thread -> persistent binding precedence.
- Regression:
- temporary bindings continue to work,
- unbound channels/topics keep current routing behavior.
## Open Questions
- Should `/acp spawn --thread auto` in Telegram topic default to `here`?
- Should persistent bindings always bypass mention-gating in bound conversations, or require explicit `requireMention=false`?
- Should `/focus` gain `--persist` as an alias for `/acp bind --persist`?
## Rollout
- Ship as opt-in per conversation (`bindings[].type="acp"` entry present).
- Start with Discord + Telegram only.
- Add docs with examples for:
- “one channel/topic per agent”
- “multiple channels/topics per same agent with different `cwd`
- “team naming patterns (`codex-1`, `claude-repo-x`)".

View File

@@ -1,800 +0,0 @@
---
summary: "Integrate ACP coding agents via a first-class ACP control plane in core and plugin-backed runtimes (acpx first)"
owner: "onutc"
status: "draft"
last_updated: "2026-02-25"
title: "ACP Thread Bound Agents"
---
# ACP Thread Bound Agents
## Overview
This plan defines how OpenClaw should support ACP coding agents in thread-capable channels (Discord first) with production-level lifecycle and recovery.
Related document:
- [Unified Runtime Streaming Refactor Plan](/experiments/plans/acp-unified-streaming-refactor)
Target user experience:
- a user spawns or focuses an ACP session into a thread
- user messages in that thread route to the bound ACP session
- agent output streams back to the same thread persona
- session can be persistent or one shot with explicit cleanup controls
## Decision summary
Long term recommendation is a hybrid architecture:
- OpenClaw core owns ACP control plane concerns
- session identity and metadata
- thread binding and routing decisions
- delivery invariants and duplicate suppression
- lifecycle cleanup and recovery semantics
- ACP runtime backend is pluggable
- first backend is an acpx-backed plugin service
- runtime does ACP transport, queueing, cancel, reconnect
OpenClaw should not reimplement ACP transport internals in core.
OpenClaw should not rely on a pure plugin-only interception path for routing.
## North-star architecture (holy grail)
Treat ACP as a first-class control plane in OpenClaw, with pluggable runtime adapters.
Non-negotiable invariants:
- every ACP thread binding references a valid ACP session record
- every ACP session has explicit lifecycle state (`creating`, `idle`, `running`, `cancelling`, `closed`, `error`)
- every ACP run has explicit run state (`queued`, `running`, `completed`, `failed`, `cancelled`)
- spawn, bind, and initial enqueue are atomic
- command retries are idempotent (no duplicate runs or duplicate Discord outputs)
- bound-thread channel output is a projection of ACP run events, never ad-hoc side effects
Long-term ownership model:
- `AcpSessionManager` is the single ACP writer and orchestrator
- manager lives in gateway process first; can be moved to a dedicated sidecar later behind the same interface
- per ACP session key, manager owns one in-memory actor (serialized command execution)
- adapters (`acpx`, future backends) are transport/runtime implementations only
Long-term persistence model:
- move ACP control-plane state to a dedicated SQLite store (WAL mode) under OpenClaw state dir
- keep `SessionEntry.acp` as compatibility projection during migration, not source-of-truth
- store ACP events append-only to support replay, crash recovery, and deterministic delivery
### Delivery strategy (bridge to holy-grail)
- short-term bridge
- keep current thread binding mechanics and existing ACP config surface
- fix metadata-gap bugs and route ACP turns through a single core ACP branch
- add idempotency keys and fail-closed routing checks immediately
- long-term cutover
- move ACP source-of-truth to control-plane DB + actors
- make bound-thread delivery purely event-projection based
- remove legacy fallback behavior that depends on opportunistic session-entry metadata
## Why not pure plugin only
Current plugin hooks are not sufficient for end to end ACP session routing without core changes.
- inbound routing from thread binding resolves to a session key in core dispatch first
- message hooks are fire-and-forget and cannot short-circuit the main reply path
- plugin commands are good for control operations but not for replacing core per-turn dispatch flow
Result:
- ACP runtime can be pluginized
- ACP routing branch must exist in core
## Existing foundation to reuse
Already implemented and should remain canonical:
- thread binding target supports `subagent` and `acp`
- inbound thread routing override resolves by binding before normal dispatch
- outbound thread identity via webhook in reply delivery
- `/focus` and `/unfocus` flow with ACP target compatibility
- persistent binding store with restore on startup
- unbind lifecycle on archive, delete, unfocus, reset, and delete
This plan extends that foundation rather than replacing it.
## Architecture
### Boundary model
Core (must be in OpenClaw core):
- ACP session-mode dispatch branch in the reply pipeline
- delivery arbitration to avoid parent plus thread duplication
- ACP control-plane persistence (with `SessionEntry.acp` compatibility projection during migration)
- lifecycle unbind and runtime detach semantics tied to session reset/delete
Plugin backend (acpx implementation):
- ACP runtime worker supervision
- acpx process invocation and event parsing
- ACP command handlers (`/acp ...`) and operator UX
- backend-specific config defaults and diagnostics
### Runtime ownership model
- one gateway process owns ACP orchestration state
- ACP execution runs in supervised child processes via acpx backend
- process strategy is long lived per active ACP session key, not per message
This avoids startup cost on every prompt and keeps cancel and reconnect semantics reliable.
### Core runtime contract
Add a core ACP runtime contract so routing code does not depend on CLI details and can switch backends without changing dispatch logic:
```ts
export type AcpRuntimePromptMode = "prompt" | "steer";
export type AcpRuntimeHandle = {
sessionKey: string;
backend: string;
runtimeSessionName: string;
};
export type AcpRuntimeEvent =
| { type: "text_delta"; stream: "output" | "thought"; text: string }
| { type: "tool_call"; name: string; argumentsText: string }
| { type: "done"; usage?: Record<string, number> }
| { type: "error"; code: string; message: string; retryable?: boolean };
export interface AcpRuntime {
ensureSession(input: {
sessionKey: string;
agent: string;
mode: "persistent" | "oneshot";
cwd?: string;
env?: Record<string, string>;
idempotencyKey: string;
}): Promise<AcpRuntimeHandle>;
submit(input: {
handle: AcpRuntimeHandle;
text: string;
mode: AcpRuntimePromptMode;
idempotencyKey: string;
}): Promise<{ runtimeRunId: string }>;
stream(input: {
handle: AcpRuntimeHandle;
runtimeRunId: string;
onEvent: (event: AcpRuntimeEvent) => Promise<void> | void;
signal?: AbortSignal;
}): Promise<void>;
cancel(input: {
handle: AcpRuntimeHandle;
runtimeRunId?: string;
reason?: string;
idempotencyKey: string;
}): Promise<void>;
close(input: { handle: AcpRuntimeHandle; reason: string; idempotencyKey: string }): Promise<void>;
health?(): Promise<{ ok: boolean; details?: string }>;
}
```
Implementation detail:
- first backend: `AcpxRuntime` shipped as a plugin service
- core resolves runtime via registry and fails with explicit operator error when no ACP runtime backend is available
### Control-plane data model and persistence
Long-term source-of-truth is a dedicated ACP SQLite database (WAL mode), for transactional updates and crash-safe recovery:
- `acp_sessions`
- `session_key` (pk), `backend`, `agent`, `mode`, `cwd`, `state`, `created_at`, `updated_at`, `last_error`
- `acp_runs`
- `run_id` (pk), `session_key` (fk), `state`, `requester_message_id`, `idempotency_key`, `started_at`, `ended_at`, `error_code`, `error_message`
- `acp_bindings`
- `binding_key` (pk), `thread_id`, `channel_id`, `account_id`, `session_key` (fk), `expires_at`, `bound_at`
- `acp_events`
- `event_id` (pk), `run_id` (fk), `seq`, `kind`, `payload_json`, `created_at`
- `acp_delivery_checkpoint`
- `run_id` (pk/fk), `last_event_seq`, `last_discord_message_id`, `updated_at`
- `acp_idempotency`
- `scope`, `idempotency_key`, `result_json`, `created_at`, unique `(scope, idempotency_key)`
```ts
export type AcpSessionMeta = {
backend: string;
agent: string;
runtimeSessionName: string;
mode: "persistent" | "oneshot";
cwd?: string;
state: "idle" | "running" | "error";
lastActivityAt: number;
lastError?: string;
};
```
Storage rules:
- keep `SessionEntry.acp` as a compatibility projection during migration
- process ids and sockets stay in memory only
- durable lifecycle and run status live in ACP DB, not generic session JSON
- if runtime owner dies, gateway rehydrates from ACP DB and resumes from checkpoints
### Routing and delivery
Inbound:
- keep current thread binding lookup as first routing step
- if bound target is ACP session, route to ACP runtime branch instead of `getReplyFromConfig`
- explicit `/acp steer` command uses `mode: "steer"`
Outbound:
- ACP event stream is normalized to OpenClaw reply chunks
- delivery target is resolved through existing bound destination path
- when a bound thread is active for that session turn, parent channel completion is suppressed
Streaming policy:
- stream partial output with coalescing window
- configurable min interval and max chunk bytes to stay under Discord rate limits
- final message always emitted on completion or failure
### State machines and transaction boundaries
Session state machine:
- `creating -> idle -> running -> idle`
- `running -> cancelling -> idle | error`
- `idle -> closed`
- `error -> idle | closed`
Run state machine:
- `queued -> running -> completed`
- `running -> failed | cancelled`
- `queued -> cancelled`
Required transaction boundaries:
- spawn transaction
- create ACP session row
- create/update ACP thread binding row
- enqueue initial run row
- close transaction
- mark session closed
- delete/expire binding rows
- write final close event
- cancel transaction
- mark target run cancelling/cancelled with idempotency key
No partial success is allowed across these boundaries.
### Per-session actor model
`AcpSessionManager` runs one actor per ACP session key:
- actor mailbox serializes `submit`, `cancel`, `close`, and `stream` side effects
- actor owns runtime handle hydration and runtime adapter process lifecycle for that session
- actor writes run events in-order (`seq`) before any Discord delivery
- actor updates delivery checkpoints after successful outbound send
This removes cross-turn races and prevents duplicate or out-of-order thread output.
### Idempotency and delivery projection
All external ACP actions must carry idempotency keys:
- spawn idempotency key
- prompt/steer idempotency key
- cancel idempotency key
- close idempotency key
Delivery rules:
- Discord messages are derived from `acp_events` plus `acp_delivery_checkpoint`
- retries resume from checkpoint without re-sending already delivered chunks
- final reply emission is exactly-once per run from projection logic
### Recovery and self-healing
On gateway start:
- load non-terminal ACP sessions (`creating`, `idle`, `running`, `cancelling`, `error`)
- recreate actors lazily on first inbound event or eagerly under configured cap
- reconcile any `running` runs missing heartbeats and mark `failed` or recover via adapter
On inbound Discord thread message:
- if binding exists but ACP session is missing, fail closed with explicit stale-binding message
- optionally auto-unbind stale binding after operator-safe validation
- never silently route stale ACP bindings to normal LLM path
### Lifecycle and safety
Supported operations:
- cancel current run: `/acp cancel`
- unbind thread: `/unfocus`
- close ACP session: `/acp close`
- auto close idle sessions by effective TTL
TTL policy:
- effective TTL is minimum of
- global/session TTL
- Discord thread binding TTL
- ACP runtime owner TTL
Safety controls:
- allowlist ACP agents by name
- restrict workspace roots for ACP sessions
- env allowlist passthrough
- max concurrent ACP sessions per account and globally
- bounded restart backoff for runtime crashes
## Config surface
Core keys:
- `acp.enabled`
- `acp.dispatch.enabled` (independent ACP routing kill switch)
- `acp.backend` (default `acpx`)
- `acp.defaultAgent`
- `acp.allowedAgents[]`
- `acp.maxConcurrentSessions`
- `acp.stream.coalesceIdleMs`
- `acp.stream.maxChunkChars`
- `acp.runtime.ttlMinutes`
- `acp.controlPlane.store` (`sqlite` default)
- `acp.controlPlane.storePath`
- `acp.controlPlane.recovery.eagerActors`
- `acp.controlPlane.recovery.reconcileRunningAfterMs`
- `acp.controlPlane.checkpoint.flushEveryEvents`
- `acp.controlPlane.checkpoint.flushEveryMs`
- `acp.idempotency.ttlHours`
- `channels.discord.threadBindings.spawnAcpSessions`
Plugin/backend keys (acpx plugin section):
- backend command/path overrides
- backend env allowlist
- backend per-agent presets
- backend startup/stop timeouts
- backend max inflight runs per session
## Implementation specification
### Control-plane modules (new)
Add dedicated ACP control-plane modules in core:
- `src/acp/control-plane/manager.ts`
- owns ACP actors, lifecycle transitions, command serialization
- `src/acp/control-plane/store.ts`
- SQLite schema management, transactions, query helpers
- `src/acp/control-plane/events.ts`
- typed ACP event definitions and serialization
- `src/acp/control-plane/checkpoint.ts`
- durable delivery checkpoints and replay cursors
- `src/acp/control-plane/idempotency.ts`
- idempotency key reservation and response replay
- `src/acp/control-plane/recovery.ts`
- boot-time reconciliation and actor rehydrate plan
Compatibility bridge modules:
- `src/acp/runtime/session-meta.ts`
- remains temporarily for projection into `SessionEntry.acp`
- must stop being source-of-truth after migration cutover
### Required invariants (must enforce in code)
- ACP session creation and thread bind are atomic (single transaction)
- there is at most one active run per ACP session actor at a time
- event `seq` is strictly increasing per run
- delivery checkpoint never advances past last committed event
- idempotency replay returns previous success payload for duplicate command keys
- stale/missing ACP metadata cannot route into normal non-ACP reply path
### Core touchpoints
Core files to change:
- `src/auto-reply/reply/dispatch-from-config.ts`
- ACP branch calls `AcpSessionManager.submit` and event-projection delivery
- remove direct ACP fallback that bypasses control-plane invariants
- `src/auto-reply/reply/inbound-context.ts` (or nearest normalized context boundary)
- expose normalized routing keys and idempotency seeds for ACP control plane
- `src/config/sessions/types.ts`
- keep `SessionEntry.acp` as projection-only compatibility field
- `src/gateway/server-methods/sessions.ts`
- reset/delete/archive must call ACP manager close/unbind transaction path
- `src/infra/outbound/bound-delivery-router.ts`
- enforce fail-closed destination behavior for ACP bound session turns
- `src/discord/monitor/thread-bindings.ts`
- add ACP stale-binding validation helpers wired to control-plane lookups
- `src/auto-reply/reply/commands-acp.ts`
- route spawn/cancel/close/steer through ACP manager APIs
- `src/agents/acp-spawn.ts`
- stop ad-hoc metadata writes; call ACP manager spawn transaction
- `src/plugin-sdk/**` and plugin runtime bridge
- expose ACP backend registration and health semantics cleanly
Core files explicitly not replaced:
- `src/discord/monitor/message-handler.preflight.ts`
- keep thread binding override behavior as the canonical session-key resolver
### ACP runtime registry API
Add a core registry module:
- `src/acp/runtime/registry.ts`
Required API:
```ts
export type AcpRuntimeBackend = {
id: string;
runtime: AcpRuntime;
healthy?: () => boolean;
};
export function registerAcpRuntimeBackend(backend: AcpRuntimeBackend): void;
export function unregisterAcpRuntimeBackend(id: string): void;
export function getAcpRuntimeBackend(id?: string): AcpRuntimeBackend | null;
export function requireAcpRuntimeBackend(id?: string): AcpRuntimeBackend;
```
Behavior:
- `requireAcpRuntimeBackend` throws a typed ACP backend missing error when unavailable
- plugin service registers backend on `start` and unregisters on `stop`
- runtime lookups are read-only and process-local
### acpx runtime plugin contract (implementation detail)
For the first production backend (`extensions/acpx`), OpenClaw and acpx are
connected with a strict command contract:
- backend id: `acpx`
- plugin service id: `acpx-runtime`
- runtime handle encoding: `runtimeSessionName = acpx:v1:<base64url(json)>`
- encoded payload fields:
- `name` (acpx named session; uses OpenClaw `sessionKey`)
- `agent` (acpx agent command)
- `cwd` (session workspace root)
- `mode` (`persistent | oneshot`)
Command mapping:
- ensure session:
- `acpx --format json --json-strict --cwd <cwd> <agent> sessions ensure --name <name>`
- prompt turn:
- `acpx --format json --json-strict --cwd <cwd> <agent> prompt --session <name> --file -`
- cancel:
- `acpx --format json --json-strict --cwd <cwd> <agent> cancel --session <name>`
- close:
- `acpx --format json --json-strict --cwd <cwd> <agent> sessions close <name>`
Streaming:
- OpenClaw consumes ndjson events from `acpx --format json --json-strict`
- `text` => `text_delta/output`
- `thought` => `text_delta/thought`
- `tool_call` => `tool_call`
- `done` => `done`
- `error` => `error`
### Session schema patch
Patch `SessionEntry` in `src/config/sessions/types.ts`:
```ts
type SessionAcpMeta = {
backend: string;
agent: string;
runtimeSessionName: string;
mode: "persistent" | "oneshot";
cwd?: string;
state: "idle" | "running" | "error";
lastActivityAt: number;
lastError?: string;
};
```
Persisted field:
- `SessionEntry.acp?: SessionAcpMeta`
Migration rules:
- phase A: dual-write (`acp` projection + ACP SQLite source-of-truth)
- phase B: read-primary from ACP SQLite, fallback-read from legacy `SessionEntry.acp`
- phase C: migration command backfills missing ACP rows from valid legacy entries
- phase D: remove fallback-read and keep projection optional for UX only
- legacy fields (`cliSessionIds`, `claudeCliSessionId`) remain untouched
### Error contract
Add stable ACP error codes and user-facing messages:
- `ACP_BACKEND_MISSING`
- message: `ACP runtime backend is not configured. Install and enable the acpx runtime plugin.`
- `ACP_BACKEND_UNAVAILABLE`
- message: `ACP runtime backend is currently unavailable. Try again in a moment.`
- `ACP_SESSION_INIT_FAILED`
- message: `Could not initialize ACP session runtime.`
- `ACP_TURN_FAILED`
- message: `ACP turn failed before completion.`
Rules:
- return actionable user-safe message in-thread
- log detailed backend/system error only in runtime logs
- never silently fall back to normal LLM path when ACP routing was explicitly selected
### Duplicate delivery arbitration
Single routing rule for ACP bound turns:
- if an active thread binding exists for the target ACP session and requester context, deliver only to that bound thread
- do not also send to parent channel for the same turn
- if bound destination selection is ambiguous, fail closed with explicit error (no implicit parent fallback)
- if no active binding exists, use normal session destination behavior
### Observability and operational readiness
Required metrics:
- ACP spawn success/failure count by backend and error code
- ACP run latency percentiles (queue wait, runtime turn time, delivery projection time)
- ACP actor restart count and restart reason
- stale-binding detection count
- idempotency replay hit rate
- Discord delivery retry and rate-limit counters
Required logs:
- structured logs keyed by `sessionKey`, `runId`, `backend`, `threadId`, `idempotencyKey`
- explicit state transition logs for session and run state machines
- adapter command logs with redaction-safe arguments and exit summary
Required diagnostics:
- `/acp sessions` includes state, active run, last error, and binding status
- `/acp doctor` (or equivalent) validates backend registration, store health, and stale bindings
### Config precedence and effective values
ACP enablement precedence:
- account override: `channels.discord.accounts.<id>.threadBindings.spawnAcpSessions`
- channel override: `channels.discord.threadBindings.spawnAcpSessions`
- global ACP gate: `acp.enabled`
- dispatch gate: `acp.dispatch.enabled`
- backend availability: registered backend for `acp.backend`
Auto-enable behavior:
- when ACP is configured (`acp.enabled=true`, `acp.dispatch.enabled=true`, or
`acp.backend=acpx`), plugin auto-enable marks `plugins.entries.acpx.enabled=true`
unless denylisted or explicitly disabled
TTL effective value:
- `min(session ttl, discord thread binding ttl, acp runtime ttl)`
### Test map
Unit tests:
- `src/acp/runtime/registry.test.ts` (new)
- `src/auto-reply/reply/dispatch-from-config.acp.test.ts` (new)
- `src/infra/outbound/bound-delivery-router.test.ts` (extend ACP fail-closed cases)
- `src/config/sessions/types.test.ts` or nearest session-store tests (ACP metadata persistence)
Integration tests:
- `src/discord/monitor/reply-delivery.test.ts` (bound ACP delivery target behavior)
- `src/discord/monitor/message-handler.preflight*.test.ts` (bound ACP session-key routing continuity)
- acpx plugin runtime tests in backend package (service register/start/stop + event normalization)
Gateway e2e tests:
- `src/gateway/server.sessions.gateway-server-sessions-a.e2e.test.ts` (extend ACP reset/delete lifecycle coverage)
- ACP thread turn roundtrip e2e for spawn, message, stream, cancel, unfocus, restart recovery
### Rollout guard
Add independent ACP dispatch kill switch:
- `acp.dispatch.enabled` default `false` for first release
- when disabled:
- ACP spawn/focus control commands may still bind sessions
- ACP dispatch path does not activate
- user receives explicit message that ACP dispatch is disabled by policy
- after canary validation, default can be flipped to `true` in a later release
## Command and UX plan
### New commands
- `/acp spawn <agent-id> [--mode persistent|oneshot] [--thread auto|here|off]`
- `/acp cancel [session]`
- `/acp steer <instruction>`
- `/acp close [session]`
- `/acp sessions`
### Existing command compatibility
- `/focus <sessionKey>` continues to support ACP targets
- `/unfocus` keeps current semantics
- `/session idle` and `/session max-age` replace the old TTL override
## Phased rollout
### Phase 0 ADR and schema freeze
- ship ADR for ACP control-plane ownership and adapter boundaries
- freeze DB schema (`acp_sessions`, `acp_runs`, `acp_bindings`, `acp_events`, `acp_delivery_checkpoint`, `acp_idempotency`)
- define stable ACP error codes, event contract, and state-transition guards
### Phase 1 Control-plane foundation in core
- implement `AcpSessionManager` and per-session actor runtime
- implement ACP SQLite store and transaction helpers
- implement idempotency store and replay helpers
- implement event append + delivery checkpoint modules
- wire spawn/cancel/close APIs to manager with transactional guarantees
### Phase 2 Core routing and lifecycle integration
- route thread-bound ACP turns from dispatch pipeline into ACP manager
- enforce fail-closed routing when ACP binding/session invariants fail
- integrate reset/delete/archive/unfocus lifecycle with ACP close/unbind transactions
- add stale-binding detection and optional auto-unbind policy
### Phase 3 acpx backend adapter/plugin
- implement `acpx` adapter against runtime contract (`ensureSession`, `submit`, `stream`, `cancel`, `close`)
- add backend health checks and startup/teardown registration
- normalize acpx ndjson events into ACP runtime events
- enforce backend timeouts, process supervision, and restart/backoff policy
### Phase 4 Delivery projection and channel UX (Discord first)
- implement event-driven channel projection with checkpoint resume (Discord first)
- coalesce streaming chunks with rate-limit aware flush policy
- guarantee exactly-once final completion message per run
- ship `/acp spawn`, `/acp cancel`, `/acp steer`, `/acp close`, `/acp sessions`
### Phase 5 Migration and cutover
- introduce dual-write to `SessionEntry.acp` projection plus ACP SQLite source-of-truth
- add migration utility for legacy ACP metadata rows
- flip read path to ACP SQLite primary
- remove legacy fallback routing that depends on missing `SessionEntry.acp`
### Phase 6 Hardening, SLOs, and scale limits
- enforce concurrency limits (global/account/session), queue policies, and timeout budgets
- add full telemetry, dashboards, and alert thresholds
- chaos-test crash recovery and duplicate-delivery suppression
- publish runbook for backend outage, DB corruption, and stale-binding remediation
### Full implementation checklist
- core control-plane modules and tests
- DB migrations and rollback plan
- ACP manager API integration across dispatch and commands
- adapter registration interface in plugin runtime bridge
- acpx adapter implementation and tests
- thread-capable channel delivery projection logic with checkpoint replay (Discord first)
- lifecycle hooks for reset/delete/archive/unfocus
- stale-binding detector and operator-facing diagnostics
- config validation and precedence tests for all new ACP keys
- operational docs and troubleshooting runbook
## Test plan
Unit tests:
- ACP DB transaction boundaries (spawn/bind/enqueue atomicity, cancel, close)
- ACP state-machine transition guards for sessions and runs
- idempotency reservation/replay semantics across all ACP commands
- per-session actor serialization and queue ordering
- acpx event parser and chunk coalescer
- runtime supervisor restart and backoff policy
- config precedence and effective TTL calculation
- core ACP routing branch selection and fail-closed behavior when backend/session is invalid
Integration tests:
- fake ACP adapter process for deterministic streaming and cancel behavior
- ACP manager + dispatch integration with transactional persistence
- thread-bound inbound routing to ACP session key
- thread-bound outbound delivery suppresses parent channel duplication
- checkpoint replay recovers after delivery failure and resumes from last event
- plugin service registration and teardown of ACP runtime backend
Gateway e2e tests:
- spawn ACP with thread, exchange multi-turn prompts, unfocus
- gateway restart with persisted ACP DB and bindings, then continue same session
- concurrent ACP sessions in multiple threads have no cross-talk
- duplicate command retries (same idempotency key) do not create duplicate runs or replies
- stale-binding scenario yields explicit error and optional auto-clean behavior
## Risks and mitigations
- Duplicate deliveries during transition
- Mitigation: single destination resolver and idempotent event checkpoint
- Runtime process churn under load
- Mitigation: long lived per session owners + concurrency caps + backoff
- Plugin absent or misconfigured
- Mitigation: explicit operator-facing error and fail-closed ACP routing (no implicit fallback to normal session path)
- Config confusion between subagent and ACP gates
- Mitigation: explicit ACP keys and command feedback that includes effective policy source
- Control-plane store corruption or migration bugs
- Mitigation: WAL mode, backup/restore hooks, migration smoke tests, and read-only fallback diagnostics
- Actor deadlocks or mailbox starvation
- Mitigation: watchdog timers, actor health probes, and bounded mailbox depth with rejection telemetry
## Acceptance checklist
- ACP session spawn can create or bind a thread in a supported channel adapter (currently Discord)
- all thread messages route to bound ACP session only
- ACP outputs appear in the same thread identity with streaming or batches
- no duplicate output in parent channel for bound turns
- spawn+bind+initial enqueue are atomic in persistent store
- ACP command retries are idempotent and do not duplicate runs or outputs
- cancel, close, unfocus, archive, reset, and delete perform deterministic cleanup
- crash restart preserves mapping and resumes multi turn continuity
- concurrent thread bound ACP sessions work independently
- ACP backend missing state produces clear actionable error
- stale bindings are detected and surfaced explicitly (with optional safe auto-clean)
- control-plane metrics and diagnostics are available for operators
- new unit, integration, and e2e coverage passes
## Addendum: targeted refactors for current implementation (status)
These are non-blocking follow-ups to keep the ACP path maintainable after the current feature set lands.
### 1) Centralize ACP dispatch policy evaluation (completed)
- implemented via shared ACP policy helpers in `src/acp/policy.ts`
- dispatch, ACP command lifecycle handlers, and ACP spawn path now consume shared policy logic
### 2) Split ACP command handler by subcommand domain (completed)
- `src/auto-reply/reply/commands-acp.ts` is now a thin router
- subcommand behavior is split into:
- `src/auto-reply/reply/commands-acp/lifecycle.ts`
- `src/auto-reply/reply/commands-acp/runtime-options.ts`
- `src/auto-reply/reply/commands-acp/diagnostics.ts`
- shared helpers in `src/auto-reply/reply/commands-acp/shared.ts`
### 3) Split ACP session manager by responsibility (completed)
- manager is split into:
- `src/acp/control-plane/manager.ts` (public facade + singleton)
- `src/acp/control-plane/manager.core.ts` (manager implementation)
- `src/acp/control-plane/manager.types.ts` (manager types/deps)
- `src/acp/control-plane/manager.utils.ts` (normalization + helper functions)
### 4) Optional acpx runtime adapter cleanup
- `extensions/acpx/src/runtime.ts` can be split into:
- process execution/supervision
- ndjson event parsing/normalization
- runtime API surface (`submit`, `cancel`, `close`, etc.)
- improves testability and makes backend behavior easier to audit

View File

@@ -1,96 +0,0 @@
---
summary: "Holy grail refactor plan for one unified runtime streaming pipeline across main, subagent, and ACP"
owner: "onutc"
status: "draft"
last_updated: "2026-02-25"
title: "Unified Runtime Streaming Refactor Plan"
---
# Unified Runtime Streaming Refactor Plan
## Objective
Deliver one shared streaming pipeline for `main`, `subagent`, and `acp` so all runtimes get identical coalescing, chunking, delivery ordering, and crash recovery behavior.
## Why this exists
- Current behavior is split across multiple runtime-specific shaping paths.
- Formatting/coalescing bugs can be fixed in one path but remain in others.
- Delivery consistency, duplicate suppression, and recovery semantics are harder to reason about.
## Target architecture
Single pipeline, runtime-specific adapters:
1. Runtime adapters emit canonical events only.
2. Shared stream assembler coalesces and finalizes text/tool/status events.
3. Shared channel projector applies channel-specific chunking/formatting once.
4. Shared delivery ledger enforces idempotent send/replay semantics.
5. Outbound channel adapter executes sends and records delivery checkpoints.
Canonical event contract:
- `turn_started`
- `text_delta`
- `block_final`
- `tool_started`
- `tool_finished`
- `status`
- `turn_completed`
- `turn_failed`
- `turn_cancelled`
## Workstreams
### 1) Canonical streaming contract
- Define strict event schema + validation in core.
- Add adapter contract tests to guarantee each runtime emits compatible events.
- Reject malformed runtime events early and surface structured diagnostics.
### 2) Shared stream processor
- Replace runtime-specific coalescer/projector logic with one processor.
- Processor owns text delta buffering, idle flush, max-chunk splitting, and completion flush.
- Move ACP/main/subagent config resolution into one helper to prevent drift.
### 3) Shared channel projection
- Keep channel adapters dumb: accept finalized blocks and send.
- Move Discord-specific chunking quirks to channel projector only.
- Keep pipeline channel-agnostic before projection.
### 4) Delivery ledger + replay
- Add per-turn/per-chunk delivery IDs.
- Record checkpoints before and after physical send.
- On restart, replay pending chunks idempotently and avoid duplicates.
### 5) Migration and cutover
- Phase 1: shadow mode (new pipeline computes output but old path sends; compare).
- Phase 2: runtime-by-runtime cutover (`acp`, then `subagent`, then `main` or reverse by risk).
- Phase 3: delete legacy runtime-specific streaming code.
## Non-goals
- No changes to ACP policy/permissions model in this refactor.
- No channel-specific feature expansion outside projection compatibility fixes.
- No transport/backend redesign (acpx plugin contract remains as-is unless needed for event parity).
## Risks and mitigations
- Risk: behavioral regressions in existing main/subagent paths.
Mitigation: shadow mode diffing + adapter contract tests + channel e2e tests.
- Risk: duplicate sends during crash recovery.
Mitigation: durable delivery IDs + idempotent replay in delivery adapter.
- Risk: runtime adapters diverge again.
Mitigation: required shared contract test suite for all adapters.
## Acceptance criteria
- All runtimes pass shared streaming contract tests.
- Discord ACP/main/subagent produce equivalent spacing/chunking behavior for tiny deltas.
- Crash/restart replay sends no duplicate chunk for the same delivery ID.
- Legacy ACP projector/coalescer path is removed.
- Streaming config resolution is shared and runtime-independent.

View File

@@ -1,232 +0,0 @@
---
summary: "Plan: isolate browser act:evaluate from Playwright queue using CDP, with end-to-end deadlines and safer ref resolution"
read_when:
- Working on browser `act:evaluate` timeout, abort, or queue blocking issues
- Planning CDP based isolation for evaluate execution
owner: "openclaw"
status: "draft"
last_updated: "2026-02-10"
title: "Browser Evaluate CDP Refactor"
---
# Browser Evaluate CDP Refactor Plan
## Context
`act:evaluate` executes user provided JavaScript in the page. Today it runs via Playwright
(`page.evaluate` or `locator.evaluate`). Playwright serializes CDP commands per page, so a
stuck or long running evaluate can block the page command queue and make every later action
on that tab look "stuck".
PR #13498 adds a pragmatic safety net (bounded evaluate, abort propagation, and best-effort
recovery). This document describes a larger refactor that makes `act:evaluate` inherently
isolated from Playwright so a stuck evaluate cannot wedge normal Playwright operations.
## Goals
- `act:evaluate` cannot permanently block later browser actions on the same tab.
- Timeouts are single source of truth end to end so a caller can rely on a budget.
- Abort and timeout are treated the same way across HTTP and in-process dispatch.
- Element targeting for evaluate is supported without switching everything off Playwright.
- Maintain backward compatibility for existing callers and payloads.
## Non-goals
- Replace all browser actions (click, type, wait, etc.) with CDP implementations.
- Remove the existing safety net introduced in PR #13498 (it remains a useful fallback).
- Introduce new unsafe capabilities beyond the existing `browser.evaluateEnabled` gate.
- Add process isolation (worker process/thread) for evaluate. If we still see hard to recover
stuck states after this refactor, that is a follow-up idea.
## Current Architecture (Why It Gets Stuck)
At a high level:
- Callers send `act:evaluate` to the browser control service.
- The route handler calls into Playwright to execute the JavaScript.
- Playwright serializes page commands, so an evaluate that never finishes blocks the queue.
- A stuck queue means later click/type/wait operations on the tab can appear to hang.
## Proposed Architecture
### 1. Deadline Propagation
Introduce a single budget concept and derive everything from it:
- Caller sets `timeoutMs` (or a deadline in the future).
- The outer request timeout, route handler logic, and the execution budget inside the page
all use the same budget, with small headroom where needed for serialization overhead.
- Abort is propagated as an `AbortSignal` everywhere so cancellation is consistent.
Implementation direction:
- Add a small helper (for example `createBudget({ timeoutMs, signal })`) that returns:
- `signal`: the linked AbortSignal
- `deadlineAtMs`: absolute deadline
- `remainingMs()`: remaining budget for child operations
- Use this helper in:
- `src/browser/client-fetch.ts` (HTTP and in-process dispatch)
- `src/node-host/runner.ts` (proxy path)
- browser action implementations (Playwright and CDP)
### 2. Separate Evaluate Engine (CDP Path)
Add a CDP based evaluate implementation that does not share Playwright's per page command
queue. The key property is that the evaluate transport is a separate WebSocket connection
and a separate CDP session attached to the target.
Implementation direction:
- New module, for example `src/browser/cdp-evaluate.ts`, that:
- Connects to the configured CDP endpoint (browser level socket).
- Uses `Target.attachToTarget({ targetId, flatten: true })` to get a `sessionId`.
- Runs either:
- `Runtime.evaluate` for page level evaluate, or
- `DOM.resolveNode` plus `Runtime.callFunctionOn` for element evaluate.
- On timeout or abort:
- Sends `Runtime.terminateExecution` best-effort for the session.
- Closes the WebSocket and returns a clear error.
Notes:
- This still executes JavaScript in the page, so termination can have side effects. The win
is that it does not wedge the Playwright queue, and it is cancelable at the transport
layer by killing the CDP session.
### 3. Ref Story (Element Targeting Without A Full Rewrite)
The hard part is element targeting. CDP needs a DOM handle or `backendDOMNodeId`, while
today most browser actions use Playwright locators based on refs from snapshots.
Recommended approach: keep existing refs, but attach an optional CDP resolvable id.
#### 3.1 Extend Stored Ref Info
Extend the stored role ref metadata to optionally include a CDP id:
- Today: `{ role, name, nth }`
- Proposed: `{ role, name, nth, backendDOMNodeId?: number }`
This keeps all existing Playwright based actions working and allows CDP evaluate to accept
the same `ref` value when the `backendDOMNodeId` is available.
#### 3.2 Populate backendDOMNodeId At Snapshot Time
When producing a role snapshot:
1. Generate the existing role ref map as today (role, name, nth).
2. Fetch the AX tree via CDP (`Accessibility.getFullAXTree`) and compute a parallel map of
`(role, name, nth) -> backendDOMNodeId` using the same duplicate handling rules.
3. Merge the id back into the stored ref info for the current tab.
If mapping fails for a ref, leave `backendDOMNodeId` undefined. This makes the feature
best-effort and safe to roll out.
#### 3.3 Evaluate Behavior With Ref
In `act:evaluate`:
- If `ref` is present and has `backendDOMNodeId`, run element evaluate via CDP.
- If `ref` is present but has no `backendDOMNodeId`, fall back to the Playwright path (with
the safety net).
Optional escape hatch:
- Extend the request shape to accept `backendDOMNodeId` directly for advanced callers (and
for debugging), while keeping `ref` as the primary interface.
### 4. Keep A Last Resort Recovery Path
Even with CDP evaluate, there are other ways to wedge a tab or a connection. Keep the
existing recovery mechanisms (terminate execution + disconnect Playwright) as a last resort
for:
- legacy callers
- environments where CDP attach is blocked
- unexpected Playwright edge cases
## Implementation Plan (Single Iteration)
### Deliverables
- A CDP based evaluate engine that runs outside the Playwright per-page command queue.
- A single end-to-end timeout/abort budget used consistently by callers and handlers.
- Ref metadata that can optionally carry `backendDOMNodeId` for element evaluate.
- `act:evaluate` prefers the CDP engine when possible and falls back to Playwright when not.
- Tests that prove a stuck evaluate does not wedge later actions.
- Logs/metrics that make failures and fallbacks visible.
### Implementation Checklist
1. Add a shared "budget" helper to link `timeoutMs` + upstream `AbortSignal` into:
- a single `AbortSignal`
- an absolute deadline
- a `remainingMs()` helper for downstream operations
2. Update all caller paths to use that helper so `timeoutMs` means the same thing everywhere:
- `src/browser/client-fetch.ts` (HTTP and in-process dispatch)
- `src/node-host/runner.ts` (node proxy path)
- CLI wrappers that call `/act` (add `--timeout-ms` to `browser evaluate`)
3. Implement `src/browser/cdp-evaluate.ts`:
- connect to the browser-level CDP socket
- `Target.attachToTarget` to get a `sessionId`
- run `Runtime.evaluate` for page evaluate
- run `DOM.resolveNode` + `Runtime.callFunctionOn` for element evaluate
- on timeout/abort: best-effort `Runtime.terminateExecution` then close the socket
4. Extend stored role ref metadata to optionally include `backendDOMNodeId`:
- keep existing `{ role, name, nth }` behavior for Playwright actions
- add `backendDOMNodeId?: number` for CDP element targeting
5. Populate `backendDOMNodeId` during snapshot creation (best-effort):
- fetch AX tree via CDP (`Accessibility.getFullAXTree`)
- compute `(role, name, nth) -> backendDOMNodeId` and merge into the stored ref map
- if mapping is ambiguous or missing, leave the id undefined
6. Update `act:evaluate` routing:
- if no `ref`: always use CDP evaluate
- if `ref` resolves to a `backendDOMNodeId`: use CDP element evaluate
- otherwise: fall back to Playwright evaluate (still bounded and abortable)
7. Keep the existing "last resort" recovery path as a fallback, not the default path.
8. Add tests:
- stuck evaluate times out within budget and the next click/type succeeds
- abort cancels evaluate (client disconnect or timeout) and unblocks subsequent actions
- mapping failures cleanly fall back to Playwright
9. Add observability:
- evaluate duration and timeout counters
- terminateExecution usage
- fallback rate (CDP -> Playwright) and reasons
### Acceptance Criteria
- A deliberately hung `act:evaluate` returns within the caller budget and does not wedge the
tab for later actions.
- `timeoutMs` behaves consistently across CLI, agent tool, node proxy, and in-process calls.
- If `ref` can be mapped to `backendDOMNodeId`, element evaluate uses CDP; otherwise the
fallback path is still bounded and recoverable.
## Testing Plan
- Unit tests:
- `(role, name, nth)` matching logic between role refs and AX tree nodes.
- Budget helper behavior (headroom, remaining time math).
- Integration tests:
- CDP evaluate timeout returns within budget and does not block the next action.
- Abort cancels evaluate and triggers termination best-effort.
- Contract tests:
- Ensure `BrowserActRequest` and `BrowserActResponse` remain compatible.
## Risks And Mitigations
- Mapping is imperfect:
- Mitigation: best-effort mapping, fallback to Playwright evaluate, and add debug tooling.
- `Runtime.terminateExecution` has side effects:
- Mitigation: only use on timeout/abort and document the behavior in errors.
- Extra overhead:
- Mitigation: only fetch AX tree when snapshots are requested, cache per target, and keep
CDP session short lived.
- Extension relay limitations:
- Mitigation: use browser level attach APIs when per page sockets are not available, and
keep the current Playwright path as fallback.
## Open Questions
- Should the new engine be configurable as `playwright`, `cdp`, or `auto`?
- Do we want to expose a new "nodeRef" format for advanced users, or keep `ref` only?
- How should frame snapshots and selector scoped snapshots participate in AX mapping?

View File

@@ -1,337 +0,0 @@
---
summary: "Status and next steps for decoupling Discord gateway listeners from long-running agent turns with a Discord-specific inbound worker"
owner: "openclaw"
status: "in_progress"
last_updated: "2026-03-05"
title: "Discord Async Inbound Worker Plan"
---
# Discord Async Inbound Worker Plan
## Objective
Remove Discord listener timeout as a user-facing failure mode by making inbound Discord turns asynchronous:
1. Gateway listener accepts and normalizes inbound events quickly.
2. A Discord run queue stores serialized jobs keyed by the same ordering boundary we use today.
3. A worker executes the actual agent turn outside the Carbon listener lifetime.
4. Replies are delivered back to the originating channel or thread after the run completes.
This is the long-term fix for queued Discord runs timing out at `channels.discord.eventQueue.listenerTimeout` while the agent run itself is still making progress.
## Current status
This plan is partially implemented.
Already done:
- Discord listener timeout and Discord run timeout are now separate settings.
- Accepted inbound Discord turns are enqueued into `src/discord/monitor/inbound-worker.ts`.
- The worker now owns the long-running turn instead of the Carbon listener.
- Existing per-route ordering is preserved by queue key.
- Timeout regression coverage exists for the Discord worker path.
What this means in plain language:
- the production timeout bug is fixed
- the long-running turn no longer dies just because the Discord listener budget expires
- the worker architecture is not finished yet
What is still missing:
- `DiscordInboundJob` is still only partially normalized and still carries live runtime references
- command semantics (`stop`, `new`, `reset`, future session controls) are not yet fully worker-native
- worker observability and operator status are still minimal
- there is still no restart durability
## Why this exists
Current behavior ties the full agent turn to the listener lifetime:
- `src/discord/monitor/listeners.ts` applies the timeout and abort boundary.
- `src/discord/monitor/message-handler.ts` keeps the queued run inside that boundary.
- `src/discord/monitor/message-handler.process.ts` performs media loading, routing, dispatch, typing, draft streaming, and final reply delivery inline.
That architecture has two bad properties:
- long but healthy turns can be aborted by the listener watchdog
- users can see no reply even when the downstream runtime would have produced one
Raising the timeout helps but does not change the failure mode.
## Non-goals
- Do not redesign non-Discord channels in this pass.
- Do not broaden this into a generic all-channel worker framework in the first implementation.
- Do not extract a shared cross-channel inbound worker abstraction yet; only share low-level primitives when duplication is obvious.
- Do not add durable crash recovery in the first pass unless needed to land safely.
- Do not change route selection, binding semantics, or ACP policy in this plan.
## Current constraints
The current Discord processing path still depends on some live runtime objects that should not stay inside the long-term job payload:
- Carbon `Client`
- raw Discord event shapes
- in-memory guild history map
- thread binding manager callbacks
- live typing and draft stream state
We already moved execution onto a worker queue, but the normalization boundary is still incomplete. Right now the worker is "run later in the same process with some of the same live objects," not a fully data-only job boundary.
## Target architecture
### 1. Listener stage
`DiscordMessageListener` remains the ingress point, but its job becomes:
- run preflight and policy checks
- normalize accepted input into a serializable `DiscordInboundJob`
- enqueue the job into a per-session or per-channel async queue
- return immediately to Carbon once the enqueue succeeds
The listener should no longer own the end-to-end LLM turn lifetime.
### 2. Normalized job payload
Introduce a serializable job descriptor that contains only the data needed to run the turn later.
Minimum shape:
- route identity
- `agentId`
- `sessionKey`
- `accountId`
- `channel`
- delivery identity
- destination channel id
- reply target message id
- thread id if present
- sender identity
- sender id, label, username, tag
- channel context
- guild id
- channel name or slug
- thread metadata
- resolved system prompt override
- normalized message body
- base text
- effective message text
- attachment descriptors or resolved media references
- gating decisions
- mention requirement outcome
- command authorization outcome
- bound session or agent metadata if applicable
The job payload must not contain live Carbon objects or mutable closures.
Current implementation status:
- partially done
- `src/discord/monitor/inbound-job.ts` exists and defines the worker handoff
- the payload still contains live Discord runtime context and should be reduced further
### 3. Worker stage
Add a Discord-specific worker runner responsible for:
- reconstructing the turn context from `DiscordInboundJob`
- loading media and any additional channel metadata needed for the run
- dispatching the agent turn
- delivering final reply payloads
- updating status and diagnostics
Recommended location:
- `src/discord/monitor/inbound-worker.ts`
- `src/discord/monitor/inbound-job.ts`
### 4. Ordering model
Ordering must remain equivalent to today for a given route boundary.
Recommended key:
- use the same queue key logic as `resolveDiscordRunQueueKey(...)`
This preserves existing behavior:
- one bound agent conversation does not interleave with itself
- different Discord channels can still progress independently
### 5. Timeout model
After cutover, there are two separate timeout classes:
- listener timeout
- only covers normalization and enqueue
- should be short
- run timeout
- optional, worker-owned, explicit, and user-visible
- should not be inherited accidentally from Carbon listener settings
This removes the current accidental coupling between "Discord gateway listener stayed alive" and "agent run is healthy."
## Recommended implementation phases
### Phase 1: normalization boundary
- Status: partially implemented
- Done:
- extracted `buildDiscordInboundJob(...)`
- added worker handoff tests
- Remaining:
- make `DiscordInboundJob` plain data only
- move live runtime dependencies to worker-owned services instead of per-job payload
- stop rebuilding process context by stitching live listener refs back into the job
### Phase 2: in-memory worker queue
- Status: implemented
- Done:
- added `DiscordInboundWorkerQueue` keyed by resolved run queue key
- listener enqueues jobs instead of directly awaiting `processDiscordMessage(...)`
- worker executes jobs in-process, in memory only
This is the first functional cutover.
### Phase 3: process split
- Status: not started
- Move delivery, typing, and draft streaming ownership behind worker-facing adapters.
- Replace direct use of live preflight context with worker context reconstruction.
- Keep `processDiscordMessage(...)` temporarily as a facade if needed, then split it.
### Phase 4: command semantics
- Status: not started
Make sure native Discord commands still behave correctly when work is queued:
- `stop`
- `new`
- `reset`
- any future session-control commands
The worker queue must expose enough run state for commands to target the active or queued turn.
### Phase 5: observability and operator UX
- Status: not started
- emit queue depth and active worker counts into monitor status
- record enqueue time, start time, finish time, and timeout or cancellation reason
- surface worker-owned timeout or delivery failures clearly in logs
### Phase 6: optional durability follow-up
- Status: not started
Only after the in-memory version is stable:
- decide whether queued Discord jobs should survive gateway restart
- if yes, persist job descriptors and delivery checkpoints
- if no, document the explicit in-memory boundary
This should be a separate follow-up unless restart recovery is required to land.
## File impact
Current primary files:
- `src/discord/monitor/listeners.ts`
- `src/discord/monitor/message-handler.ts`
- `src/discord/monitor/message-handler.preflight.ts`
- `src/discord/monitor/message-handler.process.ts`
- `src/discord/monitor/status.ts`
Current worker files:
- `src/discord/monitor/inbound-job.ts`
- `src/discord/monitor/inbound-worker.ts`
- `src/discord/monitor/inbound-job.test.ts`
- `src/discord/monitor/message-handler.queue.test.ts`
Likely next touch points:
- `src/auto-reply/dispatch.ts`
- `src/discord/monitor/reply-delivery.ts`
- `src/discord/monitor/thread-bindings.ts`
- `src/discord/monitor/native-command.ts`
## Next step now
The next step is to make the worker boundary real instead of partial.
Do this next:
1. Move live runtime dependencies out of `DiscordInboundJob`
2. Keep those dependencies on the Discord worker instance instead
3. Reduce queued jobs to plain Discord-specific data:
- route identity
- delivery target
- sender info
- normalized message snapshot
- gating and binding decisions
4. Reconstruct worker execution context from that plain data inside the worker
In practice, that means:
- `client`
- `threadBindings`
- `guildHistories`
- `discordRestFetch`
- other mutable runtime-only handles
should stop living on each queued job and instead live on the worker itself or behind worker-owned adapters.
After that lands, the next follow-up should be command-state cleanup for `stop`, `new`, and `reset`.
## Testing plan
Keep the existing timeout repro coverage in:
- `src/discord/monitor/message-handler.queue.test.ts`
Add new tests for:
1. listener returns after enqueue without awaiting full turn
2. per-route ordering is preserved
3. different channels still run concurrently
4. replies are delivered to the original message destination
5. `stop` cancels the active worker-owned run
6. worker failure produces visible diagnostics without blocking later jobs
7. ACP-bound Discord channels still route correctly under worker execution
## Risks and mitigations
- Risk: command semantics drift from current synchronous behavior
Mitigation: land command-state plumbing in the same cutover, not later
- Risk: reply delivery loses thread or reply-to context
Mitigation: make delivery identity first-class in `DiscordInboundJob`
- Risk: duplicate sends during retries or queue restarts
Mitigation: keep first pass in-memory only, or add explicit delivery idempotency before persistence
- Risk: `message-handler.process.ts` becomes harder to reason about during migration
Mitigation: split into normalization, execution, and delivery helpers before or during worker cutover
## Acceptance criteria
The plan is complete when:
1. Discord listener timeout no longer aborts healthy long-running turns.
2. Listener lifetime and agent-turn lifetime are separate concepts in code.
3. Existing per-session ordering is preserved.
4. ACP-bound Discord channels work through the same worker path.
5. `stop` targets the worker-owned run instead of the old listener-owned call stack.
6. Timeout and delivery failures become explicit worker outcomes, not silent listener drops.
## Remaining landing strategy
Finish this in follow-up PRs:
1. make `DiscordInboundJob` plain-data only and move live runtime refs onto the worker
2. clean up command-state ownership for `stop`, `new`, and `reset`
3. add worker observability and operator status
4. decide whether durability is needed or explicitly document the in-memory boundary
This is still a bounded follow-up if kept Discord-only and if we continue to avoid a premature cross-channel worker abstraction.

View File

@@ -1,126 +0,0 @@
---
summary: "Plan: Add OpenResponses /v1/responses endpoint and deprecate chat completions cleanly"
read_when:
- Designing or implementing `/v1/responses` gateway support
- Planning migration from Chat Completions compatibility
owner: "openclaw"
status: "draft"
last_updated: "2026-01-19"
title: "OpenResponses Gateway Plan"
---
# OpenResponses Gateway Integration Plan
## Context
OpenClaw Gateway currently exposes a minimal OpenAI-compatible Chat Completions endpoint at
`/v1/chat/completions` (see [OpenAI Chat Completions](/gateway/openai-http-api)).
Open Responses is an open inference standard based on the OpenAI Responses API. It is designed
for agentic workflows and uses item-based inputs plus semantic streaming events. The OpenResponses
spec defines `/v1/responses`, not `/v1/chat/completions`.
## Goals
- Add a `/v1/responses` endpoint that adheres to OpenResponses semantics.
- Keep Chat Completions as a compatibility layer that is easy to disable and eventually remove.
- Standardize validation and parsing with isolated, reusable schemas.
## Non-goals
- Full OpenResponses feature parity in the first pass (images, files, hosted tools).
- Replacing internal agent execution logic or tool orchestration.
- Changing the existing `/v1/chat/completions` behavior during the first phase.
## Research Summary
Sources: OpenResponses OpenAPI, OpenResponses specification site, and the Hugging Face blog post.
Key points extracted:
- `POST /v1/responses` accepts `CreateResponseBody` fields like `model`, `input` (string or
`ItemParam[]`), `instructions`, `tools`, `tool_choice`, `stream`, `max_output_tokens`, and
`max_tool_calls`.
- `ItemParam` is a discriminated union of:
- `message` items with roles `system`, `developer`, `user`, `assistant`
- `function_call` and `function_call_output`
- `reasoning`
- `item_reference`
- Successful responses return a `ResponseResource` with `object: "response"`, `status`, and
`output` items.
- Streaming uses semantic events such as:
- `response.created`, `response.in_progress`, `response.completed`, `response.failed`
- `response.output_item.added`, `response.output_item.done`
- `response.content_part.added`, `response.content_part.done`
- `response.output_text.delta`, `response.output_text.done`
- The spec requires:
- `Content-Type: text/event-stream`
- `event:` must match the JSON `type` field
- terminal event must be literal `[DONE]`
- Reasoning items may expose `content`, `encrypted_content`, and `summary`.
- HF examples include `OpenResponses-Version: latest` in requests (optional header).
## Proposed Architecture
- Add `src/gateway/open-responses.schema.ts` containing Zod schemas only (no gateway imports).
- Add `src/gateway/openresponses-http.ts` (or `open-responses-http.ts`) for `/v1/responses`.
- Keep `src/gateway/openai-http.ts` intact as a legacy compatibility adapter.
- Add config `gateway.http.endpoints.responses.enabled` (default `false`).
- Keep `gateway.http.endpoints.chatCompletions.enabled` independent; allow both endpoints to be
toggled separately.
- Emit a startup warning when Chat Completions is enabled to signal legacy status.
## Deprecation Path for Chat Completions
- Maintain strict module boundaries: no shared schema types between responses and chat completions.
- Make Chat Completions opt-in by config so it can be disabled without code changes.
- Update docs to label Chat Completions as legacy once `/v1/responses` is stable.
- Optional future step: map Chat Completions requests to the Responses handler for a simpler
removal path.
## Phase 1 Support Subset
- Accept `input` as string or `ItemParam[]` with message roles and `function_call_output`.
- Extract system and developer messages into `extraSystemPrompt`.
- Use the most recent `user` or `function_call_output` as the current message for agent runs.
- Reject unsupported content parts (image/file) with `invalid_request_error`.
- Return a single assistant message with `output_text` content.
- Return `usage` with zeroed values until token accounting is wired.
## Validation Strategy (No SDK)
- Implement Zod schemas for the supported subset of:
- `CreateResponseBody`
- `ItemParam` + message content part unions
- `ResponseResource`
- Streaming event shapes used by the gateway
- Keep schemas in a single, isolated module to avoid drift and allow future codegen.
## Streaming Implementation (Phase 1)
- SSE lines with both `event:` and `data:`.
- Required sequence (minimum viable):
- `response.created`
- `response.output_item.added`
- `response.content_part.added`
- `response.output_text.delta` (repeat as needed)
- `response.output_text.done`
- `response.content_part.done`
- `response.completed`
- `[DONE]`
## Tests and Verification Plan
- Add e2e coverage for `/v1/responses`:
- Auth required
- Non-stream response shape
- Stream event ordering and `[DONE]`
- Session routing with headers and `user`
- Keep `src/gateway/openai-http.test.ts` unchanged.
- Manual: curl to `/v1/responses` with `stream: true` and verify event ordering and terminal
`[DONE]`.
## Doc Updates (Follow-up)
- Add a new docs page for `/v1/responses` usage and examples.
- Update `/gateway/openai-http-api` with a legacy note and pointer to `/v1/responses`.

View File

@@ -1,195 +0,0 @@
---
summary: "Production plan for reliable interactive process supervision (PTY + non-PTY) with explicit ownership, unified lifecycle, and deterministic cleanup"
read_when:
- Working on exec/process lifecycle ownership and cleanup
- Debugging PTY and non-PTY supervision behavior
owner: "openclaw"
status: "in-progress"
last_updated: "2026-02-15"
title: "PTY and Process Supervision Plan"
---
# PTY and Process Supervision Plan
## 1. Problem and goal
We need one reliable lifecycle for long-running command execution across:
- `exec` foreground runs
- `exec` background runs
- `process` follow up actions (`poll`, `log`, `send-keys`, `paste`, `submit`, `kill`, `remove`)
- CLI agent runner subprocesses
The goal is not just to support PTY. The goal is predictable ownership, cancellation, timeout, and cleanup with no unsafe process matching heuristics.
## 2. Scope and boundaries
- Keep implementation internal in `src/process/supervisor`.
- Do not create a new package for this.
- Keep current behavior compatibility where practical.
- Do not broaden scope to terminal replay or tmux style session persistence.
## 3. Implemented in this branch
### Supervisor baseline already present
- Supervisor module is in place under `src/process/supervisor/*`.
- Exec runtime and CLI runner are already routed through supervisor spawn and wait.
- Registry finalization is idempotent.
### This pass completed
1. Explicit PTY command contract
- `SpawnInput` is now a discriminated union in `src/process/supervisor/types.ts`.
- PTY runs require `ptyCommand` instead of reusing generic `argv`.
- Supervisor no longer rebuilds PTY command strings from argv joins in `src/process/supervisor/supervisor.ts`.
- Exec runtime now passes `ptyCommand` directly in `src/agents/bash-tools.exec-runtime.ts`.
2. Process layer type decoupling
- Supervisor types no longer import `SessionStdin` from agents.
- Process local stdin contract lives in `src/process/supervisor/types.ts` (`ManagedRunStdin`).
- Adapters now depend only on process level types:
- `src/process/supervisor/adapters/child.ts`
- `src/process/supervisor/adapters/pty.ts`
3. Process tool lifecycle ownership improvement
- `src/agents/bash-tools.process.ts` now requests cancellation through supervisor first.
- `process kill/remove` now use process-tree fallback termination when supervisor lookup misses.
- `remove` keeps deterministic remove behavior by dropping running session entries immediately after termination is requested.
4. Single source watchdog defaults
- Added shared defaults in `src/agents/cli-watchdog-defaults.ts`.
- `src/agents/cli-backends.ts` consumes the shared defaults.
- `src/agents/cli-runner/reliability.ts` consumes the same shared defaults.
5. Dead helper cleanup
- Removed unused `killSession` helper path from `src/agents/bash-tools.shared.ts`.
6. Direct supervisor path tests added
- Added `src/agents/bash-tools.process.supervisor.test.ts` to cover kill and remove routing through supervisor cancellation.
7. Reliability gap fixes completed
- `src/agents/bash-tools.process.ts` now falls back to real OS-level process termination when supervisor lookup misses.
- `src/process/supervisor/adapters/child.ts` now uses process-tree termination semantics for default cancel/timeout kill paths.
- Added shared process-tree utility in `src/process/kill-tree.ts`.
8. PTY contract edge-case coverage added
- Added `src/process/supervisor/supervisor.pty-command.test.ts` for verbatim PTY command forwarding and empty-command rejection.
- Added `src/process/supervisor/adapters/child.test.ts` for process-tree kill behavior in child adapter cancellation.
## 4. Remaining gaps and decisions
### Reliability status
The two required reliability gaps for this pass are now closed:
- `process kill/remove` now has a real OS termination fallback when supervisor lookup misses.
- child cancel/timeout now uses process-tree kill semantics for default kill path.
- Regression tests were added for both behaviors.
### Durability and startup reconciliation
Restart behavior is now explicitly defined as in-memory lifecycle only.
- `reconcileOrphans()` remains a no-op in `src/process/supervisor/supervisor.ts` by design.
- Active runs are not recovered after process restart.
- This boundary is intentional for this implementation pass to avoid partial persistence risks.
### Maintainability follow-ups
1. `runExecProcess` in `src/agents/bash-tools.exec-runtime.ts` still handles multiple responsibilities and can be split into focused helpers in a follow-up.
## 5. Implementation plan
The implementation pass for required reliability and contract items is complete.
Completed:
- `process kill/remove` fallback real termination
- process-tree cancellation for child adapter default kill path
- regression tests for fallback kill and child adapter kill path
- PTY command edge-case tests under explicit `ptyCommand`
- explicit in-memory restart boundary with `reconcileOrphans()` no-op by design
Optional follow-up:
- split `runExecProcess` into focused helpers with no behavior drift
## 6. File map
### Process supervisor
- `src/process/supervisor/types.ts` updated with discriminated spawn input and process local stdin contract.
- `src/process/supervisor/supervisor.ts` updated to use explicit `ptyCommand`.
- `src/process/supervisor/adapters/child.ts` and `src/process/supervisor/adapters/pty.ts` decoupled from agent types.
- `src/process/supervisor/registry.ts` idempotent finalize unchanged and retained.
### Exec and process integration
- `src/agents/bash-tools.exec-runtime.ts` updated to pass PTY command explicitly and keep fallback path.
- `src/agents/bash-tools.process.ts` updated to cancel via supervisor with real process-tree fallback termination.
- `src/agents/bash-tools.shared.ts` removed direct kill helper path.
### CLI reliability
- `src/agents/cli-watchdog-defaults.ts` added as shared baseline.
- `src/agents/cli-backends.ts` and `src/agents/cli-runner/reliability.ts` now consume same defaults.
## 7. Validation run in this pass
Unit tests:
- `pnpm vitest src/process/supervisor/registry.test.ts`
- `pnpm vitest src/process/supervisor/supervisor.test.ts`
- `pnpm vitest src/process/supervisor/supervisor.pty-command.test.ts`
- `pnpm vitest src/process/supervisor/adapters/child.test.ts`
- `pnpm vitest src/agents/cli-backends.test.ts`
- `pnpm vitest src/agents/bash-tools.exec.pty-cleanup.test.ts`
- `pnpm vitest src/agents/bash-tools.process.poll-timeout.test.ts`
- `pnpm vitest src/agents/bash-tools.process.supervisor.test.ts`
- `pnpm vitest src/process/exec.test.ts`
E2E targets:
- `pnpm vitest src/agents/cli-runner.test.ts`
- `pnpm vitest run src/agents/bash-tools.exec.pty-fallback.test.ts src/agents/bash-tools.exec.background-abort.test.ts src/agents/bash-tools.process.send-keys.test.ts`
Typecheck note:
- Use `pnpm build` (and `pnpm check` for full lint/docs gate) in this repo. Older notes that mention `pnpm tsgo` are obsolete.
## 8. Operational guarantees preserved
- Exec env hardening behavior is unchanged.
- Approval and allowlist flow is unchanged.
- Output sanitization and output caps are unchanged.
- PTY adapter still guarantees wait settlement on forced kill and listener disposal.
## 9. Definition of done
1. Supervisor is lifecycle owner for managed runs.
2. PTY spawn uses explicit command contract with no argv reconstruction.
3. Process layer has no type dependency on agent layer for supervisor stdin contracts.
4. Watchdog defaults are single source.
5. Targeted unit and e2e tests remain green.
6. Restart durability boundary is explicitly documented or fully implemented.
## 10. Summary
The branch now has a coherent and safer supervision shape:
- explicit PTY contract
- cleaner process layering
- supervisor driven cancellation path for process operations
- real fallback termination when supervisor lookup misses
- process-tree cancellation for child-run default kill paths
- unified watchdog defaults
- explicit in-memory restart boundary (no orphan reconciliation across restart in this pass)

View File

@@ -1,226 +0,0 @@
---
summary: "Channel agnostic session binding architecture and iteration 1 delivery scope"
read_when:
- Refactoring channel-agnostic session routing and bindings
- Investigating duplicate, stale, or missing session delivery across channels
owner: "onutc"
status: "in-progress"
last_updated: "2026-02-21"
title: "Session Binding Channel Agnostic Plan"
---
# Session Binding Channel Agnostic Plan
## Overview
This document defines the long term channel agnostic session binding model and the concrete scope for the next implementation iteration.
Goal:
- make subagent bound session routing a core capability
- keep channel specific behavior in adapters
- avoid regressions in normal Discord behavior
## Why this exists
Current behavior mixes:
- completion content policy
- destination routing policy
- Discord specific details
This caused edge cases such as:
- duplicate main and thread delivery under concurrent runs
- stale token usage on reused binding managers
- missing activity accounting for webhook sends
## Iteration 1 scope
This iteration is intentionally limited.
### 1. Add channel agnostic core interfaces
Add core types and service interfaces for bindings and routing.
Proposed core types:
```ts
export type BindingTargetKind = "subagent" | "session";
export type BindingStatus = "active" | "ending" | "ended";
export type ConversationRef = {
channel: string;
accountId: string;
conversationId: string;
parentConversationId?: string;
};
export type SessionBindingRecord = {
bindingId: string;
targetSessionKey: string;
targetKind: BindingTargetKind;
conversation: ConversationRef;
status: BindingStatus;
boundAt: number;
expiresAt?: number;
metadata?: Record<string, unknown>;
};
```
Core service contract:
```ts
export interface SessionBindingService {
bind(input: {
targetSessionKey: string;
targetKind: BindingTargetKind;
conversation: ConversationRef;
metadata?: Record<string, unknown>;
ttlMs?: number;
}): Promise<SessionBindingRecord>;
listBySession(targetSessionKey: string): SessionBindingRecord[];
resolveByConversation(ref: ConversationRef): SessionBindingRecord | null;
touch(bindingId: string, at?: number): void;
unbind(input: {
bindingId?: string;
targetSessionKey?: string;
reason: string;
}): Promise<SessionBindingRecord[]>;
}
```
### 2. Add one core delivery router for subagent completions
Add a single destination resolution path for completion events.
Router contract:
```ts
export interface BoundDeliveryRouter {
resolveDestination(input: {
eventKind: "task_completion";
targetSessionKey: string;
requester?: ConversationRef;
failClosed: boolean;
}): {
binding: SessionBindingRecord | null;
mode: "bound" | "fallback";
reason: string;
};
}
```
For this iteration:
- only `task_completion` is routed through this new path
- existing paths for other event kinds remain as-is
### 3. Keep Discord as adapter
Discord remains the first adapter implementation.
Adapter responsibilities:
- create/reuse thread conversations
- send bound messages via webhook or channel send
- validate thread state (archived/deleted)
- map adapter metadata (webhook identity, thread ids)
### 4. Fix currently known correctness issues
Required in this iteration:
- refresh token usage when reusing existing thread binding manager
- record outbound activity for webhook based Discord sends
- stop implicit main channel fallback when a bound thread destination is selected for session mode completion
### 5. Preserve current runtime safety defaults
No behavior change for users with thread bound spawn disabled.
Defaults stay:
- `channels.discord.threadBindings.spawnSubagentSessions = false`
Result:
- normal Discord users stay on current behavior
- new core path affects only bound session completion routing where enabled
## Not in iteration 1
Explicitly deferred:
- ACP binding targets (`targetKind: "acp"`)
- new channel adapters beyond Discord
- global replacement of all delivery paths (`spawn_ack`, future `subagent_message`)
- protocol level changes
- store migration/versioning redesign for all binding persistence
Notes on ACP:
- interface design keeps room for ACP
- ACP implementation is not started in this iteration
## Routing invariants
These invariants are mandatory for iteration 1.
- destination selection and content generation are separate steps
- if session mode completion resolves to an active bound destination, delivery must target that destination
- no hidden reroute from bound destination to main channel
- fallback behavior must be explicit and observable
## Compatibility and rollout
Compatibility target:
- no regression for users with thread bound spawning off
- no change to non-Discord channels in this iteration
Rollout:
1. Land interfaces and router behind current feature gates.
2. Route Discord completion mode bound deliveries through router.
3. Keep legacy path for non-bound flows.
4. Verify with targeted tests and canary runtime logs.
## Tests required in iteration 1
Unit and integration coverage required:
- manager token rotation uses latest token after manager reuse
- webhook sends update channel activity timestamps
- two active bound sessions in same requester channel do not duplicate to main channel
- completion for bound session mode run resolves to thread destination only
- disabled spawn flag keeps legacy behavior unchanged
## Proposed implementation files
Core:
- `src/infra/outbound/session-binding-service.ts` (new)
- `src/infra/outbound/bound-delivery-router.ts` (new)
- `src/agents/subagent-announce.ts` (completion destination resolution integration)
Discord adapter and runtime:
- `src/discord/monitor/thread-bindings.manager.ts`
- `src/discord/monitor/reply-delivery.ts`
- `src/discord/send.outbound.ts`
Tests:
- `src/discord/monitor/provider*.test.ts`
- `src/discord/monitor/reply-delivery.test.ts`
- `src/agents/subagent-announce.format.test.ts`
## Done criteria for iteration 1
- core interfaces exist and are wired for completion routing
- correctness fixes above are merged with tests
- no main and thread duplicate completion delivery in session mode bound runs
- no behavior change for disabled bound spawn deployments
- ACP remains explicitly deferred

View File

@@ -1,89 +0,0 @@
---
summary: "Proposal: long-term command authorization model for ACP-bound conversations"
read_when:
- Designing native command auth behavior in Telegram/Discord ACP-bound channels/topics
title: "ACP Bound Command Authorization (Proposal)"
---
# ACP Bound Command Authorization (Proposal)
Status: Proposed, **not implemented yet**.
This document describes a long-term authorization model for native commands in
ACP-bound conversations. It is an experiments proposal and does not replace
current production behavior.
For implemented behavior, read source and tests in:
- `src/telegram/bot-native-commands.ts`
- `src/discord/monitor/native-command.ts`
- `src/auto-reply/reply/commands-core.ts`
## Problem
Today we have command-specific checks (for example `/new` and `/reset`) that
need to work inside ACP-bound channels/topics even when allowlists are empty.
This solves immediate UX pain, but command-name-based exceptions do not scale.
## Long-term shape
Move command authorization from ad-hoc handler logic to command metadata plus a
shared policy evaluator.
### 1) Add auth policy metadata to command definitions
Each command definition should declare an auth policy. Example shape:
```ts
type CommandAuthPolicy =
| { mode: "owner_or_allowlist" } // default, current strict behavior
| { mode: "bound_acp_or_owner_or_allowlist" } // allow in explicitly bound ACP conversations
| { mode: "owner_only" };
```
`/new` and `/reset` would use `bound_acp_or_owner_or_allowlist`.
Most other commands would remain `owner_or_allowlist`.
### 2) Share one evaluator across channels
Introduce one helper that evaluates command auth using:
- command policy metadata
- sender authorization state
- resolved conversation binding state
Both Telegram and Discord native handlers should call the same helper to avoid
behavior drift.
### 3) Use binding-match as the bypass boundary
When policy allows bound ACP bypass, authorize only if a configured binding
match was resolved for the current conversation (not just because current
session key looks ACP-like).
This keeps the boundary explicit and minimizes accidental widening.
## Why this is better
- Scales to future commands without adding more command-name conditionals.
- Keeps behavior consistent across channels.
- Preserves current security model by requiring explicit binding match.
- Keeps allowlists optional hardening instead of a universal requirement.
## Rollout plan (future)
1. Add command auth policy field to command registry types and command data.
2. Implement shared evaluator and migrate Telegram + Discord native handlers.
3. Move `/new` and `/reset` to metadata-driven policy.
4. Add tests per policy mode and channel surface.
## Non-goals
- This proposal does not change ACP session lifecycle behavior.
- This proposal does not require allowlists for all ACP-bound commands.
- This proposal does not change existing route binding semantics.
## Note
This proposal is intentionally additive and does not delete or replace existing
experiments documents.

View File

@@ -1,36 +0,0 @@
---
summary: "Exploration: model config, auth profiles, and fallback behavior"
read_when:
- Exploring future model selection + auth profile ideas
title: "Model Config Exploration"
---
# Model Config (Exploration)
This document captures **ideas** for future model configuration. It is not a
shipping spec. For current behavior, see:
- [Models](/concepts/models)
- [Model failover](/concepts/model-failover)
- [OAuth + profiles](/concepts/oauth)
## Motivation
Operators want:
- Multiple auth profiles per provider (personal vs work).
- Simple `/model` selection with predictable fallbacks.
- Clear separation between text models and image-capable models.
## Possible direction (high level)
- Keep model selection simple: `provider/model` with optional aliases.
- Let providers have multiple auth profiles, with an explicit order.
- Use a global fallback list so all sessions fail over consistently.
- Only override image routing when explicitly configured.
## Open questions
- Should profile rotation be per-provider or per-model?
- How should the UI surface profile selection for a session?
- What is the safest migration path from legacy config keys?

View File

@@ -1,228 +0,0 @@
---
summary: "Research notes: offline memory system for Clawd workspaces (Markdown source-of-truth + derived index)"
read_when:
- Designing workspace memory (~/.openclaw/workspace) beyond daily Markdown logs
- Deciding: standalone CLI vs deep OpenClaw integration
- Adding offline recall + reflection (retain/recall/reflect)
title: "Workspace Memory Research"
---
# Workspace Memory v2 (offline): research notes
Target: Clawd-style workspace (`agents.defaults.workspace`, default `~/.openclaw/workspace`) where “memory” is stored as one Markdown file per day (`memory/YYYY-MM-DD.md`) plus a small set of stable files (e.g. `memory.md`, `SOUL.md`).
This doc proposes an **offline-first** memory architecture that keeps Markdown as the canonical, reviewable source of truth, but adds **structured recall** (search, entity summaries, confidence updates) via a derived index.
## Why change?
The current setup (one file per day) is excellent for:
- “append-only” journaling
- human editing
- git-backed durability + auditability
- low-friction capture (“just write it down”)
Its weak for:
- high-recall retrieval (“what did we decide about X?”, “last time we tried Y?”)
- entity-centric answers (“tell me about Alice / The Castle / warelay”) without rereading many files
- opinion/preference stability (and evidence when it changes)
- time constraints (“what was true during Nov 2025?”) and conflict resolution
## Design goals
- **Offline**: works without network; can run on laptop/Castle; no cloud dependency.
- **Explainable**: retrieved items should be attributable (file + location) and separable from inference.
- **Low ceremony**: daily logging stays Markdown, no heavy schema work.
- **Incremental**: v1 is useful with FTS only; semantic/vector and graphs are optional upgrades.
- **Agent-friendly**: makes “recall within token budgets” easy (return small bundles of facts).
## North star model (Hindsight × Letta)
Two pieces to blend:
1. **Letta/MemGPT-style control loop**
- keep a small “core” always in context (persona + key user facts)
- everything else is out-of-context and retrieved via tools
- memory writes are explicit tool calls (append/replace/insert), persisted, then re-injected next turn
2. **Hindsight-style memory substrate**
- separate whats observed vs whats believed vs whats summarized
- support retain/recall/reflect
- confidence-bearing opinions that can evolve with evidence
- entity-aware retrieval + temporal queries (even without full knowledge graphs)
## Proposed architecture (Markdown source-of-truth + derived index)
### Canonical store (git-friendly)
Keep `~/.openclaw/workspace` as canonical human-readable memory.
Suggested workspace layout:
```
~/.openclaw/workspace/
memory.md # small: durable facts + preferences (core-ish)
memory/
YYYY-MM-DD.md # daily log (append; narrative)
bank/ # “typed” memory pages (stable, reviewable)
world.md # objective facts about the world
experience.md # what the agent did (first-person)
opinions.md # subjective prefs/judgments + confidence + evidence pointers
entities/
Peter.md
The-Castle.md
warelay.md
...
```
Notes:
- **Daily log stays daily log**. No need to turn it into JSON.
- The `bank/` files are **curated**, produced by reflection jobs, and can still be edited by hand.
- `memory.md` remains “small + core-ish”: the things you want Clawd to see every session.
### Derived store (machine recall)
Add a derived index under the workspace (not necessarily git tracked):
```
~/.openclaw/workspace/.memory/index.sqlite
```
Back it with:
- SQLite schema for facts + entity links + opinion metadata
- SQLite **FTS5** for lexical recall (fast, tiny, offline)
- optional embeddings table for semantic recall (still offline)
The index is always **rebuildable from Markdown**.
## Retain / Recall / Reflect (operational loop)
### Retain: normalize daily logs into “facts”
Hindsights key insight that matters here: store **narrative, self-contained facts**, not tiny snippets.
Practical rule for `memory/YYYY-MM-DD.md`:
- at end of day (or during), add a `## Retain` section with 25 bullets that are:
- narrative (cross-turn context preserved)
- self-contained (standalone makes sense later)
- tagged with type + entity mentions
Example:
```
## Retain
- W @Peter: Currently in Marrakech (Nov 27Dec 1, 2025) for Andys birthday.
- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md).
- O(c=0.95) @Peter: Prefers concise replies (&lt;1500 chars) on WhatsApp; long content goes into files.
```
Minimal parsing:
- Type prefix: `W` (world), `B` (experience/biographical), `O` (opinion), `S` (observation/summary; usually generated)
- Entities: `@Peter`, `@warelay`, etc (slugs map to `bank/entities/*.md`)
- Opinion confidence: `O(c=0.0..1.0)` optional
If you dont want authors to think about it: the reflect job can infer these bullets from the rest of the log, but having an explicit `## Retain` section is the easiest “quality lever”.
### Recall: queries over the derived index
Recall should support:
- **lexical**: “find exact terms / names / commands” (FTS5)
- **entity**: “tell me about X” (entity pages + entity-linked facts)
- **temporal**: “what happened around Nov 27” / “since last week”
- **opinion**: “what does Peter prefer?” (with confidence + evidence)
Return format should be agent-friendly and cite sources:
- `kind` (`world|experience|opinion|observation`)
- `timestamp` (source day, or extracted time range if present)
- `entities` (`["Peter","warelay"]`)
- `content` (the narrative fact)
- `source` (`memory/2025-11-27.md#L12` etc)
### Reflect: produce stable pages + update beliefs
Reflection is a scheduled job (daily or heartbeat `ultrathink`) that:
- updates `bank/entities/*.md` from recent facts (entity summaries)
- updates `bank/opinions.md` confidence based on reinforcement/contradiction
- optionally proposes edits to `memory.md` (“core-ish” durable facts)
Opinion evolution (simple, explainable):
- each opinion has:
- statement
- confidence `c ∈ [0,1]`
- last_updated
- evidence links (supporting + contradicting fact IDs)
- when new facts arrive:
- find candidate opinions by entity overlap + similarity (FTS first, embeddings later)
- update confidence by small deltas; big jumps require strong contradiction + repeated evidence
## CLI integration: standalone vs deep integration
Recommendation: **deep integration in OpenClaw**, but keep a separable core library.
### Why integrate into OpenClaw?
- OpenClaw already knows:
- the workspace path (`agents.defaults.workspace`)
- the session model + heartbeats
- logging + troubleshooting patterns
- You want the agent itself to call the tools:
- `openclaw memory recall "…" --k 25 --since 30d`
- `openclaw memory reflect --since 7d`
### Why still split a library?
- keep memory logic testable without gateway/runtime
- reuse from other contexts (local scripts, future desktop app, etc.)
Shape:
The memory tooling is intended to be a small CLI + library layer, but this is exploratory only.
## “S-Collide” / SuCo: when to use it (research)
If “S-Collide” refers to **SuCo (Subspace Collision)**: its an ANN retrieval approach that targets strong recall/latency tradeoffs by using learned/structured collisions in subspaces (paper: arXiv 2411.14754, 2024).
Pragmatic take for `~/.openclaw/workspace`:
- **dont start** with SuCo.
- start with SQLite FTS + (optional) simple embeddings; youll get most UX wins immediately.
- consider SuCo/HNSW/ScaNN-class solutions only once:
- corpus is big (tens/hundreds of thousands of chunks)
- brute-force embedding search becomes too slow
- recall quality is meaningfully bottlenecked by lexical search
Offline-friendly alternatives (in increasing complexity):
- SQLite FTS5 + metadata filters (zero ML)
- Embeddings + brute force (works surprisingly far if chunk count is low)
- HNSW index (common, robust; needs a library binding)
- SuCo (research-grade; attractive if theres a solid implementation you can embed)
Open question:
- whats the **best** offline embedding model for “personal assistant memory” on your machines (laptop + desktop)?
- if you already have Ollama: embed with a local model; otherwise ship a small embedding model in the toolchain.
## Smallest useful pilot
If you want a minimal, still-useful version:
- Add `bank/` entity pages and a `## Retain` section in daily logs.
- Use SQLite FTS for recall with citations (path + line numbers).
- Add embeddings only if recall quality or scale demands it.
## References
- Letta / MemGPT concepts: “core memory blocks” + “archival memory” + tool-driven self-editing memory.
- Hindsight Technical Report: “retain / recall / reflect”, four-network memory, narrative fact extraction, opinion confidence evolution.
- SuCo: arXiv 2411.14754 (2024): “Subspace Collision” approximate nearest neighbor retrieval.

View File

@@ -176,12 +176,6 @@ Use these hubs to discover every page, including deep dives and reference docs t
- [Templates: TOOLS](/reference/templates/TOOLS)
- [Templates: USER](/reference/templates/USER)
## Experiments (exploratory)
- [Onboarding config protocol](/experiments/onboarding-config-protocol)
- [Research: memory](/experiments/research/memory)
- [Model config exploration](/experiments/proposals/model-config)
## Project
- [Credits](/reference/credits)

View File

@@ -1,47 +0,0 @@
---
read_when: Changing onboarding wizard steps or config schema endpoints
summary: 新手引导向导和配置模式的 RPC 协议说明
title: 新手引导和配置协议
x-i18n:
generated_at: "2026-02-03T07:47:10Z"
model: claude-opus-4-5
provider: pi
source_hash: 55163b3ee029c02476800cb616a054e5adfe97dae5bb72f2763dce0079851e06
source_path: experiments/onboarding-config-protocol.md
workflow: 15
---
# 新手引导 + 配置协议
目的CLI、macOS 应用和 Web UI 之间共享的新手引导 + 配置界面。
## 组件
- 向导引擎(共享会话 + 提示 + 新手引导状态)。
- CLI 新手引导使用与 UI 客户端相同的向导流程。
- Gateway 网关 RPC 公开向导 + 配置模式端点。
- macOS 新手引导使用向导步骤模型。
- Web UI 从 JSON Schema + UI 提示渲染配置表单。
## Gateway 网关 RPC
- `wizard.start` 参数:`{ mode?: "local"|"remote", workspace?: string }`
- `wizard.next` 参数:`{ sessionId, answer?: { stepId, value? } }`
- `wizard.cancel` 参数:`{ sessionId }`
- `wizard.status` 参数:`{ sessionId }`
- `config.schema` 参数:`{}`
响应(结构)
- 向导:`{ sessionId, done, step?, status?, error? }`
- 配置模式:`{ schema, uiHints, version, generatedAt }`
## UI 提示
- `uiHints` 按路径键入可选元数据label/help/group/order/advanced/sensitive/placeholder
- 敏感字段渲染为密码输入;无脱敏层。
- 不支持的模式节点回退到原始 JSON 编辑器。
## 注意
- 本文档是跟踪新手引导/配置协议重构的唯一位置。

View File

@@ -1,70 +0,0 @@
---
last_updated: "2026-01-05"
owner: openclaw
status: complete
summary: 加固 cron.add 输入处理,对齐 schema改进 cron UI/智能体工具
title: Cron Add 加固
x-i18n:
generated_at: "2026-02-03T07:47:26Z"
model: claude-opus-4-5
provider: pi
source_hash: d7e469674bd9435b846757ea0d5dc8f174eaa8533917fc013b1ef4f82859496d
source_path: experiments/plans/cron-add-hardening.md
workflow: 15
---
# Cron Add 加固 & Schema 对齐
## 背景
最近的 Gateway 网关日志显示重复的 `cron.add` 失败,参数无效(缺少 `sessionTarget``wakeMode``payload`,以及格式错误的 `schedule`。这表明至少有一个客户端可能是智能体工具调用路径正在发送包装的或部分指定的任务负载。另外TypeScript 中的 cron 提供商枚举、Gateway 网关 schema、CLI 标志和 UI 表单类型之间存在漂移,加上 `cron.status` 的 UI 不匹配(期望 `jobCount` 而 Gateway 网关返回 `jobs`)。
## 目标
- 通过规范化常见的包装负载并推断缺失的 `kind` 字段来停止 `cron.add` INVALID_REQUEST 垃圾。
- 在 Gateway 网关 schema、cron 类型、CLI 文档和 UI 表单之间对齐 cron 提供商列表。
- 使智能体 cron 工具 schema 明确,以便 LLM 生成正确的任务负载。
- 修复 Control UI cron 状态任务计数显示。
- 添加测试以覆盖规范化和工具行为。
## 非目标
- 更改 cron 调度语义或任务执行行为。
- 添加新的调度类型或 cron 表达式解析。
- 除了必要的字段修复外,不大改 cron 的 UI/UX。
## 发现(当前差距)
- Gateway 网关中的 `CronPayloadSchema` 排除了 `signal` + `imessage`,而 TS 类型包含它们。
- Control UI CronStatus 期望 `jobCount`,但 Gateway 网关返回 `jobs`
- 智能体 cron 工具 schema 允许任意 `job` 对象,导致格式错误的输入。
- Gateway 网关严格验证 `cron.add` 而不进行规范化,因此包装的负载会失败。
## 变更内容
- `cron.add``cron.update` 现在规范化常见的包装形式并推断缺失的 `kind` 字段。
- 智能体 cron 工具 schema 与 Gateway 网关 schema 匹配,减少无效负载。
- 提供商枚举在 Gateway 网关、CLI、UI 和 macOS 选择器之间对齐。
- Control UI 使用 Gateway 网关的 `jobs` 计数字段显示状态。
## 当前行为
- **规范化:**包装的 `data`/`job` 负载被解包;`schedule.kind``payload.kind` 在安全时被推断。
- **默认值:**当缺失时,为 `wakeMode``sessionTarget` 应用安全默认值。
- **提供商:**Discord/Slack/Signal/iMessage 现在在 CLI/UI 中一致显示。
参见 [Cron 任务](/automation/cron-jobs) 了解规范化的形式和示例。
## 验证
- 观察 Gateway 网关日志中 `cron.add` INVALID_REQUEST 错误是否减少。
- 确认 Control UI cron 状态在刷新后显示任务计数。
## 可选后续工作
- 手动 Control UI 冒烟测试:为每个提供商添加一个 cron 任务 + 验证状态任务计数。
## 开放问题
- `cron.add` 是否应该接受来自客户端的显式 `state`(当前被 schema 禁止)?
- 我们是否应该允许 `webchat` 作为显式投递提供商(当前在投递解析中被过滤)?

View File

@@ -1,45 +0,0 @@
---
read_when:
- 查看历史 Telegram 允许列表更改
summary: Telegram 允许列表加固:前缀 + 空白规范化
title: Telegram 允许列表加固
x-i18n:
generated_at: "2026-02-03T07:47:16Z"
model: claude-opus-4-5
provider: pi
source_hash: a2eca5fcc85376948cfe1b6044f1a8bc69c7f0eb94d1ceafedc1e507ba544162
source_path: experiments/plans/group-policy-hardening.md
workflow: 15
---
# Telegram 允许列表加固
**日期**2026-01-05
**状态**:已完成
**PR**#216
## 摘要
Telegram 允许列表现在不区分大小写地接受 `telegram:``tg:` 前缀,并容忍意外的空白。这使入站允许列表检查与出站发送规范化保持一致。
## 更改内容
- 前缀 `telegram:``tg:` 被同等对待(不区分大小写)。
- 允许列表条目会被修剪;空条目会被忽略。
## 示例
以下所有形式都被接受为同一 ID
- `telegram:123456`
- `TG:123456`
- `tg:123456`
## 为什么重要
从日志或聊天 ID 复制/粘贴通常会包含前缀和空白。规范化可避免在决定是否在私信或群组中响应时出现误判。
## 相关文档
- [群聊](/channels/groups)
- [Telegram 提供商](/channels/telegram)

View File

@@ -1,121 +0,0 @@
---
last_updated: "2026-01-19"
owner: openclaw
status: draft
summary: 计划:添加 OpenResponses /v1/responses 端点并干净地弃用 chat completions
title: OpenResponses Gateway 网关计划
x-i18n:
generated_at: "2026-02-03T07:47:33Z"
model: claude-opus-4-5
provider: pi
source_hash: 71a22c48397507d1648b40766a3153e420c54f2a2d5186d07e51eb3d12e4636a
source_path: experiments/plans/openresponses-gateway.md
workflow: 15
---
# OpenResponses Gateway 网关集成计划
## 背景
OpenClaw Gateway 网关目前在 `/v1/chat/completions` 暴露了一个最小的 OpenAI 兼容 Chat Completions 端点(参见 [OpenAI Chat Completions](/gateway/openai-http-api))。
Open Responses 是基于 OpenAI Responses API 的开放推理标准。它专为智能体工作流设计使用基于项目的输入加语义流式事件。OpenResponses 规范定义的是 `/v1/responses`,而不是 `/v1/chat/completions`
## 目标
- 添加一个遵循 OpenResponses 语义的 `/v1/responses` 端点。
- 保留 Chat Completions 作为兼容层,易于禁用并最终移除。
- 使用隔离的、可复用的 schema 标准化验证和解析。
## 非目标
- 第一阶段完全实现 OpenResponses 功能(图片、文件、托管工具)。
- 替换内部智能体执行逻辑或工具编排。
- 在第一阶段更改现有的 `/v1/chat/completions` 行为。
## 研究摘要
来源OpenResponses OpenAPI、OpenResponses 规范网站和 Hugging Face 博客文章。
提取的关键点:
- `POST /v1/responses` 接受 `CreateResponseBody` 字段,如 `model``input`(字符串或 `ItemParam[]`)、`instructions``tools``tool_choice``stream``max_output_tokens``max_tool_calls`
- `ItemParam` 是以下类型的可区分联合:
- 具有角色 `system``developer``user``assistant``message`
- `function_call``function_call_output`
- `reasoning`
- `item_reference`
- 成功响应返回带有 `object: "response"``status``output` 项的 `ResponseResource`
- 流式传输使用语义事件,如:
- `response.created``response.in_progress``response.completed``response.failed`
- `response.output_item.added``response.output_item.done`
- `response.content_part.added``response.content_part.done`
- `response.output_text.delta``response.output_text.done`
- 规范要求:
- `Content-Type: text/event-stream`
- `event:` 必须匹配 JSON `type` 字段
- 终止事件必须是字面量 `[DONE]`
- Reasoning 项可能暴露 `content``encrypted_content``summary`
- HF 示例在请求中包含 `OpenResponses-Version: latest`(可选头部)。
## 提议的架构
- 添加 `src/gateway/open-responses.schema.ts`,仅包含 Zod schema无 gateway 导入)。
- 添加 `src/gateway/openresponses-http.ts`(或 `open-responses-http.ts`)用于 `/v1/responses`
- 保持 `src/gateway/openai-http.ts` 不变,作为遗留兼容适配器。
- 添加配置 `gateway.http.endpoints.responses.enabled`(默认 `false`)。
- 保持 `gateway.http.endpoints.chatCompletions.enabled` 独立;允许两个端点分别切换。
- 当 Chat Completions 启用时发出启动警告,以表明其遗留状态。
## Chat Completions 弃用路径
- 保持严格的模块边界responses 和 chat completions 之间不共享 schema 类型。
- 通过配置使 Chat Completions 成为可选,这样无需代码更改即可禁用。
- 一旦 `/v1/responses` 稳定,更新文档将 Chat Completions 标记为遗留。
- 可选的未来步骤:将 Chat Completions 请求映射到 Responses 处理器,以便更简单地移除。
## 第一阶段支持子集
- 接受 `input` 为字符串或带有消息角色和 `function_call_output``ItemParam[]`
- 将 system 和 developer 消息提取到 `extraSystemPrompt` 中。
- 使用最近的 `user``function_call_output` 作为智能体运行的当前消息。
- 对不支持的内容部分(图片/文件)返回 `invalid_request_error` 拒绝。
- 返回带有 `output_text` 内容的单个助手消息。
- 返回带有零值的 `usage`,直到 token 计数接入。
## 验证策略(无 SDK
- 为以下支持子集实现 Zod schema
- `CreateResponseBody`
- `ItemParam` + 消息内容部分联合
- `ResponseResource`
- Gateway 网关使用的流式事件形状
- 将 schema 保存在单个隔离模块中,以避免漂移并允许未来代码生成。
## 流式实现(第一阶段)
- 带有 `event:``data:` 的 SSE 行。
- 所需序列(最小可行):
- `response.created`
- `response.output_item.added`
- `response.content_part.added`
- `response.output_text.delta`(根据需要重复)
- `response.output_text.done`
- `response.content_part.done`
- `response.completed`
- `[DONE]`
## 测试和验证计划
-`/v1/responses` 添加端到端覆盖:
- 需要认证
- 非流式响应形状
- 流式事件顺序和 `[DONE]`
- 使用头部和 `user` 的会话路由
- 保持 `src/gateway/openai-http.e2e.test.ts` 不变。
- 手动:用 `stream: true` curl `/v1/responses` 并验证事件顺序和终止 `[DONE]`
## 文档更新(后续)
-`/v1/responses` 使用和示例添加新文档页面。
- 更新 `/gateway/openai-http-api`,添加遗留说明和指向 `/v1/responses` 的指针。

View File

@@ -1,42 +0,0 @@
---
read_when:
- 探索未来模型选择和认证配置文件的方案
summary: 探索:模型配置、认证配置文件和回退行为
title: 模型配置探索
x-i18n:
generated_at: "2026-02-01T20:25:05Z"
model: claude-opus-4-5
provider: pi
source_hash: 48623233d80f874c0ae853b51f888599cf8b50ae6fbfe47f6d7b0216bae9500b
source_path: experiments/proposals/model-config.md
workflow: 14
---
# 模型配置(探索)
本文档记录了未来模型配置的**构想**。这不是正式的发布规范。如需了解当前行为,请参阅:
- [模型](/concepts/models)
- [模型故障转移](/concepts/model-failover)
- [OAuth + 配置文件](/concepts/oauth)
## 动机
运营者希望:
- 每个提供商支持多个认证配置文件(个人 vs 工作)。
- 简单的 `/model` 选择,并具有可预测的回退行为。
- 文本模型与图像模型之间有清晰的分离。
## 可能的方向(高层级)
- 保持模型选择简洁:`provider/model` 加可选别名。
- 允许提供商拥有多个认证配置文件,并指定明确的顺序。
- 使用全局回退列表,使所有会话以一致的方式进行故障转移。
- 仅在明确配置时才覆盖图像路由。
## 待解决的问题
- 配置文件轮换应该按提供商还是按模型进行?
- UI 应如何为会话展示配置文件选择?
- 从旧版配置键迁移的最安全路径是什么?

View File

@@ -1,235 +0,0 @@
---
read_when:
- 设计超越每日 Markdown 日志的工作区记忆(~/.openclaw/workspace
- Deciding: standalone CLI vs deep OpenClaw integration
- 添加离线回忆 + 反思retain/recall/reflect
summary: 研究笔记Clawd 工作区的离线记忆系统Markdown 作为数据源 + 派生索引)
title: 工作区记忆研究
x-i18n:
generated_at: "2026-02-03T10:06:14Z"
model: claude-opus-4-5
provider: pi
source_hash: 1753c8ee6284999fab4a94ff5fae7421c85233699c9d3088453d0c2133ac0feb
source_path: experiments/research/memory.md
workflow: 15
---
# 工作区记忆 v2离线研究笔记
目标Clawd 风格的工作区(`agents.defaults.workspace`,默认 `~/.openclaw/workspace`),其中"记忆"以每天一个 Markdown 文件(`memory/YYYY-MM-DD.md`)加上一小组稳定文件(例如 `memory.md``SOUL.md`)的形式存储。
本文档提出一种**离线优先**的记忆架构,保持 Markdown 作为规范的、可审查的数据源,但通过派生索引添加**结构化回忆**(搜索、实体摘要、置信度更新)。
## 为什么要改变?
当前设置(每天一个文件)非常适合:
- "仅追加"式日志记录
- 人工编辑
- git 支持的持久性 + 可审计性
- 低摩擦捕获("直接写下来"
但它在以下方面较弱:
- 高召回率检索("我们对 X 做了什么决定?"、"上次我们尝试 Y 时?"
- 以实体为中心的答案("告诉我关于 Alice / The Castle / warelay 的信息")而无需重读多个文件
- 观点/偏好稳定性(以及变化时的证据)
- 时间约束("2025 年 11 月期间什么是真实的?")和冲突解决
## 设计目标
- **离线**:无需网络即可工作;可在笔记本电脑/Castle 上运行;无云依赖。
- **可解释**:检索的项目应该可归因(文件 + 位置)并与推理分离。
- **低仪式感**:每日日志保持 Markdown无需繁重的 schema 工作。
- **增量式**v1 仅使用 FTS 就很有用;语义/向量和图是可选升级。
- **对智能体友好**:使"在 token 预算内回忆"变得简单(返回小型事实包)。
## 北极星模型Hindsight × Letta
需要融合两个部分:
1. **Letta/MemGPT 风格的控制循环**
- 保持一个小的"核心"始终在上下文中(角色 + 关键用户事实)
- 其他所有内容都在上下文之外,通过工具检索
- 记忆写入是显式的工具调用append/replace/insert持久化后在下一轮重新注入
2. **Hindsight 风格的记忆基底**
- 分离观察到的、相信的和总结的内容
- 支持 retain/recall/reflect
- 带有置信度的观点可以随证据演变
- 实体感知检索 + 时间查询(即使没有完整的知识图谱)
## 提议的架构Markdown 数据源 + 派生索引)
### 规范存储git 友好)
保持 `~/.openclaw/workspace` 作为规范的人类可读记忆。
建议的工作区布局:
```
~/.openclaw/workspace/
memory.md # 小型:持久事实 + 偏好(类似核心)
memory/
YYYY-MM-DD.md # 每日日志(追加;叙事)
bank/ # "类型化"记忆页面(稳定、可审查)
world.md # 关于世界的客观事实
experience.md # 智能体做了什么(第一人称)
opinions.md # 主观偏好/判断 + 置信度 + 证据指针
entities/
Peter.md
The-Castle.md
warelay.md
...
```
注意:
- **每日日志保持为每日日志**。无需将其转换为 JSON。
- `bank/` 文件是**经过整理的**,由反思任务生成,仍可手动编辑。
- `memory.md` 保持"小型 + 类似核心":你希望 Clawd 每次会话都能看到的内容。
### 派生存储(机器回忆)
在工作区下添加派生索引(不一定需要 git 跟踪):
```
~/.openclaw/workspace/.memory/index.sqlite
```
后端支持:
- 用于事实 + 实体链接 + 观点元数据的 SQLite schema
- SQLite **FTS5** 用于词法回忆(快速、小巧、离线)
- 可选的嵌入表用于语义回忆(仍然离线)
索引始终**可从 Markdown 重建**。
## Retain / Recall / Reflect操作循环
### Retain将每日日志规范化为"事实"
Hindsight 在这里重要的关键洞察:存储**叙事性、自包含的事实**,而不是微小的片段。
`memory/YYYY-MM-DD.md` 的实用规则:
- 在一天结束时(或期间),添加一个 `## Retain` 部分,包含 2-5 个要点:
- 叙事性(保留跨轮上下文)
- 自包含(独立时也有意义)
- 标记类型 + 实体提及
示例:
```
## Retain
- W @Peter: Currently in Marrakech (Nov 27Dec 1, 2025) for Andy's birthday.
- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md).
- O(c=0.95) @Peter: Prefers concise replies (&lt;1500 chars) on WhatsApp; long content goes into files.
```
最小化解析:
- 类型前缀:`W`(世界)、`B`(经历/传记)、`O`(观点)、`S`(观察/摘要;通常是生成的)
- 实体:`@Peter``@warelay`slug 映射到 `bank/entities/*.md`
- 观点置信度:`O(c=0.0..1.0)` 可选
如果你不想让作者考虑这些:反思任务可以从日志的其余部分推断这些要点,但有一个显式的 `## Retain` 部分是最简单的"质量杠杆"。
### Recall对派生索引的查询
Recall 应支持:
- **词法**"查找精确的术语/名称/命令"FTS5
- **实体**"告诉我关于 X 的信息"(实体页面 + 实体链接的事实)
- **时间**"11 月 27 日前后发生了什么"/"自上周以来"
- **观点**"Peter 偏好什么?"(带置信度 + 证据)
返回格式应对智能体友好并引用来源:
- `kind``world|experience|opinion|observation`
- `timestamp`(来源日期,或如果存在则提取的时间范围)
- `entities``["Peter","warelay"]`
- `content`(叙事性事实)
- `source``memory/2025-11-27.md#L12` 等)
### Reflect生成稳定页面 + 更新信念
反思是一个定时任务(每日或心跳 `ultrathink`),它:
- 根据最近的事实更新 `bank/entities/*.md`(实体摘要)
- 根据强化/矛盾更新 `bank/opinions.md` 置信度
- 可选地提议对 `memory.md`"类似核心"的持久事实)的编辑
观点演变(简单、可解释):
- 每个观点有:
- 陈述
- 置信度 `c ∈ [0,1]`
- last_updated
- 证据链接(支持 + 矛盾的事实 ID
- 当新事实到达时:
- 通过实体重叠 + 相似性找到候选观点(先 FTS后嵌入
- 通过小幅增量更新置信度;大幅跳跃需要强矛盾 + 重复证据
## CLI 集成:独立 vs 深度集成
建议:**深度集成到 OpenClaw**,但保持可分离的核心库。
### 为什么要集成到 OpenClaw
- OpenClaw 已经知道:
- 工作区路径(`agents.defaults.workspace`
- 会话模型 + 心跳
- 日志记录 + 故障排除模式
- 你希望智能体自己调用工具:
- `openclaw memory recall "…" --k 25 --since 30d`
- `openclaw memory reflect --since 7d`
### 为什么仍要分离库?
- 保持记忆逻辑可测试,无需 Gateway 网关/运行时
- 可从其他上下文重用(本地脚本、未来的桌面应用等)
形态:
记忆工具预计是一个小型 CLI + 库层,但这仅是探索性的。
## "S-Collide" / SuCo何时使用研究
如果"S-Collide"指的是 **SuCoSubspace Collision**:这是一种 ANN 检索方法,通过在子空间中使用学习/结构化碰撞来实现强召回/延迟权衡论文arXiv 2411.147542024
对于 `~/.openclaw/workspace` 的务实观点:
- **不要从** SuCo 开始。
- 从 SQLite FTS +(可选的)简单嵌入开始;你会立即获得大部分 UX 收益。
- 仅在以下情况下考虑 SuCo/HNSW/ScaNN 级别的解决方案:
- 语料库很大(数万/数十万个块)
- 暴力嵌入搜索变得太慢
- 召回质量明显受到词法搜索的瓶颈限制
离线友好的替代方案(按复杂性递增):
- SQLite FTS5 + 元数据过滤(零 ML
- 嵌入 + 暴力搜索(如果块数量低,效果出奇地好)
- HNSW 索引(常见、稳健;需要库绑定)
- SuCo研究级如果有可嵌入的可靠实现则很有吸引力
开放问题:
- 对于你的机器(笔记本 + 台式机)上的"个人助理记忆"**最佳**的离线嵌入模型是什么?
- 如果你已经有 Ollama使用本地模型嵌入否则在工具链中附带一个小型嵌入模型。
## 最小可用试点
如果你想要一个最小但仍有用的版本:
- 添加 `bank/` 实体页面和每日日志中的 `## Retain` 部分。
- 使用 SQLite FTS 进行带引用的回忆(路径 + 行号)。
- 仅在召回质量或规模需要时添加嵌入。
## 参考资料
- Letta / MemGPT 概念:"核心记忆块" + "档案记忆" + 工具驱动的自编辑记忆。
- Hindsight 技术报告:"retain / recall / reflect",四网络记忆,叙事性事实提取,观点置信度演变。
- SuCoarXiv 2411.147542024"Subspace Collision"近似最近邻检索。