diff --git a/docs/cli/path.md b/docs/cli/path.md index 89d313de9f4..a9555963bcd 100644 --- a/docs/cli/path.md +++ b/docs/cli/path.md @@ -9,11 +9,19 @@ title: "Path" # `openclaw path` -Plugin-provided shell access to the `oc://` addressing substrate — one universal, -kind-dispatched path scheme for inspecting and surgically editing workspace -files (markdown, jsonc, jsonl). Self-hosters and editor extensions use -it to read or write a single leaf inside a workspace file without scripting -against the SDK directly. +Plugin-provided shell access to the `oc://` addressing substrate: one +kind-dispatched path scheme for inspecting and editing addressable workspace +files (markdown, jsonc, jsonl). Self-hosters, plugin authors, and editor +extensions use it to read, find, or update a narrow location without +hand-rolling per-file parsers. + +The CLI mirrors the substrate's public verbs: + +- `resolve` is concrete and single-match. +- `find` is the multi-match verb for wildcards, unions, predicates, and + positional expansion. +- `set` only accepts concrete paths or insertion markers; wildcard patterns are + rejected before writing. `path` is provided by the bundled optional `oc-path` plugin. Enable it before first use: @@ -26,10 +34,10 @@ openclaw plugins enable oc-path | Subcommand | Purpose | | ----------------------- | ---------------------------------------------------------------------------- | -| `resolve ` | Print the match at the path (or "not found"). | -| `find ` | Enumerate matches for a wildcard / predicate path. | -| `set ` | Write a leaf at the path. Supports `--dry-run`. | -| `validate ` | Parse-only — print structural breakdown (file / section / item / field). | +| `resolve ` | Print the concrete match at the path (or "not found"). | +| `find ` | Enumerate matches for a wildcard / union / predicate path. | +| `set ` | Write a leaf or insertion target at a concrete path. Supports `--dry-run`. | +| `validate ` | Parse-only; print structural breakdown (file / section / item / field). | | `emit ` | Round-trip a file through `parseXxx` + `emitXxx` (byte-fidelity diagnostic). | ## Global flags @@ -48,7 +56,7 @@ openclaw plugins enable oc-path oc://FILE/SECTION/ITEM/FIELD?session=SCOPE ``` -Slot rules — `field` requires `item`, `item` requires `section`. Across all +Slot rules: `field` requires `item`, and `item` requires `section`. Across all four slots: - **Quoted segments** — `"a/b.c"` survives `/` and `.` separators. @@ -65,12 +73,49 @@ four slots: - **Ordinal** — `#N` for Nth match by document order. - **Insertion markers** — `+`, `+key`, `+nnn` for keyed / indexed insertion (use with `set`). -- **Session scope** — `?session=cron:daily` etc. Orthogonal to slot - nesting. +- **Session scope** — `?session=cron-daily` etc. Orthogonal to slot + nesting. Session values are raw, not percent-decoded; they may not contain + control characters or reserved query delimiters (`?`, `&`, `%`). Reserved characters (`?`, `&`, `%`) outside quoted, predicate, or union -segments are rejected. Control characters (U+0000–U+001F, U+007F) are -rejected anywhere. +segments are rejected. Control characters (U+0000-U+001F, U+007F) are rejected +anywhere, including the `session` query value. + +`formatOcPath(parseOcPath(path)) === path` is guaranteed for canonical paths. +Non-canonical query parameters are ignored except for the first non-empty +`session=` value. + +## Addressing by file kind + +| Kind | Addressing model | +| ---------- | -------------------------------------------------------------------------------- | +| Markdown | H2 sections by slug, bullet items by slug or `#N`, frontmatter via `[frontmatter]`. | +| JSONC/JSON | Object keys and array indexes; dots split nested sub-segments unless quoted. | +| JSONL | Top-level line addresses (`L1`, `L2`, `$last`), then JSONC-style descent inside the line. | + +`resolve` returns a structured match: `root`, `node`, `leaf`, or +`insertion-point`, with a 1-based line number. Leaf values are surfaced as text +plus a `leafType` so plugin authors can render previews without depending on +the per-kind AST shape. + +## Mutation contract + +`set` writes one concrete target: + +- Markdown frontmatter values and `- key: value` item fields are string leaves. + Markdown insertions append sections, frontmatter keys, or section items and + render a canonical markdown shape for the changed file. +- JSONC leaf writes coerce the string value to the existing leaf type + (`string`, finite `number`, `true`/`false`, or `null`). JSONC object and array + insertions parse `` as JSON and use the `jsonc-parser` edit path for + ordinary leaf writes, preserving comments and nearby formatting. +- JSONL leaf writes coerce like JSONC inside a line. Whole-line replacement and + append parse `` as JSON. Rendered JSONL preserves the file's dominant + LF/CRLF line-ending convention. + +Use `--dry-run` before user-visible writes when the exact bytes matter. The +substrate preserves byte-identical output for parse/emit round-trips, but a +mutation can canonicalize the edited region or file depending on kind. ## Examples @@ -94,6 +139,40 @@ openclaw path set 'oc://gateway.jsonc/version' '2.0' openclaw path emit ./AGENTS.md ``` +More grammar examples: + +```bash +# Quote keys containing / or . +openclaw path resolve 'oc://config.jsonc/agents.defaults.models/"anthropic/claude-opus-4-7"/alias' + +# Predicate search over JSONC children +openclaw path find 'oc://config.jsonc/plugins/[enabled=true]/id' + +# Insert into a JSONC array +openclaw path set 'oc://config.jsonc/items/+1' '{"id":"new","enabled":true}' --dry-run + +# Insert a JSONC object key +openclaw path set 'oc://config.jsonc/plugins/+github' '{"enabled":true}' --dry-run + +# Append a JSONL event +openclaw path set 'oc://session.jsonl/+' '{"event":"checkpoint","ok":true}' --file ./logs/session.jsonl + +# Resolve the last JSONL value line +openclaw path resolve 'oc://session.jsonl/$last/event' --file ./logs/session.jsonl + +# Address markdown frontmatter +openclaw path resolve 'oc://AGENTS.md/[frontmatter]/name' + +# Insert markdown frontmatter +openclaw path set 'oc://AGENTS.md/[frontmatter]/+description' 'Agent instructions' --dry-run + +# Find markdown item fields +openclaw path find 'oc://SKILL.md/Tools/*/send_email' + +# Validate a session-scoped path +openclaw path validate 'oc://AGENTS.md/Tools/$last/risk?session=cron-daily' +``` + ## Exit codes | Code | Meaning | @@ -110,7 +189,7 @@ auto-detection. ## Notes -- `set` writes raw bytes through the substrate's emit path, which applies the +- `set` writes bytes through the substrate's emit path, which applies the redaction-sentinel guard automatically. A leaf carrying `__OPENCLAW_REDACTED__` (verbatim or as a substring) is refused at write time. diff --git a/extensions/oc-path/src/oc-path/dispatch.ts b/extensions/oc-path/src/oc-path/dispatch.ts index 6d2ab0a0519..0bad6fe4493 100644 --- a/extensions/oc-path/src/oc-path/dispatch.ts +++ b/extensions/oc-path/src/oc-path/dispatch.ts @@ -1,14 +1,11 @@ /** - * Cross-kind utilities. The substrate exposes per-kind verbs only; - * `inferKind` is a convention helper for callers who want to map - * filename → kind so they can pick the right `parseXxx` / `setXxx` / - * `resolveXxx` function. + * Cross-kind utilities. `inferKind` is a convention helper for callers + * who want to map filename to the parser they should use before calling + * the universal verbs (`resolveOcPath`, `findOcPaths`, `setOcPath`). * - * Earlier drafts had `resolveOcPath` / `setOcPath` / `appendOcPath` - * universal dispatchers with tagged-union AST inputs. They were dropped - * — the kind tag bled through every consumer (lint runner, doctor - * fixers, tests) since those code paths still needed to know the kind - * to use the result. Per-kind verbs are honest about input/output. + * Encoding remains per-kind (`parseMd`, `parseJsonc`, `parseJsonl`), + * while addressing and mutation dispatch are universal once callers + * have an AST carrying its `kind` discriminator. * * @module @openclaw/oc-path/dispatch */ diff --git a/extensions/oc-path/src/oc-path/index.ts b/extensions/oc-path/src/oc-path/index.ts index 6c421dcdd05..54d24ccdbea 100644 --- a/extensions/oc-path/src/oc-path/index.ts +++ b/extensions/oc-path/src/oc-path/index.ts @@ -7,16 +7,17 @@ * addressing (resolve/set) is universal. * * **Public verbs**: - * - One `setOcPath(ast, path, value)` — universal, kind-dispatched - * - One `resolveOcPath(ast, path)` — universal, kind-dispatched - * - Per-kind `parseXxx` / `emitXxx` (parsing IS per-kind by nature) + * - One `resolveOcPath(ast, path)` - concrete, kind-dispatched + * - One `findOcPaths(ast, pattern)` - multi-match, kind-dispatched + * - One `setOcPath(ast, path, value)` - concrete mutation / insertion + * - Per-kind `parseXxx` / `emitXxx` (parsing is per-kind by nature) * * `setOcPath` accepts a string value; the substrate coerces based on * AST shape at the path location. The OcPath syntax encodes the * operation: plain path = leaf set, `+` suffix = insertion. * * Per-kind set/resolve helpers exist as internal implementation; they - * aren't on the public surface. Callers don't need to pick a kind — + * aren't on the public surface. Callers don't need to pick a kind - * the AST carries its `kind` discriminator and the universal verbs * dispatch internally. * diff --git a/extensions/oc-path/src/oc-path/jsonl/edit.ts b/extensions/oc-path/src/oc-path/jsonl/edit.ts index 397f3d30f22..180eecf55ca 100644 --- a/extensions/oc-path/src/oc-path/jsonl/edit.ts +++ b/extensions/oc-path/src/oc-path/jsonl/edit.ts @@ -188,7 +188,12 @@ export function appendJsonlOcPath(ast: JsonlAst, value: JsoncValue): JsonlAst { value, raw: "", }; - const next: JsonlAst = { kind: "jsonl", raw: "", lines: [...ast.lines, newLine] }; + const next: JsonlAst = { + kind: "jsonl", + raw: "", + lines: [...ast.lines, newLine], + ...(ast.lineEnding !== undefined ? { lineEnding: ast.lineEnding } : {}), + }; const rendered = emitJsonl(next, { mode: "render" }); return { ...next, raw: rendered }; } diff --git a/extensions/oc-path/src/oc-path/oc-path.ts b/extensions/oc-path/src/oc-path/oc-path.ts index cfc35283b65..44cd3d9c640 100644 --- a/extensions/oc-path/src/oc-path/oc-path.ts +++ b/extensions/oc-path/src/oc-path/oc-path.ts @@ -3,7 +3,9 @@ * * oc://{file}[/{section}[/{item}[/{field}]]][?session={id}] * - * Round-trip contract: `formatOcPath(parseOcPath(s)) === s`. + * Canonical round-trip contract: `formatOcPath(parseOcPath(s)) === s` + * for canonical paths. Extra query parameters are ignored except for + * the first non-empty `session=` value. * * @module @openclaw/oc-path/oc-path */ @@ -49,7 +51,9 @@ function printable(s: string): string { * Parsed `oc://` path. Components nest strictly: `item` implies * `section`, `field` implies `item`. `field` directly under file * addresses a frontmatter key; under item it addresses the value of a - * `- key: value` bullet. + * `- key: value` bullet. `session` is an opaque raw scope string; it is + * not percent-decoded and cannot contain control characters or reserved + * query delimiters (`?`, `&`, `%`). */ export interface OcPath { readonly file: string; @@ -102,6 +106,23 @@ function validateFileSlot(file: string, contextInput: string): void { } } +function validateSessionSlot(session: string, contextInput: string): void { + if (hasControlChar(session)) { + fail( + `Control character in oc:// session query: ${printable(contextInput)}`, + contextInput, + "OC_PATH_CONTROL_CHAR", + ); + } + if (RESERVED_CHARS_RE.test(session)) { + fail( + `Reserved character (\`?\` / \`&\` / \`%\`) in oc:// session query: ${printable(contextInput)}`, + contextInput, + "OC_PATH_RESERVED_CHAR", + ); + } +} + /** Parse an `oc://` path string into a structured `OcPath`. */ export function parseOcPath(input: string): OcPath { if (typeof input !== "string") { @@ -131,6 +152,13 @@ export function parseOcPath(input: string): OcPath { if (!normalized.startsWith(OC_SCHEME)) { fail(`Missing oc:// scheme: ${printable(input)}`, input, "OC_PATH_MISSING_SCHEME"); } + if (hasControlChar(normalized)) { + fail( + `Control character in oc:// path: ${printable(input)}`, + input, + "OC_PATH_CONTROL_CHAR", + ); + } const afterScheme = normalized.slice(OC_SCHEME.length); // Top-level split skips quoted keys so `"foo?bar"` isn't broken. @@ -178,7 +206,7 @@ export function parseOcPath(input: string): OcPath { const file = isQuotedSeg(fileSeg) ? unquoteSeg(fileSeg) : fileSeg; validateFileSlot(file, input); - const session = extractSession(queryPart); + const session = extractSession(queryPart, input); return { file, ...(segments[1] !== undefined ? { section: segments[1] } : {}), @@ -244,7 +272,10 @@ export function formatOcPath(path: OcPath): string { if (path.section !== undefined) {out += "/" + formatSlot(path.section, "section");} if (path.item !== undefined) {out += "/" + formatSlot(path.item, "item");} if (path.field !== undefined) {out += "/" + formatSlot(path.field, "field");} - if (path.session !== undefined) {out += "?session=" + path.session;} + if (path.session !== undefined) { + validateSessionSlot(path.session, path.file); + out += "?session=" + path.session; + } if (out.length > MAX_PATH_LENGTH) { fail( @@ -464,14 +495,17 @@ export function repackPath(pattern: OcPath, subs: readonly string[]): OcPath { }; } -function extractSession(queryPart: string): string | undefined { +function extractSession(queryPart: string, input: string): string | undefined { if (queryPart.length === 0) {return undefined;} for (const pair of queryPart.split("&")) { const eqIndex = pair.indexOf("="); if (eqIndex === -1) {continue;} const key = pair.slice(0, eqIndex); const value = pair.slice(eqIndex + 1); - if (key === "session" && value.length > 0) {return value;} + if (key === "session" && value.length > 0) { + validateSessionSlot(value, input); + return value; + } } return undefined; } diff --git a/extensions/oc-path/src/oc-path/tests/jsonl/edit.test.ts b/extensions/oc-path/src/oc-path/tests/jsonl/edit.test.ts index 7ae4c97be26..68f50eabba2 100644 --- a/extensions/oc-path/src/oc-path/tests/jsonl/edit.test.ts +++ b/extensions/oc-path/src/oc-path/tests/jsonl/edit.test.ts @@ -85,6 +85,15 @@ describe("appendJsonlOcPath — session checkpointing primitive", () => { expect(out).toHaveLength(2); expect(JSON.parse(out[1] ?? "")).toEqual({ b: 2 }); }); + + it("preserves CRLF line endings when appending", () => { + const { ast } = parseJsonl('{"a":1}\r\n'); + const next = appendJsonlOcPath(ast, { + kind: "object", + entries: [{ key: "b", line: 0, value: { kind: "number", value: 2 } }], + }); + expect(emitJsonl(next)).toBe('{"a":1}\r\n{"b":2}'); + }); }); describe("setJsonlOcPath — $last line address", () => { diff --git a/extensions/oc-path/src/oc-path/tests/oc-path.test.ts b/extensions/oc-path/src/oc-path/tests/oc-path.test.ts index e7536aebfe1..eda06d994c8 100644 --- a/extensions/oc-path/src/oc-path/tests/oc-path.test.ts +++ b/extensions/oc-path/src/oc-path/tests/oc-path.test.ts @@ -37,6 +37,27 @@ describe("parseOcPath", () => { }); }); + it("rejects reserved chars in session query values", () => { + expectOcPathError( + () => parseOcPath("oc://SOUL.md?session=cron%2Fdaily"), + "OC_PATH_RESERVED_CHAR", + ); + }); + + it("rejects control chars in session query values", () => { + expectOcPathError( + () => parseOcPath("oc://SOUL.md?session=daily\x00cron"), + "OC_PATH_CONTROL_CHAR", + ); + }); + + it("rejects control chars in ignored query values", () => { + expectOcPathError( + () => parseOcPath("oc://SOUL.md?ignored=\x00"), + "OC_PATH_CONTROL_CHAR", + ); + }); + it("rejects missing scheme", () => { expectOcPathError(() => parseOcPath("SOUL.md"), "OC_PATH_MISSING_SCHEME"); }); @@ -88,6 +109,13 @@ describe("formatOcPath", () => { expect(formatOcPath({ file: "SOUL.md", session: "cron" })).toBe("oc://SOUL.md?session=cron"); }); + it("rejects reserved chars in formatted session values", () => { + expectOcPathError( + () => formatOcPath({ file: "SOUL.md", session: "cron&scope=daily" }), + "OC_PATH_RESERVED_CHAR", + ); + }); + it("rejects empty file", () => { expectOcPathError(() => formatOcPath({ file: "" }), "OC_PATH_FILE_REQUIRED"); }); diff --git a/extensions/oc-path/src/oc-path/universal.ts b/extensions/oc-path/src/oc-path/universal.ts index a7e4499213e..cf05676f42b 100644 --- a/extensions/oc-path/src/oc-path/universal.ts +++ b/extensions/oc-path/src/oc-path/universal.ts @@ -2,7 +2,9 @@ * Universal `setOcPath` / `resolveOcPath` / `detectInsertion`. * Addressing is universal; encoding is per-kind. Callers pass any AST * + path + value; the substrate dispatches on `ast.kind` and coerces - * the value based on the AST shape at the resolution point. + * the value based on the AST shape at the resolution point. Wildcard, + * union, and predicate expansion belong to `findOcPaths`; `resolveOcPath` + * and `setOcPath` require concrete paths. * * oc://FILE/section/item/field → leaf address * oc://FILE/section/+ → end-insertion