mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 05:10:44 +00:00
main
1 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
0603c2327d |
fix(file-transfer): require canonical node policy authorization (#74742)
* feat(file-transfer): add bundled plugin for binary file ops on nodes
New extensions/file-transfer/ plugin exposing four agent tools
(file_fetch, dir_list, dir_fetch, file_write) and four matching
node-host commands (file.fetch, dir.list, dir.fetch, file.write).
Lets agents read and write files on paired nodes by absolute path,
bypassing the bash output cap (200KB) and the live tool-result
text cap that would otherwise truncate base64 payloads.
Public surface
--------------
- file_fetch({ node, path, maxBytes? })
Image MIMEs return image content blocks; small text (<=8 KB) inlines
as text content; everything else returns a saved-media-path text
block. sha256-verified end-to-end.
- dir_list({ node, path, pageToken?, maxEntries? })
Structured directory listing — name, path, size, mimeType, isDir,
mtime. Paginated. No content transfer.
- dir_fetch({ node, path, maxBytes?, includeDotfiles? })
Server-side tar -czf streamed back, unpacked into the gateway media
store, returns a manifest of saved paths. Single round-trip.
60s wall-clock timeouts on tar create/unpack. tar -xzf without -P
rejects absolute paths in archive entries.
- file_write({ node, path, contentBase64, mimeType?, overwrite?,
createParents? })
Atomic write (temp + rename). Refuses to overwrite by default.
Refuses to write through symlinks (lstat check). Buffer-side
sha256 (no read-back race). Pair with file_fetch to round-trip
files between nodes — DO NOT use exec/cp for file copies.
All four commands gated by:
- dangerous-by-default node command policy
(gateway.nodes.allowCommands opt-in)
- per-node path policy (gateway.nodes.fileTransfer)
- optional operator approval prompt (ask: off | on-miss | always)
16 MB raw byte ceiling per single-frame round-trip (25 MB WS frame
with ~33% base64 overhead and JSON envelope). 8 MB defaults.
Path policy and approvals
-------------------------
Default behavior is DENY. The operator must explicitly opt in:
{
"gateway": {
"nodes": {
"fileTransfer": {
"<nodeId-or-displayName>": {
"ask": "off" | "on-miss" | "always",
"allowReadPaths": ["~/Screenshots/**", "/tmp/**"],
"allowWritePaths": ["~/Downloads/**"],
"denyPaths": ["**/.ssh/**", "**/.aws/**"],
"maxBytes": 16777216
},
"*": { "ask": "on-miss" }
}
}
}
}
ask modes:
off — silent: allow if matched, deny if not (default)
on-miss — silent allow if matched; prompt on miss
always — prompt every call (denyPaths still hard-deny)
denyPaths always wins. allow-always from the prompt persists the
exact path back into allowReadPaths/allowWritePaths via
mutateConfigFile so subsequent matching calls go silent.
Reuses existing primitives — no new gateway methods:
plugin.approval.request / plugin.approval.waitDecision
decision: allow-once | allow-always | deny
Pre-flight against requested path AND post-flight against the
canonicalPath returned by the node — closes symlink-escape attacks
where the requested path matched policy but realpath resolves
somewhere else.
Audit log
---------
JSONL at ~/.openclaw/audit/file-transfer.jsonl. Records every
decision (allow/allowed-once/allowed-always/denied/error) with
timestamp, op, nodeId, displayName, requestedPath, canonicalPath,
decision, error code, sizeBytes, sha256, durationMs. Best-effort
writes; never propagates failure.
Plugin layout
-------------
extensions/file-transfer/
index.ts definePluginEntry, nodeHostCommands
openclaw.plugin.json contracts.tools registration
package.json
src/node-host/{file-fetch,dir-list,dir-fetch,file-write}.ts
src/tools/{file-fetch,dir-list,dir-fetch,file-write}-tool.ts
src/shared/
mime.ts single-source extension->MIME map + image/text sets
errors.ts shared error code enum and helpers
params.ts shared param-validation helpers + GatewayCallOptions
policy.ts evaluateFilePolicy, persistAllowAlways
approval.ts plugin.approval.request wrapper
gatekeep.ts one-stop policy + approval + audit orchestrator
audit.ts JSONL audit sink
Core touch points
-----------------
- src/infra/node-commands.ts: NODE_FILE_FETCH_COMMAND,
NODE_DIR_LIST_COMMAND, NODE_DIR_FETCH_COMMAND,
NODE_FILE_WRITE_COMMAND, NODE_FILE_COMMANDS array
- src/gateway/node-command-policy.ts: all four added to
DEFAULT_DANGEROUS_NODE_COMMANDS
- src/security/audit-extra.sync.ts: audit detail mentions file ops
- src/agents/tools/nodes-tool-media.ts: MEDIA_INVOKE_ACTIONS entry
for file.fetch redirects raw nodes(action=invoke) callers to the
dedicated file_fetch tool to prevent base64 context bloat
- src/agents/tools/nodes-tool.ts: nodes tool description points to
the dedicated file_fetch tool
Known limitations / follow-ups
------------------------------
- No tests in this PR. For a security-sensitive surface this is a
gap; will follow up with a test pass.
- Direct CLI invocation (openclaw nodes invoke --command file.fetch)
bypasses the plugin policy entirely. Plugin-side gating is the
realistic threat model (agent on iMessage requesting paths it
shouldn't), but for true defense-in-depth, policy belongs in the
gateway-side node.invoke dispatch. Move-policy-to-core is a
separate PR.
- file_watch (long-lived filesystem event subscription) is not
included; it needs a new node-protocol primitive for streaming
event channels and was descoped from this PR.
- dir_fetch includeDotfiles: true is the only supported mode;
BSD tar exclude patterns reliably collapse dotfile filtering
to an empty archive. Reliable filtering needs a
`find ! -name ".*" | tar -T -` pipeline; deferred.
- dir_fetch du -sk preflight is a heuristic (du * 4 vs maxBytes);
the mid-stream byte cap is the actual safety net.
* test(file-transfer): add unit tests for handlers, policy, and shared utilities
Adds 77 tests covering:
- handleFileFetch: validation, fs errors, sha256, size cap, symlink canonicalization
- handleFileWrite: validation, atomic write, overwrite policy, parent dir handling, symlink refusal, integrity check, size cap
- handleDirList: validation, fs errors, sorted listing, dotfile inclusion, pagination
- handleDirFetch: validation, fs errors, gzipped tar with sha256, mid-stream byte cap
- evaluateFilePolicy: default-deny, denyPaths-wins, allow matching, ask modes (off/on-miss/always), node-id/displayName/'*' resolution
- persistAllowAlways: append, dedupe, create-on-missing
- shared/mime: extension lookup, image/text inline sets
- shared/errors: err helper, classifyFsError, throwFromNodePayload
Also fixes accumulated lint regressions in the prod source flagged once these
files moved into the changed-gate scope (parseInt -> Number.parseInt, redundant
type casts removed, single-statement if bodies wrapped in braces).
* fix(file-transfer): address PR review feedback (security + availability)
Reviewer findings addressed (greptile + aisle):
- policy: persistAllowAlways no longer escalates per-node approvals to the
'*' wildcard entry; allow-always now writes under the specific node's
own entry, never the wildcard (greptile P1 SECURITY).
- policy: add literal '..' segment short-circuit in evaluateFilePolicy,
raised before glob match. Stops "/allowed/../etc/passwd" from passing
preflight against "/allowed/**" globs (aisle MEDIUM CWE-22).
- file-write: replace no-op base64 try/catch with actual round-trip
validation. Buffer.from(s, "base64") never throws — invalid input
silently decoded to garbage bytes. Now re-encodes and compares
modulo padding/url-variant chars (greptile P1 SECURITY).
- file-write: document the parent-symlink residual risk and rely on the
existing gateway-side post-flight policy check; full rollback requires
a node-side file.unlink which is deferred to a follow-up. Initial
segment-walk attempt was reverted because it false-positives on system
symlinks like macOS /var → /private/var (aisle HIGH CWE-59).
- dir-fetch tool: add preValidateTarball pass that runs `tar -tzvf` and
rejects symlinks, hardlinks, absolute paths, '..' traversal,
uncompressed sizes >64MB, and entry counts >5000 — before any
extraction. Drops --no-overwrite-dir (GNU-only flag rejected by BSD
tar on macOS) (aisle HIGH x2 CWE-22 + CWE-409, greptile P2).
- dir-fetch tool: stream-hash files via fs.open + read loop instead of
fs.readFile to avoid full-buffer reads on large extracted entries.
- dir-fetch handler: replace spawnSync in countTarEntries with async
spawn + bounded buffer so tar -tzf can't park the node-host event
loop for up to 10s on a slow filesystem (greptile P1 AVAIL).
- audit: clear auditDirPromise on rejection so a transient mkdir
failure doesn't permanently silence the audit log (greptile P2).
New tests: wildcard escalation rejection, base64 malformed/url-variant,
'..' traversal short-circuit (3 cases). 84/84 passing.
* fix(file-transfer): CI failures + second-round PR review feedback
CI failures on previous push:
- Declare runtime deps (minimatch, typebox) in package.json — failed the
extension-runtime-dependencies contract test that scans imports.
- Switch policy.ts and policy.test.ts off the broad
openclaw/plugin-sdk/config-runtime barrel and onto the narrow
openclaw/plugin-sdk/config-mutation + runtime-config-snapshot subpaths.
This satisfies the deprecated-internal-config-api architecture guard.
Second-round Aisle findings:
- policy: traversal-segment check now treats backslash and forward slash
as equivalent, so a Windows node can't be hit with mixed-separator
"C:\\allowed\\..\\Windows\\system.ini" (Aisle HIGH CWE-22).
- dir-fetch tool: replace the single fragile `tar -tvzf` parser pass
(which broke for filenames containing whitespace) with two robust
passes: `tar -tzf` for paths only (one per line, no parsing of
fixed columns) and `tar -tzvf` for type chars only (FIRST CHAR of each
line, never the path column). Also reject backslash-containing entry
names. Drops the in-process uncompressed-size cap because reliably
parsing sizes from tar output is fragile and Aisle flagged it as a
bypass primitive — entry-count cap stays (Aisle HIGH CWE-22, MED).
Tests still 84/84 passing.
* fix(file-transfer): third-round PR review feedback
Aisle's re-analysis on b63daa6a05 surfaced 3 actionable findings:
- nodes.invoke bypass (HIGH CWE-285): generic nodes.action="invoke" let
agents call dir.list/dir.fetch/file.write directly, skipping the
file-transfer plugin's gatekeep + policy + approval flow. Only file.fetch
was redirected to its dedicated tool. Add the other three to
MEDIA_INVOKE_ACTIONS so the redirect-or-deny logic in
nodes-tool-commands fires for all four. The dedicated tools enforce
policy; the generic invoke surface no longer has a way to skip them
without an explicit allowMediaInvokeCommands opt-in.
- prototype pollution in persistAllowAlways (MED CWE-1321): a paired
node with displayName "__proto__" / "prototype" / "constructor" would
mutate the fileTransfer object's prototype when persisting allow-always.
Reject those keys explicitly. Switch the existing-key lookup to
Object.prototype.hasOwnProperty.call so a key like "constructor"
doesn't accidentally match Object.prototype.constructor.
- decompression-bomb cap in dir_fetch (MED CWE-409): compressed tar is
bounded upstream, but a highly compressible bomb can still expand to
gigabytes. Enforce DIR_FETCH_MAX_UNCOMPRESSED_BYTES (64MB) summed
across extracted files and DIR_FETCH_MAX_SINGLE_FILE_BYTES (16MB) per
entry, both checked during the post-extract walk. On bust, rm -rf the
rootDir and audit-log + throw UNCOMPRESSED_TOO_LARGE.
Tests: 85/85 passing (added prototype-pollution rejection test).
Aisle's HIGH parent-symlink finding remains documented as deferred — full
rollback requires a node-side file.unlink command which is out of scope
for this PR. The gateway-side post-flight policy check still detects and
loudly errors on canonical-path mismatches.
* fix(file-transfer): refuse symlink traversal by default with followSymlinks opt-in
Closes the deferred Aisle HIGH parent-symlink finding. Instead of
detecting the escape in a post-flight gateway check after the file is
already written, the node-side handler now refuses pre-flight if any
component of the requested path resolves through a symlink.
Behavior:
- Reads (file.fetch / dir.list / dir.fetch): node realpath()s the
requested path. If canonical != requested AND followSymlinks=false,
return SYMLINK_REDIRECT { canonicalPath } — no I/O happens.
- Writes (file.write): node realpath()s the parent dir. Same refusal
rule. The lstat-on-final check is kept to catch the case where the
target file itself is an existing symlink.
- Opt-in: set gateway.nodes.fileTransfer.<node>.followSymlinks=true to
bring back the previous "follow + post-flight check" behavior.
Operator UX: the SYMLINK_REDIRECT response includes the canonical path
so the operator can either update their allow list to the canonical form
or set followSymlinks=true on that node. On macOS, /var → /private/var
and /tmp → /private/tmp are system aliases that trip the new check, so
operators using those paths need followSymlinks=true OR canonical-path
allowlists.
Wiring:
- Add followSymlinks?: boolean to NodeFilePolicyConfig.
- evaluateFilePolicy returns followSymlinks (default false) on its
ok=true branches.
- gatekeep propagates it via GatekeepOutcome.
- Each tool passes it as a node.invoke param.
- Each handler honors it pre-flight before any read/write.
Tests updated: 89/89 passing.
- realpath(mkdtemp()) so existing happy-path tests don't trip the new
default on macOS where mkdtemp lands under symlinked /var/folders.
- New tests: SYMLINK_REDIRECT refusal for file.fetch and file.write
parent traversal; opt-in passthrough when followSymlinks=true.
- New policy test: followSymlinks propagation default false / true.
* fix(file-transfer): close two more aisle findings on
|