mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 05:30:42 +00:00
* feat(file-transfer): add bundled plugin for binary file ops on nodes
New extensions/file-transfer/ plugin exposing four agent tools
(file_fetch, dir_list, dir_fetch, file_write) and four matching
node-host commands (file.fetch, dir.list, dir.fetch, file.write).
Lets agents read and write files on paired nodes by absolute path,
bypassing the bash output cap (200KB) and the live tool-result
text cap that would otherwise truncate base64 payloads.
Public surface
--------------
- file_fetch({ node, path, maxBytes? })
Image MIMEs return image content blocks; small text (<=8 KB) inlines
as text content; everything else returns a saved-media-path text
block. sha256-verified end-to-end.
- dir_list({ node, path, pageToken?, maxEntries? })
Structured directory listing — name, path, size, mimeType, isDir,
mtime. Paginated. No content transfer.
- dir_fetch({ node, path, maxBytes?, includeDotfiles? })
Server-side tar -czf streamed back, unpacked into the gateway media
store, returns a manifest of saved paths. Single round-trip.
60s wall-clock timeouts on tar create/unpack. tar -xzf without -P
rejects absolute paths in archive entries.
- file_write({ node, path, contentBase64, mimeType?, overwrite?,
createParents? })
Atomic write (temp + rename). Refuses to overwrite by default.
Refuses to write through symlinks (lstat check). Buffer-side
sha256 (no read-back race). Pair with file_fetch to round-trip
files between nodes — DO NOT use exec/cp for file copies.
All four commands gated by:
- dangerous-by-default node command policy
(gateway.nodes.allowCommands opt-in)
- per-node path policy (gateway.nodes.fileTransfer)
- optional operator approval prompt (ask: off | on-miss | always)
16 MB raw byte ceiling per single-frame round-trip (25 MB WS frame
with ~33% base64 overhead and JSON envelope). 8 MB defaults.
Path policy and approvals
-------------------------
Default behavior is DENY. The operator must explicitly opt in:
{
"gateway": {
"nodes": {
"fileTransfer": {
"<nodeId-or-displayName>": {
"ask": "off" | "on-miss" | "always",
"allowReadPaths": ["~/Screenshots/**", "/tmp/**"],
"allowWritePaths": ["~/Downloads/**"],
"denyPaths": ["**/.ssh/**", "**/.aws/**"],
"maxBytes": 16777216
},
"*": { "ask": "on-miss" }
}
}
}
}
ask modes:
off — silent: allow if matched, deny if not (default)
on-miss — silent allow if matched; prompt on miss
always — prompt every call (denyPaths still hard-deny)
denyPaths always wins. allow-always from the prompt persists the
exact path back into allowReadPaths/allowWritePaths via
mutateConfigFile so subsequent matching calls go silent.
Reuses existing primitives — no new gateway methods:
plugin.approval.request / plugin.approval.waitDecision
decision: allow-once | allow-always | deny
Pre-flight against requested path AND post-flight against the
canonicalPath returned by the node — closes symlink-escape attacks
where the requested path matched policy but realpath resolves
somewhere else.
Audit log
---------
JSONL at ~/.openclaw/audit/file-transfer.jsonl. Records every
decision (allow/allowed-once/allowed-always/denied/error) with
timestamp, op, nodeId, displayName, requestedPath, canonicalPath,
decision, error code, sizeBytes, sha256, durationMs. Best-effort
writes; never propagates failure.
Plugin layout
-------------
extensions/file-transfer/
index.ts definePluginEntry, nodeHostCommands
openclaw.plugin.json contracts.tools registration
package.json
src/node-host/{file-fetch,dir-list,dir-fetch,file-write}.ts
src/tools/{file-fetch,dir-list,dir-fetch,file-write}-tool.ts
src/shared/
mime.ts single-source extension->MIME map + image/text sets
errors.ts shared error code enum and helpers
params.ts shared param-validation helpers + GatewayCallOptions
policy.ts evaluateFilePolicy, persistAllowAlways
approval.ts plugin.approval.request wrapper
gatekeep.ts one-stop policy + approval + audit orchestrator
audit.ts JSONL audit sink
Core touch points
-----------------
- src/infra/node-commands.ts: NODE_FILE_FETCH_COMMAND,
NODE_DIR_LIST_COMMAND, NODE_DIR_FETCH_COMMAND,
NODE_FILE_WRITE_COMMAND, NODE_FILE_COMMANDS array
- src/gateway/node-command-policy.ts: all four added to
DEFAULT_DANGEROUS_NODE_COMMANDS
- src/security/audit-extra.sync.ts: audit detail mentions file ops
- src/agents/tools/nodes-tool-media.ts: MEDIA_INVOKE_ACTIONS entry
for file.fetch redirects raw nodes(action=invoke) callers to the
dedicated file_fetch tool to prevent base64 context bloat
- src/agents/tools/nodes-tool.ts: nodes tool description points to
the dedicated file_fetch tool
Known limitations / follow-ups
------------------------------
- No tests in this PR. For a security-sensitive surface this is a
gap; will follow up with a test pass.
- Direct CLI invocation (openclaw nodes invoke --command file.fetch)
bypasses the plugin policy entirely. Plugin-side gating is the
realistic threat model (agent on iMessage requesting paths it
shouldn't), but for true defense-in-depth, policy belongs in the
gateway-side node.invoke dispatch. Move-policy-to-core is a
separate PR.
- file_watch (long-lived filesystem event subscription) is not
included; it needs a new node-protocol primitive for streaming
event channels and was descoped from this PR.
- dir_fetch includeDotfiles: true is the only supported mode;
BSD tar exclude patterns reliably collapse dotfile filtering
to an empty archive. Reliable filtering needs a
`find ! -name ".*" | tar -T -` pipeline; deferred.
- dir_fetch du -sk preflight is a heuristic (du * 4 vs maxBytes);
the mid-stream byte cap is the actual safety net.
* test(file-transfer): add unit tests for handlers, policy, and shared utilities
Adds 77 tests covering:
- handleFileFetch: validation, fs errors, sha256, size cap, symlink canonicalization
- handleFileWrite: validation, atomic write, overwrite policy, parent dir handling, symlink refusal, integrity check, size cap
- handleDirList: validation, fs errors, sorted listing, dotfile inclusion, pagination
- handleDirFetch: validation, fs errors, gzipped tar with sha256, mid-stream byte cap
- evaluateFilePolicy: default-deny, denyPaths-wins, allow matching, ask modes (off/on-miss/always), node-id/displayName/'*' resolution
- persistAllowAlways: append, dedupe, create-on-missing
- shared/mime: extension lookup, image/text inline sets
- shared/errors: err helper, classifyFsError, throwFromNodePayload
Also fixes accumulated lint regressions in the prod source flagged once these
files moved into the changed-gate scope (parseInt -> Number.parseInt, redundant
type casts removed, single-statement if bodies wrapped in braces).
* fix(file-transfer): address PR review feedback (security + availability)
Reviewer findings addressed (greptile + aisle):
- policy: persistAllowAlways no longer escalates per-node approvals to the
'*' wildcard entry; allow-always now writes under the specific node's
own entry, never the wildcard (greptile P1 SECURITY).
- policy: add literal '..' segment short-circuit in evaluateFilePolicy,
raised before glob match. Stops "/allowed/../etc/passwd" from passing
preflight against "/allowed/**" globs (aisle MEDIUM CWE-22).
- file-write: replace no-op base64 try/catch with actual round-trip
validation. Buffer.from(s, "base64") never throws — invalid input
silently decoded to garbage bytes. Now re-encodes and compares
modulo padding/url-variant chars (greptile P1 SECURITY).
- file-write: document the parent-symlink residual risk and rely on the
existing gateway-side post-flight policy check; full rollback requires
a node-side file.unlink which is deferred to a follow-up. Initial
segment-walk attempt was reverted because it false-positives on system
symlinks like macOS /var → /private/var (aisle HIGH CWE-59).
- dir-fetch tool: add preValidateTarball pass that runs `tar -tzvf` and
rejects symlinks, hardlinks, absolute paths, '..' traversal,
uncompressed sizes >64MB, and entry counts >5000 — before any
extraction. Drops --no-overwrite-dir (GNU-only flag rejected by BSD
tar on macOS) (aisle HIGH x2 CWE-22 + CWE-409, greptile P2).
- dir-fetch tool: stream-hash files via fs.open + read loop instead of
fs.readFile to avoid full-buffer reads on large extracted entries.
- dir-fetch handler: replace spawnSync in countTarEntries with async
spawn + bounded buffer so tar -tzf can't park the node-host event
loop for up to 10s on a slow filesystem (greptile P1 AVAIL).
- audit: clear auditDirPromise on rejection so a transient mkdir
failure doesn't permanently silence the audit log (greptile P2).
New tests: wildcard escalation rejection, base64 malformed/url-variant,
'..' traversal short-circuit (3 cases). 84/84 passing.
* fix(file-transfer): CI failures + second-round PR review feedback
CI failures on previous push:
- Declare runtime deps (minimatch, typebox) in package.json — failed the
extension-runtime-dependencies contract test that scans imports.
- Switch policy.ts and policy.test.ts off the broad
openclaw/plugin-sdk/config-runtime barrel and onto the narrow
openclaw/plugin-sdk/config-mutation + runtime-config-snapshot subpaths.
This satisfies the deprecated-internal-config-api architecture guard.
Second-round Aisle findings:
- policy: traversal-segment check now treats backslash and forward slash
as equivalent, so a Windows node can't be hit with mixed-separator
"C:\\allowed\\..\\Windows\\system.ini" (Aisle HIGH CWE-22).
- dir-fetch tool: replace the single fragile `tar -tvzf` parser pass
(which broke for filenames containing whitespace) with two robust
passes: `tar -tzf` for paths only (one per line, no parsing of
fixed columns) and `tar -tzvf` for type chars only (FIRST CHAR of each
line, never the path column). Also reject backslash-containing entry
names. Drops the in-process uncompressed-size cap because reliably
parsing sizes from tar output is fragile and Aisle flagged it as a
bypass primitive — entry-count cap stays (Aisle HIGH CWE-22, MED).
Tests still 84/84 passing.
* fix(file-transfer): third-round PR review feedback
Aisle's re-analysis on b63daa6a05 surfaced 3 actionable findings:
- nodes.invoke bypass (HIGH CWE-285): generic nodes.action="invoke" let
agents call dir.list/dir.fetch/file.write directly, skipping the
file-transfer plugin's gatekeep + policy + approval flow. Only file.fetch
was redirected to its dedicated tool. Add the other three to
MEDIA_INVOKE_ACTIONS so the redirect-or-deny logic in
nodes-tool-commands fires for all four. The dedicated tools enforce
policy; the generic invoke surface no longer has a way to skip them
without an explicit allowMediaInvokeCommands opt-in.
- prototype pollution in persistAllowAlways (MED CWE-1321): a paired
node with displayName "__proto__" / "prototype" / "constructor" would
mutate the fileTransfer object's prototype when persisting allow-always.
Reject those keys explicitly. Switch the existing-key lookup to
Object.prototype.hasOwnProperty.call so a key like "constructor"
doesn't accidentally match Object.prototype.constructor.
- decompression-bomb cap in dir_fetch (MED CWE-409): compressed tar is
bounded upstream, but a highly compressible bomb can still expand to
gigabytes. Enforce DIR_FETCH_MAX_UNCOMPRESSED_BYTES (64MB) summed
across extracted files and DIR_FETCH_MAX_SINGLE_FILE_BYTES (16MB) per
entry, both checked during the post-extract walk. On bust, rm -rf the
rootDir and audit-log + throw UNCOMPRESSED_TOO_LARGE.
Tests: 85/85 passing (added prototype-pollution rejection test).
Aisle's HIGH parent-symlink finding remains documented as deferred — full
rollback requires a node-side file.unlink command which is out of scope
for this PR. The gateway-side post-flight policy check still detects and
loudly errors on canonical-path mismatches.
* fix(file-transfer): refuse symlink traversal by default with followSymlinks opt-in
Closes the deferred Aisle HIGH parent-symlink finding. Instead of
detecting the escape in a post-flight gateway check after the file is
already written, the node-side handler now refuses pre-flight if any
component of the requested path resolves through a symlink.
Behavior:
- Reads (file.fetch / dir.list / dir.fetch): node realpath()s the
requested path. If canonical != requested AND followSymlinks=false,
return SYMLINK_REDIRECT { canonicalPath } — no I/O happens.
- Writes (file.write): node realpath()s the parent dir. Same refusal
rule. The lstat-on-final check is kept to catch the case where the
target file itself is an existing symlink.
- Opt-in: set gateway.nodes.fileTransfer.<node>.followSymlinks=true to
bring back the previous "follow + post-flight check" behavior.
Operator UX: the SYMLINK_REDIRECT response includes the canonical path
so the operator can either update their allow list to the canonical form
or set followSymlinks=true on that node. On macOS, /var → /private/var
and /tmp → /private/tmp are system aliases that trip the new check, so
operators using those paths need followSymlinks=true OR canonical-path
allowlists.
Wiring:
- Add followSymlinks?: boolean to NodeFilePolicyConfig.
- evaluateFilePolicy returns followSymlinks (default false) on its
ok=true branches.
- gatekeep propagates it via GatekeepOutcome.
- Each tool passes it as a node.invoke param.
- Each handler honors it pre-flight before any read/write.
Tests updated: 89/89 passing.
- realpath(mkdtemp()) so existing happy-path tests don't trip the new
default on macOS where mkdtemp lands under symlinked /var/folders.
- New tests: SYMLINK_REDIRECT refusal for file.fetch and file.write
parent traversal; opt-in passthrough when followSymlinks=true.
- New policy test: followSymlinks propagation default false / true.
* fix(file-transfer): close two more aisle findings on 069bd66
Aisle re-analysis on 069bd66 surfaced two issues my earlier round-three
fix missed:
- HIGH (CWE-284): file.fetch / dir.fetch / dir.list / file.write were
still bypassable via the generic nodes.action="invoke" surface when
the operator had set allowMediaInvokeCommands=true. That flag was
meant to opt in to base64-bloat for camera/screen, not to disable
path policy on file-transfer. Split the redirect map: introduce
POLICY_REDIRECT_INVOKE_COMMANDS (file-transfer only) which ALWAYS
rerouts to its dedicated tool regardless of the bloat flag. Camera
and screen continue to use the bloat-only redirect (suppressed by
allowMediaInvokeCommands=true). Confirmed by clawsweeper P1.
- MED (CWE-276): tar -xzf in dir_fetch unpack preserved archive
ownership and permissions, so a malicious node could plant
setuid/setgid or world-writable files on a gateway running with
elevated privileges. Add --no-same-owner --no-same-permissions
(both flags are portable across BSD tar / GNU tar).
Tests: 89/89 passing.
* chore(file-transfer): drop file_watch from plugin description
Phase 5 (file_watch) was deferred earlier in this PR. Strip the watch
mention from the plugin description in package.json,
openclaw.plugin.json, and index.ts so the metadata reflects what's
actually shipped (file_fetch, dir_list, dir_fetch, file_write).
Closes clawsweeper P3.
* fix(file-transfer): hash before rename and allow zero-byte round-trip
Two of Peter's review findings on PR #74134:
- P2 (file-write integrity): hash the decoded buffer + compare against
expectedSha256 BEFORE temp+rename. Previously the rename happened
first, then the sha check unlinked the target on mismatch — with
overwrite=true a bad caller hash could replace + delete the original.
Now a hash mismatch returns INTEGRITY_FAILURE without touching disk.
Added a regression test that asserts the original file survives.
- P2/P3 (zero-byte round-trip): the tool layer's truthy checks on
contentBase64 and base64 rejected the empty string, blocking zero-byte
files from round-tripping through file_fetch -> file_write. Switched
to type-checks (typeof === "string") and added zero-byte tests at the
handler layer for both fetch and write (sha matches the known empty
digest).
Tests: 92/92 passing.
* fix(file-transfer): declare gateway.nodes.fileTransfer in core config schema
Peter's P1/P2 finding: the plugin reads/writes gateway.nodes.fileTransfer
via casts through unknown because the strict zod schema and OpenClawConfig
type didn't declare it. That meant `openclaw config validate` would
reject the very examples in the plugin's own documentation.
- Add fileTransfer block to gateway.nodes in src/config/zod-schema.ts
with the full per-node entry shape (ask, allowReadPaths,
allowWritePaths, denyPaths, maxBytes, followSymlinks).
- Add GatewayNodeFileTransferEntry + the fileTransfer field on
GatewayNodesConfig in src/config/types.gateway.ts.
- Drop the `as unknown` casts in the extension's policy.ts now that
gateway.nodes.fileTransfer is properly typed end-to-end.
- Regenerate docs/.generated/config-baseline.sha256.
Tests: 92/92 passing. pnpm config:docs:check OK.
* fix(file-transfer): enforce path policy at gateway dispatch
Closes Peter's P1 review finding on PR #74134.
The agent-tool-only redirect added in earlier commits left CLI
(`openclaw nodes invoke`), plugin-runtime, and raw `node.invoke` callers
able to skip the file-transfer path policy entirely. The fix moves the
security boundary down to the gateway: every code path that reaches
`node.invoke` for file.fetch / dir.list / dir.fetch / file.write now
runs the same allow/deny check.
- New: src/gateway/file-transfer-dispatch.ts with
`evaluateFileTransferDispatchPolicy` and `isFileTransferCommand`. Same
semantics as the extension-side `evaluateFilePolicy` minus the
operator-prompt flow (prompts stay at the agent-tool layer; the
gateway is silent enforcement).
- src/gateway/server-methods/nodes.ts: after the existing command
allowlist check, run the new gate before forwarding. Denies emit
INVALID_REQUEST with a structured `{ command, code, reason }`.
- Decision matrix mirrors the extension: NO_POLICY (no entry for
this node) deny, denyPaths-wins, '..' traversal short-circuit
(with backslash separator handling), allowPaths match → allow,
no allow match → deny.
- 19 new unit tests covering each branch including identity
resolution (nodeId/displayName/'*'), prototype-pollution-safe lookup,
and read-vs-write allow-list separation.
Note on allow-once approvals: the agent tool's interactive
`allow-once` decision now has to flow through the dedicated tool's
pre-flight (which forwards an approved request); raw `nodes.invoke`
callers cannot benefit from one-time approvals because the gateway is
silent. allow-always (which persists to allowReadPaths/allowWritePaths)
continues to work transparently because by the time the next request
hits the gateway the path is in the persisted allow list.
Tests: 92 extension + 19 gateway = 111 total, all passing.
* fix(file-transfer): enforce node policy in gateway
* fix(file-transfer): use plugin node policy only
* fix(file-transfer): harden node policy edge cases
* fix(file-transfer): close review hardening gaps
* fix(file-transfer): harden node invoke policy
* fix(file-transfer): align runtime dependency versions
* fix(file-transfer): keep minimatch extension-owned
* refactor(file-transfer): remove unused approval gate
* fix(file-transfer): require canonical node policy authorization
Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
* fix(clawsweeper): address review for automerge-openclaw-openclaw-74134 (1)
Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
* fix(file-transfer): recheck dir fetch archive policy after fetch
* fix(file-transfer): name file-transfer tool in invoke redirect
---------
Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Co-authored-by: clawsweeper-repair <clawsweeper-repair@users.noreply.github.com>