mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 07:50:43 +00:00
* test(security): add coverage tests before security fixes
- scan-paths.ts: 100% line coverage (new test file, previously zero)
- windows-acl.ts: 100% line coverage (SID bypass, whoami throw, no-user null return)
- external-content.ts: 99% (line 248 defensive overlap guard, unreachable)
- skill-scanner.ts: 93% (lines 293-294/330/571 are defensive guards for
future extensibility, unreachable with current rules/patterns)
200+ tests covering TOCTOU paths, cache invalidation, forced-file escapes,
dir-entry-cache hit, SID world-bypass, diacritic-strip fallback,
fullwidth homoglyph markers, and more.
* fix(security): 5 security hardening fixes in src/security/
scan-paths: default requireRealpath to false (safe). All production callers
already pass requireRealpath: true; default callers are now secure.
windows-acl: block world-equivalent SIDs (S-1-1-0 Everyone etc.) from being
added to trusted set via USERSID env var.
windows-acl: log resolveCurrentUserSid failures instead of bare catch{}.
audit-extra: wrap JSON.parse in readPluginManifestExtensions with try-catch.
Malformed package.json returns [] instead of crashing the audit.
audit-extra: depth guard in listWorkspaceSkillMarkdownFiles to prevent
resource exhaustion from deep symlink cycles.
audit-extra: 2s timeout on fs.realpath in collectWorkspaceSkillSymlinkEscapeFindings
to protect against hanging on slow/network filesystems.
audit-extra: warn about phantom entries in plugins.allow that don't match
any installed plugin (pre-approval exploitation vector).
media-understanding/types: add allowPrivateNetwork to transport overrides
(duplicate of PR #66967, required for tsgo to pass here).
* fix(security): address security review findings in audit-extra.async.ts
Issue 1 — Symlink escape audit bypass on realpath timeout:
When realpathWithTimeout returns null (timeout or failure), the previous code
called 'continue', silently skipping the escape check. An attacker with a
symlink to a slow/network filesystem could hang realpath to prevent escape
detection. Now treats unverifiable symlinks as potential escapes and includes
them in the finding.
Issue 2 — Malformed package.json hides extension entrypoints from deep scan:
readPluginManifestExtensions previously swallowed JSON.parse errors and
returned [], which a malicious plugin could exploit by crafting a malformed
package.json to hide its openclaw.extensions entrypoints from the deep code
scanner. Now re-throws the parse error (with cause) so the caller in
collectPluginsCodeSafetyFindings can surface a warn finding and alert the
user, while still scanning the plugin directory via getCodeSafetySummary.
* fix(security): address PR review findings (P1 + P2)
P1 — BFS realpath in listWorkspaceSkillMarkdownFiles lacks timeout:
Extract realpathWithTimeout to module scope so the BFS dequeue loop
uses the same 2 s guard as the outer escape-detection callers. Previously
only the per-workspace and per-skill-file realpaths had the timeout;
a hanging NFS/SMB directory entry inside the BFS could still block
indefinitely.
P1 (acknowledged limitation) — Promise.race leaves the underlying
fs.realpath call running after timeout. fs.realpath cannot be cancelled
once submitted to libuv. Callers are sequential (one await at a time),
so at most one worker thread is occupied; the OS will eventually time
out the stuck call. This is documented in the module-level JSDoc.
P2 — Phantom allowlist check incorrectly flags bundled plugin IDs:
listChannelPlugins() returns bundled channel plugin IDs (telegram,
discord, browser, etc.) that are never in stateDir/extensions.
Add bundledPluginIds exclusion so the phantom-entry finding is scoped
to user-installed extension IDs only.
P2 — Rename MAX_SYMLINK_DEPTH / depthGuard to MAX_TOTAL_DIR_VISITS /
totalDirVisits to accurately reflect that the guard caps total BFS
iterations (2_000 * 20 = 40_000), not per-path symlink depth.
* fix(security): clean up realpathWithTimeout timer and add regression tests
- Clear the timer handle when fs.realpath resolves before the deadline,
preventing timer accumulation during large audit runs with many files.
- Add .unref() on the timer so it cannot hold the process alive while
waiting on a potentially hanging NFS/SMB path.
Regression tests added for three audit-extra.async security fixes:
- manifest parse error: malformed plugin package.json surfaces
plugins.code_safety.manifest_parse_error (audit-extra.async.test.ts)
- phantom allowlist with bundled exclusion: bundled channel plugin IDs
are excluded from plugins.allow_phantom_entries warnings; non-installed
non-bundled IDs are correctly reported (audit-plugins-phantom.test.ts)
- unverifiable realpath escape: fs.realpath failure / timeout produces a
skills.workspace.symlink_escape finding with 'realpath timed out' in
the detail (audit-workspace-skill-escape.test.ts)
* chore(security): add TODO for structured logger in windows-acl resolveCurrentUserSid
console.warn is acceptable short-term but may be noisy on constrained
Windows hosts; note the follow-up in-code so it is not lost.
* chore: drop unrelated formatting churn from security PR
Restores extensions/memory-lancedb/config.ts and
src/agents/pi-embedded-helpers/errors.ts to their origin/main state.
These were line-wrap-only formatting changes with no relation to the
security fixes in this branch.
* fix(security): address Codex P2 review findings
1. Normalize plugins.allow entries through normalizePluginId before
phantom-entry filtering so that bundled plugin aliases and legacy IDs
are correctly excluded. Without this, valid allow entries that resolve
via alias normalization could generate false-positive phantom warnings.
2. Surface a skills.workspace.scan_truncated warn finding when the BFS
visit cap (MAX_TOTAL_DIR_VISITS) is hit mid-traversal. Previously the
scanner silently returned partial results, allowing escaped SKILL.md
symlinks in the unvisited tree to go undetected.
listWorkspaceSkillMarkdownFiles now returns {skillFilePaths, truncated}
and collectWorkspaceSkillSymlinkEscapeFindings emits the new finding
when truncated is true.
Regression test added for the truncation path using a mocked readdir
that fills the queue past the cap (40 001 fake entries) and a mocked
realpath for zero-I/O iteration speed.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
226 lines
6.8 KiB
TypeScript
226 lines
6.8 KiB
TypeScript
import fs from "node:fs";
|
|
import { homedir } from "node:os";
|
|
import { join } from "node:path";
|
|
|
|
export type MemoryConfig = {
|
|
embedding: {
|
|
provider: "openai";
|
|
model: string;
|
|
apiKey: string;
|
|
baseUrl?: string;
|
|
dimensions?: number;
|
|
};
|
|
dreaming?: Record<string, unknown>;
|
|
dbPath?: string;
|
|
autoCapture?: boolean;
|
|
autoRecall?: boolean;
|
|
captureMaxChars?: number;
|
|
storageOptions?: Record<string, string>;
|
|
};
|
|
|
|
export const MEMORY_CATEGORIES = ["preference", "fact", "decision", "entity", "other"] as const;
|
|
export type MemoryCategory = (typeof MEMORY_CATEGORIES)[number];
|
|
|
|
const DEFAULT_MODEL = "text-embedding-3-small";
|
|
export const DEFAULT_CAPTURE_MAX_CHARS = 500;
|
|
const LEGACY_STATE_DIRS: string[] = [];
|
|
|
|
function resolveDefaultDbPath(): string {
|
|
const home = homedir();
|
|
const preferred = join(home, ".openclaw", "memory", "lancedb");
|
|
try {
|
|
if (fs.existsSync(preferred)) {
|
|
return preferred;
|
|
}
|
|
} catch {
|
|
// best-effort
|
|
}
|
|
|
|
for (const legacy of LEGACY_STATE_DIRS) {
|
|
const candidate = join(home, legacy, "memory", "lancedb");
|
|
try {
|
|
if (fs.existsSync(candidate)) {
|
|
return candidate;
|
|
}
|
|
} catch {
|
|
// best-effort
|
|
}
|
|
}
|
|
|
|
return preferred;
|
|
}
|
|
|
|
const DEFAULT_DB_PATH = resolveDefaultDbPath();
|
|
|
|
const EMBEDDING_DIMENSIONS: Record<string, number> = {
|
|
"text-embedding-3-small": 1536,
|
|
"text-embedding-3-large": 3072,
|
|
};
|
|
|
|
function assertAllowedKeys(value: Record<string, unknown>, allowed: string[], label: string) {
|
|
const unknown = Object.keys(value).filter((key) => !allowed.includes(key));
|
|
if (unknown.length === 0) {
|
|
return;
|
|
}
|
|
throw new Error(`${label} has unknown keys: ${unknown.join(", ")}`);
|
|
}
|
|
|
|
export function vectorDimsForModel(model: string): number {
|
|
const dims = EMBEDDING_DIMENSIONS[model];
|
|
if (!dims) {
|
|
throw new Error(`Unsupported embedding model: ${model}`);
|
|
}
|
|
return dims;
|
|
}
|
|
|
|
function resolveEnvVars(value: string): string {
|
|
return value.replace(/\$\{([^}]+)\}/g, (_, envVar) => {
|
|
const envValue = process.env[envVar];
|
|
if (!envValue) {
|
|
throw new Error(`Environment variable ${envVar} is not set`);
|
|
}
|
|
return envValue;
|
|
});
|
|
}
|
|
|
|
function resolveEmbeddingModel(embedding: Record<string, unknown>): string {
|
|
const model = typeof embedding.model === "string" ? embedding.model : DEFAULT_MODEL;
|
|
if (typeof embedding.dimensions !== "number") {
|
|
vectorDimsForModel(model);
|
|
}
|
|
return model;
|
|
}
|
|
|
|
export const memoryConfigSchema = {
|
|
parse(value: unknown): MemoryConfig {
|
|
if (!value || typeof value !== "object" || Array.isArray(value)) {
|
|
throw new Error("memory config required");
|
|
}
|
|
const cfg = value as Record<string, unknown>;
|
|
assertAllowedKeys(
|
|
cfg,
|
|
[
|
|
"embedding",
|
|
"dreaming",
|
|
"dbPath",
|
|
"autoCapture",
|
|
"autoRecall",
|
|
"captureMaxChars",
|
|
"storageOptions",
|
|
],
|
|
"memory config",
|
|
);
|
|
|
|
const embedding = cfg.embedding as Record<string, unknown> | undefined;
|
|
if (!embedding || typeof embedding.apiKey !== "string") {
|
|
throw new Error("embedding.apiKey is required");
|
|
}
|
|
assertAllowedKeys(embedding, ["apiKey", "model", "baseUrl", "dimensions"], "embedding config");
|
|
|
|
const model = resolveEmbeddingModel(embedding);
|
|
|
|
const captureMaxChars =
|
|
typeof cfg.captureMaxChars === "number" ? Math.floor(cfg.captureMaxChars) : undefined;
|
|
if (
|
|
typeof captureMaxChars === "number" &&
|
|
(captureMaxChars < 100 || captureMaxChars > 10_000)
|
|
) {
|
|
throw new Error("captureMaxChars must be between 100 and 10000");
|
|
}
|
|
|
|
const dreaming =
|
|
typeof cfg.dreaming === "undefined"
|
|
? undefined
|
|
: cfg.dreaming && typeof cfg.dreaming === "object" && !Array.isArray(cfg.dreaming)
|
|
? (cfg.dreaming as Record<string, unknown>)
|
|
: (() => {
|
|
throw new Error("dreaming config must be an object");
|
|
})();
|
|
|
|
// Parse storageOptions (object with string values)
|
|
let storageOptions: Record<string, string> | undefined;
|
|
const storageOpts = cfg.storageOptions as Record<string, unknown> | undefined;
|
|
if (storageOpts !== undefined && storageOpts !== null) {
|
|
if (!storageOpts || typeof storageOpts !== "object" || Array.isArray(storageOpts)) {
|
|
throw new Error("storageOptions must be an object");
|
|
}
|
|
storageOptions = {};
|
|
// Validate all values are strings
|
|
for (const [key, value] of Object.entries(storageOpts)) {
|
|
if (typeof value !== "string") {
|
|
throw new Error(`storageOptions.${key} must be a string`);
|
|
}
|
|
storageOptions[key] = resolveEnvVars(value);
|
|
}
|
|
}
|
|
|
|
return {
|
|
embedding: {
|
|
provider: "openai",
|
|
model,
|
|
apiKey: resolveEnvVars(embedding.apiKey),
|
|
baseUrl:
|
|
typeof embedding.baseUrl === "string" ? resolveEnvVars(embedding.baseUrl) : undefined,
|
|
dimensions: typeof embedding.dimensions === "number" ? embedding.dimensions : undefined,
|
|
},
|
|
dreaming,
|
|
dbPath: typeof cfg.dbPath === "string" ? cfg.dbPath : DEFAULT_DB_PATH,
|
|
autoCapture: cfg.autoCapture === true,
|
|
autoRecall: cfg.autoRecall !== false,
|
|
captureMaxChars: captureMaxChars ?? DEFAULT_CAPTURE_MAX_CHARS,
|
|
...(storageOptions ? { storageOptions } : {}),
|
|
};
|
|
},
|
|
uiHints: {
|
|
"embedding.apiKey": {
|
|
label: "OpenAI API Key",
|
|
sensitive: true,
|
|
placeholder: "sk-proj-...",
|
|
help: "API key for OpenAI embeddings (or use ${OPENAI_API_KEY})",
|
|
},
|
|
"embedding.baseUrl": {
|
|
label: "Base URL",
|
|
placeholder: "https://api.openai.com/v1",
|
|
help: "Base URL for compatible providers (e.g. http://localhost:11434/v1)",
|
|
advanced: true,
|
|
},
|
|
"embedding.dimensions": {
|
|
label: "Dimensions",
|
|
placeholder: "1536",
|
|
help: "Vector dimensions for custom models (required for non-standard models)",
|
|
advanced: true,
|
|
},
|
|
"embedding.model": {
|
|
label: "Embedding Model",
|
|
placeholder: DEFAULT_MODEL,
|
|
help: "OpenAI embedding model to use",
|
|
},
|
|
dbPath: {
|
|
label: "Database Path",
|
|
placeholder: "~/.openclaw/memory/lancedb",
|
|
advanced: true,
|
|
help: "Local filesystem path or cloud storage URI (s3://, gs://) for LanceDB database",
|
|
},
|
|
autoCapture: {
|
|
label: "Auto-Capture",
|
|
help: "Automatically capture important information from conversations",
|
|
},
|
|
autoRecall: {
|
|
label: "Auto-Recall",
|
|
help: "Automatically inject relevant memories into context",
|
|
},
|
|
captureMaxChars: {
|
|
label: "Capture Max Chars",
|
|
help: "Maximum message length eligible for auto-capture",
|
|
advanced: true,
|
|
placeholder: String(DEFAULT_CAPTURE_MAX_CHARS),
|
|
},
|
|
storageOptions: {
|
|
label: "Storage Options",
|
|
sensitive: true,
|
|
advanced: true,
|
|
help: "Storage configuration options (access_key, secret_key, endpoint, etc.); supports ${ENV_VAR} values",
|
|
},
|
|
},
|
|
};
|