mirror of
https://github.com/openclaw/openclaw.git
synced 2026-03-20 06:20:55 +00:00
Tests: Add tooling / skill for detecting and fixing memory leaks in tests (#50654)
* Tests: add periodic heap snapshot tooling * Skills: add test heap leak workflow * Apply suggestion from @greptile-apps[bot] Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update scripts/test-parallel.mjs Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Vincent Koc <vincentkoc@ieee.org> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
This commit is contained in:
71
.agents/skills/openclaw-test-heap-leaks/SKILL.md
Normal file
71
.agents/skills/openclaw-test-heap-leaks/SKILL.md
Normal file
@@ -0,0 +1,71 @@
|
||||
---
|
||||
name: openclaw-test-heap-leaks
|
||||
description: Investigate `pnpm test` memory growth, Vitest worker OOMs, and suspicious RSS increases in OpenClaw using the `scripts/test-parallel.mjs` heap snapshot tooling. Use when Codex needs to reproduce test-lane memory growth, collect repeated `.heapsnapshot` files, compare snapshots from the same worker PID, distinguish transformed-module retention from real data leaks, and fix or reduce the impact by patching cleanup logic or isolating hotspot tests.
|
||||
---
|
||||
|
||||
# OpenClaw Test Heap Leaks
|
||||
|
||||
Use this skill for test-memory investigations. Do not guess from RSS alone when heap snapshots are available.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. Reproduce the failing shape first.
|
||||
- Match the real entrypoint if possible. For Linux CI-style unit failures, start with:
|
||||
- `pnpm canvas:a2ui:bundle && OPENCLAW_TEST_MEMORY_TRACE=1 OPENCLAW_TEST_HEAPSNAPSHOT_INTERVAL_MS=60000 OPENCLAW_TEST_HEAPSNAPSHOT_DIR=.tmp/heapsnap OPENCLAW_TEST_WORKERS=2 OPENCLAW_TEST_MAX_OLD_SPACE_SIZE_MB=6144 pnpm test`
|
||||
- Keep `OPENCLAW_TEST_MEMORY_TRACE=1` enabled so the wrapper prints per-file RSS summaries alongside the snapshots.
|
||||
- If the report is about a specific shard or worker budget, preserve that shape.
|
||||
|
||||
2. Wait for repeated snapshots before concluding anything.
|
||||
- Take at least two intervals from the same lane.
|
||||
- Compare snapshots from the same PID inside one lane directory such as `.tmp/heapsnap/unit-fast/`.
|
||||
- Use `scripts/heapsnapshot-delta.mjs` to compare either two files directly or the earliest/latest pair per PID in one lane directory.
|
||||
|
||||
3. Classify the growth before choosing a fix.
|
||||
- If growth is dominated by Vite/Vitest transformed source strings, `Module`, `system / Context`, bytecode, descriptor arrays, or property maps, treat it as retained module graph growth in long-lived workers.
|
||||
- If growth is dominated by app objects, caches, buffers, server handles, timers, mock state, sqlite state, or similar runtime objects, treat it as a likely cleanup or lifecycle leak.
|
||||
|
||||
4. Fix the right layer.
|
||||
- For retained transformed-module growth in shared workers:
|
||||
- Move hotspot files out of `unit-fast` by updating `test/fixtures/test-parallel.behavior.json`.
|
||||
- Prefer `singletonIsolated` for files that are safe alone but inflate shared worker heaps.
|
||||
- If the file should already have been peeled out by timings but is absent from `test/fixtures/test-timings.unit.json`, call that out explicitly. Missing timings are a scheduling blind spot.
|
||||
- For real leaks:
|
||||
- Patch the implicated test or runtime cleanup path.
|
||||
- Look for missing `afterEach`/`afterAll`, module-reset gaps, retained global state, unreleased DB handles, or listeners/timers that survive the file.
|
||||
|
||||
5. Verify with the most direct proof.
|
||||
- Re-run the targeted lane or file with heap snapshots enabled if the suite still finishes in reasonable time.
|
||||
- If snapshot overhead pushes tests over Vitest timeouts, fall back to the same lane without snapshots and confirm the RSS trend or OOM is reduced.
|
||||
- For wrapper-only changes, at minimum verify the expected lanes start and the snapshot files are written.
|
||||
|
||||
## Heuristics
|
||||
|
||||
- Do not call everything a leak. In this repo, large `unit-fast` growth can be a worker-lifetime problem rather than an application object leak.
|
||||
- `scripts/test-parallel.mjs` and `scripts/test-parallel-memory.mjs` are the primary control points for wrapper diagnostics.
|
||||
- The lane names printed by `[test-parallel] start ...` and `[test-parallel][mem] summary ...` tell you where to focus.
|
||||
- When one or two files account for most of the delta and they are missing from timings, reducing impact by isolating them is usually the first pragmatic fix.
|
||||
- When the same retained object families grow across multiple intervals in the same worker PID, trust the snapshots over intuition.
|
||||
|
||||
## Snapshot Comparison
|
||||
|
||||
- Direct comparison:
|
||||
- `node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs before.heapsnapshot after.heapsnapshot`
|
||||
- Auto-select earliest/latest snapshots per PID within one lane:
|
||||
- `node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs --lane-dir .tmp/heapsnap/unit-fast`
|
||||
- Useful flags:
|
||||
- `--top 40`
|
||||
- `--min-kb 32`
|
||||
- `--pid 16133`
|
||||
|
||||
Read the top positive deltas first. Large positive growth in module-transform artifacts suggests lane isolation; large positive growth in runtime objects suggests a real leak.
|
||||
|
||||
## Output Expectations
|
||||
|
||||
When using this skill, report:
|
||||
|
||||
- The exact reproduce command.
|
||||
- Which lane and PID were compared.
|
||||
- The dominant retained object families from the snapshot delta.
|
||||
- Whether the issue is a real leak or shared-worker retained module growth.
|
||||
- The concrete fix or impact-reduction patch.
|
||||
- What you verified, and what snapshot overhead prevented you from verifying.
|
||||
@@ -0,0 +1,4 @@
|
||||
interface:
|
||||
display_name: "Test Heap Leaks"
|
||||
short_description: "Investigate test OOMs with heap snapshots"
|
||||
default_prompt: "Use $openclaw-test-heap-leaks to investigate test memory growth with heap snapshots and reduce its impact."
|
||||
@@ -0,0 +1,265 @@
|
||||
#!/usr/bin/env node
|
||||
|
||||
import fs from "node:fs";
|
||||
import path from "node:path";
|
||||
|
||||
function printUsage() {
|
||||
console.error(
|
||||
"Usage: node heapsnapshot-delta.mjs <before.heapsnapshot> <after.heapsnapshot> [--top N] [--min-kb N]",
|
||||
);
|
||||
console.error(
|
||||
" or: node heapsnapshot-delta.mjs --lane-dir <dir> [--pid PID] [--top N] [--min-kb N]",
|
||||
);
|
||||
}
|
||||
|
||||
function fail(message) {
|
||||
console.error(message);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
function parseArgs(argv) {
|
||||
const options = {
|
||||
top: 30,
|
||||
minKb: 64,
|
||||
laneDir: null,
|
||||
pid: null,
|
||||
files: [],
|
||||
};
|
||||
|
||||
for (let index = 0; index < argv.length; index += 1) {
|
||||
const arg = argv[index];
|
||||
if (arg === "--top") {
|
||||
options.top = Number.parseInt(argv[index + 1] ?? "", 10);
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--min-kb") {
|
||||
options.minKb = Number.parseInt(argv[index + 1] ?? "", 10);
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--lane-dir") {
|
||||
options.laneDir = argv[index + 1] ?? null;
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--pid") {
|
||||
options.pid = Number.parseInt(argv[index + 1] ?? "", 10);
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
options.files.push(arg);
|
||||
}
|
||||
|
||||
if (!Number.isFinite(options.top) || options.top <= 0) {
|
||||
fail("--top must be a positive integer");
|
||||
}
|
||||
if (!Number.isFinite(options.minKb) || options.minKb < 0) {
|
||||
fail("--min-kb must be a non-negative integer");
|
||||
}
|
||||
if (options.pid !== null && (!Number.isInteger(options.pid) || options.pid <= 0)) {
|
||||
fail("--pid must be a positive integer");
|
||||
}
|
||||
|
||||
return options;
|
||||
}
|
||||
|
||||
function parseHeapFilename(filePath) {
|
||||
const base = path.basename(filePath);
|
||||
const match = base.match(
|
||||
/^Heap\.(?<stamp>\d{8}\.\d{6})\.(?<pid>\d+)\.0\.(?<seq>\d+)\.heapsnapshot$/u,
|
||||
);
|
||||
if (!match?.groups) {
|
||||
return null;
|
||||
}
|
||||
return {
|
||||
filePath,
|
||||
pid: Number.parseInt(match.groups.pid, 10),
|
||||
stamp: match.groups.stamp,
|
||||
sequence: Number.parseInt(match.groups.seq, 10),
|
||||
};
|
||||
}
|
||||
|
||||
function resolvePair(options) {
|
||||
if (options.laneDir) {
|
||||
const entries = fs
|
||||
.readdirSync(options.laneDir)
|
||||
.map((name) => parseHeapFilename(path.join(options.laneDir, name)))
|
||||
.filter((entry) => entry !== null)
|
||||
.filter((entry) => options.pid === null || entry.pid === options.pid)
|
||||
.toSorted((left, right) => {
|
||||
if (left.pid !== right.pid) {
|
||||
return left.pid - right.pid;
|
||||
}
|
||||
if (left.stamp !== right.stamp) {
|
||||
return left.stamp.localeCompare(right.stamp);
|
||||
}
|
||||
return left.sequence - right.sequence;
|
||||
});
|
||||
|
||||
if (entries.length === 0) {
|
||||
fail(`No matching heap snapshots found in ${options.laneDir}`);
|
||||
}
|
||||
|
||||
const groups = new Map();
|
||||
for (const entry of entries) {
|
||||
const group = groups.get(entry.pid) ?? [];
|
||||
group.push(entry);
|
||||
groups.set(entry.pid, group);
|
||||
}
|
||||
|
||||
const candidates = Array.from(groups.values())
|
||||
.map((group) => ({
|
||||
pid: group[0].pid,
|
||||
before: group[0],
|
||||
after: group.at(-1),
|
||||
count: group.length,
|
||||
}))
|
||||
.filter((entry) => entry.count >= 2);
|
||||
|
||||
if (candidates.length === 0) {
|
||||
fail(`Need at least two snapshots for one PID in ${options.laneDir}`);
|
||||
}
|
||||
|
||||
const chosen =
|
||||
options.pid !== null
|
||||
? (candidates.find((entry) => entry.pid === options.pid) ?? null)
|
||||
: candidates.toSorted((left, right) => right.count - left.count || left.pid - right.pid)[0];
|
||||
|
||||
if (!chosen) {
|
||||
fail(`No PID with at least two snapshots matched in ${options.laneDir}`);
|
||||
}
|
||||
|
||||
return {
|
||||
before: chosen.before.filePath,
|
||||
after: chosen.after.filePath,
|
||||
pid: chosen.pid,
|
||||
snapshotCount: chosen.count,
|
||||
};
|
||||
}
|
||||
|
||||
if (options.files.length !== 2) {
|
||||
printUsage();
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
return {
|
||||
before: options.files[0],
|
||||
after: options.files[1],
|
||||
pid: null,
|
||||
snapshotCount: 2,
|
||||
};
|
||||
}
|
||||
|
||||
function loadSummary(filePath) {
|
||||
const data = JSON.parse(fs.readFileSync(filePath, "utf8"));
|
||||
const meta = data.snapshot?.meta;
|
||||
if (!meta) {
|
||||
fail(`Invalid heap snapshot: ${filePath}`);
|
||||
}
|
||||
|
||||
const nodeFieldCount = meta.node_fields.length;
|
||||
const typeNames = meta.node_types[0];
|
||||
const strings = data.strings;
|
||||
const typeIndex = meta.node_fields.indexOf("type");
|
||||
const nameIndex = meta.node_fields.indexOf("name");
|
||||
const selfSizeIndex = meta.node_fields.indexOf("self_size");
|
||||
|
||||
const summary = new Map();
|
||||
for (let offset = 0; offset < data.nodes.length; offset += nodeFieldCount) {
|
||||
const type = typeNames[data.nodes[offset + typeIndex]];
|
||||
const name = strings[data.nodes[offset + nameIndex]];
|
||||
const selfSize = data.nodes[offset + selfSizeIndex];
|
||||
const key = `${type}\t${name}`;
|
||||
const current = summary.get(key) ?? {
|
||||
type,
|
||||
name,
|
||||
selfSize: 0,
|
||||
count: 0,
|
||||
};
|
||||
current.selfSize += selfSize;
|
||||
current.count += 1;
|
||||
summary.set(key, current);
|
||||
}
|
||||
return {
|
||||
nodeCount: data.snapshot.node_count,
|
||||
summary,
|
||||
};
|
||||
}
|
||||
|
||||
function formatBytes(bytes) {
|
||||
if (Math.abs(bytes) >= 1024 ** 2) {
|
||||
return `${(bytes / 1024 ** 2).toFixed(2)} MiB`;
|
||||
}
|
||||
if (Math.abs(bytes) >= 1024) {
|
||||
return `${(bytes / 1024).toFixed(1)} KiB`;
|
||||
}
|
||||
return `${bytes} B`;
|
||||
}
|
||||
|
||||
function formatDelta(bytes) {
|
||||
return `${bytes >= 0 ? "+" : "-"}${formatBytes(Math.abs(bytes))}`;
|
||||
}
|
||||
|
||||
function truncate(text, maxLength) {
|
||||
return text.length <= maxLength ? text : `${text.slice(0, maxLength - 1)}…`;
|
||||
}
|
||||
|
||||
function main() {
|
||||
const options = parseArgs(process.argv.slice(2));
|
||||
const pair = resolvePair(options);
|
||||
const before = loadSummary(pair.before);
|
||||
const after = loadSummary(pair.after);
|
||||
const minBytes = options.minKb * 1024;
|
||||
|
||||
const rows = [];
|
||||
for (const [key, next] of after.summary) {
|
||||
const previous = before.summary.get(key) ?? { selfSize: 0, count: 0 };
|
||||
const sizeDelta = next.selfSize - previous.selfSize;
|
||||
const countDelta = next.count - previous.count;
|
||||
if (sizeDelta < minBytes) {
|
||||
continue;
|
||||
}
|
||||
rows.push({
|
||||
type: next.type,
|
||||
name: next.name,
|
||||
sizeDelta,
|
||||
countDelta,
|
||||
afterSize: next.selfSize,
|
||||
afterCount: next.count,
|
||||
});
|
||||
}
|
||||
|
||||
rows.sort(
|
||||
(left, right) => right.sizeDelta - left.sizeDelta || right.countDelta - left.countDelta,
|
||||
);
|
||||
|
||||
console.log(`before: ${pair.before}`);
|
||||
console.log(`after: ${pair.after}`);
|
||||
if (pair.pid !== null) {
|
||||
console.log(`pid: ${pair.pid} (${pair.snapshotCount} snapshots found)`);
|
||||
}
|
||||
console.log(
|
||||
`nodes: ${before.nodeCount} -> ${after.nodeCount} (${after.nodeCount - before.nodeCount >= 0 ? "+" : ""}${after.nodeCount - before.nodeCount})`,
|
||||
);
|
||||
console.log(`filter: top=${options.top} min=${options.minKb} KiB`);
|
||||
console.log("");
|
||||
|
||||
if (rows.length === 0) {
|
||||
console.log("No entries exceeded the minimum delta.");
|
||||
return;
|
||||
}
|
||||
|
||||
for (const row of rows.slice(0, options.top)) {
|
||||
console.log(
|
||||
[
|
||||
formatDelta(row.sizeDelta).padStart(11),
|
||||
`count ${row.countDelta >= 0 ? "+" : ""}${row.countDelta}`.padStart(10),
|
||||
row.type.padEnd(16),
|
||||
truncate(row.name || "(empty)", 96),
|
||||
].join(" "),
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
main();
|
||||
@@ -11,7 +11,7 @@ const ANSI_ESCAPE_PATTERN = new RegExp(
|
||||
const COMPLETED_TEST_FILE_LINE_PATTERN =
|
||||
/(?<file>(?:src|extensions|test|ui)\/\S+?\.(?:live\.test|e2e\.test|test)\.ts)\s+\(.*\)\s+(?<duration>\d+(?:\.\d+)?)(?<unit>ms|s)\s*$/;
|
||||
|
||||
const PS_COLUMNS = ["pid=", "ppid=", "rss="];
|
||||
const PS_COLUMNS = ["pid=", "ppid=", "rss=", "comm="];
|
||||
|
||||
function parseDurationMs(rawValue, unit) {
|
||||
const parsed = Number.parseFloat(rawValue);
|
||||
@@ -41,7 +41,7 @@ export function parseCompletedTestFileLines(text) {
|
||||
.filter((entry) => entry !== null);
|
||||
}
|
||||
|
||||
export function sampleProcessTreeRssKb(rootPid) {
|
||||
export function getProcessTreeRecords(rootPid) {
|
||||
if (!Number.isInteger(rootPid) || rootPid <= 0 || process.platform === "win32") {
|
||||
return null;
|
||||
}
|
||||
@@ -54,13 +54,13 @@ export function sampleProcessTreeRssKb(rootPid) {
|
||||
}
|
||||
|
||||
const childPidsByParent = new Map();
|
||||
const rssByPid = new Map();
|
||||
const recordsByPid = new Map();
|
||||
for (const line of result.stdout.split(/\r?\n/u)) {
|
||||
const trimmed = line.trim();
|
||||
if (!trimmed) {
|
||||
continue;
|
||||
}
|
||||
const [pidRaw, parentRaw, rssRaw] = trimmed.split(/\s+/u);
|
||||
const [pidRaw, parentRaw, rssRaw, commandRaw] = trimmed.split(/\s+/u, 4);
|
||||
const pid = Number.parseInt(pidRaw ?? "", 10);
|
||||
const parentPid = Number.parseInt(parentRaw ?? "", 10);
|
||||
const rssKb = Number.parseInt(rssRaw ?? "", 10);
|
||||
@@ -70,27 +70,30 @@ export function sampleProcessTreeRssKb(rootPid) {
|
||||
const siblings = childPidsByParent.get(parentPid) ?? [];
|
||||
siblings.push(pid);
|
||||
childPidsByParent.set(parentPid, siblings);
|
||||
rssByPid.set(pid, rssKb);
|
||||
recordsByPid.set(pid, {
|
||||
pid,
|
||||
parentPid,
|
||||
rssKb,
|
||||
command: commandRaw ?? "",
|
||||
});
|
||||
}
|
||||
|
||||
if (!rssByPid.has(rootPid)) {
|
||||
if (!recordsByPid.has(rootPid)) {
|
||||
return null;
|
||||
}
|
||||
|
||||
let rssKb = 0;
|
||||
let processCount = 0;
|
||||
const queue = [rootPid];
|
||||
const visited = new Set();
|
||||
const records = [];
|
||||
while (queue.length > 0) {
|
||||
const pid = queue.shift();
|
||||
if (pid === undefined || visited.has(pid)) {
|
||||
continue;
|
||||
}
|
||||
visited.add(pid);
|
||||
const currentRssKb = rssByPid.get(pid);
|
||||
if (currentRssKb !== undefined) {
|
||||
rssKb += currentRssKb;
|
||||
processCount += 1;
|
||||
const record = recordsByPid.get(pid);
|
||||
if (record) {
|
||||
records.push(record);
|
||||
}
|
||||
for (const childPid of childPidsByParent.get(pid) ?? []) {
|
||||
if (!visited.has(childPid)) {
|
||||
@@ -99,5 +102,21 @@ export function sampleProcessTreeRssKb(rootPid) {
|
||||
}
|
||||
}
|
||||
|
||||
return records;
|
||||
}
|
||||
|
||||
export function sampleProcessTreeRssKb(rootPid) {
|
||||
const records = getProcessTreeRecords(rootPid);
|
||||
if (!records) {
|
||||
return null;
|
||||
}
|
||||
|
||||
let rssKb = 0;
|
||||
let processCount = 0;
|
||||
for (const record of records) {
|
||||
rssKb += record.rssKb;
|
||||
processCount += 1;
|
||||
}
|
||||
|
||||
return { rssKb, processCount };
|
||||
}
|
||||
|
||||
@@ -4,7 +4,11 @@ import os from "node:os";
|
||||
import path from "node:path";
|
||||
import { channelTestPrefixes } from "../vitest.channel-paths.mjs";
|
||||
import { isUnitConfigTestFile } from "../vitest.unit-paths.mjs";
|
||||
import { parseCompletedTestFileLines, sampleProcessTreeRssKb } from "./test-parallel-memory.mjs";
|
||||
import {
|
||||
getProcessTreeRecords,
|
||||
parseCompletedTestFileLines,
|
||||
sampleProcessTreeRssKb,
|
||||
} from "./test-parallel-memory.mjs";
|
||||
import {
|
||||
appendCapturedOutput,
|
||||
hasFatalTestRunOutput,
|
||||
@@ -725,6 +729,25 @@ const memoryTraceEnabled =
|
||||
(rawMemoryTrace !== "0" && rawMemoryTrace !== "false" && isCI));
|
||||
const memoryTracePollMs = Math.max(250, parseEnvNumber("OPENCLAW_TEST_MEMORY_TRACE_POLL_MS", 1000));
|
||||
const memoryTraceTopCount = Math.max(1, parseEnvNumber("OPENCLAW_TEST_MEMORY_TRACE_TOP_COUNT", 6));
|
||||
const heapSnapshotIntervalMs = Math.max(
|
||||
0,
|
||||
parseEnvNumber("OPENCLAW_TEST_HEAPSNAPSHOT_INTERVAL_MS", 0),
|
||||
);
|
||||
const heapSnapshotMinIntervalMs = 5000;
|
||||
const heapSnapshotEnabled =
|
||||
process.platform !== "win32" &&
|
||||
heapSnapshotIntervalMs >= heapSnapshotMinIntervalMs;
|
||||
const heapSnapshotEnabled = process.platform !== "win32" && heapSnapshotIntervalMs > 0;
|
||||
const heapSnapshotSignal = process.env.OPENCLAW_TEST_HEAPSNAPSHOT_SIGNAL?.trim() || "SIGUSR2";
|
||||
const heapSnapshotBaseDir = heapSnapshotEnabled
|
||||
? path.resolve(
|
||||
process.env.OPENCLAW_TEST_HEAPSNAPSHOT_DIR?.trim() ||
|
||||
path.join(os.tmpdir(), `openclaw-heapsnapshots-${Date.now()}`),
|
||||
)
|
||||
: null;
|
||||
const ensureNodeOptionFlag = (nodeOptions, flagPrefix, nextValue) =>
|
||||
nodeOptions.includes(flagPrefix) ? nodeOptions : `${nodeOptions} ${nextValue}`.trim();
|
||||
const isNodeLikeProcess = (command) => /(?:^|\/)node(?:$|\.exe$)/iu.test(command);
|
||||
|
||||
const runOnce = (entry, extraArgs = []) =>
|
||||
new Promise((resolve) => {
|
||||
@@ -757,23 +780,44 @@ const runOnce = (entry, extraArgs = []) =>
|
||||
(acc, flag) => (acc.includes(flag) ? acc : `${acc} ${flag}`.trim()),
|
||||
nodeOptions,
|
||||
);
|
||||
const heapFlag =
|
||||
const heapSnapshotDir =
|
||||
heapSnapshotBaseDir === null ? null : path.join(heapSnapshotBaseDir, entry.name);
|
||||
let resolvedNodeOptions =
|
||||
maxOldSpaceSizeMb && !nextNodeOptions.includes("--max-old-space-size=")
|
||||
? `--max-old-space-size=${maxOldSpaceSizeMb}`
|
||||
: null;
|
||||
const resolvedNodeOptions = heapFlag
|
||||
? `${nextNodeOptions} ${heapFlag}`.trim()
|
||||
: nextNodeOptions;
|
||||
? `${nextNodeOptions} --max-old-space-size=${maxOldSpaceSizeMb}`.trim()
|
||||
: nextNodeOptions;
|
||||
if (heapSnapshotEnabled && heapSnapshotDir) {
|
||||
try {
|
||||
fs.mkdirSync(heapSnapshotDir, { recursive: true });
|
||||
} catch (err) {
|
||||
console.error(`[test-parallel] failed to create heap snapshot dir ${heapSnapshotDir}: ${String(err)}`);
|
||||
resolve(1);
|
||||
return;
|
||||
}
|
||||
resolvedNodeOptions = ensureNodeOptionFlag(
|
||||
resolvedNodeOptions,
|
||||
"--diagnostic-dir=",
|
||||
`--diagnostic-dir=${heapSnapshotDir}`,
|
||||
);
|
||||
resolvedNodeOptions = ensureNodeOptionFlag(
|
||||
resolvedNodeOptions,
|
||||
"--heapsnapshot-signal=",
|
||||
`--heapsnapshot-signal=${heapSnapshotSignal}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
let output = "";
|
||||
let fatalSeen = false;
|
||||
let childError = null;
|
||||
let child;
|
||||
let pendingLine = "";
|
||||
let memoryPollTimer = null;
|
||||
let heapSnapshotTimer = null;
|
||||
const memoryFileRecords = [];
|
||||
let initialTreeSample = null;
|
||||
let latestTreeSample = null;
|
||||
let peakTreeSample = null;
|
||||
let heapSnapshotSequence = 0;
|
||||
const updatePeakTreeSample = (sample, reason) => {
|
||||
if (!sample) {
|
||||
return;
|
||||
@@ -782,6 +826,35 @@ const runOnce = (entry, extraArgs = []) =>
|
||||
peakTreeSample = { ...sample, reason };
|
||||
}
|
||||
};
|
||||
const triggerHeapSnapshot = (reason) => {
|
||||
if (!heapSnapshotEnabled || !child?.pid || !heapSnapshotDir) {
|
||||
return;
|
||||
}
|
||||
const records = getProcessTreeRecords(child.pid) ?? [];
|
||||
const targetPids = records
|
||||
.filter((record) => record.pid !== process.pid && isNodeLikeProcess(record.command))
|
||||
.map((record) => record.pid);
|
||||
if (targetPids.length === 0) {
|
||||
return;
|
||||
}
|
||||
heapSnapshotSequence += 1;
|
||||
let signaledCount = 0;
|
||||
for (const pid of targetPids) {
|
||||
try {
|
||||
process.kill(pid, heapSnapshotSignal);
|
||||
signaledCount += 1;
|
||||
} catch {
|
||||
// Process likely exited between ps sampling and signal delivery.
|
||||
}
|
||||
}
|
||||
if (signaledCount > 0) {
|
||||
console.log(
|
||||
`[test-parallel][heap] ${entry.name} seq=${String(heapSnapshotSequence)} reason=${reason} signaled=${String(
|
||||
signaledCount,
|
||||
)}/${String(targetPids.length)} dir=${heapSnapshotDir}`,
|
||||
);
|
||||
}
|
||||
};
|
||||
const captureTreeSample = (reason) => {
|
||||
if (!memoryTraceEnabled || !child?.pid) {
|
||||
return null;
|
||||
@@ -877,6 +950,11 @@ const runOnce = (entry, extraArgs = []) =>
|
||||
captureTreeSample("poll");
|
||||
}, memoryTracePollMs);
|
||||
}
|
||||
if (heapSnapshotEnabled) {
|
||||
heapSnapshotTimer = setInterval(() => {
|
||||
triggerHeapSnapshot("interval");
|
||||
}, heapSnapshotIntervalMs);
|
||||
}
|
||||
} catch (err) {
|
||||
console.error(`[test-parallel] spawn failed: ${String(err)}`);
|
||||
resolve(1);
|
||||
@@ -905,6 +983,9 @@ const runOnce = (entry, extraArgs = []) =>
|
||||
if (memoryPollTimer) {
|
||||
clearInterval(memoryPollTimer);
|
||||
}
|
||||
if (heapSnapshotTimer) {
|
||||
clearInterval(heapSnapshotTimer);
|
||||
}
|
||||
children.delete(child);
|
||||
const resolvedCode = resolveTestRunExitCode({ code, signal, output, fatalSeen, childError });
|
||||
logMemoryTraceSummary();
|
||||
|
||||
Reference in New Issue
Block a user