From 89706d323c97d474863344a2b0bc410b021e56b9 Mon Sep 17 00:00:00 2001 From: Gustavo Madeira Santana Date: Fri, 17 Apr 2026 02:19:06 -0400 Subject: [PATCH] Docs: add test performance guardrails --- AGENTS.md | 4 +++ extensions/AGENTS.md | 4 +++ src/agents/AGENTS.md | 29 +++++++++++++++++++++ src/agents/pi-embedded-runner/run/AGENTS.md | 26 ++++++++++++++++++ src/agents/tools/AGENTS.md | 25 ++++++++++++++++++ src/channels/AGENTS.md | 3 +++ src/gateway/AGENTS.md | 23 ++++++++++++++++ 7 files changed, 114 insertions(+) create mode 100644 src/agents/AGENTS.md create mode 100644 src/agents/pi-embedded-runner/run/AGENTS.md create mode 100644 src/agents/tools/AGENTS.md create mode 100644 src/gateway/AGENTS.md diff --git a/AGENTS.md b/AGENTS.md index 97bd0c8693f..78fc5fc1324 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -87,6 +87,8 @@ - `src/plugin-sdk/AGENTS.md` expands public SDK contract rules. - `src/plugins/AGENTS.md` expands plugin loading, registry, and manifest rules. - `src/gateway/protocol/AGENTS.md` expands typed Gateway protocol rules. + - `src/gateway/AGENTS.md` expands Gateway server hot-path and plugin artifact rules. + - `src/agents/AGENTS.md` expands agent test/import performance rules. - `test/helpers/AGENTS.md` and `test/helpers/channels/AGENTS.md` expand shared test helper boundary rules. - Plugin architecture direction: - Keep a manifest-first control plane: discovery, validation, enablement, setup hints, and activation planning should stay metadata-driven by default. @@ -215,6 +217,8 @@ - Test performance guardrail: when production code already accepts `deps`, callbacks, or runtime injection, use that seam in tests before adding module-level mocks. - Test performance guardrail: prefer narrow public SDK subpaths such as `models-provider-runtime`, `skill-commands-runtime`, and `reply-dispatch-runtime` over older broad helper barrels when both expose the needed helper. - Test performance guardrail: treat import-dominated test time as a boundary bug. Refactor the import surface before adding more cases to the slow file. +- Test performance guardrail: when replacing a slow integration test with helper-level coverage, extract the exact production composition into a named helper and test that helper. Do not trade coverage shape for speed without preserving the behavior proof somewhere cheaper. +- Test performance guardrail: for plugin-owned static descriptors used by core tests or cold paths, prefer lightweight public artifacts with full-runtime fallback over loading broad bundled plugin barrels. - Agents MUST NOT modify baseline, inventory, ignore, snapshot, or expected-failure files to silence failing checks without explicit approval in this chat. - For targeted/local debugging, use the native root-project entrypoint: `pnpm test [vitest args...]` (for example `pnpm test src/commands/onboard-search.test.ts -t "shows registered plugin providers"`); do not default to raw `pnpm vitest run ...` because it bypasses the repo's default config/profile/pool routing. - Do not set test workers above 16; tried already. diff --git a/extensions/AGENTS.md b/extensions/AGENTS.md index 0232e37f00d..513d638d859 100644 --- a/extensions/AGENTS.md +++ b/extensions/AGENTS.md @@ -60,6 +60,10 @@ third-party plugins see. - Do not rely on eager global registry seeding or import-time side effects to make a plugin “available”. Plugin availability should come from manifest ownership plus targeted activation. +- When core needs plugin-owned static data on a hot path, expose a lightweight + top-level artifact such as `gateway-auth-api.ts`, `message-tool-api.ts`, or a + similarly narrow `*-api.ts`. Reuse the same local helper from the artifact and + the full plugin so fast paths do not drift from runtime behavior. ## Expanding The Boundary diff --git a/src/agents/AGENTS.md b/src/agents/AGENTS.md new file mode 100644 index 00000000000..0c8fb308d66 --- /dev/null +++ b/src/agents/AGENTS.md @@ -0,0 +1,29 @@ +# Agents Test Performance + +Agent tests are often import-bound. Treat slow test files as architecture +signals, not just runner noise. + +## Guardrails + +- Benchmark before and after performance edits. Prefer existing grouped + artifacts when comparing suites, or use `/usr/bin/time -l pnpm test ` + for a scoped hotspot. +- If a test only needs schema, capability, routing, or static discovery data, + do not cold-load full bundled plugin/channel/provider runtime. Add or reuse a + lightweight typed artifact and keep full runtime as a fallback. +- Keep expensive bootstrap, embedded runner, provider, plugin, and channel + runtime work behind dependency injection or narrow helpers so tests can cover + behavior without starting the whole runtime. +- If moving coverage out of a slow integration test, preserve the exact + production composition in a named helper and test that helper. Do not remove + the behavior proof just because the old proof was slow. +- Avoid broad `importOriginal()` partial mocks and module resets in hot agent + tests. Use explicit mock factories, one-time imports, and reset only the + state the test mutates. + +## Verification + +- For agent performance changes, record seconds and RSS before/after in the + handoff or benchmark report. +- If the change touches lazy-loading, plugin runtime imports, or bundled + artifacts, run `pnpm build`. diff --git a/src/agents/pi-embedded-runner/run/AGENTS.md b/src/agents/pi-embedded-runner/run/AGENTS.md new file mode 100644 index 00000000000..e20d083b899 --- /dev/null +++ b/src/agents/pi-embedded-runner/run/AGENTS.md @@ -0,0 +1,26 @@ +# Embedded Runner Test Performance + +The embedded attempt runner is one of the most expensive agent test surfaces. +Use full-runner tests only when the behavior truly requires the runner. + +## Guardrails + +- Prefer focused helper tests for prompt assembly, runtime-context construction, + cache metadata, token accounting, and maintenance decision logic. +- Keep full `runEmbeddedAttempt` coverage for cross-component behavior that + cannot be proven through helpers, not for a single derived field. +- When extracting a helper from runner logic, make production call that helper + directly, then test the helper. Avoid test-only copies of runner behavior. +- Preserve context-engine coverage for `sessionKey`, `sessionFile`, token + budget, current token count, prompt cache, and routing fields when slimming + tests. +- Treat a standalone full-runner test above a few seconds as suspect. First ask + whether the proof can move to a production helper plus one cheap integration + smoke. + +## Verification + +- For runner test slimming, run the touched helper test and the nearest + two-file runner/context-engine surface. +- Record Vitest duration, wall time, and RSS when the change is performance + motivated. diff --git a/src/agents/tools/AGENTS.md b/src/agents/tools/AGENTS.md new file mode 100644 index 00000000000..8b3db77eb7b --- /dev/null +++ b/src/agents/tools/AGENTS.md @@ -0,0 +1,25 @@ +# Agent Tools Performance + +Tool tests should not load full channel or plugin runtimes for static tool +descriptions. + +## Guardrails + +- Message-tool discovery should flow through shared discovery helpers and + lightweight channel artifacts before falling back to a full channel plugin + load. +- Channel-specific tool schemas, action lists, and static capabilities belong + in plugin-owned helpers that are reused by both the full plugin and the + lightweight artifact. +- Do not add direct bundled-plugin imports to agent tool tests for schema or + capability assertions. If the production path needs the same data, promote a + small public artifact instead. +- If a single assertion starts paying multi-second import/setup cost, split the + static descriptor path from runtime execution instead of adding more mocks + around the broad import. + +## Verification + +- For `src/agents/tools/*.test.ts` performance work, compare targeted file + runtime with `pnpm test ` before/after. +- Run `pnpm build` when adding or changing bundled plugin artifacts. diff --git a/src/channels/AGENTS.md b/src/channels/AGENTS.md index 616ee5c65a4..8a964673061 100644 --- a/src/channels/AGENTS.md +++ b/src/channels/AGENTS.md @@ -29,6 +29,9 @@ import from this tree directly. those files unless startup truly needs them. - Prefer a small local seam such as `channel-api.ts`, `*.runtime.ts`, or `*.runtime-api.ts` to keep heavy runtime code off the hot path. +- For core discovery paths used by Gateway or agent tools, prefer lightweight + bundled-plugin artifacts first and full channel plugin loading only as a + fallback. - Do not mix static and dynamic imports for the same heavy module family across a channel boundary change. If the path should stay lazy, keep it lazy end to end. diff --git a/src/gateway/AGENTS.md b/src/gateway/AGENTS.md new file mode 100644 index 00000000000..943b875b2e6 --- /dev/null +++ b/src/gateway/AGENTS.md @@ -0,0 +1,23 @@ +# Gateway Hot Paths + +Gateway server tests and startup paths should not materialize bundled plugin +runtime when they only need plugin-owned static descriptors. + +## Guardrails + +- For plugin-owned Gateway behavior such as auth-bypass paths, prefer a + lightweight public artifact resolver before falling back to the full channel + plugin. +- Keep the full plugin contract and the lightweight artifact backed by the same + plugin-owned helper so behavior does not diverge. +- Do not load broad bundled channel registries from Gateway HTTP/server code + just to answer static questions. +- If adding a new plugin-owned Gateway descriptor, add the core resolver, + plugin artifact, and mirrored full-plugin export in the same change. + +## Verification + +- Benchmark the affected Gateway test file before/after with + `pnpm test `. +- Run `pnpm build` when changing Gateway lazy-loading or bundled plugin + artifacts.