Files
openclaw/src/agents/openclaw-tools.sessions.e2e.test.ts
Tyler Yust b8f66c260d Agents: add nested subagent orchestration controls and reduce subagent token waste (#14447)
* Agents: add subagent orchestration controls

* Agents: add subagent orchestration controls (WIP uncommitted changes)

* feat(subagents): add depth-based spawn gating for sub-sub-agents

* feat(subagents): tool policy, registry, and announce chain for nested agents

* feat(subagents): system prompt, docs, changelog for nested sub-agents

* fix(subagents): prevent model fallback override, show model during active runs, and block context overflow fallback

Bug 1: When a session has an explicit model override (e.g., gpt/openai-codex),
the fallback candidate logic in resolveFallbackCandidates silently appended the
global primary model (opus) as a backstop. On reinjection/steer with a transient
error, the session could fall back to opus which has a smaller context window
and crash. Fix: when storedModelOverride is set, pass fallbacksOverride ?? []
instead of undefined, preventing the implicit primary backstop.

Bug 2: Active subagents showed 'model n/a' in /subagents list because
resolveModelDisplay only read entry.model/modelProvider (populated after run
completes). Fix: fall back to modelOverride/providerOverride fields which are
populated at spawn time via sessions.patch.

Bug 3: Context overflow errors (prompt too long, context_length_exceeded) could
theoretically escape runEmbeddedPiAgent and be treated as failover candidates
in runWithModelFallback, causing a switch to a model with a smaller context
window. Fix: in runWithModelFallback, detect context overflow errors via
isLikelyContextOverflowError and rethrow them immediately instead of trying the
next model candidate.

* fix(subagents): track spawn depth in session store and fix announce routing for nested agents

* Fix compaction status tracking and dedupe overflow compaction triggers

* fix(subagents): enforce depth block via session store and implement cascade kill

* fix: inject group chat context into system prompt

* fix(subagents): always write model to session store at spawn time

* Preserve spawnDepth when agent handler rewrites session entry

* fix(subagents): suppress announce on steer-restart

* fix(subagents): fallback spawned session model to runtime default

* fix(subagents): enforce spawn depth when caller key resolves by sessionId

* feat(subagents): implement active-first ordering for numeric targets and enhance task display

- Added a test to verify that subagents with numeric targets follow an active-first list ordering.
- Updated `resolveSubagentTarget` to sort subagent runs based on active status and recent activity.
- Enhanced task display in command responses to prevent truncation of long task descriptions.
- Introduced new utility functions for compacting task text and managing subagent run states.

* fix(subagents): show model for active runs via run record fallback

When the spawned model matches the agent's default model, the session
store's override fields are intentionally cleared (isDefault: true).
The model/modelProvider fields are only populated after the run
completes. This left active subagents showing 'model n/a'.

Fix: store the resolved model on SubagentRunRecord at registration
time, and use it as a fallback in both display paths (subagents tool
and /subagents command) when the session store entry has no model info.

Changes:
- SubagentRunRecord: add optional model field
- registerSubagentRun: accept and persist model param
- sessions-spawn-tool: pass resolvedModel to registerSubagentRun
- subagents-tool: pass run record model as fallback to resolveModelDisplay
- commands-subagents: pass run record model as fallback to resolveModelDisplay

* feat(chat): implement session key resolution and reset on sidebar navigation

- Added functions to resolve the main session key and reset chat state when switching sessions from the sidebar.
- Updated the `renderTab` function to handle session key changes when navigating to the chat tab.
- Introduced a test to verify that the session resets to "main" when opening chat from the sidebar navigation.

* fix: subagent timeout=0 passthrough and fallback prompt duplication

Bug 1: runTimeoutSeconds=0 now means 'no timeout' instead of applying 600s default
- sessions-spawn-tool: default to undefined (not 0) when neither timeout param
  is provided; use != null check so explicit 0 passes through to gateway
- agent.ts: accept 0 as valid timeout (resolveAgentTimeoutMs already handles
  0 → MAX_SAFE_TIMEOUT_MS)

Bug 2: model fallback no longer re-injects the original prompt as a duplicate
- agent.ts: track fallback attempt index; on retries use a short continuation
  message instead of the full original prompt since the session file already
  contains it from the first attempt
- Also skip re-sending images on fallback retries (already in session)

* feat(subagents): truncate long task descriptions in subagents command output

- Introduced a new utility function to format task previews, limiting their length to improve readability.
- Updated the command handler to use the new formatting function, ensuring task descriptions are truncated appropriately.
- Adjusted related tests to verify that long task descriptions are now truncated in the output.

* refactor(subagents): update subagent registry path resolution and improve command output formatting

- Replaced direct import of STATE_DIR with a utility function to resolve the state directory dynamically.
- Enhanced the formatting of command output for active and recent subagents, adding separators for better readability.
- Updated related tests to reflect changes in command output structure.

* fix(subagent): default sessions_spawn to no timeout when runTimeoutSeconds omitted

The previous fix (75a791106) correctly handled the case where
runTimeoutSeconds was explicitly set to 0 ("no timeout"). However,
when models omit the parameter entirely (which is common since the
schema marks it as optional), runTimeoutSeconds resolved to undefined.

undefined flowed through the chain as:
  sessions_spawn → timeout: undefined (since undefined != null is false)
  → gateway agent handler → agentCommand opts.timeout: undefined
  → resolveAgentTimeoutMs({ overrideSeconds: undefined })
  → DEFAULT_AGENT_TIMEOUT_SECONDS (600s = 10 minutes)

This caused subagents to be killed at exactly 10 minutes even though
the user's intent (via TOOLS.md) was for subagents to run without a
timeout.

Fix: default runTimeoutSeconds to 0 (no timeout) when neither
runTimeoutSeconds nor timeoutSeconds is provided by the caller.
Subagent spawns are long-running by design and should not inherit the
600s agent-command default timeout.

* fix(subagent): accept timeout=0 in agent-via-gateway path (second 600s default)

* fix: thread timeout override through getReplyFromConfig dispatch path

getReplyFromConfig called resolveAgentTimeoutMs({ cfg }) with no override,
always falling back to the config default (600s). Add timeoutOverrideSeconds
to GetReplyOptions and pass it through as overrideSeconds so callers of the
dispatch chain can specify a custom timeout (0 = no timeout).

This complements the existing timeout threading in agentCommand and the
cron isolated-agent runner, which already pass overrideSeconds correctly.

* feat(model-fallback): normalize OpenAI Codex model references and enhance fallback handling

- Added normalization for OpenAI Codex model references, specifically converting "gpt-5.3-codex" to "openai-codex" before execution.
- Updated the `resolveFallbackCandidates` function to utilize the new normalization logic.
- Enhanced tests to verify the correct behavior of model normalization and fallback mechanisms.
- Introduced a new test case to ensure that the normalization process works as expected for various input formats.

* feat(tests): add unit tests for steer failure behavior in openclaw-tools

- Introduced a new test file to validate the behavior of subagents when steer replacement dispatch fails.
- Implemented tests to ensure that the announce behavior is restored correctly and that the suppression reason is cleared as expected.
- Enhanced the subagent registry with a new function to clear steer restart suppression.
- Updated related components to support the new test scenarios.

* fix(subagents): replace stop command with kill in slash commands and documentation

- Updated the `/subagents` command to replace `stop` with `kill` for consistency in controlling sub-agent runs.
- Modified related documentation to reflect the change in command usage.
- Removed legacy timeoutSeconds references from the sessions-spawn-tool schema and tests to streamline timeout handling.
- Enhanced tests to ensure correct behavior of the updated commands and their interactions.

* feat(tests): add unit tests for readLatestAssistantReply function

- Introduced a new test file for the `readLatestAssistantReply` function to validate its behavior with various message scenarios.
- Implemented tests to ensure the function correctly retrieves the latest assistant message and handles cases where the latest message has no text.
- Mocked the gateway call to simulate different message histories for comprehensive testing.

* feat(tests): enhance subagent kill-all cascade tests and announce formatting

- Added a new test to verify that the `kill-all` command cascades through ended parents to active descendants in subagents.
- Updated the subagent announce formatting tests to reflect changes in message structure, including the replacement of "Findings:" with "Result:" and the addition of new expectations for message content.
- Improved the handling of long findings and stats in the announce formatting logic to ensure concise output.
- Refactored related functions to enhance clarity and maintainability in the subagent registry and tools.

* refactor(subagent): update announce formatting and remove unused constants

- Modified the subagent announce formatting to replace "Findings:" with "Result:" and adjusted related expectations in tests.
- Removed constants for maximum announce findings characters and summary words, simplifying the announcement logic.
- Updated the handling of findings to retain full content instead of truncating, ensuring more informative outputs.
- Cleaned up unused imports in the commands-subagents file to enhance code clarity.

* feat(tests): enhance billing error handling in user-facing text

- Added tests to ensure that normal text mentioning billing plans is not rewritten, preserving user context.
- Updated the `isBillingErrorMessage` and `sanitizeUserFacingText` functions to improve handling of billing-related messages.
- Introduced new test cases for various scenarios involving billing messages to ensure accurate processing and output.
- Enhanced the subagent announce flow to correctly manage active descendant runs, preventing premature announcements.

* feat(subagent): enhance workflow guidance and auto-announcement clarity

- Added a new guideline in the subagent system prompt to emphasize trust in push-based completion, discouraging busy polling for status updates.
- Updated documentation to clarify that sub-agents will automatically announce their results, improving user understanding of the workflow.
- Enhanced tests to verify the new guidance on avoiding polling loops and to ensure the accuracy of the updated prompts.

* fix(cron): avoid announcing interim subagent spawn acks

* chore: clean post-rebase imports

* fix(cron): fall back to child replies when parent stays interim

* fix(subagents): make active-run guidance advisory

* fix(subagents): update announce flow to handle active descendants and enhance test coverage

- Modified the announce flow to defer announcements when active descendant runs are present, ensuring accurate status reporting.
- Updated tests to verify the new behavior, including scenarios where no fallback requester is available and ensuring proper handling of finished subagents.
- Enhanced the announce formatting to include an `expectFinal` flag for better clarity in the announcement process.

* fix(subagents): enhance announce flow and formatting for user updates

- Updated the announce flow to provide clearer instructions for user updates based on active subagent runs and requester context.
- Refactored the announcement logic to improve clarity and ensure internal context remains private.
- Enhanced tests to verify the new message expectations and formatting, including updated prompts for user-facing updates.
- Introduced a new function to build reply instructions based on session context, improving the overall announcement process.

* fix: resolve prep blockers and changelog placement (#14447) (thanks @tyler6204)

* fix: restore cron delivery-plan import after rebase (#14447) (thanks @tyler6204)

* fix: resolve test failures from rebase conflicts (#14447) (thanks @tyler6204)

* fix: apply formatting after rebase (#14447) (thanks @tyler6204)
2026-02-14 22:03:45 -08:00

1010 lines
33 KiB
TypeScript

import { describe, expect, it, vi } from "vitest";
import {
addSubagentRunForTests,
listSubagentRunsForRequester,
resetSubagentRegistryForTests,
} from "./subagent-registry.js";
const callGatewayMock = vi.fn();
vi.mock("../gateway/call.js", () => ({
callGateway: (opts: unknown) => callGatewayMock(opts),
}));
vi.mock("../config/config.js", async (importOriginal) => {
const actual = await importOriginal<typeof import("../config/config.js")>();
return {
...actual,
loadConfig: () => ({
session: {
mainKey: "main",
scope: "per-sender",
agentToAgent: { maxPingPongTurns: 2 },
},
}),
resolveGatewayPort: () => 18789,
};
});
import "./test-helpers/fast-core-tools.js";
import { sleep } from "../utils.js";
import { createOpenClawTools } from "./openclaw-tools.js";
const waitForCalls = async (getCount: () => number, count: number, timeoutMs = 2000) => {
const start = Date.now();
while (getCount() < count) {
if (Date.now() - start > timeoutMs) {
throw new Error(`timed out waiting for ${count} calls`);
}
await sleep(0);
}
};
describe("sessions tools", () => {
it("uses number (not integer) in tool schemas for Gemini compatibility", () => {
const tools = createOpenClawTools();
const byName = (name: string) => {
const tool = tools.find((candidate) => candidate.name === name);
expect(tool).toBeDefined();
if (!tool) {
throw new Error(`missing ${name} tool`);
}
return tool;
};
const schemaProp = (toolName: string, prop: string) => {
const tool = byName(toolName);
const schema = tool.parameters as {
anyOf?: unknown;
oneOf?: unknown;
properties?: Record<string, unknown>;
};
expect(schema.anyOf).toBeUndefined();
expect(schema.oneOf).toBeUndefined();
const properties = schema.properties ?? {};
const value = properties[prop] as { type?: unknown } | undefined;
expect(value).toBeDefined();
if (!value) {
throw new Error(`missing ${toolName} schema prop: ${prop}`);
}
return value;
};
expect(schemaProp("sessions_history", "limit").type).toBe("number");
expect(schemaProp("sessions_list", "limit").type).toBe("number");
expect(schemaProp("sessions_list", "activeMinutes").type).toBe("number");
expect(schemaProp("sessions_list", "messageLimit").type).toBe("number");
expect(schemaProp("sessions_send", "timeoutSeconds").type).toBe("number");
expect(schemaProp("sessions_spawn", "thinking").type).toBe("string");
expect(schemaProp("sessions_spawn", "runTimeoutSeconds").type).toBe("number");
expect(schemaProp("subagents", "recentMinutes").type).toBe("number");
});
it("sessions_list filters kinds and includes messages", async () => {
callGatewayMock.mockReset();
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string };
if (request.method === "sessions.list") {
return {
path: "/tmp/sessions.json",
sessions: [
{
key: "main",
kind: "direct",
sessionId: "s-main",
updatedAt: 10,
lastChannel: "whatsapp",
},
{
key: "discord:group:dev",
kind: "group",
sessionId: "s-group",
updatedAt: 11,
channel: "discord",
displayName: "discord:g-dev",
},
{
key: "cron:job-1",
kind: "direct",
sessionId: "s-cron",
updatedAt: 9,
},
{ key: "global", kind: "global" },
{ key: "unknown", kind: "unknown" },
],
};
}
if (request.method === "chat.history") {
return {
messages: [
{ role: "toolResult", content: [] },
{
role: "assistant",
content: [{ type: "text", text: "hi" }],
},
],
};
}
return {};
});
const tool = createOpenClawTools().find((candidate) => candidate.name === "sessions_list");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_list tool");
}
const result = await tool.execute("call1", { messageLimit: 1 });
const details = result.details as {
sessions?: Array<Record<string, unknown>>;
};
expect(details.sessions).toHaveLength(3);
const main = details.sessions?.find((s) => s.key === "main");
expect(main?.channel).toBe("whatsapp");
expect(main?.messages?.length).toBe(1);
expect(main?.messages?.[0]?.role).toBe("assistant");
const cronOnly = await tool.execute("call2", { kinds: ["cron"] });
const cronDetails = cronOnly.details as {
sessions?: Array<Record<string, unknown>>;
};
expect(cronDetails.sessions).toHaveLength(1);
expect(cronDetails.sessions?.[0]?.kind).toBe("cron");
});
it("sessions_history filters tool messages by default", async () => {
callGatewayMock.mockReset();
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string };
if (request.method === "chat.history") {
return {
messages: [
{ role: "toolResult", content: [] },
{ role: "assistant", content: [{ type: "text", text: "ok" }] },
],
};
}
return {};
});
const tool = createOpenClawTools().find((candidate) => candidate.name === "sessions_history");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_history tool");
}
const result = await tool.execute("call3", { sessionKey: "main" });
const details = result.details as { messages?: unknown[] };
expect(details.messages).toHaveLength(1);
expect(details.messages?.[0]?.role).toBe("assistant");
const withTools = await tool.execute("call4", {
sessionKey: "main",
includeTools: true,
});
const withToolsDetails = withTools.details as { messages?: unknown[] };
expect(withToolsDetails.messages).toHaveLength(2);
});
it("sessions_history caps oversized payloads and strips heavy fields", async () => {
callGatewayMock.mockReset();
const oversized = Array.from({ length: 80 }, (_, idx) => ({
role: "assistant",
content: [
{
type: "text",
text: `${String(idx)}:${"x".repeat(5000)}`,
},
{
type: "thinking",
thinking: "y".repeat(7000),
thinkingSignature: "sig".repeat(4000),
},
],
details: {
giant: "z".repeat(12000),
},
usage: {
input: 1,
output: 1,
},
}));
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string };
if (request.method === "chat.history") {
return { messages: oversized };
}
return {};
});
const tool = createOpenClawTools().find((candidate) => candidate.name === "sessions_history");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_history tool");
}
const result = await tool.execute("call4b", {
sessionKey: "main",
includeTools: true,
});
const details = result.details as {
messages?: Array<Record<string, unknown>>;
truncated?: boolean;
droppedMessages?: boolean;
contentTruncated?: boolean;
bytes?: number;
};
expect(details.truncated).toBe(true);
expect(details.droppedMessages).toBe(true);
expect(details.contentTruncated).toBe(true);
expect(typeof details.bytes).toBe("number");
expect((details.bytes ?? 0) <= 80 * 1024).toBe(true);
expect(details.messages && details.messages.length > 0).toBe(true);
const first = details.messages?.[0] as
| {
details?: unknown;
usage?: unknown;
content?: Array<{
type?: string;
text?: string;
thinking?: string;
thinkingSignature?: string;
}>;
}
| undefined;
expect(first?.details).toBeUndefined();
expect(first?.usage).toBeUndefined();
const textBlock = first?.content?.find((block) => block.type === "text");
expect(typeof textBlock?.text).toBe("string");
expect((textBlock?.text ?? "").length <= 4015).toBe(true);
const thinkingBlock = first?.content?.find((block) => block.type === "thinking");
expect(thinkingBlock?.thinkingSignature).toBeUndefined();
});
it("sessions_history enforces a hard byte cap even when a single message is huge", async () => {
callGatewayMock.mockReset();
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string };
if (request.method === "chat.history") {
return {
messages: [
{
role: "assistant",
content: [{ type: "text", text: "ok" }],
extra: "x".repeat(200_000),
},
],
};
}
return {};
});
const tool = createOpenClawTools().find((candidate) => candidate.name === "sessions_history");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_history tool");
}
const result = await tool.execute("call4c", {
sessionKey: "main",
includeTools: true,
});
const details = result.details as {
messages?: Array<Record<string, unknown>>;
truncated?: boolean;
droppedMessages?: boolean;
contentTruncated?: boolean;
bytes?: number;
};
expect(details.truncated).toBe(true);
expect(details.droppedMessages).toBe(true);
expect(details.contentTruncated).toBe(false);
expect(typeof details.bytes).toBe("number");
expect((details.bytes ?? 0) <= 80 * 1024).toBe(true);
expect(details.messages).toHaveLength(1);
expect(details.messages?.[0]?.content).toContain(
"[sessions_history omitted: message too large]",
);
});
it("sessions_history resolves sessionId inputs", async () => {
callGatewayMock.mockReset();
const sessionId = "sess-group";
const targetKey = "agent:main:discord:channel:1457165743010611293";
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as {
method?: string;
params?: Record<string, unknown>;
};
if (request.method === "sessions.resolve") {
return {
key: targetKey,
};
}
if (request.method === "chat.history") {
return {
messages: [{ role: "assistant", content: [{ type: "text", text: "ok" }] }],
};
}
return {};
});
const tool = createOpenClawTools().find((candidate) => candidate.name === "sessions_history");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_history tool");
}
const result = await tool.execute("call5", { sessionKey: sessionId });
const details = result.details as { messages?: unknown[] };
expect(details.messages).toHaveLength(1);
const historyCall = callGatewayMock.mock.calls.find(
(call) => (call[0] as { method?: string }).method === "chat.history",
);
expect(historyCall?.[0]).toMatchObject({
method: "chat.history",
params: { sessionKey: targetKey },
});
});
it("sessions_history errors on missing sessionId", async () => {
callGatewayMock.mockReset();
const sessionId = "aaaaaaaa-aaaa-4aaa-aaaa-aaaaaaaaaaaa";
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string };
if (request.method === "sessions.resolve") {
throw new Error("No session found");
}
return {};
});
const tool = createOpenClawTools().find((candidate) => candidate.name === "sessions_history");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_history tool");
}
const result = await tool.execute("call6", { sessionKey: sessionId });
const details = result.details as { status?: string; error?: string };
expect(details.status).toBe("error");
expect(details.error).toMatch(/Session not found|No session found/);
});
it("sessions_send supports fire-and-forget and wait", async () => {
callGatewayMock.mockReset();
const calls: Array<{ method?: string; params?: unknown }> = [];
let agentCallCount = 0;
let _historyCallCount = 0;
let sendCallCount = 0;
let lastWaitedRunId: string | undefined;
const replyByRunId = new Map<string, string>();
const requesterKey = "discord:group:req";
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string; params?: unknown };
calls.push(request);
if (request.method === "agent") {
agentCallCount += 1;
const runId = `run-${agentCallCount}`;
const params = request.params as { message?: string; sessionKey?: string } | undefined;
const message = params?.message ?? "";
let reply = "REPLY_SKIP";
if (message === "ping" || message === "wait") {
reply = "done";
} else if (message === "Agent-to-agent announce step.") {
reply = "ANNOUNCE_SKIP";
} else if (params?.sessionKey === requesterKey) {
reply = "pong";
}
replyByRunId.set(runId, reply);
return {
runId,
status: "accepted",
acceptedAt: 1234 + agentCallCount,
};
}
if (request.method === "agent.wait") {
const params = request.params as { runId?: string } | undefined;
lastWaitedRunId = params?.runId;
return { runId: params?.runId ?? "run-1", status: "ok" };
}
if (request.method === "chat.history") {
_historyCallCount += 1;
const text = (lastWaitedRunId && replyByRunId.get(lastWaitedRunId)) ?? "";
return {
messages: [
{
role: "assistant",
content: [
{
type: "text",
text,
},
],
timestamp: 20,
},
],
};
}
if (request.method === "send") {
sendCallCount += 1;
return { messageId: "m1" };
}
return {};
});
const tool = createOpenClawTools({
agentSessionKey: requesterKey,
agentChannel: "discord",
}).find((candidate) => candidate.name === "sessions_send");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_send tool");
}
const fire = await tool.execute("call5", {
sessionKey: "main",
message: "ping",
timeoutSeconds: 0,
});
expect(fire.details).toMatchObject({
status: "accepted",
runId: "run-1",
delivery: { status: "pending", mode: "announce" },
});
await waitForCalls(() => calls.filter((call) => call.method === "agent").length, 4);
await waitForCalls(() => calls.filter((call) => call.method === "agent.wait").length, 4);
await waitForCalls(() => calls.filter((call) => call.method === "chat.history").length, 4);
const waitPromise = tool.execute("call6", {
sessionKey: "main",
message: "wait",
timeoutSeconds: 1,
});
const waited = await waitPromise;
expect(waited.details).toMatchObject({
status: "ok",
reply: "done",
delivery: { status: "pending", mode: "announce" },
});
expect(typeof (waited.details as { runId?: string }).runId).toBe("string");
await waitForCalls(() => calls.filter((call) => call.method === "agent").length, 8);
await waitForCalls(() => calls.filter((call) => call.method === "agent.wait").length, 8);
await waitForCalls(() => calls.filter((call) => call.method === "chat.history").length, 8);
const agentCalls = calls.filter((call) => call.method === "agent");
const waitCalls = calls.filter((call) => call.method === "agent.wait");
const historyOnlyCalls = calls.filter((call) => call.method === "chat.history");
expect(agentCalls).toHaveLength(8);
for (const call of agentCalls) {
expect(call.params).toMatchObject({
lane: "nested",
channel: "webchat",
inputProvenance: { kind: "inter_session" },
});
}
expect(
agentCalls.some(
(call) =>
typeof (call.params as { extraSystemPrompt?: string })?.extraSystemPrompt === "string" &&
(call.params as { extraSystemPrompt?: string })?.extraSystemPrompt?.includes(
"Agent-to-agent message context",
),
),
).toBe(true);
expect(
agentCalls.some(
(call) =>
typeof (call.params as { extraSystemPrompt?: string })?.extraSystemPrompt === "string" &&
(call.params as { extraSystemPrompt?: string })?.extraSystemPrompt?.includes(
"Agent-to-agent reply step",
),
),
).toBe(true);
expect(
agentCalls.some(
(call) =>
typeof (call.params as { extraSystemPrompt?: string })?.extraSystemPrompt === "string" &&
(call.params as { extraSystemPrompt?: string })?.extraSystemPrompt?.includes(
"Agent-to-agent announce step",
),
),
).toBe(true);
expect(waitCalls).toHaveLength(8);
expect(historyOnlyCalls).toHaveLength(8);
expect(sendCallCount).toBe(0);
});
it("sessions_send resolves sessionId inputs", async () => {
callGatewayMock.mockReset();
const sessionId = "sess-send";
const targetKey = "agent:main:discord:channel:123";
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as {
method?: string;
params?: Record<string, unknown>;
};
if (request.method === "sessions.resolve") {
return { key: targetKey };
}
if (request.method === "agent") {
return { runId: "run-1", acceptedAt: 123 };
}
if (request.method === "agent.wait") {
return { status: "ok" };
}
if (request.method === "chat.history") {
return { messages: [] };
}
return {};
});
const tool = createOpenClawTools({
agentSessionKey: "main",
agentChannel: "discord",
}).find((candidate) => candidate.name === "sessions_send");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_send tool");
}
const result = await tool.execute("call7", {
sessionKey: sessionId,
message: "ping",
timeoutSeconds: 0,
});
const details = result.details as { status?: string };
expect(details.status).toBe("accepted");
const agentCall = callGatewayMock.mock.calls.find(
(call) => (call[0] as { method?: string }).method === "agent",
);
expect(agentCall?.[0]).toMatchObject({
method: "agent",
params: { sessionKey: targetKey },
});
});
it("sessions_send runs ping-pong then announces", async () => {
callGatewayMock.mockReset();
const calls: Array<{ method?: string; params?: unknown }> = [];
let agentCallCount = 0;
let lastWaitedRunId: string | undefined;
const replyByRunId = new Map<string, string>();
const requesterKey = "discord:group:req";
const targetKey = "discord:group:target";
let sendParams: { to?: string; channel?: string; message?: string } = {};
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string; params?: unknown };
calls.push(request);
if (request.method === "agent") {
agentCallCount += 1;
const runId = `run-${agentCallCount}`;
const params = request.params as
| {
message?: string;
sessionKey?: string;
extraSystemPrompt?: string;
}
| undefined;
let reply = "initial";
if (params?.extraSystemPrompt?.includes("Agent-to-agent reply step")) {
reply = params.sessionKey === requesterKey ? "pong-1" : "pong-2";
}
if (params?.extraSystemPrompt?.includes("Agent-to-agent announce step")) {
reply = "announce now";
}
replyByRunId.set(runId, reply);
return {
runId,
status: "accepted",
acceptedAt: 2000 + agentCallCount,
};
}
if (request.method === "agent.wait") {
const params = request.params as { runId?: string } | undefined;
lastWaitedRunId = params?.runId;
return { runId: params?.runId ?? "run-1", status: "ok" };
}
if (request.method === "chat.history") {
const text = (lastWaitedRunId && replyByRunId.get(lastWaitedRunId)) ?? "";
return {
messages: [
{
role: "assistant",
content: [{ type: "text", text }],
timestamp: 20,
},
],
};
}
if (request.method === "send") {
const params = request.params as
| { to?: string; channel?: string; message?: string }
| undefined;
sendParams = {
to: params?.to,
channel: params?.channel,
message: params?.message,
};
return { messageId: "m-announce" };
}
return {};
});
const tool = createOpenClawTools({
agentSessionKey: requesterKey,
agentChannel: "discord",
}).find((candidate) => candidate.name === "sessions_send");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing sessions_send tool");
}
const waited = await tool.execute("call7", {
sessionKey: targetKey,
message: "ping",
timeoutSeconds: 1,
});
expect(waited.details).toMatchObject({
status: "ok",
reply: "initial",
});
await sleep(0);
await sleep(0);
const agentCalls = calls.filter((call) => call.method === "agent");
expect(agentCalls).toHaveLength(4);
for (const call of agentCalls) {
expect(call.params).toMatchObject({
lane: "nested",
channel: "webchat",
inputProvenance: { kind: "inter_session" },
});
}
const replySteps = calls.filter(
(call) =>
call.method === "agent" &&
typeof (call.params as { extraSystemPrompt?: string })?.extraSystemPrompt === "string" &&
(call.params as { extraSystemPrompt?: string })?.extraSystemPrompt?.includes(
"Agent-to-agent reply step",
),
);
expect(replySteps).toHaveLength(2);
expect(sendParams).toMatchObject({
to: "channel:target",
channel: "discord",
message: "announce now",
});
});
it("subagents lists active and recent runs", async () => {
resetSubagentRegistryForTests();
callGatewayMock.mockReset();
const now = Date.now();
addSubagentRunForTests({
runId: "run-active",
childSessionKey: "agent:main:subagent:active",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "investigate auth",
cleanup: "keep",
createdAt: now - 2 * 60_000,
startedAt: now - 2 * 60_000,
});
addSubagentRunForTests({
runId: "run-recent",
childSessionKey: "agent:main:subagent:recent",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "summarize findings",
cleanup: "keep",
createdAt: now - 15 * 60_000,
startedAt: now - 14 * 60_000,
endedAt: now - 5 * 60_000,
outcome: { status: "ok" },
});
addSubagentRunForTests({
runId: "run-old",
childSessionKey: "agent:main:subagent:old",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "old completed run",
cleanup: "keep",
createdAt: now - 90 * 60_000,
startedAt: now - 89 * 60_000,
endedAt: now - 80 * 60_000,
outcome: { status: "ok" },
});
const tool = createOpenClawTools({
agentSessionKey: "agent:main:main",
}).find((candidate) => candidate.name === "subagents");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing subagents tool");
}
const result = await tool.execute("call-subagents-list", { action: "list" });
const details = result.details as {
status?: string;
active?: unknown[];
recent?: unknown[];
text?: string;
};
expect(details.status).toBe("ok");
expect(details.active).toHaveLength(1);
expect(details.recent).toHaveLength(1);
expect(details.text).toContain("active subagents:");
expect(details.text).toContain("recent (last 30m):");
resetSubagentRegistryForTests();
});
it("subagents list usage separates io tokens from prompt/cache", async () => {
resetSubagentRegistryForTests();
callGatewayMock.mockReset();
const now = Date.now();
addSubagentRunForTests({
runId: "run-usage-active",
childSessionKey: "agent:main:subagent:usage-active",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "wait and check weather",
cleanup: "keep",
createdAt: now - 2 * 60_000,
startedAt: now - 2 * 60_000,
});
const sessionsModule = await import("../config/sessions.js");
const loadSessionStoreSpy = vi
.spyOn(sessionsModule, "loadSessionStore")
.mockImplementation(() => ({
"agent:main:subagent:usage-active": {
modelProvider: "anthropic",
model: "claude-opus-4-6",
inputTokens: 12,
outputTokens: 1000,
totalTokens: 197000,
},
}));
try {
const tool = createOpenClawTools({
agentSessionKey: "agent:main:main",
}).find((candidate) => candidate.name === "subagents");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing subagents tool");
}
const result = await tool.execute("call-subagents-list-usage", { action: "list" });
const details = result.details as {
status?: string;
text?: string;
};
expect(details.status).toBe("ok");
expect(details.text).toContain("tokens 1k (in 12 / out 1k)");
expect(details.text).toContain("prompt/cache 197k");
expect(details.text).not.toContain("1.0k io");
} finally {
loadSessionStoreSpy.mockRestore();
resetSubagentRegistryForTests();
}
});
it("subagents steer sends guidance to a running run", async () => {
resetSubagentRegistryForTests();
callGatewayMock.mockReset();
callGatewayMock.mockImplementation(async (opts: unknown) => {
const request = opts as { method?: string };
if (request.method === "agent") {
return { runId: "run-steer-1" };
}
return {};
});
addSubagentRunForTests({
runId: "run-steer",
childSessionKey: "agent:main:subagent:steer",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "prepare release notes",
cleanup: "keep",
createdAt: Date.now() - 60_000,
startedAt: Date.now() - 60_000,
});
const sessionsModule = await import("../config/sessions.js");
const loadSessionStoreSpy = vi
.spyOn(sessionsModule, "loadSessionStore")
.mockImplementation(() => ({
"agent:main:subagent:steer": {
sessionId: "child-session-steer",
updatedAt: Date.now(),
},
}));
try {
const tool = createOpenClawTools({
agentSessionKey: "agent:main:main",
}).find((candidate) => candidate.name === "subagents");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing subagents tool");
}
const result = await tool.execute("call-subagents-steer", {
action: "steer",
target: "1",
message: "skip changelog and focus on tests",
});
const details = result.details as { status?: string; runId?: string; text?: string };
expect(details.status).toBe("accepted");
expect(details.runId).toBe("run-steer-1");
expect(details.text).toContain("steered");
const steerWaitIndex = callGatewayMock.mock.calls.findIndex(
(call) =>
(call[0] as { method?: string; params?: { runId?: string } }).method === "agent.wait" &&
(call[0] as { method?: string; params?: { runId?: string } }).params?.runId ===
"run-steer",
);
expect(steerWaitIndex).toBeGreaterThanOrEqual(0);
const steerRunIndex = callGatewayMock.mock.calls.findIndex(
(call) => (call[0] as { method?: string }).method === "agent",
);
expect(steerRunIndex).toBeGreaterThan(steerWaitIndex);
expect(callGatewayMock.mock.calls[steerWaitIndex]?.[0]).toMatchObject({
method: "agent.wait",
params: { runId: "run-steer", timeoutMs: 5_000 },
timeoutMs: 7_000,
});
expect(callGatewayMock.mock.calls[steerRunIndex]?.[0]).toMatchObject({
method: "agent",
params: {
lane: "subagent",
sessionKey: "agent:main:subagent:steer",
sessionId: "child-session-steer",
timeout: 0,
},
});
const trackedRuns = listSubagentRunsForRequester("agent:main:main");
expect(trackedRuns).toHaveLength(1);
expect(trackedRuns[0].runId).toBe("run-steer-1");
expect(trackedRuns[0].endedAt).toBeUndefined();
} finally {
loadSessionStoreSpy.mockRestore();
resetSubagentRegistryForTests();
}
});
it("subagents numeric targets follow active-first list ordering", async () => {
resetSubagentRegistryForTests();
callGatewayMock.mockReset();
addSubagentRunForTests({
runId: "run-active",
childSessionKey: "agent:main:subagent:active",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "active task",
cleanup: "keep",
createdAt: Date.now() - 120_000,
startedAt: Date.now() - 120_000,
});
addSubagentRunForTests({
runId: "run-recent",
childSessionKey: "agent:main:subagent:recent",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "recent task",
cleanup: "keep",
createdAt: Date.now() - 30_000,
startedAt: Date.now() - 30_000,
endedAt: Date.now() - 10_000,
outcome: { status: "ok" },
});
const tool = createOpenClawTools({
agentSessionKey: "agent:main:main",
}).find((candidate) => candidate.name === "subagents");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing subagents tool");
}
const result = await tool.execute("call-subagents-kill-order", {
action: "kill",
target: "1",
});
const details = result.details as { status?: string; runId?: string; text?: string };
expect(details.status).toBe("ok");
expect(details.runId).toBe("run-active");
expect(details.text).toContain("killed");
resetSubagentRegistryForTests();
});
it("subagents kill stops a running run", async () => {
resetSubagentRegistryForTests();
callGatewayMock.mockReset();
addSubagentRunForTests({
runId: "run-kill",
childSessionKey: "agent:main:subagent:kill",
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "long running task",
cleanup: "keep",
createdAt: Date.now() - 60_000,
startedAt: Date.now() - 60_000,
});
const tool = createOpenClawTools({
agentSessionKey: "agent:main:main",
}).find((candidate) => candidate.name === "subagents");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing subagents tool");
}
const result = await tool.execute("call-subagents-kill", {
action: "kill",
target: "1",
});
const details = result.details as { status?: string; text?: string };
expect(details.status).toBe("ok");
expect(details.text).toContain("killed");
resetSubagentRegistryForTests();
});
it("subagents kill-all cascades through ended parents to active descendants", async () => {
resetSubagentRegistryForTests();
callGatewayMock.mockReset();
const now = Date.now();
const endedParentKey = "agent:main:subagent:parent-ended";
const activeChildKey = "agent:main:subagent:parent-ended:subagent:worker";
addSubagentRunForTests({
runId: "run-parent-ended",
childSessionKey: endedParentKey,
requesterSessionKey: "agent:main:main",
requesterDisplayKey: "main",
task: "orchestrator",
cleanup: "keep",
createdAt: now - 120_000,
startedAt: now - 120_000,
endedAt: now - 60_000,
outcome: { status: "ok" },
});
addSubagentRunForTests({
runId: "run-worker-active",
childSessionKey: activeChildKey,
requesterSessionKey: endedParentKey,
requesterDisplayKey: endedParentKey,
task: "leaf worker",
cleanup: "keep",
createdAt: now - 30_000,
startedAt: now - 30_000,
});
const tool = createOpenClawTools({
agentSessionKey: "agent:main:main",
}).find((candidate) => candidate.name === "subagents");
expect(tool).toBeDefined();
if (!tool) {
throw new Error("missing subagents tool");
}
const result = await tool.execute("call-subagents-kill-all-cascade-ended", {
action: "kill",
target: "all",
});
const details = result.details as { status?: string; killed?: number; text?: string };
expect(details.status).toBe("ok");
expect(details.killed).toBe(1);
expect(details.text).toContain("killed 1 subagent");
const descendants = listSubagentRunsForRequester(endedParentKey);
const worker = descendants.find((entry) => entry.runId === "run-worker-active");
expect(worker?.endedAt).toBeTypeOf("number");
resetSubagentRegistryForTests();
});
});