mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 06:20:43 +00:00
test: add weighted Docker aggregate scheduler
This commit is contained in:
@@ -91,7 +91,7 @@ Jobs are ordered so cheap checks fail before expensive ones run:
|
||||
Scope logic lives in `scripts/ci-changed-scope.mjs` and is covered by unit tests in `src/scripts/ci-changed-scope.test.ts`.
|
||||
CI workflow edits validate the Node CI graph plus workflow linting, but do not force Windows, Android, or macOS native builds by themselves; those platform lanes stay scoped to platform source changes.
|
||||
Windows Node checks are scoped to Windows-specific process/path wrappers, npm/pnpm/UI runner helpers, package manager config, and the CI workflow surfaces that execute that lane; unrelated source, plugin, install-smoke, and test-only changes stay on the Linux Node lanes so they do not reserve a 16-vCPU Windows worker for coverage that is already exercised by the normal test shards.
|
||||
The separate `install-smoke` workflow reuses the same scope script through its own `preflight` job. It splits smoke coverage into `run_fast_install_smoke` and `run_full_install_smoke`. Pull requests run the fast path for Docker/package surfaces, bundled plugin package/manifest changes, and core plugin/channel/gateway/Plugin SDK surfaces that the Docker smoke jobs exercise. Source-only bundled plugin changes, test-only edits, and docs-only edits do not reserve Docker workers. The fast path builds the root Dockerfile image once, checks the CLI, runs the container gateway-network e2e, verifies a bundled extension build arg, and runs the bounded bundled-plugin Docker profile under a 120-second command timeout. The full path keeps QR package install and installer Docker/update coverage for nightly scheduled runs, manual dispatches, workflow-call release checks, and pull requests that truly touch installer/package/Docker surfaces. `main` pushes, including merge commits, do not force the full path; when changed-scope logic would request full coverage on a push, the workflow keeps the fast Docker smoke and leaves the full install smoke to nightly or release validation. The slow Bun global install image-provider smoke is separately gated by `run_bun_global_install_smoke`; it runs on the nightly schedule and from the release checks workflow, and manual `install-smoke` dispatches can opt into it, but pull requests and `main` pushes do not run it. QR and installer Docker tests keep their own install-focused Dockerfiles. Local `test:docker:all` prebuilds one shared live-test image and one shared `scripts/e2e/Dockerfile` built-app image, then runs the live/E2E smoke lanes in parallel with `OPENCLAW_SKIP_DOCKER_BUILD=1`; tune the default main-pool concurrency of 8 with `OPENCLAW_DOCKER_ALL_PARALLELISM` and the provider-sensitive tail-pool concurrency of 8 with `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM`. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=0` or another millisecond value. The local aggregate stops scheduling new pooled lanes after the first failure by default, and each lane has a 120-minute timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`. The reusable live/E2E workflow mirrors the shared-image pattern by building and pushing one SHA-tagged GHCR Docker E2E image before the Docker matrix, then running the matrix with `OPENCLAW_SKIP_DOCKER_BUILD=1`. The scheduled live/E2E workflow runs the full release-path Docker suite daily. The full bundled update/channel matrix remains manual/full-suite because it performs repeated real npm update and doctor repair passes.
|
||||
The separate `install-smoke` workflow reuses the same scope script through its own `preflight` job. It splits smoke coverage into `run_fast_install_smoke` and `run_full_install_smoke`. Pull requests run the fast path for Docker/package surfaces, bundled plugin package/manifest changes, and core plugin/channel/gateway/Plugin SDK surfaces that the Docker smoke jobs exercise. Source-only bundled plugin changes, test-only edits, and docs-only edits do not reserve Docker workers. The fast path builds the root Dockerfile image once, checks the CLI, runs the container gateway-network e2e, verifies a bundled extension build arg, and runs the bounded bundled-plugin Docker profile under a 120-second command timeout. The full path keeps QR package install and installer Docker/update coverage for nightly scheduled runs, manual dispatches, workflow-call release checks, and pull requests that truly touch installer/package/Docker surfaces. `main` pushes, including merge commits, do not force the full path; when changed-scope logic would request full coverage on a push, the workflow keeps the fast Docker smoke and leaves the full install smoke to nightly or release validation. The slow Bun global install image-provider smoke is separately gated by `run_bun_global_install_smoke`; it runs on the nightly schedule and from the release checks workflow, and manual `install-smoke` dispatches can opt into it, but pull requests and `main` pushes do not run it. QR and installer Docker tests keep their own install-focused Dockerfiles. Local `test:docker:all` prebuilds one shared live-test image and one shared `scripts/e2e/Dockerfile` built-app image, then runs the live/E2E smoke lanes with a weighted scheduler and `OPENCLAW_SKIP_DOCKER_BUILD=1`; tune the default main-pool slot count of 10 with `OPENCLAW_DOCKER_ALL_PARALLELISM` and the provider-sensitive tail-pool slot count of 10 with `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM`. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=5` so npm install and multi-service lanes do not overcommit Docker while lighter lanes still fill available slots. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=0` or another millisecond value. The local aggregate stops scheduling new pooled lanes after the first failure by default, and each lane has a 120-minute timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`. The reusable live/E2E workflow mirrors the shared-image pattern by building and pushing one SHA-tagged GHCR Docker E2E image before the Docker matrix, then running the matrix with `OPENCLAW_SKIP_DOCKER_BUILD=1`. The scheduled live/E2E workflow runs the full release-path Docker suite daily. The full bundled update/channel matrix remains manual/full-suite because it performs repeated real npm update and doctor repair passes.
|
||||
|
||||
Local changed-lane logic lives in `scripts/changed-lanes.mjs` and is executed by `scripts/check-changed.mjs`. That local gate is stricter about architecture boundaries than the broad CI platform scope: core production changes run core prod typecheck plus core tests, core test-only changes run only core test typecheck/tests, extension production changes run extension prod typecheck plus extension tests, and extension test-only changes run only extension test typecheck/tests. Public Plugin SDK or plugin-contract changes expand to extension validation because extensions depend on those core contracts. Release metadata-only version bumps run targeted version/config/root-dependency checks. Unknown root/config changes fail safe to all lanes.
|
||||
|
||||
|
||||
@@ -537,7 +537,7 @@ These Docker runners split into two buckets:
|
||||
`OPENCLAW_LIVE_GATEWAY_STEP_TIMEOUT_MS=45000`, and
|
||||
`OPENCLAW_LIVE_GATEWAY_MODEL_TIMEOUT_MS=90000`. Override those env vars when you
|
||||
explicitly want the larger exhaustive scan.
|
||||
- `test:docker:all` builds the live Docker image once via `test:docker:live-build`, then reuses it for the two live Docker lanes. It also builds one shared `scripts/e2e/Dockerfile` image via `test:docker:e2e-build` and reuses it for the E2E container smoke runners that exercise the built app.
|
||||
- `test:docker:all` builds the live Docker image once via `test:docker:live-build`, then reuses it for the live Docker lanes. It also builds one shared `scripts/e2e/Dockerfile` image via `test:docker:e2e-build` and reuses it for the E2E container smoke runners that exercise the built app. The aggregate uses a weighted local scheduler: `OPENCLAW_DOCKER_ALL_PARALLELISM` controls process slots, while resource caps keep heavy live, npm-install, and multi-service lanes from all starting at once. Defaults are 10 slots, `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=5`; tune `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` only when the Docker host has more headroom.
|
||||
- Container smoke runners: `test:docker:openwebui`, `test:docker:onboard`, `test:docker:npm-onboard-channel-agent`, `test:docker:gateway-network`, `test:docker:mcp-channels`, `test:docker:pi-bundle-mcp-tools`, `test:docker:cron-mcp-cleanup`, `test:docker:plugins`, `test:docker:plugin-update`, and `test:docker:config-reload` boot one or more real containers and verify higher-level integration paths.
|
||||
|
||||
The live-model Docker runners also bind-mount only the needed CLI auth homes (or all supported ones when the run is not narrowed), then copy them into the container home before the run so external-CLI OAuth can refresh tokens without mutating the host auth store:
|
||||
|
||||
@@ -32,7 +32,7 @@ title: "Tests"
|
||||
- Gateway integration: opt-in via `OPENCLAW_TEST_INCLUDE_GATEWAY=1 pnpm test` or `pnpm test:gateway`.
|
||||
- `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
|
||||
- `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.
|
||||
- `pnpm test:docker:all`: Builds the shared live-test image and Docker E2E image once, then runs the Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` at concurrency 8 by default. Tune the main pool with `OPENCLAW_DOCKER_ALL_PARALLELISM=<n>` and the provider-sensitive tail pool with `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>`; both default to 8. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>`. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`. Per-lane logs are written under `.artifacts/docker-tests/<run-id>/`.
|
||||
- `pnpm test:docker:all`: Builds the shared live-test image and Docker E2E image once, then runs the Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` through a weighted scheduler. `OPENCLAW_DOCKER_ALL_PARALLELISM=<n>` controls process slots and defaults to 10; `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>` controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=5`; use `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` for larger hosts. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>`. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`. Per-lane logs are written under `.artifacts/docker-tests/<run-id>/`.
|
||||
- `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
|
||||
- `pnpm test:docker:mcp-channels`: Starts a seeded Gateway container and a second client container that spawns `openclaw mcp serve`, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits.
|
||||
|
||||
|
||||
@@ -11,81 +11,166 @@ const DEFAULT_TAIL_PARALLELISM = 10;
|
||||
const DEFAULT_FAILURE_TAIL_LINES = 80;
|
||||
const DEFAULT_LANE_TIMEOUT_MS = 120 * 60 * 1000;
|
||||
const DEFAULT_LANE_START_STAGGER_MS = 2_000;
|
||||
const DEFAULT_RESOURCE_LIMITS = {
|
||||
docker: DEFAULT_PARALLELISM,
|
||||
live: 4,
|
||||
npm: 4,
|
||||
service: 5,
|
||||
};
|
||||
|
||||
const bundledChannelLaneCommand =
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_LOAD_FAILURE_SCENARIO=0 pnpm test:docker:bundled-channel-deps";
|
||||
|
||||
function lane(name, command, options = {}) {
|
||||
return {
|
||||
command,
|
||||
name,
|
||||
resources: options.resources ?? [],
|
||||
weight: options.weight ?? 1,
|
||||
};
|
||||
}
|
||||
|
||||
function liveLane(name, command, options = {}) {
|
||||
return lane(name, command, {
|
||||
resources: ["live", ...(options.resources ?? [])],
|
||||
weight: options.weight ?? 3,
|
||||
});
|
||||
}
|
||||
|
||||
function npmLane(name, command, options = {}) {
|
||||
return lane(name, command, {
|
||||
resources: ["npm", ...(options.resources ?? [])],
|
||||
weight: options.weight ?? 2,
|
||||
});
|
||||
}
|
||||
|
||||
function serviceLane(name, command, options = {}) {
|
||||
return lane(name, command, {
|
||||
resources: ["service", ...(options.resources ?? [])],
|
||||
weight: options.weight ?? 2,
|
||||
});
|
||||
}
|
||||
|
||||
const bundledScenarioLanes = [
|
||||
["bundled-channel-telegram", `OPENCLAW_BUNDLED_CHANNELS=telegram ${bundledChannelLaneCommand}`],
|
||||
["bundled-channel-discord", `OPENCLAW_BUNDLED_CHANNELS=discord ${bundledChannelLaneCommand}`],
|
||||
["bundled-channel-slack", `OPENCLAW_BUNDLED_CHANNELS=slack ${bundledChannelLaneCommand}`],
|
||||
["bundled-channel-feishu", `OPENCLAW_BUNDLED_CHANNELS=feishu ${bundledChannelLaneCommand}`],
|
||||
[
|
||||
npmLane(
|
||||
"bundled-channel-telegram",
|
||||
`OPENCLAW_BUNDLED_CHANNELS=telegram ${bundledChannelLaneCommand}`,
|
||||
),
|
||||
npmLane(
|
||||
"bundled-channel-discord",
|
||||
`OPENCLAW_BUNDLED_CHANNELS=discord ${bundledChannelLaneCommand}`,
|
||||
),
|
||||
npmLane("bundled-channel-slack", `OPENCLAW_BUNDLED_CHANNELS=slack ${bundledChannelLaneCommand}`),
|
||||
npmLane(
|
||||
"bundled-channel-feishu",
|
||||
`OPENCLAW_BUNDLED_CHANNELS=feishu ${bundledChannelLaneCommand}`,
|
||||
),
|
||||
npmLane(
|
||||
"bundled-channel-memory-lancedb",
|
||||
`OPENCLAW_BUNDLED_CHANNELS=memory-lancedb ${bundledChannelLaneCommand}`,
|
||||
],
|
||||
[
|
||||
),
|
||||
npmLane(
|
||||
"bundled-channel-update",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 OPENCLAW_BUNDLED_CHANNEL_SCENARIOS=0 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=1 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_LOAD_FAILURE_SCENARIO=0 pnpm test:docker:bundled-channel-deps",
|
||||
],
|
||||
[
|
||||
),
|
||||
npmLane(
|
||||
"bundled-channel-root-owned",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 OPENCLAW_BUNDLED_CHANNEL_SCENARIOS=0 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=1 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_LOAD_FAILURE_SCENARIO=0 pnpm test:docker:bundled-channel-deps",
|
||||
],
|
||||
[
|
||||
),
|
||||
npmLane(
|
||||
"bundled-channel-setup-entry",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 OPENCLAW_BUNDLED_CHANNEL_SCENARIOS=0 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=1 OPENCLAW_BUNDLED_CHANNEL_LOAD_FAILURE_SCENARIO=0 pnpm test:docker:bundled-channel-deps",
|
||||
],
|
||||
[
|
||||
),
|
||||
npmLane(
|
||||
"bundled-channel-load-failure",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 OPENCLAW_BUNDLED_CHANNEL_SCENARIOS=0 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_LOAD_FAILURE_SCENARIO=1 pnpm test:docker:bundled-channel-deps",
|
||||
],
|
||||
),
|
||||
];
|
||||
|
||||
const lanes = [
|
||||
["live-models", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-models"],
|
||||
["live-gateway", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-gateway"],
|
||||
[
|
||||
liveLane("live-models", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-models", {
|
||||
weight: 4,
|
||||
}),
|
||||
liveLane("live-gateway", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-gateway", {
|
||||
weight: 4,
|
||||
}),
|
||||
liveLane(
|
||||
"live-cli-backend-claude",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-cli-backend:claude",
|
||||
],
|
||||
[
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
liveLane(
|
||||
"live-cli-backend-gemini",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-cli-backend:gemini",
|
||||
],
|
||||
["openwebui", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:openwebui"],
|
||||
["onboard", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:onboard"],
|
||||
[
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
serviceLane("openwebui", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:openwebui", {
|
||||
weight: 3,
|
||||
}),
|
||||
serviceLane("onboard", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:onboard", {
|
||||
weight: 2,
|
||||
}),
|
||||
npmLane(
|
||||
"npm-onboard-channel-agent",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:npm-onboard-channel-agent",
|
||||
],
|
||||
["gateway-network", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:gateway-network"],
|
||||
["mcp-channels", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:mcp-channels"],
|
||||
["pi-bundle-mcp-tools", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:pi-bundle-mcp-tools"],
|
||||
["cron-mcp-cleanup", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:cron-mcp-cleanup"],
|
||||
["doctor-switch", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:doctor-switch"],
|
||||
["plugins", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:plugins"],
|
||||
["plugin-update", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:plugin-update"],
|
||||
["config-reload", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:config-reload"],
|
||||
{ resources: ["service"], weight: 3 },
|
||||
),
|
||||
serviceLane("gateway-network", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:gateway-network"),
|
||||
serviceLane("mcp-channels", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:mcp-channels", {
|
||||
resources: ["npm"],
|
||||
weight: 3,
|
||||
}),
|
||||
lane("pi-bundle-mcp-tools", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:pi-bundle-mcp-tools"),
|
||||
serviceLane(
|
||||
"cron-mcp-cleanup",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:cron-mcp-cleanup",
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
npmLane("doctor-switch", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:doctor-switch", {
|
||||
weight: 3,
|
||||
}),
|
||||
npmLane("plugins", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:plugins", { weight: 2 }),
|
||||
npmLane("plugin-update", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:plugin-update"),
|
||||
serviceLane("config-reload", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:config-reload"),
|
||||
...bundledScenarioLanes,
|
||||
["openai-image-auth", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:openai-image-auth"],
|
||||
["qr", "pnpm test:docker:qr"],
|
||||
lane("openai-image-auth", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:openai-image-auth"),
|
||||
lane("qr", "pnpm test:docker:qr"),
|
||||
];
|
||||
|
||||
const exclusiveLanes = [
|
||||
[
|
||||
serviceLane(
|
||||
"openai-web-search-minimal",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:openai-web-search-minimal",
|
||||
],
|
||||
["live-codex-harness", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-codex-harness"],
|
||||
["live-codex-bind", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-codex-bind"],
|
||||
[
|
||||
),
|
||||
liveLane(
|
||||
"live-codex-harness",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-codex-harness",
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
liveLane("live-codex-bind", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-codex-bind", {
|
||||
resources: ["npm"],
|
||||
weight: 3,
|
||||
}),
|
||||
liveLane(
|
||||
"live-cli-backend-codex",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-cli-backend:codex",
|
||||
],
|
||||
["live-acp-bind-claude", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-acp-bind:claude"],
|
||||
["live-acp-bind-codex", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-acp-bind:codex"],
|
||||
["live-acp-bind-gemini", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-acp-bind:gemini"],
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
liveLane(
|
||||
"live-acp-bind-claude",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-acp-bind:claude",
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
liveLane(
|
||||
"live-acp-bind-codex",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-acp-bind:codex",
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
liveLane(
|
||||
"live-acp-bind-gemini",
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:live-acp-bind:gemini",
|
||||
{ resources: ["npm"], weight: 3 },
|
||||
),
|
||||
];
|
||||
|
||||
const tailLanes = exclusiveLanes;
|
||||
@@ -119,6 +204,41 @@ function parseBool(raw, fallback) {
|
||||
return !/^(?:0|false|no)$/i.test(raw);
|
||||
}
|
||||
|
||||
function parseResourceLimit(env, resource, parallelism, fallback) {
|
||||
const envName = `OPENCLAW_DOCKER_ALL_${resource.toUpperCase()}_LIMIT`;
|
||||
return parsePositiveInt(env[envName], Math.min(parallelism, fallback), envName);
|
||||
}
|
||||
|
||||
function parseSchedulerOptions(env, parallelism) {
|
||||
const weightLimit = parsePositiveInt(
|
||||
env.OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT,
|
||||
parallelism,
|
||||
"OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT",
|
||||
);
|
||||
return {
|
||||
resourceLimits: {
|
||||
docker: parseResourceLimit(env, "docker", parallelism, parallelism),
|
||||
live: parseResourceLimit(env, "live", parallelism, DEFAULT_RESOURCE_LIMITS.live),
|
||||
npm: parseResourceLimit(env, "npm", parallelism, DEFAULT_RESOURCE_LIMITS.npm),
|
||||
service: parseResourceLimit(env, "service", parallelism, DEFAULT_RESOURCE_LIMITS.service),
|
||||
},
|
||||
weightLimit,
|
||||
};
|
||||
}
|
||||
|
||||
function laneWeight(poolLane) {
|
||||
return Math.max(1, poolLane.weight ?? 1);
|
||||
}
|
||||
|
||||
function laneResources(poolLane) {
|
||||
return ["docker", ...(poolLane.resources ?? [])];
|
||||
}
|
||||
|
||||
function laneSummary(poolLane) {
|
||||
const resources = laneResources(poolLane).join(",");
|
||||
return `${poolLane.name}(w=${laneWeight(poolLane)} r=${resources})`;
|
||||
}
|
||||
|
||||
function sleep(ms) {
|
||||
return new Promise((resolve) => {
|
||||
setTimeout(resolve, ms);
|
||||
@@ -287,7 +407,7 @@ function laneEnv(name, baseEnv, logDir) {
|
||||
}
|
||||
|
||||
async function runLane(lane, baseEnv, logDir, timeoutMs) {
|
||||
const [name, command] = lane;
|
||||
const { command, name } = lane;
|
||||
const logFile = path.join(logDir, `${name}.log`);
|
||||
const env = laneEnv(name, baseEnv, logDir);
|
||||
await mkdir(env.OPENCLAW_DOCKER_CLI_TOOLS_DIR, { recursive: true });
|
||||
@@ -323,7 +443,13 @@ async function runLane(lane, baseEnv, logDir, timeoutMs) {
|
||||
|
||||
async function runLanePool(poolLanes, baseEnv, logDir, parallelism, options) {
|
||||
const failures = [];
|
||||
let nextIndex = 0;
|
||||
const pending = [...poolLanes];
|
||||
const running = new Set();
|
||||
const active = {
|
||||
count: 0,
|
||||
resources: new Map(),
|
||||
weight: 0,
|
||||
};
|
||||
let lastLaneStartAt = 0;
|
||||
let laneStartQueue = Promise.resolve();
|
||||
|
||||
@@ -345,25 +471,96 @@ async function runLanePool(poolLanes, baseEnv, logDir, parallelism, options) {
|
||||
releaseQueue();
|
||||
}
|
||||
|
||||
async function worker() {
|
||||
while (nextIndex < poolLanes.length) {
|
||||
if (options.failFast && failures.length > 0) {
|
||||
return;
|
||||
function canStartLane(candidate) {
|
||||
const weight = laneWeight(candidate);
|
||||
if (active.count >= parallelism || active.weight + weight > options.weightLimit) {
|
||||
return false;
|
||||
}
|
||||
for (const resource of laneResources(candidate)) {
|
||||
const limit = options.resourceLimits[resource] ?? options.weightLimit;
|
||||
const current = active.resources.get(resource) ?? 0;
|
||||
if (current + weight > limit) {
|
||||
return false;
|
||||
}
|
||||
const lane = poolLanes[nextIndex++];
|
||||
await waitForLaneStartSlot();
|
||||
const result = await runLane(lane, baseEnv, logDir, options.timeoutMs);
|
||||
if (result.status !== 0) {
|
||||
failures.push(result);
|
||||
if (options.failFast) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
function reserve(candidate) {
|
||||
const weight = laneWeight(candidate);
|
||||
active.count += 1;
|
||||
active.weight += weight;
|
||||
for (const resource of laneResources(candidate)) {
|
||||
active.resources.set(resource, (active.resources.get(resource) ?? 0) + weight);
|
||||
}
|
||||
}
|
||||
|
||||
function release(candidate) {
|
||||
const weight = laneWeight(candidate);
|
||||
active.count -= 1;
|
||||
active.weight -= weight;
|
||||
for (const resource of laneResources(candidate)) {
|
||||
const next = (active.resources.get(resource) ?? 0) - weight;
|
||||
if (next > 0) {
|
||||
active.resources.set(resource, next);
|
||||
} else {
|
||||
active.resources.delete(resource);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const workerCount = Math.min(parallelism, poolLanes.length);
|
||||
await Promise.all(Array.from({ length: workerCount }, () => worker()));
|
||||
async function startLane(poolLane) {
|
||||
await waitForLaneStartSlot();
|
||||
reserve(poolLane);
|
||||
let promise;
|
||||
promise = runLane(poolLane, baseEnv, logDir, options.timeoutMs)
|
||||
.then((result) => ({ lane: poolLane, promise, result }))
|
||||
.finally(() => {
|
||||
release(poolLane);
|
||||
});
|
||||
running.add(promise);
|
||||
}
|
||||
|
||||
while (pending.length > 0 || running.size > 0) {
|
||||
let started = false;
|
||||
if (!options.failFast || failures.length === 0) {
|
||||
for (let index = 0; index < pending.length; ) {
|
||||
const candidate = pending[index];
|
||||
if (!canStartLane(candidate)) {
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
pending.splice(index, 1);
|
||||
await startLane(candidate);
|
||||
started = true;
|
||||
}
|
||||
}
|
||||
|
||||
if (started) {
|
||||
continue;
|
||||
}
|
||||
if (running.size === 0) {
|
||||
const blocked = pending.map(laneSummary).join(", ");
|
||||
throw new Error(`No Docker lanes fit scheduler limits: ${blocked}`);
|
||||
}
|
||||
|
||||
const { promise, result } = await Promise.race(running);
|
||||
running.delete(promise);
|
||||
if (result.status !== 0) {
|
||||
failures.push(result);
|
||||
}
|
||||
if (options.failFast && failures.length > 0) {
|
||||
const remainingResults = await Promise.all(running);
|
||||
running.clear();
|
||||
for (const remaining of remainingResults) {
|
||||
if (remaining.result.status !== 0) {
|
||||
failures.push(remaining.result);
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
return failures;
|
||||
}
|
||||
|
||||
@@ -448,6 +645,14 @@ async function main() {
|
||||
console.log(`==> Lane start stagger: ${laneStartStaggerMs}ms`);
|
||||
console.log(`==> Fail fast: ${failFast ? "yes" : "no"}`);
|
||||
console.log(`==> Live-test bundled plugin deps: ${baseEnv.OPENCLAW_DOCKER_BUILD_EXTENSIONS}`);
|
||||
const schedulerOptions = parseSchedulerOptions(process.env, parallelism);
|
||||
const tailSchedulerOptions = parseSchedulerOptions(process.env, tailParallelism);
|
||||
console.log(
|
||||
`==> Scheduler: weight=${schedulerOptions.weightLimit} docker=${schedulerOptions.resourceLimits.docker} live=${schedulerOptions.resourceLimits.live} npm=${schedulerOptions.resourceLimits.npm} service=${schedulerOptions.resourceLimits.service}`,
|
||||
);
|
||||
console.log(
|
||||
`==> Tail scheduler: weight=${tailSchedulerOptions.weightLimit} docker=${tailSchedulerOptions.resourceLimits.docker} live=${tailSchedulerOptions.resourceLimits.live} npm=${tailSchedulerOptions.resourceLimits.npm} service=${tailSchedulerOptions.resourceLimits.service}`,
|
||||
);
|
||||
|
||||
await runForegroundGroup(
|
||||
[
|
||||
@@ -461,7 +666,12 @@ async function main() {
|
||||
);
|
||||
await prepareBundledChannelPackage(baseEnv, logDir);
|
||||
|
||||
const options = { failFast, startStaggerMs: laneStartStaggerMs, timeoutMs: laneTimeoutMs };
|
||||
const options = {
|
||||
...schedulerOptions,
|
||||
failFast,
|
||||
startStaggerMs: laneStartStaggerMs,
|
||||
timeoutMs: laneTimeoutMs,
|
||||
};
|
||||
const failures = await runLanePool(lanes, baseEnv, logDir, parallelism, options);
|
||||
if (failFast && failures.length > 0) {
|
||||
await printFailureSummary(failures, tailLines);
|
||||
@@ -469,7 +679,12 @@ async function main() {
|
||||
}
|
||||
|
||||
console.log("==> Running provider-sensitive Docker tail lanes");
|
||||
failures.push(...(await runLanePool(tailLanes, baseEnv, logDir, tailParallelism, options)));
|
||||
failures.push(
|
||||
...(await runLanePool(tailLanes, baseEnv, logDir, tailParallelism, {
|
||||
...options,
|
||||
...tailSchedulerOptions,
|
||||
})),
|
||||
);
|
||||
if (failures.length > 0) {
|
||||
await printFailureSummary(failures, tailLines);
|
||||
process.exit(1);
|
||||
|
||||
Reference in New Issue
Block a user