mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 15:50:46 +00:00
test(release): prefer GPT-5.5 smoke models
This commit is contained in:
@@ -188,7 +188,7 @@ Keep `workflow_ref` and `package_ref` separate. `workflow_ref` is the trusted wo
|
||||
|
||||
The `package` profile uses offline plugin coverage so published-package validation is not gated on live ClawHub availability. The optional Telegram lane reuses the `package-under-test` artifact in `NPM Telegram Beta E2E`, with the published npm spec path kept for standalone dispatches.
|
||||
|
||||
Release checks call Package Acceptance with `source=ref`, `package_ref=<release-ref>`, `workflow_ref=<release workflow ref>`, `suite_profile=custom`, `docker_lanes='plugins-offline plugin-update'`, and `telegram_mode=mock-openai`. Release-path Docker chunks cover the overlapping package/update/plugin lanes; Package Acceptance keeps offline plugin, update, and Telegram proof against the same resolved package tarball. Cross-OS release checks still cover OS-specific onboarding, installer, and platform behavior; package/update product validation should start with Package Acceptance. The `published-upgrade-survivor` Docker lane validates one published package baseline per run. In Package Acceptance, the resolved `package-under-test` tarball is always the candidate and `published_upgrade_survivor_baseline` selects the fallback published baseline, defaulting to `openclaw@latest`; failed-lane rerun commands preserve that baseline. Set `published_upgrade_survivor_baselines=release-history` to expand the lane across a deduped history matrix: the latest six stable releases, `2026.4.23`, and the latest stable release before `2026-03-15`. Set `published_upgrade_survivor_scenarios=reported-issues` to expand the same baselines across issue-shaped fixtures for Feishu config, preserved bootstrap/persona files, tilde log paths, and stale legacy plugin dependency roots. Local aggregate runs can pass exact package specs with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS`, keep a single lane with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC` such as `openclaw@2026.4.15`, or set `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS` for the scenario matrix. The published lane configures the baseline with a baked `openclaw config set` command recipe, records recipe steps in `summary.json`, and probes `/healthz`, `/readyz`, plus RPC status after Gateway start. The Windows packaged and installer fresh lanes also verify that an installed package can import a browser-control override from a raw absolute Windows path. The OpenAI cross-OS agent-turn smoke defaults to `OPENCLAW_CROSS_OS_OPENAI_MODEL` when set, otherwise `openai/gpt-5.4-mini`, so the install and gateway proof stays fast and deterministic.
|
||||
Release checks call Package Acceptance with `source=ref`, `package_ref=<release-ref>`, `workflow_ref=<release workflow ref>`, `suite_profile=custom`, `docker_lanes='plugins-offline plugin-update'`, and `telegram_mode=mock-openai`. Release-path Docker chunks cover the overlapping package/update/plugin lanes; Package Acceptance keeps offline plugin, update, and Telegram proof against the same resolved package tarball. Cross-OS release checks still cover OS-specific onboarding, installer, and platform behavior; package/update product validation should start with Package Acceptance. The `published-upgrade-survivor` Docker lane validates one published package baseline per run. In Package Acceptance, the resolved `package-under-test` tarball is always the candidate and `published_upgrade_survivor_baseline` selects the fallback published baseline, defaulting to `openclaw@latest`; failed-lane rerun commands preserve that baseline. Set `published_upgrade_survivor_baselines=release-history` to expand the lane across a deduped history matrix: the latest six stable releases, `2026.4.23`, and the latest stable release before `2026-03-15`. Set `published_upgrade_survivor_scenarios=reported-issues` to expand the same baselines across issue-shaped fixtures for Feishu config, preserved bootstrap/persona files, tilde log paths, and stale legacy plugin dependency roots. Local aggregate runs can pass exact package specs with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS`, keep a single lane with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC` such as `openclaw@2026.4.15`, or set `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS` for the scenario matrix. The published lane configures the baseline with a baked `openclaw config set` command recipe, records recipe steps in `summary.json`, and probes `/healthz`, `/readyz`, plus RPC status after Gateway start. The Windows packaged and installer fresh lanes also verify that an installed package can import a browser-control override from a raw absolute Windows path. The OpenAI cross-OS agent-turn smoke defaults to `OPENCLAW_CROSS_OS_OPENAI_MODEL` when set, otherwise `openai/gpt-5.5`, so the install and gateway proof stays on the preferred GPT-5 test model.
|
||||
|
||||
### Legacy compatibility windows
|
||||
|
||||
|
||||
@@ -248,7 +248,7 @@ gh workflow run package-acceptance.yml --ref main \
|
||||
- Use `--platform macos`, `--platform windows`, or `--platform linux` while
|
||||
iterating on one guest. Use `--json` for the summary artifact path and
|
||||
per-lane status.
|
||||
- The OpenAI lane uses `openai/gpt-5.4` for the live agent-turn proof by
|
||||
- The OpenAI lane uses `openai/gpt-5.5` for the live agent-turn proof by
|
||||
default. Pass `--model <provider/model>` or set
|
||||
`OPENCLAW_PARALLELS_OPENAI_MODEL` when deliberately validating another
|
||||
OpenAI model.
|
||||
|
||||
@@ -276,7 +276,7 @@ ref once as `release-package-under-test` and reuses that artifact in both
|
||||
release-path Docker checks and Package Acceptance. This keeps all
|
||||
package-facing boxes on the same bytes and avoids repeated package builds.
|
||||
The cross-OS OpenAI install smoke uses `OPENCLAW_CROSS_OS_OPENAI_MODEL` when the
|
||||
repo/org variable is set, otherwise `openai/gpt-5.4-mini`, because this lane is
|
||||
repo/org variable is set, otherwise `openai/gpt-5.5`, because this lane is
|
||||
proving package install, onboarding, gateway startup, and one live agent turn
|
||||
rather than benchmarking the slowest default model. The broader live provider
|
||||
matrix remains the place for model-specific coverage.
|
||||
|
||||
Reference in New Issue
Block a user