Commit Graph

82 Commits

Author SHA1 Message Date
Peter Steinberger
bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341)
* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types
2026-05-27 19:24:04 +01:00
Vincent Koc
4d4e2ec256 fix(qa): require genai otel model spans (#86920) 2026-05-26 14:51:50 +01:00
Vincent Koc
9be760fb37 test(qa): add collector-backed otel smoke 2026-05-25 23:51:17 +02:00
Peter Steinberger
a1fe86a0ff feat(qa): add coverage scenario matching 2026-05-25 10:22:51 +01:00
Vincent Koc
7f05be041e fix(diagnostics): harden observability exports and smokes (#85371)
* test(diagnostics): widen observability smokes

* fix(diagnostics): sanitize observability exports

* docs(diagnostics): format otel export docs
2026-05-23 15:27:43 +08:00
Kevin Lin
5656f687c1 Add Slack approval QA checkpoints (#85141)
* test: add slack approval qa checkpoints

* fix(slack): scope plugin approval session fallback

* ci(mantis): allow slack approval checkpoint dispatch

* ci(mantis): use on-demand aws slack desktops

* ci(mantis): run slack smoke from candidate checkout

* ci(mantis): pin aws ssh ingress to runner

* test(mantis): skip crabbox actions hydrate for slack desktop

* ci(mantis): use fresh pr checkout for slack desktop

* ci(mantis): start slack desktop smoke from source

* fix(mantis): use relative slack qa output dir

* test(mantis): surface slack smoke failure logs

* fix(mantis): write slack approval watcher script

* fix(mantis): accept successful slack qa metadata

* fix(mantis): tighten slack approval evidence

* fix(mantis): repair slack evidence manifest

* fix(mantis): render slack approval checkpoint proof

* fix(mantis): quote approval checkpoint renderer html

* fix(mantis): preserve slack approval failure artifacts

* fix(mantis): timeout silent slack desktop runs

* fix(mantis): keep slack desktop runs chatty

* fix(mantis): keep slack workflow harness trusted

* fix(qa-lab): make slack approval evidence robust

* fix(qa-lab): harden slack approval workflow proof

* test(qa-lab): surface slack approval diagnostics

* test(qa-lab): loosen slack approval readiness
2026-05-22 22:04:15 -07:00
Ayaan Zaidi
cd15ce35a0 fix(qa): keep telegram user creds mantis-only 2026-05-18 10:04:58 +05:30
Vincent Koc
1300b22630 fix(qa-lab): classify runtime token efficiency 2026-05-18 11:09:08 +08:00
Vincent Koc
1926982c4c fix(qa-lab): refresh parity model targets 2026-05-17 23:12:26 +08:00
Vincent Koc
da8afe359d feat(qa-lab): add scenario pack selector 2026-05-17 09:23:48 +08:00
Vincent Koc
440333125c test(qa-lab): add personal agent scenarios 2026-05-17 02:56:53 +08:00
Vincent Koc
ac2e3a23b9 fix(qa): preserve RTT samples with Convex credentials 2026-05-17 02:17:35 +08:00
Ayaan Zaidi
d7bbff2185 feat(telegram): default Crabbox proof GIFs to 1080p 2026-05-10 15:46:30 +05:30
Ayaan Zaidi
a9bf94c62d feat(telegram): harden Crabbox real-user proof 2026-05-10 15:46:30 +05:30
Ayaan Zaidi
984174fb9d feat(telegram): publish crabbox proof gif by default 2026-05-10 15:10:39 +05:30
Ayaan Zaidi
32e1236cb7 feat(telegram): hold crabbox user sessions 2026-05-10 15:10:39 +05:30
Ayaan Zaidi
ecb7ea19a5 feat(telegram): add real user crabbox proof 2026-05-10 15:10:39 +05:30
Ayaan Zaidi
dc64b99c41 docs(qa): list Telegram auth E2E scenario 2026-05-09 15:28:54 +05:30
Ayaan Zaidi
5e27993cbe docs(qa): document telegram e2e defaults 2026-05-08 16:14:42 +05:30
Peter Steinberger
5ed1cfc15c docs: keep qa broker notes internal 2026-05-08 06:01:23 +01:00
Vincent Koc
93747f6955 test(qa): add discord voice autojoin smoke 2026-05-06 22:30:36 -07:00
Vincent Koc
2b8d91d9ee docs: typography hygiene + 2 in-body H1 removals across 5 pages
Replaced 112 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules.

- docs/help/gpt55-codex-agentic-parity.md: 22 chars; removed the
  duplicate '# GPT-5.5 / Codex Agentic Parity in OpenClaw' H1 (Mintlify
  renders the title from frontmatter; the in-body H1 with the slash
  produced a brittle anchor).
- docs/platforms/mac/menu-bar.md: 21 chars; removed the duplicate
  '# Menu Bar Status Logic' H1.
- docs/tools/acp-agents.md: 23 chars
- docs/concepts/qa-matrix.md: 23 chars
- docs/concepts/qa-e2e-automation.md: 23 chars
2026-05-05 19:34:52 -07:00
Peter Steinberger
057d3a43c0 feat(mantis): capture logged-in discord web evidence 2026-05-06 02:43:49 +01:00
Peter Steinberger
ad2d13cc67 fix(discord): preserve thread reply file attachments 2026-05-06 01:16:57 +01:00
Kevin Lin
dd643b52df test: expand slack live qa coverage (#77713) 2026-05-05 16:11:07 -07:00
Peter Steinberger
430814ebc1 docs: add Mantis Slack desktop runbook 2026-05-05 23:48:49 +01:00
Peter Steinberger
26bc40c1a4 perf: add Mantis Slack hydrate timings 2026-05-05 21:07:07 +01:00
Peter Steinberger
e8a9c766c2 perf: speed up Mantis Slack desktop smoke 2026-05-05 19:57:26 +01:00
Peter Steinberger
35266879de feat: add Mantis visual task video QA 2026-05-05 05:35:12 +01:00
Vincent Koc
e03fe1e289 fix(telegram): reuse preview for long text finals (#77658)
* fix(telegram): reuse preview for long text finals

* test(qa): cover long telegram finals

* fix(qa): satisfy extension lint

* fix(qa): keep telegram long final fixture to two chunks

* test(telegram): cover three chunk finals

* fix(telegram): force long final preview boundary
2026-05-04 21:19:44 -07:00
Vincent Koc
b062bb670d docs(channels): inline Slack manifest into Quick Setup with Recommended/Minimal variants
The Quick Setup steps in docs/channels/slack.md previously sent users to
the `#manifest-and-scope-checklist` anchor lower on the page to copy the
manifest, breaking the copy-paste flow. Pull the manifest inline as a
Mintlify <CodeGroup> for both Socket Mode and HTTP Request URLs tabs and
add a Minimal variant for workspaces that restrict scopes (drops
files:*, reactions:*, pins:*, mpim:*, emoji:read, usergroups:read while
keeping DMs, channel/group history, mentions, App Home, and slash
commands). Recommended matches extensions/slack/src/setup-shared.ts.
Existing Manifest and scope checklist section stays as the canonical
per-scope reference.

Cross-link from docs/concepts/qa-e2e-automation.md so QA maintainers see
the production manifest reference, while keeping the QA Driver/SUT pair
of manifests inline (the lane intentionally needs two distinct apps so
its shape is different from a single-app production install).
2026-05-04 18:16:15 -07:00
Sarah Fortune
d6e991db49 Add instructions for how to setup slack for QA tests (#77606) 2026-05-04 17:38:16 -07:00
Peter Steinberger
f632f5e60b feat(qa): add mantis Slack desktop smoke 2026-05-04 03:47:27 +01:00
Peter Steinberger
57b2d29761 feat(qa): add Mantis desktop browser smoke 2026-05-04 01:30:20 +01:00
Vincent Koc
31cafbb802 test(qa): add Slack live transport lane 2026-05-03 15:19:55 -07:00
Peter Steinberger
d4af125b52 feat(qa): add Mantis before-after CLI 2026-05-03 21:27:43 +01:00
Peter Steinberger
77a50db9ea feat(qa): add Mantis Discord status reaction scenario (#76747)
* feat(qa): add Mantis Discord status reaction scenario

* fix(qa): retry Discord rate limits in Mantis runs

* refactor(qa): reuse Discord API retry helper

* fix(qa): import Discord API through package surface

* fix(ci): generate Discord boundary declarations

* fix(ci): keep xai boundary overrides stable
2026-05-03 17:00:06 +01:00
Peter Steinberger
0bf06e953f feat: add Mantis Discord smoke runner (#76696)
* docs: add Mantis QA system design

* feat: add Mantis Discord smoke runner

* fix: harden Mantis Discord smoke

* fix: redact Mantis Discord artifacts

* fix: satisfy Mantis redaction lint

* fix: redact Mantis mismatch failures

* test: avoid promise assertions in Mantis tests
2026-05-03 15:25:56 +01:00
Vincent Koc
b9eb31b54c ci: fold parity into QA release validation (#74622)
Summary:
- The PR deletes the standalone `Parity gate` workflow, renames QA parity wording from gate to lane, routes docs toward QA/release validation, and adjusts the Docker E2E boundary guard for package-backed live lanes.
- Reproducibility: not applicable. as a CI/docs refactor. The high-confidence review path is static comparison of the repaired PR diff against current workflow/docs code plus exact-head check status.

ClawSweeper fixups:
- Included follow-up commit: ci: fold parity into QA release validation
- Ran the ClawSweeper repair loop before final review.

Validation:
- ClawSweeper review passed for head 3482654058.
- Required merge gates passed before the squash merge.

Prepared head SHA: 3482654058
Review: https://github.com/openclaw/openclaw/pull/74622#issuecomment-4359168336

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
2026-05-02 19:57:15 +00:00
Gustavo Madeira Santana
d59f001507 test(qa-matrix): cover allowBots modes 2026-04-28 00:47:40 -04:00
Gustavo Madeira Santana
c5678194d4 docs(qa): document Telegram and Discord QA lanes against code
Both lanes had only one paragraph each in qa-e2e-automation.md. Adds a
"Telegram and Discord QA reference" section verified against
extensions/qa-lab/src/live-transports/{telegram,discord}/* with:

- shared CLI flags table (--scenario, --output-dir, --repo-root, --sut-account,
  --provider-mode, --model, --alt-model, --fast, --credential-source,
  --credential-role) — none of these were enumerated for either lane.
- Telegram QA: 8 scenario ids
  (telegram-canary/-mention-gating/-mentioned-message-reply/-help-command/
  -commands-command/-tools-compact-command/-whoami-command/-context-command),
  output artifact paths (telegram-qa-report.md, -summary.json,
  -observed-messages.json), and the redaction toggle.
- Discord QA: 3 scenario ids
  (discord-canary/-mention-gating/-native-help-command-registration), output
  artifact paths, and the SUT-application-id-must-match-bot-user-id check.
- Convex credential pool: documents Discord support (only Telegram was
  mentioned before) and the per-kind payload shapes for the
  admin/add validator. Cross-links to testing.md for the broker endpoint
  contract.

Slims the duplicate Operator-flow paragraphs for Telegram and Discord into a
single one-block pointer that links to the new reference section.
2026-04-27 13:48:03 -04:00
Gustavo Madeira Santana
dd1a94f089 docs(qa): reorg, audit against code, and refresh stale content
Reorg
- Rename the architecture page title to "QA overview" (slug stays
  /concepts/qa-e2e-automation so inbound links keep working).
- Move "Adding a channel to QA" + scenario-helper-name reference from
  testing.md into qa-e2e-automation.md under "Transport adapters". Architecture
  belongs with the architecture page.
- Drop the duplicate live-transport coverage table from testing.md; canonical
  copy stays in qa-e2e-automation.md under a new "Live transport coverage"
  heading so qa-matrix.md can deep-link to it.
- Slim testing.md QA-specific runners section to ops only, with cross-links.

Audit (against extensions/qa-lab/src/cli.ts, qa-channel/src/config-schema.ts,
and live-transport runtimes)
- qa-e2e-automation.md gains a "Command surface" table covering all 14
  openclaw qa <subcommand> forms; previously only ~7 of 14 were named.
- Document missing OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT and
  OPENCLAW_QA_DISCORD_CAPTURE_CONTENT env vars (Matrix already had it).
- Cross-link qa coverage from the Reporting section.
- qa-channel.md completes the config-key list (enabled, name, accounts,
  defaultAccount were missing from the schema doc) and pollTimeoutMs range.
- Drop stale "Follow-up work" framing in qa-channel.md (provider/model matrix,
  scenario discovery, orchestration) — all three already shipped.
- Replace "vertical slice" language with current behavior; fix misplaced
  debugger-UI paragraph.

Discoverability
- Add a Note callout to testing.md pointing at the three QA pages
  (QA overview, Matrix QA, QA channel) so maintainers landing on testing.md
  see the QA stack in the prologue.

Glossary entries for the renamed/new doc titles.
2026-04-27 13:40:11 -04:00
Peter Steinberger
6987132aed ci: add Matrix QA profiles 2026-04-27 05:43:14 +01:00
Peter Steinberger
7ca2f9fed5 test(docker): align package harness image 2026-04-27 01:22:58 +01:00
Peter Steinberger
42db865673 test(docker): run observability on shared image 2026-04-27 00:49:36 +01:00
Vincent Koc
5d7c6e6bda test(docker): add observability smoke
Add Docker aggregate observability coverage for QA-lab OTEL and Prometheus diagnostics.
2026-04-26 16:43:56 -07:00
Vincent Koc
82ddcf24f5 feat(diagnostics): add harness lifecycle telemetry 2026-04-25 23:34:34 -07:00
Vincent Koc
a0ca546997 test(qa): add local otel smoke harness 2026-04-25 19:30:46 -07:00
Peter Steinberger
768bbc7cc0 docs: update OpenAI GPT-5.5 API guidance 2026-04-25 18:14:10 +01:00
Peter Steinberger
88c91675e2 test: stabilize qa suite concurrency 2026-04-24 20:39:33 +01:00