openclaw

mirror of https://github.com/openclaw/openclaw.git synced 2026-06-28 23:13:42 +00:00

Author	SHA1	Message	Date
Colin Johnson	591313e80a	qa-lab: support script-backed evidence scenarios (#94276 ) * qa: add script scenario execution kind * fix(qa-lab): carry suite profile into script producer evidence and simplify artifact path resolution * fix(qa-lab): keep out-of-repo producer artifacts absolute to avoid ../ traversal refs --------- Co-authored-by: Dallin Romney <dallinromney@gmail.com>	2026-06-17 15:09:25 -07:00
Dallin Romney	e32929e12c	Add slim evidence mode for QA profile evidence (#93179 ) * test(qa): compact profile evidence execution metadata * docs(qa): document compact profile evidence * test(qa): support compact evidence mode * test(qa): rename compact evidence mode to slim * docs(qa): trim slim evidence wording * fix(qa): avoid commander runtime import	2026-06-15 14:50:40 -07:00
Dallin Romney	3d38c9a633	test(qa): embed profile scorecard evidence (#93109 ) * test(qa): embed profile scorecard evidence * test(qa): fix profile runner return lint * test(qa): satisfy suite command lint return	2026-06-14 20:51:38 -07:00
Dallin Romney	e8db9c3bc0	test(qa): add qa run --profile and unified output summary/evidence (#91587 ) * test(qa): add mapped qa run profiles * test(qa): document mapped profile runner * test(qa): validate run profiles from mapping * test(qa): preserve root profile parsing * test(qa): simplify taxonomy profile dispatch * test(qa): align tool coverage CLI expectation * test(qa): fix profile dispatch fixture type * test(qa): share profile runner option types * test(qa): split shared cli runner options * test(qa): unify profile suite artifacts * fix(qa): filter profile scenarios by provider lane * test(qa): drop native scenario subreports * fix(qa): keep native log refs repo-relative * fix(cli): preserve qa run root profile parsing * fix(qa): avoid qa profile flag collision * fix(qa): reject profile flags without qa profile	2026-06-14 18:08:42 -07:00
Dallin Romney	fef8394079	Convert QA scenarios to YAML files (#92915 ) * refactor: load QA scenarios from YAML * docs: update personal QA scenario docs * test: keep QA scenarios YAML-only	2026-06-14 17:31:18 -07:00
Dallin Romney	1affe4fcdf	Fold Telegram RTT sampling into live QA evidence (#92550 ) * refactor(qa): fold telegram rtt into live evidence * test: default package telegram rtt samples * refactor(qa-lab): fold telegram rtt into live evidence * fix(qa-lab): keep package telegram rtt optional for focused runs * fix(qa-lab): avoid stale rtt evidence on failed samples * fix(qa-lab): pass telegram live env into credential leasing * fix(qa-lab): update telegram canary remediation artifacts * docs(qa): remove stale telegram observed artifact guidance * fix(qa-lab): clarify telegram empty-reply remediation * fix(qa-lab): honor telegram rtt timeout * ci(qa): drop stale telegram capture env * refactor: align telegram evidence coverage fields * fix: ignore stale telegram observed artifacts * fix: preserve telegram rtt coverage mapping * fix: omit unused telegram rtt catch binding * docs: document telegram rtt check selector	2026-06-14 17:02:33 +08:00
Dallin Romney	561b293c7a	Run Vitest and Playwright scenarios from qa suite (#92606 ) * test(qa): run vitest and playwright scenarios from qa suite * fix(qa): harden scenario suite dispatch * refactor(qa): share scenario path utilities * refactor(qa): share test file scenario runner * refactor(qa): route test file scenarios through suite runtime * refactor(qa): use explicit suite runtime result kind * test(qa): write suite evidence artifact * refactor(qa): clarify suite execution dispatch * fix(qa): keep test-file scenarios out of flow-only runners * refactor(qa): export mixed scenario suite runner	2026-06-13 01:06:10 -07:00
Dallin Romney	4809ac70fa	Add QA evidence artifact output (#91484 ) * feat: add qa evidence summary normalization * chore: rename qa evidence target environment * chore: align qa evidence profile terminology * chore: align qa evidence summary fields * chore: add qa evidence taxonomy ref * test: remove stale multipass evidence example * test(qa): normalize vitest and playwright evidence * test(qa): slim evidence summary metadata * test(qa): clarify evidence summary inputs * test(qa): rename scenario specs in evidence flow * test(qa): treat evidence profiles as mapping strings * test(qa): use neutral evidence test identity * test(qa): nest evidence summary joins * refactor(qa): normalize live evidence summaries * fix(qa): accept normalized telegram rtt summaries * fix(qa): normalize evidence lane summaries * fix(qa): align evidence summaries with requirements * refactor(qa): tighten evidence summary builders * refactor(qa): restore standard evidence ids * fix(qa): keep legacy summaries out of rtt evidence * refactor(qa): make package evidence provenance explicit * test(qa): keep script tests out of qa lab internals * refactor(qa): rename scenario evidence definitions * refactor(qa): clean evidence summary wording * test(qa): fix evidence summary test inputs * refactor(qa): simplify evidence identity fields * refactor(qa): tighten evidence summary inputs * refactor(qa): rename evidence artifact	2026-06-12 16:12:58 -07:00
Marcus Castro	181238fb53	feat(whatsapp): expand live QA coverage (#90480 ) * feat(whatsapp): expand qa driver message support * feat(qa-lab): add deterministic whatsapp mock replies * feat(qa-lab): expand whatsapp live qa scenarios * docs(qa): document whatsapp live qa coverage	2026-06-08 00:03:23 -03:00
Vincent Koc	f3f85ae5f7	refactor: share live transport scenario helpers	2026-05-30 01:05:56 +02:00
Peter Steinberger	1188aa3b81	feat: add Claude Opus 4.8 support (#87890 ) * feat: add Claude Opus 4.8 support * fix: omit Vertex Opus sampling overrides * fix: preserve Opus adaptive thinking levels * fix: clamp Anthropic max effort support * fix: use sha256 for QA mock call ids * fix: type Anthropic transport test model metadata * test: update PDF model default for Opus 4.8	2026-05-29 06:10:42 +01:00
Kevin Lin	359c31b7e7	Add WhatsApp approval QA scenarios (#87782 ) * test(qa): add WhatsApp approval scenarios * fix(qa): keep WhatsApp approval scenarios explicit	2026-05-28 15:27:20 -07:00
Peter Steinberger	bb46b79d3c	refactor: internalize OpenClaw agent runtime (#85341 ) * refactor: extract agent core package Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts. * refactor: extract shared llm runtime Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout. * refactor: remove pi runtime internals Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code. * refactor: tighten agent session runtime Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts. * refactor: remove static model and pi auth paths Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities. * refactor: remove legacy provider compat paths * docs: remove agent parity notes * fix: skip provider wildcard metadata parsing * refactor: share session extension sdk loading * refactor: inline acpx proxy error formatter * refactor: fold edit recovery into edit tool * fix: accept extension batch separator * test: align startup provider plugin expectations * fix: restore provider-scoped release discovery * test: align static asset packaging expectations * fix: run static provider catalogs during scoped discovery * fix: add provider entry catalogs for scoped live discovery * fix: load lightweight provider catalog entries * fix: refresh provider-scoped plugin metadata * fix: keep provider catalog entries on release live path * fix: keep static manifest models in release live checks * fix: harden release model discovery * fix: reduce OpenAI live cache probe reasoning * fix: disable OpenAI cache probe reasoning * ci: extend OpenAI gateway live timeout * fix: extend live gateway model budget * fix: stabilize release validation regressions * fix: honor provider aliases in model rows * fix: stabilize release validation lanes * fix: stabilize release memory qa * ci: stabilize release validation lanes * ci: prefer ipv4 for live docker node calls * fix: restore shared tool-call stream wrapper * ci: remove legacy pi test shard alias * fix: clean up embedded agent test drift * fix: stabilize runtime alias status * fix: clean up embedded agent ci drift * fix: restore release ci invariants * fix: clean up post-rebase runtime drift * fix: restore release ci checks * fix: restore release ci after rebase * fix: remove stale pi runtime path * test: align compaction runtime expectations * test: update plugin prerelease expectations * fix: handle claude live tool approvals * fix: stabilize release validation gates * fix: finish agent runtime import * test: finish post-rebase agent runtime mocks * fix: keep codex compaction native * fix: stabilize codex app-server hook tests * test: isolate codex diagnostic active run * test: remove codex diagnostic completion race # Conflicts: # extensions/codex/src/app-server/run-attempt.test.ts * ci: fix full release manifest performance run id * refactor: narrow llm plugin sdk boundary * chore: drop generated google boundary stamps * fix: repair rebase fallout * fix: clean up rebased runtime references * fix: decode codex jwt payloads as base64url * fix: preserve shipped pi runtime alias * fix: add scoped sdk virtual modules * fix: decode llm codex oauth jwt as base64url * fix: avoid stale vertex adc negative cache * fix: harden tool arg decoding and codeql path * fix: keep vertex adc negative checks live * refactor: consolidate codex jwt and edit helpers * fix: await codex oauth node runtime imports * fix: preserve sdk tool and notice contracts * fix: preserve shipped compat config boundaries * fix: align codex oauth callback host * fix: terminate agent-core loop streams on failure * fix: keep codex oauth callback alive during fallback * ci: include session tools in critical codeql scans * fix: keep Cloudflare Anthropic provider auth header * docs: redirect legacy pi runtime pages * fix: honor bundled web provider compat discovery * fix: protect session output spill files * fix: keep legacy agent dir env blocked * fix: contain auto-discovered skill symlinks * fix: harden agent core sdk proxy surfaces * fix: restore approval reaction sdk compat * fix: keep live docker runs bounded * fix: keep codex oauth redirect host aligned * fix: resolve post-rebase agent runtime drift * fix: redact anthropic oauth parse failures * fix: preserve responses strict tool shaping * fix: repair agent runtime rebase cleanup * docs: redirect retired parity pages * fix: bound auto-discovered resources to roots * fix: repair post-rebase agent test drift * fix: preserve bundled provider allowlist migration * fix: preserve manifest-owned provider aliases * fix: declare photon image dependency * fix: keep provider headers out of proxy body * fix: preserve shipped env aliases * fix: refresh control ui i18n generated state * fix: quote read fallback paths * fix: preview edits through configured backend * test: satisfy core test typecheck * fix: preserve ZAI usage auth fallback * test: repair codex diagnostic test * fix: repair agent runtime rebase drift * test: finish embedded runner import rename * fix: repair agent runtime rebase integrations * test: align compaction oauth fallback expectations * fix: allow sdk-auth session models * fix: update doctor tool schema import * fix: preserve bedrock plugin region * fix: stream harmony-like prose immediately * ci: include session runtime in codeql shards * fix: repair latest rebase integrations * fix: honor explicit codex websocket transport * fix: keep openai-compatible credentials provider-scoped * fix: refresh sdk api baseline after rebase * fix: route cli runtime aliases through openclaw harness * test: rename stale harness mock expectation * test: rename embedded agent overflow calls * test: clean embedded auth test wording * test: use openclaw stream types in deepinfra cache test * fix: refresh sdk api baseline on latest main * fix: honor bundled discovery compat allowlists * fix: refresh sdk api baseline after latest rebase * fix: remove stale rebase imports * test: rename stale model catalog mock * test: mock renamed doctor runtime modules * fix: map canonical kimi env auth * fix: use internal model registry in bench script * fix: migrate deepinfra provider catalog entry * fix: enforce builtin tool suppression * fix: route compaction auth and proxy payloads safely * refactor: prune unused llm registry leftovers * test: update codex hooks session import * test: fix model picker ci coverage * test: align model picker auth mock types	2026-05-27 19:24:04 +01:00
Vincent Koc	4d4e2ec256	fix(qa): require genai otel model spans (#86920 )	2026-05-26 14:51:50 +01:00
Vincent Koc	9be760fb37	test(qa): add collector-backed otel smoke	2026-05-25 23:51:17 +02:00
Peter Steinberger	a1fe86a0ff	feat(qa): add coverage scenario matching	2026-05-25 10:22:51 +01:00
Vincent Koc	7f05be041e	fix(diagnostics): harden observability exports and smokes (#85371 ) * test(diagnostics): widen observability smokes * fix(diagnostics): sanitize observability exports * docs(diagnostics): format otel export docs	2026-05-23 15:27:43 +08:00
Kevin Lin	5656f687c1	Add Slack approval QA checkpoints (#85141 ) * test: add slack approval qa checkpoints * fix(slack): scope plugin approval session fallback * ci(mantis): allow slack approval checkpoint dispatch * ci(mantis): use on-demand aws slack desktops * ci(mantis): run slack smoke from candidate checkout * ci(mantis): pin aws ssh ingress to runner * test(mantis): skip crabbox actions hydrate for slack desktop * ci(mantis): use fresh pr checkout for slack desktop * ci(mantis): start slack desktop smoke from source * fix(mantis): use relative slack qa output dir * test(mantis): surface slack smoke failure logs * fix(mantis): write slack approval watcher script * fix(mantis): accept successful slack qa metadata * fix(mantis): tighten slack approval evidence * fix(mantis): repair slack evidence manifest * fix(mantis): render slack approval checkpoint proof * fix(mantis): quote approval checkpoint renderer html * fix(mantis): preserve slack approval failure artifacts * fix(mantis): timeout silent slack desktop runs * fix(mantis): keep slack desktop runs chatty * fix(mantis): keep slack workflow harness trusted * fix(qa-lab): make slack approval evidence robust * fix(qa-lab): harden slack approval workflow proof * test(qa-lab): surface slack approval diagnostics * test(qa-lab): loosen slack approval readiness	2026-05-22 22:04:15 -07:00
Ayaan Zaidi	cd15ce35a0	fix(qa): keep telegram user creds mantis-only	2026-05-18 10:04:58 +05:30
Vincent Koc	1300b22630	fix(qa-lab): classify runtime token efficiency	2026-05-18 11:09:08 +08:00
Vincent Koc	1926982c4c	fix(qa-lab): refresh parity model targets	2026-05-17 23:12:26 +08:00
Vincent Koc	da8afe359d	feat(qa-lab): add scenario pack selector	2026-05-17 09:23:48 +08:00
Vincent Koc	440333125c	test(qa-lab): add personal agent scenarios	2026-05-17 02:56:53 +08:00
Vincent Koc	ac2e3a23b9	fix(qa): preserve RTT samples with Convex credentials	2026-05-17 02:17:35 +08:00
Ayaan Zaidi	d7bbff2185	feat(telegram): default Crabbox proof GIFs to 1080p	2026-05-10 15:46:30 +05:30
Ayaan Zaidi	a9bf94c62d	feat(telegram): harden Crabbox real-user proof	2026-05-10 15:46:30 +05:30
Ayaan Zaidi	984174fb9d	feat(telegram): publish crabbox proof gif by default	2026-05-10 15:10:39 +05:30
Ayaan Zaidi	32e1236cb7	feat(telegram): hold crabbox user sessions	2026-05-10 15:10:39 +05:30
Ayaan Zaidi	ecb7ea19a5	feat(telegram): add real user crabbox proof	2026-05-10 15:10:39 +05:30
Ayaan Zaidi	dc64b99c41	docs(qa): list Telegram auth E2E scenario	2026-05-09 15:28:54 +05:30
Ayaan Zaidi	5e27993cbe	docs(qa): document telegram e2e defaults	2026-05-08 16:14:42 +05:30
Peter Steinberger	5ed1cfc15c	docs: keep qa broker notes internal	2026-05-08 06:01:23 +01:00
Vincent Koc	93747f6955	test(qa): add discord voice autojoin smoke	2026-05-06 22:30:36 -07:00
Vincent Koc	2b8d91d9ee	docs: typography hygiene + 2 in-body H1 removals across 5 pages Replaced 112 typography characters (curly quotes, apostrophes, em/en dashes, non-breaking hyphens) with ASCII equivalents per docs/CLAUDE.md heading and content hygiene rules. - docs/help/gpt55-codex-agentic-parity.md: 22 chars; removed the duplicate '# GPT-5.5 / Codex Agentic Parity in OpenClaw' H1 (Mintlify renders the title from frontmatter; the in-body H1 with the slash produced a brittle anchor). - docs/platforms/mac/menu-bar.md: 21 chars; removed the duplicate '# Menu Bar Status Logic' H1. - docs/tools/acp-agents.md: 23 chars - docs/concepts/qa-matrix.md: 23 chars - docs/concepts/qa-e2e-automation.md: 23 chars	2026-05-05 19:34:52 -07:00
Peter Steinberger	057d3a43c0	feat(mantis): capture logged-in discord web evidence	2026-05-06 02:43:49 +01:00
Peter Steinberger	ad2d13cc67	fix(discord): preserve thread reply file attachments	2026-05-06 01:16:57 +01:00
Kevin Lin	dd643b52df	test: expand slack live qa coverage (#77713 )	2026-05-05 16:11:07 -07:00
Peter Steinberger	430814ebc1	docs: add Mantis Slack desktop runbook	2026-05-05 23:48:49 +01:00
Peter Steinberger	26bc40c1a4	perf: add Mantis Slack hydrate timings	2026-05-05 21:07:07 +01:00
Peter Steinberger	e8a9c766c2	perf: speed up Mantis Slack desktop smoke	2026-05-05 19:57:26 +01:00
Peter Steinberger	35266879de	feat: add Mantis visual task video QA	2026-05-05 05:35:12 +01:00
Vincent Koc	e03fe1e289	fix(telegram): reuse preview for long text finals (#77658 ) * fix(telegram): reuse preview for long text finals * test(qa): cover long telegram finals * fix(qa): satisfy extension lint * fix(qa): keep telegram long final fixture to two chunks * test(telegram): cover three chunk finals * fix(telegram): force long final preview boundary	2026-05-04 21:19:44 -07:00
Vincent Koc	b062bb670d	docs(channels): inline Slack manifest into Quick Setup with Recommended/Minimal variants The Quick Setup steps in docs/channels/slack.md previously sent users to the `#manifest-and-scope-checklist` anchor lower on the page to copy the manifest, breaking the copy-paste flow. Pull the manifest inline as a Mintlify <CodeGroup> for both Socket Mode and HTTP Request URLs tabs and add a Minimal variant for workspaces that restrict scopes (drops files:, reactions:, pins:, mpim:, emoji:read, usergroups:read while keeping DMs, channel/group history, mentions, App Home, and slash commands). Recommended matches extensions/slack/src/setup-shared.ts. Existing Manifest and scope checklist section stays as the canonical per-scope reference. Cross-link from docs/concepts/qa-e2e-automation.md so QA maintainers see the production manifest reference, while keeping the QA Driver/SUT pair of manifests inline (the lane intentionally needs two distinct apps so its shape is different from a single-app production install).	2026-05-04 18:16:15 -07:00
Sarah Fortune	d6e991db49	Add instructions for how to setup slack for QA tests (#77606 )	2026-05-04 17:38:16 -07:00
Peter Steinberger	f632f5e60b	feat(qa): add mantis Slack desktop smoke	2026-05-04 03:47:27 +01:00
Peter Steinberger	57b2d29761	feat(qa): add Mantis desktop browser smoke	2026-05-04 01:30:20 +01:00
Vincent Koc	31cafbb802	test(qa): add Slack live transport lane	2026-05-03 15:19:55 -07:00
Peter Steinberger	d4af125b52	feat(qa): add Mantis before-after CLI	2026-05-03 21:27:43 +01:00
Peter Steinberger	77a50db9ea	feat(qa): add Mantis Discord status reaction scenario (#76747 ) * feat(qa): add Mantis Discord status reaction scenario * fix(qa): retry Discord rate limits in Mantis runs * refactor(qa): reuse Discord API retry helper * fix(qa): import Discord API through package surface * fix(ci): generate Discord boundary declarations * fix(ci): keep xai boundary overrides stable	2026-05-03 17:00:06 +01:00
Peter Steinberger	0bf06e953f	feat: add Mantis Discord smoke runner (#76696 ) * docs: add Mantis QA system design * feat: add Mantis Discord smoke runner * fix: harden Mantis Discord smoke * fix: redact Mantis Discord artifacts * fix: satisfy Mantis redaction lint * fix: redact Mantis mismatch failures * test: avoid promise assertions in Mantis tests	2026-05-03 15:25:56 +01:00

1 2

94 Commits