openclaw

mirror of https://github.com/openclaw/openclaw.git synced 2026-06-28 08:33:38 +00:00

Author	SHA1	Message	Date
Dallin Romney	e17d111990	fix: require all taxonomy coverage ids (#94296 )	2026-06-17 16:38:14 -07:00
Vincent Koc	e2b6753b87	fix(qa-lab): bound credential payload reads	2026-06-17 11:59:55 +02:00
Vincent Koc	c32ba171db	fix(qa): fail unsuccessful self-checks	2026-06-16 01:32:47 +02:00
Dallin Romney	e32929e12c	Add slim evidence mode for QA profile evidence (#93179 ) * test(qa): compact profile evidence execution metadata * docs(qa): document compact profile evidence * test(qa): support compact evidence mode * test(qa): rename compact evidence mode to slim * docs(qa): trim slim evidence wording * fix(qa): avoid commander runtime import	2026-06-15 14:50:40 -07:00
Dallin Romney	1f8c4d3958	simplify QA evidence profile and mappings/coverage shape (#93153 ) * test(qa): simplify evidence coverage shape * test(qa): collapse evidence scorecard metadata * test(qa): document evidence schema version	2026-06-14 22:26:58 -07:00
Dallin Romney	3d38c9a633	test(qa): embed profile scorecard evidence (#93109 ) * test(qa): embed profile scorecard evidence * test(qa): fix profile runner return lint * test(qa): satisfy suite command lint return	2026-06-14 20:51:38 -07:00
Dallin Romney	e8db9c3bc0	test(qa): add qa run --profile and unified output summary/evidence (#91587 ) * test(qa): add mapped qa run profiles * test(qa): document mapped profile runner * test(qa): validate run profiles from mapping * test(qa): preserve root profile parsing * test(qa): simplify taxonomy profile dispatch * test(qa): align tool coverage CLI expectation * test(qa): fix profile dispatch fixture type * test(qa): share profile runner option types * test(qa): split shared cli runner options * test(qa): unify profile suite artifacts * fix(qa): filter profile scenarios by provider lane * test(qa): drop native scenario subreports * fix(qa): keep native log refs repo-relative * fix(cli): preserve qa run root profile parsing * fix(qa): avoid qa profile flag collision * fix(qa): reject profile flags without qa profile	2026-06-14 18:08:42 -07:00
Vincent Koc	b1caba5906	test(qa): align tool coverage CLI expectation	2026-06-14 16:11:00 +08:00
Dallin Romney	561b293c7a	Run Vitest and Playwright scenarios from qa suite (#92606 ) * test(qa): run vitest and playwright scenarios from qa suite * fix(qa): harden scenario suite dispatch * refactor(qa): share scenario path utilities * refactor(qa): share test file scenario runner * refactor(qa): route test file scenarios through suite runtime * refactor(qa): use explicit suite runtime result kind * test(qa): write suite evidence artifact * refactor(qa): clarify suite execution dispatch * fix(qa): keep test-file scenarios out of flow-only runners * refactor(qa): export mixed scenario suite runner	2026-06-13 01:06:10 -07:00
Dallin Romney	4809ac70fa	Add QA evidence artifact output (#91484 ) * feat: add qa evidence summary normalization * chore: rename qa evidence target environment * chore: align qa evidence profile terminology * chore: align qa evidence summary fields * chore: add qa evidence taxonomy ref * test: remove stale multipass evidence example * test(qa): normalize vitest and playwright evidence * test(qa): slim evidence summary metadata * test(qa): clarify evidence summary inputs * test(qa): rename scenario specs in evidence flow * test(qa): treat evidence profiles as mapping strings * test(qa): use neutral evidence test identity * test(qa): nest evidence summary joins * refactor(qa): normalize live evidence summaries * fix(qa): accept normalized telegram rtt summaries * fix(qa): normalize evidence lane summaries * fix(qa): align evidence summaries with requirements * refactor(qa): tighten evidence summary builders * refactor(qa): restore standard evidence ids * fix(qa): keep legacy summaries out of rtt evidence * refactor(qa): make package evidence provenance explicit * test(qa): keep script tests out of qa lab internals * refactor(qa): rename scenario evidence definitions * refactor(qa): clean evidence summary wording * test(qa): fix evidence summary test inputs * refactor(qa): simplify evidence identity fields * refactor(qa): tighten evidence summary inputs * refactor(qa): rename evidence artifact	2026-06-12 16:12:58 -07:00
Vincent Koc	c7b01cf201	test(release): stabilize qa runtime parity gate	2026-06-09 01:02:24 +02:00
Vincent Koc	3dc6ac3802	fix(qa): fail closed on skipped suite summaries	2026-06-07 11:20:25 +02:00
Vincent Koc	e12141fa9f	fix(qa): gate live transport exits on summaries	2026-06-07 02:53:50 +02:00
Vincent Koc	6d2566682a	fix(qa): fail suite on summary scenario failures	2026-06-07 02:49:02 +02:00
Peter Steinberger	58912f8fd8	docs: document channel extension sources	2026-06-04 21:59:00 -04:00
Peter Steinberger	1188aa3b81	feat: add Claude Opus 4.8 support (#87890 ) * feat: add Claude Opus 4.8 support * fix: omit Vertex Opus sampling overrides * fix: preserve Opus adaptive thinking levels * fix: clamp Anthropic max effort support * fix: use sha256 for QA mock call ids * fix: type Anthropic transport test model metadata * test: update PDF model default for Opus 4.8	2026-05-29 06:10:42 +01:00
Peter Steinberger	a8dec44f56	fix(release): accept openclaw qa runtime alias	2026-05-28 21:22:17 +01:00
Peter Steinberger	cd80b4efca	fix: parse qa cli integers strictly	2026-05-28 14:36:47 -04:00
Peter Steinberger	bb46b79d3c	refactor: internalize OpenClaw agent runtime (#85341 ) * refactor: extract agent core package Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts. * refactor: extract shared llm runtime Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout. * refactor: remove pi runtime internals Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code. * refactor: tighten agent session runtime Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts. * refactor: remove static model and pi auth paths Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities. * refactor: remove legacy provider compat paths * docs: remove agent parity notes * fix: skip provider wildcard metadata parsing * refactor: share session extension sdk loading * refactor: inline acpx proxy error formatter * refactor: fold edit recovery into edit tool * fix: accept extension batch separator * test: align startup provider plugin expectations * fix: restore provider-scoped release discovery * test: align static asset packaging expectations * fix: run static provider catalogs during scoped discovery * fix: add provider entry catalogs for scoped live discovery * fix: load lightweight provider catalog entries * fix: refresh provider-scoped plugin metadata * fix: keep provider catalog entries on release live path * fix: keep static manifest models in release live checks * fix: harden release model discovery * fix: reduce OpenAI live cache probe reasoning * fix: disable OpenAI cache probe reasoning * ci: extend OpenAI gateway live timeout * fix: extend live gateway model budget * fix: stabilize release validation regressions * fix: honor provider aliases in model rows * fix: stabilize release validation lanes * fix: stabilize release memory qa * ci: stabilize release validation lanes * ci: prefer ipv4 for live docker node calls * fix: restore shared tool-call stream wrapper * ci: remove legacy pi test shard alias * fix: clean up embedded agent test drift * fix: stabilize runtime alias status * fix: clean up embedded agent ci drift * fix: restore release ci invariants * fix: clean up post-rebase runtime drift * fix: restore release ci checks * fix: restore release ci after rebase * fix: remove stale pi runtime path * test: align compaction runtime expectations * test: update plugin prerelease expectations * fix: handle claude live tool approvals * fix: stabilize release validation gates * fix: finish agent runtime import * test: finish post-rebase agent runtime mocks * fix: keep codex compaction native * fix: stabilize codex app-server hook tests * test: isolate codex diagnostic active run * test: remove codex diagnostic completion race # Conflicts: # extensions/codex/src/app-server/run-attempt.test.ts * ci: fix full release manifest performance run id * refactor: narrow llm plugin sdk boundary * chore: drop generated google boundary stamps * fix: repair rebase fallout * fix: clean up rebased runtime references * fix: decode codex jwt payloads as base64url * fix: preserve shipped pi runtime alias * fix: add scoped sdk virtual modules * fix: decode llm codex oauth jwt as base64url * fix: avoid stale vertex adc negative cache * fix: harden tool arg decoding and codeql path * fix: keep vertex adc negative checks live * refactor: consolidate codex jwt and edit helpers * fix: await codex oauth node runtime imports * fix: preserve sdk tool and notice contracts * fix: preserve shipped compat config boundaries * fix: align codex oauth callback host * fix: terminate agent-core loop streams on failure * fix: keep codex oauth callback alive during fallback * ci: include session tools in critical codeql scans * fix: keep Cloudflare Anthropic provider auth header * docs: redirect legacy pi runtime pages * fix: honor bundled web provider compat discovery * fix: protect session output spill files * fix: keep legacy agent dir env blocked * fix: contain auto-discovered skill symlinks * fix: harden agent core sdk proxy surfaces * fix: restore approval reaction sdk compat * fix: keep live docker runs bounded * fix: keep codex oauth redirect host aligned * fix: resolve post-rebase agent runtime drift * fix: redact anthropic oauth parse failures * fix: preserve responses strict tool shaping * fix: repair agent runtime rebase cleanup * docs: redirect retired parity pages * fix: bound auto-discovered resources to roots * fix: repair post-rebase agent test drift * fix: preserve bundled provider allowlist migration * fix: preserve manifest-owned provider aliases * fix: declare photon image dependency * fix: keep provider headers out of proxy body * fix: preserve shipped env aliases * fix: refresh control ui i18n generated state * fix: quote read fallback paths * fix: preview edits through configured backend * test: satisfy core test typecheck * fix: preserve ZAI usage auth fallback * test: repair codex diagnostic test * fix: repair agent runtime rebase drift * test: finish embedded runner import rename * fix: repair agent runtime rebase integrations * test: align compaction oauth fallback expectations * fix: allow sdk-auth session models * fix: update doctor tool schema import * fix: preserve bedrock plugin region * fix: stream harmony-like prose immediately * ci: include session runtime in codeql shards * fix: repair latest rebase integrations * fix: honor explicit codex websocket transport * fix: keep openai-compatible credentials provider-scoped * fix: refresh sdk api baseline after rebase * fix: route cli runtime aliases through openclaw harness * test: rename stale harness mock expectation * test: rename embedded agent overflow calls * test: clean embedded auth test wording * test: use openclaw stream types in deepinfra cache test * fix: refresh sdk api baseline on latest main * fix: honor bundled discovery compat allowlists * fix: refresh sdk api baseline after latest rebase * fix: remove stale rebase imports * test: rename stale model catalog mock * test: mock renamed doctor runtime modules * fix: map canonical kimi env auth * fix: use internal model registry in bench script * fix: migrate deepinfra provider catalog entry * fix: enforce builtin tool suppression * fix: route compaction auth and proxy payloads safely * refactor: prune unused llm registry leftovers * test: update codex hooks session import * test: fix model picker ci coverage * test: align model picker auth mock types	2026-05-27 19:24:04 +01:00
Peter Steinberger	a1fe86a0ff	feat(qa): add coverage scenario matching	2026-05-25 10:22:51 +01:00
Vincent Koc	7f05be041e	fix(diagnostics): harden observability exports and smokes (#85371 ) * test(diagnostics): widen observability smokes * fix(diagnostics): sanitize observability exports * docs(diagnostics): format otel export docs	2026-05-23 15:27:43 +08:00
Peter Steinberger	0def3e20e4	test(release): align prerelease validation	2026-05-22 14:43:36 +01:00
Firas Alswihry	229323d37a	test(qa-lab): add personal failure recovery scenario	2026-05-21 23:22:35 +08:00
Vincent Koc	cf0657852f	feat(qa-lab): add jsonl replay harness	2026-05-21 23:03:51 +08:00
Firas Alswihry	a9eaf0c993	test(qa-lab): add personal no-fake-progress scenario (#83824 ) Summary: - The PR adds a personal-agent QA-Lab no-fake-progress scenario, registers it in the personal-agent pack, teaches mock-openai the scripted path, and updates focused tests, docs, and changelog. - Reproducibility: not applicable. This PR adds QA coverage rather than reporting a current-main bug; the branch supplies concrete after-patch QA-Lab/mock-openai commands and copied pass output. Automerge notes: - PR branch already contained follow-up commit before automerge: test(qa-lab): add personal no-fake-progress scenario Validation: - ClawSweeper review passed for head `95d2e46288`. - Required merge gates passed before the squash merge. Prepared head SHA: `95d2e46288` Review: https://github.com/openclaw/openclaw/pull/83824#issuecomment-4483439200 Co-authored-by: Firas Alswihry <itzfiras@gmail.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>	2026-05-19 01:16:00 +00:00
Peter Steinberger	59defa3e71	ci(release): fix beta validation gates	2026-05-19 01:05:52 +01:00
Firas Alswihry	94c012b2ec	test(qa-lab): add personal task followthrough scenario	2026-05-18 14:35:03 +08:00
Vincent Koc	1300b22630	fix(qa-lab): classify runtime token efficiency	2026-05-18 11:09:08 +08:00
Vincent Koc	4dec9679e6	fix(qa-lab): gate missing runtime tool coverage	2026-05-18 11:00:20 +08:00
Vincent Koc	58e1351863	fix(qa-lab): hard gate runtime tool coverage	2026-05-18 10:05:04 +08:00
Vincent Koc	79212f9869	feat(qa-lab): select runtime parity tiers	2026-05-18 00:21:13 +08:00
Vincent Koc	9249e13891	test(qa-lab): sync personal pack expectation	2026-05-17 23:56:18 +08:00
Vincent Koc	1926982c4c	fix(qa-lab): refresh parity model targets	2026-05-17 23:12:26 +08:00
Vincent Koc	37dcf385e5	fix(qa): expose codex tools for runtime parity	2026-05-17 17:20:12 +08:00
Vincent Koc	1f9d8c1e9d	fix(qa-lab): wire tool coverage report command	2026-05-17 17:12:10 +08:00
Vincent Koc	da8afe359d	feat(qa-lab): add scenario pack selector	2026-05-17 09:23:48 +08:00
Vincent Koc	f345b54d04	test(qa-lab): add runtime parity axis	2026-05-17 03:32:50 +08:00
Peter Steinberger	2d0c3750d8	test: guard extension provider helpers	2026-05-11 21:01:56 +01:00
Peter Steinberger	5d4113a2c9	test: tighten qa cli runtime assertions	2026-05-10 20:32:50 +01:00
Ayaan Zaidi	5cd4996205	feat(qa-lab): list telegram live scenarios	2026-05-08 16:14:42 +05:30
Vincent Koc	30e259b9c5	test(qa-lab): accept native Windows paths	2026-05-04 09:20:03 -07:00
Vincent Koc	a6dfaaeb4e	test(plugins): add gateway gauntlet	2026-04-28 16:44:10 -07:00
Peter Steinberger	cc7a209982	fix: normalize QA model refs for parity gates	2026-04-28 23:01:58 +01:00
Peter Steinberger	6b3e4b88d6	test: update QA parity fixtures for GPT-5.5	2026-04-25 18:05:28 +01:00
Peter Steinberger	fa22ca8883	test(qa): cover stale subagent child links	2026-04-25 05:59:42 +01:00
Peter Steinberger	903308dbf2	fix: stabilize qa lab mock suite	2026-04-24 02:46:33 +01:00
Peter Steinberger	ff56a9d41b	test(openai): prefer canonical GPT refs	2026-04-23 20:47:39 +01:00
Peter Steinberger	cd5bc2fc93	test(openai): cover GPT-5.5 defaults	2026-04-23 20:19:15 +01:00
Val Alexander	dab46a7e98	qa: harden parity gate execution (#70045 )	2026-04-22 03:08:25 -05:00
Josh Avant	d5b326523f	qa-lab: make live lanes CI-ready for v1 E2E automation (#69122 ) * qa-lab: harden CI defaults and failure semantics for live lanes * qa-lab: add unit tests for suite progress logging defaults * qa-lab: cover malformed multipass summary edge cases * qa-lab: share suite summary failure counting helper * qa-lab: test allow-failures parse wiring and sanitize progress ids * fix: note qa CI live-lane defaults in changelog (#69122) (thanks @joshavant)	2026-04-19 21:13:27 -05:00

1 2

77 Commits