Commit Graph

43 Commits

Author SHA1 Message Date
Simon Peck
6653193fdb fix(openai): avoid orphan Responses message id replay
Omit provider-owned OpenAI Responses assistant message ids unless the paired reasoning item was replayed immediately before the message. Preserve commentary/final_answer phase metadata so replay quality stays intact without sending orphan msg_* ids that OpenAI rejects.

Verification:
- node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts src/agents/openai-responses.reasoning-replay.test.ts src/llm/providers/openai-responses-shared.test.ts
- node scripts/run-oxlint.mjs on touched files
- git diff --check
- live OpenAI Responses API proof with gpt-5.5
- autoreview clean
- PR CI clean on d6902ed1a0

Co-authored-by: latensified <880715+latensified@users.noreply.github.com>
2026-05-31 20:08:33 +01:00
Peter Steinberger
304e2c83c0 chore(lint): enable stricter oxlint rules 2026-05-31 18:59:02 +01:00
Peter Steinberger
f5eca3f84c chore(lint): enable object and reassignment rules 2026-05-31 09:32:52 +01:00
Peter Steinberger
4eba3e5d7d chore(lint): enable more readability rules 2026-05-31 07:38:33 +01:00
Peter Steinberger
deb7bc6539 chore(lint): enable readability lint rules 2026-05-31 07:17:57 +01:00
Peter Steinberger
00d8d7ead0 refactor: extract normalization core package
Extract shared normalization/coercion helpers into private @openclaw/normalization-core workspace package while preserving existing plugin SDK helper subpaths.\n\nAlso keeps direct normalization-core imports internal, wires UI/build/loader resolution, and replaces the slow PR network CodeQL lane with a fast added-line boundary scan while retaining full CodeQL for scheduled/manual runs.\n\nVerification: local moved tests, plugin SDK boundary tests, extension loader tests, agents-support shard, UI build/test, build artifacts, lint, workflow guards, autoreview, and GitHub CI passed on PR head 963d893715.
2026-05-31 01:33:00 +01:00
Peter Steinberger
4c33aaa86c refactor: unify OpenAI provider identity (#88451)
* refactor: unify OpenAI provider identity

* refactor: move legacy oauth sidecar doctor helpers

* test: align OpenAI fixtures after rebase

* test: clean OpenAI provider unification

* fix: finish OpenAI provider cleanup

* fix: finish OpenAI cleanup follow-through

* fix: finish OpenAI CI cleanup
2026-05-31 00:29:44 +01:00
Jerry-Xin
f59fc0d477 fix(gateway): strip spurious tool calls on non-tool stops
Treat OpenAI-compatible streaming tool deltas as executable only when the final finish reason is `tool_calls`. This prevents malformed provider streams from triggering spurious tool execution while preserving normal tool-call responses.

Fixes #85161.

Verification:
- Local OpenAI-compatible SSE replay: spurious stop stream `finalToolCalls: 0`; valid tool-call stream `finalToolCalls: 1`.
- `pnpm test src/agents/openai-transport-stream.test.ts src/llm/providers/openai-completions.test.ts -- --reporter=verbose`
- PR CI green on `cdc2fc34753492c862cae99b37f8cf3761d9bbed`.

Co-authored-by: 忻役 <xinyi@mininglamp.com>
Co-authored-by: Jerry-Xin <jerryxin0@gmail.com>
2026-05-30 23:14:16 +01:00
Thomas Krohnfuß
48980a0f41 fix(responses): drop orphaned assistant msg_* id when reasoning is dropped (#88019) (#88067)
* fix(responses): drop orphaned assistant msg_* id when reasoning is dropped (#88019)

When an Azure/OpenAI Responses session falls back to a non-Responses model
and later resumes a Responses model, sanitizeSessionHistory drops the
replayable reasoning (rs_*) item via downgradeOpenAIReasoningBlocks. The
paired assistant text block still carried its textSignature (the msg_* id),
so the transport replayed an assistant message item referencing msg_* with
no accompanying rs_* reasoning item. Azure Responses then rejected the next
turn with:

  400 Item 'msg_...' provided without its required 'reasoning' item: 'rs_...'

permanently poisoning the session.

Fix:
- downgradeOpenAIReasoningBlocks now strips the textSignature from a turn's
  text blocks whenever it drops a replayable reasoning item, so the msg_* id
  and its rs_* reasoning are removed together. The transport then falls back
  to a synthetic, unpaired id that Azure accepts.
- Because the synthetic fallback id is derived from the per-message msgIndex,
  multiple id-less text blocks in one assistant turn (e.g. commentary +
  final_answer) would collide on the same id. Make the fallback unique per
  text block in both Responses conversion sites
  (openai-transport-stream.ts and the shared llm provider
  openai-responses-shared.ts).

Tests:
- sanitize-session-history: model-switch path drops the paired msg_* id.
- embedded-agent-helpers: downgrade strips paired text signature(s).
- reasoning-replay: multiple id-less text blocks get distinct item ids.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(responses): preserve phase metadata and guard malformed blocks (#88019)

Address PR review feedback on the orphaned msg_* replay fix:

- Preserve Responses phase metadata: dropping the paired msg_* id when its
  rs_* reasoning is removed previously stripped the entire textSignature,
  which also discarded the phase (commentary/final_answer). Phased text now
  keeps a phase-only signature ({v:1,phase}) so commentary is not replayed
  as user-visible output. Both parseTextSignature copies (shared provider and
  embedded transport) now accept id-less phase-only signatures and fall back
  to a synthetic id while preserving the phase.
- Guard malformed content blocks: the post-drop map no longer dereferences
  contentBlock.type unconditionally, so a corrupted transcript with a
  null/primitive block can still sanitize through a model switch.

Tests:
- sanitize-session-history: phase metadata is preserved while the paired id
  is dropped on a model switch.
- reasoning-replay: id-less phase-only signatures get distinct synthetic ids
  and retain their phase.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 21:45:08 +01:00
Coder
adcac404e1 fix(llm): repair invalid streaming unicode escapes
Repair invalid \u escapes during streaming JSON parsing without changing valid Unicode escapes. Split oversized node CI doctor/infra shards and fix the restart test mock deadlock so PR CI stays under the no-output threshold.\n\nCo-authored-by: Coder <83845889+coder999999999@users.noreply.github.com>
2026-05-30 20:53:26 +01:00
Lellansin Huang
fe3c3ac5cd fix(gateway): forward stop sequences across providers
Forward OpenAI-compatible stop sequences from gateway chat completions through the agent runner into provider transports.

The gateway now normalizes stop into sampling extras, agent transports pass it into the shared stream options, and OpenAI, Anthropic, Mistral, Google, and Vertex-backed simple providers map it to their native request fields. Provider/gateway/agent coverage plus Crabbox live gateway proof verify valid stop dispatch and invalid stop rejection.

Refs #87920
2026-05-30 19:24:21 +01:00
Peter Steinberger
941e04e9f3 fix: clamp configured OpenAI-compatible output tokens 2026-05-30 14:46:30 +01:00
Peter Steinberger
d92b3b5cc2 refactor: unify OpenAI provider identity
Refactor OpenAI provider identity so OpenAI remains the canonical provider for API-key and OAuth-backed flows while legacy openai-codex state is doctor/migration-only.

Keeps OpenAI Codex Responses as an API/transport class rather than a provider identity, moves auth aliases through providerAuthAliases, updates doctor repair sequencing for old auth/profile state, and refreshes tests/docs around the canonical OpenAI behavior.
2026-05-30 11:48:41 +02:00
Peter Steinberger
be2c43ee3e fix(llm): cap codex retry delays 2026-05-30 02:17:30 -04:00
Peter Steinberger
aa0d6e1bca refactor: extract LLM core packages (#88117)
* refactor: extract llm core packages

* chore: drop generated llm package artifacts

* fix: align llm package export artifacts

* test: fix moving main CI expectations

* fix: align llm core subpath aliases

* fix: use llm package exports

* fix: stabilize llm package boundary artifacts

* fix: sync llm boundary path contract

* test: isolate crabbox provider env

* test: pin crabbox configured-provider cases

* test: apply crabbox lease provider override
2026-05-30 07:45:04 +02:00
Vincent Koc
75de853c37 refactor: share provider OAuth runtime helpers 2026-05-30 03:30:51 +02:00
Vincent Koc
c01a0f5588 refactor: share provider oauth runtime helpers 2026-05-30 01:31:10 +02:00
Peter Steinberger
fb8b9e9138 fix(copilot): cap oauth request timeouts 2026-05-29 15:54:28 -04:00
Peter Steinberger
a0c1f5962d fix(runtime): centralize safe timer timeout resolution 2026-05-29 15:36:38 -04:00
Peter Steinberger
e5845dd452 fix(codex): cap responses request timeout delays 2026-05-29 14:59:37 -04:00
Peter Steinberger
16cd7f9d3f fix(oauth): cap request abort timeout delays 2026-05-29 14:52:01 -04:00
Peter Steinberger
5a294cb2bd refactor: centralize safe expiry parsing 2026-05-29 12:38:11 -04:00
Peter Steinberger
a5717c34ab fix(github-copilot): validate oauth expiry values 2026-05-29 12:09:47 -04:00
Peter Steinberger
b67679fb73 fix(anthropic): validate oauth token lifetimes 2026-05-29 11:50:12 -04:00
Vincent Koc
a18bc56996 refactor: share google provider stream helpers 2026-05-29 14:02:26 +02:00
Vincent Koc
9366d0a873 refactor: share responses stream lifecycle 2026-05-29 13:38:55 +02:00
Peter Steinberger
f212176e91 fix(azure): preserve equals in deployment maps 2026-05-29 02:03:03 -04:00
Peter Steinberger
1188aa3b81 feat: add Claude Opus 4.8 support (#87890)
* feat: add Claude Opus 4.8 support

* fix: omit Vertex Opus sampling overrides

* fix: preserve Opus adaptive thinking levels

* fix: clamp Anthropic max effort support

* fix: use sha256 for QA mock call ids

* fix: type Anthropic transport test model metadata

* test: update PDF model default for Opus 4.8
2026-05-29 06:10:42 +01:00
Peter Steinberger
4ad9f0bdbb refactor: route node proxy agents through proxyline 2026-05-28 19:21:50 +01:00
Peter Steinberger
bcf354eac1 fix: parse codex retry headers strictly 2026-05-28 14:19:47 -04:00
Peter Steinberger
e205888fa7 fix: honor ipv6 no_proxy entries 2026-05-28 12:50:59 -04:00
Peter Steinberger
b6ef874220 fix: reject partial numeric parsing 2026-05-28 10:51:32 -04:00
Peter Steinberger
aab5410bd5 test: speed up slow test suite (#87611)
* test: speed up slow test suite

* test: preserve fake timer cleanup hooks

* test: avoid timeout readiness race

* test: satisfy reply test types

* test: restore runner and image coverage

* test: restore final media runner path

* test: make cli auth status fixture deterministic

* test: repair runtime alias fixtures
2026-05-28 13:20:19 +01:00
Dallin Romney
e805ffd2eb refactor(openai): centralize codex oauth flow (#87411) 2026-05-27 22:32:08 -07:00
Josh Avant
4a45a259ec fix(agents): preserve signed thinking payloads (#87493) 2026-05-27 21:57:41 -07:00
Vincent Koc
5846878924 fix(auth): honor OAuth login cancellation 2026-05-28 03:12:40 +02:00
Peter Steinberger
cee2a50fe6 chore(release): prepare 2026.5.28 2026-05-28 01:48:07 +01:00
Peter Steinberger
45e6af5e57 fix: reject partial numeric runtime values 2026-05-27 20:10:01 -04:00
Fermin Quant
3f9d2415ac fix(cron): stabilize isolated prompt cache affinity
Stabilize isolated cron prompt cache affinity by deriving a stable prompt cache key per cron job/session/model and forwarding it separately from the rotating run session id.

Thread the key through embedded runs, stream resolution, provider options, proxy forwarding, custom streams, and prompt-cache observability. Keep OpenAI-compatible payloads valid by using hyphen-safe keys, clamping upstream prompt_cache_key values, and omitting affinity when cache retention is disabled.

Thanks @ferminquant.

Co-authored-by: Fermin Quant <ferminquant@hotmail.com>
2026-05-28 00:31:19 +01:00
Vincent Koc
c20a055341 fix(provider): honor Codex response timeouts 2026-05-28 01:03:21 +02:00
Vincent Koc
7b967c5701 fix(oauth): bound GitHub Copilot requests 2026-05-27 23:27:27 +02:00
Vincent Koc
67277088eb fix(oauth): bound Codex token requests 2026-05-27 23:20:15 +02:00
Peter Steinberger
bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341)
* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types
2026-05-27 19:24:04 +01:00