Commit Graph

1126 Commits

Author SHA1 Message Date
Peter Steinberger
97f713f459 test(agents): isolate compaction token estimator mocks 2026-04-17 17:18:12 +01:00
Gustavo Madeira Santana
c66703300a Tests: narrow bootstrap routing coverage 2026-04-17 12:17:41 -04:00
Gustavo Madeira Santana
8de7aefe0a Tests: narrow embedded timeout wiring 2026-04-17 11:37:46 -04:00
Tak Hoffman
62703d8430 fix(bootstrap): workspace bootstrap prompt routing (#68000)
* fix(bootstrap): workspace bootstrap prompt routing

* Fix bootstrap routing edge cases

* Refine bootstrap mode routing and reset prompts

* Fix bootstrap workspace routing for embedded runs

* Fix embedded bootstrap compile follow-up

* Align bare reset bootstrap file access

* Honor reset override model for bootstrap gating

* Align chat reset bootstrap topology
2026-04-17 10:18:50 -05:00
Gustavo Madeira Santana
5775fe272a Docs: refresh agent instructions 2026-04-17 02:46:38 -04:00
Gustavo Madeira Santana
89706d323c Docs: add test performance guardrails 2026-04-17 02:23:49 -04:00
Gustavo Madeira Santana
e4c4f955b3 Tests: restore context-engine usage proof 2026-04-17 02:23:49 -04:00
Gustavo Madeira Santana
74c198f2e8 Tests: slim context engine runtime coverage 2026-04-17 02:23:49 -04:00
Peter Steinberger
35dcd06764 test: trim agent test hotspots 2026-04-17 07:15:27 +01:00
Rubén Cuevas
7e18c07e41 fix: dedupe repeated bootstrap truncation warnings (#67906) (thanks @rubencu)
* Agents: dedupe bootstrap truncation warnings

* Agents: normalize bootstrap warning cache bookkeeping

* fix(agents): scope bootstrap warning dedupe by workspace

* refactor(agents): simplify bootstrap warning wrapper

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-17 10:54:15 +05:30
Peter Steinberger
acd86a06cd test: preserve tool helpers in embedded runner mocks 2026-04-17 02:57:18 +01:00
Tak Hoffman
81818df1b4 fix(startup): prioritize bootstrap on fresh sessions 2026-04-16 17:53:07 -05:00
Josh Lehman
a327b6750d fix: stabilize context engine prompt cache touches (#67767)
* fix: stabilize context engine prompt cache touches

* fix(changelog): document context-engine prompt cache touch stabilization
2026-04-16 11:53:42 -07:00
Peter Steinberger
1183832d4f fix: pin codex resume sandbox override 2026-04-16 17:31:41 +01:00
stain lu
c3c7a9953f fix: repair sanitized replay tool results before send (#67620) (thanks @stainlu)
* fix(agents): preserve native Anthropic tool IDs for hybrid providers

Fixes #66892

MiniMax and other hybrid providers use api.minimaxi.com/anthropic
(modelApi: anthropic-messages), which generates and expects native
Anthropic tool_call_ids in toolu_* format. The hybrid replay policy
(buildHybridAnthropicOrOpenAIReplayPolicy) applied strict
sanitization that stripped underscores from these IDs, causing
MiniMax to reject them with error 2013.

The native Anthropic provider already preserved these IDs via
preserveNativeAnthropicToolUseIds (added in 4613f121ad). This
commit enables the same flag for the hybrid anthropic-messages
branch, so toolu_* IDs pass through unsanitized while other
synthetic IDs still get strict cleanup.

* fix(agents): repair sanitized replay tool results before send

* fix: repair sanitized replay tool results before send (#67620) (thanks @stainlu)

* fix: preserve aborted-span tool results during replay sanitize (#67620) (thanks @stainlu)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-16 18:38:57 +05:30
Xan Torres
36ed36768c Agents/tool-loop: enable unknown-tool stream guard by default 2026-04-16 18:31:56 +05:30
Nimrod Gutman
90801ba400 fix(openai-codex): normalize stale transport metadata in resolution and discovery (#67635)
Merged via squash.

Supersedes:
- #66969 by @saamuelng601-pixel
- #67159 by @hclsys

Co-authored-by: saamuelng601-pixel <274746699+saamuelng601-pixel@users.noreply.github.com>
Co-authored-by: hclsys <7755017+hclsys@users.noreply.github.com>
2026-04-16 14:30:05 +03:00
Devin Robison
52ef42302e fix: tighten trusted tool media passthrough (#67303)
* fix: tighten trusted tool media passthrough

* changelog: tighten trusted tool media passthrough (#67303)

* address review: thread rawToolName into emitToolResultOutput and keep plugin-tool media passthrough

- Pass rawToolName through emitToolResultOutput params so the emit and
  collect calls no longer reference an out-of-scope identifier
  (ReferenceError on any verbose tool-output path).
- Widen builtinToolNames to all effective tool raw names for this run
  (core + bundled/trusted plugin tools), so plugin tools on the trusted
  media list still receive local MEDIA: passthrough. Admission-time
  client-tool conflict check keeps using the core-only set so unrelated
  plugin names do not spuriously reject client definitions; MEDIA
  passthrough is still gated by the raw-name set, so a client tool that
  normalize-collides with a plugin name cannot inherit its media trust.
- Add unit coverage for bundled-plugin raw-name passthrough and for
  case-variant plugin-name collisions.

* drop redundant String() casts flagged by oxlint no-useless-cast

The names from effectiveTools, client tool function names, and the
existingToolNames iterable are already typed as string, so wrapping them
in String(...) adds nothing and trips oxlint's no-useless-cast rule.
2026-04-15 13:12:44 -06:00
Peter Steinberger
4efd3c3d74 test: harden beta release gates 2026-04-15 19:28:49 +01:00
Tak Hoffman
4f00b76925 fix(context-window): Tighten context limits and bound memory excerpts (#67277)
* Tighten context limits and bound memory excerpts

* Align startup context defaults in config docs

* Align qmd memory_get bounds with shared limits

* Preserve qmd partial memory reads

* Fix shared memory read type import

* Add changelog entry for context bounds
2026-04-15 13:06:02 -05:00
Josh Lehman
75e7fc97f8 fix: preserve runtime token budget in deferred context-engine maintenance (#66820)
* fix(context-engine): pass deferred maintenance token budget

Thread tokenBudget through the after-turn runtime context so background context-engine maintenance reuses the real model context window instead of falling back to 128k. Also pass through a best-effort currentTokenCount from the latest call total and make the runtime context type explicit about both fields.

Regeneration-Prompt: |
  OpenClaw already passed the real context token budget into direct context-engine calls like afterTurn and assemble, but deferred maintain() reused only the runtimeContext object and that object did not carry tokenBudget. Lossless Claw therefore fell back to 128k during background maintenance, which made budget-trigger fire much more aggressively than the live model context warranted. Thread the real contextTokenBudget into buildAfterTurnRuntimeContext so deferred maintenance receives the same budget, and pass a straightforward best-effort currentTokenCount from the latest call total while the relevant data is already in scope. Keep the change additive, update the runtime-context type, and cover the background maintenance/runtime-context behavior with focused tests.

* fix(context-engine): use prompt usage for deferred maintenance
2026-04-14 15:30:37 -07:00
Peter Steinberger
54cf4cd857 test(agents): isolate shared subagent state 2026-04-14 22:49:31 +01:00
Chunyue Wang
4bc46ccfed fix(gateway): cap compaction reserve floor to context window for small models (#65671)
Fixes #65465. Caps the compaction reserveTokensFloor so that at least min(8 000, 50%) of the context window remains available for
  prompt content, preventing the default 20 000-token floor from exceeding the entire context window on small-context local models (e.g. Ollama
  16K). The cap is only applied when contextTokenBudget is provided, preserving backward compatibility.
2026-04-15 01:08:11 +08:00
Jordan Brant Baker
b4e4f96fd5 fix: restore embedded-run param forwarding (#62675) (thanks @hexsprite)
* fix: forward optional params dropped at the runEmbeddedAttempt call site

runEmbeddedPiAgent in pi-embedded-runner/run.ts hand-enumerates ~85 fields
when calling runEmbeddedAttempt({...}). Several optional fields on
RunEmbeddedPiAgentParams were added to the type and to attempt.ts (the
consumer) but were never wired at this specific call site. Because every
field is declared as ?: optional on EmbeddedRunAttemptParams, TypeScript
does not flag the missing fields and the attempt silently receives
undefined for each.

Four fields were affected:

- toolsAllow (#58504, #62569): cron's --tools allow-list. Persisted in
  jobs.json by the CLI, forwarded by cron/isolated-agent/run-executor.ts
  to runEmbeddedPiAgent, but dropped here. Result: provider request
  ships the full tool catalog on every cron run regardless of toolsAllow,
  defeating the ~95% input-token reduction documented in #58504 and the
  --tools restriction documented in docs/automation/cron-jobs.md:85.

- disableMessageTool: cron/isolated-agent/run-executor.ts:164 sets it
  from toolPolicy.disableMessageTool, derived at run.ts:110 as
  `params.deliveryContract === "cron-owned" ? true : params.deliveryRequested`.
  Every cron-owned delivery (the default per docs) is supposed to disable
  the message tool so the runner owns the final delivery path. Without
  forwarding, the agent can call messaging tools mid-cron and cause
  duplicate or wrong-channel sends.

- requireExplicitMessageTarget: cron/isolated-agent/run-executor.ts:163
  sets it from toolPolicy.requireExplicitMessageTarget. Has a fallback at
  attempt.ts:568-569 to `?? isSubagentSessionKey(params.sessionKey)`, so
  non-subagent crons silently get false instead of the intended value.

- internalEvents: agents/command/attempt-execution.ts:478 passes it via
  params.opts.internalEvents. Different caller path from cron, but the
  same drop point. Internal events array silently dropped before reaching
  the consumer at attempt.ts:1480.

The fix is four lines in the runEmbeddedAttempt({...}) call, immediately
after the bootstrapContextMode/bootstrapContextRunKind lines added by
PR #62264 (which fixed two more fields with the identical pattern at the
same call site).

A regression test (run.attempt-param-forwarding.test.ts) covers all six
optional fields shown to have been bitten by this class of bug at this
seam. The next ?: optional field added to RunEmbeddedPiAgentParams without
wiring at the runEmbeddedAttempt call site will fail a test instead of
silently shipping broken — addressing the missing-guardrail concern PR
#60776's writeup explicitly noted.

Verified locally: 6/6 forwarding tests pass, 258 pi-embedded-runner/run*
tests pass, 176 cron/isolated-agent tests pass, oxlint and tsgo deltas
versus origin/main are zero.

Fixes #62569

* test: distill param forwarding guardrails

* fix: restore embedded-run param forwarding (#62675) (thanks @hexsprite)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-14 21:36:14 +05:30
Vincent Koc
a7436c8b4a perf(plugins): split provider hook runtime seam 2026-04-14 17:01:05 +01:00
Vincent Koc
82364e901a test(codex): cover exact gpt-5.4 registry upgrades (#66454) 2026-04-14 11:05:45 +01:00
Vincent Koc
5a5ca6d62c feat(codex): add gpt-5.4-pro forward compat (#66453)
* feat(openai-codex): add gpt-5.4-pro forward-compat #63404

* feat(openai-codex): add gpt-5.4-pro forward-compat #63404

* openai-codex: use patch.cost when forward-compat falls back to normalizeModelCompat

* feat(codex): add gpt-5.4-pro forward compat

* fix(codex): reuse gpt-5.4 fallback for gpt-5.4-pro

---------

Co-authored-by: jepson-liu <jepsonliu@gmail.com>
2026-04-14 11:05:24 +01:00
Vincent Koc
b90d4ea3d7 fix(codex): canonicalize the gpt-5.4-codex alias (#66438)
* fix(codex): canonicalize the gpt-5.4-codex alias

* Update CHANGELOG.md
2026-04-14 09:56:58 +01:00
Bikkies
fecd4fcc55 fix(agents) context-engine: per-iteration ingest and assemble for compaction (#63555)
Merged via squash.

Prepared head SHA: 0d815fc190
Co-authored-by: Bikkies <29473797+Bikkies@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-04-14 01:42:14 -07:00
Vincent Koc
38de896419 fix(agents): honor embedded ollama timeouts (#66418) 2026-04-14 09:21:22 +01:00
Luke
0abe64a4ff Agents: clarify local model context preflight (#66236)
Merged via squash.

Prepared head SHA: 11bfaf15f6
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Reviewed-by: @ImLukeF
2026-04-14 15:38:10 +10:00
Agustin Rivera
1c35795fce fix(slack): align interaction auth with allowlists (#66028)
* fix(slack): align interaction auth with allowlists

* fix(slack): address review followups

* fix(slack): preserve explicit owners with wildcard

* chore: append Claude comments resolution worklog

* fix(slack): harden interaction auth with default-deny, mandatory actor binding, and channel type validation

- Add interactiveEvent flag to authorizeSlackSystemEventSender for stricter
  interactive control authorization
- Default-deny when no allowFrom or channel users are configured for
  interactive events (block actions, modals)
- Require expectedSenderId for all interactive event types; block actions
  pass Slack-verified userId, modals pass metadata-embedded userId
- Reject ambiguous channel types for interactive events to prevent DM
  authorization bypass via channel-type fallback
- Add comprehensive test coverage for all new behaviors

* fix(slack): scope interactive owner/allowFrom enforcement to interactive paths only

* fix(slack): preserve no-channel interactive default

* Update context-engine-maintenance test

* chore: remove USER.md worklog artifact

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* changelog: note Slack interactive auth allowlist alignment (#66028)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Devin Robison <drobison@nvidia.com>
2026-04-13 20:38:11 -06:00
Josh Lehman
14779eaeb0 fix: recover reasoning-only OpenAI turns (#66167)
* openclaw-11f.1: retry reasoning-only OpenAI turns

Regeneration-Prompt: |
  Patch the embedded runner so a signed reasoning-only assistant turn with no user-visible text is treated as recoverable instead of silently ending the run. Keep the change focused on the active OpenAI GPT-style path, retry the turn with an explicit visible-answer continuation instruction, and fall back to the existing incomplete-turn error handling only after retries are exhausted. Add regression coverage for the helper classification and for the outer run loop retry behavior, and keep unrelated provider behavior unchanged.

* openclaw-11f.1: address reasoning-only review feedback

Regeneration-Prompt: |
  Follow up on PR review feedback for the reasoning-only retry patch. Keep the fix narrow: move the retry limit into a named constant alongside the other retry-policy values, document why the limit is 2, and prevent reasoning-only auto-retries after any side effects so the runner falls back to the existing caution path instead of risking duplicate actions. Add regression coverage for the side-effect guard and the named limit behavior.

* openclaw-11f.1: drop local pebbles artifacts

Regeneration-Prompt: |
  Remove accidentally committed local pebbles tracker artifacts from the PR branch without changing runtime code. Keep the cleanup limited to deleting the tracked .pebbles files from version control, and rely on local git excludes for future pebbles activity so these files stay out of diffs.

* openclaw-11f.1: tighten reasoning-only retry guards

Regeneration-Prompt: |
  Follow up on the remaining review feedback for the reasoning-only retry path. Keep the fix narrow: do not auto-retry a reasoning-only turn when the assistant already terminated with stopReason error, and evaluate the OpenAI-specific retry guard against the provider/model metadata of the assistant turn that actually produced the partial output rather than the outer run configuration. Add regression coverage for both behaviors in the incomplete-turn runner tests.

* openclaw-11f.1: retry empty GPT turns once

Regeneration-Prompt: |
  Extend the embedded runner's GPT-style incomplete-turn recovery with a separate generic empty-response retry path. Keep it narrower than the existing reasoning-only recovery: one retry only, replay-safe only, no side effects, no assistant error turns, and scoped to the active assistant provider/model metadata. Add explicit warning logs when the empty-response retry triggers and when its single retry budget is exhausted, and add regression coverage for the success and exhaustion cases without changing broader provider fallback behavior.

* openclaw-11f.1: harden reasoning-only retry completion checks

Regeneration-Prompt: |
  Follow up on the remaining review feedback for the GPT-style recovery path. Keep the change narrow: only retry reasoning-only turns when there is no visible assistant answer yet, and if the reasoning-only retry budget is exhausted without any visible answer, surface the existing incomplete-turn error instead of treating reasoning-only payloads as a successful completion. Add focused regression coverage for both scenarios and preserve the adjacent empty-response retry behavior.

* openclaw-11f.1: preserve profile cooldown on retry exhaustion

Regeneration-Prompt: |
  Follow up on the final review comment for the GPT-style recovery path. Keep the change narrow: when the reasoning-only retry budget is exhausted and the run returns the incomplete-turn error early, preserve the same auth-profile cooldown behavior that the normal incomplete-turn branch already applies so multi-profile failover continues to work consistently. Verify the touched runner suites still pass.

* fix: recover GPT-style empty turns

Regeneration-Prompt: |
  Add the required changelog entry for the PR that hardens embedded GPT-style recovery of reasoning-only and empty-response turns. Keep the changelog update under ## Unreleased > ### Fixes, append-only, and include the PR number plus author attribution on the same line.
2026-04-13 16:58:28 -07:00
Bob
8c7f17b953 fix: count unknown-tool retries only when streamed (#66145)
Merged via squash.

Prepared head SHA: b79209cdb5
Co-authored-by: Bob <dutifulbob@gmail.com>
Reviewed-by: @osolmaz
2026-04-13 22:49:05 +02:00
Tak Hoffman
7c09ba70ef fix(trace command): Improve trace raw diagnostics and trace command UX (#66089)
* improve trace raw diagnostics and command acks

* address trace review feedback

* avoid sync transcript reads in raw trace

* preserve raw cli output for trace

* gate trace emission at reply time

* reflect raw trace mode in status surfaces
2026-04-13 14:26:57 -05:00
pashpashpash
8efbe8c1ed agents: stop strict mode from hijacking chat turns 2026-04-13 11:49:00 -07:00
Bob
74f2c4a56b fix: stop repeated unknown-tool loops (#65922)
Merged via squash.

Prepared head SHA: f352a270a6
Reviewed-by: @osolmaz
2026-04-13 17:42:11 +02:00
Tak Hoffman
dc5ed7edea fix(logging) add failover log source and target (#65955)
* clarify failover log source and target

* fix embedded runner final assistant raw text helper
2026-04-13 08:56:10 -05:00
EVA
c15b295a85 Run context-engine turn maintenance as idle-aware background work (#65233)
Merged via squash.

Prepared head SHA: e9f6c679ba
Co-authored-by: 100yenadmin <239388517+100yenadmin@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-04-13 06:50:22 -07:00
pashpashpash
c848ebc8ce agents: split GPT-5 prompt and retry behavior (#65597)
* agents: split GPT-5 prompt and retry behavior

* agents: fix GPT-5 review follow-ups

* agents: address GPT-5 review follow-ups

* agents: avoid replaying side-effectful GPT retries

* agents: mark subagent control as mutating

* agents: fail closed on single-action retries

* commands: stabilize channel legacy doctor migration test

* agents: narrow single-action retry promise trigger
2026-04-12 18:52:22 -07:00
Peter Steinberger
03d042d2b9 perf: mock hot agents import tests 2026-04-13 01:35:52 +01:00
EVA
26945ddb49 agents: GPT-5.4 runtime completion rollup (#65219)
* agents: auto-activate strict-agentic for GPT-5 and emit blocked-exit liveness

Closes two hard blockers on the GPT-5.4 parity completion gate:

1) Criterion 1 (no stalls after planning) is universal, but the pre-existing
   strict-agentic execution contract was opt-in only. Out-of-the-box GPT-5
   openai / openai-codex users who never set
   `agents.defaults.embeddedPi.executionContract` still got only 1
   planning-only retry and then fell through to the normal completion path
   with the plan-only text, i.e. they still stalled.

   Introduce `resolveEffectiveExecutionContract(...)` in
   src/agents/execution-contract.ts. Behavior:

   - supported provider/model (openai or openai-codex + gpt-5-family) AND
     explicit "strict-agentic" or unspecified → "strict-agentic"
   - supported provider/model AND explicit "default" → "default" (opt-out)
   - unsupported provider/model → "default" regardless of explicit value

   `isStrictAgenticExecutionContractActive` now delegates to the effective
   resolver so the 2-retry + blocked-state treatment applies by default to
   every GPT-5 openai/codex run. Explicit opt-out still works for users who
   intentionally want the pre-parity-program behavior.

2) Criterion 4 (replay/liveness failures are explicit, not silent
   disappearance) is violated by the strict-agentic blocked exit itself.
   Every other terminal return path in src/agents/pi-embedded-runner/run.ts
   sets `replayInvalid` + `livenessState` via `setTerminalLifecycleMeta`,
   but the strict-agentic exit at run.ts:1615 falls through without them.

   Add explicit `livenessState: "abandoned"` + `replayInvalid` (via the
   shared `resolveReplayInvalidForAttempt` helper) to that exit, plus a
   `setTerminalLifecycleMeta` call so downstream observers (lifecycle log,
   ACP bridge, telemetry) see the same explicit terminal state they see on
   every other exit branch.

Regressions added:

- `auto-enables update_plan for unconfigured GPT-5 openai runs`
- `respects explicit default contract opt-out on GPT-5 runs`
- `does not auto-enable update_plan for non-openai providers even when unconfigured`
- `emits explicit replayInvalid + abandoned liveness state at the strict-agentic blocked exit`
- `auto-activates strict-agentic for unconfigured GPT-5 openai runs and surfaces the blocked state`
- `respects explicit default contract opt-out on GPT-5 openai runs`

Local validation:

- pnpm test src/agents/openclaw-tools.update-plan.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/agents/pi-embedded-runner.buildembeddedsandboxinfo.test.ts src/agents/system-prompt.test.ts src/agents/openclaw-tools.sessions.test.ts src/agents/pi-embedded-runner/run.overflow-compaction.test.ts

122/122 passing.

Refs #64227

* agents: address loop-6 review comments on strict-agentic contract

Triages all three loop-6 review comments on PR #64679:

1. Copilot: 'The strict-agentic blocked exit returns an error payload
   (isError: true) but sets livenessState to "abandoned". Elsewhere in
   the runner/lifecycle flow, error terminal states are treated as
   "blocked".' Verified: every other hardcoded error terminal branch in
   run.ts (role ordering at 1152, image size at 1206, schema error at
   1244, compaction timeout at 1128, aborted-with-no-payloads at 606)
   uses livenessState: "blocked". Match that convention at the
   strict-agentic blocked exit at 1634. Updated the 'emits explicit
   replayInvalid + abandoned liveness state' regression test to assert
   the new "blocked" value and renamed the assertion commentary.

2. Copilot: 'The JSDoc for resolveEffectiveExecutionContract says
   explicit "strict-agentic" in config always resolves to
   "strict-agentic", but the implementation collapses to "default"
   whenever the provider/mode is unsupported.' Rewrite the JSDoc to
   explicitly document the unsupported-provider collapse as the lead
   case (strict-agentic is a GPT-5-family openai/openai-codex-only
   runtime contract) before listing the supported-lane behavior matrix.
   No code change; this is a docstring-only clarification.

3. Greptile P2: 'Non-preferred Anthropic model constant. CLAUDE.md says
   to prefer sonnet-4.6 for Anthropic test constants.' Swap
   claude-opus-4-6 → claude-sonnet-4-6 in the two update_plan gating
   fixtures that assert non-openai providers don't auto-enable the
   planning tool. Behavior unchanged; model constant now matches repo
   testing guidance.

Local validation:

- pnpm test src/agents/openclaw-tools.update-plan.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts

29/29 passing.

Refs #64227

* test: rename strict-agentic blocked-exit liveness regression to match blocked state

Addresses loop-7 Copilot finding on PR #64679: loop 6 changed the
assertion to livenessState === 'blocked' to match the rest of the
hard-error terminal branches in run.ts, but the test title still said
'abandoned liveness state', which made failures and test output
misleading. Rename the test title to match the asserted value. No
code change beyond the it(...) title.

Validation: pnpm test src/agents/pi-embedded-runner/run.incomplete-turn.test.ts
(19/19 pass).

Refs #64227

* agents: widen strict-agentic auto-activation to handle prefixed and variant GPT-5 model ids

* Align strict-agentic retry matching

* runtime: harden strict-agentic model matching

---------

Co-authored-by: Eva <eva@100yen.org>
2026-04-12 16:36:11 -07:00
pashpashpash
383c854313 CI: fix mainline regression blockers (#65269)
* MSTeams: align logger test expectations

* Gateway: fix CI follow-up regressions

* Config: refresh generated schema baseline

* VoiceCall: type webhook test doubles

* CI: retrigger blocker workflow

* CI: retrigger retry workflow

* Agents: fix current mainline agentic regressions

* Agents: type auth controller test mock

* CI: retrigger blocker validation

* Agents: repair OpenAI replay pairing order
2026-04-13 06:18:37 +09:00
Vincent Koc
fcae3bf943 fix(agents): preserve active-turn queued user prompts (#65478)
* fix(agents): preserve active-turn queued user prompts

* Update src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Update CHANGELOG.md

* Update CHANGELOG.md

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-04-12 19:02:55 +01:00
Peter Steinberger
50fcdb36a8 fix: preserve prompt budget for small context models 2026-04-12 17:16:37 +01:00
Peter Steinberger
d6bb36730b fix(agents): stabilize subagent lifecycle 2026-04-12 16:07:46 +01:00
Nimrod Gutman
1fe14627a2 fix(tests): restore ci type and format checks 2026-04-12 14:13:57 +03:00
Vincent Koc
2389b3bdd3 test(agents): share fallback error fixtures 2026-04-12 11:11:41 +01:00
Vincent Koc
f2c7cec8de test(agents): share proxy stream wrapper fixture 2026-04-12 10:53:47 +01:00
Vincent Koc
a09e228e3e fix(agents): preserve openai replay ids and timeout hooks 2026-04-12 10:35:52 +01:00