Speed up Control UI first global chat sends by letting safe literal-global startup refresh use the fresh hello default before agents.list finishes, while keeping stale carried/cached agent ids out of that fast path. Adds chat history/send and gateway chat.send timing markers for the next latency pass.
Add chat-send first visible assistant output telemetry in the Control UI, plus Gateway diagnostics correlation attributes for chat.send dispatch spans. Verified with focused UI/Gateway tests, tsgo, oxlint, autoreview, PR checks, and Testbox-through-Crabbox check:changed.
* fix(diagnostics): clear embedded-run activity when recovery declares lane idle
Stuck-session recovery transitions a lane to idle via the recovery
coordinator, but only mutated the session-state store. When an aborted
embedded run was removed without markDiagnosticEmbeddedRunEnded, the
activity store kept hasActiveEmbeddedRun set, so the liveness sweep
reported idle/embedded_run and isIdleQueuedRecoverableSessionStall
re-triggered recovery indefinitely.
Reconcile the activity store from the authoritative idle declaration by
clearing the session's embedded-run owners. The existing generation
guard already excludes any newer run that re-armed activity, so a live
requeued run is preserved.
* fix(diagnostics): reconcile tool/model activity on authoritative idle cleanup
clearDiagnosticEmbeddedRunActivityForSession (renamed from
clearDiagnosticEmbeddedRunsForSession) now clears the aborted run's tool and
model markers alongside the embedded-run owners, matching the default
markDiagnosticEmbeddedRunEnded teardown. Clearing only the owner set left the
lane as idle + orphaned tool/model activity, which
isIdleQueuedRecoverableSessionStall still treats as recoverable while work is
queued, so the liveness sweep kept re-triggering recovery instead of converging.
Adds regression cases with stale tool and model markers plus queued work.
* test(phone-control): align service mocks with keyed store API
* fix(diagnostics): preserve rearmed recovery activity
* fix(diagnostics): clear recovered owner markers
* fix(diagnostics): clear recovered embedded work keys
* fix(diagnostics): ignore stale same-key recovery owners
* fix(diagnostics): preserve same-session recovery rearm
* fix(diagnostics): ignore stale queued activity starts
* fix(diagnostics): record recovery cutoffs for empty activity
* fix(diagnostics): preserve fresh recovery markers
* fix(diagnostics): prune stale activity before fresh recovery block
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(agents): clear legacy auto fallback pins
* fix(agents): repair legacy auto-fallback test mock and tighten review feedback
Add hasLegacyAutoFallbackWithoutOrigin to the live-model-switch agent-scope mock so the agents-core lane runs, simplify the redundant hasSessionModelOverride guard, use a single source of truth for the legacy-pin staleness check with a comment on the load-bearing modelKey guard, and add preservation/edge-case/guard regression coverage. Rename the misleading primary-probe agent test.
* style(agents): format rebased fallback fix
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>