Implements a strict format validation for the agentId parameter in
sessions_spawn to fully resolve the ghost workspace creation bug reported
in #31311.
This fix introduces a regex format gate at the entry point to
immediately reject malformed agentId strings. This prevents error
messages (e.g., 'Agent not found: xyz') or path traversals from being
mangled by normalizeAgentId into seemingly valid IDs (e.g.,
'agent-not-found--xyz'), which was the root cause of the bug.
The validation is placed before normalization and does not interfere
with existing workflows, including delegating to agents that are
allowlisted but not globally configured.
New, non-redundant tests are added to
sessions-spawn.allowlist.test.ts to cover format validation and
ensure no regressions in allowlist behavior.
Fixes#31311
When a Chrome relay targetId becomes stale between snapshot and action,
the browser tool now retries once without targetId so the relay falls
back to the currently attached tab.
Drop the unknown recovered field from the test mock return value
to satisfy tsc strict checking against BrowserActResponse.
* test: fix typing and test fixture issues
* Fix type-test harness issues from session routing and mock typing
* Add routing regression test for session.mainKey precedence
* feat: add PDF analysis tool with native provider support
New `pdf` tool for analyzing PDF documents with model-powered analysis.
Architecture:
- Native PDF path: sends raw PDF bytes directly to providers that support
inline document input (Anthropic via DocumentBlockParam, Google Gemini
via inlineData with application/pdf MIME type)
- Extraction fallback: for providers without native PDF support, extracts
text via pdfjs-dist and rasterizes pages to images via @napi-rs/canvas,
then sends through the standard vision/text completion path
Key features:
- Single PDF (`pdf` param) or multiple PDFs (`pdfs` array, up to 10)
- Page range selection (`pages` param, e.g. "1-5", "1,3,7-9")
- Model override (`model` param) and file size limits (`maxBytesMb`)
- Auto-detects provider capability and falls back gracefully
- Same security patterns as image tool (SSRF guards, sandbox support,
local path roots, workspace-only policy)
Config (agents.defaults):
- pdfModel: primary/fallbacks (defaults to imageModel, then session model)
- pdfMaxBytesMb: max PDF file size (default: 10)
- pdfMaxPages: max pages to process (default: 20)
Model catalog:
- Extended ModelInputType to include "document" alongside "text"/"image"
- Added modelSupportsDocument() capability check
Files:
- src/agents/tools/pdf-tool.ts - main tool factory
- src/agents/tools/pdf-tool.helpers.ts - helpers (page range, config, etc.)
- src/agents/tools/pdf-native-providers.ts - direct API calls for Anthropic/Google
- src/agents/tools/pdf-tool.test.ts - 43 tests covering all paths
- Modified: model-catalog.ts, openclaw-tools.ts, config schema/types/labels/help
* fix: prepare pdf tool for merge (#31319) (thanks @tyler6204)
When an OAuth auth profile returns HTTP 403 with permission_error
(e.g. expired plan), the error was not matched by the authPermanent
patterns. This caused the profile to receive only a short cooldown
instead of being disabled, so the gateway kept retrying the same
broken profile indefinitely.
Add "permission_error" and "not allowed for this organization" to
the authPermanent error patterns so these errors trigger the longer
billing/auth_permanent disable window and proper profile rotation.
Closes#31306
Made-with: Cursor
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix: detect PID recycling in session write lock staleness check
The session lock uses isPidAlive() to determine if a lock holder is
still running. In containers, PID recycling can cause a different
process to inherit the same PID, making the lock appear valid when
the original holder is dead.
Record the process start time (field 22 of /proc/pid/stat) in the
lock file and compare it during staleness checks. If the PID is alive
but its start time differs from the recorded value, the lock is
treated as stale and reclaimed immediately.
Backward compatible: lock files without starttime are handled with
the existing PID-alive + age-based logic. Non-Linux platforms skip
the starttime check entirely (getProcessStartTime returns null).
* shared: harden pid starttime parsing
* sessions: validate lock pid/starttime payloads
* changelog: note recycled PID lock recovery fix
* changelog: credit hiroki and vincent on lock recovery fix
---------
Co-authored-by: HirokiKobayashi-R <hiroki@rhems-japan.co.jp>
* fix(cron): guard against year-rollback in croner nextRun
Croner can return a past-year timestamp for some timezone/date
combinations (e.g. Asia/Shanghai). When nextRun returns a value at or
before nowMs, retry from the next whole second and, if still stale,
from midnight-tomorrow UTC before giving up.
Closes#30351
* googlechat: guard API calls with SSRF-safe fetch
* test: fix hoisted plugin context mock setup
---------
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
* feat(agents): support `thinkingDefault: "adaptive"` for Anthropic models
Anthropic's Opus 4.6 and Sonnet 4.6 support adaptive thinking where the
model dynamically decides when and how much to think. This is now
Anthropic's recommended mode and `budget_tokens` is deprecated on these
models.
Add "adaptive" as a valid thinking level:
- Config: `agents.defaults.thinkingDefault: "adaptive"`
- CLI: `/think adaptive` or `/think auto`
- Pi SDK mapping: "adaptive" → "medium" effort at the pi-agent-core
layer, which the Anthropic provider translates to
`thinking.type: "adaptive"` with `output_config.effort: "medium"`
- Provider fallbacks: OpenRouter and Google map "adaptive" to their
respective "medium" equivalents
Closes#30880
Made-with: Cursor
* style(changelog): format changelog with oxfmt
* test(types): fix strict typing in runtime/plugin-context tests
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Some OpenAI-format providers (via pi-ai) pre-subtract cached_tokens from
prompt_tokens upstream. When cached_tokens exceeds prompt_tokens due to
provider inconsistencies the subtraction produces a negative input value
that flows through to the TUI status bar and /usage dashboard.
Clamp rawInput to 0 in normalizeUsage() so downstream consumers never
see nonsensical negative token counts.
Closes#30765
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GitHub Copilot API tokens expire after ~30 minutes. When OpenClaw spawns
a long-running subagent using Copilot as the provider, the token would
expire mid-session with no recovery mechanism, causing 401 auth errors.
This commit adds:
- Periodic token refresh scheduled 5 minutes before expiry
- Auth error detection with automatic token refresh and single retry
- Proper timer cleanup on session shutdown to prevent leaks
The implementation uses a per-attempt retry flag to ensure each auth
error can trigger one refresh+retry cycle without creating infinite
retry loops.
🤖 AI-assisted: This fix was developed with GitHub Copilot CLI assistance.
Testing: Fully tested with 3 new unit tests covering auth retry, retry
reset, and timer cleanup scenarios. All 11 auth rotation tests pass.