Commit Graph

154 Commits

Author SHA1 Message Date
Bruno Lorente
ca76e2fedc fix(cron-tool): add typed properties to job/patch schemas (#55043)
Merged via squash.

Prepared head SHA: 979bb0e8b7
Co-authored-by: brunolorente <127802443+brunolorente@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-04-01 23:41:19 +03:00
Peter Steinberger
f559ea126d fix: land slash command metadata parsing (#58725) (thanks @Mlightsnow) 2026-04-01 10:17:47 +01:00
Forgely3D
4fa11632b4 fix: escalate to model fallback after rate-limit profile rotation cap (#58707)
* fix: escalate to model fallback after rate-limit profile rotation cap

Per-model rate limits (e.g. Anthropic Sonnet-only quotas) are not
relieved by rotating auth profiles — if all profiles share the same
model quota, cycling between them loops forever without falling back
to the next model in the configured fallbacks chain.

Apply the same rotation-cap pattern introduced for overloaded_error
(#58348) to rate_limit errors:

- Add `rateLimitedProfileRotations` to auth.cooldowns config (default: 1)
- After N profile rotations on a rate_limit error, throw FailoverError
  to trigger cross-provider model fallback
- Add `resolveRateLimitProfileRotationLimit` helper following the same
  pattern as `resolveOverloadProfileRotationLimit`

Fixes #58572

* fix: cap prompt-side rate-limit failover (#58707) (thanks @Forgely3D)

* fix: restore latest-main gates for #58707

---------

Co-authored-by: Ember (Forgely3D) <ember@forgely.co>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-01 17:54:10 +09:00
scoootscooob
10750fb80e Cron: avoid busy-wait drift for recurring main jobs (#58872) 2026-04-01 01:31:21 -07:00
Peter Steinberger
19d0c2dd1d refactor: remove cron legacy delivery from runtime 2026-04-01 17:06:01 +09:00
Peter Steinberger
802bdb099e refactor: move cron legacy delivery migration to doctor 2026-04-01 16:44:10 +09:00
Peter Steinberger
6776306387 fix: preserve telegram topic delivery routing (#58489) (thanks @cwmine) 2026-04-01 16:13:24 +09:00
Vincent Koc
80ed55332d fix(tasks): restore owner-key task scope 2026-04-01 03:53:12 +09:00
Vincent Koc
338d313043 fix(tasks): scope shared run updates by session 2026-04-01 03:41:29 +09:00
Vincent Koc
7cd0ff2d88 refactor(tasks): add owner-key task access boundaries (#58516)
* refactor(tasks): add owner-key task access boundaries

* test(acp): update task owner-key assertion

* fix(tasks): align owner key checks and migration scope
2026-04-01 03:12:33 +09:00
Peter Steinberger
759d37635d Revert "refactor: move tasks behind plugin-sdk seam"
This reverts commit da6e9bb76f.
2026-04-01 01:30:22 +09:00
Peter Steinberger
da6e9bb76f refactor: move tasks behind plugin-sdk seam 2026-03-31 15:22:09 +01:00
Vincent Koc
1c9053802a refactor(cron): split main and detached dispatch (#57482)
* refactor(tasks): add executor facade

* refactor(tasks): extract delivery policy

* refactor(tasks): route acp through executor

* refactor(tasks): route subagents through executor

* refactor(cron): split main and detached dispatch
2026-03-30 13:59:55 +09:00
Vincent Koc
c7106c4285 refactor(tasks): replace generic task mutation api 2026-03-30 12:49:36 +09:00
Vincent Koc
53bcd5769e refactor(tasks): unify the shared task run registry (#57324)
* refactor(tasks): simplify shared task run registry

* refactor(tasks): remove legacy task registry aliases

* fix(cron): normalize timeout task status and harden ledger writes

* fix(cron): keep manual runs resilient to ledger failures
2026-03-30 08:28:17 +09:00
Peter Steinberger
7d4fab3e73 test: debrand pairing and dm policy fixtures 2026-03-27 22:18:20 +00:00
Tak Hoffman
50d996a6ec tests: cron coverage and NO_REPLY delivery fixes (#53366)
* tools: extend seam audit inventory

* tools: audit cron seam coverage gaps

* test: add cron seam coverage tests

* fix: avoid marking NO_REPLY cron deliveries as delivered

* fix: clean up delete-after-run NO_REPLY cron sessions
2026-03-23 22:52:13 -05:00
Peter Steinberger
1e1372027e perf: avoid cron startup store reload churn 2026-03-22 21:52:42 +00:00
Peter Steinberger
9267e694f7 perf: reduce cron persistence churn 2026-03-22 21:28:16 +00:00
kkhomej33-netizen
e7d9648fba feat(cron): support custom session IDs and auto-bind to current session (#16511)
feat(cron): support persistent session targets for cron jobs (#9765)

Add support for `sessionTarget: "current"` and `session:<id>` so cron jobs can
bind to the creating session or a persistent named session instead of only
`main` or ephemeral `isolated` sessions.

Also:
- preserve custom session targets across reloads and restarts
- update gateway validation and normalization for the new target forms
- add cron coverage for current/custom session targets and fallback behavior
- fix merged CI regressions in Discord and diffs tests
- add a changelog entry for the new cron session behavior

Co-authored-by: kkhomej33-netizen <kkhomej33-netizen@users.noreply.github.com>
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
2026-03-14 16:48:46 +11:00
Peter Steinberger
49f3fbf726 fix: restore cron manual run type narrowing 2026-03-13 16:51:59 +00:00
Peter Steinberger
7b8e48ffb6 refactor: share cron manual run preflight 2026-03-13 16:51:59 +00:00
futuremind2026
382287026b cron: record lastErrorReason in job state (#14382)
Merged via squash.

Prepared head SHA: baa6b5d566
Co-authored-by: futuremind2026 <258860756+futuremind2026@users.noreply.github.com>
Co-authored-by: BunsDev <68980965+BunsDev@users.noreply.github.com>
Reviewed-by: @BunsDev
2026-03-10 00:01:45 -05:00
Mariano
d4e59a3666 Cron: enforce cron-owned delivery contract (#40998)
Merged via squash.

Prepared head SHA: 5877389e33
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Reviewed-by: @mbelinky
2026-03-09 20:12:37 +01:00
Peter Steinberger
e86b38f09d refactor: split cron startup catch-up flow 2026-03-09 06:19:10 +00:00
Peter Steinberger
96d17f3cb1 fix: stagger missed cron jobs on restart (#18925) (thanks @rexlunae) 2026-03-09 06:07:43 +00:00
rexlunae
79853aca9c fix(cron): stagger missed jobs on restart to prevent gateway overload
When the gateway restarts with many overdue cron jobs, they are now
executed with staggered delays to prevent overwhelming the gateway.

- Add missedJobStaggerMs config (default 5s between jobs)
- Add maxMissedJobsPerRestart limit (default 5 jobs immediately)
- Prioritize most overdue jobs by sorting by nextRunAtMs
- Reschedule deferred jobs to fire gradually via normal timer

Fixes #18892
2026-03-09 06:07:43 +00:00
Tyler Yust
38543d8196 fix(cron): consolidate announce delivery, fire-and-forget trigger, and minimal prompt mode (#40204)
* fix(cron): consolidate announce delivery and detach manual runs

* fix: queue detached cron runs (#40204)
2026-03-08 14:46:33 -07:00
gambletan
8a20f51460 fix: add rate limit patterns for 'too many tokens' and 'tokens per day' (#39377)
Merged via squash.

Prepared head SHA: 132a457286
Co-authored-by: gambletan <266203672+gambletan@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-08 13:03:33 +03:00
Peter Steinberger
149ae45bad fix(cron): preserve manual timeoutSeconds on add 2026-03-08 00:48:57 +00:00
Peter Steinberger
e66c418c45 refactor(cron): normalize legacy delivery at ingress 2026-03-08 00:48:57 +00:00
Peter Steinberger
6b18ec479c refactor(cron): centralize initial delivery defaults 2026-03-08 00:48:56 +00:00
Peter Steinberger
4e07bdbdfd fix(cron): restore isolated delivery defaults 2026-03-08 00:18:45 +00:00
Altay
6e962d8b9e fix(agents): handle overloaded failover separately (#38301)
* fix(agents): skip auth-profile failure on overload

* fix(agents): note overload auth-profile fallback fix

* fix(agents): classify overloaded failures separately

* fix(agents): back off before overload failover

* fix(agents): tighten overload probe and backoff state

* fix(agents): persist overloaded cooldown across runs

* fix(agents): tighten overloaded status handling

* test(agents): add overload regression coverage

* fix(agents): restore runner imports after rebase

* test(agents): add overload fallback integration coverage

* fix(agents): harden overloaded failover abort handling

* test(agents): tighten overload classifier coverage

* test(agents): cover all-overloaded fallback exhaustion

* fix(cron): retry overloaded fallback summaries

* fix(cron): treat HTTP 529 as overloaded retry
2026-03-07 01:42:11 +03:00
Tak Hoffman
cc5dad81bc cron: unify stale-run recovery and preserve manual-run every anchors (#35363)
* cron: unify stale-run recovery and preserve manual every anchors

* cron: address unresolved review threads on recovery paths

* cron: remove duplicate timestamp helper after rebase
2026-03-04 22:12:32 -06:00
Tak Hoffman
28dc2e8a40 cron: narrow startup replay backoff guard (#35391) 2026-03-04 22:11:11 -06:00
Tak Hoffman
79d00ae398 fix(cron): stabilize restart catch-up replay semantics (#35351)
* Cron: stabilize restart catch-up replay semantics

* Cron: respect backoff in startup missed-run replay
2026-03-04 21:50:16 -06:00
sline
1059b406a8 fix: cron backup should preserve pre-edit snapshot (#35195) (#35234)
* fix(cron): avoid overwriting .bak during normalization

Fixes openclaw/openclaw#35195

* test(cron): preserve pre-edit bak snapshot in normalization path

---------

Co-authored-by: 0xsline <sline@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-04 21:46:27 -06:00
Peter Steinberger
f9cbcfca0d refactor: modularize slack/config/cron/daemon internals 2026-03-02 22:30:21 +00:00
Peter Steinberger
f7765bc151 perf(cron): cache schedule evaluators and stagger offsets 2026-03-02 20:19:10 +00:00
bmendonca3
4cd04e4652 fix(cron): migrate legacy string schedule and command jobs 2026-03-02 19:55:19 +00:00
scoootscooob
a3c5d21b4d fix(cron): suppress HEARTBEAT_OK summary from leaking into main session (#32013)
When an isolated cron agent returns HEARTBEAT_OK (nothing to announce),
the direct delivery is correctly skipped, but the fallback path in
timer.ts still enqueues the summary as a system event to the main
session. Filter out heartbeat-only summaries using isCronSystemEvent
before enqueuing, so internal ack tokens never reach user conversations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 19:51:51 +00:00
scoootscooob
4030de6c73 fix(cron): move session reaper to finally block so it runs reliably (#31996)
* fix(cron): move session reaper to finally block so it runs reliably

The cron session reaper was placed inside the try block of onTimer(),
after job execution and state updates. If the locked persist section
threw, the reaper was skipped — causing isolated cron run sessions to
accumulate indefinitely in sessions.json.

Move the reaper into the finally block so it always executes after a
timer tick, regardless of whether job execution succeeded. The reaper
is already self-throttled (MIN_SWEEP_INTERVAL_MS = 5 min) so calling
it more reliably has no performance impact.

Closes #31946

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: strengthen cron reaper failure-path coverage and changelog (#31996) (thanks @scoootscooob)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-03-02 18:38:59 +00:00
Glucksberg
09f49cd921 fix(cron): accept delivery mode "none" for sessionTarget="main" (#27431) (#28871) 2026-03-02 10:32:00 -06:00
Evgeny Zislis
4b4ea5df8b feat(cron): add failure destination support to failed cron jobs (#31059)
* feat(cron): add failure destination support with webhook mode and bestEffort handling

Extends PR #24789 failure alerts with features from PR #29145:
- Add webhook delivery mode for failure alerts (mode: 'webhook')
- Add accountId support for multi-account channel configurations
- Add bestEffort handling to skip alerts when job has bestEffort=true
- Add separate failureDestination config (global + per-job in delivery)
- Add duplicate prevention (prevents sending to same as primary delivery)
- Add CLI flags: --failure-alert-mode, --failure-alert-account-id
- Add UI fields for new options in web cron editor

* fix(cron): merge failureAlert mode/accountId and preserve failureDestination on updates

- Fix mergeCronFailureAlert to merge mode and accountId fields
- Fix mergeCronDelivery to preserve failureDestination on updates
- Fix isSameDeliveryTarget to use 'announce' as default instead of 'none'
  to properly detect duplicates when delivery.mode is undefined

* fix(cron): validate webhook mode requires URL in resolveFailureDestination

When mode is 'webhook' but no 'to' URL is provided, return null
instead of creating an invalid plan that silently fails later.

* fix(cron): fail closed on webhook mode without URL and make failureDestination fields clearable

- sendCronFailureAlert: fail closed when mode is webhook but URL is missing
- mergeCronDelivery: use per-key presence checks so callers can clear
  nested failureDestination fields via cron.update

Note: protocol:check shows missing internalEvents in Swift models - this is
a pre-existing issue unrelated to these changes (upstream sync needed).

* fix(cron): use separate schema for failureDestination and fix type cast

- Create CronFailureDestinationSchema excluding after/cooldownMs fields
- Fix type cast in sendFailureNotificationAnnounce to use CronMessageChannel

* fix(cron): merge global failureDestination with partial job overrides

When job has partial failureDestination config, fall back to global
config for unset fields instead of treating it as a full override.

* fix(cron): avoid forcing announce mode and clear inherited to on mode change

- UI: only include mode in patch if explicitly set to non-default
- delivery.ts: clear inherited 'to' when job overrides mode, since URL
  semantics differ between announce and webhook modes

* fix(cron): preserve explicit to on mode override and always include mode in UI patches

- delivery.ts: preserve job-level explicit 'to' when overriding mode
- UI: always include mode in failureAlert patch so users can switch between announce/webhook

* fix(cron): allow clearing accountId and treat undefined global mode as announce

- UI: always include accountId in patch so users can clear it
- delivery.ts: treat undefined global mode as announce when comparing for clearing inherited 'to'

* Cron: harden failure destination routing and add regression coverage

* Cron: resolve failure destination review feedback

* Cron: drop unrelated timeout assertions from conflict resolution

* Cron: format cron CLI regression test

* Cron: align gateway cron test mock types

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 09:27:41 -06:00
Yasunori Morishima(盛島康徳)
be8930d6f9 fix: clear stale runningAtMs in cron.run() before already-running check (#17949)
Add recomputeNextRunsForMaintenance() call in run() so that stale
runningAtMs markers (from a crashed Phase-1 persist) are cleared by the
existing normalizeJobTickState logic before the already-running guard.

Without this, a manual cron.run() could be blocked for up to
STUCK_RUN_MS (2 hours) even though no job was actually running.

Fixes #17554

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:47:36 -06:00
Kate Chapman
6df8bd9741 fix(cron): wrap computeJobNextRunAtMs in try-catch inside applyJobResult (#30905)
* fix(cron): wrap computeJobNextRunAtMs in try-catch inside applyJobResult

Without this guard, if the croner library throws during schedule
computation (timezone/expression edge cases), the exception propagates
out of applyJobResult and the entire state update is lost — runningAtMs
never clears, lastRunAtMs never advances, nextRunAtMs never recomputes.
After STUCK_RUN_MS (2h), stuck detection clears runningAtMs and the job
re-fires, creating a ~2h repeat cycle instead of the intended schedule.

The sibling function recomputeJobNextRunAtMs in jobs.ts already wraps
computeJobNextRunAtMs in try-catch; this was an oversight in the
applyJobResult call sites.

Changes:
- Error-backoff path: catch and fall back to backoff-only schedule
- Success path: catch and fall through to the MIN_REFIRE_GAP_MS safety net
- applyOutcomeToStoredJob: log a warning when job not found after forceReload

* fix(cron): use recordScheduleComputeError in applyJobResult catch blocks

Address review feedback: the original catch blocks only logged a warning,
which meant a persistent computeJobNextRunAtMs throw would cause a
MIN_REFIRE_GAP_MS (2s) hot loop on cron-kind jobs.

Now both catch blocks call recordScheduleComputeError (exported from
jobs.ts), which tracks consecutive schedule errors and auto-disables the
job after 3 failures — matching the existing behavior in
recomputeJobNextRunAtMs.

* test(cron): cover applyJobResult schedule-throw fallback paths

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 07:38:33 -06:00
Tak Hoffman
254bb7ceee ui(cron): add advanced controls for run-if-due and routing (#31244)
* ui(cron): add advanced run controls and routing fields

* ui(cron): gate delivery account id to announce mode

* ui(cron): allow clearing delivery account id in editor

* cron: persist payload lightContext updates

* tests(cron): fix payload lightContext assertion typing
2026-03-02 07:24:33 -06:00
Peter Steinberger
6b78544f82 refactor(commands): unify repeated ACP and routing flows 2026-03-02 05:20:19 +00:00
Glucksberg
08c35eb13f fix(cron): re-arm one-shot at-jobs when rescheduled after completion (openclaw#28915) thanks @Glucksberg
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: Glucksberg <80581902+Glucksberg@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-01 21:31:24 -06:00