Fixes#86161.
Route Telegram media-message edits through the Telegram caption/reply-markup APIs instead of always calling `editMessageText`. Button-only edits now update reply markup, explicit captions use `editMessageCaption`, and text edits can fall back to caption edits when Telegram reports the message has no editable text.
Also documents the edit behavior, adds regression coverage, tightens timer-spy cleanup for the affected agents test lane, and removes a stale loader helper from the current base that broke core typecheck.
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Move Telegram plugin-local state from JSON sidecars into plugin-state SQLite. Keep legacy JSON handling in startup and doctor migration plans, with runtime state now reading and writing SQLite directly. Stabilize the channel Vitest lane by cleaning up typing timers and isolating that lane.
Align Telegram proactive DM-topic outbound session routing with inbound reply routing.
The Telegram plugin now uses the chat-scoped DM-topic suffix for direct-topic outbound sessions, so cron/proactive sends and replies reuse the same session. Delivery metadata is kept as the numeric Telegram topic id so visible sends still target the correct private topic.
Refs #80212.
Thanks @brokemac79.
Verification:
- PR head d904115e4c
- GitHub CI/checks green on PR head; Real behavior proof passed; OpenGrep passed; CodeQL neutral/pass
- git diff --check origin/main...pr/88421 -- extensions/telegram/src/channel.ts extensions/telegram/src/session-route.test.ts
- git merge-tree $(git merge-base origin/main pr/88421) origin/main pr/88421
Reduce repeated gateway warning noise in startup/auth retry paths while preserving credential mismatch and rate-limit audit visibility.
Also hardens empty embedded-assistant retry handling by carrying lifecycle state through the missing-assistant guard, and keeps the relevant regression coverage in gateway and agent tests.
* fix(telegram): enable TCP keepalive on getUpdates connections to prevent NAT timeout stalls
Long-polling connections to api.telegram.org stay idle for up to the
getUpdates timeout (~900 s). Most home/office NAT tables expire idle TCP
entries after 60–1800 s (commonly ~1000 s). When the NAT entry is
silently dropped the connection hangs rather than returning an error,
leaving the grammY runner stuck until the 90 s stall watchdog fires and
forces a restart cycle.
Fix: unconditionally set `keepAlive: true` and
`keepAliveInitialDelay: 30_000` (30 s) on the undici Agent `connect`
options built in `buildTelegramConnectOptions`. OS-level TCP keepalive
probes sent every ~75 s (OS default) will:
1. Refresh the NAT table entry before it expires.
2. Surface dead connections immediately with ETIMEDOUT instead of
hanging forever.
The `return Object.keys(connect).length > 0 ? connect : null` guard is
also removed; `connect` is now always non-empty so it always returns the
object.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 92e454c0614256201cdf6f0f73c7897d006616d4)
* fix(telegram): stop self-flagging disconnected on poll-cycle start; widen channel connect grace to 300s
(cherry picked from commit 1ca963a05dac0d9d605e9a15dc97fced9cf7725e)
* fix(telegram): catch hung polling startups that preserve inherited connected:true
The widened 300s channel connect grace and the removal of connected:false from
notePollingStart left a path where a polling restart could hang forever
looking healthy. notePollingStart clears lastConnectedAt, lastEventAt, and
lastTransportActivityAt but deliberately omits connected, so server-channels'
patch-merge inherits a connected:true from the previous lifecycle. After grace,
evaluateChannelHealth's stale-socket branch requires lastTransportActivityAt
to be non-null and the connected:false branch is masked, so the channel sits
healthy with no first getUpdates.
Add a post-grace branch to evaluateChannelHealth that flags polling channels
as stale-socket when connected:true is paired with null lastConnectedAt and
null lastTransportActivityAt and a non-null lastStartAt. Scoped to mode:polling
so webhook channels and channels without continuous transport tracking are
not falsely flagged. Align TELEGRAM_POLLING_CONNECT_GRACE_MS in the Telegram
status diagnostic with DEFAULT_CHANNEL_CONNECT_GRACE_MS so openclaw channels
status agrees with the shared health monitor on the grace window. Refresh
the notePollingStart comment to point at the new evaluateChannelHealth branch.
Addresses clawsweeper review on #83304 (P1 connect-grace startup-hang, P2
diagnostic grace drift). Tests cover the new flagged path, the in-grace happy
path, and the prior-successful-connect happy path.
* fix(telegram): clear polling connected state on startup
* fix(gateway): add defense-in-depth health-policy branch for hung polling startups
Defense in depth on top of 87db46c576's notePollingStart connected:false fix.
The primary path (notePollingStart writes connected:false explicitly so
evaluateChannelHealth's existing connected===false branch catches a hung
restart) is unchanged. This adds a defensive post-grace branch that catches
the same hang via a different signature -- inherited connected:true paired
with null lastConnectedAt and null lastTransportActivityAt -- in case a
future code path forgets to clear the inherited connected flag on lifecycle
start. Scoped to mode:polling so webhook channels and channels without
continuous transport tracking are not falsely flagged.
Also bump lastStartAt: Date.now() - 121_000 to 301_000 in the spool-handler
timeout test added by upstream #83505 so it falls past the widened 300s
TELEGRAM_POLLING_CONNECT_GRACE_MS suppression window (mirroring the same
fixup already applied to the two adjacent polling-startup tests).
* revert(telegram,gateway): keep connect grace at 120s
Drop the 120s -> 300s widening from this PR after maintainer feedback that
the extra grace masks real startup bugs. The defense-in-depth checks added
in earlier commits (notePollingStart clearing inherited connected state,
the stale-socket policy branch, the per-snapshot startup grace test) all
work fine at 120s and remain valuable on their own.
Reverts in:
- src/gateway/channel-health-policy.ts: DEFAULT_CHANNEL_CONNECT_GRACE_MS 300 -> 120
- extensions/telegram/src/status-issues.ts: TELEGRAM_POLLING_CONNECT_GRACE_MS 300 -> 120
- extensions/telegram/src/status.test.ts: lastStartAt 301_000 -> 121_000 (3 cases)
The new channel-health-policy.test.ts cases use explicit channelConnectGraceMs:
10_000 in the policy, so they are unaffected by the default constant change.
* fix(telegram): narrow polling keepalive fix
---------
Co-authored-by: Yibei Ou <yibeiou@Yibeis-Mac-mini.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
Make plugin-state enforce the plugin-wide live-row fuse by evicting only from the namespace currently being written, preserving sibling namespace rows and still failing atomically when the current namespace cannot free enough rows.
Raise the plugin-wide cap to 6,000 rows, keep Telegram's persistent message-cache namespace at 3,000 entries, and document the updated SDK runtime contract. Harden legacy plugin-state import so capacity pressure cannot archive a source after losing imported keys, with focused regression coverage for Telegram-shaped namespaces and migration rollback.
Also restore the Docker runtime-assets preflight step in full release validation so release workflow contract tests stay aligned.
Verification: focused plugin-state, migration, Telegram, workflow-contract, lint, deprecated-API, diff-check, Blacksmith Testbox, CI, CodeQL, Workflow Sanity, OpenGrep, and autoreview all passed on PR head fee021cfa6.
Co-authored-by: Keshav's Bot <keshavbotagent@gmail.com>
Use read-only Telegram account inspection for prompt-time channel actions, inline buttons, and reaction guidance so unresolved SecretRef tokens retain configured non-secret behavior before runtime snapshot hydration.
Match runtime Telegram account lookup for normalized config keys and multi-account fallback guards, while keeping sends/actions on the existing strict credential resolution path.
Fixes#75433.
Co-authored-by: Shubhankar Tripathy <reach2shubhankar@gmail.com>
Route Telegram sendMessage action replies through durable outbound delivery so completed agent responses remain retryable when the gateway send path times out.
Verified with focused Telegram/outbound tests, extension test typecheck, prepare build/check/full test gates, and green CI rerun for head 20b45687e1.