Regression: `costUsageCache` in `src/gateway/server-methods/usage.ts` had no
delete/prune/evict path. The TTL check at L310 only gates stale reads — on a
miss after expiry, `set()` overwrites the same key but never removes stale
keys. `parseDateRange` derives cacheKey from `getTodayStartMs`, so cacheKey
rolls at every UTC 00:00, and additional axes (days / startDate / endDate /
utcOffset) multiply cardinality. The macOS menu polls `usage.cost` every ~45s
with no params, exercising `parseDateRange`'s default branch every day. Over
gateway uptime the map grows monotonically.
Three sibling caches in the same subsystem already implement MAX + FIFO
eviction (resolvedSessionKeyByRunId, TRANSCRIPT_SESSION_KEY_CACHE,
sessionTitleFieldsCache). This change mirrors their pattern:
- `COST_USAGE_CACHE_MAX = 256` (matches RUN_LOOKUP_CACHE_LIMIT and
TRANSCRIPT_SESSION_KEY_CACHE_MAX).
- New `setCostUsageCache(cacheKey, entry)` helper checks size + evicts
`keys().next().value` when adding a new key would exceed the cap.
- The three existing `costUsageCache.set(...)` call sites now route through
the helper. TTL-on-read, in-flight dedup, and overwrite-on-same-key
semantics are preserved.
Adds `src/gateway/server-methods/usage.cost-usage-cache.test.ts` which drives
growth through `__test.loadCostUsageSummaryCached` with 600 distinct
(startMs, endMs) pairs (mirrors day rollover + range switches). Pre-fix the
Map grows to 600; post-fix it plateaus, the last key is retained, and the
first key is evicted (FIFO).
AI-assisted (fully tested). 432 server-methods tests pass, pnpm check +
pnpm build clean.