fix(gateway): prefer linux child OOM victims

Raise eligible Linux child processes own oom_score_adj from a child-side /bin/sh exec shim so cgroup memory pressure prefers transient workers over the long-lived gateway. Cover supervisor children, PTY shells, MCP stdio servers, and OpenClaw-launched browser processes through the shared process runtime seam.

Harden the wrapper for distroless images, shell startup env, per-child and process-level opt-outs, dash-compatible exec, and leading-dash command names. Document Linux verification and OOM behavior.

Fixes #70404.

Co-authored-by: Neerav Makwana <261249544+neeravmakwana@users.noreply.github.com>
This commit is contained in:
Peter Steinberger
2026-04-23 05:10:30 +01:00
parent d3a2e993a8
commit cc9dcd3d69
14 changed files with 451 additions and 25 deletions

View File

@@ -114,3 +114,6 @@ If you deliberately installed a system unit instead, edit
How `Restart=` policies help automated recovery:
[systemd can automate service recovery](https://www.redhat.com/en/blog/systemd-automate-recovery).
For Linux OOM behavior, child process victim selection, and `exit 137`
diagnostics, see [Linux memory pressure and OOM kills](/platforms/linux#memory-pressure-and-oom-kills).