fix(gateway): catch startup failure in run loop to prevent process exit (#35862)

When an in-process restart (SIGUSR1) triggers a config-triggered restart
and the new config is invalid, params.start() throws and the while loop
exits, killing the process. On macOS this loses TCC permissions.

Wrap params.start() in try/catch: on failure, set server=null, log the
error, and wait for the next SIGUSR1 instead of crashing.
This commit is contained in:
merlin
2026-03-05 18:36:39 +08:00
committed by Peter Steinberger
parent da4e43d52c
commit b11933d8a9

View File

@@ -193,7 +193,19 @@ export async function runGatewayLoop(params: {
// eslint-disable-next-line no-constant-condition
while (true) {
onIteration();
server = await params.start();
try {
server = await params.start();
} catch (err) {
// If startup fails (e.g., invalid config after a config-triggered
// restart), keep the process alive and wait for the next SIGUSR1
// instead of crashing. A crash here would respawn a new process that
// loses macOS Full Disk Access (TCC permissions are PID-bound). (#35862)
server = null;
gatewayLog.error(
`gateway startup failed: ${err instanceof Error ? err.message : String(err)}. ` +
"Process will stay alive; fix the issue and restart.",
);
}
await new Promise<void>((resolve) => {
restartResolver = resolve;
});