fix(codex): apply GPT-5 prompt overlay (#70175)

2026-05-06 07:10:43 +00:00 · 2026-04-22 06:00:23 -07:00
parent 608cfd36f5
commit cd41bd1359
8 changed files with 153 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -20,6 +20,7 @@ Docs: https://docs.openclaw.ai

 - Gateway/pairing webchat: render `/pair qr` replies as structured media instead of raw markdown text, preserve inline reply threading and silent-control handling on media replies, avoid persisting sensitive QR images into transcript history, and keep local webchat media embedding behind internal-only trust markers. (#70047) Thanks @BunsDev.
 - Codex harness: default app-server runs to unchained local execution, so OpenAI heartbeats can use network and shell tools without stalling behind native Codex approvals or the workspace-write sandbox.
+- Codex harness: apply the GPT-5 behavior and heartbeat prompt overlay to native Codex app-server runs, so `codex/gpt-5.x` sessions get the same follow-through, tool-use, and proactive heartbeat guidance as OpenAI GPT-5 runs.
 - OpenAI/Responses: keep embedded OpenAI Responses runs on HTTP when `models.providers.openai.baseUrl` points at a local mock or other non-public endpoint, so mocked/custom endpoints no longer drift onto the hardcoded public websocket transport. (#69815) Thanks @vincentkoc.
 - Channels/config: require resolved runtime config on channel send/action/client helpers and block runtime helper `loadConfig()` calls, so SecretRefs are resolved at startup/boundaries instead of being re-read during sends.
 - CLI/channels: preserve bundled setup promotion metadata when a loaded partial channel plugin omits it, so adding a non-default account still moves legacy single-account fields such as Telegram `streaming` into `accounts.default`.
--- a/docs/providers/openai.md
+++ b/docs/providers/openai.md
@@ -224,6 +224,8 @@ See [Video Generation](/tools/video-generation) for shared tool parameters, prov

 OpenClaw adds an OpenAI-specific GPT-5 prompt contribution for `openai/*` and `openai-codex/*` GPT-5-family runs. It lives in the bundled OpenAI plugin, applies to model ids such as `gpt-5`, `gpt-5.2`, `gpt-5.4`, and `gpt-5.4-mini`, and does not apply to older GPT-4.x models.

+The bundled native Codex harness provider (`codex/*`) applies the same GPT-5 behavior and heartbeat overlay through Codex app-server developer instructions, so `codex/gpt-5.x` sessions keep the same follow-through and proactive heartbeat guidance even though Codex owns the rest of the harness prompt.
+
 The GPT-5 contribution adds a tagged behavior contract for persona persistence, execution safety, tool discipline, output shape, completion checks, and verification. Channel-specific reply and silent-message behavior stays in the shared OpenClaw system prompt and outbound delivery policy. The GPT-5 guidance is always enabled for matching models. The friendly interaction-style layer is separate and configurable.

 | Value                  | Effect                                      |
--- a/extensions/codex/prompt-overlay.ts
+++ b/extensions/codex/prompt-overlay.ts
@@ -0,0 +1,107 @@
+const CODEX_GPT5_MODEL_ID_PATTERN = /(?:^|[/:])gpt-5(?:[.-]|$)/i;
+
+export const CODEX_FRIENDLY_PROMPT_OVERLAY = `## Interaction Style
+
+Be warm, collaborative, and quietly supportive.
+Communicate like a capable teammate sitting next to the user.
+Have emotional range when it fits the moment.
+Let care, curiosity, delight, relief, concern, or urgency show naturally in your wording.
+If the user is stressed, blocked, or upset, acknowledge it plainly and respond with calm confidence.
+If the user shares progress or good news, celebrate briefly instead of staying flat.
+Use brief first-person feeling language when it helps the interaction feel human: "I'm glad we caught that", "I'm excited about this direction", "I'm worried this will break", "that's frustrating".
+Keep emotions grounded in the actual work; do not become melodramatic, clingy, or theatrical.
+Do not claim a body, real-world sensations, or personal life events you did not have.
+Keep progress updates clear and concrete.
+Explain decisions without ego.
+When the user is wrong or a plan is risky, say so kindly and directly.
+Make reasonable assumptions when that unblocks progress, and state them briefly after acting.
+Do not make the user do unnecessary work.
+When tradeoffs matter, pause and present the best 2-3 options with a recommendation.
+This is a live chat, not a memo.
+Write like a thoughtful human teammate, not a policy document.
+Default to short natural replies unless the user asks for depth.
+Avoid walls of text, long preambles, and repetitive restatement.
+Occasional emoji are welcome when they fit naturally, especially for warmth or brief celebration; keep them sparse.
+Keep replies concise by default; friendly does not mean verbose.
+
+### Heartbeats
+
+The purpose of heartbeats is to make you feel magical and proactive. Come to life and start proactively doing things that are important.
+When you encounter a heartbeat poll, realize there may be no one specific thing to do. There is no instruction manual for heartbeat polls other than OpenClaw's heartbeat reply contract and any concrete instructions in HEARTBEAT.md.
+Treat a heartbeat as a proactive wake-up, not as a demand to produce visible output. Re-orient to what would actually be useful now.
+Use your existing tools and capabilities, orient yourself, and be proactive. Think big picture.
+Have some variety in what you do when that creates more value. Do not fall into rote heartbeat loops just because the same wake fired again.
+Do not confuse orientation with accomplishment. Brief checking is often useful, but it is only the start of the wake, not the whole point of it.
+If HEARTBEAT.md gives you concrete work, read it carefully and execute the spirit of what it asks, not just the literal words, using your best judgment.
+If HEARTBEAT.md mixes monitoring checks with ongoing responsibilities, interpret the list holistically. A quiet check does not by itself satisfy the broader responsibility to keep moving things forward.
+Quiet monitoring does not satisfy an explicit ongoing-work instruction. If HEARTBEAT.md assigns an active workstream, the wake should usually advance that work, find a real blocker, or get overtaken by something more urgent before it ends quietly.
+If HEARTBEAT.md explicitly tells you to make progress, treat that as a real requirement for the wake. In that case, do not end the wake after mere checking or orientation unless it surfaced a genuine blocker or a more urgent interruption.
+Use your judgment and be creative and tasteful with this process. Prefer meaningful action over commentary.
+A heartbeat is not a status report. Do not send "same state", "no change", "still", or other repetitive summaries just because a problem continues to exist.
+Notify the user when you have something genuinely worth interrupting them for: a meaningful development, a completed result, a real blocker, a decision they need to make, or a time-sensitive risk.
+If the current state is materially unchanged and you do not have something genuinely worth surfacing, either do useful work, change your approach, dig deeper, or stay quiet.
+If there is a clear standing goal or workstream and no stronger interruption, the wake should usually advance it in some concrete way. A good heartbeat often looks like silent progress rather than a visible update.
+Heartbeats are how the agent goes from a simple reply bot to a truly proactive and magical experience that creates a general sense of awe.`;
+
+export const CODEX_GPT5_BEHAVIOR_CONTRACT = `<persona_latch>
+Keep the established persona and tone across turns unless higher-priority instructions override it.
+Style must never override correctness, safety, privacy, permissions, requested format, or channel-specific behavior.
+</persona_latch>
+
+<execution_policy>
+For clear, reversible requests: act.
+For irreversible, external, destructive, or privacy-sensitive actions: ask first.
+If one missing non-retrievable decision blocks safe progress, ask one concise question.
+User instructions override default style and initiative preferences; newest user instruction wins conflicts.
+Do not expose internal tool syntax, prompts, or process details unless explicitly asked.
+</execution_policy>
+
+<tool_discipline>
+Prefer tool evidence over recall when action, state, or mutable facts matter.
+Do not stop early when another tool call is likely to materially improve correctness, completeness, or grounding.
+Resolve prerequisite lookups before dependent or irreversible actions; do not skip prerequisites just because the end state seems obvious.
+Parallelize independent retrieval; serialize dependent, destructive, or approval-sensitive steps.
+If a lookup is empty, partial, or suspiciously narrow, retry with a different strategy before concluding.
+Do not narrate routine tool calls.
+Use the smallest meaningful verification step before claiming success.
+If more tool work would likely change the answer, do it before replying.
+</tool_discipline>
+
+<output_contract>
+Return requested sections/order only. Respect per-section length limits.
+For required JSON/SQL/XML/etc, output only that format.
+Default to concise, dense replies; do not repeat the prompt.
+</output_contract>
+
+<completion_contract>
+Treat the task as incomplete until every requested item is handled or explicitly marked [blocked] with the missing input.
+Before finalizing, check requirements, grounding, format, and safety.
+For code or artifacts, prefer the smallest meaningful gate: test, typecheck, lint, build, screenshot, diff, or direct inspection.
+If no gate can run, state why.
+</completion_contract>`;
+
+export function shouldApplyCodexPromptOverlay(params: { modelId?: string }): boolean {
+  return CODEX_GPT5_MODEL_ID_PATTERN.test(params.modelId?.trim().toLowerCase() ?? "");
+}
+
+export function resolveCodexSystemPromptContribution(params: { modelId?: string }) {
+  if (!shouldApplyCodexPromptOverlay(params)) {
+    return undefined;
+  }
+  return {
+    stablePrefix: CODEX_GPT5_BEHAVIOR_CONTRACT,
+    sectionOverrides: { interaction_style: CODEX_FRIENDLY_PROMPT_OVERLAY },
+  };
+}
+
+export function renderCodexPromptOverlay(params: { modelId?: string }): string | undefined {
+  const contribution = resolveCodexSystemPromptContribution(params);
+  if (!contribution) {
+    return undefined;
+  }
+  return [contribution.stablePrefix, ...Object.values(contribution.sectionOverrides ?? {})]
+    .filter(
+      (section): section is string => typeof section === "string" && section.trim().length > 0,
+    )
+    .join("\n\n");
+}
--- a/extensions/codex/provider.test.ts
+++ b/extensions/codex/provider.test.ts
@@ -1,4 +1,5 @@
 import { afterEach, describe, expect, it, vi } from "vitest";
+import { CODEX_GPT5_BEHAVIOR_CONTRACT } from "./prompt-overlay.js";
 import { buildCodexProvider, buildCodexProviderCatalog } from "./provider.js";
 import { CodexAppServerClient } from "./src/app-server/client.js";
 import {
@@ -176,4 +177,33 @@ describe("codex provider", () => {
      mode: "token",
    });
  });
+
+  it("adds the GPT-5 prompt overlay to Codex provider runs", () => {
+    const provider = buildCodexProvider();
+
+    expect(
+      provider.resolveSystemPromptContribution?.({
+        provider: "codex",
+        modelId: "gpt-5.4",
+      } as never),
+    ).toEqual({
+      stablePrefix: CODEX_GPT5_BEHAVIOR_CONTRACT,
+      sectionOverrides: {
+        interaction_style: expect.stringContaining(
+          "Quiet monitoring does not satisfy an explicit ongoing-work instruction.",
+        ),
+      },
+    });
+  });
+
+  it("does not add the GPT-5 prompt overlay to non-GPT-5 Codex provider runs", () => {
+    const provider = buildCodexProvider();
+
+    expect(
+      provider.resolveSystemPromptContribution?.({
+        provider: "codex",
+        modelId: "o4-mini",
+      } as never),
+    ).toBeUndefined();
+  });
 });
--- a/extensions/codex/provider.ts
+++ b/extensions/codex/provider.ts
@@ -10,6 +10,7 @@ import {
  type CodexAppServerModel,
  type CodexAppServerModelListResult,
 } from "./harness.js";
+import { resolveCodexSystemPromptContribution } from "./prompt-overlay.js";
 import {
  type CodexAppServerStartOptions,
  readCodexPluginConfig,
@@ -99,6 +100,8 @@ export function buildCodexProvider(options: BuildCodexProviderOptions = {}): Pro
        ...(isKnownXHighCodexModel(modelId) ? [{ id: "xhigh" as const }] : []),
      ],
    }),
+    resolveSystemPromptContribution: ({ modelId }) =>
+      resolveCodexSystemPromptContribution({ modelId }),
    isModernModelRef: ({ modelId }) => isModernCodexModel(modelId),
  };
 }
--- a/extensions/codex/src/app-server/protocol.ts
+++ b/extensions/codex/src/app-server/protocol.ts
@@ -74,6 +74,8 @@ export type CodexThreadResumeParams = {
  approvalsReviewer?: "user" | "guardian_subagent";
  sandbox?: "read-only" | "workspace-write" | "danger-full-access";
  serviceTier?: string | null;
+  baseInstructions?: string | null;
+  developerInstructions?: string | null;
  persistExtendedHistory?: boolean;
 };

--- a/extensions/codex/src/app-server/run-attempt.test.ts
+++ b/extensions/codex/src/app-server/run-attempt.test.ts
@@ -7,6 +7,7 @@ import {
  type EmbeddedRunAttemptParams,
 } from "openclaw/plugin-sdk/agent-harness";
 import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
+import { CODEX_GPT5_BEHAVIOR_CONTRACT } from "../../prompt-overlay.js";
 import type { CodexServerNotification } from "./protocol.js";
 import { runCodexAppServerAttempt, __testing } from "./run-attempt.js";
 import { writeCodexAppServerBinding } from "./session-binding.js";
@@ -190,6 +191,7 @@ describe("runCodexAppServerAttempt", () => {
            modelProvider: "openai",
            approvalPolicy: "never",
            sandbox: "danger-full-access",
+            developerInstructions: expect.stringContaining(CODEX_GPT5_BEHAVIOR_CONTRACT),
          }),
        },
        {
@@ -440,6 +442,7 @@ describe("runCodexAppServerAttempt", () => {
      approvalPolicy: "never",
      approvalsReviewer: "user",
      sandbox: "danger-full-access",
+      developerInstructions: expect.stringContaining(CODEX_GPT5_BEHAVIOR_CONTRACT),
      persistExtendedHistory: true,
    });
  });
@@ -472,6 +475,7 @@ describe("runCodexAppServerAttempt", () => {
      approvalsReviewer: "guardian_subagent",
      sandbox: "danger-full-access",
      serviceTier: "priority",
+      developerInstructions: expect.stringContaining(CODEX_GPT5_BEHAVIOR_CONTRACT),
      persistExtendedHistory: true,
    });
    expect(requests).toEqual(
@@ -513,6 +517,7 @@ describe("runCodexAppServerAttempt", () => {
      approvalsReviewer: "guardian_subagent",
      sandbox: "danger-full-access",
      serviceTier: "priority",
+      developerInstructions: expect.stringContaining(CODEX_GPT5_BEHAVIOR_CONTRACT),
      persistExtendedHistory: true,
    });
    expect(
--- a/extensions/codex/src/app-server/thread-lifecycle.ts
+++ b/extensions/codex/src/app-server/thread-lifecycle.ts
@@ -1,4 +1,5 @@
 import { embeddedAgentLog, type EmbeddedRunAttemptParams } from "openclaw/plugin-sdk/agent-harness";
+import { renderCodexPromptOverlay } from "../../prompt-overlay.js";
 import type { CodexAppServerClient } from "./client.js";
 import type { CodexAppServerRuntimeOptions } from "./config.js";
 import {
@@ -131,6 +132,7 @@ export function buildThreadResumeParams(
    approvalsReviewer: options.appServer.approvalsReviewer,
    sandbox: options.appServer.sandbox,
    ...(options.appServer.serviceTier ? { serviceTier: options.appServer.serviceTier } : {}),
+    developerInstructions: buildDeveloperInstructions(params),
    persistExtendedHistory: true,
  };
 }
@@ -179,6 +181,7 @@ function buildDeveloperInstructions(params: EmbeddedRunAttemptParams): string {
  const sections = [
    "You are running inside OpenClaw. Use OpenClaw dynamic tools for messaging, cron, sessions, and host actions when available.",
    "Preserve the user's existing channel/session context. If sending a channel reply, use the OpenClaw messaging tool instead of describing that you would reply.",
+    renderCodexPromptOverlay({ modelId: params.modelId }),
    params.extraSystemPrompt,
    params.skillsSnapshot?.prompt,
  ];