diff --git a/docs/gateway/opentelemetry.md b/docs/gateway/opentelemetry.md index 2331636514c..26d185abd05 100644 --- a/docs/gateway/opentelemetry.md +++ b/docs/gateway/opentelemetry.md @@ -61,11 +61,11 @@ openclaw plugins enable diagnostics-otel ## Signals exported -| Signal | What goes in it | -| ----------- | --------------------------------------------------------------------------------------------------------------------------------- | -| **Metrics** | Counters and histograms for token usage, cost, run duration, message flow, queue lanes, session state, exec, and memory pressure. | -| **Traces** | Spans for model usage, model calls, tool execution, exec, webhook/message processing, context assembly, and tool loops. | -| **Logs** | Structured `logging.file` records exported over OTLP when `diagnostics.otel.logs` is enabled. | +| Signal | What goes in it | +| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------ | +| **Metrics** | Counters and histograms for token usage, cost, run duration, message flow, queue lanes, session state, exec, and memory pressure. | +| **Traces** | Spans for model usage, model calls, harness lifecycle, tool execution, exec, webhook/message processing, context assembly, and tool loops. | +| **Logs** | Structured `logging.file` records exported over OTLP when `diagnostics.otel.logs` is enabled. | Toggle `traces`, `metrics`, and `logs` independently. All three default to on when `diagnostics.otel.enabled` is true. @@ -176,6 +176,10 @@ When any subkey is enabled, model and tool spans get bounded, redacted - `openclaw.session.stuck_age_ms` (histogram, attrs: `openclaw.state`) - `openclaw.run.attempt` (counter, attrs: `openclaw.attempt`) +### Harness lifecycle + +- `openclaw.harness.duration_ms` (histogram, attrs: `openclaw.harness.id`, `openclaw.harness.plugin`, `openclaw.outcome`, `openclaw.harness.phase` on errors) + ### Exec - `openclaw.exec.duration_ms` (histogram, attrs: `openclaw.exec.target`, `openclaw.exec.mode`, `openclaw.outcome`, `openclaw.failureKind`) @@ -201,6 +205,10 @@ When any subkey is enabled, model and tool spans get bounded, redacted - `gen_ai.system` by default, or `gen_ai.provider.name` when the latest GenAI semantic conventions are opted in - `gen_ai.request.model`, `gen_ai.operation.name`, `openclaw.provider`, `openclaw.model`, `openclaw.api`, `openclaw.transport` - `openclaw.provider.request_id_hash` (bounded SHA-based hash of the upstream provider request id; raw ids are not exported) +- `openclaw.harness.run` + - `openclaw.harness.id`, `openclaw.harness.plugin`, `openclaw.outcome`, `openclaw.provider`, `openclaw.model`, `openclaw.channel` + - On completion: `openclaw.harness.result_classification`, `openclaw.harness.yield_detected`, `openclaw.harness.items.started`, `openclaw.harness.items.completed`, `openclaw.harness.items.active` + - On error: `openclaw.harness.phase`, `openclaw.errorCategory`, optional `openclaw.harness.cleanup_failed` - `openclaw.tool.execution` - `gen_ai.tool.name`, `openclaw.toolName`, `openclaw.errorCategory`, `openclaw.tool.params.*` - `openclaw.exec` @@ -251,6 +259,16 @@ to them directly without OTLP export. - `run.attempt` - `diagnostic.heartbeat` (aggregate counters: webhooks/queue/session) +**Harness lifecycle** + +- `harness.run.started` / `harness.run.completed` / `harness.run.error` — + per-run lifecycle for the agent harness. Includes `harnessId`, optional + `pluginId`, provider/model/channel, and run id. Completion adds + `durationMs`, `outcome`, optional `resultClassification`, `yieldDetected`, + and `itemLifecycle` counts. Errors add `phase` + (`prepare`/`start`/`send`/`resolve`/`cleanup`), `errorCategory`, and + optional `cleanupFailed`. + **Exec** - `exec.process.completed` — terminal outcome, duration, target, mode, exit