openclaw/docs/gateway/opentelemetry.md at 64a7a34c83343e3add5638eb34f48693abaf05dc

vultr/openclaw

Fork 0

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-06 21:50:42 +00:00

Files

Vincent Koc a77996dc56 fix(diagnostics): propagate trusted traceparent headers

2026-04-26 00:24:47 -07:00

17 KiB

Raw Blame History

summary, title, read_when

summary

title

read_when

Export OpenClaw diagnostics to any OpenTelemetry collector via the diagnostics-otel plugin (OTLP/HTTP)

OpenTelemetry export

You want to send OpenClaw model usage, message flow, or session metrics to an OpenTelemetry collector

You are wiring traces, metrics, or logs into Grafana, Datadog, Honeycomb, New Relic, Tempo, or another OTLP backend

You need the exact metric names, span names, or attribute shapes to build dashboards or alerts

OpenClaw exports diagnostics through the bundled diagnostics-otel plugin using OTLP/HTTP (protobuf). Any collector or backend that accepts OTLP/HTTP works without code changes. For local file logs and how to read them, see Logging.

How it fits together

Diagnostics events are structured, in-process records emitted by the Gateway and bundled plugins for model runs, message flow, sessions, queues, and exec.
diagnostics-otel plugin subscribes to those events and exports them as OpenTelemetry metrics, traces, and logs over OTLP/HTTP.
Provider calls receive a W3C traceparent header from OpenClaw's trusted model-call span context when the provider transport accepts custom headers. Plugin-emitted trace context is not propagated.
Exporters only attach when both the diagnostics surface and the plugin are enabled, so the in-process cost stays near zero by default.

Quick start

{
  plugins: {
    allow: ["diagnostics-otel"],
    entries: {
      "diagnostics-otel": { enabled: true },
    },
  },
  diagnostics: {
    enabled: true,
    otel: {
      enabled: true,
      endpoint: "http://otel-collector:4318",
      protocol: "http/protobuf",
      serviceName: "openclaw-gateway",
      traces: true,
      metrics: true,
      logs: true,
      sampleRate: 0.2,
      flushIntervalMs: 60000,
    },
  },
}

You can also enable the plugin from the CLI:

openclaw plugins enable diagnostics-otel

`protocol` currently supports `http/protobuf` only. `grpc` is ignored.

Signals exported

Signal	What goes in it
Metrics	Counters and histograms for token usage, cost, run duration, message flow, queue lanes, session state, exec, and memory pressure.
Traces	Spans for model usage, model calls, harness lifecycle, tool execution, exec, webhook/message processing, context assembly, and tool loops.
Logs	Structured `logging.file` records exported over OTLP when `diagnostics.otel.logs` is enabled.

Toggle traces, metrics, and logs independently. All three default to on when diagnostics.otel.enabled is true.

Configuration reference

{
  diagnostics: {
    enabled: true,
    otel: {
      enabled: true,
      endpoint: "http://otel-collector:4318",
      tracesEndpoint: "http://otel-collector:4318/v1/traces",
      metricsEndpoint: "http://otel-collector:4318/v1/metrics",
      logsEndpoint: "http://otel-collector:4318/v1/logs",
      protocol: "http/protobuf", // grpc is ignored
      serviceName: "openclaw-gateway",
      headers: { "x-collector-token": "..." },
      traces: true,
      metrics: true,
      logs: true,
      sampleRate: 0.2, // root-span sampler, 0.0..1.0
      flushIntervalMs: 60000, // metric export interval (min 1000ms)
      captureContent: {
        enabled: false,
        inputMessages: false,
        outputMessages: false,
        toolInputs: false,
        toolOutputs: false,
        systemPrompt: false,
      },
    },
  },
}

Environment variables

Variable	Purpose
`OTEL_EXPORTER_OTLP_ENDPOINT`	Override `diagnostics.otel.endpoint`. If the value already contains `/v1/traces`, `/v1/metrics`, or `/v1/logs`, it is used as-is.
`OTEL_EXPORTER_OTLP_TRACES_ENDPOINT` / `OTEL_EXPORTER_OTLP_METRICS_ENDPOINT` / `OTEL_EXPORTER_OTLP_LOGS_ENDPOINT`	Signal-specific endpoint overrides used when the matching `diagnostics.otel.*Endpoint` config key is unset. Signal-specific config wins over signal-specific env, which wins over the shared endpoint.
`OTEL_SERVICE_NAME`	Override `diagnostics.otel.serviceName`.
`OTEL_EXPORTER_OTLP_PROTOCOL`	Override the wire protocol (only `http/protobuf` is honored today).
`OTEL_SEMCONV_STABILITY_OPT_IN`	Set to `gen_ai_latest_experimental` to emit the latest experimental GenAI span attribute (`gen_ai.provider.name`) instead of the legacy `gen_ai.system`. GenAI metrics always use bounded, low-cardinality semantic attributes regardless.
`OPENCLAW_OTEL_PRELOADED`	Set to `1` when another preload or host process already registered the global OpenTelemetry SDK. The plugin then skips its own NodeSDK lifecycle but still wires diagnostic listeners and honors `traces`/`metrics`/`logs`.

Privacy and content capture

Raw model/tool content is not exported by default. Spans carry bounded identifiers (channel, provider, model, error category, hash-only request ids) and never include prompt text, response text, tool inputs, tool outputs, or session keys.

Outbound model requests may include a W3C traceparent header. That header is generated only from OpenClaw-owned diagnostic trace context for the active model call. Existing caller-supplied traceparent headers are replaced, so plugins or custom provider options cannot spoof cross-service trace ancestry.

Set diagnostics.otel.captureContent.* to true only when your collector and retention policy are approved for prompt, response, tool, or system-prompt text. Each subkey is opt-in independently:

inputMessages — user prompt content.
outputMessages — model response content.
toolInputs — tool argument payloads.
toolOutputs — tool result payloads.
systemPrompt — assembled system/developer prompt.

When any subkey is enabled, model and tool spans get bounded, redacted openclaw.content.* attributes for that class only.

Sampling and flushing

Traces: diagnostics.otel.sampleRate (root-span only, 0.0 drops all, 1.0 keeps all).
Metrics: diagnostics.otel.flushIntervalMs (minimum 1000).
Logs: OTLP logs respect logging.level (file log level). Console redaction does not apply to OTLP logs. High-volume installs should prefer OTLP collector sampling/filtering over local sampling.

Exported metrics

Model usage

openclaw.tokens (counter, attrs: openclaw.token, openclaw.channel, openclaw.provider, openclaw.model, openclaw.agent)
openclaw.cost.usd (counter, attrs: openclaw.channel, openclaw.provider, openclaw.model)
openclaw.run.duration_ms (histogram, attrs: openclaw.channel, openclaw.provider, openclaw.model)
openclaw.context.tokens (histogram, attrs: openclaw.context, openclaw.channel, openclaw.provider, openclaw.model)
gen_ai.client.token.usage (histogram, GenAI semantic-conventions metric, attrs: gen_ai.token.type = input/output, gen_ai.provider.name, gen_ai.operation.name, gen_ai.request.model)
gen_ai.client.operation.duration (histogram, seconds, GenAI semantic-conventions metric, attrs: gen_ai.provider.name, gen_ai.operation.name, gen_ai.request.model, optional error.type)

Message flow

openclaw.webhook.received (counter, attrs: openclaw.channel, openclaw.webhook)
openclaw.webhook.error (counter, attrs: openclaw.channel, openclaw.webhook)
openclaw.webhook.duration_ms (histogram, attrs: openclaw.channel, openclaw.webhook)
openclaw.message.queued (counter, attrs: openclaw.channel, openclaw.source)
openclaw.message.processed (counter, attrs: openclaw.channel, openclaw.outcome)
openclaw.message.duration_ms (histogram, attrs: openclaw.channel, openclaw.outcome)
openclaw.message.delivery.started (counter, attrs: openclaw.channel, openclaw.delivery.kind)
openclaw.message.delivery.duration_ms (histogram, attrs: openclaw.channel, openclaw.delivery.kind, openclaw.outcome, openclaw.errorCategory)

Queues and sessions

openclaw.queue.lane.enqueue (counter, attrs: openclaw.lane)
openclaw.queue.lane.dequeue (counter, attrs: openclaw.lane)
openclaw.queue.depth (histogram, attrs: openclaw.lane or openclaw.channel=heartbeat)
openclaw.queue.wait_ms (histogram, attrs: openclaw.lane)
openclaw.session.state (counter, attrs: openclaw.state, openclaw.reason)
openclaw.session.stuck (counter, attrs: openclaw.state)
openclaw.session.stuck_age_ms (histogram, attrs: openclaw.state)
openclaw.run.attempt (counter, attrs: openclaw.attempt)

Harness lifecycle

openclaw.harness.duration_ms (histogram, attrs: openclaw.harness.id, openclaw.harness.plugin, openclaw.outcome, openclaw.harness.phase on errors)

Exec

openclaw.exec.duration_ms (histogram, attrs: openclaw.exec.target, openclaw.exec.mode, openclaw.outcome, openclaw.failureKind)

Diagnostics internals (memory and tool loop)

openclaw.memory.heap_used_bytes (histogram, attrs: openclaw.memory.kind)
openclaw.memory.rss_bytes (histogram)
openclaw.memory.pressure (counter, attrs: openclaw.memory.level)
openclaw.tool.loop.iterations (counter, attrs: openclaw.toolName, openclaw.outcome)
openclaw.tool.loop.duration_ms (histogram, attrs: openclaw.toolName, openclaw.outcome)

Exported spans

openclaw.model.usage
- openclaw.channel, openclaw.provider, openclaw.model
- openclaw.tokens.* (input/output/cache_read/cache_write/total)
- gen_ai.system by default, or gen_ai.provider.name when the latest GenAI semantic conventions are opted in
- gen_ai.request.model, gen_ai.operation.name, gen_ai.usage.*
openclaw.run
- openclaw.outcome, openclaw.channel, openclaw.provider, openclaw.model, openclaw.errorCategory
openclaw.model.call
- gen_ai.system by default, or gen_ai.provider.name when the latest GenAI semantic conventions are opted in
- gen_ai.request.model, gen_ai.operation.name, openclaw.provider, openclaw.model, openclaw.api, openclaw.transport
- openclaw.provider.request_id_hash (bounded SHA-based hash of the upstream provider request id; raw ids are not exported)
openclaw.harness.run
- openclaw.harness.id, openclaw.harness.plugin, openclaw.outcome, openclaw.provider, openclaw.model, openclaw.channel
- On completion: openclaw.harness.result_classification, openclaw.harness.yield_detected, openclaw.harness.items.started, openclaw.harness.items.completed, openclaw.harness.items.active
- On error: openclaw.harness.phase, openclaw.errorCategory, optional openclaw.harness.cleanup_failed
openclaw.tool.execution
- gen_ai.tool.name, openclaw.toolName, openclaw.errorCategory, openclaw.tool.params.*
openclaw.exec
- openclaw.exec.target, openclaw.exec.mode, openclaw.outcome, openclaw.failureKind, openclaw.exec.command_length, openclaw.exec.exit_code, openclaw.exec.timed_out
openclaw.webhook.processed
- openclaw.channel, openclaw.webhook, openclaw.chatId
openclaw.webhook.error
- openclaw.channel, openclaw.webhook, openclaw.chatId, openclaw.error
openclaw.message.processed
- openclaw.channel, openclaw.outcome, openclaw.chatId, openclaw.messageId, openclaw.reason
openclaw.message.delivery
- openclaw.channel, openclaw.delivery.kind, openclaw.outcome, openclaw.errorCategory, openclaw.delivery.result_count
openclaw.session.stuck
- openclaw.state, openclaw.ageMs, openclaw.queueDepth
openclaw.context.assembled
- openclaw.prompt.size, openclaw.history.size, openclaw.context.tokens, openclaw.errorCategory (no prompt, history, response, or session-key content)
openclaw.tool.loop
- openclaw.toolName, openclaw.outcome, openclaw.iterations, openclaw.errorCategory (no loop messages, params, or tool output)
openclaw.memory.pressure
- openclaw.memory.level, openclaw.memory.heap_used_bytes, openclaw.memory.rss_bytes

When content capture is explicitly enabled, model and tool spans can also include bounded, redacted openclaw.content.* attributes for the specific content classes you opted into.

Diagnostic event catalog

The events below back the metrics and spans above. Plugins can also subscribe to them directly without OTLP export.

Model usage

model.usage — tokens, cost, duration, context, provider/model/channel, session ids. usage is provider/turn accounting for cost and telemetry; context.used is the current prompt/context snapshot and can be lower than provider usage.total when cached input or tool-loop calls are involved.

Message flow

webhook.received / webhook.processed / webhook.error
message.queued / message.processed
message.delivery.started / message.delivery.completed / message.delivery.error

Queue and session

queue.lane.enqueue / queue.lane.dequeue
session.state / session.stuck
run.attempt
diagnostic.heartbeat (aggregate counters: webhooks/queue/session)

Harness lifecycle

harness.run.started / harness.run.completed / harness.run.error — per-run lifecycle for the agent harness. Includes harnessId, optional pluginId, provider/model/channel, and run id. Completion adds durationMs, outcome, optional resultClassification, yieldDetected, and itemLifecycle counts. Errors add phase (prepare/start/send/resolve/cleanup), errorCategory, and optional cleanupFailed.

Exec

exec.process.completed — terminal outcome, duration, target, mode, exit code, and failure kind. Command text and working directories are not included.

Without an exporter

You can keep diagnostics events available to plugins or custom sinks without running diagnostics-otel:

{
  diagnostics: { enabled: true },
}

For targeted debug output without raising logging.level, use diagnostics flags. Flags are case-insensitive and support wildcards (e.g. telegram.* or *):

{
  diagnostics: { flags: ["telegram.http"] },
}

Or as a one-off env override:

OPENCLAW_DIAGNOSTICS=telegram.http,telegram.payload openclaw gateway

Flag output goes to the standard log file (logging.file) and is still redacted by logging.redactSensitive. Full guide: Diagnostics flags.

Disable

{
  diagnostics: { otel: { enabled: false } },
}

You can also leave diagnostics-otel out of plugins.allow, or run openclaw plugins disable diagnostics-otel.

Logging — file logs, console output, CLI tailing, and the Control UI Logs tab
Gateway logging internals — WS log styles, subsystem prefixes, and console capture
Diagnostics flags — targeted debug-log flags
Diagnostics export — operator support-bundle tool (separate from OTEL export)
Configuration reference — full diagnostics.* field reference

17 KiB Raw Blame History