19 KiB
summary, title, read_when
| summary | title | read_when | |||
|---|---|---|---|---|---|
| Export OpenClaw diagnostics to any OpenTelemetry collector via the diagnostics-otel plugin (OTLP/HTTP) | OpenTelemetry export |
|
OpenClaw exports diagnostics through the bundled diagnostics-otel plugin
using OTLP/HTTP (protobuf). Any collector or backend that accepts OTLP/HTTP
works without code changes. For local file logs and how to read them, see
Logging.
How it fits together
- Diagnostics events are structured, in-process records emitted by the Gateway and bundled plugins for model runs, message flow, sessions, queues, and exec.
diagnostics-otelplugin subscribes to those events and exports them as OpenTelemetry metrics, traces, and logs over OTLP/HTTP.- Provider calls receive a W3C
traceparentheader from OpenClaw's trusted model-call span context when the provider transport accepts custom headers. Plugin-emitted trace context is not propagated. - Exporters only attach when both the diagnostics surface and the plugin are enabled, so the in-process cost stays near zero by default.
Quick start
{
plugins: {
allow: ["diagnostics-otel"],
entries: {
"diagnostics-otel": { enabled: true },
},
},
diagnostics: {
enabled: true,
otel: {
enabled: true,
endpoint: "http://otel-collector:4318",
protocol: "http/protobuf",
serviceName: "openclaw-gateway",
traces: true,
metrics: true,
logs: true,
sampleRate: 0.2,
flushIntervalMs: 60000,
},
},
}
You can also enable the plugin from the CLI:
openclaw plugins enable diagnostics-otel
Signals exported
| Signal | What goes in it |
|---|---|
| Metrics | Counters and histograms for token usage, cost, run duration, message flow, queue lanes, session state, exec, and memory pressure. |
| Traces | Spans for model usage, model calls, harness lifecycle, tool execution, exec, webhook/message processing, context assembly, and tool loops. |
| Logs | Structured logging.file records exported over OTLP when diagnostics.otel.logs is enabled. |
Toggle traces, metrics, and logs independently. All three default to on
when diagnostics.otel.enabled is true.
Configuration reference
{
diagnostics: {
enabled: true,
otel: {
enabled: true,
endpoint: "http://otel-collector:4318",
tracesEndpoint: "http://otel-collector:4318/v1/traces",
metricsEndpoint: "http://otel-collector:4318/v1/metrics",
logsEndpoint: "http://otel-collector:4318/v1/logs",
protocol: "http/protobuf", // grpc is ignored
serviceName: "openclaw-gateway",
headers: { "x-collector-token": "..." },
traces: true,
metrics: true,
logs: true,
sampleRate: 0.2, // root-span sampler, 0.0..1.0
flushIntervalMs: 60000, // metric export interval (min 1000ms)
captureContent: {
enabled: false,
inputMessages: false,
outputMessages: false,
toolInputs: false,
toolOutputs: false,
systemPrompt: false,
},
},
},
}
Environment variables
| Variable | Purpose |
|---|---|
OTEL_EXPORTER_OTLP_ENDPOINT |
Override diagnostics.otel.endpoint. If the value already contains /v1/traces, /v1/metrics, or /v1/logs, it is used as-is. |
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT / OTEL_EXPORTER_OTLP_METRICS_ENDPOINT / OTEL_EXPORTER_OTLP_LOGS_ENDPOINT |
Signal-specific endpoint overrides used when the matching diagnostics.otel.*Endpoint config key is unset. Signal-specific config wins over signal-specific env, which wins over the shared endpoint. |
OTEL_SERVICE_NAME |
Override diagnostics.otel.serviceName. |
OTEL_EXPORTER_OTLP_PROTOCOL |
Override the wire protocol (only http/protobuf is honored today). |
OTEL_SEMCONV_STABILITY_OPT_IN |
Set to gen_ai_latest_experimental to emit the latest experimental GenAI span attribute (gen_ai.provider.name) instead of the legacy gen_ai.system. GenAI metrics always use bounded, low-cardinality semantic attributes regardless. |
OPENCLAW_OTEL_PRELOADED |
Set to 1 when another preload or host process already registered the global OpenTelemetry SDK. The plugin then skips its own NodeSDK lifecycle but still wires diagnostic listeners and honors traces/metrics/logs. |
Privacy and content capture
Raw model/tool content is not exported by default. Spans carry bounded identifiers (channel, provider, model, error category, hash-only request ids) and never include prompt text, response text, tool inputs, tool outputs, or session keys.
Outbound model requests may include a W3C traceparent header. That header is
generated only from OpenClaw-owned diagnostic trace context for the active model
call. Existing caller-supplied traceparent headers are replaced, so plugins or
custom provider options cannot spoof cross-service trace ancestry.
Set diagnostics.otel.captureContent.* to true only when your collector and
retention policy are approved for prompt, response, tool, or system-prompt
text. Each subkey is opt-in independently:
inputMessages— user prompt content.outputMessages— model response content.toolInputs— tool argument payloads.toolOutputs— tool result payloads.systemPrompt— assembled system/developer prompt.
When any subkey is enabled, model and tool spans get bounded, redacted
openclaw.content.* attributes for that class only.
Sampling and flushing
- Traces:
diagnostics.otel.sampleRate(root-span only,0.0drops all,1.0keeps all). - Metrics:
diagnostics.otel.flushIntervalMs(minimum1000). - Logs: OTLP logs respect
logging.level(file log level). They use the diagnostic log-record redaction path, not console formatting. High-volume installs should prefer OTLP collector sampling/filtering over local sampling. - File-log correlation: JSONL file logs include top-level
traceId,spanId,parentSpanId, andtraceFlagswhen the log call carries a valid diagnostic trace context, which lets log processors join local log lines with exported spans. - Request correlation: Gateway HTTP requests and WebSocket frames create an
internal request trace scope. Logs and diagnostic events inside that scope
inherit the request trace by default, while agent run and model-call spans are
created as children so provider
traceparentheaders stay on the same trace.
Exported metrics
Model usage
openclaw.tokens(counter, attrs:openclaw.token,openclaw.channel,openclaw.provider,openclaw.model,openclaw.agent)openclaw.cost.usd(counter, attrs:openclaw.channel,openclaw.provider,openclaw.model)openclaw.run.duration_ms(histogram, attrs:openclaw.channel,openclaw.provider,openclaw.model)openclaw.context.tokens(histogram, attrs:openclaw.context,openclaw.channel,openclaw.provider,openclaw.model)gen_ai.client.token.usage(histogram, GenAI semantic-conventions metric, attrs:gen_ai.token.type=input/output,gen_ai.provider.name,gen_ai.operation.name,gen_ai.request.model)gen_ai.client.operation.duration(histogram, seconds, GenAI semantic-conventions metric, attrs:gen_ai.provider.name,gen_ai.operation.name,gen_ai.request.model, optionalerror.type)openclaw.model_call.duration_ms(histogram, attrs:openclaw.provider,openclaw.model,openclaw.api,openclaw.transport, plusopenclaw.errorCategoryandopenclaw.failureKindon classified errors)openclaw.model_call.request_bytes(histogram, UTF-8 byte size of the final model request payload; no raw payload content)openclaw.model_call.response_bytes(histogram, UTF-8 byte size of streamed model response events; no raw response content)openclaw.model_call.time_to_first_byte_ms(histogram, elapsed time before the first streamed response event)
Message flow
openclaw.webhook.received(counter, attrs:openclaw.channel,openclaw.webhook)openclaw.webhook.error(counter, attrs:openclaw.channel,openclaw.webhook)openclaw.webhook.duration_ms(histogram, attrs:openclaw.channel,openclaw.webhook)openclaw.message.queued(counter, attrs:openclaw.channel,openclaw.source)openclaw.message.processed(counter, attrs:openclaw.channel,openclaw.outcome)openclaw.message.duration_ms(histogram, attrs:openclaw.channel,openclaw.outcome)openclaw.message.delivery.started(counter, attrs:openclaw.channel,openclaw.delivery.kind)openclaw.message.delivery.duration_ms(histogram, attrs:openclaw.channel,openclaw.delivery.kind,openclaw.outcome,openclaw.errorCategory)
Queues and sessions
openclaw.queue.lane.enqueue(counter, attrs:openclaw.lane)openclaw.queue.lane.dequeue(counter, attrs:openclaw.lane)openclaw.queue.depth(histogram, attrs:openclaw.laneoropenclaw.channel=heartbeat)openclaw.queue.wait_ms(histogram, attrs:openclaw.lane)openclaw.session.state(counter, attrs:openclaw.state,openclaw.reason)openclaw.session.stuck(counter, attrs:openclaw.state; emitted only for stale session bookkeeping with no active work)openclaw.session.stuck_age_ms(histogram, attrs:openclaw.state; emitted only for stale session bookkeeping with no active work)openclaw.run.attempt(counter, attrs:openclaw.attempt)
Session liveness telemetry
diagnostics.stuckSessionWarnMs is the no-progress age threshold for session
liveness diagnostics. A processing session does not age toward this threshold
while OpenClaw observes reply, tool, status, block, or ACP runtime progress.
Typing keepalives are not counted as progress, so a silent model or harness can
still be detected.
OpenClaw classifies sessions by the work it can still observe:
session.long_running: active embedded work, model calls, or tool calls are still making progress.session.stalled: active work exists, but the active run has not reported recent progress.session.stuck: stale session bookkeeping with no active work. This is the only liveness classification that releases the affected session lane.
Only session.stuck emits the openclaw.session.stuck counter, the
openclaw.session.stuck_age_ms histogram, and the openclaw.session.stuck
span. Repeated session.stuck diagnostics back off while the session remains
unchanged, so dashboards should alert on sustained increases rather than every
heartbeat tick. For the config knob and defaults, see
Configuration reference.
Harness lifecycle
openclaw.harness.duration_ms(histogram, attrs:openclaw.harness.id,openclaw.harness.plugin,openclaw.outcome,openclaw.harness.phaseon errors)
Exec
openclaw.exec.duration_ms(histogram, attrs:openclaw.exec.target,openclaw.exec.mode,openclaw.outcome,openclaw.failureKind)
Diagnostics internals (memory and tool loop)
openclaw.memory.heap_used_bytes(histogram, attrs:openclaw.memory.kind)openclaw.memory.rss_bytes(histogram)openclaw.memory.pressure(counter, attrs:openclaw.memory.level)openclaw.tool.loop.iterations(counter, attrs:openclaw.toolName,openclaw.outcome)openclaw.tool.loop.duration_ms(histogram, attrs:openclaw.toolName,openclaw.outcome)
Exported spans
openclaw.model.usageopenclaw.channel,openclaw.provider,openclaw.modelopenclaw.tokens.*(input/output/cache_read/cache_write/total)gen_ai.systemby default, orgen_ai.provider.namewhen the latest GenAI semantic conventions are opted ingen_ai.request.model,gen_ai.operation.name,gen_ai.usage.*
openclaw.runopenclaw.outcome,openclaw.channel,openclaw.provider,openclaw.model,openclaw.errorCategory
openclaw.model.callgen_ai.systemby default, orgen_ai.provider.namewhen the latest GenAI semantic conventions are opted ingen_ai.request.model,gen_ai.operation.name,openclaw.provider,openclaw.model,openclaw.api,openclaw.transportopenclaw.errorCategoryand optionalopenclaw.failureKindon errorsopenclaw.model_call.request_bytes,openclaw.model_call.response_bytes,openclaw.model_call.time_to_first_byte_msopenclaw.provider.request_id_hash(bounded SHA-based hash of the upstream provider request id; raw ids are not exported)
openclaw.harness.runopenclaw.harness.id,openclaw.harness.plugin,openclaw.outcome,openclaw.provider,openclaw.model,openclaw.channel- On completion:
openclaw.harness.result_classification,openclaw.harness.yield_detected,openclaw.harness.items.started,openclaw.harness.items.completed,openclaw.harness.items.active - On error:
openclaw.harness.phase,openclaw.errorCategory, optionalopenclaw.harness.cleanup_failed
openclaw.tool.executiongen_ai.tool.name,openclaw.toolName,openclaw.errorCategory,openclaw.tool.params.*
openclaw.execopenclaw.exec.target,openclaw.exec.mode,openclaw.outcome,openclaw.failureKind,openclaw.exec.command_length,openclaw.exec.exit_code,openclaw.exec.timed_out
openclaw.webhook.processedopenclaw.channel,openclaw.webhook,openclaw.chatId
openclaw.webhook.erroropenclaw.channel,openclaw.webhook,openclaw.chatId,openclaw.error
openclaw.message.processedopenclaw.channel,openclaw.outcome,openclaw.chatId,openclaw.messageId,openclaw.reason
openclaw.message.deliveryopenclaw.channel,openclaw.delivery.kind,openclaw.outcome,openclaw.errorCategory,openclaw.delivery.result_count
openclaw.session.stuckopenclaw.state,openclaw.ageMs,openclaw.queueDepth
openclaw.context.assembledopenclaw.prompt.size,openclaw.history.size,openclaw.context.tokens,openclaw.errorCategory(no prompt, history, response, or session-key content)
openclaw.tool.loopopenclaw.toolName,openclaw.outcome,openclaw.iterations,openclaw.errorCategory(no loop messages, params, or tool output)
openclaw.memory.pressureopenclaw.memory.level,openclaw.memory.heap_used_bytes,openclaw.memory.rss_bytes
When content capture is explicitly enabled, model and tool spans can also
include bounded, redacted openclaw.content.* attributes for the specific
content classes you opted into.
Diagnostic event catalog
The events below back the metrics and spans above. Plugins can also subscribe to them directly without OTLP export.
Model usage
model.usage— tokens, cost, duration, context, provider/model/channel, session ids.usageis provider/turn accounting for cost and telemetry;context.usedis the current prompt/context snapshot and can be lower than providerusage.totalwhen cached input or tool-loop calls are involved.
Message flow
webhook.received/webhook.processed/webhook.errormessage.queued/message.processedmessage.delivery.started/message.delivery.completed/message.delivery.error
Queue and session
queue.lane.enqueue/queue.lane.dequeuesession.state/session.long_running/session.stalled/session.stuckrun.attempt/run.progressdiagnostic.heartbeat(aggregate counters: webhooks/queue/session)
Harness lifecycle
harness.run.started/harness.run.completed/harness.run.error— per-run lifecycle for the agent harness. IncludesharnessId, optionalpluginId, provider/model/channel, and run id. Completion addsdurationMs,outcome, optionalresultClassification,yieldDetected, anditemLifecyclecounts. Errors addphase(prepare/start/send/resolve/cleanup),errorCategory, and optionalcleanupFailed.
Exec
exec.process.completed— terminal outcome, duration, target, mode, exit code, and failure kind. Command text and working directories are not included.
Without an exporter
You can keep diagnostics events available to plugins or custom sinks without
running diagnostics-otel:
{
diagnostics: { enabled: true },
}
For targeted debug output without raising logging.level, use diagnostics
flags. Flags are case-insensitive and support wildcards (e.g. telegram.* or
*):
{
diagnostics: { flags: ["telegram.http"] },
}
Or as a one-off env override:
OPENCLAW_DIAGNOSTICS=telegram.http,telegram.payload openclaw gateway
Flag output goes to the standard log file (logging.file) and is still
redacted by logging.redactSensitive. Full guide:
Diagnostics flags.
Disable
{
diagnostics: { otel: { enabled: false } },
}
You can also leave diagnostics-otel out of plugins.allow, or run
openclaw plugins disable diagnostics-otel.
Related
- Logging — file logs, console output, CLI tailing, and the Control UI Logs tab
- Gateway logging internals — WS log styles, subsystem prefixes, and console capture
- Diagnostics flags — targeted debug-log flags
- Diagnostics export — operator support-bundle tool (separate from OTEL export)
- Configuration reference — full
diagnostics.*field reference