From 64582bb3a73c342b6835d82840512304d566cb91 Mon Sep 17 00:00:00 2001 From: Vincent Koc Date: Sat, 25 Apr 2026 12:29:54 -0700 Subject: [PATCH] docs(diagnostics-otel): clarify genai semconv exports --- CHANGELOG.md | 1 + docs/gateway/configuration-reference.md | 1 + docs/logging.md | 22 +++++++++++++++++++--- 3 files changed, 21 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index aaa15ee7854..0ab8bb22a94 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -43,6 +43,7 @@ Docs: https://docs.openclaw.ai - Diagnostics/OTEL: include bounded GenAI operation, provider, and request-model attributes on model-usage spans so token usage remains self-describing without diagnostic identifiers. Thanks @vincentkoc. - Diagnostics/OTEL: keep model-usage span GenAI provider attributes aligned with the existing semantic-convention opt-in policy, using legacy `gen_ai.system` unless latest experimental GenAI conventions are enabled. Thanks @vincentkoc. - Diagnostics/OTEL: keep `gen_ai.request.model` present on GenAI token usage metrics with a bounded `unknown` fallback when model usage events do not include a model. Thanks @vincentkoc. +- Docs/OTEL: document the GenAI token and model-call duration metrics, model-usage span attributes, and `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` provider-attribute behavior. Thanks @vincentkoc. - Diagnostics/OTEL: add bounded outbound message delivery lifecycle diagnostics and export them as low-cardinality delivery spans/metrics without message body, recipient, room, or media-path data. (#71471) Thanks @vincentkoc and @jlapenna. - Diagnostics/OTEL: emit bounded exec-process diagnostics and export them as `openclaw.exec` spans without exposing command text, working directories, or container identifiers. (#71451) Thanks @vincentkoc and @jlapenna. - Diagnostics/OTEL: support `OPENCLAW_OTEL_PRELOADED=1` so the plugin can reuse an already-registered OpenTelemetry SDK while keeping OpenClaw diagnostic listeners wired. (#71450) Thanks @vincentkoc and @jlapenna. diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md index 9ad7da2e01b..d1eb0fec488 100644 --- a/docs/gateway/configuration-reference.md +++ b/docs/gateway/configuration-reference.md @@ -917,6 +917,7 @@ Notes: - `otel.sampleRate`: trace sampling rate `0`–`1`. - `otel.flushIntervalMs`: periodic telemetry flush interval in ms. - `otel.captureContent`: opt-in raw content capture for OTEL span attributes. Defaults to off. Boolean `true` captures non-system message/tool content; the object form lets you enable `inputMessages`, `outputMessages`, `toolInputs`, `toolOutputs`, and `systemPrompt` explicitly. +- `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental`: environment toggle for latest experimental GenAI span provider attributes. By default spans keep the legacy `gen_ai.system` attribute for compatibility; GenAI metrics use bounded semantic attributes. - `OPENCLAW_OTEL_PRELOADED=1`: environment toggle for hosts that already registered a global OpenTelemetry SDK. OpenClaw then skips plugin-owned SDK startup/shutdown while keeping diagnostic listeners active. - `cacheTrace.enabled`: log cache trace snapshots for embedded runs (default: `false`). - `cacheTrace.filePath`: output path for cache trace JSONL (default: `$OPENCLAW_STATE_DIR/logs/cache-trace.jsonl`). diff --git a/docs/logging.md b/docs/logging.md index 67f4283661c..6da54994f7b 100644 --- a/docs/logging.md +++ b/docs/logging.md @@ -310,7 +310,8 @@ Notes: - You can also enable the plugin with `openclaw plugins enable diagnostics-otel`. - `protocol` currently supports `http/protobuf` only. `grpc` is ignored. - Metrics include token usage, cost, context size, run duration, and message-flow - counters/histograms (webhooks, queueing, session state, queue depth/wait). + counters/histograms (webhooks, queueing, session state, queue depth/wait), + plus GenAI token usage and model-call duration histograms. - Traces/metrics can be toggled with `traces` / `metrics` (default: on). Traces include model usage spans plus webhook/message processing spans when enabled. - Raw model/tool content is not exported by default. Use @@ -319,6 +320,10 @@ Notes: - Set `headers` when your collector requires auth. - Environment variables supported: `OTEL_EXPORTER_OTLP_ENDPOINT`, `OTEL_SERVICE_NAME`, `OTEL_EXPORTER_OTLP_PROTOCOL`. +- Set `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` to emit the + latest experimental GenAI provider span attribute (`gen_ai.provider.name`) + instead of the legacy span attribute (`gen_ai.system`). GenAI metrics always + use bounded, low-cardinality semantic attributes. - Set `OPENCLAW_OTEL_PRELOADED=1` when another preload or host process already registered the global OpenTelemetry SDK. In that mode the plugin does not start or shut down its own SDK, but it still wires OpenClaw diagnostic listeners and @@ -337,8 +342,11 @@ Model usage: - `openclaw.context.tokens` (histogram, attrs: `openclaw.context`, `openclaw.channel`, `openclaw.provider`, `openclaw.model`) - `gen_ai.client.token.usage` (histogram, GenAI semantic-conventions metric, - attrs: `gen_ai.token.type` = `input`/`output`, `gen_ai.system`, + attrs: `gen_ai.token.type` = `input`/`output`, `gen_ai.provider.name`, `gen_ai.operation.name`, `gen_ai.request.model`) +- `gen_ai.client.operation.duration` (histogram, seconds, GenAI + semantic-conventions metric, attrs: `gen_ai.provider.name`, + `gen_ai.operation.name`, `gen_ai.request.model`, optional `error.type`) Message flow: @@ -392,11 +400,16 @@ Diagnostics internals (memory + tool loop): - `openclaw.model.usage` - `openclaw.channel`, `openclaw.provider`, `openclaw.model` - `openclaw.tokens.*` (input/output/cache_read/cache_write/total) + - `gen_ai.system` by default, or `gen_ai.provider.name` when latest GenAI + semantic conventions are opted in + - `gen_ai.request.model`, `gen_ai.operation.name`, `gen_ai.usage.*` - `openclaw.run` - `openclaw.outcome`, `openclaw.channel`, `openclaw.provider`, `openclaw.model`, `openclaw.errorCategory` - `openclaw.model.call` - - `gen_ai.system`, `gen_ai.request.model`, `gen_ai.operation.name`, + - `gen_ai.system` by default, or `gen_ai.provider.name` when latest GenAI + semantic conventions are opted in + - `gen_ai.request.model`, `gen_ai.operation.name`, `openclaw.provider`, `openclaw.model`, `openclaw.api`, `openclaw.transport`, `openclaw.provider.request_id_hash` (bounded SHA-based hash of the upstream provider request id; raw ids are not @@ -447,6 +460,9 @@ classes you opted into. `OTEL_EXPORTER_OTLP_ENDPOINT`. - If the endpoint already contains `/v1/traces` or `/v1/metrics`, it is used as-is. - If the endpoint already contains `/v1/logs`, it is used as-is for logs. +- `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` controls only the + GenAI span provider attribute shape. Existing dashboards that read + `gen_ai.system` can keep the default until they migrate. - `OPENCLAW_OTEL_PRELOADED=1` reuses an externally registered OpenTelemetry SDK for traces/metrics instead of starting a plugin-owned NodeSDK. - `diagnostics.otel.logs` enables OTLP log export for the main logger output.