--- summary: "Expose OpenClaw diagnostics as Prometheus text metrics through the diagnostics-prometheus plugin" title: "Prometheus metrics" sidebarTitle: "Prometheus" read_when: - You want Prometheus, Grafana, VictoriaMetrics, or another scraper to collect OpenClaw Gateway metrics - You need the Prometheus metric names and label policy for dashboards or alerts - You want metrics without running an OpenTelemetry collector --- OpenClaw can expose diagnostics metrics through the official `diagnostics-prometheus` plugin. It listens to trusted internal diagnostics and renders a Prometheus text endpoint at: ```text GET /api/diagnostics/prometheus ``` Content type is `text/plain; version=0.0.4; charset=utf-8`, the standard Prometheus exposition format. The route uses Gateway authentication (operator scope). Do not expose it as a public unauthenticated `/metrics` endpoint. Scrape it through the same auth path you use for other operator APIs. For traces, logs, OTLP push, and OpenTelemetry GenAI semantic attributes, see [OpenTelemetry export](/gateway/opentelemetry). ## Quick start ```bash openclaw plugins install clawhub:@openclaw/diagnostics-prometheus ``` ```json5 { plugins: { allow: ["diagnostics-prometheus"], entries: { "diagnostics-prometheus": { enabled: true }, }, }, diagnostics: { enabled: true, }, } ``` ```bash openclaw plugins enable diagnostics-prometheus ``` The HTTP route is registered at plugin startup, so reload after enabling. Send the same gateway auth your operator clients use: ```bash curl -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \ http://127.0.0.1:18789/api/diagnostics/prometheus ``` ```yaml # prometheus.yml scrape_configs: - job_name: openclaw scrape_interval: 30s metrics_path: /api/diagnostics/prometheus authorization: credentials_file: /etc/prometheus/openclaw-gateway-token static_configs: - targets: ["openclaw-gateway:18789"] ``` `diagnostics.enabled: true` is required. Without it, the plugin still registers the HTTP route but no diagnostic events flow into the exporter, so the response is empty. ## Metrics exported | Metric | Type | Labels | | --------------------------------------------- | --------- | ----------------------------------------------------------------------------------------- | | `openclaw_run_completed_total` | counter | `channel`, `model`, `outcome`, `provider`, `trigger` | | `openclaw_run_duration_seconds` | histogram | `channel`, `model`, `outcome`, `provider`, `trigger` | | `openclaw_model_call_total` | counter | `api`, `error_category`, `model`, `outcome`, `provider`, `transport` | | `openclaw_model_call_duration_seconds` | histogram | `api`, `error_category`, `model`, `outcome`, `provider`, `transport` | | `openclaw_model_tokens_total` | counter | `agent`, `channel`, `model`, `provider`, `token_type` | | `openclaw_gen_ai_client_token_usage` | histogram | `model`, `provider`, `token_type` | | `openclaw_model_cost_usd_total` | counter | `agent`, `channel`, `model`, `provider` | | `openclaw_tool_execution_total` | counter | `error_category`, `outcome`, `params_kind`, `tool` | | `openclaw_tool_execution_duration_seconds` | histogram | `error_category`, `outcome`, `params_kind`, `tool` | | `openclaw_harness_run_total` | counter | `channel`, `error_category`, `harness`, `model`, `outcome`, `phase`, `plugin`, `provider` | | `openclaw_harness_run_duration_seconds` | histogram | `channel`, `error_category`, `harness`, `model`, `outcome`, `phase`, `plugin`, `provider` | | `openclaw_message_processed_total` | counter | `channel`, `outcome`, `reason` | | `openclaw_message_processed_duration_seconds` | histogram | `channel`, `outcome`, `reason` | | `openclaw_message_delivery_total` | counter | `channel`, `delivery_kind`, `error_category`, `outcome` | | `openclaw_message_delivery_duration_seconds` | histogram | `channel`, `delivery_kind`, `error_category`, `outcome` | | `openclaw_queue_lane_size` | gauge | `lane` | | `openclaw_queue_lane_wait_seconds` | histogram | `lane` | | `openclaw_session_state_total` | counter | `reason`, `state` | | `openclaw_session_queue_depth` | gauge | `state` | | `openclaw_memory_bytes` | gauge | `kind` | | `openclaw_memory_rss_bytes` | histogram | none | | `openclaw_memory_pressure_total` | counter | `level`, `reason` | | `openclaw_telemetry_exporter_total` | counter | `exporter`, `reason`, `signal`, `status` | | `openclaw_prometheus_series_dropped_total` | counter | none | ## Label policy Prometheus labels stay bounded and low-cardinality. The exporter does not emit raw diagnostic identifiers such as `runId`, `sessionKey`, `sessionId`, `callId`, `toolCallId`, message IDs, chat IDs, or provider request IDs. Label values are redacted and must match OpenClaw's low-cardinality character policy. Values that fail the policy are replaced with `unknown`, `other`, or `none`, depending on the metric. The exporter caps retained time series in memory at **2048** series across counters, gauges, and histograms combined. New series beyond that cap are dropped, and `openclaw_prometheus_series_dropped_total` increments by one each time. Watch this counter as a hard signal that an attribute upstream is leaking high-cardinality values. The exporter never lifts the cap automatically; if it climbs, fix the source rather than disabling the cap. - prompt text, response text, tool inputs, tool outputs, system prompts - raw provider request IDs (only bounded hashes, where applicable, on spans — never on metrics) - session keys and session IDs - hostnames, file paths, secret values ## PromQL recipes ```promql # Tokens per minute, split by provider sum by (provider) (rate(openclaw_model_tokens_total[1m])) # Spend (USD) over the last hour, by model sum by (model) (increase(openclaw_model_cost_usd_total[1h])) # 95th percentile model run duration histogram_quantile( 0.95, sum by (le, provider, model) (rate(openclaw_run_duration_seconds_bucket[5m])) ) # Queue wait time SLO (95p under 2s) histogram_quantile( 0.95, sum by (le, lane) (rate(openclaw_queue_lane_wait_seconds_bucket[5m])) ) < 2 # Dropped Prometheus series (cardinality alarm) increase(openclaw_prometheus_series_dropped_total[15m]) > 0 ``` Prefer `gen_ai_client_token_usage` for cross-provider dashboards: it follows the OpenTelemetry GenAI semantic conventions and is consistent with metrics from non-OpenClaw GenAI services. ## Choosing between Prometheus and OpenTelemetry export OpenClaw supports both surfaces independently. You can run either, both, or neither. - **Pull** model: Prometheus scrapes `/api/diagnostics/prometheus`. - No external collector required. - Authenticated through normal Gateway auth. - Surface is metrics only (no traces or logs). - Best for stacks already standardized on Prometheus + Grafana. - **Push** model: OpenClaw sends OTLP/HTTP to a collector or OTLP-compatible backend. - Surface includes metrics, traces, and logs. - Bridges to Prometheus through an OpenTelemetry Collector (`prometheus` or `prometheusremotewrite` exporter) when you need both. - See [OpenTelemetry export](/gateway/opentelemetry) for the full catalog. ## Troubleshooting - Check `diagnostics.enabled: true` in config. - Confirm the plugin is enabled and loaded with `openclaw plugins list --enabled`. - Generate some traffic; counters and histograms only emit lines after at least one event. The endpoint requires the Gateway operator scope (`auth: "gateway"` with `gatewayRuntimeScopeSurface: "trusted-operator"`). Use the same token or password Prometheus uses for any other Gateway operator route. There is no public unauthenticated mode. A new attribute is exceeding the **2048**-series cap. Inspect recent metrics for an unexpectedly high-cardinality label and fix it at the source. The exporter intentionally drops new series instead of silently rewriting labels. The plugin keeps state in memory only. After a Gateway restart, counters reset to zero and gauges restart at their next reported value. Use PromQL `rate()` and `increase()` to handle resets cleanly. ## Related - [Diagnostics export](/gateway/diagnostics) — local diagnostics zip for support bundles - [Health and readiness](/gateway/health) — `/healthz` and `/readyz` probes - [Logging](/logging) — file-based logging - [OpenTelemetry export](/gateway/opentelemetry) — OTLP push for traces, metrics, and logs