mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-17 20:21:13 +00:00
docs: rename and improve infer docs
This commit is contained in:
@@ -1,191 +0,0 @@
|
||||
---
|
||||
summary: "Infer-first CLI for provider-backed model, image, audio, TTS, video, web, and embedding workflows"
|
||||
read_when:
|
||||
- Adding or modifying `openclaw infer` commands
|
||||
- Designing stable headless capability automation
|
||||
title: "Inference CLI"
|
||||
---
|
||||
|
||||
# Inference CLI
|
||||
|
||||
`openclaw infer` is the canonical headless surface for provider-backed inference workflows.
|
||||
|
||||
It intentionally exposes capability families, not raw gateway RPC names and not raw agent tool ids.
|
||||
|
||||
## Common tasks
|
||||
|
||||
This table maps common inference tasks to the corresponding infer command.
|
||||
|
||||
| If the user wants to... | Use this command |
|
||||
| ------------------------------- | ---------------------------------------------------------------------- |
|
||||
| run a text/model prompt | `openclaw infer model run --prompt "..." --json` |
|
||||
| list configured model providers | `openclaw infer model providers --json` |
|
||||
| generate an image | `openclaw infer image generate --prompt "..." --json` |
|
||||
| describe an image file | `openclaw infer image describe --file ./image.png --json` |
|
||||
| transcribe audio | `openclaw infer audio transcribe --file ./memo.m4a --json` |
|
||||
| synthesize speech | `openclaw infer tts convert --text "..." --output ./speech.mp3 --json` |
|
||||
| generate a video | `openclaw infer video generate --prompt "..." --json` |
|
||||
| describe a video file | `openclaw infer video describe --file ./clip.mp4 --json` |
|
||||
| search the web | `openclaw infer web search --query "..." --json` |
|
||||
| fetch a web page | `openclaw infer web fetch --url https://example.com --json` |
|
||||
| create embeddings | `openclaw infer embedding create --text "..." --json` |
|
||||
|
||||
## Command tree
|
||||
|
||||
```text
|
||||
openclaw infer
|
||||
list
|
||||
inspect
|
||||
|
||||
model
|
||||
run
|
||||
list
|
||||
inspect
|
||||
providers
|
||||
auth login
|
||||
auth logout
|
||||
auth status
|
||||
|
||||
image
|
||||
generate
|
||||
edit
|
||||
describe
|
||||
describe-many
|
||||
providers
|
||||
|
||||
audio
|
||||
transcribe
|
||||
providers
|
||||
|
||||
tts
|
||||
convert
|
||||
voices
|
||||
providers
|
||||
status
|
||||
enable
|
||||
disable
|
||||
set-provider
|
||||
|
||||
video
|
||||
generate
|
||||
describe
|
||||
providers
|
||||
|
||||
web
|
||||
search
|
||||
fetch
|
||||
providers
|
||||
|
||||
embedding
|
||||
create
|
||||
providers
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
These examples show the standard command shape across the infer surface.
|
||||
|
||||
```bash
|
||||
openclaw infer list --json
|
||||
openclaw infer inspect --name image.generate --json
|
||||
openclaw infer model run --prompt "Reply with exactly: smoke-ok" --json
|
||||
openclaw infer model providers --json
|
||||
openclaw infer image generate --prompt "friendly lobster illustration" --json
|
||||
openclaw infer image describe --file ./photo.jpg --json
|
||||
openclaw infer audio transcribe --file ./memo.m4a --json
|
||||
openclaw infer tts convert --text "hello from openclaw" --output ./hello.mp3 --json
|
||||
openclaw infer video generate --prompt "cinematic sunset over the ocean" --json
|
||||
openclaw infer video describe --file ./clip.mp4 --json
|
||||
openclaw infer web search --query "OpenClaw docs" --json
|
||||
openclaw infer embedding create --text "friendly lobster" --json
|
||||
```
|
||||
|
||||
## Additional examples
|
||||
|
||||
```bash
|
||||
openclaw infer audio transcribe --file ./team-sync.m4a --language en --prompt "Focus on names and action items" --json
|
||||
openclaw infer image describe --file ./ui-screenshot.png --model openai/gpt-4.1-mini --json
|
||||
openclaw infer tts convert --text "Your build is complete" --output ./build-complete.mp3 --json
|
||||
openclaw infer web search --query "OpenClaw docs infer web providers" --json
|
||||
openclaw infer embedding create --text "customer support ticket: delayed shipment" --model openai/text-embedding-3-large --json
|
||||
```
|
||||
|
||||
## Transport
|
||||
|
||||
Supported transport flags:
|
||||
|
||||
- `--local`
|
||||
- `--gateway`
|
||||
|
||||
Default transport is implicit auto at the command-family level:
|
||||
|
||||
- Stateless execution commands default to local.
|
||||
- Gateway-managed state commands default to gateway.
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
openclaw infer model run --prompt "hello" --json
|
||||
openclaw infer image generate --prompt "friendly lobster" --json
|
||||
openclaw infer tts status --json
|
||||
openclaw infer embedding create --text "hello world" --json
|
||||
```
|
||||
|
||||
## Usage notes
|
||||
|
||||
- `openclaw infer ...` is the primary CLI surface for these workflows.
|
||||
- Use `--json` when the output will be consumed by another command or script.
|
||||
- Use `--provider` or `--model provider/model` when a specific backend is required.
|
||||
- For `image describe`, `audio transcribe`, and `video describe`, `--model` must use the form `<provider/model>`.
|
||||
- The normal local path does not require the gateway to be running.
|
||||
|
||||
## JSON output
|
||||
|
||||
Capability commands normalize JSON output under a shared envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"capability": "image.generate",
|
||||
"transport": "local",
|
||||
"provider": "openai",
|
||||
"model": "gpt-image-1",
|
||||
"attempts": [],
|
||||
"outputs": []
|
||||
}
|
||||
```
|
||||
|
||||
Top-level fields are stable:
|
||||
|
||||
- `ok`
|
||||
- `capability`
|
||||
- `transport`
|
||||
- `provider`
|
||||
- `model`
|
||||
- `attempts`
|
||||
- `outputs`
|
||||
- `error`
|
||||
|
||||
## Common pitfalls
|
||||
|
||||
```bash
|
||||
# Bad
|
||||
openclaw infer media image generate --prompt "friendly lobster"
|
||||
|
||||
# Good
|
||||
openclaw infer image generate --prompt "friendly lobster"
|
||||
```
|
||||
|
||||
```bash
|
||||
# Bad
|
||||
openclaw infer audio transcribe --file ./memo.m4a --model whisper-1 --json
|
||||
|
||||
# Good
|
||||
openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- `model run` reuses the agent runtime so provider/model overrides behave like normal agent execution.
|
||||
- `tts status` defaults to gateway because it reflects gateway-managed TTS state.
|
||||
- `openclaw capability ...` is an alias for `openclaw infer ...`.
|
||||
@@ -35,7 +35,7 @@ This page describes the current CLI behavior. If commands change, update this do
|
||||
- [`logs`](/cli/logs)
|
||||
- [`system`](/cli/system)
|
||||
- [`models`](/cli/models)
|
||||
- [`infer`](/cli/capability)
|
||||
- [`infer`](/cli/infer)
|
||||
- [`memory`](/cli/memory)
|
||||
- [`directory`](/cli/directory)
|
||||
- [`nodes`](/cli/nodes)
|
||||
|
||||
280
docs/cli/infer.md
Normal file
280
docs/cli/infer.md
Normal file
@@ -0,0 +1,280 @@
|
||||
---
|
||||
summary: "Infer-first CLI for provider-backed model, image, audio, TTS, video, web, and embedding workflows"
|
||||
read_when:
|
||||
- Adding or modifying `openclaw infer` commands
|
||||
- Designing stable headless capability automation
|
||||
title: "Inference CLI"
|
||||
---
|
||||
|
||||
# Inference CLI
|
||||
|
||||
`openclaw infer` is the canonical headless surface for provider-backed inference workflows.
|
||||
|
||||
It intentionally exposes capability families, not raw gateway RPC names and not raw agent tool ids.
|
||||
|
||||
## Turn infer into a skill
|
||||
|
||||
Copy and paste this to an agent:
|
||||
|
||||
```text
|
||||
Read https://docs.openclaw.ai/cli/infer, then create a skill that routes my common workflows to `openclaw infer`.
|
||||
Focus on model runs, image generation, video generation, audio transcription, TTS, web search, and embeddings.
|
||||
```
|
||||
|
||||
A good infer-based skill should:
|
||||
|
||||
- map common user intents to the correct infer subcommand
|
||||
- include a few canonical infer examples for the workflows it covers
|
||||
- prefer `openclaw infer ...` in examples and suggestions
|
||||
- avoid re-documenting the entire infer surface inside the skill body
|
||||
|
||||
Typical infer-focused skill coverage:
|
||||
|
||||
- `openclaw infer model run`
|
||||
- `openclaw infer image generate`
|
||||
- `openclaw infer audio transcribe`
|
||||
- `openclaw infer tts convert`
|
||||
- `openclaw infer web search`
|
||||
- `openclaw infer embedding create`
|
||||
|
||||
## Why use infer
|
||||
|
||||
`openclaw infer` provides one consistent CLI for provider-backed inference tasks inside OpenClaw.
|
||||
|
||||
Benefits:
|
||||
|
||||
- Use the providers and models already configured in OpenClaw instead of wiring up one-off wrappers for each backend.
|
||||
- Keep model, image, audio transcription, TTS, video, web, and embedding workflows under one command tree.
|
||||
- Use a stable `--json` output shape for scripts, automation, and agent-driven workflows.
|
||||
- Prefer a first-party OpenClaw surface when the task is fundamentally "run inference."
|
||||
- Use the normal local path without requiring the gateway for most infer commands.
|
||||
|
||||
## Command tree
|
||||
|
||||
```text
|
||||
openclaw infer
|
||||
list
|
||||
inspect
|
||||
|
||||
model
|
||||
run
|
||||
list
|
||||
inspect
|
||||
providers
|
||||
auth login
|
||||
auth logout
|
||||
auth status
|
||||
|
||||
image
|
||||
generate
|
||||
edit
|
||||
describe
|
||||
describe-many
|
||||
providers
|
||||
|
||||
audio
|
||||
transcribe
|
||||
providers
|
||||
|
||||
tts
|
||||
convert
|
||||
voices
|
||||
providers
|
||||
status
|
||||
enable
|
||||
disable
|
||||
set-provider
|
||||
|
||||
video
|
||||
generate
|
||||
describe
|
||||
providers
|
||||
|
||||
web
|
||||
search
|
||||
fetch
|
||||
providers
|
||||
|
||||
embedding
|
||||
create
|
||||
providers
|
||||
```
|
||||
|
||||
## Common tasks
|
||||
|
||||
This table maps common inference tasks to the corresponding infer command.
|
||||
|
||||
| Task | Command | Notes |
|
||||
| ----------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------- |
|
||||
| Run a text/model prompt | `openclaw infer model run --prompt "..." --json` | Uses the normal local path by default |
|
||||
| Generate an image | `openclaw infer image generate --prompt "..." --json` | Use `image edit` when starting from an existing file |
|
||||
| Describe an image file | `openclaw infer image describe --file ./image.png --json` | `--model` must be `<provider/model>` |
|
||||
| Transcribe audio | `openclaw infer audio transcribe --file ./memo.m4a --json` | `--model` must be `<provider/model>` |
|
||||
| Synthesize speech | `openclaw infer tts convert --text "..." --output ./speech.mp3 --json` | `tts status` is gateway-oriented |
|
||||
| Generate a video | `openclaw infer video generate --prompt "..." --json` | |
|
||||
| Describe a video file | `openclaw infer video describe --file ./clip.mp4 --json` | `--model` must be `<provider/model>` |
|
||||
| Search the web | `openclaw infer web search --query "..." --json` | |
|
||||
| Fetch a web page | `openclaw infer web fetch --url https://example.com --json` | |
|
||||
| Create embeddings | `openclaw infer embedding create --text "..." --json` | |
|
||||
|
||||
## Behavior
|
||||
|
||||
- `openclaw infer ...` is the primary CLI surface for these workflows.
|
||||
- Use `--json` when the output will be consumed by another command or script.
|
||||
- Use `--provider` or `--model provider/model` when a specific backend is required.
|
||||
- For `image describe`, `audio transcribe`, and `video describe`, `--model` must use the form `<provider/model>`.
|
||||
- Stateless execution commands default to local.
|
||||
- Gateway-managed state commands default to gateway.
|
||||
- The normal local path does not require the gateway to be running.
|
||||
|
||||
## Model
|
||||
|
||||
Use `model` for provider-backed text inference and model/provider inspection.
|
||||
|
||||
```bash
|
||||
openclaw infer model run --prompt "Reply with exactly: smoke-ok" --json
|
||||
openclaw infer model run --prompt "Summarize this changelog entry" --provider openai --json
|
||||
openclaw infer model providers --json
|
||||
openclaw infer model inspect --name gpt-5.4 --json
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- `model run` reuses the agent runtime so provider/model overrides behave like normal agent execution.
|
||||
- `model auth login`, `model auth logout`, and `model auth status` manage saved provider auth state.
|
||||
|
||||
## Image
|
||||
|
||||
Use `image` for generation, edit, and description.
|
||||
|
||||
```bash
|
||||
openclaw infer image generate --prompt "friendly lobster illustration" --json
|
||||
openclaw infer image generate --prompt "cinematic product photo of headphones" --json
|
||||
openclaw infer image describe --file ./photo.jpg --json
|
||||
openclaw infer image describe --file ./ui-screenshot.png --model openai/gpt-4.1-mini --json
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- Use `image edit` when starting from existing input files.
|
||||
- For `image describe`, `--model` must be `<provider/model>`.
|
||||
|
||||
## Audio
|
||||
|
||||
Use `audio` for file transcription.
|
||||
|
||||
```bash
|
||||
openclaw infer audio transcribe --file ./memo.m4a --json
|
||||
openclaw infer audio transcribe --file ./team-sync.m4a --language en --prompt "Focus on names and action items" --json
|
||||
openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- `audio transcribe` is for file transcription, not realtime session management.
|
||||
- `--model` must be `<provider/model>`.
|
||||
|
||||
## TTS
|
||||
|
||||
Use `tts` for speech synthesis and TTS provider state.
|
||||
|
||||
```bash
|
||||
openclaw infer tts convert --text "hello from openclaw" --output ./hello.mp3 --json
|
||||
openclaw infer tts convert --text "Your build is complete" --output ./build-complete.mp3 --json
|
||||
openclaw infer tts providers --json
|
||||
openclaw infer tts status --json
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- `tts status` defaults to gateway because it reflects gateway-managed TTS state.
|
||||
- Use `tts providers`, `tts voices`, and `tts set-provider` to inspect and configure TTS behavior.
|
||||
|
||||
## Video
|
||||
|
||||
Use `video` for generation and description.
|
||||
|
||||
```bash
|
||||
openclaw infer video generate --prompt "cinematic sunset over the ocean" --json
|
||||
openclaw infer video generate --prompt "slow drone shot over a forest lake" --json
|
||||
openclaw infer video describe --file ./clip.mp4 --json
|
||||
openclaw infer video describe --file ./clip.mp4 --model openai/gpt-4.1-mini --json
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- `--model` must be `<provider/model>` for `video describe`.
|
||||
|
||||
## Web
|
||||
|
||||
Use `web` for search and fetch workflows.
|
||||
|
||||
```bash
|
||||
openclaw infer web search --query "OpenClaw docs" --json
|
||||
openclaw infer web search --query "OpenClaw infer web providers" --json
|
||||
openclaw infer web fetch --url https://docs.openclaw.ai/cli/infer --json
|
||||
openclaw infer web providers --json
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- Use `web providers` to inspect available, configured, and selected providers.
|
||||
|
||||
## Embedding
|
||||
|
||||
Use `embedding` for vector creation and embedding provider inspection.
|
||||
|
||||
```bash
|
||||
openclaw infer embedding create --text "friendly lobster" --json
|
||||
openclaw infer embedding create --text "customer support ticket: delayed shipment" --model openai/text-embedding-3-large --json
|
||||
openclaw infer embedding providers --json
|
||||
```
|
||||
|
||||
## JSON output
|
||||
|
||||
Infer commands normalize JSON output under a shared envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"capability": "image.generate",
|
||||
"transport": "local",
|
||||
"provider": "openai",
|
||||
"model": "gpt-image-1",
|
||||
"attempts": [],
|
||||
"outputs": []
|
||||
}
|
||||
```
|
||||
|
||||
Top-level fields are stable:
|
||||
|
||||
- `ok`
|
||||
- `capability`
|
||||
- `transport`
|
||||
- `provider`
|
||||
- `model`
|
||||
- `attempts`
|
||||
- `outputs`
|
||||
- `error`
|
||||
|
||||
## Common pitfalls
|
||||
|
||||
```bash
|
||||
# Bad
|
||||
openclaw infer media image generate --prompt "friendly lobster"
|
||||
|
||||
# Good
|
||||
openclaw infer image generate --prompt "friendly lobster"
|
||||
```
|
||||
|
||||
```bash
|
||||
# Bad
|
||||
openclaw infer audio transcribe --file ./memo.m4a --model whisper-1 --json
|
||||
|
||||
# Good
|
||||
openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- `openclaw capability ...` is an alias for `openclaw infer ...`.
|
||||
@@ -76,6 +76,10 @@
|
||||
"source": "/plugins/agent-tools",
|
||||
"destination": "/plugins/building-plugins#registering-agent-tools"
|
||||
},
|
||||
{
|
||||
"source": "/cli/capability",
|
||||
"destination": "/cli/infer"
|
||||
},
|
||||
{
|
||||
"source": "/tools/capability-cookbook",
|
||||
"destination": "/plugins/architecture"
|
||||
|
||||
@@ -1185,7 +1185,7 @@ export function registerCapabilityCli(program: Command) {
|
||||
.addHelpText(
|
||||
"after",
|
||||
() =>
|
||||
`\n${theme.muted("Docs:")} ${formatDocsLink("/cli/capability", "docs.openclaw.ai/cli/capability")}\n`,
|
||||
`\n${theme.muted("Docs:")} ${formatDocsLink("/cli/infer", "docs.openclaw.ai/cli/infer")}\n`,
|
||||
);
|
||||
|
||||
registerCapabilityListAndInspect(capability);
|
||||
|
||||
Reference in New Issue
Block a user