mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 20:00:42 +00:00
209 lines
6.1 KiB
Markdown
209 lines
6.1 KiB
Markdown
---
|
|
summary: "Run OpenClaw through inferrs (OpenAI-compatible local server)"
|
|
read_when:
|
|
- You want to run OpenClaw against a local inferrs server
|
|
- You are serving Gemma or another model through inferrs
|
|
- You need the exact OpenClaw compat flags for inferrs
|
|
title: "Inferrs"
|
|
---
|
|
|
|
[inferrs](https://github.com/ericcurtin/inferrs) can serve local models behind an
|
|
OpenAI-compatible `/v1` API. OpenClaw works with `inferrs` through the generic
|
|
`openai-completions` path.
|
|
|
|
`inferrs` is currently best treated as a custom self-hosted OpenAI-compatible
|
|
backend, not a dedicated OpenClaw provider plugin.
|
|
|
|
## Getting started
|
|
|
|
<Steps>
|
|
<Step title="Start inferrs with a model">
|
|
```bash
|
|
inferrs serve google/gemma-4-E2B-it \
|
|
--host 127.0.0.1 \
|
|
--port 8080 \
|
|
--device metal
|
|
```
|
|
</Step>
|
|
<Step title="Verify the server is reachable">
|
|
```bash
|
|
curl http://127.0.0.1:8080/health
|
|
curl http://127.0.0.1:8080/v1/models
|
|
```
|
|
</Step>
|
|
<Step title="Add an OpenClaw provider entry">
|
|
Add an explicit provider entry and point your default model at it. See the full config example below.
|
|
</Step>
|
|
</Steps>
|
|
|
|
## Full config example
|
|
|
|
This example uses Gemma 4 on a local `inferrs` server.
|
|
|
|
```json5
|
|
{
|
|
agents: {
|
|
defaults: {
|
|
model: { primary: "inferrs/google/gemma-4-E2B-it" },
|
|
models: {
|
|
"inferrs/google/gemma-4-E2B-it": {
|
|
alias: "Gemma 4 (inferrs)",
|
|
},
|
|
},
|
|
},
|
|
},
|
|
models: {
|
|
mode: "merge",
|
|
providers: {
|
|
inferrs: {
|
|
baseUrl: "http://127.0.0.1:8080/v1",
|
|
apiKey: "inferrs-local",
|
|
api: "openai-completions",
|
|
models: [
|
|
{
|
|
id: "google/gemma-4-E2B-it",
|
|
name: "Gemma 4 E2B (inferrs)",
|
|
reasoning: false,
|
|
input: ["text"],
|
|
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
|
contextWindow: 131072,
|
|
maxTokens: 4096,
|
|
compat: {
|
|
requiresStringContent: true,
|
|
},
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
}
|
|
```
|
|
|
|
## Advanced configuration
|
|
|
|
<AccordionGroup>
|
|
<Accordion title="Why requiresStringContent matters">
|
|
Some `inferrs` Chat Completions routes accept only string
|
|
`messages[].content`, not structured content-part arrays.
|
|
|
|
<Warning>
|
|
If OpenClaw runs fail with an error like:
|
|
|
|
```text
|
|
messages[1].content: invalid type: sequence, expected a string
|
|
```
|
|
|
|
set `compat.requiresStringContent: true` in your model entry.
|
|
</Warning>
|
|
|
|
```json5
|
|
compat: {
|
|
requiresStringContent: true
|
|
}
|
|
```
|
|
|
|
OpenClaw will flatten pure text content parts into plain strings before sending
|
|
the request.
|
|
|
|
</Accordion>
|
|
|
|
<Accordion title="Gemma and tool-schema caveat">
|
|
Some current `inferrs` + Gemma combinations accept small direct
|
|
`/v1/chat/completions` requests but still fail on full OpenClaw agent-runtime
|
|
turns.
|
|
|
|
If that happens, try this first:
|
|
|
|
```json5
|
|
compat: {
|
|
requiresStringContent: true,
|
|
supportsTools: false
|
|
}
|
|
```
|
|
|
|
That disables OpenClaw's tool schema surface for the model and can reduce prompt
|
|
pressure on stricter local backends.
|
|
|
|
If tiny direct requests still work but normal OpenClaw agent turns continue to
|
|
crash inside `inferrs`, the remaining issue is usually upstream model/server
|
|
behavior rather than OpenClaw's transport layer.
|
|
|
|
</Accordion>
|
|
|
|
<Accordion title="Manual smoke test">
|
|
Once configured, test both layers:
|
|
|
|
```bash
|
|
curl http://127.0.0.1:8080/v1/chat/completions \
|
|
-H 'content-type: application/json' \
|
|
-d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}'
|
|
```
|
|
|
|
```bash
|
|
openclaw infer model run \
|
|
--model inferrs/google/gemma-4-E2B-it \
|
|
--prompt "What is 2 + 2? Reply with one short sentence." \
|
|
--json
|
|
```
|
|
|
|
If the first command works but the second fails, check the troubleshooting section below.
|
|
|
|
</Accordion>
|
|
|
|
<Accordion title="Proxy-style behavior">
|
|
`inferrs` is treated as a proxy-style OpenAI-compatible `/v1` backend, not a
|
|
native OpenAI endpoint.
|
|
|
|
- Native OpenAI-only request shaping does not apply here
|
|
- No `service_tier`, no Responses `store`, no prompt-cache hints, and no
|
|
OpenAI reasoning-compat payload shaping
|
|
- Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`)
|
|
are not injected on custom `inferrs` base URLs
|
|
|
|
</Accordion>
|
|
</AccordionGroup>
|
|
|
|
## Troubleshooting
|
|
|
|
<AccordionGroup>
|
|
<Accordion title="curl /v1/models fails">
|
|
`inferrs` is not running, not reachable, or not bound to the expected
|
|
host/port. Make sure the server is started and listening on the address you
|
|
configured.
|
|
</Accordion>
|
|
|
|
<Accordion title="messages[].content expected a string">
|
|
Set `compat.requiresStringContent: true` in the model entry. See the
|
|
`requiresStringContent` section above for details.
|
|
</Accordion>
|
|
|
|
<Accordion title="Direct /v1/chat/completions calls pass but openclaw infer model run fails">
|
|
Try setting `compat.supportsTools: false` to disable the tool schema surface.
|
|
See the Gemma tool-schema caveat above.
|
|
</Accordion>
|
|
|
|
<Accordion title="inferrs still crashes on larger agent turns">
|
|
If OpenClaw no longer gets schema errors but `inferrs` still crashes on larger
|
|
agent turns, treat it as an upstream `inferrs` or model limitation. Reduce
|
|
prompt pressure or switch to a different local backend or model.
|
|
</Accordion>
|
|
</AccordionGroup>
|
|
|
|
<Tip>
|
|
For general help, see [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq).
|
|
</Tip>
|
|
|
|
## Related
|
|
|
|
<CardGroup cols={2}>
|
|
<Card title="Local models" href="/gateway/local-models" icon="server">
|
|
Running OpenClaw against local model servers.
|
|
</Card>
|
|
<Card title="Gateway troubleshooting" href="/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail" icon="wrench">
|
|
Debugging local OpenAI-compatible backends that pass probes but fail agent runs.
|
|
</Card>
|
|
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
|
|
Overview of all providers, model refs, and failover behavior.
|
|
</Card>
|
|
</CardGroup>
|