Commit Graph

6192 Commits

Author SHA1 Message Date
Peter Steinberger
0e58654dba fix(agents): silence empty group model turns 2026-04-26 06:25:59 +01:00
Vincent Koc
d531760898 docs(music-generation): rewrite around Steps, Tabs, and provider Accordion
The music-generation page was 291 lines with two side-by-side
'Quick start' subsections (shared provider-backed vs. ComfyUI
workflow), a flat parameter table, two prose paragraphs explaining
async behaviour and task lifecycle, and a 'Provider notes' bullet
list mixed with a separate 'Choosing the right path' section.

Restructure for scan-first reading without losing technical content:

- Wrap Quick start in a top-level Tabs with two child Steps blocks
  (Shared provider-backed | ComfyUI workflow), so readers pick a path
  first and only see the matching steps.
- Convert the tool parameter list to ParamField definitions with
  type signatures and required flags surfaced visually.
- Convert the four async-behaviour bullets to a labelled bullet list
  and the four-state task lifecycle to a table for at-a-glance
  scanning.
- Change Capability matrix Yes/No values to checkmarks/em-dashes for
  alignment with the rest of the media docs.
- Convert the 'Provider notes' free-form paragraphs into an
  AccordionGroup keyed by provider (ComfyUI / Google Lyria 3 /
  MiniMax), keeping wording faithful.
- Sentence-case Related entries and add sidebarTitle so the nav reads
  'Music generation' explicitly.

Provider rows already alphabetized in the supported providers table
(ComfyUI / Google / MiniMax), kept that order. Wording, model refs,
defaults, env vars, and capability declarations are unchanged.
2026-04-25 22:24:58 -07:00
Vincent Koc
f0ea901a0d docs(image-generation): rewrite around Steps, Tabs, and AZ providers
The image-generation page was 395 lines with a 3-step quick-start
written as plain numbered prose, a sprawling 'OpenAI gpt-image-2'
section that mixed routing/legacy/OpenAI options with five inline
slash-command examples, and provider tables that mixed alphabetic
and recency order.

Restructure for scan-first reading without losing technical content:

- Wrap Quick start in a Steps component (auth -> default model ->
  ask the agent), pulling the Codex OAuth note inline with the model
  step where it belongs and surfacing the LAN/SSRF caveat as a
  Warning callout.
- Alphabetize the Supported providers table (ComfyUI, fal, Google,
  LiteLLM, MiniMax, OpenAI, OpenRouter, Vydra, xAI) and the Provider
  capabilities table (same order across both). Convert the Yes/No
  capability table to checkmarks plus exact counts for readability.
- Replace the long inline OpenAI / OpenRouter / MiniMax / xAI prose
  with a 'Provider deep dives' AccordionGroup so each backend's
  routing, legacy URL handling, and provider-specific knobs collapse
  by default.
- Move the four provider-selection-order notes into a small
  AccordionGroup ('Per-call overrides are exact', 'Auto-detection is
  auth-aware', 'Timeouts', 'Inspect at runtime').
- Collapse the five flat slash-command examples into a single Tabs
  component (4K landscape / transparent PNG / two-square /
  edit-one-ref / edit-multi-ref) with the matching CLI variant inline
  on the transparent-PNG tab.
- Sentence-case the Related list (Tools overview, Configuration
  reference) and drop the redundant generic introductory wording.
- Add sidebarTitle so the nav reads 'Image generation' explicitly.

Wording, schema fields, defaults, model refs, env vars, and the
detailed OpenAI/OpenRouter/Codex routing rules are unchanged.
2026-04-25 22:23:09 -07:00
Vincent Koc
d1502c2ba1 docs(media-overview): rewrite around CardGroup, sync/async split, and AZ providers
The media overview was a 91-line page that opened with a redundant
Title-Case body H1 ('# Media Generation and Understanding'), then
mixed a capability table, a Yes/Yes/Yes provider matrix, dense prose
about async behaviour and STT/Voice Call surfaces, plus duplicate
'Quick links' and 'Related' sections at the end.

Restructure for scan-first reading without losing any content:

- Drop the redundant body H1; lead with a one-paragraph summary.
- Replace the 'Capabilities at a glance' table with a CardGroup of six
  entry cards (Image / Video / Music / TTS / Media understanding / STT)
  each linking directly to its dedicated page. Mode (sync/async) is
  noted on the card so readers see latency expectations up front.
- Convert the provider matrix to checkmarks for readability and align
  the column header names. Provider rows already alphabetized.
- Pull async vs synchronous behaviour into a 5-row table that names
  why each capability is sync or async, then keep the operator-facing
  paragraph that explains task-id handoff.
- Move the long 'Google maps to ... OpenAI maps to ... xAI maps to ...'
  paragraph into a per-vendor AccordionGroup so each mapping is a
  collapsible panel instead of one large prose block.
- Drop duplicate 'Quick links' section in favour of a single Related
  list, sentence-cased to match the rest of the docs.
2026-04-25 22:20:35 -07:00
Longbiao CHEN
afe1abc297 feat(voicewake): refresh trigger routing on main 2026-04-26 06:19:35 +01:00
Vincent Koc
724e92505a docs(tts): add sidebarTitle 'Text to speech (TTS)' for the nav
Default sidebar label fell back to title 'Text-to-speech', which is fine
on the page header but readers scanning the Tools sidebar look for the
acronym 'TTS'. Add a sidebarTitle so Mintlify renders 'Text to speech
(TTS)' in the sidebar while keeping the canonical page title intact.

Sentence case matches the rest of the Tools sidebar group (e.g.
'Image generation', 'Music generation', 'Video generation').
2026-04-25 22:11:31 -07:00
Peter Steinberger
8c35e45c00 fix: guard gateway mutations from older binaries 2026-04-26 06:07:55 +01:00
Vincent Koc
fbd6b3ce3c docs(tts): A-Z order providers and add tools/tts to Tools nav group
- docs/tools/tts.md: alphabetize providers in three places that listed
  them: the supported-providers table (Azure Speech ... Xiaomi MiMo),
  the configuration Tabs (12 provider presets in A-Z), and the field
  reference AccordionGroup. Top-level fields stay first; provider
  tabs/accordions follow strict alphabetical order. Wording, schema,
  and defaults unchanged.
- docs/docs.json: add tools/tts to the main Tools sidebar group
  (slotted between trajectory and video-generation, matching the
  alphabetical neighborhood with image-generation, music-generation,
  video-generation). Previously tts only appeared under
  Nodes > Media capabilities, which was a discoverability gap for
  readers looking for TTS alongside the other generation tools.
2026-04-25 22:05:46 -07:00
Vincent Koc
71b79f49ad docs(tts): rewrite tts.md around personas with Mintlify components
The TTS doc had grown to 1008 lines with 11 separate flat 'X primary'
config blocks, a 100-line dense 'Notes on fields' bullet list, and
the new provider-personas feature (#70748) buried near the bottom.
Restructure for readability and feature visibility:

- Lead with a Steps-based 'Quick start' so first-time readers can
  enable TTS in 4 explicit steps.
- Replace the 13-bullet provider list with a single 'Supported
  providers' table that names auth env vars and per-provider notes
  inline. Add a Warning callout for the Microsoft/edge legacy alias.
- Collapse the 11 'X primary' config blocks into one Tabs component
  ('OpenAI + ElevenLabs', 'Google Gemini', 'Azure Speech',
  'Microsoft (no key)', 'MiniMax', 'Inworld', 'xAI', 'Volcengine',
  'Xiaomi MiMo', 'OpenRouter', 'Gradium', 'Local CLI') so users see
  one preset at a time and the page is scannable.
- Promote 'Personas' to its own top-level section with two examples
  (minimal and the Alfred provider-neutral persona), and add a new
  'How providers use persona prompts' AccordionGroup covering Google
  (promptTemplate audio-profile-v1, personaPrompt), OpenAI
  (instructions auto-mapping), and Other providers, plus a fallback
  policy table.
- Note that agents.list[].tts.persona overrides global persona
  per-agent (covers the recent feat(tts) per-agent voice-override
  work).
- Convert the 100-line 'Notes on fields' wall into a per-provider
  AccordionGroup using ParamField, so the field reference is
  scannable and field types/defaults are visually distinct.
- Sentence-case headings, drop redundant body H1, fold the flow
  diagram inline with Auto-TTS behavior, and refresh the Output
  formats section to a table-first layout.
- Schema fields (label/description/provider/fallbackPolicy/prompt
  with profile/scene/sampleContext/style/accent/pacing/constraints
  and providers map) verified against src/config/types.tts.ts; all
  defaults and env-var fallbacks preserved verbatim.

Net diff: 585 insertions, 684 deletions across the same surface
area.
2026-04-25 22:00:19 -07:00
Peter Steinberger
ad5c00b8e0 docs: expand bonjour disable troubleshooting 2026-04-26 05:56:25 +01:00
Peter Steinberger
d1a5ea2024 fix(docker): disable bonjour by default for compose 2026-04-26 05:51:05 +01:00
Vincent Koc
4cba24a4c3 fix(logging): redact console and file sinks 2026-04-25 21:50:00 -07:00
Peter Steinberger
6a67f65568 fix(voice): reuse preflight transcripts across channels 2026-04-26 05:42:04 +01:00
Vincent Koc
46b9044c3f docs: update model input modalities and OTEL token-metric attrs
Two recent commits added user-facing surface that left signature-style
references in docs stale:

- 4428661779 Alvin Tang (#20721, thanks @alvinttang) extends the
  configured model 'input' modality set to also accept 'audio' and
  'video', matching what providers like LM Studio already report.
  docs/plugins/manifest.md model-fields table listed only
  'text | image | document', so add 'audio' and 'video'.
- 44da034516 Vincent (thanks @oc-factus) adds a bounded openclaw.agent
  attribute on the openclaw.tokens counter so per-agent dashboards can
  group usage. docs/gateway/opentelemetry.md metric reference omitted
  it; add it to the attrs list.
2026-04-25 21:39:44 -07:00
Peter Steinberger
9b93b7df62 fix(whatsapp): remove ack reactions after replies 2026-04-26 05:36:14 +01:00
Neerav Makwana
dc9ce2a1bf fix: honor agent for models auth writes (#71933)
Honor the parent `models auth --agent <id>` flag across auth write commands: `add`, `login`, `setup-token`, `paste-token`, and `login-github-copilot`.

The auth helpers now resolve the requested configured agent before choosing the auth-profile store and provider workspace, while preserving default-agent behavior when `--agent` is omitted.

Validation:
- `pnpm test src/cli/models-cli.test.ts src/commands/models/auth.test.ts`
- `pnpm test src/commands/models/auth.test.ts`
- `pnpm docs:check-mdx`
- `pnpm check:changed`
- `pnpm check`
- `pnpm build`
- `pnpm test src/cli/run-main.test.ts`

Full `pnpm test` was also run; it failed in unrelated `src/cli/run-main.test.ts` assertions during the full-suite order, while the exact file passes on both latest main and this branch. The PR diff only touches models auth CLI/auth files, docs, and changelog.

Fixes #71864.

Thanks @neeravmakwana.
2026-04-26 05:30:47 +01:00
Peter Steinberger
ae45eebef1 fix: route remote mac browser through node host 2026-04-26 05:25:59 +01:00
Peter Steinberger
b8aef04ccd docs(config): refresh config baseline hash 2026-04-26 05:20:45 +01:00
Peter Steinberger
f1eef47839 fix(agents): treat empty group replies as silent 2026-04-26 05:17:02 +01:00
Shakker
c953e98c59 docs: clarify provider index foundation scope 2026-04-26 05:14:51 +01:00
Shakker
e827778129 fix: keep provider index previews authoritative 2026-04-26 05:14:51 +01:00
Shakker
f1e28370c4 docs: explain provider index authority 2026-04-26 05:14:51 +01:00
Eulices
008e4ca81f fix: add placeholder transcript for silent voice notes (#49131)
* fix: add placeholder transcript for silent voice notes

* fix: handle placeholder transcripts per skipped attachment

* fix: preserve synthetic transcript attachment order

* fix: scope synthetic audio merge to audio slice only, preserve cross-capability and prefer ordering

Replace the global outputs.sort() with a targeted merge that:
1. Only sorts within the audio output slice (real + synthetic),
   preserving CAPABILITY_ORDER and per-capability attachments.prefer
   ordering for non-audio outputs.
2. Excludes synthetic placeholder indexes from audioAttachmentIndexes
   used by extractFileBlocks, so tiny audio-MIME files with text
   extensions can still be recovered via forcedTextMime.

Adds mergeAudioOutputsPreservingAttachmentOrder helper.

* fix: remove unused function and use toSorted() for oxlint compliance

* fix(media-understanding): preserve selected audio order for synthetic placeholders

- merge synthetic skipped-audio placeholders using audio decision order
  instead of raw attachmentIndex sorting, preserving attachments.prefer
- insert synthetic-only audio outputs at the audio capability slot
  (before video) when no real audio outputs were produced

* fix(media-understanding): use neutral too-small placeholder text

Clarify that this synthetic transcript path is triggered by attachment size,
not by a silence/no-speech detection result.

* test(media-understanding): update too-small audio placeholder expectations

* test(media-understanding): cover mixed too-small audio placeholder

* test(media-understanding): cover too-small audio context

* fix(tasks): preserve visible task title before internal context

* Revert "fix(tasks): preserve visible task title before internal context"

This reverts commit dc536fb4d3c8a01168de5d05e8562193dd68a88e.

---------

Co-authored-by: Eulices Lopez <eulices@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-26 05:14:01 +01:00
Barron Roth
0594fa3c4d TTS: add provider personas 2026-04-26 09:42:38 +05:30
likewen-tech
86328585fa fix(tasks): terminalize gateway agent run ledger
Terminalize Gateway-backed async task records from the run result while preserving aborted, failed, cancelled, and lost outcomes.\n\nThanks @likewen-tech.
2026-04-26 05:06:33 +01:00
Peter Steinberger
b277eac656 fix: pin macos ssh remote url to loopback 2026-04-26 05:01:25 +01:00
Peter Steinberger
9ed11d6c49 fix: steer agents to safe gateway config flow 2026-04-26 05:00:17 +01:00
Peter Steinberger
29741f696a fix(feishu): transcribe inbound voice notes 2026-04-26 04:47:45 +01:00
Peter Steinberger
540c70d166 fix(plugins): ignore bundled load path aliases 2026-04-26 04:46:05 +01:00
Shakker
26a647d4bb docs: scope manifest model list note 2026-04-26 04:41:51 +01:00
Shakker
469bd5f51e docs: mention manifest model list rows 2026-04-26 04:41:51 +01:00
Peter Steinberger
9e4a0e7f3c fix(qqbot): ignore bot self-echo events 2026-04-26 04:40:53 +01:00
Peter Steinberger
e40094a9ef test(browser): add CDP snapshot Docker smoke 2026-04-26 04:40:26 +01:00
Peter Steinberger
4edf22f63f fix(acpx): avoid startup agent probes by default 2026-04-26 04:40:26 +01:00
Peter Steinberger
ed1ac2fc44 feat(browser): add CDP role snapshot fallback 2026-04-26 04:40:26 +01:00
Peter Steinberger
6d4f65c9d4 docs: clarify codex runtime routing 2026-04-26 04:38:39 +01:00
Peter Steinberger
6336ed4166 fix: gate codex acp route hints 2026-04-26 04:36:26 +01:00
Peter Steinberger
b58223510c fix(providers): support zai preserved thinking 2026-04-26 04:35:50 +01:00
Peter Steinberger
2c8c79de5c fix(tts): normalize streamed tts voice media 2026-04-26 04:28:19 +01:00
Pinghuachiu
7b943667a0 fix: expose image edit geometry flags in capability cli
Expose image edit geometry flags in the capability CLI and document the new infer options.\n\nThanks @Pinghuachiu.
2026-04-26 04:22:22 +01:00
Peter Steinberger
ee8f41f56e fix(channels): strip copied inbound metadata from replies 2026-04-26 04:21:20 +01:00
Vincent Koc
7fef13abbc docs(anthropic): note context1m param applies to Claude CLI backend
Ayaan's 28e4cd81a9 (#70863, thanks @bidadh, source from Arthur Kazemi
8abbae0101) extended params.context1m:true so the configured 1M
context window override now applies to eligible Claude CLI Opus and
Sonnet models, not only direct API calls. CHANGELOG entry covered
the change but docs/providers/anthropic.md '1M context window (beta)'
Accordion only described direct-API behavior, so Claude CLI users had
no signal the same param works for their backend. Add a sentence
inside the same Accordion.
2026-04-25 20:18:51 -07:00
Shakker
862b39976d fix: remove managed plugin files on uninstall 2026-04-26 04:16:33 +01:00
Shakker
48ba3a4198 fix: clean migrated plugin install config 2026-04-26 04:16:33 +01:00
Shakker
f5f4477bae fix: reject manifestless plugin archives 2026-04-26 04:16:33 +01:00
Peter Steinberger
a91baa16de fix(tts): honor explicit directive providers 2026-04-26 04:14:48 +01:00
Peter Steinberger
cf834e2a21 fix(tts): clean streamed directive text 2026-04-26 04:09:56 +01:00
Peter Steinberger
ed5276f9b9 fix(providers): keep vllm nemotron replies visible 2026-04-26 03:54:20 +01:00
Peter Steinberger
7a85c1a822 fix(tts): surface voice status and harden providers 2026-04-26 03:51:30 +01:00
Peter Steinberger
0cf30b6a65 docs(tasks): document retained lost task audit 2026-04-26 03:50:40 +01:00