docs: tighten subscription guidance and update MiniMax M2.5 refs

This commit is contained in:
Peter Steinberger
2026-03-03 00:02:25 +00:00
parent 3e1ec5ad8b
commit 6b85ec3022
54 changed files with 272 additions and 245 deletions

View File

@@ -516,7 +516,7 @@ Even with strong system prompts, **prompt injection is not solved**. System prom
- Run sensitive tool execution in a sandbox; keep secrets out of the agents reachable filesystem.
- Note: sandboxing is opt-in. If sandbox mode is off, exec runs on the gateway host even though tools.exec.host defaults to sandbox, and host exec does not require approvals unless you set host=gateway and configure exec approvals.
- Limit high-risk tools (`exec`, `browser`, `web_fetch`, `web_search`) to trusted agents or explicit allowlists.
- **Model choice matters:** older/legacy models can be less robust against prompt injection and tool misuse. Prefer modern, instruction-hardened models for any bot with tools. We recommend Anthropic Opus 4.6 (or the latest Opus) because its strong at recognizing prompt injections (see [“A step forward on safety”](https://www.anthropic.com/news/claude-opus-4-5)).
- **Model choice matters:** older/legacy models can be less robust against prompt injection and tool misuse. Prefer the strongest latest-generation, instruction-hardened model available for any bot with tools.
Red flags to treat as untrusted:
@@ -570,7 +570,7 @@ Prompt injection resistance is **not** uniform across model tiers. Smaller/cheap
Recommendations:
- **Use the latest generation, best-tier model** for any bot that can run tools or touch files/networks.
- **Avoid weaker tiers** (for example, Sonnet or Haiku) for tool-enabled agents or untrusted inboxes.
- **Avoid older/weaker tiers** for tool-enabled agents or untrusted inboxes.
- If you must use a smaller model, **reduce blast radius** (read-only tools, strong sandboxing, minimal filesystem access, strict allowlists).
- When running small models, **enable sandboxing for all sessions** and **disable web_search/web_fetch/browser** unless inputs are tightly controlled.
- For chat-only personal assistants with trusted input and no tools, smaller models are usually fine.