mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-12 01:31:08 +00:00
feat(memory): add Bedrock embedding provider for memory search (#61547)
* feat(memory): add Bedrock embedding provider for memory search Add Amazon Bedrock as a native embedding provider for memory search. Supports Titan Embed Text v1/v2 and Cohere Embed models via AWS SDK. - New embeddings-bedrock.ts: BedrockRuntimeClient + InvokeModel - Auth via AWS default credential chain (same as Bedrock inference) - Auto-selected in 'auto' mode when AWS credentials are detected - Titan V2: configurable dimensions (256/512/1024), normalization - Cohere: native batch support with search_query/search_document types - 16 new tests covering all model types, auth detection, edge cases Closes #26289 * fix(memory): harden bedrock embedding selection --------- Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
This commit is contained in:
@@ -39,6 +39,7 @@ Docs: https://docs.openclaw.ai
|
||||
- Docs/IRC: replace public IRC hostname examples with `irc.example.com` and recommend private servers for bot coordination while listing common public networks for intentional use.
|
||||
- Memory/dreaming: write dreaming trail content to top-level `dreams.md` instead of daily memory notes, update `/dreaming` help text to point there, and keep `dreams.md` available for explicit reads without pulling it into default recall. Thanks @davemorin.
|
||||
- Memory/dreaming: add the Dream Diary surface in Dreams, simplify user-facing dreaming config to `enabled` plus optional `frequency`, treat phases as implementation detail in docs/UI, and keep the lobster animation visible above diary content. Thanks @vignesh07.
|
||||
- Memory/search: add Amazon Bedrock embeddings for Titan, Cohere, Nova, and TwelveLabs models, with AWS credential-chain auto-detection for `provider: "auto"` and provider-specific dimension controls. Thanks @wirjo.
|
||||
|
||||
### Fixes
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
57a3b1cc7d573c3788a670d927eac947fb1685384804f5c3c926f702a27fe00b config-baseline.json
|
||||
82163136ff466db3caa61290fd65a8b8dd9487fc61f3871c177f96fcecf9e29b config-baseline.core.json
|
||||
ae67508350baf891b902348d55fada6c17e9c053adf53aaf3a8b92cd364ef3f1 config-baseline.channel.json
|
||||
d972a11d0f86080a722bddfe48990dd1b8fa16eb8e157e83f49bd46a5941c512 config-baseline.plugin.json
|
||||
73fbcd00d17685b462dfb11aff74baae99265ae5671db28893d8608456daa44e config-baseline.json
|
||||
effaf240920c16fce2c78af52dec15aa9ceb049e34f703c568669cb6beef3f91 config-baseline.core.json
|
||||
3c999707b167138de34f6255e3488b99e404c5132d3fc5879a1fa12d815c31f5 config-baseline.channel.json
|
||||
031b237717ca108ea2cd314413db4c91edfdfea55f808179e3066331f41af134 config-baseline.plugin.json
|
||||
|
||||
@@ -35,14 +35,15 @@ node-llama-cpp).
|
||||
|
||||
## Supported providers
|
||||
|
||||
| Provider | ID | Needs API key | Notes |
|
||||
| -------- | --------- | ------------- | ----------------------------- |
|
||||
| OpenAI | `openai` | Yes | Auto-detected, fast |
|
||||
| Gemini | `gemini` | Yes | Supports image/audio indexing |
|
||||
| Voyage | `voyage` | Yes | Auto-detected |
|
||||
| Mistral | `mistral` | Yes | Auto-detected |
|
||||
| Ollama | `ollama` | No | Local, must set explicitly |
|
||||
| Local | `local` | No | GGUF model, ~0.6 GB download |
|
||||
| Provider | ID | Needs API key | Notes |
|
||||
| -------- | --------- | ------------- | ---------------------------------------------------- |
|
||||
| OpenAI | `openai` | Yes | Auto-detected, fast |
|
||||
| Gemini | `gemini` | Yes | Supports image/audio indexing |
|
||||
| Voyage | `voyage` | Yes | Auto-detected |
|
||||
| Mistral | `mistral` | Yes | Auto-detected |
|
||||
| Bedrock | `bedrock` | No | Auto-detected when the AWS credential chain resolves |
|
||||
| Ollama | `ollama` | No | Local, must set explicitly |
|
||||
| Local | `local` | No | GGUF model, ~0.6 GB download |
|
||||
|
||||
## How search works
|
||||
|
||||
|
||||
@@ -271,3 +271,32 @@ grounding checks.
|
||||
|
||||
The IAM principal used by the gateway must have the `bedrock:ApplyGuardrail`
|
||||
permission in addition to the standard invoke permissions.
|
||||
|
||||
## Embeddings for memory search
|
||||
|
||||
Bedrock can also serve as the embedding provider for
|
||||
[memory search](/concepts/memory-search). This is configured separately from the
|
||||
inference provider — set `agents.defaults.memorySearch.provider` to `"bedrock"`:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
memorySearch: {
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0", // default
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Bedrock embeddings use the same AWS SDK credential chain as inference (instance
|
||||
roles, SSO, access keys, shared config, and web identity). No API key is
|
||||
needed. When `provider` is `"auto"`, Bedrock is auto-detected if that
|
||||
credential chain resolves successfully.
|
||||
|
||||
Supported embedding models include Amazon Titan Embed (v1, v2), Amazon Nova
|
||||
Embed, Cohere Embed (v3, v4), and TwelveLabs Marengo. See
|
||||
[Memory configuration reference — Bedrock](/reference/memory-config#bedrock-embedding-config)
|
||||
for the full model list and dimension options.
|
||||
|
||||
@@ -25,12 +25,12 @@ All memory search settings live under `agents.defaults.memorySearch` in
|
||||
|
||||
## Provider selection
|
||||
|
||||
| Key | Type | Default | Description |
|
||||
| ---------- | --------- | ---------------- | -------------------------------------------------------------------------------- |
|
||||
| `provider` | `string` | auto-detected | Embedding adapter ID: `openai`, `gemini`, `voyage`, `mistral`, `ollama`, `local` |
|
||||
| `model` | `string` | provider default | Embedding model name |
|
||||
| `fallback` | `string` | `"none"` | Fallback adapter ID when the primary fails |
|
||||
| `enabled` | `boolean` | `true` | Enable or disable memory search |
|
||||
| Key | Type | Default | Description |
|
||||
| ---------- | --------- | ---------------- | ------------------------------------------------------------------------------------------- |
|
||||
| `provider` | `string` | auto-detected | Embedding adapter ID: `openai`, `gemini`, `voyage`, `mistral`, `bedrock`, `ollama`, `local` |
|
||||
| `model` | `string` | provider default | Embedding model name |
|
||||
| `fallback` | `string` | `"none"` | Fallback adapter ID when the primary fails |
|
||||
| `enabled` | `boolean` | `true` | Enable or disable memory search |
|
||||
|
||||
### Auto-detection order
|
||||
|
||||
@@ -41,13 +41,14 @@ When `provider` is not set, OpenClaw selects the first available:
|
||||
3. `gemini` -- if a Gemini key can be resolved.
|
||||
4. `voyage` -- if a Voyage key can be resolved.
|
||||
5. `mistral` -- if a Mistral key can be resolved.
|
||||
6. `bedrock` -- if the AWS SDK credential chain resolves (instance role, access keys, profile, SSO, web identity, or shared config).
|
||||
|
||||
`ollama` is supported but not auto-detected (set it explicitly).
|
||||
|
||||
### API key resolution
|
||||
|
||||
Remote embeddings require an API key. OpenClaw resolves from:
|
||||
auth profiles, `models.providers.*.apiKey`, or environment variables.
|
||||
Remote embeddings require an API key. Bedrock uses the AWS SDK default
|
||||
credential chain instead (instance roles, SSO, access keys).
|
||||
|
||||
| Provider | Env var | Config key |
|
||||
| -------- | ------------------------------ | --------------------------------- |
|
||||
@@ -55,6 +56,7 @@ auth profiles, `models.providers.*.apiKey`, or environment variables.
|
||||
| Gemini | `GEMINI_API_KEY` | `models.providers.google.apiKey` |
|
||||
| Voyage | `VOYAGE_API_KEY` | `models.providers.voyage.apiKey` |
|
||||
| Mistral | `MISTRAL_API_KEY` | `models.providers.mistral.apiKey` |
|
||||
| Bedrock | AWS credential chain | No API key needed |
|
||||
| Ollama | `OLLAMA_API_KEY` (placeholder) | -- |
|
||||
|
||||
Codex OAuth covers chat/completions only and does not satisfy embedding
|
||||
@@ -104,6 +106,84 @@ Changing model or `outputDimensionality` triggers an automatic full reindex.
|
||||
|
||||
---
|
||||
|
||||
## Bedrock embedding config
|
||||
|
||||
Bedrock uses the AWS SDK default credential chain -- no API keys needed.
|
||||
If OpenClaw runs on EC2 with a Bedrock-enabled instance role, just set the
|
||||
provider and model:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
memorySearch: {
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
| Key | Type | Default | Description |
|
||||
| ---------------------- | -------- | ------------------------------ | ------------------------------- |
|
||||
| `model` | `string` | `amazon.titan-embed-text-v2:0` | Any Bedrock embedding model ID |
|
||||
| `outputDimensionality` | `number` | model default | For Titan V2: 256, 512, or 1024 |
|
||||
|
||||
### Supported models
|
||||
|
||||
The following models are supported (with family detection and dimension
|
||||
defaults):
|
||||
|
||||
| Model ID | Provider | Default Dims | Configurable Dims |
|
||||
| ------------------------------------------ | ---------- | ------------ | -------------------- |
|
||||
| `amazon.titan-embed-text-v2:0` | Amazon | 1024 | 256, 512, 1024 |
|
||||
| `amazon.titan-embed-text-v1` | Amazon | 1536 | -- |
|
||||
| `amazon.titan-embed-g1-text-02` | Amazon | 1536 | -- |
|
||||
| `amazon.titan-embed-image-v1` | Amazon | 1024 | -- |
|
||||
| `amazon.nova-2-multimodal-embeddings-v1:0` | Amazon | 1024 | 256, 384, 1024, 3072 |
|
||||
| `cohere.embed-english-v3` | Cohere | 1024 | -- |
|
||||
| `cohere.embed-multilingual-v3` | Cohere | 1024 | -- |
|
||||
| `cohere.embed-v4:0` | Cohere | 1536 | 256-1536 |
|
||||
| `twelvelabs.marengo-embed-3-0-v1:0` | TwelveLabs | 512 | -- |
|
||||
| `twelvelabs.marengo-embed-2-7-v1:0` | TwelveLabs | 1024 | -- |
|
||||
|
||||
Throughput-suffixed variants (e.g., `amazon.titan-embed-text-v1:2:8k`) inherit
|
||||
the base model's configuration.
|
||||
|
||||
### Authentication
|
||||
|
||||
Bedrock auth uses the standard AWS SDK credential resolution order:
|
||||
|
||||
1. Environment variables (`AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`)
|
||||
2. SSO token cache
|
||||
3. Web identity token credentials
|
||||
4. Shared credentials and config files
|
||||
5. ECS or EC2 metadata credentials
|
||||
|
||||
Region is resolved from `AWS_REGION`, `AWS_DEFAULT_REGION`, the
|
||||
`amazon-bedrock` provider `baseUrl`, or defaults to `us-east-1`.
|
||||
|
||||
### IAM permissions
|
||||
|
||||
The IAM role or user needs:
|
||||
|
||||
```json
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": "bedrock:InvokeModel",
|
||||
"Resource": "*"
|
||||
}
|
||||
```
|
||||
|
||||
For least-privilege, scope `InvokeModel` to the specific model:
|
||||
|
||||
```
|
||||
arn:aws:bedrock:*::foundation-model/amazon.titan-embed-text-v2:0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Local embedding config
|
||||
|
||||
| Key | Type | Default | Description |
|
||||
|
||||
@@ -1145,6 +1145,8 @@
|
||||
"@agentclientprotocol/sdk": "0.18.0",
|
||||
"@anthropic-ai/vertex-sdk": "^0.14.4",
|
||||
"@aws-sdk/client-bedrock": "3.1023.0",
|
||||
"@aws-sdk/client-bedrock-runtime": "3.1023.0",
|
||||
"@aws-sdk/credential-provider-node": "3.972.29",
|
||||
"@clack/prompts": "^1.2.0",
|
||||
"@homebridge/ciao": "^1.3.6",
|
||||
"@line/bot-sdk": "^10.6.0",
|
||||
|
||||
377
packages/memory-host-sdk/src/host/embeddings-bedrock.test.ts
Normal file
377
packages/memory-host-sdk/src/host/embeddings-bedrock.test.ts
Normal file
@@ -0,0 +1,377 @@
|
||||
import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
|
||||
const { defaultProviderMock, resolveCredentialsMock, sendMock } = vi.hoisted(() => ({
|
||||
defaultProviderMock: vi.fn(),
|
||||
resolveCredentialsMock: vi.fn(),
|
||||
sendMock: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock("@aws-sdk/client-bedrock-runtime", () => {
|
||||
class MockClient {
|
||||
region: string;
|
||||
constructor(config: { region: string }) {
|
||||
this.region = config.region;
|
||||
}
|
||||
send = sendMock;
|
||||
}
|
||||
class MockCommand {
|
||||
input: unknown;
|
||||
constructor(input: unknown) {
|
||||
this.input = input;
|
||||
}
|
||||
}
|
||||
return { BedrockRuntimeClient: MockClient, InvokeModelCommand: MockCommand };
|
||||
});
|
||||
|
||||
vi.mock("@aws-sdk/credential-provider-node", () => ({
|
||||
defaultProvider: defaultProviderMock.mockImplementation(() => resolveCredentialsMock),
|
||||
}));
|
||||
|
||||
let createBedrockEmbeddingProvider: typeof import("./embeddings-bedrock.js").createBedrockEmbeddingProvider;
|
||||
let resolveBedrockEmbeddingClient: typeof import("./embeddings-bedrock.js").resolveBedrockEmbeddingClient;
|
||||
let normalizeBedrockEmbeddingModel: typeof import("./embeddings-bedrock.js").normalizeBedrockEmbeddingModel;
|
||||
let hasAwsCredentials: typeof import("./embeddings-bedrock.js").hasAwsCredentials;
|
||||
|
||||
beforeAll(async () => {
|
||||
({
|
||||
createBedrockEmbeddingProvider,
|
||||
resolveBedrockEmbeddingClient,
|
||||
normalizeBedrockEmbeddingModel,
|
||||
hasAwsCredentials,
|
||||
} = await import("./embeddings-bedrock.js"));
|
||||
});
|
||||
|
||||
beforeEach(() => {
|
||||
defaultProviderMock.mockImplementation(() => resolveCredentialsMock);
|
||||
});
|
||||
|
||||
const enc = (body: unknown) => ({ body: new TextEncoder().encode(JSON.stringify(body)) });
|
||||
const reqBody = (i = 0): Record<string, unknown> =>
|
||||
JSON.parse(sendMock.mock.calls[i][0].input.body);
|
||||
|
||||
describe("bedrock embedding provider", () => {
|
||||
const originalEnv = process.env;
|
||||
afterEach(() => {
|
||||
process.env = originalEnv;
|
||||
vi.restoreAllMocks();
|
||||
defaultProviderMock.mockClear();
|
||||
resolveCredentialsMock.mockReset();
|
||||
sendMock.mockReset();
|
||||
});
|
||||
|
||||
// --- Normalization ---
|
||||
|
||||
it("normalizes model names with prefixes", () => {
|
||||
expect(normalizeBedrockEmbeddingModel("bedrock/amazon.titan-embed-text-v2:0")).toBe(
|
||||
"amazon.titan-embed-text-v2:0",
|
||||
);
|
||||
expect(normalizeBedrockEmbeddingModel("amazon-bedrock/cohere.embed-english-v3")).toBe(
|
||||
"cohere.embed-english-v3",
|
||||
);
|
||||
expect(normalizeBedrockEmbeddingModel("")).toBe("amazon.titan-embed-text-v2:0");
|
||||
});
|
||||
|
||||
// --- Client resolution ---
|
||||
|
||||
it("resolves region from env", () => {
|
||||
process.env = { ...originalEnv, AWS_REGION: "eu-west-1" };
|
||||
const c = resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(c.region).toBe("eu-west-1");
|
||||
expect(c.dimensions).toBe(1024);
|
||||
});
|
||||
|
||||
it("defaults to us-east-1", () => {
|
||||
process.env = { ...originalEnv };
|
||||
delete process.env.AWS_REGION;
|
||||
delete process.env.AWS_DEFAULT_REGION;
|
||||
expect(
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
}).region,
|
||||
).toBe("us-east-1");
|
||||
});
|
||||
|
||||
it("extracts region from baseUrl", () => {
|
||||
process.env = { ...originalEnv };
|
||||
delete process.env.AWS_REGION;
|
||||
const c = resolveBedrockEmbeddingClient({
|
||||
config: {
|
||||
models: {
|
||||
providers: {
|
||||
"amazon-bedrock": { baseUrl: "https://bedrock-runtime.ap-southeast-2.amazonaws.com" },
|
||||
},
|
||||
},
|
||||
} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(c.region).toBe("ap-southeast-2");
|
||||
});
|
||||
|
||||
it("validates dimensions", () => {
|
||||
expect(() =>
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 768,
|
||||
}),
|
||||
).toThrow("Invalid dimensions 768");
|
||||
});
|
||||
|
||||
it("accepts valid dimensions", () => {
|
||||
expect(
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 256,
|
||||
}).dimensions,
|
||||
).toBe(256);
|
||||
});
|
||||
|
||||
it("resolves throughput-suffixed variants", () => {
|
||||
expect(
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v1:2:8k",
|
||||
fallback: "none",
|
||||
}).dimensions,
|
||||
).toBe(1536);
|
||||
});
|
||||
|
||||
// --- Credential detection ---
|
||||
|
||||
it("detects access keys", async () => {
|
||||
await expect(
|
||||
hasAwsCredentials({
|
||||
AWS_ACCESS_KEY_ID: "A",
|
||||
AWS_SECRET_ACCESS_KEY: "s",
|
||||
} as NodeJS.ProcessEnv),
|
||||
).resolves.toBe(true);
|
||||
});
|
||||
it("detects profile", async () => {
|
||||
await expect(hasAwsCredentials({ AWS_PROFILE: "default" } as NodeJS.ProcessEnv)).resolves.toBe(
|
||||
true,
|
||||
);
|
||||
});
|
||||
it("detects ECS task role", async () => {
|
||||
await expect(
|
||||
hasAwsCredentials({ AWS_CONTAINER_CREDENTIALS_RELATIVE_URI: "/v2" } as NodeJS.ProcessEnv),
|
||||
).resolves.toBe(true);
|
||||
});
|
||||
it("detects EKS IRSA", async () => {
|
||||
await expect(
|
||||
hasAwsCredentials({
|
||||
AWS_WEB_IDENTITY_TOKEN_FILE: "/var/run/secrets/token",
|
||||
AWS_ROLE_ARN: "arn:aws:iam::123:role/x",
|
||||
} as NodeJS.ProcessEnv),
|
||||
).resolves.toBe(true);
|
||||
});
|
||||
it("detects credentials via the AWS SDK default provider chain", async () => {
|
||||
resolveCredentialsMock.mockResolvedValue({ accessKeyId: "AKIAEXAMPLE" });
|
||||
await expect(hasAwsCredentials({} as NodeJS.ProcessEnv)).resolves.toBe(true);
|
||||
expect(defaultProviderMock).toHaveBeenCalledWith({ timeout: 1000, maxRetries: 0 });
|
||||
});
|
||||
it("returns false with no creds", async () => {
|
||||
resolveCredentialsMock.mockRejectedValue(new Error("no aws credentials"));
|
||||
await expect(hasAwsCredentials({} as NodeJS.ProcessEnv)).resolves.toBe(false);
|
||||
});
|
||||
|
||||
// --- Titan V2 ---
|
||||
|
||||
it("embeds with Titan V2", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embedding: [0.1, 0.2, 0.3] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("test")).toHaveLength(3);
|
||||
expect(reqBody()).toMatchObject({ inputText: "test", normalize: true, dimensions: 1024 });
|
||||
});
|
||||
|
||||
it("returns empty for blank text", async () => {
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery(" ")).toEqual([]);
|
||||
expect(sendMock).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("batches Titan V2 concurrently", async () => {
|
||||
sendMock
|
||||
.mockResolvedValueOnce(enc({ embedding: [0.1] }))
|
||||
.mockResolvedValueOnce(enc({ embedding: [0.2] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(sendMock).toHaveBeenCalledTimes(2);
|
||||
});
|
||||
|
||||
// --- Titan V1 ---
|
||||
|
||||
it("sends only inputText for Titan V1", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embedding: [0.5] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v1",
|
||||
fallback: "none",
|
||||
});
|
||||
await provider.embedQuery("hi");
|
||||
expect(reqBody()).toEqual({ inputText: "hi" });
|
||||
});
|
||||
|
||||
it("handles Titan G1 text variant", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embedding: [0.1] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-g1-text-02",
|
||||
fallback: "none",
|
||||
});
|
||||
await provider.embedQuery("hi");
|
||||
expect(reqBody()).toEqual({ inputText: "hi" });
|
||||
});
|
||||
|
||||
// --- Cohere V3 ---
|
||||
|
||||
it("embeds Cohere V3 batch in single call", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embeddings: [[0.1], [0.2]] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-english-v3",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(sendMock).toHaveBeenCalledTimes(1);
|
||||
expect(reqBody()).toMatchObject({ texts: ["a", "b"], input_type: "search_document" });
|
||||
});
|
||||
|
||||
it("uses search_query for Cohere embedQuery", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embeddings: [[0.1]] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-english-v3",
|
||||
fallback: "none",
|
||||
});
|
||||
await provider.embedQuery("q");
|
||||
expect(reqBody().input_type).toBe("search_query");
|
||||
});
|
||||
|
||||
// --- Cohere V4 ---
|
||||
|
||||
it("embeds Cohere V4 with embedding_types + output_dimension", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embeddings: { float: [[0.1], [0.2]] } }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-v4:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(reqBody()).toMatchObject({ embedding_types: ["float"], output_dimension: 1536 });
|
||||
});
|
||||
|
||||
it("validates Cohere V4 dimensions", () => {
|
||||
expect(() =>
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-v4:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 2048,
|
||||
}),
|
||||
).toThrow("Invalid dimensions 2048");
|
||||
});
|
||||
|
||||
// --- Nova ---
|
||||
|
||||
it("embeds Nova with SINGLE_EMBEDDING format", async () => {
|
||||
sendMock.mockResolvedValue(
|
||||
enc({ embeddings: [{ embeddingType: "TEXT", embedding: [0.1, 0.2] }] }),
|
||||
);
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.nova-2-multimodal-embeddings-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("hi")).toHaveLength(2);
|
||||
expect(reqBody().taskType).toBe("SINGLE_EMBEDDING");
|
||||
});
|
||||
|
||||
it("validates Nova dimensions", () => {
|
||||
expect(() =>
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.nova-2-multimodal-embeddings-v1:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 512,
|
||||
}),
|
||||
).toThrow("Invalid dimensions 512");
|
||||
});
|
||||
|
||||
it("batches Nova concurrently", async () => {
|
||||
sendMock
|
||||
.mockResolvedValueOnce(enc({ embeddings: [{ embeddingType: "TEXT", embedding: [0.1] }] }))
|
||||
.mockResolvedValueOnce(enc({ embeddings: [{ embeddingType: "TEXT", embedding: [0.2] }] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.nova-2-multimodal-embeddings-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(sendMock).toHaveBeenCalledTimes(2);
|
||||
});
|
||||
|
||||
// --- TwelveLabs ---
|
||||
|
||||
it("embeds TwelveLabs Marengo", async () => {
|
||||
sendMock.mockResolvedValue(enc({ data: [{ embedding: [0.1, 0.2] }] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "twelvelabs.marengo-embed-3-0-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("hi")).toHaveLength(2);
|
||||
expect(reqBody()).toEqual({ inputType: "text", text: { inputText: "hi" } });
|
||||
});
|
||||
|
||||
it("embeds TwelveLabs object-style responses", async () => {
|
||||
sendMock.mockResolvedValue(enc({ data: { embedding: [0.3, 0.4] } }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "twelvelabs.marengo-embed-2-7-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("hi")).toEqual([0.6, 0.8]);
|
||||
});
|
||||
});
|
||||
397
packages/memory-host-sdk/src/host/embeddings-bedrock.ts
Normal file
397
packages/memory-host-sdk/src/host/embeddings-bedrock.ts
Normal file
@@ -0,0 +1,397 @@
|
||||
import { sanitizeAndNormalizeEmbedding } from "./embedding-vectors.js";
|
||||
import { debugEmbeddingsLog } from "./embeddings-debug.js";
|
||||
import type { EmbeddingProvider, EmbeddingProviderOptions } from "./embeddings.js";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types & constants
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export type BedrockEmbeddingClient = {
|
||||
region: string;
|
||||
model: string;
|
||||
dimensions?: number;
|
||||
};
|
||||
|
||||
export const DEFAULT_BEDROCK_EMBEDDING_MODEL = "amazon.titan-embed-text-v2:0";
|
||||
|
||||
/** Request/response format family — each has a different API shape. */
|
||||
type Family = "titan-v1" | "titan-v2" | "cohere-v3" | "cohere-v4" | "nova" | "twelvelabs";
|
||||
|
||||
interface ModelSpec {
|
||||
maxTokens: number;
|
||||
dims: number;
|
||||
validDims?: number[];
|
||||
family: Family;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Model catalog
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const MODELS: Record<string, ModelSpec> = {
|
||||
"amazon.titan-embed-text-v2:0": {
|
||||
maxTokens: 8192,
|
||||
dims: 1024,
|
||||
validDims: [256, 512, 1024],
|
||||
family: "titan-v2",
|
||||
},
|
||||
"amazon.titan-embed-text-v1": { maxTokens: 8000, dims: 1536, family: "titan-v1" },
|
||||
"amazon.titan-embed-g1-text-02": { maxTokens: 8000, dims: 1536, family: "titan-v1" },
|
||||
"amazon.titan-embed-image-v1": { maxTokens: 128, dims: 1024, family: "titan-v1" },
|
||||
"cohere.embed-english-v3": { maxTokens: 512, dims: 1024, family: "cohere-v3" },
|
||||
"cohere.embed-multilingual-v3": { maxTokens: 512, dims: 1024, family: "cohere-v3" },
|
||||
"cohere.embed-v4:0": {
|
||||
maxTokens: 128000,
|
||||
dims: 1536,
|
||||
validDims: [256, 384, 512, 768, 1024, 1536],
|
||||
family: "cohere-v4",
|
||||
},
|
||||
"amazon.nova-2-multimodal-embeddings-v1:0": {
|
||||
maxTokens: 8192,
|
||||
dims: 1024,
|
||||
validDims: [256, 384, 1024, 3072],
|
||||
family: "nova",
|
||||
},
|
||||
"twelvelabs.marengo-embed-2-7-v1:0": { maxTokens: 512, dims: 1024, family: "twelvelabs" },
|
||||
"twelvelabs.marengo-embed-3-0-v1:0": { maxTokens: 512, dims: 512, family: "twelvelabs" },
|
||||
};
|
||||
|
||||
/** Resolve spec, stripping throughput suffixes like `:2:8k` or `:0:512`. */
|
||||
function resolveSpec(modelId: string): ModelSpec | undefined {
|
||||
if (MODELS[modelId]) {
|
||||
return MODELS[modelId];
|
||||
}
|
||||
const parts = modelId.split(":");
|
||||
for (let i = parts.length - 1; i >= 1; i--) {
|
||||
const spec = MODELS[parts.slice(0, i).join(":")];
|
||||
if (spec) {
|
||||
return spec;
|
||||
}
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
/** Infer family from model ID prefix when not in catalog. */
|
||||
function inferFamily(modelId: string): Family {
|
||||
const id = modelId.toLowerCase();
|
||||
if (id.startsWith("amazon.titan-embed-text-v2")) {
|
||||
return "titan-v2";
|
||||
}
|
||||
if (id.startsWith("amazon.titan-embed")) {
|
||||
return "titan-v1";
|
||||
}
|
||||
if (id.startsWith("amazon.nova")) {
|
||||
return "nova";
|
||||
}
|
||||
if (id.startsWith("cohere.embed-v4")) {
|
||||
return "cohere-v4";
|
||||
}
|
||||
if (id.startsWith("cohere.embed")) {
|
||||
return "cohere-v3";
|
||||
}
|
||||
if (id.startsWith("twelvelabs.")) {
|
||||
return "twelvelabs";
|
||||
}
|
||||
return "titan-v1"; // safest default — simplest request format
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// AWS SDK lazy loader
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
type SdkClient = import("@aws-sdk/client-bedrock-runtime").BedrockRuntimeClient;
|
||||
type SdkCommand = import("@aws-sdk/client-bedrock-runtime").InvokeModelCommand;
|
||||
|
||||
interface AwsSdk {
|
||||
BedrockRuntimeClient: new (config: { region: string }) => SdkClient;
|
||||
InvokeModelCommand: new (input: {
|
||||
modelId: string;
|
||||
body: string;
|
||||
contentType: string;
|
||||
accept: string;
|
||||
}) => SdkCommand;
|
||||
}
|
||||
|
||||
interface AwsCredentialProviderSdk {
|
||||
defaultProvider: (init?: { timeout?: number; maxRetries?: number }) => () => Promise<{
|
||||
accessKeyId?: string;
|
||||
}>;
|
||||
}
|
||||
|
||||
let sdkCache: AwsSdk | null = null;
|
||||
let credentialProviderSdkCache: AwsCredentialProviderSdk | null | undefined;
|
||||
|
||||
async function loadSdk(): Promise<AwsSdk> {
|
||||
if (sdkCache) {
|
||||
return sdkCache;
|
||||
}
|
||||
try {
|
||||
sdkCache = (await import("@aws-sdk/client-bedrock-runtime")) as unknown as AwsSdk;
|
||||
return sdkCache;
|
||||
} catch {
|
||||
throw new Error(
|
||||
"No API key found for provider bedrock: @aws-sdk/client-bedrock-runtime is not installed. " +
|
||||
"Install it with: npm install @aws-sdk/client-bedrock-runtime",
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
async function loadCredentialProviderSdk(): Promise<AwsCredentialProviderSdk | null> {
|
||||
if (credentialProviderSdkCache !== undefined) {
|
||||
return credentialProviderSdkCache;
|
||||
}
|
||||
try {
|
||||
credentialProviderSdkCache =
|
||||
(await import("@aws-sdk/credential-provider-node")) as unknown as AwsCredentialProviderSdk;
|
||||
} catch {
|
||||
credentialProviderSdkCache = null;
|
||||
}
|
||||
return credentialProviderSdkCache;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const MODEL_PREFIX_RE = /^(?:bedrock|amazon-bedrock|aws)\//;
|
||||
const REGION_RE = /bedrock-runtime\.([a-z0-9-]+)\./;
|
||||
|
||||
export function normalizeBedrockEmbeddingModel(model: string): string {
|
||||
const trimmed = model.trim();
|
||||
return trimmed ? trimmed.replace(MODEL_PREFIX_RE, "") : DEFAULT_BEDROCK_EMBEDDING_MODEL;
|
||||
}
|
||||
|
||||
function regionFromUrl(url: string | undefined): string | undefined {
|
||||
return url?.trim() ? REGION_RE.exec(url)?.[1] : undefined;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Request builders
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function buildBody(family: Family, text: string, dims?: number): string {
|
||||
switch (family) {
|
||||
case "titan-v2": {
|
||||
const b: Record<string, unknown> = { inputText: text };
|
||||
if (dims != null) {
|
||||
b.dimensions = dims;
|
||||
b.normalize = true;
|
||||
}
|
||||
return JSON.stringify(b);
|
||||
}
|
||||
case "titan-v1":
|
||||
return JSON.stringify({ inputText: text });
|
||||
case "nova":
|
||||
return JSON.stringify({
|
||||
taskType: "SINGLE_EMBEDDING",
|
||||
singleEmbeddingParams: {
|
||||
embeddingPurpose: "GENERIC_INDEX",
|
||||
embeddingDimension: dims ?? 1024,
|
||||
text: { truncationMode: "END", value: text },
|
||||
},
|
||||
});
|
||||
case "twelvelabs":
|
||||
return JSON.stringify({ inputType: "text", text: { inputText: text } });
|
||||
default:
|
||||
return JSON.stringify({ inputText: text });
|
||||
}
|
||||
}
|
||||
|
||||
function buildCohereBody(
|
||||
family: Family,
|
||||
texts: string[],
|
||||
inputType: "search_query" | "search_document",
|
||||
dims?: number,
|
||||
): string {
|
||||
const body: Record<string, unknown> = { texts, input_type: inputType, truncate: "END" };
|
||||
if (family === "cohere-v4") {
|
||||
body.embedding_types = ["float"];
|
||||
if (dims != null) {
|
||||
body.output_dimension = dims;
|
||||
}
|
||||
}
|
||||
return JSON.stringify(body);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Response parsers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function parseSingle(family: Family, raw: string): number[] {
|
||||
const data = JSON.parse(raw);
|
||||
switch (family) {
|
||||
case "nova":
|
||||
return data.embeddings?.[0]?.embedding ?? [];
|
||||
case "twelvelabs": {
|
||||
if (Array.isArray(data.data)) {
|
||||
return data.data[0]?.embedding ?? [];
|
||||
}
|
||||
if (Array.isArray(data.data?.embedding)) {
|
||||
return data.data.embedding;
|
||||
}
|
||||
return data.embedding ?? [];
|
||||
}
|
||||
default:
|
||||
return data.embedding ?? [];
|
||||
}
|
||||
}
|
||||
|
||||
function parseCohereBatch(family: Family, raw: string): number[][] {
|
||||
const data = JSON.parse(raw);
|
||||
const embeddings = data.embeddings;
|
||||
if (!embeddings) {
|
||||
return [];
|
||||
}
|
||||
if (family === "cohere-v4" && !Array.isArray(embeddings)) {
|
||||
return embeddings.float ?? [];
|
||||
}
|
||||
return embeddings;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Provider
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export async function createBedrockEmbeddingProvider(
|
||||
options: EmbeddingProviderOptions,
|
||||
): Promise<{ provider: EmbeddingProvider; client: BedrockEmbeddingClient }> {
|
||||
const client = resolveBedrockEmbeddingClient(options);
|
||||
const { BedrockRuntimeClient, InvokeModelCommand } = await loadSdk();
|
||||
const sdk = new BedrockRuntimeClient({ region: client.region });
|
||||
const spec = resolveSpec(client.model);
|
||||
const family = spec?.family ?? inferFamily(client.model);
|
||||
|
||||
debugEmbeddingsLog("memory embeddings: bedrock client", {
|
||||
region: client.region,
|
||||
model: client.model,
|
||||
dimensions: client.dimensions,
|
||||
family,
|
||||
});
|
||||
|
||||
const invoke = async (body: string): Promise<string> => {
|
||||
const res = await sdk.send(
|
||||
new InvokeModelCommand({
|
||||
modelId: client.model,
|
||||
body,
|
||||
contentType: "application/json",
|
||||
accept: "application/json",
|
||||
}),
|
||||
);
|
||||
return new TextDecoder().decode(res.body);
|
||||
};
|
||||
|
||||
const isCohere = family === "cohere-v3" || family === "cohere-v4";
|
||||
|
||||
const embedSingle = async (text: string): Promise<number[]> => {
|
||||
const raw = await invoke(buildBody(family, text, client.dimensions));
|
||||
return sanitizeAndNormalizeEmbedding(parseSingle(family, raw));
|
||||
};
|
||||
|
||||
const embedCohere = async (
|
||||
texts: string[],
|
||||
inputType: "search_query" | "search_document",
|
||||
): Promise<number[][]> => {
|
||||
const raw = await invoke(buildCohereBody(family, texts, inputType, client.dimensions));
|
||||
return parseCohereBatch(family, raw).map((e) => sanitizeAndNormalizeEmbedding(e));
|
||||
};
|
||||
|
||||
const embedQuery = async (text: string): Promise<number[]> => {
|
||||
if (!text.trim()) {
|
||||
return [];
|
||||
}
|
||||
if (isCohere) {
|
||||
return (await embedCohere([text], "search_query"))[0] ?? [];
|
||||
}
|
||||
return embedSingle(text);
|
||||
};
|
||||
|
||||
const embedBatch = async (texts: string[]): Promise<number[][]> => {
|
||||
if (texts.length === 0) {
|
||||
return [];
|
||||
}
|
||||
if (isCohere) {
|
||||
return embedCohere(texts, "search_document");
|
||||
}
|
||||
return Promise.all(texts.map((t) => (t.trim() ? embedSingle(t) : Promise.resolve([]))));
|
||||
};
|
||||
|
||||
return {
|
||||
provider: {
|
||||
id: "bedrock",
|
||||
model: client.model,
|
||||
maxInputTokens: spec?.maxTokens,
|
||||
embedQuery,
|
||||
embedBatch,
|
||||
},
|
||||
client,
|
||||
};
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Client resolution
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export function resolveBedrockEmbeddingClient(
|
||||
options: EmbeddingProviderOptions,
|
||||
): BedrockEmbeddingClient {
|
||||
const model = normalizeBedrockEmbeddingModel(options.model);
|
||||
const spec = resolveSpec(model);
|
||||
const providerConfig = options.config.models?.providers?.["amazon-bedrock"];
|
||||
|
||||
const region =
|
||||
regionFromUrl(options.remote?.baseUrl) ??
|
||||
regionFromUrl(providerConfig?.baseUrl) ??
|
||||
process.env.AWS_REGION ??
|
||||
process.env.AWS_DEFAULT_REGION ??
|
||||
"us-east-1";
|
||||
|
||||
let dimensions: number | undefined;
|
||||
if (options.outputDimensionality != null) {
|
||||
if (spec?.validDims && !spec.validDims.includes(options.outputDimensionality)) {
|
||||
throw new Error(
|
||||
`Invalid dimensions ${options.outputDimensionality} for ${model}. Valid values: ${spec.validDims.join(", ")}`,
|
||||
);
|
||||
}
|
||||
dimensions = options.outputDimensionality;
|
||||
} else {
|
||||
dimensions = spec?.dims;
|
||||
}
|
||||
|
||||
return { region, model, dimensions };
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Credential detection
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const CREDENTIAL_ENV_VARS = [
|
||||
"AWS_PROFILE",
|
||||
"AWS_BEARER_TOKEN_BEDROCK",
|
||||
"AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
|
||||
"AWS_CONTAINER_CREDENTIALS_FULL_URI",
|
||||
"AWS_EC2_METADATA_SERVICE_ENDPOINT",
|
||||
"AWS_WEB_IDENTITY_TOKEN_FILE",
|
||||
"AWS_ROLE_ARN",
|
||||
] as const;
|
||||
|
||||
export async function hasAwsCredentials(env: NodeJS.ProcessEnv = process.env): Promise<boolean> {
|
||||
if (env.AWS_ACCESS_KEY_ID?.trim() && env.AWS_SECRET_ACCESS_KEY?.trim()) {
|
||||
return true;
|
||||
}
|
||||
if (CREDENTIAL_ENV_VARS.some((k) => env[k]?.trim())) {
|
||||
return true;
|
||||
}
|
||||
const credentialProviderSdk = await loadCredentialProviderSdk();
|
||||
if (!credentialProviderSdk) {
|
||||
return false;
|
||||
}
|
||||
try {
|
||||
const credentials = await credentialProviderSdk.defaultProvider({
|
||||
timeout: 1000,
|
||||
maxRetries: 0,
|
||||
})();
|
||||
return typeof credentials.accessKeyId === "string" && credentials.accessKeyId.trim().length > 0;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
@@ -6,6 +6,11 @@ import { formatErrorMessage } from "../../../../src/infra/errors.js";
|
||||
import { resolveUserPath } from "../../../../src/utils.js";
|
||||
import type { EmbeddingInput } from "./embedding-inputs.js";
|
||||
import { sanitizeAndNormalizeEmbedding } from "./embedding-vectors.js";
|
||||
import {
|
||||
createBedrockEmbeddingProvider,
|
||||
hasAwsCredentials,
|
||||
type BedrockEmbeddingClient,
|
||||
} from "./embeddings-bedrock.js";
|
||||
import {
|
||||
createGeminiEmbeddingProvider,
|
||||
type GeminiEmbeddingClient,
|
||||
@@ -25,6 +30,7 @@ export type { MistralEmbeddingClient } from "./embeddings-mistral.js";
|
||||
export type { OpenAiEmbeddingClient } from "./embeddings-openai.js";
|
||||
export type { VoyageEmbeddingClient } from "./embeddings-voyage.js";
|
||||
export type { OllamaEmbeddingClient } from "./embeddings-ollama.js";
|
||||
export type { BedrockEmbeddingClient } from "./embeddings-bedrock.js";
|
||||
|
||||
export type EmbeddingProvider = {
|
||||
id: string;
|
||||
@@ -35,13 +41,21 @@ export type EmbeddingProvider = {
|
||||
embedBatchInputs?: (inputs: EmbeddingInput[]) => Promise<number[][]>;
|
||||
};
|
||||
|
||||
export type EmbeddingProviderId = "openai" | "local" | "gemini" | "voyage" | "mistral" | "ollama";
|
||||
export type EmbeddingProviderId =
|
||||
| "openai"
|
||||
| "local"
|
||||
| "gemini"
|
||||
| "voyage"
|
||||
| "mistral"
|
||||
| "ollama"
|
||||
| "bedrock";
|
||||
export type EmbeddingProviderRequest = EmbeddingProviderId | "auto";
|
||||
export type EmbeddingProviderFallback = EmbeddingProviderId | "none";
|
||||
|
||||
// Remote providers considered for auto-selection when provider === "auto".
|
||||
// Ollama is intentionally excluded here so that "auto" mode does not
|
||||
// implicitly assume a local Ollama instance is available.
|
||||
// Bedrock is included when AWS credentials are detected.
|
||||
const REMOTE_EMBEDDING_PROVIDER_IDS = ["openai", "gemini", "voyage", "mistral"] as const;
|
||||
|
||||
export type EmbeddingProviderResult = {
|
||||
@@ -55,6 +69,7 @@ export type EmbeddingProviderResult = {
|
||||
voyage?: VoyageEmbeddingClient;
|
||||
mistral?: MistralEmbeddingClient;
|
||||
ollama?: OllamaEmbeddingClient;
|
||||
bedrock?: BedrockEmbeddingClient;
|
||||
};
|
||||
|
||||
export type EmbeddingProviderOptions = {
|
||||
@@ -72,7 +87,7 @@ export type EmbeddingProviderOptions = {
|
||||
modelPath?: string;
|
||||
modelCacheDir?: string;
|
||||
};
|
||||
/** Gemini embedding-2: output vector dimensions (768, 1536, or 3072). */
|
||||
/** Provider-specific output vector dimensions for supported embedding families. */
|
||||
outputDimensionality?: number;
|
||||
/** Gemini: override the default task type sent with embedding requests. */
|
||||
taskType?: GeminiTaskType;
|
||||
@@ -192,6 +207,10 @@ export async function createEmbeddingProvider(
|
||||
const { provider, client } = await createMistralEmbeddingProvider(options);
|
||||
return { provider, mistral: client };
|
||||
}
|
||||
if (id === "bedrock") {
|
||||
const { provider, client } = await createBedrockEmbeddingProvider(options);
|
||||
return { provider, bedrock: client };
|
||||
}
|
||||
const { provider, client } = await createOpenAiEmbeddingProvider(options);
|
||||
return { provider, openAi: client };
|
||||
};
|
||||
@@ -229,6 +248,23 @@ export async function createEmbeddingProvider(
|
||||
}
|
||||
}
|
||||
|
||||
// Try bedrock if AWS credentials are available
|
||||
if (await hasAwsCredentials()) {
|
||||
try {
|
||||
const result = await createProvider("bedrock");
|
||||
return { ...result, requestedProvider };
|
||||
} catch (err) {
|
||||
const message = formatPrimaryError(err, "bedrock");
|
||||
if (isMissingApiKeyError(err)) {
|
||||
missingKeyErrors.push(message);
|
||||
} else {
|
||||
const wrapped = new Error(message) as Error & { cause?: unknown };
|
||||
wrapped.cause = err;
|
||||
throw wrapped;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// All providers failed due to missing API keys - return null provider for FTS-only mode
|
||||
const details = [...missingKeyErrors, localError].filter(Boolean) as string[];
|
||||
const reason = details.length > 0 ? details.join("\n\n") : "No embeddings provider available.";
|
||||
|
||||
142
pnpm-lock.yaml
generated
142
pnpm-lock.yaml
generated
@@ -39,6 +39,12 @@ importers:
|
||||
'@aws-sdk/client-bedrock':
|
||||
specifier: 3.1023.0
|
||||
version: 3.1023.0
|
||||
'@aws-sdk/client-bedrock-runtime':
|
||||
specifier: 3.1023.0
|
||||
version: 3.1023.0
|
||||
'@aws-sdk/credential-provider-node':
|
||||
specifier: 3.972.29
|
||||
version: 3.972.29
|
||||
'@clack/prompts':
|
||||
specifier: ^1.2.0
|
||||
version: 1.2.0
|
||||
@@ -757,56 +763,8 @@ importers:
|
||||
specifier: workspace:*
|
||||
version: link:../..
|
||||
|
||||
packages/clawdbot:
|
||||
dependencies:
|
||||
openclaw:
|
||||
specifier: workspace:*
|
||||
version: link:../..
|
||||
|
||||
packages/memory-host-sdk: {}
|
||||
|
||||
packages/moltbot:
|
||||
dependencies:
|
||||
openclaw:
|
||||
specifier: workspace:*
|
||||
version: link:../..
|
||||
|
||||
packages/plugin-package-contract: {}
|
||||
|
||||
ui:
|
||||
dependencies:
|
||||
'@create-markdown/preview':
|
||||
specifier: ^2.0.0
|
||||
version: 2.0.0(@create-markdown/core@2.0.0)(shiki@3.23.0)
|
||||
'@noble/ed25519':
|
||||
specifier: 3.0.1
|
||||
version: 3.0.1
|
||||
dompurify:
|
||||
specifier: ^3.3.3
|
||||
version: 3.3.3
|
||||
lit:
|
||||
specifier: ^3.3.2
|
||||
version: 3.3.2
|
||||
marked:
|
||||
specifier: ^17.0.5
|
||||
version: 17.0.5
|
||||
devDependencies:
|
||||
'@vitest/browser-playwright':
|
||||
specifier: 4.1.2
|
||||
version: 4.1.2(playwright@1.59.1)(vite@8.0.3(@emnapi/core@1.8.1)(@emnapi/runtime@1.9.1)(@types/node@25.5.2)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3))(vitest@4.1.2)
|
||||
jsdom:
|
||||
specifier: ^29.0.1
|
||||
version: 29.0.1(@noble/hashes@2.0.1)
|
||||
playwright:
|
||||
specifier: ^1.59.1
|
||||
version: 1.59.1
|
||||
vite:
|
||||
specifier: 8.0.3
|
||||
version: 8.0.3(@emnapi/core@1.8.1)(@emnapi/runtime@1.9.1)(@types/node@25.5.2)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3)
|
||||
vitest:
|
||||
specifier: 4.1.2
|
||||
version: 4.1.2(@opentelemetry/api@1.9.1)(@types/node@25.5.2)(@vitest/browser-playwright@4.1.2)(jsdom@29.0.1(@noble/hashes@2.0.1))(vite@8.0.3(@emnapi/core@1.8.1)(@emnapi/runtime@1.9.1)(@types/node@25.5.2)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3))
|
||||
|
||||
packages:
|
||||
|
||||
'@agentclientprotocol/sdk@0.18.0':
|
||||
@@ -860,8 +818,8 @@ packages:
|
||||
'@aws-crypto/util@5.2.0':
|
||||
resolution: {integrity: sha512-4RkU9EsI6ZpBve5fseQlGNUWKMa1RLPQ1dnjnQoe07ldfIzcsGb5hC5W0Dm7u423KWzawlrpbjXBrXCEv9zazQ==}
|
||||
|
||||
'@aws-sdk/client-bedrock-runtime@3.1022.0':
|
||||
resolution: {integrity: sha512-gT8+ebNzmLjk07dPTVn0f4ZdEDSFYsyCX3rAxX2QGGOasKeeQQEBW4PxYqHGM6lJrcuFSc/ScSVKTRDxGZlFiA==}
|
||||
'@aws-sdk/client-bedrock-runtime@3.1023.0':
|
||||
resolution: {integrity: sha512-C0He9qhrClUp6JEk3QjE0WScDN1GSZF8eruP0uoh5kXeQEJLxyfFDrR2TIYnHntlRs/sMwhO82Vu7yGGQM2pfQ==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
|
||||
'@aws-sdk/client-bedrock@3.1023.0':
|
||||
@@ -984,10 +942,6 @@ packages:
|
||||
resolution: {integrity: sha512-TKY6h9spUk3OLs5v1oAgW9mAeBE3LAGNBwJokLy96wwmd4W2v/tYlXseProyed9ValDj2u1jK/4Rg1T+1NXyJA==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
|
||||
'@aws-sdk/token-providers@3.1022.0':
|
||||
resolution: {integrity: sha512-rC0+QQh5uo9Y0wtrvsVuGWi8njtf6h6FB94h5NClUoiNTuQiRG/+AjXiqhv1x/m8TnLrgYCHiFzykOdOb5Ea9w==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
|
||||
'@aws-sdk/token-providers@3.1023.0':
|
||||
resolution: {integrity: sha512-g/t814ec7g+MbazONIdQzb0c8FalVnSKCLc665GLG4QdrviKXHzag7HQmf5wBhCDsUDNAIi77fLeElaZSkylTA==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
@@ -1127,25 +1081,6 @@ packages:
|
||||
resolution: {integrity: sha512-ooWCrlZP11i8GImSjTHYHLkvFDP48nS4+204nGb1RiX/WXYHmJA2III9/e2DWVabCESdW7hBAEzHRqUn9OUVvQ==}
|
||||
engines: {node: '>=0.1.90'}
|
||||
|
||||
'@create-markdown/core@2.0.0':
|
||||
resolution: {integrity: sha512-xOmhoiDSa82EzjXp3aViQdB+xfCP4E2jEKxJiKJ702sup3p/CTCtL8fZBKQ3BvzASQRpq/xKCRXZZwRrg1DmZQ==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
|
||||
'@create-markdown/preview@2.0.0':
|
||||
resolution: {integrity: sha512-3WTGCrCOVBy9wH2X82Oa2ZHJ+eiEqu8AlucckenWVtFSbzRKzkxgci0BRw7IDvsOsTUEMxS9Ltc9/hSUHkdidA==}
|
||||
engines: {node: '>=20.0.0'}
|
||||
peerDependencies:
|
||||
'@create-markdown/core': '>=2.0.0'
|
||||
mermaid: '>=10.0.0'
|
||||
shiki: '>=1.0.0'
|
||||
peerDependenciesMeta:
|
||||
'@create-markdown/core':
|
||||
optional: true
|
||||
mermaid:
|
||||
optional: true
|
||||
shiki:
|
||||
optional: true
|
||||
|
||||
'@csstools/color-helpers@6.0.2':
|
||||
resolution: {integrity: sha512-LMGQLS9EuADloEFkcTBR3BwV/CGHV7zyDxVRtVDTwdI2Ca4it0CCVTT9wCkxSgokjE5Ho41hEPgb8OEUwoXr6Q==}
|
||||
engines: {node: '>=20.19.0'}
|
||||
@@ -2181,9 +2116,6 @@ packages:
|
||||
resolution: {integrity: sha512-vs1Az2OOTBiP4q0pwjW5aF0xp9n4MxVrmkFBxc6EKZc6ddYx5gaZiAsZoq0uRRXWbi3AT/sBqn05eRPtn1JCPw==}
|
||||
engines: {node: '>= 20.19.0'}
|
||||
|
||||
'@noble/ed25519@3.0.1':
|
||||
resolution: {integrity: sha512-t/T8LuK0ym8ALQudCCQCtrRdMSxBnRgHXw+wg+YsSlE6d+on7sX3flqlSJ2mOs9xEuchM36kj9SuX5MG7pXQMA==}
|
||||
|
||||
'@noble/hashes@2.0.1':
|
||||
resolution: {integrity: sha512-XlOlEbQcE9fmuXxrVTXCTlG2nlRXa9Rj3rr5Ue/+tX+nmkgbX720YHh0VR3hBF9xDvwnb8D2shVGOwNx+ulArw==}
|
||||
engines: {node: '>= 20.19.0'}
|
||||
@@ -4158,9 +4090,6 @@ packages:
|
||||
resolution: {integrity: sha512-cgwlv/1iFQiFnU96XXgROh8xTeetsnJiDsTc7TYCLFd9+/WNkIqPTxiM/8pSd8VIrhXGTf1Ny1q1hquVqDJB5w==}
|
||||
engines: {node: '>= 4'}
|
||||
|
||||
dompurify@3.3.3:
|
||||
resolution: {integrity: sha512-Oj6pzI2+RqBfFG+qOaOLbFXLQ90ARpcGG6UePL82bJLtdsa6CYJD7nmiU8MW9nQNOtCHV3lZ/Bzq1X0QYbBZCA==}
|
||||
|
||||
domutils@3.2.2:
|
||||
resolution: {integrity: sha512-6kZKyUajlDuqlHKVX1w7gyslj9MPIXzIFiz/rGu35uC1wMi+kMhQwGhl4lt9unC9Vb9INnY9Z3/ZA3+FhASLaw==}
|
||||
|
||||
@@ -5088,11 +5017,6 @@ packages:
|
||||
engines: {node: '>= 18'}
|
||||
hasBin: true
|
||||
|
||||
marked@17.0.5:
|
||||
resolution: {integrity: sha512-6hLvc0/JEbRjRgzI6wnT2P1XuM1/RrrDEX0kPt0N7jGm1133g6X7DlxFasUIx+72aKAr904GTxhSLDrd5DIlZg==}
|
||||
engines: {node: '>= 20'}
|
||||
hasBin: true
|
||||
|
||||
math-intrinsics@1.1.0:
|
||||
resolution: {integrity: sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g==}
|
||||
engines: {node: '>= 0.4'}
|
||||
@@ -6730,7 +6654,7 @@ snapshots:
|
||||
'@smithy/util-utf8': 2.3.0
|
||||
tslib: 2.8.1
|
||||
|
||||
'@aws-sdk/client-bedrock-runtime@3.1022.0':
|
||||
'@aws-sdk/client-bedrock-runtime@3.1023.0':
|
||||
dependencies:
|
||||
'@aws-crypto/sha256-browser': 5.2.0
|
||||
'@aws-crypto/sha256-js': 5.2.0
|
||||
@@ -6744,7 +6668,7 @@ snapshots:
|
||||
'@aws-sdk/middleware-user-agent': 3.972.28
|
||||
'@aws-sdk/middleware-websocket': 3.972.14
|
||||
'@aws-sdk/region-config-resolver': 3.972.10
|
||||
'@aws-sdk/token-providers': 3.1022.0
|
||||
'@aws-sdk/token-providers': 3.1023.0
|
||||
'@aws-sdk/types': 3.973.6
|
||||
'@aws-sdk/util-endpoints': 3.996.5
|
||||
'@aws-sdk/util-user-agent-browser': 3.972.8
|
||||
@@ -7219,18 +7143,6 @@ snapshots:
|
||||
transitivePeerDependencies:
|
||||
- aws-crt
|
||||
|
||||
'@aws-sdk/token-providers@3.1022.0':
|
||||
dependencies:
|
||||
'@aws-sdk/core': 3.973.26
|
||||
'@aws-sdk/nested-clients': 3.996.18
|
||||
'@aws-sdk/types': 3.973.6
|
||||
'@smithy/property-provider': 4.2.12
|
||||
'@smithy/shared-ini-file-loader': 4.4.7
|
||||
'@smithy/types': 4.13.1
|
||||
tslib: 2.8.1
|
||||
transitivePeerDependencies:
|
||||
- aws-crt
|
||||
|
||||
'@aws-sdk/token-providers@3.1023.0':
|
||||
dependencies:
|
||||
'@aws-sdk/core': 3.973.26
|
||||
@@ -7342,7 +7254,8 @@ snapshots:
|
||||
|
||||
'@bcoe/v8-coverage@1.0.2': {}
|
||||
|
||||
'@blazediff/core@1.9.1': {}
|
||||
'@blazediff/core@1.9.1':
|
||||
optional: true
|
||||
|
||||
'@borewit/text-codec@0.2.2': {}
|
||||
|
||||
@@ -7411,14 +7324,6 @@ snapshots:
|
||||
'@colors/colors@1.5.0':
|
||||
optional: true
|
||||
|
||||
'@create-markdown/core@2.0.0':
|
||||
optional: true
|
||||
|
||||
'@create-markdown/preview@2.0.0(@create-markdown/core@2.0.0)(shiki@3.23.0)':
|
||||
optionalDependencies:
|
||||
'@create-markdown/core': 2.0.0
|
||||
shiki: 3.23.0
|
||||
|
||||
'@csstools/color-helpers@6.0.2': {}
|
||||
|
||||
'@csstools/css-calc@3.1.1(@csstools/css-parser-algorithms@4.0.0(@csstools/css-tokenizer@4.0.0))(@csstools/css-tokenizer@4.0.0)':
|
||||
@@ -8261,7 +8166,7 @@ snapshots:
|
||||
'@mariozechner/pi-ai@0.65.0(@modelcontextprotocol/sdk@1.29.0(zod@4.3.6))(ws@8.20.0)(zod@4.3.6)':
|
||||
dependencies:
|
||||
'@anthropic-ai/sdk': 0.81.0(zod@4.3.6)
|
||||
'@aws-sdk/client-bedrock-runtime': 3.1022.0
|
||||
'@aws-sdk/client-bedrock-runtime': 3.1023.0
|
||||
'@google/genai': 1.48.0(@modelcontextprotocol/sdk@1.29.0(zod@4.3.6))
|
||||
'@mistralai/mistralai': 1.14.1
|
||||
'@sinclair/typebox': 0.34.49
|
||||
@@ -8515,8 +8420,6 @@ snapshots:
|
||||
dependencies:
|
||||
'@noble/hashes': 2.0.1
|
||||
|
||||
'@noble/ed25519@3.0.1': {}
|
||||
|
||||
'@noble/hashes@2.0.1': {}
|
||||
|
||||
'@node-llama-cpp/linux-arm64@3.18.1':
|
||||
@@ -8961,7 +8864,8 @@ snapshots:
|
||||
|
||||
'@pinojs/redact@0.4.0': {}
|
||||
|
||||
'@polka/url@1.0.0-next.29': {}
|
||||
'@polka/url@1.0.0-next.29':
|
||||
optional: true
|
||||
|
||||
'@protobufjs/aspromise@1.1.2': {}
|
||||
|
||||
@@ -9875,6 +9779,7 @@ snapshots:
|
||||
- msw
|
||||
- utf-8-validate
|
||||
- vite
|
||||
optional: true
|
||||
|
||||
'@vitest/browser@4.1.2(vite@8.0.3(@emnapi/core@1.8.1)(@emnapi/runtime@1.9.1)(@types/node@25.5.2)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3))(vitest@4.1.2)':
|
||||
dependencies:
|
||||
@@ -9892,6 +9797,7 @@ snapshots:
|
||||
- msw
|
||||
- utf-8-validate
|
||||
- vite
|
||||
optional: true
|
||||
|
||||
'@vitest/coverage-v8@4.1.2(@vitest/browser@4.1.2(vite@8.0.3(@emnapi/core@1.8.1)(@emnapi/runtime@1.9.1)(@types/node@25.5.2)(esbuild@0.27.4)(jiti@2.6.1)(tsx@4.21.0)(yaml@2.8.3))(vitest@4.1.2))(vitest@4.1.2)':
|
||||
dependencies:
|
||||
@@ -10497,10 +10403,6 @@ snapshots:
|
||||
dependencies:
|
||||
domelementtype: 2.3.0
|
||||
|
||||
dompurify@3.3.3:
|
||||
optionalDependencies:
|
||||
'@types/trusted-types': 2.0.7
|
||||
|
||||
domutils@3.2.2:
|
||||
dependencies:
|
||||
dom-serializer: 2.0.0
|
||||
@@ -11578,8 +11480,6 @@ snapshots:
|
||||
|
||||
marked@15.0.12: {}
|
||||
|
||||
marked@17.0.5: {}
|
||||
|
||||
math-intrinsics@1.1.0: {}
|
||||
|
||||
matrix-events-sdk@0.0.1: {}
|
||||
@@ -11691,7 +11591,8 @@ snapshots:
|
||||
mri@1.2.0:
|
||||
optional: true
|
||||
|
||||
mrmime@2.0.1: {}
|
||||
mrmime@2.0.1:
|
||||
optional: true
|
||||
|
||||
ms@2.1.3: {}
|
||||
|
||||
@@ -12133,6 +12034,7 @@ snapshots:
|
||||
playwright-core: 1.59.1
|
||||
optionalDependencies:
|
||||
fsevents: 2.3.2
|
||||
optional: true
|
||||
|
||||
pngjs@6.0.0: {}
|
||||
|
||||
@@ -12653,6 +12555,7 @@ snapshots:
|
||||
'@polka/url': 1.0.0-next.29
|
||||
mrmime: 2.0.1
|
||||
totalist: 3.0.1
|
||||
optional: true
|
||||
|
||||
sisteransi@1.0.5: {}
|
||||
|
||||
@@ -12870,7 +12773,8 @@ snapshots:
|
||||
'@tokenizer/token': 0.3.0
|
||||
ieee754: 1.2.1
|
||||
|
||||
totalist@3.0.1: {}
|
||||
totalist@3.0.1:
|
||||
optional: true
|
||||
|
||||
tough-cookie@4.1.3:
|
||||
dependencies:
|
||||
|
||||
@@ -51,6 +51,20 @@ describe("config schema regressions", () => {
|
||||
expect(res.ok).toBe(true);
|
||||
});
|
||||
|
||||
it('accepts memorySearch provider "bedrock"', () => {
|
||||
const res = validateConfigObject({
|
||||
agents: {
|
||||
defaults: {
|
||||
memorySearch: {
|
||||
provider: "bedrock",
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.ok).toBe(true);
|
||||
});
|
||||
|
||||
it("accepts memorySearch.qmd.extraCollections", () => {
|
||||
const res = validateConfigObject({
|
||||
agents: {
|
||||
|
||||
@@ -3366,7 +3366,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
type: "string",
|
||||
title: "Memory Search Provider",
|
||||
description:
|
||||
'Selects the embedding backend used to build/query memory vectors: "openai", "gemini", "voyage", "mistral", "ollama", or "local". Keep your most reliable provider here and configure fallback for resilience.',
|
||||
'Selects the embedding backend used to build/query memory vectors: "openai", "gemini", "voyage", "mistral", "bedrock", "ollama", or "local". Keep your most reliable provider here and configure fallback for resilience.',
|
||||
},
|
||||
remote: {
|
||||
type: "object",
|
||||
@@ -3521,7 +3521,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
maximum: 9007199254740991,
|
||||
title: "Memory Search Output Dimensionality",
|
||||
description:
|
||||
"Gemini embedding-2 only: chooses the output vector size for memory embeddings. Use 768, 1536, or 3072 (default), and expect a full reindex when you change it because stored vector dimensions must stay consistent.",
|
||||
"Provider-specific output vector size override for memory embeddings. Gemini embedding-2 supports 768, 1536, or 3072; Bedrock families such as Titan V2, Cohere V4, and Nova expose their own allowed sizes. Expect a full reindex when you change it because stored vector dimensions must stay consistent.",
|
||||
},
|
||||
local: {
|
||||
type: "object",
|
||||
@@ -23958,7 +23958,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
},
|
||||
"agents.defaults.memorySearch.provider": {
|
||||
label: "Memory Search Provider",
|
||||
help: 'Selects the embedding backend used to build/query memory vectors: "openai", "gemini", "voyage", "mistral", "ollama", or "local". Keep your most reliable provider here and configure fallback for resilience.',
|
||||
help: 'Selects the embedding backend used to build/query memory vectors: "openai", "gemini", "voyage", "mistral", "bedrock", "ollama", or "local". Keep your most reliable provider here and configure fallback for resilience.',
|
||||
tags: ["advanced"],
|
||||
},
|
||||
"agents.defaults.memorySearch.remote.baseUrl": {
|
||||
@@ -24009,7 +24009,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
},
|
||||
"agents.defaults.memorySearch.outputDimensionality": {
|
||||
label: "Memory Search Output Dimensionality",
|
||||
help: "Gemini embedding-2 only: chooses the output vector size for memory embeddings. Use 768, 1536, or 3072 (default), and expect a full reindex when you change it because stored vector dimensions must stay consistent.",
|
||||
help: "Provider-specific output vector size override for memory embeddings. Gemini embedding-2 supports 768, 1536, or 3072; Bedrock families such as Titan V2, Cohere V4, and Nova expose their own allowed sizes. Expect a full reindex when you change it because stored vector dimensions must stay consistent.",
|
||||
tags: ["advanced"],
|
||||
},
|
||||
"agents.defaults.memorySearch.fallback": {
|
||||
|
||||
@@ -867,11 +867,11 @@ export const FIELD_HELP: Record<string, string> = {
|
||||
"agents.defaults.memorySearch.experimental.sessionMemory":
|
||||
"Indexes session transcripts into memory search so responses can reference prior chat turns. Keep this off unless transcript recall is needed, because indexing cost and storage usage both increase.",
|
||||
"agents.defaults.memorySearch.provider":
|
||||
'Selects the embedding backend used to build/query memory vectors: "openai", "gemini", "voyage", "mistral", "ollama", or "local". Keep your most reliable provider here and configure fallback for resilience.',
|
||||
'Selects the embedding backend used to build/query memory vectors: "openai", "gemini", "voyage", "mistral", "bedrock", "ollama", or "local". Keep your most reliable provider here and configure fallback for resilience.',
|
||||
"agents.defaults.memorySearch.model":
|
||||
"Embedding model override used by the selected memory provider when a non-default model is required. Set this only when you need explicit recall quality/cost tuning beyond provider defaults.",
|
||||
"agents.defaults.memorySearch.outputDimensionality":
|
||||
"Gemini embedding-2 only: chooses the output vector size for memory embeddings. Use 768, 1536, or 3072 (default), and expect a full reindex when you change it because stored vector dimensions must stay consistent.",
|
||||
"Provider-specific output vector size override for memory embeddings. Gemini embedding-2 supports 768, 1536, or 3072; Bedrock families such as Titan V2, Cohere V4, and Nova expose their own allowed sizes. Expect a full reindex when you change it because stored vector dimensions must stay consistent.",
|
||||
"agents.defaults.memorySearch.remote.baseUrl":
|
||||
"Overrides the embedding API endpoint, such as an OpenAI-compatible proxy or custom Gemini base URL. Use this only when routing through your own gateway or vendor endpoint; keep provider defaults otherwise.",
|
||||
"agents.defaults.memorySearch.remote.apiKey":
|
||||
|
||||
377
src/memory-host-sdk/host/embeddings-bedrock.test.ts
Normal file
377
src/memory-host-sdk/host/embeddings-bedrock.test.ts
Normal file
@@ -0,0 +1,377 @@
|
||||
import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
|
||||
const { defaultProviderMock, resolveCredentialsMock, sendMock } = vi.hoisted(() => ({
|
||||
defaultProviderMock: vi.fn(),
|
||||
resolveCredentialsMock: vi.fn(),
|
||||
sendMock: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock("@aws-sdk/client-bedrock-runtime", () => {
|
||||
class MockClient {
|
||||
region: string;
|
||||
constructor(config: { region: string }) {
|
||||
this.region = config.region;
|
||||
}
|
||||
send = sendMock;
|
||||
}
|
||||
class MockCommand {
|
||||
input: unknown;
|
||||
constructor(input: unknown) {
|
||||
this.input = input;
|
||||
}
|
||||
}
|
||||
return { BedrockRuntimeClient: MockClient, InvokeModelCommand: MockCommand };
|
||||
});
|
||||
|
||||
vi.mock("@aws-sdk/credential-provider-node", () => ({
|
||||
defaultProvider: defaultProviderMock.mockImplementation(() => resolveCredentialsMock),
|
||||
}));
|
||||
|
||||
let createBedrockEmbeddingProvider: typeof import("./embeddings-bedrock.js").createBedrockEmbeddingProvider;
|
||||
let resolveBedrockEmbeddingClient: typeof import("./embeddings-bedrock.js").resolveBedrockEmbeddingClient;
|
||||
let normalizeBedrockEmbeddingModel: typeof import("./embeddings-bedrock.js").normalizeBedrockEmbeddingModel;
|
||||
let hasAwsCredentials: typeof import("./embeddings-bedrock.js").hasAwsCredentials;
|
||||
|
||||
beforeAll(async () => {
|
||||
({
|
||||
createBedrockEmbeddingProvider,
|
||||
resolveBedrockEmbeddingClient,
|
||||
normalizeBedrockEmbeddingModel,
|
||||
hasAwsCredentials,
|
||||
} = await import("./embeddings-bedrock.js"));
|
||||
});
|
||||
|
||||
beforeEach(() => {
|
||||
defaultProviderMock.mockImplementation(() => resolveCredentialsMock);
|
||||
});
|
||||
|
||||
const enc = (body: unknown) => ({ body: new TextEncoder().encode(JSON.stringify(body)) });
|
||||
const reqBody = (i = 0): Record<string, unknown> =>
|
||||
JSON.parse(sendMock.mock.calls[i][0].input.body);
|
||||
|
||||
describe("bedrock embedding provider", () => {
|
||||
const originalEnv = process.env;
|
||||
afterEach(() => {
|
||||
process.env = originalEnv;
|
||||
vi.restoreAllMocks();
|
||||
defaultProviderMock.mockClear();
|
||||
resolveCredentialsMock.mockReset();
|
||||
sendMock.mockReset();
|
||||
});
|
||||
|
||||
// --- Normalization ---
|
||||
|
||||
it("normalizes model names with prefixes", () => {
|
||||
expect(normalizeBedrockEmbeddingModel("bedrock/amazon.titan-embed-text-v2:0")).toBe(
|
||||
"amazon.titan-embed-text-v2:0",
|
||||
);
|
||||
expect(normalizeBedrockEmbeddingModel("amazon-bedrock/cohere.embed-english-v3")).toBe(
|
||||
"cohere.embed-english-v3",
|
||||
);
|
||||
expect(normalizeBedrockEmbeddingModel("")).toBe("amazon.titan-embed-text-v2:0");
|
||||
});
|
||||
|
||||
// --- Client resolution ---
|
||||
|
||||
it("resolves region from env", () => {
|
||||
process.env = { ...originalEnv, AWS_REGION: "eu-west-1" };
|
||||
const c = resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(c.region).toBe("eu-west-1");
|
||||
expect(c.dimensions).toBe(1024);
|
||||
});
|
||||
|
||||
it("defaults to us-east-1", () => {
|
||||
process.env = { ...originalEnv };
|
||||
delete process.env.AWS_REGION;
|
||||
delete process.env.AWS_DEFAULT_REGION;
|
||||
expect(
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
}).region,
|
||||
).toBe("us-east-1");
|
||||
});
|
||||
|
||||
it("extracts region from baseUrl", () => {
|
||||
process.env = { ...originalEnv };
|
||||
delete process.env.AWS_REGION;
|
||||
const c = resolveBedrockEmbeddingClient({
|
||||
config: {
|
||||
models: {
|
||||
providers: {
|
||||
"amazon-bedrock": { baseUrl: "https://bedrock-runtime.ap-southeast-2.amazonaws.com" },
|
||||
},
|
||||
},
|
||||
} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(c.region).toBe("ap-southeast-2");
|
||||
});
|
||||
|
||||
it("validates dimensions", () => {
|
||||
expect(() =>
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 768,
|
||||
}),
|
||||
).toThrow("Invalid dimensions 768");
|
||||
});
|
||||
|
||||
it("accepts valid dimensions", () => {
|
||||
expect(
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 256,
|
||||
}).dimensions,
|
||||
).toBe(256);
|
||||
});
|
||||
|
||||
it("resolves throughput-suffixed variants", () => {
|
||||
expect(
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v1:2:8k",
|
||||
fallback: "none",
|
||||
}).dimensions,
|
||||
).toBe(1536);
|
||||
});
|
||||
|
||||
// --- Credential detection ---
|
||||
|
||||
it("detects access keys", async () => {
|
||||
await expect(
|
||||
hasAwsCredentials({
|
||||
AWS_ACCESS_KEY_ID: "A",
|
||||
AWS_SECRET_ACCESS_KEY: "s",
|
||||
} as NodeJS.ProcessEnv),
|
||||
).resolves.toBe(true);
|
||||
});
|
||||
it("detects profile", async () => {
|
||||
await expect(hasAwsCredentials({ AWS_PROFILE: "default" } as NodeJS.ProcessEnv)).resolves.toBe(
|
||||
true,
|
||||
);
|
||||
});
|
||||
it("detects ECS task role", async () => {
|
||||
await expect(
|
||||
hasAwsCredentials({ AWS_CONTAINER_CREDENTIALS_RELATIVE_URI: "/v2" } as NodeJS.ProcessEnv),
|
||||
).resolves.toBe(true);
|
||||
});
|
||||
it("detects EKS IRSA", async () => {
|
||||
await expect(
|
||||
hasAwsCredentials({
|
||||
AWS_WEB_IDENTITY_TOKEN_FILE: "/var/run/secrets/token",
|
||||
AWS_ROLE_ARN: "arn:aws:iam::123:role/x",
|
||||
} as NodeJS.ProcessEnv),
|
||||
).resolves.toBe(true);
|
||||
});
|
||||
it("detects credentials via the AWS SDK default provider chain", async () => {
|
||||
resolveCredentialsMock.mockResolvedValue({ accessKeyId: "AKIAEXAMPLE" });
|
||||
await expect(hasAwsCredentials({} as NodeJS.ProcessEnv)).resolves.toBe(true);
|
||||
expect(defaultProviderMock).toHaveBeenCalledWith({ timeout: 1000, maxRetries: 0 });
|
||||
});
|
||||
it("returns false with no creds", async () => {
|
||||
resolveCredentialsMock.mockRejectedValue(new Error("no aws credentials"));
|
||||
await expect(hasAwsCredentials({} as NodeJS.ProcessEnv)).resolves.toBe(false);
|
||||
});
|
||||
|
||||
// --- Titan V2 ---
|
||||
|
||||
it("embeds with Titan V2", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embedding: [0.1, 0.2, 0.3] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("test")).toHaveLength(3);
|
||||
expect(reqBody()).toMatchObject({ inputText: "test", normalize: true, dimensions: 1024 });
|
||||
});
|
||||
|
||||
it("returns empty for blank text", async () => {
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery(" ")).toEqual([]);
|
||||
expect(sendMock).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("batches Titan V2 concurrently", async () => {
|
||||
sendMock
|
||||
.mockResolvedValueOnce(enc({ embedding: [0.1] }))
|
||||
.mockResolvedValueOnce(enc({ embedding: [0.2] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(sendMock).toHaveBeenCalledTimes(2);
|
||||
});
|
||||
|
||||
// --- Titan V1 ---
|
||||
|
||||
it("sends only inputText for Titan V1", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embedding: [0.5] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v1",
|
||||
fallback: "none",
|
||||
});
|
||||
await provider.embedQuery("hi");
|
||||
expect(reqBody()).toEqual({ inputText: "hi" });
|
||||
});
|
||||
|
||||
it("handles Titan G1 text variant", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embedding: [0.1] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-g1-text-02",
|
||||
fallback: "none",
|
||||
});
|
||||
await provider.embedQuery("hi");
|
||||
expect(reqBody()).toEqual({ inputText: "hi" });
|
||||
});
|
||||
|
||||
// --- Cohere V3 ---
|
||||
|
||||
it("embeds Cohere V3 batch in single call", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embeddings: [[0.1], [0.2]] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-english-v3",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(sendMock).toHaveBeenCalledTimes(1);
|
||||
expect(reqBody()).toMatchObject({ texts: ["a", "b"], input_type: "search_document" });
|
||||
});
|
||||
|
||||
it("uses search_query for Cohere embedQuery", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embeddings: [[0.1]] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-english-v3",
|
||||
fallback: "none",
|
||||
});
|
||||
await provider.embedQuery("q");
|
||||
expect(reqBody().input_type).toBe("search_query");
|
||||
});
|
||||
|
||||
// --- Cohere V4 ---
|
||||
|
||||
it("embeds Cohere V4 with embedding_types + output_dimension", async () => {
|
||||
sendMock.mockResolvedValue(enc({ embeddings: { float: [[0.1], [0.2]] } }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-v4:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(reqBody()).toMatchObject({ embedding_types: ["float"], output_dimension: 1536 });
|
||||
});
|
||||
|
||||
it("validates Cohere V4 dimensions", () => {
|
||||
expect(() =>
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "cohere.embed-v4:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 2048,
|
||||
}),
|
||||
).toThrow("Invalid dimensions 2048");
|
||||
});
|
||||
|
||||
// --- Nova ---
|
||||
|
||||
it("embeds Nova with SINGLE_EMBEDDING format", async () => {
|
||||
sendMock.mockResolvedValue(
|
||||
enc({ embeddings: [{ embeddingType: "TEXT", embedding: [0.1, 0.2] }] }),
|
||||
);
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.nova-2-multimodal-embeddings-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("hi")).toHaveLength(2);
|
||||
expect(reqBody().taskType).toBe("SINGLE_EMBEDDING");
|
||||
});
|
||||
|
||||
it("validates Nova dimensions", () => {
|
||||
expect(() =>
|
||||
resolveBedrockEmbeddingClient({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.nova-2-multimodal-embeddings-v1:0",
|
||||
fallback: "none",
|
||||
outputDimensionality: 512,
|
||||
}),
|
||||
).toThrow("Invalid dimensions 512");
|
||||
});
|
||||
|
||||
it("batches Nova concurrently", async () => {
|
||||
sendMock
|
||||
.mockResolvedValueOnce(enc({ embeddings: [{ embeddingType: "TEXT", embedding: [0.1] }] }))
|
||||
.mockResolvedValueOnce(enc({ embeddings: [{ embeddingType: "TEXT", embedding: [0.2] }] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "amazon.nova-2-multimodal-embeddings-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedBatch(["a", "b"])).toHaveLength(2);
|
||||
expect(sendMock).toHaveBeenCalledTimes(2);
|
||||
});
|
||||
|
||||
// --- TwelveLabs ---
|
||||
|
||||
it("embeds TwelveLabs Marengo", async () => {
|
||||
sendMock.mockResolvedValue(enc({ data: [{ embedding: [0.1, 0.2] }] }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "twelvelabs.marengo-embed-3-0-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("hi")).toHaveLength(2);
|
||||
expect(reqBody()).toEqual({ inputType: "text", text: { inputText: "hi" } });
|
||||
});
|
||||
|
||||
it("embeds TwelveLabs object-style responses", async () => {
|
||||
sendMock.mockResolvedValue(enc({ data: { embedding: [0.3, 0.4] } }));
|
||||
const { provider } = await createBedrockEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "bedrock",
|
||||
model: "twelvelabs.marengo-embed-2-7-v1:0",
|
||||
fallback: "none",
|
||||
});
|
||||
expect(await provider.embedQuery("hi")).toEqual([0.6, 0.8]);
|
||||
});
|
||||
});
|
||||
397
src/memory-host-sdk/host/embeddings-bedrock.ts
Normal file
397
src/memory-host-sdk/host/embeddings-bedrock.ts
Normal file
@@ -0,0 +1,397 @@
|
||||
import { sanitizeAndNormalizeEmbedding } from "./embedding-vectors.js";
|
||||
import { debugEmbeddingsLog } from "./embeddings-debug.js";
|
||||
import type { EmbeddingProvider, EmbeddingProviderOptions } from "./embeddings.js";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types & constants
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export type BedrockEmbeddingClient = {
|
||||
region: string;
|
||||
model: string;
|
||||
dimensions?: number;
|
||||
};
|
||||
|
||||
export const DEFAULT_BEDROCK_EMBEDDING_MODEL = "amazon.titan-embed-text-v2:0";
|
||||
|
||||
/** Request/response format family — each has a different API shape. */
|
||||
type Family = "titan-v1" | "titan-v2" | "cohere-v3" | "cohere-v4" | "nova" | "twelvelabs";
|
||||
|
||||
interface ModelSpec {
|
||||
maxTokens: number;
|
||||
dims: number;
|
||||
validDims?: number[];
|
||||
family: Family;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Model catalog
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const MODELS: Record<string, ModelSpec> = {
|
||||
"amazon.titan-embed-text-v2:0": {
|
||||
maxTokens: 8192,
|
||||
dims: 1024,
|
||||
validDims: [256, 512, 1024],
|
||||
family: "titan-v2",
|
||||
},
|
||||
"amazon.titan-embed-text-v1": { maxTokens: 8000, dims: 1536, family: "titan-v1" },
|
||||
"amazon.titan-embed-g1-text-02": { maxTokens: 8000, dims: 1536, family: "titan-v1" },
|
||||
"amazon.titan-embed-image-v1": { maxTokens: 128, dims: 1024, family: "titan-v1" },
|
||||
"cohere.embed-english-v3": { maxTokens: 512, dims: 1024, family: "cohere-v3" },
|
||||
"cohere.embed-multilingual-v3": { maxTokens: 512, dims: 1024, family: "cohere-v3" },
|
||||
"cohere.embed-v4:0": {
|
||||
maxTokens: 128000,
|
||||
dims: 1536,
|
||||
validDims: [256, 384, 512, 768, 1024, 1536],
|
||||
family: "cohere-v4",
|
||||
},
|
||||
"amazon.nova-2-multimodal-embeddings-v1:0": {
|
||||
maxTokens: 8192,
|
||||
dims: 1024,
|
||||
validDims: [256, 384, 1024, 3072],
|
||||
family: "nova",
|
||||
},
|
||||
"twelvelabs.marengo-embed-2-7-v1:0": { maxTokens: 512, dims: 1024, family: "twelvelabs" },
|
||||
"twelvelabs.marengo-embed-3-0-v1:0": { maxTokens: 512, dims: 512, family: "twelvelabs" },
|
||||
};
|
||||
|
||||
/** Resolve spec, stripping throughput suffixes like `:2:8k` or `:0:512`. */
|
||||
function resolveSpec(modelId: string): ModelSpec | undefined {
|
||||
if (MODELS[modelId]) {
|
||||
return MODELS[modelId];
|
||||
}
|
||||
const parts = modelId.split(":");
|
||||
for (let i = parts.length - 1; i >= 1; i--) {
|
||||
const spec = MODELS[parts.slice(0, i).join(":")];
|
||||
if (spec) {
|
||||
return spec;
|
||||
}
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
/** Infer family from model ID prefix when not in catalog. */
|
||||
function inferFamily(modelId: string): Family {
|
||||
const id = modelId.toLowerCase();
|
||||
if (id.startsWith("amazon.titan-embed-text-v2")) {
|
||||
return "titan-v2";
|
||||
}
|
||||
if (id.startsWith("amazon.titan-embed")) {
|
||||
return "titan-v1";
|
||||
}
|
||||
if (id.startsWith("amazon.nova")) {
|
||||
return "nova";
|
||||
}
|
||||
if (id.startsWith("cohere.embed-v4")) {
|
||||
return "cohere-v4";
|
||||
}
|
||||
if (id.startsWith("cohere.embed")) {
|
||||
return "cohere-v3";
|
||||
}
|
||||
if (id.startsWith("twelvelabs.")) {
|
||||
return "twelvelabs";
|
||||
}
|
||||
return "titan-v1"; // safest default — simplest request format
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// AWS SDK lazy loader
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
type SdkClient = import("@aws-sdk/client-bedrock-runtime").BedrockRuntimeClient;
|
||||
type SdkCommand = import("@aws-sdk/client-bedrock-runtime").InvokeModelCommand;
|
||||
|
||||
interface AwsSdk {
|
||||
BedrockRuntimeClient: new (config: { region: string }) => SdkClient;
|
||||
InvokeModelCommand: new (input: {
|
||||
modelId: string;
|
||||
body: string;
|
||||
contentType: string;
|
||||
accept: string;
|
||||
}) => SdkCommand;
|
||||
}
|
||||
|
||||
interface AwsCredentialProviderSdk {
|
||||
defaultProvider: (init?: { timeout?: number; maxRetries?: number }) => () => Promise<{
|
||||
accessKeyId?: string;
|
||||
}>;
|
||||
}
|
||||
|
||||
let sdkCache: AwsSdk | null = null;
|
||||
let credentialProviderSdkCache: AwsCredentialProviderSdk | null | undefined;
|
||||
|
||||
async function loadSdk(): Promise<AwsSdk> {
|
||||
if (sdkCache) {
|
||||
return sdkCache;
|
||||
}
|
||||
try {
|
||||
sdkCache = (await import("@aws-sdk/client-bedrock-runtime")) as unknown as AwsSdk;
|
||||
return sdkCache;
|
||||
} catch {
|
||||
throw new Error(
|
||||
"No API key found for provider bedrock: @aws-sdk/client-bedrock-runtime is not installed. " +
|
||||
"Install it with: npm install @aws-sdk/client-bedrock-runtime",
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
async function loadCredentialProviderSdk(): Promise<AwsCredentialProviderSdk | null> {
|
||||
if (credentialProviderSdkCache !== undefined) {
|
||||
return credentialProviderSdkCache;
|
||||
}
|
||||
try {
|
||||
credentialProviderSdkCache =
|
||||
(await import("@aws-sdk/credential-provider-node")) as unknown as AwsCredentialProviderSdk;
|
||||
} catch {
|
||||
credentialProviderSdkCache = null;
|
||||
}
|
||||
return credentialProviderSdkCache;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const MODEL_PREFIX_RE = /^(?:bedrock|amazon-bedrock|aws)\//;
|
||||
const REGION_RE = /bedrock-runtime\.([a-z0-9-]+)\./;
|
||||
|
||||
export function normalizeBedrockEmbeddingModel(model: string): string {
|
||||
const trimmed = model.trim();
|
||||
return trimmed ? trimmed.replace(MODEL_PREFIX_RE, "") : DEFAULT_BEDROCK_EMBEDDING_MODEL;
|
||||
}
|
||||
|
||||
function regionFromUrl(url: string | undefined): string | undefined {
|
||||
return url?.trim() ? REGION_RE.exec(url)?.[1] : undefined;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Request builders
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function buildBody(family: Family, text: string, dims?: number): string {
|
||||
switch (family) {
|
||||
case "titan-v2": {
|
||||
const b: Record<string, unknown> = { inputText: text };
|
||||
if (dims != null) {
|
||||
b.dimensions = dims;
|
||||
b.normalize = true;
|
||||
}
|
||||
return JSON.stringify(b);
|
||||
}
|
||||
case "titan-v1":
|
||||
return JSON.stringify({ inputText: text });
|
||||
case "nova":
|
||||
return JSON.stringify({
|
||||
taskType: "SINGLE_EMBEDDING",
|
||||
singleEmbeddingParams: {
|
||||
embeddingPurpose: "GENERIC_INDEX",
|
||||
embeddingDimension: dims ?? 1024,
|
||||
text: { truncationMode: "END", value: text },
|
||||
},
|
||||
});
|
||||
case "twelvelabs":
|
||||
return JSON.stringify({ inputType: "text", text: { inputText: text } });
|
||||
default:
|
||||
return JSON.stringify({ inputText: text });
|
||||
}
|
||||
}
|
||||
|
||||
function buildCohereBody(
|
||||
family: Family,
|
||||
texts: string[],
|
||||
inputType: "search_query" | "search_document",
|
||||
dims?: number,
|
||||
): string {
|
||||
const body: Record<string, unknown> = { texts, input_type: inputType, truncate: "END" };
|
||||
if (family === "cohere-v4") {
|
||||
body.embedding_types = ["float"];
|
||||
if (dims != null) {
|
||||
body.output_dimension = dims;
|
||||
}
|
||||
}
|
||||
return JSON.stringify(body);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Response parsers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function parseSingle(family: Family, raw: string): number[] {
|
||||
const data = JSON.parse(raw);
|
||||
switch (family) {
|
||||
case "nova":
|
||||
return data.embeddings?.[0]?.embedding ?? [];
|
||||
case "twelvelabs": {
|
||||
if (Array.isArray(data.data)) {
|
||||
return data.data[0]?.embedding ?? [];
|
||||
}
|
||||
if (Array.isArray(data.data?.embedding)) {
|
||||
return data.data.embedding;
|
||||
}
|
||||
return data.embedding ?? [];
|
||||
}
|
||||
default:
|
||||
return data.embedding ?? [];
|
||||
}
|
||||
}
|
||||
|
||||
function parseCohereBatch(family: Family, raw: string): number[][] {
|
||||
const data = JSON.parse(raw);
|
||||
const embeddings = data.embeddings;
|
||||
if (!embeddings) {
|
||||
return [];
|
||||
}
|
||||
if (family === "cohere-v4" && !Array.isArray(embeddings)) {
|
||||
return embeddings.float ?? [];
|
||||
}
|
||||
return embeddings;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Provider
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export async function createBedrockEmbeddingProvider(
|
||||
options: EmbeddingProviderOptions,
|
||||
): Promise<{ provider: EmbeddingProvider; client: BedrockEmbeddingClient }> {
|
||||
const client = resolveBedrockEmbeddingClient(options);
|
||||
const { BedrockRuntimeClient, InvokeModelCommand } = await loadSdk();
|
||||
const sdk = new BedrockRuntimeClient({ region: client.region });
|
||||
const spec = resolveSpec(client.model);
|
||||
const family = spec?.family ?? inferFamily(client.model);
|
||||
|
||||
debugEmbeddingsLog("memory embeddings: bedrock client", {
|
||||
region: client.region,
|
||||
model: client.model,
|
||||
dimensions: client.dimensions,
|
||||
family,
|
||||
});
|
||||
|
||||
const invoke = async (body: string): Promise<string> => {
|
||||
const res = await sdk.send(
|
||||
new InvokeModelCommand({
|
||||
modelId: client.model,
|
||||
body,
|
||||
contentType: "application/json",
|
||||
accept: "application/json",
|
||||
}),
|
||||
);
|
||||
return new TextDecoder().decode(res.body);
|
||||
};
|
||||
|
||||
const isCohere = family === "cohere-v3" || family === "cohere-v4";
|
||||
|
||||
const embedSingle = async (text: string): Promise<number[]> => {
|
||||
const raw = await invoke(buildBody(family, text, client.dimensions));
|
||||
return sanitizeAndNormalizeEmbedding(parseSingle(family, raw));
|
||||
};
|
||||
|
||||
const embedCohere = async (
|
||||
texts: string[],
|
||||
inputType: "search_query" | "search_document",
|
||||
): Promise<number[][]> => {
|
||||
const raw = await invoke(buildCohereBody(family, texts, inputType, client.dimensions));
|
||||
return parseCohereBatch(family, raw).map((e) => sanitizeAndNormalizeEmbedding(e));
|
||||
};
|
||||
|
||||
const embedQuery = async (text: string): Promise<number[]> => {
|
||||
if (!text.trim()) {
|
||||
return [];
|
||||
}
|
||||
if (isCohere) {
|
||||
return (await embedCohere([text], "search_query"))[0] ?? [];
|
||||
}
|
||||
return embedSingle(text);
|
||||
};
|
||||
|
||||
const embedBatch = async (texts: string[]): Promise<number[][]> => {
|
||||
if (texts.length === 0) {
|
||||
return [];
|
||||
}
|
||||
if (isCohere) {
|
||||
return embedCohere(texts, "search_document");
|
||||
}
|
||||
return Promise.all(texts.map((t) => (t.trim() ? embedSingle(t) : Promise.resolve([]))));
|
||||
};
|
||||
|
||||
return {
|
||||
provider: {
|
||||
id: "bedrock",
|
||||
model: client.model,
|
||||
maxInputTokens: spec?.maxTokens,
|
||||
embedQuery,
|
||||
embedBatch,
|
||||
},
|
||||
client,
|
||||
};
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Client resolution
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export function resolveBedrockEmbeddingClient(
|
||||
options: EmbeddingProviderOptions,
|
||||
): BedrockEmbeddingClient {
|
||||
const model = normalizeBedrockEmbeddingModel(options.model);
|
||||
const spec = resolveSpec(model);
|
||||
const providerConfig = options.config.models?.providers?.["amazon-bedrock"];
|
||||
|
||||
const region =
|
||||
regionFromUrl(options.remote?.baseUrl) ??
|
||||
regionFromUrl(providerConfig?.baseUrl) ??
|
||||
process.env.AWS_REGION ??
|
||||
process.env.AWS_DEFAULT_REGION ??
|
||||
"us-east-1";
|
||||
|
||||
let dimensions: number | undefined;
|
||||
if (options.outputDimensionality != null) {
|
||||
if (spec?.validDims && !spec.validDims.includes(options.outputDimensionality)) {
|
||||
throw new Error(
|
||||
`Invalid dimensions ${options.outputDimensionality} for ${model}. Valid values: ${spec.validDims.join(", ")}`,
|
||||
);
|
||||
}
|
||||
dimensions = options.outputDimensionality;
|
||||
} else {
|
||||
dimensions = spec?.dims;
|
||||
}
|
||||
|
||||
return { region, model, dimensions };
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Credential detection
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const CREDENTIAL_ENV_VARS = [
|
||||
"AWS_PROFILE",
|
||||
"AWS_BEARER_TOKEN_BEDROCK",
|
||||
"AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
|
||||
"AWS_CONTAINER_CREDENTIALS_FULL_URI",
|
||||
"AWS_EC2_METADATA_SERVICE_ENDPOINT",
|
||||
"AWS_WEB_IDENTITY_TOKEN_FILE",
|
||||
"AWS_ROLE_ARN",
|
||||
] as const;
|
||||
|
||||
export async function hasAwsCredentials(env: NodeJS.ProcessEnv = process.env): Promise<boolean> {
|
||||
if (env.AWS_ACCESS_KEY_ID?.trim() && env.AWS_SECRET_ACCESS_KEY?.trim()) {
|
||||
return true;
|
||||
}
|
||||
if (CREDENTIAL_ENV_VARS.some((k) => env[k]?.trim())) {
|
||||
return true;
|
||||
}
|
||||
const credentialProviderSdk = await loadCredentialProviderSdk();
|
||||
if (!credentialProviderSdk) {
|
||||
return false;
|
||||
}
|
||||
try {
|
||||
const credentials = await credentialProviderSdk.defaultProvider({
|
||||
timeout: 1000,
|
||||
maxRetries: 0,
|
||||
})();
|
||||
return typeof credentials.accessKeyId === "string" && credentials.accessKeyId.trim().length > 0;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
@@ -6,10 +6,18 @@ import { createEmbeddingProvider, DEFAULT_LOCAL_MODEL } from "./embeddings.js";
|
||||
import * as nodeLlamaModule from "./node-llama.js";
|
||||
import { mockPublicPinnedHostname } from "./test-helpers/ssrf.js";
|
||||
|
||||
const { createOllamaEmbeddingProviderMock } = vi.hoisted(() => ({
|
||||
const {
|
||||
bedrockSendMock,
|
||||
createOllamaEmbeddingProviderMock,
|
||||
defaultProviderMock,
|
||||
resolveCredentialsMock,
|
||||
} = vi.hoisted(() => ({
|
||||
bedrockSendMock: vi.fn(),
|
||||
createOllamaEmbeddingProviderMock: vi.fn(async () => {
|
||||
throw new Error("Unexpected ollama provider in embeddings.test.ts");
|
||||
}),
|
||||
defaultProviderMock: vi.fn(),
|
||||
resolveCredentialsMock: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock("../../infra/net/fetch-guard.js", () => ({
|
||||
@@ -35,6 +43,23 @@ vi.mock("./embeddings-ollama.js", () => ({
|
||||
createOllamaEmbeddingProvider: createOllamaEmbeddingProviderMock,
|
||||
}));
|
||||
|
||||
vi.mock("@aws-sdk/client-bedrock-runtime", () => {
|
||||
class MockClient {
|
||||
send = bedrockSendMock;
|
||||
}
|
||||
class MockCommand {
|
||||
input: unknown;
|
||||
constructor(input: unknown) {
|
||||
this.input = input;
|
||||
}
|
||||
}
|
||||
return { BedrockRuntimeClient: MockClient, InvokeModelCommand: MockCommand };
|
||||
});
|
||||
|
||||
vi.mock("@aws-sdk/credential-provider-node", () => ({
|
||||
defaultProvider: defaultProviderMock.mockImplementation(() => resolveCredentialsMock),
|
||||
}));
|
||||
|
||||
const createFetchMock = () =>
|
||||
vi.fn(async (_input?: unknown, _init?: unknown) => ({
|
||||
ok: true,
|
||||
@@ -63,6 +88,7 @@ type ResolvedProviderAuth = Awaited<ReturnType<typeof authModule.resolveApiKeyFo
|
||||
beforeEach(() => {
|
||||
vi.spyOn(authModule, "resolveApiKeyForProvider");
|
||||
vi.spyOn(nodeLlamaModule, "importNodeLlamaCpp");
|
||||
defaultProviderMock.mockImplementation(() => resolveCredentialsMock);
|
||||
});
|
||||
|
||||
beforeEach(() => {
|
||||
@@ -108,7 +134,7 @@ function createLocalProvider(options?: { fallback?: "none" | "openai" }) {
|
||||
|
||||
function expectAutoSelectedProvider(
|
||||
result: Awaited<ReturnType<typeof createEmbeddingProvider>>,
|
||||
expectedId: "openai" | "gemini" | "mistral",
|
||||
expectedId: "openai" | "gemini" | "mistral" | "bedrock",
|
||||
) {
|
||||
expect(result.requestedProvider).toBe("auto");
|
||||
const provider = requireProvider(result);
|
||||
@@ -434,6 +460,39 @@ describe("embedding provider auto selection", () => {
|
||||
expect(url, testCase.name).toBe(testCase.expectedUrl);
|
||||
}
|
||||
});
|
||||
|
||||
it("selects Bedrock in auto mode when the AWS credential chain resolves", async () => {
|
||||
bedrockSendMock.mockResolvedValue({
|
||||
body: new TextEncoder().encode(JSON.stringify({ embedding: [1, 2, 3] })),
|
||||
});
|
||||
resolveCredentialsMock.mockResolvedValue({ accessKeyId: "AKIAEXAMPLE" });
|
||||
vi.mocked(authModule.resolveApiKeyForProvider).mockImplementation(async ({ provider }) => {
|
||||
throw new Error(`No API key found for provider "${provider}".`);
|
||||
});
|
||||
|
||||
const result = await createAutoProvider();
|
||||
const provider = expectAutoSelectedProvider(result, "bedrock");
|
||||
await provider.embedQuery("hello");
|
||||
|
||||
expect(bedrockSendMock).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it("rethrows non-auth Bedrock setup errors in auto mode", async () => {
|
||||
resolveCredentialsMock.mockResolvedValue({ accessKeyId: "AKIAEXAMPLE" });
|
||||
vi.mocked(authModule.resolveApiKeyForProvider).mockImplementation(async ({ provider }) => {
|
||||
throw new Error(`No API key found for provider "${provider}".`);
|
||||
});
|
||||
|
||||
await expect(
|
||||
createEmbeddingProvider({
|
||||
config: {} as never,
|
||||
provider: "auto",
|
||||
model: "",
|
||||
fallback: "none",
|
||||
outputDimensionality: 768,
|
||||
}),
|
||||
).rejects.toThrow("Invalid dimensions 768");
|
||||
});
|
||||
});
|
||||
|
||||
describe("embedding provider local fallback", () => {
|
||||
|
||||
@@ -6,6 +6,11 @@ import { formatErrorMessage } from "../../infra/errors.js";
|
||||
import { resolveUserPath } from "../../utils.js";
|
||||
import type { EmbeddingInput } from "./embedding-inputs.js";
|
||||
import { sanitizeAndNormalizeEmbedding } from "./embedding-vectors.js";
|
||||
import {
|
||||
createBedrockEmbeddingProvider,
|
||||
hasAwsCredentials,
|
||||
type BedrockEmbeddingClient,
|
||||
} from "./embeddings-bedrock.js";
|
||||
import {
|
||||
createGeminiEmbeddingProvider,
|
||||
type GeminiEmbeddingClient,
|
||||
@@ -25,6 +30,7 @@ export type { MistralEmbeddingClient } from "./embeddings-mistral.js";
|
||||
export type { OpenAiEmbeddingClient } from "./embeddings-openai.js";
|
||||
export type { VoyageEmbeddingClient } from "./embeddings-voyage.js";
|
||||
export type { OllamaEmbeddingClient } from "./embeddings-ollama.js";
|
||||
export type { BedrockEmbeddingClient } from "./embeddings-bedrock.js";
|
||||
|
||||
export type EmbeddingProvider = {
|
||||
id: string;
|
||||
@@ -35,13 +41,21 @@ export type EmbeddingProvider = {
|
||||
embedBatchInputs?: (inputs: EmbeddingInput[]) => Promise<number[][]>;
|
||||
};
|
||||
|
||||
export type EmbeddingProviderId = "openai" | "local" | "gemini" | "voyage" | "mistral" | "ollama";
|
||||
export type EmbeddingProviderId =
|
||||
| "openai"
|
||||
| "local"
|
||||
| "gemini"
|
||||
| "voyage"
|
||||
| "mistral"
|
||||
| "ollama"
|
||||
| "bedrock";
|
||||
export type EmbeddingProviderRequest = EmbeddingProviderId | "auto";
|
||||
export type EmbeddingProviderFallback = EmbeddingProviderId | "none";
|
||||
|
||||
// Remote providers considered for auto-selection when provider === "auto".
|
||||
// Ollama is intentionally excluded here so that "auto" mode does not
|
||||
// implicitly assume a local Ollama instance is available.
|
||||
// Bedrock is included when AWS credentials are detected.
|
||||
const REMOTE_EMBEDDING_PROVIDER_IDS = ["openai", "gemini", "voyage", "mistral"] as const;
|
||||
|
||||
export type EmbeddingProviderResult = {
|
||||
@@ -55,6 +69,7 @@ export type EmbeddingProviderResult = {
|
||||
voyage?: VoyageEmbeddingClient;
|
||||
mistral?: MistralEmbeddingClient;
|
||||
ollama?: OllamaEmbeddingClient;
|
||||
bedrock?: BedrockEmbeddingClient;
|
||||
};
|
||||
|
||||
export type EmbeddingProviderOptions = {
|
||||
@@ -72,7 +87,7 @@ export type EmbeddingProviderOptions = {
|
||||
modelPath?: string;
|
||||
modelCacheDir?: string;
|
||||
};
|
||||
/** Gemini embedding-2: output vector dimensions (768, 1536, or 3072). */
|
||||
/** Provider-specific output vector dimensions for supported embedding families. */
|
||||
outputDimensionality?: number;
|
||||
/** Gemini: override the default task type sent with embedding requests. */
|
||||
taskType?: GeminiTaskType;
|
||||
@@ -192,6 +207,10 @@ export async function createEmbeddingProvider(
|
||||
const { provider, client } = await createMistralEmbeddingProvider(options);
|
||||
return { provider, mistral: client };
|
||||
}
|
||||
if (id === "bedrock") {
|
||||
const { provider, client } = await createBedrockEmbeddingProvider(options);
|
||||
return { provider, bedrock: client };
|
||||
}
|
||||
const { provider, client } = await createOpenAiEmbeddingProvider(options);
|
||||
return { provider, openAi: client };
|
||||
};
|
||||
@@ -229,6 +248,23 @@ export async function createEmbeddingProvider(
|
||||
}
|
||||
}
|
||||
|
||||
// Try bedrock if AWS credentials are available
|
||||
if (await hasAwsCredentials()) {
|
||||
try {
|
||||
const result = await createProvider("bedrock");
|
||||
return { ...result, requestedProvider };
|
||||
} catch (err) {
|
||||
const message = formatPrimaryError(err, "bedrock");
|
||||
if (isMissingApiKeyError(err)) {
|
||||
missingKeyErrors.push(message);
|
||||
} else {
|
||||
const wrapped = new Error(message) as Error & { cause?: unknown };
|
||||
wrapped.cause = err;
|
||||
throw wrapped;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// All providers failed due to missing API keys - return null provider for FTS-only mode
|
||||
const details = [...missingKeyErrors, localError].filter(Boolean) as string[];
|
||||
const reason = details.length > 0 ? details.join("\n\n") : "No embeddings provider available.";
|
||||
|
||||
Reference in New Issue
Block a user