Voyage AI embedding provider
Voyage AI embedding provider implementing llm.EmbeddingProviderPort. It generates dense vectors for semantic search, retrieval, and similarity scoring. Voyage is an embedding-only service, with no chat completions.
Overview
Voyage AI is an embedding-only specialist focused on retrieval quality. The provider knows nine models: voyage-3.5, voyage-3.5-lite, voyage-4, voyage-4-lite, voyage-4-large, voyage-3-large, voyage-code-3, voyage-finance-2, and voyage-law-2. The domain models target a single corpus type. Reach for voyage-code-3 on a code search corpus, voyage-finance-2 on financial text, or voyage-law-2 on legal text. The general models cover everything else.
Voyage slots into piko as a driven port. You write one NewVoyageProvider call and one WithEmbeddingProvider option, and the LLM service drives it. The provider is pure net/http with no build tags or CGO, so it runs the same in interpreted dev mode (dev-i) and in compiled builds.
The provider records OpenTelemetry metrics through piko's machinery with no extra wiring: piko.llm.provider.voyage.embed.count, .duration, and .errors. It caps each response body at 16 MiB to bound memory from a hostile or malfunctioning peer. It drains and closes bodies so the HTTP client reuses connections, and recovers panics inside Embed. On a 4xx status the provider wraps the error so upstream detail does not leak.
Vector dimensions are model-specific. The provider resolves them from a built-in lookup unless you override EmbeddingDimensions.
Per-request retrieval options
The provider reads two per-request knobs through the embedding request, both optional. Set ProviderOptions["input_type"] to "query" or "document" to tune retrieval for asymmetric search. Set the request Dimensions field to override the output vector dimension for models that support it. Both default to Voyage's server-side behaviour when unset.
Requirements
- A Voyage AI API key from
dash.voyageai.com. - Network egress to
api.voyageai.com. - A separately registered LLM provider for completions. Voyage does not provide chat models.
Configuration
import (
"os"
"piko.sh/piko/wdk/llm/llm_provider_voyage"
)
provider, err := llm_provider_voyage.NewVoyageProvider(llm_provider_voyage.Config{
APIKey: os.Getenv("VOYAGE_API_KEY"), // required
BaseURL: "", // empty for https://api.voyageai.com
DefaultModel: "voyage-3.5", // optional; package default if empty
EmbeddingDimensions: 0, // 0 = use the model's default
})
if err != nil {
return err
}
Config.Validate() rejects missing API keys at construction time. Config.WithDefaults() fills DefaultModel, BaseURL, and EmbeddingDimensions when zero.
Bootstrap
Voyage is an embedding-only provider, so it registers via WithEmbeddingProvider, not WithLLMProvider. Pair it with a completion provider that you register separately:
ssr := piko.New(
piko.WithLLMProvider("anthropic", anthropicProvider),
piko.WithDefaultLLMProvider("anthropic"),
piko.WithEmbeddingProvider("voyage", voyageProvider),
piko.WithDefaultEmbeddingProvider("voyage"),
)
This explicit registration is the piko-specific step Voyage requires. When your default LLM provider already supports embeddings, for example OpenAI or Ollama, piko auto-detects that support and WithEmbeddingProvider is not needed. Register Voyage when you want a dedicated embedding model alongside a completion provider that lacks one, such as Anthropic. WithDefaultEmbeddingProvider takes precedence over the auto-detected provider.
See also
Embedding-capable providers:
- OpenAI,
text-embedding-3-small/-large, also chat. - Gemini,
text-embedding-004, also chat. - Mistral,
mistral-embed, also chat. - Ollama, local embedding models like
all-minilm,nomic-embed-text.
Completion providers to pair Voyage with:
- Anthropic Claude, long-context, strong tool use.
- OpenAI, widest model selection.
- Gemini, cheap and multimodal.
- Mistral, open-weight European provider.
- Grok, xAI provider.
- Ollama, local inference.
Framework docs:
- How to use LLMs, embeddings, and RAG, wiring the LLM service end-to-end, including separate embedding providers.
- LLM API reference,
EmbeddingProviderPortand the LLM service.
External:
- Voyage AI documentation, authoritative reference for models and the API.
- MTEB leaderboard, public benchmarks for embedding retrieval quality.