Hanzo Embeddings
Generate, store, and search vector embeddings at scale — an OpenAI-compatible embeddings API on the Hanzo gateway that pairs with Vector.
Hanzo Embeddings
Hanzo Embeddings turns text into vectors through a single OpenAI-compatible endpoint on the AI gateway. Generate embeddings with any supported model, then store and query them in Vector for semantic search and RAG.
Generate Embeddings
The endpoint is drop-in compatible with the OpenAI embeddings API, so existing SDKs work by pointing at api.hanzo.ai.
curl https://api.hanzo.ai/v1/embeddings \
-H "Authorization: Bearer hk-..." \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-small",
"input": "Hanzo is an AI cloud platform."
}'Response (OpenAI shape):
{
"object": "list",
"data": [{ "object": "embedding", "index": 0, "embedding": [0.021, -0.014, "..."] }],
"model": "text-embedding-3-small",
"usage": { "prompt_tokens": 8, "total_tokens": 8 }
}Pass an array of strings as input to embed a batch in one request.
Store and Search
Provision a Vector collection whose dimensions match your model, then upsert the returned vectors and query for nearest neighbors.
from openai import OpenAI
client = OpenAI(api_key="hk-...", base_url="https://api.hanzo.ai/v1")
vecs = client.embeddings.create(
model="text-embedding-3-small",
input=["first document", "second document"],
)
# Upsert vecs.data[i].embedding into your Vector collection, then search.Models
| Model | Dimensions | Notes |
|---|---|---|
text-embedding-3-small | 1536 | Fast, low-cost default |
text-embedding-3-large | 3072 | Highest quality |
bge-base-en-v1.5 | 768 | Open-weight, self-hostable |
Model availability and pricing follow the LLM gateway. Because embeddings share the gateway's per-key budgets and rate limits, the same hk- key governs spend across chat and embeddings.
Related
- Vector — store and search the embeddings you generate
- Search — hybrid full-text + semantic ranking
- LLM Gateway — one API for 200+ models, including embeddings
- API Keys — budgets and rate limits per key
- API Reference — every endpoint at
api.hanzo.ai
How is this guide?
Last updated on