ServicesVector
Hanzo Vector
High-performance vector search engine for AI applications
Hanzo Vector
Hanzo Vector is a high-performance vector similarity search engine and database, based on Qdrant. It powers semantic search, recommendation systems, and RAG applications across the Hanzo AI platform.
Features
- Vector Similarity Search: Fast nearest neighbor search with HNSW indexing
- Filtering Support: Combine vector search with metadata filtering
- High Performance: Written in Rust for speed and reliability
- Scalable: Horizontal scaling with sharding and replication
- Multiple Metrics: Cosine, Euclidean, Dot Product similarity
- Payload Storage: Store and filter on JSON metadata
Architecture
+-----------------------------------------------------+
| HANZO VECTOR |
+-----------------------------------------------------+
| |
| +-------------------------------------------------+|
| | Vector Search Engine ||
| | +-------------+ +-------------+ +-----------+||
| | | HNSW | | Scalar | | Sparse |||
| | | Index | | Quantize | | Vectors |||
| | +-------------+ +-------------+ +-----------+||
| +-------------------------------------------------+|
| |
| +-------------------------------------------------+|
| | Storage Layer ||
| | +-------------+ +-------------+ +-----------+||
| | | Segments | | WAL | | Snapshots|||
| | +-------------+ +-------------+ +-----------+||
| +-------------------------------------------------+|
| |
+-----------------------------------------------------+Endpoints
| Environment | URL |
|---|---|
| Production | https://vector.hanzo.ai |
| Staging | https://stg.vector.hanzo.ai |
Quick Start
Development Setup
cd ~/work/hanzo/vector
# Build from source
cargo build --release
# Or run with Docker
docker run -p 6333:6333 hanzoai/vector:latestCreate Collection
import { QdrantClient } from '@qdrant/js-client-rest'
const client = new QdrantClient({
url: 'https://vector.hanzo.ai',
apiKey: process.env.HANZO_VECTOR_KEY
})
// Create collection with 1536-dimensional vectors (OpenAI)
await client.createCollection('documents', {
vectors: {
size: 1536,
distance: 'Cosine'
}
})Insert Vectors
// Upsert points with vectors and metadata
await client.upsert('documents', {
wait: true,
points: [
{
id: 1,
vector: embedding, // [0.1, 0.2, ...]
payload: {
title: 'Document Title',
content: 'Document content...',
category: 'technical'
}
}
]
})Search
// Search for similar vectors
const results = await client.search('documents', {
vector: queryEmbedding,
limit: 10,
filter: {
must: [{ key: 'category', match: { value: 'technical' } }]
}
})Collection Types
Dense Vectors
Standard embedding vectors (OpenAI, Cohere, etc.):
await client.createCollection('embeddings', {
vectors: {
size: 1536,
distance: 'Cosine',
on_disk: true // Store on disk for large collections
}
})Sparse Vectors
For keyword/BM25 hybrid search:
await client.createCollection('hybrid', {
sparse_vectors: {
keywords: {}
},
vectors: {
size: 1536,
distance: 'Cosine'
}
})Multi-Vector
Multiple vectors per point:
await client.createCollection('multimodal', {
vectors: {
text: { size: 1536, distance: 'Cosine' },
image: { size: 512, distance: 'Cosine' }
}
})Filtering
Match Filter
filter: {
must: [
{ key: 'category', match: { value: 'technical' } }
],
must_not: [
{ key: 'archived', match: { value: true } }
]
}Range Filter
filter: {
must: [
{ key: 'price', range: { gte: 10, lte: 100 } }
]
}Geo Filter
filter: {
must: [
{
key: 'location',
geo_radius: {
center: { lat: 37.7749, lon: -122.4194 },
radius: 10000 // meters
}
}
]
}Quantization
Reduce memory usage with scalar quantization:
await client.createCollection('quantized', {
vectors: {
size: 1536,
distance: 'Cosine'
},
quantization_config: {
scalar: {
type: 'int8',
quantile: 0.99,
always_ram: true
}
}
})Clustering
Sharding
# Deploy with 3 shards
QDRANT_COLLECTION__SHARDS=3 hanzo-vectorReplication
# qdrant.yaml
cluster:
enabled: true
replication_factor: 2API Reference
Collections
PUT /collections/{name}- Create collectionGET /collections/{name}- Get collection infoDELETE /collections/{name}- Delete collectionPATCH /collections/{name}- Update collection
Points
PUT /collections/{name}/points- Upsert pointsPOST /collections/{name}/points/search- Search vectorsPOST /collections/{name}/points/scroll- Paginate pointsDELETE /collections/{name}/points- Delete points
Snapshots
POST /collections/{name}/snapshots- Create snapshotGET /collections/{name}/snapshots- List snapshotsGET /snapshots/{name}- Download snapshot
Integration with Gateway
Hanzo Vector integrates with Gateway for RAG:
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: query }],
tools: [{
type: 'retrieval',
retrieval: {
collection: 'documents',
top_k: 5
}
}]
})Next Steps
How is this guide?
Last updated on