Hanzo
ServicesVector

Hanzo Vector

High-performance vector search engine for AI applications

Hanzo Vector

Hanzo Vector is a high-performance vector similarity search engine and database, based on Qdrant. It powers semantic search, recommendation systems, and RAG applications across the Hanzo AI platform.

Features

  • Vector Similarity Search: Fast nearest neighbor search with HNSW indexing
  • Filtering Support: Combine vector search with metadata filtering
  • High Performance: Written in Rust for speed and reliability
  • Scalable: Horizontal scaling with sharding and replication
  • Multiple Metrics: Cosine, Euclidean, Dot Product similarity
  • Payload Storage: Store and filter on JSON metadata

Architecture

+-----------------------------------------------------+
|                    HANZO VECTOR                      |
+-----------------------------------------------------+
|                                                      |
|  +-------------------------------------------------+|
|  |              Vector Search Engine               ||
|  |  +-------------+  +-------------+  +-----------+||
|  |  |   HNSW     |  |   Scalar   |  | Sparse    |||
|  |  |   Index    |  |   Quantize |  | Vectors   |||
|  |  +-------------+  +-------------+  +-----------+||
|  +-------------------------------------------------+|
|                                                      |
|  +-------------------------------------------------+|
|  |              Storage Layer                      ||
|  |  +-------------+  +-------------+  +-----------+||
|  |  |   Segments  |  |   WAL      |  |  Snapshots|||
|  |  +-------------+  +-------------+  +-----------+||
|  +-------------------------------------------------+|
|                                                      |
+-----------------------------------------------------+

Endpoints

EnvironmentURL
Productionhttps://vector.hanzo.ai
Staginghttps://stg.vector.hanzo.ai

Quick Start

Development Setup

cd ~/work/hanzo/vector

# Build from source
cargo build --release

# Or run with Docker
docker run -p 6333:6333 hanzoai/vector:latest

Create Collection

import { QdrantClient } from '@qdrant/js-client-rest'

const client = new QdrantClient({
  url: 'https://vector.hanzo.ai',
  apiKey: process.env.HANZO_VECTOR_KEY
})

// Create collection with 1536-dimensional vectors (OpenAI)
await client.createCollection('documents', {
  vectors: {
    size: 1536,
    distance: 'Cosine'
  }
})

Insert Vectors

// Upsert points with vectors and metadata
await client.upsert('documents', {
  wait: true,
  points: [
    {
      id: 1,
      vector: embedding, // [0.1, 0.2, ...]
      payload: {
        title: 'Document Title',
        content: 'Document content...',
        category: 'technical'
      }
    }
  ]
})
// Search for similar vectors
const results = await client.search('documents', {
  vector: queryEmbedding,
  limit: 10,
  filter: {
    must: [{ key: 'category', match: { value: 'technical' } }]
  }
})

Collection Types

Dense Vectors

Standard embedding vectors (OpenAI, Cohere, etc.):

await client.createCollection('embeddings', {
  vectors: {
    size: 1536,
    distance: 'Cosine',
    on_disk: true // Store on disk for large collections
  }
})

Sparse Vectors

For keyword/BM25 hybrid search:

await client.createCollection('hybrid', {
  sparse_vectors: {
    keywords: {}
  },
  vectors: {
    size: 1536,
    distance: 'Cosine'
  }
})

Multi-Vector

Multiple vectors per point:

await client.createCollection('multimodal', {
  vectors: {
    text: { size: 1536, distance: 'Cosine' },
    image: { size: 512, distance: 'Cosine' }
  }
})

Filtering

Match Filter

filter: {
  must: [
    { key: 'category', match: { value: 'technical' } }
  ],
  must_not: [
    { key: 'archived', match: { value: true } }
  ]
}

Range Filter

filter: {
  must: [
    { key: 'price', range: { gte: 10, lte: 100 } }
  ]
}

Geo Filter

filter: {
  must: [
    {
      key: 'location',
      geo_radius: {
        center: { lat: 37.7749, lon: -122.4194 },
        radius: 10000 // meters
      }
    }
  ]
}

Quantization

Reduce memory usage with scalar quantization:

await client.createCollection('quantized', {
  vectors: {
    size: 1536,
    distance: 'Cosine'
  },
  quantization_config: {
    scalar: {
      type: 'int8',
      quantile: 0.99,
      always_ram: true
    }
  }
})

Clustering

Sharding

# Deploy with 3 shards
QDRANT_COLLECTION__SHARDS=3 hanzo-vector

Replication

# qdrant.yaml
cluster:
  enabled: true
  replication_factor: 2

API Reference

Collections

  • PUT /collections/{name} - Create collection
  • GET /collections/{name} - Get collection info
  • DELETE /collections/{name} - Delete collection
  • PATCH /collections/{name} - Update collection

Points

  • PUT /collections/{name}/points - Upsert points
  • POST /collections/{name}/points/search - Search vectors
  • POST /collections/{name}/points/scroll - Paginate points
  • DELETE /collections/{name}/points - Delete points

Snapshots

  • POST /collections/{name}/snapshots - Create snapshot
  • GET /collections/{name}/snapshots - List snapshots
  • GET /snapshots/{name} - Download snapshot

Integration with Gateway

Hanzo Vector integrates with Gateway for RAG:

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: query }],
  tools: [{
    type: 'retrieval',
    retrieval: {
      collection: 'documents',
      top_k: 5
    }
  }]
})

Next Steps

How is this guide?

Last updated on

On this page