Hanzo ML

Hanzo ML combines Kubeflow-based orchestration with a Rust ML toolkit for training, evaluation, and model packaging. It provides the end-to-end MLOps pipeline from data preparation to production deployment.

Features

Pipelines: Kubeflow workflows for training, evaluation, and deployment

Rust Toolkit: High-performance data prep, feature engineering, and inference utilities

Model Registry: Versioning, promotion, A/B testing, and rollback

Experiment Tracking: Metrics, hyperparameters, and artifact logging

Deployment: Direct integration with Hanzo Cloud for model serving

Pipeline Architecture

┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
│   Data   │──▶│ Feature  │──▶│ Training │──▶│  Eval    │
│  Ingest  │   │  Engine  │   │          │   │          │
└──────────┘   └──────────┘   └──────────┘   └──────────┘
                                                   │
                                                   ▼
┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
│  Deploy  │◀──│ Registry │◀──│ Package  │◀──│ Validate │
│ (Serve)  │   │          │   │          │   │          │
└──────────┘   └──────────┘   └──────────┘   └──────────┘

Kubeflow Pipelines

Define training workflows as composable pipeline steps:

from kfp import dsl

@dsl.pipeline(name="hanzo-finetune")
def finetune_pipeline(
    base_model: str = "meta-llama/Llama-3.1-8B",
    dataset: str = "s3://hanzo-data/training/v2",
    epochs: int = 3,
    learning_rate: float = 2e-5,
):
    # Data preparation
    prep = prep_op(dataset=dataset)

    # Fine-tuning
    train = train_op(
        base_model=base_model,
        data=prep.outputs["processed"],
        epochs=epochs,
        lr=learning_rate,
    )

    # Evaluation
    eval = eval_op(
        model=train.outputs["model"],
        test_data=prep.outputs["test_set"],
    )

    # Register if quality threshold met
    register = register_op(
        model=train.outputs["model"],
        metrics=eval.outputs["metrics"],
        min_accuracy=0.85,
    )

Rust ML Toolkit

The Rust toolkit provides high-performance utilities for data-intensive ML tasks:

use hanzo_ml::{Dataset, FeatureEngine, Tokenizer};

// High-performance tokenization (10x faster than Python)
let tokenizer = Tokenizer::from_pretrained("meta-llama/Llama-3.1-8B")?;
let tokens = tokenizer.encode_batch(&texts, true)?;

// Feature engineering
let engine = FeatureEngine::new()
    .add_numeric_scaler("price", ScalerType::Standard)
    .add_text_embedder("description", "bge-small-en-v1.5")
    .add_categorical_encoder("category", EncoderType::OneHot);

let features = engine.transform(&dataset)?;

Model Registry

Track, version, and promote models through environments:

Stage	Description	Auto-promote
`dev`	Experimental models, local testing	—
`staging`	Passed evaluation thresholds	On eval pass
`canary`	A/B testing with 5% traffic	Manual
`production`	Full production traffic	On canary success

# CLI model management
hanzo ml models list
hanzo ml models promote my-model@v3 --to staging
hanzo ml models rollback my-model --to v2

Experiment Tracking

Log metrics, hyperparameters, and artifacts during training:

import hanzo_ml

# Start an experiment run
with hanzo_ml.start_run(experiment="llm-finetune") as run:
    run.log_params({
        "base_model": "Llama-3.1-8B",
        "learning_rate": 2e-5,
        "epochs": 3,
    })

    for epoch in range(3):
        loss = train_epoch(model, data)
        run.log_metrics({"loss": loss, "epoch": epoch})

    run.log_artifact("model", model_path)
    run.log_artifact("tokenizer", tokenizer_path)

Integration with Hanzo Cloud

Deploy trained models directly to Hanzo Cloud for serving:

# Deploy a model from the registry
hanzo ml deploy my-model@v3 \
  --runtime vllm \
  --gpu a100 \
  --replicas 2 \
  --env production

Core runtime — executes ML pipeline tasks

Infrastructure — hosts deployed models

LLM Gateway — routes to deployed models

Vector DB — stores model embeddings

Hanzo ML

On this page