Hanzo Python SDK

Stainless-generated Python client library for the Hanzo AI API with full type safety, async support, streaming, auto-pagination, and auto-retry across 187 endpoints.

Overview

Hanzo Python SDK is a Stainless-generated Python client library for the Hanzo AI API. Provides full type safety, async support, streaming, auto-pagination, and auto-retry. Covers all 187 API endpoints across 51 resource namespaces — identical API surface to the JS and Go SDKs.

Why Hanzo Python SDK?

Full type safety: Generated from OpenAPI spec via Stainless
OpenAI compatible: Same interface patterns, easy migration
Async native: Both sync and async clients
Streaming: First-class SSE streaming with iterators
Auto-retry: Configurable retry with exponential backoff
Pagination: Auto-pagination helpers for list endpoints

OSS Base

Generated by Stainless. Repo: hanzoai/python-sdk.

When to use

Python applications calling Hanzo API
FastAPI, Django, or Flask backends
ML/AI pipelines needing LLM access
Replacing OpenAI Python SDK with Hanzo
Jupyter notebooks and data science workflows

Hard requirements

Python >=3.12
HANZO_API_KEY for authentication

Quick reference

Item	Value
Package	`hanzoai` (PyPI)
Version	2.2.0
Repo	`github.com/hanzoai/python-sdk`
Generated by	Stainless
License	BSD-3-Clause
Python	>=3.12
Base URL	`https://api.hanzo.ai`

Installation

# pip
pip install hanzoai

# uv (preferred)
uv add hanzoai

# poetry
poetry add hanzoai

One-file quickstart

Chat Completion

from hanzoai import Hanzo

client = Hanzo(
    api_key="your-api-key",  # defaults to HANZO_API_KEY env var
)

response = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "Hello, Hanzo!"}],
)
print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "Write a poem about code"}],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="")

Async Client

import asyncio
from hanzoai import AsyncHanzo

client = AsyncHanzo()

async def main():
    response = await client.chat.completions.create(
        model="zen-70b",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

asyncio.run(main())

Async Streaming

async def stream_response():
    stream = await client.chat.completions.create(
        model="zen-70b",
        messages=[{"role": "user", "content": "Write a story"}],
        stream=True,
    )

    async for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            print(content, end="")

Embeddings

embedding = client.embeddings.create(
    model="zen-embedding",
    input="Hello world",
)
print(len(embedding.data[0].embedding))  # dimension count

Function Calling / Tools

response = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                },
                "required": ["location"],
            },
        },
    }],
)

tool_call = response.choices[0].message.tool_calls[0]
if tool_call:
    print(tool_call.function.name)       # "get_weather"
    print(tool_call.function.arguments)  # '{"location":"Tokyo"}'

Vision (Multimodal)

response = client.chat.completions.create(
    model="zen-vision",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
        ],
    }],
)

API Surface

The SDK provides 187 endpoints across 51 resource namespaces, generated from the OpenAPI spec via Stainless.

Core Resources

Resource	Method	Description
`client.chat.completions.create()`	POST	Chat completion (streaming optional)
`client.completions.create()`	POST	Text completion (legacy)
`client.embeddings.create()`	POST	Generate embeddings
`client.models.list()`	GET	List available models
`client.models.retrieve(id)`	GET	Get model details

File Management

Method	Description
`client.files.create()`	Upload a file
`client.files.retrieve(id)`	Get file metadata
`client.files.list()`	List uploaded files
`client.files.delete(id)`	Delete a file
`client.files.content(id)`	Get file content

Fine-Tuning

Method	Description
`client.fine_tuning.jobs.create()`	Create fine-tuning job
`client.fine_tuning.jobs.retrieve(id)`	Get job status
`client.fine_tuning.jobs.list()`	List all jobs
`client.fine_tuning.jobs.cancel(id)`	Cancel a job
`client.fine_tuning.jobs.list_events(id)`	List job events

Images

Method	Description
`client.images.generate()`	Generate images
`client.images.edit()`	Edit images

Audio

Method	Description
`client.audio.transcriptions.create()`	Speech-to-text
`client.audio.translations.create()`	Audio translation
`client.audio.speech.create()`	Text-to-speech

Additional Resource Namespaces

The SDK also includes namespaces for: assistants, threads, runs, batches, vector stores, model management, key management, team management, organization, budget, guardrails, credentials, and more (51 total).

Configuration

client = Hanzo(
    api_key="your-key",                    # Required (or HANZO_API_KEY env)
    base_url="https://api.hanzo.ai",       # Default
    timeout=60.0,                          # Request timeout (seconds)
    max_retries=2,                         # Auto-retry count
    default_headers={"X-Custom": "val"},   # Extra headers
    default_query={"version": "2"},        # Extra query params
)

Environment Variables

HANZO_API_KEY=your-api-key          # Required
HANZO_BASE_URL=https://...          # Override base URL
HANZO_LOG=debug                     # Enable debug logging

OpenAI Drop-In Replacement

from openai import OpenAI

client = OpenAI(
    base_url="https://api.hanzo.ai/v1",
    api_key=os.environ["HANZO_API_KEY"],
)
# Everything works — same API shape
response = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "Hello"}],
)

Framework Integration

FastAPI

from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from hanzoai import Hanzo

app = FastAPI()
client = Hanzo()

@app.post("/api/chat")
async def chat(messages: list[dict]):
    stream = client.chat.completions.create(
        model="zen-70b",
        messages=messages,
        stream=True,
    )

    async def generate():
        for chunk in stream:
            content = chunk.choices[0].delta.content
            if content:
                yield f"data: {content}\n\n"

    return StreamingResponse(generate(), media_type="text/event-stream")

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.hanzo.ai/v1",
    api_key=os.environ["HANZO_API_KEY"],
    model="zen-70b",
)

response = llm.invoke("Hello, Hanzo!")

Error Handling

from hanzoai import Hanzo, APIError, RateLimitError, APIConnectionError

try:
    response = client.chat.completions.create(...)
except RateLimitError as e:
    print(f"Rate limited, retry after: {e.response.headers.get('retry-after')}")
except APIError as e:
    print(f"API error: {e.status_code} {e.message}")
except APIConnectionError as e:
    print(f"Connection error: {e}")

Pagination

# Auto-pagination
for model in client.models.list():
    print(model.id)

# Manual pagination
page = client.models.list()
print(page.data)
if page.has_next_page():
    next_page = page.get_next_page()

Package Structure

The SDK contains 55 packages in the pkg/ directory, organized by resource namespace:

hanzoai/
├── __init__.py           # Main client export
├── _client.py            # Hanzo and AsyncHanzo clients
├── _streaming.py         # Stream and AsyncStream
├── _response.py          # Response types
├── types/                # All Pydantic response models
│   ├── chat/
│   ├── fine_tuning/
│   ├── audio/
│   └── ...
├── resources/            # Resource namespace implementations
│   ├── chat/
│   │   └── completions.py
│   ├── embeddings.py
│   ├── models.py
│   ├── files.py
│   └── ...
└── pkg/                  # 55 sub-packages

Runtime Support

Runtime	Support	Notes
CPython 3.12+	Full	Primary target
PyPy 3.12+	Full	Compatible
Jupyter	Full	Sync client in notebooks

hanzo/hanzo-chat.md - API reference and model catalog
hanzo/js-sdk.md - JavaScript equivalent
hanzo/go-sdk.md - Go equivalent
hanzo/rust-sdk.md - Rust equivalent (infrastructure SDK)

On this page