Hanzo

Fleet Overview

Monitor health, resources, and workloads across all clusters from one dashboard.

The fleet dashboard gives you a unified view of every cluster connected to Hanzo Platform. See health status, resource utilization, and workload distribution at a glance.

Dashboard Overview

The fleet page displays all registered clusters in a grid or list view with real-time metrics:

MetricDescription
StatusHealthy (green), Degraded (yellow), Unreachable (red)
NodesTotal node count and ready/not-ready breakdown
CPUAggregate CPU usage vs. allocatable capacity
MemoryAggregate memory usage vs. allocatable capacity
PodsRunning pods vs. total pod capacity
WorkloadsDeployments, StatefulSets, DaemonSets, Services

Health Status

Cluster health is determined by checking:

  1. API Server Reachability -- Can the platform reach the cluster's API endpoint?
  2. Node Health -- Are all nodes in Ready state?
  3. System Pods -- Are critical system workloads (DNS, networking, ingress) running?
  4. Certificate Validity -- Are TLS certificates current and not expiring soon?

Health checks run every 30 seconds. Status changes trigger notifications if you have alerts configured in Settings > Notifications.

Resource Utilization

Click any cluster to see detailed resource breakdowns:

CPU and Memory

A time-series graph shows CPU and memory usage over the last 1h, 6h, 24h, 7d, or 30d. Hover over the graph to see exact values at any point.

Node-Level Breakdown

A table lists each node with its individual resource consumption:

Node              CPU Used   CPU Total   Memory Used   Memory Total   Pods
worker-01         1.2 cores  4 cores     3.1 GB        8 GB           28/110
worker-02         2.8 cores  4 cores     5.4 GB        8 GB           45/110
worker-03         0.4 cores  4 cores     1.2 GB        8 GB           12/110

Storage

View PersistentVolumeClaim usage across the cluster:

  • Total provisioned storage
  • Used vs. available per PVC
  • Storage class distribution

Use the toolbar to filter clusters by:

  • Provider -- AWS, DigitalOcean, Hetzner, or self-managed
  • Orchestrator -- Kubernetes, Docker Swarm, Docker Host
  • Status -- Healthy, Degraded, Unreachable
  • Labels -- Custom labels you assign (e.g., env:production, team:backend)
# Filter examples
provider:digitalocean status:healthy
label:env=production
label:team=platform provider:aws

Cluster Comparison

Select multiple clusters (checkbox) and click Compare to see a side-by-side view:

Metricadnexus-k8shanzo-k8sstaging-k8s
Nodes352
CPU Used45%62%18%
Memory Used58%71%22%
Pods8914231
Deployments14228

Alerts

Configure fleet-wide alerts from Settings > Notifications:

  • Node not ready -- Alert when any node leaves the Ready state for more than 5 minutes
  • High CPU -- Alert when cluster CPU exceeds a threshold (default: 85%)
  • High memory -- Alert when cluster memory exceeds a threshold (default: 90%)
  • Pod crash loop -- Alert when a pod restarts more than 5 times in 10 minutes
  • Certificate expiry -- Alert 30 days before any cluster certificate expires

Fleet alerts are distinct from application-level monitoring. Use both for comprehensive coverage.

How is this guide?

Last updated on

On this page