kube-state-metrics: All 200+ Kubernetes Metrics Reference (v2.13)

🫛

Pod Metrics

Track lifecycle, phase, resource requests, and container-level state

28 metrics

kube_pod_info info

Information about a pod. Returns 1 for each pod. Useful for joining with other pod metrics via label_replace.

namespacepodnodehost_ippod_ipuid

kube_pod_info{namespace="production", node="worker-1"}
# Count pods per node
count by (node) (kube_pod_info)

kube_pod_status_phase gauge

The current phase of the pod (Pending, Running, Succeeded, Failed, Unknown). Value is 1 for the active phase, 0 otherwise.

namespacepodphase

# Count of Running pods per namespace
count by (namespace) (
  kube_pod_status_phase{phase="Running"} == 1
)
# Alert: pods stuck in Pending
kube_pod_status_phase{phase="Pending"} == 1

kube_pod_status_ready gauge

Describes whether the pod is ready to serve requests (1 = true, 0 = false/unknown).

namespacepodcondition

# Percentage of ready pods
sum(kube_pod_status_ready{condition="true"}) /
count(kube_pod_info) * 100

kube_pod_container_status_restarts_total counter

The number of container restarts per container. High restart count indicates CrashLoopBackOff or OOM.

namespacepodcontainer

# Alert: container restarting frequently
increase(kube_pod_container_status_restarts_total[1h]) > 5

kube_pod_container_resource_requests gauge

The number of requested resource (CPU cores or memory bytes) by a container.

namespacepodcontainerresourceunit

# Total CPU requested in namespace
sum by (namespace) (
  kube_pod_container_resource_requests{resource="cpu"}
)

kube_pod_container_resource_limits gauge

The number of requested limit resource (CPU cores or memory bytes) by a container.

namespacepodcontainerresourceunit

# Containers without memory limits
count(kube_pod_info) - count(
  kube_pod_container_resource_limits{resource="memory"}
)

🚀

Deployment Metrics

Monitor rollout status, replica counts, and update strategy

16 metrics

kube_deployment_status_replicas_available gauge

The number of available replicas per deployment.

namespacedeployment

# Alert: deployment has unavailable replicas
kube_deployment_spec_replicas -
kube_deployment_status_replicas_available > 0

kube_deployment_status_replicas_updated gauge

The number of updated replicas per deployment. Watch this converge during rolling updates.

namespacedeployment

# Rollout progress as percentage
kube_deployment_status_replicas_updated /
kube_deployment_spec_replicas * 100

kube_deployment_spec_replicas gauge

Number of desired pods for a deployment as specified in the deployment spec.

namespacedeployment

# Deployments scaled to zero
kube_deployment_spec_replicas == 0

kube_deployment_status_condition gauge

The current status conditions of a deployment (Available, Progressing, ReplicaFailure).

namespacedeploymentconditionstatus

kube_deployment_status_condition{
  condition="ReplicaFailure", status="true"
} == 1

🖥️

Node Metrics

Node readiness, conditions, capacity, and allocatable resources

22 metrics

kube_node_status_condition gauge

The condition status of a node (Ready, MemoryPressure, DiskPressure, NetworkUnavailable, PIDPressure).

nodeconditionstatus

# Alert: node not Ready
kube_node_status_condition{
  condition="Ready", status="true"
} == 0

kube_node_status_allocatable gauge

The allocatable resource of a node available for scheduling. Accounts for system reserved resources.

noderesourceunit

# CPU allocation ratio per node
sum by (node) (kube_pod_container_resource_requests{resource="cpu"}) /
sum by (node) (kube_node_status_allocatable{resource="cpu"})

kube_node_info info

Information about a node including kernel version, OS image, container runtime version, and kubelet version.

nodekernel_versionos_imagecontainer_runtime_versionkubelet_version

# Count nodes by kubelet version
count by (kubelet_version) (kube_node_info)

kube_node_spec_taint gauge

The taint of a node. Useful for tracking cordoned/drained nodes and custom scheduling constraints.

nodekeyvalueeffect

# Cordoned nodes
kube_node_spec_taint{
  key="node.kubernetes.io/unschedulable"
}

🗃️

StatefulSet Metrics

Monitor ordered pod management, update strategy, and persistence

10 metrics

kube_statefulset_status_replicas_ready gauge

The number of ready replicas for this StatefulSet controller.

namespacestatefulset

# StatefulSets not fully available
kube_statefulset_replicas !=
kube_statefulset_status_replicas_ready

kube_statefulset_status_current_revision gauge

Indicates the version of the StatefulSet used to generate Pods. Tracks rolling update progress.

namespacestatefulsetrevision

# StatefulSet mid-rollout
kube_statefulset_status_current_revision !=
kube_statefulset_status_update_revision

👻

DaemonSet Metrics

Track desired vs current vs ready counts for node-level workloads

11 metrics

kube_daemonset_status_desired_number_scheduled gauge

The number of nodes that should be running the daemon pod.

namespacedaemonset

# DaemonSet coverage gap
kube_daemonset_status_desired_number_scheduled -
kube_daemonset_status_number_ready

kube_daemonset_status_number_misscheduled gauge

The number of nodes running a daemon pod but are not supposed to. Indicates scheduling constraint drift.

namespacedaemonset

# Alert: misscheduled DaemonSet pods
kube_daemonset_status_number_misscheduled > 0

💾

PersistentVolumeClaim Metrics

Storage binding status, capacity, and access modes

9 metrics

kube_persistentvolumeclaim_status_phase gauge

The phase the persistent volume claim is currently in (Bound, Lost, Pending).

namespacepersistentvolumeclaimphase

# Unbound PVCs
kube_persistentvolumeclaim_status_phase{
  phase!="Bound"
} == 1

kube_persistentvolumeclaim_resource_requests_storage_bytes gauge

The capacity of storage requested by the persistent volume claim.

namespacepersistentvolumeclaim

# Total storage requested per namespace (GiB)
sum by (namespace) (
  kube_persistentvolumeclaim_resource_requests_storage_bytes
) / 1073741824

⚖️

HorizontalPodAutoscaler Metrics

Current vs desired replicas and scaling conditions

10 metrics

kube_horizontalpodautoscaler_status_current_replicas gauge

Current number of replicas of pods managed by this autoscaler.

namespacehorizontalpodautoscaler

# HPAs at max capacity
kube_horizontalpodautoscaler_status_current_replicas ==
kube_horizontalpodautoscaler_spec_max_replicas

kube_horizontalpodautoscaler_spec_min_replicas gauge

Lower limit for the number of replicas to which the autoscaler can scale down.

namespacehorizontalpodautoscaler

# HPAs at min (scale-down event)
kube_horizontalpodautoscaler_status_current_replicas ==
kube_horizontalpodautoscaler_spec_min_replicas

⚙️

Job & CronJob Metrics

Completion status, duration, active pods, and schedule health

15 metrics

kube_job_status_active gauge

The number of actively running pods for a Job.

namespacejob_name

# Long-running jobs (over 1 hour)
kube_job_status_active == 1
  and
(time() - kube_job_status_start_time) > 3600

kube_job_status_failed gauge

The number of pods which reached phase Failed for a Job.

namespacejob_namecondition

# Alert: any job failure
kube_job_status_failed > 0

kube_cronjob_next_schedule_time gauge

Next time the CronJob should be scheduled. Useful for detecting missed schedules.

namespacecronjob

# Missed CronJob executions
time() - kube_cronjob_next_schedule_time > 3600

📁

Namespace Metrics

Namespace phase, labels, and annotations

4 metrics

kube_namespace_status_phase gauge

The phase of the namespace (Active or Terminating).

namespacephase

# Terminating namespaces stuck
kube_namespace_status_phase{phase="Terminating"}

🌐

Service & Ingress Metrics

Service type, selector, and ingress TLS/backend information

14 metrics

kube_service_info info

Information about a Kubernetes service including cluster IP, type, and external IP.

namespaceservicecluster_ipexternal_iptype

# Count LoadBalancer services
count(kube_service_info{type="LoadBalancer"})

kube_ingress_info info

Information about an Ingress resource including class, default backend, and TLS configuration.

namespaceingressingressclass

# All ingress objects
count(kube_ingress_info) by (namespace)

All 200+ MetricsDocumented

All 200+ Metrics
Documented