If you are trying to understand Kubernetes monitoring, you will quickly run into two names: kube-state-metrics and Metrics Server.
At first glance, they seem similar. Both expose metrics. Both integrate into the cluster. Both are widely used.
But they serve completely different purposes.
And confusing them leads to bad monitoring decisions, incomplete visibility, or worse, alerting that does not reflect reality.
This guide breaks down kube state metrics vs metrics server in plain terms, so you can choose the right tool (or combination) based on what you actually need.
What Is Metrics Server?
Metrics Server is a lightweight, cluster-wide aggregator of resource usage metrics. It collects real-time data such as:
- CPU usage
- Memory usage
This data comes directly from the Kubelet on each node and is exposed through the Kubernetes Metrics API.
What It is Used For
Metrics Server is primarily designed for autoscaling and quick insights, not deep monitoring. Typical use cases:
- Horizontal Pod Autoscaler (HPA)
- Vertical Pod Autoscaler (VPA)
kubectl topcommands
If you have ever run:
kubectl top pods
You were using Metrics Server.
Key Characteristics
- Focuses on live resource usage
- Stores no historical data
- Lightweight and fast
- Not designed for alerting or dashboards
If you want a deeper walkthrough, this internal guide on Kubernetes Metrics Server explains its setup and behavior in more detail.
What Is Kube-State-Metrics?
Kube-state-metrics does something fundamentally different. Instead of measuring usage, it exposes the state of Kubernetes objects. That includes:
- Deployments
- Pods
- Nodes
- StatefulSets
- Jobs
What It Tracks
It answers questions like:
- Is a deployment fully available?
- How many replicas are desired vs running?
- Are pods stuck in pending or crash loop states?
- Has a job failed or completed?
This data comes from the Kubernetes API server, not from resource usage.
Key Characteristics
- Focuses on cluster state and configuration
- Works best with tools like Prometheus
- Enables alerting, dashboards, and analysis
- Provides semantic, human-readable metrics
Kube State Metrics vs Metrics Server (Core Difference)
Let us make it simple.
| Aspect | Metrics Server | Kube-State-Metrics |
| Purpose | Resource usage | Cluster state |
| Data Source | Kubelet | Kubernetes API |
| Metrics Type | CPU, Memory | Object status |
| Historical Data | No | Yes (via Prometheus) |
| Use Case | Autoscaling | Monitoring, alerting |
| Example | CPU usage of pod | Pod readiness status |
The Real Difference
- Metrics Server tells you how much your app is using
- Kube-state-metrics tells you whether your app is healthy
You need both perspectives to understand what is actually happening.
Why This Difference Matters
Many Kubernetes monitoring setups fall short, not because tools are missing, but because they are misunderstood.
Relying on only one of these tools creates blind spots. And in production, blind spots turn into delayed responses, poor scaling decisions, and hard-to-debug failures.
Scenario 1: Only Metrics Server
You can see:
- CPU is low
- Memory is stable
At a glance, everything looks healthy. But beneath that:
- Pods might be restarting repeatedly
- Deployments may not have the required number of ready replicas
- Jobs could be failing silently
Metrics Server does not surface any of this. It only shows resource consumption, not whether workloads are functioning correctly.
So you end up with a false sense of stability: systems look fine, but users may already be impacted.
Scenario 2: Only Kube-State-Metrics
You can see:
- Pods are failing
- Replicas are missing
- Deployments are not progressing
But, you do not know if it is due to CPU or memory pressure. Without resource metrics, you are guessing. You can detect the symptom, but not the cause.
The Real Problem
Each tool answers only half the story:
- Metrics Server → “How much is being used?”
- Kube-state-metrics → “Is it working as expected?”
Individually, both are incomplete.
The Right Approach
Use them together.
- Metrics Server → performance signals (CPU, memory, real-time usage)
- Kube-state-metrics → state signals (readiness, availability, failures)
When combined:
- You detect issues faster
- You understand root causes more clearly
- You avoid chasing the wrong signals
That is when monitoring stops being reactive and starts becoming reliable. Combined, they give you a complete picture.
When to Use Metrics Server
Use Metrics Server if your goal is:
1. Autoscaling
It powers:
- Horizontal Pod Autoscaler (HPA)
- Vertical Pod Autoscaler (VPA)
Without it, autoscaling will not work properly.
2. Quick Debugging
Need a quick check?
kubectl top nodes
kubectl top pods
That is Metrics Server doing its job.
3. Lightweight Clusters
If you do not want a full monitoring stack, Metrics Server is minimal and efficient.
When to Use Kube-State-Metrics
Use kube-state-metrics when you care about cluster behavior over time.
1. Alerting
Examples:
- Deployment replicas mismatch
- Pods not ready
- CrashLoopBackOff detection
2. Dashboards
Tools like Prometheus + Grafana rely on kube-state-metrics to visualize cluster health.
3. Operational Visibility
It helps answer:
- Is my deployment actually working?
- Are resources stuck or misconfigured?
Can You Replace One with the Other?
No. They solve different problems. Trying to replace:
- Metrics Server with kube-state-metrics → You lose resource usage data
- Kube-state-metrics with Metrics Server → You lose cluster state insights
This is not a choice between tools. It is a layered system.
How They Work Together in a Real Stack
A typical production setup looks like this:
- Metrics Server → feeds autoscaling
- Kube-state-metrics → feeds Prometheus
- Prometheus → stores and queries metrics
- Grafana → visualizes everything
Each layer has a purpose. Remove one, and the system becomes incomplete.
Common Misunderstandings
“Metrics Server Is Enough for Monitoring”
It is not. Metrics Server was built for autoscaling decisions, not observability.
It gives you short-lived snapshots of CPU and memory, but:
- no historical trends
- no workload health signals
- no context around failures
So while it helps Kubernetes decide when to scale, it does not help you understand why something broke or what is degrading over time.
“Kube-State-Metrics Shows Resource Usage”
It does not. Kube-state-metrics exposes desired vs actual state, not consumption. You will see:
- how many replicas should be running vs how many are ready
- whether pods are pending, running, or failing
- rollout and deployment status
But it has zero visibility into:
- CPU pressure
- memory limits
- actual runtime load
It tells you what is wrong structurally, not what is causing it.
“They Compete With Each Other”
They do not operate in the same layer.
- Metrics Server → resource layer (usage)
- Kube-state-metrics → control plane layer (state)
Treating them as alternatives leads to incomplete monitoring. They are designed to be used together, not replaced.
Practical Example
Let us say your app is down.
With Metrics Server
You see:
- CPU usage is low
- Memory looks fine
No obvious issue.
With Kube-State-Metrics
You see:
- Pods are restarting
- Deployment has unavailable replicas
Now you know something is wrong.
Combined Insight
You check:
- Resource usage → fine
- Pod state → failing
Likely config issue, not resource issue. That is the value of using both.
Which One Should You Start With?
If you are just starting:
- Begin with Metrics Server for basic visibility
- Add kube-state-metrics when you need real monitoring
If you are running production workloads, you should already be using both
Conclusion
The debate around kube state metrics vs metrics server is not about choosing one over the other. It is about understanding what each tool does and what it does not.
- Metrics Server shows how resources are being used.
- Kube-state-metrics shows whether your cluster is behaving correctly.
You need both to see the full picture.
FAQ Section
1. Is Metrics Server required for Kubernetes?
It is not mandatory, but it is required for auto-scaling features like HPA.
2. Does kube-state-metrics store data?
No. It exposes metrics. Storage depends on tools like Prometheus.
3. Can I use kube-state-metrics without Prometheus?
Technically yes, but it is most useful when paired with a monitoring system.
4. Why does Metrics Server not show pod status?
Because it only tracks resource usage, not object state.
5. Do I need both in production?
Yes, if you want complete visibility across performance and state.