Kube-State-Metrics Scrape Config: Setup That Actually Works

Q: 1. What port does Kube-State-Metrics expose?

By default, it exposes metrics on port 8080 at the /metrics endpoint.

Q: 5. Can I reduce the number of metrics scraped?

Yes. Use metric_relabel_configs to filter out unnecessary metrics and reduce load.

If your metrics look incomplete, stale, or just wrong: the issue is often not Kubernetes, it is your scrape configuration.

Getting the kube-state-metrics scrape config right is what turns raw cluster data into something Prometheus can actually use. Without it, dashboards break, alerts misfire, and debugging becomes guesswork.

This guide walks you through exactly how to configure Prometheus to scrape Kube-State-Metrics the right way cleanly, reliably, and without hidden pitfalls.

What “Scrape Config” Actually Means (In Simple Terms)

Before jumping into YAML, it helps to understand what is really happening. Prometheus does not “receive” metrics automatically. It pulls them. A scrape config tells Prometheus:

where to find metrics (target endpoint)
how often to fetch them (interval)
how to label them (metadata for querying)

Kube-State-Metrics exposes cluster state at an HTTP endpoint. Prometheus needs a precise configuration to discover and scrape that endpoint consistently.

If you want a deeper understanding of how this data flow works, it helps to see the full pipeline in action in this Kube State Metrics and Prometheus integration guide.

Why Kube-State-Metrics Needs a Proper Scrape Setup

Kube-State-Metrics is different from node exporters or application metrics. It does not measure performance. It exposes cluster state:

pod status
deployment replicas
job completions
resource conditions

That means:

Data must be fresh → otherwise alerts lag
Labels must be clean → otherwise queries break
Targets must be stable → otherwise metrics disappear

A weak scrape config leads to:

missing metrics
duplicate targets
incorrect labeling
noisy dashboards

Basic Kube State Metrics Scrape Config (Working Example)

Here is a clean and minimal Prometheus scrape config:

scrape_configs:
- job_name: 'kube-state-metrics'
    scrape_interval: 30s
    metrics_path: /metrics

    static_configs:
      - targets:
          - kube-state-metrics.kube-system.svc.cluster.local:8080

What This Does

job_name → groups metrics under a logical name
scrape_interval → fetches metrics every 30 seconds
metrics_path → default endpoint exposed by KSM
targets → Kubernetes service endpoint

This works well for simple setups or local clusters. But in real environments, static configs are rarely enough.

Using Kubernetes Service Discovery (Recommended)

Static targets break easily in dynamic clusters. Instead, use Kubernetes service discovery:

scrape_configs:
  - job_name: 'kube-state-metrics'

    kubernetes_sd_configs:
      - role: endpoints

    relabel_configs:
      - source_labels: [__meta_kubernetes_service_name]
        action: keep
        regex: kube-state-metrics

      - source_labels: [__meta_kubernetes_namespace]
        action: keep
        regex: kube-system

Why This Is Better

Automatically discovers the service
Adapts to pod restarts and scaling
Reduces manual updates
Keeps config future-proof

What the Relabeling Does

Filters only the kube-state-metrics service
Restricts scraping to the correct namespace

Without relabeling, Prometheus may scrape unnecessary endpoints.

How to Verify Your Scrape Config Is Working

After applying your Prometheus scrape configuration, do not just assume everything is fine. Always verify it properly. This step helps you catch issues early before they affect your monitoring data.

Step 1: Open Prometheus Targets Page

Start by checking if Prometheus is actually scraping your target. Go to the Prometheus UI and open:

/targets

Now look for the kube-state-metrics job in the list. You should see:

Status: UP
Last scrape should be recent (a few seconds or minutes ago)

If it shows DOWN, click on it to see the error. Most issues here are usually related to service discovery or network access.

Step 2: Check Metrics Directly

Next, confirm that data is actually coming in. Go to the Graph or Explore section and run a simple query:

kube_pod_info

If everything is working correctly, you will see active time series data appear. If nothing shows up, it usually means:

Scraping is not working
Or kube-state-metrics is not exposing metrics properly

Step 3: Validate Labels

Now check the quality of your metrics. Inspect the output and look at key labels like:

namespace
pod
container

These labels should be clear and consistent. Why this matters:

Clean labels make your dashboards easier to build
Helps in filtering and grouping metrics properly
Makes debugging much faster later

If labels look messy or missing, your scraping setup might still need tuning.

Common Mistakes That Break Scraping

1. Wrong Service Endpoint

A small DNS mistake breaks everything. Example issue:

wrong namespace
wrong port
wrong service name

Always confirm:

kubectl get svc -n kube-system

2. Missing RBAC Permissions

Prometheus must be allowed to discover endpoints. Without proper RBAC:

targets do not appear
scraping silently fails

This is especially common in restricted clusters.

3. Overly Broad Discovery

Scraping everything sounds easy but creates chaos. Without filtering:

Prometheus scrapes unrelated services
metrics become noisy
storage usage increases

Always limit targets using relabel configs.

4. Too Frequent Scraping

Lower interval ≠ better monitoring. Example mistake:

scrape_interval: 5s

This causes:

unnecessary load
duplicate data
faster storage consumption

For Kube-State-Metrics: 30s–60s is usually enough

5. Ignoring Metrics Cardinality

Kube-State-Metrics exposes many labels. Too many labels:

slow queries
heavy Prometheus memory usage

Fix:

avoid collecting unused metrics
control label usage where possible

Advanced Improvements (When You Need More Control)

Once the basics are solid, you can optimize further.

Add Custom Labels for Better Querying

relabel_configs:
  - target_label: cluster
    replacement: production

This helps when:

managing multiple clusters
building shared dashboards

Filter Metrics at Scrape Time

If you do not need everything:

metric_relabel_configs:
  - source_labels: [__name__]
    regex: 'kube_pod_.*'
    action: keep

This keeps only pod-related metrics. Result:

cleaner data
faster queries
lower storage cost

Separate Jobs for Clarity

Instead of mixing targets:

job_name: 'kube-state-metrics'

Keep it isolated. This makes:

debugging easier
dashboards cleaner
alerts more predictable

How Kube-State-Metrics Fits Into Your Monitoring Flow

Kube-state-metrics is like a translator between Kubernetes and Prometheus. Understanding the full flow prevents misconfiguration.

1. Kubernetes keeps updating the status of things like pods and deployments. But Prometheus cannot directly read that information.

2. Kube-State-Metrics takes that data and turns it into simple metrics that Prometheus can understand.

3. Prometheus scrapes the endpoint to collect the data

4. Data is stored and queried

5. Dashboards and alerts use that data

If something goes wrong at step 3, everything below it stops working properly. You might still see Prometheus running, but your dashboards can show empty or incomplete data.

When Static Config Is Still Fine

Static config is fine when your setup is simple and does not change much.

It works well for local clusters, Docker-based setups, or testing environments where you just want things to run quickly.
It is easy because you manually tell Prometheus what to scrape. But in real production systems, things change often, and static config becomes harder to manage.

That is why most production setups use service discovery instead.

Quick Checklist (Before You Move On)

Before you finish your setup, quickly double-check a few things.

uses correct service endpoint
has RBAC configured
filters targets properly
uses reasonable scrape interval (not too fast or too slow)
avoids unnecessary metrics

If all of this looks good, your setup is healthy and ready to use.

Conclusion

A good kube state metrics scrape config is not complicated but it has to be intentional. Most monitoring issues do not come from Kubernetes itself. They come from weak or incomplete scraping setups.

Get the basics right, keep the config clean, and your metrics will stay reliable, predictable, and useful.

FAQ Section

1. What port does Kube-State-Metrics expose?

By default, it exposes metrics on port 8080 at the /metrics endpoint.

2. Do I need service discovery for Kube-State-Metrics?

Not strictly. But in dynamic Kubernetes environments, service discovery is strongly recommended for stability.

3. How often should Prometheus scrape Kube-State-Metrics?

A 30–60 second interval is usually sufficient. Faster scraping rarely adds value.

4. Why are my Kube-State-Metrics targets showing DOWN?

Common causes are:
1. wrong service endpoint
2. missing RBAC permissions
3. incorrect namespace filtering

5. Can I reduce the number of metrics scraped?

Yes. Use metric_relabel_configs to filter out unnecessary metrics and reduce load.