Skip to main content
Updated Feb 23, 2026

OpenCost/Kubecost Visibility

Your Kubernetes cluster runs three namespaces: production (customer-facing APIs), staging (testing), and data-science (ML training jobs). At month-end, your cloud bill shows $15,000. Your CFO asks: "Which team is responsible for what portion of that cost?"

Without cost visibility, you're guessing. Maybe production is expensive because it runs 24/7. Maybe data-science is expensive because GPU nodes cost $3/hour. Maybe staging is wasting money running unused replicas. You could estimate based on node count, but that ignores the reality: pods share nodes, some pods are memory-heavy while others are CPU-heavy, and resource requests rarely match actual usage.

OpenCost solves this. It watches your cluster, tracks resource consumption per pod, and calculates cost allocation with precision. When your CFO asks "where's the money going?", you query the API: "Show me cost breakdown by namespace for the last 30 days." Seconds later: production ($9,200), data-science ($4,800), staging ($1,000). Now you can have a real conversation about whether that data-science spend is generating value.

This lesson teaches you to install OpenCost, understand its architecture, and query the allocation API to answer cost visibility questions.

OpenCost vs Kubecost: CNCF Open Source vs Commercial

Before diving into implementation, understand the landscape. Two tools dominate Kubernetes cost visibility, and they share the same core engine.

OpenCost: The CNCF incubating project (accepted October 2024 as incubating). Fully open-source under Apache 2.0 license. Developed by the same team that created Kubecost's allocation engine. OpenCost provides the core cost allocation capabilities that most teams need.

Kubecost: The commercial product built on top of OpenCost. Adds enterprise features like discount reconciliation (reserved instances, spot pricing, credits), multi-cluster unified views, cost forecasting, anomaly detection, and integrations with Slack/PagerDuty.

AspectOpenCostKubecost
LicenseApache 2.0 (fully open source)Freemium with paid tiers
CNCF StatusIncubating projectCommercial product
Pricing AccuracyOn-demand list pricesIncludes discounts, credits, spot pricing
Multi-ClusterSingle cluster viewUnified multi-cluster view (paid)
Cost AllocationFull namespace/label/pod supportSame, plus forecasting and anomaly detection
Best ForLearning, single clusters, budget-conscious teamsEnterprises needing discount reconciliation

For this course: We use OpenCost because it's the CNCF standard, free to use, and teaches the core concepts. Everything you learn applies directly to Kubecost if your organization needs enterprise features.

OpenCost Architecture

OpenCost runs as a deployment in your cluster. It integrates with Prometheus to collect resource metrics and calculate costs.

┌─────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌──────────────┐ metrics ┌──────────────────┐ │
│ │ Prometheus │ ◄──────────── │ Node Exporter │ │
│ │ │ │ kube-state- │ │
│ │ │ │ metrics │ │
│ └──────┬───────┘ └──────────────────┘ │
│ │ │
│ │ PromQL queries │
│ ▼ │
│ ┌──────────────┐ /allocation ┌──────────────┐ │
│ │ OpenCost │ ◄──────────────────│ Your Query │ │
│ │ │ /assets │ (curl, UI) │ │
│ │ Port 9003 │ └──────────────┘ │
│ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘

How it works:

  1. Metrics collection: Prometheus scrapes resource metrics from your cluster (CPU, memory, network, storage usage per pod)

  2. Cost calculation: OpenCost queries Prometheus and applies pricing:

    • Cloud pricing from AWS/Azure/GCP billing APIs (or custom on-prem rates)
    • Formula: max(request, usage) * hourly_rate
  3. Allocation API: You query OpenCost's HTTP API to retrieve cost data aggregated by namespace, label, controller, or pod

  4. Prometheus integration: OpenCost also exposes its own Prometheus metrics, enabling Grafana dashboards and alerting

Installing OpenCost

OpenCost requires Prometheus. If you followed Chapter 55, you already have kube-prometheus-stack installed. OpenCost connects to your existing Prometheus.

Prerequisites

Verify Prometheus is running:

kubectl get pods -n monitoring | grep prometheus

Output:

prometheus-kube-prometheus-stack-prometheus-0   2/2     Running   0   5d

Install OpenCost with Helm

# Add the OpenCost Helm repository
helm repo add opencost https://opencost.github.io/opencost-helm-chart
helm repo update

# Install OpenCost pointing to your Prometheus
helm install opencost opencost/opencost \
--namespace opencost \
--create-namespace \
--set prometheus.internal.serviceName=prometheus-kube-prometheus-stack-prometheus \
--set prometheus.internal.namespaceName=monitoring

Output:

NAME: opencost
LAST DEPLOYED: Mon Dec 30 10:15:00 2025
NAMESPACE: opencost
STATUS: deployed
REVISION: 1

Verify Installation

Check that OpenCost is running:

kubectl get pods -n opencost

Output:

NAME                        READY   STATUS    RESTARTS   AGE
opencost-7d8f9c4b5f-xk2m9 1/1 Running 0 60s

Port-forward to access the API:

kubectl port-forward -n opencost svc/opencost 9003:9003

Test the API is responding:

curl http://localhost:9003/allocation/compute \
-G \
-d window=1h \
-d aggregate=namespace

Output:

{
"code": 200,
"data": [
{
"default": {
"cpuCost": 0.0234,
"ramCost": 0.0156,
...
}
}
]
}

OpenCost is now collecting cost data from your cluster.

Querying the Allocation API

The /allocation API is your primary interface for cost visibility. It answers: "How much did X cost during time window Y?"

Query Structure

curl http://localhost:9003/allocation/compute \
-G \
-d window=<time-range> \
-d aggregate=<grouping> \
-d filter=<optional-filter>

Parameters:

ParameterDescriptionExamples
windowTime range to query1h, 24h, 7d, 30d, lastweek
aggregateHow to group costsnamespace, label:team, controller, pod
filterLimit to specific resourcesnamespace:production, label:app=task-api
shareIdleInclude idle coststrue, false

Query by Namespace

The most common query: "How much did each namespace cost?"

curl http://localhost:9003/allocation/compute \
-G \
-d window=7d \
-d aggregate=namespace \
-d shareIdle=true

Output:

{
"code": 200,
"data": [
{
"production": {
"name": "production",
"cpuCost": 45.23,
"ramCost": 28.67,
"pvCost": 12.50,
"totalCost": 86.40
},
"data-science": {
"name": "data-science",
"cpuCost": 102.45,
"gpuCost": 234.00,
"ramCost": 45.12,
"totalCost": 381.57
},
"staging": {
"name": "staging",
"cpuCost": 8.90,
"ramCost": 5.23,
"totalCost": 14.13
}
}
]
}

Now you can answer: "Data-science costs 79% of our cluster spend, mostly from GPU usage."

Query by Team Label

When namespaces don't map cleanly to teams, use labels. If your pods have labels.team: product-team, query by that label:

curl http://localhost:9003/allocation/compute \
-G \
-d window=7d \
-d aggregate=label:team

Output:

{
"code": 200,
"data": [
{
"product-team": {
"name": "product-team",
"totalCost": 156.78
},
"platform-team": {
"name": "platform-team",
"totalCost": 89.34
},
"ml-team": {
"name": "ml-team",
"totalCost": 312.45
}
}
]
}

This works because OpenCost reads labels from your pods. If your pods don't have team labels, you can't query by team.

Filter to Specific Resources

Query a single namespace or application:

# Cost for production namespace only
curl http://localhost:9003/allocation/compute \
-G \
-d window=7d \
-d aggregate=pod \
-d filter=namespace:production

Output:

{
"code": 200,
"data": [
{
"production/task-api-deployment-abc123": {
"name": "production/task-api-deployment-abc123",
"totalCost": 23.45
},
"production/inference-service-xyz789": {
"name": "production/inference-service-xyz789",
"totalCost": 62.90
}
}
]
}

Now you see that inference-service costs nearly 3x what task-api costs. Worth investigating.

Understanding Idle Cost

Idle cost is the money you're paying for resources that no one is using. It's the gap between what you're paying for (provisioned resources) and what workloads are actually consuming (allocated resources).

Idle Cost = Provisioned Cost - Allocated Cost

Why Idle Cost Matters

If you provision a node with 32 GB of memory but pods only request 16 GB, you're paying for 16 GB of idle memory. That idle capacity has a cost.

Example scenario:

MetricValueCost Impact
Node capacity32 GB RAMYou pay for 32 GB
Pod requests16 GB RAMAllocated to workloads
Idle16 GB RAMPaying but not using

If memory costs $0.01/GB/hour, you're spending $0.16/hour on idle memory alone. Over a month: $115 wasted.

Querying Idle Cost

Use shareIdle=true to distribute idle costs across namespaces proportionally:

# Without idle cost distribution
curl http://localhost:9003/allocation/compute \
-G \
-d window=7d \
-d aggregate=namespace \
-d shareIdle=false

# With idle cost distribution
curl http://localhost:9003/allocation/compute \
-G \
-d window=7d \
-d aggregate=namespace \
-d shareIdle=true

With shareIdle=true, each namespace's cost includes its proportional share of cluster-wide idle costs. This gives a more accurate picture of true cost.

Reducing Idle Cost

High idle cost signals optimization opportunities:

  1. Right-size nodes: Use VPA recommendations (Lesson 2) to reduce over-provisioning
  2. Use autoscaling: KEDA and HPA can scale down during low-traffic periods
  3. Bin-packing: Cluster autoscaler removes underutilized nodes
  4. Review requests: If pods request 1 GB but use 200 MB, reduce the request

Cost Attribution Labels

OpenCost can only report costs by dimensions it knows about. If your pods don't have team labels, you can't query costs by team. Labeling is essential for visibility.

Add these labels to all workloads for comprehensive cost attribution:

apiVersion: apps/v1
kind: Deployment
metadata:
name: task-api
labels:
team: product-team # Which team owns this?
app: task-api # Which application is this?
environment: production # Production, staging, development?
cost-center: engineering # Budget allocation
spec:
template:
metadata:
labels:
team: product-team
app: task-api
environment: production
cost-center: engineering

Label strategy:

LabelPurposeQuery Example
teamTeam responsibilityaggregate=label:team
appApplication attributionaggregate=label:app
environmentEnvironment separationfilter=label:environment=production
cost-centerFinance/billingaggregate=label:cost-center

Labeling Best Practices

  1. Apply labels at deployment time: Include labels in your Helm charts or Kustomize bases

  2. Use consistent naming: team not owner or squad across all resources

  3. Label namespaces too: OpenCost can aggregate by namespace labels

  4. Enforce with admission controllers: Use OPA/Gatekeeper to require cost labels on all deployments

From Visibility to Action: The FinOps Progression

OpenCost provides visibility. What you do with that visibility follows the FinOps progression:

Stage 1: Showback (Where We Are Now)

Report costs to teams without charging them. Build trust that the data is accurate.

Actions:

  • Create weekly cost reports by team
  • Share dashboards showing cost trends
  • Let teams review and validate their costs

Stage 2: Allocation

Map costs to business entities. Connect cost centers to budgets.

Actions:

  • Assign cost-center labels to all workloads
  • Create namespace/team cost reports for finance
  • Track costs against budget

Stage 3: Chargeback

Formally bill internal teams for their usage. This is the mature state.

Actions:

  • Monthly invoices to cost centers
  • Usage-based internal billing
  • Accountability for over-budget spend

Most teams never reach chargeback. Showback is often sufficient to drive cost-conscious behavior.

Try With AI

These prompts help you apply cost visibility concepts to your own projects.

Prompt 1: Cost Query Design

I have a Kubernetes cluster with 5 namespaces and 3 teams.
Pods have labels: team, app, environment.

Design a set of OpenCost queries that answer:
1. Monthly cost by team
2. Cost breakdown for the 'ml-team' by application
3. Idle cost percentage across the cluster

Show the curl commands I would use.

What you're learning: How to translate cost visibility questions into specific API queries. AI helps you construct the query parameters.

Prompt 2: Label Strategy

I'm setting up cost attribution for a new Kubernetes cluster.
We have:
- 4 teams (product, platform, data, infrastructure)
- 3 environments (dev, staging, production)
- 15 applications

Design a labeling strategy:
- What labels should every resource have?
- How do I enforce these labels?
- What queries will these labels enable?

What you're learning: How to design a labeling strategy before deployment. Retrofitting labels is painful; planning ahead saves time.

Prompt 3: Cost Report Generation

I want to create a weekly cost report for my team leads.
Each team should see:
- Their total cost for the week
- Cost breakdown by application
- Comparison to previous week
- Top 3 most expensive pods

Using the OpenCost API, design a script that generates this report.

What you're learning: How to automate cost reporting. AI helps you combine multiple API calls into a coherent report.

Safety note: Cost data can reveal business information (which products get investment, team sizes, infrastructure scale). Restrict access to cost dashboards and APIs. In multi-tenant clusters, ensure teams only see their own cost data.

Reflect on Your Skill

You built an operational-excellence skill in Lesson 0. Test and improve it based on what you learned.

Test Your Skill

Using my operational-excellence skill, explain how to query cost data by team using OpenCost.
Does my skill describe the /allocation API parameters correctly?
Does it include the cost attribution label strategy?

Identify Gaps

Ask yourself:

  • Did my skill include the idle cost concept (provisioned - allocated)?
  • Did it explain the FinOps progression (showback -> allocation -> chargeback)?
  • Did it include the recommended labels (team, app, environment, cost-center)?

Improve Your Skill

If you found gaps:

My operational-excellence skill is missing the OpenCost query patterns.
Update it to include:
- The /allocation API structure with window, aggregate, and filter parameters
- Example queries for namespace and team-label cost allocation
- The idle cost formula and why it matters