Updated Feb 23, 2026

Chapter 59: Cost & Disaster Recovery

You build an operational-excellence skill first, then apply it to control spend and recover from failures: FinOps visibility, right-sizing, backups, restores, and chaos testing.

Goals

Understand cloud cost fundamentals and allocation
Right-size workloads with VPA and optimize requests/limits
Enable cost visibility with OpenCost and FinOps practices
Implement backups/restores with Velero for clusters and volumes
Practice chaos engineering to validate resilience
Capture patterns in a reusable operational-excellence skill

Lesson Progression

Build Your Operational-Excellence Skill
Cost fundamentals and allocation
Right-sizing with VPA; cost visibility with OpenCost
FinOps practices (budgets, alerts, showback/chargeback)
Backup/recovery fundamentals and Velero usage
Chaos engineering basics
Capstone: cost-aware, resilient Task API; finalize the skill

Each lesson ends with a reflection to test, find gaps, and improve.

Outcome & Method

You finish with cost monitoring, right-sized workloads, tested backups/restores, and chaos validation for the Task API—plus a reusable operational-excellence skill. The chapter follows the skill-first pattern with a spec-driven capstone.

Prerequisites

Chapters 49-58 (containerized, deployed, secured, observable service)

Goals​

Lesson Progression​

Outcome & Method​

Prerequisites​

Goals

Lesson Progression

Outcome & Method

Prerequisites