Skip to main content
Updated Feb 23, 2026

Chapter 59: Cost & Disaster Recovery

You build an operational-excellence skill first, then apply it to control spend and recover from failures: FinOps visibility, right-sizing, backups, restores, and chaos testing.


Goals

  • Understand cloud cost fundamentals and allocation
  • Right-size workloads with VPA and optimize requests/limits
  • Enable cost visibility with OpenCost and FinOps practices
  • Implement backups/restores with Velero for clusters and volumes
  • Practice chaos engineering to validate resilience
  • Capture patterns in a reusable operational-excellence skill

Lesson Progression

  • Build Your Operational-Excellence Skill
  • Cost fundamentals and allocation
  • Right-sizing with VPA; cost visibility with OpenCost
  • FinOps practices (budgets, alerts, showback/chargeback)
  • Backup/recovery fundamentals and Velero usage
  • Chaos engineering basics
  • Capstone: cost-aware, resilient Task API; finalize the skill

Each lesson ends with a reflection to test, find gaps, and improve.


Outcome & Method

You finish with cost monitoring, right-sized workloads, tested backups/restores, and chaos validation for the Task API—plus a reusable operational-excellence skill. The chapter follows the skill-first pattern with a spec-driven capstone.


Prerequisites

  • Chapters 49-58 (containerized, deployed, secured, observable service)