Copy as MarkdownCtrl+⇧+C
Chapter 59: Cost & Disaster Recovery
You build an operational-excellence skill first, then apply it to control spend and recover from failures: FinOps visibility, right-sizing, backups, restores, and chaos testing.
Goals
- Understand cloud cost fundamentals and allocation
- Right-size workloads with VPA and optimize requests/limits
- Enable cost visibility with OpenCost and FinOps practices
- Implement backups/restores with Velero for clusters and volumes
- Practice chaos engineering to validate resilience
- Capture patterns in a reusable operational-excellence skill
Lesson Progression
- Build Your Operational-Excellence Skill
- Cost fundamentals and allocation
- Right-sizing with VPA; cost visibility with OpenCost
- FinOps practices (budgets, alerts, showback/chargeback)
- Backup/recovery fundamentals and Velero usage
- Chaos engineering basics
- Capstone: cost-aware, resilient Task API; finalize the skill
Each lesson ends with a reflection to test, find gaps, and improve.
Outcome & Method
You finish with cost monitoring, right-sized workloads, tested backups/restores, and chaos validation for the Task API—plus a reusable operational-excellence skill. The chapter follows the skill-first pattern with a spec-driven capstone.
Prerequisites
- Chapters 49-58 (containerized, deployed, secured, observable service)