Data Sovereignty and Compliance
Your Task API agent is working well. Velero backs it up nightly. Chaos Mesh confirms it recovers from failures. Then your sales team closes a deal with a German healthcare company, and you get a message from legal: "We need to discuss GDPR compliance for the backup infrastructure."
Suddenly, technical decisions about where S3 buckets live become legal decisions. The backup storage location you chose for convenience might violate data residency laws. The audit logs you never configured might be required for compliance certification. The encryption you assumed was "good enough" might not meet regulatory standards.
Data sovereignty transforms backup strategy from "store it somewhere reliable" to "store it where the law allows, prove you did it correctly, and encrypt it to specific standards." This lesson explains what data sovereignty means, how it affects your Velero backup configuration, and what audit trails you need to demonstrate compliance.
What is Data Sovereignty?
Data sovereignty is the principle that data is subject to the laws of the country where it physically resides. When your Kubernetes backup sits in an S3 bucket in Virginia, US law governs that data. When the same backup sits in an S3 bucket in Frankfurt, German and EU law governs it.
This matters because different jurisdictions have different rules about:
- Who can access data: Some countries require government access to data stored within their borders. Others prohibit such access.
- Where data can be transferred: The EU restricts transferring personal data to countries without "adequate" data protection laws.
- How long data must be retained: Financial regulations might require 7-year retention. Privacy laws might require deletion after purpose is fulfilled.
- How data must be encrypted: Some regulations specify encryption algorithms. Others require specific key management practices.
The core question for disaster recovery: Where does your backup storage physically exist, and what laws apply there?
GDPR and Data Residency
The General Data Protection Regulation (GDPR) is the EU's data protection law. It affects any organization that processes data of EU residents, regardless of where the organization is based. If your Task API has users in Germany, GDPR applies to you even if your company is in California.
Key GDPR Requirements for Backups
Data Residency: GDPR doesn't require data to stay in the EU, but transferring data outside the EU requires legal mechanisms. Since the Schrems II ruling in 2020, transfers to the US require additional safeguards beyond the standard contractual clauses that were previously sufficient.
Practical impact: If your Velero backup storage is in a US region and your application processes EU personal data, you need legal review. Many organizations choose the simpler path: keep EU data in EU regions.
Right to Erasure (Article 17): Users can request deletion of their personal data. This includes data in backups. You need a process to either delete specific records from backups or document why full deletion isn't technically feasible.
Data Protection by Design (Article 25): Encryption isn't optional. Backups containing personal data must be encrypted at rest and in transit. The encryption must be "appropriate to the risk."
Backup Location Strategy
For GDPR compliance, the safest approach is geographic isolation:
| Data Origin | Backup Storage Region | Legal Basis |
|---|---|---|
| EU users | EU region (eu-west-1, eu-central-1) | Data stays in adequate jurisdiction |
| US users | US region (us-east-1, us-west-2) | Domestic data, US law applies |
| Mixed users | Both regions, routed by origin | Data stays in origin jurisdiction |
For Velero, this means configuring separate BackupStorageLocation resources for each region:
EU backup location (conceptual configuration):
- Bucket:
task-api-backups-eu - Region:
eu-central-1(Frankfurt) - Encryption: AES-256 at rest, TLS 1.3 in transit
- Retention: Per GDPR requirements (purpose-based)
US backup location (conceptual configuration):
- Bucket:
task-api-backups-us - Region:
us-east-1(Virginia) - Encryption: AES-256 at rest, TLS 1.3 in transit
- Retention: Per US requirements (may differ from EU)
The key insight: backup location isn't just about latency or cost. It's about legal jurisdiction.
Beyond GDPR: Other Regulations
GDPR is the most discussed regulation, but it's not the only one affecting backup strategy:
| Regulation | Scope | Key Backup Requirement |
|---|---|---|
| HIPAA (US Healthcare) | Protected health information | Encryption, access controls, audit trails, BAA with cloud provider |
| PCI DSS (Payment Cards) | Cardholder data | Encryption, key management, quarterly vulnerability scans of backup systems |
| SOX (US Financial) | Financial records | 7-year retention, immutable backups, audit trails |
| CCPA (California Privacy) | California residents' personal data | Deletion capability, disclosure of data practices |
The pattern: Every regulation requires encryption, most require audit trails, many specify retention periods or deletion capabilities.
What this lesson covers vs. doesn't cover:
- Covers: Understanding that these regulations exist and affect backup design
- Covers: Data sovereignty and audit trail concepts
- Does NOT cover: Certification processes (SOC2, ISO27001 are multi-month projects requiring dedicated compliance teams)
Encryption Requirements
Regulatory compliance almost universally requires encryption. But "encrypted" isn't a binary state. Compliance requires specific encryption configurations.
At-Rest Encryption
Data stored in backup files must be encrypted. For Velero with S3-compatible storage:
Server-Side Encryption (SSE): The storage provider encrypts data when it's written. Options include:
- SSE-S3: AWS manages keys. Simple but less control.
- SSE-KMS: AWS Key Management Service manages keys. Better for compliance because you control key rotation and access policies.
- SSE-C: You provide keys. Maximum control but you manage key distribution.
For most compliance scenarios, SSE-KMS provides the right balance: AWS manages encryption operations, you control keys and policies, and you get audit logs of key usage.
In-Transit Encryption
Data moving between your Kubernetes cluster and backup storage must be encrypted. This typically means:
- TLS 1.2 or 1.3: Velero uses HTTPS by default when communicating with S3
- Certificate validation: Don't disable TLS certificate verification (a common but dangerous shortcut)
- Private endpoints: Use VPC endpoints to keep traffic off the public internet
Key Management
Encryption is only as strong as key management. Compliance requires:
- Key rotation: Keys should rotate periodically (annually at minimum)
- Access control: Only authorized systems should access encryption keys
- Audit logging: Every key access should be logged
Audit Trails: Proving Compliance
Encryption protects data. Audit trails prove you protected it. When a compliance auditor asks "Who accessed this backup on March 15?", you need an answer. When a regulator investigates a breach, you need logs showing what happened.
What Audit Trails Must Capture
Backup operations:
- When was each backup created?
- What namespaces/resources were included?
- Did the backup succeed or fail?
- Who or what system initiated the backup?
Restore operations:
- When was each restore performed?
- What backup was used?
- What target namespace received the restore?
- Who authorized the restore?
Access to backup storage:
- Who accessed the S3 bucket?
- What operations did they perform (read, write, delete)?
- Was access authorized?
Kubernetes Audit Logs
Kubernetes has built-in audit logging that captures API server activity. For backup compliance, you need logs showing:
- Velero backup and restore custom resource creation
- Access to secrets (backup credentials)
- Service account token usage
The audit policy determines what gets logged. For compliance, you typically need at least Metadata level logging for backup-related resources and Request or RequestResponse for sensitive operations.
Velero Backup Logs
Velero generates detailed logs for each backup and restore operation. These logs show:
- Resources discovered and backed up
- Volume snapshots created
- Errors or warnings during backup
- Timing information
For compliance, these logs should be stored in a separate, tamper-evident location (not just the Velero pod logs that rotate and disappear).
Storage Access Logs
S3 and compatible storage systems can log all access requests. Enable server access logging to capture:
- Every request to the backup bucket
- Source IP address
- Action performed (GET, PUT, DELETE)
- Success or failure
These logs go to a separate bucket that the backup system cannot modify, providing tamper-evident access records.
The Compliance Mindset
Technical teams often approach compliance as a checkbox exercise: "Enable encryption. Done." But compliance is actually about demonstrating a system of controls:
Design: Document why you made specific choices (EU region for GDPR, SSE-KMS for key control)
Implement: Configure systems according to documented design
Monitor: Continuously verify that controls are working (audit logs, compliance dashboards)
Respond: Have procedures for when things go wrong (data breach notification, audit response)
The backup configuration is just one piece. Compliance requires the full lifecycle: design documents, configuration evidence, ongoing monitoring, and incident response procedures.
Multi-Region Considerations
Real-world applications often have users in multiple jurisdictions. A multi-region backup strategy addresses this:
Region affinity: Route user data to the appropriate region based on user location or preference. Backups inherit this affinity.
Separate backup locations: Configure Velero with multiple BackupStorageLocation resources, one per compliance region.
Namespace organization: Consider organizing namespaces by compliance region (e.g., production-eu, production-us) to simplify backup targeting.
Retention differences: Different regulations may require different retention periods. EU data might have shorter retention (purpose limitation), while US financial data might require 7+ years.
Try With AI
These prompts help you apply data sovereignty concepts to your own deployment scenarios.
Prompt 1: Compliance Requirements Analysis
I'm building a Task API that will have users in the EU and US.
The application stores user-created tasks with titles and descriptions.
Help me understand:
1. Does GDPR apply to my application?
2. What data sovereignty requirements affect my backup strategy?
3. What backup location configuration would satisfy both EU and US requirements?
What you're learning: How to analyze your application's data to determine which regulations apply and how they affect infrastructure decisions. AI helps you think through the regulatory landscape before making technical choices.
Prompt 2: Audit Trail Design
My Kubernetes cluster runs Velero for backups.
I need to demonstrate compliance to auditors.
Design an audit trail strategy that captures:
- All backup and restore operations
- Access to backup storage
- Who initiated each operation
What logging should I enable, and where should logs be stored?
What you're learning: How to design audit infrastructure that satisfies compliance requirements. AI helps you map compliance needs to specific Kubernetes and cloud logging configurations.
Prompt 3: Encryption Review
Review this backup storage configuration for compliance gaps:
Backup bucket: us-east-1
Encryption: SSE-S3 (AWS managed keys)
Access logging: Not enabled
Users: Mix of EU and US
Identify what I need to change for GDPR compliance.
What specific configuration changes would you recommend?
What you're learning: How to identify compliance gaps in existing configurations. AI helps you analyze configurations against regulatory requirements and prioritize necessary changes.
Safety note: This lesson provides educational guidance on compliance concepts. Actual compliance certification (GDPR, HIPAA, SOC2) requires legal counsel and may require third-party auditors. Use this material to understand concepts and ask informed questions, not as legal advice.
Reflect on Your Skill
You built an operational-excellence skill earlier in this chapter. Test and improve it with data sovereignty knowledge.
Test Your Skill
Using my operational-excellence skill, explain how data residency requirements
affect Velero backup configuration.
Does my skill understand:
- What data sovereignty means?
- Why backup location matters for compliance?
- What audit trails are needed?
Identify Gaps
Ask yourself:
- Did my skill mention any specific regulations (GDPR, HIPAA)?
- Did it explain the relationship between backup location and legal jurisdiction?
- Did it include audit trail requirements?
Improve Your Skill
If you found gaps:
My operational-excellence skill is missing data sovereignty knowledge.
Update it to include:
- Data residency requirements for GDPR
- Backup location strategy for multi-region compliance
- Audit trail requirements for demonstrating compliance