Capstone: Production Traffic for Task API
You have learned Gateway API fundamentals, configured Envoy Gateway, written HTTPRoutes, applied rate limiting, set up TLS, implemented traffic splitting, configured autoscaling with KEDA, and explored Envoy AI Gateway for LLM traffic. Each lesson added a pattern to your traffic-engineer skill. Now you will compose all these patterns into a production-ready traffic stack for your Task API.
This capstone follows the specification-first approach. You will write a specification defining what "production-ready" means for your deployment, then use your traffic-engineer skill to generate configurations that meet those criteria. The goal is not just working configuration—it is configuration you can validate, defend, and evolve.
By the end, you will have a complete traffic engineering stack protecting your Task API: external access through Gateway, rate limiting per user, TLS termination, and autoscaling based on demand. You will also finalize your traffic-engineer skill as a reusable component for any future AI agent deployment.
Phase 1: Write Your Specification
Before generating any YAML, define what you are building. Create a file named traffic-spec.md in your project:
# Production Traffic Specification: Task API
## Overview
Configure production-ready traffic management for Task API running
in the task-api namespace on Docker Desktop Kubernetes.
## Business Goals
- External users can reach Task API from their browsers
- Abusive users cannot exhaust resources for others
- Communication is encrypted (HTTPS)
- System scales automatically with demand
## Success Criteria
### SC-001: External Access
Users can reach Task API at http://localhost:8080/api/tasks
from outside the cluster.
**Validation:**
```bash
curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/api/tasks
# Expected: 200
SC-002: Rate Limiting Active
Each user (identified by x-user-id header) is limited to 100 requests/minute. Anonymous users are limited to 10 requests/minute.
Validation:
# Authenticated user: 100 requests succeed, then 429
for i in {1..110}; do
curl -s -o /dev/null -w "%{http_code}\n" \
-H "x-user-id: test" http://localhost:8080/api/tasks
done | grep 429 | wc -l
# Expected: 10 (requests 101-110 rejected)
SC-003: TLS Termination
HTTPS access works with a valid (self-signed for local) certificate.
Validation:
curl -k -s -o /dev/null -w "%{http_code}" https://localhost:8443/api/tasks
# Expected: 200
SC-004: Autoscaling Active
KEDA ScaledObject scales pods from 1-10 based on request rate. Scale-to-zero disabled (minimum 1 replica for availability).
Validation:
kubectl get scaledobject task-api-scaler -n task-api
# Expected: READY=True, MIN=1, MAX=10
SC-005: Versioned API Paths
Traffic routes to correct backends based on path prefix:
- /api/v1/* -> task-api-v1 service
- /api/v2/* -> task-api-v2 service (if deployed)
Validation:
curl -s http://localhost:8080/api/v1/tasks | jq .version
# Expected: "v1"
SC-006: Circuit Breaker Protection
Backend protected from overload with max 50 concurrent connections.
Validation:
kubectl get backendtrafficpolicy task-api-protection -n task-api -o yaml | grep maxConnections
# Expected: maxConnections: 50
SC-007: Skill Produces Valid Configuration
Traffic-engineer skill generates all required resources from this spec.
Validation:
# Skill generates YAML that passes dry-run
kubectl apply --dry-run=client -f generated-config.yaml
# Expected: All resources created (dry run)
Constraints
- Docker Desktop Kubernetes (no cloud load balancer)
- Self-signed TLS certificates (no external CA)
- Prometheus available at prometheus.monitoring.svc.cluster.local:9090
- KEDA installed in keda namespace
Non-Goals
- Multi-cluster routing
- External DNS integration
- Production certificate management (covered in real deployments)
- Global rate limiting with Redis (local rate limiting sufficient)
**Output:**
traffic-spec.md created (2847 bytes)
This specification answers the key questions: What are you building? How will you know it works? What constraints exist?
---
## Phase 2: Generate Configuration with Your Skill
With specification in hand, use your traffic-engineer skill to generate configuration.
### Step 1: Generate Gateway and HTTPRoute
Ask your skill to create the entry point:
Using my traffic-engineer skill, generate Gateway and HTTPRoute for Task API:
From traffic-spec.md:
- Gateway listening on ports 80 (HTTP) and 443 (HTTPS)
- HTTPRoute for /api/v1/* to task-api service on port 8000
- Running in task-api namespace
- Docker Desktop Kubernetes (gatewayClassName: eg)
**Review AI's output.** Check these specifics:
- Does Gateway use `gatewayClassName: eg` (Envoy Gateway)?
- Does Gateway have two listeners (http on 80, https on 443)?
- Does HTTPRoute use `parentRefs` pointing to the Gateway?
- Does HTTPRoute match `/api/v1` path prefix?
**Expected Gateway structure:**
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: task-api-gateway
namespace: task-api
spec:
gatewayClassName: eg
listeners:
- name: http
protocol: HTTP
port: 80
- name: https
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- name: task-api-tls
Output:
gateway.gateway.networking.k8s.io/task-api-gateway created (dry run)
Step 2: Generate Rate Limiting Policy
Request BackendTrafficPolicy matching SC-002:
Generate BackendTrafficPolicy for Task API rate limiting:
- 100 requests/minute per authenticated user (x-user-id header)
- 10 requests/minute for anonymous users (no x-user-id header)
- Local rate limiting (no Redis dependency)
- Apply to task-api-route HTTPRoute
Review AI's output. Check these specifics:
- Does policy use
clientSelectorswith header matching? - Does anonymous rule use
invert: true? - Are limits set correctly (100 vs 10)?
Expected structure:
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: task-api-ratelimit
namespace: task-api
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: task-api-route
rateLimit:
local:
rules:
# Authenticated users
- clientSelectors:
- headers:
- name: x-user-id
value: "*"
limit:
requests: 100
unit: Minute
# Anonymous users
- clientSelectors:
- headers:
- name: x-user-id
value: "*"
invert: true
limit:
requests: 10
unit: Minute
Output:
backendtrafficpolicy.gateway.envoyproxy.io/task-api-ratelimit created (dry run)
Step 3: Generate Circuit Breaker Policy
Request protection against backend overload:
Add circuit breaker to the BackendTrafficPolicy:
- Maximum 50 concurrent connections
- Maximum 25 parallel requests
- Maximum 5 pending requests (fail fast)
Review AI's output. Ensure circuit breaker is added to existing policy, not a separate resource:
circuitBreaker:
maxConnections: 50
maxParallelRequests: 25
maxPendingRequests: 5
Output:
backendtrafficpolicy.gateway.envoyproxy.io/task-api-protection created (dry run)
Step 4: Generate TLS Configuration
Request self-signed certificate setup:
Generate TLS configuration for local development:
- Self-signed certificate for localhost
- Certificate stored in task-api-tls Secret
- Gateway configured for TLS termination
Review AI's output. For local development, you need:
- Certificate generation commands
- Secret containing cert and key
- Gateway listener referencing the secret
Certificate generation:
# Generate self-signed certificate
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout tls.key -out tls.crt \
-subj "/CN=localhost" \
-addext "subjectAltName=DNS:localhost,IP:127.0.0.1"
# Create Kubernetes secret
kubectl create secret tls task-api-tls \
--cert=tls.crt --key=tls.key \
-n task-api
Output:
secret/task-api-tls created
Step 5: Generate KEDA ScaledObject
Request autoscaling configuration matching SC-004:
Generate KEDA ScaledObject for Task API:
- Scale based on request rate from Prometheus
- Minimum 1 replica (no scale-to-zero for availability)
- Maximum 10 replicas
- Scale up when requests exceed 50/second
- Prometheus at prometheus.monitoring.svc.cluster.local:9090
Review AI's output. Check these specifics:
- Is
minReplicaCountset to 1 (not 0)? - Is the Prometheus query correct for your metrics?
- Is the threshold appropriate?
Expected structure:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: task-api-scaler
namespace: task-api
spec:
scaleTargetRef:
name: task-api
minReplicaCount: 1
maxReplicaCount: 10
pollingInterval: 15
cooldownPeriod: 300
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
metricName: task_api_request_rate
query: sum(rate(http_requests_total{service="task-api"}[2m]))
threshold: "50"
Output:
scaledobject.keda.sh/task-api-scaler created (dry run)
Phase 3: Apply Configuration
With all configurations generated and reviewed, apply them to your cluster in the correct order.
Step 1: Create Namespace and Prerequisites
# Create namespace if not exists
kubectl create namespace task-api --dry-run=client -o yaml | kubectl apply -f -
# Verify Envoy Gateway is running
kubectl get pods -n envoy-gateway-system
Output:
namespace/task-api configured
NAME READY STATUS RESTARTS AGE
envoy-gateway-7b9c4d6f5-xxxxx 1/1 Running 0 1h
Step 2: Create TLS Secret
# Generate certificate
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout tls.key -out tls.crt \
-subj "/CN=localhost" \
-addext "subjectAltName=DNS:localhost,IP:127.0.0.1"
# Create secret
kubectl create secret tls task-api-tls \
--cert=tls.crt --key=tls.key \
-n task-api
Output:
Generating a RSA private key
...
secret/task-api-tls created
Step 3: Apply Gateway and HTTPRoute
# Apply Gateway first (HTTPRoute depends on it)
kubectl apply -f gateway.yaml
# Wait for Gateway to be ready
kubectl wait --for=condition=Programmed gateway/task-api-gateway -n task-api --timeout=60s
# Apply HTTPRoute
kubectl apply -f httproute.yaml
Output:
gateway.gateway.networking.k8s.io/task-api-gateway created
gateway.gateway.networking.k8s.io/task-api-gateway condition met
httproute.gateway.networking.k8s.io/task-api-route created
Step 4: Apply Traffic Policies
# Apply BackendTrafficPolicy (rate limiting + circuit breaker)
kubectl apply -f backend-traffic-policy.yaml
Output:
backendtrafficpolicy.gateway.envoyproxy.io/task-api-protection created
Step 5: Apply KEDA ScaledObject
# Apply ScaledObject
kubectl apply -f scaledobject.yaml
# Verify KEDA recognized it
kubectl get scaledobject -n task-api
Output:
scaledobject.keda.sh/task-api-scaler created
NAME SCALETARGETKIND SCALETARGETNAME MIN MAX TRIGGERS READY ACTIVE
task-api-scaler apps/v1.Deployment task-api 1 10 prometheus True True
Phase 4: Validate Against Success Criteria
Run each validation from your specification.
SC-001: External Access
curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/api/tasks
Output:
200
Result: PASS
SC-002: Rate Limiting
# Test authenticated user limit (100/min)
for i in {1..110}; do
curl -s -o /dev/null -w "%{http_code}\n" \
-H "x-user-id: test-user" http://localhost:8080/api/tasks
done | sort | uniq -c
Output:
100 200
10 429
Result: PASS (100 succeeded, 10 rate limited)
# Test anonymous limit (10/min)
for i in {1..15}; do
curl -s -o /dev/null -w "%{http_code}\n" http://localhost:8080/api/tasks
done | sort | uniq -c
Output:
10 200
5 429
Result: PASS (10 succeeded, 5 rate limited)
SC-003: TLS Termination
curl -k -s -o /dev/null -w "%{http_code}" https://localhost:8443/api/tasks
Output:
200
Result: PASS
SC-004: Autoscaling Active
kubectl get scaledobject task-api-scaler -n task-api
Output:
NAME SCALETARGETKIND SCALETARGETNAME MIN MAX TRIGGERS READY ACTIVE
task-api-scaler apps/v1.Deployment task-api 1 10 prometheus True True
Result: PASS (READY=True, MIN=1, MAX=10)
SC-005: Versioned API Paths
curl -s http://localhost:8080/api/v1/tasks | jq .version
Output:
"v1"
Result: PASS
SC-006: Circuit Breaker Protection
kubectl get backendtrafficpolicy task-api-protection -n task-api -o yaml | grep -A3 circuitBreaker
Output:
circuitBreaker:
maxConnections: 50
maxParallelRequests: 25
maxPendingRequests: 5
Result: PASS
SC-007: Skill Produces Valid Configuration
kubectl apply --dry-run=client -f generated-config.yaml
Output:
gateway.gateway.networking.k8s.io/task-api-gateway created (dry run)
httproute.gateway.networking.k8s.io/task-api-route created (dry run)
backendtrafficpolicy.gateway.envoyproxy.io/task-api-protection created (dry run)
scaledobject.keda.sh/task-api-scaler created (dry run)
Result: PASS
Validation Summary
| Criterion | Result |
|---|---|
| SC-001: External Access | PASS |
| SC-002: Rate Limiting | PASS |
| SC-003: TLS Termination | PASS |
| SC-004: Autoscaling Active | PASS |
| SC-005: Versioned API Paths | PASS |
| SC-006: Circuit Breaker | PASS |
| SC-007: Skill Valid Output | PASS |
All success criteria met. Your Task API has production-ready traffic engineering.
Phase 5: Finalize Your Skill
Your traffic-engineer skill has accumulated patterns from all 12 lessons. Review and finalize it.
Complete Decision Tree
Your skill should include this decision logic:
| Question | If Yes | If No |
|---|---|---|
| Need external access? | Configure Gateway + HTTPRoute | Use ClusterIP Service |
| Multiple API versions? | Path-based routing in HTTPRoute | Single backend |
| Per-user quotas needed? | Header-based rate limiting | Global rate limit |
| Protect backend from overload? | Add circuit breaker | Skip for low-traffic |
| TLS required? | Configure listener with certificateRefs | HTTP only |
| Event-driven workload? | KEDA with appropriate scaler | HPA on CPU |
| Cost-sensitive? | KEDA with scale-to-zero | Minimum replicas |
| LLM/AI traffic? | Envoy AI Gateway with token limits | Standard rate limiting |
Complete Template Library
Your skill should include templates for:
1. Gateway with HTTP and HTTPS:
# Template: gateway-dual-listener
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: {{ name }}-gateway
namespace: {{ namespace }}
spec:
gatewayClassName: eg
listeners:
- name: http
protocol: HTTP
port: {{ http_port | default(80) }}
- name: https
protocol: HTTPS
port: {{ https_port | default(443) }}
tls:
mode: Terminate
certificateRefs:
- name: {{ tls_secret }}
2. HTTPRoute with versioned paths:
# Template: httproute-versioned
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: {{ name }}-route
namespace: {{ namespace }}
spec:
parentRefs:
- name: {{ gateway }}
rules:
- matches:
- path:
type: PathPrefix
value: /api/v1
backendRefs:
- name: {{ service }}-v1
port: {{ port }}
- matches:
- path:
type: PathPrefix
value: /api/v2
backendRefs:
- name: {{ service }}-v2
port: {{ port }}
3. BackendTrafficPolicy with rate limiting and circuit breaker:
# Template: backend-traffic-policy-complete
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: {{ name }}-protection
namespace: {{ namespace }}
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: {{ route }}
rateLimit:
local:
rules:
- clientSelectors:
- headers:
- name: x-user-id
value: "*"
limit:
requests: {{ authenticated_limit | default(100) }}
unit: Minute
- clientSelectors:
- headers:
- name: x-user-id
value: "*"
invert: true
limit:
requests: {{ anonymous_limit | default(10) }}
unit: Minute
circuitBreaker:
maxConnections: {{ max_connections | default(50) }}
maxParallelRequests: {{ max_parallel | default(25) }}
maxPendingRequests: {{ max_pending | default(5) }}
4. KEDA ScaledObject:
# Template: scaledobject-prometheus
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: {{ name }}-scaler
namespace: {{ namespace }}
spec:
scaleTargetRef:
name: {{ deployment }}
minReplicaCount: {{ min_replicas | default(1) }}
maxReplicaCount: {{ max_replicas | default(10) }}
pollingInterval: {{ polling_interval | default(15) }}
cooldownPeriod: {{ cooldown | default(300) }}
triggers:
- type: prometheus
metadata:
serverAddress: {{ prometheus_url }}
metricName: {{ metric_name }}
query: {{ query }}
threshold: "{{ threshold }}"
Troubleshooting Guide
Add this to your skill:
| Symptom | Check | Likely Cause |
|---|---|---|
| 404 on all requests | HTTPRoute parentRefs correct? | Gateway name mismatch |
| No rate limiting | BackendTrafficPolicy targetRefs? | Route name mismatch |
| TLS handshake fails | Secret exists? Certificate valid? | Missing or expired cert |
| KEDA not scaling | Prometheus query returns data? | Query syntax or no metrics |
| 503 under low load | Circuit breaker too aggressive? | Increase maxParallelRequests |
| Cold start too slow | minReplicaCount = 0? | Set minimum 1 for availability |
Skill Verification
Run this test to verify your skill handles the complete workflow:
# Ask skill to generate complete config from spec
# Then validate
kubectl apply --dry-run=client -f complete-config.yaml
# All resources should pass validation
Reflect on Your Skill
Your traffic-engineer skill is now complete. It should handle:
- Gateway setup (single and dual listener)
- HTTPRoute configuration (paths, headers, query params)
- Rate limiting (global, per-user, tiered)
- Circuit breaker protection
- TLS termination (with self-signed or CA certs)
- Traffic splitting (canary, blue-green, A/B)
- Autoscaling (HPA, KEDA with various scalers)
- LLM traffic (token-based limiting, provider fallback)
Consider what edge cases need documentation:
- Multi-tenant deployments: How do you isolate traffic between tenants?
- Gradual rollouts: What traffic split percentages work for your canary process?
- Disaster recovery: How quickly can you fail over to a secondary backend?
This skill becomes part of your Digital FTE portfolio. When deploying future AI agents, invoke this skill to generate production-ready traffic configurations in minutes rather than hours.
Submit Your Skill
Your traffic-engineer skill is ready for your skills portfolio:
# Verify skill structure
ls -la .claude/skills/traffic-engineer/
# Expected:
# SKILL.md - Core patterns and decision trees
# templates/ - YAML templates for all resources
# references/ - Links to official documentation
Commit your finalized skill:
cd .claude/skills/traffic-engineer
git add .
git commit -m "feat(skill): finalize traffic-engineer with complete patterns"
You now own a production-tested traffic engineering skill that transforms any AI agent deployment from "pod running in cluster" to "production service ready for users."