Skip to main content
Updated Feb 10, 2026

AI-Assisted GitOps Workflows

You've learned ArgoCD architecture, ApplicationSets, secrets management, and multi-cluster patterns manually. You can write manifests, reason about sync strategies, and debug deployment issues. Now you're ready for the next layer: using AI as a collaborator to generate sophisticated GitOps configurations that would take hours to write by hand.

This lesson teaches a critical skill: evaluating and refining AI-generated manifests. Claude can generate working ArgoCD configurations in seconds, but that output needs your domain knowledge to become production-ready.

Why AI Helps With GitOps

GitOps configurations are highly structured YAML where small mistakes have large consequences. A typo in a sync policy, a missing imagePullSecret, or incorrect resource ordering can break deployments.

AI excels at:

  • Boilerplate generation — ApplicationSets with matrix generators, complex sync strategies
  • Multi-environment templates — Dev/staging/prod variations that differ only in replicas and registries
  • Manifest composition — Combining Helm values, ConfigMaps, and ArgoCD policies into coherent configurations
  • Pattern recognition — Suggesting sync strategies or health checks you might not have considered

But AI has no visibility into:

  • Your actual cluster topology and names
  • Registry credentials and pull secret names
  • Environment-specific constraints (storage classes, ingress controllers)
  • Your organization's naming conventions and security policies

This is where you come in. You provide constraints, validate assumptions, and catch environment-specific mistakes that AI can't know about.

When to Use AI for GitOps

Ask yourself these questions:

Use AI if:

  • The configuration is complex (3+ environments, multiple deployment patterns)
  • You're using unfamiliar features (Argo Rollouts integration, advanced sync waves)
  • You're generating boilerplate that follows a pattern you've defined
  • You need to quickly explore design options

Don't rely on AI if:

  • The manifest is simple (single application, one cluster)
  • You're unsure what the manifest should do (write the spec first, then ask for the manifest)
  • The configuration involves undocumented internal systems
  • You haven't validated cluster names, registries, or credentials

Critical Evaluation: What to Check

When Claude generates a manifest, you are not accepting it as gospel. You're evaluating it against your environment.

Checklist: Validate AI Output

  1. Cluster references — Does the manifest use YOUR cluster names?

    # Wrong (generic example)
    destination:
    server: https://kubernetes.default.svc

    # Right (your actual cluster)
    destination:
    server: https://prod-eks-cluster.example.com
  2. Registry credentials — Are imagePullSecrets correct for your registries?

    # Check: Does my cluster have this secret?
    imagePullSecrets:
    - name: ghcr-credentials # Must exist in the namespace
  3. Namespace alignment — Does every resource deploy to the right namespace?

    # Validate: argocd/default/target namespace consistency
    metadata:
    namespace: production-agents # Must match destination.namespace
  4. Resource limits — Are requests/limits appropriate for your workloads?

    resources:
    requests:
    memory: "256Mi" # Is this realistic for a FastAPI agent?
    cpu: "250m"
    limits:
    memory: "512Mi" # Or too generous?
    cpu: "500m"
  5. Sync strategies — Does auto-sync make sense for this environment?

    syncPolicy:
    automated:
    prune: true # Safe for dev? Dangerous for prod.
    selfHeal: true
  6. Health assessment — Does the health check match your service?

    # AI might assume HTTP health checks for all services
    # Your database might need different health criteria
    healthChecks:
    - type: Application # Correct for most deployments
  7. Variable substitution — Are placeholders actually filled in?

    # Wrong (template not rendered)
    image: ghcr.io/organization/agent:{{ version }}

    # Right (AI should fill this from your context)
    image: ghcr.io/organization/agent:sha-abc1234

Example: AI-Generated Manifest with Issues

Here's what Claude might generate for a multi-environment deployment:

---
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: fastapi-agent-multi-env
spec:
generators:
- matrix:
generators:
- list:
elements:
- name: dev
cluster: minikube
replicas: 1
image_tag: latest
- name: staging
cluster: staging-cluster
replicas: 3
image_tag: v1.2.3
- name: prod
cluster: prod-cluster
replicas: 5
image_tag: v1.2.3
- list:
elements:
- registry: docker.io
- registry: ghcr.io

template:
metadata:
name: fastapi-agent-{{name}}-{{registry}}
spec:
project: default
source:
repoURL: https://github.com/your-org/helm-charts
chart: fastapi-agent
targetRevision: HEAD
helm:
valuesInline:
replicaCount: {{replicas}}
image:
repository: {{registry}}/your-org/fastapi-agent
tag: {{image_tag}}
ingress:
enabled: true
className: nginx

destination:
server: https://{{cluster}}.example.com
namespace: agents

syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

Problems to identify:

  1. Matrix generator — Combining environment with registry creates unwanted combinations (dev with ghcr, prod with docker.io)
  2. Cluster server{{cluster}}.example.com won't resolve; needs actual cluster URLs
  3. Image registry mismatch — Prod probably shouldn't pull from docker.io; dev might not use ghcr
  4. Auto-sync in prodprune: true in production is risky without additional safeguards
  5. No imagePullSecrets — ghcr and private registries need credentials
  6. Missing namespace — No validation that agents namespace exists

This is where you step in. You know these constraints. Claude doesn't.

Teaching Claude Your Constraints

The collaborative process works like this:

Round 1: Initial Generation

You ask Claude to generate the manifest. You get a structurally correct starting point that's missing your environment-specific details.

Round 2: Share Your Constraints

You respond with your actual constraints:

"The cluster setup is different. We have:

  • Dev cluster: minikube (local)
  • Staging: 'staging-eks' (AWS EKS)
  • Prod: 'prod-eks' (AWS EKS, requires IAM auth)

Dev can use public registries (docker.io), but staging and prod MUST use private ECR. The ECR URIs are: 123456789.dkr.ecr.us-east-1.amazonaws.com

All clusters have imagePullSecrets named 'ecr-credentials' except dev (which doesn't need one).

Auto-sync is okay for dev and staging, but prod MUST be manual sync only."

Claude doesn't know these constraints. But once you tell it, it can adapt immediately.

Round 3: Refined Manifest

Claude regenerates the manifest accounting for your constraints:

---
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: fastapi-agent-multi-env
spec:
generators:
- list:
elements:
- name: dev
cluster: minikube
server: https://kubernetes.default.svc # Local minikube
replicas: 1
registry: docker.io
pullSecret: null
autoSync: "true"

- name: staging
cluster: staging-eks
server: https://staging-eks.example.com
replicas: 3
registry: 123456789.dkr.ecr.us-east-1.amazonaws.com
pullSecret: ecr-credentials
autoSync: "true"

- name: prod
cluster: prod-eks
server: https://prod-eks.example.com
replicas: 5
registry: 123456789.dkr.ecr.us-east-1.amazonaws.com
pullSecret: ecr-credentials
autoSync: "false"

template:
metadata:
name: fastapi-agent-{{name}}
spec:
project: default
source:
repoURL: https://github.com/your-org/helm-charts
chart: fastapi-agent
targetRevision: main
helm:
valuesInline:
replicaCount: {{replicas}}
image:
repository: {{registry}}/your-org/fastapi-agent
tag: sha-abc1234
imagePullSecrets:
{{#pullSecret}}- name: {{pullSecret}}{{/pullSecret}}
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"

destination:
server: {{server}}
namespace: agents

syncPolicy:
{{#if autoSync == "true"}}
automated:
prune: true
selfHeal: true
{{/if}}
syncOptions:
- CreateNamespace=true
- PruneLast=true

Output:

The manifest now:

  • Uses your actual cluster servers
  • Applies imagePullSecrets only where needed (staging, prod)
  • Makes auto-sync conditional (true for dev/staging, false for prod)
  • Specifies the correct registry for each environment
  • Uses a specific image tag (sha-abc1234) instead of latest

This is still a template, but it's now grounded in your environment.

Iterative Refinement

The collaborative process doesn't stop there. As Claude generates output, you ask clarifying questions:

Your question after seeing the refined manifest:

"The pull secret conditional syntax looks like Kustomize, not Helm. ArgoCD uses standard Helm templating. How would you rewrite the imagePullSecrets to work with ArgoCD's Helm integration?"

Claude explains and corrects:

# Cleaner approach: Remove null values in values.yaml
imagePullSecrets: [] # Empty in dev, populated in values-staging.yaml and values-prod.yaml

Then you validate the fix by checking if it matches your Helm chart expectations.

You push back when Claude makes assumptions:

"You suggest using default StorageClass for all environments. But dev uses emptyDir, staging uses ebs-gp3, and prod uses ebs-io2 (expensive, high-performance). How do you handle per-environment storage class selection?"

Claude offers solutions:

# Approach 1: Pass storageClassName through Helm values
values:
persistence:
storageClassName: {{storage_class}}

# Approach 2: Use an ArgoCD SyncWave to create environment-specific PVCs first
# Approach 3: Use Kustomize patches to override the storage class per environment

You choose the approach that fits your constraints.

Testing Before Deploying

Claude can generate manifests, but you must validate them before applying:

# Step 1: Apply to a test cluster or dry-run
argocd app create fastapi-agent-dev \
--file manifest.yaml \
--dry-run

# Step 2: Check what ArgoCD would deploy
argocd app diff fastapi-agent-dev

# Step 3: Verify the ApplicationSet generates correct Applications
kubectl get Application -n argocd
# Expected output shows 3 Applications: one per environment

# Step 4: Sync and monitor
argocd app sync fastapi-agent-dev
argocd app wait fastapi-agent-dev

# Step 5: Validate actual deployment
kubectl get pods -n agents
kubectl logs -n agents -l app=fastapi-agent --tail=50

Each step confirms that Claude's generated manifest actually works in your environment.


Try With AI: Multi-Environment GitOps Deployment

Now practice this collaborative pattern yourself.

Setup

Have these ready:

  • Your actual cluster names and server URLs
  • Your container registry (Docker Hub, ECR, GHCR)
  • Environment differences (dev/staging/prod): replicas, resources, auto-sync preferences
  • Your team's naming conventions

Part 1: Initial Request

Ask Claude to generate an ApplicationSet:

Generate an ApplicationSet for deploying our FastAPI agent to dev, staging, and prod.

Here's what we have:
- Dev: local Minikube cluster, public registries allowed
- Staging: AWS EKS cluster, private ECR registry
- Prod: AWS EKS cluster, private ECR, high-availability requirements

The agent needs:
- Dev: 1 replica, 256Mi memory
- Staging: 3 replicas, 512Mi memory
- Prod: 5 replicas, 1Gi memory, managed node group with specific labels

Each environment has different:
- Cluster names
- Registry credentials
- Auto-sync policies (dev and staging can auto-sync; prod must be manual)

Generate a manifest that handles these variations. Use a single ApplicationSet with a list or matrix generator.

Part 2: Critical Evaluation

Review Claude's output. Ask yourself:

  • Does this handle my cluster topology?

    • Are cluster server URLs correct? (Not just placeholders)
    • Does it reference my actual namespaces?
  • What assumptions did Claude make?

    • Pull secrets: Are they named correctly in my clusters?
    • Resource limits: Do they match my infrastructure?
    • Auto-sync: Did it apply the right policies to each environment?
  • What would fail in my environment?

    • Are there any hardcoded values that don't match my setup?
    • Does the registry URL match my actual ECR or registry?
    • Will the ConfigMap or Secret fields work with my secrets backend?

Write down 2-3 specific concerns before moving to the next part.

Part 3: Share Your Constraints

Tell Claude your actual constraints:

The manifest is close, but I need adjustments for our actual environment:

- Dev cluster is 'minikube' (local, server: https://kubernetes.default.svc)
- Staging is 'staging-eks' in AWS (server: https://staging-eks-cluster-xyz.eks.amazonaws.com)
- Prod is 'prod-eks' in AWS (server: https://prod-eks-cluster-abc.eks.amazonaws.com)

Registry setup:
- Dev: Uses docker.io (no credentials needed)
- Staging: Uses 123456789.dkr.ecr.us-east-1.amazonaws.com (needs 'ecr-staging' secret)
- Prod: Uses 123456789.dkr.ecr.us-east-1.amazonaws.com (needs 'ecr-prod' secret)

Auto-sync policy:
- Dev: Fully automatic (prune: true, selfHeal: true)
- Staging: Automatic but safe-delete (prune: false, selfHeal: true)
- Prod: Manual only (no automated sync)

We also need to handle node affinity: Staging and prod both require the FastAPI workload to run only on nodes labeled 'workload: agents'. Dev has no such requirement.

Can you regenerate the ApplicationSet with these specifics?

Part 4: Refinement

After Claude regenerates, ask a clarifying question about an assumption:

You're using imagePullSecrets in the values. But I'm not sure that's the right place—
should imagePullSecrets be in the Helm values.yaml, or should they be created separately in the namespace by ArgoCD using SyncWaves?

Also, for the node affinity in staging/prod: Is it better to set that in the ApplicationSet
or in the Helm chart's values.yaml? We want to be able to change it per environment without
modifying the chart.

Claude explains the tradeoffs. You decide which approach fits your workflow.

Part 5: Validation

Apply the final manifest and validate:

# Create the ApplicationSet
kubectl apply -f applicationset-fastapi-agent.yaml

# Verify it generated 3 Applications (one per environment)
argocd app list | grep fastapi-agent
# Expected: 3 rows (fastapi-agent-dev, fastapi-agent-staging, fastapi-agent-prod)

# Check sync status
argocd app status fastapi-agent-dev
argocd app status fastapi-agent-staging
argocd app status fastapi-agent-prod

# Verify deployment across environments
kubectl get pods -n agents -o wide --context=minikube
kubectl get pods -n agents -o wide --context=staging-eks
kubectl get pods -n agents -o wide --context=prod-eks

# Confirm replicas match expectations
kubectl get deployments -n agents -o wide --context=minikube
# Expected: 1 replica in dev
kubectl get deployments -n agents -o wide --context=staging-eks
# Expected: 3 replicas in staging
kubectl get deployments -n agents -o wide --context=prod-eks
# Expected: 5 replicas in prod

Expected Outcome

Through this 5-part process, you've:

  • Learned Claude's strengths — Generating boilerplate manifests quickly
  • Identified its gaps — Lack of visibility into your environment
  • Provided your constraints — Translating domain knowledge into specific requirements
  • Validated the output — Confirmed it actually works in your clusters

Reflect on the collaboration:

  • What did Claude suggest that you hadn't considered?
  • What mistakes would the first version have caused?
  • Which constraints would have been hardest to debug if they were wrong?

These reflections become the foundation for your next use of AI with GitOps.


Reflect on Your Skill

You built a gitops-deployment skill in Lesson 0. Test and improve it based on what you learned.

Test Your Skill

Using my gitops-deployment skill, generate a complete multi-environment ApplicationSet with all the patterns from this chapter.
Does my skill produce production-ready manifests with proper validation?

Identify Gaps

Ask yourself:

  • Did my skill include all the concepts: sync waves, hooks, secrets, multi-cluster, RBAC, notifications?
  • Did it validate cluster names, registries, and environment-specific constraints?

Improve Your Skill

If you found gaps:

My gitops-deployment skill generates basic manifests but misses advanced patterns.
Review all 16 lessons and update the skill to include comprehensive GitOps workflows with all safety checks.