Skip to main content

Phase 12 — GitOps with ArgoCD

GitOps replaces manual kubectl apply and helm install with continuous reconciliation against a Git repository. The repo becomes the single source of truth; ArgoCD watches it and brings the cluster back to that state every few minutes — automatically.

After this phase the question "if I lost the cluster, how would I rebuild it?" has a one-command answer: kubectl apply -f bootstrap/root-app.yaml, and ArgoCD recreates everything else.

This phase is also the most disruptive so far — ArgoCD starts managing live workloads. We migrated Homer (low-risk, stateless) and added a new greenfield whoami demo, but deliberately left the cluster-critical workloads (Ingress, MetalLB, Harbor, Prometheus, podinfo) helm-managed until later phases.


Architecture

GitHub: andrelair-platform/minicloud-gitops
┌──────────────────────────────────────────┐
│ bootstrap/root-app.yaml (App-of-Apps) │
│ apps/homer.yaml │
│ apps/whoami.yaml │
│ manifests/homer/ (5 YAMLs) │
│ manifests/whoami/ (4 YAMLs) │
└────────────────┬─────────────────────────┘
│ git pull every 3 min

┌────────────────────────────────────────────┐
│ ArgoCD (in-cluster, namespace argocd) │
│ ┌──────────────────────────────────────┐ │
│ │ Application "root" (App-of-Apps) │ │
│ │ ↓ syncs apps/ │ │
│ │ Application "homer" │ │
│ │ ↓ syncs manifests/homer/ │ │
│ │ Application "whoami" │ │
│ │ ↓ syncs manifests/whoami/ │ │
│ └──────────────────────────────────────┘ │
│ Reconciles → Kubernetes API every 3 min │
└────────────────────────────────────────────┘


Cluster state matches git

UI at http://argocd.10.0.0.200.nip.io, login admin / ~/.argocd-admin.


Decisions

DecisionChoiceRationale
Apps migrated to ArgoCDHomer only + new whoami demoSmallest blast radius — Homer is stateless, single ConfigMap + Deployment + Service + Ingress
Apps left helm-managedingress-nginx, MetalLB, Harbor, kube-prometheus-stack, podinfoCritical-path or stateful; migrating them deserves its own phase. Phase 9's HPA test rig stays intact.
PatternApp-of-AppsIndustry standard; one root Application bootstraps the whole platform
Sync policyautomated: prune+selfHeal=true, syncOptions=[CreateNamespace, ServerSideApply]Real GitOps — drift gets corrected automatically. ServerSideApply tracks ownership cleanly during adoption.
AuthBuilt-in admin from auto-generated argocd-initial-admin-secret, dumped to ~/.argocd-adminSame out-of-band pattern as Harbor / Grafana. Keycloak SSO comes in Phase 15.
UI accessargocd.10.0.0.200.nip.io via existing NGINX Ingress (HTTP, server.insecure=true)Matches Homer / Harbor / Grafana / podinfo pattern. TLS in Phase 15.
GitOps repoNew public andrelair-platform/minicloud-gitopsPublic is fine — no secrets in this repo. Anything sensitive lives in ~/.* files on the controller.
Repo locationSeparate repo (not a monorepo)Each repo has one purpose, clean ArgoCD path, independent CI

Pre-flight

# Add the chart repo
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update argo

# Confirm chart and app version
helm search repo argo/argo-cd
# argo/argo-cd 9.5.13 v3.4.1

values.yaml

argocd-values.yaml:

global:
domain: argocd.10.0.0.200.nip.io

configs:
params:
server.insecure: true # HTTP behind the Ingress; TLS in Phase 15
cm:
timeout.reconciliation: 180s

server:
# F5 NGINX Ingress gets confused when the chart's default Service has
# both http (port 80 -> 8080) and https (port 443 -> 8080). The
# duplicate targetPorts make endpointslice resolution fail with
# "no endpointslices for target port 8080". We later patched the
# generated Ingress to use port name "http" instead of port number 80,
# which avoids the issue.
service:
type: ClusterIP
servicePortHttp: 80
servicePortHttpName: http
servicePortHttps: 443
servicePortHttpsName: https
ingress:
enabled: true
ingressClassName: nginx
hostname: argocd.10.0.0.200.nip.io
# NB: do NOT add nginx.org/proxy-buffer-size here — see Troubleshooting

resources:
requests: { cpu: 50m, memory: 128Mi }
limits: { cpu: 500m, memory: 512Mi }

repoServer:
resources:
requests: { cpu: 50m, memory: 128Mi }
limits: { cpu: 500m, memory: 512Mi }

controller:
resources:
requests: { cpu: 100m, memory: 256Mi }
limits: { cpu: 1000m, memory: 1Gi }

applicationSet:
resources:
requests: { cpu: 25m, memory: 64Mi }
limits: { cpu: 250m, memory: 256Mi }

# Keep things lean — Phase 15 will add Keycloak SSO; Phase 21 will add notifications.
dex:
enabled: false
notifications:
enabled: false

redis:
resources:
requests: { cpu: 25m, memory: 64Mi }
limits: { cpu: 250m, memory: 128Mi }

Install

kubectl create namespace argocd
helm install argo-cd argo/argo-cd -n argocd \
-f argocd-values.yaml \
--wait --timeout 10m

# Capture the auto-generated initial admin password (length ~16 chars)
kubectl -n argocd get secret argocd-initial-admin-secret \
-o jsonpath='{.data.password}' | base64 -d > ~/.argocd-admin
chmod 600 ~/.argocd-admin

Verify:

kubectl get pods -n argocd
# 5 pods: server, controller, applicationset, repo-server, redis

curl -sI -m 5 http://argocd.10.0.0.200.nip.io/
# HTTP/1.1 200 OK

GitOps repo layout

minicloud-gitops/
├── README.md
├── bootstrap/
│ └── root-app.yaml # the App-of-Apps Application
├── apps/ # one Application per child app
│ ├── homer.yaml
│ └── whoami.yaml
└── manifests/
├── homer/ # Namespace, ConfigMap, Deployment, Service, Ingress
└── whoami/ # Namespace, Deployment, Service, Ingress

bootstrap/root-app.yaml (the App-of-Apps)

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root
namespace: argocd
finalizers: [resources-finalizer.argocd.argoproj.io]
spec:
project: default
source:
repoURL: https://github.com/andrelair-platform/minicloud-gitops.git
targetRevision: main
path: apps
directory:
recurse: false
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated: { prune: true, selfHeal: true }
syncOptions:
- CreateNamespace=true
- ServerSideApply=true

apps/homer.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: homer
namespace: argocd
finalizers: [resources-finalizer.argocd.argoproj.io]
spec:
project: default
source:
repoURL: https://github.com/andrelair-platform/minicloud-gitops.git
targetRevision: main
path: manifests/homer
destination:
server: https://kubernetes.default.svc
namespace: homer
syncPolicy:
automated: { prune: true, selfHeal: true }
syncOptions:
- CreateNamespace=true
- ServerSideApply=true

apps/whoami.yaml is identical with path manifests/whoami and namespace gitops-demo.


Bootstrap

kubectl apply -f bootstrap/root-app.yaml

Within seconds, ArgoCD discovers the apps directory and creates the homer and whoami child Applications:

$ kubectl get applications -n argocd
NAME SYNC STATUS HEALTH STATUS
homer Synced Healthy
root Synced Healthy
whoami Synced Healthy

Adopting Homer (no downtime)

Homer was previously deployed via raw kubectl apply, with no ArgoCD labels or annotations. Naive adoption would conflict ("resource already exists"). We avoid that by setting ServerSideApply=true in the syncOptions — server-side apply uses field managers to take over ownership cleanly without recreating resources.

After bootstrap, the existing Homer pod kept running (no restart) and ArgoCD took ownership. The tracking annotation appeared on the live ConfigMap:

$ kubectl get cm homer-config -n homer -o jsonpath='{.metadata.annotations.argocd\.argoproj\.io/tracking-id}'
homer:/ConfigMap:homer/homer-config

The git-push-to-deploy demo

Edit manifests/homer/01-configmap.yaml to change the dashboard subtitle, then push:

cd minicloud-gitops/
sed -i 's|subtitle: \\"Private Bare-Metal Infrastructure\\"|subtitle: \\"Private Bare-Metal Infrastructure \\u2014 GitOps via ArgoCD\\"|' \
manifests/homer/01-configmap.yaml
git add manifests/homer/01-configmap.yaml
git commit -m "demo: GitOps callout in Homer subtitle"
git push origin main

Watch ArgoCD pick up the new revision:

$ kubectl get app homer -n argocd -w
NAME SYNC STATUS HEALTH STATUS
homer Synced Healthy
homer OutOfSync Healthy # detected at +180s
homer Synced Progressing # applied
homer Synced Healthy # rolled out

The first sync took ~3 min 24 s end-to-end — exactly the documented default reconciliation interval (180 s) plus a small processing delay.


Two real gotchas this phase exposed

1. The nginx.org/proxy-buffer-size annotation breaks all Ingresses

Initially I added nginx.org/proxy-buffer-size: "16k" to ArgoCD's Ingress annotations as a "just in case" for the 308-redirect concern. That change made nginx -s reload fail with:

[emerg] "proxy_busy_buffers_size" must be less than the size of all
"proxy_buffers" minus one buffer

NGINX kept the previous config running, but every subsequent change (including endpoints for new pods) silently failed to apply. Result: a 404 on the new ArgoCD ingress, plus zero updates to anything else F5 NGINX manages.

Fix: remove the annotation. F5 NGINX uses default buffer config and handles ArgoCD fine without it.

2. ConfigMap subPath mounts + ArgoCD selfHeal == checksum annotation

Homer mounts its config like this:

volumeMounts:
- mountPath: /www/assets/config.yml
name: config
subPath: config.yml

When a ConfigMap is mounted with subPath, Kubernetes does not auto-propagate changes to the file. The pod sees the old config until it restarts.

The obvious fix is kubectl rollout restart, but that doesn't survive ArgoCD's selfHeal: true: ArgoCD sees the auto-injected restartedAt annotation as drift and reverts it within seconds.

The GitOps-correct fix is to bump a checksum annotation in the pod template alongside the ConfigMap change:

spec:
template:
metadata:
annotations:
config-checksum: "v3-argocd-whoami-tiles" # bump on every config change

This alters the pod-template-hash, ArgoCD applies it as a normal sync, and the rolling restart picks up the new ConfigMap. Production teams automate this with Reloader or kustomize's configMapGenerator (which generates suffixed names).


Verification (results)

TestExpectedActual
Connectivitycurl http://argocd.10.0.0.200.nip.io/200 OK
LoginPOST /api/v1/session returns JWTtoken length 257 ✓
MigrationHomer ConfigMap has argocd.argoproj.io/tracking-idhomer:/ConfigMap:homer/homer-config
Apps healthykubectl get applications -n argocdroot + homer + whoami all Synced + Healthy
GitOps loopPush → cluster reflects change~3 min 24 s ✓
New workloadcurl http://whoami.10.0.0.200.nip.io/full traefik/whoami response with 2 alternating pod hostnames ✓

Done When

✔ ArgoCD UI reachable at http://argocd.10.0.0.200.nip.io
✔ Admin login works with the password from ~/.argocd-admin
✔ kubectl get applications -n argocd shows root + homer + whoami all Synced + Healthy
✔ Homer was migrated without restart (existing pod kept running)
✔ A push to minicloud-gitops triggers a sync within ~3 minutes
✔ http://whoami.10.0.0.200.nip.io/ returns valid traefik/whoami output
✔ Homer dashboard has live ArgoCD and whoami tiles

What's NOT migrated to ArgoCD yet

Intentional scope. Migrating these is a future phase:

ComponentWhy deferred
ingress-nginxCritical path — if ArgoCD breaks the Ingress, half the platform goes down with it
metallb-systemCritical path — same as above, plus has cluster-scoped resources
harborStateful, 5 PVCs. Adopting a stateful workload is its own playbook.
kube-prometheus-stackHeavyweight chart, complex CRDs, lifecycle hooks
podinfoPhase 9's HPA + custom dashboard test rig is too valuable to risk during migration

The principle: migrate the cheapest things first, prove the workflow, then come back for the expensive things in dedicated migration phases.


Real-world skills demonstrated

SkillWhere it applies in industry
App-of-Apps patternThe standard ArgoCD bootstrapping pattern at every scale, from 5-app clusters to 500-app multi-tenant platforms
Brownfield resource adoptionEvery team that adopts ArgoCD post-hoc has to deal with existing kubectl-applied resources. ServerSideApply is the cleanest path.
Auto-sync + selfHeal in productionReal GitOps means the cluster auto-corrects drift. The catch — see the "two gotchas" — is real production knowledge.
ConfigMap subPath + checksum-annotation patternSolves a real problem every Kubernetes shop hits. Reloader is the canonical solution; the inline checksum is the manual baseline.
Risk-aware migration scopeChoosing which workloads to migrate first vs. which to defer is the same calculus every platform team makes. Critical-path apps go last, not first.
Recognizing failed-reload symptomsThe "404 on a new Ingress that should work" diagnosis pattern — check ingress controller logs for [emerg] reload failures — is Day-2 ops 101.
Multi-repo org structureOne purpose per repo (minicloud-platform-docs, minicloud-ansible, minicloud-opentofu, minicloud-gitops) is the canonical setup for real platform teams.