Phase 18 β Backstage (Internal Developer Portal)
Backstage is Spotify's open-source developer-portal framework. Its core value is the Software Catalog β a unified registry of services, APIs, docs, and ownership metadata β plus a plugin system for embedding Kubernetes/ArgoCD/Grafana/etc. dashboards per service.
For a single-operator homelab like this one, Backstage is honestly portfolio-only β not operationally needed. Homer + the live docs site already do 90% of what a developer portal would. Same architectural question we asked at every "do we need this?" decision: GitLab in Phase 13, Crossplane in Phase 11, Vault in Phase 15, Backstage here.
The skill of installing, configuring, and reasoning about an IDP is real and recruiter-recognizable. So Phase 18 ships a focused minimal Backstage β catalog-only, no plugins, no SSO β that demonstrates the architecture without burning a weekend on plugins nobody will use.
Architectureβ
βββββββββββββββββββββββββββββββββββββββββββββ
Browser β backstage namespace β
β HTTPS β β
β + guest auth β ββββββββββββββββββββββββββββ β
βΌ β β Backstage Pod (1 replica)β β
cert-manager TLS β β off-the-shelf image: β β
+ NGINX Ingress βββββββΆβ β ghcr.io/backstage/ β β
β β backstage:latest β β
β β pulled through Harbor's β β
β β ghcr proxy (Phase 16) β β
β βββββββββββ¬βββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββ β
β β PostgreSQL (StatefulSet) β β
β β bitnami/postgresql 18.x β β
β β standalone, Longhorn 1Gi β β
β ββββββββββββββββββββββββββββ β
β β
β Catalog refresh every 100s: β
β pulls 5 catalog-info.yaml from β
β raw.githubusercontent.com/... β
βββββββββββββββββββββββββββββββββββββββββββββββ
After this phase: https://backstage.10.0.0.200.nip.io shows a
Software Catalog with 1 System (minicloud-platform) and 5
Components (one per repo in andrelair-platform).
Decisionsβ
| Decision | Choice | Rationale |
|---|---|---|
| Install method | Helm chart backstage/backstage v2.7.0 | Standard install path; bundles Postgres subchart |
| Image | Off-the-shelf ghcr.io/backstage/backstage:latest (pulled through Harbor ghcr proxy cache) | Avoids 3+ hours of Node.js scaffold + Docker build for marginal portfolio gain. Custom build deferred. |
| Database | PostgreSQL (bitnami subchart, standalone), 1 GiB on Longhorn | SQLite is documented as "for development only"; Postgres is the right shape for portfolio even at this scale |
| Authentication | Guest auth (no SSO) with dangerouslyAllowOutsideDevelopment: true | GitHub OAuth requires GitHub App + browser setup; SSO requires Keycloak (future phase). Guest gets us to "logged in" instantly. |
| Plugins | Catalog-only (chart's default β no Kubernetes/ArgoCD/Grafana plugins) | Each plugin requires custom-build with its node module. Deferred. |
| Catalog source | 5 catalog-info.yaml files, one per repo, fetched via raw.githubusercontent.com URLs | Avoids needing a GitHub integration token (anonymous reads of public raw URLs work) |
| Hostname | backstage.10.0.0.200.nip.io via NGINX Ingress + cert-manager TLS | Same single-IP host-routing pattern |
| Resource budget | Backstage 384Mi req / 1 GiB limit; Postgres 128Mi / 512 MiB | Conservative; cluster has ~30 GiB headroom |
What's deferred (with future homes)β
Same scope-reduction pattern as Phase 11 (Crossplane), Phase 13 (GitLab), Phase 15 (Vault), Phase 16 (n8n/Temporal/Airflow).
| Component | Reason | Future home |
|---|---|---|
| Custom Backstage image build | 3+ hours of Node.js scaffold + Docker build for marginal gain over off-the-shelf | Future "Backstage Plugins" phase, when we wire Kubernetes/ArgoCD/Grafana plugins |
| GitHub OAuth / SSO | Requires GitHub App + Keycloak as backbone | Future phase pairing with Keycloak |
| Kubernetes / ArgoCD / Grafana plugins | Each requires the custom-build pipeline; the off-the-shelf image doesn't include them | Future "Backstage Plugins" phase |
| Software Templates ("Golden Paths") | Most valuable feature, needs scaffolder backend + template repos. ~6 hours focused work. | Dedicated future phase, after SSO |
| Vault | Same reasons as Phase 15 β single-control-plane SPOF, no current workload needs dynamic creds | Future phase pairing with external-DB-credentials need |
| Crossplane | Promised "alongside Phase 18" in Phase 11's deferral; still no compelling external-infra use case | Future phase when there's a real cloud account or self-service template need |
This means Phase 18 ships ~30% of "real Backstage" β the catalog. Honest about what's missing and why.
Pre-flightβ
helm repo add backstage https://backstage.github.io/charts
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
# Generate Postgres password (mode 600, never committed)
openssl rand -base64 24 > ~/.backstage-postgres
chmod 600 ~/.backstage-postgres
Out-of-band Postgres Secretβ
The bitnami chart's existingSecret mechanism expects a Secret with
specific keys. We pre-create it in the chart's expected format:
kubectl create namespace backstage
PG_PW=$(cat ~/.backstage-postgres)
kubectl create secret generic backstage-postgres-secret \
-n backstage \
--from-literal=postgres-password="$PG_PW" \
--from-literal=password="$PG_PW"
backstage-values.yamlβ
backstage:
replicas: 1
image:
registry: ghcr.io
repository: backstage/backstage
tag: latest
# ghcr.io is routed through Harbor's ghcr proxy via the Phase 16
# mirror config in /etc/rancher/k3s/registries.yaml β no override needed
resources:
requests: { cpu: 100m, memory: 384Mi }
limits: { cpu: 1000m, memory: 1Gi }
containerPorts:
backend: 7007
appConfig:
app:
title: minicloud platform
baseUrl: https://backstage.10.0.0.200.nip.io
organization:
name: andrelair-platform
backend:
baseUrl: https://backstage.10.0.0.200.nip.io
listen: { port: 7007 }
cors:
origin: https://backstage.10.0.0.200.nip.io
methods: [GET, POST, PUT, DELETE]
credentials: true
# URL whitelist β without this, raw.githubusercontent.com reads
# are blocked by Backstage's default policy.
reading:
allow:
- host: raw.githubusercontent.com
- host: github.com
database:
client: pg
connection:
host: ${POSTGRES_HOST}
port: ${POSTGRES_PORT}
user: ${POSTGRES_USER}
password: ${POSTGRES_PASSWORD}
auth:
providers:
guest:
# Required when NODE_ENV=production (off-the-shelf image runs
# in production mode); without this, /api/auth/guest/refresh
# returns 403 NotAllowedError. Acceptable here: internal-only
# via Tailscale + cluster network, no sensitive catalog data.
dangerouslyAllowOutsideDevelopment: true
catalog:
rules:
- allow: [Component, System, API, Resource, Location]
locations:
- type: url
target: https://raw.githubusercontent.com/andrelair-platform/minicloud-platform-docs/main/catalog-info.yaml
- type: url
target: https://raw.githubusercontent.com/andrelair-platform/minicloud-ansible/main/catalog-info.yaml
- type: url
target: https://raw.githubusercontent.com/andrelair-platform/minicloud-opentofu/main/catalog-info.yaml
- type: url
target: https://raw.githubusercontent.com/andrelair-platform/minicloud-gitops/main/catalog-info.yaml
- type: url
target: https://raw.githubusercontent.com/andrelair-platform/platform-demo/main/catalog-info.yaml
extraEnvVars:
- name: POSTGRES_HOST
value: backstage-postgresql
- name: POSTGRES_PORT
value: "5432"
- name: POSTGRES_USER
value: bn_backstage
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: backstage-postgres-secret
key: password
ingress:
enabled: false # we add our own with cert-manager TLS
postgresql:
enabled: true
image:
registry: docker.io
repository: bitnamilegacy/postgresql
architecture: standalone
auth:
username: bn_backstage
database: backstage_plugin_catalog
existingSecret: backstage-postgres-secret
secretKeys:
adminPasswordKey: postgres-password
userPasswordKey: password
primary:
persistence:
enabled: true
storageClass: longhorn
size: 1Gi
resources:
requests: { cpu: 50m, memory: 128Mi }
limits: { cpu: 500m, memory: 512Mi }
Install + bootstrap order trapβ
helm install backstage backstage/backstage -n backstage \
-f backstage-values.yaml \
--wait --timeout 10m
Likely first failure: Backend startup failed... ECONNREFUSED 5432.
Postgres needs ~5 min to first-boot (init scripts, schema setup,
checkpoint). Backstage's startup happens in parallel and tries to
connect immediately. Backstage 1.x doesn't retry on startup-time DB
connection failures β it logs BackendStartupError and stays broken.
Fix: restart the Backstage pod once Postgres is fully Ready.
kubectl rollout restart deployment/backstage -n backstage
kubectl rollout status deployment/backstage -n backstage --timeout=120s
This is a real production gotcha β even with helm install --wait, the
chart's readiness probes wait for Backstage's HTTP server to be alive,
which can succeed before Backstage's database connection is. The
--wait then times out at 10min on the readiness probe, but the helm
install itself reports STATUS: deployed. Restart the pod and you're
fine.
TLS Ingressβ
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: backstage-tls
namespace: backstage
spec:
secretName: backstage-tls
issuerRef:
name: minicloud-ca
kind: ClusterIssuer
dnsNames: [backstage.10.0.0.200.nip.io]
duration: 2160h
renewBefore: 720h
privateKey: { algorithm: ECDSA, size: 256 }
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: backstage
namespace: backstage
annotations:
nginx.org/redirect-to-https: "true"
spec:
ingressClassName: nginx
tls:
- hosts: [backstage.10.0.0.200.nip.io]
secretName: backstage-tls
rules:
- host: backstage.10.0.0.200.nip.io
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: backstage
port: { number: 7007 }
The 5 catalog-info.yaml filesβ
One per repo, at the repo's root. Components reference the umbrella
System (spec.system: minicloud-platform) for tree-view navigation.
minicloud-platform-docs/catalog-info.yaml β System + Componentβ
apiVersion: backstage.io/v1alpha1
kind: System
metadata:
name: minicloud-platform
description: 3-node bare-metal Kubernetes platform on ThinkPads
tags: [homelab, bare-metal, portfolio]
spec:
owner: andrelair-platform
---
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: minicloud-platform-docs
description: Docusaurus 3 documentation site
annotations:
backstage.io/source-location: url:https://github.com/andrelair-platform/minicloud-platform-docs
github.com/project-slug: andrelair-platform/minicloud-platform-docs
tags: [documentation, docusaurus]
links:
- { url: https://andrelair-platform.github.io/minicloud-platform-docs/, title: Live docs site }
spec:
type: documentation
lifecycle: production
owner: andrelair-platform
system: minicloud-platform
platform-demo/catalog-info.yaml (most-decorated example)β
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: platform-demo
description: Phase 13 β tiny Go HTTP service end-to-end CI/CD demo
annotations:
backstage.io/source-location: url:https://github.com/andrelair-platform/platform-demo
github.com/project-slug: andrelair-platform/platform-demo
argocd/app-name: platform-demo
prometheus.io/rule: http_request_duration_seconds_count{namespace="gitops-demo"}
grafana/dashboard-selector: namespace=gitops-demo
tags: [go, service, demo, phase-13, cicd]
links:
- { url: https://platform-demo.10.0.0.200.nip.io/, title: Live service }
- { url: https://argocd.10.0.0.200.nip.io/applications/platform-demo, title: ArgoCD Application }
spec:
type: service
lifecycle: production
owner: andrelair-platform
system: minicloud-platform
The other three (ansible / opentofu / gitops) follow the same shape with
appropriate type (tool), tags, and links.
The argocd/app-name, prometheus.io/rule, grafana/dashboard-selector
annotations don't do anything yet β but they're the right metadata for
when we add the corresponding plugins. Future-proofing.
Verificationβ
# Pods
kubectl get pods -n backstage
# Expected: backstage-... (1/1 Running) + backstage-postgresql-0 (1/1 Running)
# UI reachable
curl -sIL --cacert ~/minicloud-ca.crt https://backstage.10.0.0.200.nip.io/
# Page title
curl -s --cacert ~/minicloud-ca.crt https://backstage.10.0.0.200.nip.io/ \
| grep -oE "<title>[^<]+</title>"
# <title>minicloud platform</title>
# Catalog query (with guest auth)
TOKEN=$(curl -s --cacert ~/minicloud-ca.crt -X POST \
https://backstage.10.0.0.200.nip.io/api/auth/guest/refresh \
| jq -r '.backstageIdentity.token')
curl -s --cacert ~/minicloud-ca.crt \
-H "Authorization: Bearer $TOKEN" \
"https://backstage.10.0.0.200.nip.io/api/catalog/entities?filter=kind=component" \
| jq -r '.[] | "\(.metadata.name) β \(.spec.type)"'
# minicloud-ansible β tool
# minicloud-gitops β tool
# minicloud-opentofu β tool
# minicloud-platform-docs β documentation
# platform-demo β service
In the browser at https://backstage.10.0.0.200.nip.io:
- Click "Sign in as Guest"
- Catalog page β switch from "OWNER" to "ALL" filter
- See 5 Components + 1 System with descriptions, tags, owner, links
Three real install gotchasβ
1. Postgres race on first installβ
helm install --wait reports STATUS: deployed even when Backstage's
DB connection failed. Always restart the Backstage pod manually if logs
show ECONNREFUSED.
2. Default URL reading whitelist blocks raw.githubusercontent.comβ
Without backend.reading.allow: [{host: raw.githubusercontent.com}],
catalog refresh logs:
Unable to read url, Reading from 'https://raw.githubusercontent.com/...'
is not allowed.
3. Guest auth gated in production-modeβ
The off-the-shelf image runs NODE_ENV=production. Without
auth.providers.guest.dangerouslyAllowOutsideDevelopment: true, every
guest sign-in returns:
NotAllowedError: The guest provider cannot be used outside of a development environment
Done Whenβ
β 2 pods Running in backstage namespace (backstage + backstage-postgresql-0)
β 1 PVC Bound on Longhorn (backstage-postgresql)
β Cert + Ingress for backstage.10.0.0.200.nip.io
β HTTPβHTTPS 301 redirect; HTTPS returns 200 with title "minicloud platform"
β Guest auth issues a JWT (~514 chars)
β /api/catalog/entities returns 6 entities: 1 System + 5 Components
β Each Component has its system: minicloud-platform link
β Homer has a Backstage tile
Real-world skills demonstratedβ
| Skill | Industry context |
|---|---|
| Off-the-shelf vs custom Backstage trade-off | The single most important Backstage adoption decision. Real teams agonize over this. |
| Software Catalog as the entry point | Catalog-first adoption is the canonical Backstage rollout pattern at scale (Spotify, Netflix, Roadie's customers, etc.) |
PostgreSQL bitnami subchart with existingSecret | Standard pattern for any Helm chart that ships a DB subchart β pre-create the secret out-of-band, reference it via existingSecret |
raw.githubusercontent.com reads to avoid GitHub rate limits | Real production knowledge: even reading public repo files via the GitHub API hits unauthenticated rate limits. raw.githubusercontent.com is faster and rate-limit-free. |
| Annotations for future plugins | Adding argocd/app-name and prometheus.io/rule annotations even before the corresponding plugins are wired is the high-signal move β when plugins arrive, the metadata is already there |
| Production-mode auth gating | NODE_ENV=production + dangerouslyAllow is a Backstage-specific quirk that's documented but easily missed. Real install knowledge. |
| Risk-aware scope reduction | Choosing minimal-IDP over full custom-build with all plugins is the same skill as Phase 11's Crossplane deferral, Phase 13's GitLab deferral, etc. |
| Honest "this is portfolio-only" framing | Naming what isn't operationally needed alongside what is. A portfolio that says "we don't actually use this much" is more credible than one that pretends the homelab needs Backstage. |