Skip to content

Migrated from ADR-0044 on 2026-05-02 per ADR-0047. This source file is retained as a reference; the canonical content is in PLAT-0006.

PLAT-0006 — Traefik IngressRoutes with TLS for Platform Services

Field Value
Status Accepted
Date 2026-04-28
Author Ben Peries
Sources ADR-0044
Epic E2 Platform Security & Hardening (WI-258)

Context

cert-manager (PLAT-0005) is installed and both ClusterIssuers (letsencrypt-staging, letsencrypt-prod) are registered. The cluster now has automated TLS lifecycle management, but no services are yet wired to use it.

At time of this ADR, the k3s cluster exposes platform services as follows:

Service Access method TLS
Grafana HTTP-only IngressRoute (grafana-platform.peries.ca:80) None
AWX NodePort :30080 on all nodes None
Traefik dashboard Internal only (:8080, not exposed) N/A
Prometheus / Alertmanager No ingress Deferred

This ADR covers the first two services. All traffic to internal platform services must use HTTPS; plaintext HTTP must be rejected or redirected.

Decision

Wire Grafana and AWX through Traefik IngressRoutes with TLS termination using certificates issued by cert-manager.

Manifest structure

k8s/ingress/
  grafana/
    certificate.yaml            # cert-manager Certificate → letsencrypt-prod
    middleware-redirect-https.yaml  # Traefik Middleware: HTTP → HTTPS redirect
    ingressroute.yaml           # Two IngressRoutes: web (redirect) + websecure (TLS)
  awx/
    certificate.yaml
    middleware-redirect-https.yaml
    ingressroute.yaml

Hostname → Service mapping

Hostname Namespace Backend service Port
grafana-platform.peries.ca archon-monitoring kube-prometheus-stack-grafana 80
awx-platform.peries.ca awx awx-service 80

Internal DNS (AdGuard, caneast-site1-node2) resolves both hostnames to REDACTED (k3s LoadBalancer VIP / Traefik). External Cloudflare A records are a follow-up action.

HTTP → HTTPS redirect

A Middleware of type redirectScheme is deployed in each service namespace. Traefik Middleware resources are namespace-scoped; they cannot be shared cross-namespace. Each namespace that requires the redirect gets its own identical Middleware.

TLS termination

Certificate resources in each service namespace reference ClusterIssuer: letsencrypt-prod. Cert-manager issues via Cloudflare DNS-01 (PLAT-0005). The resulting TLS Secret is referenced by tls.secretName in the HTTPS IngressRoute.

Phase alignment (IAM-0002)

Phase 1 (current): flat LAN 192.168.2.x. All services are on the internal network. Phase 2 (future): OPNsense MGT zone (10.10.20.x). The *-platform.peries.ca hostname pattern stays unchanged; only the backend routing changes. No manifest rewrites are needed at Phase 2 migration.

AWX NodePort → ClusterIP

awx-service is currently NodePort :30080 (managed by the AWX operator). The IngressRoute routes to the service's ClusterIP port 80, which works regardless of service type. Post-smoke-test conversion to ClusterIP requires patching the AWX CR (spec.service_type: ClusterIP) rather than editing the Service directly, as the operator will reconcile any direct Service patch back to NodePort.

Options Considered

Option A: Vanilla Kubernetes Ingress with cert-manager annotation (rejected)

  • kubernetes.io/tls-acme: "true" annotation triggers cert issuance automatically.
  • k3s Traefik v3 CRDs are the established pattern in this cluster; vanilla Ingress is a second-class citizen and does not support Traefik-specific middleware features.
  • IngressRoute is the correct API for this stack.

Option B: Traefik IngressRoute with tls.certResolver (rejected)

  • tls.certResolver delegates ACME issuance to Traefik's built-in resolver.
  • No cert-manager integration; bypasses ESO token injection and PLAT-0005 lifecycle.
  • cert-manager (PLAT-0005) is the authoritative TLS manager for the cluster — a second ACME client would create split state.

Option C: Traefik IngressRoute + cert-manager Certificate (selected)

  • Clean separation: cert-manager owns issuance lifecycle; Traefik owns routing.
  • TLS Secrets are referenced by name in IngressRoute — no controller coupling.
  • Consistent with PLAT-0005 and the platform single-TLS-manager principle.

Rationale

Traefik IngressRoute + cert-manager is the correct layering for this stack. The redirectScheme middleware enforces HTTPS at the ingress layer without requiring application changes. Per-namespace Middleware instances are a minor duplication but respect Traefik's namespace isolation model and avoid cross-namespace reference complexity.

Using letsencrypt-prod issuer at initial deployment is appropriate because both peries.ca and peries.ca DNS-01 solvers have been validated in C.2 (PLAT-0005 smoke test), and these are production-access services.

Consequences

  • grafana-platform.peries.ca — HTTP redirects to HTTPS, TLS cert issued by Let's Encrypt prod CA.
  • awx-platform.peries.ca — HTTP redirects to HTTPS, TLS cert issued by Let's Encrypt prod CA.
  • AWX awx-service remains NodePort until AWX CR is patched; follow-up WI required.
  • Prometheus and Alertmanager ingress is deferred — requires auth middleware to prevent unauthenticated metric access before exposure.
  • External Cloudflare A records for both hostnames are a follow-up action (currently NXDOMAIN externally; internal access via AdGuard rewrites works immediately).
  • Future services follow the same manifest structure under k8s/ingress/<service>/.

References

  • IAM-0002 — IT/OT zone separation policy
  • PLAT-0005 — cert-manager with Let's Encrypt DNS-01 via Cloudflare
  • IAM-0005 — External Secrets Operator (ESO)
  • Traefik IngressRoute: https://doc.traefik.io/traefik/routing/providers/kubernetes-crd/
  • Traefik Middleware redirectScheme: https://doc.traefik.io/traefik/middlewares/http/redirectscheme/

Addendum — 2026-05-01

Services added following this pattern (WI-376, WI-377, WI-378)

After the initial Grafana + AWX deployment, the following services were brought onto the same IngressRoute + cert-manager TLS pattern:

Hostname Namespace Backend service Port WI
prometheus-platform.peries.ca archon-monitoring kube-prometheus-stack-prometheus 9090 WI-376
alertmanager-platform.peries.ca archon-monitoring kube-prometheus-stack-alertmanager 9093 WI-376
headlamp-platform.peries.ca archon-infra archon-headlamp 80 WI-377
loki-platform.peries.ca archon-monitoring archon-loki 3100 WI-378

Note on AdGuard DNS rewrites: Each new *-platform.peries.ca hostname requires a specific AdGuard rewrite entry pointing to REDACTED (Traefik LoadBalancer VIP) to override the wildcard *.peries.ca → REDACTED rule. The AdGuard HTTP API requires basic auth; if the credential is not in Infisical, edit the config file directly via docker cp, modify on the node, copy back, and docker restart adguard.

Note on namespace-scoped Middleware: redirect-https Middleware is deployed once per namespace. archon-monitoring already had one from WI-376; subsequent services in that namespace (Loki) reuse it (unchanged on kubectl apply).