Migrated from ADR-0016 on 2026-05-02 per ADR-0047. This source file is retained as a reference; the canonical content is in PLAT-0002.
PLAT-0002 — k3s Namespace Design: Domain-Based Pattern¶
| Field | Value |
|---|---|
| Status | Accepted |
| Date | 2026-04-03 |
| Author | Ben Peries |
| Sources | ADR-0016 |
Context¶
The k3s cluster is running with caneast-site1-node3 as control-plane and caneast-site1-node2 as worker node. Workloads span infrastructure tooling, monitoring, application services, OT-facing pipelines, and security. A namespace strategy is needed before deploying workloads.
Two patterns were evaluated:
- Pattern A — environment-based: dev, staging, prod namespaces. Each environment contains all workloads.
- Pattern B — domain/team-based: namespaces map to functional domains. Each namespace owns a category of workloads.
Decision¶
Pattern B — domain/team-based namespaces.
Namespace map¶
| Namespace | Purpose | Example workloads |
|---|---|---|
archon-infra |
Core platform tools | AWX, Infisical agents, ArgoCD |
archon-monitoring |
Observability stack | Grafana, InfluxDB, Telegraf, Prometheus |
archon-apps |
Application workloads | Node-RED, peries.ca, Homepage |
archon-ot |
OT-facing services | MQTT bridge, OT data processors |
archon-security |
Security tooling | CrowdSec, Suricata, Falco, Conpot |
Rationale¶
- Maps directly to the IT/OT separation story established in IAM-0002 — OT workloads are isolated in
archon-ot, security tooling inarchon-security - Mirrors enterprise domain ownership model — each namespace has a clear owner and purpose, even in a single-operator platform
- Supports per-namespace RBAC — when AWX (Phase 3) manages deployments, service accounts can be scoped to their domain namespace only
- Enables per-namespace NetworkPolicies —
archon-otcan restrict egress to MQTT broker and InfluxDB only,archon-securitycan access all namespaces for monitoring - Aligns with ArgoCD ApplicationSets — one Application per namespace, clean GitOps sync boundaries
Alternatives Considered¶
Pattern A — environment-based (dev, staging, prod)¶
Rejected. Archon is a single-environment homelab — there is no staging or prod distinction at this scale. Creating dev/staging/prod namespaces would result in workloads only ever running in prod, with the other namespaces empty. Environment separation is better handled at the ADO pipeline level (environment gates) than at the namespace level.
Flat default namespace¶
Rejected. All workloads in default provides zero isolation, no RBAC granularity, and no NetworkPolicy boundaries. Impossible to reason about blast radius of a misconfigured deployment.
Exceptions¶
AWX Operator — awx namespace¶
The AWX Operator upstream (github.com/ansible/awx-operator/config/default) hardcodes the awx namespace in its kustomization manifests. Deploying to archon-infra was attempted but produced namespace mismatch errors on serviceaccount, role, rolebinding, configmap, service, and deployment resources — the operator's own RBAC and resource references assume awx and cannot be overridden cleanly via a downstream kustomization namespace field.
Decision: accept awx as a fixed upstream constraint, not a local naming choice. AWX deploys to the awx namespace. All other workloads follow the archon-* domain-based pattern.
Reference: https://github.com/ansible/awx-operator/issues/2002
Consequences¶
- All k3s workloads must declare
namespaceexplicitly in their manifests — no implicitdefault - No deployments to the
defaultnamespace — it remains empty - Helm charts and ArgoCD Applications must target a specific
archon-*namespace - Namespace creation managed via k3s manifests in
archon-platform/k3s/namespaces/ - Future multi-environment support (if needed) would layer environment labels on resources within domain namespaces, not create parallel namespace trees
References¶
- IAM-0002 — IT/OT zone separation policy
Addendum — 2026-04-12¶
caneast-site1-node2 ServiceLB disabled (WI #175)¶
Observation: k3s Traefik's LoadBalancer service was claiming port 443 on caneast-site1-node2 (worker node), conflicting with the Nginx Proxy Manager (NRP) Docker container which also binds port 443 on that host. NRP went offline as a result.
Temporary fix applied:
k3s-agent on caneast-site1-node2 and k3s on caneast-site1-node3 were restarted. TraefikLoadBalancer external IP now shows only REDACTED (caneast-site1-node3). NRP owns port 443 on caneast-site1-node2 cleanly.
Permanent fix (WI #175, backlog): caneast-site1-node2 should not run both a k3s worker and NRP long term. Options:
1. Move NRP to port 8443 and update all Cloudflare/DNS accordingly
2. Migrate all proxy hosts from NRP to Traefik IngressRoute resources and retire NRP
Label persists across reboots — the node label is stored in etcd and re-applied by the k3s control-plane on agent reconnect. No additional configuration required to make this permanent at the k3s level.
Status: Temporary fix is active. Permanent resolution tracked in WI #175.