Roadmap¶
| Phase | Name | Status |
|---|---|---|
| 1 | Foundation | Complete |
| 2 | Monitoring & Automation | Complete |
| 3 | Cloud-Native & TLS | Complete |
| 4 | AI & LLM | Active |
| 5 | OT Hardening | Active |
| 6 | Cloud Expansion | Planned |
Phase 1 — Foundation¶
- Deployed ESP32 sensor nodes for sump pit monitoring (flood, level, rain)
- Stood up Mosquitto MQTT broker on dedicated OT hardware
- Established InfluxDB time-series historian with Telegraf collectors
- Wired Telegram alerts on flood and level thresholds; first OT Grafana dashboard live
Phase 2 — Monitoring & Automation¶
- Self-hosted Infisical as single secrets plane; machine identities replace shared credentials
- Ansible fleet-wide provisioning: node hardening, docker, common roles; idempotent from day one
- Separated IT service accounts from shared logins; keypair-per-role pattern established
- AdGuard Home DNS, WireGuard VPN, Portainer, Uptime Kuma all production-ready
- archon-docs launched with DLP sanitization pipeline; every doc scrubbed before publish
Phase 3 — Cloud-Native & TLS¶
- k3s cluster provisioned: 1 control plane, 3 workers; all nodes Ready
- AWX deployed for Ansible job orchestration; job templates wired to pipelines
- External Secrets Operator wired to Infisical; ClusterSecretStore READY with 5-minute TTL
- cert-manager with Let's Encrypt DNS-01 validation; automatic certificate renewal live
- Traefik IngressRoutes with TLS enforced across all platform services
- kube-prometheus-stack: Prometheus, Alertmanager, Grafana cluster observability operational
- Loki log aggregation in k3s; IngressRoute with TLS; log collector ADR pending
- Grafana Logs Drilldown plugin installed; datasource live
- Headlamp Kubernetes UI deployed; read-only cluster introspection without kubectl
- CanEast AI Node scrape removed from additionalScrapeConfigs; stale targets cleared
- ccagnt agent fleet deployed: 21 specialized Claude agents with proactive invocation and model tier discipline
Phase 4 — AI & LLM¶
- OpenClaw LLM gateway deployed; policy enforcement layer for production AI workloads
- MCP integration live: Claude Code connected to k3s, Cloudflare, Azure DevOps, Infisical
- Ask Archy public assistant: Cloudflare Worker + Groq API; Turnstile bot protection
- Frigate 0.14.1 on k3s with Intel UHD 630 GPU allocation via intel-gpu-plugin
- qwen3-vl:4b vision model on Ollama; 5-tier inference pipeline for cam01 event detection
- cam01 capture pipeline: Raspberry Pi 5 camera node; event-driven MQTT publish and JSONL logging
Phase 5 — OT Hardening¶
- ansible-ot-svc-account service account separated from IT account: UID REDACTED, Ed25519 keypair, Infisical-stored key
- SSH port isolation enforced: OT nodes on dedicated port, IT nodes on port REDACTED
- OT Ansible inventory separated from IT; Phase 2 migration to archon-apps repo planned
- Log collector ADR: Promtail vs Fluent Bit vs Alloy decision; Kibana/ELK retirement assessment
Phase 6 — Cloud Expansion¶
- Terraform modules for Azure and Cloudflare; git-driven lifecycle management
- ERP/CMMS deployment for manufacturing and maintenance workflow integration
- Manufacturing test loop: OT sensor data driving automated alerting and work order creation
Current Sprint (Sprint 5)¶
| WI | Title | Status |
|---|---|---|
| WI-387 | Author log collector ADR (Promtail vs Fluent Bit vs Alloy) | Backlog |
| WI-383 | Store AdGuard admin password in Infisical | Backlog |
| WI-374 | Homepage config-as-code in git | In Progress |
| WI-373 | NPM proxy host for infisical.peries.ca | Backlog |
| WI-353 | kube-prometheus-stack rename to archon-prom | Blocked (15d TSDB loss) |