Skip to content

Roadmap

Phase Name Status
1 Foundation Complete
2 Monitoring & Automation Complete
3 Cloud-Native & TLS Complete
4 AI & LLM Active
5 OT Hardening Active
6 Cloud Expansion Planned

Phase 1 — Foundation

  • Deployed ESP32 sensor nodes for sump pit monitoring (flood, level, rain)
  • Stood up Mosquitto MQTT broker on dedicated OT hardware
  • Established InfluxDB time-series historian with Telegraf collectors
  • Wired Telegram alerts on flood and level thresholds; first OT Grafana dashboard live

Phase 2 — Monitoring & Automation

  • Self-hosted Infisical as single secrets plane; machine identities replace shared credentials
  • Ansible fleet-wide provisioning: node hardening, docker, common roles; idempotent from day one
  • Separated IT service accounts from shared logins; keypair-per-role pattern established
  • AdGuard Home DNS, WireGuard VPN, Portainer, Uptime Kuma all production-ready
  • archon-docs launched with DLP sanitization pipeline; every doc scrubbed before publish

Phase 3 — Cloud-Native & TLS

  • k3s cluster provisioned: 1 control plane, 3 workers; all nodes Ready
  • AWX deployed for Ansible job orchestration; job templates wired to pipelines
  • External Secrets Operator wired to Infisical; ClusterSecretStore READY with 5-minute TTL
  • cert-manager with Let's Encrypt DNS-01 validation; automatic certificate renewal live
  • Traefik IngressRoutes with TLS enforced across all platform services
  • kube-prometheus-stack: Prometheus, Alertmanager, Grafana cluster observability operational
  • Loki log aggregation in k3s; IngressRoute with TLS; log collector ADR pending
  • Grafana Logs Drilldown plugin installed; datasource live
  • Headlamp Kubernetes UI deployed; read-only cluster introspection without kubectl
  • CanEast AI Node scrape removed from additionalScrapeConfigs; stale targets cleared
  • ccagnt agent fleet deployed: 21 specialized Claude agents with proactive invocation and model tier discipline

Phase 4 — AI & LLM

  • OpenClaw LLM gateway deployed; policy enforcement layer for production AI workloads
  • MCP integration live: Claude Code connected to k3s, Cloudflare, Azure DevOps, Infisical
  • Ask Archy public assistant: Cloudflare Worker + Groq API; Turnstile bot protection
  • Frigate 0.14.1 on k3s with Intel UHD 630 GPU allocation via intel-gpu-plugin
  • qwen3-vl:4b vision model on Ollama; 5-tier inference pipeline for cam01 event detection
  • cam01 capture pipeline: Raspberry Pi 5 camera node; event-driven MQTT publish and JSONL logging

Phase 5 — OT Hardening

  • ansible-ot-svc-account service account separated from IT account: UID REDACTED, Ed25519 keypair, Infisical-stored key
  • SSH port isolation enforced: OT nodes on dedicated port, IT nodes on port REDACTED
  • OT Ansible inventory separated from IT; Phase 2 migration to archon-apps repo planned
  • Log collector ADR: Promtail vs Fluent Bit vs Alloy decision; Kibana/ELK retirement assessment

Phase 6 — Cloud Expansion

  • Terraform modules for Azure and Cloudflare; git-driven lifecycle management
  • ERP/CMMS deployment for manufacturing and maintenance workflow integration
  • Manufacturing test loop: OT sensor data driving automated alerting and work order creation

Current Sprint (Sprint 5)

WI Title Status
WI-387 Author log collector ADR (Promtail vs Fluent Bit vs Alloy) Backlog
WI-383 Store AdGuard admin password in Infisical Backlog
WI-374 Homepage config-as-code in git In Progress
WI-373 NPM proxy host for infisical.peries.ca Backlog
WI-353 kube-prometheus-stack rename to archon-prom Blocked (15d TSDB loss)