LLM Operations Architecture¶
Overview¶
The Archon Platform runs a self-hosted LLM operations layer that provides multi-backend model routing, a public documentation chat interface, an AI agent plane for infrastructure operations, and cluster intelligence via k8sgpt. These components are designed to be independent -- each solves a distinct problem and can be updated or replaced without affecting the others.
LLM Gateway: OpenClaw¶
OpenClaw is the platform LLM gateway. It provides a single API endpoint that routes requests across multiple model backends -- local (Ollama), cloud-hosted (provider APIs), and any future backends. Consumers of OpenClaw do not need to know which backend serves a given request; routing is configuration-driven.
OpenClaw enables the platform to use local models for cost-sensitive or privacy-sensitive workloads while retaining access to cloud models for tasks that require larger context windows or higher capability. The OpenClaw adoption decision is documented in LLMOPS-0001.
Documentation Chat: Ask Archy¶
Ask Archy is a chat widget embedded in the public documentation site (peries.ca). It allows visitors to ask questions about the platform and receive answers grounded in the documentation content. Ask Archy is implemented as a Cloudflare Worker that calls a configured LLM backend with the platform documentation as context.
Ask Archy uses Cloudflare Turnstile for bot protection. Its secrets (API key, Turnstile secret) live in the Cloudflare Worker environment, not in the MkDocs build output. No platform secrets are embedded in the static site.
Agent Plane¶
The agent plane provides AI-assisted infrastructure operations. An orchestrator agent interprets high-level tasks and delegates to specialised executor agents with narrowly scoped tools and personas -- each covering a specific platform domain (Ansible, Terraform, Kubernetes, OT firmware, pipeline review, and others).
The agent plane architecture -- including the orchestrator/executor split, agent persona model, and constraints on agent tool scope -- is documented in LLMOPS-0002.
Agents are declarative CLAUDE.md-backed definitions. They are reviewed and versioned like code. No agent has unrestricted tool access; each agent's scope is limited to the minimum required for its domain.
Cluster Intelligence: k8sgpt¶
k8sgpt analyses Kubernetes cluster state and produces natural-language explanations of degraded conditions (failing pods, resource pressure, misconfigured objects). It is deployed as an operator in the cluster and is accessible via the Grafana integration or CLI.
k8sgpt augments human cluster review -- it is not a remediation engine. All remediation is performed by human operators or by the agent plane under explicit instruction. k8sgpt's role is accelerated diagnosis, not autonomous action.
Key Properties¶
- OpenClaw decouples LLM consumers from backend selection -- routing is configuration
- Ask Archy is a public surface backed by platform documentation, not a general chatbot
- Agent plane uses orchestrator/executor separation -- no single agent has broad scope
- k8sgpt is diagnostic only -- no autonomous remediation
- All LLM-adjacent secrets live in Infisical or Cloudflare Worker environment, never in source files