IT Node Onboarding Golden Path¶
Four-step process for bringing a new physical node into the Archon fleet.
Step 1 — OS Install (autoinstall)¶
Boot the node from the Archon autoinstall USB. The cloud-init user-data:
- Installs Ubuntu (current fleet version — see Inventory)
- Creates the
operatorlocal admin account with your SSH key - Sets hostname, locale, timezone
- Expands LVM to full disk
After install the node will reboot and be reachable at its DHCP address. Reserve a static IP on the Bell Giga Hub before proceeding.
Step 2 — Bootstrap (bootstrap.yml)¶
Runs as operator with password auth. Creates the ansible-svc-account service account
so AWX can reach the node for all subsequent automation.
Add node to inventory¶
In ansible/inventories/it/hosts.yml, add the node to new_nodes:
new_nodes:
hosts:
<hostname>:
ansible_host: <ip>
ansible_port: 22
ansible_user: operator
ansible_python_interpreter: /usr/bin/python3
Run bootstrap¶
ansible-playbook -i ansible/inventories/it/hosts.yml \
ansible/playbooks/it/bootstrap.yml \
--limit new_nodes \
--ask-pass --ask-become-pass
What it does:
- Creates
ansible-svc-accountuser (/bin/bashshell, home directory) - Creates
/home/ansible-svc-account/.sshwith mode0700 - Deploys
~/.ssh/ansible-svc-account.pubfrom the control node asauthorized_keys - Writes
/etc/sudoers.d/ansible-svc-account(NOPASSWD:ALL, validated by visudo)
Move node to permanent group¶
After bootstrap succeeds, move the host from new_nodes to its target group
(nodes, docker_hosts, etc.) and update ansible_user to ansible-svc-account.
Step 3 — Baseline (AWX: it-baseline-all)¶
Run the it-baseline-all AWX job template, scoped to the new node.
What it applies (roles/common + security roles):
- Hostname, timezone, NTP
- Base packages
- SSH hardening (
ssh_hardeningrole — moves SSH to port REDACTED) - fail2ban, ufw
- node_exporter (Prometheus metrics)
- Power management (sleep/suspend targets masked, lid-switch actions disabled)
After this run:
- SSH port is now 2222
- Update ansible_port in hosts.yml for the node
Step 4 — Services (AWX: site.yml)¶
Run the site.yml AWX job template for the node's role:
| Role | Playbook | Adds |
|---|---|---|
| Docker host | site.yml |
Docker, Portainer |
| KVM host | kvm.yml |
KVM/libvirt bridges |
| Monitoring | monitoring.yml |
Grafana, InfluxDB, Telegraf |
Post-onboarding checklist¶
- [ ] Static IP reserved (Bell Giga Hub — or OPNsense once active)
- [ ]
hosts.ymlupdated — correct group,ansible_port: 2222,ansible_user: ansible-svc-account - [ ] Node removed from
new_nodesgroup - [ ] Device page created in
docs/devices/<hostname>.md - [ ]
docs/reference/inventory.mdupdated - [ ]
docs/reference/network.mdDNS rewrite added (<hostname>.home) - [ ] k3s worker join (if applicable):
kubectl get nodesshows Ready
Reference¶
- Inventory:
ansible/inventories/it/hosts.yml - Bootstrap playbook:
ansible/playbooks/it/bootstrap.yml - Baseline playbook:
ansible/playbooks/it/baseline.yml - Site playbook:
ansible/playbooks/it/site.yml - Service account variable:
roles/common/defaults/main.yml→common_ansible_user