Pre-Commit Hooks — archon-docs¶
Pre-commit hooks catch DLP (data loss prevention) violations locally before a push reaches the CI pipeline.
What the hook does¶
Runs sanitize.py + verify-sanitization.py before every commit to archon-docs. If the DLP scan finds leaked IPs, UUIDs, node names, or credentials, the commit is blocked.
Install¶
Run once from the root of the archon-docs repo:
cat > .git/hooks/pre-commit << 'EOF'
#!/bin/bash
set -e
# DLP pre-commit hook for archon-docs
# Runs sanitize.py then verify-sanitization.py.
# Commit is blocked if any internal strings are detected.
echo "[pre-commit] Running sanitization DLP check..."
if ! python3 sanitize.py > /dev/null 2>&1; then
echo "[pre-commit] FAIL: sanitize.py exited non-zero"
exit 1
fi
if ! python3 verify-sanitization.py; then
echo "[pre-commit] FAIL: DLP leak detected — commit blocked"
exit 1
fi
echo "[pre-commit] PASS — no DLP leaks detected"
EOF
chmod +x .git/hooks/pre-commit
What is checked¶
The same LEAK_PATTERNS checked by the ADO pipeline gate:
| Pattern | Example |
|---|---|
| RFC1918 IPs | 192.168.2.x, 10.x.x.x |
| Node names | caneast-site1-node2, caneast-site1-mqtt1 |
| Hyphenated UUIDs | Infisical project/identity IDs |
| Personal identifiers | operator, caneast-site1-ai1, Ben Peries |
| Hardware identifiers | CanEast AI Node, <device-model>, REDACTED |
| Service accounts | ansible-svc-account |
| Credentials | TOKEN=REDACTEDKEY=REDACTED password: |
| Network identifiers | dmz-bridge-0, lan-bridge, port REDACTED |
| MQTT topics with internal prefix | caneast/ot |
| ADO org URL | dev.azure.com/caneast-platform |
Excluded files¶
sanitize.py never copies these to build/docs/:
docs/_index.md— machine context indexdocs/_context.md— AI portability file with full inventorydocs/internal/— sensitive runbooks and identity details
The hook verifies these are absent from build/docs/ after sanitization runs.
Relationship to CI pipeline¶
The ADO pipeline runs the same checks on every push to main. The pre-commit hook is a local fast-fail — it does not replace the pipeline gate. Both must pass.
See ADR-0018 — Sanitization Verification Strategy and ADR-0031 — Public Docs Security Controls.