OT-0005 OT Grafana Dashboard Taxonomy¶
| Field | Value |
|---|---|
| ID | OT-0005 |
| Date | 2026-05-02 |
| Status | Proposed |
| Deciders | Ben Peries |
| Migrated from | OT-0009 (2026-04-14) |
Context¶
The OT Grafana instance (on caneast-site1-node2:3002) hosts dashboards for OT telemetry,
alarms, and historian data. As the number of sensor nodes and metric streams grows, an
unstructured dashboard list becomes operationally difficult to navigate. A consistent
folder hierarchy and UID scheme is needed before the dashboard count exceeds ~10.
Decision¶
Folder Hierarchy (5 tiers)¶
| Folder | UID | Audience | Contents |
|---|---|---|---|
archon-platform |
archon-platform |
All | Platform home; links to OT and IT Grafana |
ot-operations |
ot-operations |
Operators | Live sensor status, active alarms |
ot-engineering |
ot-engineering |
Engineers | Calibration, debug, raw signal views |
ot-infrastructure |
ot-infrastructure |
Platform | Node health, MQTT broker, InfluxDB |
ot-historian |
ot-historian |
All | Long-term trends, seasonal comparisons |
archon-platform folder contains the org home dashboard (ot-home-archon) which is
the default landing page for the OT Grafana instance.
UID Pattern¶
Examples:
| UID | Dashboard |
|---|---|
ot-operations-snr01-sumppit |
Live sump-pit status (snr01) |
ot-historian-snr01-level-trend |
Annual level trend (snr01) |
ot-engineering-snr01-raw |
Raw signal debug (snr01) |
Panel Pattern — Event-Driven Data¶
OT metrics are sparse (state-change events, not fixed-interval polling). The correct Grafana panel pattern for event-driven data is:
// Combines: last known value before window + all events within window
union(
tables: [
from(bucket: "homelab") |> range(start: lastBefore) |> last(),
from(bucket: "homelab") |> range(start: windowStart, stop: windowEnd)
]
)
Use stepAfter interpolation for state-change metrics (flood, door_state) so the panel holds the last known value until the next event rather than interpolating.
Options Considered¶
Option A: Flat dashboard list (rejected)¶
No folder structure. Works below ~10 dashboards; breaks as node count grows. Navigation requires scrolling a flat list with no audience segmentation.
Option B: Per-node folders (rejected)¶
One folder per sensor node (snr01/, snr02/). Cross-node comparison dashboards
have no natural home. Folder proliferation at O(nodes) rather than O(roles).
Option C: Role-based 5-tier hierarchy (selected)¶
Aligns with ISA-95 role separation (operators vs. engineers vs. platform admins).
Folder count is fixed at 5 regardless of node count. Cross-node dashboards (e.g.,
all-zone alarm summary) belong naturally in ot-operations.
Rationale¶
The 5-tier hierarchy mirrors enterprise SCADA display conventions without requiring formal ISA-101 compliance. Separating operator, engineering, and historian views prevents dashboard clutter in the primary operations view. The UID pattern makes dashboards addressable by code (Ansible, Terraform) without relying on numeric IDs that change on import/export cycles.
The event-driven panel pattern avoids the "gaps between points" problem that occurs when standard Grafana time-series panels assume fixed-interval data.
Consequences¶
- All new OT dashboards must be placed in one of the five folders.
- Dashboard UIDs must follow the
ot-{tier}-{node}-{purpose}pattern. - The
archon-platformfolder andot-home-archondashboard are the entry point; navigation links must be maintained when new dashboards are added. - State-change metrics must use stepAfter interpolation and the union+lastBefore query pattern to avoid misleading gaps in the time series.