Skip to content

OT-0003 Historian Retention

Field Value
ID OT-0003
Date 2026-05-02
Status Accepted
Deciders Ben Peries
Migrated from OT-0005 (2026-04-05)

Context

OT telemetry is written to InfluxDB at full sensor resolution (approximately 10-second intervals). Retaining raw data indefinitely is impractical at homelab storage scale. A retention and aggregation strategy is needed that preserves recent data at full resolution while compressing historical data for long-term trend analysis — analogous to an ISA-95 Level 2 historian.

Decision

Two-tier retention model using two InfluxDB buckets:

Bucket Retention Resolution Purpose
homelab 30 days Raw (10 s) Recent operations, alert evaluation
homelab-archive 365 days 1-minute aggregates Trend analysis, seasonal comparisons

A Flux task (downsample-sumppit-to-archive, ID 108cb726d43c7000) runs every hour, computing 1-minute mean aggregates from homelab and writing them to homelab-archive. The task references org=REDACTED (pending rename to archon in WI-309).

Aggregation Logic

from(bucket: "homelab")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "mqtt_consumer")
  |> aggregateWindow(every: 1m, fn: mean, createEmpty: false)
  |> to(bucket: "homelab-archive", org: "REDACTED")

ISA-95 Analogy

The two-bucket pattern maps to ISA-95 Level 2 historian practice:

ISA-95 Archon equivalent
Real-time buffer homelab (30 d raw)
Historical archive homelab-archive (365 d 1-min)
Compression ratio ~6× (10 s → 1 min)

Options Considered

Option A: Single bucket, no aggregation (rejected)

Unbounded raw storage. At 10 s resolution across multiple sensors, storage grows ~8.6 MB/day/sensor. Not viable at homelab scale for multi-year retention.

Option B: Two-tier with 1-minute aggregates (selected)

Preserves operational fidelity for the 30-day window most relevant to alarm analysis. 1-minute resolution is sufficient for seasonal trend queries.

Option C: Three-tier (raw → 1 min → 1 hour) (deferred)

See OT-0010 Historian Retention Architecture for the full four-bucket hierarchy. OT-0010 supersedes this ADR's bucket design when implemented; this two-tier model is the current operational baseline.

Rationale

Thirty days of raw data covers the operational window for alarm investigation and manual analysis. One-year 1-minute archive supports seasonal flood-risk comparisons (comparing current sump-pit levels to prior wet seasons). Flux task-based aggregation is native to InfluxDB OSS and requires no external ETL tooling.

Consequences

  • homelab bucket retention is set to 30 days; data older than 30 days is automatically purged.
  • homelab-archive bucket retention is set to 365 days.
  • The Flux downsampling task must be monitored; if it fails silently, the archive gap is unrecoverable for the missed window.
  • Future bucket design (four-tier with 1 h and 7-year retention) tracked in OT-0010.
  • InfluxDB org rename (REDACTED → archon) tracked in WI-309; update task after rename.

References