Intel GPU Plugin for K3s¶

Status: Tier 3 Operational Reference Last Updated: 2026-05-02 WI: WI-395

Overview¶

The Intel GPU plugin for Kubernetes exposes Intel iGPU hardware as a consumable Kubernetes resource. This enables workloads (e.g., Frigate NVR) to request and allocate GPU time, avoiding manual device assignment and contention.

Node: caneast-site1-node4 (Intel Core REDACTED, UHD 630 iGPU)

Resource name: gpu.intel.com/i915

Current consumer: Frigate (1 iGPU claimed)

Hardware Specifics¶

caneast-site1-node4 iGPU Details¶

Property	Value
GPU Model	Intel UHD Graphics 630
CPU	Intel Core REDACTED (6th or 7th gen, integrated graphics)
Device Node	/dev/dri/renderD128
Kernel Module	i915
Memory	Shared with system RAM (no discrete VRAM)
Encoding	H.264, H.265, VP8, VP9 (hardware-accelerated)

Kernel Module Verification¶

Ensure the i915 driver is loaded:

lsmod | grep i915

Expected output:

i915               1234567  1

If not loaded, load it:

sudo modprobe i915

Installation and Deployment¶

Intel Device Plugins Helm Chart¶

The intel-device-plugins-for-kubernetes is deployed as a DaemonSet. It periodically scans nodes for Intel GPU capabilities and exposes them as Kubernetes resources.

Deployment method: Helm chart (on caneast-site1-node4 and other Intel nodes in the cluster)

Chart: intel/intel-device-plugins

Namespace: kube-system (default)

Key configuration:

dpu:
  enabled: false  # DPU not used
gpu:
  enabled: true
  sharedDevNum: 1  # One iGPU per node
  kubeconfig: ""   # Use in-cluster auth

Installation Command (Reference)¶

helm repo add intel https://intel.github.io/helm-charts
helm repo update

helm install intel-gpu-plugin intel/intel-device-plugins \
  --namespace kube-system \
  --set gpu.enabled=true \
  --set gpu.sharedDevNum=1

Verification¶

Check that the DaemonSet is running:

kubectl get daemonsets -n kube-system | grep intel
kubectl get pods -n kube-system | grep intel-gpu

Expected output:

intel-gpu-plugin-abc123  1/1  Running

Node Capacity¶

Describing Node GPU Capacity¶

kubectl describe node caneast-site1-node4 | grep -A 10 "Allocatable"

Expected output:

Allocatable:
  cpu:                     6
  memory:                   15Gi
  pods:                     110
  gpu.intel.com/i915:       1

If gpu.intel.com/i915 is not shown, the plugin may not have initialized. Check DaemonSet logs:

kubectl logs -n kube-system -l app=intel-gpu-plugin --tail=20

Pod Resource Requests and Limits¶

GPU Allocation in Pod Spec¶

To request the Intel iGPU, add a resources section to the container spec:

apiVersion: v1
kind: Pod
metadata:
  name: frigate
  namespace: archon-vision
spec:
  containers:
  - name: frigate
    image: ghcr.io/blakeblackshear/frigate:0.14.1
    resources:
      limits:
        gpu.intel.com/i915: 1
      requests:
        gpu.intel.com/i915: 1

Important: Kubernetes requires both limits and requests for GPU resources.

Current Usage¶

Verify which workloads are using the GPU:

kubectl describe node caneast-site1-node4 | grep -A 20 "Allocated resources"

Example output:

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                500m          1000m
  memory             512Mi         1Gi
  gpu.intel.com/i915 1             1       <-- Frigate is using it

Frigate OpenVino Integration¶

OpenVino Detector¶

Frigate can use OpenVino (Intel's inference runtime) to offload object detection to the GPU. This is configured in the Frigate config YAML:

detectors:
  ov:
    type: openvino
    device: GPU.0  # Use first GPU
    num_threads: 4

Performance gains: - CPU-only detection on caneast-site1-node4: ~1 FPS - GPU-accelerated detection: ~5-10 FPS (depending on model)

Encoding Acceleration¶

The i915 driver also accelerates H.264/H.265 encoding, used when re-streaming video from RTSP to Frigate's internal format. This is automatic when the plugin is present.

Troubleshooting¶

GPU Not Detected¶

If kubectl describe node caneast-site1-node4 does not show gpu.intel.com/i915, check:

i915 kernel module loaded:
```
ssh caneast-site1-node4 "lsmod | grep i915"
```
If not, load it and restart the intel-gpu-plugin pod.
Device node exists:
```
ssh caneast-site1-node4 "ls -la /dev/dri/"
```
Should show renderD128 or renderD129.

Plugin pod logs:

kubectl logs -n kube-system -l app=intel-gpu-plugin --tail=50

Look for device discovery errors.

Restart plugin:

kubectl rollout restart daemonset/intel-gpu-plugin -n kube-system

GPU Memory Pressure¶

If Frigate fails with "out of memory" errors:

Check available GPU memory:
```
ssh caneast-site1-node4 "sudo clinfo 2>/dev/null | grep 'Max Allocations'"
```
(Note: clinfo may not be installed; alternative: check dmesg for OOM)

Reduce Frigate's detection resolution or frame rate:

detect:
  width: 1280   # instead of 1920
  height: 720   # instead of 1080
  fps: 3        # instead of 5

Monitor real-time GPU usage:

ssh caneast-site1-node4 "watch -n 1 'cat /sys/class/drm/card*/device/driver/module/parameters/enable_hangcheck'"

Pod Stuck in Pending¶

If a pod requests gpu.intel.com/i915 but stays Pending:

kubectl describe pod <pod-name>

Look for "Insufficient gpu.intel.com/i915" in the Events section. This means no node has an available GPU. Either:

Free the GPU by stopping Frigate (if testing another workload)
Wait for Frigate to finish and release the resource
Add a second Intel GPU node to the cluster (Phase 2 decision)

Design Rationale¶

Single GPU, Single Consumer¶

Currently, caneast-site1-node4 has one iGPU, and Frigate is the only consumer. If future workloads (e.g., GPU-accelerated k8sgpt inference) require GPU access, a decision must be made:

Add a second Intel node (hardware acquisition + network config)
Share GPU time via time-slicing (Kubernetes GPU sharing plugin; complex scheduling)
Prioritize one workload; demote others to CPU (simple, clear trade-off)

For WI-395, option 3 is the chosen strategy: Frigate has GPU priority; k8sgpt uses CPU.

Why Not Disable GPU for Cost Savings¶

The i915 GPU is integrated (no discrete VRAM or power cost). Frigate's GPU acceleration improves throughput and lowers CPU load. Disabling it has no financial benefit and sacrifices observability quality.

References¶

Intel Device Plugins for Kubernetes
Frigate OpenVino Documentation
Kubernetes Device Plugin API
docs/platform/frigate-caneast-site1-node4.md - Frigate NVR deployment (GPU consumer)
docs/architecture/cam01-capture-pipeline.md - Frigate in the vision pipeline