The kiosk chromium pinned ~92% of a core (software-compositing spin from
--enable-gpu-rasterization on a GPU-less/headless node), saturating the machine
and starving the backend + container builds — it caused the .198 receive timeout
and the deploy storms.
- archipelago-kiosk.service: CPUQuota=75% + MemoryMax/High + Delegate, so a
runaway kiosk can never take the whole node down.
- archipelago-kiosk-launcher.sh: detect /dev/dri — use GPU rasterization only
when a GPU exists, else --disable-gpu (avoids the headless spin).
- bootstrap::ensure_kiosk_hardened: OTA self-heal that installs the updated
unit+launcher on already-deployed nodes, daemon-reloads, and only try-restarts
a *running* kiosk (never re-enables an operator-disabled one).
cargo check clean; launcher bash -n clean; unit syntax valid.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>