The kiosk chromium pinned ~92% of a core (software-compositing spin from --enable-gpu-rasterization on a GPU-less/headless node), saturating the machine and starving the backend + container builds — it caused the .198 receive timeout and the deploy storms. - archipelago-kiosk.service: CPUQuota=75% + MemoryMax/High + Delegate, so a runaway kiosk can never take the whole node down. - archipelago-kiosk-launcher.sh: detect /dev/dri — use GPU rasterization only when a GPU exists, else --disable-gpu (avoids the headless spin). - bootstrap::ensure_kiosk_hardened: OTA self-heal that installs the updated unit+launcher on already-deployed nodes, daemon-reloads, and only try-restarts a *running* kiosk (never re-enables an operator-disabled one). cargo check clean; launcher bash -n clean; unit syntax valid. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
35 lines
1.4 KiB
Desktop File
35 lines
1.4 KiB
Desktop File
[Unit]
|
|
Description=Archipelago Kiosk (X11 + Chromium)
|
|
After=archipelago.service systemd-user-sessions.service network-online.target
|
|
Wants=archipelago.service network-online.target
|
|
ConditionPathExists=/usr/local/bin/archipelago-kiosk-launcher
|
|
Conflicts=getty@tty1.service
|
|
|
|
[Service]
|
|
Type=simple
|
|
# Wait up to 5 min for archipelago to serve /health. On slow hardware
|
|
# first-boot is dominated by the FileBrowser pull (unbundled ISO),
|
|
# initial archipelago state sync, and frontend settle — .198 took
|
|
# longer than 120s and chromium launched against an empty backend,
|
|
# producing a white window that only recovered on reboot. 300s gives
|
|
# slow-but-functional hardware enough headroom; TimeoutStartSec is
|
|
# bumped in lockstep so systemd doesn't kill us mid-wait.
|
|
ExecStartPre=/bin/bash -c 'for i in $(seq 1 150); do curl -sf http://localhost/health >/dev/null 2>&1 && break; sleep 2; done'
|
|
ExecStart=/usr/local/bin/archipelago-kiosk-launcher
|
|
TimeoutStartSec=360
|
|
Restart=always
|
|
RestartSec=5
|
|
|
|
# Resource guardrail (#36). On GPU-less / headless hardware chromium could spin
|
|
# software compositing at ~92% of a core, saturating the node and starving the
|
|
# backend (it caused the .198 receive timeout + deploy storms). Cap CPU + memory
|
|
# so a runaway kiosk can never take the whole machine down; Delegate so the cap
|
|
# also binds the chromium/Xorg children in this unit's cgroup.
|
|
Delegate=yes
|
|
CPUQuota=75%
|
|
MemoryMax=1500M
|
|
MemoryHigh=1200M
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|