test(resilience): gate host-reboot batch on os-audit (L3 per-boot health)

batch_host_reboot previously asserted only container-set equality after the
reboot. Add the os-audit.sh per-boot health gate: after rpc_login succeeds
post-reboot, run os-audit against the target (ARCHY_LOCAL=0, https) and record
host_reboot_osaudit PASS/FAIL. This asserts the node is actually healthy after
a reboot — RPC up, OTA not wedged (FM12), every app reachable with valid launch
metadata, FM-guards green — not just that the right containers exist. Validated
green on .116 (11 pass / 0 fail / 0 warn).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
archipelago 2026-06-14 08:01:30 -04:00
parent 4232424b23
commit 5b052372b7

View File

@ -470,6 +470,21 @@ batch_host_reboot() {
missing=$(comm -23 <(echo "$before") <(echo "$after") | tr '\n' ',' | sed 's/,$//')
record "_batch" host_reboot FAIL "missing: $missing"
fi
# ── L3 per-boot health gate ──────────────────────────────────
# Container-set equality proves the right containers exist; os-audit proves
# the node is actually *healthy* after the reboot: RPC up, OTA not wedged
# (FM12), every app reachable with valid launch metadata, FM-guards green.
# This is the per-boot building block os-audit.sh was written to be.
if [ -x "$ROOT/tests/lifecycle/os-audit.sh" ]; then
echo "── per-boot os-audit gate ──"
if ARCHY_HOST="$HOST" ARCHY_SCHEME=https ARCHY_PASSWORD="$UI_PASS" ARCHY_LOCAL=0 \
"$ROOT/tests/lifecycle/os-audit.sh" >"$OUT_DIR/os-audit-postboot.log" 2>&1; then
record "_batch" host_reboot_osaudit PASS "os-audit green after reboot"
else
record "_batch" host_reboot_osaudit FAIL "os-audit not green after reboot (see $OUT_DIR/os-audit-postboot.log)"
fi
fi
}
# ── main ─────────────────────────────────────────────────────────