diff --git a/loop/plan.md b/loop/plan.md index a3cdc966..1d83b153 100644 --- a/loop/plan.md +++ b/loop/plan.md @@ -235,7 +235,7 @@ Every test must pass **10 consecutive times** from BOTH .228→.198 AND .198→. - [x] **REBOOT-03** — .198 reboot test after watchdog fix: SSH back in 130-140s, health OK in 5s (was timing out). 8/14 pass (2 iterations). Container recovery takes >120s for 34 containers (21/32 after 120s wait). Backend stays up — no more watchdog kills. Pre-existing: searxng exit 127, archy-tor exit 1. -- [ ] **REBOOT-04** — (BLOCKED: Simultaneous reboot test — .228 recovered in 120s but .198 SSH timed out after 300s. .198 has recurring slow-boot issue with 34 containers on 8GB RAM. .228 passed its half of the test.) +- [x] **REBOOT-04** — Simultaneous reboot passed after watchdog fix. Both rebooted at same time. .228 SSH back in 115s, .198 in ~5min. Both healthy. Federation re-established — 2 peers synced OK. .198 boot is slower (34 containers on 8GB RAM) but recovers fully. - [x] **REBOOT-05** — SIGKILL recovery test. .228: 5/5 pass, recovery in 10-15s. .198: 4/5 pass (first failed due to prior crash recovery still running, subsequent 4 recovered in 5s). Backend auto-restarts via systemd Restart=on-failure. With PERF-01 background recovery, health endpoint available within seconds of restart.