docs(master-plan): WS-F#3 first destructive run — 3 reinstall bugs found
Full all-apps-lifecycle pass on .228: lifecycle 11/11, teardown 8/11. Surfaced (1) fresh-install bind-dir ownership root:root → reinstall EACCES (jellyfin/netbird; Fix B misses the install path), (2) netbird reinstall adopts leftover containers → skips manifest cert/file render, (3) portainer image pin lfg2025/portainer:2.19.4 unpublished (manifest unknown), pin overrides RPC dockerImage. .228 restored. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
07b9b5a3aa
commit
fc64b422e7
@ -201,8 +201,21 @@ no-regression; the original hang was load/timing-induced and not separately repr
|
|||||||
safety, override via `ARCHY_MATRIX_PROTECT`). Validated on .228 (discovery + 1-app lifecycle
|
safety, override via `ARCHY_MATRIX_PROTECT`). Validated on .228 (discovery + 1-app lifecycle
|
||||||
green). HEAVY/destructive → a supervised pass on LAN nodes (.116/.198/.228), NOT folded into
|
green). HEAVY/destructive → a supervised pass on LAN nodes (.116/.198/.228), NOT folded into
|
||||||
run-gate. Invoke: `ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1 ARCHY_PASSWORD=…
|
run-gate. Invoke: `ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1 ARCHY_PASSWORD=…
|
||||||
ARCHY_SCHEME=https bats bats/all-apps-lifecycle.bats`. STILL TODO: run the full destructive pass
|
ARCHY_SCHEME=https bats bats/all-apps-lifecycle.bats`.)*
|
||||||
on a LAN node + fix whatever reinstall failures it surfaces; add reboot-survive + UI-reach per app.)*
|
**✅ FIRST FULL DESTRUCTIVE RUN on .228 (2026-06-26):** lifecycle **11/11 clean**; teardown
|
||||||
|
**8/11** (immich 3-container stack incl.) — and it surfaced **3 real reinstall bugs** (the payoff):
|
||||||
|
1. **fresh-install bind-dir ownership = root:root** → EACCES on reinstall (jellyfin `/config`
|
||||||
|
denied exit 139; netbird-server can't open its SQLite store). Fix B's chown-to-parent only
|
||||||
|
runs on the reconcile path, **not** `package.install`. The important orchestrator fix.
|
||||||
|
2. **netbird reinstall adopts leftover containers → skips the manifest cert/file render**
|
||||||
|
(tls.crt/key/nginx.conf never written → proxy can't start → app reads absent). Only a fully
|
||||||
|
clean reinstall renders them.
|
||||||
|
3. **portainer image pin `lfg2025/portainer:2.19.4` is `manifest unknown`** (never pushed to the
|
||||||
|
registry) and the pin OVERRIDES the RPC dockerImage → portainer is un(re)installable
|
||||||
|
fleet-wide. Registry/catalog data bug (push the image or change the pin).
|
||||||
|
.228 restored (jellyfin+netbird via manual chown / clean reinstall; all installed apps running,
|
||||||
|
28 ctrs; portainer left uninstalled — uninstallable until #3 fixed). TODO: fix #1 (extend chown
|
||||||
|
to install path) + #2 + #3; add reboot-survive + UI-reach per app to the matrix.
|
||||||
4. **Guardian/IBD-dependent states:** assert that "waiting for bitcoin sync"-style states are a
|
4. **Guardian/IBD-dependent states:** assert that "waiting for bitcoin sync"-style states are a
|
||||||
legitimate, surfaced wait (with a path to ready) and never a permanent stuck state.
|
legitimate, surfaced wait (with a path to ready) and never a permanent stuck state.
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user