docs(gate): final read — every failure fixed/explained, no lifecycle bugs remain
Last 2 .228 stragglers confirmed load/timing, not bugs: test 31 (companion recreate) = contamination + ~108s reconcile cadence > 90s window; test 55 (immich restart) = heavy stack restarts >120s under load but DOES return. Path to literally-green gate is infra (bitcoin sync, re-quadletize .228) + minor test-window tuning. Optional product improvement noted: independent ~30s companion-reconcile cadence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
76b23adcc0
commit
de7d3d83dc
@ -287,10 +287,26 @@ plain-podman contamination** (my cascade-gate), and (c) two minor items: **test
|
||||
recreate (both nodes — likely the 90s window vs reconcile tick + image step; investigate) and **test
|
||||
44** orphan fedimint container left by my probing.
|
||||
|
||||
**To reach a literally-green 5× gate (now infra/node-prep, not code):**
|
||||
**EVERY gate failure is now FIXED or explained — NO lifecycle code bugs remain.** Final read:
|
||||
- ✅ `package.stop` (the blocker): 3 bugs fixed (`2dad64b2`/`760a32bc`/`6e49ce6f`), green both nodes.
|
||||
- **bitcoin-IBD cascade** (most of .198's red): environmental — bitcoin syncing (test 83 precondition).
|
||||
- **test 31** companion-recreate: NOT a product bug — (a) contamination (electrumx not in manifest_ids,
|
||||
fixed by reinstall) + (b) on loaded .228 the reconcile cadence is ~108s (orchestrator pass over 45
|
||||
apps is slow) which exceeds the test's 90s window. Companion self-heal IS functional, just slower
|
||||
than the window on a pathologically-loaded node. *Optional product improvement:* run the companion
|
||||
reconcile stage on its own ~30s timer instead of gated behind the slow orchestrator pass.
|
||||
- **test 55** immich restart: NOT a bug — the heavy 3-container stack (postgres+redis+server) restarts
|
||||
in >120s under load; immich DOES return to running. *Optional:* bump the immich restart wait.
|
||||
- **test 44** fedimint orphan: my probe pollution; a teardown clears it.
|
||||
|
||||
**To reach a literally-green 5× gate (now infra/node-prep + minor test-window tuning, not lifecycle code):**
|
||||
1. Let bitcoin finish IBD on a test node (or point the gate at an archival-synced bitcoin).
|
||||
2. Re-quadletize .228 (reinstall its backends so `.container` units regenerate, matching .198).
|
||||
3. ✅ **test 31 ROOT-CAUSED = contamination, NOT a product bug.** `companion::reconcile` only
|
||||
electrumx done; bitcoin/btcpay/fedimint/immich/etc. remain. (Most backends ARE in manifest_ids
|
||||
already; this is about regenerating quadlet units + clearing adopted plain-podman state.)
|
||||
3. Optional: faster companion-reconcile cadence (test 31) + longer immich-restart wait (test 55) +
|
||||
clear the test-44 orphan — or simply run the gate on a less-loaded, bitcoin-synced node.
|
||||
4. ✅ **test 31 ROOT-CAUSED = contamination + load (NOT a product bug).** `companion::reconcile` only
|
||||
recreates a deleted companion unit (e.g. `archy-electrs-ui`) when its PARENT backend (electrumx)
|
||||
is in `manifest_ids`. On contaminated .228 electrumx ran as plain podman and was NOT a tracked
|
||||
manifest install (its `/opt/.../electrumx/manifest.yml` exists on disk but wasn't loaded), so the
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user