From 236a2dee855d062a13475e6f89e385cfaff5e69c Mon Sep 17 00:00:00 2001 From: archipelago Date: Thu, 23 Apr 2026 02:34:43 -0400 Subject: [PATCH] docs: split Step 8 into 8a/8b/8c Discovered during Step 8 execution that first-boot-containers.sh creates 30+ containers with per-container logic (wallet loads, DB init, rpcauth derivations, post-create health waits) and does substantial non-container setup (secret gen, rootless-podman subuid chowns, Tor hostnames, WireGuard, firewall, nostr-relay). Only 3 of the 30+ containers have manifests today (the UIs from Step 7). Deleting the bash in a single step bricks first-boot on fresh installs. Split into: - 8a: delete reconcile-containers.sh + container-specs.sh + reconcile systemd unit + timer. BootReconciler fully covers these. Safe, atomic, no manifest porting required. - 8b: port remaining ~25 containers into apps//manifest.yml. One manifest per commit, validated against current bash behavior. Multi-day scope. - 8c: rename first-boot-containers.sh -> first-boot-setup.sh, strip container ops, keep secret/dir/Tor/WG/firewall setup. Final one-way door, requires 8b complete. --- docs/STATUS.md | 40 ++++++++++++++++++----------- docs/rust-orchestrator-migration.md | 5 +++- 2 files changed, 29 insertions(+), 16 deletions(-) diff --git a/docs/STATUS.md b/docs/STATUS.md index 8b527a69..a33be137 100644 --- a/docs/STATUS.md +++ b/docs/STATUS.md @@ -15,7 +15,9 @@ Working through the 11-step plan in [`rust-orchestrator-migration.md`](./rust-or - [x] **Step 5** — `fc39b04b` BootReconciler with Arc shutdown, 4 paused-time tests pass - [x] **Step 6** — main.rs wire-up: construct orchestrator once, load_manifests + adopt_existing + spawn BootReconciler, thread through Server::new / ApiHandler::new / RpcHandler::new, wire shutdown Notify to SIGTERM/SIGINT. Clean `cargo check -p archipelago` (6 pre-existing warnings), container tests 43/44 pass (the one failing `test_parse_image_versions` is pre-existing and unrelated — asserts `!contains_key("NOT_AN_IMAGE")` but the retain on line 106 keeps anything ending in `_IMAGE`). - [x] **Step 7** — `069bc4a5` bitcoin-ui pre-start hook renders nginx.conf from embedded template. New `container::bitcoin_ui` module (render fn, atomic tmp+rename, idempotent byte-compare, 8 unit tests). `ProdContainerOrchestrator::run_pre_start_hooks` fires in `install_fresh` before `create_container` and in `ensure_running` (Running+Rewritten → restart; Stopped → re-render+start). bitcoin-ui Dockerfile no longer COPYs nginx conf; arrives via runtime bind-mount (safe-failure → 404 if missing, never stale auth). `apps/{bitcoin,electrs,lnd}-ui/manifest.yml` land. Integration test asserts `install("bitcoin-ui")` writes substituted config to disk. 39/39 container:: tests pass (same 1 pre-existing failure). -- [ ] **Step 8** — Delete bash scripts + systemd units + ISO builder lines, **plus** add ISO builder lines to copy `apps/*/manifest.yml` → `/opt/archipelago/apps/` on install — next up +- [ ] **Step 8a** — Delete `reconcile-containers.sh` + `container-specs.sh` + `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Safe, `BootReconciler` fully replaces. Next up. +- [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps//manifest.yml` (deferred, multi-day work) +- [ ] **Step 8c** — Rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip container ops, keep setup. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete) - [ ] **Step 9** — Hot-swap + verify on .228 - [ ] **Step 10** — Hot-swap + verify on .116 - [ ] **Step 11** — Chaos matrix on both nodes @@ -53,25 +55,33 @@ Both are development alpha nodes — **full destructive latitude**, no need to a ## Next action -**Step 8 — Delete bash scripts + systemd units, and teach the ISO builder to install manifests.** +**Step 8a — Delete the reconcile bash path.** Safe, isolated, atomic. -Files to delete (the scripts the Rust orchestrator has now replaced): +Files to delete: +1. `scripts/reconcile-containers.sh` (531 LOC — `BootReconciler` fully replaces) +2. `scripts/container-specs.sh` (602 LOC — manifest-driven now) +3. `image-recipe/configs/archipelago-reconcile.service` +4. `image-recipe/configs/archipelago-reconcile.timer` -1. `scripts/first-boot-containers.sh` (1392 lines — the sed/envsubst path, now covered by bitcoin-ui pre-start hook + `install_fresh`) -2. `scripts/reconcile-containers.sh` (now covered by `BootReconciler` in-process loop) -3. `scripts/container-specs.sh` (manifests live in `apps/*/manifest.yml` instead) -4. `image-recipe/configs/archipelago-first-boot-containers.service` (systemd unit) -5. `image-recipe/configs/archipelago-reconcile.service` (systemd unit) +ISO builder edits in `image-recipe/build-auto-installer-iso.sh`: +- L412-413: drop `COPY archipelago-reconcile.{service,timer}` +- L429-430: drop `COPY reconcile-containers.sh` + `container-specs.sh` +- L453: drop `systemctl enable archipelago-reconcile.timer` +- L547-548: drop the `cp archipelago-reconcile.{service,timer}` block +- L550: drop `reconcile-containers.sh container-specs.sh` from the loop -Enablement lines to remove in `image-recipe/build-auto-installer-iso.sh`: -- line ~2773 (first-boot unit enable) -- line ~2896 (reconcile unit enable) -- line ~2961 (any remaining enablement hooks — verify) +No Rust changes. Atomic single commit. Full ISO build test on .116 before commit per user ask. -**New requirement** (discovered this session, not in original spec): -- Add ISO builder lines to copy `apps/*/manifest.yml` → `/opt/archipelago/apps/` on install. Without this, `load_manifests()` finds no manifests on fresh nodes and the orchestrator has nothing to reconcile. Reference: `image-recipe/build-auto-installer-iso.sh:2350-2351` (existing `cp -r /docker/* /mnt/target/opt/archipelago/docker/` pattern — mirror it for `/apps/`). +**Step 8b/8c come later** — they require porting 25+ container creations from `first-boot-containers.sh` into `apps/*/manifest.yml`, which is a multi-day scope. Not tonight. -No Rust code changes in this step. Atomic commit: deletions + ISO builder edits together. +--- + +### Why Step 8 got split (discovered 2026-04-23) + +Original plan was one commit "delete bash + edit ISO builder". But on investigation: +- `first-boot-containers.sh` creates **30+ containers** with per-container logic (wallets, DB init, rpcauth derivations, post-create health waits). The repo only has manifests for 3 (bitcoin-ui, electrs-ui, lnd-ui from Step 7). Deleting bash now = brick first-boot on fresh installs. +- Script also does non-container setup: secret generation (RPC pw, DB pw, FileBrowser admin pw), UID-mapping chowns for rootless podman subuid, Tor hostnames dir, WireGuard, firewall rules, nostr-relay dir. None of this lives in the Rust orchestrator. +- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a is safe to execute before we port manifests. --- diff --git a/docs/rust-orchestrator-migration.md b/docs/rust-orchestrator-migration.md index 1508c172..ce1d6f6f 100644 --- a/docs/rust-orchestrator-migration.md +++ b/docs/rust-orchestrator-migration.md @@ -502,7 +502,10 @@ Chaos matrix (bash + Playwright, the original goal): 5. **BootReconciler**: task spawner with loop + cancellation. ~80 LOC + unit tests. 6. **main.rs wire-up**: adopt + spawn reconciler. ~20 LOC. 7. **3 UI manifests + Dockerfile BITCOIN_RPC_AUTH refactor** (use ARG + template file, not sed). ~60 lines of YAML + ~20 lines of Dockerfile. -8. **Remove bash scripts + services**: `git rm` + ISO-builder edits + changelog. +8. **Remove bash scripts + services**: split into sub-steps because `first-boot-containers.sh` creates 25+ containers (only 3 ported in Step 7) AND does non-container setup (secret gen, UID-mapping chowns, Tor hostnames, WireGuard, firewall, nostr-relay dir): + - **8a** (cheap, safe): delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` + `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints. `BootReconciler` fully replaces these — no manifest porting required. Atomic commit, low risk. + - **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps//manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`. + - **8c** (final, one-way door): rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service` → `archipelago-first-boot-setup.service`. Add ISO builder lines to copy `apps/*/manifest.yml` → `/opt/archipelago/apps/`. Full ISO build test on .116 required before commit. 9. **Live test on .228**: hot-swap binary, expect 3 UIs to come up within 60s of service restart. 10. **Live test on .116**: hot-swap binary, expect zero container recreation + adoption-confirmed log lines. 11. **Chaos matrix** on both nodes.