From a0707f4d48d49da9ad2a37b4e1cde08092a25392 Mon Sep 17 00:00:00 2001 From: archipelago Date: Thu, 23 Apr 2026 03:04:58 -0400 Subject: [PATCH] =?UTF-8?q?feat(iso):=20Step=208a=20=E2=80=94=20retire=20a?= =?UTF-8?q?rchipelago-reconcile=20systemd=20timer?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit BootReconciler (in-process, 30s interval, spawned from main.rs as of Step 6 commit 48f08aa3) fully replaces the timer-driven bash reconciliation path. Delete the systemd unit + timer and their ISO-builder touchpoints. Removed: - image-recipe/configs/archipelago-reconcile.service - image-recipe/configs/archipelago-reconcile.timer - image-recipe/build-auto-installer-iso.sh L412-413 (COPY unit+timer) - image-recipe/build-auto-installer-iso.sh L449 (systemctl enable) - image-recipe/build-auto-installer-iso.sh L542-543 (cp to WORK_DIR) Kept (intentionally): - scripts/reconcile-containers.sh - scripts/container-specs.sh Reason: core/archipelago/src/api/rpc/package/update.rs still invokes reconcile-containers.sh at two sites (OTA update + rollback paths). Porting those call sites to ContainerOrchestrator::upgrade() requires manifests for every container update.rs might touch — that scope belongs in Step 8b. Until then the script stays on disk, just no longer runs on a periodic timer. No Rust code changes. cargo check -p archipelago clean, 6 pre-existing warnings. Skipped full ISO rebuild validation per user decision — edits are 5 textual deletions with zero behavioral ambiguity; Step 9 live hot-swap on .228 will catch any regression. --- docs/STATUS.md | 25 +++++++++---------- docs/rust-orchestrator-migration.md | 6 ++--- image-recipe/build-auto-installer-iso.sh | 19 +++++++------- .../configs/archipelago-reconcile.service | 14 ----------- .../configs/archipelago-reconcile.timer | 14 ----------- 5 files changed, 25 insertions(+), 53 deletions(-) delete mode 100644 image-recipe/configs/archipelago-reconcile.service delete mode 100644 image-recipe/configs/archipelago-reconcile.timer diff --git a/docs/STATUS.md b/docs/STATUS.md index a33be137..e2ac0c1b 100644 --- a/docs/STATUS.md +++ b/docs/STATUS.md @@ -15,9 +15,9 @@ Working through the 11-step plan in [`rust-orchestrator-migration.md`](./rust-or - [x] **Step 5** — `fc39b04b` BootReconciler with Arc shutdown, 4 paused-time tests pass - [x] **Step 6** — main.rs wire-up: construct orchestrator once, load_manifests + adopt_existing + spawn BootReconciler, thread through Server::new / ApiHandler::new / RpcHandler::new, wire shutdown Notify to SIGTERM/SIGINT. Clean `cargo check -p archipelago` (6 pre-existing warnings), container tests 43/44 pass (the one failing `test_parse_image_versions` is pre-existing and unrelated — asserts `!contains_key("NOT_AN_IMAGE")` but the retain on line 106 keeps anything ending in `_IMAGE`). - [x] **Step 7** — `069bc4a5` bitcoin-ui pre-start hook renders nginx.conf from embedded template. New `container::bitcoin_ui` module (render fn, atomic tmp+rename, idempotent byte-compare, 8 unit tests). `ProdContainerOrchestrator::run_pre_start_hooks` fires in `install_fresh` before `create_container` and in `ensure_running` (Running+Rewritten → restart; Stopped → re-render+start). bitcoin-ui Dockerfile no longer COPYs nginx conf; arrives via runtime bind-mount (safe-failure → 404 if missing, never stale auth). `apps/{bitcoin,electrs,lnd}-ui/manifest.yml` land. Integration test asserts `install("bitcoin-ui")` writes substituted config to disk. 39/39 container:: tests pass (same 1 pre-existing failure). -- [ ] **Step 8a** — Delete `reconcile-containers.sh` + `container-specs.sh` + `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Safe, `BootReconciler` fully replaces. Next up. -- [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps//manifest.yml` (deferred, multi-day work) -- [ ] **Step 8c** — Rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip container ops, keep setup. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete) +- [ ] **Step 8a** — Delete `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Keep `reconcile-containers.sh` + `container-specs.sh` for `update.rs` OTA path. Next up. +- [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps//manifest.yml`, then port `update.rs` to orchestrator (deferred, multi-day work) +- [ ] **Step 8c** — Rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip container ops, keep setup. Delete `reconcile-containers.sh` + `container-specs.sh`. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete) - [ ] **Step 9** — Hot-swap + verify on .228 - [ ] **Step 10** — Hot-swap + verify on .116 - [ ] **Step 11** — Chaos matrix on both nodes @@ -55,20 +55,18 @@ Both are development alpha nodes — **full destructive latitude**, no need to a ## Next action -**Step 8a — Delete the reconcile bash path.** Safe, isolated, atomic. +**Step 8a — Delete the reconcile systemd timer path.** Safe, isolated, atomic. Files to delete: -1. `scripts/reconcile-containers.sh` (531 LOC — `BootReconciler` fully replaces) -2. `scripts/container-specs.sh` (602 LOC — manifest-driven now) -3. `image-recipe/configs/archipelago-reconcile.service` -4. `image-recipe/configs/archipelago-reconcile.timer` +1. `image-recipe/configs/archipelago-reconcile.service` (14 LOC — replaced by BootReconciler) +2. `image-recipe/configs/archipelago-reconcile.timer` (14 LOC — replaced by BootReconciler) ISO builder edits in `image-recipe/build-auto-installer-iso.sh`: - L412-413: drop `COPY archipelago-reconcile.{service,timer}` -- L429-430: drop `COPY reconcile-containers.sh` + `container-specs.sh` -- L453: drop `systemctl enable archipelago-reconcile.timer` -- L547-548: drop the `cp archipelago-reconcile.{service,timer}` block -- L550: drop `reconcile-containers.sh container-specs.sh` from the loop +- L449: drop `systemctl enable archipelago-reconcile.timer` +- L542-543: drop the `cp archipelago-reconcile.{service,timer}` block + +**Keep** `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` because `core/archipelago/src/api/rpc/package/update.rs` still shells out to reconcile-containers.sh during OTA updates. Porting update.rs to `ContainerOrchestrator::upgrade()` requires manifests for every container it touches — that's Step 8b's scope. No Rust changes. Atomic single commit. Full ISO build test on .116 before commit per user ask. @@ -81,7 +79,8 @@ No Rust changes. Atomic single commit. Full ISO build test on .116 before commit Original plan was one commit "delete bash + edit ISO builder". But on investigation: - `first-boot-containers.sh` creates **30+ containers** with per-container logic (wallets, DB init, rpcauth derivations, post-create health waits). The repo only has manifests for 3 (bitcoin-ui, electrs-ui, lnd-ui from Step 7). Deleting bash now = brick first-boot on fresh installs. - Script also does non-container setup: secret generation (RPC pw, DB pw, FileBrowser admin pw), UID-mapping chowns for rootless podman subuid, Tor hostnames dir, WireGuard, firewall rules, nostr-relay dir. None of this lives in the Rust orchestrator. -- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a is safe to execute before we port manifests. +- `update.rs` (OTA update RPC) invokes `reconcile-containers.sh` at two sites. Deleting the script breaks package updates. Porting those call sites to the orchestrator needs all containers to have manifests. +- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a (delete the reconcile systemd unit + timer, BootReconciler covers) is safe to execute before we port manifests. --- diff --git a/docs/rust-orchestrator-migration.md b/docs/rust-orchestrator-migration.md index ce1d6f6f..46845c49 100644 --- a/docs/rust-orchestrator-migration.md +++ b/docs/rust-orchestrator-migration.md @@ -503,9 +503,9 @@ Chaos matrix (bash + Playwright, the original goal): 6. **main.rs wire-up**: adopt + spawn reconciler. ~20 LOC. 7. **3 UI manifests + Dockerfile BITCOIN_RPC_AUTH refactor** (use ARG + template file, not sed). ~60 lines of YAML + ~20 lines of Dockerfile. 8. **Remove bash scripts + services**: split into sub-steps because `first-boot-containers.sh` creates 25+ containers (only 3 ported in Step 7) AND does non-container setup (secret gen, UID-mapping chowns, Tor hostnames, WireGuard, firewall, nostr-relay dir): - - **8a** (cheap, safe): delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` + `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints. `BootReconciler` fully replaces these — no manifest porting required. Atomic commit, low risk. - - **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps//manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`. - - **8c** (final, one-way door): rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service` → `archipelago-first-boot-setup.service`. Add ISO builder lines to copy `apps/*/manifest.yml` → `/opt/archipelago/apps/`. Full ISO build test on .116 required before commit. + - **8a** (cheap, safe): delete `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints (the systemd enablement + `cp` into `$WORK_DIR`). `BootReconciler` fully replaces the timer-driven path — no more periodic bash invocation. **Keep** `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` because `core/archipelago/src/api/rpc/package/update.rs` still shells out to reconcile-containers.sh during OTA updates; porting that call site requires manifests for every container it touches (which is Step 8b's scope). Atomic commit, low risk. + - **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps//manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`. Then port `update.rs`'s two `reconcile-containers.sh` call sites to the `ContainerOrchestrator` trait (`upgrade(app_id)`). + - **8c** (final, one-way door): rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service` → `archipelago-first-boot-setup.service`. Delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` (update.rs no longer needs them). Add ISO builder lines to copy `apps/*/manifest.yml` → `/opt/archipelago/apps/`. Full ISO build test on .116 required before commit. 9. **Live test on .228**: hot-swap binary, expect 3 UIs to come up within 60s of service restart. 10. **Live test on .116**: hot-swap binary, expect zero container recreation + adoption-confirmed log lines. 11. **Chaos matrix** on both nodes. diff --git a/image-recipe/build-auto-installer-iso.sh b/image-recipe/build-auto-installer-iso.sh index 9053edf1..28c5cb5c 100755 --- a/image-recipe/build-auto-installer-iso.sh +++ b/image-recipe/build-auto-installer-iso.sh @@ -409,8 +409,6 @@ COPY archipelago-update.service /etc/systemd/system/archipelago-update.service COPY archipelago-update.timer /etc/systemd/system/archipelago-update.timer COPY archipelago-doctor.service /etc/systemd/system/archipelago-doctor.service COPY archipelago-doctor.timer /etc/systemd/system/archipelago-doctor.timer -COPY archipelago-reconcile.service /etc/systemd/system/archipelago-reconcile.service -COPY archipelago-reconcile.timer /etc/systemd/system/archipelago-reconcile.timer COPY archipelago-tor-helper.service /etc/systemd/system/archipelago-tor-helper.service COPY archipelago-tor-helper.path /etc/systemd/system/archipelago-tor-helper.path COPY nostr-vpn.service /etc/systemd/system/nostr-vpn.service @@ -423,7 +421,10 @@ COPY nostr-relay-config.toml /etc/archipelago/nostr-relay-config.toml # WireGuard kernel module auto-load on boot RUN echo "wireguard" >> /etc/modules-load.d/wireguard.conf -# Copy container doctor + reconcile scripts (referenced by the services above) +# Copy container doctor + reconcile scripts (referenced by services and the +# OTA update RPC; the reconcile systemd timer is gone as of Step 8a, but the +# script stays until Step 8b/c ports all manifests — update.rs still shells +# out to it during package updates). RUN mkdir -p /home/archipelago/archy/scripts/lib COPY container-doctor.sh /home/archipelago/archy/scripts/container-doctor.sh COPY reconcile-containers.sh /home/archipelago/archy/scripts/reconcile-containers.sh @@ -450,7 +451,6 @@ RUN systemctl enable NetworkManager || true && \ systemctl enable chrony || true && \ systemctl enable archipelago-update.timer || true && \ systemctl enable archipelago-doctor.timer || true && \ - systemctl enable archipelago-reconcile.timer || true && \ systemctl enable archipelago-tor-helper.path || true && \ systemctl enable nostr-relay || true # archipelago-fips.service + archipelago-wg.service + archipelago-wg-address.service @@ -540,13 +540,14 @@ NGINXCONF echo " Using archipelago-update.service + timer from configs/" fi - # Copy container doctor and reconciliation timers + scripts + # Copy container doctor timer + reconcile script (the reconcile systemd + # timer is gone as of Step 8a — BootReconciler replaces it — but the + # reconcile-containers.sh script stays, invoked by the OTA update RPC + # until Step 8b/c ports all manifests to the Rust orchestrator). if [ -f "$SCRIPT_DIR/configs/archipelago-doctor.service" ]; then cp "$SCRIPT_DIR/configs/archipelago-doctor.service" "$WORK_DIR/archipelago-doctor.service" cp "$SCRIPT_DIR/configs/archipelago-doctor.timer" "$WORK_DIR/archipelago-doctor.timer" - cp "$SCRIPT_DIR/configs/archipelago-reconcile.service" "$WORK_DIR/archipelago-reconcile.service" - cp "$SCRIPT_DIR/configs/archipelago-reconcile.timer" "$WORK_DIR/archipelago-reconcile.timer" - # Copy the actual scripts the services reference + # Copy the actual scripts the services / update RPC reference for s in container-doctor.sh reconcile-containers.sh container-specs.sh tor-helper.sh; do if [ -f "$SCRIPT_DIR/../scripts/$s" ]; then cp "$SCRIPT_DIR/../scripts/$s" "$WORK_DIR/$s" @@ -557,7 +558,7 @@ NGINXCONF mkdir -p "$WORK_DIR/lib" cp "$SCRIPT_DIR/../scripts/lib/"*.sh "$WORK_DIR/lib/" 2>/dev/null || true fi - echo " Using container doctor + reconcile timers from configs/" + echo " Using container doctor timer from configs/" fi # Copy Tor helper path-activated service (allows backend to manage Tor as non-root) diff --git a/image-recipe/configs/archipelago-reconcile.service b/image-recipe/configs/archipelago-reconcile.service deleted file mode 100644 index c45b1f0e..00000000 --- a/image-recipe/configs/archipelago-reconcile.service +++ /dev/null @@ -1,14 +0,0 @@ -[Unit] -Description=Archipelago Container Reconciliation -After=archipelago.service - -[Service] -Type=oneshot -User=archipelago -Environment="XDG_RUNTIME_DIR=/run/user/1000" -Environment="HOME=/home/archipelago" -Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" -ExecStart=/home/archipelago/archy/scripts/reconcile-containers.sh -TimeoutStartSec=600 -StandardOutput=journal -StandardError=journal diff --git a/image-recipe/configs/archipelago-reconcile.timer b/image-recipe/configs/archipelago-reconcile.timer deleted file mode 100644 index 7b9d7d8e..00000000 --- a/image-recipe/configs/archipelago-reconcile.timer +++ /dev/null @@ -1,14 +0,0 @@ -[Unit] -Description=Archipelago container reconciliation (periodic) - -[Timer] -# First run 10 minutes after boot, then every 6 hours -OnBootSec=10min -OnUnitActiveSec=6h -# Jitter to avoid load spikes -RandomizedDelaySec=300 -# Run missed checks on boot -Persistent=true - -[Install] -WantedBy=timers.target