feat(iso): Step 8a — retire archipelago-reconcile systemd timer

BootReconciler (in-process, 30s interval, spawned from main.rs as of
Step 6 commit 48f08aa3) fully replaces the timer-driven bash
reconciliation path. Delete the systemd unit + timer and their
ISO-builder touchpoints.

Removed:
- image-recipe/configs/archipelago-reconcile.service
- image-recipe/configs/archipelago-reconcile.timer
- image-recipe/build-auto-installer-iso.sh L412-413 (COPY unit+timer)
- image-recipe/build-auto-installer-iso.sh L449 (systemctl enable)
- image-recipe/build-auto-installer-iso.sh L542-543 (cp to WORK_DIR)

Kept (intentionally):
- scripts/reconcile-containers.sh
- scripts/container-specs.sh

Reason: core/archipelago/src/api/rpc/package/update.rs still invokes
reconcile-containers.sh at two sites (OTA update + rollback paths).
Porting those call sites to ContainerOrchestrator::upgrade() requires
manifests for every container update.rs might touch — that scope
belongs in Step 8b. Until then the script stays on disk, just no
longer runs on a periodic timer.

No Rust code changes. cargo check -p archipelago clean, 6 pre-existing
warnings. Skipped full ISO rebuild validation per user decision —
edits are 5 textual deletions with zero behavioral ambiguity; Step 9
live hot-swap on .228 will catch any regression.
This commit is contained in:
archipelago 2026-04-23 03:04:58 -04:00
parent 236a2dee85
commit c396be8068
5 changed files with 25 additions and 53 deletions

View File

@ -15,9 +15,9 @@ Working through the 11-step plan in [`rust-orchestrator-migration.md`](./rust-or
- [x] **Step 5**`fc39b04b` BootReconciler with Arc<Notify> shutdown, 4 paused-time tests pass - [x] **Step 5**`fc39b04b` BootReconciler with Arc<Notify> shutdown, 4 paused-time tests pass
- [x] **Step 6** — main.rs wire-up: construct orchestrator once, load_manifests + adopt_existing + spawn BootReconciler, thread through Server::new / ApiHandler::new / RpcHandler::new, wire shutdown Notify to SIGTERM/SIGINT. Clean `cargo check -p archipelago` (6 pre-existing warnings), container tests 43/44 pass (the one failing `test_parse_image_versions` is pre-existing and unrelated — asserts `!contains_key("NOT_AN_IMAGE")` but the retain on line 106 keeps anything ending in `_IMAGE`). - [x] **Step 6** — main.rs wire-up: construct orchestrator once, load_manifests + adopt_existing + spawn BootReconciler, thread through Server::new / ApiHandler::new / RpcHandler::new, wire shutdown Notify to SIGTERM/SIGINT. Clean `cargo check -p archipelago` (6 pre-existing warnings), container tests 43/44 pass (the one failing `test_parse_image_versions` is pre-existing and unrelated — asserts `!contains_key("NOT_AN_IMAGE")` but the retain on line 106 keeps anything ending in `_IMAGE`).
- [x] **Step 7**`069bc4a5` bitcoin-ui pre-start hook renders nginx.conf from embedded template. New `container::bitcoin_ui` module (render fn, atomic tmp+rename, idempotent byte-compare, 8 unit tests). `ProdContainerOrchestrator::run_pre_start_hooks` fires in `install_fresh` before `create_container` and in `ensure_running` (Running+Rewritten → restart; Stopped → re-render+start). bitcoin-ui Dockerfile no longer COPYs nginx conf; arrives via runtime bind-mount (safe-failure → 404 if missing, never stale auth). `apps/{bitcoin,electrs,lnd}-ui/manifest.yml` land. Integration test asserts `install("bitcoin-ui")` writes substituted config to disk. 39/39 container:: tests pass (same 1 pre-existing failure). - [x] **Step 7**`069bc4a5` bitcoin-ui pre-start hook renders nginx.conf from embedded template. New `container::bitcoin_ui` module (render fn, atomic tmp+rename, idempotent byte-compare, 8 unit tests). `ProdContainerOrchestrator::run_pre_start_hooks` fires in `install_fresh` before `create_container` and in `ensure_running` (Running+Rewritten → restart; Stopped → re-render+start). bitcoin-ui Dockerfile no longer COPYs nginx conf; arrives via runtime bind-mount (safe-failure → 404 if missing, never stale auth). `apps/{bitcoin,electrs,lnd}-ui/manifest.yml` land. Integration test asserts `install("bitcoin-ui")` writes substituted config to disk. 39/39 container:: tests pass (same 1 pre-existing failure).
- [ ] **Step 8a** — Delete `reconcile-containers.sh` + `container-specs.sh` + `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Safe, `BootReconciler` fully replaces. Next up. - [ ] **Step 8a** — Delete `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Keep `reconcile-containers.sh` + `container-specs.sh` for `update.rs` OTA path. Next up.
- [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` (deferred, multi-day work) - [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml`, then port `update.rs` to orchestrator (deferred, multi-day work)
- [ ] **Step 8c** — Rename `first-boot-containers.sh``first-boot-setup.sh`, strip container ops, keep setup. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete) - [ ] **Step 8c** — Rename `first-boot-containers.sh``first-boot-setup.sh`, strip container ops, keep setup. Delete `reconcile-containers.sh` + `container-specs.sh`. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete)
- [ ] **Step 9** — Hot-swap + verify on .228 - [ ] **Step 9** — Hot-swap + verify on .228
- [ ] **Step 10** — Hot-swap + verify on .116 - [ ] **Step 10** — Hot-swap + verify on .116
- [ ] **Step 11** — Chaos matrix on both nodes - [ ] **Step 11** — Chaos matrix on both nodes
@ -55,20 +55,18 @@ Both are development alpha nodes — **full destructive latitude**, no need to a
## Next action ## Next action
**Step 8a — Delete the reconcile bash path.** Safe, isolated, atomic. **Step 8a — Delete the reconcile systemd timer path.** Safe, isolated, atomic.
Files to delete: Files to delete:
1. `scripts/reconcile-containers.sh` (531 LOC — `BootReconciler` fully replaces) 1. `image-recipe/configs/archipelago-reconcile.service` (14 LOC — replaced by BootReconciler)
2. `scripts/container-specs.sh` (602 LOC — manifest-driven now) 2. `image-recipe/configs/archipelago-reconcile.timer` (14 LOC — replaced by BootReconciler)
3. `image-recipe/configs/archipelago-reconcile.service`
4. `image-recipe/configs/archipelago-reconcile.timer`
ISO builder edits in `image-recipe/build-auto-installer-iso.sh`: ISO builder edits in `image-recipe/build-auto-installer-iso.sh`:
- L412-413: drop `COPY archipelago-reconcile.{service,timer}` - L412-413: drop `COPY archipelago-reconcile.{service,timer}`
- L429-430: drop `COPY reconcile-containers.sh` + `container-specs.sh` - L449: drop `systemctl enable archipelago-reconcile.timer`
- L453: drop `systemctl enable archipelago-reconcile.timer` - L542-543: drop the `cp archipelago-reconcile.{service,timer}` block
- L547-548: drop the `cp archipelago-reconcile.{service,timer}` block
- L550: drop `reconcile-containers.sh container-specs.sh` from the loop **Keep** `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` because `core/archipelago/src/api/rpc/package/update.rs` still shells out to reconcile-containers.sh during OTA updates. Porting update.rs to `ContainerOrchestrator::upgrade()` requires manifests for every container it touches — that's Step 8b's scope.
No Rust changes. Atomic single commit. Full ISO build test on .116 before commit per user ask. No Rust changes. Atomic single commit. Full ISO build test on .116 before commit per user ask.
@ -81,7 +79,8 @@ No Rust changes. Atomic single commit. Full ISO build test on .116 before commit
Original plan was one commit "delete bash + edit ISO builder". But on investigation: Original plan was one commit "delete bash + edit ISO builder". But on investigation:
- `first-boot-containers.sh` creates **30+ containers** with per-container logic (wallets, DB init, rpcauth derivations, post-create health waits). The repo only has manifests for 3 (bitcoin-ui, electrs-ui, lnd-ui from Step 7). Deleting bash now = brick first-boot on fresh installs. - `first-boot-containers.sh` creates **30+ containers** with per-container logic (wallets, DB init, rpcauth derivations, post-create health waits). The repo only has manifests for 3 (bitcoin-ui, electrs-ui, lnd-ui from Step 7). Deleting bash now = brick first-boot on fresh installs.
- Script also does non-container setup: secret generation (RPC pw, DB pw, FileBrowser admin pw), UID-mapping chowns for rootless podman subuid, Tor hostnames dir, WireGuard, firewall rules, nostr-relay dir. None of this lives in the Rust orchestrator. - Script also does non-container setup: secret generation (RPC pw, DB pw, FileBrowser admin pw), UID-mapping chowns for rootless podman subuid, Tor hostnames dir, WireGuard, firewall rules, nostr-relay dir. None of this lives in the Rust orchestrator.
- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a is safe to execute before we port manifests. - `update.rs` (OTA update RPC) invokes `reconcile-containers.sh` at two sites. Deleting the script breaks package updates. Porting those call sites to the orchestrator needs all containers to have manifests.
- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a (delete the reconcile systemd unit + timer, BootReconciler covers) is safe to execute before we port manifests.
--- ---

View File

@ -503,9 +503,9 @@ Chaos matrix (bash + Playwright, the original goal):
6. **main.rs wire-up**: adopt + spawn reconciler. ~20 LOC. 6. **main.rs wire-up**: adopt + spawn reconciler. ~20 LOC.
7. **3 UI manifests + Dockerfile BITCOIN_RPC_AUTH refactor** (use ARG + template file, not sed). ~60 lines of YAML + ~20 lines of Dockerfile. 7. **3 UI manifests + Dockerfile BITCOIN_RPC_AUTH refactor** (use ARG + template file, not sed). ~60 lines of YAML + ~20 lines of Dockerfile.
8. **Remove bash scripts + services**: split into sub-steps because `first-boot-containers.sh` creates 25+ containers (only 3 ported in Step 7) AND does non-container setup (secret gen, UID-mapping chowns, Tor hostnames, WireGuard, firewall, nostr-relay dir): 8. **Remove bash scripts + services**: split into sub-steps because `first-boot-containers.sh` creates 25+ containers (only 3 ported in Step 7) AND does non-container setup (secret gen, UID-mapping chowns, Tor hostnames, WireGuard, firewall, nostr-relay dir):
- **8a** (cheap, safe): delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` + `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints. `BootReconciler` fully replaces these — no manifest porting required. Atomic commit, low risk. - **8a** (cheap, safe): delete `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints (the systemd enablement + `cp` into `$WORK_DIR`). `BootReconciler` fully replaces the timer-driven path — no more periodic bash invocation. **Keep** `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` because `core/archipelago/src/api/rpc/package/update.rs` still shells out to reconcile-containers.sh during OTA updates; porting that call site requires manifests for every container it touches (which is Step 8b's scope). Atomic commit, low risk.
- **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`. - **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`. Then port `update.rs`'s two `reconcile-containers.sh` call sites to the `ContainerOrchestrator` trait (`upgrade(app_id)`).
- **8c** (final, one-way door): rename `first-boot-containers.sh``first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service``archipelago-first-boot-setup.service`. Add ISO builder lines to copy `apps/*/manifest.yml``/opt/archipelago/apps/`. Full ISO build test on .116 required before commit. - **8c** (final, one-way door): rename `first-boot-containers.sh``first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service``archipelago-first-boot-setup.service`. Delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` (update.rs no longer needs them). Add ISO builder lines to copy `apps/*/manifest.yml``/opt/archipelago/apps/`. Full ISO build test on .116 required before commit.
9. **Live test on .228**: hot-swap binary, expect 3 UIs to come up within 60s of service restart. 9. **Live test on .228**: hot-swap binary, expect 3 UIs to come up within 60s of service restart.
10. **Live test on .116**: hot-swap binary, expect zero container recreation + adoption-confirmed log lines. 10. **Live test on .116**: hot-swap binary, expect zero container recreation + adoption-confirmed log lines.
11. **Chaos matrix** on both nodes. 11. **Chaos matrix** on both nodes.

View File

@ -409,8 +409,6 @@ COPY archipelago-update.service /etc/systemd/system/archipelago-update.service
COPY archipelago-update.timer /etc/systemd/system/archipelago-update.timer COPY archipelago-update.timer /etc/systemd/system/archipelago-update.timer
COPY archipelago-doctor.service /etc/systemd/system/archipelago-doctor.service COPY archipelago-doctor.service /etc/systemd/system/archipelago-doctor.service
COPY archipelago-doctor.timer /etc/systemd/system/archipelago-doctor.timer COPY archipelago-doctor.timer /etc/systemd/system/archipelago-doctor.timer
COPY archipelago-reconcile.service /etc/systemd/system/archipelago-reconcile.service
COPY archipelago-reconcile.timer /etc/systemd/system/archipelago-reconcile.timer
COPY archipelago-tor-helper.service /etc/systemd/system/archipelago-tor-helper.service COPY archipelago-tor-helper.service /etc/systemd/system/archipelago-tor-helper.service
COPY archipelago-tor-helper.path /etc/systemd/system/archipelago-tor-helper.path COPY archipelago-tor-helper.path /etc/systemd/system/archipelago-tor-helper.path
COPY nostr-vpn.service /etc/systemd/system/nostr-vpn.service COPY nostr-vpn.service /etc/systemd/system/nostr-vpn.service
@ -423,7 +421,10 @@ COPY nostr-relay-config.toml /etc/archipelago/nostr-relay-config.toml
# WireGuard kernel module auto-load on boot # WireGuard kernel module auto-load on boot
RUN echo "wireguard" >> /etc/modules-load.d/wireguard.conf RUN echo "wireguard" >> /etc/modules-load.d/wireguard.conf
# Copy container doctor + reconcile scripts (referenced by the services above) # Copy container doctor + reconcile scripts (referenced by services and the
# OTA update RPC; the reconcile systemd timer is gone as of Step 8a, but the
# script stays until Step 8b/c ports all manifests — update.rs still shells
# out to it during package updates).
RUN mkdir -p /home/archipelago/archy/scripts/lib RUN mkdir -p /home/archipelago/archy/scripts/lib
COPY container-doctor.sh /home/archipelago/archy/scripts/container-doctor.sh COPY container-doctor.sh /home/archipelago/archy/scripts/container-doctor.sh
COPY reconcile-containers.sh /home/archipelago/archy/scripts/reconcile-containers.sh COPY reconcile-containers.sh /home/archipelago/archy/scripts/reconcile-containers.sh
@ -450,7 +451,6 @@ RUN systemctl enable NetworkManager || true && \
systemctl enable chrony || true && \ systemctl enable chrony || true && \
systemctl enable archipelago-update.timer || true && \ systemctl enable archipelago-update.timer || true && \
systemctl enable archipelago-doctor.timer || true && \ systemctl enable archipelago-doctor.timer || true && \
systemctl enable archipelago-reconcile.timer || true && \
systemctl enable archipelago-tor-helper.path || true && \ systemctl enable archipelago-tor-helper.path || true && \
systemctl enable nostr-relay || true systemctl enable nostr-relay || true
# archipelago-fips.service + archipelago-wg.service + archipelago-wg-address.service # archipelago-fips.service + archipelago-wg.service + archipelago-wg-address.service
@ -540,13 +540,14 @@ NGINXCONF
echo " Using archipelago-update.service + timer from configs/" echo " Using archipelago-update.service + timer from configs/"
fi fi
# Copy container doctor and reconciliation timers + scripts # Copy container doctor timer + reconcile script (the reconcile systemd
# timer is gone as of Step 8a — BootReconciler replaces it — but the
# reconcile-containers.sh script stays, invoked by the OTA update RPC
# until Step 8b/c ports all manifests to the Rust orchestrator).
if [ -f "$SCRIPT_DIR/configs/archipelago-doctor.service" ]; then if [ -f "$SCRIPT_DIR/configs/archipelago-doctor.service" ]; then
cp "$SCRIPT_DIR/configs/archipelago-doctor.service" "$WORK_DIR/archipelago-doctor.service" cp "$SCRIPT_DIR/configs/archipelago-doctor.service" "$WORK_DIR/archipelago-doctor.service"
cp "$SCRIPT_DIR/configs/archipelago-doctor.timer" "$WORK_DIR/archipelago-doctor.timer" cp "$SCRIPT_DIR/configs/archipelago-doctor.timer" "$WORK_DIR/archipelago-doctor.timer"
cp "$SCRIPT_DIR/configs/archipelago-reconcile.service" "$WORK_DIR/archipelago-reconcile.service" # Copy the actual scripts the services / update RPC reference
cp "$SCRIPT_DIR/configs/archipelago-reconcile.timer" "$WORK_DIR/archipelago-reconcile.timer"
# Copy the actual scripts the services reference
for s in container-doctor.sh reconcile-containers.sh container-specs.sh tor-helper.sh; do for s in container-doctor.sh reconcile-containers.sh container-specs.sh tor-helper.sh; do
if [ -f "$SCRIPT_DIR/../scripts/$s" ]; then if [ -f "$SCRIPT_DIR/../scripts/$s" ]; then
cp "$SCRIPT_DIR/../scripts/$s" "$WORK_DIR/$s" cp "$SCRIPT_DIR/../scripts/$s" "$WORK_DIR/$s"
@ -557,7 +558,7 @@ NGINXCONF
mkdir -p "$WORK_DIR/lib" mkdir -p "$WORK_DIR/lib"
cp "$SCRIPT_DIR/../scripts/lib/"*.sh "$WORK_DIR/lib/" 2>/dev/null || true cp "$SCRIPT_DIR/../scripts/lib/"*.sh "$WORK_DIR/lib/" 2>/dev/null || true
fi fi
echo " Using container doctor + reconcile timers from configs/" echo " Using container doctor timer from configs/"
fi fi
# Copy Tor helper path-activated service (allows backend to manage Tor as non-root) # Copy Tor helper path-activated service (allows backend to manage Tor as non-root)

View File

@ -1,14 +0,0 @@
[Unit]
Description=Archipelago Container Reconciliation
After=archipelago.service
[Service]
Type=oneshot
User=archipelago
Environment="XDG_RUNTIME_DIR=/run/user/1000"
Environment="HOME=/home/archipelago"
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
ExecStart=/home/archipelago/archy/scripts/reconcile-containers.sh
TimeoutStartSec=600
StandardOutput=journal
StandardError=journal

View File

@ -1,14 +0,0 @@
[Unit]
Description=Archipelago container reconciliation (periodic)
[Timer]
# First run 10 minutes after boot, then every 6 hours
OnBootSec=10min
OnUnitActiveSec=6h
# Jitter to avoid load spikes
RandomizedDelaySec=300
# Run missed checks on boot
Persistent=true
[Install]
WantedBy=timers.target