feat(iso): Step 8a — retire archipelago-reconcile systemd timer
BootReconciler (in-process, 30s interval, spawned from main.rs as of Step 6 commit 48f08aa3) fully replaces the timer-driven bash reconciliation path. Delete the systemd unit + timer and their ISO-builder touchpoints. Removed: - image-recipe/configs/archipelago-reconcile.service - image-recipe/configs/archipelago-reconcile.timer - image-recipe/build-auto-installer-iso.sh L412-413 (COPY unit+timer) - image-recipe/build-auto-installer-iso.sh L449 (systemctl enable) - image-recipe/build-auto-installer-iso.sh L542-543 (cp to WORK_DIR) Kept (intentionally): - scripts/reconcile-containers.sh - scripts/container-specs.sh Reason: core/archipelago/src/api/rpc/package/update.rs still invokes reconcile-containers.sh at two sites (OTA update + rollback paths). Porting those call sites to ContainerOrchestrator::upgrade() requires manifests for every container update.rs might touch — that scope belongs in Step 8b. Until then the script stays on disk, just no longer runs on a periodic timer. No Rust code changes. cargo check -p archipelago clean, 6 pre-existing warnings. Skipped full ISO rebuild validation per user decision — edits are 5 textual deletions with zero behavioral ambiguity; Step 9 live hot-swap on .228 will catch any regression.
This commit is contained in:
parent
236a2dee85
commit
c396be8068
@ -15,9 +15,9 @@ Working through the 11-step plan in [`rust-orchestrator-migration.md`](./rust-or
|
||||
- [x] **Step 5** — `fc39b04b` BootReconciler with Arc<Notify> shutdown, 4 paused-time tests pass
|
||||
- [x] **Step 6** — main.rs wire-up: construct orchestrator once, load_manifests + adopt_existing + spawn BootReconciler, thread through Server::new / ApiHandler::new / RpcHandler::new, wire shutdown Notify to SIGTERM/SIGINT. Clean `cargo check -p archipelago` (6 pre-existing warnings), container tests 43/44 pass (the one failing `test_parse_image_versions` is pre-existing and unrelated — asserts `!contains_key("NOT_AN_IMAGE")` but the retain on line 106 keeps anything ending in `_IMAGE`).
|
||||
- [x] **Step 7** — `069bc4a5` bitcoin-ui pre-start hook renders nginx.conf from embedded template. New `container::bitcoin_ui` module (render fn, atomic tmp+rename, idempotent byte-compare, 8 unit tests). `ProdContainerOrchestrator::run_pre_start_hooks` fires in `install_fresh` before `create_container` and in `ensure_running` (Running+Rewritten → restart; Stopped → re-render+start). bitcoin-ui Dockerfile no longer COPYs nginx conf; arrives via runtime bind-mount (safe-failure → 404 if missing, never stale auth). `apps/{bitcoin,electrs,lnd}-ui/manifest.yml` land. Integration test asserts `install("bitcoin-ui")` writes substituted config to disk. 39/39 container:: tests pass (same 1 pre-existing failure).
|
||||
- [ ] **Step 8a** — Delete `reconcile-containers.sh` + `container-specs.sh` + `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Safe, `BootReconciler` fully replaces. Next up.
|
||||
- [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` (deferred, multi-day work)
|
||||
- [ ] **Step 8c** — Rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip container ops, keep setup. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete)
|
||||
- [ ] **Step 8a** — Delete `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Keep `reconcile-containers.sh` + `container-specs.sh` for `update.rs` OTA path. Next up.
|
||||
- [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml`, then port `update.rs` to orchestrator (deferred, multi-day work)
|
||||
- [ ] **Step 8c** — Rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip container ops, keep setup. Delete `reconcile-containers.sh` + `container-specs.sh`. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete)
|
||||
- [ ] **Step 9** — Hot-swap + verify on .228
|
||||
- [ ] **Step 10** — Hot-swap + verify on .116
|
||||
- [ ] **Step 11** — Chaos matrix on both nodes
|
||||
@ -55,20 +55,18 @@ Both are development alpha nodes — **full destructive latitude**, no need to a
|
||||
|
||||
## Next action
|
||||
|
||||
**Step 8a — Delete the reconcile bash path.** Safe, isolated, atomic.
|
||||
**Step 8a — Delete the reconcile systemd timer path.** Safe, isolated, atomic.
|
||||
|
||||
Files to delete:
|
||||
1. `scripts/reconcile-containers.sh` (531 LOC — `BootReconciler` fully replaces)
|
||||
2. `scripts/container-specs.sh` (602 LOC — manifest-driven now)
|
||||
3. `image-recipe/configs/archipelago-reconcile.service`
|
||||
4. `image-recipe/configs/archipelago-reconcile.timer`
|
||||
1. `image-recipe/configs/archipelago-reconcile.service` (14 LOC — replaced by BootReconciler)
|
||||
2. `image-recipe/configs/archipelago-reconcile.timer` (14 LOC — replaced by BootReconciler)
|
||||
|
||||
ISO builder edits in `image-recipe/build-auto-installer-iso.sh`:
|
||||
- L412-413: drop `COPY archipelago-reconcile.{service,timer}`
|
||||
- L429-430: drop `COPY reconcile-containers.sh` + `container-specs.sh`
|
||||
- L453: drop `systemctl enable archipelago-reconcile.timer`
|
||||
- L547-548: drop the `cp archipelago-reconcile.{service,timer}` block
|
||||
- L550: drop `reconcile-containers.sh container-specs.sh` from the loop
|
||||
- L449: drop `systemctl enable archipelago-reconcile.timer`
|
||||
- L542-543: drop the `cp archipelago-reconcile.{service,timer}` block
|
||||
|
||||
**Keep** `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` because `core/archipelago/src/api/rpc/package/update.rs` still shells out to reconcile-containers.sh during OTA updates. Porting update.rs to `ContainerOrchestrator::upgrade()` requires manifests for every container it touches — that's Step 8b's scope.
|
||||
|
||||
No Rust changes. Atomic single commit. Full ISO build test on .116 before commit per user ask.
|
||||
|
||||
@ -81,7 +79,8 @@ No Rust changes. Atomic single commit. Full ISO build test on .116 before commit
|
||||
Original plan was one commit "delete bash + edit ISO builder". But on investigation:
|
||||
- `first-boot-containers.sh` creates **30+ containers** with per-container logic (wallets, DB init, rpcauth derivations, post-create health waits). The repo only has manifests for 3 (bitcoin-ui, electrs-ui, lnd-ui from Step 7). Deleting bash now = brick first-boot on fresh installs.
|
||||
- Script also does non-container setup: secret generation (RPC pw, DB pw, FileBrowser admin pw), UID-mapping chowns for rootless podman subuid, Tor hostnames dir, WireGuard, firewall rules, nostr-relay dir. None of this lives in the Rust orchestrator.
|
||||
- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a is safe to execute before we port manifests.
|
||||
- `update.rs` (OTA update RPC) invokes `reconcile-containers.sh` at two sites. Deleting the script breaks package updates. Porting those call sites to the orchestrator needs all containers to have manifests.
|
||||
- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a (delete the reconcile systemd unit + timer, BootReconciler covers) is safe to execute before we port manifests.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@ -503,9 +503,9 @@ Chaos matrix (bash + Playwright, the original goal):
|
||||
6. **main.rs wire-up**: adopt + spawn reconciler. ~20 LOC.
|
||||
7. **3 UI manifests + Dockerfile BITCOIN_RPC_AUTH refactor** (use ARG + template file, not sed). ~60 lines of YAML + ~20 lines of Dockerfile.
|
||||
8. **Remove bash scripts + services**: split into sub-steps because `first-boot-containers.sh` creates 25+ containers (only 3 ported in Step 7) AND does non-container setup (secret gen, UID-mapping chowns, Tor hostnames, WireGuard, firewall, nostr-relay dir):
|
||||
- **8a** (cheap, safe): delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` + `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints. `BootReconciler` fully replaces these — no manifest porting required. Atomic commit, low risk.
|
||||
- **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`.
|
||||
- **8c** (final, one-way door): rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service` → `archipelago-first-boot-setup.service`. Add ISO builder lines to copy `apps/*/manifest.yml` → `/opt/archipelago/apps/`. Full ISO build test on .116 required before commit.
|
||||
- **8a** (cheap, safe): delete `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints (the systemd enablement + `cp` into `$WORK_DIR`). `BootReconciler` fully replaces the timer-driven path — no more periodic bash invocation. **Keep** `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` because `core/archipelago/src/api/rpc/package/update.rs` still shells out to reconcile-containers.sh during OTA updates; porting that call site requires manifests for every container it touches (which is Step 8b's scope). Atomic commit, low risk.
|
||||
- **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`. Then port `update.rs`'s two `reconcile-containers.sh` call sites to the `ContainerOrchestrator` trait (`upgrade(app_id)`).
|
||||
- **8c** (final, one-way door): rename `first-boot-containers.sh` → `first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service` → `archipelago-first-boot-setup.service`. Delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` (update.rs no longer needs them). Add ISO builder lines to copy `apps/*/manifest.yml` → `/opt/archipelago/apps/`. Full ISO build test on .116 required before commit.
|
||||
9. **Live test on .228**: hot-swap binary, expect 3 UIs to come up within 60s of service restart.
|
||||
10. **Live test on .116**: hot-swap binary, expect zero container recreation + adoption-confirmed log lines.
|
||||
11. **Chaos matrix** on both nodes.
|
||||
|
||||
@ -409,8 +409,6 @@ COPY archipelago-update.service /etc/systemd/system/archipelago-update.service
|
||||
COPY archipelago-update.timer /etc/systemd/system/archipelago-update.timer
|
||||
COPY archipelago-doctor.service /etc/systemd/system/archipelago-doctor.service
|
||||
COPY archipelago-doctor.timer /etc/systemd/system/archipelago-doctor.timer
|
||||
COPY archipelago-reconcile.service /etc/systemd/system/archipelago-reconcile.service
|
||||
COPY archipelago-reconcile.timer /etc/systemd/system/archipelago-reconcile.timer
|
||||
COPY archipelago-tor-helper.service /etc/systemd/system/archipelago-tor-helper.service
|
||||
COPY archipelago-tor-helper.path /etc/systemd/system/archipelago-tor-helper.path
|
||||
COPY nostr-vpn.service /etc/systemd/system/nostr-vpn.service
|
||||
@ -423,7 +421,10 @@ COPY nostr-relay-config.toml /etc/archipelago/nostr-relay-config.toml
|
||||
# WireGuard kernel module auto-load on boot
|
||||
RUN echo "wireguard" >> /etc/modules-load.d/wireguard.conf
|
||||
|
||||
# Copy container doctor + reconcile scripts (referenced by the services above)
|
||||
# Copy container doctor + reconcile scripts (referenced by services and the
|
||||
# OTA update RPC; the reconcile systemd timer is gone as of Step 8a, but the
|
||||
# script stays until Step 8b/c ports all manifests — update.rs still shells
|
||||
# out to it during package updates).
|
||||
RUN mkdir -p /home/archipelago/archy/scripts/lib
|
||||
COPY container-doctor.sh /home/archipelago/archy/scripts/container-doctor.sh
|
||||
COPY reconcile-containers.sh /home/archipelago/archy/scripts/reconcile-containers.sh
|
||||
@ -450,7 +451,6 @@ RUN systemctl enable NetworkManager || true && \
|
||||
systemctl enable chrony || true && \
|
||||
systemctl enable archipelago-update.timer || true && \
|
||||
systemctl enable archipelago-doctor.timer || true && \
|
||||
systemctl enable archipelago-reconcile.timer || true && \
|
||||
systemctl enable archipelago-tor-helper.path || true && \
|
||||
systemctl enable nostr-relay || true
|
||||
# archipelago-fips.service + archipelago-wg.service + archipelago-wg-address.service
|
||||
@ -540,13 +540,14 @@ NGINXCONF
|
||||
echo " Using archipelago-update.service + timer from configs/"
|
||||
fi
|
||||
|
||||
# Copy container doctor and reconciliation timers + scripts
|
||||
# Copy container doctor timer + reconcile script (the reconcile systemd
|
||||
# timer is gone as of Step 8a — BootReconciler replaces it — but the
|
||||
# reconcile-containers.sh script stays, invoked by the OTA update RPC
|
||||
# until Step 8b/c ports all manifests to the Rust orchestrator).
|
||||
if [ -f "$SCRIPT_DIR/configs/archipelago-doctor.service" ]; then
|
||||
cp "$SCRIPT_DIR/configs/archipelago-doctor.service" "$WORK_DIR/archipelago-doctor.service"
|
||||
cp "$SCRIPT_DIR/configs/archipelago-doctor.timer" "$WORK_DIR/archipelago-doctor.timer"
|
||||
cp "$SCRIPT_DIR/configs/archipelago-reconcile.service" "$WORK_DIR/archipelago-reconcile.service"
|
||||
cp "$SCRIPT_DIR/configs/archipelago-reconcile.timer" "$WORK_DIR/archipelago-reconcile.timer"
|
||||
# Copy the actual scripts the services reference
|
||||
# Copy the actual scripts the services / update RPC reference
|
||||
for s in container-doctor.sh reconcile-containers.sh container-specs.sh tor-helper.sh; do
|
||||
if [ -f "$SCRIPT_DIR/../scripts/$s" ]; then
|
||||
cp "$SCRIPT_DIR/../scripts/$s" "$WORK_DIR/$s"
|
||||
@ -557,7 +558,7 @@ NGINXCONF
|
||||
mkdir -p "$WORK_DIR/lib"
|
||||
cp "$SCRIPT_DIR/../scripts/lib/"*.sh "$WORK_DIR/lib/" 2>/dev/null || true
|
||||
fi
|
||||
echo " Using container doctor + reconcile timers from configs/"
|
||||
echo " Using container doctor timer from configs/"
|
||||
fi
|
||||
|
||||
# Copy Tor helper path-activated service (allows backend to manage Tor as non-root)
|
||||
|
||||
@ -1,14 +0,0 @@
|
||||
[Unit]
|
||||
Description=Archipelago Container Reconciliation
|
||||
After=archipelago.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
User=archipelago
|
||||
Environment="XDG_RUNTIME_DIR=/run/user/1000"
|
||||
Environment="HOME=/home/archipelago"
|
||||
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
|
||||
ExecStart=/home/archipelago/archy/scripts/reconcile-containers.sh
|
||||
TimeoutStartSec=600
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
@ -1,14 +0,0 @@
|
||||
[Unit]
|
||||
Description=Archipelago container reconciliation (periodic)
|
||||
|
||||
[Timer]
|
||||
# First run 10 minutes after boot, then every 6 hours
|
||||
OnBootSec=10min
|
||||
OnUnitActiveSec=6h
|
||||
# Jitter to avoid load spikes
|
||||
RandomizedDelaySec=300
|
||||
# Run missed checks on boot
|
||||
Persistent=true
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
Loading…
x
Reference in New Issue
Block a user