From 0ea4f96de95a5e14e6e21ec96919ebce42cdcf49 Mon Sep 17 00:00:00 2001 From: archipelago Date: Thu, 23 Apr 2026 05:30:45 -0400 Subject: [PATCH] docs(status): mark async-spawn lifecycle fix as shipped Records the four landed commits, the .228 deploy (binary + frontend paths, backups, md5), the manual LND Stop verification, and the rollback incantation. Leaves the older "NEXT SESSION" design block in place as historical reference with a note that it's stale. Adds a follow-ups list: chaos matrix is now unblocked, bundled-app RPCs are still sync (deprecate or mirror-async?), transitional_since is in-memory only, and there are 22 pre-existing test failures in unrelated modules that should get their own cleanup pass. --- docs/STATUS.md | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) diff --git a/docs/STATUS.md b/docs/STATUS.md index 58c9092d..2e022c8c 100644 --- a/docs/STATUS.md +++ b/docs/STATUS.md @@ -1,12 +1,43 @@ # RESUME HERE — Rust orchestrator migration -Updated: 2026-04-23 (Dashboard Stop UX bug diagnosed; async-spawn fix fully designed, ready to implement) +Updated: 2026-04-23 (Async-spawn lifecycle fix LANDED + deployed to .228, manual LND Stop verified — "absolutely beautiful" per user. Ready for chaos matrix / Step 11.) **To resume this work, SSH into the ThinkPad and run `opencode` from `~/Projects/archy/`. Or work from the laptop via the SSHFS mount at `~/mnt/archy-thinkpad/`.** --- -## ⚡ NEXT SESSION — START HERE +## ✅ ASYNC-SPAWN LIFECYCLE FIX — SHIPPED + +Landed 2026-04-23 in 4 commits on `main` (unpushed per user mirror protocol): + +- `44cd5eef` `feat(rpc): spawn_transitional helper for async lifecycle ops` — new `api/rpc/transitional.rs` with `Op::{Stop,Start,Restart}` and `RpcHandler::spawn_transitional` / `flip_to_transitional` / `set_state` helpers. `install_log` re-exported so sibling modules can use it. +- `19a99ca9` `fix(rpc): async container stop/start/restart; widen state mapping` — `container.rs` start/stop rewritten + restart added; `container-list` now emits all transitional variants instead of falling back to `"unknown"`. `dispatcher.rs` registers `container-restart`. `package/runtime.rs` mirrored with `do_package_*` helpers inside `tokio::spawn` and revert-on-error. +- `6712810b` `fix(state): preserve transitional state across container scans` — `server.rs` scan merge now keeps transitional states while taking fresh observability fields; 1200s stuck-timeout escape hatch via `transitional_since: HashMap`. Three passing `server::merge_tests`. +- `9ce28f08` `fix(ui): single-button lifecycle control with transitional labels` — `ContainerApps.vue` and `ContainerAppDetails.vue` use a single primary button driven by `getAppVisualState()`. **Dashboard now routes through `container-start`/`container-stop`** (the async RPCs) instead of the legacy synchronous `bundled-app-*` path. `ContainerStatus.vue` widened to render all new variants. + +**Deployed to .228** (ThinkPad demo device): +- Binary at `/usr/local/bin/archipelago` (md5 `de86b63f74c7e6fe6e555ffe30b86b4f`), backup at `/usr/local/bin/archipelago.bak-pre-async-stop`. +- Frontend at `/opt/archipelago/web-ui/`, backup at `/opt/archipelago/web-ui.bak-pre-async-stop/`. +- Release build took 3m56s on .116. Deploy via scp + atomic `install -m 755` + `systemctl restart archipelago`. `nginx -t` + `systemctl reload nginx` for frontend. + +**Manual verification**: user clicked Stop on LND in the dashboard. Button flipped to `Stopping…` instantly, held for the full graceful-stop window, transitioned to `Start` when `podman stop` completed. No mid-flight revert to Running. User sign-off: _"absolutely beautiful"_. + +**Rollback (if ever needed)**: +``` +ssh archy228 'sudo cp /usr/local/bin/archipelago.bak-pre-async-stop /usr/local/bin/archipelago && sudo rsync -a --delete /opt/archipelago/web-ui.bak-pre-async-stop/ /opt/archipelago/web-ui/ && sudo systemctl restart archipelago && sudo systemctl reload nginx' +``` + +### Follow-ups to consider + +1. **Chaos matrix / Step 11** — the original next-step gated behind this fix. Now unblocked. +2. **bundled-app-start / bundled-app-stop** — still synchronous in the backend. Dashboard no longer calls them, but the RPC methods remain for any external caller. Decide: deprecate, or mirror the async-spawn treatment for parity. +3. **`transitional_since` persistence** — currently in-memory only, so a backend restart mid-stop loses the timeout anchor. Acceptable for now (scan loop re-observes live podman state and reconciles), but worth revisiting if crash-recovery stories tighten. +4. **Test regressions inventory** — the full `cargo test -p archipelago` run on .116 shows 22 pre-existing failures in unrelated modules (mesh/wallet/credentials/avatar/session/transport/update-mirrors/fips/identity_manager/image_versions). Unrelated to this work but tech debt. Log at `/tmp/cargo-test-all.log` on .116. +5. **Amend STATUS.md's older "NEXT SESSION — START HERE" section** (below) — it is now stale. Left in place for historical reference of how the fix was designed; delete on the next pass if it gets confusing. + +--- + +## ⚡ NEXT SESSION — START HERE (historical — fix above is now shipped) **Goal**: implement async-spawn lifecycle fix so the dashboard never shows a frozen spinner again. User mandate: _"best server containers in the world"_. Do not ship the chaos matrix (Step 11) until this lands and manual LND stop verifies instant RPC + live `Stopping…` label.