19 KiB
Weekly Release Tracker
Last updated: 2026-06-14 (session on node .116 / archi-thinkpad)
▶ IN PROGRESS — LND wallet auto-unlock fix (2026-06-14)
RESUME PROMPT (paste into a fresh session, on .116 / archi-thinkpad, tree at /home/archipelago/Projects/archy)
Resume the LND wallet-password fix. Read memory
project_lnd_wallet_password.mdFIRST (full root-cause + design + validated facts). Work is on branchlnd-wallet-password-fix(pushed to gitea-vps2, commit91adc281, NOT merged to main, NOT shipped). Bug: hardcodedWALLET_PASSWORD="hellohello"left LND wallets LOCKED fleet-wide after OTA → Bitcoin-receive shows "wallet is locked" on every updated node. DONE + cargo-checked: per-node random secret (secrets/lnd-wallet-password), both init paths unified, candidate-unlock with fail-fast, login-time candidate-migration (ChangePassword). DETECTION GATE already shipped on main (commit8c8e4d7a). DECISION: alpha, NO funds on nodes → destructive wipe+recreate is OK and wanted UNATTENDED for ALL nodes in the next update. A wallet locked with an unknown password is already inaccessible, so wiping loses nothing reachable.
EXACT NEXT STEPS — LND fix (in order)
- Finish seed/fresh recovery (REMAINING piece): in
container/lnd.rs ensure_wallet_initialized, when wallet.db exists but ALL unlock candidates fail → wipe wallet.db (+ macaroons + graph/chain mainnet state, as root via host_sudo) and re-init fresh (random genseed + per-node secret) so the node self-heals unattended at boot. (Login-time candidate-migration already handles nodes whose pw matches.) Validate the wipe→reinit mechanic on the scratch LND first (see below). - Scratch validation (was in progress, .249 unreachable from .116's subnet → use a throwaway
lnd-scratchpodman container on .116, regtest/neutrino, REST :18099 — already proven for init/unlock/ChangePassword). Test: init(passA) → restart→LOCKED → delete wallet.db while locked → confirm /v1/state→NON_EXISTING (may need container restart) → genseed+initwallet fresh → unlock. NOTE: scratch wallet.db lives at the container's LND data dir (regtest),podman exec lnd-scratch find / -name wallet.db. CLEAN UP:podman rm -f lnd-scratchwhen done. cargo check -p archipelago(on .116 ~15-30s incremental; full test compile ~9min).- End-to-end on .228 (reachable 192.168.1.x, SSH pw
archipelago, UI pw unknown, NO funds — has a locked unknown-pw wallet = perfect auto-recreate test): build binary (ARCHIPELAGO_TARGET=archipelago@192.168.1.228 scripts/deploy-to-target.shor per reference_deploy_to_nodes), deploy, restart, confirm wallet auto-recreates+unlocks, lncli state RPC_ACTIVE, lnd.newaddress returns an address. Run os-audit against .228 → lnd check PASS. - Merge
lnd-wallet-password-fix→ main, then cut + publish v1.7.93-alpha (carries the LND fix). Ship ritual: create-release.sh 1.7.93-alpha → add CHANGELOG (≥3 layman bullets) → run sync-whats-new.py (the new What's-New gate will require it) → publish-release-assets.sh gitea-vps2 → push origin/gitea-vps2 + tags → verify live manifest==1.7.93-alpha. Heads-up: create-release leaves core/Cargo.lock version-bump uncommitted (commit it as a chore, both .91 and .92 hit this).
Context: how we got here (this session, all on node .116)
- Shipped v1.7.91-alpha (bitcoinReceive TS2538 build fix) and v1.7.92-alpha (ElectrumX
overlay-during-sync fix; L3 reboot os-audit gate; What's-New sync gate + 8-version backfill) —
both LIVE on vps2. Restored .116-local nginx
/lnd-connect-inforoute (was dropped 2026-06-10). - Triaged user symptoms: ElectrumX "can't connect" = electrs syncing / Bitcoin verifying (not a regression); .228 "5/14 apps after reboot" = normal ~5min staggered startup (all 14 came up).
- LND lock bug found + detection gate shipped + forward fix & migration implemented (this section).
✔ DONE PASS — v1.7.91-alpha + v1.7.92-alpha (2026-06-14)
Outcome (both releases PUBLISHED + LIVE on vps2)
- v1.7.91-alpha — bitcoinReceive.ts TS2538 build-blocker fixed; cut, published, verified
live (
manifest.version==1.7.91-alpha), tagv1.7.91-alphaon vps2. The fleet OTA'd to it (confirmed on .116 + .198). - v1.7.92-alpha — cut, published, verified live (
manifest.version==1.7.92-alpha), tag on vps2, main@d462e444. Carries:fix(ui)ElectrumX overlay-during-sync bug — the "App not reachable / retry" overlay no longer paints over the ElectrumX sync screen (AppSessionFrame.vue gated on!electrsSync).test(resilience)L3 per-boot health gate —batch_host_rebootnow runs os-audit.sh after reboot (RPC/OTA/all-apps/FM-guards), not just container-set equality. os-audit validated 11/0/0 green on .116.feat(release)What's New sync gate —scripts/sync-whats-new.py+whats-new-syncstage in tests/release/run.sh. Backfilled the 8 missing modal blocks (v1.7.85→.92); the gate fails any release whose CHANGELOG version isn't in the Settings modal.
- .116 node fix (not shipped — local config): restored the
/lnd-connect-infonginx proxy route that a 2026-06-10 "before-116-routing" change had dropped (fell through to SPA). Backup at/etc/nginx/conf.d/rpc.tx1138.com.conf.bak-lndconnect-*. Shipped template already has the route. - User symptoms triaged (none were .91/.92 regressions): receive-generate "unchanged" = .91's receive change was a behavior-preserving build guard; ElectrumX "can't connect" on .198 = Bitcoin node mid-"Verifying blocks…" (-28) so electrs was "waiting for Bitcoin node"; on .116 electrs was ~59% mid-sync. The overlay UX bug is fixed regardless.
Known follow-ups (not blockers)
- gitea-local mirror push fails (
localhost:3000→ redirect to/login, token auth). vps2 is the OTA source and is fine; gitea-local secondary mirror is stale. Diagnose the local Gitea token. sync-whats-new.pyonly inserts missing versions; it does not rewrite a block when CHANGELOG bullets for an already-present version change (had to delete+resync the .92 block by hand to pick up its 3rd bullet). Fine for the forward case; enhance to idempotently re-render if needed.
What happened this session
scripts/create-release.sh 1.7.91-alphawas running; its release gate PASSED all 7 checks, backend built clean (7m22s), then it FAILED at step [4/8] frontend build with:src/utils/bitcoinReceive.ts(23,24): error TS2538: Type 'undefined' cannot be used as an index type.Cause:noUncheckedIndexedAccess—codeMatch[1]isstring | undefinedand was used directly to indexRECEIVE_CODE_MESSAGES. FIXED →const code = message.match(/\[([A-Z_]+)\]/)?.[1]thenif (code && RECEIVE_CODE_MESSAGES[code]).npx vue-tsc --noEmitis now clean (exit 0). The failed run aborted BEFORE bumping the manifest (still 1.7.90) or tagging (no v1.7.91 tag), but it HAD already partial-bumped Cargo.toml/package.json/locks to 1.7.91 — those partial bumps are reverted (create-release.sh re-owns the bump); only the genuine TS fix + harness are committed.- Built a new OS-wide health harness
tests/lifecycle/os-audit.sh(non-destructive, one scorecard): Section A backend/RPC health, Section B all-apps lifecycle audit (delegates to remote-lifecycle.sh), Section C FM-guards (port-drift + secret-completeness bats, orphan-container sweep). Section A validated all-PASS on .116. Fixed a jq bug in the FM12 OTA-wedge check://treats a legitfalseas empty and fell through to "unknown" — now useshas(). Section B is slow (~3 min) and opaque while running because output is captured (out=$(...)) not streamed — minor wart, TODO.
EXACT NEXT STEPS — v1.7.91 (in order)
- Confirm clean tree + on main (
git status; create-release.sh requiresgit diff --quiet HEAD). The TS fix + os-audit.sh are committed & pushed; version-bump artifacts reverted to 1.7.90. - Re-run the release:
scripts/create-release.sh 1.7.91-alpha. Backend is cached (only a .ts changed) so it's fast; the frontend build now passes. It bumps versions, builds, writes releases/manifest.json (→1.7.91-alpha), commits, and tags v1.7.91-alpha.- Memory guards: grep the staged frontend tarball for "1.7.91-alpha" before shipping (silent
vue-tsc failures); tarball must be flat (
tar -C web/dist/neode-ui .).
- Memory guards: grep the staged frontend tarball for "1.7.91-alpha" before shipping (silent
vue-tsc failures); tarball must be flat (
- Publish:
scripts/publish-release-assets.sh 1.7.91-alpha gitea-vps2, thengit push origin main && git push origin --tags(origin pushes to BOTH gitea-local + vps2). - Verify manifest LIVE (this is "published"):
curl -fsS http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/releases/manifest.json | jq .versionmust show1.7.91-alpha. Then notify the user — they asked to be told when 1.7.91 publishes. - os-audit harness: run a full green pass on .116
(
ARCHY_HOST=127.0.0.1 ARCHY_SCHEME=http ARCHY_PASSWORD='ThisIsWeb54321@' tests/lifecycle/os-audit.sh), confirm Section A FM12 now readsupdate_in_progress=false(PASS not WARN), review B + C findings, then wire os-audit.sh into the reboot-survival (L3) loop as the per-boot gate.
─ HISTORY — v1.7.89-alpha pass (2026-06-12), superseded ─
Last updated: 2026-06-12 ~17:45 EDT (session on node .116)
RESUME PROMPT (paste into a fresh session)
Continue the v1.7.89-alpha release pass from /home/archipelago/Projects/archy on node .116. Read docs/WEEKLY_RELEASE_TRACKER.md fully first — it has root causes, fixes already made, and exact next steps. Do NOT redo: AIUI revert (done, validated), updater fixes in core/archipelago/src/update.rs (done, uncommitted), .116 OTA unwedge (done). Resume at "EXACT NEXT STEPS" below.
EXACT NEXT STEPS (in order)
- Backend focused tests were running in background:
cd core && timeout 1500 cargo test -p archipelago -- update:: lnd container::image_versions scanner(log: /tmp/claude-.../tasks/bds4jk19e.output — if lost, just rerun the command; first attempt died at 400s timeout during test compile, 1500s is the right budget). Need: all green. - RESOLVED before session end: vitest recheck passed clean — EXIT=0, 79 files / 645 tests,
even while cargo test was compiling. The earlier harness ui-unit-tests FAIL was load/flake
(machine saturated by the parallel cargo test compile), not a real failure. On resume just
rerun
tests/release/run.sh --quickWITHOUT a parallel cargo build to confirm green; if it ever fails again, the failing test name is in the stage output (drop--silent). - Run full harness:
tests/release/run.sh(static+frontend+backend). Then commit ALL working-tree changes (one commit, e.g. "fix: harden OTA updates, AIUI desktop gap, LND no-proxy" — CHANGELOG v1.7.89 section is already curated). - Cut release:
scripts/create-release.sh 1.7.89-alpha(needs clean tree, on main, validates CHANGELOG section exists — it does). Thentests/release/run.sh --manifestshould pass, and grep the staged frontend tarball for 1.7.89-alpha (memory: silent build failures). - Publish:
scripts/publish-release-assets.sh 1.7.89-alpha gitea-vps2, thengit push origin main && git push origin --tagsand push gitea-local + tags too. Verify manifest live on http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/releases/manifest.json - Verify OTA on THIS node (.116): schedule is auto_apply; either wait for the scheduler
or trigger via UI. Confirm /var/lib/archipelago/update_state.json current_version
becomes 1.7.89-alpha,
update_in_progressreturns to false, web-ui + binary versions MATCH (this node currently has web-ui 1.7.84 / binary 1.7.85 mismatch — the OTA heals it), and journalctl shows "Post-OTA verification succeeded" (the new probe falls back to http://127.0.0.1/ which is what .116 serves). - Update this tracker + docs/PROGRESS_MEMORY.md, mark tasks done. Purpose: live tracker for this pass — test everything shipped this week (v1.7.83→v1.7.89), build the release test harness, fix OTA updates on .116, make updates bulletproof, cut v1.7.89-alpha. If the session is cut off, resume from here.
Task status
| # | Task | Status |
|---|---|---|
| 1 | AIUI revert (mobile back/close gone, desktop gap fixed) | DONE — validated |
| 2 | Dev server on :8100 with embedded AIUI | DONE — see below |
| 3 | Inventory this week's release-log items | DONE — see checklist |
| 4 | Test harness covering this week + seed of system-wide harness | IN PROGRESS |
| 5 | Fix OTA updates on .116 + bulletproof updates | IN PROGRESS — diagnosis below |
| 6 | Cut v1.7.89-alpha release | PENDING (gates: 4, 5) |
State of the working tree
- HEAD =
495b9078(v1.7.89 changelog + AIUI mobile restore committed). - Uncommitted, intended for v1.7.89-alpha:
neode-ui/src/views/Dashboard.vue— chat route back to plainh-full(desktop bottom-gap fix). Validated.core/.../rpc/lnd/*+container/lnd.rs— LND REST no-proxy + wallet readiness/unlock fixes.- Version bumps to 1.7.89-alpha (Cargo.toml, package.json, locks), CHANGELOG entry.
neode-ui/vite.config.ts— added/aiuidev proxy (keep; dev-only convenience).
AIUI validation (task 1) — DONE
- HEAD already removed the mobile back button and restored
hideClose=true(495b9078). - Working-tree Dashboard.vue removes
dashboard-scroll-panel mobile-scroll-padfrom the chat route (that padding caused the desktop bottom gap); mesh keeps its styling. - Chat CSS verified byte-identical to last-good
34c4e87d(May 20). - Playwright check (desktop 1440x900, mobile 390x844): chat fills full viewport, no bottom gap,
no mobile back/close.
npm run type-check+ focused route tests + full vitest (645/645) pass.
Dev server on :8100 (task 2) — DONE
- Running:
BACKEND_URL=http://127.0.0.1:5678 VITE_AIUI_URL=/aiui/ npx vite --host 0.0.0.0 --port 8100fromneode-ui/(real local backend on 5678). - AIUI now embeds in /dashboard/chat via new vite proxy
/aiui→http://127.0.0.1:80(the node's deployed AIUI), same-origin like production. - Secondary throwaway instance for automated checks: :8101 against mock backend
(
node mock-backend.json 5959, passwordpassword123).
This week's shipped items (v1.7.83 → v1.7.89) — test checklist
Frontend (vitest/type-check/build cover most; full suite 645/645 green 2026-06-12)
- AIUI fast launch, no availability probe (v1.7.88) — covered by visual check + Chat.vue tests
- AIUI mobile layout restore (v1.7.89) — playwright visual check
- App-session launch metadata from manifests / typed interfaces (v1.7.83) — appSessionConfig tests
- OnlyOffice + Saleor removal (v1.7.83) — catalog tests
- Bitcoin receive UI flow end-to-end (v1.7.87/88) — needs live LND node check
- Fleet tab keeps node list/alerts during refresh, names not hashes (v1.7.85/86) — store tests?
- Credential interstitial full-screen overlay (v1.7.87) — visual
- Mobile federation/system-update buttons full width (v1.7.86) — visual
Backend (cargo)
- LND REST no-proxy client + GET newaddress p2wkh (v1.7.88/89) — unit tests + live check
- LND wallet readiness/unlock after restart (v1.7.89) — unit + live
- Bitcoin trusted-node relay rpcauth/txrelay (v1.7.84) — unit tests exist? check
- Container scanner RAII in-flight guard (v1.7.84) — cargo test
- ElectrumX health-check startup window + cache tuning (v1.7.85/86)
- Portainer pin 2.19.4 / bitcoin-ui image pin (v1.7.84/85) — image-versions tests
- Fleet telemetry name/hostname/URL fields (v1.7.85)
- Federation no self-import (v1.7.85)
- Kiosk safe-area + self-update refreshes kiosk files (v1.7.84)
- Wi-Fi scan error/retry/escaped SSID/open networks (v1.7.84)
OTA / updates (task 5)
- .116 stuck: current 1.7.85-alpha,
update_in_progress: truesince 1.7.88 attempt — diagnose+fix - Updater hardening: stuck-in-progress recovery, resumable/atomic apply, verify post-restart version
OTA diagnosis on .116 — ROOT CAUSES FOUND + FIXED (code staged for v1.7.89)
Four bugs, all reproduced from the journal (Jun 12 03:45–04:33):
- Post-OTA probe only tries
https://127.0.0.1/; .116's nginx binds only :80 (443 is tailscale's) → connection refused × 18 → a GOOD 1.7.85 update was "rolled back". FIX: probe falls back tohttp://127.0.0.1/on connect error (update.rs probe_frontend_once). - That rollback's binary restore did
host_sudo cponto the RUNNING binary → ETXTBSY exit 1 → binary stayed 1.7.85 while web-ui rolled back to 1.7.84 (mismatch confirmed live). FIX: rollback now cp→tmp→atomic mv, same pattern as apply (update.rs rollback_update). - The rollback chown'd
update-backup/archipelagoroot:root IN PLACE → next apply's fs::copy (as service user) hit EACCES → "Failed to backup current binary" × 3 → 1.7.86/88 never applied. FIX: apply unlinks stale backup first; rollback chowns only its temp copy. - Failed apply left
update_in_progress: truewedged (staging still populated so the stale-flag guard never fires). Unwedged operationally; fixed structurally by 1–3.
Operational cleanup DONE on .116 (2026-06-12 17:15): removed root-owned
update-backup/archipelago, stale update-staging/ (1.7.86), and the stale
update-pending-verify.json. Next state load clears update_in_progress.
NOTE: live web-ui is 1.7.84 / binary 1.7.85 (mismatch from bug 2). Not hand-patched —
the v1.7.89 OTA will resync both. Good 1.7.85 frontend is quarantined at
/opt/archipelago/web-ui.failed.1781250438247.
Verification plan: after v1.7.89 release, watch .116 auto-apply (schedule auto_apply),
confirm update_state.json.current_version == 1.7.89-alpha and web-ui version matches.
Test harness (task 4) — CREATED at tests/release/run.sh
- Stages: static (git diff --check, cargo fmt, catalog drift, optional --manifest),
frontend (type-check, full vitest), optional --with-build (build + grep dist for version),
backend (cargo check + focused cargo test: update:: lnd container::image_versions scanner,
all wrapped in
timeout), optional --live URL smoke (/, /aiui/, /rpc/v1). - Results so far (2026-06-12): type-check PASS, full vitest 645/645 PASS, cargo fmt PASS, cargo check PASS, catalog drift PASS (3 pre-existing MISSING_CATALOG warnings, exit 0, identical on HEAD). Focused backend cargo tests running (first run hit the known slow test-compile on .116 at 400s timeout; rerunning with 1500s).
- AIUI embed verified end-to-end via playwright on :8101 (mock backend): iframe loads,
readyhandshake clears the loading overlay, hideClose honored. - Release flow confirmed: commit all →
scripts/create-release.sh 1.7.89-alpha(validates curated CHANGELOG section, builds, manifests, commits, tags) →scripts/publish-release-assets.sh 1.7.89-alpha gitea-vps2→ push origin main + tags. Tarball layout/perms safety is already inside create-release-manifest.sh. - CHANGELOG v1.7.89 section rewritten layman-readable (updater fixes added).
Release gates for v1.7.89-alpha (task 6)
- All harness stages green locally.
- OTA fix for stuck
update_in_progressincluded + .116 updates successfully to the new release. - Frontend build: grep packaged tarball for "1.7.89-alpha" before shipping (memory: silent vue-tsc failures).
- Flat tarball layout (
tar -C web/dist/neode-ui .). - Commit, tag
v1.7.89-alpha, push origin + gitea-local + tags, publish release assets, verify manifest + node OTA picks it up.