Fixes from real fresh-install feedback (Framework node .81) + its log bundle:
Backend:
- websocket: subscribe before initial snapshot — broadcasts in the gap were
silently lost, stranding clients on stale state until a hard refresh
(the "everything needs ctrl-r" bug: My Apps stuck Loading, App Store
stuck Checking, containers-scanned never arriving)
- crash recovery: check the crash marker BEFORE writing our own PID —
recovery had never run on any node (always saw its own PID and skipped);
PID-reuse guard via /proc cmdline
- boot status: pending-boot-starts registry (recovery, stack recovery,
reconciler, adoption) — scanner overlays queued-but-down apps as
Restarting instead of Stopped after a reboot; scanner-authored
Restarting resolves immediately on a settled scan (no transitional wedge)
- install deps: bounded wait (36x5s) when a dependency is installed but
still starting ("Waiting for Bitcoin to start…") instead of instant
rejection; dependency-gate rejections remove the optimistic entry (no
phantom Stopped tile) and surface as a notification
- seed backup: auth.setup persists the onboarding mnemonic as the
encrypted seed backup (reveal previously failed on EVERY node — nothing
ever wrote master_seed.enc); seed.restore stashes too; error sanitizer
lets seed/2FA errors through instead of "Check server logs"
- lnd: bitcoind.rpchost resolved from the running Bitcoin variant
(hardcoded bitcoin-knots broke Core nodes); manifest uses derived_env
- bitcoin status: clean human message for connection-reset/startup; raw
URLs + os-error chains no longer reach the app card
- fedimint-clientd: chown /var/lib/archipelago/fmcd to 1000:1000 (root-
created dir crash-looped the rootless container, EACCES) — first-boot
script + pre-start self-heal
- log volume (>1GB/day on a day-old node): journald caps drop-in (ISO +
bootstrap self-heal), bitcoind -printtoconsole=0 everywhere (90% of the
journal was IBD UpdateTip spam), tracing default debug→info
Frontend:
- Login: Enter advances to confirm field then submits; submit always
clickable with inline errors (was silently disabled on mismatch);
Restart Onboarding needs a confirming second click (the mismatch →
"onboarding restarted" trap)
- sync store: 30s state reconciliation + refetch on re-entrant connect;
20s containers-scanned escape hatch so Checking can never show forever;
fresh empty node reaches the real "no apps yet" state
- intro video: CRF20 re-encode (SSIM 0.988) + faststart — moov was at EOF
so playback needed the full 15MB first (the intro lag)
- backgrounds: 10 heaviest JPEGs → WebP q90 (9.4MB→6.6MB); 7 stayed JPEG
(WebP larger on noisy sources)
- Web5ConnectedNodes: drop unused template ref that failed vue-tsc -b
ISO/kiosk:
- nginx: /assets/ 404s no longer cached immutable for a year; HTTPS block
gained the missing /assets/ location (served index.html as images)
- kiosk: launcher/service spliced from configs/ at ISO build (stale
heredoc force-disabled GPU); MemoryHigh/Max 1200/1500→2200/2800M (kiosk
rode the reclaim throttle = the lag); firmware-intel-graphics +
firmware-amd-graphics (trixie split DMC blobs out of misc-nonfree)
Verified: cargo test 898/898 green, npm run build green with dist
contents confirmed (webp refs, lnd.png, faststart video, new strings).
Handover for ISO build + deploy: docs/HANDOVER-2026-07-02-iso-feedback.md
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Context: when package update fails after remove-old-container but
before reconcile-recreate, the rollback path in update.rs tries to
restart the old container by name. If the container is already gone
(removed in step 3 of the update), rollback fails silently and the
node is left with no live container for that app but on-disk data
still intact. This is exactly the state .228 ended up in after the
reconcile-script-missing bug killed bitcoin-knots and lnd.
Reconcile was designed to only repair existing containers for
optional apps (SPEC_OPTIONAL=true): it skips "not installed" entries
on the assumption that the install RPC creates them. That safety
check is correct for normal operation but blocks recovery when an
optional-marked container has been destroyed by a failed update.
Fix: add --create-missing flag that overrides the SPEC_OPTIONAL skip.
When set, reconcile treats absent containers exactly the same as
broken containers — it creates them from the canonical spec using
the existing on-disk data directory. Narrow-scope override; the
default behaviour is unchanged.
Updated --help to document all four flags.
Verified on .228: after the failed bitcoin-core update took out both
bitcoin-knots and lnd, running reconcile --container=bitcoin-knots
--create-missing --force (as the archipelago user, not root —
podman is rootless) brought bitcoin-knots back using the pruned
chainstate at /var/lib/archipelago/bitcoin. Repeated for lnd. All
containers now running; electrumx reconnecting; UIs recovering.
Does NOT fix the underlying update-flow rollback hole (rollback
should be able to re-create a container from spec, not just restart
by name). That is a separate commit — this flag is the manual
recovery tool plus the primitive the improved rollback will call.
- Chromium kiosk: add --disable-gpu-compositing, --disable-gpu-rasterization,
--disable-software-rasterizer, --renderer-process-limit=1
drops GPU process from 64% to 12% CPU
- Container healthchecks: 30s to 120s interval in first-boot and reconcile
- AppCard: min-height on description so cards dont shift
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Dashboard.vue: move DashboardMobileNav outside <main> so position:fixed
isn't broken by will-change:transform on the perspective container
- Add container-specs.sh and reconcile-containers.sh utility scripts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>