21 Commits

Author SHA1 Message Date
archipelago
c375ecc441 fix: fresh-ISO feedback bug-bash — onboarding, status truthfulness, recovery, kiosk, logs
Fixes from real fresh-install feedback (Framework node .81) + its log bundle:

Backend:
- websocket: subscribe before initial snapshot — broadcasts in the gap were
  silently lost, stranding clients on stale state until a hard refresh
  (the "everything needs ctrl-r" bug: My Apps stuck Loading, App Store
  stuck Checking, containers-scanned never arriving)
- crash recovery: check the crash marker BEFORE writing our own PID —
  recovery had never run on any node (always saw its own PID and skipped);
  PID-reuse guard via /proc cmdline
- boot status: pending-boot-starts registry (recovery, stack recovery,
  reconciler, adoption) — scanner overlays queued-but-down apps as
  Restarting instead of Stopped after a reboot; scanner-authored
  Restarting resolves immediately on a settled scan (no transitional wedge)
- install deps: bounded wait (36x5s) when a dependency is installed but
  still starting ("Waiting for Bitcoin to start…") instead of instant
  rejection; dependency-gate rejections remove the optimistic entry (no
  phantom Stopped tile) and surface as a notification
- seed backup: auth.setup persists the onboarding mnemonic as the
  encrypted seed backup (reveal previously failed on EVERY node — nothing
  ever wrote master_seed.enc); seed.restore stashes too; error sanitizer
  lets seed/2FA errors through instead of "Check server logs"
- lnd: bitcoind.rpchost resolved from the running Bitcoin variant
  (hardcoded bitcoin-knots broke Core nodes); manifest uses derived_env
- bitcoin status: clean human message for connection-reset/startup; raw
  URLs + os-error chains no longer reach the app card
- fedimint-clientd: chown /var/lib/archipelago/fmcd to 1000:1000 (root-
  created dir crash-looped the rootless container, EACCES) — first-boot
  script + pre-start self-heal
- log volume (>1GB/day on a day-old node): journald caps drop-in (ISO +
  bootstrap self-heal), bitcoind -printtoconsole=0 everywhere (90% of the
  journal was IBD UpdateTip spam), tracing default debug→info

Frontend:
- Login: Enter advances to confirm field then submits; submit always
  clickable with inline errors (was silently disabled on mismatch);
  Restart Onboarding needs a confirming second click (the mismatch →
  "onboarding restarted" trap)
- sync store: 30s state reconciliation + refetch on re-entrant connect;
  20s containers-scanned escape hatch so Checking can never show forever;
  fresh empty node reaches the real "no apps yet" state
- intro video: CRF20 re-encode (SSIM 0.988) + faststart — moov was at EOF
  so playback needed the full 15MB first (the intro lag)
- backgrounds: 10 heaviest JPEGs → WebP q90 (9.4MB→6.6MB); 7 stayed JPEG
  (WebP larger on noisy sources)
- Web5ConnectedNodes: drop unused template ref that failed vue-tsc -b

ISO/kiosk:
- nginx: /assets/ 404s no longer cached immutable for a year; HTTPS block
  gained the missing /assets/ location (served index.html as images)
- kiosk: launcher/service spliced from configs/ at ISO build (stale
  heredoc force-disabled GPU); MemoryHigh/Max 1200/1500→2200/2800M (kiosk
  rode the reclaim throttle = the lag); firmware-intel-graphics +
  firmware-amd-graphics (trixie split DMC blobs out of misc-nonfree)

Verified: cargo test 898/898 green, npm run build green with dist
contents confirmed (webp refs, lnd.png, faststart video, new strings).
Handover for ISO build + deploy: docs/HANDOVER-2026-07-02-iso-feedback.md

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 08:00:39 -04:00
archipelago
d6f108d818 chore: snapshot release workspace 2026-06-12 03:00:15 -04:00
archipelago
84b283f5b6 deploy: exclude archived image build outputs 2026-06-11 02:01:55 -04:00
archipelago
8f2e03df2a deploy: exclude codex scratch artifacts 2026-06-11 01:46:38 -04:00
archipelago
de60f7e21e app-platform: remove revoked onlyoffice app 2026-06-11 01:03:45 -04:00
archipelago
c393b96da3 backend: harden rootless app lifecycle orchestration 2026-06-11 00:24:32 -04:00
archipelago
d736364ad7 fix(apps): stabilize btcpay and public proxy launch flows 2026-05-19 09:26:43 -04:00
archipelago
7804223152 chore: release v1.7.57-alpha 2026-05-17 17:30:04 -04:00
Dorian
835c525218 chore(release): stage v1.7.55-alpha 2026-05-13 15:09:22 -04:00
archipelago
c0751e2551 chore(release): stage v1.7.54-alpha 2026-05-06 09:23:57 -04:00
Dorian
a802b2e478 fix: correct health check endpoints for fedimint, nextcloud, filebrowser
- Fedimint: check port 8175 (UI) not 8174 (websocket API)
- Nextcloud: check / not /status.php (returns 302 during setup)
- FileBrowser: check / not /health (endpoint doesn't exist)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 00:47:49 +00:00
Dorian
de07b48876 fix: health check escaping for SSH heredoc context
- Remove || exit 1 from health-cmd (redundant, breaks SSH heredoc)
- Use --health-cmd 'cmd' format (space, not equals) for proper quoting
- Simplify bitcoin health check to bitcoin-cli getnetworkinfo (no creds needed)
- Fix MariaDB health check nested quote issue

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:45:32 +00:00
Dorian
47c42d4d1b fix: LND config escaping in SSH heredoc, Tailscale fallback for build source
- Fix shell escaping in LND config sync block (single-quoted SSH context
  doesn't need backslash-escaped dollars)
- deploy-tailscale.sh BUILD_SOURCE auto-detects Tailscale IP when LAN
  unreachable (fixes "No binary on .228" error)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 17:01:02 +00:00
Dorian
c2ac572d9a fix: deploy credential sync, health checks, rootless port binding
- LND config always synced with secrets/bitcoin-rpc-password before
  starting (both deploy scripts) — fixes 401 auth errors on all nodes
- Replace eval "$DB_PASSWORDS" with safe individual SSH reads in
  deploy-tailscale.sh (eliminates command injection risk)
- Add MariaDB password sync step after container start (ALTER USER)
- Add --health-cmd to all 25 containers in deploy-tailscale.sh
- FileBrowser uses --user 0:0 for rootless port 80 binding (both scripts)
- Fedimint env var fixed: FM_REL_NOTES_ACK=0_4_xyz

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:16:11 +00:00
Dorian
e4e0ef4f11 bug fixing and deploy and build diagnostics 2026-03-22 03:30:21 +00:00
Dorian
23d67c0672 refactor: create shared script library, fix ISO image pinning, document planned splits
- S21: Create scripts/lib/common.sh with shared logging, SSH, health check, mem_limit functions
- S18: Source common.sh from deploy-to-target.sh, deploy-tailscale.sh, first-boot-containers.sh
- S16: Fix 2 hardcoded images in ISO build, add missing image variables
- S19: Document planned 7-module split of build-auto-installer-iso.sh
- S20: Document planned 8-module split of first-boot-containers.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:06:29 +00:00
Dorian
d8b753e1e4 fix: deploy error visibility, trap cleanup, variable quoting, frontend resilience
- S10: Add warnings to silent health check failures in deploy scripts
- S11: Add trap cleanup for temp dirs in deploy and tailscale scripts
- S12: Quote 20+ critical unquoted variables across deploy scripts
- S13: Extract hardcoded IPs to deploy-config-defaults.sh
- S15: Add --memory=256m to UI container runs
- F16: Remove in-memory JWT, use cookie-only auth in filebrowser client
- F17: Add meta tag fallback for CSRF token in RPC client
- F19: Track and clear setTimeout in AppSession on unmount

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:06:08 +00:00
Dorian
0cca539a0f fix: WebSocket reconnect state refresh, listener leak fixes, pin container images
- F4: Fetch fresh server state after WebSocket reconnect
- F5: Guard message polling timer with auth check, stop on logout
- F6: Remove NIP-07 listener in appLauncher close()
- F7: Initialize audio player once to prevent listener stacking
- S3: Pin all container images to specific versions, create image-versions.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:32:28 +00:00
Dorian
1d98de24d0 fix: WebSocket race conditions, Vue error handler, remove sudo podman, add container health checks
- F1: Guard connectWebSocket against concurrent calls with isWsConnecting flag
- F2: Serialize mesh send operations with sendQueue to prevent fetchMessages races
- F3: Add global Vue error handler with toast notification
- S1: Replace sudo podman with podman across all scripts (rootless Podman)
- S2: Add health-cmd to all 40 container run commands in first-boot-containers.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:11:05 +00:00
Dorian
623c0fa954 feat: Discover view, Fleet dashboard, MeshMap, type fixes
- New Discover.vue (app store redesign)
- Fleet.vue dashboard for .228
- MeshMap.vue component
- Fixed Discover.vue type errors (unused var, type predicate)
- Various UI updates (Apps, Dashboard, Marketplace, Mesh, Web5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 16:12:01 +00:00
Dorian
0d3ff0d3a4 fix: resolve did:dht compilation errors
- Simplify DHT encoding: use JSON instead of DNS packets (drop simple-dns)
- Fix mainline crate API: SigningKey takes 32 bytes, get_mutable returns Result
- Add missing dht_did field to IdentityRecord constructor
- Store DID Document as JSON in DHT (DNS encoding deferred)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 04:14:04 +00:00