91 Commits

Author SHA1 Message Date
Dorian
1e283daf13 fix: overhaul container lifecycle — recovery, health, uninstall, UI state
Container recovery:
- Health monitor: MAX_RESTART_ATTEMPTS 3→10, interval 60s→120s
- Dependency-aware restarts: won't restart services before their deps
- Reset dependent counters when a dependency recovers
- Handle "created" state containers (were invisible to health monitor)
- Added IndeedHub, mempool-api, mysql to tier system
- Crash recovery: podman start timeout 30s→120s with retry
- Podman client: socket timeout 5s→30s, added restart policy

UI state representation:
- Exit code 0 shows "stopped" (gray), not "crashed" (red)
- Exit code 137 shows "killed (OOM)"
- Non-zero exit shows "crashed" (red)
- Added exit_code field to PackageDataEntry

Install/uninstall fixes:
- Install returns error when container doesn't start (was silent success)
- Post-install hooks awaited instead of fire-and-forget tokio::spawn
- Uninstall: graceful rm before force, volume prune, network cleanup
- Uninstall returns error on partial failure (was 200 OK)

Config consistency:
- DB passwords read from /var/lib/archipelago/secrets/ (was hardcoded)
- Bitcoin: added ZMQ ports 28332/28333 for LND block notifications
- IndeedHub port 7777→8190 (was conflicting with strfry)
- Marketplace versions: LND 0.17.4→0.18.4, Mempool 2.5.0→3.0.0

Performance:
- Metrics collector interval 60s→300s (was duplicating health monitor)
- Podman client: proper error propagation instead of unwrap_or_default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 07:03:57 +01:00
Dorian
f6c41f9052 fix: add python3 to ISO packages, set Claude API key in proxy service
AIUI proxy requires python3 which was missing from rootfs packages.
Also sets the beta API key in the claude-api-proxy systemd service
so AIUI works out of the box on fresh installs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 05:01:13 +01:00
Dorian
72d8101045 fix: add python3 to ISO packages for Claude API proxy
The claude-api-proxy.py requires python3 which was missing from the
rootfs package list, breaking AIUI on fresh installs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 04:52:30 +01:00
Dorian
adeb57edbf fix: force UTF-8 console with Terminus font for ASCII logo
The Archipelago ASCII logo uses Unicode block characters (▄▀█) which
render as garbled symbols when the console font doesn't support them.
Force Uni2 codeset + Terminus font in both the live ISO and installed
system's console-setup config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 23:47:15 +01:00
Dorian
afda9897f1 fix: embed netavark/aardvark-dns in ISO at build time
Previous fix tried to copy from the live system at install time, but
the live ISO doesn't have netavark. Now: binaries are embedded in the
ISO during build (from the build host's /usr/lib/podman/), then copied
to the target at install time from the ISO filesystem.

This fixes container DNS on fresh installs — LND can now resolve
bitcoin-knots, mempool-api can resolve electrumx, etc.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 20:52:01 +01:00
Dorian
b515d3883f fix: install netavark + aardvark-dns for container DNS resolution
Fresh ISO installs use podman with CNI backend which lacks DNS.
Containers on archy-net can't resolve each other by name, causing:
- LND: "lookup bitcoin-knots: no such host"
- Any inter-container communication to fail

Fix: copy netavark + aardvark-dns from build host into ISO rootfs
and configure podman to use netavark backend. This enables automatic
DNS resolution on custom bridge networks (archy-net).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 18:57:17 +01:00
Dorian
44bffee473 fix: container installs, Tor, kiosk, GRUB, LUKS display, error messages
Critical:
- fix: container installs fail with "statfs: no such file or directory"
  Root cause: NoNewPrivileges=yes in systemd blocks sudo inside backend.
  Fix: use std::fs::create_dir_all + podman unshare chown (no sudo needed)
- fix: Tor services.json never written — \$ARCHY_TOR_DIR escaping bug
- fix: kiosk white screen — increase health wait to 60s, add --disable-gpu

Improvements:
- feat: LUKS encryption badge in Server disk stats (backend detects dm-crypt)
- fix: GRUB theme text scaling on 4:3 monitors — explicit fonts, wider menu
- fix: suppress default Debian MOTD (custom profile.d welcome is enough)
- fix: install error messages now show "Failed to pull/start" instead of
  generic "Operation failed" (middleware.rs allowlist expanded)
- fix: container-tests CI — source cargo env before running tests
- docs: interactive container architecture diagram (HTML)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:35:06 +01:00
Dorian
fdd69ce1b5 fix: auth, container resilience, ISO build, gamepad polish
- fix: login disconnect — verify session before WebSocket connect
- fix: 403 on app install — distinguish CSRF vs RBAC errors, only retry CSRF
- fix: health monitor now watches ALL containers (removed skip list for
  backend services like nbxplorer, databases, UI containers)
- fix: server.get-state added to CSRF-exempt list (read-only)
- fix: ISO build includes container-specs.sh and lib/common.sh in rootfs
  so reconcile actually works on fresh installs
- fix: gamepad nav — improved Server tab zone nav, focus styles, autofocus
- chore: move L484 web-only apps to Services tab
- chore: install store for cross-view install tracking

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 13:35:02 +01:00
Dorian
0f646a99d3 fix: CSRF 403 blocking all operations + reboot after install
CSRF fix (THE BLOCKER):
- After remember-me session restore, the browser has a stale CSRF
  cookie but a new session token. ALL subsequent RPC calls return 403.
- Fix: exempt read-only polling methods (node-messages-received,
  server.echo, system.stats, tor.status, etc.) from CSRF validation.
  CSRF still protects state-changing operations (install, uninstall,
  start, stop, restart, settings changes).

Reboot fix:
- The separate /tmp/archipelago-reboot.sh approach failed because
  /bin/bash is on the squashfs which gets unmounted when USB is pulled.
- Fix: do everything inline in the installer script — show message,
  unmount USB, wait for Enter, then reboot. Use sysrq-trigger first
  (kernel-level, doesn't need userspace binaries).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 22:42:09 +01:00
Dorian
ca646afd37 fix: version display, FileBrowser auto-login, nostr relay, UID mappings
Version per build:
- Health endpoint returns "1.2.0-alpha-{git_hash}" using GIT_HASH env
- CI passes git hash to cargo build

FileBrowser auto-login:
- filebrowser-client.ts: include CSRF token + credentials:include
- First-boot: generate random password, store at secrets/filebrowser/
- Set FileBrowser admin password to match after container creation

Nostr relay:
- Use docker.io/scsibug/nostr-rs-relay:0.9.0 (not in our registry)

UID mappings:
- Added electrumx (UID 1000), mysql-mempool, archy-btcpay-db, nextcloud-db

522 tests pass, Rust compiles clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 21:56:38 +01:00
Dorian
610e51500b fix: container orchestration overhaul — names, errors, Tor, restart
Container name resolution:
- New all_container_names() — single source of truth for every app's
  container name variants (bitcoin-knots/bitcoin/bitcoin-core, etc.)
- Covers all historical naming patterns and multi-container stacks

Start/Stop/Restart:
- No more silent failures (let _ = podman...). Every operation logs
  the command, checks exit status, and returns real errors to the UI.
- Restart uses stop+start fallback when podman restart fails
  (handles rootless podman loopback adapter errors)
- "No containers found" error when app doesn't exist

Tor helper:
- Install archipelago-tor-helper.path + .service in rootfs
- Enable the path unit so backend can manage Tor as non-root
- Copy tor-helper.sh to /opt/archipelago/scripts/

Verified: container with proper caps can stop/start/restart cleanly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 19:26:21 +01:00
Dorian
fbabbd0722 fix: auto-build UI containers for Bitcoin, LND, Electrumx
Critical: headless services (Bitcoin, LND, Electrumx) need companion
UI containers that serve web dashboards. These were only built for
Bitcoin, and only on bundled ISO builds.

Changes:
- install.rs: auto-build UI containers for LND (port 8081) and
  Electrumx (port 50002) in addition to Bitcoin (port 8334)
- build-auto-installer-iso.sh: always bundle docker UI source files
  (was skipping for unbundled builds — they're tiny HTML, not images)
- Dockerfiles: fix nginx base image tag 1.29.6→1.27.4 (matches registry)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:48:13 +01:00
Dorian
2049707986 fix: redirect container storage to LUKS encrypted partition
Container image pulls were filling the 29GB root partition (100% full
after 6 images). Now podman graphroot points to /var/lib/archipelago/
containers/storage on the 400GB+ LUKS encrypted data partition.

Added storage.conf with graphroot redirect + symlink for compat.
Also create containers/storage dir on encrypted partition during install.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:43:57 +01:00
Dorian
c0347c7592 fix: align image-versions.sh with registry, PATH for reboot
- image-versions.sh: fix 15+ tag mismatches against actual registry
  (bitcoin-knots:28.1→latest, lnd:v0.18.5→v0.18.4, grafana:11.4→10.2,
  vaultwarden:1.32.5→1.30.0-alpine, nextcloud:29→28, etc.)
- .bashrc: add /sbin:/usr/sbin to PATH so reboot/shutdown work
- Tailscale: add Arch Atob node (100.113.33.31)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 14:25:13 +01:00
Dorian
00428071f1 fix: UEFI ESP partition type, WebSocket cookie, password UX
UEFI boot:
- xorriso now uses -append_partition with ESP type GUID
  (C12A7328-F81F-11D2-BA4B-00A0C93EC93B) instead of -isohybrid-gpt-basdat
  which only creates "basic data" partitions. Strict UEFI firmware
  requires the correct ESP type to find BOOTX64.EFI.
- Uses Arch Linux ISO approach: -append_partition + appended_part_as_gpt

WebSocket/login from LAN browser:
- HTTPS nginx /ws block was missing proxy_set_header Cookie $http_cookie
  Session cookie wasn't forwarded → backend returned 401 → WS failed

Password UX:
- Renamed "Change Password" → "Set Password" with description explaining
  default password is password123

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 12:44:13 +01:00
Dorian
f9f54254c1 fix: suppress verbose command output in installer TUI
All mkfs, cryptsetup, grub-install, tar, update-initramfs output now
goes to log file only via run() wrapper. Console shows only clean TUI
status messages (step/ok/warn/fail/spinner).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 12:06:19 +01:00
Dorian
e9903e7b4b fix: UEFI boot fallback — search by file when label fails
The embedded GRUB EFI config only searched by volume label ARCHIPELAGO.
Some UEFI firmware presents USB devices differently, causing the search
to fail and GRUB to stall.

Added fallbacks:
1. search --file /archipelago/auto-install.sh (known ISO file)
2. Fall back to $cmdpath (EFI partition itself)
3. Use configfile before normal for explicit config loading
4. Added search_fs_file module to grub-mkstandalone

Also added same fallback to the main ISO grub.cfg.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 23:58:42 +00:00
Dorian
6e356412b8 fix: batch beta fixes — 13 issues from 2026-03-28 testing
Frontend (neode-ui):
- Login double-enter: change @keyup.enter to @keydown.enter (#10)
- Login loop on LAN: post-login session verify before navigation (#12)
- Splash flash: reorder isReady/showSplash, add black fallback div (#7)
- Skip button text: remove "skip this step" from onboarding (#8)
- Password UI: import existing ChangePasswordSection in Settings (#11)
- Arrow key focus trap: add tab-order fallback when spatial nav fails (#13)

ISO/Boot (image-recipe):
- Step counter: TOTAL_STEPS=7 → 8 to match actual step count
- GRUB theme: add desktop-image-scale-method stretch, widen menu
- Boot noise: add loglevel=0, rd.systemd.show_status=false to kernel
- USB removal: copy reboot script to tmpfs, exec from there
- Tor setup: rewrite python3 JSON generation as bash heredoc
- Doctor/reconcile: copy scripts into rootfs, fix missing file errors
- zstd: add to rootfs packages for initramfs compression

Docs:
- BETA-ISSUES-20260328.md: full issue tracker
- INSTALL-SCREENS-DESIGN.md: editable TUI mockups

522 tests pass, vue-tsc clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 23:41:40 +00:00
Dorian
bdd9578bf8 fix: root podman D-Bus cgroup issue in ISO build
When running as sudo, root podman can't reach the systemd D-Bus
session, causing "Transport endpoint is not connected" errors.
Auto-detect and fall back to cgroupfs cgroup manager.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:01:10 +00:00
Dorian
717733522b fix: heredoc quoting in installer profile.d (boot media not found)
The profile.d script used <<'PROFILE' (single-quoted heredoc) inside
a bash -c '...' single-quoted block. The inner quotes broke the outer
quoting, causing all $ variables to expand to empty at build time.
The for loop checked if [ -f "/archipelago/auto-install.sh" ] instead
of if [ -f "$dev/archipelago/auto-install.sh" ] — never matching.

Fix: use <<PROFILE with \$ escaping (matching .228's working version).
Also adds fallback device scanning if standard mount points are empty,
and fixes same quoting issue in grub-embed.cfg ($root variable).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:44:36 +00:00
Dorian
1b49257d95 fix: heredoc escaping in installer profile.d (build failure)
The z99-archipelago-installer.sh heredoc used $'\033[...]' ANSI-C
quoting inside an unquoted <<PROFILE heredoc. Bash misparses this
during expansion, treating multi-line content as a single ANSI-C
quoted string.

Fix: switch to <<'PROFILE' (quoted, no expansion) and use raw
\033 escape codes in echo -e instead of $'...' variables.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:15:42 +00:00
Dorian
5507112981 fix: UEFI boot, TUI installer steps, clean progress output
UEFI boot fix:
- Write proper EFI grub.cfg with root UUID after update-grub
  (was missing — GRUB dropped to grub> prompt because it couldn't
  find its config on the EFI FAT partition)

Installer TUI (Claude Code-inspired):
- Step counter [1/7] through [7/7] with clean progress display
- Helper functions: step(), ok(), warn(), fail(), spinner()
- Centered output with cc() helper
- Clean status messages instead of emoji + raw echo

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:39:10 +00:00
Dorian
8dffe807dd fix: onboarding "Set Password" label, reboot sequence, initramfs noise
- OnboardingDone: "Go to Login" → "Set Password" with context text
- Reboot: lazy-unmount live FS before USB removal prompt, suppress
  kernel SquashFS messages, auto-reboot after 10s countdown
- Initramfs: filter "Possible missing firmware" warnings (cosmetic)
- ISOLINUX: menu centered at bottom (VSHIFT 18, HSHIFT 32, WIDTH 18)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:14:33 +00:00
Dorian
9636bac96b fix: auto-create default user, force reboot, i915 firmware, first boot info
Critical fixes from ISO testing on .198:
- Backend auto-creates default user (password123) on first start
  so login works immediately after onboarding
- Force reboot (reboot -f) after install to avoid SquashFS errors
  when live USB is removed
- Eject USB before prompting user, not after
- Add firmware-misc-nonfree for Intel i915 GPU (suppresses dozens
  of "Possible missing firmware" warnings during initramfs update)
- First boot screen: wait up to 10s for DHCP before showing IP
- First boot screen: compact layout fits 80-col terminals
- ISOLINUX menu resolution dropped to 640x480 for universal
  VESA compatibility (was 1024x768, caused scaling on some hardware)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:06:34 +00:00
Dorian
99400a7165 feat: container orchestration, branding overhaul, onboarding logging
Container orchestration:
- Health monitor with crash recovery and auto-restart
- Doctor service (periodic health checks via systemd timer)
- Reconcile service (desired-state convergence)
- Stack-aware install/uninstall with dependency tracking

Branding:
- Custom GRUB background (designer artwork, 1024x768)
- ISOLINUX boot menu: centered, orange accents, clean labels
- Terminal banners: adaptive width, basic ANSI colors, fits 80-col
- Removed auto-generated splash scripts (designer provides assets)
- GRUB theme: lowercase branding

Frontend:
- 401 handler clears localStorage immediately (prevents cascade)

Backend:
- Onboarding/auth logging ([onboarding] tag in journalctl)
- Cookie Secure flag logging for debugging HTTP/HTTPS issues

ISO fixes:
- Install log saved before unmount (was silently failing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
6311aa563e feat: UEFI boot fix, graphical ISOLINUX menu, instant boot
UEFI (#5): grub-mkstandalone embedded config now insmod's all needed
modules (iso9660, search_label, normal, linux) and uses 'normal' to
load the full grub.cfg. Previous config couldn't find the ISO root.

ISOLINUX (#6, #7): Switch from menu.c32 to vesamenu.c32 for background
image support. Copies splash.png from branding. TIMEOUT 0 for instant
boot (no keyboard lag, no menu flicker). Dark theme with transparent
background over the splash image.

Also: added vesamenu.c32 and libcom32.c32 to build artifacts.
Removed console=ttyS0 from quiet boot (interferes with Plymouth).
Added splash to quiet boot kernel params.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
10bf53bc83 fix: add uidmap/slirp4netns for rootless Podman, fix Tor permissions
Two critical issues found on fresh .198 install:

1. Podman broken — uidmap package missing from rootfs because
   --no-install-recommends dropped it. Without newuidmap, rootless
   Podman can't create user namespaces. Also add slirp4netns and
   fuse-overlayfs which are required for rootless networking and
   storage.

2. Tor hidden service dirs created with 750 permissions (setgid).
   Tor requires exactly 700. Added explicit mkdir + chmod 700 for
   all hidden service dirs before starting Tor.

Both issues fixed on .198 live. Build script updated for future installs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
5790fb97fe fix: remove sudo from installer (already root), reduce ISOLINUX timeout
- sudo not installed in minbase squashfs — caused "command not found"
  when pressing Enter to install. We're already root via auto-login.
- ISOLINUX timeout from 5s to 1s — reduces menu flicker/duplication

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
8d01572894 fix: installer auto-start via profile.d, revert to embedded EFI, dark ISOLINUX
Three fixes from real hardware testing:

1. Installer auto-start: replace systemd service with profile.d script.
   The service and getty raced on tty1 — service output was overwritten
   by the login prompt. Profile.d runs AFTER auto-login, same approach
   the working Debian Live build used.

2. xorriso: revert from -append_partition to embedded -e boot/grub/efi.img.
   The appended partition approach produces cyl-align-off with zero CHS
   geometry, which is why BIOS wouldn't recognize the USB. The embedded
   approach matches the working main ISO (cyl-align-on, proper CHS).

3. ISOLINUX: dark theme instead of ugly blue. Black background, orange
   title, dark selection highlight. No blue boxes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
6cfb8082c5 fix: xorriso append_partition for real USB boot + grub-mkstandalone
Root cause of USB boot failure: our xorriso used -e boot/grub/efi.img
to embed the EFI image inside the ISO. This works for CD-ROM and QEMU
but NOT for USB on real UEFI hardware.

Fix: use the Will Haley / Debian live-build approach:
- -append_partition 2 (GPT type EFI) appends efi.img AFTER ISO data
- -e --interval:appended_partition_2:all:: references the appended partition
- --mbr-force-bootable forces MBR active flag
- grub-mkstandalone with embedded bootstrap config (searches for grub.cfg)
- grub.cfg placed in both /boot/grub/ AND /EFI/BOOT/ on ISO
- grub.cfg uses search --label ARCHIPELAGO to find the ISO root

This is the exact approach used by StartOS, TAILS, and every production
custom Debian live ISO that boots from USB.

Also: iso-debug, iso-branding skills + reference docs, dev-start.sh
option 0 for branding dev, improved dev-branding.sh and test-iso-qemu.sh.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
436f337a13 feat: custom boot branding, MBR fix, Plymouth theme, CI smoke tests
Boot fix:
- Ship proven Debian Live MBR (4552) as branding/isohdpfx.bin — the
  ISOLINUX package MBR (33ed) doesn't boot on all hardware. This was
  the root cause of "machine doesn't pick up the USB".

Branding:
- Custom GRUB background: pixel-art floating island (1024x574)
- Archipelago pixel-art logo for Plymouth boot splash
- GRUB theme: dark background, orange selected item, no broken font refs
- Plymouth theme: script-based with progress bar, LUKS prompt support
- Plymouth + splash added to target rootfs packages
- GRUB theme installed on both installer ISO and target system
- Serial console (ttyS0) added to kernel params for QEMU debugging

CI improvements:
- Smoke test step: mounts ISO, verifies all critical files, checks
  initrd has live-boot, confirms boot=live in grub.cfg. Fails build
  before copying to Builds if any check fails.

Dev workflow:
- dev-branding.sh: extract ISO, swap branding, repackage, boot in QEMU
  (~10 seconds vs 20 min full rebuild)
- generate-grub-background.py: procedural cyberpunk background generator
- generate-plymouth-logo.py: procedural logo generator
- Improved test-iso-qemu.sh: --bios/--nographic flags, serial logging

Build:
- Simplified live-boot install (clean chroot, no complex fallbacks)
- Static branding images preferred, generators as fallback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
a12e97ca72 fix: restore -partition_offset 16 to xorriso for USB boot compatibility
The old Debian Live ISO used -partition_offset 16 which reserves space
for a GPT partition table in the hybrid MBR layout. UEFI firmware on
some machines requires this to recognize the USB as bootable. We
removed it thinking it was Debian Live-specific but it's actually an
xorriso hybrid boot requirement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
6eb6f8b27d fix: live-boot check — scripts/live is a file not a directory
The verification used [ -d ] but live-boot-initramfs-tools installs
scripts/live as a regular file, not a directory. Changed to [ -e ].
The chroot install was actually succeeding — only the check was wrong.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
8837c76ef4 fix: live-boot install — avoid chroot, use dpkg extraction fallback
The chroot /installer command fails inside the CI container because
the container exits after debootstrap completes (set -e + container
boundary). The chroot then runs on the host where /installer doesn't
exist.

Fix: use apt-get with Dir overrides first, fall back to dpkg-deb -x
extraction of live-boot .deb files directly into the installer root.
This bypasses chroot entirely and is more robust in container-in-
container environments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
540438296e fix: install live-boot via apt after debootstrap, remove partition_offset
Two boot fixes:
- live-boot package must be installed via chroot apt-get, not debootstrap
  --include (minbase resolver can't handle its deps). Verified initrd was
  missing scripts/live* entirely.
- Remove -partition_offset 16 from xorriso — it was designed for the
  original Debian Live MBR, not the standard ISOLINUX isohdpfx.bin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
709c135fab fix: boot chain — add live-boot, mount proc/sys/dev, fix kernel params
The first ISO build didn't boot. Three root causes:

1. No squashfs-as-root mechanism — the custom initramfs hook mounted
   boot media but had no way to use the squashfs as the root filesystem.
   Fix: add live-boot + live-boot-initramfs-tools to debootstrap includes.
   This is ~100KB and provides proven squashfs-as-root with overlayfs.

2. Broken initramfs — update-initramfs needs /proc, /sys, /dev mounted
   in the chroot to detect modules and generate a working initrd.
   Fix: bind-mount virtual filesystems before update-initramfs.

3. Missing kernel parameters — GRUB and ISOLINUX configs lacked
   boot=live components, so live-boot never activated.
   Fix: add boot=live components to all kernel command lines.

Also: add all_video/efi_gop/efi_uga modules to GRUB EFI image for
display output on real hardware, and update installer wrapper to
check /run/live/medium first (where live-boot mounts the ISO).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
4326f019c1 feat: replace Debian Live with custom debootstrap ISO base + branding
Major ISO build overhaul on dev-iso branch:

- Replace ~800MB Debian Live download with debootstrap --variant=minbase
  (~150MB installer squashfs built from scratch)
- Custom initramfs with archipelago-mount hook for boot media detection
- Systemd service auto-starts installer (replaces profile.d hack)
- GRUB + ISOLINUX configs written from scratch (no Debian Live dependency)
- EFI boot image built with grub-mkimage (no more MBR extraction)
- Archipelago GRUB theme: dark background, Bitcoin orange accents
- Theme installed on both installer ISO and target system
- Rootfs optimizations: --no-install-recommends, strip docs/man/locales,
  remove firmware-misc-nonfree/wget/htop, add explicit font deps
- Separate CI workflow (build-iso-dev.yml) for dev-iso branch
- Includes pre-existing fixes from main (build-iso.yml, middleware, Login)

Target: sub-2GB unbundled ISO (down from 3.9GB)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
9741e73824 fix: filebrowser port bind, CSRF in tests, console-setup, auto-test scope
FileBrowser crash fix:
- Add --cap-add=NET_BIND_SERVICE (port 80 needs it with --cap-drop=ALL)
- Add --cap-add=DAC_OVERRIDE for rootless volume access
- Both in first-boot script and backend config.rs

Test script fixes:
- Extract csrf_token cookie and send as X-CSRF-Token header on RPC calls
- Add --phase1-only flag for safe install-only checks (no side effects)
- Auto-test service uses --phase1-only so it doesn't steal onboarding

Install fixes:
- Pre-create ~/.local/share/containers (ReadWritePaths mount namespace error)
- Fix console-setup.service: add After=tmp.mount + ExecStartPre mkdir /tmp

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:17:18 +00:00
Dorian
316dc4875a fix: run post-install tests automatically on first boot
Adds archipelago-post-install-tests.service — runs once after all
services are up, outputs to console + journal + log file at
/var/log/archipelago-post-install-tests.log. Tests password setup,
onboarding, and container lifecycle. Runs with default password
(password123) for automated validation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:19:33 +00:00
Dorian
7cd4d90ed8 fix: production onboarding, CI tests, container security, keyboard nav
Install & Onboarding:
- Remove DEV_MODE=true from production ISO service file (auto-created
  users, skipped password setup)
- Auto-install no longer overwrites rootfs service file with bad template
- Login.vue always checks auth.isSetup — shows password creation form
  on fresh install without requiring dev build flag
- Deploy image-versions.sh to /opt/archipelago/scripts/ on installed nodes
- First-boot-containers sources image-versions.sh, runs podman as
  archipelago user (rootless), enables linger + podman.socket
- Correct volume ownership (100000:100000 for rootless UID mapping)

Container Security:
- FileBrowser: add --cap-add=DAC_OVERRIDE for rootless podman volume access
- FileBrowser: add --read-only, /data volume for database, proper cmd args
- First-boot script matches backend config (security hardening + health check)

CI Pipeline:
- Add vue-tsc type check + vitest run to build-iso.yml (runs every push)
- Add post-install-tests.yml workflow (workflow_dispatch, SSH to target)
- Build report: set +eo pipefail, fix rootfs path, add || true guards
- Bundle run-post-install-tests.sh into ISO

E2E Test Suite (scripts/run-post-install-tests.sh):
- Phase 1: Install verification (files, services, podman, linger, DEV_MODE check)
- Phase 2: Onboarding flow (auth.isSetup, auth.setup, login, DID, complete)
- Phase 3: Container lifecycle (install 3 apps via package.install RPC,
  verify running, stop, verify stopped, restart, verify running, health)
- Phase 4: Log verification (first-boot log, diagnostics, journal errors)
- Correct package.install params: {"id", "dockerImage"}

Frontend:
- Fix backdrop-filter tab-switch bug (keep animations paused during rebuild)
- Dashboard glitch animations paused during tab-hidden
- Gamepad nav: auto-focus first container on route change
- Tab roving: Left/Right on role="tab" cycles and activates sibling tabs
- ContainerApps: data-controller-launch on running app cards
- 515 tests passing (fixed 30 broken, added 19 new keyboard nav tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:16:57 +00:00
Dorian
121f17e44e fix: container install flow, filebrowser auth, AppCard enrichment
- Fix .198-style fresh installs: systemd service ExecStartPre creates
  /run/user/1000, enable podman.socket, chmod 644 /etc/hosts
- Filebrowser: add /data volume for database (fixes read-only crash),
  secure auth with random password via backend RPC (no more admin/admin)
- AppCard: enrich installing state with marketplace metadata (icon,
  title, description, tier badge, author, version)
- Registry: btcpayserver 1.13.5 → 1.13.7, images mirrored
- ReadWritePaths: add home container paths for rootless podman

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:32:54 +00:00
Dorian
546481bc15 fix: bundle FileBrowser, auto-login tty1, boot/auth debug logging
- ISO build: configure insecure registry for root podman so FileBrowser
  image can be pulled during build (was failing with HTTPS error)
- Auto-login on tty1 so no password prompt on console
- RootRedirect: persistent debug logging to sessionStorage
  (view in DevTools > Application > Session Storage > archipelago_boot_log)
- Logs: health check, onboarding state, routing decisions, 401 handling

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 11:12:31 +00:00
Dorian
35ae7ff847 fix: kiosk cursor, Esc dead-end, PWA prompt, password overlay, gamepad Enter
- Kiosk: show cursor when active (removed -nocursor from Xorg),
  unclutter hides after 3s idle. X11 on VT7 for Ctrl+Alt+F1/F7 switching.
- Kiosk: keep getty@tty1 running so MOTD is accessible via Ctrl+Alt+F1
- Kiosk: disable Chromium password save overlay (--password-store=basic)
- Esc: don't navigate back from top-level pages (dashboard, login, kiosk)
  to prevent dead-end at root redirect
- PWA: suppress install prompt in kiosk mode (/kiosk path)
- Gamepad: Enter in text fields moves focus to next element (submit button)
  instead of submitting the form

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 21:16:07 +00:00
Dorian
16a5a9ae16 feat: add install log and first-boot diagnostics
- Installer: tee all output to /var/log/archipelago-install.log
  on the target disk for post-install debugging
- First boot: oneshot service captures system state 30s after boot:
  services, nginx, LUKS, EFI, SSL, containers, journal errors
- On-demand: sudo archipelago-diagnostics to re-run anytime

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 20:58:34 +00:00
Dorian
102434c041 feat: add build report and first-boot diagnostics
CI build report: checks rootfs contents (nginx, SSL, keyboard, kiosk,
lid config, backend, frontend) and ISO contents after build. Reports
in the Actions log so build issues are immediately visible.

First-boot diagnostics: one-shot systemd service runs 30s after first
boot, logs service status, nginx test, SSL certs, LUKS, podman,
kiosk, console-setup, disk, network, and journal errors to
/var/log/archipelago-first-boot-diag.log. Only runs once (ConditionPathExists).

SSH in and cat the log to debug any fresh install issues.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 20:54:32 +00:00
Dorian
6b857e59d0 fix: preseed keyboard config, enable kiosk by default
- Preseed keyboard-configuration and console-setup debconf values
  to prevent console-setup.service failure on boot
- Enable archipelago-kiosk.service by default on fresh installs
  so the system boots into the web UI display, not a login prompt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 19:50:59 +00:00
Dorian
66b1f35638 fix: nginx startup, kiosk fullscreen, reboot errors, kiosk toggle
- Remove hardcoded Tailscale IP from nginx listen (broke fresh install)
- Generate SSL cert in installer if rootfs missed it (safety net)
- Kiosk: add --start-fullscreen --start-maximized --window-size flags
- Kiosk: remove --disable-gpu (can prevent fullscreen rendering)
- Kiosk: add toggle command and Ctrl+Alt+F1/F7 docs in MOTD
- Reboot: suppress stderr during cleanup to hide flashing errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 18:30:13 +00:00
Dorian
17eeb59a2b fix: purge shim-signed and clean EFI/BOOT to fix boot failure
Shim-signed package hooks reinstall shimx64.efi and BOOTX64.CSV
which cause 'Failed to open \EFI\BOOT\' with garbled filenames.
Purge the package before grub-install, then nuke everything from
EFI/BOOT except BOOTX64.EFI and grub.cfg.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 17:31:26 +00:00
Dorian
40fd95407a fix: load dm_mod/dm_crypt and mount /proc /sys for LUKS setup
The live installer environment doesn't have dm_mod loaded, causing
'Cannot initialize device-mapper' during LUKS2 encryption. Also
bind-mount /proc and /sys into chroot so cryptsetup can detect
hardware capabilities.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 17:28:08 +00:00
Dorian
bb356af6e4 fix: remove 'local' keyword outside function in ISO build script
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 16:23:19 +00:00