Container recovery: - Health monitor: MAX_RESTART_ATTEMPTS 3→10, interval 60s→120s - Dependency-aware restarts: won't restart services before their deps - Reset dependent counters when a dependency recovers - Handle "created" state containers (were invisible to health monitor) - Added IndeedHub, mempool-api, mysql to tier system - Crash recovery: podman start timeout 30s→120s with retry - Podman client: socket timeout 5s→30s, added restart policy UI state representation: - Exit code 0 shows "stopped" (gray), not "crashed" (red) - Exit code 137 shows "killed (OOM)" - Non-zero exit shows "crashed" (red) - Added exit_code field to PackageDataEntry Install/uninstall fixes: - Install returns error when container doesn't start (was silent success) - Post-install hooks awaited instead of fire-and-forget tokio::spawn - Uninstall: graceful rm before force, volume prune, network cleanup - Uninstall returns error on partial failure (was 200 OK) Config consistency: - DB passwords read from /var/lib/archipelago/secrets/ (was hardcoded) - Bitcoin: added ZMQ ports 28332/28333 for LND block notifications - IndeedHub port 7777→8190 (was conflicting with strfry) - Marketplace versions: LND 0.17.4→0.18.4, Mempool 2.5.0→3.0.0 Performance: - Metrics collector interval 60s→300s (was duplicating health monitor) - Podman client: proper error propagation instead of unwrap_or_default Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7.3 KiB
Plan: ISO Polish — Fix Everything for Beta Release
Context
Fresh ISO install on .198 revealed 11 issues ranging from critical (app installs, Tor broken) to UX (GRUB scaling, boot splash, kiosk reliability). Goal: next ISO build produces a flawless out-of-box experience.
Issues & Fixes (priority order)
1. CRITICAL: Tor services.json not written (escaping bug)
Symptom: setup-tor.sh: line 12: $ARCHY_TOR_DIR/services.json: No such file or directory
Root cause: In build-auto-installer-iso.sh, the setup-tor heredoc escapes $ARCHY_TOR_DIR as \$ARCHY_TOR_DIR, producing a literal $ in the output script. The variable never expands at runtime.
Fix: In the heredoc that generates setup-tor.sh (~line 1200), use unescaped $ARCHY_TOR_DIR so it expands at runtime. The heredoc itself uses <<TORSCRIPT (unquoted) so we need to check the quoting carefully.
File: image-recipe/build-auto-installer-iso.sh (setup-tor heredoc section)
2. CRITICAL: App installs failing ("Operation failed")
Symptom: Screenshot shows "Failed: Error: Operation failed. Check server logs" + "Downloading..." stuck
Root cause: This is the OLD build (pre-CSRF fix). The new build has the fix. However, sanitize_error_message() in middleware.rs masks ALL real errors. Need to verify the new build works.
Fix: Already fixed (auth.ts, rpc-client.ts, mod.rs). Verify on next ISO.
Also: Consider allowing "Failed to pull" errors through the sanitizer so users see meaningful install errors.
File: core/archipelago/src/api/rpc/middleware.rs
3. HIGH: Kiosk white screen / never loads on first boot
Symptom: First boot: black screen → white screen → kiosk never loads. Second boot works fine.
Root cause: The kiosk ExecStartPre health check polls 15x with 2s delay (30s max), but on first boot the backend may not be ready within 30s (first-boot-containers, Tor setup, etc. all running). Chromium opens http://localhost/kiosk before nginx/backend is fully up → white page. No retry logic in the launcher.
Fix: Increase health check to 30 attempts (60s). Add a loading page that Chromium shows while waiting (a simple HTML file served by nginx even when backend is down). Add --disable-gpu flag to Chromium (fixes some white screen issues on low-end GPUs).
File: image-recipe/build-auto-installer-iso.sh (kiosk launcher + ExecStartPre)
4. HIGH: GRUB theme text not scaling / cut off on 4:3 monitors
Symptom: Screenshot shows "Install (var" cut off, menu items barely readable on 1280x1024 Dell
Root cause: GRUB theme uses percentage-based layout but no font size control. GRUB defaults to a small bitmap font. The item_height = 40 is fixed pixels, too small at some resolutions. No explicit font loaded in theme.txt.
Fix: In grub.cfg, load a larger font (24px DejaVu or similar). Adjust theme.txt: increase item_height, move menu position up, ensure text fits. Add loadfont to grub.cfg.
Files: image-recipe/branding/grub-theme/theme.txt, image-recipe/build-auto-installer-iso.sh (grub.cfg generation)
5. HIGH: LUKS partition not showing in disk stats
Symptom: Server view doesn't show LUKS encryption status or the encrypted partition
Root cause: Backend system.disk-status uses df / or df /var/lib/archipelago but doesn't report LUKS status. No cryptsetup status call. Frontend only shows used/total/free/percent.
Fix: Add LUKS detection to the disk status RPC: check if /dev/mapper/archipelago* exists, read cryptsetup status. Return encrypted: true/false and encryption_cipher fields. Frontend: show a lock icon + "LUKS2 Encrypted" badge in disk stats.
Files: core/archipelago/src/api/rpc/system/handlers.rs, neode-ui/src/views/Server.vue
6. MEDIUM: No Plymouth boot splash showing
Symptom: No animation between GRUB and login — just black screen with blinking cursor
Root cause: Plymouth theme files exist in image-recipe/branding/plymouth-theme/ but the ISO build doesn't copy the logo.png or install the theme properly. Also kernel cmdline needs splash quiet and plymouth-set-default-theme must be run.
Fix: Verify the ISO build copies plymouth theme + logo.png to rootfs, runs plymouth-set-default-theme archipelago, and kernel cmdline includes splash quiet.
File: image-recipe/build-auto-installer-iso.sh (plymouth setup section)
7. MEDIUM: No custom MOTD
Symptom: Default Debian MOTD on VT1 login
Fix: Add custom MOTD to ISO build that shows Archipelago ASCII logo, version, IP address, and useful commands (kiosk toggle, SSH info).
File: image-recipe/build-auto-installer-iso.sh (add MOTD generation)
8. MEDIUM: Onboarding intro needs double press
Symptom: Pressing the intro circle/button once resets, need to press twice
Root cause: SplashScreen.vue has a 48-segment ring animation triggered on hover. The splash → intro transition may have a race condition with animation completion. OnboardingIntro.vue auto-focuses CTA after 2100ms delay — if user clicks before that, focus may steal the event.
Fix: Investigate SplashScreen.vue transition timing. Add click debounce or ensure single-click always proceeds.
Files: neode-ui/src/components/SplashScreen.vue, neode-ui/src/views/OnboardingIntro.vue
9. MEDIUM: No TUI animations in actual installer
Symptom: Installer is functional but plain — no bouncing Bitcoin, no glitch effects from demo
Root cause: scripts/install-tui-demo.sh has elaborate animations but the actual installer in the ISO build script is minimal (basic spinner + typewriter only).
Fix: Port key animations from install-tui-demo.sh into the actual installer: logo decrypt reveal, progress bars with percentage, phase transitions. Keep it lightweight but visually distinctive.
File: image-recipe/build-auto-installer-iso.sh (auto-install.sh section)
10. LOW: Container tests CI failing
Symptom: cargo: command not found in container-tests workflow
Fix: Add source $HOME/.cargo/env to test steps. Already staged locally.
File: .gitea/workflows/container-tests.yml
11. LOW: Kiosk enable/disable command lacks visual feedback
Symptom: User runs command, MOTD changes but no immediate visual confirmation
Root cause: The archipelago-kiosk script DOES print feedback messages. The issue may be that VT auto-switches and the user doesn't see the output.
Fix: Add a brief sleep before VT switch so user sees the confirmation message. Consider adding a --quiet flag for scripted use.
File: image-recipe/build-auto-installer-iso.sh (kiosk toggle script)
Execution Order
- Tor fix (#1) — 5 min, critical
- Kiosk reliability (#3) — 15 min, high impact
- GRUB text scaling (#4) — 15 min, visible
- LUKS disk stats (#5) — 20 min, backend + frontend
- App install error messages (#2) — 10 min, verify + improve
- Plymouth boot splash (#6) — 15 min
- Custom MOTD (#7) — 10 min
- Intro double-press (#8) — 10 min
- TUI animations (#9) — 30 min (port from demo)
- CI fix (#10) — 2 min
- Kiosk feedback (#11) — 5 min
Verification
- Build new ISO on .228 via CI (push to main)
- Flash to USB, install on .198
- Check: GRUB readable → Plymouth splash → TUI installer animations → MOTD shows → Kiosk loads first time → Tor onion addresses visible → App installs work → Disk shows LUKS → Intro single-click works