Tor was configured in torrc but never started by first-boot-containers.sh.
Connect wallet and .onion services were broken on fresh installs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Chromium kiosk: add --disable-gpu-compositing, --disable-gpu-rasterization,
--disable-software-rasterizer, --renderer-process-limit=1
drops GPU process from 64% to 12% CPU
- Container healthchecks: 30s to 120s interval in first-boot and reconcile
- AppCard: min-height on description so cards dont shift
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical bug: the doctor runs as root but containers are rootless
under the archipelago user. When checking if a conmon process has an
associated container, the root podman database was queried (empty),
causing ALL conmon processes to be identified as orphaned and killed.
This terminated running containers every 30 minutes.
Fix: use sudo -u archipelago to query the rootless podman database.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Nginx needs CHOWN, SETUID, SETGID to chown cache directories and drop
privileges on startup. LND UI additionally needs NET_BIND_SERVICE to
bind port 80 inside the container. Without these, cap-drop ALL causes
nginx to crash with "Operation not permitted" on chown or bind.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The doctor runs as root (for tor permissions, process cleanup) but
containers are rootless under the archipelago user. Use sudo -u to
switch user context for podman commands.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rootless Podman 4.x restart policies don't auto-restart containers
after crashes. The doctor (which runs on a timer) now checks for
exited core containers (tiers 0-2) and restarts them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The health check command goes through multiple shell layers
(assignment → variable expansion → eval → podman → sh -c). Inner
double quotes need \\\" escaping to survive as literal " in Python.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The electrumx container image doesn't include curl. Replace the HTTP
health check with a Python socket connection test to the RPC port.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- electrumx: add DAC_OVERRIDE to SPEC_CAPS — rootless podman maps container
UID 0 to host UID 1000, but volumes are owned by host UID 100000; without
DAC_OVERRIDE the container can't write to its own data directory
- lnd: replace curl-based health check with lncli using readonly macaroon —
the REST API requires macaroon auth, so unauthenticated curl always fails
- grafana: add DAC_OVERRIDE to SPEC_CAPS for the same rootless volume issue
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add orchestration_tests.rs + mock_podman.rs (container unit tests)
- Add container-tests.yml CI workflow
- Add dev-container-test.sh for local testing
- MASTER_PLAN.md: add TASK-49 (P0) with 6-phase plan
- Login.vue: minor fixes from user testing
- AppCard.vue: enter key handler fix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- image-versions.sh: fix 15+ tag mismatches against actual registry
(bitcoin-knots:28.1→latest, lnd:v0.18.5→v0.18.4, grafana:11.4→10.2,
vaultwarden:1.32.5→1.30.0-alpine, nextcloud:29→28, etc.)
- .bashrc: add /sbin:/usr/sbin to PATH so reboot/shutdown work
- Tailscale: add Arch Atob node (100.113.33.31)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Option 0 in dev-start.sh launches the branding development workflow:
- Finds latest ISO on Desktop or results/
- Patches branding files into the ISO
- Boots in QEMU for immediate visual feedback
- Lists editable files if no ISO is available
Edit background.png, theme.txt, or Plymouth files, re-run option 0,
see changes in ~10 seconds without a full CI build.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FileBrowser crash fix:
- Add --cap-add=NET_BIND_SERVICE (port 80 needs it with --cap-drop=ALL)
- Add --cap-add=DAC_OVERRIDE for rootless volume access
- Both in first-boot script and backend config.rs
Test script fixes:
- Extract csrf_token cookie and send as X-CSRF-Token header on RPC calls
- Add --phase1-only flag for safe install-only checks (no side effects)
- Auto-test service uses --phase1-only so it doesn't steal onboarding
Install fixes:
- Pre-create ~/.local/share/containers (ReadWritePaths mount namespace error)
- Fix console-setup.service: add After=tmp.mount + ExecStartPre mkdir /tmp
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All container image references now pull from 80.71.235.15:3000/archipelago/
instead of Docker Hub and ghcr.io. image-versions.sh is the single source
of truth; all scripts use $*_IMAGE variables instead of hardcoded refs.
Files updated:
- scripts/image-versions.sh: central ARCHY_REGISTRY variable
- core/*/config.rs: registry whitelist includes app registry
- core/*/stacks.rs: Immich + Penpot stack images
- scripts/{first-boot,deploy-to-target,container-specs}.sh: use variables
- docker/*/Dockerfile: nginx base image from registry
- image-recipe/: ISO build, podman config, menu script
- scripts/{container-doctor,deploy-bitcoin-knots,fix-indeedhub,validate-app-manifest}.sh
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Dashboard.vue: move DashboardMobileNav outside <main> so position:fixed
isn't broken by will-change:transform on the perspective container
- Add container-specs.sh and reconcile-containers.sh utility scripts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove || exit 1 from health-cmd (redundant, breaks SSH heredoc)
- Use --health-cmd 'cmd' format (space, not equals) for proper quoting
- Simplify bitcoin health check to bitcoin-cli getnetworkinfo (no creds needed)
- Fix MariaDB health check nested quote issue
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix shell escaping in LND config sync block (single-quoted SSH context
doesn't need backslash-escaped dollars)
- deploy-tailscale.sh BUILD_SOURCE auto-detects Tailscale IP when LAN
unreachable (fixes "No binary on .228" error)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add --all as alias for --fleet
- Fleet deploy auto-detects Tailscale IP when LAN SSH fails
- Skip .198 gracefully when unreachable instead of failing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create scripts/smoke-test.sh for live server verification (7 checks)
- Document planned GitHub Actions CI/CD pipeline in docs/ci-cd-plan.md
- Integration tests deferred to future task (require test harness setup)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- S10: Add warnings to silent health check failures in deploy scripts
- S11: Add trap cleanup for temp dirs in deploy and tailscale scripts
- S12: Quote 20+ critical unquoted variables across deploy scripts
- S13: Extract hardcoded IPs to deploy-config-defaults.sh
- S15: Add --memory=256m to UI container runs
- F16: Remove in-memory JWT, use cookie-only auth in filebrowser client
- F17: Add meta tag fallback for CSRF token in RPC client
- F19: Track and clear setTimeout in AppSession on unmount
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- I2: Add MemoryMax=4G, LimitNOFILE=65535, TasksMax=2048 to systemd service
- I3: Tor rotation keeps old service for 1h transition before cleanup
- R14: Replace .parse().unwrap() with .unwrap_or(localhost) in rate limiter
- R15: Replace 7 unwrap/expect in mesh protocol with proper error propagation
- R27: Add 10s timeouts to mesh Bitcoin RPC calls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- F4: Fetch fresh server state after WebSocket reconnect
- F5: Guard message polling timer with auth check, stop on logout
- F6: Remove NIP-07 listener in appLauncher close()
- F7: Initialize audio player once to prevent listener stacking
- S3: Pin all container images to specific versions, create image-versions.sh
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- F1: Guard connectWebSocket against concurrent calls with isWsConnecting flag
- F2: Serialize mesh send operations with sendQueue to prevent fetchMessages races
- F3: Add global Vue error handler with toast notification
- S1: Replace sudo podman with podman across all scripts (rootless Podman)
- S2: Add health-cmd to all 40 container run commands in first-boot-containers.sh
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major changes:
- Full Tor hidden service management via systemd path unit pattern
(tor-helper.sh + archipelago-tor-helper.path/service) — respects
NoNewPrivileges=yes, no sudo needed from backend
- Container doctor: prefer system Tor over container, remove archy-tor
- Deploy script: fix torrc generation (read correct services.json path),
web apps map port 80→local port, enable both tor and tor@default
- Federation: server rename pushes name to peers via background sync
- Server name: fix root-owned file, optimistic store update
- Mesh: local echo for sent messages, sendingArch loading state
- Web5: Message button → Mesh redirect, node name lookup in messages
- PeerFiles: show DID not onion in header
- Connected Nodes: flex-1 instead of fixed max-h
- Toast notifications route to Mesh
- Deploy script: fix single-quote syntax in SSH block
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Part 1 — DID Persistence:
- Deploy script creates /var/lib/archipelago/identity/ directory
- First-boot script creates identity dir with proper ownership
- Identity load now logs pubkey to confirm persistence across restarts
Part 2 — Node Names:
- NodeStateSnapshot includes node_name field
- build_local_state() passes server name to sync responses
- update_node_state() stores peer's announced name on the FederatedNode
- Names propagate automatically during federation.sync-state
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ensures LND Connect works through every deployment path:
- Nginx: CORS $http_origin on /lnd-connect-info (both HTTP+HTTPS)
- Nginx: no cookie gate (backend is 127.0.0.1-only)
- LND UI source: fetch with credentials: 'include'
- Deploy: rebuilds LND UI with --no-cache every deploy
- First-boot: --restart unless-stopped + memory limits on UI containers
- Backend: bound to 127.0.0.1:5678 in systemd service
Root cause was CORS: LND UI on :8081 fetching :80 is cross-origin.
Browser blocked reading the 200 response without CORS headers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New Discover.vue (app store redesign)
- Fleet.vue dashboard for .228
- MeshMap.vue component
- Fixed Discover.vue type errors (unused var, type predicate)
- Various UI updates (Apps, Dashboard, Marketplace, Mesh, Web5)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LND was crash-looping because lnd.conf had 127.0.0.1:8332 (container
loopback, not reachable) and the old hardcoded password. Deploy script
now detects stale values and patches them to bitcoin-knots:8332 with
the current secrets file password. Fixes address generation failure.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TASK-17: Deploy script auto-tags successful clean deploys with next
alpha version number. Skips if commit already tagged or working tree
is dirty.
BUG-3: Updated IndeedHub submodule — removed dead nostrConfig with
hardcoded ws://localhost:7777 that caused WebSocket reconnection spam
in browser console. Relay detection via relay.ts (auto-detect /relay
proxy) is the active path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch docker-compose from regtest to signet, add standalone testnet stack
(docker-compose.testnet.yml) with Bitcoin+LND+ThunderHub+Fedimint. Mock
backend now auto-detects Podman/Docker sockets and includes full LND/Lightning
RPC mocks. Dev scripts refactored with boot mode, testnet option, and macOS
EAGAIN fix for port cleanup. Added dev faucet button to Home.vue.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- BUG-33: CPU load alert threshold increased from 2x to 4x core count
(8→16 on 4-core machine) to reduce false alerts during container ops
- TASK-27: Launch buttons for new-tab apps now show external link icon
(BTCPay, Grafana, PhotoPrism, Portainer, OnlyOffice, etc.)
- TASK-36: Iframe error screen now distinguishes between X-Frame-Options
blocked vs container not reachable, with appropriate messaging
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Bitcoin UI nginx: use __BITCOIN_RPC_AUTH__ placeholder, injected at
deploy time from secrets file (fixes auth prompt regression)
- Deploy script: sed-replaces placeholder with real base64 RPC creds
before building bitcoin-ui Docker image
- Container state: "created" → "stopped" (not "starting") so ollama/
tailscale show correctly
- Comprehensive INSTALLED_ALIASES for marketplace
All container credentials now flow from secrets files through the
deploy script. Manual container recreation is no longer needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>