Hotfix: archipelago.service ExecStartPre now mkdirs /run/containers and
/var/lib/containers before the unit's mount-namespace setup tries to bind
them. Without this, fresh nodes that don't have /run/containers (e.g.
nodes provisioned without a prior podman session) fail at the namespace
step with:
Failed to set up mount namespacing: /run/containers: No such file or directory
Failed at step NAMESPACE spawning /bin/bash: No such file or directory
Existing nodes don't pick up systemd unit changes via OTA — they need a
one-time `systemctl edit archipelago` adding the same mkdir. ISO installs
from this version forward have the fix baked in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds Delegate=memory pids cpu io to the archipelago.service unit.
Context: the service runs as User=archipelago under system.slice with
rootless podman. When podman creates transient libpod-*.scope units for
containers under user.slice, systemd needs the caller to hold
CAP_SYS_ADMIN on the target cgroup subtree \u2014 which happens iff
Delegate= lists the controllers we want to set. Without Delegate, any
future code path that goes through the podman CLI (runtime.rs) instead
of the libpod HTTP API (podman_client.rs) would hit MemoryMax
rejections that have exactly the same symptom as the bug I just fixed
in parse_memory_limit but with a completely different root cause.
Belt-and-braces: current production path uses PodmanClient and was
fixed in the preceding commit. But the DockerRuntime CLI path in
runtime.rs:262-268 (cmd.arg("--memory")) is still reachable via
AutoRuntime fallback on hosts without podman, and future rust
orchestrator code may legitimately need cgroup delegation. This
directive is no-op harmful on hosts that already delegate upstream
(systemd gracefully handles duplicate/nested delegation).
- Update all references from Debian 12 (Bookworm) to Debian 13 (Trixie)
- Enable SystemCallArchitectures, RestrictAddressFamilies, RestrictRealtime
in archipelago.service (safe on systemd 256+ which respects NoNewPrivileges=no)
- Update GLIBC compatibility checks from 2.36 to 2.40
- ISO filename, build container, and docs updated throughout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- VPN card: relay URLs, device management, invite QR, add participant
- Backend: vpn.invite, vpn.add-participant, vpn.peer-config RPCs
- nvpn v0.3.7 system service (fixes event processing bug in v0.3.4)
- First-boot: auto-configure nvpn with node identity and endpoint
- Service: AF_NETLINK for WireGuard, NoNewPrivileges=no for sudo wg
- TASK-50: networking stack reliability from first install
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove MemoryDenyWriteExecute=yes from archipelago.service — ring
(rustls) and secp256k1 (bitcoin/nostr) crypto libraries need
executable memory mappings that this restriction blocks
- Add + prefix to ExecStartPre so mkdir/chown run as root
- Use $HOME/archy instead of /home/archipelago/archy in CI workflows
so builds work on both .228 and VPS CI runners
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The backend shuts down in <1s. The 660s timeout was left from when
Bitcoin Core was managed by this service. With 660s, systemctl stop
hangs for 11 minutes if the process is already dead but systemd
hasn't noticed, blocking all deploys and restarts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The backend couldn't read Tor hidden service hostnames because the
systemd service only had SupplementaryGroups=dialout. Adding debian-tor
allows the backend to read /var/lib/tor/hidden_service_*/hostname
without needing sudo (which is blocked by NoNewPrivileges=yes).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- I2: Add MemoryMax=4G, LimitNOFILE=65535, TasksMax=2048 to systemd service
- I3: Tor rotation keeps old service for 1h transition before cleanup
- R14: Replace .parse().unwrap() with .unwrap_or(localhost) in rate limiter
- R15: Replace 7 unwrap/expect in mesh protocol with proper error propagation
- R27: Add 10s timeouts to mesh Bitcoin RPC calls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RestrictNamespaces and SystemCallFilter block rootless podman from
creating user namespaces needed for container isolation. Removed these
along with RestrictSUIDSGID (implied by NoNewPrivileges). ProtectHome
set to no (rootless podman needs ~/.local/share/containers writable).
Remaining active protections: NoNewPrivileges, ProtectSystem=strict,
ReadWritePaths, RestrictAddressFamilies, MemoryDenyWriteExecute,
RestrictRealtime, SystemCallArchitectures=native.
Also reduced initial scan delay from 15s to 3s for faster container
visibility after boot, and removed Ollama from auto-deploy.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The security hardening (NoNewPrivileges, RestrictAddressFamilies,
MemoryDenyWriteExecute, RestrictRealtime, ProtectSystem=strict) all
blocked podman container management via sudo. These are temporarily
disabled until TASK-11 (rootless podman migration) is complete.
Remaining active protections: ProtectSystem=true (/usr, /boot),
ProtectHome=yes, PrivateTmp=yes, PrivateDevices=no (mesh radio).
Also adds TASK-11 to MASTER_PLAN.md for tracking the rootless podman
migration that will allow re-enabling full security hardening.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: systemd PrivateDevices=yes hid /dev/ttyUSB* from the service,
preventing .198 from connecting to its Heltec V3 after the security hardening.
Changes:
- Set PrivateDevices=no in systemd service (serial access needs physical devices;
other hardening layers remain: NoNewPrivileges, ProtectSystem, RestrictNamespaces)
- Add SupplementaryGroups=dialout for explicit serial permissions
- Add fallback auto-detect when configured serial path fails to open
- Add exponential backoff on reconnect (5s→60s cap) to reduce log spam
- Add pre-open device existence check with actionable error messages
- Add udev rule (99-mesh-radio.rules) for stable /dev/mesh-radio symlink
- Add /dev/mesh-radio to serial candidate list (checked first)
- Add Connect button per detected device in Mesh UI
- Deploy udev rule to both servers and ISO build
- Fix FEDI_HASH unbound variable in deploy script
- Fix deploy binary step to handle hung service stop gracefully
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: sd_notify::notify(true, ...) cleared NOTIFY_SOCKET env var,
so watchdog pings never reached systemd. Backend killed every 60s.
Fixes:
- Change sd_notify::notify first param to false (keep socket)
- Increase WatchdogSec from 60 to 300 (5min) for crash recovery
- Add TimeoutStartSec=300 for slow container startups
- Adjust watchdog ping interval to 120s
This was causing 47 restarts/day on .198 and blocking REBOOT-03,
FLEET-03, FLEET-04, VC-04.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MEM-01: OOM kill detection via dmesg checks every 5 minutes
MEM-03: Disk growth rate tracking (288 samples over 24h), warns at >1GB/day
MEM-04: Systemd watchdog (WatchdogSec=60, sd_notify::Watchdog every 30s)
Service Type=notify for proper startup notification
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The My Apps page went blank after installing apps because pkg['static-files'].icon
was accessed without optional chaining on dynamically installed packages that lack
the static-files property.
- Make static-files optional in PackageDataEntry type
- Add defensive ?.icon access with fallback in Apps.vue and AppDetails.vue
- Add filebrowser to mock backend staticDevApps (enables Cloud page in demo)
- Expand portMappings and marketplaceMetadata for all marketplace apps
- installPackage now uses staticApp() format for consistent data shape
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Convert "Choose Your Path" screen to informative (read-only cards)
- Harden "Choose Your Setup" (gray out Coming Soon options, auto-select Fresh Start)
- Auto-fetch DID on mount with retry and auto-advance after success
- Improve backup download for mobile compatibility
- Add retry logic to verify step with graceful skip option
- Route verify → done → login for complete onboarding flow
- Add AIUI install confirmation via custom event (SEC-001)
- Add file path whitelist for AIUI file access (SEC-002)
- Add log redaction for container logs sent to AIUI (SEC-003)
- Add Secure flag to session cookie in production (SEC-004)
- Fix ISO build script to handle zstd compression errors gracefully
- Sync archipelago.service from live server
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Upgraded Fedimint version to v0.10.0 in docker-compose.yml and manifest.yml, adding support for the built-in Guardian UI.
- Modified .gitignore to exclude deploy-config.sh script.
- Enhanced onboarding process in AuthManager to persist onboarding state and validate password strength during user setup.
- Updated API to handle onboarding completion and password change requests, ensuring a smoother user experience.
- Improved configuration management to support Nostr discovery and Tor proxy settings, enhancing node identity features.
- Updated the Development-Workflow documentation to clarify deployment strategy, emphasizing direct deployment to the live system for testing.
- Added detailed instructions for the deployment command, including syncing code, building frontend and backend, and restarting services.
- Improved SSH key management section to assist with authentication issues.
- Expanded the testing workflow to include steps for checking logs and syncing changes back to the ISO build.
- Updated the ISO build integration section to ensure system-level changes are captured for future builds.
- Refactored various sections for clarity and completeness, including deployment paths and system configuration files.