BootReconciler (in-process, 30s interval, spawned from main.rs as of
Step 6 commit 48f08aa3) fully replaces the timer-driven bash
reconciliation path. Delete the systemd unit + timer and their
ISO-builder touchpoints.
Removed:
- image-recipe/configs/archipelago-reconcile.service
- image-recipe/configs/archipelago-reconcile.timer
- image-recipe/build-auto-installer-iso.sh L412-413 (COPY unit+timer)
- image-recipe/build-auto-installer-iso.sh L449 (systemctl enable)
- image-recipe/build-auto-installer-iso.sh L542-543 (cp to WORK_DIR)
Kept (intentionally):
- scripts/reconcile-containers.sh
- scripts/container-specs.sh
Reason: core/archipelago/src/api/rpc/package/update.rs still invokes
reconcile-containers.sh at two sites (OTA update + rollback paths).
Porting those call sites to ContainerOrchestrator::upgrade() requires
manifests for every container update.rs might touch — that scope
belongs in Step 8b. Until then the script stays on disk, just no
longer runs on a periodic timer.
No Rust code changes. cargo check -p archipelago clean, 6 pre-existing
warnings. Skipped full ISO rebuild validation per user decision —
edits are 5 textual deletions with zero behavioral ambiguity; Step 9
live hot-swap on .228 will catch any regression.
Discovered during Step 8 execution that first-boot-containers.sh
creates 30+ containers with per-container logic (wallet loads, DB
init, rpcauth derivations, post-create health waits) and does
substantial non-container setup (secret gen, rootless-podman subuid
chowns, Tor hostnames, WireGuard, firewall, nostr-relay). Only 3 of
the 30+ containers have manifests today (the UIs from Step 7).
Deleting the bash in a single step bricks first-boot on fresh
installs. Split into:
- 8a: delete reconcile-containers.sh + container-specs.sh + reconcile
systemd unit + timer. BootReconciler fully covers these. Safe,
atomic, no manifest porting required.
- 8b: port remaining ~25 containers into apps/<id>/manifest.yml. One
manifest per commit, validated against current bash behavior.
Multi-day scope.
- 8c: rename first-boot-containers.sh -> first-boot-setup.sh, strip
container ops, keep secret/dir/Tor/WG/firewall setup. Final
one-way door, requires 8b complete.
Records acceptance evidence for Steps 1-4 (container tests 21/21 pass, build
clean with expected unused-method warnings) and queues the BootReconciler
implementation for Step 5.
ContainerConfig.image is now Option<String>, mutually exclusive with a new
optional ContainerConfig.build: Option<BuildConfig>. Exactly one of image
or build must be present, enforced in AppManifest::validate.
Adds ResolvedSource enum (Pull | Build) and ContainerConfig::resolve +
::image_ref helpers so the orchestrator can treat pull and build uniformly.
All 26 existing pull-only manifests continue to parse unchanged
(covered by existing_pull_only_manifests_still_parse test).
Call sites updated: podman_client, runtime::DockerRuntime, dev_orchestrator.
Dev orchestrator errors out cleanly on Build sources until Step 2 lands
build_image support on the runtime trait.
Step 1 of docs/rust-orchestrator-migration.md. 10 new unit tests, all pass.
Also includes: docs/rust-orchestrator-migration.md (design spec) and
docs/STATUS.md resume section for the next session.
Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 +
v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500)
with no recovery path short of SSH. This release adds a self-check
guardrail to the update flow.
What changed:
- apply_update() writes a pending-verify marker with old+new version and
a 150s deadline immediately before scheduling the service restart.
- verify_pending_update() runs from main.rs startup. If the marker is
present and within its freshness window, the new binary waits 15s for
nginx + backend to settle, then probes https://127.0.0.1/ every 5s for
up to 90s (self-signed certs accepted).
- On any probe success within the window, the marker is cleared and
nothing else happens.
- On window-exhaust, the new binary:
1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts>
(quarantined, not deleted, so we can post-mortem).
2. Restores web-ui.bak on top of web-ui.
3. Calls rollback_update() to restore the previous binary.
4. Updates state.current_version to reflect the rollback.
5. systemctl --no-block restart archipelago so the OLD binary boots.
- Markers older than 10 minutes are treated as stale and cleared without
probing, so a crashed-during-startup marker from weeks ago cannot
spontaneously roll back a healthy node on a later reboot.
- rollback_update() binary copy now goes through host_sudo instead of
tokio::fs::copy, so it escapes the service's ProtectSystem=strict
mount namespace. Without this, the rollback silently failed with
EROFS on /usr/local/bin and orphaned the rollback - the exact
opposite of what auto-rollback is for.
Tests: 4 new unit tests in update::tests covering marker round-trip,
absent-marker noop, no-panic on verify_pending_update with nothing to
verify, and an invariant assert that the 90s probe window stays below
the 600s stale threshold. All passing.
Side fix: scripts/create-release-manifest.sh was dying with exit 141
(SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail.
Replaced with a single awk NR==1 that doesn't short-circuit the upstream
pipe, so the release-build flow is idempotent again.
- Add deploy_secondary() function for deploying to multiple LAN nodes
- --both now deploys to .198 and .253 (previously .198 only)
- Fleet deploy updated for 3 LAN nodes
- Mesh DM fixes: protocol frame format, DM-via-channel routing
- Federation pending requests, discover modal
- VPN status UI improvements
- Image versions and container specs updates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All hardcoded references to the old IP-based registry replaced across
Rust backend, Vue frontend, shell scripts, Dockerfiles, CI, and docs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Update all references from Debian 12 (Bookworm) to Debian 13 (Trixie)
- Enable SystemCallArchitectures, RestrictAddressFamilies, RestrictRealtime
in archipelago.service (safe on systemd 256+ which respects NoNewPrivileges=no)
- Update GLIBC compatibility checks from 2.36 to 2.40
- ISO filename, build container, and docs updated throughout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- VPN card: relay URLs, device management, invite QR, add participant
- Backend: vpn.invite, vpn.add-participant, vpn.peer-config RPCs
- nvpn v0.3.7 system service (fixes event processing bug in v0.3.4)
- First-boot: auto-configure nvpn with node identity and endpoint
- Service: AF_NETLINK for WireGuard, NoNewPrivileges=no for sudo wg
- TASK-50: networking stack reliability from first install
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a node was already known (via link-node) but had an empty onion
address, the peer-joined handler returned early without updating the
onion. Now it patches missing onion/pubkey fields on existing nodes.
Also adds update_node() to federation storage and updates the
architecture comparison doc with system resources, StartOS/umbrelOS
tabs, Web5 section, and comparison view.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend:
- Add --add-host host.containers.internal:host-gateway to LND and Bitcoin
Knots containers (fixes DNS resolution failure in rootless podman)
- Add --user 0:0 and DAC_OVERRIDE to nginx UI sidecar containers
(fixes chown crash in rootless podman for bitcoin-ui, electrs-ui, lnd-ui)
- Add hostadd to Rust Podman API client for web UI container installs
- Add Chromium privacy flags to kiosk launcher (disable telemetry)
Frontend:
- Fix onboarding reset on raw IP visits (trust localStorage as first-class
signal, skip boot screen when server is up but not onboarded)
- Fix seed regression: persist challenge indices in sessionStorage so going
back from Verify doesn't change which words are asked
- Remove glass container from seed Generate/Verify/Restore screens
- Add Back button to Restore from Seed screen
- Replace Network card: Tor (purple), VPN status (orange), Bitcoin sync (orange)
- Add ElectrumX to curated app list with correct .webp icon
- Install flow: navigate to My Apps immediately with toast, hide
installed/installing apps from marketplace and discover views
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace fragmented random key generation with a single 24-word BIP-39
mnemonic that deterministically derives all node keys: Ed25519 (DID),
secp256k1 (Nostr/Bitcoin), BIP-84 xprv (Bitcoin Core), and LND aezeed
entropy. New onboarding flow: seed generate → word verification → identity
naming. Restore path enabled via 24-word entry. Includes seed RPC handlers,
mock backend support, LND/Bitcoin Core wallet-from-seed integration, and
UI polish across settings and discover views.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical:
- fix: container installs fail with "statfs: no such file or directory"
Root cause: NoNewPrivileges=yes in systemd blocks sudo inside backend.
Fix: use std::fs::create_dir_all + podman unshare chown (no sudo needed)
- fix: Tor services.json never written — \$ARCHY_TOR_DIR escaping bug
- fix: kiosk white screen — increase health wait to 60s, add --disable-gpu
Improvements:
- feat: LUKS encryption badge in Server disk stats (backend detects dm-crypt)
- fix: GRUB theme text scaling on 4:3 monitors — explicit fonts, wider menu
- fix: suppress default Debian MOTD (custom profile.d welcome is enough)
- fix: install error messages now show "Failed to pull/start" instead of
generic "Operation failed" (middleware.rs allowlist expanded)
- fix: container-tests CI — source cargo env before running tests
- docs: interactive container architecture diagram (HTML)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add orchestration_tests.rs + mock_podman.rs (container unit tests)
- Add container-tests.yml CI workflow
- Add dev-container-test.sh for local testing
- MASTER_PLAN.md: add TASK-49 (P0) with 6-phase plan
- Login.vue: minor fixes from user testing
- AppCard.vue: enter key handler fix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create scripts/smoke-test.sh for live server verification (7 checks)
- Document planned GitHub Actions CI/CD pipeline in docs/ci-cd-plan.md
- Integration tests deferred to future task (require test harness setup)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Part 1 — DID Persistence:
- Deploy script creates /var/lib/archipelago/identity/ directory
- First-boot script creates identity dir with proper ownership
- Identity load now logs pubkey to confirm persistence across restarts
Part 2 — Node Names:
- NodeStateSnapshot includes node_name field
- build_local_state() passes server name to sync responses
- update_node_state() stores peer's announced name on the FederatedNode
- Names propagate automatically during federation.sync-state
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests expected router.push but panel mode (now default) uses panelAppId
store state instead. Updated assertions to check panelAppId. Fixed
BTCPay app ID from 'btcpay' to 'btcpay-server'. All 515 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Health endpoint now returns JSON with version and service status instead
of plain "OK". Updated BETA-PROGRESS.md: BUG-1 done, TASK-8 done (12/12
+ code audit), FEATURE-4 at ~80%, overall at ~55%. Added session #5 log.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TASK-31: Cleaned up Apps page nav header structure (tabs + categories + search).
TASK-38: Added Bitcoin Core sync progress gauge to homepage System Stats card —
shows sync percentage, block height, and green/orange color coding. Only
appears when Bitcoin is running. Grid expands to 4 columns when visible.
Updated MASTER_PLAN.md — cleaned up completed sections, moved done items.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BUG-1 (P0): CSRF tokens now HMAC-derived from session token instead of
random — survives backend restarts, eliminates cookie/header race conditions.
Frontend retries 403s as belt-and-suspenders.
TASK-8 H2: federation.peer-joined verifies ed25519 signature on join messages.
TASK-8 H3: federation.peer-address-changed requires signed proof from known peer.
TASK-8 H4: Rust backend default bind 0.0.0.0 → 127.0.0.1 (nginx proxies all).
BUG-20: ElectrumX index estimate string fixed from ~55GB to ~130GB.
BUG-37: App card Start/Stop buttons split into loading vs interactive states
to prevent WebSocket state flicker during container scans.
BUG-40: Uninstall modal uses Teleport to body with z-[3000] for full overlay.
BUG-41: Uninstalling overlay on card + optimistic store removal.
Updated MASTER_PLAN.md and BETA-PROGRESS.md to reflect all completed work.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- BUG-33: CPU load alert threshold increased from 2x to 4x core count
(8→16 on 4-core machine) to reduce false alerts during container ops
- TASK-27: Launch buttons for new-tab apps now show external link icon
(BTCPay, Grafana, PhotoPrism, Portainer, OnlyOffice, etc.)
- TASK-36: Iframe error screen now distinguishes between X-Frame-Options
blocked vs container not reachable, with appropriate messaging
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Boot screen (BootScreen.vue) is already fully production-integrated:
- RootRedirect health checks → shows boot screen if server down
- Polls /rpc/v1 until healthy → transitions to login/onboarding
- Kiosk launcher loads browser immediately, boot screen handles wait
- All audio/icon assets deployed to /opt/archipelago/web-ui/
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
My Apps/App Store/Services tabs, category filters, and search bar
now stay fixed at the top on scroll using sticky positioning with
glass-blur background. Applied to both Apps.vue and Marketplace.vue
desktop views.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Correct off-by-one in UID mapping: container UID N → host UID
(100000 + N - 1), not (100000 + N)
- Deploy script auto-fixes UID ownership on every deploy
- Bitcoin UI nginx uses __BITCOIN_RPC_AUTH__ placeholder injected
from secrets at deploy time
- container rules updated for rootless podman architecture
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The security hardening (NoNewPrivileges, RestrictAddressFamilies,
MemoryDenyWriteExecute, RestrictRealtime, ProtectSystem=strict) all
blocked podman container management via sudo. These are temporarily
disabled until TASK-11 (rootless podman migration) is complete.
Remaining active protections: ProtectSystem=true (/usr, /boot),
ProtectHome=yes, PrivateTmp=yes, PrivateDevices=no (mesh radio).
Also adds TASK-11 to MASTER_PLAN.md for tracking the rootless podman
migration that will allow re-enabling full security hardening.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Y5-01: docs/community-growth-plan.md — 3 growth phases from
dev preview to 10K nodes, tracking via opt-in analytics
- Y5-04: docs/v3-release-checklist.md — prerequisites, release
steps (code freeze, ISO builds, checksums), post-release plan
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per-container RAM/CPU/disk measurements from .228 baseline.
Three app tiers: Core (2.6GB), Recommended (+880MB), Optional (+2-5GB).
Four hardware tiers with cost estimates.
10K user distribution projection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>