The 0.4.11 edit affordance only lived on ServerConnectScreen, which a
connected user never sees. Add edit to NESMenu — the settings modal
reached via two-finger hold while connected: a ✎ pencil on each saved
server opens the form pre-populated (Edit Server header + Cancel),
persists via ServerPreferences.updateSavedServer(), and reconnects when
the edited server is the live one.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an edit affordance to each saved server in ServerConnectScreen: a
pencil button loads the entry into the form (Edit Server mode) with
Save Changes / Cancel actions. Persisted via a new
ServerPreferences.updateSavedServer() that replaces by connection
identity (address/port/scheme) and keeps the active record in sync when
the edited server is the active one.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Capture the 2026-06-26 lessons durably: ship via the hardened publish
script only, v1+v2+v3 signing is enforced by apksigner (AGP ignores
enableV1Signing at minSdk>=24), diagnose install failures with adb
install FIRST, signature-key changes force a one-time uninstall, and
keep all phone/adb work scoped to com.archipelago.app.debug.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The published companion APK was v2-only (AGP silently ignores
enableV1Signing for minSdk>=24) and clean builds broke on stray
space-named resource dirs. Harden scripts/publish-companion-apk.sh:
clean build, remove/ýreject space-named res dirs, force v1+v2+v3 via
zipalign+apksigner, and abort unless all three schemes verify. Wire
ship-companion.sh to the shared script. Re-sign the served 0.4.10 APK.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Active counterpart to the read-only all-apps-matrix.bats: drives
stop/start/restart for every installed app and, under
ARCHY_ALLOW_CASCADE_DESTRUCTIVE, a FULL teardown (uninstall →
no-ghost → reinstall) — the broad coverage F needs beyond the ~8 core
suites. App set is discovered from My Apps ∩ the node catalog; reinstall
spec comes from catalog.json {dockerImage, containerConfig}.
PROTECTED by default (never cycled or torn down): bitcoin*/electrum*
(expensive resync) AND lnd/btcpay*/fedimint* (teardown = irreversible
wallet/channel/guardian loss). The user asked to protect only
bitcoin+electrum; the wallet apps are added for safety and can be
removed via ARCHY_MATRIX_PROTECT. Heavy + destructive → a supervised
pass, not folded into run-gate. Validated on .228: discovery excludes
the 6 protected installed apps; lifecycle tier cycles a single app
(botfights) stop/start/restart green; teardown gated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AppCard's uninstall bar was hardcoded `w-full bg-red-400/60 animate-pulse`
— a solid, full-width, red, fake-pulsing block that never moved and read
as an error, no matter the actual teardown progress (the install bar, by
contrast, renders a real percentage). Derive a truthful percentage from
the backend's existing `uninstall-stage` label — "Stopping containers
(X/N)" → 10–50%, "Cleaning up volumes" → 70%, "Removing app data" → 90%
— and render it exactly like install: neutral fill, real width + percent,
shimmer (not a fake pulse) carrying motion when a stage has no number.
Frontend-only; the backend already broadcasts these stages.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
run-gate.sh ran only the DESTRUCTIVE tier; the cascade-uninstall suite
(uninstall→no-ghost→reinstall, the #13/#14/uninstall-hang regression
guard) existed but was never enabled by the gate. Add an opt-in single
cascade pass after the 5× loop (ARCHY_GATE_CASCADE=1, requires
ARCHY_ALLOW_DESTRUCTIVE=1), counted into the pass/fail tally. Kept out
of the 5× loop deliberately — uninstall/reinstall every iteration would
balloon runtime and re-pull images; one pass guards the class. Default
gate behavior unchanged. Validated: cascade-uninstall.bats 7/7 on .228.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Workstream F now in-progress: the immich/grafana uninstall hang →
ghost/stuck-bar/reinstall-block is root-caused (unbounded systemctl/
podman in quadlet::disable_remove) and fixed (71cc9ac4); cascade-
uninstall.bats 7/7 on .228. Records the remaining F items + the pending
gate-wiring decision.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Uninstalling immich/grafana could hang with a frozen full-red progress
bar, leave a ghost entry stuck in My Apps, and then refuse reinstall.
Single root cause: quadlet::disable_remove() — called first in the
uninstall task (via companion + orchestrator teardown) — ran
`systemctl --user stop`, daemon-reload, and `podman rm -f` with NO
timeout. On rootless podman a generated unit can wedge in "deactivating"
while podman hangs underneath, so `systemctl stop` blocks forever. The
spawned uninstall task then never returns Ok or Err, so:
- set_uninstall_stage() (after the stop) never fires → progress frozen;
- remove_package_state_entry() never runs → entry stranded in
`Removing` → ghost in My Apps;
- the install guard rejects reinstall with "already Removing".
The spawn wrapper already reverts state on Err and removes the entry on
Ok — the only failure mode was a hang that returns neither. Bound the
teardown so it always terminates:
- systemctl stop → QUADLET_STOP_TIMEOUT, escalate to kill+reset-failed
on timeout (reuses the existing helpers);
- daemon_reload_user() → bounded systemctl_user_status (30s);
- defensive `podman rm -f` → wrapped in tokio timeout.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
§10b: replace per-app static launch-port map with a manifest-first +
non-HTTP-port-skipping heuristic (the gitea :2222 class).
§10c: generalize the un-pruned/archival Bitcoin install blocker from a
hardcoded requires_unpruned_bitcoin() match to a manifest-declared
dependency, with a clear pre-install UX.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Banner + §8b: zombie-container guard (0a8db904, live-proven on .228) and
gitea launch-port fix (670ebb06) shipped in binary 040df5ce, rolled to
the fleet. Logs the mempool env-drift recreate-loop and nostr-rs-relay
follow-ups.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Gitea publishes two host ports — SSH on 2222 and the web UI on 3001.
The launch URL comes from manifest_lan_address_for() (the manifest's
interfaces.main → 3001), but Gitea had no entry in the static
lan_address_for() fallback map. On a node where the gitea manifest is
absent or stale (no interfaces block), the lookup returns None and the
code falls through to extract_lan_address(), which returns whichever
port podman lists first — frequently the SSH port. Result: the app
launched at :2222 instead of :3001 (observed on tailscale node
100.82.34.38).
Add the canonical "gitea" => http://localhost:3001 entry to the static
map, matching every other core app, so the web UI is pinned regardless
of manifest presence.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
podman trusts its own state DB: when a container's conmon dies without
podman observing it (cgroup-cascade SIGKILL on archipelago.service
restart, a crash), `podman ps` keeps reporting it "Up" long after the
process is gone. The reconciler NoOp'd such a zombie forever, so a dead
dependency with no published host port never recovered.
Observed live on .228 (2026-06-25): netbird-dashboard reported "Up" with
a dead State.Pid → its nginx proxy 502'd → NetBird login broke
("Unauthenticated"). The dashboard publishes no host port, so the
Running branch had nothing to probe and never recreated it.
Add a zombie guard to the Running branch: verify the recorded State.Pid
is alive (its /proc entry exists) before trusting "running"; on a
concrete dead PID, stop+remove+install_fresh from the manifest.
Conservative by design — any uncertainty (inspect failed, PID
unparseable) assumes alive, so a transient podman hiccup never destroys
a healthy container. Unit test covers live/dead/out-of-range PIDs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Node apps (e.g. NetBird on :8087) terminate TLS with a self-signed cert
so the dashboard gets a secure context (OIDC / window.crypto.subtle, #15).
The WebView's default onReceivedSslError CANCELs untrusted certs, so those
apps rendered blank in the companion — exactly the netbird "won't load in
the webview" report. Override onReceivedSslError in both WebViewClients
(kiosk + in-app browser) to proceed() only when the failing cert's host
matches the connected node; reject everything else (no blanket trust).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
netbird is fully manifest-driven (apps/netbird-*/manifest.yml via the signed
catalog): install_stack_via_orchestrator renders the 3-member stack with
generated_certs (self-signed TLS for the #15 OIDC secure context), base64
generated_secrets, and templated config — and adopts the running stack by live
container name. The hardcoded `podman run` fallback was therefore dead code on
any node with the embedded catalog (verified live: .228 https:8087 -> 200).
Removes the per-app Rust installer anti-pattern the master plan calls out:
- install_netbird_stack: orchestrator -> adopt -> bail! (no in-Rust installer)
- deletes 6 now-dead helpers (write_netbird_config_files, ensure_netbird_tls_cert,
read_or_generate_b64_secret, netbird_net_resolver_ip, detect_netbird_public_host_ip,
wait_for_netbird_oidc_ready), 3 NETBIRD_*_IMAGE consts, unused base64::Engine import
- ~485 lines removed; prod_orchestrator doc-comments updated
Behavioural parity: the manifest path already executed on the fleet, so this
changes no live behavior. The legacy #10 OIDC-readiness wait was already bypassed
by the manifest path; if that race resurfaces, add an OIDC-ready gate to the
manifest rather than resurrecting the Rust fn.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 5x destructive gate on heavy nodes false-failed on transient windows
during stack recovery, not real regressions:
- immich.bats: lan_address port-publish probe 30s -> 90s. The postgres->redis
->server (DB migrations on boot) stack can take >30s to republish :2283 after
a churn-induced recreate; destructive-tier immich tests already allow 180-240s.
- mempool.bats: orphan-container check now polls to steady state (<=30s) instead
of a single-shot count, which caught a recreated member briefly visible
alongside its replacement mid-reconcile.
- run-gate.sh: settle cap 180s -> 300s and also gate on immich's :2283 when
installed, so the next iteration's read-only probe doesn't race a still-
recovering stack. Settle returns the instant every probe is green.
A genuinely unexposed/orphaned/unhealthy app still fails these checks; they only
absorb the transient recreate window under sustained churn.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
NOT yet validated on a node or fleet-deployed — cargo check passes, release build
+ .228 canary validation pending. Committed as a checkpoint so the work survives.
Two fixes the immich .198 incident exposed:
Fix A (reconcile_all_with_mode): a previously-running app whose container vanished
(e.g. a wedged podman teardown cleared by a reboot) was left absent on boot. Now,
when boot reconcile would leave an app 'absent' but it was running at the last
running-containers snapshot, recreate it (install_fresh). New
crash_recovery::load_last_running_names() reads the snapshot without the PID/crash
gate (+2 unit tests). Match is exact on compute_container_name (incl stack
members); user-stopped + uninstalled apps are already excluded, so no false
positives.
Fix B (ensure_bind_mount_dirs): a freshly-created bind dir was left root:root, so a
no-data_uid app running as container-root (→ host rootless user) hit EACCES and
crash-looped (the exact immich upload-dir failure). Now a newly-created bind dir
for a no-data_uid app is chowned via --reference=<parent> to match the rootless
data root — no host-uid guessing, only fresh dirs (no regression for existing
installs).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two console-noise fixes from a live error dump:
- remote-relay.ts reconnected on a FIXED 5s interval with no backoff, so when
the backend is briefly down it floods the console/network with failed-WS
attempts for the whole outage. It's a secondary feature (companion input), so
add exponential backoff 1s->30s (mirrors websocket.ts), reset on open/start.
- cryptpad's catalog/marketplace entries pointed at a non-existent
/assets/img/app-icons/cryptpad.webp -> a 404 on every marketplace render.
Point it at the existing default icon (handleImageError swapped to it anyway).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The global error handler (Vue errorHandler + window error + unhandledrejection)
fired a red 'Something went wrong: <raw msg>' toast AND an auto on-device overlay
on every caught error — deliberately loud for bug-bash, but it surfaces benign,
non-actionable noise (e.g. a transient RPC rejection during a ws reconnect, or
the service worker failing to register over a self-signed cert) right in the
user's face.
Demote the catch-all to SILENT capture: keep console.error + the
window.__archyErrors ring buffer, and expose the screenshot-able overlay
on-demand via window.__archyShowErrors() — but never auto-pop. Components that
need to report a specific, actionable failure still call toast.error() directly.
Also filter known-benign environmental noise (PWA service-worker registration
failing over a self-signed cert — needs a trusted cert, #56) so it doesn't even
occupy a ring-buffer slot and push out real errors.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The per-app suites cover ~8 core apps in depth; nothing covered the ~30 others
(jellyfin, vaultwarden, penpot, nextcloud, grafana, …). all-apps-matrix.bats
derives the app set from server.get-state package-data (no hardcoded list) and
asserts baseline health across EVERY installed app:
- settles to a non-transitional state within a window (the #13/#14 stuck-ghost
class, generalized fleet-wide — installing/removing that never settles)
- not in error/failed
- reports a recognized (non-garbage) state
- every running UI app (manifest ui=="true") exposes a non-null lan-address
(the immich/port-drift unreachable-UI failure, generalized to all UI apps)
Read-only, so it joins run.sh/run-gate.sh on every node and grows coverage as
nodes install more apps. Verified 5/5 on .228 (17 apps) and .116 (20 apps).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 5x gate is DESTRUCTIVE-only and never exercised uninstall/reinstall — where
the worst field bugs lived (#13 app ghosting in My Apps after uninstall, #14
reinstall stalling on stale state). New cascade-uninstall.bats drives the full
teardown path on a throwaway app (default grafana, precondition-skips if already
installed so it can't destroy real data) and asserts:
- fresh install reaches running via a truthful, non-silent progression
- uninstall makes the entry DISAPPEAR from server.get-state package-data
(the literal My Apps map) — no ghost, no stuck uninstall stage
- container + (on-node) data dir are gone
- reinstall returns to running
- node left as found
Opt-in via ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1; not yet folded into the canonical
gate. Verified 7/7 against .228.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ensure_running_container_ownership re-probed and re-attempted the in-container
chown on every reconcile pass. For a mount that can't be re-owned from inside the
userns (observed: mempool-api /data -> 'Operation not permitted'), this burned
CPU and logged a WARN on every pass, forever (~6x/30min on .228/.116).
Remember hard chown failures in a process-lifetime set keyed by (container-id,
dest) and skip the probe+chown for known-unrepairable mounts. Keyed by Id (not
name) so a recreated container gets a fresh repair attempt. Verified on .116:
one recorded failure at startup, then silent across subsequent reconciles.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reconnect banner showed 'Connection lost'/'Reconnecting' instantly on every
socket close, even ones that recover in 100ms-2s (load spikes, Tailscale/relay
TCP resets). On a healthy node the drops are brief and self-healing, but each one
flashed a jarring banner, reading as constant instability.
Debounce the transient banner by 2.5s: only surface after the connection issue
persists past the grace window; hide immediately on recovery. Deliberate server
lifecycle transitions (restart/shutdown) bypass the debounce and still show at
once. A genuine persistent outage keeps isOffline true and surfaces after 2.5s.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Force-add the gitignored releases/app-catalog.json so nodes resolve
146.59.87.168:3000/lfg2025/archy/raw/branch/main/releases/app-catalog.json
(currently HTTP 404 → disk-manifest fallback). Embedded-manifest delivery
is default-on; origin-wins overlay with disk as fallback. Unsigned (migration
window accepts unsigned). Includes netbird x3 manifests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
handle_package_uninstall lumped every teardown failure into one `errors` vec
and returned Err on any of them BEFORE removing the package state entry — so a
non-fatal cleanup hiccup (a slow/failed `sudo rm -rf` of a large data dir, a
volume/network removal) left the app's containers gone but its entry in
package_data → a ghost in My Apps, and the spawned task reverted it to Installed.
Split the failures: container removal that even force-rm can't complete (app
genuinely still present) keeps the entry + returns Err; everything after the
containers are gone is best-effort. Remove the state entry as soon as the
containers are gone — BEFORE the slow volume/data teardown — so My Apps updates
immediately and residue can never ghost the app. set_uninstall_stage is a no-op
once the entry is gone (if-let guard), so the later stages don't re-create it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
wait_for_manifest_host_ports TCP-connect-probed every published port, including
UDP/SCTP. netbird's 3478/udp STUN can never answer a TCP connect, so the probe
failed forever and drove an endless host-port repair/reconcile loop on .228
(netbird-server restarting ~every 60s). Filter to tcp (empty protocol = tcp).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Turn on registry-distributed manifests for all apps: generate-app-catalog.sh now
embeds each apps/<id>/manifest.yml by default (EMBED_MANIFESTS opt-out), so nodes
install from the signed catalog (origin-wins overlay, disk = fallback) with no
OTA-shipped disk manifest. main.rs awaits a bounded (25s) refresh_catalog before
load_manifests so a fresh boot overlays the latest embedded catalog instead of a
restart later; offline/ISO boot falls through to disk and never hangs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
image_exists ran `podman image inspect <image>` via .status() (inherits the
service stdout) with no --format, so every hit dumped the image's full ~249-line
manifest JSON into the journal — once per companion image, every reconcile pass
(.228: 21.6k journal lines / 10 min, 4131 inspect dumps). The service never
crashed (NRestarts=0); the sustained journald/IO flood starved the async runtime
and dropped the UI /ws/db websocket -> constant "connection lost"/reconnect.
Discard the child's stdout/stderr; only the exit status is used.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
It called bats-assert's `fail` (not loaded in this file) → "fail:
command not found"/127, masking the real reason. Emit+return instead,
bump the cold-restart RPC window 60s→120s (block-index reload), and
note a node mid-IBD legitimately can't serve getinfo (environmental
precondition, not a product regression).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 2026-06-23 5×-green gate is DESTRUCTIVE-tier / ~8 core apps only —
it skips uninstall/reinstall (cascade) and has no progress-UI or
all-apps coverage. Manual multinode testing found real bugs it never
ran (immich+grafana uninstall hangs at full-red bar + ghost in My Apps;
grafana reinstall stops; fedimint guardian "waiting for bitcoin sync").
Adds §4 row F, §6b post-deploy order (netbird→Phase-3→F), §6c scope +
observed bugs + definition-of-done, a §5 warning, and §10 backlog to
investigate TanStack-Query/push-based state management for neode-ui.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Serve the companion download as a plain .apk so a phone installs it
straight from the link/QR with no unzip step. Repoint the in-app
download URL, the ship + publish scripts, and the pre-push hook at
archipelago-companion.apk, and drop the legacy .apk.zip.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Companion WebView now supports file inputs and downloads, and apps
opened in the in-app tab get a proper loading splash and a footer
control bar matching the web app-session bar.
- onShowFileChooser wired to an ActivityResultLauncher so <input
type=file> opens the system file browser (kiosk + in-app tab)
- DownloadListener: http(s) via DownloadManager (forwarding session
cookies), blob: via JS->base64->MediaStore, data: decoded inline
- in-app tab: app-icon + progress loading splash (eager favicon
fetch, upgraded via onReceivedIcon)
- footer controls (back/forward/refresh/open/close) matched to the
web AppSession mobile bar, with the same SVG glyphs as drawables
- bump to 0.4.8 (versionCode 12)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
run-gate.sh 5/5 on .228. Reframe the TOP PRIORITY banner as
gate-green; keep the master plan as north-star source of truth; mark
the gate definition-of-done green and point at multinode as the next
exit criterion.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
run-gate.sh 5×-green on .228, 0 not-ok (gate-5x5.log). Records the
milestone in the header/banner, §4 workstream E, §6 sequence, and §8b;
demotes the priority banner per §6 item 6. Next: bundled testing deploy
(.116/.198 + UX frontend), multinode pass, workstreams B/C/D.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- InAppBrowser now has a bottom control bar (back/forward/reload/open-in-browser/
close) mirroring the web mobile footer, plus a centered loading screen
(app favicon + progress bar) instead of a bare top bar over black.
- Commit a repo-dedicated debug keystore and pin signingConfigs.debug to it so
every machine — and the published companion download — signs debug builds with
the SAME key (fixes "App not installed" signature-mismatch on update). Force v1+v2.
- Bump versionCode 10→11, versionName 0.4.6→0.4.7.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Mobile launches use the store-driven panel (no route push) so the background
tab no longer changes and closing returns to where you launched from.
- Tab-only apps open directly (in-app WebView on companion / new tab on PWA) —
no "this app opens in a tab" interstitial.
- Shared AppLoadingScreen (app icon + progress bar) on the app session and the
legacy iframe overlay instead of a black screen.
- Pin the dashboard to 100dvh on mobile so the mesh chat/tools panes stop sliding
under the bottom tab bar in mobile browsers (no-op in the companion WebView).
- ElectrumX/electrs/electrs-ui ids now resolve to the real ElectrumX icon in My Apps.
- isMobile made reactive so overlay/footer/teleport decisions track the viewport.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5× run #4 flaked iter4 on "immich exposes its web UI lan-address
(port 2283)": container-list returned lan_address=null because
immich_server was momentarily mid-recreate when the read-only tier
queried it (passed the other 4 iterations; immich_server does publish
0.0.0.0:2283->2283). Same single-shot-read class as the bitcoin-knots
state probe — poll <=30s for the exposed port instead of one read. A
genuinely unexposed immich never publishes 2283, so real port drift
is still caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Record the overnight 5× outcome (2/5) and the triage: all three
fails were distinct one-offs. iter1 #5 bitcoin-knots = pre-launch
churn (hardened anyway); iter2 #74 + iter5 #73 = one real
orchestrator bug (phantom stack-member injection in
ordered_containers_for_start), now fixed + live-verified on .228.
Update the resume check command to gate-5x4.log.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
package.restart resolved its container list via
ordered_containers_for_start, which injected every name from the
union startup_order list that wasn't already present — including
variant names not live on a given node (mysql-mempool,
archy-mempool-api, archy-mempool-web). The phantom mysql-mempool is
2nd in the mempool start order, so do_orchestrator_package_start hit
its unknown-app-id fallback, do_package_start failed the inspect
("no such object"), and the `?` aborted the whole start sequence —
leaving mempool-api + the frontend down until the health monitor
recovered them minutes later. That was the source of the 5× gate
flakes #73 (frontend not running in 180s) and #74 (api not queryable
in 300s); root-caused from the .228 journal
("Start failed: mysql-mempool").
Replace the inject-then-sort logic with a pure helper
order_present_containers that orders only the actually-present
containers and never adds phantom entries. startup_order remains a
union of name variants across install generations — it's now used
purely to order what's live, not to inject what isn't. +3 unit tests.
Also harden bitcoin-knots.bats "valid state" probe: poll ≤30s for a
settled state instead of a single-shot read, so a container caught
mid-reconcile (transient restarting/configured) can't flake a 20-min
iteration. A genuinely-stuck container never settles, so real
breakage is still caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rename run-20x.sh → run-gate.sh, default ARCHY_ITERATIONS 20→5, and scrub
20× references across CLAUDE.md, the master plan, TESTING.md, app-registry
status, the orchestrator/config doc-comments, and the bats suites. Also add
a minimal fail() helper to mempool.bats so guard failures report cleanly.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The frontend nginx used a literal proxy_pass host with no resolver, so it
pinned mempool-api's IP at worker startup. When the backend restarts (gate,
OTA, crash, reboot re-IPAM) podman reassigns its IP and nginx keeps proxying
to the dead one -> /api hangs, websocket 502s, UI shows 'offline' until a
manual nginx reload. Same stale-upstream-IP class as the netbird 502.
Fix: mempool-frontend:v3.0.1 rewrites the generated nginx-mempool.conf to
re-resolve the backend per-request via 'resolver' + a variable proxy_pass.
Resolver address is read from /etc/resolv.conf (podman aardvark-dns answers
on the network gateway, not Docker's 127.0.0.11). Per-location path mapping
preserved (ws -> '/', /api/v1 identity via no-URI, /api/ -> /api/v1/ rewrite).
Proven on .228: backend IP change now auto-recovers with no reload; the
literal-host control still 502s. Migrated the manifest off the retired
tx1138 registry to vps2.
Also: mempool.bats #74 waited only 180s post-restart (the slow path) and
called an undefined 'fail' helper (status 127). Bumped to 300s to match the
passing parity probes and emit a real failure instead.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Captures: .228 1x-GREEN (110/110); hardened 5x DETACHED on .228 (/tmp/gate-5x2.log,
nohup — survives terminal close) with the exact check-from-any-machine command; all
shipped code fixes (commits) + deploy state (.228 + .198); node-state fixes NOT in
repo (lnd nginx proxy 8081->18083, home-assistant orphan unit removed, electrumx
re-registered); the run-ON-the-node lesson; and remaining work.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 1x gate is green; the 5x failed iters 1-2 on readiness-under-churn (apps DO
recover — lnd synced, mempool just mid-restart when probed — but slower than the
windows when restarted back-to-back). Hardening:
- run-20x.sh: best-effort settle_stack() before each iteration (wait for
mempool-api/frontend + lnd RPC healthy, 180s, on-node, never fails the run).
- required containers present/running (80/81): wait-loops (180s) not single-shot.
- mempool api/frontend (87/88): retry ~180s not single-shot.
- mempool queryable (74): 60s->180s. lnd restart-running (64): 120s->240s.
lnd getinfo (60): 90s->240s retry.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Companion app: open every app in the in-app WebView (not just non-iframeable),
carrying the mobile-iframe footer controls into the WebView. Mobile web (PWA):
open tab-apps directly in a new tab. No interstitial on either surface. Touch
points + prior commits (b5a9deb8, d1fbcd9b) noted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per direction: the gate is now 5x green ON .228 only (run on the node, not via RPC).
Fleet/multinode verification (.198 + others) moved to a new docs/multinode-testing-plan.md
with the bootstrap recipe, per-node preconditions (synced archival bitcoin, no stale
nginx proxy targets, no orphan quadlet units), node roster, and cross-node suites.
Updated CLAUDE.md, master-plan SS5/SS6/SS8b/WS-E, and TESTING.md release gates.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On a proper on-node .228 run (synced bitcoin, 4-fix binary) the lifecycle matrix is
green; these 4 were test-harness issues:
- lnd 'recovers after restart' (65): bump retry window 90s->240s. lnd cold-restart
recovery (wallet unlock + bitcoind reconnect + graph sync) exceeds 90s on a loaded
node but DOES complete (synced_to_chain:true).
- bitcoin ui responds (89): retry ~120s instead of single-shot (companion nginx may
have just been recreated by the companion-survives test).
- probe_app_url (99 lnd proxy + all ui-coverage proxy probes): retry up to 90s for
post-restart proxy/UI readiness instead of single-shot.
- required endpoints after restart (94): :8081 is nginx-proxy-manager, an OPTIONAL
app (not in required_containers) — only assert it when NPM is installed; and make
the trailing lncli getinfo a retry.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
lnd's RPC isn't ready until its wallet auto-unlocks on (re)start, which lags the
container 'running' state — single-shot lncli getinfo raced that window and
false-failed (gate tests 60 + 85). Retry up to ~90s like a health probe. lnd is
functional (getinfo returns cleanly once ready).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- immich restart: bump wait 120s->240s. Restart = ordered stop+start of the 3-
container stack (postgres->redis->server w/ DB migrations), so it needs at least
as long as the start test (180s) — the old 120s was inconsistent and false-failed
on loaded nodes. immich does return to running.
- fedimint orphan check: the unanchored 'total' regex (^fedimint) counts the
legitimate fedimint-clientd (dual-ecash bridge) but the anchored 'known' regex
omitted it -> total>known false orphan on every node running fedimint-clientd.
Add fedimint-clientd to known.
Both run as LOCAL podman/systemctl on the gate runner, so they test the runner node
(.116), not the RPC target — surfaced while driving the .228 gate green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Independent companion loop (452f05d8) validated on .228: deleted archy-electrs-ui
recreates in ~10s (was stuck 100s+). Also: companion-survives bats does LOCAL
rm/systemctl --user, so running it from .116 via RPC tests .116's companions with
.116's binary, NOT the remote target — must run ON the target node. Explains the
'failed on both nodes' runs (both silently tested .116).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The companion-unit repair stage ran at the END of each boot-reconciler tick, after
reconcile_existing(). On a heavily loaded node that per-app pass takes >60-90s, so a
deleted/lost companion unit (electrs-ui, bitcoin-ui, …) wasn't repaired within any
reasonable window (gate test 31 'deleted unit recreated within one reconcile tick'
timed out at 90s on the 45-app .228 node). Detecting + rewriting a companion unit is
cheap, so spawn it as its own ~interval(30s) loop, independent of the slow app pass.
Handle is aborted when the main loop exits (shutdown uses notify_one, so a second
waiter would steal the wake permit). tick() is now app-reconcile only.
All 4 boot_reconciler cadence tests still green (companion_stage=false in tests).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Last 2 .228 stragglers confirmed load/timing, not bugs: test 31 (companion recreate)
= contamination + ~108s reconcile cadence > 90s window; test 55 (immich restart) =
heavy stack restarts >120s under load but DOES return. Path to literally-green gate
is infra (bitcoin sync, re-quadletize .228) + minor test-window tuning. Optional
product improvement noted: independent ~30s companion-reconcile cadence.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
companion::reconcile only recreates a deleted companion unit when its parent
backend is in manifest_ids. On contaminated .228, electrumx ran as plain podman
and was NOT a tracked manifest install (manifest on disk but unloaded), so the
reconciler never iterated it -> archy-electrs-ui companion orphaned. Proven:
package.install electrumx re-registered it + restored the companion. Self-heal
logic is sound; test 31 clears on re-quadletize. electrumx on .228 de-contaminated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
.228 104/110, .198 94/110 with the 3-fix binary. Every package.stop test passes on
healthy apps. .198's 14/16 failures trace to bitcoin in IBD (test 83: ~137k blocks
behind) cascading to lnd/btcpay/electrumx/mempool. 2 node-independent: companion
recreate (31, both nodes), fedimint orphan pollution (44). Path to green 5x gate is
now infra (sync bitcoin, re-quadletize .228) + minor (test 31), not lifecycle bugs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Stop failure was 3 real product bugs (grace / reconcile-resurrection /
container-list user-stopped state), all fixed (2dad64b2, 760a32bc, 6e49ce6f) +
deployed. electrumx lifecycle suite 10/10 green (66s). fedimint 'crash loop' was
probe-induced churn (stable when left alone). Validating breadth next.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A user-stopped backend (electrumx, bitcoin, lnd, fedimint) kept reading 'running'
in container-list because its UI companion (electrs-ui, …) still serves the launch
port, and the state-refresh upgrades any reachable launch port to 'running'. The
gate's wait_for_container_status <app> stopped therefore never saw 'stopped'.
Fix: load the user_stopped marker in handle_container_list and force 'stopped' for
those apps before the launch-port refresh. The reconcile guard keeps the backend
down, so the marker is authoritative. package.start clears it first, so a started
app reports 'running' normally.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
package.stop a dependency (e.g. electrumx, a mempool dep) and the reconciler
restarts it within ~8s: the reconcile filter's dependency_required override
re-includes a user-stopped app that an active app depends on, and the in-memory
disabled set is wiped on manifest reload — so ensure_running runs, the stopped
app's unreachable ports look like a fault, the host-port repair restarts it, and
package.stop never sticks (gate 'transitions to stopped' times out).
Fix: guard ensure_running_with_mode on the on-disk user_stopped marker (the single
choke point every reconcile flows through) → Left('user-stopped'). Explicit
install/start clear the marker first (added clear_user_stopped to orchestrator
install/start, symmetric with disabled.remove; start/restart RPC already cleared
it) so user actions are unaffected. The container itself already stopped correctly
— this stops the resurrection.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix deployed to .198+.228, vaultwarden stops clean (no regression). But validation
showed the gate failures are multi-caused: (2) fedimint crash-looping/unhealthy on
both nodes can't be stopped; (3) host-listener repair watchdog restarts
port-unreachable containers fighting stop; (4) gate waits for 'stopped' but apps end
'exited'/'absent' (Exited->Stopped conversion key mismatch); (5) grace vs 60s
gate-timeout (electrumx 300s); (6) .228 contamination. Documented + re-sequenced
NEXT STEPS (fedimint health is the new top blocker).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reproduced live on CLEAN .198: package.stop fedimint -> 'podman stop -t 30
timed out after 30s' -> stop fails -> state reverts to running. Real fleet-wide
bug (NOT .228 contamination). stop_timeout_secs() per-app grace (bitcoin 600/lnd
330/electrumx 300/fedimint 60) is used by legacy stop paths but NOT the
orchestrator path: ContainerRuntime::stop_container hardcodes API ?t=10 / CLI
-t 30, and PODMAN_CLI_DEFAULT_TIMEOUT=30s == the -t grace so the await fires as
podman SIGKILLs. Fix = thread per-app grace + widen wrapper deadline; owner picks
table-based vs manifest-driven stop_grace_secs. Re-escalated to blocker.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
.198 ground truth: backend apps ARE quadlet (.container files present) -> quadlet
is the intended runtime. .228's plain-podman state traced to my cascade-gate
uninstall + package.start restore (no quadlet regen). Two real robustness sub-bugs
remain (start should regen quadlet; stop podman-fallback gap). Next: canonical
gate on CLEAN .198 first to tell real-bug from contamination.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5x gate run surfaced a real blocker: package.stop does not stop electrumx/
bitcoin-knots/btcpay/fedimint/immich (container stays running; gate stop-wait
times out). Root cause chain: these backend apps run as plain podman
--restart=unless-stopped, NOT quadlet units (PODMAN_SYSTEMD_UNIT empty; only UI
companions + home-assistant have .container files; bitcoin-core.container is
.disabled). orchestrator.stop() podman-fallback fires for filebrowser but not
electrumx -> suspect loaded()/is_unknown_app_id_error gap. stop->stopped state
reporting itself is correct (filebrowser proof, user_stopped guard).
Also: corrected the canonical gate invocation (DESTRUCTIVE only, not CASCADE);
restored .228 after my cascade-gate left apps stranded.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On the loaded .198 the frontend churned (created → "unhealthy" → reconciler
recreates → loop). The http health check fetched / through nginx (SPA +
sub_filter) and false-failed under node load; the reconciler then treated the
frontend as wedged and recreated it. nginx binds 7777 at startup, so a tcp
liveness check passes immediately and stays green under load while still
catching a real "nginx not listening" failure. Generous retries/start_period.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live on .228 the post_install `exec` steps failed with "crun: write
cgroup.procs: Permission denied / OCI permission denied": a `podman exec`
launched from archipelago.service can't place its child in the container's
cgroup (under the service's own slice). Wrap `exec` in
`systemd-run --user --scope --quiet --collect podman exec …` so it gets its own
delegated cgroup — same trick as `podman_user_scope` for pasta starts.
`copy_from_host` (a host-side `cp`, no in-container process) stays direct.
Without this only copy_from_host worked; indeedhub happened to be unaffected
(its image pre-bakes the nginx config so the exec steps were no-ops), but the
hook capability is only generally useful with exec working. hooks unit tests
pass; live verify on .228 next.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live fresh-create on .228 (post special-case removal) had nginx workers die
with "setgid(101) failed (Operation not permitted)" → workers exited code 2,
port published but nothing served (HTTP 000). The orchestrator does
--cap-drop=ALL, so unlike the legacy `podman run` (default caps) nginx's master
couldn't drop workers to the nginx user. Declare CHOWN/DAC_OVERRIDE/SETGID/SETUID
(SET* to drop the worker user, CHOWN+DAC_OVERRIDE for the tmpfs proxy cache).
Verified on .228: frontend fresh-creates, caps applied, nginx serves, UI 200
incl. /api/ and /nostr-provider.js.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The fresh-create path was blocked by hardcoded indeedhub orchestrator logic
that predated and conflicted with the manifest migration:
- ensure_running routed app_id=="indeedhub" → reconcile_indeedhub_stack, which
REFUSED to create the frontend from its manifest (returned Left("stack-managed")).
- run_pre_start_hooks("indeedhub") → start_indeedhub_backends →
wait_for_indeedhub_dependencies_ready(120) — a DNS gate with a chicken-and-egg
bug (required the frontend's own alias present before the frontend could be
created), which failed install_fresh with "dependencies were not ready within
120s" and left the frontend down (caught live on .228).
Delete all of it (−382 lines): reconcile_indeedhub_stack, start_indeedhub_backends,
wait_for_indeedhub_dependencies_ready, indeedhub_api_dependency_dns_ready,
indeedhub_required_aliases_present, repair_indeedhub_network_aliases,
indeedhub_alias_present, patch_indeedhub_nostr_provider, and the INDEEDHUB_*
consts. The manifests now carry everything these did: network_aliases (short
hostnames), generated_secrets, dependencies, and the post_install nginx hook. So
"indeedhub" + every member flows through the generic install_fresh/reconcile path
— the frontend fresh-creates normally and runs its hook.
(crash_recovery.rs's frontend-after-deps ordering guard is kept — it's beneficial
startup ordering, not a blocker.) cargo check + release build green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per user direction: the production test gate is 5x (ARCHY_ITERATIONS=5) on
.228 AND .198 for now, down from 20x. Restore to 20x before the final ship.
Updated CLAUDE.md, PRODUCTION-MASTER-PLAN.md, and tests/lifecycle/TESTING.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Author the IndeedHub stack as 7 manifests (postgres/redis/minio/relay/api/
ffmpeg + frontend) and route install_indeedhub_stack through the
orchestrator first (immich pattern), falling back to the legacy installer
only when the manifests aren't deployed.
Data-preserving by construction — the manifests reproduce the live install
exactly so an existing node ADOPTS rather than recreates:
- container_name = the live hyphenated names the runtime already references
(health_monitor tiers/deps, crash_recovery).
- named volumes indeedhub-{postgres,redis,minio,relay}-data (not bind mounts).
- dedicated indeedhub-net + network_aliases [postgres|redis|minio|relay|api]
so the api/ffmpeg env hostnames and the frontend nginx upstreams resolve
unchanged.
- generated_secrets (indeedhub-db-password/-minio-password owned by their
backends, indeedhub-jwt by the api) reuse the live /var/lib/archipelago/
secrets values (ensure_one no-ops on existing files; postgres pw is fixed
at PGDATA init). minio user "indeeadmin" + AES_MASTER_SECRET literal kept.
The frontend carries the post_install hook (#20) that replaces the hardcoded
patch_indeedhub_nostr_provider: strip X-Frame-Options, refresh
nostr-provider.js from /opt/archipelago/web-ui, inject the <script> if
absent, reload nginx — defensive/idempotent since indeedhub:1.0.0 already
bakes these. Frontend manifest also corrected off its dead Next.js shape
(health check now nginx :7777, tmpfs /run + /var/cache/nginx).
Builds + unit-tested; live adoption/lifecycle verification on .228 next.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add `container.network_aliases: Vec<String>` (serde default, DNS-label
validated) so a stack member can answer to short hostnames its peers bake
in, beyond its own container name. Rendered in both runtime paths:
- podman_client: merged (deduped) into the custom-network aliases array.
- quadlet from_manifest: appended after the container name; emitted only
for Bridge networks (slirp/pasta reject aliases).
Needed for the indeedhub migration: its frontend nginx proxies to
`api:4000` / `minio:9000` / `relay:8080`, so those members declare
`network_aliases: [api|minio|relay]` to keep the short names resolvable on
the dedicated indeedhub-net (vs. colliding generic aliases on archy-net).
Also fixes 4 pre-existing from_manifest test failures (unrelated to this
change, surfaced now that the quadlet suite runs green): test manifests
used the long-invalid `network_policy: archy-net` (allowlist is
isolated/bridge/host → moved to network_policy: isolated + container.network)
and bind sources outside /var/lib/archipelago.
Tests: container crate 53 pass; archipelago quadlet+alias 47 pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add container::hooks::run_post_install — runs an app's declarative
post_install hooks against its own running container:
- Exec -> podman exec <container> <args…> (60s timeout-bounded)
- CopyFromHost -> resolve src against allowlist roots (<data_dir>/<app>
and /opt/archipelago), canonicalise + prefix-check (defeats symlink
escape), then podman cp <abs-src> <container>:<dest>
Best-effort + idempotent: a failed step is warned and skipped, never
fails the install — matching the legacy patch_indeedhub_nostr_provider
behaviour this replaces. Wired into install_fresh after the container is
up, so it runs only on a freshly created container (not plain start), and
re-applies on recreate-after-drift.
5 unit tests on resolve_copy_src (accept in-data-dir, reject absolute /
traversal / missing / symlink-escape). cargo test -p archipelago green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add controlled post_install/pre_start hook schema to AppDefinition:
LifecycleHooks/HookStep (Exec | CopyFromHost)/HostCopy with allowlist
validation (relative src, no '..', absolute container dest, non-empty
exec). Re-exported from the crate root. Design: docs/manifest-hooks-design.md.
Also add the missing generated_secrets: vec![] field to three
pre-existing ContainerConfig test literals (the field was added to the
struct in 03a4ee1b but the container crate's own tests were never rerun,
so -p archipelago-container failed to compile). cargo test green: 53 pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
container-list reports stack apps package-level (.name="immich"), so the suite
checks the "immich" package (presence, valid state, :2283 lan-address) rather than
individual container names. Destructive tier fires async stop/start/restart and
asserts on the end state via wait_for_container_status.
KNOWN: the destructive tier is flaky for slow multi-container stacks — bats runs
ops back-to-back with no settling while immich's async stack ops take 30s+, and
stopped reports as "exited" not "stopped". The immich migration itself is verified
working (manual stop/start/restart succeed; all 3 containers healthy). Hardening
the harness for stack apps (inter-op settling + stopped|exited acceptance) is a
follow-up.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
package.stop/start/restart broke ("no containers found" / "no such object
immich_postgres") because the runtime hardcodes the immich stack's container names
as immich_server/immich_postgres/immich_redis (underscore) across 8 files
(lifecycle, health, crash-recovery, ports, config). The migration had named the
containers by app_id (hyphen), mismatching all of it.
Root cause of the earlier failed attempt: container_name was nested under an
`extensions:` block, but `app.extensions` is serde(flatten) — container_name must
be a TOP-LEVEL app key to be read by compute_container_name. Fixed: set
container_name: immich_server / immich_postgres / immich_redis at top level, and
point DB_HOSTNAME/REDIS_HOSTNAME at the underscore aliases. App ids stay hyphen
(immich/immich-postgres/immich-redis) so the catalog identity (title+icon) holds.
Manifest-only change — container names now match existing runtime references, no
code edits to the 8 files. (Deriving stack containers from manifests instead of
hardcoded lists remains a north-star follow-up.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
RPC-based (host-agnostic) lifecycle coverage for the manifest-driven immich stack
(immich + immich-postgres + immich-redis): presence + valid state of all 3 members,
a guard that no legacy underscore containers exist (catches botched migration /
legacy-installer fallback), destructive stop/start/restart of the server with
postgres+redis staying up, and cascade uninstall/reinstall (preserve_data).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Orchestrator-installed backends (immich, btcpay-db, …) run as plain podman
`--restart=unless-stopped` containers until the Phase-3 Quadlet rollout flips
use_quadlet_backends on. Nothing in the codebase enabled the user's
podman-restart.service, so those containers had NO reboot-survival mechanism.
Enable it (idempotent, best-effort) at orchestrator startup so unless-stopped
containers come back after a reboot. Already applied manually on .228 (covers
31 containers incl. immich + btcpay); this codifies it fleet-wide.
The deeper fix (render Quadlet for all orchestrator installs) remains the gated
Phase-3 Quadlet-everywhere rollout.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After the manifest migration the launcher installed as "immich-server" (app_id),
which has no catalog entry → showed the raw id and no icon. Rename the server
manifest app_id immich-server→immich so it matches the catalog/curated "immich"
entry (title "Immich", icon immich.png) and is recognised as a known launcher app
(APP_CATEGORY_MAP) → stays in My Apps. immich_stack_app_ids now installs
[immich-postgres, immich-redis, immich]; orchestrator.install bypasses package
routing so there's no recursion with the "immich"→stack-installer mapping.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Classify databases/APIs/backends into Services (#10): add immich-postgres/redis
to SERVICE_NAMES; isServiceContainer matches -postgres/-redis/-valkey/-cache/-db
suffixes; isWebsitePackage final fallback now routes any no-UI, non-known package
to Services ("anything that isn't the frontend UI launcher").
- Services show their parent app's icon (#14): backends reuse the app logo
(immich-* → immich, archy-btcpay-db → btcpay, indeedhub-* → indeedhub, etc.)
via explicit APP_ICON_FALLBACKS + prefix map, instead of 404 → 📦.
- Categories sub-nav for Services (#12): getServiceCategory + buildServiceCategories
+ useServiceCategories; Services tab gets the same desktop/mobile category strips
(Databases/Caches/APIs/Backends), shown only for categories with items. Shared
selectedCategory resets to 'all' on tab switch.
- Mobile swipe (#11): the tab-swipe gesture is suppressed over .mobile-category-strip
so swiping the category chips scrolls them instead of changing tabs (covers both
My Apps and the new Services strip).
vue-tsc build clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the immich migration off the legacy hardcoded install_immich_stack
(podman run + sudo chown) to the registry-manifest + orchestrator path. Validated
live on .228 (clean single set, healthy v2.7.4, data dir ownership correct).
- install_immich_stack now tries install_stack_via_orchestrator(immich_stack_app_ids)
first; legacy remains only as the no-manifests fallback.
- immich-{postgres,redis,server} manifests corrected from live findings:
* named by app_id (dropped container_name override) — using container_name
spawned DUPLICATE containers (app_id-named install vs name-override reconcile)
on the same PGDATA, which corrupted a postgres cluster. Server reaches its
siblings via app_id aliases (DB_HOSTNAME=immich-postgres, REDIS=immich-redis).
* immich-postgres data_uid 100998:100998 (postgres drops to container 999 →
host 100998 under rootless; verified the fresh dir is chowned correctly).
* immich-server version "release"→"2.7.4" (manifest validation requires a digit;
the bad version made the manifest silently skip → partial orchestrator install
→ legacy fallback → the duplicate corruption above).
- HARDEN install_stack_via_orchestrator: only fall back to the legacy installer
when NOTHING was installed yet. An "unknown app_id" AFTER a member is up now
errors instead of double-creating containers on shared data (the corruption
root cause).
- Strict the all-manifests round-trip test: fail (not skip) on any invalid shipped
manifest — this gap let the bad immich-server version through.
Known follow-up (pre-existing, platform-wide): orchestrator-installed backends
(immich, btcpay-db) run as podman --restart, not Quadlet, and podman-restart.service
is disabled on .228 → reboot-survival gap independent of this migration.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
immich becomes a manifest-driven stack (the legacy install_immich_stack — hardcoded
podman run + sudo chown — is the anti-pattern being retired). Three image-only
manifests modelled on the btcpay stack + the live .228 container config:
- immich-postgres / immich-redis / immich-server on archy-net; container_name set
to the underscore form (immich_postgres/_redis/_server) so the server's
DB_HOSTNAME/REDIS_HOSTNAME aliases resolve.
- generated_secrets: [immich-db-password] (idempotent — reuses the live secret on
existing nodes; postgres is already initialised with it).
- server depends on postgres+redis (install ordering); upload bind preserved.
Inert for now: not added to the UI catalog and install_immich_stack still the
default, so nothing installs these until the orchestrator wiring + on-node
ownership (data_uid) validation lands. Schema validated by the all-manifests
round-trip test. See docs/PRODUCTION-MASTER-PLAN.md §6.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
generate-app-catalog.sh gains opt-in EMBED_MANIFESTS=1: embeds each
apps/<id>/manifest.yml into its catalog entry's `manifest` field (whole document,
top-level app: preserved — exactly what the Rust side deserializes). Default off
so routine catalog regen is unchanged during the migration window; turn on
deliberately, then sign via the existing release-root ceremony. Verified: default
embeds 0; EMBED_MANIFESTS=1 embeds 40 manifests (generated_secrets preserved).
Adds a round-trip guard test: every shipped apps/*/manifest.yml must deserialize
+ validate through catalog_manifest_to_overlay (image apps accepted, build apps
defer to disk) — catches schema drift between disk manifests and the catalog path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Workstream B phase 1 (node-side consume). The signed app-catalog can now carry a
full manifest per entry; the orchestrator overlays it over the disk manifest
(origin-wins) with disk as the migration fallback. Moves apps toward
registry-distributed manifests with no OTA-shipped disk file.
- app_catalog: `manifest: Option<Value>` on AppCatalogEntry (forward-compatible,
covered by the existing release-root signature over the raw JSON);
`catalog_manifest_values()` accessor.
- prod_orchestrator: `load_manifests` overlays catalog manifests after the disk
walk; `catalog_manifest_to_overlay()` returns None (→ disk fallback) on
unparseable value / app-id mismatch / failed validate() / build source
(build contexts aren't registry-distributed yet — phase 1 is image-only).
- manifest_dir stays PathBuf (build-only field); image-only apps never read it.
- 6 unit tests; compiles clean. No-op until a catalog embeds a manifest, so
existing nodes are unaffected.
See docs/registry-manifest-design.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Single authoritative hub (docs/PRODUCTION-MASTER-PLAN.md) for the app-platform
north star: every app manifest-driven (zero OS-level reliance), manifests via the
signed registry, developer-ready external marketplace; rootless/secure/robust/
100%-uptime. Repo CLAUDE.md (auto-loaded each session) points agents at it until
the 20x lifecycle gate is green. New design doc registry-manifest-design.md.
Consolidated docs 56 -> 28: deleted dated handoffs/resumes/transcripts and
superseded trackers (content folded into the master plan or already in memory).
Kept all evergreen design/reference docs + ADRs (the master links them).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Buyer-side paid downloads now persist: purchases are cached on disk
(content_owned.rs) keyed by (seller onion, content_id), the gallery shows
an "Owned" badge unblurred, and items view/play in-app from the local
cache with no re-payment or reliance on a browser download (which
silently failed on the mobile companion). New RPCs content.owned-list /
content.owned-get. Validated e2e .116<-.198 (paid 100 sats via Fedimint,
166KB jpeg returns, survives restart).
fedimint-clientd manifest: restore the standard container capability set
(CHOWN/DAC_OVERRIDE/FOWNER/SETUID/SETGID) so fmcd's startup chown of an
existing-federation /data succeeds instead of dying EPERM (#7). Confirmed
the orchestrator applies these to the running container.
FIPS perf: tighten the supervisor warm-path keepalive 45s -> 25s so peer
paths stay inside the ~30-60s NAT cold window. Dials now reliably land on
FIPS instead of re-punching and falling back to Tor. Measured to the same
peer: cloud browse 18-22s -> 0.4s; full Fedimint paid download 29s -> 11s
(residual is the seller-side guardian reissue round-trip).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fmcd crash-looped "Operation not permitted (os error 1)" on .116 (kernel
6.12.74): the default rootless seccomp profile blocks a syscall its Mainline-DHT
/ iroh transport needs, so the REST API never came up (:8178 → HTTP 000) and
federations couldn't be joined. Verified: with seccomp=unconfined fmcd boots and
answers /v2/* (HTTP 401 instead of dead). fmcd works on other nodes, so this is
kernel/seccomp-specific — but the relaxation is safe for an outbound-networking
daemon and harmless where not needed.
- new `security.seccomp_unconfined` manifest flag (SecurityPolicy);
- libpod backend sets `seccomp_profile_path: "unconfined"` (== --security-opt
seccomp=unconfined); quadlet backend emits `SeccompProfile=unconfined`;
- enabled in apps/fedimint-clientd/manifest.yml.
NOTE: manifests live on-disk at /opt/archipelago/apps/<id>/manifest.yml, so the
node needs the updated manifest deployed + the fmcd container recreated to apply.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- PeerFiles: new confirmation step after "pay from ecash" — shows the amount and
which wallet will be spent (Cashu/Fedimint) with balances, lets the user switch
backends, and a styled Confirm button. The chosen backend is passed to the
payment so it spends exactly what was confirmed.
- content.download-peer-paid: accept `method` (cashu|fedimint) to honor the
confirmed choice; log the backend + outcome; backend-specific rejection errors
("not in the same Fedimint federation" / "doesn't accept your Cashu mint").
- AUTO-REFUND: a minted token whose sale fails (peer unreachable, rejected, or
error) is now reclaimed (fedimint reissue / cashu receive) so the buyer no
longer loses the spent ecash — fixes the stuck-Fedimint-notes report.
- wallet.ecash-balance already reports cashu_sats/fedimint_sats/total_sats which
the confirm screen uses to pick/show the covering wallet.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
chrome://inspect isn't always reachable on the Android companion WebView, so the
real error stayed invisible. Add a plain-DOM, screenshot-able overlay (built
without Vue so it survives a crash in Vue itself) that shows the captured error
message + stack and a Copy button for the full window.__archyErrors buffer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Right after install the dashboard SPA opens and, if it loads before NetBird's
embedded OIDC provider is serving, caches a bad auth state — the user appears
logged-in but can't log out until it self-corrects. Container "running" != OIDC
ready, so gate the install's Done phase on the management server's
/oauth2/.well-known/openid-configuration answering (best-effort, 60s cap, never
fails the install since the stack is already up).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On Cloud/files (and any scrolling view), the bottom of the list could sit behind
the fixed mobile tab bar. Cause: DashboardMobileNav measured the bar's
offsetHeight and wrote it to --mobile-tab-bar-height, but when the bar was hidden
or not yet laid out the measurement was 0 — and writing "0px" defeats the
", 88px" fallback in the .mobile-scroll-pad clearance calc (an explicit 0 is
still a set value), so the clearance collapsed and the ~88px bar overlapped the
last row.
- never write 0px: only set a real measured height, else remove the var so the
88px fallback applies.
- re-measure after first paint (rAF) and after the WebView safe-area injection,
so the clearance reflects the bar's final laid-out height.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Paying for a peer file minted a Cashu-only token, so a node whose ecash balance
lived in Fedimint couldn't pay even with funds. Now both backends are tried:
- payer (content.download-peer-paid): mint a Cashu token first; on failure fall
back to spending Fedimint notes. Only error if BOTH backends can't cover it.
- seller (verify_and_receive_payment): accept Fedimint notes as well as Cashu —
anything not starting with "cashu" is redeemed via reissue_into_any.
- new fedimint_client::spend_from_any() — spend from whichever joined federation
has the balance, returning the notes + federation id (mirrors reissue_into_any).
- wallet.ecash-balance now also reports fedimint_sats + combined total_sats; the
pay-for-file pre-check uses the combined total so a Fedimint-funded node isn't
wrongly blocked.
Compiles (cargo check + vue-tsc). Live cross-node federation validation pending
(dual-ecash phase 6) — needs two nodes sharing a federation.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The global Vue errorHandler swallowed every crash into "Something went wrong.
Please refresh the page." — which hides exactly what we need to diagnose the
companion-app (Android WebView) post-login crash. Now:
- the toast shows the real (truncated) error message;
- a 25-entry ring buffer is kept on window.__archyErrors for retrieval where
there's no console (companion WebView via chrome://inspect, or a debug view);
- window 'error' and 'unhandledrejection' listeners catch async/non-Vue errors
that Vue's errorHandler misses (e.g. a JS API absent in an older WebView).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A node reachable both over LoRa and federation has two MeshPeer rows (radio
twin: low contact_id + firmware key; federation twin: high contact_id +
archipelago key), and messages key by peer_contact_id split across the two ids
— so opening one twin shows an empty thread (the .120->.89 symptom).
- backend: new group_peer_twins() helper groups peers by arch_pubkey_hex (set on
BOTH twins by bind_federation_twins), keeps the radio id as the mesh-first
send target, and unions messages across all twin ids. Wired into
conversations.list / conversations.messages / mesh.contacts-list. +3 unit tests.
- frontend: the live chat list merges client-side (mergedPeers) and matched twins
by the "Archy-z6Mk..." advert prefix, which the Meshtastic device rename broke
(radio now advertises the server name). Merge by arch_pubkey_hex instead, which
the backend reliably sets on both twins. Expose arch_pubkey_hex on MeshPeer.
- fix unrelated stale test: EcashTransaction test missing the new `kind` field.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resume notes for the 1.8.0 bug-bash mesh work: Meshtastic rename shipped +
verified; .120->.89 'non-delivery' diagnosed to a duplicate-contact surfacing
bug (messages inject fine, split across federation/radio twin contact_ids);
design for the dedup fix (#12) and the netbird logout-race map (#10).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Meshtastic device rename was a no-op — set_advert_name only updated an
in-memory field and never told the radio, so the device kept its firmware
default ('Meshtastic xxxx') and wasn't findable from external Meshtastic
apps. MeshCore already renamed correctly (CMD_SET_ADVERT_NAME); this brings
Meshtastic to parity.
Send an AdminMessage{set_owner=User{long_name,short_name}} to the locally
connected node (admin packet to our own node_num on the ADMIN_APP port).
Local serial admin needs no session passkey, matching the official client.
long_name = server name (<=39 chars); short_name = first 4 alphanumerics,
upper-cased. Verified on real hardware: .120 -> 'Archy-X250-EXP', .5 ->
'Archy-X250-Beta' (name read back from the radio after reconnect).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Calibrated from a device home-screen screenshot: launcher3 crops less than the
App-info view, so the ring at 0.53 sat ~78% out. Scale 0.65 reaches the edge.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Ring uses logo.svg's #000->#666 gradient (stroke 22.8834) pushed to scale 0.53
so it sits at the launcher's visible crop edge (calibrated from a device
screenshot). Grid at 0.55. versionCode 9 so launcher3 refreshes its icon cache.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Device App-info screenshot showed the launcher only renders the central ~54%
of the adaptive icon, clipping the ring. Calibrated the ring to scale 0.50 so it
lands at the visible circle edge; grid to 0.55. Bump versionCode 8 so launcher3
refreshes its icon cache (it keys the cached bitmap by versionCode).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Joining a Fedimint federation is heavy and routinely outlasts the default 15s
client timeout while still succeeding server-side, so the UI wrongly showed
failure. Bump the join timeout to 90s, and on any error re-check the list: if a
new federation appeared the join worked — show 'Federation joined.' instead of
a misleading error.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A stock meshcore client (e.g. a phone) can't sign our typed envelopes, so it is
never 'authenticated' — which meant ticking it as an allowed assistant contact
had no effect and !ai stayed denied. The explicit per-contact allowlist is a
deliberate operator opt-in for a specific key, so match it regardless of
authentication, keyed on the asker's resolved identity (bound archipelago key,
else firmware routing key — how meshcore addresses the contact). The spoofable
federation-trust-list match still requires authentication.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- reissue_into_any now tries the UNION of the local registry AND fmcd's live
joined set (/v2/admin/info) before failing, so a valid Fedimint token isn't
wrongly rejected when the registry has drifted. On all-fail it returns a
friendly message: notes already redeemed into this wallet (funds safe) vs
didn't match any connected federation.
- Unified transaction history: a local Fedimint tx log (recorded on each
successful redeem) is merged with the Cashu history in wallet.ecash-history,
newest-first, each tagged kind=cashu|fedimint. Previously a Fedimint receive
appeared nowhere.
- fedimint-clientd healthcheck -> type:tcp. It was probing /health, which fmcd
doesn't serve (only /v2/*), pinning the container in (starting) forever; the
TCP probe is skipped by the Quadlet renderer (host-side lifecycle verifies),
so it reports running. Cosmetic for ecash, which worked throughout.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Put dark fill + inset metallic ring (0.88) + grid (0.58) all in the background
(renders to the mask edge, no safe-zone crop); transparent foreground. Matches
a locally-rendered, circle-masked preview so the ring is visible and uncut.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the metallic ring into the background (renders to the mask edge, unlike the
foreground which is cropped to the safe zone) so the border is finally visible
at the circle's rim; shrink the grid to ~0.55 so the mark isn't too big.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Apps tab: a horizontal swipe that starts on an app icon no longer flips the
top tab — it lets the app-page scroll / icon tap win (swipe empty space to
change tab). Fixes the swipe conflict with two pages of apps.
- Files: file cover tiles are forced square on mobile (aspect driven by CSS,
not a Tailwind arbitrary class) so the grid is uniform and tappable.
- Files: scroll container gets bottom safe-area + tab-bar padding so the last
row clears the mobile back button / bottom nav.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A !ai (or any typed message) from a trusted, federated node was denied when
it arrived over the radio. The radio half of a node that is also a federation
peer carried no archipelago identity (identity adverts are no longer broadcast
on the public channel), so the trusted_only gate and signature verification
had no key to check the asker against — and the same node showed up as two
contacts (a radio twin + a federation twin).
- bind_federation_twins(): correlate a radio contact with its federation twin
by exact, case-insensitive advert_name and copy the federation peer's
arch_pubkey_hex/did/x25519 onto the radio record. Called from
upsert_federation_peer and refresh_contacts. Ambiguous names (held by >1
federation peer) are skipped. This is only a CANDIDATE key — security is
unchanged: the inbound envelope signature must still verify against it.
- send_message now signs the typed Text envelope (new_signed) so a radio !ai
authenticates against the bound key. A meshcore node merely named like a
trusted node cannot forge the signature, so it is still denied.
Receiver-side verification (handle_typed_envelope_direct) and federation-trust
matching (is_sender_allowed) already existed; this supplies the missing key
binding and signature. Also resolves the radio/federation duplicate-contact
display for same-named nodes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Revert to a pure adaptive icon (the bare round PNG was getting legacy-wrapped
onto a white circle by the launcher). One ring only, in the foreground, using
the SVG's dark #000->#666 gradient on a plain dark tile.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Revert the brightened grey->white ring back to the original logo.svg gradient
(black->#666, stroke 22.8834) on both the round PNG icon and the adaptive
foreground.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Render the full circular badge (bright grey->white ring + grid) to round-icon
PNGs at all densities and drop the adaptive round XML, so launchers that use
round icons show a real edge-to-edge circle instead of a mask-cropped coin.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Scale the whole badge to ~0.64 so the bold grey->white ring isn't clipped at
the edge by the launcher mask; bigger, brighter ring. Background is plain dark.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The ring at 0.96 sat in the adaptive-icon bleed zone (outer ~18dp cropped by the
launcher), so only the grid showed. Scale badge + grid to 0.68 so the ring lands
at the edge of the visible circle, and brighten it to grey->white.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the badge ring into the background layer (brightened grey->white so it
reads on #0A0A0A) at ~0.96 so it sits at the masked-circle edge; foreground is
just the white grid. Also honor SHIP_COMPANION in the pre-push hook.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
scripts/publish-companion-apk.sh builds the debug APK and refreshes the served
download neode-ui/public/packages/archipelago-companion.apk.zip; .githooks/pre-push
runs it on every push to main that touches Android. Enable per clone with
git config core.hooksPath .githooks
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adaptive icon foreground now draws the full badge (black→grey gradient ring +
white grid) scaled to ~0.94 so the ring reads as a clean border at the circle
edge. Adds ship-companion.sh: builds the debug APK and publishes it to
neode-ui/public/packages/archipelago-companion.apk.zip, then commits + pushes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The controller body/face were opaque, so the synthwave backdrop only peeked
out above/below the controller. Make the DARK palette surfaces translucent
(body/face/inlay) and drop the opaque shadow platform + the gradient's forced
0.95 alpha, so the backdrop reads through the controller as glass. CLASSIC
palette stays solid.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tapping a dashboard app icon now scales it down immediately (CSS :active)
and shows a per-icon spinner until the app overlay opens, so the tap is
acknowledged even while the app session spins up.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New circular badge logo (ic_logo) on Intro + Connect screens; launcher
icon rebuilt as dark circle + white grid.
- Reddish synthwave backdrop (bg-intro-2) behind Intro, Connect, and the
remote/gamepad (edge-to-edge with a light scrim); controllers no longer
paint an opaque fill over it.
- Server name: added to ServerEntry/prefs, the Connect form, the modal
add-form, and saved-server rows; removal now matches by connection
identity (rename- and legacy-format-safe).
- NESMenu modal restyled to glassmorphism #0A0A0A with centered, larger
fields. Connect-form glass cards given a darker base for legibility.
- Intro title/subtitle set to #FAFAFA.
- Deleting the last server clears the active server and returns to Connect.
- D-pad auto-repeat initial delay raised to 500ms so a tap sends one key
(fixes doubled nav sound).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Messages to a federated peer that is out of LoRa range (e.g. on another
continent) were dropped into the radio with no fallback, or hung on a dead
FIPS path before reaching Tor — so they never arrived.
- Route a radio contact over the federation transport (FIPS->Tor) when it is
the same node as a federated peer (known archipelago identity -> onion) AND
it is not currently reachable over the radio. Reachable radio peers stay on
the mesh (preferred); oversized/file envelopes still always take federation.
- Resolve the onion via the archipelago identity key (arch_pubkey_hex), not
the firmware routing key, so a radio contact maps to its nodes.json onion.
- Add .fips_timeout(8s) to the federation message POST so an unreachable FIPS
overlay fast-fails to Tor (~3-5s) instead of burning the 120s budget.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Companion app QR encoded a relative path (/packages/...apk.zip) which
can't resolve when scanned by a phone. Point it at the absolute 146
release-server URL so the download works from any device.
- Dashboard tab-swipe: guard tabs[next] (noUncheckedIndexedAccess) so the
frontend type-checks/builds.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Federated nodes failed to converge to full-mesh across the LAN<->Tailscale
boundary: nodes were invisible to peers, sync 'took ages'/timed out, and
names only updated on a manual sync. Onions were healthy in both directions
(~3-5s); the failures were app-layer.
- B: federation dials fast-fail a dead FIPS path via .fips_timeout(6s) in
sync_with_peer + notify_join, so the Tor fallback isn't stuck behind the
full 30s FIPS budget when LAN and remote peers share no FIPS path.
- A: notify_join (peer-joined) now spawns with retries+backoff instead of a
single awaited best-effort POST, so the join RPC returns instantly (no
'Request timeout') and the inviter reliably learns the joiner (was
asymmetric).
- C: new 90s periodic federation auto-sync (none existed) so renamed nodes
and roster changes propagate without a manual Sync click.
- self-heal: each auto-sync re-asserts membership to any peer that doesn't
list us back, converging the fleet to full-mesh and healing pre-existing
asymmetry with no manual re-joins.
Validated live across 7 nodes: a previously fleet-invisible node became
fully meshed automatically (logs: 'auto-sync ... reasserted=1',
'peer-joined ... delivered').
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
UI (this session):
- Global audio player now scales the whole interface into the space above it
on desktop (sidebar + main) and docks directly above the tab bar on mobile;
it stays visible while navigating.
- Mesh mobile redesign: floating Chat / BTC / Dead Man / AI / Map tab strip
with a single fixed, internally-scrolling pane (page no longer scrolls);
tabs hide while a conversation is open; floating back button; collapsible
Device panel (starts collapsed); keyboard-aware conversation sizing via
VisualViewport so the chat sits just above the keyboard.
- Cloud file grid: uniform 4/3 card heights (folders + images match).
- Swipe left/right switches tabs on the Apps and Web5 screens.
- Map tool fills its pane (no bottom gap); fix skewed Share Location toggle
on mobile (global min-height rule was deforming the switch).
- Trim redundant helper copy from the mesh AI tab.
Also bundles pre-existing in-progress work that was already in the tree:
mesh listener/session + wallet + container + bitcoin-status backend changes,
docker UI updates, and assorted other UI tweaks.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fedimint never appeared in Wallet > Settings > Fedimint because the
fmcd (fedimint-clientd) sidecar was never installed: ensure_default_
federation() needs the fmcd password to reach the daemon, found none,
and silently no-oped, leaving the registry empty.
- prod_orchestrator: add fedimint-clientd to the baseline auto-install
set so it self-heals onto every node and auto-joins the default
federation; generate the fmcd-password secret before secret_env
resolves.
- fedimint_client: ensure_fmcd_password (random hex, 0600) shared with
the container's secret_env; from_node reads the same secret (legacy
fmcd/password kept as fallback); reissue_into_any redeems received
notes into the first joined federation that accepts them.
- wallet.ecash-receive: dual-token — cashu* tokens redeem at the mint,
anything else is reissued via fmcd; returns the kind + federation_id.
- UI: receive box advertises "Cashu or Fedimint" and reports which kind.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Meshtastic DMs were falling back to a channel broadcast, so every node
on the LoRa channel saw a "direct" message. Send a directed MeshPacket
(to = node num, decoded from the synthetic pubkey's node-id bytes)
instead — the Meshtastic analog of the meshcore CMD_SEND_TXT_MSG fix.
DMs now reach only the recipient; firmware auto-PKC-encrypts them
end-to-end once NodeInfo keys are exchanged.
Capture E2E status at the driver level (no shared-type/UI change):
- learn each peer's real Curve25519 key from User.public_key (field 8)
and inbound MeshPacket.public_key (16), kept in a side-map separate
from the synthetic routing key so unicast routing is untouched
- detect inbound MeshPacket.pki_encrypted (17) to tell a true E2E DM
from a channel-PSK fallback
- peer_is_pkc_capable() seam for a future mesh-tab E2E badge
Hot-swap preserved: no dispatched MeshRadioDevice signature or the
shared ParsedContact changed, so meshcore and meshtastic stay
interchangeable behind the listener.
Adds tests/multinode/meshtastic.sh, a two/three-radio on-air parity
harness (detect, discover, DM round-trip, DM privacy, channel
broadcast, typed envelope, reachability).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The offline/reconnecting banners were in-flow (mx-6 mt-6) and pushed the whole
dashboard down when shown. Teleport them to <body> as a fixed, top-centered
overlay with a fade/slide transition and safe-area inset, so they no longer
shift layout.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
applicationIdSuffix=".debug" + versionNameSuffix so a debug/test build
installs alongside the release app instead of failing on signature mismatch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When ArchipelagoNative is present (the Android companion app), openInNewTab()
now calls openInApp(url) so non-iframeable apps open in the in-app WebView
instead of a suppressed window.open popup. Falls back to window.open in a
plain mobile browser. Logic only; no visual change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The kiosk's "Open in new tab" used window.open(..., 'noopener,noreferrer'),
which the WebView suppresses, so launching apps that can't be iframed did
nothing. Route such node apps (same host) into a local in-app WebView overlay
instead, keeping the kiosk view alive underneath; genuinely external links
still go to the system browser. Wired through onCreateWindow,
shouldOverrideUrlLoading, and a new ArchipelagoNative.openInApp() bridge.
Perf (no visual change): enable setOffscreenPreRaster to stop scroll
checkerboarding, and enable WebView remote debugging on debuggable builds
for chrome://inspect profiling.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-key FileGrid on the current folder path and wrap it in a cloud-zoom
Transition so the depth/zoom animation replays at every folder level; the
header + breadcrumb nav stay fixed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Cashu default mint was the local Fedimint guardian (:8175), wrongly surfacing
a Fedimint URL in the Cashu mints list. Default is now Minibits
(https://mint.minibits.cash/Bitcoin) — Cashu and Fedimint are distinct
protocols (Fedimint lives under its own tab).
- Peer-file (buy) invoice creation: retry the LND REST call (3× / 400ms) so a
transient LND-REST blip (swap pressure / just-restarted / TLS race) no longer
hard-fails as an opaque 503, and surface the real error chain ({:#}) in the
response + logs instead of a generic "Failed to create invoice".
- Autojoined default federation now shows a friendly name ("Archipelago
Federation") in the Fedimint tab instead of a bare federation id.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sizes bitcoind -dbcache to host RAM (~1/16, floor 300MB, cap 4096) instead of a
fixed 2048/4096. A multi-GB UTXO cache on an 8GB node running the full app stack
pushed memory past physical RAM and triggered system-wide swap thrash: the disk
saturated, bitcoind could not answer its own RPC, and the dashboard backend's
sqlite reads stalled — surfacing as fleet-wide /rpc/v1 502s and a blank Bitcoin
UI. Applied in scripts/container-specs.sh (reconciler path) and the config.rs
bitcoin-core path.
Bitcoin status cache now polls every 5s (was 10/15) with an 8s timeout (was 20s)
and fetches the four RPCs concurrently, so the cached snapshot tracks bitcoind's
responsive windows during IBD and the UI stops dwelling on "reconnecting...".
Unifies the divergent discover AppGrid/FeaturedApps image-error handlers onto the
canonical placeholder fallback so missing app icons render the placeholder.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- DMs now use native meshcore unicast (CMD_SEND_TXT_MSG) instead of @DM2 channel
broadcasts: private (E2E-encrypted to the recipient pubkey by firmware), off the
public channel, and decodable by stock clients. Plain text (split, not MC-chunked)
to non-archipelago contacts; typed envelopes to archy peers.
- !ai replies now DM the asker privately (RadioDm) instead of broadcasting on ch0.
- Auto contact-import: a heard advert (PUSH_CONTACT_ADVERT/0x80, 32-byte pubkey) is
added via CMD_ADD_UPDATE_CONTACT (0x09) so contacts appear without a flood advert.
- clear-all now DELETES firmware contacts via CMD_REMOVE_CONTACT (0x0F) instead of
blocklisting; blocking filter removed entirely. Wiped contacts return when reachable.
- Contact reachability: MeshPeer carries last_advert + reachable (path-based); UI shows
a reachability dot.
- Peers list: contact search box (filter by name/DID/npub/pubkey) with a clear button.
- send_message routes stock contacts as plain native text (fixes garbled envelopes).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- mesh: stop broadcasting ARCHY:2 identity on the public channel (startup + every advert tick); receive path still parses inbound. No more public-channel spam.
- mesh assistant: trigger on !ai/!ask typed in 1:1 chat (was only the dead AssistQuery path + bare channel text); route the reply transport-aware via MeshService::send_message (Tor for federation peers, LoRa for radio) through a new AssistChatReply event consumed at the server layer — fixes replies never reaching federation askers.
- mesh assistant: per-contact !ai allowlist (allowed_contacts) bypassing trusted_only; config + RPC + is_sender_allowed.
- fedimint-clientd manifest: network_policy open -> bridge (invalid value made the loader skip the whole manifest, so fmcd never ran and federations never joined/listed).
- ui: AI panel — Claude model dropdown (Haiku/Sonnet/Opus presets) + allowlist contact picker.
- ui: Settings — App Updates + App Registry moved under Account.
- ui: mesh chat — overscroll-behavior: contain so chat scroll no longer bleeds to the contacts panel.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Component tests mounted without main.ts's bootstrap, so the $ver global
template helper (app.config.globalProperties.$ver = displayVersion) was
undefined — AppSidebar/AppHeroSection/MarketplaceAppCard tests failed with
"_ctx.$ver is not a function", blocking the release gate's ui-unit-tests
stage. Add a vitest setup file that mirrors main.ts via config.global.mocks
and wire it into vitest.config.ts.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the assistant scheduler, MeshAssistantPanel UI, and the remaining
config-RPC / live-toggle / Ollama-detect wiring on top of Phase 1.x.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 2 backend. AssistantConfig is now live-updatable (RwLock) so the UI
toggle applies without a listener restart. New RPCs:
- mesh.assistant-status -> {enabled, model, trusted_only, default_model,
ollama_detected, models[]} (probes local Ollama :11434/api/tags)
- mesh.assistant-configure -> set enabled/model/trusted_only live + persist
MeshService::assistant_config / configure_assistant. Compiles clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The ARCHY:2 identity broadcast (DID + ed25519 + x25519) was unwired dead
code on both send and receive. Wiring it lets a radio peer prove its
archipelago identity, so the assistant's trusted-only gate (and encrypted
DMs) work over meshcore AND Meshtastic — the latter otherwise only exposes
synthetic node keys.
- session.rs: broadcast ARCHY:2 as channel text at startup + each advert tick
- frames.rs: parse inbound ARCHY:2 on the channel path, dedupe-keyed by
archipelago pubkey (federation_peer_contact_id) so it MERGES with the
federation-seeded peer instead of duplicating; self-echo guarded
- threads our_x25519_secret into handle_channel_payload (was reserved)
Reuses the existing handle_identity_received verifier (ed/x25519 consistency
check + shared-secret derivation). Compiles clean. Needs a live 2-radio test
before trusting trusted-only over radio.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A plain '!ai <q>' / '!ask <q>' on the channel is now answered by the node's
local model and broadcast back as plain text, so ANY client (bare meshcore
or Meshtastic) can ask. Generalised run_assist with an AssistReply target:
Typed chunks to a peer (archipelago UI path) vs plain channel-text (bare
clients). Trust/rate gate unchanged; asker identity is separate from reply
mode. Works over both radios.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Headless containers (databases, APIs, backends without a UI) belong in a
tab labelled 'Services', not 'Websites'. The categorisation logic already
routes UI-less packages there (built under #45); this finishes the rename
of the user-facing label across Apps, Marketplace, Discover and the mobile
nav, and makes 'services' the canonical tab state/query param. Old
?tab=websites bookmarks still resolve (back-compat acceptor kept).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Destructure the first 4 pubkey bytes into typed locals so vue-tsc's
noUncheckedIndexedAccess doesn't fail the build (the bytes.length<4 guard
doesn't narrow per-element access). No behaviour change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The IndeeHub API needs MinIO (object storage) up to serve, but the
health monitor's dependency map listed only postgres + redis, so it
would restart the API while MinIO was still starting — the "recovers
only after 1-2 container restarts" symptom. Add indeedhub-minio to the
API's deps; MinIO has no deps of its own so the monitor restarts it
first, no deadlock. (First-start ordering in the stack definition is a
deeper, separate follow-up.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
federation.remove-node only edited nodes.json, so a removed/renamed node
(e.g. a stale "Arch HP") lingered in the mesh chat list with its old
thread. Capture the node's pubkey before removal, then purge its
synthetic mesh peer, shared secret, messages, presence, and persisted
contact entry via the new mesh::purge_federation_peer. Combined with the
#42 name refresh, stale federation contacts can now be fully cleaned from
a node.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the rule that only front-end apps with a UI belong in "My Apps"
(databases/backends/headless go to Websites), make the manifest's
interfaces.main.ui the deciding signal. isWebsitePackage now treats any
package that declares a UI as an app even when it isn't in the curated
APP_CATEGORY_MAP, and falls through headless LAN-reachable packages to
Websites. Additive — service-by-name infra and curated known apps are
unchanged, so no currently-correct app moves.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Settings shows the node-level Nostr key (HKDF derive_node_nostr_key,
read via node.nostr-pubkey) while Web5 > Identities showed the identity
record's own key — the mirrored "Node" identity stores nostr=None and
seed identities use a different BIP-32 NIP-06 key, so the two surfaces
disagreed.
Resolve the node-level Nostr key once in identity.list and override it
onto whichever identity record is the node's own (ed25519 == server_info
.pubkey). Display-only — no stored key is rewritten, so it self-applies
to existing nodes with no migration and the discovery identity is
unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A peer accepted via invite is seeded into the mesh peer table with
name=None, so it shows as "Archipelago <pubkey8>" in chat. Federation
sync later learns the real name (update_node_state writes it to
nodes.json) and discovers transitive peers (merge_transitive_peers),
but nothing pushed those into the live mesh peer table — the chat list
stayed stale until the next mesh restart, and transitive peers never
appeared as contacts at all.
Add RpcHandler::refresh_federation_mesh_peers() (re-runs the idempotent,
onion-deduped seed_federation_peers_into_mesh) and call it after every
periodic sync cycle (server.rs) and after the manual federation.sync-all
RPC. Names now correct themselves and the full roster meshes within a
sync cycle, no restart needed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
phase4-streaming-ecash-plan.md: design for ecash-paid swarm transport, paying
across different mints (§2a, Lightning-bridged swaps), networking-through-nodes
relay, and an IndeeHub "Archipelago" content source. Records the resolved
iroh-blobs paid-serving spike. dht-RESUME.md: task #12 + step F marked done.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a Settings control to the Networking Profits card that opens a new page
where the operator controls what their node charges sats for and how much.
Drives the existing streaming.list-services / streaming.configure-service RPCs;
"free everything" is the default (all priced services ship disabled, surfaced
with a reassurance banner). New route web5/networking-profits + common.settings
i18n (en/es).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 3 wiring (task #12):
- NostrSeedDiscovery: async ProviderDiscovery that queries relays for signed
seed adverts and parses endpoint ids (swarm/iroh_provider.rs, seed_advert.rs).
- seed_and_advertise publish path; dep-free fetch/publish helpers reuse the
node's Nostr identity (build_nostr_client/load_or_create_nostr_keys made
pub(crate)).
- swarm::init builds the IrohProvider once into a OnceLock runtime; providers()
returns it; announce_held_blob() is called from update.rs after a release
component passes both hash gates.
- config swarm_enabled (ARCHIPELAGO_SWARM_ENABLED, default off); server.rs init.
Paid swarm serving (Phase 4 step F):
- swarm/paid.rs gates the iroh-blobs provider through streaming::gate,
intercepting connect + GET (peer push hard-disabled). Free by default
(content-download service disabled); denies unpaid peers when enabled;
fails open on internal error so a payment fault never blocks distribution.
Wired into IrohProvider::new.
All iroh code behind the iroh-swarm feature; the default build is inert.
Default build clean; --features iroh-swarm: 11/11 swarm tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bitcoin-core was missing from APP_CATEGORY_MAP, so isKnownApp() was false and
isWebsitePackage() fell through to 'has a runtime LAN address'. Once the running
container's LAN address (the bitcoind RPC port :8332) showed up ~a minute after
launch, Bitcoin Core was reclassified as a website: it dropped out of the Apps
tab and search, moved under Websites, and launching it opened :8332 (raw RPC)
instead of the :8334 custom UI that Knots opens.
Add 'bitcoin-core': 'money' alongside bitcoin-knots/bitcoin-ui so isKnownApp is
true, isWebsitePackage is false, and launchAppNow routes through openSession ->
resolveAppUrl (:8334 custom UI). Fixes search, category, and the launch URL.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Messaging a federation-only peer (e.g. 'Arch Dev') failed with 'Missing
contact_id'. The UI gave federation-only rows a *negative* placeholder
contact_id derived from a DID hash, but the backend parses contact_id as u64,
so a negative value deserialized to None. The negative id also never matched
the positive federation-synthetic id that federation-routed messages are stored
under, so those threads looked empty.
- Frontend: derive the SAME positive federation-synthetic id the backend uses
(federationContactId mirrors federation_peer_contact_id) so mesh.send accepts
it and messages thread correctly.
- Backend: send_typed_wire now resolves a federation-synthetic contact_id from
nodes.json when it isn't in the live mesh peer table (radio-less node),
instead of bailing 'Unknown federation peer'.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Large peer downloads (~178MB) failed with a generic 'Operation failed', and
the download path had three stacked problems:
- The FIPS reqwest client used a hard-coded 20s total timeout regardless of the
caller's .timeout(), so a big transfer over the mesh aborted at 20s before
the Tor fallback could help. Honor the per-request timeout (client_with_timeout).
- The peer-content proxy buffered the whole file into node memory via
resp.bytes() before sending a byte, and capped the transfer at 60s. Stream
the body through with hyper::Body::wrap_stream (constant memory) and raise the
timeout to 900s; bump the nginx peer-content read timeout to match.
- Free downloads pulled the file as base64 over RPC, doubling it in node memory
and the browser — fatal for large files. Download free files by streaming
from /api/peer-content straight to disk, after a 1-byte Range probe that
surfaces the real reason (peer offline on mesh and Tor) instead of a generic
failure. Paid downloads now return the real error through the {error} channel
the UI already displays.
Adds the reqwest 'stream' feature for bytes_stream().
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The prod orchestrator only checked whether a build-image tag was *present*
before deciding to skip the build. The local UI images (bitcoin-ui, lnd-ui,
electrs-ui) COPY a built neode-ui dist, so a UI update changed the source but
left the old tag in place and the new UI never shipped.
Gate the build on a content fingerprint of the build context (sorted relative
path + length + mtime, SHA-256) recorded in a per-tag stamp under data_dir.
Rebuild whenever the fingerprint differs from the one that produced the
existing image; podman's own COPY-layer cache keeps a no-op rebuild cheap.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Single source of truth for picking the DHT work back up after a restart:
worktree/branch rules, all phase commits, the exact next task (#12 Phase 3
glue), build-time facts, and the Phase 0 go-live ceremony.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Apps could fail install when a stack member exited on its first start
because a dependency (db/redis/the bitcoin node) was not ready yet — a
transient crash, not a broken install. wait_for_stack_containers now
restarts each exited/dead container up to 3 times before declaring the
install failed; the runtime supervisor keeps it alive afterwards.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The discovery wire format that feeds the swarm's ProviderDiscovery seam: a
node announces 'I seed blake3 H from iroh endpoint E' as a signed NIP-33
addressable Nostr event. Scope is releases/catalog content ONLY (decided
2026-06-16) — never private user blobs.
- swarm/seed_advert.rs: kind 30081, d-tag = blake3 hex (one current advert
per author+hash, latest-replaces), content {"v":1,"endpoint_id":...}.
advertisement_builder / advertisement_filter / parse_endpoint_id /
endpoint_ids_from_events (dedup). Endpoint ids stay opaque strings so the
protocol is dep-light + unit-testable on the default build.
4/4 tests pass (sign->parse roundtrip, filter targeting, reject wrong-kind/
empty, dedup across nodes).
Next (task #12): gated NostrSeedDiscovery glue (query relays, parse ids ->
iroh::EndpointId), publish path, wire swarm::providers().
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pulls iroh 1.0 + iroh-blobs 0.103 as OPTIONAL deps under the iroh-swarm
feature and implements a real BlobProvider over them. Verified: the full
iroh QUIC dep tree (260 pkgs) resolves and compiles against the pinned
bitcoin/nostr-sdk/reqwest-rustls stack; the provider compiles against the
0.103/1.0 API.
- swarm/iroh_provider.rs: IrohProvider::new binds a QUIC Endpoint, opens a
persistent FsStore (data_dir/iroh-blobs), and serves blobs via the
iroh-blobs protocol/Router — a node that fetches also SEEDS. try_fetch
maps ContentDigest -> iroh Hash, asks discovery for seed EndpointIds, then
downloader.download(hash, providers) (range-verified) + export to staging.
- ProviderDiscovery trait: the seam Phase 3 (signed Nostr advertisement
events) fills. discovery=None -> no seeds -> origin-only, so enabling the
feature is never worse than today.
- Default build untouched: iroh is optional, the module is cfg-gated, and
providers() stays empty until Phase 3 wires discovery in.
Build: cargo build --features iroh-swarm succeeds (dev). Default build +
44 swarm/update/content_hash/blobs tests unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The backend already sends did in federation peer lists, but the Peer
type omitted it and federationNodeToPeer() dropped it when mapping. Add
did?: string to Peer and pass node.did through, so trusted/observer
node rows route to Federation/Mesh by their real DID (falling back to
pubkey/onion) instead of failing the build on a missing property.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lands the transport/swarm orchestration layer (the iroh engine attaches
later, behind a flag). The seam is fully exercised today with the origin
HTTP path; with no swarm providers registered the behaviour is byte-for-byte
identical to before.
- swarm/mod.rs: BlobProvider trait + fetch_content_addressed() — tries each
provider in order, VERIFIES peer-sourced bytes against the content digest
before accepting (untrusted seeds can't inject tampered bytes), falls back
to the origin closure if none serve. Returns Swarm|Origin.
- Cargo: iroh-swarm feature (off by default; heavy QUIC dep tree attaches
here). providers() is empty until enabled → every fetch hits origin.
- update.rs: components with a BLAKE3 digest route through the seam, using
the existing resumable HTTP downloader as the origin fallback; a swarm hit
is re-checked against the mandatory SHA-256 manifest gate (re-fetch from
origin on any disagreement). Components without blake3 take the original
path untouched.
44/44 swarm/update/content_hash/blobs tests pass (incl. swarm hit/miss,
tampered-bytes-rejected→origin, fall-through ordering).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the iroh-native, range-verifiable hash next to the incumbent SHA-256
so the swarm can later fetch/verify by BLAKE3 with the registry/origin as
fallback. Non-breaking: SHA-256 stays the mandatory gate; BLAKE3 is verified
only when present.
- content_hash.rs: HashAlg + ContentDigest (parse/verify '<alg>:<hex>'
multihash strings), blake3_hex/sha256_hex; BLAKE3 known-answer test
- update.rs: ComponentUpdate.blake3 (serde-default); verified ALONGSIDE
SHA-256 in the resumable download loop, re-download on mismatch
- blobs.rs: BlobMeta.blake3 computed on put (on-disk path stays
SHA-256-keyed for back-compat; advertises the future swarm address)
Drive-by: fix a pre-existing stale test (test_save_and_load_state_roundtrip)
that never wrote the .download-complete marker #26 requires, so load_state's
self-heal cleared update_in_progress. Unrelated to BLAKE3 — surfaced by
running the full update:: suite.
40/40 content_hash/update/blobs tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The #26 fix makes has_staged_update require the .download-complete
marker, so the state self-heal treats a marker-less staging dir as a
partial download and clears update_in_progress. The roundtrip test
staged a binary file but not the marker, so it began failing. Write
the marker to simulate a *complete* staged update.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the parked trust module and wires it into the live build:
- main.rs: register `mod trust`
- app_catalog::fetch_one: verify the release-root detached signature when
present (verify against raw JSON so forward-compat fields stay in the
signed preimage); accept unsigned during the migration window, hard-reject
a present-but-bad signature so a tampering mirror can't pass altered bytes
- seed: pin release-root Ed25519 known-answer test (priv+pub) for the
signing ceremony / pinned-anchor / external-verifier cross-check
- signed_doc: drop unused import
20/20 Phase 0 unit tests pass (trust::canonical/did/signed_doc/anchor,
seed release-root, app_catalog). Crate compiles clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Moved here so main stays clean for the v1.7.98 release. Contains the trust/
module (canonical.rs, did.rs, signed_doc.rs) + seed::derive_release_root_ed25519.
Not wired into the build yet. Continue this work on this branch.
Captures the verified 2026-06-16 design: swarm-assist/origin-always-wins,
iroh-blobs as the swarm engine, BLAKE3 addressing, signed Nostr/release-root
authenticity, and the Phase 0-4 plan. Foundation doc for the dht branch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The kiosk chromium pinned ~92% of a core (software-compositing spin from
--enable-gpu-rasterization on a GPU-less/headless node), saturating the machine
and starving the backend + container builds — it caused the .198 receive timeout
and the deploy storms.
- archipelago-kiosk.service: CPUQuota=75% + MemoryMax/High + Delegate, so a
runaway kiosk can never take the whole node down.
- archipelago-kiosk-launcher.sh: detect /dev/dri — use GPU rasterization only
when a GPU exists, else --disable-gpu (avoids the headless spin).
- bootstrap::ensure_kiosk_hardened: OTA self-heal that installs the updated
unit+launcher on already-deployed nodes, daemon-reloads, and only try-restarts
a *running* kiosk (never re-enables an operator-disabled one).
cargo check clean; launcher bash -n clean; unit syntax valid.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
immich_server/redis/postgres + indeedhub-* are multi-container stack members
whose sub-container app_ids are NOT in package_data, so the health monitor skips
them as "orphans" and never restarts them when they exit — Immich/IndeedHub stay
down until the next reboot (the boot-only start_stopped_stack_containers was the
only recovery). Spawn a 120s supervisor that reuses that same recovery at
runtime. It cheaply skips already-running containers and honours the user-stopped
list (set on every container by package.stop), so it only revives genuinely
crashed members and never fights a user stop.
cargo check clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- All four tabs (trusted/observers/messages/requests) capped at max-h-72 with
internal scroll, so the screen stays short instead of growing very long.
- Clicking a node row navigates to that node in the Federation screen
(?node=did); the Message button (stop-propagation) deep-links to that peer\047s
mesh chat (?peer=), using the Mesh.vue ?peer handler.
type-check clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A resumable-but-failed download leaves partial component files in update-staging.
has_staged_update() treated ANY staged file as "install-ready", so the state
self-heal kept update_in_progress=true and the UI showed Install instead of
Download (no clean retry).
- update.rs: write a .download-complete marker only after EVERY component
downloads+verifies; has_staged_update() now checks that marker. Partial/failed
downloads (no marker) correctly read as not-staged → self-heal clears
update_in_progress → UI shows Download. Resume still works (partial files kept).
- SystemUpdate.vue: on a genuine download failure, reset downloaded/in_progress
and re-sync, so the user lands back on Download immediately.
cargo check + vue-tsc clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
sendArchMessage looped over every federation node sequentially (await
sendMessageToPeer per node), so the spinner stayed up until the slowest/offline
node's Tor request finished — long after online peers had received the message.
Send to all peers concurrently (Promise.allSettled); the spinner now clears
after the slowest single delivery, not the sum.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- DID: the Identity card read the DID only from localStorage('neode_did'), so
nodes/browsers that never cached it (e.g. .116/.228) showed no DID. Fall back
to the node.did RPC and cache it — the DID now shows everywhere.
- npub: add the node's seed-derived Nostr public key (npub) to the Identity card
next to the DID + onion, fetched from node.nostr-pubkey, with a copy button.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make each peer file card a flex column filling its grid cell (flex flex-col
h-full) and pin the body row (filename + Play/Download) with mt-auto, so cards
with a media preview and cards without line their footers up across the row.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
User: chat history (messages + mesh/Tor contacts) must persist and be
secure/encrypted per best practice. Root cause of the .198 loss was the B17
mount race writing empty stores over real data (B17 already fixes the trigger);
this hardens storage so it can never silently lose or expose data:
- storage_crypto: shared at-rest envelope mirroring credentials::store — key =
SHA-256(domain ‖ node identity key) (seed-derived, per-store domain
separation), ChaCha20-Poly1305 AEAD with a random 96-bit nonce, tamper-evident.
Transparent migration of legacy plaintext files. Unit-tested (round-trip,
wrong-key/tamper rejection, plaintext detection).
- messages.json: encrypted at rest + ATOMIC write (temp+rename) so a crash/
reboot mid-write cannot corrupt history; decrypt-with-migration on load; a
failed decrypt never overwrites the on-disk data.
- mesh contacts (alias/notes/pinned/blocked): were ONLY in memory and lost on
every restart — now persisted to mesh-contacts.json (encrypted, atomic),
loaded on MeshState startup, saved after contacts-save/contacts-block.
Explicit clear (mesh.clear-all) still wipes everything, as intended.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
User priority: FIPS is the main transport but it was unreliable and needed a
manual "Activate" button. Improvements (all in the FIPS dial/supervisor):
- Auto-activate: ensure_activated() installs the daemon config + starts the
service on its own once seed onboarding has materialised the key — no Activate
button needed. Idempotent; runs from the supervisor every 45s so a node that
onboards after boot still comes up automatically.
- Dial retry: try_fips_get/post now retry ONCE on a connect/timeout error. The
first dial to a peer triggers NAT hole-punching and often times out before the
path is up; the retry lands on the now-warm path — the main reason calls were
dropping to Tor despite the peer being FIPS-reachable.
- More patient connect_timeout (5s→8s) so a reachable-but-cold peer isn't
abandoned to Tor while hole-punching completes.
- Path warmer: spawn_fips_supervisor() keeps hole-punched paths to known
federation peers warm (every 45s, concurrent), so on-demand dials are fast and
land on FIPS.
- Confirmed the daemon config already enables BOTH udp + tcp transports
(render_config_yaml), so FIPS already uses TCP where UDP is blocked; the Tor
fallback was path-establishment, addressed above.
cargo check + fmt clean. Backend — needs a binary rebuild+deploy to validate on
.116/.198 (watch last_transport flip fips, and FIPS coming up with no button).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add a close (X) button to the message toast (closeToast, @click.stop) like the
system notifications.
- Carry the sender pubkey on the toast; clicking now deep-links to that
conversation (/dashboard/mesh?peer=<pubkey>) instead of the generic mesh page.
- Mesh.vue reads ?peer= on mount and opens the matching peer (by pubkey_hex/did),
gracefully falling back to the mesh list when no match (B1/B2 identity).
type-check clean; useMessageToast tests 11/11.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Streaming a peer file connects over mesh/Tor before the first frame, so the
player sat blank. Add a loading state:
- PeerFiles video modal: spinner overlay ("Connecting to peer…") until the
<video> fires playing/canplay; an error overlay on failure instead of a
silent black box.
- useAudioPlayer: loading flag driven by loadstart/waiting vs canplay/playing;
GlobalAudioPlayer shows a spinner in the transport button while connecting.
- Fix the misleading audio error "Could not play audio. File Browser may not be
running." (wrong for peer content) → "Could not play this audio file. The peer
may be offline…" (B22).
type-check clean; useAudioPlayer tests 10/10.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- docker/fedimint-ui/nginx.conf: the local /assets/ handler 404'd the real
fedimint guardian UI's own bundled CSS (bootstrap.min.css, style.css) →
unstyled app. B13 fixed our local icon; this adds a @guardian_assets proxy
fallback to :8177 so the guardian's own /assets/* resolve. Verified live on
.116: /app/fedimint/assets/bootstrap.min.css 404→200 text/css. (needs
archy-fedimint-ui image rebuild to persist on nodes.)
- Home.vue: Quick Start Goals card regained lg:col-span-2 so it fills its row
on desktop instead of sitting at half width.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The B4 fix made listDirectory require a JSON content-type (to detect the
SPA-fallback HTML / 502 cases) and changed the non-OK error string, but its
tests still mocked headerless responses + the old message, so they failed —
which also polluted the run and tripped AppIconGrid's teardown. Give the JSON
mock a content-type, update the non-OK expectation, and add a test for the
guard's friendly-error path. Full suite now 667/667 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On production nodes /var/lib/archipelago (the app data dir AND podman's
graphroot=/var/lib/archipelago/containers/storage) is a separate
device-mapper volume. archipelago.service ordered only After=network-online
.target, so on cold boots it (and its ExecStartPre) could start BEFORE
var-lib-archipelago.mount, write to the bare mountpoint on rootfs, fail every
podman call, exit, and be restarted every 5s until the volume mounted — the
"~20x [FAILED] Failed to start over ~5min" boot flap. Proven live on .198:
"var-lib-archipelago.mount: Directory /var/lib/archipelago to mount over is
not empty, mounting anyway" — the service had written there pre-mount.
Fix: RequiresMountsFor=/var/lib/archipelago (adds Requires= + After= on the
mount unit).
- image-recipe/configs/archipelago.service: ships the directive on fresh ISOs.
- bootstrap::ensure_archipelago_mount_ordering(): self-heals already-deployed
nodes' installed unit + daemon-reload (boot-ordering only, effective next
reboot; never restarts the running service). Idempotent; harmless on rootfs
installs (maps to the always-mounted root).
Verified on .198: after applying, systemctl shows After=var-lib-archipelago
.mount and systemd-analyze verify is clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Home > System bitcoin tile is gated on bitcoinAvailable===true, so any
transient bitcoin.getinfo failure (RPC busy during heavy IBD, route-change
scan) could blank it even though the node is fine. Add a bitcoinStale flag:
- getinfo fails while the container is Running, or package data is momentarily
absent → retain the last-known value and mark it stale (tile stays, shows
"Updating…" instead of a frozen figure presented as live).
- container authoritatively Stopped/Exited → flip to not-available as before
(no stale-as-live).
- first-ever poll times out but container Running → show the tile as updating
rather than staying hidden on a syncing node.
Harness: src/stores/__tests__/homeStatus.test.ts (6 cases) — red before, green
after. type-check clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CORE_RPC_HOST was hardcoded to bitcoin-knots in three env-render paths, so on a
bitcoin-core node (container named bitcoin-core) mempool-api could not reach
Bitcoin RPC. Both node variants are reachable on archy-net by container name —
only the name differs.
- Legacy direct-podman (stacks.rs) and config.rs::get_app_config now use a new
dependencies::detect_bitcoin_rpc_host() (pure, unit-tested pick_bitcoin_host).
- Quadlet/manifest path (the modern fleet default): add a {{BITCOIN_HOST}}
derived-env placeholder — HostFacts.bitcoin_host + resolve_derived_env render
it; prod_orchestrator detects Knots/Core via podman ps, resolved on demand
only for manifests that use the placeholder. mempool-api manifest moves
CORE_RPC_HOST from static env to derived_env: {{BITCOIN_HOST}}.
Tests: pick_bitcoin_host (5 cases incl. substring safety), container-crate
resolve_derived_env, and orchestrator mempool_core_rpc_host_follows_bitcoin_node
(core->bitcoin-core, knots->bitcoin-knots). No-regression confirmed: picker
returns bitcoin-knots live on .198. Live bitcoin-core validation pending (no
core node available). Sibling hardcodes (lnd/btcpay/electrumx/fedimint) tracked
as B12b.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The B13 template fix only fixed fresh ISOs. Already-deployed nodes keep their
old nginx config, where /app/fedimint/ proxies to :8175 without rewriting the
Guardian UI's root-rooted asset URLs (src="/assets/...", url("/assets/...")).
Those resolve against the SPA root: bg-network.jpg exists there by luck, but
app-icons/fedimint.jpg 404s (location /assets/ uses try_files =404) — the
visibly-broken icon.
bootstrap.rs::patch_nginx_conf now heals both paths on startup:
- Style A (main conf, HTTP): swaps the old single nostr-provider sub_filter tail
for the full reroot set; byte-matches the shipped template.
- Style B (HTTPS app-proxy snippet): the snippet's fedimint block has no
sub_filter and a per-node-varying trailing directive, so anchor on the unique
:8175 proxy_pass and insert the reroot set after it (nginx ignores directive
order). Snippet added to the bootstrap nginx loop (skipped on HTTP-only nodes).
missing_* flags are now gated on their splice anchors so the included snippet
neither attempts the main-conf-only patches nor logs warn-skips every boot.
Idempotent via the 'href="/' 'href="/app/fedimint/' marker.
Verified on .198 (both paths): fedimint app-icon 404 -> 200 image/jpeg; nginx -t
OK; containers survived restart (Quadlet); idempotent steady state, no warn spam.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fedimint UI HTML/CSS reference absolute /assets/* paths; under /app/fedimint/
those hit the main SPA, not the fedimint container, so the UI renders
unstyled. Add the proven sub_filter asset-rewrite pattern (as indeedhub/
botfights use) to the /app/fedimint/ block in the nginx template + https
snippet (also rewrites url(...) for the CSS background image). Bootstrap
self-heal for already-deployed nodes is the documented resume point.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
B15: Home system stats (incl. bitcoin sync %) polled every 30s — too slow;
now 10s so sync progress tracks the actual block height more closely.
B7: the ElectrumX sync overlay was gated only on status!=='synced', so if
the status never flips to 'synced' (ElectrumX stale/disconnected) the loader
stuck on top forever. Now the overlay hides and the app iframe loads when
the sync status is stale (fail-open), while still showing during active
indexing. type-check EXIT 0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The B3 streaming proxy endpoint existed in the backend but nginx had no
location for /api/peer-content/*, so the browser's requests fell through to
the SPA (200 text/html) and media still wouldn't play. Add an
NGINX_PEER_CONTENT_BLOCK that bootstrap patches into every server block
(forwards Cookie for session auth + Range, proxy_buffering off). Idempotent;
covers fresh-ISO nodes too since bootstrap runs on every startup.
Verified on .198: after restart the async nginx patch lands and
/api/peer-content/<onion>/<id> returns 401 (reaches backend, auth-gated)
instead of the SPA; nginx block present in both server blocks.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Peer media (music/video) wouldn't play: the frontend downloaded the whole
file via RPC as base64 and made a non-seekable Blob URL, so <video>/large
<audio> stalled and big files hit the RPC timeout.
Add GET /api/peer-content/<onion>/<id> — a same-origin, session-gated proxy
that forwards the browser's Range header to the peer's /content/<id> (which
already returns 206 Partial Content) and passes status + Content-Range +
Content-Type back. PeerFiles.playMedia() now points <video>/<audio> at this
streaming URL for free content instead of buffering a base64 blob, so the
player can seek and start immediately. Onion/id validated to prevent
SSRF/path traversal. (Paid preview keeps its existing flow.)
Verified: cargo build --release EXIT 0; vue-tsc --noEmit EXIT 0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
content.browse-peer now returns the transport that actually reached the
peer (fips/tor/mesh/lan). PeerFiles shows it as a small coloured pill next
to the peer name (FIPS/Mesh green, LAN blue, Tor amber) and the loading
text no longer hardcodes "Connecting via Tor" (it was misleading when FIPS
was used). Pairs with B14 (transport recording).
Verified: cargo build --release EXIT 0; vue-tsc --noEmit EXIT 0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The B14 commit referenced crate::federation::storage::record_peer_transport
but `storage` is a private module — record_peer_transport is re-exported at
crate::federation::. E0603 broke the build. Use the re-exported path (as
load_nodes/fips_npub_for_onion already do). Verified: cargo build --release
EXIT 0. Also logs B21 (Tor/FIPS pill) plan.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 4 content peer handlers (browse, download, download_paid, preview)
captured the transport returned by PeerRequest::send_get() but discarded
it, so the federation node's last_transport was never updated for cloud
activity — the UI showed Tor/none even when FIPS was used. Call
record_peer_transport() after each successful fetch (same as sync does).
Note: live data shows FIPS still reaches only some peers (many genuinely
fall back to Tor) — tracked separately as B14b (FIPS reachability).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
B1/B2: the same physical node can linger in the federation list under two
dids (e.g. after a did/key change). An onion is a node's unique stable
identity, so two entries with the same onion are one node. This showed the
node twice in the trusted-node list (B1) and as two mesh chat contacts —
one by name+logo, one by raw did (B2).
- storage::load_nodes now collapses same-onion entries (keep first, merge
fips_npub/name/last_state) so every consumer (list + chat seed + sync)
sees one entry per node.
- federation::sync merge_transitive_peers also matches by onion (not just
did) so new transitive hints don't re-add a known node under a new did.
- mesh::seed_federation_peers_into_mesh skips already-seeded onions (belt
and suspenders).
- Unit tests for dedup_nodes_by_onion (collapse + onion-suffix handling).
B4: filebrowser-client.listDirectory only checked res.ok before res.json(),
so when File Browser is absent (nginx serves the SPA index.html, 200) or
down (502) the JSON parse threw the opaque "Unexpected token '<'". Now it
checks the content-type and throws a friendly "File Browser is not
available" the Cloud view already renders as an empty state.
Verified: dedup unit tests 2/2; live .198 (15 entries→13 distinct onions)
restarted healthy on new binary; B4 guard present in built bundle + deployed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The LND wallet UI (served on its own app port) fetches /lnd-connect-info
and /proxy/lnd/* cross-origin, so both need correct CORS headers.
(a) Older nginx configs add their own Access-Control-Allow-Origin in the
/lnd-connect-info location on top of the one the backend sets, yielding
a DUPLICATE header that browsers reject ("multiple values"). bootstrap
now strips that redundant nginx add_header (backend owns CORS).
(b) /proxy/lnd/* returned a 401 with no CORS headers when the session
check failed, so the browser saw an opaque CORS error instead of a
readable 401. Add unauthorized_cors() and use it on that path.
Adds tests/production-quality/ (bug tracker + lnd-cors-test.sh harness).
Verified: harness 4/4 on .116, .198, .103.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The add-anchor form previously hardcoded transport=udp. Expose a
TCP/UDP selector (default tcp) so public internet anchors and
local-network anchors can both be added. Includes changelog + What's
New entry for v1.7.96-alpha.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The kiosk attached-display showed a separate app-tile launcher grid
(Kiosk.vue at /kiosk) instead of the normal onboarding/login/dashboard.
The grid is auth-gated, so it only surfaced once the kiosk browser held a
persisted session; otherwise it bounced to login — masking the issue.
Remove the grid entirely. /kiosk now just persists kiosk mode + safe-area
insets and redirects to the root app. The launcher keeps pointing at
/kiosk (not directly at /) so the 'kiosk' localStorage flag is still set —
App.vue uses it to skip the remote relay, which would otherwise double
xdotool input on the kiosk display. Route made public so the auth guard
doesn't bounce it before the redirect runs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds tests/multinode/smoke.sh on the existing multinode.bash lib: an
assertion suite (pass/fail + non-zero exit) driving two real nodes through
login, onion + FIPS identity, FIPS anchor-connected, federation pairing
both directions, peer content browse over the mesh, and the removed-node
tombstone (with an optional 3rd node C for the transitive-reappear case).
Guards the v1.7.94/v1.7.95 fixes. Content-browse + tombstone checks
skip-with-note against peers older than v1.7.95.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FIPS peer content browse over the mesh was failing with "Peer returned
error: 404 Not Found" and never falling back to Tor. `is_peer_allowed_path`
only allowed `/content/<id>` (item fetches) — the catalog endpoint is
exactly `/content` (no trailing slash), so it 404'd over the FIPS peer
listener. A FIPS 404 was also treated as a successful response, so the dial
never retried Tor. Fixes: allow `/content` over the mesh; add
`fips_should_fall_back()` so a FIPS 404/5xx in Auto mode falls back to Tor
(handles version-skew peers reaching a different route). Also correct the
reconnect hint text — the public anchor is TCP/8443, not UDP/8668.
Federation: deleted nodes reappeared because transitive discovery
(`merge` of a peer's advertised trusted peers) re-added any unknown DID.
Add a tombstone store (`removed-nodes.json`): remove_node tombstones the
DID, transitive merge skips tombstoned DIDs, and a remote-triggered
peer-joined is ignored for a removed DID. Explicit local re-add (add_node)
clears the tombstone.
UI: the app credentials modal panel stretched edge-to-edge (height:100%,
max-width:none, items-stretch overlay). Constrain it to a centered card
(max-width 34rem, rounded, dimmed full-screen backdrop) matching the
AppIconGrid / wallet-receive modal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The whole fleet was silently never reaching the FIPS mesh: the default
public anchor was configured as fips.v0l.io:8668/udp, but the anchor only
answers on TCP/8443. Fix the default to 185.18.221.160:8443/tcp (IPv4
literal — the hostname resolves IPv6-first and the daemon binds v4-only,
which fails the handshake with EAFNOSUPPORT), and auto-seed it in
anchors::load() so every node dials it without operator action (removal
still persists). Proven live on .116: cold start → anchor_connected in
~400ms, anchor became mesh parent.
Wire fips::update::apply() against upstream GitHub releases (stable
channel only): resolve /releases/latest → SHA256-verify the .deb against
checksums-linux.txt → install → restart. dpkg runs via `systemd-run` to
escape archipelago's ProtectSystem=strict sandbox (else /var/lib/dpkg is
read-only), with --force-confold (archipelago manages /etc/fips conffiles)
and --force-downgrade (dev builds sort newer than the stable tag).
Validated live: .116 upgraded 0.3.0-dev -> stable v0.3.0.
Also: standalone fips-ui dashboard app (apps/fips-ui + docker/fips-ui,
static nginx proxying /rpc/v1 same-origin, copiable own-anchor address);
reserve UI port 8336; register fips/fips-ui as platform-managed. Includes
the Lightning wallet cross-origin (CORS) + LND proxy auth + nginx
self-healer fix so the wallet screen connects instead of "failed to fetch".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When an existing LND wallet is locked and none of the candidate passwords
(per-node secret, legacy constant) open it, the node can never auto-unlock
unattended. unlock_existing_wallet now returns Ok(false) for "all candidates
actively rejected" (vs Err for transient "LND not ready"), and
ensure_wallet_initialized responds by recreating the wallet:
- mark the lnd container user-stopped so the health monitor won't
re-launch it (and re-open the wallet) mid-wipe,
- stop lnd, delete its wallet/chain/graph state as root,
- start lnd, wait for NON_EXISTING, re-init a fresh wallet on the
per-node secret, then clear the user-stopped flag.
LND runs as a plain bridge-network podman container (not a Quadlet unit),
so it is restarted via `systemd-run --user --scope podman`, matching the
orchestrator/health-monitor path.
Alpha nodes hold no funds and a wallet locked with an unknown password is
already inaccessible, so the wipe loses nothing reachable. Completes the
forward fix from 91adc281 for nodes whose wallet pre-dates the per-node
secret and whose password is unrecorded (e.g. .116/.228).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the fleet-wide hardcoded WALLET_PASSWORD='hellohello' that left wallets
LOCKED after OTA/reboot (auto-unlock used the wrong password fleet-wide).
Forward fix (both init paths unified, validated cargo check + LND REST mechanics
on a scratch wallet):
- Per-node random 256-bit secret in secrets/lnd-wallet-password (0600), mirroring
secrets/bitcoin-rpc-password. read_wallet_password (no-gen) vs
ensure_wallet_password (gen at init only).
- container/lnd.rs init AND api/rpc/lnd/wallet.rs seed-derived init both use the
per-node secret (wallet.rs keeps recoverable derived entropy; password unified).
- Unlock tries [per-node secret, legacy 'hellohello']; single-attempt primitive
distinguishes invalid-passphrase (fail fast, try next) from not-ready (retry),
so a wrong password no longer hangs the boot path ~60s.
Migration (candidate-unlock + rotate, best-effort at login):
- change_wallet_password (WalletUnlocker.ChangePassword) + migrate_locked_wallet:
if LOCKED, try candidates as current pw and ChangePassword onto the per-node
secret so future boots auto-unlock. Hooked into auth.login (non-blocking) with
the just-verified password as the candidate.
NOT YET: seed-recovery fallback for wallets where no candidate matches (e.g.
.116/.228) — destructive, needs entropy-source/funds-safety handling; next pass.
NOT shipped: pending end-to-end validation on a real node.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
create-release.sh bumps Cargo.toml but not the lock's archipelago version line;
the cargo build regenerates it post-commit. Same as the 1.7.91 leftover — worth
fixing create-release.sh to stage Cargo.lock, tracked separately.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A wrong/locked LND wallet password leaves the wallet LOCKED after every
restart/OTA, breaking all Bitcoin-receive + Lightning ops fleet-wide — and the
harness was blind to it: live-lnd-address-type treats 'wallet locked' as PASS,
os-audit treated lnd-unreachable as WARN, and the archipelago lnd.getinfo RPC
masks a locked wallet (returns all-zero success).
- tests/release/run.sh: new 'live-lnd-unlocked' stage polls LND's unauth
/v1/state and FAILs if still LOCKED after a 60s grace window.
- tests/lifecycle/os-audit.sh: probe lnd.newaddress (the real receive path,
which surfaces LND_WALLET_LOCKED) instead of lnd.getinfo; locked = hard FAIL,
not-installed = WARN.
Proven on .116 (genuinely locked): os-audit now reports
'[FAIL] lnd wallet unlocked (lnd.newaddress) wallet LOCKED'.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
create-release staging requires >=3 curated release-note bullets. The What's
New restoration is itself user-facing, so it's an honest third note; mirror it
into the modal's v1.7.92 block via sync-whats-new.py.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The What's New modal (AccountInfoSection.vue) hardcodes one block per release
and had silently drifted: it sat at v1.7.84 while the fleet shipped through
v1.7.92, so eight releases of notes never reached users in Settings.
- scripts/sync-whats-new.py: renders a modal block from each CHANGELOG version
that's missing one (curated bullets, dev-process 'Validation…' lines dropped),
inserts newest-first; never touches older hand-written pre-CHANGELOG history.
--check mode lists anything missing and exits non-zero.
- tests/release/run.sh: new 'whats-new-sync' static gate runs --check, so a
release with an un-surfaced CHANGELOG entry fails before shipping.
- Backfilled the eight missing blocks (v1.7.85 … v1.7.92) into the modal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
batch_host_reboot previously asserted only container-set equality after the
reboot. Add the os-audit.sh per-boot health gate: after rpc_login succeeds
post-reboot, run os-audit against the target (ARCHY_LOCAL=0, https) and record
host_reboot_osaudit PASS/FAIL. This asserts the node is actually healthy after
a reboot — RPC up, OTA not wedged (FM12), every app reachable with valid launch
metadata, FM-guards green — not just that the right containers exist. Validated
green on .116 (11 pass / 0 fail / 0 warn).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When ElectrumX is still building its index (or waiting on the Bitcoin node),
AppSessionFrame shows a sync 'pre UI'. The iframe-blocked fallback ('App not
reachable / retrying') was not gated on electrsSync, so it painted over the
sync screen and read as a hard connection error. Gate it on !electrsSync,
mirroring the iframe's own guard.
Also harden the lifecycle health probe: container_health used jq '// "unknown"',
which only catches null/false — an empty-string health (a brief window under
load) rendered as a blank 'bad health: X is '. Map empty to 'unknown' so the
retry loop keeps waiting instead of failing on a transient.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
create-release.sh bumps Cargo.toml; the lock's archipelago version line is
regenerated by the subsequent cargo build and was left uncommitted after the
v1.7.91-alpha release commit. The shipped binary is built from the bumped
Cargo.toml, so this is bookkeeping only.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
os-audit.sh: one non-destructive scorecard tying backend/RPC health, the
all-apps lifecycle audit (delegates to remote-lifecycle.sh), and the FM-guards
(port-drift, secret-completeness, orphan-container sweep, OTA-wedge). The
per-boot building block for the reboot-survival loop. FM12 check uses jq has()
not // (// treats a legit false as empty). Section A validated all-PASS on .116.
docs: v1.7.91 release-pass resume notes + the bitcoinReceive blocker writeup.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
codeMatch[1] is string|undefined under noUncheckedIndexedAccess; using it
directly as an index into RECEIVE_CODE_MESSAGES failed vue-tsc (TS2538) and
aborted create-release.sh at the frontend build step. Bind to a const and
narrow before indexing.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Before/after on the live node confirms the launch_url_port fix:
jellyfin/btcpay/fedimint/gitea/portainer/botfights all went from
lan_address=None to a resolved http://localhost:PORT/ URL; harness
focused audit passed, exit 0. Also documents that archipelago.service
restarts are safe on .116 (containers run in the user-1000 slice, a
different cgroup, and survived the restart).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
reachable_lan_address() parsed the launch port with url.rsplit(':')
which yields "8096/" for manifest interfaces.main URLs that carry a
path (http://localhost:8096/). That fails to parse and silently drops
a perfectly reachable launch URL, so apps like jellyfin, btcpay-server,
fedimint, gitea, nextcloud and portainer showed running with no launch
link in the UI. New launch_url_port() reads digits after the final
colon (mirroring port_from_url in the RPC layer) and tolerates a
trailing path. Adds regression tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixes three Bitcoin/wallet failures observed across the fleet on v1.7.90-alpha
(all nodes were already on the latest build — these were live bugs, not stale
builds), plus the missing ElectrumX tile, and adds automated coverage so each
can't regress silently.
Receive address (".116 receive fails", ".228 false 'wallet is locked'"):
- LND publishes its REST API on a host port that can drift from the manifest
(a container created when the mapping was 8080 kept publishing 8080 after the
manifest moved to 18080). The in-process client connects to the manifest port,
gets connection-refused, and wallet init fails forever while the container
looks "Up". Add published-port drift detection to the reconciler
(container_ports_drifted / host_port_bindings_drifted) that recreates a
drifted backend even for restart-sensitive apps — a drifted container is
already broken, so leaving it "untouched" only perpetuates the failure.
- Receive errors now carry a stable [CODE] token (REST_UNREACHABLE, WALLET_LOCKED,
WALLET_UNINITIALIZED, SYNCING) and always start with "Bitcoin address" so they
survive the RPC error sanitizer instead of collapsing to the generic
"Operation failed". The UI maps the code instead of guessing wallet state from
substrings — so an unreachable REST endpoint is no longer mislabelled "locked".
Bitcoin install (".198 bitcoin gone / reinstall just stops"):
- bitcoin-knots requires the secret bitcoin-rpc-txrelay-rpcauth, which was only
generated by the tx-relay flow. Nodes that never used tx-relay lacked it, so
secret resolution hard-failed and the whole Bitcoin stack cascaded. Generate
it idempotently before bitcoin starts (ensure_app_secrets, reusing
ensure_txrelay_credentials), and name the missing secret in the error so a
genuine gap is actionable instead of a bare "IO error".
ElectrumX app tile missing on every node with it installed:
- The catalog generator dropped electrumx because the manifest had no
interfaces.main block, so the tile had no launch URL and was hidden. Declare
the companion UI port (50002) in the manifest, regenerate the catalog, and let
an app with a known launch URL stay launchable while its backend is still
"starting" (ElectrumX indexes for 10m+).
Test harness:
- New lifecycle bats suites: bitcoin-receive, port-drift, secret-completeness
(validated live; port-drift catches the real .116 drift).
- Rust unit tests for drift detection, the receive reason-code classifier, and
the named-missing-secret error; vitest for the UI code mapping.
- create-release.sh now runs tests/release/run.sh and aborts the release on
failure — previously it ran no tests at all.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- LND wallet: request correct address type so receive-address generation
no longer 400s
- AIUI/app session: on-screen pointer can click + type into app content
(incl. app store search); "open in new tab" opens the phone browser;
mobile credential modal centered instead of full-height
(remote-relay.ts, AppSession.vue, AppSessionFrame.vue, AppIconGrid.vue,
openExternal.ts, WebViewScreen.kt) + remote-relay tests
- health_monitor: electrs auto-recovers from a corrupt index and shows a
percent/block-height progress screen while reindexing (useElectrsSync.ts)
- update.rs: drop retired tx1138 secondary mirror (one-time migration);
longer download timeout for slow connections
- CHANGELOG: v1.7.90-alpha notes
- tests/release/run.sh: harness tweaks
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| `UPDATE_INCOMPATIBLE … signatures do not match` | Old install signed with a **different key** (e.g. pre-shared-keystore per-machine key `58:31:12…`). | Uninstall the old package, then install. **One-time** per device after a key change. |
| `INVALID_APK` / parse error | Corrupt/incomplete download or bad signing. | Re-download; re-run the publish script. |
Polishes the mesh AI assistant and Fedimint, on top of all the v1.7.99 features (kept listed below so you can still see what's new).
- The off-grid mesh radio no longer posts cryptic identity codes to the shared public channel. Your node was announcing a line starting with "ARCHY:" to the public channel about once a minute, which everyone else on that channel saw as spam; that broadcast has been removed.
- You can now use your node's AI assistant straight from a normal chat. Send "!ai <yourquestion>" in a direct message to an AI-enabled node and the answer comes right back in the same conversation — whether your message travelled over the internet or the LoRa radio. Before, the reply could be sent on the wrong path and never arrive.
- The Mesh AI Assistant panel is easier to set up: pick the Claude model from a dropdown (Haiku, Sonnet, or Opus) instead of typing it, and add specific contacts to an "always allow" list so chosen people can use "!ai" even when the assistant is set to trusted-nodes-only.
- Fedimint federations show up in Wallet Settings again. The Fedimint client app wasn't starting because of a configuration error, so the federation your node auto-joins never appeared; the client is fixed and runs again.
- In Settings, "App Updates" and "App Registry" now sit directly under your Account section for quicker access.
- In Mesh chat, scrolling the conversation no longer also scrolls the contact list behind it.
- Mesh direct messages are now private and end-to-end encrypted to the recipient — they're sent as real radio DMs instead of being broadcast on the public channel, so other people on the mesh no longer see them, and the answer arrives intact (even on standard meshcore phone apps).
- You can now message standard meshcore apps (like the phone companion) and they can message you — text shows up readable on both sides, and your node's AI answers come back as a private reply rather than on the public channel.
- New contacts you hear on the radio are added automatically, so people show up in your Peers list without any extra steps.
- "Clear All" now actually removes contacts (rather than hiding them forever); a contact comes back on its own the next time it's in range. Each contact also shows a reachability dot so you can see who's currently reachable.
- The Peers list has a search box (with a clear button) to quickly filter your contacts by name, DID, npub, or key.
All the v1.7.99-alpha features are included as well:
- Your node can now hold Fedimint ecash as well as Cashu, with tabbed Wallet Settings for each and both balances shown side by side on the home wallet card.
- You can buy files shared by another node right from their cloud, paying from this node's ecash, your Lightning wallet, on-chain, or by scanning a Lightning QR with any outside wallet.
- Your node can act as an AI assistant on the off-grid mesh: peers ask by starting a message with "!ai" and get an answer back over the radio, with a panel to turn it on or off.
- You can view your node's 24-word recovery phrase any time from Settings, behind a password (and 2FA) confirmation and a tap-to-show blur.
- Setting up a brand-new node is smoother: it waits and retries quietly instead of flashing errors, and shows a gentle "securing your private connection…" status that turns to "ready" on its own.
- The NetBird VPN app now logs in (it's served over HTTPS and opens in a browser tab).
- Phone remote-control of a node's screen now supports two-finger scrolling inside apps, and external-browser apps open on your phone.
- You can choose whether your node shares Bitcoin block headers over the mesh, and your choices are remembered.
- Version numbers display cleanly everywhere (no more doubled "v"), and "Back" buttons look and behave consistently across desktop and mobile.
- For advanced testing, Settings includes an optional update & app source choice between the usual trusted origin and an experimental peer-to-peer (DHT swarm) mode, with the trusted origin remaining the default.
## v1.7.99-alpha (2026-06-17)
- Your node can now hold Fedimint ecash as well as Cashu. Wallet Settings now has tabbed sections for each: keep your list of trusted Cashu mints, or paste a Fedimint invite code to join a federation, and the home wallet card shows both your Cashu and Fedimint balances side by side. A new "Fedimint Client" app in the catalog powers the federation side.
- You can now buy files shared by another node, right from their cloud. When you open a peer's paid file you get a simple "Buy this file" picker with several ways to pay — instantly from this node's ecash balance, from your node's own Lightning wallet, on-chain from your node, or by scanning a Lightning QR code with any outside wallet. Once payment settles, the file downloads automatically.
- Your node can now act as an AI assistant on the off-grid mesh radio network. If your node has a local AI model available (via Ollama), other people on the mesh can ask it a question by starting their message with "!ai" and get an answer back over the radio — handy where there's no internet. A new Mesh assistant panel lets you turn this on or off and shows whether a local AI model was detected.
- You can now view your node's 24-word recovery phrase whenever you need it. Settings has a new "Recovery phrase" option that, after you confirm your password (and 2FA code if you use one), reveals the words behind a tap-to-show blur with a copy button — so you can write them down and store them safely offline.
- Setting up a brand-new node is smoother and less alarming. If the node is still starting up while you generate or confirm your recovery phrase, it now quietly waits and retries instead of flashing a scary error, and offers a clear "Try again" button only when something genuinely goes wrong. The final setup screen also shows a gentle "securing your private connection…" status that turns to "ready" on its own, so you can tell the encrypted transport is coming up rather than stuck.
- The NetBird VPN app now actually logs in. It was failing to reach its sign-in screen because the dashboard needs a secure (HTTPS) connection that wasn't being provided; the node now serves it over HTTPS and opens it in a browser tab, so the login flow completes.
- When you use your phone to remote-control a node's attached screen, two-finger scrolling now works inside apps and panels, not just the main page. And tapping an app that's meant to open in an external browser now hands the link to your phone to open there, instead of trying to open it on the (often unattended) attached display.
- You can now choose whether your node shares Bitcoin block headers over the mesh. The Mesh Bitcoin panel has new switches to announce headers to peers and to accept headers from them, and your choices are remembered.
- Version numbers now display cleanly everywhere. In a few places the interface was showing a doubled "v" (like "vv1.7.98"); it now always shows a single, tidy version label.
- The "Back" buttons throughout the cloud and other detail screens now look and behave consistently on both desktop and mobile, including when browsing another node's files.
- For advanced testing, Settings now includes an optional "update & app source" choice between the usual trusted origin and an experimental peer-to-peer (DHT swarm) mode that pulls updates and app content from other nodes first, falling back to the origin automatically. The trusted origin remains the default.
## v1.7.98-alpha (2026-06-16)
- Apps that crash now recover on their own. Multi-part apps like Immich and IndeedHub could have one of their pieces stop and stay stopped until the whole node was rebooted; the node now checks every couple of minutes and restarts any crashed piece automatically (while still leaving apps you deliberately stopped alone).
- The on-screen kiosk display can no longer slow the whole node down. On machines without a graphics chip the kiosk browser could spin a CPU core at full tilt, starving everything else (including the wallet, which then timed out); it's now capped and uses lighter rendering on those machines.
- If an update download fails, you're taken back to the Download button to retry, instead of being stranded on an Install button for an update that didn't actually finish downloading.
- Your node's identity is clearer and always visible: Settings now shows your Node DID on every node (it previously only appeared if your browser had cached it) plus your node's npub, both with copy buttons. There's also a terminal tool to cryptographically prove all your node's keys come from your one seed phrase.
- The "all nodes over Tor" group chat sends quickly now — the "sending" spinner clears as soon as the reachable nodes have the message, instead of hanging on a slow or offline node.
- Message notifications now have a close button and open the relevant chat when tapped.
- The encrypted mesh transport (FIPS) turns itself on automatically after setup — no button to press — and connects to peers more reliably (it retries and keeps connections warm), so node-to-node features use the fast path more often instead of falling back to Tor.
- Your chat history with other nodes is saved reliably and now encrypted on disk, so it survives restarts and updates and can't be read from a stolen drive (only clearing chat removes it).
- Peer media shows a "connecting" loader before a video or audio file plays, and audio errors are accurate instead of blaming File Browser.
- The Fedimint app now displays with its proper styling, and the Connected Nodes screen stays compact — it shows a few nodes and scrolls, you can tap a node to jump to it in Federation, or tap Message to open its chat.
- App updates can now arrive on their own without waiting for a full system release, so individual apps can be improved and shipped faster.
## v1.7.97-alpha (2026-06-16)
- The Bitcoin sync status on the home screen no longer disappears for a moment when it refreshes. If the node was briefly busy, the panel used to vanish and pop back; it now stays put and simply shows "Updating…" until the next reading arrives, while a genuinely stopped node still correctly shows as not running.
- Bitcoin sync progress on the home screen now updates more promptly, so the percentage and block height keep pace with the node instead of lagging behind.
- The Lightning wallet "connect your wallet" screen loads its details and QR code again across all nodes, instead of failing to fetch them.
- Your list of trusted nodes is now clean: the same node no longer appears several times under different names, and removed nodes stay removed. In chat, a node that previously showed up as two separate contacts now appears just once.
- Browsing another node's cloud is smoother: music and video files from a peer now preview and play properly (including seeking partway through), and the connection now shows a small badge telling you whether it's using the fast encrypted mesh or the slower Tor network.
- Opening "My Folders" in the cloud now shows a clear, friendly message when the file app isn't running, instead of a confusing error.
- The Electrum server app opens on its own once it's ready, instead of sometimes leaving a loading spinner stuck on top of the screen.
- The Fedimint app now displays with its proper styling and icons, instead of appearing unstyled with a missing image.
- The Mempool app now connects to your Bitcoin node whether the node is Bitcoin Core or Bitcoin Knots, instead of only working with one of them.
- Nodes start up cleanly after a reboot. On some boots the node's main service was trying to start before its data drive had finished mounting, so it failed and retried about twenty times over roughly five minutes — showing a wall of "Failed to start" messages — before finally coming up. It now waits for the data drive to be ready first, so it starts on the first try.
- The background images throughout the interface now load faster — they've been made significantly smaller with no loss of quality.
## v1.7.96-alpha (2026-06-15)
- The screen attached to your node now shows the normal Archipelago interface and your dashboard after you sign in, instead of a separate, stripped-down grid of app icons that could appear in its place. That extra screen has been removed so the attached display matches what you see everywhere else.
- On a brand-new node, the attached screen now walks through the same welcome and setup steps you'd see on a phone or laptop, and shows the normal sign-in screen once the node is set up — so the on-device display always matches the rest of the interface.
- When adding a FIPS network anchor, you can now choose whether it connects over TCP (for a public anchor reached across the internet) or UDP (for one on your local network), instead of it always assuming the local-network option.
- Behind the scenes, a new automated two-node test now exercises real node-to-node features — browsing another node's shared files and handling a removed node — against live nodes before each release, so node-to-node problems are caught earlier.
## v1.7.95-alpha (2026-06-15)
- Browsing another node's shared files now works over the fast encrypted mesh. Opening a peer's cloud could fail with a generic "Operation failed" message because the request for their file list wasn't permitted over the mesh and came back as "not found" — and it never retried over Tor. The mesh now serves the file list directly, and if a peer can't answer over the mesh the node automatically falls back to Tor instead of giving up.
- Nodes you remove from your federation now stay removed. Previously a deleted node could quietly come back the next time you synced with another node that still listed it. Removed nodes are now remembered as removed and won't reappear on their own — only if you add them back yourself.
- The app credentials pop-up now appears as a normal centred box with a dimmed background over the whole screen, instead of stretching to fill the entire screen.
## v1.7.94-alpha (2026-06-15)
- Your node now joins the private encrypted mesh network on its own. A wrong built-in setting meant nodes were quietly never reaching the shared mesh meeting point, so everything between nodes fell back to the slower Tor network. Every node now connects to the mesh automatically on startup, so node-to-node features like file sharing use the faster encrypted mesh first and only fall back to Tor when a peer is genuinely offline. (Confirmed live: a node with its mesh setting wiped re-connected to the mesh by itself within a second of starting.)
- You can now bring the mesh networking software up to the latest stable version straight from the node, with one action — it fetches the new version, checks it's genuine before installing, and restarts the mesh on its own. (Confirmed live end to end: a node on an older build was upgraded to the current stable release and rejoined the mesh automatically.)
- The Lightning wallet screen connects again on nodes where it was showing a "failed to fetch" error instead of your balance and channels. The wallet app and the node now talk to each other correctly, and the connection quietly repairs itself if its details drift after a restart.
## v1.7.93-alpha (2026-06-14)
- Receiving Bitcoin and Lightning works again on nodes where the Lightning wallet was stuck locked. After some updates the wallet could come back locked with a password the node no longer had, so "generate a receive address" kept failing with a "wallet is locked" message that nothing could clear. The node now detects this and repairs itself automatically.
- Each node now secures its Lightning wallet with its own unique, randomly generated password instead of a shared built-in one, and remembers it safely so the wallet unlocks on its own after every restart or update — no more getting stuck locked.
- If a wallet is found locked with an unrecoverable password, the node rebuilds it cleanly so Bitcoin and Lightning start working again. (On these early-access nodes the wallet holds no funds, so nothing is lost — a wallet locked with an unknown password was already inaccessible.)
- The self-repair was validated end to end on live nodes: a stuck, locked wallet was detected, rebuilt, and came back unlocked on its own, and stayed unlocked across restarts.
## v1.7.92-alpha (2026-06-14)
- The Electrum server app no longer flashes a "can't connect, try again" error over its loading screen while it's still catching up. If ElectrumX is building its index or waiting on the Bitcoin node, you now just see the sync progress, and the app opens on its own once it's ready.
- Behind the scenes, the reboot-survival test now confirms the whole system is genuinely healthy after a restart — every app reachable, updates not stuck, core services answering — instead of only checking that containers came back, so update-related problems are caught before shipping.
- Settings → What's New now lists the notes for every recent release again. The screen had quietly fallen several versions behind, so the last eight releases of changes weren't showing up there — they're all back now, and a release check keeps it from drifting again.
## v1.7.91-alpha (2026-06-14)
- Apps you've installed now reliably show their "Open" button again. Some apps — including Jellyfin, BTCPay Server, Fedimint, Gitea and Portainer — were running fine but their launch link sometimes went missing, so there was no way to open them from the home screen. They now open correctly.
- Receiving Bitcoin is more dependable: if the wallet's internal connection details drift after a restart, it now repairs them on its own, and any error it does hit is reported clearly instead of as a generic failure or a misleading "wallet locked" message.
- Installing Bitcoin now sets itself up correctly without manual help — a security credential that could previously be missing and stop Bitcoin from starting is created automatically before it launches.
- The Electrum server app is back on the home screen and can be launched again.
- Behind the scenes, the release now runs an expanded automated test suite before shipping, so these kinds of issues are caught earlier.
## v1.7.90-alpha (2026-06-13)
- Generating a Bitcoin receive address works again — the wallet now requests the correct address type, fixing the "400 Bad Request" error when creating an address.
- In the companion app, the on-screen pointer can now click into apps and type — including the app store search box — instead of clicks and keystrokes not reaching app content.
- "Open in a new tab" from the companion app now opens the app in your phone's browser, instead of doing nothing. The normal mobile browser keeps working as before.
- The login/credentials pop-up on phones is once again a centered, properly sized window rather than stretching the full height of the screen.
- The Electrum server now recovers on its own if its index ever gets corrupted, and shows a clear progress screen (with percent complete and block height) while it builds its index, instead of a blank or broken page.
- Software updates are more reliable on slow internet connections — downloads are given much more time to finish before giving up.
## v1.7.89-alpha (2026-06-12)
- The AI assistant looks the way it always did again: no extra back button or close button on phones, and the desktop view fills the whole screen without a gap at the bottom.
- System updates are much more reliable: updates that previously got stuck partway or failed to install now complete cleanly, and a failed update can no longer block all future updates.
- After an update, the system now checks itself correctly on every node type, so working updates are no longer mistakenly undone.
- Generating a Bitcoin receive address works again on nodes where a network proxy previously got in the way.
- The Lightning wallet now recovers and unlocks itself properly after restarts.
## v1.7.88-alpha (2026-06-12)
- AIUI now loads immediately again instead of waiting on a production availability probe and cache-busted iframe URL, restoring the lighter launch behavior from before the regression.
- Bitcoin receive now uses LND's GET-based newaddress flow with the native SegWit address type, fixing the `501 Method Not Allowed` response from the previous POST attempt.
- Validation pending on the AIUI rollback; the rest of the release train remains unchanged.
## v1.7.87-alpha (2026-06-12)
- Bitcoin receive now calls LND's on-chain address endpoint with the correct REST method, and backend failures keep the specific address-generation error instead of collapsing into the generic operation-failed message.
- App launch credential interstitials now render as true full-screen overlays, and the launcher loading indicator uses the neutral brand palette instead of a blue spinner.
- Validation passed with `git diff --check`, `npm run type-check`, and the focused frontend tests for `bitcoinReceive` and `AppIconGrid`.
## v1.7.86-alpha (2026-06-12)
- Fleet now preserves the last known node list, alerts, and selection locally while telemetry refreshes in the background, so the dashboard no longer blanks on tab switches or update scans.
- Connected nodes and identities now reuse their last loaded data instead of reloading the visible list every time the user revisits the tab.
- The Fleet matrix and detail views now show actual node names and host information instead of raw node id prefixes.
- The network map only redraws when its graph data actually changes, which stops the D3 scene from visually resetting on every refresh tick.
- Mobile federation and system-update actions now stack full width, and the ElectrumX app health check allows a long startup window so slow sync nodes do not restart mid-index.
- Validation passed with `git diff --check`, focused frontend tests, and `npm run type-check`.
## v1.7.85-alpha (2026-06-12)
- ElectrumX now runs with less cache pressure and more memory headroom, reducing the restart loop seen during sync catch-up.
- Portainer is pinned to `2.19.4` instead of `latest`, avoiding schema-drift restarts from surprise image updates.
- LND receive-address creation now asks for a native SegWit address and returns clearer wallet/readiness failures when an address is not available.
- Fleet telemetry now carries server name, hostname, and server URL, and the Fleet dashboard shows those names instead of hashed node ids.
- Trusted federation peers are still auto-added transitively, but the local node no longer imports itself back into the fleet list.
- Validation passed locally for the touched frontend helpers, `git diff --check`, and Rust formatting.
## v1.7.84-alpha (2026-06-11)
- Bitcoin trusted-node relay approvals now generate restricted `txrelay` RPC credentials when needed and restart the active Bitcoin backend so bitcoind loads the new `rpcauth` whitelist.
- Kiosk mode now includes a browser safe-area path for HDMI displays that crop edges, and self-update refreshes kiosk launcher/systemd files so display fixes ship to existing nodes. The experimental X11 scaling safe-area is opt-in to avoid stretching TV output.
- Wi-Fi setup now reports scan errors instead of showing an empty network list, supports retrying scans from the modal, parses escaped `nmcli` SSIDs correctly, and can join open networks without forcing a WPA password.
- Bitcoin Core now matches Bitcoin Knots for restricted relay RPC support, including the txrelay secret injection and transaction broadcast whitelist.
- The restricted Bitcoin relay whitelist now includes `submitpackage` and `gettxout`, covering newer wallet/package-relay broadcast flows without opening wallet/admin RPC.
- The Bitcoin UI companion image is pinned to `1.7.84-alpha` across release metadata and the Quadlet fallback path, avoiding stale `latest` detection during OTA updates.
- Container scanning now uses an RAII in-flight guard so timeout and error paths cannot leave the scanner stuck in a permanently busy state.
- Validation passed with `cargo fmt`, `cargo check -p archipelago`, `git diff --check`, and focused source review of the relay message/approval path.
## v1.7.83-alpha (2026-06-11)
- App launch metadata now derives more consistently from app manifests, with typed launch interfaces and catalog generation updates that keep packaged apps aligned with their runtime ports and launch surfaces.
- Revoked or unsupported app surfaces were removed from the catalog and release path, including OnlyOffice and the unvalidated Saleor surface, so the Marketplace no longer exposes apps that cannot be safely supported in this release.
- The frontend production build now passes strict TypeScript checks after tightening app details, Web5, cloud refresh, and credential test typing.
- Mobile and desktop app surfaces received release polish: improved mobile app layout, safer mesh desktop/tablet scrolling, and the Home system card now routes directly to monitoring.
- Bitcoin UI status rendering now avoids false stale/reconnecting states when fresh block snapshots advance, and guards optional DOM updates so the standalone Bitcoin UI is more resilient.
- Deploy tooling now excludes local Codex scratch output, archived image-build artifacts, and upload screenshots from target syncs, and bounded optional IndeedHub fixups so a stuck Podman helper cannot hold the deploy.
- Validation passed with `npm run type-check`, production `npm run build`, backend `cargo build --release`, catalog/release manifest checks, focused frontend tests, and live `.198` deploy verification through the frontend/service restart phase.
## v1.7.82-alpha (2026-05-22)
- Saleor storefront proxying now forwards `X-Forwarded-Host`, fixing Next.js Server Actions requests that compared the browser origin with the internal `storefront-app:3000` upstream host.
- Saleor storefront media now routes `/thumbnail/` and `/media/` through the same `9011` proxy to the Saleor API, fixing product image optimizer failures caused by `localhost:8000` media URLs.
- The Saleor storefront container receives an explicit internal media origin so rewritten media URLs resolve inside the Podman network without exposing private API ports to browsers.
- Validation passed with `cargo fmt --all --check --manifest-path core/Cargo.toml`, `cargo check -p archipelago --manifest-path core/Cargo.toml`, and live checks on `100.114.134.21` for storefront HTML, static assets, GraphQL, media redirects, and optimized product images.
## v1.7.81-alpha (2026-05-21)
- Saleor storefront installs now use the prebuilt registry image instead of building the Next.js app on-device, avoiding Podman build failures during stack installation.
- Existing Saleor stacks are repaired on adoption by recreating missing storefront containers, forcing the storefront app to bind `0.0.0.0:3000`, and resolving nginx upstreams dynamically after container restarts.
- The shipped Saleor storefront image now includes public assets and omits Vercel-only Speed Insights injection, fixing broken static asset responses and the local `/_vercel/speed-insights/script.js` browser warning.
- Validation passed with `cargo fmt --all --check --manifest-path core/Cargo.toml`, `cargo check -p archipelago --manifest-path core/Cargo.toml`, and live checks on `100.114.134.21` for `9011` storefront, static assets, and proxied GraphQL.
## v1.7.80-alpha (2026-05-21)
- Saleor storefront proxying now falls back to the direct request scheme when no forwarded protocol header is present, fixing direct `http://node:9011` launches that could generate an invalid same-origin GraphQL URL.
- The Saleor storefront release path keeps public proxy support intact by still honoring forwarded HTTPS headers for Nginx Proxy Manager domains while repairing local/direct port launches.
- Validation passed with `cargo fmt --check` and `cargo check` for the Archipelago backend before release staging.
## v1.7.79-alpha (2026-05-20)
- Saleor now installs the official Saleor Storefront as part of the stack, built from the pinned `saleor/storefront` source and served as the customer-facing shop on port `9011`.
- Saleor app launches now open the storefront while the admin dashboard remains available on port `9010` with the generated `admin@example.com` credentials shown in Archipelago.
- Public Nginx Proxy Manager hosts forwarding to the Saleor storefront also expose same-origin `/graphql/`, so public storefront domains can talk to the local Saleor API without mixed-content or private-LAN reachability failures.
- Saleor stack metadata, marketplace descriptions, catalog ports, scanner exclusions, and app-session routing now describe the storefront/dashboard/API split explicitly.
## v1.7.78-alpha (2026-05-20)
- Public Nginx Proxy Manager hosts for Saleor now keep browser GraphQL calls same-origin at `/graphql/` and proxy them to the local API on `8000`, fixing `Failed to fetch` when a public domain such as `noderunner.shop` was loaded from devices that cannot reach the node's private LAN/tailnet API address.
- Saleor's validated stack changes are now release-ready: dashboard origins on port `9010` are explicitly allowed for dashboard/API calls, preserving the working test-node install path for production nodes.
- NetBird launches now stay pinned to the unified dashboard/proxy origin on port `8087` instead of following stale runtime-discovered server URLs on `8086`.
- NetBird's local nginx proxy now routes browser API, OAuth, relay, and WebSocket traffic through `host.containers.internal:8086` instead of a hard-coded rootless Podman gateway IP, and includes the upstream `management.ProxyService` gRPC path.
- The mobile credentials interstitial now keeps credential lists scrollable and action buttons reachable in both My Apps and the mobile app icon grid.
- Android WebView popup windows now hand external popup URLs to the system browser, covering app login/signup flows that open secondary windows.
- Validation passed with `git diff --check`, `cargo check -p archipelago`, and the focused `npm test -- src/views/appSession/__tests__/appSessionConfig.test.ts` suite.
## v1.7.77-alpha (2026-05-20)
- Saleor first-use now exposes generated credentials through Archipelago instead of leaving users at an unexplained dashboard login: App Details shows copyable `admin@example.com` credentials, and My Apps/mobile icon launches show a pre-launch credentials modal.
- Saleor installs now create or repair the `admin@example.com` staff account idempotently after sample data loads, use the correct dashboard mount path, and re-check stack containers after startup so stopped containers are caught.
- NetBird embedded login now uses the upstream-compatible IdP signing-key behavior and sends ID tokens from the dashboard to the management API, fixing the post-signup `Unauthenticated` state while preserving the unified local proxy/logout routes.
- Transient unnamed Podman helper containers created during app install tasks are hidden from My Apps, so generated names like `eager_keldysh` no longer appear as user applications.
- Validation passed with catalog/release JSON checks, `npm run type-check`, and `cargo fmt --all --check --manifest-path core/Cargo.toml`; live checks on `100.114.134.21` confirmed Saleor dashboard/API availability, generated Saleor admin login, NetBird OAuth availability, and NetBird logout redirects.
## v1.7.76-alpha (2026-05-20)
- Saleor installs now use dashboard port `9010`, avoiding the existing Portainer `9000` binding on the test node while keeping API `8000`, Mailpit `8025`, and Jaeger `16686` unchanged.
- Saleor's Valkey cache no longer bind-mounts `/var/lib/archipelago/saleor-cache`, and the dashboard container has the minimal rootless nginx capabilities it needs to chown cache files, bind port 80 inside the container, and drop workers to the nginx user.
- NetBird's browser proxy now sends API, OAuth, relay, WebSocket, and management traffic through the stable host-published server port at `169.254.1.2:8086`, avoiding stale rootless Podman DNS/IPs after `netbird-server` restarts.
- Mobile App Store category chips now stay visible above the tab bar, Discover is available on mobile, and category selection updates the page route/query so the selected category is actually shown.
- Apps that require a real browser tab now open directly from the app icon tap instead of first entering an in-shell app-session route, including BTCPay, Grafana, Home Assistant, Vaultwarden, Nextcloud, Portainer, OnlyOffice, Tailscale, Uptime Kuma, Gitea, and Nginx Proxy Manager.
- Validation passed with catalog JSON checks, `npm run type-check`, `cargo fmt --all --check --manifest-path core/Cargo.toml`, and `cargo check -p archipelago --manifest-path core/Cargo.toml`; live checks on `100.70.96.88` confirmed Saleor dashboard `9010`/API `8000` and NetBird API/OAuth routes survive `netbird-server` restart.
## v1.7.75-alpha (2026-05-19)
- Saleor is now published as a recommended commerce app with catalog metadata, icon, direct app-session launch on port `9000`, scanner metadata, image pins, and a full stack installer for dashboard, API, worker, PostgreSQL, Valkey, Mailpit, and Jaeger.
- Existing NetBird installs are repaired more aggressively by rewriting unified-origin config, recreating the dashboard/proxy containers, restarting the server, preserving data, and handling exact `/api` and `/oauth2` routes plus dashboard logout redirects through the local proxy.
- Desktop dashboard scrolling now hands focus back from the sidebar to the main content when the pointer or wheel moves over the main pane, preventing the sidebar scroll area from trapping wheel input on short screens.
- Validation passed with catalog JSON checks, `npm run type-check`, `cargo fmt --all --check --manifest-path core/Cargo.toml`, and `cargo check -p archipelago --manifest-path core/Cargo.toml` before release.
## v1.7.74-alpha (2026-05-19)
- App-session right panels now re-focus the iframe after load and when the frame area is activated, so wheel/touch scrolling works immediately after switching tabs or selecting an app on shorter screens.
- NetBird now launches through a unified local origin on port `8087` that proxies the dashboard plus `/oauth2`, `/api`, relay, WebSocket, and gRPC routes to `netbird-server`, fixing the embedded login flow that previously ended in `Unauthenticated` or `404 page not found` after logout.
- Existing NetBird installs are repaired on adopt/start by rewriting `config.yaml`, `dashboard.env`, and the local nginx proxy config, then creating the missing `netbird-dashboard` and `netbird` proxy containers when needed while preserving NetBird data.
- Saleor is still pending and is not included in this release; its registry/installer work remains local until it can be validated separately.
- Validation passed with catalog JSON checks, `npm run type-check`, `cargo fmt --all --check --manifest-path core/Cargo.toml`, and `cargo check -p archipelago --manifest-path core/Cargo.toml`.
## v1.7.73-alpha (2026-05-19)
- Mobile app launches for iframe-blocked apps now open the direct app URL in a new browser tab immediately instead of landing in a broken in-shell webview that requires a second tap.
- Mobile My Apps/Websites tabs now react to route query changes, App Store pages label the mobile view as Discover, mobile filters have safe bottom spacing, and App Store search ignores the current category so searches cover all available apps.
- My Apps search now surfaces matching App Store entries when the app is not installed, making it possible to jump directly from a failed My Apps search to the installable app details.
- NetBird self-host installs now prefer a `100.x` tailnet/CGNAT address for dashboard, management, relay, STUN, and auth redirect origins when one is present; live repair on `100.89.209.89` updated the existing stack from LAN origins to `100.89.209.89` and restored `netbird-server`.
- App-session iframe frames now focus automatically and wrap the iframe in a scroll host so wheel/touch scrolling works in the active right frame without requiring an initial click.
## v1.7.72-alpha (2026-05-19)
- Settings What's New now includes the missing release notes for `v1.7.68-alpha` through `v1.7.71-alpha`, so the modal reflects the current OTA history instead of stopping at `v1.7.67-alpha`.
- The follow-up release carries the NetBird install fix, Gitea icon polish, mobile app-session fallback updates, and rounder app icon masks from `v1.7.71-alpha` with the Settings modal notes included.
- The local Cargo lockfile version metadata is kept in sync with the release bump after the previous release build updated it.
## v1.7.71-alpha (2026-05-19)
- NetBird stack installs now pre-create `/var/lib/archipelago/netbird/data` before binding it into `netbird-server`, fixing the failed install/start path seen on `100.70.96.88` where Podman rejected the missing host directory.
- NetBird start/restart ordering now starts `netbird-server` before the dashboard container so lifecycle actions bring the control plane up before the UI.
- App-session invalid IDs and panel-mode fallbacks now return to `/dashboard/apps`, avoiding the stale `/apps` route that could render a 404.
- Mobile launches for apps that block iframes now stay inside the Archipelago app-session fallback instead of automatically opening an external browser tab.
- Installed Gitea containers now report the packaged Gitea icon, and app icon masks use a rounder radius on mobile grids, app cards, and detail headers.
- Validation passed with `npm run type-check`, focused Vitest app-session/app-grid tests, `cargo fmt --all --check --manifest-path core/Cargo.toml`, and `cargo check -p archipelago --manifest-path core/Cargo.toml`.
## v1.7.70-alpha (2026-05-19)
- NetBird is being corrected from the peer/client daemon image to the self-hosted NetBird control-plane stack with a launchable dashboard on port `8087`, a combined management/signal/relay server on `8086`, and STUN on UDP `3478`.
"description":"Electrum protocol server. Index the blockchain for fast wallet lookups.",
"description":"Electrum server indexing Bitcoin chain data for lightweight wallet queries.",
"icon":"/assets/img/app-icons/electrumx.png",
"author":"Luke Childs",
"category":"money",
@ -99,7 +99,7 @@
"id":"indeedhub",
"title":"IndeeHub",
"version":"1.0.0",
"description":"Bitcoin documentary streaming with Nostr identity.",
"description":"Bitcoin documentary streaming platform featuring God Bless Bitcoin and other educational content about Bitcoin, sovereignty, and decentralized technology. Sign in with your Nostr identity.",
"icon":"/assets/img/app-icons/indeedhub.png",
"author":"IndeeHub",
"category":"community",
@ -110,49 +110,133 @@
"id":"botfights",
"title":"BotFights",
"version":"1.1.0",
"description":"Bot arena + 2-player arcade fighter with controller support and Adventure Mode.",
"description":"Bot competition arena with 2-player arcade fighting mode. AI bots battle in trivia challenges while humans duke it out with controllers. Built for Bitcoiners.",
"description":"Fedimint ecash client daemon (fmcd). Lets your node hold Fedimint ecash and join federations; the wallet talks to it over a local REST API.",
"tailscaled --tun=userspace-networking & for i in $(seq 1 30); do [ -S /var/run/tailscale/tailscaled.sock ] && break; sleep 1; done; tailscale web --listen 0.0.0.0:8240 & wait"
]
}
},
{
"id":"portainer",
"title":"Portainer",
"version":"2.19.4",
"description":"Container management web UI for the local Podman socket.",
"notes":"Installed as a two-container stack: netbird dashboard on 8087 and netbird-server control plane on 8086 plus UDP 3478. For production clients, publish a DNS name over HTTPS with gRPC/WebSocket routing."
This will build all apps that have Dockerfiles. Standard apps (bitcoin-core, lnd, etc.) will use their official images, while custom apps (router, did-wallet, web5-dwn) will be built from source.
This will build all apps that have Dockerfiles. Standard apps (bitcoin-core, lnd, etc.) will use their official images, while custom apps (router, did-wallet) will be built from source.
### Build Specific App
```bash
./build.sh router
./build.sh did-wallet
./build.sh web5-dwn
```
## Running Apps via Archipelago
@ -64,7 +63,6 @@ In development mode, apps are accessible on offset ports:
- **Router**: http://localhost:18084
- **DID Wallet**: http://localhost:18083
- **Web5 DWN**: http://localhost:13000
- **Nostr RS Relay**: http://localhost:18081
- **Strfry**: http://localhost:18082
@ -72,7 +70,7 @@ See [PORTS.md](./PORTS.md) for complete port mapping.
## Development Workflow
### For Custom Apps (router, did-wallet, web5-dwn)
### For Custom Apps (router, did-wallet)
1. **Make changes** to source code in `apps/<app-id>/src/`
description:Bot competition arena with 2-player arcade fighting mode. AI bots battle in trivia challenges while humans duke it out with controllers. Built for Bitcoiners.
description:Bitcoin documentary streaming platform featuring God Bless Bitcoin and other educational content about Bitcoin, sovereignty, and decentralized technology. Sign in with your Nostr identity.
category:media
category:community
# The user-facing launcher (app_id "indeedhub"). Container is named "indeedhub"
# (matches the runtime's per-app references + the live container, so the
# orchestrator adopts it). Its nginx (listen 7777) proxies to the backends by
# their short aliases on indeedhub-net: api:4000, minio:9000, relay:8080.
container_name:indeedhub
container:
image:146.59.87.168:3000/lfg2025/indeedhub:latest
pull_policy:always # Pull from registry; falls back to local build
image:146.59.87.168:3000/lfg2025/indeedhub:1.0.0
pull_policy:if-not-present
network:indeedhub-net
dependencies:
- app_id:indeedhub-api
- storage:1Gi
resources:
cpu_limit:2
memory_limit:512Mi
disk_limit:1Gi
security:
capabilities:[]
readonly_root:true
no_new_privileges:true
user:1001
seccomp_profile:default
network_policy:bridge
apparmor_profile:default
# nginx master runs as root and drops workers to the nginx user (uid/gid
# 101) — needs SET{UID,GID}; CHOWN + DAC_OVERRIDE let it own + write the
# proxy cache under the tmpfs /var/cache/nginx. The orchestrator does
# --cap-drop=ALL, so (unlike the legacy `podman run` default caps) these
# must be declared or nginx workers die with "setgid(101) failed".
description:NetBird combined management / signal / relay server with an embedded identity provider and STUN. Backend for the self-hosted NetBird mesh VPN.
category:networking
# Hyphen name matches the runtime references (crash_recovery / dependencies /
# config startup order) + the live container, so on an existing node the
# orchestrator ADOPTS the running server rather than recreating it (data +
# the sqlite store under /var/lib/netbird preserved). Alias `netbird-server`
# is the short hostname the proxy's nginx proxies/grpc-passes to.
container_name:netbird-server
container:
image:docker.io/netbirdio/netbird-server:0.71.2
pull_policy:if-not-present
network:netbird-net
network_aliases:[netbird-server]
# The relay authSecret and the sqlite store encryptionKey are base64 keys
# (the server base64-decodes them to recover raw bytes — hex would decode to
# the wrong value). Generated once and reused: ensure_generated_secrets
# no-ops when the file already exists, so a re-render of config.yaml on an
# adopted node keeps the same keys (regenerating would orphan the store).
generated_secrets:
- name:netbird-relay-auth-secret
kind:base64
- name:netbird-store-encryption-key
kind:base64
# Pass the rendered config explicitly, mirroring the legacy `--config` arg.
description:Self-hosted WireGuard mesh VPN control plane with dashboard, embedded identity provider, management API, signal, relay, and STUN. The user-facing entry point — a TLS proxy in front of the dashboard + server.
category:networking
# The user-facing launcher (app_id + container both "netbird", matching the
# runtime references + the live container so the orchestrator adopts it). This
# is the nginx that terminates TLS on 8087 and fans out to the dashboard +
# server by their short aliases on netbird-net.
container_name:netbird
container:
image:docker.io/library/nginx:1.27-alpine
pull_policy:if-not-present
network:netbird-net
# Self-signed TLS cert materialised before create — the dashboard needs a
# secure context (window.crypto.subtle / OIDC PKCE, issue #15), so the proxy
# serves HTTPS. Idempotent: kept as-is when crt+key already exist (a user
# accepts it once). SAN defaults to the host IP + 127.0.0.1 + localhost.
generated_certs:
- crt:/var/lib/archipelago/netbird/tls.crt
key:/var/lib/archipelago/netbird/tls.key
dependencies:
- app_id:netbird-server
- app_id:netbird-dashboard
- storage:1Gi
resources:
memory_limit:256Mi
security:
# cap-drop=ALL is applied by the orchestrator. nginx (master as root, drops
r#"{"error":"This file is shared with the host's federation peers only. Federate with that node (exchange invites) so it recognizes you, then try again."}"#,
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.