Compare commits

...

177 Commits

Author SHA1 Message Date
Dorian
38d2bbf570 chore(android): update companion APK download [skip ci] 2026-06-26 13:08:37 +01:00
Dorian
a90fea80ed feat(android): edit server entries from in-app settings menu (NESMenu); bump to 0.4.12 (vc16)
The 0.4.11 edit affordance only lived on ServerConnectScreen, which a
connected user never sees. Add edit to NESMenu — the settings modal
reached via two-finger hold while connected: a ✎ pencil on each saved
server opens the form pre-populated (Edit Server header + Cancel),
persists via ServerPreferences.updateSavedServer(), and reconnects when
the edited server is the live one.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 13:08:18 +01:00
Dorian
389e602097 chore(android): update companion APK download [skip ci] 2026-06-26 12:54:52 +01:00
Dorian
5677f9cca1 feat(android): edit saved server entries; bump companion to 0.4.11 (vc15)
Add an edit affordance to each saved server in ServerConnectScreen: a
pencil button loads the entry into the form (Edit Server mode) with
Save Changes / Cancel actions. Persisted via a new
ServerPreferences.updateSavedServer() that replaces by connection
identity (address/port/scheme) and keeps the active record in sync when
the edited server is the active one.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 12:54:07 +01:00
archipelago
fc64b422e7 docs(master-plan): WS-F#3 first destructive run — 3 reinstall bugs found
Full all-apps-lifecycle pass on .228: lifecycle 11/11, teardown 8/11.
Surfaced (1) fresh-install bind-dir ownership root:root → reinstall
EACCES (jellyfin/netbird; Fix B misses the install path), (2) netbird
reinstall adopts leftover containers → skips manifest cert/file render,
(3) portainer image pin lfg2025/portainer:2.19.4 unpublished (manifest
unknown), pin overrides RPC dockerImage. .228 restored.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 07:47:24 -04:00
Dorian
07b9b5a3aa docs(android): companion release + App-Not-Installed runbook
Capture the 2026-06-26 lessons durably: ship via the hardened publish
script only, v1+v2+v3 signing is enforced by apksigner (AGP ignores
enableV1Signing at minSdk>=24), diagnose install failures with adb
install FIRST, signature-key changes force a one-time uninstall, and
keep all phone/adb work scoped to com.archipelago.app.debug.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 12:21:48 +01:00
Dorian
ac59771560 fix(android): force v1+v2+v3 signing & clean-build guards in companion publish
The published companion APK was v2-only (AGP silently ignores
enableV1Signing for minSdk>=24) and clean builds broke on stray
space-named resource dirs. Harden scripts/publish-companion-apk.sh:
clean build, remove/ýreject space-named res dirs, force v1+v2+v3 via
zipalign+apksigner, and abort unless all three schemes verify. Wire
ship-companion.sh to the shared script. Re-sign the served 0.4.10 APK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 11:53:25 +01:00
Dorian
d1f9e9ce88 chore(android): update companion apk download 2026-06-26 11:32:00 +01:00
Dorian
58847fc3d7 chore(android): bump companion to 0.4.10 (versionCode 14)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 11:31:36 +01:00
archipelago
a3e09eab57 docs(master-plan): WS-F#3 — destructive all-apps lifecycle matrix landed (43934eef)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 06:29:51 -04:00
archipelago
43934eefa5 test(gate): destructive all-apps lifecycle matrix (WS-F#3)
Active counterpart to the read-only all-apps-matrix.bats: drives
stop/start/restart for every installed app and, under
ARCHY_ALLOW_CASCADE_DESTRUCTIVE, a FULL teardown (uninstall →
no-ghost → reinstall) — the broad coverage F needs beyond the ~8 core
suites. App set is discovered from My Apps ∩ the node catalog; reinstall
spec comes from catalog.json {dockerImage, containerConfig}.

PROTECTED by default (never cycled or torn down): bitcoin*/electrum*
(expensive resync) AND lnd/btcpay*/fedimint* (teardown = irreversible
wallet/channel/guardian loss). The user asked to protect only
bitcoin+electrum; the wallet apps are added for safety and can be
removed via ARCHY_MATRIX_PROTECT. Heavy + destructive → a supervised
pass, not folded into run-gate. Validated on .228: discovery excludes
the 6 protected installed apps; lifecycle tier cycles a single app
(botfights) stop/start/restart green; teardown gated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 06:29:22 -04:00
archipelago
80146f4476 docs(master-plan): WS-F#2 — uninstall progress bar made truthful (9f17ba68)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 06:15:11 -04:00
archipelago
9f17ba6867 fix(ui): truthful uninstall progress bar (was a solid full-red block)
AppCard's uninstall bar was hardcoded `w-full bg-red-400/60 animate-pulse`
— a solid, full-width, red, fake-pulsing block that never moved and read
as an error, no matter the actual teardown progress (the install bar, by
contrast, renders a real percentage). Derive a truthful percentage from
the backend's existing `uninstall-stage` label — "Stopping containers
(X/N)" → 10–50%, "Cleaning up volumes" → 70%, "Removing app data" → 90%
— and render it exactly like install: neutral fill, real width + percent,
shimmer (not a fake pulse) carrying motion when a stage has no number.
Frontend-only; the backend already broadcasts these stages.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 06:04:48 -04:00
archipelago
67426c0d41 docs(master-plan): cascade tier wired into the gate (b7d92107)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 05:24:07 -04:00
archipelago
b7d9210784 test(gate): optional ARCHY_GATE_CASCADE pass — wire the cascade tier in
run-gate.sh ran only the DESTRUCTIVE tier; the cascade-uninstall suite
(uninstall→no-ghost→reinstall, the #13/#14/uninstall-hang regression
guard) existed but was never enabled by the gate. Add an opt-in single
cascade pass after the 5× loop (ARCHY_GATE_CASCADE=1, requires
ARCHY_ALLOW_DESTRUCTIVE=1), counted into the pass/fail tally. Kept out
of the 5× loop deliberately — uninstall/reinstall every iteration would
balloon runtime and re-pull images; one pass guards the class. Default
gate behavior unchanged. Validated: cascade-uninstall.bats 7/7 on .228.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 05:22:45 -04:00
archipelago
292a2650df docs(master-plan): WS-F — uninstall-hang root cause fixed + cascade validated
Workstream F now in-progress: the immich/grafana uninstall hang →
ghost/stuck-bar/reinstall-block is root-caused (unbounded systemctl/
podman in quadlet::disable_remove) and fixed (71cc9ac4); cascade-
uninstall.bats 7/7 on .228. Records the remaining F items + the pending
gate-wiring decision.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 05:18:39 -04:00
archipelago
71cc9ac46a fix(uninstall): bound systemctl/podman teardown so uninstall can't hang
Uninstalling immich/grafana could hang with a frozen full-red progress
bar, leave a ghost entry stuck in My Apps, and then refuse reinstall.
Single root cause: quadlet::disable_remove() — called first in the
uninstall task (via companion + orchestrator teardown) — ran
`systemctl --user stop`, daemon-reload, and `podman rm -f` with NO
timeout. On rootless podman a generated unit can wedge in "deactivating"
while podman hangs underneath, so `systemctl stop` blocks forever. The
spawned uninstall task then never returns Ok or Err, so:
  - set_uninstall_stage() (after the stop) never fires → progress frozen;
  - remove_package_state_entry() never runs → entry stranded in
    `Removing` → ghost in My Apps;
  - the install guard rejects reinstall with "already Removing".

The spawn wrapper already reverts state on Err and removes the entry on
Ok — the only failure mode was a hang that returns neither. Bound the
teardown so it always terminates:
  - systemctl stop → QUADLET_STOP_TIMEOUT, escalate to kill+reset-failed
    on timeout (reuses the existing helpers);
  - daemon_reload_user() → bounded systemctl_user_status (30s);
  - defensive `podman rm -f` → wrapped in tokio timeout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 04:27:02 -04:00
archipelago
2ebcd8f9a8 docs(master-plan): backlog — smart launch-port selection + manifest-driven archival-node blocker
§10b: replace per-app static launch-port map with a manifest-first +
non-HTTP-port-skipping heuristic (the gitea :2222 class).
§10c: generalize the un-pruned/archival Bitcoin install blocker from a
hardcoded requires_unpruned_bitcoin() match to a manifest-declared
dependency, with a clear pre-install UX.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 03:47:25 -04:00
archipelago
3515344800 docs(master-plan): session h — zombie guard + gitea launch-port fix
Banner + §8b: zombie-container guard (0a8db904, live-proven on .228) and
gitea launch-port fix (670ebb06) shipped in binary 040df5ce, rolled to
the fleet. Logs the mempool env-drift recreate-loop and nostr-rs-relay
follow-ups.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 03:41:59 -04:00
archipelago
670ebb0666 fix(launcher): pin Gitea launch URL to web port 3001 (not SSH 2222)
Gitea publishes two host ports — SSH on 2222 and the web UI on 3001.
The launch URL comes from manifest_lan_address_for() (the manifest's
interfaces.main → 3001), but Gitea had no entry in the static
lan_address_for() fallback map. On a node where the gitea manifest is
absent or stale (no interfaces block), the lookup returns None and the
code falls through to extract_lan_address(), which returns whichever
port podman lists first — frequently the SSH port. Result: the app
launched at :2222 instead of :3001 (observed on tailscale node
100.82.34.38).

Add the canonical "gitea" => http://localhost:3001 entry to the static
map, matching every other core app, so the web UI is pinned regardless
of manifest presence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 03:16:41 -04:00
archipelago
0a8db9044f fix(orchestrator): recreate zombie "Up" containers whose process is dead
podman trusts its own state DB: when a container's conmon dies without
podman observing it (cgroup-cascade SIGKILL on archipelago.service
restart, a crash), `podman ps` keeps reporting it "Up" long after the
process is gone. The reconciler NoOp'd such a zombie forever, so a dead
dependency with no published host port never recovered.

Observed live on .228 (2026-06-25): netbird-dashboard reported "Up" with
a dead State.Pid → its nginx proxy 502'd → NetBird login broke
("Unauthenticated"). The dashboard publishes no host port, so the
Running branch had nothing to probe and never recreated it.

Add a zombie guard to the Running branch: verify the recorded State.Pid
is alive (its /proc entry exists) before trusting "running"; on a
concrete dead PID, stop+remove+install_fresh from the manifest.
Conservative by design — any uncertainty (inspect failed, PID
unparseable) assumes alive, so a transient podman hiccup never destroys
a healthy container. Unit test covers live/dead/out-of-range PIDs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 02:25:52 -04:00
archipelago
43e700498b fix(android): trust self-signed certs for the user's own node in WebView
Node apps (e.g. NetBird on :8087) terminate TLS with a self-signed cert
so the dashboard gets a secure context (OIDC / window.crypto.subtle, #15).
The WebView's default onReceivedSslError CANCELs untrusted certs, so those
apps rendered blank in the companion — exactly the netbird "won't load in
the webview" report. Override onReceivedSslError in both WebViewClients
(kiosk + in-app browser) to proceed() only when the failing cert's host
matches the connected node; reject everything else (no blanket trust).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 18:13:52 -04:00
archipelago
89d397bb74 refactor(netbird): delete legacy Rust installer — #20 ph4 (manifest-driven only)
netbird is fully manifest-driven (apps/netbird-*/manifest.yml via the signed
catalog): install_stack_via_orchestrator renders the 3-member stack with
generated_certs (self-signed TLS for the #15 OIDC secure context), base64
generated_secrets, and templated config — and adopts the running stack by live
container name. The hardcoded `podman run` fallback was therefore dead code on
any node with the embedded catalog (verified live: .228 https:8087 -> 200).

Removes the per-app Rust installer anti-pattern the master plan calls out:
- install_netbird_stack: orchestrator -> adopt -> bail! (no in-Rust installer)
- deletes 6 now-dead helpers (write_netbird_config_files, ensure_netbird_tls_cert,
  read_or_generate_b64_secret, netbird_net_resolver_ip, detect_netbird_public_host_ip,
  wait_for_netbird_oidc_ready), 3 NETBIRD_*_IMAGE consts, unused base64::Engine import
- ~485 lines removed; prod_orchestrator doc-comments updated

Behavioural parity: the manifest path already executed on the fleet, so this
changes no live behavior. The legacy #10 OIDC-readiness wait was already bypassed
by the manifest path; if that race resurfaces, add an OIDC-ready gate to the
manifest rather than resurrecting the Rust fn.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:04:01 -04:00
archipelago
41e7f500f8 test(lifecycle): tolerate slow-but-healthy heavy-app recovery under 5x churn
The 5x destructive gate on heavy nodes false-failed on transient windows
during stack recovery, not real regressions:

- immich.bats: lan_address port-publish probe 30s -> 90s. The postgres->redis
  ->server (DB migrations on boot) stack can take >30s to republish :2283 after
  a churn-induced recreate; destructive-tier immich tests already allow 180-240s.
- mempool.bats: orphan-container check now polls to steady state (<=30s) instead
  of a single-shot count, which caught a recreated member briefly visible
  alongside its replacement mid-reconcile.
- run-gate.sh: settle cap 180s -> 300s and also gate on immich's :2283 when
  installed, so the next iteration's read-only probe doesn't race a still-
  recovering stack. Settle returns the instant every probe is green.

A genuinely unexposed/orphaned/unhealthy app still fails these checks; they only
absorb the transient recreate window under sustained churn.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 09:18:34 -04:00
archipelago
a721532f55 feat(orchestrator): desired-state recovery + recreate volume-ownership [UNVALIDATED WIP]
NOT yet validated on a node or fleet-deployed — cargo check passes, release build
+ .228 canary validation pending. Committed as a checkpoint so the work survives.

Two fixes the immich .198 incident exposed:

Fix A (reconcile_all_with_mode): a previously-running app whose container vanished
(e.g. a wedged podman teardown cleared by a reboot) was left absent on boot. Now,
when boot reconcile would leave an app 'absent' but it was running at the last
running-containers snapshot, recreate it (install_fresh). New
crash_recovery::load_last_running_names() reads the snapshot without the PID/crash
gate (+2 unit tests). Match is exact on compute_container_name (incl stack
members); user-stopped + uninstalled apps are already excluded, so no false
positives.

Fix B (ensure_bind_mount_dirs): a freshly-created bind dir was left root:root, so a
no-data_uid app running as container-root (→ host rootless user) hit EACCES and
crash-looped (the exact immich upload-dir failure). Now a newly-created bind dir
for a no-data_uid app is chowned via --reference=<parent> to match the rootless
data root — no host-uid guessing, only fresh dirs (no regression for existing
installs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 09:28:40 -04:00
archipelago
80f49cac1c fix(ui): backoff remote-relay reconnects + stop cryptpad icon 404
Two console-noise fixes from a live error dump:
- remote-relay.ts reconnected on a FIXED 5s interval with no backoff, so when
  the backend is briefly down it floods the console/network with failed-WS
  attempts for the whole outage. It's a secondary feature (companion input), so
  add exponential backoff 1s->30s (mirrors websocket.ts), reset on open/start.
- cryptpad's catalog/marketplace entries pointed at a non-existent
  /assets/img/app-icons/cryptpad.webp -> a 404 on every marketplace render.
  Point it at the existing default icon (handleImageError swapped to it anyway).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 08:41:04 -04:00
archipelago
2d8ade629b fix(ui): log global errors silently instead of popping a toast + overlay
The global error handler (Vue errorHandler + window error + unhandledrejection)
fired a red 'Something went wrong: <raw msg>' toast AND an auto on-device overlay
on every caught error — deliberately loud for bug-bash, but it surfaces benign,
non-actionable noise (e.g. a transient RPC rejection during a ws reconnect, or
the service worker failing to register over a self-signed cert) right in the
user's face.

Demote the catch-all to SILENT capture: keep console.error + the
window.__archyErrors ring buffer, and expose the screenshot-able overlay
on-demand via window.__archyShowErrors() — but never auto-pop. Components that
need to report a specific, actionable failure still call toast.error() directly.

Also filter known-benign environmental noise (PWA service-worker registration
failing over a self-signed cert — needs a trusted cert, #56) so it doesn't even
occupy a ring-buffer slot and push out real errors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 05:55:49 -04:00
archipelago
0406af522c test(lifecycle): add manifest-driven all-apps health matrix
The per-app suites cover ~8 core apps in depth; nothing covered the ~30 others
(jellyfin, vaultwarden, penpot, nextcloud, grafana, …). all-apps-matrix.bats
derives the app set from server.get-state package-data (no hardcoded list) and
asserts baseline health across EVERY installed app:
  - settles to a non-transitional state within a window (the #13/#14 stuck-ghost
    class, generalized fleet-wide — installing/removing that never settles)
  - not in error/failed
  - reports a recognized (non-garbage) state
  - every running UI app (manifest ui=="true") exposes a non-null lan-address
    (the immich/port-drift unreachable-UI failure, generalized to all UI apps)

Read-only, so it joins run.sh/run-gate.sh on every node and grows coverage as
nodes install more apps. Verified 5/5 on .228 (17 apps) and .116 (20 apps).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 05:27:10 -04:00
archipelago
57a69257c4 test(lifecycle): add CASCADE uninstall/reinstall tier (guards #13 ghost, #14 reinstall)
The 5x gate is DESTRUCTIVE-only and never exercised uninstall/reinstall — where
the worst field bugs lived (#13 app ghosting in My Apps after uninstall, #14
reinstall stalling on stale state). New cascade-uninstall.bats drives the full
teardown path on a throwaway app (default grafana, precondition-skips if already
installed so it can't destroy real data) and asserts:
  - fresh install reaches running via a truthful, non-silent progression
  - uninstall makes the entry DISAPPEAR from server.get-state package-data
    (the literal My Apps map) — no ghost, no stuck uninstall stage
  - container + (on-node) data dir are gone
  - reinstall returns to running
  - node left as found

Opt-in via ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1; not yet folded into the canonical
gate. Verified 7/7 against .228.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 05:13:53 -04:00
archipelago
d1cd42c821 fix(orchestrator): stop retrying unrepairable volume chowns every reconcile
ensure_running_container_ownership re-probed and re-attempted the in-container
chown on every reconcile pass. For a mount that can't be re-owned from inside the
userns (observed: mempool-api /data -> 'Operation not permitted'), this burned
CPU and logged a WARN on every pass, forever (~6x/30min on .228/.116).

Remember hard chown failures in a process-lifetime set keyed by (container-id,
dest) and skip the probe+chown for known-unrepairable mounts. Keyed by Id (not
name) so a recreated container gets a fresh repair attempt. Verified on .116:
one recorded failure at startup, then silent across subsequent reconciles.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 04:58:57 -04:00
archipelago
3e3016f2bd fix(ui): debounce connection-lost banner so transient ws blips don't flash
The reconnect banner showed 'Connection lost'/'Reconnecting' instantly on every
socket close, even ones that recover in 100ms-2s (load spikes, Tailscale/relay
TCP resets). On a healthy node the drops are brief and self-healing, but each one
flashed a jarring banner, reading as constant instability.

Debounce the transient banner by 2.5s: only surface after the connection issue
persists past the grace window; hide immediately on recovery. Deliberate server
lifecycle transitions (restart/shutdown) bypass the debounce and still show at
once. A genuine persistent outage keeps isOffline true and surfaces after 2.5s.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 04:58:54 -04:00
archipelago
7d89b4d8b2 chore(registry): publish embedded app-catalog.json (52 manifests) for fleet fetch
Force-add the gitignored releases/app-catalog.json so nodes resolve
146.59.87.168:3000/lfg2025/archy/raw/branch/main/releases/app-catalog.json
(currently HTTP 404 → disk-manifest fallback). Embedded-manifest delivery
is default-on; origin-wins overlay with disk as fallback. Unsigned (migration
window accepts unsigned). Includes netbird x3 manifests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 23:45:31 -04:00
archipelago
15f65428b8 docs(master-plan): §8b — uninstall fix deployed+live-verifying, #15 guardian resolved
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 18:07:41 -04:00
archipelago
36015a19fe docs(master-plan): §8b session-b state — connection-lost+netbird+UX-merge shipped to .228, uninstall ghost fix, workstream F in progress
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 15:26:17 -04:00
archipelago
e57514b690 fix(uninstall): never ghost a removed app in My Apps on cleanup residue
handle_package_uninstall lumped every teardown failure into one `errors` vec
and returned Err on any of them BEFORE removing the package state entry — so a
non-fatal cleanup hiccup (a slow/failed `sudo rm -rf` of a large data dir, a
volume/network removal) left the app's containers gone but its entry in
package_data → a ghost in My Apps, and the spawned task reverted it to Installed.

Split the failures: container removal that even force-rm can't complete (app
genuinely still present) keeps the entry + returns Err; everything after the
containers are gone is best-effort. Remove the state entry as soon as the
containers are gone — BEFORE the slow volume/data teardown — so My Apps updates
immediately and residue can never ghost the app. set_uninstall_stage is a no-op
once the entry is gone (if-let guard), so the later stages don't re-create it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 15:23:16 -04:00
archipelago
4346007d37 fix(orchestrator): only TCP host ports get reachability-probed
wait_for_manifest_host_ports TCP-connect-probed every published port, including
UDP/SCTP. netbird's 3478/udp STUN can never answer a TCP connect, so the probe
failed forever and drove an endless host-port repair/reconcile loop on .228
(netbird-server restarting ~every 60s). Filter to tcp (empty protocol = tcp).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 14:40:48 -04:00
archipelago
44f7af2017 merge: companion-mobile-ux UX (loader/store-driven launch/icons + android webview) into main
# Conflicts:
#	Android/app/build.gradle.kts
#	Android/app/src/main/java/com/archipelago/app/ui/screens/WebViewScreen.kt
#	neode-ui/src/views/apps/appsConfig.ts
2026-06-23 14:07:44 -04:00
archipelago
9670af62b6 feat(registry): deliver app manifests via the signed catalog (embed by default)
Turn on registry-distributed manifests for all apps: generate-app-catalog.sh now
embeds each apps/<id>/manifest.yml by default (EMBED_MANIFESTS opt-out), so nodes
install from the signed catalog (origin-wins overlay, disk = fallback) with no
OTA-shipped disk manifest. main.rs awaits a bounded (25s) refresh_catalog before
load_manifests so a fresh boot overlays the latest embedded catalog instead of a
restart later; offline/ISO boot falls through to disk and never hangs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 13:39:54 -04:00
archipelago
a8b9b0f5e8 feat(netbird): manifest-driven migration via reusable orchestrator primitives
Migrate the netbird stack (server/dashboard/proxy) off ~500 lines of per-app Rust
to 3 declarative manifests, adding 4 reusable primitives:
- SecretGenKind::Base64 (netbird relay authSecret + sqlite store encryptionKey)
- GeneratedCert schema + ensure_manifest_certs (self-signed TLS so the dashboard
  gets a secure context for OIDC PKCE — issue #15; https proxy on 8087 preserved)
- templated GeneratedFile render: {{HOST_IP}}/{{HOST_MDNS}}/{{NETWORK_GATEWAY}}
  (aardvark resolver for the #15 stale-IP fix) /{{secret:NAME}} (never logged)
- legacy create_container now honours port.protocol (3478/udp STUN)
install_netbird_stack routes via the orchestrator first (legacy kept as fallback,
mirroring indeedhub); launch URL derives https://{host_ip}:8087 from host facts.
Legacy Rust deletion deferred to post-live-verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 13:39:53 -04:00
archipelago
3c36cf1c40 fix(companion): stop image_exists journal flood that drops the UI websocket
image_exists ran `podman image inspect <image>` via .status() (inherits the
service stdout) with no --format, so every hit dumped the image's full ~249-line
manifest JSON into the journal — once per companion image, every reconcile pass
(.228: 21.6k journal lines / 10 min, 4131 inspect dumps). The service never
crashed (NRestarts=0); the sustained journald/IO flood starved the async runtime
and dropped the UI /ws/db websocket -> constant "connection lost"/reconnect.
Discard the child's stdout/stderr; only the exit status is used.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 13:39:19 -04:00
archipelago
c4cd5fdc90 docs(master-plan): §8b resume — gate green + 6-node deploy + APK fix + workstream F
Comprehensive resume for the session restart: single-node gate green
(5/5 .228), latest backend + UX + one-tap companion APK deployed to 6
nodes (table w/ creds + pending 100.64.83.15 cred), workstream-F bugs
from manual testing, agreed next order (netbird → Phase-3 → F →
multinode), and loose ends (untracked AppLoadingScreen.vue, broken
gitea-local mirror, don't-delete-bitcoin-data directive).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 06:56:54 -04:00
archipelago
ccb594fb85 test(gate): fix bitcoin-knots getinfo-after-restart helper + IBD note
It called bats-assert's `fail` (not loaded in this file) → "fail:
command not found"/127, masking the real reason. Emit+return instead,
bump the cold-restart RPC window 60s→120s (block-index reload), and
note a node mid-IBD legitimately can't serve getinfo (environmental
precondition, not a product regression).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 06:28:20 -04:00
archipelago
deff380191 docs(master-plan): workstream F (lifecycle perfection) + §10 state-mgmt backlog
The 2026-06-23 5×-green gate is DESTRUCTIVE-tier / ~8 core apps only —
it skips uninstall/reinstall (cascade) and has no progress-UI or
all-apps coverage. Manual multinode testing found real bugs it never
ran (immich+grafana uninstall hangs at full-red bar + ghost in My Apps;
grafana reinstall stops; fedimint guardian "waiting for bitcoin sync").
Adds §4 row F, §6b post-deploy order (netbird→Phase-3→F), §6c scope +
observed bugs + definition-of-done, a §5 warning, and §10 backlog to
investigate TanStack-Query/push-based state management for neode-ui.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 06:28:19 -04:00
Dorian
5c43e12782 chore(android): publish companion as raw APK instead of zip
Serve the companion download as a plain .apk so a phone installs it
straight from the link/QR with no unzip step. Repoint the in-app
download URL, the ship + publish scripts, and the pre-push hook at
archipelago-companion.apk, and drop the legacy .apk.zip.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 09:41:10 +01:00
Dorian
e825bbed73 feat(android): file upload/download + in-app tab redesign
Companion WebView now supports file inputs and downloads, and apps
opened in the in-app tab get a proper loading splash and a footer
control bar matching the web app-session bar.

- onShowFileChooser wired to an ActivityResultLauncher so <input
  type=file> opens the system file browser (kiosk + in-app tab)
- DownloadListener: http(s) via DownloadManager (forwarding session
  cookies), blob: via JS->base64->MediaStore, data: decoded inline
- in-app tab: app-icon + progress loading splash (eager favicon
  fetch, upgraded via onReceivedIcon)
- footer controls (back/forward/refresh/open/close) matched to the
  web AppSession mobile bar, with the same SVG glyphs as drawables
- bump to 0.4.8 (versionCode 12)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 09:41:10 +01:00
archipelago
0dd19f0721 docs(CLAUDE.md): single-node gate GREEN — demote priority banner
run-gate.sh 5/5 on .228. Reframe the TOP PRIORITY banner as
gate-green; keep the master plan as north-star source of truth; mark
the gate definition-of-done green and point at multinode as the next
exit criterion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 04:35:50 -04:00
archipelago
ae47897601 docs: single-node production gate GREEN (5/5 on .228) — demote banner
run-gate.sh 5×-green on .228, 0 not-ok (gate-5x5.log). Records the
milestone in the header/banner, §4 workstream E, §6 sequence, and §8b;
demotes the priority banner per §6 item 6. Next: bundled testing deploy
(.116/.198 + UX frontend), multinode pass, workstreams B/C/D.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 04:27:36 -04:00
archipelago
256d354048 docs(master-plan): tick off §8 P1 mobile app-launch UX (code-complete)
Mobile launch UX is code-complete on branch `companion-mobile-ux` (store-driven
panel, no interstitial, in-app WebView footer + loader, mesh 100dvh, ElectrumX
icon, companion v0.4.7 + shared debug keystore). Marked code-complete pending
on-device/mobile-web verification and merge to main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 04:11:25 -04:00
archipelago
2a249b8a48 feat(android): companion in-app WebView footer controls + loader; shared debug key; v0.4.7
- InAppBrowser now has a bottom control bar (back/forward/reload/open-in-browser/
  close) mirroring the web mobile footer, plus a centered loading screen
  (app favicon + progress bar) instead of a bare top bar over black.
- Commit a repo-dedicated debug keystore and pin signingConfigs.debug to it so
  every machine — and the published companion download — signs debug builds with
  the SAME key (fixes "App not installed" signature-mismatch on update). Force v1+v2.
- Bump versionCode 10→11, versionName 0.4.6→0.4.7.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 03:48:58 -04:00
archipelago
a7c7c44843 feat(neode-ui): mobile app-launch UX — store-driven panel, loader, ElectrumX icon
- Mobile launches use the store-driven panel (no route push) so the background
  tab no longer changes and closing returns to where you launched from.
- Tab-only apps open directly (in-app WebView on companion / new tab on PWA) —
  no "this app opens in a tab" interstitial.
- Shared AppLoadingScreen (app icon + progress bar) on the app session and the
  legacy iframe overlay instead of a black screen.
- Pin the dashboard to 100dvh on mobile so the mesh chat/tools panes stop sliding
  under the bottom tab bar in mobile browsers (no-op in the companion WebView).
- ElectrumX/electrs/electrs-ui ids now resolve to the real ElectrumX icon in My Apps.
- isMobile made reactive so overlay/footer/teleport decisions track the viewport.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 03:48:57 -04:00
archipelago
2afd18c6de test(gate): poll immich lan_address to absorb mid-recreate churn
5× run #4 flaked iter4 on "immich exposes its web UI lan-address
(port 2283)": container-list returned lan_address=null because
immich_server was momentarily mid-recreate when the read-only tier
queried it (passed the other 4 iterations; immich_server does publish
0.0.0.0:2283->2283). Same single-shot-read class as the bitcoin-knots
state probe — poll <=30s for the exposed port instead of one read. A
genuinely unexposed immich never publishes 2283, so real port drift
is still caught.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 03:20:18 -04:00
archipelago
6511754545 docs: master-plan §8b — 5× triage, mempool restart bug fixed
Record the overnight 5× outcome (2/5) and the triage: all three
fails were distinct one-offs. iter1 #5 bitcoin-knots = pre-launch
churn (hardened anyway); iter2 #74 + iter5 #73 = one real
orchestrator bug (phantom stack-member injection in
ordered_containers_for_start), now fixed + live-verified on .228.
Update the resume check command to gate-5x4.log.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 02:23:07 -04:00
archipelago
92d7f52dd6 fix(orchestrator): order only live containers on package start/restart
package.restart resolved its container list via
ordered_containers_for_start, which injected every name from the
union startup_order list that wasn't already present — including
variant names not live on a given node (mysql-mempool,
archy-mempool-api, archy-mempool-web). The phantom mysql-mempool is
2nd in the mempool start order, so do_orchestrator_package_start hit
its unknown-app-id fallback, do_package_start failed the inspect
("no such object"), and the `?` aborted the whole start sequence —
leaving mempool-api + the frontend down until the health monitor
recovered them minutes later. That was the source of the 5× gate
flakes #73 (frontend not running in 180s) and #74 (api not queryable
in 300s); root-caused from the .228 journal
("Start failed: mysql-mempool").

Replace the inject-then-sort logic with a pure helper
order_present_containers that orders only the actually-present
containers and never adds phantom entries. startup_order remains a
union of name variants across install generations — it's now used
purely to order what's live, not to inject what isn't. +3 unit tests.

Also harden bitcoin-knots.bats "valid state" probe: poll ≤30s for a
settled state instead of a single-shot read, so a container caught
mid-reconcile (transient restarting/configured) can't flake a 20-min
iteration. A genuinely-stuck container never settles, so real
breakage is still caught.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 02:22:50 -04:00
archipelago
57a013bc66 test(gate): make 5× the canonical gate, drop 20x naming
Rename run-20x.sh → run-gate.sh, default ARCHY_ITERATIONS 20→5, and scrub
20× references across CLAUDE.md, the master plan, TESTING.md, app-registry
status, the orchestrator/config doc-comments, and the bats suites. Also add
a minimal fail() helper to mempool.bats so guard failures report cleanly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 18:12:41 -04:00
archipelago
0f05f73a23 fix(mempool): self-healing nginx backend proxy (v3.0.1) + gate timeout
The frontend nginx used a literal proxy_pass host with no resolver, so it
pinned mempool-api's IP at worker startup. When the backend restarts (gate,
OTA, crash, reboot re-IPAM) podman reassigns its IP and nginx keeps proxying
to the dead one -> /api hangs, websocket 502s, UI shows 'offline' until a
manual nginx reload. Same stale-upstream-IP class as the netbird 502.

Fix: mempool-frontend:v3.0.1 rewrites the generated nginx-mempool.conf to
re-resolve the backend per-request via 'resolver' + a variable proxy_pass.
Resolver address is read from /etc/resolv.conf (podman aardvark-dns answers
on the network gateway, not Docker's 127.0.0.11). Per-location path mapping
preserved (ws -> '/', /api/v1 identity via no-URI, /api/ -> /api/v1/ rewrite).
Proven on .228: backend IP change now auto-recovers with no reload; the
literal-host control still 502s. Migrated the manifest off the retired
tx1138 registry to vps2.

Also: mempool.bats #74 waited only 180s post-restart (the slow path) and
called an undefined 'fail' helper (status 127). Bumped to 300s to match the
passing parity probes and emit a real failure instead.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 18:07:07 -04:00
archipelago
c8acc84506 docs: §2 invariant single-node (.228); multinode → separate plan 2026-06-22 17:23:19 -04:00
archipelago
8355453a7e docs: exact cutoff-proof resume in master-plan SS8b (resume from any device)
Captures: .228 1x-GREEN (110/110); hardened 5x DETACHED on .228 (/tmp/gate-5x2.log,
nohup — survives terminal close) with the exact check-from-any-machine command; all
shipped code fixes (commits) + deploy state (.228 + .198); node-state fixes NOT in
repo (lnd nginx proxy 8081->18083, home-assistant orphan unit removed, electrumx
re-registered); the run-ON-the-node lesson; and remaining work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 17:22:29 -04:00
archipelago
98f4fa44a8 test(gate): harden readiness for sustained 5x churn + inter-iteration settle
The 1x gate is green; the 5x failed iters 1-2 on readiness-under-churn (apps DO
recover — lnd synced, mempool just mid-restart when probed — but slower than the
windows when restarted back-to-back). Hardening:
- run-20x.sh: best-effort settle_stack() before each iteration (wait for
  mempool-api/frontend + lnd RPC healthy, 180s, on-node, never fails the run).
- required containers present/running (80/81): wait-loops (180s) not single-shot.
- mempool api/frontend (87/88): retry ~180s not single-shot.
- mempool queryable (74): 60s->180s. lnd restart-running (64): 120s->240s.
  lnd getinfo (60): 90s->240s retry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 17:11:15 -04:00
archipelago
22b05de6d9 docs(roadmap): P1 mobile app-launch UX — drop 'opens in a tab' interstitial
Companion app: open every app in the in-app WebView (not just non-iframeable),
carrying the mobile-iframe footer controls into the WebView. Mobile web (PWA):
open tab-apps directly in a new tab. No interstitial on either surface. Touch
points + prior commits (b5a9deb8, d1fbcd9b) noted.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:57:44 -04:00
archipelago
27299ea687 docs: make the production test gate a SINGLE-NODE (.228) criterion; split out multinode
Per direction: the gate is now 5x green ON .228 only (run on the node, not via RPC).
Fleet/multinode verification (.198 + others) moved to a new docs/multinode-testing-plan.md
with the bootstrap recipe, per-node preconditions (synced archival bitcoin, no stale
nginx proxy targets, no orphan quadlet units), node roster, and cross-node suites.
Updated CLAUDE.md, master-plan SS5/SS6/SS8b/WS-E, and TESTING.md release gates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:47:34 -04:00
archipelago
892ff083c4 test(gate): fix the last 4 readiness/config false-fails (none are product bugs)
On a proper on-node .228 run (synced bitcoin, 4-fix binary) the lifecycle matrix is
green; these 4 were test-harness issues:
- lnd 'recovers after restart' (65): bump retry window 90s->240s. lnd cold-restart
  recovery (wallet unlock + bitcoind reconnect + graph sync) exceeds 90s on a loaded
  node but DOES complete (synced_to_chain:true).
- bitcoin ui responds (89): retry ~120s instead of single-shot (companion nginx may
  have just been recreated by the companion-survives test).
- probe_app_url (99 lnd proxy + all ui-coverage proxy probes): retry up to 90s for
  post-restart proxy/UI readiness instead of single-shot.
- required endpoints after restart (94): :8081 is nginx-proxy-manager, an OPTIONAL
  app (not in required_containers) — only assert it when NPM is installed; and make
  the trailing lncli getinfo a retry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 15:43:51 -04:00
archipelago
8893055810 test(gate): retry lnd getinfo for RPC readiness (wallet-unlock lags 'running')
lnd's RPC isn't ready until its wallet auto-unlocks on (re)start, which lags the
container 'running' state — single-shot lncli getinfo raced that window and
false-failed (gate tests 60 + 85). Retry up to ~90s like a health probe. lnd is
functional (getinfo returns cleanly once ready).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 14:45:36 -04:00
archipelago
53b8e47f1d test(gate): fix two false-failing lifecycle tests (not product bugs)
- immich restart: bump wait 120s->240s. Restart = ordered stop+start of the 3-
  container stack (postgres->redis->server w/ DB migrations), so it needs at least
  as long as the start test (180s) — the old 120s was inconsistent and false-failed
  on loaded nodes. immich does return to running.
- fedimint orphan check: the unanchored 'total' regex (^fedimint) counts the
  legitimate fedimint-clientd (dual-ecash bridge) but the anchored 'known' regex
  omitted it -> total>known false orphan on every node running fedimint-clientd.
  Add fedimint-clientd to known.

Both run as LOCAL podman/systemctl on the gate runner, so they test the runner node
(.116), not the RPC target — surfaced while driving the .228 gate green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 14:11:35 -04:00
archipelago
f4727bfdb3 docs(gate): companion self-heal fix validated (10s) + test-31 harness caveat
Independent companion loop (452f05d8) validated on .228: deleted archy-electrs-ui
recreates in ~10s (was stuck 100s+). Also: companion-survives bats does LOCAL
rm/systemctl --user, so running it from .116 via RPC tests .116's companions with
.116's binary, NOT the remote target — must run ON the target node. Explains the
'failed on both nodes' runs (both silently tested .116).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 13:44:57 -04:00
archipelago
452f05d849 fix(reconciler): decouple companion self-heal onto its own cadence
The companion-unit repair stage ran at the END of each boot-reconciler tick, after
reconcile_existing(). On a heavily loaded node that per-app pass takes >60-90s, so a
deleted/lost companion unit (electrs-ui, bitcoin-ui, …) wasn't repaired within any
reasonable window (gate test 31 'deleted unit recreated within one reconcile tick'
timed out at 90s on the 45-app .228 node). Detecting + rewriting a companion unit is
cheap, so spawn it as its own ~interval(30s) loop, independent of the slow app pass.
Handle is aborted when the main loop exits (shutdown uses notify_one, so a second
waiter would steal the wake permit). tick() is now app-reconcile only.

All 4 boot_reconciler cadence tests still green (companion_stage=false in tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 13:04:28 -04:00
archipelago
de7d3d83dc docs(gate): final read — every failure fixed/explained, no lifecycle bugs remain
Last 2 .228 stragglers confirmed load/timing, not bugs: test 31 (companion recreate)
= contamination + ~108s reconcile cadence > 90s window; test 55 (immich restart) =
heavy stack restarts >120s under load but DOES return. Path to literally-green gate
is infra (bitcoin sync, re-quadletize .228) + minor test-window tuning. Optional
product improvement noted: independent ~30s companion-reconcile cadence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 12:36:03 -04:00
archipelago
76b23adcc0 docs(gate): test 31 root-caused = .228 contamination (not a product bug)
companion::reconcile only recreates a deleted companion unit when its parent
backend is in manifest_ids. On contaminated .228, electrumx ran as plain podman
and was NOT a tracked manifest install (manifest on disk but unloaded), so the
reconciler never iterated it -> archy-electrs-ui companion orphaned. Proven:
package.install electrumx re-registered it + restored the companion. Self-heal
logic is sound; test 31 clears on re-quadletize. electrumx on .228 de-contaminated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 11:34:55 -04:00
archipelago
47a5148865 docs(gate): two-node result — stop blocker FIXED; residual red is bitcoin-IBD + node prep
.228 104/110, .198 94/110 with the 3-fix binary. Every package.stop test passes on
healthy apps. .198's 14/16 failures trace to bitcoin in IBD (test 83: ~137k blocks
behind) cascading to lnd/btcpay/electrumx/mempool. 2 node-independent: companion
recreate (31, both nodes), fedimint orphan pollution (44). Path to green 5x gate is
now infra (sync bitcoin, re-quadletize .228) + minor (test 31), not lifecycle bugs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 11:09:12 -04:00
archipelago
b090235b04 docs(gate): 3 stop bugs FIXED, electrumx suite GREEN on .228
Stop failure was 3 real product bugs (grace / reconcile-resurrection /
container-list user-stopped state), all fixed (2dad64b2, 760a32bc, 6e49ce6f) +
deployed. electrumx lifecycle suite 10/10 green (66s). fedimint 'crash loop' was
probe-induced churn (stable when left alone). Validating breadth next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 09:49:45 -04:00
archipelago
6e49ce6f88 fix(container-list): report user-stopped apps as stopped despite live UI companion
A user-stopped backend (electrumx, bitcoin, lnd, fedimint) kept reading 'running'
in container-list because its UI companion (electrs-ui, …) still serves the launch
port, and the state-refresh upgrades any reachable launch port to 'running'. The
gate's wait_for_container_status <app> stopped therefore never saw 'stopped'.

Fix: load the user_stopped marker in handle_container_list and force 'stopped' for
those apps before the launch-port refresh. The reconcile guard keeps the backend
down, so the marker is authoritative. package.start clears it first, so a started
app reports 'running' normally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 09:26:30 -04:00
archipelago
760a32bccf fix(reconcile): keep user-stopped apps stopped (reconciler was resurrecting them)
package.stop a dependency (e.g. electrumx, a mempool dep) and the reconciler
restarts it within ~8s: the reconcile filter's dependency_required override
re-includes a user-stopped app that an active app depends on, and the in-memory
disabled set is wiped on manifest reload — so ensure_running runs, the stopped
app's unreachable ports look like a fault, the host-port repair restarts it, and
package.stop never sticks (gate 'transitions to stopped' times out).

Fix: guard ensure_running_with_mode on the on-disk user_stopped marker (the single
choke point every reconcile flows through) → Left('user-stopped'). Explicit
install/start clear the marker first (added clear_user_stopped to orchestrator
install/start, symmetric with disabled.remove; start/restart RPC already cleared
it) so user actions are unaffected. The container itself already stopped correctly
— this stops the resurrection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 09:04:02 -04:00
archipelago
29cd167894 docs(gate): stop-grace fix shipped+validated; gate is multi-caused (5 issues)
Fix deployed to .198+.228, vaultwarden stops clean (no regression). But validation
showed the gate failures are multi-caused: (2) fedimint crash-looping/unhealthy on
both nodes can't be stopped; (3) host-listener repair watchdog restarts
port-unreachable containers fighting stop; (4) gate waits for 'stopped' but apps end
'exited'/'absent' (Exited->Stopped conversion key mismatch); (5) grace vs 60s
gate-timeout (electrumx 300s); (6) .228 contamination. Documented + re-sequenced
NEXT STEPS (fedimint health is the new top blocker).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 08:07:43 -04:00
archipelago
2dad64b2ee fix(stop): honour per-app graceful-stop grace in orchestrator stop path
package.stop left slow-to-SIGTERM apps (fedimint/electrumx/bitcoin/btcpay/immich)
running: the orchestrator path hardcoded podman API ?t=10 / CLI -t 30 and the CLI
wrapper deadline (30s) equalled the -t grace, so the await fired exactly as podman
SIGKILLed -> stop reported failed -> state reverted to running. Reproduced live on
clean .198 (fedimint).

- container/runtime.rs: add ContainerRuntime::stop_container_with_grace (defaulted
  so mock/dev impls are unchanged); PodmanRuntime honours grace for API + CLI with
  deadline = grace + 15s buffer; AutoRuntime delegates. New canonical per-app table
  stop_grace_secs_for() + DEFAULT_STOP_GRACE_SECS / STOP_GRACE_DEADLINE_BUFFER_SECS.
- podman_client.rs: stop_container_with_grace uses ?t=<grace> + longer HTTP deadline.
- prod_orchestrator::stop: resolve grace = manifest stop_grace_secs (north-star) else
  the table; pass to quadlet::stop_service_with_timeout AND stop_container_with_grace.
- quadlet.rs: stop_service_with_timeout so slow apps aren't SIGKILLed at 45s.
- rpc/package/runtime.rs: doc-note its &str stop_timeout_secs mirrors the canonical table.
- tests: resolve_stop_grace_secs (manifest field wins / table fallback / default 30).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 06:59:40 -04:00
archipelago
470e3c649a docs(gate): ROOT-CAUSE the stop blocker — orchestrator ignores per-app stop grace
Reproduced live on CLEAN .198: package.stop fedimint -> 'podman stop -t 30
timed out after 30s' -> stop fails -> state reverts to running. Real fleet-wide
bug (NOT .228 contamination). stop_timeout_secs() per-app grace (bitcoin 600/lnd
330/electrumx 300/fedimint 60) is used by legacy stop paths but NOT the
orchestrator path: ContainerRuntime::stop_container hardcodes API ?t=10 / CLI
-t 30, and PODMAN_CLI_DEFAULT_TIMEOUT=30s == the -t grace so the await fires as
podman SIGKILLs. Fix = thread per-app grace + widen wrapper deadline; owner picks
table-based vs manifest-driven stop_grace_secs. Re-escalated to blocker.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 06:17:23 -04:00
archipelago
a111d79a05 docs(gate): downgrade stop-blocker ⚠️ — .198 has quadlet units, .228 state was my contamination
.198 ground truth: backend apps ARE quadlet (.container files present) -> quadlet
is the intended runtime. .228's plain-podman state traced to my cascade-gate
uninstall + package.start restore (no quadlet regen). Two real robustness sub-bugs
remain (start should regen quadlet; stop podman-fallback gap). Next: canonical
gate on CLEAN .198 first to tell real-bug from contamination.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 06:00:42 -04:00
archipelago
47026fae30 docs(gate): document package.stop blocker + quadlet-vs-podman finding (.228)
5x gate run surfaced a real blocker: package.stop does not stop electrumx/
bitcoin-knots/btcpay/fedimint/immich (container stays running; gate stop-wait
times out). Root cause chain: these backend apps run as plain podman
--restart=unless-stopped, NOT quadlet units (PODMAN_SYSTEMD_UNIT empty; only UI
companions + home-assistant have .container files; bitcoin-core.container is
.disabled). orchestrator.stop() podman-fallback fires for filebrowser but not
electrumx -> suspect loaded()/is_unknown_app_id_error gap. stop->stopped state
reporting itself is correct (filebrowser proof, user_stopped guard).

Also: corrected the canonical gate invocation (DESTRUCTIVE only, not CASCADE);
restored .228 after my cascade-gate left apps stranded.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 05:47:11 -04:00
archipelago
d6fa262d69 docs(#20): consolidate master-plan resume — indeedhub migration 2-node verified (.228+.198); cutoff-proof next-steps + deploy facts
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 04:23:52 -04:00
archipelago
e2a012d086 fix(indeedhub): frontend health = tcp:7777 not http GET / (stops reconcile churn)
On the loaded .198 the frontend churned (created → "unhealthy" → reconciler
recreates → loop). The http health check fetched / through nginx (SPA +
sub_filter) and false-failed under node load; the reconciler then treated the
frontend as wedged and recreated it. nginx binds 7777 at startup, so a tcp
liveness check passes immediately and stays green under load while still
catching a real "nginx not listening" failure. Generous retries/start_period.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 03:39:26 -04:00
archipelago
e4d3f94913 docs(#20): hook exec cgroup gap FIXED + verified on .228 (scoped exec)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:57:17 -04:00
archipelago
ff78b31212 fix(hooks): run post_install exec in a transient user scope (fixes cgroup denial)
Live on .228 the post_install `exec` steps failed with "crun: write
cgroup.procs: Permission denied / OCI permission denied": a `podman exec`
launched from archipelago.service can't place its child in the container's
cgroup (under the service's own slice). Wrap `exec` in
`systemd-run --user --scope --quiet --collect podman exec …` so it gets its own
delegated cgroup — same trick as `podman_user_scope` for pasta starts.
`copy_from_host` (a host-side `cp`, no in-container process) stays direct.

Without this only copy_from_host worked; indeedhub happened to be unaffected
(its image pre-bakes the nginx config so the exec steps were no-ops), but the
hook capability is only generally useful with exec working. hooks unit tests
pass; live verify on .228 next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:38:23 -04:00
archipelago
fdb465f8ac docs(#20): indeedhub fresh-create FIXED + verified on .228 (special-cases deleted + nginx caps); hook exec cgroup gap noted
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:26:23 -04:00
archipelago
ff8f11b87e fix(indeedhub): frontend nginx needs SET{UID,GID}+CHOWN+DAC_OVERRIDE under cap-drop-ALL
Live fresh-create on .228 (post special-case removal) had nginx workers die
with "setgid(101) failed (Operation not permitted)" → workers exited code 2,
port published but nothing served (HTTP 000). The orchestrator does
--cap-drop=ALL, so unlike the legacy `podman run` (default caps) nginx's master
couldn't drop workers to the nginx user. Declare CHOWN/DAC_OVERRIDE/SETGID/SETUID
(SET* to drop the worker user, CHOWN+DAC_OVERRIDE for the tmpfs proxy cache).

Verified on .228: frontend fresh-creates, caps applied, nginx serves, UI 200
incl. /api/ and /nostr-provider.js.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:24:34 -04:00
archipelago
b73084dbb0 refactor(indeedhub): delete orchestrator special-cases; use generic path (#20 phase 3)
The fresh-create path was blocked by hardcoded indeedhub orchestrator logic
that predated and conflicted with the manifest migration:
- ensure_running routed app_id=="indeedhub" → reconcile_indeedhub_stack, which
  REFUSED to create the frontend from its manifest (returned Left("stack-managed")).
- run_pre_start_hooks("indeedhub") → start_indeedhub_backends →
  wait_for_indeedhub_dependencies_ready(120) — a DNS gate with a chicken-and-egg
  bug (required the frontend's own alias present before the frontend could be
  created), which failed install_fresh with "dependencies were not ready within
  120s" and left the frontend down (caught live on .228).

Delete all of it (−382 lines): reconcile_indeedhub_stack, start_indeedhub_backends,
wait_for_indeedhub_dependencies_ready, indeedhub_api_dependency_dns_ready,
indeedhub_required_aliases_present, repair_indeedhub_network_aliases,
indeedhub_alias_present, patch_indeedhub_nostr_provider, and the INDEEDHUB_*
consts. The manifests now carry everything these did: network_aliases (short
hostnames), generated_secrets, dependencies, and the post_install nginx hook. So
"indeedhub" + every member flows through the generic install_fresh/reconcile path
— the frontend fresh-creates normally and runs its hook.

(crash_recovery.rs's frontend-after-deps ordering guard is kept — it's beneficial
startup ordering, not a blocker.) cargo check + release build green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:11:33 -04:00
archipelago
84031e6209 docs: temporarily reduce release lifecycle gate from 20x to 5x
Per user direction: the production test gate is 5x (ARCHY_ITERATIONS=5) on
.228 AND .198 for now, down from 20x. Restore to 20x before the final ship.
Updated CLAUDE.md, PRODUCTION-MASTER-PLAN.md, and tests/lifecycle/TESTING.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:11:00 -04:00
archipelago
9c45f718a2 docs(#20): fresh-create path blocked by legacy indeedhub orchestrator special-cases; fix plan + .228 recovered
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 16:36:22 -04:00
archipelago
8bdc857911 docs(#20): indeedhub phase 3 adoption path live-verified on .228
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 16:23:09 -04:00
archipelago
d2f7c4abf3 docs(#20): phase 3 code-complete (indeedhub manifests + orchestrator-first); next = .228 live verify
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 15:48:18 -04:00
archipelago
b1eea8c053 feat(indeedhub): manifest-driven 7-member stack, orchestrator-first (#20 phase 3)
Author the IndeedHub stack as 7 manifests (postgres/redis/minio/relay/api/
ffmpeg + frontend) and route install_indeedhub_stack through the
orchestrator first (immich pattern), falling back to the legacy installer
only when the manifests aren't deployed.

Data-preserving by construction — the manifests reproduce the live install
exactly so an existing node ADOPTS rather than recreates:
- container_name = the live hyphenated names the runtime already references
  (health_monitor tiers/deps, crash_recovery).
- named volumes indeedhub-{postgres,redis,minio,relay}-data (not bind mounts).
- dedicated indeedhub-net + network_aliases [postgres|redis|minio|relay|api]
  so the api/ffmpeg env hostnames and the frontend nginx upstreams resolve
  unchanged.
- generated_secrets (indeedhub-db-password/-minio-password owned by their
  backends, indeedhub-jwt by the api) reuse the live /var/lib/archipelago/
  secrets values (ensure_one no-ops on existing files; postgres pw is fixed
  at PGDATA init). minio user "indeeadmin" + AES_MASTER_SECRET literal kept.

The frontend carries the post_install hook (#20) that replaces the hardcoded
patch_indeedhub_nostr_provider: strip X-Frame-Options, refresh
nostr-provider.js from /opt/archipelago/web-ui, inject the <script> if
absent, reload nginx — defensive/idempotent since indeedhub:1.0.0 already
bakes these. Frontend manifest also corrected off its dead Next.js shape
(health check now nginx :7777, tmpfs /run + /var/cache/nginx).

Builds + unit-tested; live adoption/lifecycle verification on .228 next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 15:46:26 -04:00
archipelago
b94b61f640 feat(manifest): network_aliases — extra DNS aliases on a container's network
Add `container.network_aliases: Vec<String>` (serde default, DNS-label
validated) so a stack member can answer to short hostnames its peers bake
in, beyond its own container name. Rendered in both runtime paths:
- podman_client: merged (deduped) into the custom-network aliases array.
- quadlet from_manifest: appended after the container name; emitted only
  for Bridge networks (slirp/pasta reject aliases).

Needed for the indeedhub migration: its frontend nginx proxies to
`api:4000` / `minio:9000` / `relay:8080`, so those members declare
`network_aliases: [api|minio|relay]` to keep the short names resolvable on
the dedicated indeedhub-net (vs. colliding generic aliases on archy-net).

Also fixes 4 pre-existing from_manifest test failures (unrelated to this
change, surfaced now that the quadlet suite runs green): test manifests
used the long-invalid `network_policy: archy-net` (allowlist is
isolated/bridge/host → moved to network_policy: isolated + container.network)
and bind sources outside /var/lib/archipelago.

Tests: container crate 53 pass; archipelago quadlet+alias 47 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 15:45:11 -04:00
archipelago
ccb5b7ca39 docs(#20): mark hook phases 1+2 done; resume notes point to phase 3 (indeedhub)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 11:49:05 -04:00
archipelago
955c54b713 feat(hooks): post_install executor + install-path wiring (#20 phase 2)
Add container::hooks::run_post_install — runs an app's declarative
post_install hooks against its own running container:
- Exec  -> podman exec <container> <args…> (60s timeout-bounded)
- CopyFromHost -> resolve src against allowlist roots (<data_dir>/<app>
  and /opt/archipelago), canonicalise + prefix-check (defeats symlink
  escape), then podman cp <abs-src> <container>:<dest>

Best-effort + idempotent: a failed step is warned and skipped, never
fails the install — matching the legacy patch_indeedhub_nostr_provider
behaviour this replaces. Wired into install_fresh after the container is
up, so it runs only on a freshly created container (not plain start), and
re-applies on recreate-after-drift.

5 unit tests on resolve_copy_src (accept in-data-dir, reject absolute /
traversal / missing / symlink-escape). cargo test -p archipelago green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 11:45:28 -04:00
archipelago
4c1a4e5976 feat(hooks): manifest lifecycle-hooks schema (#20 phase 1) + fix container test literals
Add controlled post_install/pre_start hook schema to AppDefinition:
LifecycleHooks/HookStep (Exec | CopyFromHost)/HostCopy with allowlist
validation (relative src, no '..', absolute container dest, non-empty
exec). Re-exported from the crate root. Design: docs/manifest-hooks-design.md.

Also add the missing generated_secrets: vec![] field to three
pre-existing ContainerConfig test literals (the field was added to the
struct in 03a4ee1b but the container crate's own tests were never rerun,
so -p archipelago-container failed to compile). cargo test green: 53 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 11:07:00 -04:00
archipelago
b0b54a96fa test(lifecycle): immich suite — package-level checks, wait-based destructive tier
container-list reports stack apps package-level (.name="immich"), so the suite
checks the "immich" package (presence, valid state, :2283 lan-address) rather than
individual container names. Destructive tier fires async stop/start/restart and
asserts on the end state via wait_for_container_status.

KNOWN: the destructive tier is flaky for slow multi-container stacks — bats runs
ops back-to-back with no settling while immich's async stack ops take 30s+, and
stopped reports as "exited" not "stopped". The immich migration itself is verified
working (manual stop/start/restart succeed; all 3 containers healthy). Hardening
the harness for stack apps (inter-op settling + stopped|exited acceptance) is a
follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:52:33 -04:00
archipelago
f0c6b79d1a fix(immich): name containers underscore to match runtime lifecycle code
package.stop/start/restart broke ("no containers found" / "no such object
immich_postgres") because the runtime hardcodes the immich stack's container names
as immich_server/immich_postgres/immich_redis (underscore) across 8 files
(lifecycle, health, crash-recovery, ports, config). The migration had named the
containers by app_id (hyphen), mismatching all of it.

Root cause of the earlier failed attempt: container_name was nested under an
`extensions:` block, but `app.extensions` is serde(flatten) — container_name must
be a TOP-LEVEL app key to be read by compute_container_name. Fixed: set
container_name: immich_server / immich_postgres / immich_redis at top level, and
point DB_HOSTNAME/REDIS_HOSTNAME at the underscore aliases. App ids stay hyphen
(immich/immich-postgres/immich-redis) so the catalog identity (title+icon) holds.

Manifest-only change — container names now match existing runtime references, no
code edits to the 8 files. (Deriving stack containers from manifests instead of
hardcoded lists remains a north-star follow-up.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:20:38 -04:00
archipelago
b1f175b927 test(lifecycle): add immich stack lifecycle suite
RPC-based (host-agnostic) lifecycle coverage for the manifest-driven immich stack
(immich + immich-postgres + immich-redis): presence + valid state of all 3 members,
a guard that no legacy underscore containers exist (catches botched migration /
legacy-installer fallback), destructive stop/start/restart of the server with
postgres+redis staying up, and cascade uninstall/reinstall (preserve_data).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:01:19 -04:00
archipelago
c548705147 docs: master plan — mark registry-manifest phases 1-3 + immich + reboot-survival done
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 08:25:40 -04:00
archipelago
f160e0c404 fix(reboot): enable podman-restart.service at startup (--restart reboot-survival)
Orchestrator-installed backends (immich, btcpay-db, …) run as plain podman
`--restart=unless-stopped` containers until the Phase-3 Quadlet rollout flips
use_quadlet_backends on. Nothing in the codebase enabled the user's
podman-restart.service, so those containers had NO reboot-survival mechanism.
Enable it (idempotent, best-effort) at orchestrator startup so unless-stopped
containers come back after a reboot. Already applied manually on .228 (covers
31 containers incl. immich + btcpay); this codifies it fleet-wide.

The deeper fix (render Quadlet for all orchestrator installs) remains the gated
Phase-3 Quadlet-everywhere rollout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 08:23:19 -04:00
archipelago
d5ef45731a fix(immich): restore canonical app_id "immich" (title + icon)
After the manifest migration the launcher installed as "immich-server" (app_id),
which has no catalog entry → showed the raw id and no icon. Rename the server
manifest app_id immich-server→immich so it matches the catalog/curated "immich"
entry (title "Immich", icon immich.png) and is recognised as a known launcher app
(APP_CATEGORY_MAP) → stays in My Apps. immich_stack_app_ids now installs
[immich-postgres, immich-redis, immich]; orchestrator.install bypasses package
routing so there's no recursion with the "immich"→stack-installer mapping.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 08:07:08 -04:00
archipelago
0860dfacc7 feat(ui): Services tab — backend classification, parent icons, categories sub-nav
- Classify databases/APIs/backends into Services (#10): add immich-postgres/redis
  to SERVICE_NAMES; isServiceContainer matches -postgres/-redis/-valkey/-cache/-db
  suffixes; isWebsitePackage final fallback now routes any no-UI, non-known package
  to Services ("anything that isn't the frontend UI launcher").
- Services show their parent app's icon (#14): backends reuse the app logo
  (immich-* → immich, archy-btcpay-db → btcpay, indeedhub-* → indeedhub, etc.)
  via explicit APP_ICON_FALLBACKS + prefix map, instead of 404 → 📦.
- Categories sub-nav for Services (#12): getServiceCategory + buildServiceCategories
  + useServiceCategories; Services tab gets the same desktop/mobile category strips
  (Databases/Caches/APIs/Backends), shown only for categories with items. Shared
  selectedCategory resets to 'all' on tab switch.
- Mobile swipe (#11): the tab-swipe gesture is suppressed over .mobile-category-strip
  so swiping the category chips scrolls them instead of changing tabs (covers both
  My Apps and the new Services strip).

vue-tsc build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 07:42:48 -04:00
archipelago
9e6c5370fc feat(immich): manifest-driven stack via orchestrator — live-migrated on .228
Completes the immich migration off the legacy hardcoded install_immich_stack
(podman run + sudo chown) to the registry-manifest + orchestrator path. Validated
live on .228 (clean single set, healthy v2.7.4, data dir ownership correct).

- install_immich_stack now tries install_stack_via_orchestrator(immich_stack_app_ids)
  first; legacy remains only as the no-manifests fallback.
- immich-{postgres,redis,server} manifests corrected from live findings:
  * named by app_id (dropped container_name override) — using container_name
    spawned DUPLICATE containers (app_id-named install vs name-override reconcile)
    on the same PGDATA, which corrupted a postgres cluster. Server reaches its
    siblings via app_id aliases (DB_HOSTNAME=immich-postgres, REDIS=immich-redis).
  * immich-postgres data_uid 100998:100998 (postgres drops to container 999 →
    host 100998 under rootless; verified the fresh dir is chowned correctly).
  * immich-server version "release"→"2.7.4" (manifest validation requires a digit;
    the bad version made the manifest silently skip → partial orchestrator install
    → legacy fallback → the duplicate corruption above).
- HARDEN install_stack_via_orchestrator: only fall back to the legacy installer
  when NOTHING was installed yet. An "unknown app_id" AFTER a member is up now
  errors instead of double-creating containers on shared data (the corruption
  root cause).
- Strict the all-manifests round-trip test: fail (not skip) on any invalid shipped
  manifest — this gap let the bad immich-server version through.

Known follow-up (pre-existing, platform-wide): orchestrator-installed backends
(immich, btcpay-db) run as podman --restart, not Quadlet, and podman-restart.service
is disabled on .228 → reboot-survival gap independent of this migration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 07:08:45 -04:00
archipelago
011081d180 feat(immich): scaffold registry manifests for postgres/redis/server (not yet live)
immich becomes a manifest-driven stack (the legacy install_immich_stack — hardcoded
podman run + sudo chown — is the anti-pattern being retired). Three image-only
manifests modelled on the btcpay stack + the live .228 container config:

- immich-postgres / immich-redis / immich-server on archy-net; container_name set
  to the underscore form (immich_postgres/_redis/_server) so the server's
  DB_HOSTNAME/REDIS_HOSTNAME aliases resolve.
- generated_secrets: [immich-db-password] (idempotent — reuses the live secret on
  existing nodes; postgres is already initialised with it).
- server depends on postgres+redis (install ordering); upload bind preserved.

Inert for now: not added to the UI catalog and install_immich_stack still the
default, so nothing installs these until the orchestrator wiring + on-node
ownership (data_uid) validation lands. Schema validated by the all-manifests
round-trip test. See docs/PRODUCTION-MASTER-PLAN.md §6.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:53:58 -04:00
archipelago
7bfbe8fe40 feat(registry-manifest): phase 2 — publisher embeds manifests into signed catalog
generate-app-catalog.sh gains opt-in EMBED_MANIFESTS=1: embeds each
apps/<id>/manifest.yml into its catalog entry's `manifest` field (whole document,
top-level app: preserved — exactly what the Rust side deserializes). Default off
so routine catalog regen is unchanged during the migration window; turn on
deliberately, then sign via the existing release-root ceremony. Verified: default
embeds 0; EMBED_MANIFESTS=1 embeds 40 manifests (generated_secrets preserved).

Adds a round-trip guard test: every shipped apps/*/manifest.yml must deserialize
+ validate through catalog_manifest_to_overlay (image apps accepted, build apps
defer to disk) — catches schema drift between disk manifests and the catalog path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:46:17 -04:00
archipelago
220666d3a9 feat(registry-manifest): phase 1 — orchestrator consumes manifests from signed catalog
Workstream B phase 1 (node-side consume). The signed app-catalog can now carry a
full manifest per entry; the orchestrator overlays it over the disk manifest
(origin-wins) with disk as the migration fallback. Moves apps toward
registry-distributed manifests with no OTA-shipped disk file.

- app_catalog: `manifest: Option<Value>` on AppCatalogEntry (forward-compatible,
  covered by the existing release-root signature over the raw JSON);
  `catalog_manifest_values()` accessor.
- prod_orchestrator: `load_manifests` overlays catalog manifests after the disk
  walk; `catalog_manifest_to_overlay()` returns None (→ disk fallback) on
  unparseable value / app-id mismatch / failed validate() / build source
  (build contexts aren't registry-distributed yet — phase 1 is image-only).
- manifest_dir stays PathBuf (build-only field); image-only apps never read it.
- 6 unit tests; compiles clean. No-op until a catalog embeds a manifest, so
  existing nodes are unaffected.

See docs/registry-manifest-design.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:30:38 -04:00
archipelago
192238cbb8 docs: consolidate into PRODUCTION-MASTER-PLAN, add CLAUDE.md, prune 25 stale docs
Single authoritative hub (docs/PRODUCTION-MASTER-PLAN.md) for the app-platform
north star: every app manifest-driven (zero OS-level reliance), manifests via the
signed registry, developer-ready external marketplace; rootless/secure/robust/
100%-uptime. Repo CLAUDE.md (auto-loaded each session) points agents at it until
the 20x lifecycle gate is green. New design doc registry-manifest-design.md.

Consolidated docs 56 -> 28: deleted dated handoffs/resumes/transcripts and
superseded trackers (content folded into the master plan or already in memory).
Kept all evergreen design/reference docs + ADRs (the master links them).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:11:32 -04:00
archipelago
03a4ee1b30 feat(container): manifest-declared generated secrets + companion/quadlet hardening
Generated-secrets system: apps declare `generated_secrets` in their manifest
(kinds hex16/hex32/bcrypt); `container::secrets::ensure_generated_secrets`
materialises them 0600/rootless in resolve_dynamic_env — idempotent and
self-healing (recovers wrongly root-owned secrets with no privilege). Replaces
per-app Rust (deletes ensure_fmcd_password). fedimint-clientd/gateway manifests
now declare fmcd-password / fedimint-gateway-hash.

companion.rs: rebuild the auto-built :latest image when its build context changes
(staleness check) so baked-in fixes (e.g. guardian-UI CSS) actually reach nodes.

quadlet.rs: skip PublishPort under Network=host (podman rejects the combo, exit
125) + regression tests.

UI: "Fedimint Guardian" rename, fedimint-clientd/nostr-rs-relay/meshtastic tagged
as Services (headless backends), gateway icon fallback.

Deployed + verified on .228 (generated-secrets fixed fedimint-gateway start;
grafana/strfry orphan crash-loop units removed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:11:07 -04:00
archipelago
db7d424bff feat(content): owned-content persistence + Fedimint paid downloads, fmcd caps fix, FIPS warm-path perf
Buyer-side paid downloads now persist: purchases are cached on disk
(content_owned.rs) keyed by (seller onion, content_id), the gallery shows
an "Owned" badge unblurred, and items view/play in-app from the local
cache with no re-payment or reliance on a browser download (which
silently failed on the mobile companion). New RPCs content.owned-list /
content.owned-get. Validated e2e .116<-.198 (paid 100 sats via Fedimint,
166KB jpeg returns, survives restart).

fedimint-clientd manifest: restore the standard container capability set
(CHOWN/DAC_OVERRIDE/FOWNER/SETUID/SETGID) so fmcd's startup chown of an
existing-federation /data succeeds instead of dying EPERM (#7). Confirmed
the orchestrator applies these to the running container.

FIPS perf: tighten the supervisor warm-path keepalive 45s -> 25s so peer
paths stay inside the ~30-60s NAT cold window. Dials now reliably land on
FIPS instead of re-punching and falling back to Tor. Measured to the same
peer: cloud browse 18-22s -> 0.4s; full Fedimint paid download 29s -> 11s
(residual is the seller-side guardian reissue round-trip).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 18:58:52 -04:00
archipelago
b0c9bd2a0c docs: #7 exhaustive isolation — seccomp ruled out; fmcd runs standalone, orchestrator-managed fails (open)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 14:39:33 -04:00
archipelago
63b98599e8 Revert "fix(fedimint): run fmcd with seccomp=unconfined so its DHT can start (#7)"
This reverts commit 409543c41e78025354acbdde5ffc6445895d4508.
2026-06-20 14:37:24 -04:00
archipelago
409543c41e fix(fedimint): run fmcd with seccomp=unconfined so its DHT can start (#7)
fmcd crash-looped "Operation not permitted (os error 1)" on .116 (kernel
6.12.74): the default rootless seccomp profile blocks a syscall its Mainline-DHT
/ iroh transport needs, so the REST API never came up (:8178 → HTTP 000) and
federations couldn't be joined. Verified: with seccomp=unconfined fmcd boots and
answers /v2/* (HTTP 401 instead of dead). fmcd works on other nodes, so this is
kernel/seccomp-specific — but the relaxation is safe for an outbound-networking
daemon and harmless where not needed.

- new `security.seccomp_unconfined` manifest flag (SecurityPolicy);
- libpod backend sets `seccomp_profile_path: "unconfined"` (== --security-opt
  seccomp=unconfined); quadlet backend emits `SeccompProfile=unconfined`;
- enabled in apps/fedimint-clientd/manifest.yml.

NOTE: manifests live on-disk at /opt/archipelago/apps/<id>/manifest.yml, so the
node needs the updated manifest deployed + the fmcd container recreated to apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 13:08:13 -04:00
archipelago
d59cf6d299 docs: session 3 — ecash confirm+refund, #5 confirmed, #7 fmcd-on-.116 EPERM
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 12:28:24 -04:00
archipelago
12f54e390d feat(wallet): ecash pay confirmation screen + auto-refund on failed sale (#3)
- PeerFiles: new confirmation step after "pay from ecash" — shows the amount and
  which wallet will be spent (Cashu/Fedimint) with balances, lets the user switch
  backends, and a styled Confirm button. The chosen backend is passed to the
  payment so it spends exactly what was confirmed.
- content.download-peer-paid: accept `method` (cashu|fedimint) to honor the
  confirmed choice; log the backend + outcome; backend-specific rejection errors
  ("not in the same Fedimint federation" / "doesn't accept your Cashu mint").
- AUTO-REFUND: a minted token whose sale fails (peer unreachable, rejected, or
  error) is now reclaimed (fedimint reissue / cashu receive) so the buyer no
  longer loses the spent ecash — fixes the stuck-Fedimint-notes report.
- wallet.ecash-balance already reports cashu_sats/fedimint_sats/total_sats which
  the confirm screen uses to pick/show the covering wallet.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 12:16:02 -04:00
archipelago
242baf5deb fix(ui): on-screen error overlay so companion crashes are visible without a console
chrome://inspect isn't always reachable on the Android companion WebView, so the
real error stayed invisible. Add a plain-DOM, screenshot-able overlay (built
without Vue so it survives a crash in Vue itself) that shows the captured error
message + stack and a Copy button for the full window.__archyErrors buffer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 10:23:59 -04:00
archipelago
0ab160b5c3 docs: deploy state — all 6 nodes on 4a8f2198 build (#12/#2/#3/#10)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 10:15:59 -04:00
archipelago
a6957a48f7 fix(netbird): wait for OIDC discovery before reporting install done (#10)
Right after install the dashboard SPA opens and, if it loads before NetBird's
embedded OIDC provider is serving, caches a bad auth state — the user appears
logged-in but can't log out until it self-corrects. Container "running" != OIDC
ready, so gate the install's Done phase on the management server's
/oauth2/.well-known/openid-configuration answering (best-effort, 60s cap, never
fails the install since the stack is already up).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:57:37 -04:00
archipelago
2761f0d70f docs: handoff — session 2 progress (#12/#2/#3 code-complete, deploy held)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:52:07 -04:00
archipelago
a8c668ee0a fix(ui): stop mobile tab bar covering last row of content (#2)
On Cloud/files (and any scrolling view), the bottom of the list could sit behind
the fixed mobile tab bar. Cause: DashboardMobileNav measured the bar's
offsetHeight and wrote it to --mobile-tab-bar-height, but when the bar was hidden
or not yet laid out the measurement was 0 — and writing "0px" defeats the
", 88px" fallback in the .mobile-scroll-pad clearance calc (an explicit 0 is
still a set value), so the clearance collapsed and the ~88px bar overlapped the
last row.

- never write 0px: only set a real measured height, else remove the var so the
  88px fallback applies.
- re-measure after first paint (rAF) and after the WebView safe-area injection,
  so the clearance reflects the bar's final laid-out height.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:50:44 -04:00
archipelago
8f06d88fbf feat(wallet): pay for peer files from BOTH Cashu and Fedimint ecash (#3)
Paying for a peer file minted a Cashu-only token, so a node whose ecash balance
lived in Fedimint couldn't pay even with funds. Now both backends are tried:

- payer (content.download-peer-paid): mint a Cashu token first; on failure fall
  back to spending Fedimint notes. Only error if BOTH backends can't cover it.
- seller (verify_and_receive_payment): accept Fedimint notes as well as Cashu —
  anything not starting with "cashu" is redeemed via reissue_into_any.
- new fedimint_client::spend_from_any() — spend from whichever joined federation
  has the balance, returning the notes + federation id (mirrors reissue_into_any).
- wallet.ecash-balance now also reports fedimint_sats + combined total_sats; the
  pay-for-file pre-check uses the combined total so a Fedimint-funded node isn't
  wrongly blocked.

Compiles (cargo check + vue-tsc). Live cross-node federation validation pending
(dual-ecash phase 6) — needs two nodes sharing a federation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:13:23 -04:00
archipelago
b3633ec525 fix(ui): surface real error instead of generic toast + catch async errors
The global Vue errorHandler swallowed every crash into "Something went wrong.
Please refresh the page." — which hides exactly what we need to diagnose the
companion-app (Android WebView) post-login crash. Now:
- the toast shows the real (truncated) error message;
- a 25-entry ring buffer is kept on window.__archyErrors for retrieval where
  there's no console (companion WebView via chrome://inspect, or a debug view);
- window 'error' and 'unhandledrejection' listeners catch async/non-Vue errors
  that Vue's errorHandler misses (e.g. a JS API absent in an older WebView).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:05:51 -04:00
archipelago
f92e442bfc fix(mesh): collapse cross-transport twin contacts into one conversation (#12)
A node reachable both over LoRa and federation has two MeshPeer rows (radio
twin: low contact_id + firmware key; federation twin: high contact_id +
archipelago key), and messages key by peer_contact_id split across the two ids
— so opening one twin shows an empty thread (the .120->.89 symptom).

- backend: new group_peer_twins() helper groups peers by arch_pubkey_hex (set on
  BOTH twins by bind_federation_twins), keeps the radio id as the mesh-first
  send target, and unions messages across all twin ids. Wired into
  conversations.list / conversations.messages / mesh.contacts-list. +3 unit tests.
- frontend: the live chat list merges client-side (mergedPeers) and matched twins
  by the "Archy-z6Mk..." advert prefix, which the Meshtastic device rename broke
  (radio now advertises the server name). Merge by arch_pubkey_hex instead, which
  the backend reliably sets on both twins. Expose arch_pubkey_hex on MeshPeer.
- fix unrelated stale test: EcashTransaction test missing the new `kind` field.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:01:14 -04:00
archipelago
5f7e8dca80 docs: handoff — mesh rename done, .120->.89 dup-contact diagnosis, netbird TODO
Resume notes for the 1.8.0 bug-bash mesh work: Meshtastic rename shipped +
verified; .120->.89 'non-delivery' diagnosed to a duplicate-contact surfacing
bug (messages inject fine, split across federation/radio twin contact_ids);
design for the dedup fix (#12) and the netbird logout-race map (#10).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 06:06:03 -04:00
archipelago
d00d1b20d7 fix(mesh): rename Meshtastic radio to the node's server name
Meshtastic device rename was a no-op — set_advert_name only updated an
in-memory field and never told the radio, so the device kept its firmware
default ('Meshtastic xxxx') and wasn't findable from external Meshtastic
apps. MeshCore already renamed correctly (CMD_SET_ADVERT_NAME); this brings
Meshtastic to parity.

Send an AdminMessage{set_owner=User{long_name,short_name}} to the locally
connected node (admin packet to our own node_num on the ADMIN_APP port).
Local serial admin needs no session passkey, matching the official client.
long_name = server name (<=39 chars); short_name = first 4 alphanumerics,
upper-cased. Verified on real hardware: .120 -> 'Archy-X250-EXP', .5 ->
'Archy-X250-Beta' (name read back from the radio after reconnect).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 06:04:22 -04:00
Dorian
b00c5247f5 chore(android): update companion apk download 2026-06-20 10:34:49 +01:00
Dorian
e39e0370e2 fix(android): push icon ring to home-screen visible edge (scale 0.65, v0.4.6)
Calibrated from a device home-screen screenshot: launcher3 crops less than the
App-info view, so the ring at 0.53 sat ~78% out. Scale 0.65 reaches the edge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 10:34:44 +01:00
Dorian
3b9eb35a37 chore(android): update companion apk download 2026-06-19 22:22:59 +01:00
Dorian
011f6559e1 fix(android): icon ring matching logo.svg gradient at visible edge (v0.4.5)
Ring uses logo.svg's #000->#666 gradient (stroke 22.8834) pushed to scale 0.53
so it sits at the launcher's visible crop edge (calibrated from a device
screenshot). Grid at 0.55. versionCode 9 so launcher3 refreshes its icon cache.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 22:21:58 +01:00
Dorian
979e6525b7 fix(android): icon ring at visible crop edge (scale 0.50) + version 0.4.4
Device App-info screenshot showed the launcher only renders the central ~54%
of the adaptive icon, clipping the ring. Calibrated the ring to scale 0.50 so it
lands at the visible circle edge; grid to 0.55. Bump versionCode 8 so launcher3
refreshes its icon cache (it keys the cached bitmap by versionCode).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 22:21:58 +01:00
archipelago
af816c61a5 fix(ui): reliable federation-join feedback (90s timeout + re-check + success)
Joining a Fedimint federation is heavy and routinely outlasts the default 15s
client timeout while still succeeding server-side, so the UI wrongly showed
failure. Bump the join timeout to 90s, and on any error re-check the list: if a
new federation appeared the join worked — show 'Federation joined.' instead of
a misleading error.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:43:30 -04:00
archipelago
63611a4453 fix(mesh): honour explicit !ai allowlist for unauthenticated stock clients
A stock meshcore client (e.g. a phone) can't sign our typed envelopes, so it is
never 'authenticated' — which meant ticking it as an allowed assistant contact
had no effect and !ai stayed denied. The explicit per-contact allowlist is a
deliberate operator opt-in for a specific key, so match it regardless of
authentication, keyed on the asker's resolved identity (bound archipelago key,
else firmware routing key — how meshcore addresses the contact). The spoofable
federation-trust-list match still requires authentication.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:43:30 -04:00
archipelago
7831e68d13 fix(wallet): redeem across all federations, unified ecash history, fmcd healthcheck
- reissue_into_any now tries the UNION of the local registry AND fmcd's live
  joined set (/v2/admin/info) before failing, so a valid Fedimint token isn't
  wrongly rejected when the registry has drifted. On all-fail it returns a
  friendly message: notes already redeemed into this wallet (funds safe) vs
  didn't match any connected federation.
- Unified transaction history: a local Fedimint tx log (recorded on each
  successful redeem) is merged with the Cashu history in wallet.ecash-history,
  newest-first, each tagged kind=cashu|fedimint. Previously a Fedimint receive
  appeared nowhere.
- fedimint-clientd healthcheck -> type:tcp. It was probing /health, which fmcd
  doesn't serve (only /v2/*), pinning the container in (starting) forever; the
  TCP probe is skipped by the Quadlet renderer (host-side lifecycle verifies),
  so it reports running. Cosmetic for ecash, which worked throughout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:43:29 -04:00
Dorian
0f2e6f6aaf chore(android): update companion apk download 2026-06-19 21:28:29 +01:00
Dorian
5afe9e4aec fix(android): whole badge in background layer, ring inset to survive mask
Put dark fill + inset metallic ring (0.88) + grid (0.58) all in the background
(renders to the mask edge, no safe-zone crop); transparent foreground. Matches
a locally-rendered, circle-masked preview so the ring is visible and uncut.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 21:28:26 +01:00
Dorian
857dc66240 chore(android): update companion apk download 2026-06-19 19:22:00 +01:00
Dorian
75f7020e3e fix(android): ring at circle edge (background layer) + smaller grid
Move the metallic ring into the background (renders to the mask edge, unlike the
foreground which is cropped to the safe zone) so the border is finally visible
at the circle's rim; shrink the grid to ~0.55 so the mark isn't too big.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 19:21:57 +01:00
Dorian
75666cdc31 chore(android): update companion apk download 2026-06-19 19:20:21 +01:00
Dorian
8977ea92e8 fix(android): shrink icon grid within the ring for more margin
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 19:20:18 +01:00
Dorian
ca38f5d8f4 chore(android): update companion apk download 2026-06-19 19:05:57 +01:00
Dorian
d72cb57545 fix(android): brighter, thicker icon rim (#555->#A5A5A5, stroke 28)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 19:05:55 +01:00
Dorian
dc2cdca549 chore(android): update companion apk download 2026-06-19 19:00:35 +01:00
Dorian
ee01ab9427 fix(android): make icon rim softly visible (#3A3A3A->#888)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 19:00:35 +01:00
archipelago
cebbde7bde fix(ui): square mobile file tiles, files scroll clearance, apps-tab swipe guard
- Apps tab: a horizontal swipe that starts on an app icon no longer flips the
  top tab — it lets the app-page scroll / icon tap win (swipe empty space to
  change tab). Fixes the swipe conflict with two pages of apps.
- Files: file cover tiles are forced square on mobile (aspect driven by CSS,
  not a Tailwind arbitrary class) so the grid is uniform and tappable.
- Files: scroll container gets bottom safe-area + tab-bar padding so the last
  row clears the mobile back button / bottom nav.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 13:57:51 -04:00
archipelago
a0b80dd27d fix(mesh): authenticate !ai over LoRa via federation-twin binding + signed Text
A !ai (or any typed message) from a trusted, federated node was denied when
it arrived over the radio. The radio half of a node that is also a federation
peer carried no archipelago identity (identity adverts are no longer broadcast
on the public channel), so the trusted_only gate and signature verification
had no key to check the asker against — and the same node showed up as two
contacts (a radio twin + a federation twin).

- bind_federation_twins(): correlate a radio contact with its federation twin
  by exact, case-insensitive advert_name and copy the federation peer's
  arch_pubkey_hex/did/x25519 onto the radio record. Called from
  upsert_federation_peer and refresh_contacts. Ambiguous names (held by >1
  federation peer) are skipped. This is only a CANDIDATE key — security is
  unchanged: the inbound envelope signature must still verify against it.
- send_message now signs the typed Text envelope (new_signed) so a radio !ai
  authenticates against the bound key. A meshcore node merely named like a
  trusted node cannot forge the signature, so it is still denied.

Receiver-side verification (handle_typed_envelope_direct) and federation-trust
matching (is_sender_allowed) already existed; this supplies the missing key
binding and signature. Also resolves the radio/federation duplicate-contact
display for same-named nodes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 13:57:50 -04:00
Dorian
839da80e0b chore(android): update companion apk download 2026-06-19 18:50:39 +01:00
Dorian
f0e9343d74 fix(android): drop white-wrapping round PNG, single SVG-matched icon ring
Revert to a pure adaptive icon (the bare round PNG was getting legacy-wrapped
onto a white circle by the launcher). One ring only, in the foreground, using
the SVG's dark #000->#666 gradient on a plain dark tile.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 18:50:34 +01:00
Dorian
bf6d98195e chore(android): update companion apk download 2026-06-19 18:40:39 +01:00
Dorian
846b2d9646 fix(android): match icon ring to logo.svg gradient (#000->#666)
Revert the brightened grey->white ring back to the original logo.svg gradient
(black->#666, stroke 22.8834) on both the round PNG icon and the adaptive
foreground.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 18:40:37 +01:00
Dorian
6df776b25a chore(android): update companion apk download 2026-06-19 18:32:00 +01:00
Dorian
1074f89c47 feat(android): true-circle round launcher icon (PNG badge)
Render the full circular badge (bright grey->white ring + grid) to round-icon
PNGs at all densities and drop the adaptive round XML, so launchers that use
round icons show a real edge-to-edge circle instead of a mask-cropped coin.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 18:31:57 +01:00
Dorian
726cc132af chore(android): update companion apk download 2026-06-19 18:26:59 +01:00
Dorian
078c1793a9 fix(android): fit full badge (ring + grid) inside icon safe zone
Scale the whole badge to ~0.64 so the bold grey->white ring isn't clipped at
the edge by the launcher mask; bigger, brighter ring. Background is plain dark.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 18:26:54 +01:00
Dorian
b83e2c2f37 chore(android): update companion apk download 2026-06-19 18:26:34 +01:00
Dorian
a2fa57456d fix(android): scale icon badge into safe zone so the ring is visible
The ring at 0.96 sat in the adaptive-icon bleed zone (outer ~18dp cropped by the
launcher), so only the grid showed. Scale badge + grid to 0.68 so the ring lands
at the edge of the visible circle, and brighten it to grey->white.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 18:26:32 +01:00
Dorian
64937df8a2 chore(android): update companion apk download 2026-06-19 18:12:41 +01:00
Dorian
6527e66c07 fix(android): visible metallic icon ring at circle edge
Move the badge ring into the background layer (brightened grey->white so it
reads on #0A0A0A) at ~0.96 so it sits at the masked-circle edge; foreground is
just the white grid. Also honor SHIP_COMPANION in the pre-push hook.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 18:12:38 +01:00
Dorian
07b611d07d chore(android): add companion APK auto-publish hook + script
scripts/publish-companion-apk.sh builds the debug APK and refreshes the served
download neode-ui/public/packages/archipelago-companion.apk.zip; .githooks/pre-push
runs it on every push to main that touches Android. Enable per clone with
  git config core.hooksPath .githooks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 17:53:38 +01:00
Dorian
dcedf9582a chore(android): update companion apk download 2026-06-19 17:46:44 +01:00
Dorian
f2c420d9c0 feat(android): app icon gradient ring border + companion publish script
Adaptive icon foreground now draws the full badge (black→grey gradient ring +
white grid) scaled to ~0.94 so the ring reads as a clean border at the circle
edge. Adds ship-companion.sh: builds the debug APK and publishes it to
neode-ui/public/packages/archipelago-companion.apk.zip, then commits + pushes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 17:46:41 +01:00
Dorian
68cd1c120a fix(android): translucent glass DARK controller so backdrop shows through
The controller body/face were opaque, so the synthwave backdrop only peeked
out above/below the controller. Make the DARK palette surfaces translucent
(body/face/inlay) and drop the opaque shadow platform + the gradient's forced
0.95 alpha, so the backdrop reads through the controller as glass. CLASSIC
palette stays solid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:52:02 +01:00
Dorian
993f30456f feat(neode-ui): instant press feedback + launching spinner on app icons
Tapping a dashboard app icon now scales it down immediately (CSS :active)
and shows a per-icon spinner until the app overlay opens, so the tap is
acknowledged even while the app session spins up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:21:48 +01:00
Dorian
aa95e42383 feat(android): circular logo, synthwave backgrounds, glass modal, server names + UX fixes
- New circular badge logo (ic_logo) on Intro + Connect screens; launcher
  icon rebuilt as dark circle + white grid.
- Reddish synthwave backdrop (bg-intro-2) behind Intro, Connect, and the
  remote/gamepad (edge-to-edge with a light scrim); controllers no longer
  paint an opaque fill over it.
- Server name: added to ServerEntry/prefs, the Connect form, the modal
  add-form, and saved-server rows; removal now matches by connection
  identity (rename- and legacy-format-safe).
- NESMenu modal restyled to glassmorphism #0A0A0A with centered, larger
  fields. Connect-form glass cards given a darker base for legibility.
- Intro title/subtitle set to #FAFAFA.
- Deleting the last server clears the active server and returns to Connect.
- D-pad auto-repeat initial delay raised to 500ms so a tap sends one key
  (fixes doubled nav sound).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:21:48 +01:00
archipelago
75e470bfa4 fix(mesh): mesh-preferred message routing with FIPS/Tor fallback
Messages to a federated peer that is out of LoRa range (e.g. on another
continent) were dropped into the radio with no fallback, or hung on a dead
FIPS path before reaching Tor — so they never arrived.

- Route a radio contact over the federation transport (FIPS->Tor) when it is
  the same node as a federated peer (known archipelago identity -> onion) AND
  it is not currently reachable over the radio. Reachable radio peers stay on
  the mesh (preferred); oversized/file envelopes still always take federation.
- Resolve the onion via the archipelago identity key (arch_pubkey_hex), not
  the firmware routing key, so a radio contact maps to its nodes.json onion.
- Add .fips_timeout(8s) to the federation message POST so an unreachable FIPS
  overlay fast-fails to Tor (~3-5s) instead of burning the 120s budget.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 10:09:14 -04:00
archipelago
0ac67f5092 fix(ui): companion QR absolute 146 URL + Dashboard swipe type guard
- Companion app QR encoded a relative path (/packages/...apk.zip) which
  can't resolve when scanned by a phone. Point it at the absolute 146
  release-server URL so the download works from any device.
- Dashboard tab-swipe: guard tabs[next] (noUncheckedIndexedAccess) so the
  frontend type-checks/builds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
837cc02812 fix(federation): reliable symmetric auto-federation across LAN/Tor/FIPS
Federated nodes failed to converge to full-mesh across the LAN<->Tailscale
boundary: nodes were invisible to peers, sync 'took ages'/timed out, and
names only updated on a manual sync. Onions were healthy in both directions
(~3-5s); the failures were app-layer.

- B: federation dials fast-fail a dead FIPS path via .fips_timeout(6s) in
  sync_with_peer + notify_join, so the Tor fallback isn't stuck behind the
  full 30s FIPS budget when LAN and remote peers share no FIPS path.
- A: notify_join (peer-joined) now spawns with retries+backoff instead of a
  single awaited best-effort POST, so the join RPC returns instantly (no
  'Request timeout') and the inviter reliably learns the joiner (was
  asymmetric).
- C: new 90s periodic federation auto-sync (none existed) so renamed nodes
  and roster changes propagate without a manual Sync click.
- self-heal: each auto-sync re-asserts membership to any peer that doesn't
  list us back, converging the fleet to full-mesh and healing pre-existing
  asymmetry with no manual re-joins.

Validated live across 7 nodes: a previously fleet-invisible node became
fully meshed automatically (logs: 'auto-sync ... reasserted=1',
'peer-joined ... delivered').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
1bce694ebb feat(ui): mobile mesh tabs, AIUI-style audio player, cloud grid + map fixes
UI (this session):
- Global audio player now scales the whole interface into the space above it
  on desktop (sidebar + main) and docks directly above the tab bar on mobile;
  it stays visible while navigating.
- Mesh mobile redesign: floating Chat / BTC / Dead Man / AI / Map tab strip
  with a single fixed, internally-scrolling pane (page no longer scrolls);
  tabs hide while a conversation is open; floating back button; collapsible
  Device panel (starts collapsed); keyboard-aware conversation sizing via
  VisualViewport so the chat sits just above the keyboard.
- Cloud file grid: uniform 4/3 card heights (folders + images match).
- Swipe left/right switches tabs on the Apps and Web5 screens.
- Map tool fills its pane (no bottom gap); fix skewed Share Location toggle
  on mobile (global min-height rule was deforming the switch).
- Trim redundant helper copy from the mesh AI tab.

Also bundles pre-existing in-progress work that was already in the tree:
mesh listener/session + wallet + container + bitcoin-status backend changes,
docker UI updates, and assorted other UI tweaks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
c4855526fe feat(wallet): wire fmcd as core app + dual-ecash receive
Fedimint never appeared in Wallet > Settings > Fedimint because the
fmcd (fedimint-clientd) sidecar was never installed: ensure_default_
federation() needs the fmcd password to reach the daemon, found none,
and silently no-oped, leaving the registry empty.

- prod_orchestrator: add fedimint-clientd to the baseline auto-install
  set so it self-heals onto every node and auto-joins the default
  federation; generate the fmcd-password secret before secret_env
  resolves.
- fedimint_client: ensure_fmcd_password (random hex, 0600) shared with
  the container's secret_env; from_node reads the same secret (legacy
  fmcd/password kept as fallback); reissue_into_any redeems received
  notes into the first joined federation that accepts them.
- wallet.ecash-receive: dual-token — cashu* tokens redeem at the mint,
  anything else is reissued via fmcd; returns the kind + federation_id.
- UI: receive box advertises "Cashu or Fedimint" and reports which kind.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
298595069d fix(mesh): native Meshtastic unicast DMs + driver-level E2E status
Meshtastic DMs were falling back to a channel broadcast, so every node
on the LoRa channel saw a "direct" message. Send a directed MeshPacket
(to = node num, decoded from the synthetic pubkey's node-id bytes)
instead — the Meshtastic analog of the meshcore CMD_SEND_TXT_MSG fix.
DMs now reach only the recipient; firmware auto-PKC-encrypts them
end-to-end once NodeInfo keys are exchanged.

Capture E2E status at the driver level (no shared-type/UI change):
- learn each peer's real Curve25519 key from User.public_key (field 8)
  and inbound MeshPacket.public_key (16), kept in a side-map separate
  from the synthetic routing key so unicast routing is untouched
- detect inbound MeshPacket.pki_encrypted (17) to tell a true E2E DM
  from a channel-PSK fallback
- peer_is_pkc_capable() seam for a future mesh-tab E2E badge

Hot-swap preserved: no dispatched MeshRadioDevice signature or the
shared ParsedContact changed, so meshcore and meshtastic stay
interchangeable behind the listener.

Adds tests/multinode/meshtastic.sh, a two/three-radio on-air parity
harness (detect, discover, DM round-trip, DM privacy, channel
broadcast, typed envelope, reachability).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
Dorian
f636c5d505 fix(neode-ui): float connection banners as overlay
The offline/reconnecting banners were in-flow (mx-6 mt-6) and pushed the whole
dashboard down when shown. Teleport them to <body> as a fixed, top-centered
overlay with a fade/slide transition and safe-area inset, so they no longer
shift layout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 14:40:50 +01:00
Dorian
0f43870e6c chore(android): give debug build a .debug app id
applicationIdSuffix=".debug" + versionNameSuffix so a debug/test build
installs alongside the release app instead of failing on signature mismatch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 14:40:50 +01:00
Dorian
d1fbcd9b0a feat(neode-ui): route "open in browser" through native bridge in companion app
When ArchipelagoNative is present (the Android companion app), openInNewTab()
now calls openInApp(url) so non-iframeable apps open in the in-app WebView
instead of a suppressed window.open popup. Falls back to window.open in a
plain mobile browser. Logic only; no visual change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 11:28:48 +01:00
Dorian
b5a9deb815 feat(android): open non-iframeable apps in in-app webview + webview perf
The kiosk's "Open in new tab" used window.open(..., 'noopener,noreferrer'),
which the WebView suppresses, so launching apps that can't be iframed did
nothing. Route such node apps (same host) into a local in-app WebView overlay
instead, keeping the kiosk view alive underneath; genuinely external links
still go to the system browser. Wired through onCreateWindow,
shouldOverrideUrlLoading, and a new ArchipelagoNative.openInApp() bridge.

Perf (no visual change): enable setOffscreenPreRaster to stop scroll
checkerboarding, and enable WebView remote debugging on debuggable builds
for chrome://inspect profiling.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 11:28:48 +01:00
archipelago
d0ca53501c feat(ui): cloud folder zoom transition on path change
Re-key FileGrid on the current folder path and wrap it in a cloud-zoom
Transition so the depth/zoom animation replays at every folder level; the
header + breadcrumb nav stay fixed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 09:40:16 -04:00
archipelago
790da4bd0f fix(wallet): Minibits default Cashu mint, resilient peer-file invoices, named default federation
- Cashu default mint was the local Fedimint guardian (:8175), wrongly surfacing
  a Fedimint URL in the Cashu mints list. Default is now Minibits
  (https://mint.minibits.cash/Bitcoin) — Cashu and Fedimint are distinct
  protocols (Fedimint lives under its own tab).
- Peer-file (buy) invoice creation: retry the LND REST call (3× / 400ms) so a
  transient LND-REST blip (swap pressure / just-restarted / TLS race) no longer
  hard-fails as an opaque 503, and surface the real error chain ({:#}) in the
  response + logs instead of a generic "Failed to create invoice".
- Autojoined default federation now shows a friendly name ("Archipelago
  Federation") in the Fedimint tab instead of a bare federation id.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 09:23:56 -04:00
archipelago
cc2e055e09 fix(bitcoin,ui): RAM-aware dbcache to stop swap-thrash 502s + snappier status + icon placeholder
Sizes bitcoind -dbcache to host RAM (~1/16, floor 300MB, cap 4096) instead of a
fixed 2048/4096. A multi-GB UTXO cache on an 8GB node running the full app stack
pushed memory past physical RAM and triggered system-wide swap thrash: the disk
saturated, bitcoind could not answer its own RPC, and the dashboard backend's
sqlite reads stalled — surfacing as fleet-wide /rpc/v1 502s and a blank Bitcoin
UI. Applied in scripts/container-specs.sh (reconciler path) and the config.rs
bitcoin-core path.

Bitcoin status cache now polls every 5s (was 10/15) with an 8s timeout (was 20s)
and fetches the four RPCs concurrently, so the cached snapshot tracks bitcoind's
responsive windows during IBD and the UI stops dwelling on "reconnecting...".

Unifies the divergent discover AppGrid/FeaturedApps image-error handlers onto the
canonical placeholder fallback so missing app icons render the placeholder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 09:14:47 -04:00
archipelago
549c6180a2 chore(ui): sync What's New modal for v1.8.00-alpha
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 08:12:12 -04:00
archipelago
ec644ab90f docs: changelog v1.8.00-alpha — mesh DM privacy, contact import/search/reachability
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 08:10:29 -04:00
archipelago
f0fdc23cc9 feat(mesh): native-unicast DMs, contact import/remove, reachability, contact search
- DMs now use native meshcore unicast (CMD_SEND_TXT_MSG) instead of @DM2 channel
  broadcasts: private (E2E-encrypted to the recipient pubkey by firmware), off the
  public channel, and decodable by stock clients. Plain text (split, not MC-chunked)
  to non-archipelago contacts; typed envelopes to archy peers.
- !ai replies now DM the asker privately (RadioDm) instead of broadcasting on ch0.
- Auto contact-import: a heard advert (PUSH_CONTACT_ADVERT/0x80, 32-byte pubkey) is
  added via CMD_ADD_UPDATE_CONTACT (0x09) so contacts appear without a flood advert.
- clear-all now DELETES firmware contacts via CMD_REMOVE_CONTACT (0x0F) instead of
  blocklisting; blocking filter removed entirely. Wiped contacts return when reachable.
- Contact reachability: MeshPeer carries last_advert + reachable (path-based); UI shows
  a reachability dot.
- Peers list: contact search box (filter by name/DID/npub/pubkey) with a clear button.
- send_message routes stock contacts as plain native text (fixes garbled envelopes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 08:08:52 -04:00
archipelago
9f2edf6b7a docs: changelog for v1.8.00-alpha (carry forward v1.7.99 features + mesh/fedimint fixes)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 04:20:10 -04:00
archipelago
3a21243be7 fix(mesh,ui,fedimint): mesh-AI chat trigger + transport-aware reply, stop ARCHY:2 public-channel spam, AI allowlist + model dropdown, Fedimint client manifest, settings reorder, chat scroll
- mesh: stop broadcasting ARCHY:2 identity on the public channel (startup + every advert tick); receive path still parses inbound. No more public-channel spam.
- mesh assistant: trigger on !ai/!ask typed in 1:1 chat (was only the dead AssistQuery path + bare channel text); route the reply transport-aware via MeshService::send_message (Tor for federation peers, LoRa for radio) through a new AssistChatReply event consumed at the server layer — fixes replies never reaching federation askers.
- mesh assistant: per-contact !ai allowlist (allowed_contacts) bypassing trusted_only; config + RPC + is_sender_allowed.
- fedimint-clientd manifest: network_policy open -> bridge (invalid value made the loader skip the whole manifest, so fmcd never ran and federations never joined/listed).
- ui: AI panel — Claude model dropdown (Haiku/Sonnet/Opus presets) + allowlist contact picker.
- ui: Settings — App Updates + App Registry moved under Account.
- ui: mesh chat — overscroll-behavior: contain so chat scroll no longer bleeds to the contacts panel.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 03:33:37 -04:00
207 changed files with 16727 additions and 10285 deletions

51
.githooks/pre-push Executable file
View File

@ -0,0 +1,51 @@
#!/usr/bin/env bash
# Keep the served companion APK in sync with main on every push.
#
# When a push to main includes Android changes, rebuild the APK, refresh
# neode-ui/public/packages/archipelago-companion.apk, commit it, and ask
# you to push again (so the refreshed APK rides along in the same push).
#
# Enable once per clone: git config core.hooksPath .githooks
set -euo pipefail
ROOT="$(git rev-parse --show-toplevel)"
cd "$ROOT"
# ship-companion.sh already (re)published the APK for this push — don't redo it.
[ -n "${SHIP_COMPANION:-}" ] && exit 0
PUSH_MAIN=0; RANGE_OLD=""; RANGE_NEW=""
while read -r _local_ref local_sha remote_ref remote_sha; do
if [ "${remote_ref##*/}" = "main" ]; then
PUSH_MAIN=1; RANGE_OLD="$remote_sha"; RANGE_NEW="$local_sha"
fi
done
[ "$PUSH_MAIN" = "1" ] || exit 0
# Loop-break: if the tip is already the auto APK commit, let the push proceed.
case "$(git log -1 --pretty=%s)" in
*"companion APK"*) exit 0 ;;
esac
# Only rebuild when this push actually touches the Android app.
ZEROS="0000000000000000000000000000000000000000"
if [ -z "$RANGE_OLD" ] || [ "$RANGE_OLD" = "$ZEROS" ]; then
ANDROID_CHANGED=1
elif git diff --quiet "$RANGE_OLD" "$RANGE_NEW" -- Android/ 2>/dev/null; then
ANDROID_CHANGED=0
else
ANDROID_CHANGED=1
fi
[ "$ANDROID_CHANGED" = "1" ] || exit 0
bash scripts/publish-companion-apk.sh || exit 0
DEST="neode-ui/public/packages/archipelago-companion.apk"
if git diff --cached --quiet -- "$DEST"; then
exit 0 # APK unchanged — nothing to do
fi
git commit -q -m "chore(android): update companion APK download [skip ci]"
echo "" >&2
echo "▶ Companion APK rebuilt and committed. Run your push again to include it." >&2
exit 1

5
Android/.gitignore vendored
View File

@ -14,3 +14,8 @@ local.properties
*.aab
*.jks
*.keystore
# Exception: the repo-dedicated *debug* keystore is committed on purpose so every
# machine (and the published companion download) signs debug builds identically —
# updates then install over the top without an uninstall. Debug keys are not
# secret (well-known password "android"); never commit a real release keystore.
!/app/debug.keystore

View File

@ -0,0 +1,94 @@
# Companion App — Build, Ship & "App Not Installed" Runbook
Canonical procedure for releasing the Archipelago Companion Android app and for
debugging install failures. Read this before touching the companion release flow.
Hard lessons from 2026-06-26 are baked in below — don't relearn them.
## Ship the companion (the only sanctioned way)
```bash
./Android/ship-companion.sh
```
This calls `scripts/publish-companion-apk.sh` (the single source of truth, also
used by the `.githooks/pre-push` hook), which:
1. **Removes/rejects resource dirs whose names contain spaces.** Empty stray
`mipmap-* NNN` dirs (left by icon-export tools) break a *clean* build with
`Invalid resource directory name`. Incremental builds hide them — clean builds
don't.
2. **Always does a CLEAN build** (`:app:clean :app:assembleDebug`).
3. **Forces v1 + v2 + v3 signing** via `zipalign` + `apksigner`.
4. **Verifies all three schemes** (`apksigner verify --min-sdk-version 21`) and
**aborts** if any is missing.
5. Stages the signed APK at `neode-ui/public/packages/archipelago-companion.apk`,
commits, and pushes with `SHIP_COMPANION=1` (the sanctioned pre-push bypass).
**Never** hand-roll `gradlew assembleDebug` + `cp` to the served path. That path
skips the clean build and the signature enforcement and is exactly how a broken
APK shipped.
### Bump the version first
Edit `Android/app/build.gradle.kts``versionCode` (must strictly increase) and
`versionName`. The committed value can drift AHEAD of what's actually built into
the served APK, so verify the served APK's real version after shipping:
`aapt2 dump badging neode-ui/public/packages/archipelago-companion.apk | grep version`.
## Signing facts (important)
- Debug builds are signed with the **committed** `Android/app/debug.keystore`
(store/key pass `android`, alias `androiddebugkey`) so every machine and the
served download share ONE signing key. Cert SHA-256: `D6:22:E0:7E:…:66:4D`.
- **AGP silently ignores `enableV1Signing = true` for `minSdk ≥ 24`**, so a plain
gradle build produces a **v2-only** APK. The `apksigner` step in the publish
script is what actually guarantees v1+v2+v3 — do not remove it.
- **Changing the signing key forces every existing install to be uninstalled
once.** Android blocks in-place upgrades across different signatures. Treat the
keystore as permanent; never regenerate it casually.
## Debugging "App Not Installed" — DIAGNOSE FIRST
Do **not** theorize about signing schemes / OEM quirks. Get the real reason:
```bash
adb install ~/Desktop/archipelago-companion-<ver>.apk
# -> Failure [INSTALL_FAILED_<REASON>: ...]
```
Map the reason:
| `INSTALL_FAILED_*` | Cause | Fix |
|---|---|---|
| `UPDATE_INCOMPATIBLE … signatures do not match` | Old install signed with a **different key** (e.g. pre-shared-keystore per-machine key `58:31:12…`). | Uninstall the old package, then install. **One-time** per device after a key change. |
| `INVALID_APK` / parse error | Corrupt/incomplete download or bad signing. | Re-download; re-run the publish script. |
| `INSUFFICIENT_STORAGE` | Storage. | Free space. |
| `OLDER_SDK` | Device below `minSdk` (26 = Android 8.0). | Unsupported device. |
> A manual uninstall on the phone may NOT clear `UPDATE_INCOMPATIBLE` if the
> package is registered under another user/profile — `pm path <pkg>` under user 0
> can show nothing while the conflict persists. `adb uninstall <pkg>` clears it
> across all users.
## Phone / adb safety (non-negotiable)
When acting on the user's physical phone, be surgical — the user once had all
home-screen app layouts wiped by an over-broad action.
- Default to **read-only** adb (`devices`, `getprop`, `pm path/list`, `dumpsys`).
- Mutations (`adb install`, `adb uninstall com.archipelago.app.debug`) only with
explicit go-ahead and **scoped to our exact package** — echo it first.
- **Never** run launcher/system resets: no `pm clear` on launchers, no
`reset-permissions`, no factory wipe, no uninstalling apps you didn't build.
## Verify the published download after shipping
The download served to nodes is Gitea raw-on-main. Confirm the live bytes match
what you built and signed:
```bash
SERVED=neode-ui/public/packages/archipelago-companion.apk
URL=http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/$SERVED
curl -sS -o /tmp/live.apk "$URL"
shasum -a 256 "$SERVED" /tmp/live.apk # must match
apksigner verify -v --min-sdk-version 21 /tmp/live.apk | grep -i "scheme" # v1/v2/v3 = true
```

View File

@ -11,15 +11,41 @@ android {
applicationId = "com.archipelago.app"
minSdk = 26
targetSdk = 35
versionCode = 6
versionName = "0.4.2"
versionCode = 16
versionName = "0.4.12"
vectorDrawables {
useSupportLibrary = true
}
}
signingConfigs {
// Repo-dedicated debug keystore (committed at app/debug.keystore) so every
// machine — and the published companion download — signs debug builds with
// the SAME key. Without this, Gradle falls back to each machine's
// ~/.android/debug.keystore, so a build from a different machine has a
// different signature and the phone rejects the update ("App not installed").
getByName("debug") {
storeFile = file("debug.keystore")
storePassword = "android"
keyAlias = "androiddebugkey"
keyPassword = "android"
// Force both legacy JAR (v1) and APK Signature Scheme v2. AGP drops v1
// for minSdk>=24, but some OEM package installers (e.g. Samsung) reject
// a v2-only sideload with "App not installed" — keep v1 for max compat.
enableV1Signing = true
enableV2Signing = true
}
}
buildTypes {
debug {
// Separate app ID so a debug/test build installs alongside the
// release app instead of colliding on signature.
applicationIdSuffix = ".debug"
versionNameSuffix = "-debug"
signingConfig = signingConfigs.getByName("debug")
}
release {
isMinifyEnabled = true
isShrinkResources = true

BIN
Android/app/debug.keystore Normal file

Binary file not shown.

View File

@ -18,7 +18,11 @@ data class ServerEntry(
val useHttps: Boolean,
val port: String = "",
val password: String = "",
val name: String = "",
) {
/** Label to show in lists — the user-given name, or the address if unnamed. */
fun displayName(): String = name.ifBlank { address }
fun toUrl(): String {
val scheme = if (useHttps) "https" else "http"
val portSuffix = if (port.isNotBlank()) ":$port" else ""
@ -31,7 +35,9 @@ data class ServerEntry(
return "$scheme://$address$portSuffix"
}
fun serialize(): String = "$address|$useHttps|$port|$password"
// name is the trailing field so entries saved before naming existed
// (4 fields) still deserialize, with name defaulting to "".
fun serialize(): String = "$address|$useHttps|$port|$password|$name"
companion object {
fun deserialize(raw: String): ServerEntry? {
@ -42,6 +48,7 @@ data class ServerEntry(
useHttps = parts[1].toBooleanStrictOrNull() ?: false,
port = parts.getOrElse(2) { "" },
password = parts.getOrElse(3) { "" },
name = parts.getOrElse(4) { "" },
)
}
}
@ -53,6 +60,7 @@ class ServerPreferences(private val context: Context) {
private val activeHttpsKey = booleanPreferencesKey("active_https")
private val activePortKey = stringPreferencesKey("active_port")
private val activePasswordKey = stringPreferencesKey("active_password")
private val activeNameKey = stringPreferencesKey("active_name")
private val savedServersKey = stringSetPreferencesKey("saved_servers")
private val introSeenKey = booleanPreferencesKey("intro_seen")
@ -63,6 +71,7 @@ class ServerPreferences(private val context: Context) {
useHttps = prefs[activeHttpsKey] ?: false,
port = prefs[activePortKey] ?: "",
password = prefs[activePasswordKey] ?: "",
name = prefs[activeNameKey] ?: "",
)
}
@ -81,6 +90,7 @@ class ServerPreferences(private val context: Context) {
prefs[activeHttpsKey] = server.useHttps
prefs[activePortKey] = server.port
prefs[activePasswordKey] = server.password
prefs[activeNameKey] = server.name
}
addSavedServer(server)
}
@ -91,6 +101,7 @@ class ServerPreferences(private val context: Context) {
prefs.remove(activeHttpsKey)
prefs.remove(activePortKey)
prefs.remove(activePasswordKey)
prefs.remove(activeNameKey)
}
}
@ -101,10 +112,50 @@ class ServerPreferences(private val context: Context) {
}
}
/**
* Replace a saved server in place. Matches the existing entry by connection
* identity (address/port/scheme) so edits that change the name or password
* or that touch a legacy 4-field entry still update the right record. If the
* edited server is also the active one, the active record is kept in sync.
*/
suspend fun updateSavedServer(original: ServerEntry, updated: ServerEntry) {
context.dataStore.edit { prefs ->
val current = prefs[savedServersKey] ?: emptySet()
val filtered = current.filterNot { raw ->
val e = ServerEntry.deserialize(raw)
e != null &&
e.address == original.address &&
e.port == original.port &&
e.useHttps == original.useHttps
}.toSet()
prefs[savedServersKey] = filtered + updated.serialize()
val isActive = prefs[activeAddressKey] == original.address &&
(prefs[activePortKey] ?: "") == original.port &&
(prefs[activeHttpsKey] ?: false) == original.useHttps
if (isActive) {
prefs[activeAddressKey] = updated.address
prefs[activeHttpsKey] = updated.useHttps
prefs[activePortKey] = updated.port
prefs[activePasswordKey] = updated.password
prefs[activeNameKey] = updated.name
}
}
}
suspend fun removeSavedServer(server: ServerEntry) {
context.dataStore.edit { prefs ->
val current = prefs[savedServersKey] ?: emptySet()
prefs[savedServersKey] = current - server.serialize()
// Match by connection identity (address/port/scheme) rather than the
// exact serialized string, so a rename — or the legacy 4-field format
// saved before names existed — still removes the right entry.
prefs[savedServersKey] = current.filterNot { raw ->
val e = ServerEntry.deserialize(raw)
e != null &&
e.address == server.address &&
e.port == server.port &&
e.useHttps == server.useHttps
}.toSet()
}
}

View File

@ -108,7 +108,9 @@ private fun Btn(icon: ImageVector, key: String, onDir: (String) -> Unit) {
.pointerInput(key) {
detectTapGestures(onPress = {
p = true; onDir(key)
job = scope.launch { delay(350); while (true) { onDir(key); delay(100) } }
// 500ms initial delay so a normal tap sends one key, not two
// (a touch tap often exceeds 350ms → doubled nav sound).
job = scope.launch { delay(500); while (true) { onDir(key); delay(100) } }
tryAwaitRelease(); p = false; job?.cancel()
})
},

View File

@ -83,13 +83,16 @@ val ClassicPalette = NESPalette(
inlayBg = Color(0xFF080808), inlayBorder = Color(0xFF999999),
)
// Glassmorphism-black (OS design): translucent dark surfaces so the backdrop
// shows through the controller, subtle white-alpha borders, translucent-white
// buttons. Accents come from each button's ring.
val DarkPalette = NESPalette(
body = NES.DarkBody, face = NES.DarkFace, ridge = NES.DarkRidge,
label = NES.DarkLabel, labelMuted = NES.DarkLabelMuted,
dpad = Color(0xFF080808), dpadHi = Color(0xFF141418),
btn = NES.DarkButtonMain, btnPress = NES.DarkButtonMainPress,
capsule = Color(0xFF121216), capsulePress = Color(0xFF0A0A0C),
inlayBg = Color(0xFF060608), inlayBorder = Color(0xFF444448),
body = Color(0xA6121216), face = Color(0x8C0E0E12), ridge = Color(0x14FFFFFF),
label = Color(0xFF9A9A9A), labelMuted = Color(0xFF777777),
dpad = Color(0xFF202024), dpadHi = Color(0xFF33333A),
btn = Color(0x14FFFFFF), btnPress = Color(0x0AFFFFFF),
capsule = Color(0x12FFFFFF), capsulePress = Color(0x08FFFFFF),
inlayBg = Color(0x990A0A0A), inlayBorder = Color(0x1FFFFFFF),
)
fun paletteFor(style: ControllerStyle) = if (style == ControllerStyle.CLASSIC) ClassicPalette else DarkPalette
@ -113,20 +116,10 @@ fun NESController(
Box(
modifier = modifier
.fillMaxSize()
.background(Color(0xFF0C0C0C)) // Slightly lighter than black for shadow visibility
.twoFingerHold(onMenu)
.padding(horizontal = 40.dp, vertical = 24.dp),
contentAlignment = Alignment.Center,
) {
// Shadow platform
Box(
modifier = Modifier
.fillMaxWidth(0.86f)
.aspectRatio(2.3f)
.padding(top = 6.dp)
.clip(RoundedCornerShape(18.dp))
.background(Color(0xFF000000)),
)
// Controller body
Box(
Modifier
@ -135,7 +128,7 @@ fun NESController(
.shadow(32.dp, RoundedCornerShape(16.dp), ambientColor = Color(0xFF000000), spotColor = Color(0xFF000000))
.clip(RoundedCornerShape(16.dp))
.background(
Brush.verticalGradient(listOf(c.body, c.body.copy(alpha = 0.95f)))
Brush.verticalGradient(listOf(c.body, c.body))
)
.border(1.dp, Color.White.copy(alpha = if (isClassic) 0.08f else 0.04f), RoundedCornerShape(16.dp)),
) {
@ -193,13 +186,13 @@ fun NESController(
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.Center,
) {
// C on top (white)
ColorBtn(Color(0xFF888888), Color(0xFFAAAAAA), 44.dp) { onKey("c") }
// C on top
GlassFaceBtn("C", Color(0xFFBBBBBB), 44.dp) { onKey("c") }
Spacer(Modifier.height(6.dp))
// B + A on bottom row
Row(horizontalArrangement = Arrangement.spacedBy(12.dp)) {
ColorBtn(Color(0xFF3B82F6), Color(0xFF60A5FA), 44.dp) { onKey("b") }
ColorBtn(Color(0xFFEA580C), Color(0xFFFB923C), 44.dp) { onKey("a") }
GlassFaceBtn("B", Color(0xFF60A5FA), 44.dp) { onKey("b") }
GlassFaceBtn("A", Color(0xFFF7931A), 44.dp) { onKey("a") }
}
}
}
@ -264,7 +257,9 @@ fun OnePointDPad(c: NESPalette, size: Dp, onDir: (String) -> Unit) {
}
activeDir = dir; onDir(dir)
job?.cancel()
job = scope.launch { delay(300); while (true) { onDir(dir); delay(90) } }
// 500ms initial delay so a normal tap sends one key, not
// two (a touch tap often exceeds 300ms → doubled nav sound).
job = scope.launch { delay(500); while (true) { onDir(dir); delay(90) } }
tryAwaitRelease()
job?.cancel(); activeDir = null
},
@ -375,6 +370,28 @@ fun ColorBtn(color: Color, pressColor: Color, sz: Dp = 48.dp, onClick: () -> Uni
}
}
/** Glass face button — dark translucent fill, colored ring + letter (OS style) */
@Composable
fun GlassFaceBtn(label: String, accent: Color, sz: Dp = 44.dp, onClick: () -> Unit) {
var p by remember { mutableStateOf(false) }
Box(
Modifier
.size(sz)
.clip(CircleShape)
.background(
Brush.verticalGradient(
if (p) listOf(Color.White.copy(alpha = 0.05f), Color.White.copy(alpha = 0.02f))
else listOf(Color.White.copy(alpha = 0.10f), Color.White.copy(alpha = 0.03f))
)
)
.border(1.5.dp, accent.copy(alpha = if (p) 0.95f else 0.55f), CircleShape)
.pointerInput(Unit) { detectTapGestures(onPress = { p = true; onClick(); tryAwaitRelease(); p = false }) },
contentAlignment = Alignment.Center,
) {
Text(label, color = accent.copy(alpha = if (p) 1f else 0.85f), fontSize = 16.sp, fontWeight = FontWeight.Bold)
}
}
/** START/SELECT capsule */
@Composable
fun CapsuleBtn(label: String, c: NESPalette, w: Dp = 64.dp, h: Dp = 28.dp, onClick: () -> Unit) {

View File

@ -3,6 +3,8 @@ package com.archipelago.app.ui.components
import androidx.compose.animation.AnimatedVisibility
import androidx.compose.animation.fadeIn
import androidx.compose.animation.fadeOut
import androidx.compose.animation.scaleIn
import androidx.compose.animation.scaleOut
import androidx.compose.foundation.background
import androidx.compose.foundation.border
import androidx.compose.foundation.clickable
@ -34,17 +36,35 @@ import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.text.TextStyle
import androidx.compose.ui.text.font.FontWeight
import androidx.compose.ui.text.input.ImeAction
import androidx.compose.ui.text.input.KeyboardType
import androidx.compose.ui.text.input.PasswordVisualTransformation
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.unit.dp
import androidx.compose.ui.unit.sp
import com.archipelago.app.data.ServerEntry
import com.archipelago.app.ui.theme.BitcoinOrange
import com.archipelago.app.ui.theme.ControllerStyle
import com.archipelago.app.ui.theme.NES
import com.archipelago.app.ui.theme.SurfaceDark
import com.archipelago.app.ui.theme.TextMuted
import com.archipelago.app.ui.theme.TextPrimary
/** NES-styled modal menu — dark blue panel with white borders */
// Glassmorphism palette (OS design): near-black surfaces, subtle white borders,
// Bitcoin-orange accent.
private val PanelBg = SurfaceDark // #0A0A0A
private val PanelBorder = Color.White.copy(alpha = 0.12f)
private val RowBg = Color.White.copy(alpha = 0.05f)
private val RowBorder = Color.White.copy(alpha = 0.08f)
private val FieldBg = Color.White.copy(alpha = 0.04f)
private val PANEL_R = 20.dp
private val ROW_R = 14.dp
private val ROW_H = 54.dp
private val FIELD_H = 58.dp
/** Glassmorphism modal menu — #0A0A0A surface, subtle white borders. */
@Composable
fun NESMenu(
visible: Boolean,
@ -55,6 +75,7 @@ fun NESMenu(
onDismiss: () -> Unit,
onSelectServer: (ServerEntry) -> Unit,
onAddServer: (ServerEntry) -> Unit,
onEditServer: (ServerEntry, ServerEntry) -> Unit,
onRemoveServer: (ServerEntry) -> Unit,
onToggleMode: () -> Unit,
onToggleStyle: () -> Unit,
@ -66,7 +87,9 @@ fun NESMenu(
.clickable(indication = null, interactionSource = remember { MutableInteractionSource() }) { onDismiss() },
contentAlignment = Alignment.Center,
) {
MenuPanel(servers, activeServer, isGamepadMode, controllerStyle, onDismiss, onSelectServer, onAddServer, onRemoveServer, onToggleMode, onToggleStyle, onBackToWebView)
AnimatedVisibility(visible = visible, enter = fadeIn() + scaleIn(initialScale = 0.95f), exit = fadeOut() + scaleOut(targetScale = 0.95f)) {
MenuPanel(servers, activeServer, isGamepadMode, controllerStyle, onDismiss, onSelectServer, onAddServer, onEditServer, onRemoveServer, onToggleMode, onToggleStyle, onBackToWebView)
}
}
}
}
@ -80,105 +103,160 @@ private fun MenuPanel(
onDismiss: () -> Unit,
onSelectServer: (ServerEntry) -> Unit,
onAddServer: (ServerEntry) -> Unit,
onEditServer: (ServerEntry, ServerEntry) -> Unit,
onRemoveServer: (ServerEntry) -> Unit,
onToggleMode: () -> Unit,
onToggleStyle: () -> Unit,
onBackToWebView: (() -> Unit)?,
) {
var showAdd by remember { mutableStateOf(false) }
// The saved server being edited, or null when adding a new one.
var editing by remember { mutableStateOf<ServerEntry?>(null) }
var nm by remember { mutableStateOf("") }
var addr by remember { mutableStateOf("") }
var pwd by remember { mutableStateOf("") }
fun resetForm() {
nm = ""; addr = ""; pwd = ""; showAdd = false; editing = null
}
fun startEdit(server: ServerEntry) {
editing = server
nm = server.name; addr = server.address; pwd = server.password
showAdd = false
}
fun submit() {
if (addr.isBlank()) return
val orig = editing
if (orig != null) {
// Preserve fields the compact form doesn't expose (scheme, port).
onEditServer(orig, orig.copy(address = addr, password = pwd, name = nm))
} else {
onAddServer(ServerEntry(addr, false, password = pwd, name = nm))
}
resetForm()
}
Column(
modifier = Modifier
.widthIn(max = 360.dp)
.clip(RoundedCornerShape(4.dp))
.background(NES.MenuPanel)
.border(3.dp, NES.MenuBorder, RoundedCornerShape(4.dp))
.widthIn(max = 420.dp)
.padding(horizontal = 20.dp)
.clip(RoundedCornerShape(PANEL_R))
.background(PanelBg)
.border(1.dp, PanelBorder, RoundedCornerShape(PANEL_R))
.clickable(indication = null, interactionSource = remember { MutableInteractionSource() }) {}
.padding(16.dp),
verticalArrangement = Arrangement.spacedBy(6.dp),
.padding(22.dp),
verticalArrangement = Arrangement.spacedBy(10.dp),
) {
// Title
Text("- MENU -", color = NES.MenuText, fontSize = 14.sp, fontWeight = FontWeight.Bold, letterSpacing = 4.sp,
modifier = Modifier.fillMaxWidth(), textAlign = androidx.compose.ui.text.style.TextAlign.Center)
Spacer(Modifier.height(4.dp))
Text(
"Menu",
color = TextPrimary,
fontSize = 18.sp,
fontWeight = FontWeight.SemiBold,
letterSpacing = 2.sp,
modifier = Modifier.fillMaxWidth(),
textAlign = TextAlign.Center,
)
Spacer(Modifier.height(2.dp))
// Servers
servers.forEach { server ->
val active = server.serialize() == activeServer?.serialize()
MenuItem(
label = (if (active) "\u25B6 " else " ") + server.address,
label = server.displayName(),
selected = active,
onClick = { onSelectServer(server) },
onEdit = { startEdit(server) },
onRemove = { onRemoveServer(server) },
)
}
if (servers.isEmpty()) {
Text(" NO SERVERS", color = NES.MenuMuted, fontSize = 11.sp, modifier = Modifier.padding(vertical = 4.dp))
Text("No servers", color = TextMuted, fontSize = 14.sp, modifier = Modifier.padding(vertical = 4.dp))
}
// Add server
if (showAdd) {
// Add / edit server
if (showAdd || editing != null) {
Column(
Modifier.fillMaxWidth().background(Color.Black.copy(alpha = 0.3f)).padding(8.dp),
verticalArrangement = Arrangement.spacedBy(6.dp),
Modifier
.fillMaxWidth()
.clip(RoundedCornerShape(ROW_R))
.background(FieldBg)
.border(1.dp, RowBorder, RoundedCornerShape(ROW_R))
.padding(12.dp),
verticalArrangement = Arrangement.spacedBy(8.dp),
) {
OutlinedTextField(
value = addr, onValueChange = { addr = it.trim() },
placeholder = { Text("192.168.1.100", color = NES.MenuMuted, fontSize = 11.sp) },
modifier = Modifier.fillMaxWidth().height(48.dp), singleLine = true,
textStyle = androidx.compose.ui.text.TextStyle(color = NES.MenuText, fontSize = 12.sp),
colors = nesFieldColors(),
shape = RoundedCornerShape(2.dp),
Row(
Modifier.fillMaxWidth(),
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.SpaceBetween,
) {
Text(
if (editing != null) "Edit Server" else "Add Server",
color = TextMuted,
fontSize = 13.sp,
letterSpacing = 1.sp,
fontWeight = FontWeight.Medium,
)
Text(
"Cancel",
color = TextMuted,
fontSize = 13.sp,
modifier = Modifier.clickable { resetForm() }.padding(start = 8.dp),
)
}
GlassField(
value = nm, onValueChange = { nm = it },
placeholder = "Name (optional)",
keyboardOptions = KeyboardOptions(keyboardType = KeyboardType.Text, imeAction = ImeAction.Next),
)
Row(horizontalArrangement = Arrangement.spacedBy(6.dp), verticalAlignment = Alignment.CenterVertically) {
OutlinedTextField(
GlassField(
value = addr, onValueChange = { addr = it.trim() },
placeholder = "192.168.1.100",
keyboardOptions = KeyboardOptions(keyboardType = KeyboardType.Uri, imeAction = ImeAction.Next),
)
Row(horizontalArrangement = Arrangement.spacedBy(8.dp), verticalAlignment = Alignment.CenterVertically) {
GlassField(
value = pwd, onValueChange = { pwd = it },
placeholder = { Text("PASSWORD", color = NES.MenuMuted, fontSize = 11.sp) },
modifier = Modifier.weight(1f).height(48.dp), singleLine = true,
placeholder = "Password",
modifier = Modifier.weight(1f),
visualTransformation = PasswordVisualTransformation(),
keyboardOptions = KeyboardOptions(keyboardType = KeyboardType.Password, imeAction = ImeAction.Go),
keyboardActions = KeyboardActions(onGo = {
if (addr.isNotBlank()) { onAddServer(ServerEntry(addr, false, password = pwd)); addr = ""; pwd = ""; showAdd = false }
}),
textStyle = androidx.compose.ui.text.TextStyle(color = NES.MenuText, fontSize = 12.sp),
colors = nesFieldColors(),
shape = RoundedCornerShape(2.dp),
keyboardActions = KeyboardActions(onGo = { submit() }),
)
Box(
Modifier.size(48.dp).clip(RoundedCornerShape(2.dp)).background(NES.MenuSelected)
.clickable {
if (addr.isNotBlank()) { onAddServer(ServerEntry(addr, false, password = pwd)); addr = ""; pwd = ""; showAdd = false }
},
Modifier.size(FIELD_H).clip(RoundedCornerShape(12.dp)).background(BitcoinOrange.copy(alpha = 0.15f))
.border(1.dp, BitcoinOrange.copy(alpha = 0.4f), RoundedCornerShape(12.dp))
.clickable { submit() },
contentAlignment = Alignment.Center,
) { Text("OK", color = NES.MenuText, fontSize = 10.sp, fontWeight = FontWeight.Bold) }
) { Text("OK", color = BitcoinOrange, fontSize = 14.sp, fontWeight = FontWeight.Bold) }
}
}
} else {
MenuItem(label = " ADD SERVER", onClick = { showAdd = true })
MenuItem(label = "Add Server", labelColor = BitcoinOrange, onClick = { showAdd = true })
}
Spacer(Modifier.height(2.dp))
Box(Modifier.fillMaxWidth().height(1.dp).background(NES.MenuBorder.copy(alpha = 0.3f)))
Box(Modifier.fillMaxWidth().height(1.dp).background(PanelBorder))
Spacer(Modifier.height(2.dp))
// Mode toggle
MenuItem(
label = if (isGamepadMode) " SWITCH TO KEYBOARD" else " SWITCH TO GAMEPAD",
label = if (isGamepadMode) "Switch to Keyboard" else "Switch to Gamepad",
onClick = onToggleMode,
)
// Style toggle
MenuItem(
label = if (controllerStyle == ControllerStyle.CLASSIC) " STYLE: CLASSIC" else " STYLE: DARK",
label = if (controllerStyle == ControllerStyle.CLASSIC) "Style: Classic" else "Style: Dark",
onClick = onToggleStyle,
)
// Back to dashboard
if (onBackToWebView != null) {
MenuItem(label = " BACK TO DASHBOARD", onClick = onBackToWebView)
MenuItem(label = "Back to Dashboard", onClick = onBackToWebView)
}
}
}
@ -187,32 +265,79 @@ private fun MenuPanel(
private fun MenuItem(
label: String,
selected: Boolean = false,
labelColor: Color = TextPrimary,
onClick: () -> Unit,
onEdit: (() -> Unit)? = null,
onRemove: (() -> Unit)? = null,
) {
Row(
Modifier
.fillMaxWidth()
.height(32.dp)
.background(if (selected) NES.MenuSelected.copy(alpha = 0.15f) else Color.Transparent)
.height(ROW_H)
.clip(RoundedCornerShape(ROW_R))
.background(if (selected) BitcoinOrange.copy(alpha = 0.12f) else RowBg)
.border(1.dp, if (selected) BitcoinOrange.copy(alpha = 0.4f) else RowBorder, RoundedCornerShape(ROW_R))
.clickable { onClick() }
.padding(horizontal = 8.dp),
.padding(horizontal = 16.dp),
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.SpaceBetween,
) {
Text(label, color = if (selected) NES.MenuSelected else NES.MenuText, fontSize = 11.sp, fontWeight = FontWeight.Medium)
Text(
label,
color = if (selected) BitcoinOrange else labelColor,
fontSize = 16.sp,
fontWeight = FontWeight.Medium,
modifier = Modifier.weight(1f),
)
if (onEdit != null) {
Text(
"",
color = TextMuted,
fontSize = 16.sp,
modifier = Modifier.clickable { onEdit() }.padding(horizontal = 8.dp),
)
}
if (onRemove != null) {
Text("\u2715", color = NES.MenuMuted, fontSize = 10.sp,
modifier = Modifier.clickable { onRemove() }.padding(horizontal = 8.dp))
Text(
"",
color = TextMuted,
fontSize = 16.sp,
modifier = Modifier.clickable { onRemove() }.padding(horizontal = 8.dp),
)
}
}
}
/** Glass text field with centered input text. */
@Composable
private fun nesFieldColors() = OutlinedTextFieldDefaults.colors(
focusedBorderColor = NES.MenuBorder,
unfocusedBorderColor = NES.MenuMuted,
cursorColor = NES.MenuText,
focusedTextColor = NES.MenuText,
unfocusedTextColor = NES.MenuText,
)
private fun GlassField(
value: String,
onValueChange: (String) -> Unit,
placeholder: String,
modifier: Modifier = Modifier,
visualTransformation: androidx.compose.ui.text.input.VisualTransformation = androidx.compose.ui.text.input.VisualTransformation.None,
keyboardOptions: KeyboardOptions = KeyboardOptions.Default,
keyboardActions: KeyboardActions = KeyboardActions.Default,
) {
OutlinedTextField(
value = value,
onValueChange = onValueChange,
placeholder = {
Text(placeholder, color = TextMuted, fontSize = 15.sp, modifier = Modifier.fillMaxWidth(), textAlign = TextAlign.Center)
},
modifier = modifier.fillMaxWidth().height(FIELD_H),
singleLine = true,
visualTransformation = visualTransformation,
keyboardOptions = keyboardOptions,
keyboardActions = keyboardActions,
textStyle = TextStyle(color = TextPrimary, fontSize = 16.sp, textAlign = TextAlign.Center),
colors = OutlinedTextFieldDefaults.colors(
focusedBorderColor = Color.White.copy(alpha = 0.3f),
unfocusedBorderColor = Color.White.copy(alpha = 0.12f),
cursorColor = BitcoinOrange,
focusedTextColor = TextPrimary,
unfocusedTextColor = TextPrimary,
),
shape = RoundedCornerShape(12.dp),
)
}

View File

@ -50,7 +50,6 @@ fun NESPortraitController(
Box(
Modifier
.fillMaxSize()
.background(Color(0xFF0C0C0C))
.twoFingerHold(onMenu)
.padding(horizontal = 40.dp, vertical = 24.dp),
contentAlignment = Alignment.Center,
@ -62,7 +61,7 @@ fun NESPortraitController(
.fillMaxSize()
.shadow(28.dp, RoundedCornerShape(20.dp), ambientColor = Color.Black, spotColor = Color.Black)
.clip(RoundedCornerShape(20.dp))
.background(Brush.verticalGradient(listOf(c.body, c.body.copy(alpha = 0.95f))))
.background(Brush.verticalGradient(listOf(c.body, c.body)))
.border(1.dp, Color.White.copy(alpha = if (isClassic) 0.08f else 0.04f), RoundedCornerShape(20.dp)),
) {
// Top highlight
@ -119,11 +118,11 @@ fun NESPortraitController(
Modifier.fillMaxWidth().padding(horizontal = 12.dp, vertical = 8.dp),
horizontalAlignment = Alignment.CenterHorizontally,
) {
ColorBtn(Color(0xFF888888), Color(0xFFAAAAAA), 46.dp) { onKey("c") }
GlassFaceBtn("C", Color(0xFFBBBBBB), 46.dp) { onKey("c") }
Spacer(Modifier.height(6.dp))
Row(horizontalArrangement = Arrangement.spacedBy(14.dp)) {
ColorBtn(Color(0xFF3B82F6), Color(0xFF60A5FA), 46.dp) { onKey("b") }
ColorBtn(Color(0xFFEA580C), Color(0xFFFB923C), 46.dp) { onKey("a") }
GlassFaceBtn("B", Color(0xFF60A5FA), 46.dp) { onKey("b") }
GlassFaceBtn("A", Color(0xFFF7931A), 46.dp) { onKey("a") }
}
}
}

View File

@ -23,6 +23,7 @@ import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.safeDrawing
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.windowInsetsPadding
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material3.MaterialTheme
@ -41,7 +42,7 @@ import androidx.compose.ui.geometry.Offset
import androidx.compose.ui.geometry.Size
import androidx.compose.ui.graphics.Brush
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.graphics.ColorFilter
import androidx.compose.ui.layout.ContentScale
import androidx.compose.ui.res.painterResource
import androidx.compose.ui.res.stringResource
import androidx.compose.ui.text.style.TextAlign
@ -67,26 +68,45 @@ fun IntroScreen(onContinue: () -> Unit) {
Box(
modifier = Modifier
.fillMaxSize()
.background(SurfaceBlack)
.windowInsetsPadding(WindowInsets.safeDrawing),
contentAlignment = Alignment.Center,
.background(SurfaceBlack),
) {
// Reddish synthwave backdrop
Image(
painter = painterResource(id = R.drawable.bg_synthwave),
contentDescription = null,
modifier = Modifier.fillMaxSize(),
contentScale = ContentScale.Crop,
)
// Dark scrim so the title/buttons stay legible over the art
Box(
modifier = Modifier
.fillMaxSize()
.background(
Brush.verticalGradient(
colors = listOf(
Color.Black.copy(alpha = 0.55f),
Color.Black.copy(alpha = 0.35f),
Color.Black.copy(alpha = 0.75f),
),
)
),
)
Column(
modifier = Modifier
.align(Alignment.Center)
.fillMaxWidth()
.windowInsetsPadding(WindowInsets.safeDrawing)
.padding(horizontal = 32.dp),
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.Center,
) {
// Wide pixel-art logo
// Circular badge logo
Image(
painter = painterResource(id = R.drawable.ic_logo_wide),
painter = painterResource(id = R.drawable.ic_logo),
contentDescription = "Archipelago",
modifier = Modifier
.fillMaxWidth()
.padding(horizontal = 8.dp)
.size(160.dp)
.alpha(logoAlpha.value),
colorFilter = ColorFilter.tint(Color.White),
)
Spacer(modifier = Modifier.height(48.dp))
@ -102,7 +122,7 @@ fun IntroScreen(onContinue: () -> Unit) {
Text(
text = stringResource(R.string.welcome_title),
style = MaterialTheme.typography.headlineLarge,
color = TextPrimary,
color = Color(0xFFFAFAFA),
textAlign = TextAlign.Center,
)
@ -111,7 +131,7 @@ fun IntroScreen(onContinue: () -> Unit) {
Text(
text = stringResource(R.string.welcome_subtitle),
style = MaterialTheme.typography.bodyLarge,
color = TextMuted,
color = Color(0xFFFAFAFA),
textAlign = TextAlign.Center,
lineHeight = 26.sp,
)

View File

@ -2,6 +2,7 @@ package com.archipelago.app.ui.screens
import android.content.res.Configuration
import androidx.activity.compose.BackHandler
import androidx.compose.foundation.Image
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
@ -24,13 +25,17 @@ import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Brush
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.layout.ContentScale
import androidx.compose.ui.platform.LocalConfiguration
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.platform.LocalLifecycleOwner
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleEventObserver
import androidx.compose.ui.res.painterResource
import androidx.compose.ui.unit.dp
import com.archipelago.app.R
import com.archipelago.app.data.ServerPreferences
import com.archipelago.app.network.ConnectionState
import com.archipelago.app.network.InputWebSocket
@ -58,7 +63,7 @@ fun RemoteInputScreen(onBack: () -> Unit) {
var isGamepadMode by remember { mutableStateOf(true) }
var showModal by remember { mutableStateOf(false) }
var controllerStyle by remember { mutableStateOf(ControllerStyle.CLASSIC) }
var controllerStyle by remember { mutableStateOf(ControllerStyle.DARK) }
var playerId by remember { mutableStateOf(0) } // 0 = broadcast, 1 = P1, 2 = P2
val ws = remember { InputWebSocket(scope) }
@ -113,9 +118,31 @@ fun RemoteInputScreen(onBack: () -> Unit) {
Box(
Modifier
.fillMaxSize()
.background(Color(0xFF0C0C0C))
.windowInsetsPadding(WindowInsets.safeDrawing),
.background(Color(0xFF0C0C0C)),
) {
// Reddish synthwave backdrop behind the controller
Image(
painter = painterResource(id = R.drawable.bg_synthwave),
contentDescription = null,
modifier = Modifier.fillMaxSize(),
contentScale = ContentScale.Crop,
)
// Light scrim — the controller body provides its own contrast, so keep
// this subtle and let the backdrop show through around it.
Box(
modifier = Modifier
.fillMaxSize()
.background(
Brush.verticalGradient(
colors = listOf(
Color.Black.copy(alpha = 0.4f),
Color.Black.copy(alpha = 0.25f),
Color.Black.copy(alpha = 0.45f),
),
)
),
)
Box(Modifier.fillMaxSize().windowInsetsPadding(WindowInsets.safeDrawing)) {
when {
isGamepadMode && isLandscape -> NESController(
style = controllerStyle,
@ -174,6 +201,7 @@ fun RemoteInputScreen(onBack: () -> Unit) {
}
),
)
}
NESMenu(
visible = showModal,
@ -188,7 +216,31 @@ fun RemoteInputScreen(onBack: () -> Unit) {
onAddServer = { server ->
scope.launch { prefs.addSavedServer(server); if (activeServer == null) prefs.setActiveServer(server) }
},
onRemoveServer = { server -> scope.launch { prefs.removeSavedServer(server) } },
onEditServer = { original, updated ->
scope.launch {
prefs.updateSavedServer(original, updated)
// If the edited server is the live one, reconnect with the new
// address/credentials so the change takes effect immediately.
if (original.serialize() == activeServer?.serialize()) {
ws.disconnect()
prefs.setActiveServer(updated)
}
}
},
onRemoveServer = { server ->
scope.launch {
prefs.removeSavedServer(server)
// Deleting the last server leaves nothing to control — drop the
// active server and return to the Connect screen.
val remaining = savedServers.count { it.serialize() != server.serialize() }
if (remaining == 0) {
ws.disconnect()
prefs.clearActiveServer()
showModal = false
onBack()
}
}
},
onToggleMode = { isGamepadMode = !isGamepadMode; showModal = false },
onToggleStyle = {
controllerStyle = if (controllerStyle == ControllerStyle.CLASSIC) ControllerStyle.DARK else ControllerStyle.CLASSIC

View File

@ -30,6 +30,7 @@ import androidx.compose.material.icons.filled.VisibilityOff
import androidx.compose.foundation.verticalScroll
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Close
import androidx.compose.material.icons.filled.Edit
import androidx.compose.material.icons.filled.Lock
import androidx.compose.material.icons.filled.LockOpen
import androidx.compose.material3.CircularProgressIndicator
@ -55,6 +56,7 @@ import androidx.compose.ui.draw.drawWithContent
import androidx.compose.ui.graphics.Brush
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.graphics.ColorFilter
import androidx.compose.ui.layout.ContentScale
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.platform.LocalSoftwareKeyboardController
import androidx.compose.ui.res.painterResource
@ -97,6 +99,7 @@ fun ServerConnectScreen(
val scope = rememberCoroutineScope()
val keyboard = LocalSoftwareKeyboardController.current
var name by remember { mutableStateOf("") }
var address by remember { mutableStateOf("") }
var port by remember { mutableStateOf("") }
var password by remember { mutableStateOf("") }
@ -104,9 +107,50 @@ fun ServerConnectScreen(
var useHttps by remember { mutableStateOf(false) }
var isConnecting by remember { mutableStateOf(false) }
var errorMessage by remember { mutableStateOf<String?>(null) }
// The saved server currently being edited, or null when adding/connecting.
var editingServer by remember { mutableStateOf<ServerEntry?>(null) }
val savedServers by prefs.savedServers.collectAsState(initial = emptyList())
fun clearForm() {
name = ""
address = ""
port = ""
password = ""
useHttps = false
passwordVisible = false
errorMessage = null
}
fun startEdit(server: ServerEntry) {
editingServer = server
name = server.name
address = server.address
port = server.port
password = server.password
useHttps = server.useHttps
passwordVisible = false
errorMessage = null
}
fun cancelEdit() {
editingServer = null
clearForm()
}
fun saveEdit() {
val original = editingServer ?: return
if (address.isBlank()) {
errorMessage = "Enter a server address"
return
}
val updated = ServerEntry(address, useHttps, port, password, name)
scope.launch {
prefs.updateSavedServer(original, updated)
cancelEdit()
}
}
fun connect(server: ServerEntry) {
if (isConnecting) return
if (server.address.isBlank()) {
@ -132,12 +176,33 @@ fun ServerConnectScreen(
Box(
modifier = Modifier
.fillMaxSize()
.background(SurfaceBlack)
.windowInsetsPadding(WindowInsets.safeDrawing),
.background(SurfaceBlack),
) {
// Reddish synthwave backdrop
Image(
painter = painterResource(id = R.drawable.bg_synthwave),
contentDescription = null,
modifier = Modifier.fillMaxSize(),
contentScale = ContentScale.Crop,
)
// Dark scrim so the form stays legible over the art
Box(
modifier = Modifier
.fillMaxSize()
.background(
Brush.verticalGradient(
colors = listOf(
Color.Black.copy(alpha = 0.6f),
Color.Black.copy(alpha = 0.45f),
Color.Black.copy(alpha = 0.8f),
),
)
),
)
Column(
modifier = Modifier
.fillMaxSize()
.windowInsetsPadding(WindowInsets.safeDrawing)
.verticalScroll(state = rememberScrollState())
.drawWithContent { drawContent() }
.padding(horizontal = 24.dp)
@ -145,20 +210,17 @@ fun ServerConnectScreen(
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.spacedBy(16.dp),
) {
// Wide logo
// Circular badge logo
Image(
painter = painterResource(id = R.drawable.ic_logo_wide),
painter = painterResource(id = R.drawable.ic_logo),
contentDescription = "Archipelago",
modifier = Modifier
.fillMaxWidth()
.padding(horizontal = 16.dp),
colorFilter = ColorFilter.tint(Color.White),
modifier = Modifier.size(96.dp),
)
Spacer(modifier = Modifier.height(4.dp))
Text(
text = "Connect to Server",
text = if (editingServer != null) stringResource(R.string.edit_server_title) else "Connect to Server",
style = MaterialTheme.typography.headlineMedium,
color = TextPrimary,
textAlign = TextAlign.Center,
@ -178,6 +240,7 @@ fun ServerConnectScreen(
modifier = Modifier
.fillMaxWidth()
.clip(RoundedCornerShape(16.dp))
.background(Color.Black.copy(alpha = 0.6f))
.background(
Brush.verticalGradient(
colors = listOf(
@ -190,6 +253,34 @@ fun ServerConnectScreen(
.padding(20.dp),
) {
Column {
OutlinedTextField(
value = name,
onValueChange = {
name = it
errorMessage = null
},
label = { Text(stringResource(R.string.server_name_label)) },
placeholder = { Text(stringResource(R.string.server_name_placeholder)) },
modifier = Modifier.fillMaxWidth(),
singleLine = true,
keyboardOptions = KeyboardOptions(
keyboardType = KeyboardType.Text,
imeAction = ImeAction.Next,
),
colors = OutlinedTextFieldDefaults.colors(
focusedBorderColor = Color.White.copy(alpha = 0.3f),
unfocusedBorderColor = Color.White.copy(alpha = 0.12f),
cursorColor = Color.White,
focusedLabelColor = Color.White.copy(alpha = 0.7f),
unfocusedLabelColor = TextMuted,
focusedTextColor = TextPrimary,
unfocusedTextColor = TextPrimary,
),
shape = RoundedCornerShape(12.dp),
)
Spacer(modifier = Modifier.height(12.dp))
OutlinedTextField(
value = address,
onValueChange = {
@ -275,7 +366,11 @@ fun ServerConnectScreen(
keyboardActions = KeyboardActions(
onGo = {
keyboard?.hide()
connect(ServerEntry(address, useHttps, port, password))
if (editingServer != null) {
saveEdit()
} else {
connect(ServerEntry(address, useHttps, port, password, name))
}
},
),
colors = OutlinedTextFieldDefaults.colors(
@ -340,15 +435,40 @@ fun ServerConnectScreen(
}
}
// Connect button — glass style
GlassButton(
text = if (isConnecting) stringResource(R.string.connecting) else stringResource(R.string.connect),
onClick = {
keyboard?.hide()
connect(ServerEntry(address, useHttps, port, password))
},
modifier = Modifier.fillMaxWidth().height(56.dp),
)
if (editingServer != null) {
// Save / Cancel while editing an existing saved server
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(12.dp),
) {
GlassButton(
text = stringResource(R.string.cancel),
onClick = {
keyboard?.hide()
cancelEdit()
},
modifier = Modifier.weight(1f).height(56.dp),
)
GlassButton(
text = stringResource(R.string.save_changes),
onClick = {
keyboard?.hide()
saveEdit()
},
modifier = Modifier.weight(1f).height(56.dp),
)
}
} else {
// Connect button — glass style
GlassButton(
text = if (isConnecting) stringResource(R.string.connecting) else stringResource(R.string.connect),
onClick = {
keyboard?.hide()
connect(ServerEntry(address, useHttps, port, password, name))
},
modifier = Modifier.fillMaxWidth().height(56.dp),
)
}
if (isConnecting) {
CircularProgressIndicator(
@ -358,8 +478,8 @@ fun ServerConnectScreen(
)
}
// Saved servers
if (savedServers.isNotEmpty()) {
// Saved servers (hidden while editing one to keep focus on the form)
if (editingServer == null && savedServers.isNotEmpty()) {
Spacer(modifier = Modifier.height(8.dp))
Text(
text = stringResource(R.string.saved_servers),
@ -373,6 +493,7 @@ fun ServerConnectScreen(
SavedServerItem(
server = server,
onConnect = { connect(it) },
onEdit = { startEdit(it) },
onRemove = { scope.launch { prefs.removeSavedServer(it) } },
)
}
@ -385,12 +506,14 @@ fun ServerConnectScreen(
private fun SavedServerItem(
server: ServerEntry,
onConnect: (ServerEntry) -> Unit,
onEdit: (ServerEntry) -> Unit,
onRemove: (ServerEntry) -> Unit,
) {
Row(
modifier = Modifier
.fillMaxWidth()
.clip(RoundedCornerShape(12.dp))
.background(Color.Black.copy(alpha = 0.6f))
.background(
Brush.verticalGradient(
colors = listOf(
@ -414,12 +537,21 @@ private fun SavedServerItem(
)
Spacer(modifier = Modifier.width(12.dp))
Column {
Text(text = server.address, style = MaterialTheme.typography.bodyMedium, color = TextPrimary, maxLines = 1, overflow = TextOverflow.Ellipsis)
if (server.port.isNotBlank()) {
Text(text = "Port ${server.port}", style = MaterialTheme.typography.labelMedium, color = TextMuted)
Text(text = server.displayName(), style = MaterialTheme.typography.bodyMedium, color = TextPrimary, maxLines = 1, overflow = TextOverflow.Ellipsis)
val secondary = buildString {
if (server.name.isNotBlank()) append(server.address)
if (server.port.isNotBlank()) {
if (isNotEmpty()) append(":${server.port}") else append("Port ${server.port}")
}
}
if (secondary.isNotBlank()) {
Text(text = secondary, style = MaterialTheme.typography.labelMedium, color = TextMuted, maxLines = 1, overflow = TextOverflow.Ellipsis)
}
}
}
IconButton(onClick = { onEdit(server) }) {
Icon(imageVector = Icons.Default.Edit, contentDescription = stringResource(R.string.edit_server), modifier = Modifier.size(18.dp), tint = TextMuted)
}
IconButton(onClick = { onRemove(server) }) {
Icon(imageVector = Icons.Default.Close, contentDescription = stringResource(R.string.remove_server), modifier = Modifier.size(18.dp), tint = TextMuted)
}

View File

@ -2,6 +2,7 @@ package com.archipelago.app.ui.screens
import android.annotation.SuppressLint
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.view.ViewGroup
import android.webkit.CookieManager
import android.webkit.WebChromeClient
@ -14,10 +15,12 @@ import androidx.activity.compose.BackHandler
import androidx.compose.animation.AnimatedVisibility
import androidx.compose.animation.fadeIn
import androidx.compose.animation.fadeOut
import androidx.compose.foundation.Image
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.WindowInsets
import androidx.compose.foundation.layout.fillMaxSize
@ -26,14 +29,24 @@ import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.safeDrawing
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.layout.windowInsetsPadding
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.automirrored.filled.ArrowBack
import androidx.compose.material.icons.automirrored.filled.ArrowForward
import androidx.compose.material.icons.filled.Close
import androidx.compose.material.icons.filled.CloudOff
import androidx.compose.material.icons.filled.OpenInBrowser
import androidx.compose.material.icons.filled.Refresh
import androidx.compose.material3.CircularProgressIndicator
import androidx.compose.material3.Icon
import androidx.compose.material3.IconButton
import androidx.compose.material3.LinearProgressIndicator
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Text
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableIntStateOf
import androidx.compose.runtime.mutableStateOf
@ -41,8 +54,12 @@ import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.asImageBitmap
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.res.stringResource
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.text.style.TextOverflow
import androidx.compose.ui.unit.dp
import androidx.compose.ui.viewinterop.AndroidView
import com.archipelago.app.R
@ -50,8 +67,70 @@ import com.archipelago.app.ui.theme.BitcoinOrange
import com.archipelago.app.ui.theme.SurfaceBlack
import com.archipelago.app.ui.theme.TextMuted
import com.archipelago.app.ui.theme.TextPrimary
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
/** Open a URL in the phone's default browser (genuinely external links). */
private fun openExternalUrl(context: android.content.Context, url: String) {
try {
val intent = android.content.Intent(
android.content.Intent.ACTION_VIEW,
android.net.Uri.parse(url),
).apply {
// Required when launching from a non-Activity/binder thread
// (the JS bridge below can run off the UI thread).
addFlags(android.content.Intent.FLAG_ACTIVITY_NEW_TASK)
}
context.startActivity(intent)
} catch (_: Exception) {}
}
/** True when [url] points at the same host as the connected Archipelago node
* (ignoring port). Such URLs are node apps e.g. one that can't be iframed
* and should stay inside the app rather than bouncing out to the browser. */
private fun isSameHost(url: String, base: String): Boolean {
return try {
val a = android.net.Uri.parse(url).host ?: return false
val b = android.net.Uri.parse(base).host ?: return false
a.equals(b, ignoreCase = true)
} catch (_: Exception) {
false
}
}
/** Apply the WebView settings shared by the kiosk view and the in-app browser.
* These are tuned for SPA performance and parity with the mobile browser;
* none of them alter how a page renders visually. */
@SuppressLint("SetJavaScriptEnabled")
private fun WebView.applyArchipelagoSettings() {
// Pre-rasterize just outside the viewport so flinging the kiosk/app doesn't
// show blank checkerboarding — the single biggest scroll-smoothness win and
// a major part of the "feels slower than the browser" gap. (API 23+)
settings.setOffscreenPreRaster(true)
settings.apply {
javaScriptEnabled = true
domStorageEnabled = true
databaseEnabled = true
mediaPlaybackRequiresUserGesture = false
mixedContentMode = WebSettings.MIXED_CONTENT_COMPATIBILITY_MODE
useWideViewPort = true
loadWithOverviewMode = true
setSupportZoom(false)
builtInZoomControls = false
cacheMode = WebSettings.LOAD_DEFAULT
allowContentAccess = true
allowFileAccess = false
}
// chrome://inspect profiling on debuggable builds only — lets us measure the
// real in-page bottleneck rather than guess. No effect on release builds.
val debuggable = 0 != (context.applicationInfo.flags and
android.content.pm.ApplicationInfo.FLAG_DEBUGGABLE)
if (debuggable) WebView.setWebContentsDebuggingEnabled(true)
}
@SuppressLint("SetJavaScriptEnabled", "ClickableViewAccessibility")
@Composable
fun WebViewScreen(
serverUrl: String,
@ -63,7 +142,12 @@ fun WebViewScreen(
var hasError by remember { mutableStateOf(false) }
var webView by remember { mutableStateOf<WebView?>(null) }
BackHandler(enabled = webView?.canGoBack() == true) {
// A node app that refused iframing, opened in a local WebView overlay.
// null = no overlay. The kiosk WebView underneath stays alive (and warm)
// while this is shown, so closing it returns instantly with no reload.
var inAppUrl by remember { mutableStateOf<String?>(null) }
BackHandler(enabled = inAppUrl == null && webView?.canGoBack() == true) {
webView?.goBack()
}
@ -132,20 +216,6 @@ fun WebViewScreen(
AndroidView(
modifier = Modifier.fillMaxSize(),
factory = { context ->
fun openExternalUrl(url: String) {
try {
val intent = android.content.Intent(
android.content.Intent.ACTION_VIEW,
android.net.Uri.parse(url),
).apply {
// Required when launching from a non-Activity/binder
// thread (the JS bridge below runs off the UI thread).
addFlags(android.content.Intent.FLAG_ACTIVITY_NEW_TASK)
}
context.startActivity(intent)
} catch (_: Exception) {}
}
WebView(context).apply {
layoutParams = ViewGroup.LayoutParams(
ViewGroup.LayoutParams.MATCH_PARENT,
@ -159,19 +229,8 @@ fun WebViewScreen(
cookieManager.setAcceptCookie(true)
cookieManager.setAcceptThirdPartyCookies(this, true)
applyArchipelagoSettings()
settings.apply {
javaScriptEnabled = true
domStorageEnabled = true
databaseEnabled = true
mediaPlaybackRequiresUserGesture = false
mixedContentMode = WebSettings.MIXED_CONTENT_COMPATIBILITY_MODE
useWideViewPort = true
loadWithOverviewMode = true
setSupportZoom(false)
builtInZoomControls = false
cacheMode = WebSettings.LOAD_DEFAULT
allowContentAccess = true
allowFileAccess = false
setSupportMultipleWindows(true) // enables onCreateWindow for window.open
// Let JS open windows without a synchronous user-gesture
// chain; without this, window.open() from a Vue click
@ -179,18 +238,35 @@ fun WebViewScreen(
javaScriptCanOpenWindowsAutomatically = true
}
// Deterministic bridge for "open in the phone's browser".
// The web UI calls window.ArchipelagoNative.openExternal(url)
// when present (companion app), falling back to window.open
// in a plain mobile browser. This avoids relying on the
// window.open → onCreateWindow path, which noopener/noreferrer
// can suppress in the WebView.
val webViewRef = this
// Decide where an outbound URL goes:
// - same host as the node → in-app WebView overlay
// (this is the "open in browser" target for apps the
// kiosk couldn't iframe — keep the user inside the app)
// - different host → the phone's real browser
fun routeOutbound(url: String) {
if (isSameHost(url, serverUrl)) {
inAppUrl = url
} else {
openExternalUrl(context, url)
}
}
// JS bridge. The web UI calls:
// window.ArchipelagoNative.openExternal(url) — host-routed
// window.ArchipelagoNative.openInApp(url) — force in-app
// Falls back to window.open in a plain mobile browser.
addJavascriptInterface(
object {
@android.webkit.JavascriptInterface
fun openExternal(url: String) {
webViewRef.post { openExternalUrl(url) }
webViewRef.post { routeOutbound(url) }
}
@android.webkit.JavascriptInterface
fun openInApp(url: String) {
webViewRef.post { inAppUrl = url }
}
},
"ArchipelagoNative",
@ -247,15 +323,35 @@ fun WebViewScreen(
}
}
// Node apps (e.g. NetBird) terminate TLS with a
// self-signed cert — the dashboard needs a secure
// context for OIDC/window.crypto.subtle (#15). The
// WebView default is to CANCEL untrusted certs, so
// those apps render blank. The user explicitly trusts
// their own node, so proceed for same-host certs only;
// reject anything else (don't blanket-trust the web).
override fun onReceivedSslError(
view: WebView?,
handler: android.webkit.SslErrorHandler?,
error: android.net.http.SslError?,
) {
val u = error?.url
if (u != null && isSameHost(u, serverUrl)) {
handler?.proceed()
} else {
handler?.cancel()
}
}
override fun shouldOverrideUrlLoading(
view: WebView?,
request: WebResourceRequest?,
): Boolean {
val url = request?.url?.toString() ?: return false
// Keep navigation within the Archipelago server
// Keep kiosk navigation (same origin incl. port) in place
if (url.startsWith(serverUrl)) return false
// Open external URLs in the system browser
openExternalUrl(url)
// Same node (other port) → in-app; external → browser
routeOutbound(url)
return true
}
}
@ -265,7 +361,9 @@ fun WebViewScreen(
loadProgress = newProgress
}
// Handle window.open() — open in system browser
// window.open() — e.g. the kiosk's "Open in new tab"
// for an app that can't be iframed. Capture the target
// URL via a throwaway WebView and route it ourselves.
override fun onCreateWindow(
view: WebView?,
isDialog: Boolean,
@ -283,12 +381,12 @@ fun WebViewScreen(
request: WebResourceRequest?,
): Boolean {
val url = request?.url?.toString() ?: return true
openExternalUrl(url)
routeOutbound(url)
return true
}
override fun onPageStarted(view: WebView?, url: String?, favicon: Bitmap?) {
if (url != null) openExternalUrl(url)
if (url != null) routeOutbound(url)
view?.stopLoading()
}
}
@ -350,6 +448,255 @@ fun WebViewScreen(
)
}
// In-app browser overlay for non-iframeable node apps. Rendered last
// so it sits above the kiosk WebView, which stays alive underneath.
inAppUrl?.let { target ->
InAppBrowser(
url = target,
serverUrl = serverUrl,
onClose = { inAppUrl = null },
)
}
}
}
}
/** Best-effort fetch of the origin's /favicon.ico, so the launched app's icon
* can be shown on the loading screen before the WebView reports onReceivedIcon
* (which only fires once the page's <head> has parsed). Blocking call on IO. */
private fun fetchFavicon(pageUrl: String): Bitmap? {
return try {
val u = android.net.Uri.parse(pageUrl)
val scheme = u.scheme ?: return null
val host = u.host ?: return null
val portPart = if (u.port > 0) ":${u.port}" else ""
val conn = (java.net.URL("$scheme://$host$portPart/favicon.ico").openConnection()
as java.net.HttpURLConnection).apply {
connectTimeout = 4000
readTimeout = 4000
instanceFollowRedirects = true
}
conn.inputStream.use { BitmapFactory.decodeStream(it) }
} catch (_: Exception) {
null
}
}
/**
* Lightweight in-app browser used when the kiosk hands off an app that can't be
* shown in an iframe. Loads the app in a local WebView with a centered loading
* screen (app favicon + progress bar) and a BOTTOM control bar mirroring the
* web mobile-iframe footer (back / forward / reload / open-in-browser / close).
* Same-host navigation stays here; any genuinely external link escapes to the
* phone's browser.
*/
@SuppressLint("SetJavaScriptEnabled")
@Composable
private fun InAppBrowser(
url: String,
serverUrl: String,
onClose: () -> Unit,
) {
val context = LocalContext.current
var browser by remember { mutableStateOf<WebView?>(null) }
var title by remember { mutableStateOf(android.net.Uri.parse(url).host ?: url) }
var favicon by remember { mutableStateOf<Bitmap?>(null) }
var progress by remember { mutableIntStateOf(0) }
var loading by remember { mutableStateOf(true) }
var canGoBack by remember { mutableStateOf(false) }
var canGoForward by remember { mutableStateOf(false) }
// Seed the loading-screen icon immediately from a best-effort favicon
// pre-fetch (main's app-icon work), then onReceivedIcon upgrades it — so the
// loader shows an icon right away instead of staying blank until the page
// parses its <head> (which is what made the loader look stuck).
LaunchedEffect(url) {
val fetched = withContext(Dispatchers.IO) { fetchFavicon(url) }
if (fetched != null && favicon == null) favicon = fetched
}
// Back: walk the in-app history first, then close the overlay.
BackHandler {
val b = browser
if (b != null && b.canGoBack()) b.goBack() else onClose()
}
Column(
modifier = Modifier
.fillMaxSize()
.background(SurfaceBlack)
.windowInsetsPadding(WindowInsets.safeDrawing),
) {
// WebView + loading overlay fill the area above the bottom control bar.
Box(modifier = Modifier.weight(1f).fillMaxWidth()) {
AndroidView(
modifier = Modifier.fillMaxSize(),
factory = { ctx ->
WebView(ctx).apply {
layoutParams = ViewGroup.LayoutParams(
ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT,
)
isVerticalScrollBarEnabled = false
isHorizontalScrollBarEnabled = false
CookieManager.getInstance().setAcceptThirdPartyCookies(this, true)
applyArchipelagoSettings()
webChromeClient = object : WebChromeClient() {
override fun onProgressChanged(view: WebView?, newProgress: Int) {
progress = newProgress
}
override fun onReceivedTitle(view: WebView?, t: String?) {
if (!t.isNullOrBlank()) title = t
}
override fun onReceivedIcon(view: WebView?, icon: Bitmap?) {
if (icon != null) favicon = icon
}
}
webViewClient = object : WebViewClient() {
override fun onPageStarted(view: WebView?, u: String?, favicon: Bitmap?) {
loading = true
}
override fun onPageFinished(view: WebView?, u: String?) {
loading = false
canGoBack = view?.canGoBack() == true
canGoForward = view?.canGoForward() == true
}
override fun doUpdateVisitedHistory(view: WebView?, u: String?, isReload: Boolean) {
canGoBack = view?.canGoBack() == true
canGoForward = view?.canGoForward() == true
}
// Self-signed TLS on the node's apps (e.g. NetBird on
// :8087) would otherwise be cancelled by the WebView
// and render blank. Proceed for the user's own node
// (same host); reject any other untrusted cert.
override fun onReceivedSslError(
view: WebView?,
handler: android.webkit.SslErrorHandler?,
error: android.net.http.SslError?,
) {
val u = error?.url
if (u != null && isSameHost(u, serverUrl)) {
handler?.proceed()
} else {
handler?.cancel()
}
}
override fun shouldOverrideUrlLoading(
view: WebView?,
request: WebResourceRequest?,
): Boolean {
val u = request?.url?.toString() ?: return false
// Stay in the overlay for same-node navigation;
// hand genuinely external links to the real browser.
if (isSameHost(u, serverUrl)) return false
openExternalUrl(ctx, u)
return true
}
}
browser = this
loadUrl(url)
}
},
)
// Centered loading screen — app favicon (or spinner) + title + bar.
if (loading) {
Column(
modifier = Modifier
.fillMaxSize()
.background(SurfaceBlack),
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.Center,
) {
Box(
modifier = Modifier.size(84.dp).clip(RoundedCornerShape(20.dp)),
contentAlignment = Alignment.Center,
) {
val fav = favicon
if (fav != null) {
Image(
bitmap = fav.asImageBitmap(),
contentDescription = title,
modifier = Modifier.fillMaxSize(),
)
} else {
CircularProgressIndicator(color = BitcoinOrange)
}
}
Spacer(modifier = Modifier.height(18.dp))
Text(
text = title,
style = MaterialTheme.typography.bodyLarge,
color = TextPrimary,
maxLines = 1,
overflow = TextOverflow.Ellipsis,
)
Spacer(modifier = Modifier.height(16.dp))
LinearProgressIndicator(
progress = { progress / 100f },
modifier = Modifier.width(220.dp),
color = BitcoinOrange,
trackColor = TextMuted.copy(alpha = 0.2f),
)
}
}
}
// Bottom control bar — mirrors the web mobile-iframe footer.
Row(
modifier = Modifier
.fillMaxWidth()
.height(56.dp)
.background(SurfaceBlack)
.padding(horizontal = 8.dp),
horizontalArrangement = Arrangement.SpaceAround,
verticalAlignment = Alignment.CenterVertically,
) {
IconButton(onClick = { browser?.goBack() }, enabled = canGoBack) {
Icon(
imageVector = Icons.AutoMirrored.Filled.ArrowBack,
contentDescription = "Back",
tint = if (canGoBack) TextPrimary else TextMuted.copy(alpha = 0.4f),
)
}
IconButton(onClick = { browser?.goForward() }, enabled = canGoForward) {
Icon(
imageVector = Icons.AutoMirrored.Filled.ArrowForward,
contentDescription = "Forward",
tint = if (canGoForward) TextPrimary else TextMuted.copy(alpha = 0.4f),
)
}
IconButton(onClick = { browser?.reload() }) {
Icon(
imageVector = Icons.Default.Refresh,
contentDescription = "Reload",
tint = TextPrimary,
)
}
IconButton(onClick = { openExternalUrl(context, browser?.url ?: url) }) {
Icon(
imageVector = Icons.Default.OpenInBrowser,
contentDescription = stringResource(R.string.open_in_browser),
tint = TextPrimary,
)
}
IconButton(onClick = onClose) {
Icon(
imageVector = Icons.Default.Close,
contentDescription = stringResource(R.string.close),
tint = TextPrimary,
)
}
}
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 869 KiB

View File

@ -1,10 +1,53 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- Whole badge lives here (background renders to the mask edge with no
safe-zone cropping, unlike the foreground): dark fill + metallic ring pulled
inward to ~0.88 so the mask can't clip it + grid at ~0.58. Matches the
locally-rendered preview. Foreground is transparent. -->
<vector xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:aapt="http://schemas.android.com/aapt"
android:width="108dp"
android:height="108dp"
android:viewportWidth="108"
android:viewportHeight="108">
android:viewportWidth="752"
android:viewportHeight="752">
<path
android:fillColor="#030202"
android:pathData="M0,0h108v108H0z" />
android:fillColor="#0A0A0A"
android:pathData="M0,0h752v752H0z" />
<!-- Ring matching logo.svg's gradient (#000->#666). Scale 0.65 places it at
the home-screen's visible edge (calibrated from a device home screenshot;
launcher3 crops less than the Settings App-info view). -->
<group
android:pivotX="376"
android:pivotY="376"
android:scaleX="0.65"
android:scaleY="0.65">
<path
android:fillColor="#00000000"
android:strokeWidth="22.8834"
android:pathData="M11.441,375.669a364.227,364.227 0 1,0 728.454,0a364.227,364.227 0 1,0 -728.454,0z">
<aapt:attr name="android:strokeColor">
<gradient
android:type="linear"
android:startX="751.337"
android:startY="751.338"
android:endX="0"
android:endY="0.000976562">
<item android:offset="0" android:color="#FF000000" />
<item android:offset="1" android:color="#FF666666" />
</gradient>
</aapt:attr>
</path>
</group>
<!-- White Archipelago grid -->
<group
android:pivotX="376"
android:pivotY="376"
android:scaleX="0.55"
android:scaleY="0.55">
<path
android:fillColor="#FFFFFF"
android:pathData="M253.805,278.37V222.28H309.853V278.37H253.805ZM315.797,278.37V222.28H372.694V278.37H315.797ZM378.639,278.37V222.28H435.536V278.37H378.639ZM441.481,278.37V222.28H497.529V278.37H441.481ZM441.481,341.259V284.319H497.529V341.259H441.481ZM503.473,341.259V284.319H560.37V341.259H503.473ZM190.963,404.148V347.208H247.86V404.148H190.963ZM253.805,404.148V347.208H309.853V404.148H253.805ZM315.797,404.148V347.208H372.694V404.148H315.797ZM378.639,404.148V347.208H435.536V404.148H378.639ZM441.481,404.148V347.208H497.529V404.148H441.481ZM503.473,404.148V347.208H560.37V404.148H503.473ZM190.963,466.187V410.097H247.86V466.187H190.963ZM253.805,466.187V410.097H309.853V466.187H253.805ZM441.481,466.187V410.097H497.529V466.187H441.481ZM503.473,466.187V410.097H560.37V466.187H503.473ZM253.805,529.076V472.136H309.853V529.076H253.805ZM315.797,529.076V472.136H372.694V529.076H315.797ZM378.639,529.076V472.136H435.536V529.076H378.639ZM441.481,529.076V472.136H497.529V529.076H441.481Z" />
</group>
</vector>

View File

@ -1,45 +1,12 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- Archipelago pixel-art "A" logo — scaled 90% and centered -->
<!-- Transparent — the whole badge (ring + grid) is in the background layer so it
renders to the mask edge without safe-zone cropping. -->
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="108dp"
android:height="108dp"
android:viewportWidth="1024"
android:viewportHeight="1024">
<group
android:pivotX="512"
android:pivotY="512"
android:scaleX="0.55"
android:scaleY="0.55">
<!-- Row 1: 4 blocks -->
<path android:fillColor="#FFFFFF" android:pathData="M357.614,318h71.007v70.936h-71.007z" />
<path android:fillColor="#FFFFFF" android:pathData="M436.152,318h72.082v70.936h-72.082z" />
<path android:fillColor="#FFFFFF" android:pathData="M515.766,318h72.082v70.936h-72.082z" />
<path android:fillColor="#FFFFFF" android:pathData="M595.379,318h71.007v70.936h-71.007z" />
<!-- Row 2: 2 blocks (right side) -->
<path android:fillColor="#FFFFFF" android:pathData="M595.379,396.46h71.007v72.011h-71.007z" />
<path android:fillColor="#FFFFFF" android:pathData="M673.917,396.46h72.083v72.011h-72.083z" />
<!-- Row 3: 6 blocks (full width) -->
<path android:fillColor="#FFFFFF" android:pathData="M278,475.994h72.083v72.012h-72.083z" />
<path android:fillColor="#FFFFFF" android:pathData="M357.614,475.994h71.007v72.012h-71.007z" />
<path android:fillColor="#FFFFFF" android:pathData="M436.152,475.994h72.082v72.012h-72.082z" />
<path android:fillColor="#FFFFFF" android:pathData="M515.766,475.994h72.082v72.012h-72.082z" />
<path android:fillColor="#FFFFFF" android:pathData="M595.379,475.994h71.007v72.012h-71.007z" />
<path android:fillColor="#FFFFFF" android:pathData="M673.917,475.994h72.083v72.012h-72.083z" />
<!-- Row 4: 4 blocks (sides only — the "A" gap) -->
<path android:fillColor="#FFFFFF" android:pathData="M278,555.529h72.083v70.936h-72.083z" />
<path android:fillColor="#FFFFFF" android:pathData="M357.614,555.529h71.007v70.936h-71.007z" />
<path android:fillColor="#FFFFFF" android:pathData="M595.379,555.529h71.007v70.936h-71.007z" />
<path android:fillColor="#FFFFFF" android:pathData="M673.917,555.529h72.083v70.936h-72.083z" />
<!-- Row 5: 4 blocks (bottom) -->
<path android:fillColor="#FFFFFF" android:pathData="M357.614,633.989h71.007v72.011h-71.007z" />
<path android:fillColor="#FFFFFF" android:pathData="M436.152,633.989h72.082v72.011h-72.082z" />
<path android:fillColor="#FFFFFF" android:pathData="M515.766,633.989h72.082v72.011h-72.082z" />
<path android:fillColor="#FFFFFF" android:pathData="M595.379,633.989h71.007v72.011h-71.007z" />
</group>
android:viewportWidth="108"
android:viewportHeight="108">
<path
android:fillColor="#00000000"
android:pathData="M0,0h108v108H0z" />
</vector>

View File

@ -0,0 +1,33 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- Archipelago circular badge logo (from logo.svg):
dark circle with a black→grey gradient ring + white pixel-grid mark. -->
<vector xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:aapt="http://schemas.android.com/aapt"
android:width="120dp"
android:height="120dp"
android:viewportWidth="752"
android:viewportHeight="752">
<!-- Ringed circle (circle converted to a path; stroke carries the gradient) -->
<path
android:fillColor="#0A0A0A"
android:strokeWidth="22.8834"
android:pathData="M11.441,375.669a364.227,364.227 0 1,0 728.454,0a364.227,364.227 0 1,0 -728.454,0z">
<aapt:attr name="android:strokeColor">
<gradient
android:type="linear"
android:startX="751.337"
android:startY="751.338"
android:endX="0"
android:endY="0">
<item android:offset="0" android:color="#FF000000" />
<item android:offset="1" android:color="#FF666666" />
</gradient>
</aapt:attr>
</path>
<!-- White Archipelago pixel grid -->
<path
android:fillColor="#FFFFFF"
android:pathData="M253.805,278.37V222.28H309.853V278.37H253.805ZM315.797,278.37V222.28H372.694V278.37H315.797ZM378.639,278.37V222.28H435.536V278.37H378.639ZM441.481,278.37V222.28H497.529V278.37H441.481ZM441.481,341.259V284.319H497.529V341.259H441.481ZM503.473,341.259V284.319H560.37V341.259H503.473ZM190.963,404.148V347.208H247.86V404.148H190.963ZM253.805,404.148V347.208H309.853V404.148H253.805ZM315.797,404.148V347.208H372.694V404.148H315.797ZM378.639,404.148V347.208H435.536V404.148H378.639ZM441.481,404.148V347.208H497.529V404.148H441.481ZM503.473,404.148V347.208H560.37V404.148H503.473ZM190.963,466.187V410.097H247.86V466.187H190.963ZM253.805,466.187V410.097H309.853V466.187H253.805ZM441.481,466.187V410.097H497.529V466.187H441.481ZM503.473,466.187V410.097H560.37V466.187H503.473ZM253.805,529.076V472.136H309.853V529.076H253.805ZM315.797,529.076V472.136H372.694V529.076H315.797ZM378.639,529.076V472.136H435.536V529.076H378.639ZM441.481,529.076V472.136H497.529V529.076H441.481Z" />
</vector>

View File

@ -0,0 +1,12 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M15,19l-7,-7 7,-7"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -0,0 +1,12 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M6,18L18,6M6,6l12,12"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -0,0 +1,12 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M9,5l7,7 -7,7"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -0,0 +1,12 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M10,6H6a2,2 0,0 0,-2 2v10a2,2 0,0 0,2 2h10a2,2 0,0 0,2 -2v-4M14,4h6m0,0v6m0,-6L10,14"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -0,0 +1,12 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M4,4v6h6M20,20v-6h-6M5.64,15.36A8,8 0,0 0,18.36 18M18.36,8.64A8,8 0,0 0,5.64 6"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -21,4 +21,15 @@
<string name="retry">Retry</string>
<string name="remote_input">Remote Control</string>
<string name="remote_input_hint">Use your phone as a keyboard and mouse for the kiosk</string>
<string name="close">Close</string>
<string name="open_in_browser">Open in browser</string>
<string name="back">Back</string>
<string name="forward">Forward</string>
<string name="refresh">Refresh</string>
<string name="server_name_label">Server Name (optional)</string>
<string name="server_name_placeholder">My Archipelago</string>
<string name="edit_server">Edit</string>
<string name="edit_server_title">Edit Server</string>
<string name="save_changes">Save Changes</string>
<string name="cancel">Cancel</string>
</resources>

10
Android/logo.svg Normal file
View File

@ -0,0 +1,10 @@
<svg width="752" height="752" viewBox="0 0 752 752" fill="none" xmlns="http://www.w3.org/2000/svg">
<circle cx="375.668" cy="375.669" r="364.227" fill="#0A0A0A" stroke="url(#paint0_linear_877_1990)" stroke-width="22.8834"/>
<path d="M253.805 278.37V222.28H309.853V278.37H253.805ZM315.797 278.37V222.28H372.694V278.37H315.797ZM378.639 278.37V222.28H435.536V278.37H378.639ZM441.481 278.37V222.28H497.529V278.37H441.481ZM441.481 341.259V284.319H497.529V341.259H441.481ZM503.473 341.259V284.319H560.37V341.259H503.473ZM190.963 404.148V347.208H247.86V404.148H190.963ZM253.805 404.148V347.208H309.853V404.148H253.805ZM315.797 404.148V347.208H372.694V404.148H315.797ZM378.639 404.148V347.208H435.536V404.148H378.639ZM441.481 404.148V347.208H497.529V404.148H441.481ZM503.473 404.148V347.208H560.37V404.148H503.473ZM190.963 466.187V410.097H247.86V466.187H190.963ZM253.805 466.187V410.097H309.853V466.187H253.805ZM441.481 466.187V410.097H497.529V466.187H441.481ZM503.473 466.187V410.097H560.37V466.187H503.473ZM253.805 529.076V472.136H309.853V529.076H253.805ZM315.797 529.076V472.136H372.694V529.076H315.797ZM378.639 529.076V472.136H435.536V529.076H378.639ZM441.481 529.076V472.136H497.529V529.076H441.481Z" fill="white"/>
<defs>
<linearGradient id="paint0_linear_877_1990" x1="751.337" y1="751.338" x2="0" y2="0.000976562" gradientUnits="userSpaceOnUse">
<stop/>
<stop offset="1" stop-color="#666666"/>
</linearGradient>
</defs>
</svg>

After

Width:  |  Height:  |  Size: 1.4 KiB

41
Android/ship-companion.sh Executable file
View File

@ -0,0 +1,41 @@
#!/usr/bin/env bash
#
# Build the Android companion app and publish it as the served download
# (neode-ui/public/packages/archipelago-companion.apk — a plain APK a phone can
# install straight from the link), then commit + push.
#
# Use this INSTEAD of `git push` when shipping the companion app, so the
# downloadable APK on the node always matches what's on main.
#
# ./Android/ship-companion.sh
#
# The actual build/sign/verify/stage is done by scripts/publish-companion-apk.sh
# (single source of truth, shared with the pre-push hook). It does a CLEAN build,
# forces v1+v2+v3 signing, and ABORTS if any signature scheme is missing — so a
# broken or v2-only APK can never be shipped.
set -euo pipefail
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
cd "$ROOT"
export JAVA_HOME="${JAVA_HOME:-/opt/homebrew/opt/openjdk@17}"
export ANDROID_HOME="${ANDROID_HOME:-$HOME/Library/Android/sdk}"
DEST="neode-ui/public/packages/archipelago-companion.apk"
echo "==> Building + signing + verifying companion APK"
bash scripts/publish-companion-apk.sh
[ -f "$DEST" ] || { echo "ERROR: served APK not found at $DEST" >&2; exit 1; }
if git diff --cached --quiet -- "$DEST"; then
echo "==> Nothing to commit (APK unchanged)"
else
git commit -q -m "chore(android): update companion apk download"
echo "==> Committed"
fi
echo "==> Pushing $(git branch --show-current)"
# SHIP_COMPANION lets the pre-push guard know the APK was just refreshed.
SHIP_COMPANION=1 git push origin "$(git branch --show-current)"
echo "==> Done — companion APK published and pushed."

View File

@ -1,5 +1,34 @@
# Changelog
## v1.8.00-alpha (2026-06-18)
Polishes the mesh AI assistant and Fedimint, on top of all the v1.7.99 features (kept listed below so you can still see what's new).
- The off-grid mesh radio no longer posts cryptic identity codes to the shared public channel. Your node was announcing a line starting with "ARCHY:" to the public channel about once a minute, which everyone else on that channel saw as spam; that broadcast has been removed.
- You can now use your node's AI assistant straight from a normal chat. Send "!ai <your question>" in a direct message to an AI-enabled node and the answer comes right back in the same conversation — whether your message travelled over the internet or the LoRa radio. Before, the reply could be sent on the wrong path and never arrive.
- The Mesh AI Assistant panel is easier to set up: pick the Claude model from a dropdown (Haiku, Sonnet, or Opus) instead of typing it, and add specific contacts to an "always allow" list so chosen people can use "!ai" even when the assistant is set to trusted-nodes-only.
- Fedimint federations show up in Wallet Settings again. The Fedimint client app wasn't starting because of a configuration error, so the federation your node auto-joins never appeared; the client is fixed and runs again.
- In Settings, "App Updates" and "App Registry" now sit directly under your Account section for quicker access.
- In Mesh chat, scrolling the conversation no longer also scrolls the contact list behind it.
- Mesh direct messages are now private and end-to-end encrypted to the recipient — they're sent as real radio DMs instead of being broadcast on the public channel, so other people on the mesh no longer see them, and the answer arrives intact (even on standard meshcore phone apps).
- You can now message standard meshcore apps (like the phone companion) and they can message you — text shows up readable on both sides, and your node's AI answers come back as a private reply rather than on the public channel.
- New contacts you hear on the radio are added automatically, so people show up in your Peers list without any extra steps.
- "Clear All" now actually removes contacts (rather than hiding them forever); a contact comes back on its own the next time it's in range. Each contact also shows a reachability dot so you can see who's currently reachable.
- The Peers list has a search box (with a clear button) to quickly filter your contacts by name, DID, npub, or key.
All the v1.7.99-alpha features are included as well:
- Your node can now hold Fedimint ecash as well as Cashu, with tabbed Wallet Settings for each and both balances shown side by side on the home wallet card.
- You can buy files shared by another node right from their cloud, paying from this node's ecash, your Lightning wallet, on-chain, or by scanning a Lightning QR with any outside wallet.
- Your node can act as an AI assistant on the off-grid mesh: peers ask by starting a message with "!ai" and get an answer back over the radio, with a panel to turn it on or off.
- You can view your node's 24-word recovery phrase any time from Settings, behind a password (and 2FA) confirmation and a tap-to-show blur.
- Setting up a brand-new node is smoother: it waits and retries quietly instead of flashing errors, and shows a gentle "securing your private connection…" status that turns to "ready" on its own.
- The NetBird VPN app now logs in (it's served over HTTPS and opens in a browser tab).
- Phone remote-control of a node's screen now supports two-finger scrolling inside apps, and external-browser apps open on your phone.
- You can choose whether your node shares Bitcoin block headers over the mesh, and your choices are remembered.
- Version numbers display cleanly everywhere (no more doubled "v"), and "Back" buttons look and behave consistently across desktop and mobile.
- For advanced testing, Settings includes an optional update & app source choice between the usual trusted origin and an experimental peer-to-peer (DHT swarm) mode, with the trusted origin remaining the default.
## v1.7.99-alpha (2026-06-17)
- Your node can now hold Fedimint ecash as well as Cashu. Wallet Settings now has tabbed sections for each: keep your list of trusted Cashu mints, or paste a Fedimint invite code to join a federation, and the home wallet card shows both your Cashu and Fedimint balances side by side. A new "Fedimint Client" app in the catalog powers the federation side.

57
CLAUDE.md Normal file
View File

@ -0,0 +1,57 @@
# Archipelago — agent guide
## ✅ Single-node production gate is GREEN (2026-06-23)
`tests/lifecycle/run-gate.sh` is **5/5 on .228, 0 failures** — the single-node exit
criterion is met and the priority banner is demoted. Next exit-criteria: the
**multinode pass** (`docs/multinode-testing-plan.md`) and workstreams B/C/D.
**Read `docs/PRODUCTION-MASTER-PLAN.md` first** — it is still the authoritative plan
for the north star: a world-class, **developer-ready app platform** where every app
is manifest-driven, manifests ship via the **signed registry** (not OTA disk files),
and **third-party developers publish apps via an external/decentralized registry**
all rootless, secure, robust, and 100%-uptime-capable. It no longer overrides all
ad-hoc direction now that the gate is green, but it remains the source of truth for
sequencing the remaining workstreams.
Detailed sub-plans (all linked from the master):
- App platform / packaging phases + security model → `docs/APP-PACKAGING-MIGRATION-PLAN.md`
- Registry-distributed manifests (in progress) → `docs/registry-manifest-design.md`
- External/decentralized marketplace for devs → `docs/marketplace-protocol.md`
- Current per-app state → `docs/app-registry-status-2026-06-21.md`
- Production test gate (exit criterion) → `tests/lifecycle/TESTING.md`
## Invariants (never violate)
- **Rootless Podman only.** No rootful, no Docker-socket mounts, no privileged
containers unless explicitly approved.
- **No per-app Rust installers / no OS-level reliance.** Apps are declarative;
the orchestrator owns the lifecycle. `install_immich_stack` (hardcoded
`podman run` + `sudo chown`) is the anti-pattern being deleted, not a template.
- **Secrets are manifest-declared** (`generated_secrets`, materialised by
`container::secrets`, 0600/rootless) — never hardcoded, per-app, or logged.
- **Migrations never destroy data** — preserve `/var/lib/archipelago/<app>`,
secrets, credentials, ports, and adoption container names; keep a rollback path.
- **Verify on the real node .228 before any tag.** (Fleet-wide multinode
verification is a separate plan: `docs/multinode-testing-plan.md`.)
## Build / verify
- Rust workspace root is `core/` (no Cargo.toml at repo root). `cargo` from `core/`.
- If a `cargo test`/build hits `rust-lld: undefined hidden symbol`, it's
incremental-cache corruption — rebuild with `CARGO_INCREMENTAL=0`.
- Frontend: `neode-ui/``npm run build` outputs to `web/dist/neode-ui/`.
Grep the built bundle for new strings before shipping (build can silently no-op).
- App manifests load from disk on nodes at `/opt/archipelago/apps/*/manifest.yml`
(today); the goal is to distribute them via the signed catalog instead.
## Production test gate (definition of done)
`tests/lifecycle/run-gate.sh` green across install / UI / stop / start / restart /
reinstall / reboot-survive / archipelago-restart-survive / uninstall — **5× on
.228** (`ARCHY_ITERATIONS=5`). **Run the gate ON the node** (it uses local podman/systemctl/bitcoin
probes), not via RPC from another host. **✅ GREEN 2026-06-23 (5/5, 0 not-ok)** — keep it
green (re-run after orchestrator/lifecycle changes); regressions are top priority again.
**Multinode testing (.198 + the rest of the fleet) is a SEPARATE plan** —
`docs/multinode-testing-plan.md` — not part of this single-node gate criterion, and is
the next exit criterion now that single-node is green.

View File

@ -73,7 +73,7 @@
"author": "Mempool",
"category": "money",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1",
"repoUrl": "https://github.com/mempool/mempool",
"requires": [
"bitcoin-knots",
@ -281,7 +281,7 @@
},
{
"id": "fedimint",
"title": "Fedimint",
"title": "Fedimint Guardian",
"version": "0.10.0",
"description": "Federated Bitcoin minting service with built-in Guardian UI. Privacy-preserving Bitcoin custody.",
"icon": "/assets/img/app-icons/fedimint.png",

View File

@ -1,12 +1,12 @@
app:
id: archy-mempool-web
name: Mempool Web
version: 3.0.0
version: 3.0.1
description: Frontend web UI for mempool explorer.
container_name: mempool
container:
image: git.tx1138.com/lfg2025/mempool-frontend:v3.0.0
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1
pull_policy: if-not-present
network: archy-net

View File

@ -16,6 +16,11 @@ app:
# fmcd and retries on join failure (fmcd needs >=1 federation to boot), so an
# unreachable default never crash-loops. All config comes from FMCD_* env
# below. Nodes can join more federations via wallet.fedimint-join.
# Auto-generated on first install (random hex, 0600, rootless-owned) so the
# app needs no host provisioning. The wallet bridge reads the same file.
generated_secrets:
- name: fmcd-password
kind: hex16
secret_env:
- key: FMCD_PASSWORD
secret_file: fmcd-password
@ -33,12 +38,22 @@ app:
disk_limit: 2Gi
security:
capabilities: []
# fmcd's `fmcd-run` launcher chowns its /data (existing federation DB) on
# every start. With the default `cap_drop: ALL` and no caps added back, that
# chown fails and fmcd dies "Operation not permitted (os error 1)" — but ONLY
# once /data holds a joined federation (a fresh/empty dir needs no chown, so
# it appeared to work). Restore the standard container capability set so the
# startup chown succeeds (#7). Verified by bisection on .116: these caps make
# fmcd boot + serve /v2/*; DAC_OVERRIDE or SETUID/SETGID alone do NOT.
capabilities: ["CHOWN", "DAC_OVERRIDE", "FOWNER", "SETUID", "SETGID"]
readonly_root: true
# NOT isolated: fmcd needs outbound UDP + Mainline DHT (port 6881) + iroh
# relays to reach iroh-transport federations. Lock down once the default
# federation's reachability model is finalized.
network_policy: open
# relays to reach iroh-transport federations. `bridge` gives NAT'd outbound
# (UDP/DHT/iroh hole-punch all work) plus the published 8178→8080 port the
# wallet bridge targets. ("open" is not a valid policy — it made the loader
# skip this whole manifest, so fmcd never ran and federations never joined.)
# Lock down once the default federation's reachability model is finalized.
network_policy: bridge
ports:
# fmcd REST bound to 8080 in-container; 8080 collides with LND REST on the
@ -66,10 +81,15 @@ app:
# join reliability from a real second node before relying on auto-bundle.
- FMCD_INVITE_CODE=fed11qgqyj3mfwfhksw309uuxywtxxfjrjc35xuexverpxdsnxcnrxucxvenzveskgc3kvvun2c34xp3k2ep38yunzdpexcekxe3hvd3rvvmx8pnrvdenx5mnzvtzqqqjqt0t6pc3s5z0ynqjw9s4njf6svwgu59kweawc0vvrddcjeemw6yyn4pcdp
# fmcd serves only authenticated /v2/* routes — there is no unauthenticated
# /health endpoint, so an http probe to /health 404s forever and pins the
# container in "(starting)". fmcd's own image also ships neither curl nor wget.
# Use a TCP probe: the Quadlet renderer skips it (no HealthCmd emitted) and the
# host-side lifecycle layer verifies reachability, so the container reports
# "running" instead of a perpetual false-negative "(starting)".
health_check:
type: http
endpoint: http://localhost:8080
path: /health
type: tcp
endpoint: localhost:8080
interval: 30s
timeout: 5s
retries: 3

View File

@ -16,6 +16,14 @@ app:
else
exec gatewayd --data-dir /data --listen 0.0.0.0:8176 --bcrypt-password-hash "$FEDI_HASH" --network bitcoin --bitcoind-url http://host.archipelago:8332 --bitcoind-username "$FM_BITCOIND_USERNAME" --bitcoind-password "$FM_BITCOIND_PASSWORD" ldk --ldk-lightning-port 9737 --ldk-alias archipelago-gateway;
fi
# The gateway's admin API is gated by a bcrypt password hash. Generate it on
# first install (random password + its bcrypt hash, both 0600 rootless-owned)
# so the app installs from its manifest alone — `fedimint-gateway-hash` holds
# the hash passed to gatewayd, `fedimint-gateway-hash.pw` the plaintext for
# any client that must authenticate. Self-heals a wrongly root-owned hash.
generated_secrets:
- name: fedimint-gateway-hash
kind: bcrypt
secret_env:
- key: FM_BITCOIND_PASSWORD
secret_file: bitcoin-rpc-password

View File

@ -1,6 +1,6 @@
app:
id: fedimint
name: Fedimint
name: Fedimint Guardian
version: 0.10.0
description: Federated Bitcoin minting service with built-in Guardian UI. Privacy-preserving Bitcoin custody.

View File

@ -0,0 +1,58 @@
app:
id: immich-postgres
name: Immich Postgres
version: "14-vectorchord0.4.3-pgvectors0.2.0"
description: Postgres (pgvecto.rs / vectorchord) backend for Immich.
# Container named immich_postgres (underscore) to match the runtime's existing
# per-app references (lifecycle/health/crash-recovery/config) and serve as the
# server's DB_HOSTNAME alias. Top-level key → serde(flatten) → extensions →
# compute_container_name.
container_name: immich_postgres
container:
image: 146.59.87.168:3000/lfg2025/immich-postgres:14-vectorchord0.4.3-pgvectors0.2.0
pull_policy: if-not-present
network: archy-net
# postgres drops to its own uid (container 999 → host 100998 under rootless),
# so the data dir must be owned by that mapped uid — mirrors archy-btcpay-db.
# Verified on .228: the live immich-db is owned 100998. Without this a FRESH
# install's dir would be service-user-owned and postgres would EACCES.
data_uid: "100998:100998"
generated_secrets:
- name: immich-db-password
kind: hex32
secret_env:
- key: POSTGRES_PASSWORD
secret_file: immich-db-password
dependencies:
- storage: 40Gi
resources:
memory_limit: 2Gi
disk_limit: 40Gi
security:
capabilities: [CHOWN, DAC_OVERRIDE, FOWNER, SETGID, SETUID]
readonly_root: false
network_policy: isolated
ports: []
volumes:
- type: bind
source: /var/lib/archipelago/immich-db
target: /var/lib/postgresql/data
options: [rw]
environment:
- POSTGRES_USER=postgres
- POSTGRES_DB=immich
health_check:
type: tcp
endpoint: localhost:5432
interval: 30s
timeout: 5s
retries: 3

View File

@ -0,0 +1,37 @@
app:
id: immich-redis
name: Immich Redis
version: "7-alpine"
description: Valkey (Redis-compatible) cache for Immich.
# Container named immich_redis (underscore) to match runtime per-app references
# and serve as the server's REDIS_HOSTNAME alias on archy-net.
container_name: immich_redis
container:
image: 146.59.87.168:3000/lfg2025/valkey:7-alpine
pull_policy: if-not-present
network: archy-net
dependencies: []
resources:
memory_limit: 128Mi
security:
capabilities: [SETGID, SETUID]
readonly_root: false
network_policy: isolated
ports: []
volumes: []
environment: []
health_check:
type: tcp
endpoint: localhost:6379
interval: 30s
timeout: 5s
retries: 3

74
apps/immich/manifest.yml Normal file
View File

@ -0,0 +1,74 @@
app:
id: immich
name: Immich
version: "2.7.4"
description: Self-hosted photo and video backup with mobile apps and search.
# app_id "immich" = the user-facing launcher (matches the catalog entry's title
# + icon). The container is named "immich_server" so it matches the runtime's
# existing per-app container references (lifecycle/health/crash-recovery/ports);
# `container_name` is a top-level app key (captured by serde(flatten) into
# extensions, read by compute_container_name). It reaches its backends by their
# underscore aliases on archy-net (DB_HOSTNAME / REDIS_HOSTNAME below).
container_name: immich_server
container:
image: 146.59.87.168:3000/lfg2025/immich-server:release
pull_policy: if-not-present
network: archy-net
secret_env:
- key: DB_PASSWORD
secret_file: immich-db-password
dependencies:
- app_id: immich-postgres
- app_id: immich-redis
- storage: 200Gi
resources:
memory_limit: 2Gi
disk_limit: 200Gi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports:
- host: 2283
container: 2283
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/immich
target: /usr/src/app/upload
options: [rw]
environment:
- DB_HOSTNAME=immich_postgres
- DB_USERNAME=postgres
- DB_DATABASE_NAME=immich
- REDIS_HOSTNAME=immich_redis
- UPLOAD_LOCATION=/usr/src/app/upload
health_check:
type: http
endpoint: http://localhost:2283
path: /api/server/ping
interval: 30s
timeout: 5s
retries: 20
interfaces:
main:
name: Web UI
description: Immich photo library
type: ui
port: 2283
protocol: http
path: /
metadata:
launch:
open_in_new_tab: true

View File

@ -0,0 +1,77 @@
app:
id: indeedhub-api
name: IndeedHub API
version: "1.0.0"
description: IndeedHub backend API (Nostr auth, media, payments).
category: community
# Hyphen name matches runtime references + the live container (adoption);
# alias `api` is the short hostname the frontend nginx proxies to
# (http://api:4000). Reaches its backends by their short aliases
# (postgres/redis/minio) on indeedhub-net — unchanged from the legacy installer.
container_name: indeedhub-api
container:
image: 146.59.87.168:3000/lfg2025/indeedhub-api:1.0.0
pull_policy: if-not-present
network: indeedhub-net
network_aliases: [api]
# The JWT signing secret is owned here (no backend container owns it); the
# db + minio passwords are owned by indeedhub-postgres / indeedhub-minio and
# only consumed here. ensure_generated_secrets no-ops when a file already
# exists, so live values on .228 are preserved (postgres pw is fixed at
# PGDATA init — regenerating would lock the API out).
generated_secrets:
- name: indeedhub-jwt
kind: hex32
secret_env:
- key: DATABASE_PASSWORD
secret_file: indeedhub-db-password
- key: AWS_SECRET_KEY
secret_file: indeedhub-minio-password
- key: NOSTR_JWT_SECRET
secret_file: indeedhub-jwt
dependencies:
- app_id: indeedhub-postgres
- app_id: indeedhub-redis
- app_id: indeedhub-minio
resources:
memory_limit: 2Gi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports: []
volumes: []
environment:
- PORT=4000
- DATABASE_HOST=postgres
- DATABASE_PORT=5432
- DATABASE_USER=indeedhub
- DATABASE_NAME=indeedhub
- QUEUE_HOST=redis
- QUEUE_PORT=6379
- S3_ENDPOINT=http://minio:9000
- AWS_REGION=us-east-1
- AWS_ACCESS_KEY=indeeadmin
- S3_PUBLIC_BUCKET_NAME=indeedhub-public
- S3_PRIVATE_BUCKET_NAME=indeedhub-private
- S3_PUBLIC_BUCKET_URL=/storage
- NOSTR_JWT_EXPIRES_IN=7d
# Fixed across the fleet (envelope-encryption master key baked by the legacy
# installer); not node-specific, so a plain env literal, not a secret.
- AES_MASTER_SECRET=0123456789abcdef0123456789abcdef
- ENVIRONMENT=production
health_check:
type: tcp
endpoint: localhost:4000
interval: 30s
timeout: 5s
retries: 10

View File

@ -0,0 +1,51 @@
app:
id: indeedhub-ffmpeg
name: IndeedHub FFmpeg Worker
version: "1.0.0"
description: IndeedHub background media transcoding worker.
category: community
# Hyphen name matches runtime references + the live container (adoption). No
# network_alias: nothing connects TO the worker — it only dials out to
# postgres/redis/minio (resolved by their aliases on indeedhub-net).
container_name: indeedhub-ffmpeg
container:
image: 146.59.87.168:3000/lfg2025/indeedhub-ffmpeg:1.0.0
pull_policy: if-not-present
network: indeedhub-net
secret_env:
- key: DATABASE_PASSWORD
secret_file: indeedhub-db-password
- key: AWS_SECRET_KEY
secret_file: indeedhub-minio-password
dependencies:
- app_id: indeedhub-api
resources:
memory_limit: 4Gi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports: []
volumes: []
environment:
- DATABASE_HOST=postgres
- DATABASE_PORT=5432
- DATABASE_USER=indeedhub
- DATABASE_NAME=indeedhub
- QUEUE_HOST=redis
- QUEUE_PORT=6379
- S3_ENDPOINT=http://minio:9000
- AWS_REGION=us-east-1
- AWS_ACCESS_KEY=indeeadmin
- S3_PUBLIC_BUCKET_NAME=indeedhub-public
- S3_PRIVATE_BUCKET_NAME=indeedhub-private
- ENVIRONMENT=production
- AES_MASTER_SECRET=0123456789abcdef0123456789abcdef

View File

@ -0,0 +1,60 @@
app:
id: indeedhub-minio
name: IndeedHub MinIO
version: "RELEASE.2024-11-07T00-52-20Z"
description: MinIO S3-compatible object storage for IndeedHub media.
category: community
# Hyphen name matches runtime references + the live container (adoption);
# alias `minio` is the short hostname the api/ffmpeg use (S3_ENDPOINT=
# http://minio:9000) AND the frontend nginx proxies to (http://minio:9000).
container_name: indeedhub-minio
container:
image: 146.59.87.168:3000/lfg2025/minio:RELEASE.2024-11-07T00-52-20Z
pull_policy: if-not-present
network: indeedhub-net
network_aliases: [minio]
# `server /data` — the minio entrypoint args from the legacy installer.
custom_args: [server, /data]
generated_secrets:
- name: indeedhub-minio-password
kind: hex32
secret_env:
- key: MINIO_ROOT_PASSWORD
secret_file: indeedhub-minio-password
dependencies:
- storage: 50Gi
resources:
memory_limit: 1Gi
disk_limit: 50Gi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports: []
# Named volume matches the live indeedhub-minio-data volume on .228.
volumes:
- type: volume
source: indeedhub-minio-data
target: /data
options: [rw]
# MINIO_ROOT_USER "indeeadmin" is the fixed admin identity baked by the legacy
# installer (api/ffmpeg use it as AWS_ACCESS_KEY); the password is the
# generated secret above. Not secret, so it stays a plain env value.
environment:
- MINIO_ROOT_USER=indeeadmin
health_check:
type: http
endpoint: http://localhost:9000
path: /minio/health/live
interval: 30s
timeout: 5s
retries: 5

View File

@ -0,0 +1,59 @@
app:
id: indeedhub-postgres
name: IndeedHub Postgres
version: "16.13-alpine"
description: Postgres database backend for IndeedHub.
category: community
# Container named indeedhub-postgres (hyphen) to match the runtime's existing
# per-app references (health_monitor tiers/deps, crash_recovery) and the live
# .228 install, so the orchestrator ADOPTS the running container instead of
# recreating it. `network_aliases: [postgres]` keeps the short hostname the
# api/ffmpeg/relay reach by (DATABASE_HOST=postgres) resolvable on
# indeedhub-net, reproducing the legacy `--network-alias postgres`.
container_name: indeedhub-postgres
container:
image: 146.59.87.168:3000/lfg2025/postgres:16.13-alpine
pull_policy: if-not-present
network: indeedhub-net
network_aliases: [postgres]
generated_secrets:
- name: indeedhub-db-password
kind: hex32
secret_env:
- key: POSTGRES_PASSWORD
secret_file: indeedhub-db-password
dependencies:
- storage: 10Gi
resources:
memory_limit: 1Gi
disk_limit: 10Gi
security:
capabilities: [CHOWN, DAC_OVERRIDE, FOWNER, SETGID, SETUID]
readonly_root: false
network_policy: isolated
ports: []
# Named podman volume (matches the live indeedhub-postgres-data volume on .228);
# preserves all existing database content across the migration.
volumes:
- type: volume
source: indeedhub-postgres-data
target: /var/lib/postgresql/data
options: [rw]
environment:
- POSTGRES_USER=indeedhub
- POSTGRES_DB=indeedhub
health_check:
type: tcp
endpoint: localhost:5432
interval: 30s
timeout: 5s
retries: 3

View File

@ -0,0 +1,45 @@
app:
id: indeedhub-redis
name: IndeedHub Redis
version: "7.4.8-alpine"
description: Redis queue/cache backend for IndeedHub.
category: community
# Hyphen name matches runtime references + the live container (adoption);
# alias `redis` is the short hostname the api/ffmpeg reach (QUEUE_HOST=redis).
container_name: indeedhub-redis
container:
image: 146.59.87.168:3000/lfg2025/redis:7.4.8-alpine
pull_policy: if-not-present
network: indeedhub-net
network_aliases: [redis]
dependencies:
- storage: 1Gi
resources:
memory_limit: 256Mi
security:
capabilities: [SETGID, SETUID]
readonly_root: false
network_policy: isolated
ports: []
# Named volume matches the live indeedhub-redis-data volume on .228.
volumes:
- type: volume
source: indeedhub-redis-data
target: /data
options: [rw]
environment: []
health_check:
type: tcp
endpoint: localhost:6379
interval: 30s
timeout: 5s
retries: 3

View File

@ -0,0 +1,47 @@
app:
id: indeedhub-relay
name: IndeedHub Nostr Relay
version: "0.9.0"
description: nostr-rs-relay backing IndeedHub's Nostr identity + comments.
category: community
# Hyphen name matches runtime references + the live container (adoption);
# alias `relay` is the short hostname the frontend nginx proxies to
# (http://relay:8080 for the /relay websocket).
container_name: indeedhub-relay
container:
image: 146.59.87.168:3000/lfg2025/nostr-rs-relay:0.9.0
pull_policy: if-not-present
network: indeedhub-net
network_aliases: [relay]
dependencies:
- storage: 2Gi
resources:
memory_limit: 256Mi
disk_limit: 2Gi
security:
capabilities: []
readonly_root: false
network_policy: isolated
ports: []
# Named volume matches the live indeedhub-relay-data volume on .228.
volumes:
- type: volume
source: indeedhub-relay-data
target: /usr/src/app/db
options: [rw]
environment: []
health_check:
type: tcp
endpoint: localhost:8080
interval: 30s
timeout: 5s
retries: 3

View File

@ -1,63 +1,84 @@
app:
id: indeedhub
name: IndeeHub
version: 1.0.0
version: "1.0.0"
description: Bitcoin documentary streaming platform featuring God Bless Bitcoin and other educational content about Bitcoin, sovereignty, and decentralized technology. Sign in with your Nostr identity.
category: community
# The user-facing launcher (app_id "indeedhub"). Container is named "indeedhub"
# (matches the runtime's per-app references + the live container, so the
# orchestrator adopts it). Its nginx (listen 7777) proxies to the backends by
# their short aliases on indeedhub-net: api:4000, minio:9000, relay:8080.
container_name: indeedhub
container:
image: 146.59.87.168:3000/lfg2025/indeedhub:1.0.0
pull_policy: always # Pull from registry; falls back to local build
pull_policy: if-not-present
network: indeedhub-net
dependencies:
- app_id: indeedhub-api
- storage: 1Gi
resources:
cpu_limit: 2
memory_limit: 512Mi
disk_limit: 1Gi
security:
capabilities: []
readonly_root: true
no_new_privileges: true
user: 1001
seccomp_profile: default
network_policy: bridge
apparmor_profile: default
# nginx master runs as root and drops workers to the nginx user (uid/gid
# 101) — needs SET{UID,GID}; CHOWN + DAC_OVERRIDE let it own + write the
# proxy cache under the tmpfs /var/cache/nginx. The orchestrator does
# --cap-drop=ALL, so (unlike the legacy `podman run` default caps) these
# must be declared or nginx workers die with "setgid(101) failed".
capabilities: [CHOWN, DAC_OVERRIDE, SETGID, SETUID]
readonly_root: false
network_policy: isolated
ports:
- host: 7778
container: 7777
protocol: tcp # Web UI. Port 7777 on the host is reserved for Nostr relay.
protocol: tcp # Web UI. Port 7777 on the host is reserved for the Nostr relay.
# Writable scratch the baked nginx needs; matches the legacy installer's
# --tmpfs /run + /var/cache/nginx.
volumes:
- type: tmpfs
target: /tmp
options: [rw,noexec,nosuid,size=64m]
- type: tmpfs
target: /app/.next/cache
options: [rw,noexec,nosuid,size=128m]
- type: tmpfs
target: /run
options: [rw,nosuid,nodev,size=16m]
options: [rw, nosuid, nodev, size=16m]
- type: tmpfs
target: /var/cache/nginx
options: [rw,nosuid,nodev,size=32m]
options: [rw, nosuid, nodev, size=32m]
environment:
- NODE_ENV=production
- NEXT_TELEMETRY_DISABLED=1
environment: []
# Defensive + idempotent. The current indeedhub:1.0.0 image already bakes the
# iframe-friendly nginx (X-Frame-Options omitted, nostr-provider.js present +
# <script> injected), so these are mostly no-ops on that tag — but they keep
# the app iframe-loadable + the provider script fresh for any image build that
# predates the bake. copy_from_host pulls /opt/archipelago/web-ui/nostr-provider.js
# (kept current by frontend OTA releases). Replaces the legacy hardcoded
# patch_indeedhub_nostr_provider() Rust hook.
hooks:
post_install:
- exec: ["sed", "-i", "/X-Frame-Options/d", "/etc/nginx/conf.d/default.conf"]
- copy_from_host:
src: "web-ui/nostr-provider.js"
dest: "/usr/share/nginx/html/nostr-provider.js"
- exec: ["sh", "-c", "grep -q nostr-provider /etc/nginx/conf.d/default.conf || sed -i 's#</head>#<script src=\"/nostr-provider.js\"></script></head>#' /etc/nginx/conf.d/default.conf"]
- exec: ["nginx", "-s", "reload"]
# TCP liveness on the nginx port, NOT an http GET of /. nginx binds 7777 at
# startup (before workers), so this passes immediately and stays green under
# load. An http check of / runs the SPA + sub_filter and false-fails when the
# node is busy → the reconciler then treats the frontend as wedged and
# recreates it in a loop (observed churning the frontend on the loaded .198).
health_check:
type: http
endpoint: http://localhost:3000
path: /
type: tcp
endpoint: localhost:7777
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
timeout: 5s
retries: 5
start_period: 30s
interfaces:
main:

View File

@ -5,7 +5,7 @@ app:
description: Bitcoin mempool and blockchain explorer. Real-time transaction and block visualization.
container:
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1
image_signature: cosign://...
pull_policy: if-not-present

View File

@ -1,5 +0,0 @@
# Meshtastic - uses official image
FROM meshtastic/meshtastic:latest
# Default configuration is in the image
# No additional setup needed

View File

@ -1,69 +0,0 @@
app:
id: meshtastic
name: Meshtastic
version: 2-daily-alpine
description: Open-source mesh networking for LoRa radios. Create decentralized communication networks.
container:
image: docker.io/meshtastic/meshtasticd:daily-alpine
pull_policy: if-not-present
dependencies:
- storage: 1Gi
resources:
cpu_limit: 1
memory_limit: 512Mi
disk_limit: 1Gi
security:
capabilities: [NET_ADMIN, SYS_ADMIN] # Required for LoRa radio access
readonly_root: false # Needs write access for device management
no_new_privileges: true
user: 1000
seccomp_profile: default
network_policy: host # Requires host network for radio access
apparmor_profile: meshtastic
ports:
- host: 4403
container: 4403
protocol: tcp # Meshtastic TCP API
devices:
- /dev/ttyUSB0 # LoRa radio device (if connected)
volumes:
- type: bind
source: /var/lib/archipelago/meshtastic
target: /var/lib/meshtasticd
options: [rw]
files:
- path: /var/lib/archipelago/meshtastic/config.yaml
content: |
General:
MACAddress: AA:BB:CC:DD:EE:01
Webserver:
Port: 4403
environment:
- MESHTASTIC_PORT=/dev/ttyUSB0
- MESHTASTIC_SERIAL=true
health_check:
type: cmd
endpoint: test -f /var/lib/meshtasticd/config.yaml
interval: 30s
timeout: 30s
retries: 5
networking:
mesh_enabled: true
local_network_access: true
metadata:
icon: /assets/img/app-icons/meshcore.svg
category: networking
tier: recommended
repo: https://github.com/meshtastic/firmware

View File

@ -0,0 +1,77 @@
app:
id: netbird-dashboard
name: NetBird Dashboard
version: "2.38.0"
description: NetBird management dashboard (SPA). Internal stack member served through the netbird proxy.
category: networking
# Hyphen name matches runtime references + the live container (adoption).
# Alias `netbird-dashboard` is the short hostname the proxy's nginx proxies to.
container_name: netbird-dashboard
container:
image: docker.io/netbirdio/dashboard:v2.38.0
pull_policy: if-not-present
network: netbird-net
network_aliases: [netbird-dashboard]
# The dashboard SPA bakes its API/OIDC base URL from these at container
# start. They must point at the proxy's public HTTPS origin (8087) so the
# browser uses a secure context (window.crypto.subtle / OIDC PKCE, #15).
# {{HOST_IP}} is the node's primary host IP, resolved at apply time.
derived_env:
- key: NETBIRD_MGMT_API_ENDPOINT
template: "https://{{HOST_IP}}:8087"
- key: NETBIRD_MGMT_GRPC_API_ENDPOINT
template: "https://{{HOST_IP}}:8087"
- key: AUTH_AUTHORITY
template: "https://{{HOST_IP}}:8087/oauth2"
dependencies:
- app_id: netbird-server
resources:
memory_limit: 256Mi
security:
# cap-drop=ALL is applied by the orchestrator. The dashboard image runs
# nginx (master as root, drops workers) binding :80 — needs the worker-drop
# caps + NET_BIND_SERVICE for the privileged port.
capabilities: [CHOWN, DAC_OVERRIDE, SETGID, SETUID, NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
# Internal only — reached container-to-container by the proxy via netbird-net.
ports: []
volumes: []
environment:
- AUTH_AUDIENCE=netbird-dashboard
- AUTH_CLIENT_ID=netbird-dashboard
- AUTH_CLIENT_SECRET=
- USE_AUTH0=false
- AUTH_SUPPORTED_SCOPES=openid profile email groups
- AUTH_REDIRECT_URI=/nb-auth
- AUTH_SILENT_REDIRECT_URI=/nb-silent-auth
- NETBIRD_TOKEN_SOURCE=idToken
- NGINX_SSL_PORT=443
- LETSENCRYPT_DOMAIN=none
health_check:
type: tcp
endpoint: localhost:80
interval: 30s
timeout: 5s
retries: 5
start_period: 20s
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/dashboard
license: BSD-3-Clause
tags:
- networking
- vpn
- dashboard

View File

@ -0,0 +1,122 @@
app:
id: netbird-server
name: NetBird Server
version: "0.71.2"
description: NetBird combined management / signal / relay server with an embedded identity provider and STUN. Backend for the self-hosted NetBird mesh VPN.
category: networking
# Hyphen name matches the runtime references (crash_recovery / dependencies /
# config startup order) + the live container, so on an existing node the
# orchestrator ADOPTS the running server rather than recreating it (data +
# the sqlite store under /var/lib/netbird preserved). Alias `netbird-server`
# is the short hostname the proxy's nginx proxies/grpc-passes to.
container_name: netbird-server
container:
image: docker.io/netbirdio/netbird-server:0.71.2
pull_policy: if-not-present
network: netbird-net
network_aliases: [netbird-server]
# The relay authSecret and the sqlite store encryptionKey are base64 keys
# (the server base64-decodes them to recover raw bytes — hex would decode to
# the wrong value). Generated once and reused: ensure_generated_secrets
# no-ops when the file already exists, so a re-render of config.yaml on an
# adopted node keeps the same keys (regenerating would orphan the store).
generated_secrets:
- name: netbird-relay-auth-secret
kind: base64
- name: netbird-store-encryption-key
kind: base64
# Pass the rendered config explicitly, mirroring the legacy `--config` arg.
custom_args: ["--config", "/etc/netbird/config.yaml"]
dependencies:
- storage: 1Gi
resources:
memory_limit: 1Gi
security:
# cap-drop=ALL is applied by the orchestrator. The server binds :80
# (management/signal/relay HTTP + gRPC) inside the container — a privileged
# port — so it needs NET_BIND_SERVICE. STUN is 3478/udp (unprivileged).
capabilities: [NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
ports:
- host: 8086
container: 80
protocol: tcp # management API + embedded OIDC issuer (/oauth2)
- host: 3478
container: 3478
protocol: udp # STUN — must be UDP; tcp here breaks relay discovery
volumes:
- type: bind
source: /var/lib/archipelago/netbird/data
target: /var/lib/netbird
options: [rw]
# The rendered config.yaml, read-only. Re-rendered on every reconcile from
# host facts + the base64 secrets; idempotent (stable bytes → no restart).
- type: bind
source: /var/lib/archipelago/netbird/config.yaml
target: /etc/netbird/config.yaml
options: [ro]
environment: []
# The server's config. {{HOST_IP}} is the node's primary host IP (the proxy's
# public origin is https on 8087 — the dashboard needs a secure context for
# OIDC PKCE, issue #15). {{secret:...}} are read 0600 from the secrets dir.
files:
- path: /var/lib/archipelago/netbird/config.yaml
overwrite: true
content: |
server:
listenAddress: ":80"
exposedAddress: "https://{{HOST_IP}}:8087"
stunPorts:
- 3478
metricsPort: 9090
healthcheckAddress: ":9000"
logLevel: "info"
logFile: "console"
authSecret: "{{secret:netbird-relay-auth-secret}}"
dataDir: "/var/lib/netbird"
auth:
issuer: "https://{{HOST_IP}}:8087/oauth2"
localAuthDisabled: false
signKeyRefreshEnabled: false
dashboardRedirectURIs:
- "https://{{HOST_IP}}:8087/nb-auth"
- "https://{{HOST_IP}}:8087/nb-silent-auth"
dashboardPostLogoutRedirectURIs:
- "https://{{HOST_IP}}:8087/"
cliRedirectURIs:
- "http://localhost:53000/"
store:
engine: "sqlite"
encryptionKey: "{{secret:netbird-store-encryption-key}}"
# TCP liveness on the management port. Binds at startup, stays green; an http
# check of /oauth2 would false-fail while the issuer warms up.
health_check:
type: tcp
endpoint: localhost:80
interval: 30s
timeout: 5s
retries: 10
start_period: 30s
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/netbird
license: BSD-3-Clause
tags:
- networking
- vpn
- wireguard
- mesh

182
apps/netbird/manifest.yml Normal file
View File

@ -0,0 +1,182 @@
app:
id: netbird
name: NetBird
version: "2.38.0"
description: Self-hosted WireGuard mesh VPN control plane with dashboard, embedded identity provider, management API, signal, relay, and STUN. The user-facing entry point — a TLS proxy in front of the dashboard + server.
category: networking
# The user-facing launcher (app_id + container both "netbird", matching the
# runtime references + the live container so the orchestrator adopts it). This
# is the nginx that terminates TLS on 8087 and fans out to the dashboard +
# server by their short aliases on netbird-net.
container_name: netbird
container:
image: docker.io/library/nginx:1.27-alpine
pull_policy: if-not-present
network: netbird-net
# Self-signed TLS cert materialised before create — the dashboard needs a
# secure context (window.crypto.subtle / OIDC PKCE, issue #15), so the proxy
# serves HTTPS. Idempotent: kept as-is when crt+key already exist (a user
# accepts it once). SAN defaults to the host IP + 127.0.0.1 + localhost.
generated_certs:
- crt: /var/lib/archipelago/netbird/tls.crt
key: /var/lib/archipelago/netbird/tls.key
dependencies:
- app_id: netbird-server
- app_id: netbird-dashboard
- storage: 1Gi
resources:
memory_limit: 256Mi
security:
# cap-drop=ALL is applied by the orchestrator. nginx (master as root, drops
# workers) binds :443 — needs the worker-drop caps + NET_BIND_SERVICE.
capabilities: [CHOWN, DAC_OVERRIDE, SETGID, SETUID, NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
ports:
# 8087 publishes the TLS listener (container :443). HTTPS is required for the
# dashboard's secure context (issue #15).
- host: 8087
container: 443
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/netbird/nginx.conf
target: /etc/nginx/conf.d/default.conf
options: [ro]
- type: bind
source: /var/lib/archipelago/netbird/tls.crt
target: /etc/nginx/tls.crt
options: [ro]
- type: bind
source: /var/lib/archipelago/netbird/tls.key
target: /etc/nginx/tls.key
options: [ro]
environment: []
# The proxy config. {{NETWORK_GATEWAY}} is the netbird-net bridge gateway =
# Podman's aardvark DNS. nginx uses it as an explicit `resolver` with VARIABLE
# upstreams so it re-resolves container names per request — without it nginx
# pins a container IP at startup and 502s forever once that IP moves on a
# restart/reboot (issue #15, observed live on .198). Every #15 fix below
# (CORS $http_origin reflect, grpc pass, nb-auth/nb-silent-auth rewrite to
# index.html, /relay websocket) is preserved verbatim from the legacy config.
files:
- path: /var/lib/archipelago/netbird/nginx.conf
overwrite: true
content: |
server {
listen 443 ssl;
server_name _;
# netbird's dashboard needs a secure context (window.crypto.subtle for
# OIDC PKCE), so the proxy terminates TLS with a self-signed cert (#15).
ssl_certificate /etc/nginx/tls.crt;
ssl_certificate_key /etc/nginx/tls.key;
# Rootless Podman can hand a container a new IP across restarts/reboots.
# nginx resolves a literal upstream name ONCE at startup and caches it,
# so after the IP moves every request 502s with "host unreachable"
# (issue #15, observed live on .198: nginx pinned to a dead
# netbird-dashboard IP). Fix: point `resolver` at the netbird-net
# gateway (Podman's aardvark DNS) and use VARIABLE upstreams, which
# forces nginx to re-resolve the container names at request time.
resolver {{NETWORK_GATEWAY}} valid=10s ipv6=off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
location ~ ^/(relay|ws-proxy/) {
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 1d;
}
location ~ ^/(api|oauth2)(/|$) {
# The dashboard is a SPA whose API/OIDC base URL is baked at build
# time to one host:port. A single box is reached via several
# addresses, so those fetches are cross-origin and the browser
# blocks them with no Access-Control-Allow-Origin (#15, live on
# .198). Reflect the caller's Origin and answer the CORS preflight.
if ($request_method = OPTIONS) {
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
add_header Access-Control-Max-Age 86400 always;
add_header Content-Length 0;
return 204;
}
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
}
location ~ ^/(signalexchange\.SignalExchange|management\.ManagementService|management\.ProxyService)/ {
set $nb_server netbird-server;
grpc_pass grpc://$nb_server:80;
grpc_read_timeout 1d;
grpc_send_timeout 1d;
}
# OIDC callback routes are client-side SPA routes with NO prebuilt page
# in the dashboard bundle, so proxying them straight through 404s —
# which crashes the dashboard's auth init and shows "Unauthenticated"
# with dead buttons (#15, live on .198: /nb-auth + /nb-silent-auth
# returned 404). Serve index.html at these paths (URL unchanged) so
# react-oidc boots and completes the login / silent-SSO.
location ~ ^/(nb-auth|nb-silent-auth) {
set $nb_dashboard netbird-dashboard;
rewrite ^.*$ /index.html break;
proxy_pass http://$nb_dashboard:80;
}
location / {
set $nb_dashboard netbird-dashboard;
proxy_pass http://$nb_dashboard:80;
}
}
health_check:
type: tcp
endpoint: localhost:443
interval: 30s
timeout: 5s
retries: 5
start_period: 20s
interfaces:
main:
name: Dashboard
description: Manage your self-hosted NetBird mesh VPN
type: ui
port: 8087
protocol: https
path: /
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/netbird
license: BSD-3-Clause
tags:
- networking
- vpn
- wireguard
- mesh

View File

@ -146,7 +146,9 @@ impl ApiHandler {
Ok(content_server::ServeResult::Forbidden) => Ok(build_response(
StatusCode::FORBIDDEN,
"application/json",
hyper::Body::from(r#"{"error":"Access denied — federation peer required"}"#),
hyper::Body::from(
r#"{"error":"This file is shared with the host's federation peers only. Federate with that node (exchange invites) so it recognizes you, then try again."}"#,
),
)),
Ok(content_server::ServeResult::NotFound) | Err(_) => Ok(build_response(
StatusCode::NOT_FOUND,
@ -222,8 +224,12 @@ impl ApiHandler {
hyper::Body::from(r#"{"error":"Invoice missing payment hash"}"#),
)),
Err(e) => {
// Surface the FULL error chain ({:#}) — the generic top-level
// message hid the real cause (e.g. the LND REST connection
// failing), which made this 503 undiagnosable.
tracing::warn!("content invoice creation failed: {e:#}");
let body = serde_json::json!({
"error": format!("Could not create invoice: {e}")
"error": format!("Could not create invoice: {e:#}")
});
Ok(build_response(
StatusCode::SERVICE_UNAVAILABLE,

View File

@ -171,6 +171,13 @@ impl RpcHandler {
// than the WebSocket-delivered package_data, which caused apps to flicker
// between "installed" and "not-installed" in the UI.
let (data, _) = self.state_manager.get_snapshot().await;
// Apps the user explicitly stopped must read as "stopped" even though a
// UI companion (electrs-ui, bitcoin-ui, …) keeps serving the launch port:
// launch_port_reachable() below would otherwise upgrade an exited backend
// back to "running". The reconcile guard keeps these backends down, so the
// marker is authoritative here.
let user_stopped =
crate::crash_recovery::load_user_stopped(&self.config.data_dir).await;
if data.server_info.status_info.containers_scanned && !data.package_data.is_empty() {
let mut containers = Vec::with_capacity(data.package_data.len());
for (id, pkg) in &data.package_data {
@ -202,7 +209,11 @@ impl RpcHandler {
// Scanner backoff preserves cached package_data. Refresh stable
// states so callers do not see stale `running`/`exited` after
// health-monitor recovery or Quadlet --rm container removal.
if state == "running" && requires_launch_port_for_health(id) {
if user_stopped.contains(id) {
// User stopped it → authoritative "stopped". Do NOT let a
// still-running UI companion's launch port mark it running.
state = "stopped".to_string();
} else if state == "running" && requires_launch_port_for_health(id) {
if !self.cached_reachable_health(id).await?.is_some() {
state = live_state_for_app(id)
.await

View File

@ -19,6 +19,29 @@ fn is_valid_v3_onion(addr: &str) -> bool {
const FILE_CATALOG_PROTOCOL: &str = "https://archipelago.dev/protocols/file-catalog/v1";
/// Best-effort reclaim of an ecash payment token that was minted but the sale
/// didn't complete (seller unreachable or couldn't redeem it), so the buyer
/// doesn't lose the value. For Fedimint the spender can reissue its own
/// un-redeemed notes; for Cashu the proofs are received back. Fails silently if
/// the seller already claimed the token (then the value is genuinely gone).
async fn reclaim_spent_ecash(data_dir: &std::path::Path, token: &str, backend: &str) {
let res = match backend {
"fedimint" => crate::wallet::fedimint_client::reissue_into_any(data_dir, token)
.await
.map(|(sats, _fed)| sats),
_ => ecash::receive_token(data_dir, token).await,
};
match res {
Ok(sats) => tracing::info!(
"paid download: reclaimed {sats} sats of unspent {backend} ecash after a failed sale"
),
Err(e) => tracing::warn!(
"paid download: could not reclaim {backend} ecash (the peer may have already \
claimed it): {e:#}"
),
}
}
impl RpcHandler {
/// List content I'm sharing.
pub(super) async fn handle_content_list_mine(&self) -> Result<serde_json::Value> {
@ -260,6 +283,20 @@ impl RpcHandler {
}));
}
// A 403 carries an actionable reason in its JSON body (e.g. "shared with
// the host's federation peers only — federate first"). Surface that to
// the user instead of a bare "Peer returned: 403 Forbidden".
if response.status() == reqwest::StatusCode::FORBIDDEN {
let status = response.status();
let body: serde_json::Value = response.json().await.unwrap_or_default();
let msg = body
.get("error")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
.unwrap_or_else(|| format!("Peer returned: {status}"));
return Err(anyhow::anyhow!(msg));
}
if !response.status().is_success() {
return Err(anyhow::anyhow!("Peer returned: {}", response.status()));
}
@ -369,10 +406,58 @@ impl RpcHandler {
return Err(anyhow::anyhow!("Invalid v3 onion address"));
}
// Mint ecash payment token
let token_str = ecash::send_token(&self.config.data_dir, price_sats)
.await
.context("Failed to create ecash payment token — check wallet balance")?;
// `method` pins the backend the user confirmed in the UI ("cashu" |
// "fedimint"); absent = auto (Cashu first, then Fedimint). The seller's
// verify_payment_token accepts either, so a node whose balance lives in
// one system can still pay (#3).
let method = params.get("method").and_then(|v| v.as_str());
let mint_cashu = || ecash::send_token(&self.config.data_dir, price_sats);
let mint_fedimint =
|| crate::wallet::fedimint_client::spend_from_any(&self.config.data_dir, price_sats);
let (token_str, used_backend) = match method {
Some("cashu") => match mint_cashu().await {
Ok(t) => (t, "cashu"),
Err(e) => {
tracing::warn!("paid download: cashu mint failed for {price_sats} sats: {e:#}");
return Ok(serde_json::json!({ "error": format!(
"Couldn't pay {price_sats} sats from your Cashu wallet: {e}. \
Fund it, or choose Fedimint."
) }));
}
},
Some("fedimint") => match mint_fedimint().await {
Ok((notes, fed)) => {
tracing::info!("paid download: spending {price_sats} sats Fedimint notes from {fed}");
(notes, "fedimint")
}
Err(e) => {
tracing::warn!("paid download: fedimint spend failed for {price_sats} sats: {e:#}");
return Ok(serde_json::json!({ "error": format!(
"Couldn't pay {price_sats} sats from your Fedimint wallet: {e}. \
Fund it, or choose Cashu."
) }));
}
},
_ => match mint_cashu().await {
Ok(t) => (t, "cashu"),
Err(cashu_err) => match mint_fedimint().await {
Ok((notes, _fed)) => (notes, "fedimint"),
Err(fedi_err) => {
tracing::warn!(
"paid download: no ecash backend could pay {price_sats} sats \
(cashu: {cashu_err:#}; fedimint: {fedi_err:#})"
);
return Ok(serde_json::json!({ "error": format!(
"Couldn't pay {price_sats} sats from your ecash wallet \
(Cashu or Fedimint). Fund either wallet and try again."
) }));
}
},
},
};
tracing::info!("paid download: paying {price_sats} sats to {onion} via {used_backend} ecash");
let (data, _) = self.state_manager.get_snapshot().await;
let local_did = crate::identity::did_key_from_pubkey_hex(&data.server_info.pubkey)?;
@ -389,7 +474,7 @@ impl RpcHandler {
)
.service(crate::settings::transport::PeerService::PeerFiles)
.header("X-Federation-DID", local_did)
.header("X-Payment-Token", token_str)
.header("X-Payment-Token", token_str.clone())
.timeout(std::time::Duration::from_secs(900))
.send_get()
.await
@ -397,8 +482,11 @@ impl RpcHandler {
Ok(v) => v,
Err(e) => {
tracing::warn!("paid peer download dial failed for {}: {:#}", onion, e);
// The token was already minted/spent — reclaim it so the buyer
// doesn't lose the value when the seller was simply unreachable.
reclaim_spent_ecash(&self.config.data_dir, &token_str, used_backend).await;
return Ok(serde_json::json!({
"error": "Could not reach the peer over mesh or Tor — it may be offline. Please try again."
"error": "Could not reach the peer over mesh or Tor — it may be offline. Your ecash was refunded to your wallet. Please try again."
}));
}
};
@ -412,30 +500,92 @@ impl RpcHandler {
.await;
if response.status() == reqwest::StatusCode::PAYMENT_REQUIRED {
// Payment was rejected — token is spent but content not received
// Payment was rejected by the seller. Surface the most likely cause
// per backend — for ecash both sides must share a redemption network
// (a Cashu mint, or a Fedimint federation).
let body = response.text().await.unwrap_or_default();
tracing::warn!(
"paid download: seller {onion} rejected {used_backend} payment of {price_sats} sats: {body}"
);
// Seller couldn't redeem the token — reclaim it so the buyer keeps
// their funds (the spent-but-unredeemed-notes case the user hit).
reclaim_spent_ecash(&self.config.data_dir, &token_str, used_backend).await;
let hint = match used_backend {
"fedimint" => "the seller isn't in the same Fedimint federation as you",
_ => "the seller doesn't accept your Cashu mint",
};
return Ok(serde_json::json!({
"error": "Payment rejected by peer — the token may have been insufficient or invalid."
"error": format!(
"Payment rejected by the seller — {hint}. Your ecash was refunded to \
your wallet. Try the other ecash type, or use a shared mint/federation."
)
}));
}
if !response.status().is_success() {
let status = response.status();
let body = response.text().await.unwrap_or_default();
tracing::warn!("paid download: seller {onion} returned {status}: {body}");
reclaim_spent_ecash(&self.config.data_dir, &token_str, used_backend).await;
return Ok(serde_json::json!({
"error": format!("Peer returned an error ({}).", response.status())
"error": format!("Peer returned an error ({status}). Your ecash was refunded to your wallet.")
}));
}
// Capture the content type BEFORE consuming the body so the local cache
// can render the right viewer (image vs video) later.
let mime_type = response
.headers()
.get(reqwest::header::CONTENT_TYPE)
.and_then(|v| v.to_str().ok())
.map(|s| s.split(';').next().unwrap_or(s).trim().to_string())
.filter(|s| !s.is_empty())
.unwrap_or_else(|| "application/octet-stream".to_string());
let bytes = response
.bytes()
.await
.context("Failed to read response body")?;
// Persist the purchase so it "stays unlocked" for this buyer: cache the
// bytes + metadata keyed by (onion, content_id). The gallery then renders
// it unblurred and views it in-app from this cache — no re-payment and no
// reliance on a browser download (which silently fails on the mobile
// companion, the original "paid but never unlocked" report). Best-effort:
// a cache-write failure must not fail an already-paid download.
let filename = params
.get("filename")
.and_then(|v| v.as_str())
.unwrap_or(content_id)
.to_string();
let purchased_at = chrono::Utc::now().to_rfc3339();
if let Err(e) = crate::content_owned::record_purchase(
&self.config.data_dir,
onion,
content_id,
&filename,
&mime_type,
&bytes,
price_sats,
used_backend,
&purchased_at,
)
.await
{
tracing::warn!("paid download: failed to cache purchased content (non-fatal): {e:#}");
}
use base64::Engine;
let encoded = base64::engine::general_purpose::STANDARD.encode(&bytes);
tracing::info!("paid download: received {} bytes from {onion} (paid {price_sats} sats via {used_backend})", bytes.len());
Ok(serde_json::json!({
"data": encoded,
"size": bytes.len(),
"paid_sats": price_sats,
"ecash_backend": used_backend,
"mime_type": mime_type,
"owned": true,
}))
}
@ -463,12 +613,16 @@ impl RpcHandler {
let local_did = crate::identity::did_key_from_pubkey_hex(&data.server_info.pubkey)?;
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
// Minting a bolt11 is a tiny request/response — keep it snappy. Cap the
// FIPS attempt hard so a cold overlay can't burn the whole budget, and
// give Tor a short-but-real window (onion circuits need a few seconds).
let path = format!("/content/{}/invoice", content_id);
let (response, _transport) =
match crate::fips::dial::PeerRequest::new(fips_npub.as_deref(), onion, &path)
.service(crate::settings::transport::PeerService::PeerFiles)
.header("X-Federation-DID", local_did)
.timeout(std::time::Duration::from_secs(60))
.timeout(std::time::Duration::from_secs(25))
.fips_timeout(std::time::Duration::from_secs(6))
.send_get()
.await
{
@ -524,11 +678,15 @@ impl RpcHandler {
}
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
// Settlement poll — runs repeatedly, so each call must be quick. Fast-fail
// FIPS and keep a short Tor window; an unreachable peer just reads as
// "not yet paid" and the UI polls again.
let path = format!("/content/{}/invoice-status/{}", content_id, payment_hash);
let (response, _transport) =
match crate::fips::dial::PeerRequest::new(fips_npub.as_deref(), onion, &path)
.service(crate::settings::transport::PeerService::PeerFiles)
.timeout(std::time::Duration::from_secs(30))
.timeout(std::time::Duration::from_secs(15))
.fips_timeout(std::time::Duration::from_secs(6))
.send_get()
.await
{
@ -652,12 +810,15 @@ impl RpcHandler {
let local_did = crate::identity::did_key_from_pubkey_hex(&data.server_info.pubkey)?;
let fips_npub = crate::federation::fips_npub_for_onion(&self.config.data_dir, onion).await;
// Issuing an address is a tiny request/response — fast-fail FIPS, short
// Tor window (same budget shape as the invoice path, #6).
let path = format!("/content/{}/onchain", content_id);
let (response, _transport) =
match crate::fips::dial::PeerRequest::new(fips_npub.as_deref(), onion, &path)
.service(crate::settings::transport::PeerService::PeerFiles)
.header("X-Federation-DID", local_did)
.timeout(std::time::Duration::from_secs(60))
.timeout(std::time::Duration::from_secs(25))
.fips_timeout(std::time::Duration::from_secs(6))
.send_get()
.await
{
@ -715,7 +876,8 @@ impl RpcHandler {
let (response, _transport) =
match crate::fips::dial::PeerRequest::new(fips_npub.as_deref(), onion, &path)
.service(crate::settings::transport::PeerService::PeerFiles)
.timeout(std::time::Duration::from_secs(30))
.timeout(std::time::Duration::from_secs(15))
.fips_timeout(std::time::Duration::from_secs(6))
.send_get()
.await
{
@ -895,4 +1057,43 @@ impl RpcHandler {
"preview_mode": is_preview,
}))
}
/// `content.owned-list` — every paid item this node has purchased, so the
/// gallery can render owned items unblurred/viewable without re-payment.
pub(super) async fn handle_content_owned_list(&self) -> Result<serde_json::Value> {
let items = crate::content_owned::list_owned(&self.config.data_dir).await;
Ok(serde_json::json!({ "items": items }))
}
/// `content.owned-get` — return a purchased item's bytes (base64) from the
/// local cache for in-app viewing/saving. No network, no re-payment.
pub(super) async fn handle_content_owned_get(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let onion = params
.get("onion")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing onion address"))?;
let content_id = params
.get("content_id")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing content_id"))?;
match crate::content_owned::read_owned(&self.config.data_dir, onion, content_id).await {
Some((mime_type, bytes)) => {
use base64::Engine;
let encoded = base64::engine::general_purpose::STANDARD.encode(&bytes);
Ok(serde_json::json!({
"data": encoded,
"size": bytes.len(),
"mime_type": mime_type,
}))
}
None => Ok(serde_json::json!({
"error": "You don't own this item yet, or its cached copy is missing."
})),
}
}
}

View File

@ -276,6 +276,8 @@ impl RpcHandler {
"content.browse-peer" => self.handle_content_browse_peer(params).await,
"content.download-peer" => self.handle_content_download_peer(params).await,
"content.download-peer-paid" => self.handle_content_download_peer_paid(params).await,
"content.owned-list" => self.handle_content_owned_list().await,
"content.owned-get" => self.handle_content_owned_get(params).await,
"content.request-invoice" => self.handle_content_request_invoice(params).await,
"content.invoice-status" => self.handle_content_invoice_status(params).await,
"content.download-peer-invoice" => {

View File

@ -156,6 +156,35 @@ impl RpcHandler {
/// Shared helper used by both the `lnd.createinvoice` RPC and the seller-side
/// peer-file invoice flow (#46). LND returns `r_hash` as base64; we re-encode
/// it as hex so it can be used as a stable lookup key and passed in URLs.
/// Whether LND reports it's synced to its Bitcoin chain backend. Used to
/// fail invoice minting FAST with a clear reason while the node's Bitcoin
/// backend is still in initial block download — otherwise the `/v1/invoices`
/// POST hangs for the full client timeout (×3 retries ≈ 45s) and surfaces as
/// an opaque failure. `getinfo` answers in ~2s even mid-IBD. Returns
/// `Some(false)` only when LND is reachable AND explicitly not synced;
/// `None` when we couldn't tell (let the mint attempt proceed and report its
/// own error rather than guess "syncing").
pub(crate) async fn lnd_chain_synced(&self) -> Option<bool> {
let (client, macaroon_hex) = self.lnd_client().await.ok()?;
let resp = client
.get(format!("{LND_REST_BASE_URL}/v1/getinfo"))
.header("Grpc-Metadata-macaroon", &macaroon_hex)
.send()
.await
.ok()?;
let body: serde_json::Value = resp.json().await.ok()?;
body.get("synced_to_chain").and_then(|v| v.as_bool())
}
/// Error returned when the node can't mint a Lightning invoice because its
/// Bitcoin backend is still syncing. Kept as one string so every invoice
/// entry point surfaces the same clear, user-facing reason.
fn syncing_invoice_err() -> anyhow::Error {
anyhow::anyhow!(
"Your Bitcoin node is still syncing — Lightning invoices are unavailable until it finishes. Try again once the node is fully synced."
)
}
pub(crate) async fn create_invoice(
&self,
amount_sats: i64,
@ -173,13 +202,55 @@ impl RpcHandler {
"value": amount_sats.to_string(),
"memo": memo,
});
let resp = client
.post(format!("{LND_REST_BASE_URL}/v1/invoices"))
.header("Grpc-Metadata-macaroon", &macaroon_hex)
.json(&invoice_body)
.send()
.await
.context("Failed to create invoice")?;
// LND's REST endpoint can briefly drop/reset connections under load
// (swap pressure, just-restarted, TLS handshake races), which used to
// hard-fail the buy-file invoice with an opaque 503. Retry on a
// CONNECTION error with short backoff so a transient blip doesn't
// surface as a payment failure. A *timeout* is NOT retried: it means LND
// accepted the connection but isn't answering the mint (e.g. a degraded
// node), and retrying just multiplies the wait (3×15s ≈ 45s) — fail
// after the first hang and let the caller surface the real reason.
let mut last_err: Option<anyhow::Error> = None;
let mut resp = None;
for attempt in 0..3u32 {
match client
.post(format!("{LND_REST_BASE_URL}/v1/invoices"))
.header("Grpc-Metadata-macaroon", &macaroon_hex)
.json(&invoice_body)
.send()
.await
{
Ok(r) => {
resp = Some(r);
break;
}
Err(e) => {
let timed_out = e.is_timeout();
last_err = Some(anyhow::anyhow!(
"LND REST send failed (attempt {}): {e}",
attempt + 1
));
if timed_out {
break;
}
tokio::time::sleep(std::time::Duration::from_millis(400)).await;
}
}
}
let resp = match resp {
Some(r) => r,
None => {
// If LND is reachable but explicitly not synced to chain, say so —
// it's the most common reason a just-restored/syncing node can't
// mint. Otherwise surface the underlying transport error.
if self.lnd_chain_synced().await == Some(false) {
return Err(Self::syncing_invoice_err());
}
return Err(last_err.unwrap_or_else(|| {
anyhow::anyhow!("Failed to reach LND REST to create invoice")
}));
}
};
let status = resp.status();
let body: serde_json::Value = resp
@ -356,13 +427,23 @@ impl RpcHandler {
"memo": memo,
});
let resp = client
let resp = match client
.post(format!("{LND_REST_BASE_URL}/v1/invoices"))
.header("Grpc-Metadata-macaroon", &macaroon_hex)
.json(&invoice_body)
.send()
.await
.context("Failed to create invoice")?;
{
Ok(r) => r,
Err(e) => {
// A hung/failed mint while LND is explicitly not synced to chain
// gets a clear, user-facing reason instead of an opaque error.
if self.lnd_chain_synced().await == Some(false) {
return Err(Self::syncing_invoice_err());
}
return Err(anyhow::anyhow!(e).context("Failed to create invoice"));
}
};
let status = resp.status();
let body: serde_json::Value = resp

View File

@ -14,12 +14,12 @@ impl RpcHandler {
pub(in crate::api::rpc) async fn handle_mesh_assistant_status(
&self,
) -> Result<serde_json::Value> {
let cfg = {
let (cfg, denied_askers) = {
let service = self.mesh_service.read().await;
let svc = service
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?;
svc.assistant_config().await
(svc.assistant_config().await, svc.assistant_denied_askers().await)
};
let (ollama_detected, models) = detect_ollama().await;
@ -32,10 +32,12 @@ impl RpcHandler {
"model": cfg.model,
"trusted_only": cfg.trusted_only,
"backend": cfg.backend,
"allowed_contacts": cfg.allowed_contacts,
"default_model": DEFAULT_MODEL,
"ollama_detected": ollama_detected,
"claude_available": claude_available,
"models": models,
"denied_askers": denied_askers,
}))
}
@ -64,8 +66,18 @@ impl RpcHandler {
} else {
None
};
// allowed_contacts: present + array => replace the allowlist (pubkey hex
// strings); absent => leave unchanged.
let allowed_contacts = params
.get("allowed_contacts")
.and_then(|v| v.as_array())
.map(|arr| {
arr.iter()
.filter_map(|e| e.as_str().map(|s| s.to_string()))
.collect::<Vec<String>>()
});
svc.configure_assistant(enabled, model, trusted_only, backend)
svc.configure_assistant(enabled, model, trusted_only, backend, allowed_contacts)
.await?;
let cfg = svc.assistant_config().await;
Ok(serde_json::json!({
@ -73,6 +85,7 @@ impl RpcHandler {
"model": cfg.model,
"trusted_only": cfg.trusted_only,
"backend": cfg.backend,
"allowed_contacts": cfg.allowed_contacts,
}))
}

View File

@ -95,12 +95,17 @@ impl RpcHandler {
if let Some(svc) = service.as_ref() {
let peers = svc.peers().await;
let messages = svc.messages(None).await;
// Per-peer last message.
for peer in &peers {
// Collapse radio/federation twins into one conversation per identity
// so a node reachable both ways shows once, with its messages unioned
// across both twin contact_ids (#12).
let groups = mesh::group_peer_twins(&peers);
for group in &groups {
let peer = &group.canonical;
// Newest message across ALL twin contact_ids in this group.
let last = messages
.iter()
.rev()
.find(|m| m.peer_contact_id == peer.contact_id);
.find(|m| group.contact_ids.contains(&m.peer_contact_id));
let is_federation = peer.contact_id & 0x8000_0000 != 0;
conversations.push(serde_json::json!({
"id": format!("{}:{}", if is_federation { "federation" } else { "mesh" }, peer.contact_id),
@ -163,8 +168,16 @@ impl RpcHandler {
let filtered: Vec<_> = match kind {
"mesh" | "federation" => {
let contact_id: u32 = rest.parse().unwrap_or(0);
// Resolve this id's twin group and union messages across all of
// its contact_ids, so opening either twin shows the full thread
// (federation-injected + radio messages) (#12).
let ids: Vec<u32> = mesh::group_peer_twins(&svc.peers().await)
.into_iter()
.find(|g| g.contact_ids.contains(&contact_id))
.map(|g| g.contact_ids)
.unwrap_or_else(|| vec![contact_id]);
all.into_iter()
.filter(|m| m.peer_contact_id == contact_id)
.filter(|m| ids.contains(&m.peer_contact_id))
.collect()
}
"channel" => {
@ -258,43 +271,45 @@ impl RpcHandler {
if let Some(svc) = service.as_ref() {
let state = svc.state();
// Snapshot the firmware pubkeys we currently know about, then
// add them to the radio-contact blocklist. MeshCore's on-device
// contact table is persistent and reads back stale rows on the
// next refresh_contacts, so without this step `clear-all` only
// wipes the app view for a few seconds before the old entries
// reappear. The blocklist is also saved to disk so the filter
// survives a restart.
let firmware_pubkeys: Vec<String> = state
// NOTE: `clear-all` intentionally does NOT build a radio-contact
// blocklist. Permanently ignoring firmware contacts meant a cleared
// peer could never return even when it re-advertised (it also broke
// re-pairing a phone after a clear). Real per-contact blocking will
// be a separate, explicit feature. Here we just wipe the app-side
// view and ALSO clear any blocklist left over from older builds, so
// previously-hidden contacts can re-appear when next heard. The
// firmware's own contact table is the source of truth on refresh.
{
let mut set = state.radio_contact_blocklist.write().await;
set.clear();
}
let _ = crate::mesh::save_ignored_radio_contacts(&data_dir, &[]).await;
// Actually DELETE each radio contact from the firmware table (via
// CMD_REMOVE_CONTACT) so wiped peers don't just reappear on the next
// refresh. They come back only when they re-advertise (reachable).
// Federation-synthetic peers (high contact_id bit) aren't firmware
// contacts, so skip those.
let firmware_pubkeys: Vec<[u8; 32]> = state
.peers
.read()
.await
.values()
.filter_map(|p| {
// Federation-synthetic peers have their contact_id in the
// high half of u32 and carry the archipelago key — those
// aren't firmware contacts and must not go on the list.
if p.contact_id & 0x8000_0000 != 0 {
None
} else {
p.pubkey_hex.clone()
}
.filter(|p| p.contact_id & 0x8000_0000 == 0)
.filter_map(|p| p.pubkey_hex.as_deref())
.filter_map(|h| hex::decode(h).ok())
.filter(|b| b.len() == 32)
.map(|b| {
let mut k = [0u8; 32];
k.copy_from_slice(&b);
k
})
.collect();
{
let mut set = state.radio_contact_blocklist.write().await;
for pk in &firmware_pubkeys {
set.insert(pk.clone());
}
for pk in firmware_pubkeys {
let _ = state
.send_cmd(crate::mesh::listener::MeshCommand::RemoveContact { pubkey: pk })
.await;
}
let persisted: Vec<String> = state
.radio_contact_blocklist
.read()
.await
.iter()
.cloned()
.collect();
let _ = crate::mesh::save_ignored_radio_contacts(&data_dir, &persisted).await;
state.peers.write().await.clear();
state.messages.write().await.clear();

View File

@ -1133,9 +1133,13 @@ impl RpcHandler {
.ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?;
let state = svc.shared_state();
let contacts = state.contacts.read().await;
let peers = state.peers.read().await;
let peer_vec: Vec<_> = state.peers.read().await.values().cloned().collect();
// Collapse radio/federation twins so a node reachable both ways shows as
// one contact instead of two (#12).
let groups = crate::mesh::group_peer_twins(&peer_vec);
let mut out: Vec<serde_json::Value> = Vec::new();
for peer in peers.values() {
for group in &groups {
let peer = &group.canonical;
if let Some(pk) = peer.pubkey_hex.as_ref() {
let entry = contacts.get(pk).cloned().unwrap_or_default();
out.push(serde_json::json!({

View File

@ -349,13 +349,37 @@ fn http_probe_cmd(url: &'static str) -> &'static str {
}
}
/// Bitcoin UTXO cache (`-dbcache`) in MB, sized to host RAM.
///
/// A fixed large dbcache on a small box pushes bitcoind + the ~20 app
/// containers past physical RAM and triggers system-wide swap thrash: the
/// disk saturates, bitcoind can't answer its own RPC, and the dashboard
/// backend's sqlite reads stall — surfacing as /rpc/v1 502s and a blank
/// Bitcoin UI. Budget ~1/16 of RAM for the cache (floor 300 MB — bitcoind's
/// own default is 450 — cap 4096 MB), mirroring scripts/container-specs.sh.
pub(super) fn bitcoin_dbcache_mb() -> u64 {
let total_mb = std::fs::read_to_string("/proc/meminfo")
.ok()
.and_then(|c| {
c.lines()
.find_map(|l| l.strip_prefix("MemTotal:"))
.and_then(|v| v.split_whitespace().next())
.and_then(|kb| kb.parse::<u64>().ok())
})
.map(|kb| kb / 1024)
.unwrap_or(16000); // assume a comfortable host if /proc/meminfo is unreadable
(total_mb / 16).clamp(300, 4096)
}
/// Get per-app memory limit.
pub(super) fn get_memory_limit(app_id: &str) -> &'static str {
match app_id {
// Heavy apps. Bitcoin: dbcache uses ~4GB; the daemon also needs
// headroom for mempool + connection buffers + script-verifier
// memory + I/O. 4g caused OOM-cascades during IBD. 8g is the
// floor; ideally this would be host-RAM aware (next pass).
// Heavy apps. Bitcoin: dbcache is now host-RAM-aware (see
// bitcoin_dbcache_mb), so the daemon's footprint scales with the box.
// This cgroup cap is an upper bound for mempool + connection buffers +
// script-verifier memory + I/O; a tight cap (4g) previously caused
// OOM-cascades during IBD, so keep 8g as a generous ceiling rather
// than a tight limit — swap thrash is prevented at the dbcache layer.
"bitcoin" | "bitcoin-core" | "bitcoin-knots" => "8g",
// ElectrumX indexing spikes above its cache size due Python,
// RocksDB, socket buffers, and reorg/history work. Keep cache
@ -674,9 +698,10 @@ pub(super) async fn get_app_config(
// RPC is reachable from the bitcoin-ui companion container.
//
// Sync-speed flags:
// -dbcache=4096 — UTXO set cache; 4GB is the sweet spot before
// diminishing returns. Container has --memory=8g now so
// there's headroom for mempool + connections.
// -dbcache — UTXO set cache, sized to host RAM via
// bitcoin_dbcache_mb() (see there). A fixed 4GB cache swap-
// thrashed small nodes into fleet-wide 502s; ~1/16 of RAM
// keeps headroom for mempool + connections + the app stack.
// -par=0 — use all available cores for script
// verification (defaults to NCPU-1 capped at 16). Was
// effectively pinned at 2 by --cpus=2 (now removed).
@ -689,7 +714,7 @@ pub(super) async fn get_app_config(
"-rpcport=8332".to_string(),
"-printtoconsole=1".to_string(),
"-datadir=/home/bitcoin/.bitcoin".to_string(),
"-dbcache=4096".to_string(),
format!("-dbcache={}", bitcoin_dbcache_mb()),
"-par=0".to_string(),
"-maxconnections=125".to_string(),
]),

View File

@ -376,16 +376,31 @@ pub(super) fn startup_order(package_id: &str) -> &'static [&'static str] {
/// order for the given app. Unknown containers sort to the end.
pub(super) async fn ordered_containers_for_start(package_id: &str) -> Result<Vec<String>> {
let containers = get_containers_for_app(package_id).await?;
Ok(order_present_containers(package_id, containers))
}
/// Order the *actually-present* containers of an app by its dependency-aware
/// startup order. Containers whose name is unknown to the order list sort to
/// the end, preserving their relative input order.
///
/// This deliberately does NOT inject order entries that aren't live
/// containers. `startup_order` is a union of container-name variants across
/// install generations (e.g. `mysql-mempool` vs `archy-mempool-db`), so any
/// single install only ever has a subset of those names. Injecting a phantom
/// name makes the start path fail on a "no such object" inspect — and because
/// `do_orchestrator_package_start` propagates the unknown-app-id fallback
/// error via `?`, every later member (the api + frontend) is then skipped,
/// leaving the stack down until the health monitor recovers it minutes later.
/// That was the source of mempool gate flakes #73 (frontend) / #74 (api).
fn order_present_containers(package_id: &str, containers: Vec<String>) -> Vec<String> {
if containers.is_empty() {
// Nothing is live under any known name. Fall back to the package id so
// a single-container app whose container matches its id still gets one
// start attempt; multi-container stacks with no live members are
// surfaced as "no containers" by the caller's emptiness check.
return vec![package_id.to_string()];
}
let order = startup_order(package_id);
if order.is_empty() && containers.is_empty() {
return Ok(vec![package_id.to_string()]);
}
let mut sorted = containers;
for required in order {
if !sorted.iter().any(|name| name == required) {
sorted.push((*required).to_string());
}
}
// If no special order is defined, fall back to mempool order for legacy
// multi-container names that may still be returned by config lookups.
let effective_order: &[&str] = if order.is_empty() {
@ -393,8 +408,14 @@ pub(super) async fn ordered_containers_for_start(package_id: &str) -> Result<Vec
} else {
order
};
sorted.sort_by_key(|c| effective_order.iter().position(|o| *o == c).unwrap_or(99));
Ok(sorted)
let mut sorted = containers;
sorted.sort_by_key(|c| {
effective_order
.iter()
.position(|o| *o == c)
.unwrap_or(usize::MAX)
});
sorted
}
/// Configure Fedimint Gateway to use LND instead of LDK.
@ -452,7 +473,48 @@ pub(super) fn configure_fedimint_lnd(
#[cfg(test)]
mod tests {
use super::{requires_unpruned_bitcoin, startup_order};
use super::{order_present_containers, requires_unpruned_bitcoin, startup_order};
#[test]
fn order_present_containers_never_injects_phantom_stack_members() {
// The live mempool stack on a node: db + api + frontend. These are the
// only real container names; the startup_order list also contains
// variant/legacy names (mysql-mempool, archy-mempool-api, ...) that are
// NOT live here and must never appear in the result — a phantom name in
// the start list aborts the orchestrator start mid-sequence (gate
// #73/#74).
let present = vec![
"mempool".to_string(),
"mempool-api".to_string(),
"archy-mempool-db".to_string(),
];
let ordered = order_present_containers("mempool", present);
// Dependency order: db -> api -> frontend.
assert_eq!(ordered, vec!["archy-mempool-db", "mempool-api", "mempool"]);
// No phantom variants leaked in.
for phantom in ["mysql-mempool", "archy-mempool-api", "archy-mempool-web"] {
assert!(
!ordered.iter().any(|c| c == phantom),
"phantom {phantom} must not be injected"
);
}
}
#[test]
fn order_present_containers_orders_known_before_unknown() {
let present = vec!["mempool".to_string(), "some-sidecar".to_string()];
let ordered = order_present_containers("mempool", present);
// The known frontend sorts ahead of an unknown sidecar.
assert_eq!(ordered, vec!["mempool", "some-sidecar"]);
}
#[test]
fn order_present_containers_empty_falls_back_to_package_id() {
assert_eq!(
order_present_containers("mempool", vec![]),
vec!["mempool".to_string()]
);
}
#[test]
fn btcpay_start_order_includes_required_stack_members() {

View File

@ -22,6 +22,11 @@ const PODMAN_LOG_TIMEOUT: Duration = Duration::from_secs(15);
/// Per-container graceful shutdown timeout in seconds.
/// Bitcoin Core needs 600s to flush UTXO set, LND 330s for channel state,
/// indexers 300s for index flush, databases 120s for WAL/transaction commit.
///
/// MIRRORS `archipelago_container::runtime::stop_grace_secs_for` (which returns
/// `u64` and is the canonical table used by the orchestrator stop path). This
/// `&str` variant exists for the legacy `podman stop -t <s>` call sites here —
/// keep the two tables in sync until those are migrated to the orchestrator.
pub fn stop_timeout_secs(container_name: &str) -> &'static str {
let id = container_name
.strip_prefix("archy-")
@ -307,7 +312,16 @@ impl RpcHandler {
let mut stopped = 0u32;
let mut removed = 0u32;
let mut errors = Vec::new();
// Two distinct failure classes, kept separate so they don't get
// conflated (the old single `errors` vec did, which caused the "ghost in
// My Apps" bug): `container_errors` means a container could NOT be
// removed (force-rm failed too) — the app is genuinely still present, so
// we keep its state entry and surface a hard error. `cleanup_errors`
// means volume/network/data-dir teardown left residue — the containers
// are already gone, so the app IS uninstalled and MUST disappear from My
// Apps; the residue is logged but never ghosts the app.
let mut container_errors: Vec<String> = Vec::new();
let mut cleanup_errors: Vec<String> = Vec::new();
self.set_uninstall_stage(
package_id,
@ -365,7 +379,7 @@ impl RpcHandler {
let msg =
format!("Failed to remove {}: {}; {}", name, stderr.trim(), e);
tracing::error!("Uninstall {}: {}", package_id, msg);
errors.push(msg);
container_errors.push(msg);
}
}
}
@ -374,12 +388,35 @@ impl RpcHandler {
Err(force_err) => {
let msg = format!("Failed to remove {}: {}; {}", name, e, force_err);
tracing::error!("Uninstall {}: {}", package_id, msg);
errors.push(msg);
container_errors.push(msg);
}
},
}
}
// A container that survived even force-remove means the app is NOT
// actually uninstalled — keep its state entry and fail so the spawned
// task reverts it to its prior state (and the user can retry), rather
// than orphaning a live container that's missing from My Apps.
if !container_errors.is_empty() {
tracing::error!(
"Uninstall {}: containers could not be removed: {:?}",
package_id,
container_errors
);
return Err(anyhow::anyhow!(
"Uninstall {} failed: {}",
package_id,
container_errors.join("; ")
));
}
// Containers are gone → the app is uninstalled. Remove its state entry
// NOW, before the (possibly slow, possibly fallible) volume/data
// teardown below, so My Apps updates immediately and a residue failure
// can never leave a ghost. Reinstall/scan no longer see a stale entry.
self.remove_package_state_entry(package_id).await;
self.set_uninstall_stage(package_id, "Cleaning up volumes")
.await;
// Avoid global Podman volume prune on production nodes: store-wide
@ -427,70 +464,73 @@ impl RpcHandler {
let stderr = String::from_utf8_lossy(&o.stderr);
let msg = format!("Failed to remove data {}: {}", dir, stderr.trim());
tracing::error!("Uninstall {}: {}", package_id, msg);
errors.push(msg);
cleanup_errors.push(msg);
}
Err(e) => {
let msg = format!("Failed to remove data {}: {}", dir, e);
tracing::error!("Uninstall {}: {}", package_id, msg);
errors.push(msg);
cleanup_errors.push(msg);
}
_ => {}
}
}
}
if !errors.is_empty() {
// The app is already gone from My Apps (entry removed above). Residual
// volume/data cleanup failures are logged but NEVER ghost the app — a
// reinstall and the next uninstall both tolerate leftover dirs.
if !cleanup_errors.is_empty() {
tracing::error!(
"Uninstall {} completed with errors: {:?}",
"Uninstall {} removed but left cleanup residue: {:?}",
package_id,
errors
cleanup_errors
);
return Err(anyhow::anyhow!(
"Uninstall {} partially failed: {}",
package_id,
errors.join("; ")
));
}
tracing::info!(
"Uninstall {} complete: stopped={}, removed={}",
"Uninstall {} complete: stopped={}, removed={}, cleanup_errors={}",
package_id,
stopped,
removed
removed,
cleanup_errors.len()
);
// Immediately remove from in-memory state so the UI updates without
// waiting for the scanner's absence threshold (3 scans × 60s each).
{
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let before = data.package_data.len();
data.package_data.remove(package_id);
// Also remove any alias keys (e.g. "bitcoin-knots" vs "bitcoin")
let aliases: Vec<String> = data
.package_data
.keys()
.filter(|k| {
super::config::all_container_names(package_id)
.iter()
.any(|c| c.strip_prefix("archy-").unwrap_or(c) == k.as_str())
})
.cloned()
.collect();
for alias in &aliases {
data.package_data.remove(alias);
}
if data.package_data.len() < before {
self.state_manager.update_data(data).await;
}
}
Ok(serde_json::json!({
"status": "uninstalled",
"stopped": stopped,
"removed": removed,
"cleanup_warnings": cleanup_errors,
}))
}
/// Remove a package's entry (and any alias keys) from persisted state so it
/// disappears from My Apps immediately, without waiting for the scanner's
/// absence threshold (3 scans × 60s). Called as soon as an uninstall has
/// removed the app's containers — before the slower volume/data teardown —
/// so a residue failure can never leave a ghost entry behind.
async fn remove_package_state_entry(&self, package_id: &str) {
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let before = data.package_data.len();
data.package_data.remove(package_id);
// Also remove any alias keys (e.g. "bitcoin-knots" vs "bitcoin").
let aliases: Vec<String> = data
.package_data
.keys()
.filter(|k| {
super::config::all_container_names(package_id)
.iter()
.any(|c| c.strip_prefix("archy-").unwrap_or(c) == k.as_str())
})
.cloned()
.collect();
for alias in &aliases {
data.package_data.remove(alias);
}
if data.package_data.len() < before {
self.state_manager.update_data(data).await;
}
}
/// Start a bundled app (create container from pre-loaded image if needed).
pub(in crate::api::rpc) async fn handle_bundled_app_start(
&self,

View File

@ -6,7 +6,6 @@
use crate::api::rpc::RpcHandler;
use crate::data_model::InstallPhase;
use anyhow::{Context, Result};
use base64::Engine;
use std::process::Output;
use std::time::Duration;
use tracing::info;
@ -620,16 +619,25 @@ async fn install_stack_via_orchestrator(
))
.await;
let mut installed = 0usize;
for app_id in app_ids {
match orchestrator.install(app_id).await {
Ok(container_name) => {
installed += 1;
install_log(&format!(
"INSTALL ORCH: {} stack — app {} installed as {}",
stack_name, app_id, container_name
))
.await;
}
Err(e) if e.to_string().contains("unknown app_id") => {
Err(e) if e.to_string().contains("unknown app_id") && installed == 0 => {
// None of the stack's manifests are known — the orchestrator
// can't render this stack at all, so defer to the legacy
// installer. Only safe when NOTHING was installed yet: once an
// earlier member is up, falling back would let the legacy path
// double-create containers on the same data dir (observed
// corrupting an immich postgres cluster — two postmasters, one
// PGDATA). A partial set means a deploy bug, not a legacy node.
install_log(&format!(
"INSTALL ORCH SKIP: {} stack — app {} unknown, falling back to legacy stack installer",
stack_name, app_id
@ -637,6 +645,17 @@ async fn install_stack_via_orchestrator(
.await;
return Ok(None);
}
Err(e) if e.to_string().contains("unknown app_id") => {
install_log(&format!(
"INSTALL ORCH FAIL: {} stack — app {} unknown AFTER {} installed; refusing legacy fallback (would double-create on shared data)",
stack_name, app_id, installed
))
.await;
return Err(e.context(format!(
"orchestrator stack install {} aborted: app {} has no manifest but {} member(s) already installed — deploy all stack manifests",
stack_name, app_id, installed
)));
}
Err(e) => {
install_log(&format!(
"INSTALL ORCH FAIL: {} stack — app {} failed: {}",
@ -668,11 +687,42 @@ fn mempool_stack_app_ids() -> &'static [&'static str] {
&["archy-mempool-db", "mempool-api", "archy-mempool-web"]
}
const REGISTRY: &str = "146.59.87.168:3000/lfg2025";
fn immich_stack_app_ids() -> &'static [&'static str] {
// Install order = dependency order: db + cache before the server. The server
// app_id is the user-facing "immich" (canonical name + icon); its install is
// handled here (not recursively) since orchestrator.install bypasses the
// package.install routing that maps "immich" → this stack installer.
&["immich-postgres", "immich-redis", "immich"]
}
const NETBIRD_DASHBOARD_IMAGE: &str = "docker.io/netbirdio/dashboard:v2.38.0";
const NETBIRD_SERVER_IMAGE: &str = "docker.io/netbirdio/netbird-server:0.71.2";
const NETBIRD_PROXY_IMAGE: &str = "docker.io/library/nginx:1.27-alpine";
fn netbird_stack_app_ids() -> &'static [&'static str] {
// Dependency/startup order: the combined management/signal/relay server
// first (it owns the base64 relay/store secrets + the sqlite store, and is
// the OIDC issuer the others point at), then the dashboard SPA, then the
// user-facing TLS proxy ("netbird", which carries the self-signed cert +
// the templated nginx.conf and is the launcher). Mirrors the netbird
// startup_order in dependencies.rs.
&["netbird-server", "netbird-dashboard", "netbird"]
}
fn indeedhub_stack_app_ids() -> &'static [&'static str] {
// Dependency order: backends + their generated secrets first, then the api
// (owns indeedhub-jwt; reads the db/minio secrets the backends materialised),
// then the ffmpeg worker, then the user-facing frontend ("indeedhub", which
// carries the post_install nginx hook). The frontend's nginx reaches the
// backends by their short network_aliases (api/minio/relay) on indeedhub-net.
&[
"indeedhub-postgres",
"indeedhub-redis",
"indeedhub-minio",
"indeedhub-relay",
"indeedhub-api",
"indeedhub-ffmpeg",
"indeedhub",
]
}
const REGISTRY: &str = "146.59.87.168:3000/lfg2025";
/// Pull an image with retry and exponential backoff (3 attempts).
async fn pull_image_with_retry(image: &str) -> Result<()> {
@ -734,6 +784,17 @@ async fn pull_image_with_retry(image: &str) -> Result<()> {
impl RpcHandler {
/// Install Immich stack (postgres + redis + server).
pub(super) async fn install_immich_stack(&self) -> Result<serde_json::Value> {
// Manifest-driven path (workstream B/C): render the stack from
// apps/immich-*/manifest.yml via the orchestrator (rootless Quadlet
// units, generated_secrets, reboot-survivable). Falls back to the legacy
// installer below only when the orchestrator doesn't know these app_ids
// (manifests not yet deployed). See docs/PRODUCTION-MASTER-PLAN.md.
if let Some(orchestrated) =
install_stack_via_orchestrator(self, "immich", immich_stack_app_ids()).await?
{
return Ok(orchestrated);
}
if let Some(adopted) = adopt_stack_if_exists(
"immich_server",
"immich",
@ -1383,6 +1444,20 @@ impl RpcHandler {
/// Install the IndeedHub multi-container stack.
pub(super) async fn install_indeedhub_stack(&self) -> Result<serde_json::Value> {
// Manifest-driven path (#20 phase 3): render the 7-member stack from
// apps/indeedhub-*/manifest.yml via the orchestrator (dedicated
// indeedhub-net + network_aliases, generated_secrets, the frontend's
// post_install nginx hook, reboot-survivable). The manifests use the exact
// live container names / named volumes, so on an existing node this ADOPTS
// the running stack rather than recreating it (data preserved). Falls back
// to the legacy installer below only when the orchestrator doesn't know
// these app_ids (manifests not yet deployed). See PRODUCTION-MASTER-PLAN.md.
if let Some(orchestrated) =
install_stack_via_orchestrator(self, "indeedhub", indeedhub_stack_app_ids()).await?
{
return Ok(orchestrated);
}
let registry = crate::container::registry::load_registries(&self.config.data_dir)
.await
.unwrap_or_default()
@ -1758,6 +1833,27 @@ impl RpcHandler {
/// Install self-hosted NetBird (dashboard + combined management/signal/relay server).
pub(super) async fn install_netbird_stack(&self) -> Result<serde_json::Value> {
// Manifest-driven path (#20 phase 4): render the 3-member stack from
// apps/netbird-*/manifest.yml via the orchestrator — dedicated
// netbird-net + network_aliases, base64 generated_secrets, a self-signed
// TLS cert (generated_certs) so the dashboard gets a secure context for
// OIDC PKCE (#15), and templated config.yaml/nginx.conf rendered from
// host facts + the netbird-net gateway. The manifests use the exact live
// container names, so on an existing node this ADOPTS the running stack
// rather than recreating it (the sqlite store + base64 keys are
// preserved — ensure_generated_secrets no-ops on existing files).
//
// #20 ph4: the legacy hardcoded `podman run` installer was DELETED — the
// signed catalog always ships apps/netbird-*/manifest.yml, so there is no
// in-Rust fallback. If the orchestrator doesn't know these app_ids and no
// running stack exists to adopt, install errors rather than silently
// diverging from the manifest contract.
if let Some(orchestrated) =
install_stack_via_orchestrator(self, "netbird", netbird_stack_app_ids()).await?
{
return Ok(orchestrated);
}
if let Some(adopted) = adopt_stack_if_exists(
"netbird",
"netbird",
@ -1768,452 +1864,12 @@ impl RpcHandler {
return Ok(adopted);
}
install_log("INSTALL START: netbird stack (dashboard + server)").await;
info!("Installing self-hosted NetBird stack");
self.set_install_phase("netbird", InstallPhase::PullingImage)
.await;
for (i, image) in [
NETBIRD_DASHBOARD_IMAGE,
NETBIRD_SERVER_IMAGE,
NETBIRD_PROXY_IMAGE,
]
.iter()
.enumerate()
{
self.set_install_progress("netbird", i as u64, 3).await;
pull_image_with_retry(image)
.await
.with_context(|| format!("Failed to pull NetBird image: {}", image))?;
}
self.set_install_progress("netbird", 3, 3).await;
for name in ["netbird", "netbird-dashboard", "netbird-server"] {
let _ = podman_stack_status(&["rm", "-f", name], PODMAN_STACK_PROBE_TIMEOUT).await;
}
let _ = podman_stack_status(
&["network", "rm", "-f", "netbird-net"],
PODMAN_STACK_PROBE_TIMEOUT,
anyhow::bail!(
"netbird manifests not available on this node — the signed catalog must provide apps/netbird-*/manifest.yml (legacy hardcoded installer removed in #20 ph4)"
)
.await;
self.set_install_phase("netbird", InstallPhase::CreatingContainer)
.await;
tokio::fs::create_dir_all("/var/lib/archipelago/netbird/data")
.await
.context("Failed to create NetBird data directory")?;
let host_ip = detect_netbird_public_host_ip()
.await
.unwrap_or_else(|| self.config.host_ip.clone());
// Create the network FIRST so we can read back the gateway it was
// assigned — that gateway is Podman's aardvark DNS, which the proxy's
// nginx needs as an explicit `resolver` to re-resolve container names
// (issue #15: without it nginx caches a container IP and 502s forever
// once that IP changes on restart/reboot).
let _ = podman_stack_status(
&["network", "create", "netbird-net"],
PODMAN_STACK_PROBE_TIMEOUT,
)
.await;
let resolver_ip = netbird_net_resolver_ip().await;
write_netbird_config_files(&host_ip, &self.config.host_ip, &resolver_ip).await?;
ensure_netbird_tls_cert(&host_ip).await?;
let mut server_cmd = tokio::process::Command::new("podman");
server_cmd.args([
"run",
"-d",
"--name",
"netbird-server",
"--network",
"netbird-net",
"--network-alias",
"netbird-server",
"--restart=unless-stopped",
"-p",
"8086:80",
"-p",
"3478:3478/udp",
"-v",
"/var/lib/archipelago/netbird/data:/var/lib/netbird",
"-v",
"/var/lib/archipelago/netbird/config.yaml:/etc/netbird/config.yaml:ro",
NETBIRD_SERVER_IMAGE,
"--config",
"/etc/netbird/config.yaml",
]);
run_required_stack_command("netbird", "create server", &mut server_cmd).await?;
self.set_install_phase("netbird", InstallPhase::StartingContainer)
.await;
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
let mut dashboard_cmd = tokio::process::Command::new("podman");
dashboard_cmd.args([
"run",
"-d",
"--name",
"netbird-dashboard",
"--network",
"netbird-net",
// Explicit alias so the proxy can always resolve `netbird-dashboard`
// via Podman DNS — don't rely on implicit container-name aliasing.
"--network-alias",
"netbird-dashboard",
"--restart=unless-stopped",
"--env-file",
"/var/lib/archipelago/netbird/dashboard.env",
NETBIRD_DASHBOARD_IMAGE,
]);
run_required_stack_command("netbird", "create dashboard", &mut dashboard_cmd).await?;
let mut proxy_cmd = tokio::process::Command::new("podman");
proxy_cmd.args([
"run",
"-d",
"--name",
"netbird",
"--network",
"netbird-net",
"--restart=unless-stopped",
// 8087 publishes the TLS listener — netbird's dashboard requires a
// secure context (window.crypto.subtle / OIDC PKCE), issue #15.
"-p",
"8087:443",
"-v",
"/var/lib/archipelago/netbird/nginx.conf:/etc/nginx/conf.d/default.conf:ro",
"-v",
"/var/lib/archipelago/netbird/tls.crt:/etc/nginx/tls.crt:ro",
"-v",
"/var/lib/archipelago/netbird/tls.key:/etc/nginx/tls.key:ro",
NETBIRD_PROXY_IMAGE,
]);
run_required_stack_command("netbird", "create unified proxy", &mut proxy_cmd).await?;
wait_for_stack_containers(
"netbird",
&["netbird-server", "netbird-dashboard", "netbird"],
60,
)
.await?;
self.set_install_phase("netbird", InstallPhase::WaitingHealthy)
.await;
self.set_install_phase("netbird", InstallPhase::PostInstall)
.await;
self.set_install_phase("netbird", InstallPhase::Done).await;
self.clear_install_progress("netbird").await;
install_log("INSTALL OK: netbird stack").await;
info!("NetBird stack installed");
Ok(serde_json::json!({
"success": true,
"package_id": "netbird",
"message": "NetBird self-hosted stack installed",
}))
}
}
async fn read_or_generate_b64_secret(name: &str) -> String {
let path = format!("/var/lib/archipelago/secrets/{}", name);
if let Ok(val) = tokio::fs::read_to_string(&path).await {
let trimmed = val.trim().to_string();
if !trimmed.is_empty() {
return trimmed;
}
}
let mut buf = [0u8; 32];
rand::RngCore::fill_bytes(&mut rand::rngs::OsRng, &mut buf);
let secret = base64::engine::general_purpose::STANDARD.encode(buf);
let _ = tokio::fs::create_dir_all("/var/lib/archipelago/secrets").await;
let _ = tokio::fs::write(&path, &secret).await;
secret
}
/// Read the gateway of the `netbird-net` bridge. Podman runs its aardvark DNS
/// resolver on this address, so nginx can use it as an explicit `resolver` to
/// re-resolve container names at request time. Falls back to Podman's usual
/// first-pool gateway if the inspect fails (best effort — config is rewritten
/// on every (re)install).
async fn netbird_net_resolver_ip() -> String {
let out = tokio::process::Command::new("podman")
.args([
"network",
"inspect",
"netbird-net",
"--format",
"{{range .Subnets}}{{.Gateway}}{{end}}",
])
.output()
.await;
if let Ok(o) = out {
let gw = String::from_utf8_lossy(&o.stdout).trim().to_string();
if !gw.is_empty() && gw.parse::<std::net::IpAddr>().is_ok() {
return gw;
}
}
"10.89.0.1".to_string()
}
/// Generate a self-signed TLS cert for the netbird proxy if absent. The
/// dashboard needs a secure context (window.crypto.subtle / OIDC PKCE), so the
/// proxy serves HTTPS; a self-signed cert is sufficient (the user accepts it
/// once when opening netbird in a tab). SAN covers the LAN IP plus
/// localhost/127.0.0.1 so it's valid however the box is reached locally.
async fn ensure_netbird_tls_cert(host_ip: &str) -> Result<()> {
let dir = "/var/lib/archipelago/netbird";
let crt = format!("{dir}/tls.crt");
let key = format!("{dir}/tls.key");
if tokio::fs::metadata(&crt).await.is_ok() && tokio::fs::metadata(&key).await.is_ok() {
return Ok(());
}
let _ = tokio::fs::create_dir_all(dir).await;
let san = format!("subjectAltName=IP:{host_ip},IP:127.0.0.1,DNS:localhost");
let status = tokio::process::Command::new("openssl")
.args([
"req",
"-x509",
"-newkey",
"rsa:2048",
"-nodes",
"-keyout",
&key,
"-out",
&crt,
"-days",
"3650",
"-subj",
&format!("/CN={host_ip}"),
"-addext",
&san,
])
.status()
.await
.context("failed to run openssl for netbird TLS cert")?;
if !status.success() {
anyhow::bail!("openssl failed to generate netbird TLS cert");
}
Ok(())
}
async fn write_netbird_config_files(host_ip: &str, lan_ip: &str, resolver_ip: &str) -> Result<()> {
// netbird's dashboard uses window.crypto.subtle (OIDC PKCE), which browsers
// only expose in a SECURE context — so the proxy serves HTTPS and every
// origin here is https (issue #15: over plain http the dashboard threw
// "window.crypto.subtle is unavailable" and never reached login).
let public_origin = format!("https://{}:8087", host_ip);
let server_origin = format!("http://{}:8086", host_ip);
// A single box is reached via several addresses. Allow the OIDC login flow
// to redirect back to whichever origin the user actually used, otherwise
// post-login lands on the wrong host and the dashboard shows
// "Unauthenticated" (issue #15). The browser-side CORS is handled in the
// nginx proxy; this covers the redirect-URI allow-list.
let lan_origin = format!("https://{}:8087", lan_ip);
let mut redirect_origins = vec![public_origin.clone()];
if lan_origin != public_origin {
redirect_origins.push(lan_origin);
}
let dashboard_redirect_uris = redirect_origins
.iter()
.flat_map(|o| {
[
format!(" - \"{o}/nb-auth\""),
format!(" - \"{o}/nb-silent-auth\""),
]
})
.collect::<Vec<_>>()
.join("\n");
let dashboard_logout_uris = redirect_origins
.iter()
.map(|o| format!(" - \"{o}/\""))
.collect::<Vec<_>>()
.join("\n");
let relay_secret = read_or_generate_b64_secret("netbird-relay-auth-secret").await;
let encryption_key = read_or_generate_b64_secret("netbird-store-encryption-key").await;
let config = format!(
r#"server:
listenAddress: ":80"
exposedAddress: "{public_origin}"
stunPorts:
- 3478
metricsPort: 9090
healthcheckAddress: ":9000"
logLevel: "info"
logFile: "console"
authSecret: "{relay_secret}"
dataDir: "/var/lib/netbird"
auth:
issuer: "{public_origin}/oauth2"
localAuthDisabled: false
signKeyRefreshEnabled: false
dashboardRedirectURIs:
{dashboard_redirect_uris}
dashboardPostLogoutRedirectURIs:
{dashboard_logout_uris}
cliRedirectURIs:
- "http://localhost:53000/"
store:
engine: "sqlite"
encryptionKey: "{encryption_key}"
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/config.yaml", config)
.await
.context("Failed to write NetBird config.yaml")?;
let dashboard_env = format!(
r#"NETBIRD_MGMT_API_ENDPOINT={public_origin}
NETBIRD_MGMT_GRPC_API_ENDPOINT={public_origin}
AUTH_AUDIENCE=netbird-dashboard
AUTH_CLIENT_ID=netbird-dashboard
AUTH_CLIENT_SECRET=
AUTH_AUTHORITY={public_origin}/oauth2
USE_AUTH0=false
AUTH_SUPPORTED_SCOPES=openid profile email groups
AUTH_REDIRECT_URI=/nb-auth
AUTH_SILENT_REDIRECT_URI=/nb-silent-auth
NETBIRD_TOKEN_SOURCE=idToken
NGINX_SSL_PORT=443
LETSENCRYPT_DOMAIN=none
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/dashboard.env", dashboard_env)
.await
.context("Failed to write NetBird dashboard.env")?;
let nginx_conf = format!(
r#"server {{
listen 443 ssl;
server_name _;
# netbird's dashboard needs a secure context (window.crypto.subtle for OIDC
# PKCE), so the proxy terminates TLS with a self-signed cert (issue #15).
ssl_certificate /etc/nginx/tls.crt;
ssl_certificate_key /etc/nginx/tls.key;
# Rootless Podman can hand a container a new IP across restarts/reboots.
# nginx resolves a literal upstream name ONCE at startup and caches it, so
# after the IP moves every request 502s with "host unreachable" (issue #15,
# observed live on .198: nginx pinned to a dead netbird-dashboard IP). Fix:
# point `resolver` at the netbird-net gateway (Podman's aardvark DNS) and
# use VARIABLE upstreams, which forces nginx to re-resolve the container
# names at request time. Everything is reached container-to-container by
# name so nothing depends on host-published ports either.
resolver {resolver_ip} valid=10s ipv6=off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
location ~ ^/(relay|ws-proxy/) {{
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 1d;
}}
location ~ ^/(api|oauth2)(/|$) {{
# The dashboard is a SPA whose API/OIDC base URL is baked at build time
# to one host:port. A single box is reached via several addresses (LAN
# IP, Tailscale 100.x, hostname), so those fetches are cross-origin and
# the browser blocks them with no Access-Control-Allow-Origin (issue
# #15, observed live on .198). Reflect the caller's Origin so the
# self-hosted management/OIDC API is reachable from any of them, and
# answer the CORS preflight here.
if ($request_method = OPTIONS) {{
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
add_header Access-Control-Max-Age 86400 always;
add_header Content-Length 0;
return 204;
}}
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
}}
location ~ ^/(signalexchange\.SignalExchange|management\.ManagementService|management\.ProxyService)/ {{
set $nb_server netbird-server;
grpc_pass grpc://$nb_server:80;
grpc_read_timeout 1d;
grpc_send_timeout 1d;
}}
# OIDC callback routes are client-side SPA routes with NO prebuilt page in
# the dashboard bundle, so proxying them straight through 404s which
# crashes the dashboard's auth init and shows "Unauthenticated" with dead
# buttons (issue #15, confirmed live on .198: /nb-auth + /nb-silent-auth
# returned 404). Serve the dashboard's index.html at these paths (URL
# unchanged) so react-oidc boots and completes the login / silent-SSO.
location ~ ^/(nb-auth|nb-silent-auth) {{
set $nb_dashboard netbird-dashboard;
rewrite ^.*$ /index.html break;
proxy_pass http://$nb_dashboard:80;
}}
location / {{
set $nb_dashboard netbird-dashboard;
proxy_pass http://$nb_dashboard:80;
}}
}}
# Direct server remains available for diagnostics at {server_origin}.
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/nginx.conf", nginx_conf)
.await
.context("Failed to write NetBird nginx.conf")?;
Ok(())
}
async fn detect_netbird_public_host_ip() -> Option<String> {
let output = tokio::process::Command::new("hostname")
.args(["-I"])
.output()
.await
.ok()?;
let stdout = String::from_utf8_lossy(&output.stdout);
let ips: Vec<&str> = stdout
.split_whitespace()
.filter(|s| s.contains('.'))
.collect();
// Prefer the LAN address as the canonical origin — that's what users browse
// to on the local network. Baking the Tailscale 100.x address here broke
// LAN access with cross-origin/redirect mismatches (issue #15). Tailscale
// (100.64.0.0/10 CGNAT) is only a fallback for nodes with no LAN IP.
let is_private_lan = |ip: &str| {
ip.starts_with("192.168.")
|| ip.starts_with("10.")
|| (ip.starts_with("172.")
&& ip
.split('.')
.nth(1)
.and_then(|o| o.parse::<u8>().ok())
.map(|o| (16..=31).contains(&o))
.unwrap_or(false))
};
if let Some(lan) = ips.iter().find(|ip| is_private_lan(ip)) {
return Some(lan.to_string());
}
ips.iter()
.find(|ip| ip.starts_with("100."))
.map(|s| s.to_string())
}
#[cfg(test)]
mod tests {
use super::{btcpay_stack_app_ids, mempool_stack_app_ids};

View File

@ -1,12 +1,34 @@
use super::RpcHandler;
use crate::wallet::{ecash, profits};
use crate::wallet::{ecash, fedimint_client, profits};
use anyhow::Result;
/// A Cashu token (NUT-00 `cashuA`/`cashuB`, or our legacy `cashuSend_` form)
/// always starts with `cashu`. Fedimint ecash notes never do, so a non-`cashu`
/// string is routed to the Fedimint reissue path.
fn is_cashu_token(token: &str) -> bool {
token.trim_start().starts_with("cashu")
}
impl RpcHandler {
pub(super) async fn handle_wallet_ecash_balance(&self) -> Result<serde_json::Value> {
let wallet = ecash::load_wallet(&self.config.data_dir).await?;
let cashu_sats = wallet.balance();
// Spendable Fedimint balance too, so callers (e.g. the pay-for-file
// pre-check) see funds available across BOTH backends (#3). Best-effort:
// if fmcd isn't installed/joined this is just 0, never an error.
let fedimint_sats = match fedimint_client::FedimintClient::from_node(&self.config.data_dir)
.await
{
Ok(client) => client.total_balance_sats().await.unwrap_or(0),
Err(_) => 0,
};
Ok(serde_json::json!({
"balance_sats": wallet.balance(),
// `balance_sats` stays Cashu-only for back-compat; `total_sats` is the
// spendable amount across Cashu + Fedimint.
"balance_sats": cashu_sats,
"cashu_sats": cashu_sats,
"fedimint_sats": fedimint_sats,
"total_sats": cashu_sats + fedimint_sats,
"proof_count": wallet.proofs.iter().filter(|p| !p.spent && !p.reserved).count(),
"mint_url": wallet.mint_url,
}))
@ -129,18 +151,42 @@ impl RpcHandler {
let token = params
.get("token")
.and_then(|v| v.as_str())
.map(str::trim)
.filter(|s| !s.is_empty())
.ok_or_else(|| anyhow::anyhow!("Missing token"))?;
let amount = ecash::receive_token(&self.config.data_dir, token).await?;
// Dual-ecash: one "Receive ecash" box accepts either a Cashu token
// (redeemed at the mint) or Fedimint notes (reissued via the fmcd
// sidecar). Detect by prefix and route accordingly.
if is_cashu_token(token) {
let amount = ecash::receive_token(&self.config.data_dir, token).await?;
return Ok(serde_json::json!({
"received_sats": amount,
"kind": "cashu",
}));
}
let (amount, federation_id) =
fedimint_client::reissue_into_any(&self.config.data_dir, token).await?;
Ok(serde_json::json!({
"received_sats": amount,
"kind": "fedimint",
"federation_id": federation_id,
}))
}
pub(super) async fn handle_wallet_ecash_history(&self) -> Result<serde_json::Value> {
// Unified history: Cashu transactions (tagged kind="cashu") + the local
// Fedimint transaction log (kind="fedimint"), newest first. Previously
// only Cashu was returned, so a Fedimint receive showed up nowhere.
let wallet = ecash::load_wallet(&self.config.data_dir).await?;
let mut transactions = wallet.transactions;
transactions.extend(fedimint_client::load_fedimint_txs(&self.config.data_dir).await);
// Sort by RFC-3339 timestamp descending (string compare is valid for
// same-offset RFC-3339), newest first.
transactions.sort_by(|a, b| b.timestamp.cmp(&a.timestamp));
Ok(serde_json::json!({
"transactions": wallet.transactions,
"transactions": transactions,
}))
}

View File

@ -13,14 +13,32 @@ use std::time::{Duration, SystemTime, UNIX_EPOCH};
use tokio::sync::RwLock;
use tracing::{debug, warn};
const CACHE_REFRESH_SECS: u64 = 10;
const CACHE_ERROR_BACKOFF_SECS: u64 = 15;
// Poll frequently and recover fast so the cached snapshot tracks bitcoind's
// responsive windows during IBD. During heavy block-connection, getblockchaininfo
// can block briefly; a slow 10s/15s/20s cadence let one missed poll age the
// snapshot past the UI's 30s "stale" threshold, so the UI dwelled on
// "reconnecting…" long after bitcoind was answering again. Tight cadence + short
// timeout keeps last-known state fresh and clears the stale banner promptly.
const CACHE_REFRESH_SECS: u64 = 5;
const CACHE_ERROR_BACKOFF_SECS: u64 = 5;
// Grace window before a failing poll marks the snapshot "stale" for the UI.
// On a busy / swap-thrashing node (e.g. .198) getblockchaininfo intermittently
// exceeds the RPC timeout, so a single missed poll is normal and must NOT flip
// the UI to "reconnecting…". Only after the cached snapshot is genuinely old —
// several polls failed in a row — do we surface the banner.
const STALE_GRACE_MS: u64 = 20_000;
#[derive(Debug, Clone, Serialize)]
pub struct BitcoinNodeStatus {
pub ok: bool,
pub stale: bool,
pub updated_at_ms: u64,
// Server-computed age of the snapshot, filled in at serve time. The browser
// must not derive this itself (Date.now() - updated_at_ms) because that
// compares the browser clock against this node's clock — any skew made a
// fresh snapshot look stale and the "reconnecting…" banner never cleared.
pub age_ms: u64,
pub error: Option<String>,
pub blockchain_info: Option<serde_json::Value>,
pub network_info: Option<serde_json::Value>,
@ -34,6 +52,7 @@ impl Default for BitcoinNodeStatus {
ok: false,
stale: false,
updated_at_ms: 0,
age_ms: 0,
error: Some("Connecting to Bitcoin node...".to_string()),
blockchain_info: None,
network_info: None,
@ -122,7 +141,11 @@ pub fn spawn_status_cache() {
if cached.blockchain_info.is_some() {
cached.ok = false;
cached.stale = true;
// Only flip to "stale" once the last good snapshot is older
// than the grace window. A brief RPC gap on a busy node keeps
// showing last-known state silently instead of a banner flicker.
let snapshot_age_ms = now_ms().saturating_sub(cached.updated_at_ms);
cached.stale = snapshot_age_ms > STALE_GRACE_MS;
cached.error = Some(friendly_transient_error(true, &err_msg));
} else {
*cached = BitcoinNodeStatus {
@ -142,40 +165,46 @@ pub fn spawn_status_cache() {
}
pub async fn get_bitcoin_status() -> BitcoinNodeStatus {
cache().read().await.clone()
let mut status = cache().read().await.clone();
// Compute age here (server clock only) so the browser never has to subtract
// across clocks. A successful snapshot serves age_ms ≈ 0 → the UI clears the
// "reconnecting…" banner on its very next poll regardless of browser-clock skew.
if status.updated_at_ms > 0 {
status.age_ms = now_ms().saturating_sub(status.updated_at_ms);
}
status
}
async fn fetch_bitcoin_status() -> Result<BitcoinNodeStatus> {
// 12s (not 8s): on a swap-thrashing node getblockchaininfo can answer slowly
// but correctly; too tight a timeout turned working-but-slow polls into
// failures and tripped the "reconnecting…" banner. Stays under STALE_GRACE_MS.
let client = reqwest::Client::builder()
.timeout(Duration::from_secs(20))
.timeout(Duration::from_secs(12))
.build()
.context("build Bitcoin status HTTP client")?;
let blockchain_info = bitcoin_rpc_call(&client, "getblockchaininfo", serde_json::json!([]))
.await
.context("getblockchaininfo")?;
let network_info = bitcoin_rpc_call(&client, "getnetworkinfo", serde_json::json!([]))
.await
.context("getnetworkinfo")
.ok();
let index_info = bitcoin_rpc_call(&client, "getindexinfo", serde_json::json!([]))
.await
.context("getindexinfo")
.ok();
let zmq_notifications = bitcoin_rpc_call(&client, "getzmqnotifications", serde_json::json!([]))
.await
.context("getzmqnotifications")
.ok();
// Fetch all four calls concurrently: getblockchaininfo gates freshness, so a
// slow auxiliary call (network/index/zmq) must not delay the snapshot or block
// the next refresh. Only getblockchaininfo failing marks the status stale.
let (blockchain_info, network_info, index_info, zmq_notifications) = tokio::join!(
bitcoin_rpc_call(&client, "getblockchaininfo", serde_json::json!([])),
bitcoin_rpc_call(&client, "getnetworkinfo", serde_json::json!([])),
bitcoin_rpc_call(&client, "getindexinfo", serde_json::json!([])),
bitcoin_rpc_call(&client, "getzmqnotifications", serde_json::json!([])),
);
let blockchain_info = blockchain_info.context("getblockchaininfo")?;
Ok(BitcoinNodeStatus {
ok: true,
stale: false,
updated_at_ms: now_ms(),
age_ms: 0,
error: None,
blockchain_info: Some(blockchain_info),
network_info,
index_info,
zmq_notifications,
network_info: network_info.ok(),
index_info: index_info.ok(),
zmq_notifications: zmq_notifications.ok(),
})
}

View File

@ -66,7 +66,7 @@ pub struct Config {
/// through Quadlet (`.container` units in ~/.config/containers/systemd
/// + systemctl --user start) instead of `podman create + start`. Default
/// off so the legacy path stays the production path until the harness
/// at tests/lifecycle/run-20x.sh has gone green against the new path
/// at tests/lifecycle/run-gate.sh has gone green against the new path
/// on .228 + .198. See `project_v1_7_52_phase3_quadlet_design`.
#[serde(default)]
pub use_quadlet_backends: bool,
@ -487,7 +487,7 @@ mod tests {
#[test]
fn test_config_use_quadlet_backends_defaults_off() {
// Phase 3.2 of v1.7.52 — the new path stays gated until the 20×
// Phase 3.2 of v1.7.52 — the new path stays gated until the 5×
// harness goes green on .228 and .198. Flipping this default
// ahead of that would route every backend install through code
// we haven't fleet-validated yet.

View File

@ -86,6 +86,15 @@ pub struct AppCatalogEntry {
/// Optional human-readable changelog lines for this version.
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub changelog: Vec<String>,
/// Full app manifest, embedded so the app installs from the registry alone —
/// no OTA-shipped `apps/<id>/manifest.yml`. Carried as the raw value the
/// publisher signed (so it stays part of the verified preimage) and
/// deserialized into an `AppManifest` by the orchestrator at load time, where
/// it overrides the disk manifest (origin-wins). Absent during the migration
/// window => the node falls back to the disk manifest. See
/// `docs/registry-manifest-design.md`.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub manifest: Option<serde_json::Value>,
}
/// Read-side cache file search order. Mirrors `image_versions.rs`: the running
@ -166,6 +175,18 @@ pub fn catalog_stack_images(app_id: &str) -> HashMap<String, String> {
entry_for(app_id).and_then(|e| e.images).unwrap_or_default()
}
/// All `(app_id, manifest-value)` pairs the registry catalog carries. The
/// orchestrator deserializes + validates each into an `AppManifest` and prefers
/// it over the disk manifest (origin-wins); disk remains the migration fallback.
/// Empty when the catalog is absent or no entry embeds a manifest.
pub fn catalog_manifest_values() -> Vec<(String, serde_json::Value)> {
load_catalog()
.apps
.into_iter()
.filter_map(|(id, e)| e.manifest.map(|m| (id, m)))
.collect()
}
/// Image override for the orchestrator's install/upgrade path. Returns the
/// catalog's primary image for `app_id` ONLY when it refers to the same
/// repository as the manifest's current image — a guard so a catalog typo can
@ -346,6 +367,30 @@ mod tests {
assert_eq!(e.digest.as_deref(), Some("blake3:deadbeef"));
}
#[test]
fn entry_carries_embedded_manifest() {
let json = r#"{
"schema": 1,
"apps": {
"demo": {
"version": "1.0.0",
"manifest": {
"app": {
"id": "demo",
"name": "Demo",
"version": "1.0.0",
"container": { "image": "registry/demo:1.0.0" }
}
}
}
}
}"#;
let cat: AppCatalog = serde_json::from_str(json).unwrap();
let e = cat.apps.get("demo").unwrap();
let m = e.manifest.as_ref().expect("manifest present");
assert_eq!(m["app"]["id"], "demo");
}
#[test]
fn empty_catalog_when_absent_is_default() {
let cat = AppCatalog::default();

View File

@ -96,6 +96,35 @@ impl BootReconciler {
}
}
// Companion self-heal runs on its OWN cadence, decoupled from the
// per-app reconcile pass. On a heavily loaded node `reconcile_existing`
// over dozens of apps can take well over a minute, which would delay a
// companion-unit repair (deleted/lost unit file) past any reasonable
// safety window. Detecting + rewriting a companion unit is cheap, so it
// gets a dedicated `interval` loop. The handle is aborted when the main
// loop exits (shutdown uses `notify_one`, so we must NOT add a second
// waiter on `self.shutdown` — it would steal the single wake permit).
let companion_handle = if self.companion_stage {
let orchestrator = self.orchestrator.clone();
let interval = self.interval;
Some(tokio::spawn(async move {
loop {
let installed = orchestrator.manifest_ids().await;
for (companion, err) in crate::container::companion::reconcile(&installed).await
{
tracing::warn!(
companion = %companion,
error = %err,
"companion reconcile failed"
);
}
time::sleep(interval).await;
}
}))
} else {
None
};
// Initial pass: no delay.
self.tick().await;
@ -111,23 +140,15 @@ impl BootReconciler {
}
}
}
if let Some(handle) = companion_handle {
handle.abort();
}
}
async fn tick(&self) {
let report = self.orchestrator.reconcile_existing().await;
Self::log_report(&report);
if !self.companion_stage {
return;
}
let installed = self.orchestrator.manifest_ids().await;
for (companion, err) in crate::container::companion::reconcile(&installed).await {
tracing::warn!(
companion = %companion,
error = %err,
"companion reconcile failed"
);
}
}
fn log_report(report: &ReconcileReport) {

View File

@ -102,8 +102,15 @@ const LND_UI: &[CompanionSpec] = &[CompanionSpec {
],
pre_start: None,
bind_mounts: &[],
ports: &[(18083, 80)],
host_network: false,
// Host networking so the app's own nginx can proxy the archipelago backend
// same-origin (127.0.0.1:5678), exactly like fips-ui / electrs-ui. The
// previous bridge + 18083→80 mapping forced the browser to fetch the
// backend cross-origin from the app's port, which depended on the host
// nginx route + a CORS Origin/Host match and broke on http-only nodes
// (e.g. .116: blank fields, QR "failed to fetch"). The app's nginx now
// listens on 18083 directly (NOT 80 — that would collide with host nginx).
ports: &[],
host_network: true,
}];
const ELECTRS_UI: &[CompanionSpec] = &[CompanionSpec {
@ -214,13 +221,26 @@ async fn ensure_image_present(spec: &CompanionSpec) -> Result<String> {
for dir in spec.build_dir_candidates {
let dockerfile = PathBuf::from(dir).join("Dockerfile");
if fs::try_exists(&dockerfile).await.unwrap_or(false) {
// `:local` is a deliberate manual override — never auto-rebuild it.
if image_exists(&local_image_compat).await {
return Ok(local_image_compat);
}
// Reuse the auto-built `:latest` only when the build context has NOT
// changed since it was built. Without this staleness check an
// already-present image is reused forever, so edits to the baked-in
// context (Dockerfile, nginx.conf, …) never reach the node — this is
// exactly why the guardian-CSS nginx fix never reached the fleet.
if image_exists(&local_image).await {
return Ok(local_image);
if !context_is_newer_than_image(dir, &local_image).await {
return Ok(local_image);
}
info!(
companion = spec.name,
"build context changed since image built; rebuilding {dir}"
);
} else {
info!(companion = spec.name, "building locally from {dir}");
}
info!(companion = spec.name, "building locally from {dir}");
let out = command_output_with_timeout(
Command::new("podman").args(["build", "-t", &local_image, dir]),
COMPANION_BUILD_TIMEOUT,
@ -265,7 +285,15 @@ async fn ensure_image_present(spec: &CompanionSpec) -> Result<String> {
async fn image_exists(image: &str) -> bool {
let mut cmd = Command::new("podman");
cmd.args(["image", "inspect", image]);
// Only the exit status matters. WITHOUT a `--format`, `podman image inspect`
// prints the image's full multi-KB manifest JSON; `.status()` inherits the
// service's stdout, so on a hit that whole blob lands in the journal — once
// per companion image, every reconcile pass. That flood spikes journald +
// IO and starves the async runtime (UI websocket then drops → "connection
// lost"/reconnect). Discard the child's stdout/stderr; we read neither.
cmd.args(["image", "inspect", image])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null());
match tokio::time::timeout(COMPANION_IMAGE_CHECK_TIMEOUT, cmd.status()).await {
Ok(Ok(status)) => status.success(),
Ok(Err(err)) => {
@ -279,6 +307,73 @@ async fn image_exists(image: &str) -> bool {
}
}
/// Returns true if any file in the build context `dir` is newer than the
/// already-built `image`, signalling the cached image is stale and must be
/// rebuilt. Conservative: if either timestamp can't be determined we return
/// false (reuse the cache) to avoid rebuild storms on every reconcile pass.
async fn context_is_newer_than_image(dir: &str, image: &str) -> bool {
let image_created = match image_created_unix(image).await {
Some(t) => t,
None => return false,
};
match newest_mtime_unix(PathBuf::from(dir)).await {
Some(ctx) => ctx > image_created,
None => false,
}
}
/// Build timestamp of `image` as Unix seconds, via `podman image inspect`.
async fn image_created_unix(image: &str) -> Option<i64> {
let mut cmd = Command::new("podman");
cmd.args(["image", "inspect", "--format", "{{.Created.Unix}}", image]);
let out = command_output_with_timeout(
&mut cmd,
COMPANION_IMAGE_CHECK_TIMEOUT,
"podman image created time",
)
.await
.ok()?;
if !out.status.success() {
return None;
}
String::from_utf8_lossy(&out.stdout).trim().parse::<i64>().ok()
}
/// Newest modification time (Unix seconds) across all files under `dir`,
/// walked recursively. Runs on a blocking thread since it touches the fs.
async fn newest_mtime_unix(dir: PathBuf) -> Option<i64> {
tokio::task::spawn_blocking(move || newest_mtime_blocking(&dir))
.await
.ok()
.flatten()
}
fn newest_mtime_blocking(dir: &std::path::Path) -> Option<i64> {
let mut newest: Option<i64> = None;
let mut stack = vec![dir.to_path_buf()];
while let Some(p) = stack.pop() {
let entries = match std::fs::read_dir(&p) {
Ok(e) => e,
Err(_) => continue,
};
for entry in entries.flatten() {
let meta = match entry.metadata() {
Ok(m) => m,
Err(_) => continue,
};
if meta.is_dir() {
stack.push(entry.path());
} else if let Ok(modified) = meta.modified() {
if let Ok(dur) = modified.duration_since(std::time::UNIX_EPOCH) {
let secs = dur.as_secs() as i64;
newest = Some(newest.map_or(secs, |n| n.max(secs)));
}
}
}
}
newest
}
async fn command_output_with_timeout(
cmd: &mut Command,
timeout: Duration,
@ -439,12 +534,15 @@ mod tests {
}
#[test]
fn lnd_ui_uses_port_mapping_not_host_port_80() {
fn lnd_ui_uses_host_network_for_same_origin_backend_proxy() {
// lnd-ui is host-networked (its nginx listens on 18083 directly) so the
// app can proxy the archipelago backend same-origin instead of fetching
// it cross-origin from its app port — see the spec comment for why.
let spec = &LND_UI[0];
let u = build_unit(spec, "localhost/lnd-ui:latest");
assert_eq!(u.name, "archy-lnd-ui");
assert!(matches!(u.network, NetworkMode::Bridge(ref n) if n == "bridge"));
assert_eq!(u.ports, vec![(18083, 80, "tcp".into())]);
assert!(matches!(u.network, NetworkMode::Host));
assert!(u.ports.is_empty());
}
#[test]

View File

@ -365,6 +365,13 @@ fn get_app_metadata(app_id: &str) -> AppMetadata {
repo: "https://github.com/fedimint/fedimint".to_string(),
tier: "",
},
"fedimint-clientd" | "fmcd" => AppMetadata {
title: "Fedimint Client".to_string(),
description: "Fedimint ecash client daemon (fmcd) — lets your node hold Fedimint ecash and join federations".to_string(),
icon: "/assets/img/app-icons/fedimint.png".to_string(),
repo: "https://github.com/minmoto/fmcd".to_string(),
tier: "",
},
"morphos" | "morphos-server" => AppMetadata {
title: "Morphos".to_string(),
description: "Self-hosted file converter".to_string(),
@ -684,16 +691,37 @@ fn extract_lan_address(ports: &[String]) -> Option<String> {
None
}
/// netbird's dashboard launch URL: HTTPS on 8087 (the proxy terminates TLS —
/// the dashboard needs a secure context for OIDC PKCE, issue #15) at the node's
/// primary host IP so it's reachable from the LAN. Manifest-driven netbird no
/// longer writes `dashboard.env`, so this is derived from host facts (the same
/// `{{HOST_IP}}` the orchestrator bakes into the cert/config); it falls back to
/// the static localhost mapping when the host IP can't be read. URL shape is
/// identical to the legacy installer's, so the existing https reachability
/// wrapper still applies.
async fn netbird_configured_launch_url() -> Option<String> {
let env = tokio::fs::read_to_string("/var/lib/archipelago/netbird/dashboard.env")
if let Some(ip) = first_host_ip().await {
return Some(format!("https://{ip}:8087"));
}
PodmanClient::lan_address_for("netbird")
}
/// First address from `hostname -I` — the node's primary host IP. Mirrors the
/// orchestrator's `detect_host_ip` so launch URLs match the cert/config the
/// orchestrator renders for `{{HOST_IP}}`.
async fn first_host_ip() -> Option<String> {
let out = tokio::process::Command::new("hostname")
.arg("-I")
.output()
.await
.ok()?;
env.lines()
.find_map(|line| line.strip_prefix("NETBIRD_MGMT_API_ENDPOINT="))
.map(str::trim)
.filter(|s| !s.is_empty())
if !out.status.success() {
return None;
}
String::from_utf8_lossy(&out.stdout)
.split_whitespace()
.next()
.map(ToOwned::to_owned)
.or_else(|| PodmanClient::lan_address_for("netbird"))
}
async fn reachable_lan_address(app_id: &str, candidate: Option<String>) -> Option<String> {

View File

@ -0,0 +1,203 @@
//! Manifest-driven lifecycle hook executor (Task #20).
//!
//! Runs an app's declarative `post_install` hooks against its **own** running
//! container. Hooks are an allowlisted, reviewed escape hatch — NOT arbitrary
//! host scripts:
//!
//! - `exec` runs *inside the container* (`podman exec`), never on the host, and
//! inherits the container's (already dropped) capabilities.
//! - `copy_from_host.src` is resolved against an allowlist root, canonicalised,
//! and rejected on any escape; only then is it `podman cp`'d into the container.
//! - Execution is **best-effort + idempotent**: each step is logged, a failure is
//! warned and the remaining steps still run, so a transient hook error never
//! bricks an install. Authors must make steps safe to re-run (e.g. `grep -q … ||`).
//!
//! See `docs/manifest-hooks-design.md`.
use std::path::{Path, PathBuf};
use std::time::Duration;
use anyhow::{bail, Result};
use archipelago_container::{AppManifest, HookStep};
/// Upper bound on a single hook command. Generous — config rewrites + nginx
/// reloads are fast, but an image with a hung entrypoint shouldn't wedge install.
const HOOK_TIMEOUT: Duration = Duration::from_secs(60);
/// Roots a `copy_from_host.src` may resolve within. A src is joined onto each
/// root, canonicalised, and accepted only if it stays inside that root:
/// - the app's own data dir (`<data_dir>/<app_id>`), and
/// - `/opt/archipelago` (covers the orchestrator's bundled `web-ui/` assets,
/// e.g. indeedhub's `web-ui/nostr-provider.js`).
fn allowlist_roots(app_id: &str, data_dir: &Path) -> Vec<PathBuf> {
vec![data_dir.join(app_id), PathBuf::from("/opt/archipelago")]
}
/// Resolve a hook copy source against the allowlist. Returns the canonical
/// absolute path iff it exists and lies within an allowlist root. Defence in
/// depth: `AppManifest::validate` already rejects absolute / `..` srcs, but we
/// re-check here and canonicalise so a symlink inside a root can't escape it.
fn resolve_copy_src(src: &str, app_id: &str, data_dir: &Path) -> Result<PathBuf> {
if src.is_empty() || src.starts_with('/') || src.contains("..") {
bail!("hook copy src '{src}' is not an allowlisted relative path");
}
for root in allowlist_roots(app_id, data_dir) {
let Ok(root_canon) = root.canonicalize() else {
continue;
};
let Ok(canon) = root.join(src).canonicalize() else {
continue;
};
if canon.starts_with(&root_canon) {
return Ok(canon);
}
}
bail!("hook copy src '{src}' did not resolve inside an allowlist root")
}
/// Run an app's declarative `post_install` hooks against its running container.
/// Best-effort: never returns an error — a failed step is warned and skipped.
/// Called from the install path after the container is created + running, and
/// only when a fresh container was created (see `install_fresh`).
pub async fn run_post_install(manifest: &AppManifest, container_name: &str, data_dir: &Path) {
let steps = &manifest.app.hooks.post_install;
if steps.is_empty() {
return;
}
let app_id = &manifest.app.id;
tracing::info!(
app_id = %app_id,
container = %container_name,
steps = steps.len(),
"running manifest post_install hooks"
);
for (i, step) in steps.iter().enumerate() {
match run_step(step, container_name, app_id, data_dir).await {
Ok(()) => tracing::debug!(app_id = %app_id, step = i, "post_install hook step ok"),
Err(err) => tracing::warn!(
app_id = %app_id,
container = %container_name,
step = i,
error = %err,
"post_install hook step failed (continuing best-effort)"
),
}
}
}
async fn run_step(
step: &HookStep,
container: &str,
app_id: &str,
data_dir: &Path,
) -> Result<()> {
match step {
HookStep::Exec { exec } => {
let mut args: Vec<&str> = Vec::with_capacity(exec.len() + 2);
args.push("exec");
args.push(container);
args.extend(exec.iter().map(String::as_str));
// `exec` spawns a process INSIDE the container's cgroup. When the
// container was started by archipelago.service, that cgroup is under
// the service's slice and a bare `podman exec` from the service can't
// write its `cgroup.procs` ("crun: ... Permission denied / OCI
// permission denied"). Run it in a transient user scope (its own
// delegated cgroup) — mirrors `podman_user_scope` for pasta starts.
run_podman(&args, /* scoped */ true).await
}
HookStep::CopyFromHost { copy_from_host } => {
let abs = resolve_copy_src(&copy_from_host.src, app_id, data_dir)?;
let abs = abs.to_string_lossy().into_owned();
let dest = format!("{container}:{}", copy_from_host.dest);
// `cp` is a host-side copy (no in-container process), so no scope needed.
run_podman(&["cp", &abs, &dest], /* scoped */ false).await
}
}
}
/// Run a podman command, optionally inside a transient systemd user scope. The
/// scope gives the invocation its own delegated cgroup so `podman exec` can
/// place its child process — without it, an exec launched from the service's
/// own cgroup is denied write to the container's `cgroup.procs`.
async fn run_podman(args: &[&str], scoped: bool) -> Result<()> {
let rendered = args.join(" ");
let mut cmd = if scoped {
let mut c = tokio::process::Command::new("systemd-run");
c.args(["--user", "--scope", "--quiet", "--collect", "podman"]);
c.args(args);
c
} else {
let mut c = tokio::process::Command::new("podman");
c.args(args);
c
};
let out = tokio::time::timeout(HOOK_TIMEOUT, cmd.output())
.await
.map_err(|_| anyhow::anyhow!("podman {rendered} timed out after {:?}", HOOK_TIMEOUT))?
.map_err(|e| anyhow::anyhow!("podman {rendered}: {e}"))?;
if !out.status.success() {
bail!(
"podman {rendered} exited {}: {}",
out.status,
String::from_utf8_lossy(&out.stderr).trim()
);
}
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn resolve_copy_src_accepts_file_in_app_data_dir() {
let tmp = tempfile::tempdir().unwrap();
let data_dir = tmp.path();
let app_dir = data_dir.join("myapp/web-ui");
std::fs::create_dir_all(&app_dir).unwrap();
std::fs::write(app_dir.join("provider.js"), b"x").unwrap();
let got = resolve_copy_src("web-ui/provider.js", "myapp", data_dir).unwrap();
assert!(got.ends_with("myapp/web-ui/provider.js"));
assert!(got.is_absolute());
}
#[test]
fn resolve_copy_src_rejects_absolute() {
let tmp = tempfile::tempdir().unwrap();
assert!(resolve_copy_src("/etc/passwd", "myapp", tmp.path()).is_err());
}
#[test]
fn resolve_copy_src_rejects_traversal() {
let tmp = tempfile::tempdir().unwrap();
assert!(resolve_copy_src("web-ui/../../etc/shadow", "myapp", tmp.path()).is_err());
}
#[test]
fn resolve_copy_src_rejects_missing_file() {
// Inside the allowlist shape but the file doesn't exist → canonicalize fails.
let tmp = tempfile::tempdir().unwrap();
std::fs::create_dir_all(tmp.path().join("myapp")).unwrap();
assert!(resolve_copy_src("nope.js", "myapp", tmp.path()).is_err());
}
#[test]
fn resolve_copy_src_rejects_symlink_escape() {
// A symlink inside the app dir pointing outside it must be rejected by
// the post-canonicalisation prefix check.
let tmp = tempfile::tempdir().unwrap();
let app_dir = tmp.path().join("myapp");
std::fs::create_dir_all(&app_dir).unwrap();
let secret = tmp.path().join("secret.txt");
std::fs::write(&secret, b"s").unwrap();
let link = app_dir.join("link.js");
if std::os::unix::fs::symlink(&secret, &link).is_ok() {
// `secret.txt` lives in the tmp root, NOT under <data_dir>/myapp, so
// the canonical target escapes the app-data root. It also isn't under
// /opt/archipelago. Must be rejected.
assert!(resolve_copy_src("link.js", "myapp", tmp.path()).is_err());
}
}
}

View File

@ -6,11 +6,13 @@ pub mod data_manager;
pub mod dev_orchestrator;
pub mod docker_packages;
pub mod filebrowser;
pub mod hooks;
pub mod image_versions;
pub mod lnd;
pub mod prod_orchestrator;
pub mod quadlet;
pub mod registry;
pub mod secrets;
pub mod traits;
pub use boot_reconciler::{BootReconciler, DEFAULT_INTERVAL as RECONCILER_DEFAULT_INTERVAL};

File diff suppressed because it is too large Load Diff

View File

@ -227,13 +227,20 @@ impl QuadletUnit {
mode
);
}
for (host, container, proto) in &self.ports {
let p = if proto.is_empty() {
"tcp"
} else {
proto.as_str()
};
let _ = writeln!(s, "PublishPort={host}:{container}/{p}");
// Host networking exposes the container's ports on the host directly.
// Podman rejects PublishPort combined with Network=host ("published
// ports cannot be used with host network") and the unit crash-loops
// (exit 125). Skip publishing in host mode — matches the NetworkMode
// doc note that Podman discards port mappings under host networking.
if !matches!(self.network, NetworkMode::Host) {
for (host, container, proto) in &self.ports {
let p = if proto.is_empty() {
"tcp"
} else {
proto.as_str()
};
let _ = writeln!(s, "PublishPort={host}:{container}/{p}");
}
}
for env in &self.environment {
// env entries already arrive shaped as "KEY=VALUE"; quadlet
@ -403,7 +410,18 @@ impl QuadletUnit {
environment: app.environment.clone(),
devices: app.devices.clone(),
add_hosts: vec![("host.archipelago".into(), "10.89.0.1".into())],
network_aliases: vec![name.to_string()],
// Container always answers to its own name; manifest extras add the
// short hostnames peers bake in (e.g. indeedhub api/minio/relay).
// Only emitted for Bridge networks (slirp/pasta reject aliases).
network_aliases: {
let mut a = vec![name.to_string()];
for extra in &app.container.network_aliases {
if !a.iter().any(|x| x == extra) {
a.push(extra.clone());
}
}
a
},
entrypoint: app.container.entrypoint.clone(),
command: app.container.custom_args.clone(),
read_only_root: app.security.readonly_root,
@ -563,11 +581,12 @@ pub async fn write_if_changed(unit: &QuadletUnit, dir: &Path) -> Result<bool> {
/// Reload the user systemd manager. Required after any quadlet write
/// or removal so systemd picks up the generated `.service` translation.
pub async fn daemon_reload_user() -> Result<()> {
let status = Command::new("systemctl")
.args(["--user", "daemon-reload"])
.status()
// Bounded: a wedged user manager (e.g. a unit stuck "deactivating" while
// podman hangs) could otherwise block daemon-reload indefinitely and freeze
// any caller — notably uninstall teardown.
let status = systemctl_user_status(&["daemon-reload"], Duration::from_secs(30))
.await
.context("spawn systemctl --user daemon-reload")?;
.context("systemctl --user daemon-reload")?;
if !status.success() {
return Err(anyhow!("systemctl --user daemon-reload exited {status}"));
}
@ -624,7 +643,17 @@ pub async fn restart_service(service: &str) -> Result<()> {
/// Stop a generated Quadlet service without removing its unit file.
pub async fn stop_service(service: &str) -> Result<()> {
match systemctl_user_status(&["stop", service], QUADLET_STOP_TIMEOUT).await {
stop_service_with_timeout(service, QUADLET_STOP_TIMEOUT).await
}
/// Stop a user service, waiting up to `timeout` for a graceful stop before
/// force-killing the app-scoped unit. Slow-to-SIGTERM apps (bitcoin-core ~600s,
/// lnd ~330s) must not be SIGKILLed at the default 45s — that risks data
/// corruption — so the orchestrator passes the per-app grace here. Never waits
/// less than `QUADLET_STOP_TIMEOUT`.
pub async fn stop_service_with_timeout(service: &str, timeout: Duration) -> Result<()> {
let timeout = timeout.max(QUADLET_STOP_TIMEOUT);
match systemctl_user_status(&["stop", service], timeout).await {
Ok(status) if status.success() => Ok(()),
Ok(status) => Err(anyhow!("systemctl --user stop {service} exited {status}")),
Err(err) => {
@ -759,11 +788,19 @@ fn directive_values(unit_body: &str, prefix: &str) -> Vec<String> {
/// that systemd no longer knows about.
pub async fn disable_remove(unit_name: &str, dir: &Path) -> Result<()> {
let svc = format!("{unit_name}.service");
// Stop first; ignore failure (unit may already be down).
let _ = Command::new("systemctl")
.args(["--user", "stop", &svc])
.status()
.await;
// Stop first; ignore failure (unit may already be down). BOUNDED — on
// rootless podman a generated unit can wedge in "deactivating" while
// `podman rm -f` hangs underneath it, and an unbounded `systemctl stop`
// would block the entire uninstall forever: the progress bar freezes and
// the package entry is stranded in `Removing` (a ghost in My Apps that also
// blocks reinstall). If the graceful stop times out, escalate to
// SIGKILL + reset-failed so teardown always proceeds.
if systemctl_user_status(&["stop", &svc], QUADLET_STOP_TIMEOUT)
.await
.is_err()
{
let _ = kill_and_reset_service(&svc).await;
}
let path = dir.join(format!("{unit_name}.container"));
if fs::try_exists(&path).await.unwrap_or(false) {
match fs::remove_file(&path).await {
@ -774,10 +811,15 @@ pub async fn disable_remove(unit_name: &str, dir: &Path) -> Result<()> {
}
daemon_reload_user().await.ok();
// Defensive: kill the actual container too, in case quadlet left it.
let _ = Command::new("podman")
.args(["rm", "-f", unit_name])
.status()
.await;
// Bounded so a hung podman store can't re-introduce the stall this function
// exists to avoid.
let _ = tokio::time::timeout(
QUADLET_STOP_TIMEOUT,
Command::new("podman")
.args(["rm", "-f", unit_name])
.status(),
)
.await;
Ok(())
}
@ -852,6 +894,26 @@ mod tests {
assert!(!s.contains("Network=host"));
}
#[test]
fn render_host_network_omits_publish_ports() {
// Podman rejects PublishPort with Network=host (crash-loop exit 125).
let mut u = sample_unit();
u.network = NetworkMode::Host;
u.ports = vec![(3000, 3000, "tcp".into())];
let s = u.render();
assert!(s.contains("Network=host"));
assert!(!s.contains("PublishPort"));
}
#[test]
fn render_non_host_network_emits_publish_ports() {
let mut u = sample_unit();
u.network = NetworkMode::Bridge("archy-net".into());
u.ports = vec![(3000, 3000, "tcp".into())];
let s = u.render();
assert!(s.contains("PublishPort=3000:3000/tcp"));
}
#[test]
fn unit_filename_and_service_name_are_consistent() {
let u = sample_unit();
@ -1033,6 +1095,7 @@ app:
version: 1.0.0
container:
image: registry/bitcoin-knots:1.0
network: archy-net
entrypoint: ["/usr/local/bin/bitcoind"]
custom_args: ["-server=1", "-rpcbind=0.0.0.0"]
ports:
@ -1053,7 +1116,7 @@ app:
security:
capabilities: ["NET_BIND_SERVICE"]
readonly_root: true
network_policy: archy-net
network_policy: isolated
"#;
let m = AppManifest::parse(yaml).expect("manifest must parse");
let u = QuadletUnit::from_manifest(&m, "bitcoin-knots");
@ -1193,7 +1256,7 @@ app:
image: x:latest
volumes:
- type: bind
source: /etc/host-conf
source: /var/lib/archipelago/x-conf
target: /etc/conf
options: ["ro"]
"#;
@ -1217,7 +1280,7 @@ app:
target: /tmp
tmpfs_options: "rw,size=64m"
- type: bind
source: /var/lib/x
source: /var/lib/archipelago/x
target: /data
options: []
"#;
@ -1225,7 +1288,7 @@ app:
let u = QuadletUnit::from_manifest(&m, "x");
// tmpfs entry is dropped from bind_mounts; bind entry survives.
assert_eq!(u.bind_mounts.len(), 1);
assert_eq!(u.bind_mounts[0].host, PathBuf::from("/var/lib/x"));
assert_eq!(u.bind_mounts[0].host, PathBuf::from("/var/lib/archipelago/x"));
}
#[test]
@ -1404,6 +1467,31 @@ app:
assert!(!publish_ports_changed(new, new));
}
#[test]
fn from_manifest_appends_manifest_network_aliases_for_bridge() {
let yaml = r#"
app:
id: indeedhub-api
name: IndeedHub API
version: 1.0.0
container:
image: registry/indeedhub-api:1.0.0
network: indeedhub-net
network_aliases: [api]
security:
capabilities: []
network_policy: isolated
"#;
let m = AppManifest::parse(yaml).expect("manifest must parse");
let u = QuadletUnit::from_manifest(&m, "indeedhub-api");
assert!(matches!(u.network, NetworkMode::Bridge(ref n) if n == "indeedhub-net"));
// Own name first, then the baked-in short alias the frontend nginx uses.
assert_eq!(u.network_aliases, vec!["indeedhub-api", "api"]);
let s = u.render();
assert!(s.contains("NetworkAlias=api"));
assert!(s.contains("PodmanArgs=--network-alias=api"));
}
#[test]
fn network_aliases_changed_detects_service_discovery_drift() {
let old = "[Container]\nNetwork=archy-net\n";
@ -1462,6 +1550,7 @@ app:
version: 1.0.0
container:
image: registry/lnd:latest
network: archy-net
ports:
- host: 10009
container: 10009
@ -1477,7 +1566,7 @@ app:
memory_limit: 1g
security:
capabilities: []
network_policy: archy-net
network_policy: isolated
"#;
let m = AppManifest::parse(yaml).unwrap();
let body = QuadletUnit::from_manifest(&m, "lnd").render();

View File

@ -0,0 +1,208 @@
//! Declarative, self-healing generation of app secrets.
//!
//! An app declares `generated_secrets` in its manifest; this module materialises
//! them just before `secret_env` is resolved. That keeps the migration's
//! data-driven bar: an app installs from its manifest alone — no host
//! provisioning and no per-app Rust — and every secret lands `0600`, owned by
//! the unprivileged (rootless) service user.
//!
//! Two properties make it safe to call on every install/reconcile tick:
//!
//! * **Idempotent** — a target file that already exists, is readable and
//! non-empty is left untouched, so values are stable across ticks.
//! * **Self-healing without privilege** — a target file that exists but is
//! *unreadable* (the classic `root:root`-owned secret left by some earlier
//! path) is unlinked and rewritten. Unlinking needs write on the
//! service-owned secrets dir, not on the file, so this recovers the broken
//! state with no `chown` and no root — exactly what a rootless node needs.
use anyhow::{Context, Result};
use archipelago_container::{AppManifest, GeneratedSecret, SecretGenKind};
use rand::RngCore;
use std::fs;
use std::io::Write;
use std::os::unix::fs::OpenOptionsExt;
use std::path::Path;
/// Plaintext-password length (bytes of entropy) for [`SecretGenKind::Bcrypt`].
const BCRYPT_PASSWORD_BYTES: usize = 24;
/// Materialise every declared generated secret for `manifest` under
/// `secrets_dir`. No-op when the manifest declares none. Safe to call on every
/// reconcile/install tick (idempotent + self-healing).
pub fn ensure_generated_secrets(secrets_dir: &Path, manifest: &AppManifest) -> Result<()> {
let specs = &manifest.app.container.generated_secrets;
if specs.is_empty() {
return Ok(());
}
fs::create_dir_all(secrets_dir)
.with_context(|| format!("creating secrets dir {}", secrets_dir.display()))?;
for gs in specs {
ensure_one(secrets_dir, gs).with_context(|| format!("generating secret '{}'", gs.name))?;
}
Ok(())
}
fn ensure_one(dir: &Path, gs: &GeneratedSecret) -> Result<()> {
let files = gs.target_files();
// Idempotent fast path: every target file present, readable and non-empty.
if files.iter().all(|f| readable_nonempty(&dir.join(f))) {
return Ok(());
}
// Self-heal: drop any stale/unreadable target so the write below recreates
// it owned by us. Unlinking uses the (service-owned) dir's write bit, so a
// wrongly root-owned secret is recovered with no privilege escalation.
for f in &files {
let p = dir.join(f);
if p.exists() && !readable_nonempty(&p) {
tracing::warn!("regenerating unreadable/stale secret {}", p.display());
fs::remove_file(&p)
.with_context(|| format!("removing stale secret {}", p.display()))?;
}
}
match gs.kind {
SecretGenKind::Hex16 => write_secret(&dir.join(&gs.name), &random_hex(16))?,
SecretGenKind::Hex32 => write_secret(&dir.join(&gs.name), &random_hex(32))?,
SecretGenKind::Base64 => write_secret(&dir.join(&gs.name), &random_base64(32))?,
SecretGenKind::Bcrypt => {
let password = random_hex(BCRYPT_PASSWORD_BYTES);
let hash = bcrypt::hash(&password, bcrypt::DEFAULT_COST)
.context("bcrypt-hashing generated password")?;
// Primary (server-facing hash) first, then the plaintext sibling.
write_secret(&dir.join(&gs.name), &hash)?;
write_secret(&dir.join(format!("{}.pw", gs.name)), &password)?;
}
}
Ok(())
}
/// True when `path` exists, is readable by this process, and is non-empty after
/// trimming. Any error (missing, permission denied, empty) reads as false.
fn readable_nonempty(path: &Path) -> bool {
fs::read_to_string(path)
.map(|s| !s.trim().is_empty())
.unwrap_or(false)
}
fn random_hex(bytes: usize) -> String {
let mut buf = vec![0u8; bytes];
rand::thread_rng().fill_bytes(&mut buf);
hex::encode(buf)
}
/// `bytes` of entropy, standard base64 (with padding). For keys that a service
/// base64-decodes to recover the raw bytes (e.g. netbird's store encryptionKey).
fn random_base64(bytes: usize) -> String {
use base64::Engine as _;
let mut buf = vec![0u8; bytes];
rand::thread_rng().fill_bytes(&mut buf);
base64::engine::general_purpose::STANDARD.encode(buf)
}
/// Atomically write a `0600` secret: a temp file in the same dir (so the rename
/// is atomic), fsynced, then renamed over the target.
fn write_secret(path: &Path, value: &str) -> Result<()> {
let dir = path
.parent()
.context("secret path has no parent directory")?;
let name = path
.file_name()
.and_then(|n| n.to_str())
.context("secret path has no filename")?;
let tmp = dir.join(format!(".{name}.tmp"));
let mut f = fs::OpenOptions::new()
.write(true)
.create(true)
.truncate(true)
.mode(0o600)
.open(&tmp)
.with_context(|| format!("creating temp secret {}", tmp.display()))?;
f.write_all(value.as_bytes())
.with_context(|| format!("writing temp secret {}", tmp.display()))?;
f.sync_all()
.with_context(|| format!("fsync temp secret {}", tmp.display()))?;
drop(f);
fs::rename(&tmp, path)
.with_context(|| format!("renaming {} -> {}", tmp.display(), path.display()))?;
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use archipelago_container::SecretGenKind;
use std::os::unix::fs::PermissionsExt;
fn manifest_with(secrets: Vec<GeneratedSecret>) -> AppManifest {
let mut m: AppManifest = serde_yaml::from_str(
"app:\n id: t\n name: t\n version: 1.0.0\n container:\n image: x:y\n",
)
.unwrap();
m.app.container.generated_secrets = secrets;
m
}
fn gs(name: &str, kind: SecretGenKind) -> GeneratedSecret {
GeneratedSecret {
name: name.to_string(),
kind,
}
}
#[test]
fn generates_hex_and_bcrypt_with_0600() {
let dir = tempfile::tempdir().unwrap();
let m = manifest_with(vec![
gs("tok", SecretGenKind::Hex16),
gs("admin", SecretGenKind::Bcrypt),
]);
ensure_generated_secrets(dir.path(), &m).unwrap();
let tok = std::fs::read_to_string(dir.path().join("tok")).unwrap();
assert_eq!(tok.trim().len(), 32, "hex16 = 16 bytes = 32 hex chars");
let hash = std::fs::read_to_string(dir.path().join("admin")).unwrap();
let pw = std::fs::read_to_string(dir.path().join("admin.pw")).unwrap();
assert!(hash.starts_with("$2"), "bcrypt hash shape");
assert!(bcrypt::verify(pw.trim(), hash.trim()).unwrap(), "pw matches hash");
for f in ["tok", "admin", "admin.pw"] {
let mode = std::fs::metadata(dir.path().join(f))
.unwrap()
.permissions()
.mode()
& 0o777;
assert_eq!(mode, 0o600, "{f} must be 0600");
}
}
#[test]
fn idempotent_value_is_stable() {
let dir = tempfile::tempdir().unwrap();
let m = manifest_with(vec![gs("tok", SecretGenKind::Hex32)]);
ensure_generated_secrets(dir.path(), &m).unwrap();
let first = std::fs::read_to_string(dir.path().join("tok")).unwrap();
ensure_generated_secrets(dir.path(), &m).unwrap();
let second = std::fs::read_to_string(dir.path().join("tok")).unwrap();
assert_eq!(first, second, "a present readable secret is never rewritten");
}
#[test]
fn self_heals_unreadable_secret() {
// Simulate the root-owned case: a present-but-unreadable file. We can't
// chmod-away read as the owner in a unit test, so emulate "unreadable"
// via the empty-file branch (readable_nonempty == false), which drives
// the same unlink+regenerate path.
let dir = tempfile::tempdir().unwrap();
std::fs::write(dir.path().join("tok"), "").unwrap();
let m = manifest_with(vec![gs("tok", SecretGenKind::Hex16)]);
ensure_generated_secrets(dir.path(), &m).unwrap();
let v = std::fs::read_to_string(dir.path().join("tok")).unwrap();
assert_eq!(v.trim().len(), 32, "stale/empty secret was regenerated");
}
}

View File

@ -0,0 +1,165 @@
//! Buyer-side store of paid content the node has purchased.
//!
//! A paid peer download used to be ephemeral: the bytes were handed to the
//! browser as a one-shot `<a download>` and then thrown away. On the mobile
//! companion that download silently fails, so the item appeared to never
//! "unlock" even though the ecash was spent. This module persists every
//! successful purchase — bytes + metadata — keyed by (seller onion, content_id),
//! so the gallery can render owned items unblurred and play/view them in-app
//! from the local cache, with no re-payment and no reliance on a browser
//! download. The buyer can still save the file later from the cached copy.
use anyhow::{Context, Result};
use serde::{Deserialize, Serialize};
use std::path::{Path, PathBuf};
use tokio::fs;
const OWNED_DIR: &str = "purchased-content";
const OWNED_INDEX: &str = "owned.json";
/// One purchased item. `onion` + `content_id` are the identity; everything else
/// is display/metadata captured at purchase time.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct OwnedItem {
pub onion: String,
pub content_id: String,
pub filename: String,
pub mime_type: String,
pub size_bytes: u64,
pub paid_sats: u64,
pub ecash_backend: String,
/// RFC3339 timestamp; best-effort, empty if the clock was unavailable.
pub purchased_at: String,
}
#[derive(Debug, Default, Serialize, Deserialize)]
struct OwnedIndex {
items: Vec<OwnedItem>,
}
fn owned_root(data_dir: &Path) -> PathBuf {
data_dir.join(OWNED_DIR)
}
fn index_path(data_dir: &Path) -> PathBuf {
owned_root(data_dir).join(OWNED_INDEX)
}
/// Sanitize an onion into a safe directory component (it's already [a-z2-7].onion
/// for valid v3, but be defensive against path traversal regardless).
fn sanitize(component: &str) -> String {
component
.chars()
.map(|c| {
if c.is_ascii_alphanumeric() || c == '-' || c == '_' || c == '.' {
c
} else {
'_'
}
})
.collect()
}
fn bytes_path(data_dir: &Path, onion: &str, content_id: &str) -> PathBuf {
owned_root(data_dir)
.join(sanitize(onion))
.join(sanitize(content_id))
}
async fn load_index(data_dir: &Path) -> OwnedIndex {
match fs::read_to_string(index_path(data_dir)).await {
Ok(s) => serde_json::from_str(&s).unwrap_or_default(),
Err(_) => OwnedIndex::default(),
}
}
async fn save_index(data_dir: &Path, index: &OwnedIndex) -> Result<()> {
let root = owned_root(data_dir);
fs::create_dir_all(&root)
.await
.with_context(|| format!("creating {}", root.display()))?;
let content = serde_json::to_string_pretty(index).context("serializing owned index")?;
fs::write(index_path(data_dir), content)
.await
.context("writing owned index")
}
/// Persist a successful purchase: write the bytes to disk and upsert the index
/// entry. Idempotent on (onion, content_id) — re-buying overwrites with the
/// latest copy/metadata rather than duplicating.
pub async fn record_purchase(
data_dir: &Path,
onion: &str,
content_id: &str,
filename: &str,
mime_type: &str,
bytes: &[u8],
paid_sats: u64,
ecash_backend: &str,
purchased_at: &str,
) -> Result<()> {
let path = bytes_path(data_dir, onion, content_id);
if let Some(parent) = path.parent() {
fs::create_dir_all(parent)
.await
.with_context(|| format!("creating {}", parent.display()))?;
}
fs::write(&path, bytes)
.await
.with_context(|| format!("writing purchased bytes to {}", path.display()))?;
let mut index = load_index(data_dir).await;
let entry = OwnedItem {
onion: onion.to_string(),
content_id: content_id.to_string(),
filename: filename.to_string(),
mime_type: mime_type.to_string(),
size_bytes: bytes.len() as u64,
paid_sats,
ecash_backend: ecash_backend.to_string(),
purchased_at: purchased_at.to_string(),
};
if let Some(existing) = index
.items
.iter_mut()
.find(|i| i.onion == onion && i.content_id == content_id)
{
*existing = entry;
} else {
index.items.push(entry);
}
save_index(data_dir, &index).await
}
/// Every item this node owns.
pub async fn list_owned(data_dir: &Path) -> Vec<OwnedItem> {
load_index(data_dir).await.items
}
/// True if the node has already purchased this (onion, content_id).
#[allow(dead_code)] // used by the upcoming seller-side signed-entitlement path (#8)
pub async fn is_owned(data_dir: &Path, onion: &str, content_id: &str) -> bool {
bytes_path(data_dir, onion, content_id).exists()
&& load_index(data_dir)
.await
.items
.iter()
.any(|i| i.onion == onion && i.content_id == content_id)
}
/// Read a purchased item's bytes + mime type from the local cache, if present.
pub async fn read_owned(
data_dir: &Path,
onion: &str,
content_id: &str,
) -> Option<(String, Vec<u8>)> {
let bytes = fs::read(bytes_path(data_dir, onion, content_id)).await.ok()?;
let mime = load_index(data_dir)
.await
.items
.into_iter()
.find(|i| i.onion == onion && i.content_id == content_id)
.map(|i| i.mime_type)
.unwrap_or_else(|| "application/octet-stream".to_string());
Some((mime, bytes))
}

View File

@ -61,6 +61,22 @@ pub async fn load_user_stopped(data_dir: &Path) -> std::collections::HashSet<Str
}
}
/// Names of the containers that were running at the last periodic snapshot
/// (`running-containers.json`, saved every ~120s by `save_container_snapshot`).
/// Unlike `check_for_crash`, this reads the snapshot unconditionally (no PID/crash
/// gate) — it's the durable "what was running" signal the boot reconciler uses to
/// recreate a previously-running app whose container vanished. Empty if absent.
pub async fn load_last_running_names(data_dir: &Path) -> std::collections::HashSet<String> {
let path = data_dir.join(CONTAINER_STATE_FILE);
match fs::read_to_string(&path).await {
Ok(content) => match serde_json::from_str::<ContainerSnapshot>(&content) {
Ok(snapshot) => snapshot.containers.into_iter().map(|c| c.name).collect(),
Err(_) => std::collections::HashSet::new(),
},
Err(_) => std::collections::HashSet::new(),
}
}
/// Save the set of user-stopped containers to disk.
pub async fn save_user_stopped(data_dir: &Path, stopped: &std::collections::HashSet<String>) {
let path = data_dir.join(USER_STOPPED_FILE);
@ -898,6 +914,43 @@ mod tests {
assert_eq!(containers[1].name, "archy-mempool-web");
}
#[tokio::test]
async fn test_load_last_running_names_reads_snapshot_without_pid_gate() {
let tmp = TempDir::new().unwrap();
// No PID file written — load_last_running_names must NOT require a crash.
let snapshot = ContainerSnapshot {
timestamp: 1000,
containers: vec![
RunningContainerRecord {
name: "immich_server".to_string(),
image: "immich:2.7".to_string(),
},
RunningContainerRecord {
name: "immich_postgres".to_string(),
image: "postgres:16".to_string(),
},
],
};
fs::write(
tmp.path().join(CONTAINER_STATE_FILE),
serde_json::to_string(&snapshot).unwrap(),
)
.await
.unwrap();
let names = load_last_running_names(tmp.path()).await;
assert_eq!(names.len(), 2);
assert!(names.contains("immich_server"));
assert!(names.contains("immich_postgres"));
assert!(!names.contains("immich_redis"));
}
#[tokio::test]
async fn test_load_last_running_names_empty_when_absent() {
let tmp = TempDir::new().unwrap();
assert!(load_last_running_names(tmp.path()).await.is_empty());
}
#[tokio::test]
async fn test_write_and_remove_pid_marker() {
let tmp = TempDir::new().unwrap();

View File

@ -254,11 +254,57 @@ pub(crate) async fn notify_join(
"params": params,
});
let _ = crate::fips::dial::PeerRequest::new(remote_fips_npub, remote_onion, "/rpc/v1")
.service(crate::settings::transport::PeerService::Federation)
.timeout(std::time::Duration::from_secs(30))
.send_json(&body)
.await;
// Deliver the notification in the BACKGROUND with retries, and return
// immediately. Two reasons:
// 1. The join RPC must not block on this. Awaiting a cold FIPS overlay
// (no shared FIPS path between LAN and remote/Tailscale peers) stalled
// the whole join until FIPS timed out, surfacing as "Request timeout"
// in the UI even though the local membership was already saved.
// 2. If this single best-effort POST failed, the inviter never learned
// about us → asymmetric federation (they couldn't see us). Retrying in
// the background until it lands makes federation converge to symmetric.
// `fips_timeout` fast-fails a dead FIPS path so the Tor fallback (which
// answers an onion in ~3-5s) is reached quickly on each attempt.
let remote_onion = remote_onion.to_string();
let remote_fips_npub = remote_fips_npub.map(|s| s.to_string());
tokio::spawn(async move {
// ~5 attempts with linear backoff: 0s, 10s, 20s, 30s, 40s — covers a
// peer that is briefly unreachable (restarting, publishing its onion)
// without hammering it.
for attempt in 1..=5u32 {
let res = crate::fips::dial::PeerRequest::new(
remote_fips_npub.as_deref(),
&remote_onion,
"/rpc/v1",
)
.service(crate::settings::transport::PeerService::Federation)
.timeout(std::time::Duration::from_secs(30))
.fips_timeout(std::time::Duration::from_secs(6))
.send_json(&body)
.await;
match res {
Ok((resp, transport)) if resp.status().is_success() => {
tracing::info!(
attempt,
transport = %transport,
"peer-joined notification delivered to inviter"
);
return;
}
Ok((resp, _)) => tracing::warn!(
attempt,
status = %resp.status(),
"peer-joined notification rejected; will retry"
),
Err(e) => tracing::warn!(attempt, error = %e, "peer-joined notification failed; will retry"),
}
tokio::time::sleep(std::time::Duration::from_secs(10 * attempt as u64)).await;
}
tracing::warn!(
onion = %remote_onion,
"peer-joined notification gave up after retries — peer may not see us until next sync"
);
});
Ok(())
}

View File

@ -12,6 +12,9 @@ mod types;
// Re-export all public items so `crate::federation::*` continues to work.
pub use invites::{accept_invite, create_invite};
// Crate-internal: used by the periodic federation auto-sync to re-assert
// membership to peers that don't list us back (asymmetry self-heal).
pub(crate) use invites::notify_join;
#[allow(unused_imports)]
pub use storage::{
add_node, fips_npub_for_onion, load_nodes, load_removed_dids, record_peer_transport,

View File

@ -33,6 +33,12 @@ pub async fn sync_with_peer(
.header("X-Federation-Sig", signature)
.header("X-Federation-Timestamp", timestamp)
.timeout(std::time::Duration::from_secs(30))
// Fast-fail a cold/unreachable FIPS overlay (common between LAN and
// remote/Tailscale peers that share no FIPS path) so the Tor fallback —
// which answers an onion in ~3-5s — isn't stuck behind the full 30s FIPS
// budget. Without this, a state sync to a FIPS-unreachable peer "took
// ages" and join/sync appeared to time out even though Tor was healthy.
.fips_timeout(std::time::Duration::from_secs(6))
.send_json(&body)
.await
.context("Failed to reach federated peer")?;

View File

@ -308,6 +308,14 @@ pub struct PeerRequest<'a> {
pub path: &'a str,
pub headers: Vec<(&'a str, String)>,
pub timeout: std::time::Duration,
/// Optional shorter cap on the FIPS *attempt* only. When set, a cold or hung
/// FIPS overlay fails fast within this budget so the Tor fallback still gets
/// its full `timeout` — without it, a stuck FIPS dial can consume the whole
/// caller budget (e.g. a 60s frontend RPC) and the request "times out" even
/// though Tor would have answered (#6, the Pay-with-QR invoice request).
/// `None` keeps the legacy behavior (FIPS uses the full `timeout`), which a
/// large content download needs so its long FIPS transfer isn't truncated.
pub fips_timeout: Option<std::time::Duration>,
pub service: Option<crate::settings::transport::PeerService>,
}
@ -319,10 +327,26 @@ impl<'a> PeerRequest<'a> {
path,
headers: Vec::new(),
timeout: std::time::Duration::from_secs(30),
fips_timeout: None,
service: None,
}
}
/// Cap the FIPS attempt to a shorter budget than the overall `timeout`, so a
/// cold/hung overlay path fails fast and the Tor fallback keeps its full
/// budget. Use on short request/response calls (invoice, status); leave
/// unset for large downloads that legitimately need a long FIPS transfer.
pub fn fips_timeout(mut self, t: std::time::Duration) -> Self {
self.fips_timeout = Some(t);
self
}
/// Timeout to apply to the FIPS attempt — the explicit cap if set, else the
/// overall request timeout.
fn fips_attempt_timeout(&self) -> std::time::Duration {
self.fips_timeout.unwrap_or(self.timeout)
}
/// Tie this request to a user-configurable service preference. If
/// the user has set that service to `Fips` or `Tor`, the builder
/// respects it.
@ -423,7 +447,7 @@ impl<'a> PeerRequest<'a> {
}
};
let url = format!("{}{}", base, self.path);
let c = client_with_timeout(self.timeout);
let c = client_with_timeout(self.fips_attempt_timeout());
let mut rb = c.post(&url).json(body);
for (k, v) in &self.headers {
rb = rb.header(*k, v);
@ -456,7 +480,7 @@ impl<'a> PeerRequest<'a> {
}
};
let url = format!("{}{}", base, self.path);
let c = client_with_timeout(self.timeout);
let c = client_with_timeout(self.fips_attempt_timeout());
let mut rb = c.get(&url);
for (k, v) in &self.headers {
rb = rb.header(*k, v);

View File

@ -60,14 +60,23 @@ pub async fn ensure_activated(data_dir: &std::path::Path) {
tracing::info!("FIPS auto-activated");
}
/// Spawn the FIPS supervisor: every 45s it (1) auto-activates FIPS if onboarding
/// Spawn the FIPS supervisor: every 25s it (1) auto-activates FIPS if onboarding
/// is done but the service is down — so it comes up with zero user interaction,
/// and (2) keeps hole-punched paths to known federation peers warm, so on-demand
/// dials land on FIPS instead of falling back to Tor. Warms peers concurrently
/// so one slow/offline peer doesn't delay the rest.
///
/// The interval MUST be shorter than the NAT/hole-punch cold window
/// (`warm_path` docs it at ~30-60s). The previous 45s sat at the edge of that
/// window: a path that went cold at ~30s stayed cold until the next 45s tick,
/// so real peer dials in that gap hit a cold path and fell back to Tor (~18s
/// onion latency instead of FIPS's ~2-3s). 25s keeps every path refreshed
/// inside the minimum cold window, which is what actually makes FIPS — not Tor —
/// the transport peer requests land on. Measured: warm FIPS browse ~2.6s vs a
/// cold-path fallback browse ~18-22s over Tor to the same peer.
pub fn spawn_fips_supervisor(data_dir: std::path::PathBuf) {
tokio::spawn(async move {
let mut tick = tokio::time::interval(std::time::Duration::from_secs(45));
let mut tick = tokio::time::interval(std::time::Duration::from_secs(25));
loop {
tick.tick().await;
// Bring FIPS up on its own once onboarding has materialised the key.

View File

@ -39,6 +39,7 @@ mod constants;
mod container;
mod content_hash;
mod content_invoice;
mod content_owned;
mod content_server;
mod crash_recovery;
mod credentials;
@ -197,14 +198,53 @@ async fn main() -> Result<()> {
(Some(trait_obj), Some(dev))
} else {
let prod = Arc::new(ProdContainerOrchestrator::new(config.clone()).await?);
// Pull the freshest signed app-catalog BEFORE loading manifests, so any
// registry-embedded manifest (the origin-wins overlay in load_manifests)
// is in place on THIS boot — not a restart later. Without this the boot
// would overlay the previous run's cached catalog and a newly-published
// app (e.g. a registry-only install) wouldn't appear until the next
// restart. Bounded + best-effort: on timeout/unreachable origin the
// last-cached catalog (or the disk manifests) still load — registry is
// an overlay on top of disk, never a hard dependency.
match tokio::time::timeout(
std::time::Duration::from_secs(25),
crate::container::app_catalog::refresh_catalog(&config.data_dir),
)
.await
{
Ok(Ok(n)) => info!("🛰️ app-catalog refreshed before manifest load ({n} apps)"),
Ok(Err(e)) => tracing::debug!("app-catalog pre-load refresh failed (using cache): {e}"),
Err(_) => tracing::debug!("app-catalog pre-load refresh timed out (using cache)"),
}
// Best-effort manifest load; a missing /opt/archipelago/apps is
// logged inside load_manifests and not fatal.
match prod.load_manifests().await {
Ok(n) => info!("📦 Loaded {n} app manifest(s) from disk"),
Ok(n) => info!("📦 Loaded {n} app manifest(s) (disk + registry catalog)"),
Err(e) => {
tracing::error!(error = %e, "prod orchestrator: load_manifests failed at startup");
}
}
// Reboot-survival safety net for the podman `--restart` path: ensure the
// user's podman-restart.service is enabled so `unless-stopped` containers
// come back after a reboot even when the Quadlet backend path is off
// (orchestrator-installed backends like immich/btcpay run as plain podman
// containers until the Phase-3 Quadlet rollout). Idempotent + best-effort.
{
let out = tokio::process::Command::new("systemctl")
.args(["--user", "enable", "--now", "podman-restart.service"])
.output()
.await;
match out {
Ok(o) if o.status.success() => {
info!("🔁 podman-restart.service enabled (reboot-survival for --restart containers)")
}
Ok(o) => tracing::debug!(
"podman-restart.service enable skipped: {}",
String::from_utf8_lossy(&o.stderr).trim()
),
Err(e) => tracing::debug!("podman-restart.service enable skipped: {e}"),
}
}
// Adoption pass: link existing podman containers back to their
// manifests so the reconciler doesn't recreate them.
match tokio::time::timeout(Duration::from_secs(35), prod.adopt_existing()).await {

View File

@ -8,6 +8,7 @@
//! asker is limited to one in-flight query.
use super::super::message_types::{self, AssistResponsePayload, MeshMessageType};
use super::super::types::MeshEvent;
use super::bitcoin::send_to_peer;
use super::{MeshCommand, MeshState};
use crate::federation::TrustLevel;
@ -42,28 +43,46 @@ pub(super) enum AssistReply {
/// Plain-text broadcast on a mesh channel — the bare `!ai` path, so any
/// client (including non-archipelago meshcore/Meshtastic nodes) sees it.
ChannelText { channel: u8 },
/// Normal `Text` chat bubble sent back into the 1:1 thread — the
/// archipelago `!ai`-in-chat path. The asker typed `!ai …` as a regular
/// direct message, so the answer lands inline in that same conversation
/// (encrypted, peer-addressed) rather than as a separate widget.
ChatText { contact_id: u32 },
/// Plain-text NATIVE direct message back to the asker's radio contact —
/// the bare `!ai` path for a stock meshcore client (e.g. a phone). The
/// answer goes as a real unicast DM (not a public-channel broadcast), so
/// only the asker sees it and a stock client can read it.
RadioDm { dest_prefix: [u8; 6] },
}
/// Entry point: gate the query, run the model, send the answer back via the
/// requested reply path. Spawned off the radio loop so it never blocks.
#[allow(clippy::too_many_arguments)]
pub(super) async fn run_assist(
prompt: String,
model_override: Option<String>,
req_id: u64,
asker_contact_id: u32,
sender_name: String,
// Whether the asker's message was cryptographically authenticated (a
// verified signature, or arrival over the federation transport). Required
// for any identity-based allow under `trusted_only`/the allowlist.
authenticated: bool,
reply: AssistReply,
state: Arc<MeshState>,
) {
let asker = asker_contact_id;
// Trust + block gate.
if !is_sender_allowed(&state, asker).await {
if !is_sender_allowed(&state, asker, authenticated).await {
warn!(
from = asker,
name = %sender_name,
"AssistQuery denied — sender not permitted by assistant policy"
);
// Record who was turned away so the operator can find + allow them from
// the UI (the silent-on-wire denial otherwise only shows in the journal).
record_denied(&state, asker, &sender_name).await;
// Silent on the wire (no airtime spent on denials); surface to the UI.
let _ = state
.event_tx
@ -144,13 +163,25 @@ pub(super) async fn run_assist(
}
/// Whether `sender_contact_id` may invoke the assistant under the node's policy.
/// Always denies user-blocked contacts. With `trusted_only`, requires a
/// federation-Trusted match on the peer's pubkey or DID.
async fn is_sender_allowed(state: &Arc<MeshState>, sender_contact_id: u32) -> bool {
///
/// Always denies user-blocked contacts. Identity-based allows (the per-contact
/// allowlist and the federation-Trusted match) require `authenticated == true` —
/// i.e. the asker's message carried a signature that verified against its known
/// key (or it arrived over the federation transport, which verifies upstream).
/// A bare radio packet can CLAIM any key or DID, so without that proof the
/// allowlist and trust list are spoofable; only the explicit "anyone on the
/// mesh" policy (`trusted_only == false`) admits an unauthenticated asker.
async fn is_sender_allowed(
state: &Arc<MeshState>,
sender_contact_id: u32,
authenticated: bool,
) -> bool {
let (pubkey_hex, did) = {
let peers = state.peers.read().await;
match peers.get(&sender_contact_id) {
Some(p) => (p.pubkey_hex.clone(), p.did.clone()),
// Match identity on the bound archipelago key (stable, advert/
// federation-verified), not the firmware routing key.
Some(p) => (p.identity_pubkey_hex().map(|s| s.to_string()), p.did.clone()),
None => (None, None),
}
};
@ -169,11 +200,35 @@ async fn is_sender_allowed(state: &Arc<MeshState>, sender_contact_id: u32) -> bo
}
}
// Explicit per-contact allowlist: the operator deliberately ticked THIS
// contact, so honour it even for an unauthenticated radio asker. A stock
// meshcore client (e.g. a phone) can't sign our typed envelopes, so it can
// never be `authenticated` — gating the allowlist on authentication made
// ticking such a contact have no effect. We match the asker's resolved
// identity key: the bound archipelago key if we know it, else the firmware
// routing key (`pubkey_hex`), which is how meshcore addresses the contact
// and what the UI adds to the allowlist for a keyless radio peer. This is a
// narrow, explicit opt-in for a specific key — the spoofable federation-
// trust-list match below still requires authentication.
if let Some(ref pk) = pubkey_hex {
let allowed = state.assistant.read().await.allowed_contacts.clone();
if allowed.iter().any(|a| a.eq_ignore_ascii_case(pk)) {
return true;
}
}
if !state.assistant.read().await.trusted_only {
return true;
}
// Trusted-only: match against the federation trust list.
// Trusted-only from here: an unauthenticated asker can never match the trust
// list (it could otherwise just claim a trusted node's public key/DID).
if !authenticated {
return false;
}
// Match against the federation trust list by the asker's verified archipelago
// pubkey or DID (a radio peer gets these from its signed identity advert).
let nodes = crate::federation::load_nodes(&state.data_dir)
.await
.unwrap_or_default();
@ -183,6 +238,36 @@ async fn is_sender_allowed(state: &Arc<MeshState>, sender_contact_id: u32) -> bo
})
}
/// Newest-first cap on the denied-asker buffer — enough to surface the people
/// who recently tried, without unbounded growth from a spammer.
const MAX_DENIED_ASKERS: usize = 25;
/// Record a turned-away `!ai` asker so the UI can offer a one-click "Allow".
/// Dedupes by contact id (moves an existing entry to the front and refreshes its
/// timestamp/name) so repeated denials from one device don't flood the list.
async fn record_denied(state: &Arc<MeshState>, asker_contact_id: u32, sender_name: &str) {
// Capture the bound archipelago identity key (NOT the firmware routing key):
// one-click "Allow" adds this to the allowlist, which the gate matches on the
// archipelago key. A peer with no advert has no arch key → None → the UI shows
// "no key" (only the "anyone on the mesh" policy can admit it).
let pubkey_hex = {
let peers = state.peers.read().await;
peers
.get(&asker_contact_id)
.and_then(|p| p.arch_pubkey_hex.clone())
};
let entry = super::DeniedAsker {
contact_id: asker_contact_id,
name: sender_name.to_string(),
pubkey_hex,
at: chrono::Utc::now().to_rfc3339(),
};
let mut denied = state.assist_denied.write().await;
denied.retain(|d| d.contact_id != asker_contact_id);
denied.push_front(entry);
denied.truncate(MAX_DENIED_ASKERS);
}
/// Cap the answer to `MAX_REPLY_CHARS`, appending a marker when truncated.
/// Returns (text_to_send, was_truncated).
fn cap_reply(answer: &str) -> (String, bool) {
@ -205,6 +290,19 @@ async fn send_reply(state: &Arc<MeshState>, reply: &AssistReply, req_id: u64, an
let text = cap_channel(answer);
send_channel_text(state, *channel, &text).await;
}
AssistReply::ChatText { contact_id } => {
let (text, _) = cap_reply(answer);
send_chat_text(state, *contact_id, &text).await;
}
AssistReply::RadioDm { dest_prefix } => {
let text = cap_channel(answer);
let _ = state
.send_cmd(MeshCommand::SendNativeText {
dest_pubkey_prefix: *dest_prefix,
payload: text.into_bytes(),
})
.await;
}
}
}
@ -224,6 +322,17 @@ async fn send_failure(state: &Arc<MeshState>, reply: &AssistReply, req_id: u64,
AssistReply::ChannelText { channel } => {
send_channel_text(state, *channel, &format!("AI: {msg}")).await;
}
AssistReply::ChatText { contact_id } => {
send_chat_text(state, *contact_id, &format!("AI: {msg}")).await;
}
AssistReply::RadioDm { dest_prefix } => {
let _ = state
.send_cmd(MeshCommand::SendNativeText {
dest_pubkey_prefix: *dest_prefix,
payload: format!("AI: {msg}").into_bytes(),
})
.await;
}
}
}
@ -272,6 +381,23 @@ async fn send_typed_response(
}
}
/// Send the answer back into the 1:1 chat thread as a normal chat bubble.
/// Used for the `!ai`-in-chat path. We emit an `AssistChatReply` event rather
/// than sending here, because the reply must be routed transport-aware:
/// `!ai` can arrive over LoRa OR over federation (Tor), and only
/// `MeshService::send_message` (which owns the signing key + Tor client) knows
/// to POST over the peer's onion for a federation-synthetic contact_id. The
/// radio-only path used to drop the reply for federation askers — the answer
/// showed on the answering node but never reached the asker. A server-layer
/// consumer fulfils this event via `send_message`, which also records the
/// Sent bubble and allocates the seq.
async fn send_chat_text(state: &Arc<MeshState>, contact_id: u32, text: &str) {
let _ = state.event_tx.send(MeshEvent::AssistChatReply {
contact_id,
text: text.to_string(),
});
}
/// Broadcast a plain-text answer on a channel for bare `!ai` clients.
async fn send_channel_text(state: &Arc<MeshState>, channel: u8, text: &str) {
let _ = state

View File

@ -353,26 +353,40 @@ pub(super) async fn store_plain_message(
state.status.write().await.messages_received += 1;
let _ = state.event_tx.send(MeshEvent::MessageReceived(msg));
// Mesh-AI assistant (issue #50): a plain `!ai`/`!ask <question>` on the
// channel is answered by this node's local model when the assistant is on.
// Reply goes back as plain channel text so bare (non-archipelago) clients
// see it. The trust/rate gate lives in run_assist.
// Mesh-AI assistant (issue #50): a plain `!ai`/`!ask <question>` is answered
// by this node's local model when the assistant is on. The trust/rate gate
// lives in run_assist. The reply goes back as a private NATIVE DM to the
// asker whenever we know its radio pubkey (so it does NOT land on the public
// channel and a stock meshcore client can read it); we only fall back to a
// channel reply if the sender has no resolvable pubkey (rare).
if state.assistant.read().await.enabled {
if let Some(prompt) = strip_ai_trigger(text) {
if !prompt.is_empty() {
let reply = {
let peers = state.peers.read().await;
peers
.get(&contact_id)
.and_then(|p| p.pubkey_hex.clone())
.filter(|h| h.len() >= 12)
.and_then(|h| hex::decode(&h[..12]).ok())
.filter(|b| b.len() == 6)
.map(|b| {
let mut pre = [0u8; 6];
pre.copy_from_slice(&b);
super::assist::AssistReply::RadioDm { dest_prefix: pre }
})
.unwrap_or(super::assist::AssistReply::ChannelText { channel: 0 })
};
let req_id = state.next_id().await;
let prompt = prompt.to_string();
let name = peer_name.to_string();
let st = Arc::clone(state);
tokio::spawn(async move {
// A bare plain-text channel `!ai` carries no signature, so it
// is NOT authenticated — under trusted_only it'll be denied,
// and it can only be answered under the "anyone" policy.
super::assist::run_assist(
prompt,
None,
req_id,
contact_id,
name,
super::assist::AssistReply::ChannelText { channel: 0 },
st,
prompt, None, req_id, contact_id, name, false, reply, st,
)
.await;
});
@ -383,7 +397,7 @@ pub(super) async fn store_plain_message(
/// Recognise a `!ai`/`!ask ` command prefix (case-insensitive) and return the
/// trimmed question after it, or `None` if the text isn't an AI command.
fn strip_ai_trigger(text: &str) -> Option<&str> {
pub(super) fn strip_ai_trigger(text: &str) -> Option<&str> {
let t = text.trim_start();
for p in ["!ai ", "!ask "] {
if t.len() >= p.len() && t[..p.len()].eq_ignore_ascii_case(p) {
@ -475,11 +489,18 @@ pub(super) async fn handle_identity_received(
advert_name: format!("Archy-{}", &did[8..16.min(did.len())]),
did: Some(did.to_string()),
pubkey_hex: Some(ed_pubkey_hex.to_string()),
// The advert signature was verified above, so this is an authenticated
// archipelago identity. Bind it separately so a later refresh_contacts
// (which rewrites pubkey_hex to the firmware routing key) can't drop it.
arch_pubkey_hex: Some(ed_pubkey_hex.to_string()),
x25519_pubkey: Some(x25519_bytes),
rssi: Some(rssi),
snr: None,
last_heard: chrono::Utc::now().to_rfc3339(),
hops: 0,
last_advert: 0,
// We just heard this peer's identity advert, so it's reachable.
reachable: true,
};
let is_new = {

View File

@ -83,14 +83,22 @@ pub(crate) async fn handle_typed_envelope_direct(
sender_name: &str,
envelope: TypedEnvelope,
) {
// Verify envelope signature if present, using the sender's known Ed25519 key
// Verify the envelope signature (if present) against the sender's known
// Ed25519 key, and record whether the sender is cryptographically
// authenticated. A federation peer (synthetic high-half contact_id) arrived
// over the Tor relay, which verifies the sender signature upstream before
// injecting here, so it counts as authenticated. This flag gates the
// identity-based `!ai` allows (allowlist / federation-trust) downstream.
let mut authenticated = sender_contact_id >= crate::mesh::FEDERATION_CONTACT_ID_BASE;
if envelope.sig.is_some() {
let peer_pubkey = state
.peers
.read()
.await
.get(&sender_contact_id)
.and_then(|p| p.pubkey_hex.as_ref())
// Verify against the bound archipelago identity key, not the
// firmware routing key — only the former is what the peer signs with.
.and_then(|p| p.identity_pubkey_hex())
.and_then(|hex_str| hex::decode(hex_str).ok())
.and_then(|bytes| {
if bytes.len() == 32 {
@ -103,7 +111,9 @@ pub(crate) async fn handle_typed_envelope_direct(
});
if let Some(vk) = peer_pubkey {
match envelope.verify_signature(&vk) {
Ok(true) => {}
Ok(true) => {
authenticated = true;
}
Ok(false) => {
warn!(
peer = sender_contact_id,
@ -679,6 +689,37 @@ pub(crate) async fn handle_typed_envelope_direct(
Some(envelope.seq),
)
.await;
// Mesh-AI assistant (issue #50): a `!ai`/`!ask <question>` typed in
// the normal 1:1 chat triggers this node's assistant, with the
// answer sent back as a chat bubble in the same thread. The typed
// DM carries the peer's federation identity (via sender_contact_id),
// so the `trusted_only` gate in run_assist resolves correctly —
// unlike the bare channel-text path, which only knows the radio key.
if state.assistant.read().await.enabled {
if let Some(prompt) = super::decode::strip_ai_trigger(&text) {
if !prompt.is_empty() {
let req_id = state.next_id().await;
let prompt = prompt.to_string();
let name = sender_name.to_string();
let cid = sender_contact_id;
let st = Arc::clone(state);
tokio::spawn(async move {
super::assist::run_assist(
prompt,
None,
req_id,
cid,
name,
authenticated,
super::assist::AssistReply::ChatText { contact_id: cid },
st,
)
.await;
});
}
}
}
}
Some(MeshMessageType::AssistQuery) => {
@ -718,6 +759,7 @@ pub(crate) async fn handle_typed_envelope_direct(
query.req_id,
sender_contact_id,
name,
authenticated,
super::assist::AssistReply::Typed {
contact_id: sender_contact_id,
},

View File

@ -22,8 +22,33 @@ pub(super) async fn handle_frame(
protocol::PUSH_NEW_CONTACT | protocol::PUSH_CONTACT_ADVERT => {
info!(
code = frame.code,
data_len = frame.data.len(),
"Contact discovery event — refreshing contacts"
);
// Auto-import: a PUSH_CONTACT_ADVERT (0x80) carries the 32-byte
// pubkey of a node we just heard. If it isn't already a contact,
// add it to the firmware table so it shows up immediately — no
// flood-advert dance required. (PUSH_NEW_CONTACT/0x8A is already
// added by the firmware, so we skip it.)
if frame.code == protocol::PUSH_CONTACT_ADVERT && frame.data.len() >= 32 {
let mut pubkey = [0u8; 32];
pubkey.copy_from_slice(&frame.data[..32]);
let pk_hex = hex::encode(pubkey);
let known = state
.peers
.read()
.await
.values()
.any(|p| p.pubkey_hex.as_deref() == Some(pk_hex.as_str()));
if !known {
let _ = state
.send_cmd(super::MeshCommand::AddContact {
pubkey,
name: String::new(),
})
.await;
}
}
return true; // Signal caller to fetch contacts
}

View File

@ -63,6 +63,14 @@ pub enum MeshCommand {
dest_pubkey_prefix: [u8; 6],
payload: Vec<u8>,
},
/// Send PLAIN text as one or more native meshcore DMs to a stock client
/// (e.g. a phone). Long text is split into multiple readable plain messages
/// — never MC-chunked — because stock clients can't reassemble archy's
/// chunk framing. Used for chat/AI replies to non-archipelago contacts.
SendNativeText {
dest_pubkey_prefix: [u8; 6],
payload: Vec<u8>,
},
/// Broadcast pre-encoded binary on a mesh channel.
BroadcastChannel {
channel: u8,
@ -71,6 +79,16 @@ pub enum MeshCommand {
SendAdvert,
/// Re-fetch contact list from the radio device.
RefreshContacts,
/// Delete a contact from the firmware table (clear-all / unreachable wipe).
RemoveContact {
pubkey: [u8; 32],
},
/// Import/add a heard advert as a firmware contact so it shows up without
/// needing a flood advert. Name may be empty (firmware fills from advert).
AddContact {
pubkey: [u8; 32],
name: String,
},
}
/// Shared state for the mesh listener, accessible from RPC handlers.
@ -135,6 +153,28 @@ pub struct MeshState {
/// Contact-ids with an AI query currently being answered. Caps each asker to
/// one in-flight query so a peer can't flood the node's compute / airtime.
pub assist_inflight: RwLock<HashSet<u32>>,
/// Recently-denied `!ai` askers (newest first, capped). When `trusted_only`
/// rejects a sender — typically a radio (meshcore) device that presents a
/// firmware key rather than an archipelago DID — we record who tried so the
/// UI can surface them and let the operator one-click allow their key.
/// Silent on the wire (no airtime spent), visible to the operator here.
pub assist_denied: RwLock<VecDeque<DeniedAsker>>,
}
/// A `!ai` asker that the assistant policy turned away. Surfaced to the UI so
/// the operator can add their key to the allowlist without hunting the journal.
#[derive(Debug, Clone, Serialize)]
pub struct DeniedAsker {
/// Meshcore contact id of the asker.
pub contact_id: u32,
/// Best-known display name (advert name) at denial time.
pub name: String,
/// The asker's ed25519 pubkey hex, if known. `None` for a raw radio device
/// that hasn't advertised an archipelago key — such a sender can only be
/// admitted by switching the policy to "anyone", not via the allowlist.
pub pubkey_hex: Option<String>,
/// ISO-8601 timestamp of the (most recent) denial.
pub at: String,
}
/// Mesh-AI assistant configuration, snapshotted from `MeshConfig` at startup.
@ -148,6 +188,10 @@ pub struct AssistantConfig {
pub trusted_only: bool,
/// AI backend: "claude" (shared proxy token) or "ollama" (local model).
pub backend: String,
/// Per-contact allowlist (ed25519 pubkey hex) permitted to use `!ai`
/// regardless of `trusted_only`. Empty → only the `trusted_only` policy
/// applies. A user-blocked contact is always denied even if listed here.
pub allowed_contacts: Vec<String>,
}
/// Contact metadata kept alongside MeshState.peers. Pinned contacts sort to
@ -226,6 +270,7 @@ impl MeshState {
assistant: RwLock::new(assistant),
data_dir,
assist_inflight: RwLock::new(HashSet::new()),
assist_denied: RwLock::new(VecDeque::new()),
});
(state, rx, cmd_rx)
}

View File

@ -53,6 +53,43 @@ impl MeshRadioDevice {
}
}
async fn send_text_msg(&mut self, dest_pubkey_prefix: &[u8; 6], payload: &[u8]) -> Result<()> {
match self {
Self::Meshcore(device) => device.send_text_msg(dest_pubkey_prefix, payload).await,
Self::Meshtastic(device) => device.send_text_msg(dest_pubkey_prefix, payload).await,
}
}
async fn remove_contact(&mut self, pubkey: &[u8; 32]) -> Result<()> {
match self {
Self::Meshcore(device) => device.remove_contact(pubkey).await,
Self::Meshtastic(device) => device.remove_contact(pubkey).await,
}
}
async fn add_contact(
&mut self,
pubkey: &[u8; 32],
contact_type: u8,
flags: u8,
out_path_len: u8,
name: &str,
last_advert: u32,
) -> Result<()> {
match self {
Self::Meshcore(device) => {
device
.add_contact(pubkey, contact_type, flags, out_path_len, name, last_advert)
.await
}
Self::Meshtastic(device) => {
device
.add_contact(pubkey, contact_type, flags, out_path_len, name, last_advert)
.await
}
}
}
async fn get_contacts(&mut self) -> Result<Vec<super::super::protocol::ParsedContact>> {
match self {
Self::Meshcore(device) => device.get_contacts().await,
@ -151,6 +188,7 @@ pub(super) const DM_V1_MARKER: &str = "@DM:";
/// route inbound DMs to the correct contact_id thread.
pub(super) const DM_V2_MARKER: &str = "@DM2:";
#[allow(dead_code)] // legacy @DM2-over-channel wrapper; kept for reference now that DMs are native unicast
fn wrap_dm_for_channel(
dest_pubkey_prefix: &[u8; 6],
sender_arch_prefix: &[u8; 6],
@ -169,6 +207,7 @@ fn wrap_dm_for_channel(
/// `[0u8; 6]` if the stored hex is malformed (which would only happen if a
/// caller constructed `MeshState` with a bad value — empty string yields
/// all-zero, which won't match any real peer on the receiver side).
#[allow(dead_code)] // was used by the @DM2 wrapper; native unicast doesn't need it
fn our_sender_prefix(state: &Arc<MeshState>) -> [u8; 6] {
let mut out = [0u8; 6];
if state.our_ed_pubkey_hex.len() >= 12 {
@ -195,39 +234,42 @@ async fn send_dm_via_channel(
consecutive_write_failures: &mut u32,
) {
use base64::Engine;
let sender_prefix = our_sender_prefix(state);
// First try a single frame with the raw payload directly wrapped.
// This keeps small plain-text messages at minimal overhead.
let single = wrap_dm_for_channel(dest_pubkey_prefix, &sender_prefix, payload);
if single.len() <= 140 {
match device.send_channel_text(0, single.as_bytes()).await {
let _ = state; // native unicast carries no separate sender prefix
// NATIVE meshcore unicast (CMD_SEND_TXT_MSG): a real direct message to the
// contact, NOT a broadcast on the shared public channel. This is the fix
// for the long-standing public-channel pollution — archy used to tunnel
// every DM/relay/receipt as an `@DM2:` blob on channel 0, which (a) every
// mesh participant saw as spam and (b) stock meshcore clients (e.g. a
// phone) couldn't decode. A native DM is private and decodes everywhere.
// The receive side handles these via the existing RESP_CONTACT_MSG path.
//
// Small payloads send in one frame; larger ones are base64 + MC-chunked
// and reassembled by the receiver (try_chunk_reassemble).
if payload.len() <= 140 {
match device.send_text_msg(dest_pubkey_prefix, payload).await {
Ok(()) => {
*consecutive_write_failures = 0;
info!(
dest = %hex::encode(dest_pubkey_prefix),
len = payload.len(),
wire_len = single.len(),
"Sent mesh message (DM via channel)"
"Sent mesh DM (native unicast)"
);
}
Err(e) => {
*consecutive_write_failures += 1;
warn!(
failures = *consecutive_write_failures,
"Failed to send DM via channel: {}", e
"Failed to send native DM: {}", e
);
}
}
return;
}
// Payload too large for one wrap — base64 then MC-chunk. Receiver
// reassembles base64 chunks and routes the decoded bytes back through
// the typed-envelope ladder in handle_channel_payload.
let encoded = base64::engine::general_purpose::STANDARD.encode(payload);
static CHUNK_MSG_ID: std::sync::atomic::AtomicU8 = std::sync::atomic::AtomicU8::new(0);
let msg_id = CHUNK_MSG_ID.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
let chunk_data_size = 80;
let chunk_data_size = 100;
let chunks: Vec<&str> = encoded
.as_bytes()
.chunks(chunk_data_size)
@ -239,18 +281,20 @@ async fn send_dm_via_channel(
raw_len = payload.len(),
b64_len = encoded.len(),
chunks = total,
"Sending chunked mesh message (DM via channel)"
"Sending chunked mesh DM (native unicast)"
);
let mut any_err = false;
for (idx, chunk) in chunks.iter().enumerate() {
let frame = format!("MC{:02x}{:02x}{:02x}{}", msg_id, idx as u8, total, chunk);
let wrapped = wrap_dm_for_channel(dest_pubkey_prefix, &sender_prefix, frame.as_bytes());
if let Err(e) = device.send_channel_text(0, wrapped.as_bytes()).await {
if let Err(e) = device
.send_text_msg(dest_pubkey_prefix, frame.as_bytes())
.await
{
*consecutive_write_failures += 1;
warn!(
failures = *consecutive_write_failures,
chunk = idx,
"Chunk DM-via-channel send failed: {}",
"Chunk native DM send failed: {}",
e
);
any_err = true;
@ -263,20 +307,72 @@ async fn send_dm_via_channel(
}
}
/// Send PLAIN text to a stock meshcore client as one or more native DMs.
/// Unlike `send_dm_via_channel`, this never uses MC-chunk framing (stock
/// clients can't reassemble it) — if the text exceeds one LoRa frame it is
/// split into multiple readable plain messages on UTF-8 char boundaries.
async fn send_plain_native_text(
device: &mut MeshRadioDevice,
dest_pubkey_prefix: &[u8; 6],
text: &[u8],
consecutive_write_failures: &mut u32,
) {
// Split on char boundaries so we never break a multi-byte UTF-8 sequence.
const FRAME: usize = 150; // under MAX_MESSAGE_LEN (160), leaves header room
let s = String::from_utf8_lossy(text);
let mut parts: Vec<String> = Vec::new();
let mut cur = String::new();
for ch in s.chars() {
if cur.len() + ch.len_utf8() > FRAME {
parts.push(std::mem::take(&mut cur));
}
cur.push(ch);
}
if !cur.is_empty() || parts.is_empty() {
parts.push(cur);
}
let total = parts.len();
for (idx, part) in parts.iter().enumerate() {
match device
.send_text_msg(dest_pubkey_prefix, part.as_bytes())
.await
{
Ok(()) => {
*consecutive_write_failures = 0;
info!(
dest = %hex::encode(dest_pubkey_prefix),
part = idx + 1,
total,
"Sent plain native DM"
);
}
Err(e) => {
*consecutive_write_failures += 1;
warn!(
failures = *consecutive_write_failures,
"Plain native DM send failed: {}", e
);
break;
}
}
if total > 1 {
tokio::time::sleep(Duration::from_millis(400)).await;
}
}
}
/// Fetch the contacts list from the device and update the peer cache.
async fn refresh_contacts(device: &mut MeshRadioDevice, state: &Arc<MeshState>) {
match device.get_contacts().await {
Ok(contacts) => {
// Skip firmware contacts the user has explicitly wiped via
// mesh.clear-all. MeshCore keeps its own persistent contact
// table the app can't remove from, so we filter on read to
// keep cleared entries out of the chat list.
let blocklist = state.radio_contact_blocklist.read().await.clone();
// Contact blocking is intentionally NOT applied here. A read-time
// blocklist meant a wiped/re-paired contact could never come back
// even when it re-advertised (it broke phone re-pairing after a
// clear). Per-contact blocking will return later as an explicit,
// user-controlled feature; until then every firmware contact is
// surfaced. `radio_contact_blocklist` is retained but unused.
let mut peers = state.peers.write().await;
for (idx, contact) in contacts.iter().enumerate() {
if blocklist.contains(&contact.public_key_hex) {
continue;
}
let contact_id = idx as u32;
let existing = peers.get(&contact_id);
let peer = super::super::types::MeshPeer {
@ -284,14 +380,31 @@ async fn refresh_contacts(device: &mut MeshRadioDevice, state: &Arc<MeshState>)
advert_name: contact.advert_name.clone(),
did: existing.and_then(|p| p.did.clone()),
pubkey_hex: Some(contact.public_key_hex.clone()),
// Preserve any archipelago identity bound by an earlier
// identity advert — NEVER overwrite it with the firmware
// contact key, or a signed `!ai` query from this peer would
// fail authentication after the next contact refresh.
arch_pubkey_hex: existing.and_then(|p| p.arch_pubkey_hex.clone()),
x25519_pubkey: existing.and_then(|p| p.x25519_pubkey),
rssi: None,
snr: None,
last_heard: chrono::Utc::now().to_rfc3339(),
hops: 0,
last_advert: contact.last_advert,
// A non-zero path_len means the firmware has a route (direct
// or flood) to this contact — i.e. we can deliver to it.
reachable: contact.path_len != 0,
};
peers.insert(contact_id, peer);
}
// A radio contact that shares an exact advert_name with a known
// federation peer is the same physical node — bind the federation
// peer's archipelago identity onto the radio record so a signed
// `!ai`/typed message over LoRa authenticates (and the contact stops
// showing as a radio/federation duplicate). Security is unchanged:
// the bound key is only a candidate the inbound signature must still
// verify against. See `bind_federation_twins`.
super::super::bind_federation_twins(&mut peers);
drop(peers);
state.update_peer_count().await;
if !contacts.is_empty() {
@ -424,21 +537,16 @@ pub(super) async fn run_mesh_session(
warn!("Failed to send initial advert: {}", e);
}
// Archipelago identity advert (`ARCHY:2:{ed}:{x25519}`): broadcast as channel
// text so peers can bind our radio presence to our DID + keys. The firmware
// advert alone carries the meshcore key (and nothing on Meshtastic), so this
// is what makes trust-gating + encrypted DMs work across BOTH transports.
let identity_advert = super::super::protocol::encode_identity_broadcast(
our_did,
our_ed_pubkey_hex,
our_x25519_pubkey_hex,
);
if let Err(e) = device
.send_channel_text(0, identity_advert.as_bytes())
.await
{
warn!("Failed to broadcast archipelago identity: {}", e);
}
// NOTE: Archipelago identity adverts (`ARCHY:2:{ed}:{x25519}`) are intentionally
// NOT broadcast on the shared public channel (channel 0). Doing so spams every
// participant on that channel — including plain Meshtastic/meshcore users who
// just see raw `ARCHY:2:…` text — on startup and again on every advert tick.
// The inbound parser in frames.rs still accepts these from any legacy peer that
// sends them, so trust-binding keeps working when a peer advertises; we simply
// don't pollute the public channel ourselves. A dedicated control channel (or a
// DM-targeted handshake) is the proper transport for this and is tracked
// separately. See encode_identity_broadcast / parse_identity_broadcast.
let _ = (our_did, our_ed_pubkey_hex, our_x25519_pubkey_hex);
// Fetch existing contacts from the device
refresh_contacts(&mut device, state).await;
@ -507,11 +615,9 @@ pub(super) async fn run_mesh_session(
} else {
consecutive_write_failures = 0;
}
// Re-broadcast archipelago identity so peers that joined since
// startup (or missed it) can bind our DID/keys.
if let Err(e) = device.send_channel_text(0, identity_advert.as_bytes()).await {
warn!("Failed to re-broadcast archipelago identity: {}", e);
}
// (Identity re-broadcast on the public channel intentionally
// removed — see the note at session startup. It spammed the
// shared channel every advert tick.)
refresh_contacts(&mut device, state).await;
}
@ -562,6 +668,18 @@ async fn handle_send_command(
)
.await;
}
MeshCommand::SendNativeText {
dest_pubkey_prefix,
payload,
} => {
send_plain_native_text(
device,
&dest_pubkey_prefix,
&payload,
consecutive_write_failures,
)
.await;
}
MeshCommand::SendRaw {
dest_pubkey_prefix,
payload,
@ -615,5 +733,22 @@ async fn handle_send_command(
MeshCommand::RefreshContacts => {
refresh_contacts(device, state).await;
}
MeshCommand::RemoveContact { pubkey } => {
if let Err(e) = device.remove_contact(&pubkey).await {
warn!(pubkey = %hex::encode(pubkey), "remove_contact failed: {}", e);
} else {
info!(pubkey = %hex::encode(&pubkey[..6]), "Removed firmware contact");
}
}
MeshCommand::AddContact { pubkey, name } => {
// type=1 (chat/user), flags=0, out_path_len=0 (firmware will flood
// until a path is learned). last_advert=0 lets the firmware keep its
// own advert timestamp.
if let Err(e) = device.add_contact(&pubkey, 1, 0, 0, &name, 0).await {
warn!(pubkey = %hex::encode(&pubkey[..6]), "add_contact failed: {}", e);
} else {
info!(pubkey = %hex::encode(&pubkey[..6]), "Imported advert as contact");
}
}
}
}

View File

@ -22,6 +22,13 @@ const START2: u8 = 0xc3;
const TO_RADIO_MAX: usize = 512;
const BROADCAST_NUM: u32 = 0xffff_ffff;
const TEXT_MESSAGE_APP: u32 = 1;
/// Meshtastic PortNum for admin (config) packets.
const ADMIN_APP: u32 = 6;
/// AdminMessage.set_owner oneof field number (carries a `User`).
const ADMIN_SET_OWNER_FIELD: u64 = 32;
/// Meshtastic firmware caps long_name at ~40 bytes and short_name at 4 bytes.
const MESHTASTIC_LONG_NAME_MAX: usize = 39;
const MESHTASTIC_SHORT_NAME_MAX: usize = 4;
const TO_RADIO_PACKET: u64 = 1;
const TO_RADIO_WANT_CONFIG_ID: u64 = 3;
@ -42,6 +49,14 @@ pub struct MeshtasticDevice {
long_name: Option<String>,
short_name: Option<String>,
contacts: HashMap<u32, ParsedContact>,
/// Real Curve25519 public keys, keyed by node-num, as learned from NodeInfo
/// (`User.public_key`) or PKC-encrypted inbound packets (`MeshPacket
/// .public_key`). Kept SEPARATE from `contacts[*].public_key_hex`, which is
/// the synthetic node-num-derived routing key that `send_text_msg` relies
/// on — we must not overwrite that or unicast routing breaks. This map only
/// records which peers are PKC-capable, so we can tell a true end-to-end
/// (PKI) DM from a channel-PSK fallback.
peer_pubkeys: HashMap<u32, Vec<u8>>,
device_path: String,
}
@ -68,6 +83,7 @@ impl MeshtasticDevice {
long_name: None,
short_name: None,
contacts: HashMap::new(),
peer_pubkeys: HashMap::new(),
device_path: path.to_string(),
})
}
@ -134,8 +150,56 @@ impl MeshtasticDevice {
})
}
/// Rename the connected Meshtastic radio to match the node's server name so
/// it's findable from external Meshtastic apps (phone/desktop) on the same
/// mesh. Previously this only updated the in-memory field and never told the
/// device — so the radio kept its firmware-default name ("Meshtastic xxxx").
///
/// We push an `AdminMessage { set_owner: User { long_name, short_name } }` to
/// the locally-connected node (an admin packet addressed to our own
/// `node_num`, on the ADMIN_APP port). Local admin over the serial link needs
/// no session passkey, so this is the same path the official phone/CLI client
/// uses for "set owner".
pub async fn set_advert_name(&mut self, name: &str) -> Result<()> {
self.long_name = Some(name.to_string());
let long_name: String = name.chars().take(MESHTASTIC_LONG_NAME_MAX).collect();
let short_name = derive_short_name(name).unwrap_or_else(|| {
self.short_name
.clone()
.unwrap_or_else(|| "NODE".to_string())
});
let Some(node_num) = self.node_num else {
// No local node number yet (initialize() not completed) — can't
// address a local admin packet. Record the intent so advert_name()
// still reflects it, but skip the device write.
warn!("Meshtastic set_advert_name: node_num unknown, skipping device write");
self.long_name = Some(long_name);
self.short_name = Some(short_name);
return Ok(());
};
// User { id?(1), long_name(2), short_name(3) }. Echo back the existing id
// when known so the firmware keeps the node's stable `!xxxxxxxx` id.
let mut user = Vec::new();
if let Some(id) = &self.user_id {
encode_len_field(1, id.as_bytes(), &mut user);
}
encode_len_field(2, long_name.as_bytes(), &mut user);
encode_len_field(3, short_name.as_bytes(), &mut user);
// AdminMessage { set_owner(32): User }
let mut admin = Vec::new();
encode_len_field(ADMIN_SET_OWNER_FIELD, &user, &mut admin);
// Admin packet to ourselves on the ADMIN_APP port.
let packet = encode_mesh_packet(node_num, ADMIN_APP, &admin);
self.send_to_radio(&encode_to_radio_variant(TO_RADIO_PACKET, &packet))
.await
.context("Failed to send Meshtastic set_owner admin packet")?;
info!(node_num, long_name = %long_name, short_name = %short_name, "Set Meshtastic device owner");
self.long_name = Some(long_name);
self.short_name = Some(short_name);
Ok(())
}
@ -150,6 +214,52 @@ impl MeshtasticDevice {
.await
}
/// Native Meshtastic unicast DM. Our synthetic Meshtastic pubkeys carry the
/// numeric node-id in their first 4 bytes (little-endian, see
/// `synthetic_pubkey`), so `dest_pubkey_prefix` directly yields the
/// destination node number. We send a directed MeshPacket (`to` = node num)
/// rather than a `BROADCAST_NUM` channel blast — this is the Meshtastic
/// analog of the meshcore `CMD_SEND_TXT_MSG` fix: the message is delivered
/// as a real DM (only the recipient's client surfaces it) instead of
/// polluting the shared primary channel where every node would see it.
///
/// If the prefix decodes to node 0 / broadcast (e.g. a non-Meshtastic
/// synthetic key routed here by mistake), fall back to a channel send so the
/// device interface stays uniform and the message still goes out.
pub async fn send_text_msg(&mut self, dest_pubkey_prefix: &[u8; 6], msg: &[u8]) -> Result<()> {
let node_num = u32::from_le_bytes([
dest_pubkey_prefix[0],
dest_pubkey_prefix[1],
dest_pubkey_prefix[2],
dest_pubkey_prefix[3],
]);
if node_num == 0 || node_num == BROADCAST_NUM {
return self.send_channel_text(0, msg).await;
}
let text = String::from_utf8_lossy(msg);
let packet = encode_mesh_packet(node_num, TEXT_MESSAGE_APP, text.as_bytes());
self.send_to_radio(&encode_to_radio_variant(TO_RADIO_PACKET, &packet))
.await
}
/// Meshtastic has no meshcore-style contact table; these are no-ops so the
/// device interface stays uniform.
pub async fn remove_contact(&mut self, _pubkey: &[u8; 32]) -> Result<()> {
Ok(())
}
pub async fn add_contact(
&mut self,
_pubkey: &[u8; 32],
_contact_type: u8,
_flags: u8,
_out_path_len: u8,
_name: &str,
_last_advert: u32,
) -> Result<()> {
Ok(())
}
pub async fn get_contacts(&mut self) -> Result<Vec<ParsedContact>> {
if self.contacts.is_empty() {
self.send_to_radio(&encode_want_config()).await?;
@ -188,6 +298,19 @@ impl MeshtasticDevice {
Ok(self.handle_from_radio(&frame))
}
/// Whether we've learned `node_num`'s real PKI (Curve25519) key — from a
/// NodeInfo `public_key` or an inbound PKC DM — meaning the firmware can
/// deliver DMs to/from it end-to-end encrypted instead of falling back to
/// the channel PSK. Driver-internal for now; lets a future mesh-tab badge
/// distinguish a true E2E DM from a channel-encrypted one without changing
/// the shared device interface (which would break meshcore hot-swap).
#[allow(dead_code)] // seam: consumed when the mesh-tab E2E badge lands
pub fn peer_is_pkc_capable(&self, node_num: u32) -> bool {
self.peer_pubkeys
.get(&node_num)
.is_some_and(|k| !k.is_empty())
}
pub fn advert_name(&self) -> Option<String> {
self.long_name
.clone()
@ -260,6 +383,15 @@ impl MeshtasticDevice {
fn update_node_info(&mut self, data: &[u8]) {
if let Some(node) = parse_node_info(data) {
if let Some(pk) = node.public_key.as_ref() {
if self.peer_pubkeys.insert(node.num, pk.clone()).is_none() {
debug!(
node = node.num,
key_len = pk.len(),
"Meshtastic peer is PKC-capable (NodeInfo public_key)"
);
}
}
let key = synthetic_pubkey(node.num);
let name = node
.long_name
@ -292,6 +424,18 @@ impl MeshtasticDevice {
if Some(from) == self.node_num {
return None;
}
// Record E2E status: a `pki_encrypted` packet (or one carrying the
// sender's `public_key`) proves this DM arrived end-to-end encrypted via
// the PKI, not the shared channel PSK. We learn the sender's key here too
// — but keep it OUT of the routing `public_key_hex` (synthetic) so the
// device interface stays identical to meshcore's and the two remain
// hot-swappable behind the mesh listener.
if let Some(pk) = packet.public_key.as_ref() {
self.peer_pubkeys.entry(from).or_insert_with(|| pk.clone());
}
if packet.pki_encrypted {
debug!(node = from, "Meshtastic DM received end-to-end encrypted (PKI)");
}
let from_key = synthetic_pubkey(from);
self.contacts.entry(from).or_insert_with(|| ParsedContact {
public_key_hex: hex::encode(synthetic_pubkey(from)),
@ -339,6 +483,23 @@ fn encode_want_config() -> Vec<u8> {
encode_varint_field(TO_RADIO_WANT_CONFIG_ID, 1)
}
/// Derive a Meshtastic short_name (≤4 chars, the label shown on node icons) from
/// the human node name: the first few alphanumeric characters, upper-cased.
/// Returns `None` when the name has no usable alphanumeric characters.
fn derive_short_name(name: &str) -> Option<String> {
let short: String = name
.chars()
.filter(|c| c.is_alphanumeric())
.take(MESHTASTIC_SHORT_NAME_MAX)
.collect::<String>()
.to_uppercase();
if short.is_empty() {
None
} else {
Some(short)
}
}
fn encode_heartbeat() -> Vec<u8> {
encode_to_radio_variant(TO_RADIO_HEARTBEAT, &[])
}
@ -418,6 +579,7 @@ struct ParsedNode {
long_name: Option<String>,
short_name: Option<String>,
last_heard: Option<u32>,
public_key: Option<Vec<u8>>,
}
fn parse_node_info(data: &[u8]) -> Option<ParsedNode> {
@ -428,6 +590,7 @@ fn parse_node_info(data: &[u8]) -> Option<ParsedNode> {
long_name: None,
short_name: None,
last_heard: None,
public_key: None,
};
while idx < data.len() {
let (field, value, next) = next_field(data, idx)?;
@ -440,6 +603,7 @@ fn parse_node_info(data: &[u8]) -> Option<ParsedNode> {
node.id = user.id;
node.long_name = user.long_name;
node.short_name = user.short_name;
node.public_key = user.public_key;
}
}
(5, FieldValue::Fixed32(v)) => node.last_heard = Some(v),
@ -457,6 +621,7 @@ struct ParsedUser {
id: Option<String>,
long_name: Option<String>,
short_name: Option<String>,
public_key: Option<Vec<u8>>,
}
fn parse_user(data: &[u8]) -> Option<ParsedUser> {
@ -465,6 +630,7 @@ fn parse_user(data: &[u8]) -> Option<ParsedUser> {
id: None,
long_name: None,
short_name: None,
public_key: None,
};
while idx < data.len() {
let (field, value, next) = next_field(data, idx)?;
@ -473,6 +639,9 @@ fn parse_user(data: &[u8]) -> Option<ParsedUser> {
(1, FieldValue::Bytes(b)) => user.id = string_field(b),
(2, FieldValue::Bytes(b)) => user.long_name = string_field(b),
(3, FieldValue::Bytes(b)) => user.short_name = string_field(b),
// User.public_key (field 8): the peer's Curve25519 key. Its presence
// means the radio can PKC-encrypt DMs to this node end-to-end.
(8, FieldValue::Bytes(b)) if !b.is_empty() => user.public_key = Some(b.to_vec()),
_ => {}
}
}
@ -483,18 +652,28 @@ struct ParsedPacket {
from: Option<u32>,
portnum: u32,
payload: Vec<u8>,
/// MeshPacket.pki_encrypted (field 17): the firmware decrypted this packet
/// with the PKI (Curve25519) key, i.e. it arrived end-to-end encrypted
/// rather than via the shared channel PSK.
pki_encrypted: bool,
/// MeshPacket.public_key (field 16): the sender's key, carried on PKC DMs.
public_key: Option<Vec<u8>>,
}
fn parse_mesh_packet(data: &[u8]) -> Option<ParsedPacket> {
let mut idx = 0;
let mut from = None;
let mut decoded = None;
let mut pki_encrypted = false;
let mut public_key = None;
while idx < data.len() {
let (field, value, next) = next_field(data, idx)?;
idx = next;
match (field, value) {
(1, FieldValue::Fixed32(v)) => from = Some(v),
(4, FieldValue::Bytes(b)) => decoded = Some(b),
(16, FieldValue::Bytes(b)) if !b.is_empty() => public_key = Some(b.to_vec()),
(17, FieldValue::Varint(v)) => pki_encrypted = v != 0,
_ => {}
}
}
@ -515,6 +694,8 @@ fn parse_mesh_packet(data: &[u8]) -> Option<ParsedPacket> {
from,
portnum,
payload,
pki_encrypted,
public_key,
})
}

View File

@ -46,6 +46,12 @@ const MESH_CONTACTS_FILE: &str = "mesh-contacts.json";
/// high half of u32 space to avoid collision. Both the receive path
/// (`inject_typed_from_federation`) and the startup pre-seed use this
/// formula so they always produce the same id for the same peer.
/// Mesh contacts at or above this id are synthetic federation peers (the high
/// half of the u32 space). Meshcore radio contacts use the firmware's low-int id
/// space, so this bit cleanly distinguishes "arrived over the authenticated
/// federation transport" from "heard over the radio".
pub(crate) const FEDERATION_CONTACT_ID_BASE: u32 = 0x8000_0000;
pub(crate) fn federation_peer_contact_id(archipelago_pubkey_hex: &str) -> u32 {
let bytes = hex::decode(archipelago_pubkey_hex).unwrap_or_default();
if bytes.len() < 4 {
@ -55,6 +61,157 @@ pub(crate) fn federation_peer_contact_id(archipelago_pubkey_hex: &str) -> u32 {
0x8000_0000 | (low & 0x7FFF_FFFF)
}
/// Bind radio (LoRa) contacts to their federation twin's archipelago identity.
///
/// The same physical node commonly appears twice in the peer table: a radio
/// contact (low `contact_id`, firmware routing key only, `arch_pubkey_hex ==
/// None`) and a federation peer (high `contact_id`, `arch_pubkey_hex` set). The
/// radio half carries no archipelago identity because identity adverts are no
/// longer broadcast on the public channel (anti-spam), so the `!ai` trust gate
/// and envelope signature verification have no key to check a radio asker
/// against — a `!ai` from a trusted node over LoRa is therefore denied, and the
/// node shows up as two separate contacts.
///
/// We correlate the two halves by exact, case-insensitive `advert_name` and copy
/// the federation peer's `arch_pubkey_hex`/`did`/`x25519` onto the radio peer.
/// This only supplies a CANDIDATE identity key; it does NOT bypass
/// authentication. A radio envelope must still carry an Ed25519 signature that
/// verifies against this bound key (see `handle_typed_envelope_direct`), so a
/// meshcore node merely *named* like a trusted node cannot impersonate it — it
/// cannot produce the signature. The candidate key comes from the authenticated
/// federation handshake (`nodes.json`), never from anything the radio packet
/// claims. Names held by more than one federation peer are treated as ambiguous
/// and skipped so a duplicate name can't bind the wrong identity.
pub(crate) fn bind_federation_twins(peers: &mut std::collections::HashMap<u32, MeshPeer>) {
// name (lowercased) -> federation identity; `None` marks an ambiguous name
// (seen on more than one federation peer) which we must not bind.
type FedIdentity = (String, Option<String>, Option<[u8; 32]>);
let mut fed_by_name: std::collections::HashMap<String, Option<FedIdentity>> =
std::collections::HashMap::new();
for p in peers.values() {
if p.contact_id < FEDERATION_CONTACT_ID_BASE {
continue;
}
let Some(arch) = p.arch_pubkey_hex.clone() else {
continue;
};
let name = p.advert_name.trim().to_ascii_lowercase();
if name.is_empty() {
continue;
}
fed_by_name
.entry(name)
.and_modify(|e| *e = None) // a second federation peer with this name → ambiguous
.or_insert(Some((arch, p.did.clone(), p.x25519_pubkey)));
}
if fed_by_name.is_empty() {
return;
}
for p in peers.values_mut() {
if p.contact_id >= FEDERATION_CONTACT_ID_BASE || p.arch_pubkey_hex.is_some() {
continue;
}
let name = p.advert_name.trim().to_ascii_lowercase();
if name.is_empty() {
continue;
}
if let Some(Some((arch, did, x25519))) = fed_by_name.get(&name) {
p.arch_pubkey_hex = Some(arch.clone());
if p.did.is_none() {
p.did = did.clone();
}
if p.x25519_pubkey.is_none() {
p.x25519_pubkey = *x25519;
}
}
}
}
/// One logical contact after collapsing cross-transport twins (see
/// [`group_peer_twins`]). A node reachable both over LoRa and over federation
/// has two `MeshPeer` rows (different `contact_id`s) but is one conversation.
pub(crate) struct PeerGroup {
/// The peer row the UI should address. The radio twin when one exists (so
/// `send_typed_wire` stays mesh-first — LoRa if reachable, else federation
/// via the bound arch key), otherwise the federation row. Gap-healed so its
/// name / `arch_pubkey_hex` / `did` are populated from whichever twin had
/// them, and `reachable` is the OR across the group.
pub canonical: MeshPeer,
/// Every `contact_id` in the group. The conversation's messages are the
/// union of those keyed by any of these ids — federation-injected messages
/// land on the federation twin's id, radio messages on the radio twin's.
pub contact_ids: Vec<u32>,
}
/// Collapse cross-transport twin peers into one conversation per identity.
///
/// The same node commonly appears twice in the peer table: a radio twin (low
/// `contact_id`, firmware routing key) and a federation twin (high
/// `contact_id`, archipelago key), correlated by [`bind_federation_twins`]
/// which copies `arch_pubkey_hex` onto the radio twin but leaves both rows.
/// Messages are keyed by `peer_contact_id`, so they split across the two ids:
/// a federation-injected message sits on the federation row while the user may
/// open the radio row and see an empty thread (the `.120`→`.89` symptom).
///
/// Group peers by `arch_pubkey_hex` when set, else treat each peer as its own
/// singleton group keyed by `contact_id`. Grouping is done ONLY here at surface
/// time — never re-keyed at bind time — so outbound routing keeps the distinct
/// per-twin `contact_id`s and stays mesh-first. First-seen order is preserved
/// for stable downstream sorting.
pub(crate) fn group_peer_twins(peers: &[MeshPeer]) -> Vec<PeerGroup> {
let mut order: Vec<String> = Vec::new();
let mut groups: std::collections::HashMap<String, Vec<MeshPeer>> =
std::collections::HashMap::new();
for p in peers {
let key = match p.arch_pubkey_hex.as_deref() {
Some(arch) if !arch.is_empty() => format!("arch:{}", arch.to_ascii_lowercase()),
_ => format!("cid:{}", p.contact_id),
};
if !groups.contains_key(&key) {
order.push(key.clone());
}
groups.entry(key).or_default().push(p.clone());
}
let mut out = Vec::with_capacity(order.len());
for key in order {
let members = match groups.remove(&key) {
Some(m) if !m.is_empty() => m,
_ => continue,
};
let contact_ids: Vec<u32> = members.iter().map(|m| m.contact_id).collect();
// Canonical = the radio twin (lowest id below the federation base) when
// one exists, else the lowest id overall (a federation-only peer).
let canonical_src = members
.iter()
.filter(|m| m.contact_id < FEDERATION_CONTACT_ID_BASE)
.min_by_key(|m| m.contact_id)
.or_else(|| members.iter().min_by_key(|m| m.contact_id))
.expect("non-empty members");
let mut canonical = canonical_src.clone();
// Heal gaps from the twin: a radio row may lack the advert name, arch
// identity, or did that only the federation row carries.
if canonical.advert_name.trim().is_empty() {
if let Some(named) = members.iter().find(|m| !m.advert_name.trim().is_empty()) {
canonical.advert_name = named.advert_name.clone();
}
}
if canonical.arch_pubkey_hex.is_none() {
canonical.arch_pubkey_hex = members.iter().find_map(|m| m.arch_pubkey_hex.clone());
}
if canonical.did.is_none() {
canonical.did = members.iter().find_map(|m| m.did.clone());
}
// Reachable if ANY twin is reachable (radio path or off-radio federation).
canonical.reachable = members.iter().any(|m| m.reachable);
out.push(PeerGroup {
canonical,
contact_ids,
});
}
out
}
/// Upsert a mesh peer record representing a federation node so the UI can
/// address it as a chat and `mesh.send-content` can route ContentRef to it.
/// Existing entries (same contact_id) are updated in place, preserving any
@ -77,13 +234,24 @@ pub(crate) async fn upsert_federation_peer(
advert_name: display_name,
did: Some(did.to_string()),
pubkey_hex: Some(archipelago_pubkey_hex.to_string()),
// Federation peers are authenticated by the Tor relay upstream; their
// archipelago key is known, so bind it as the identity key too.
arch_pubkey_hex: Some(archipelago_pubkey_hex.to_string()),
x25519_pubkey: existing.as_ref().and_then(|p| p.x25519_pubkey),
rssi: existing.as_ref().and_then(|p| p.rssi),
snr: existing.as_ref().and_then(|p| p.snr),
last_heard: chrono::Utc::now().to_rfc3339(),
hops: existing.as_ref().map(|p| p.hops).unwrap_or(0),
last_advert: existing.as_ref().map(|p| p.last_advert).unwrap_or(0),
// Federation peers are reachable off-radio (Tor/FIPS), so always true.
reachable: true,
};
peers.insert(contact_id, peer);
// A radio twin of this node (same advert_name, no arch identity yet) can now
// inherit this federation peer's archipelago key — so a signed `!ai`/typed
// message arriving over LoRa from it authenticates and the duplicate radio
// contact resolves to the same identity.
bind_federation_twins(&mut peers);
drop(peers);
state.update_peer_count().await;
contact_id
@ -197,6 +365,10 @@ pub struct MeshConfig {
/// local GPU) or "ollama" (a local model on this node).
#[serde(default = "default_assistant_backend")]
pub assistant_backend: String,
/// Per-contact allowlist (ed25519 pubkey hex) permitted to use `!ai` even
/// when `assistant_trusted_only` is on and they aren't federation-Trusted.
#[serde(default)]
pub assistant_allowed_contacts: Vec<String>,
}
fn default_assistant_backend() -> String {
@ -224,6 +396,7 @@ impl Default for MeshConfig {
assistant_model: None,
assistant_trusted_only: true,
assistant_backend: default_assistant_backend(),
assistant_allowed_contacts: Vec::new(),
}
}
}
@ -401,6 +574,7 @@ impl MeshService {
model: config.assistant_model.clone(),
trusted_only: config.assistant_trusted_only,
backend: config.assistant_backend.clone(),
allowed_contacts: config.assistant_allowed_contacts.clone(),
},
data_dir.to_path_buf(),
);
@ -578,6 +752,7 @@ impl MeshService {
let mut interval = tokio::time::interval(Duration::from_secs(30));
interval.tick().await; // skip first
let mut last_announced_height: u64 = 0;
let mut last_announce_at: Option<std::time::Instant> = None;
let client = match reqwest::Client::builder()
.timeout(Duration::from_secs(10))
.build()
@ -595,6 +770,18 @@ impl MeshService {
// Poll Bitcoin Core for latest block
match bitcoin_rpc_getblockcount(&client).await {
Ok(height) if height > last_announced_height => {
// Advance the tip baseline immediately so a fast Bitcoin
// catch-up (a new block every poll) doesn't re-fire each tick.
last_announced_height = height;
// Throttle: at most one announcement per ~9 min. Real ~10 min
// blocks still propagate, but a rapid catch-up can no longer
// flood the shared LoRa channel.
if last_announce_at
.map(|t| t.elapsed() < Duration::from_secs(540))
.unwrap_or(false)
{
continue;
}
if let Ok(header) = bitcoin_rpc_getblockheader_by_height(&client, height).await {
// Store in cache
let payload = message_types::BlockHeaderPayload {
@ -640,30 +827,15 @@ impl MeshService {
}
}
}
// Second pass: any peer if no Archy nodes found
if sent == 0 {
for peer in peers.values() {
if sent >= max_peers { break; }
if let Some(ref pk) = peer.pubkey_hex {
if let Ok(pk_bytes) = hex::decode(pk) {
if pk_bytes.len() >= 6 {
let mut prefix = [0u8; 6];
prefix.copy_from_slice(&pk_bytes[..6]);
let _ = bha_state.send_cmd(
listener::MeshCommand::SendRaw {
dest_pubkey_prefix: prefix,
payload: wire.clone(),
},
).await;
sent += 1;
}
}
}
}
}
// NOTE: intentionally NO fallback to arbitrary
// peers. Block headers go ONLY to known Archy
// (federated) nodes — never to random meshcore
// devices on the shared public channel.
drop(peers);
last_announced_height = height;
info!(height, hash = %header.hash, peers = sent, "Announced block header to Archy peers");
if sent > 0 {
last_announce_at = Some(std::time::Instant::now());
info!(height, hash = %header.hash, peers = sent, "Announced block header to Archy peers");
}
}
Err(e) => warn!("Failed to build block announcement: {}", e),
}
@ -917,7 +1089,25 @@ impl MeshService {
// over Tor; otherwise the send falls through to LoRa.
let is_federation_synthetic = contact_id & 0x8000_0000 != 0;
let exceeds_lora = wire.len() > protocol::MAX_MESSAGE_LEN;
if is_federation_synthetic || exceeds_lora {
// Mesh-preferred routing with a federation fallback. A normal radio
// contact is delivered over LoRa (preferred — free, local, no internet).
// But if that contact is the same node as a federated peer — we know its
// archipelago identity (`arch_pubkey_hex`) → onion — AND it is NOT
// currently reachable over the radio (out of LoRa range, e.g. a peer on
// another continent), route the message over the federation transport
// (FIPS→Tor) instead of handing it to a radio that physically cannot
// deliver it. Reachable radio peers stay on the mesh; oversized
// envelopes (file shares etc.) always take the federation path.
let radio_federated_unreachable = !is_federation_synthetic
&& !exceeds_lora
&& {
let peers = self.state.peers.read().await;
peers
.get(&contact_id)
.map(|p| !p.reachable && p.arch_pubkey_hex.is_some())
.unwrap_or(false)
};
if is_federation_synthetic || exceeds_lora || radio_federated_unreachable {
// Resolve the peer's pubkey/did. Prefer the live mesh peer table,
// but fall back to federation storage for federation-synthetic ids
// that were never seeded into `state.peers` — e.g. a radio-less
@ -926,9 +1116,15 @@ impl MeshService {
// even though we know its onion from nodes.json.
let from_table = {
let peers = self.state.peers.read().await;
peers
.get(&contact_id)
.map(|p| (p.pubkey_hex.clone(), p.did.clone()))
peers.get(&contact_id).map(|p| {
// Resolve via the archipelago IDENTITY key (not the firmware
// routing key) — that's what matches the peer's onion entry
// in nodes.json for the federation lookup below.
(
p.arch_pubkey_hex.clone().or_else(|| p.pubkey_hex.clone()),
p.did.clone(),
)
})
};
let (peer_pubkey, peer_did) = match from_table {
Some(v) => v,
@ -1062,7 +1258,14 @@ impl MeshService {
"/archipelago/mesh-typed",
)
.service(crate::settings::transport::PeerService::Messaging)
.timeout(std::time::Duration::from_secs(120));
.timeout(std::time::Duration::from_secs(120))
// Fast-fail a FIPS path the peer isn't reachable on (the common case
// for remote/Tailscale peers that share no FIPS overlay with us) so
// the Tor fallback delivers the message in ~3-5s instead of the send
// hanging on FIPS. FIPS-reachable peers connect in <1s and still use
// it; only an unreachable FIPS path is short-circuited. Matches the
// federation-sync fix. 8s ≈ the FIPS connect_timeout headroom.
.fips_timeout(std::time::Duration::from_secs(8));
match req.send_json(&body).await {
Ok((resp, transport)) if resp.status().is_success() => {
tracing::debug!(contact_id, transport = %transport, "Federation envelope delivered");
@ -1267,13 +1470,57 @@ impl MeshService {
pub async fn send_message(&self, contact_id: u32, text: &str) -> Result<MeshMessage> {
use crate::mesh::message_types::{MeshMessageType, TypedEnvelope};
let seq = self.state.next_send_seq(contact_id).await;
// Stock (non-archipelago) radio contacts — e.g. a phone running the
// MeshCore app — can't decode our typed envelope and would render it as
// garbled bytes. Send them the raw text as a plain native DM instead.
// Archipelago peers still get the typed envelope (seq/reply/reaction
// addressing + encryption).
if !self.is_archy_peer(contact_id).await {
let dest_prefix = self.peer_dest_prefix(contact_id).await?;
self.state
.send_cmd(listener::MeshCommand::SendNativeText {
dest_pubkey_prefix: dest_prefix,
payload: text.as_bytes().to_vec(),
})
.await
.map_err(|_| anyhow::anyhow!("Mesh listener not running"))?;
return Ok(self
.record_sent_typed(contact_id, "text", text, None, seq)
.await);
}
// Sign the envelope with our archipelago identity key so the receiver
// can authenticate us over LoRa (it verifies against our bound
// `arch_pubkey_hex`). This is what lets a `!ai` typed in chat to a
// trusted node pass the receiver's `trusted_only` gate over the radio —
// an unsigned radio packet can never authenticate. The signature is
// optional on the wire and ignored by peers that don't know our key, so
// it stays backward compatible. (Federation/Tor sends already sign in
// `send_typed_wire_via_federation`.) `with_seq` is applied after signing
// — seq is not covered by the signature.
let envelope =
TypedEnvelope::new(MeshMessageType::Text, text.as_bytes().to_vec()).with_seq(seq);
TypedEnvelope::new_signed(MeshMessageType::Text, text.as_bytes().to_vec(), &self.signing_key)
.with_seq(seq);
let wire = envelope.to_wire()?;
self.send_typed_wire(contact_id, wire, "text", text, None, seq)
.await
}
/// Whether `contact_id` is an archipelago peer (vs a stock meshcore client).
/// Federation-synthetic ids are always archy; radio contacts count as archy
/// only once we've learned their archipelago identity (DID or x25519 key,
/// from federation seeding or an identity exchange). Stock clients have
/// neither, so we send them plain text rather than typed envelopes.
async fn is_archy_peer(&self, contact_id: u32) -> bool {
if contact_id & 0x8000_0000 != 0 {
return true;
}
let peers = self.state.peers.read().await;
peers
.get(&contact_id)
.map(|p| p.did.is_some() || p.x25519_pubkey.is_some())
.unwrap_or(false)
}
/// Record a Sent MeshMessage for a typed envelope that has already been
/// transmitted by the caller. Used by the RPC layer after sending
/// invoice/coordinate/alert/etc. so the UI gets a proper rich Sent card
@ -1392,6 +1639,12 @@ impl MeshService {
self.state.assistant.read().await.clone()
}
/// Recently-denied `!ai` askers (newest first) so the UI can offer to allow
/// them. Cleared implicitly as new denials rotate older ones out.
pub async fn assistant_denied_askers(&self) -> Vec<listener::DeniedAsker> {
self.state.assist_denied.read().await.iter().cloned().collect()
}
/// Update the mesh-AI assistant settings live (no listener restart) and
/// persist them to the mesh config. `model: Some(None)` clears the override
/// (falls back to the built-in default); `None` leaves a field unchanged.
@ -1401,6 +1654,7 @@ impl MeshService {
model: Option<Option<String>>,
trusted_only: Option<bool>,
backend: Option<String>,
allowed_contacts: Option<Vec<String>>,
) -> Result<()> {
{
let mut a = self.state.assistant.write().await;
@ -1416,6 +1670,9 @@ impl MeshService {
if let Some(b) = backend {
a.backend = b;
}
if let Some(list) = allowed_contacts {
a.allowed_contacts = list;
}
}
// Persist by updating the on-disk config (the in-memory `self.config`
// snapshot stays as-is; the live `state.assistant` is the runtime
@ -1427,6 +1684,7 @@ impl MeshService {
cfg.assistant_model = a.model.clone();
cfg.assistant_trusted_only = a.trusted_only;
cfg.assistant_backend = a.backend.clone();
cfg.assistant_allowed_contacts = a.allowed_contacts.clone();
}
save_config(&self.data_dir, &cfg).await?;
Ok(())
@ -1587,6 +1845,57 @@ async fn bitcoin_rpc_getblockheader_by_height(
mod tests {
use super::*;
fn mk_peer(contact_id: u32, name: &str, arch: Option<&str>, reachable: bool) -> MeshPeer {
MeshPeer {
contact_id,
advert_name: name.to_string(),
did: None,
pubkey_hex: Some(format!("fw{contact_id}")),
arch_pubkey_hex: arch.map(|s| s.to_string()),
x25519_pubkey: None,
rssi: None,
snr: None,
last_heard: String::new(),
hops: 0,
last_advert: 0,
reachable,
}
}
#[test]
fn test_group_peer_twins_collapses_radio_and_federation() {
let radio = mk_peer(42, "Archy-X250-EXP", Some("ABCD"), false);
let fed = mk_peer(0x8000_0001, "Archy-X250-EXP", Some("abcd"), true);
let groups = group_peer_twins(&[radio, fed]);
assert_eq!(groups.len(), 1, "twins must collapse to one conversation");
let g = &groups[0];
// Canonical = the radio twin (mesh-first send), but reachability is the
// OR across twins (federation is reachable off-radio).
assert_eq!(g.canonical.contact_id, 42);
assert!(g.canonical.reachable);
// Both ids retained so messages can be unioned across them.
assert!(g.contact_ids.contains(&42) && g.contact_ids.contains(&0x8000_0001));
}
#[test]
fn test_group_peer_twins_keeps_distinct_identities_and_unbound_radio() {
// Two different identities + one radio peer that was never bound to a
// federation twin (arch = None) → three separate conversations.
let a = mk_peer(1, "Alice", Some("aa"), false);
let b = mk_peer(2, "Bob", Some("bb"), true);
let lonely = mk_peer(3, "Carol-radio", None, false);
let groups = group_peer_twins(&[a, b, lonely]);
assert_eq!(groups.len(), 3);
}
#[test]
fn test_group_peer_twins_federation_only_uses_federation_id() {
let fed = mk_peer(0x8000_00ff, "Arch Dev", Some("dead"), true);
let groups = group_peer_twins(&[fed]);
assert_eq!(groups.len(), 1);
assert_eq!(groups[0].canonical.contact_id, 0x8000_00ff);
}
#[test]
fn test_mesh_config_default() {
let config = MeshConfig::default();

View File

@ -30,6 +30,13 @@ pub const CMD_SYNC_NEXT_MESSAGE: u8 = 0x0A;
/// known" — without this, the firmware silently drops outbound TXT_MSG
/// frames to such contacts.
pub const CMD_RESET_PATH: u8 = 0x0D;
/// CMD_ADD_UPDATE_CONTACT (0x09): add or update a contact in the firmware
/// table. 144-byte frame (see `build_add_contact`).
pub const CMD_ADD_UPDATE_CONTACT: u8 = 0x09;
/// CMD_REMOVE_CONTACT (0x0F): `[0x0F][pub_key:32]` — delete a contact from the
/// firmware's persistent table (used by clear-all so wiped contacts actually
/// go away and only return when they re-advertise).
pub const CMD_REMOVE_CONTACT: u8 = 0x0F;
pub const CMD_SET_RADIO_PARAMS: u8 = 0x0B;
pub const CMD_SET_RADIO_TX_POWER: u8 = 0x0C;
pub const CMD_SET_TUNING_PARAMS: u8 = 0x15;
@ -258,6 +265,45 @@ pub fn build_reset_path(pubkey: &[u8; 32]) -> Vec<u8> {
encode_frame(&data)
}
/// CMD_REMOVE_CONTACT (0x0F): `[0x0F][pub_key:32]`. Removes the contact from
/// the firmware's persistent contact table.
pub fn build_remove_contact(pubkey: &[u8; 32]) -> Vec<u8> {
let mut data = vec![CMD_REMOVE_CONTACT];
data.extend_from_slice(pubkey);
encode_frame(&data)
}
/// CMD_ADD_UPDATE_CONTACT (0x09): add/update a contact. 144-byte body:
/// `[0x09][pub_key:32][type:1][flags:1][out_path_len:1][out_path:64][name:32]
/// [last_advert:4 LE][adv_lat:4 LE][adv_lon:4 LE]`.
/// `name` is zero-padded to 32 bytes (the firmware fills it from the heard
/// advert on its side too, so an empty name still resolves on get-contacts).
pub fn build_add_contact(
pubkey: &[u8; 32],
contact_type: u8,
flags: u8,
out_path_len: u8,
name: &str,
last_advert: u32,
) -> Vec<u8> {
let mut data = Vec::with_capacity(144);
data.push(CMD_ADD_UPDATE_CONTACT);
data.extend_from_slice(pubkey); // 32
data.push(contact_type); // 1
data.push(flags); // 1
data.push(out_path_len); // 1
data.extend_from_slice(&[0u8; 64]); // out_path (64)
let mut name_buf = [0u8; 32];
let nb = name.as_bytes();
let n = nb.len().min(32);
name_buf[..n].copy_from_slice(&nb[..n]);
data.extend_from_slice(&name_buf); // name (32)
data.extend_from_slice(&last_advert.to_le_bytes()); // last_advert (4)
data.extend_from_slice(&0i32.to_le_bytes()); // adv_lat (4)
data.extend_from_slice(&0i32.to_le_bytes()); // adv_lon (4)
encode_frame(&data)
}
/// CMD_SYNC_NEXT_MESSAGE (0x0A): Retrieve the next queued message.
pub fn build_sync_next_message() -> Vec<u8> {
encode_frame(&[CMD_SYNC_NEXT_MESSAGE])

View File

@ -206,6 +206,24 @@ impl MeshcoreDevice {
Ok(())
}
/// Send a NATIVE meshcore direct message (CMD_SEND_TXT_MSG) to a contact,
/// addressed by the first 6 bytes of its public key. Unlike the
/// `@DM2`-over-channel path, this is a real unicast — it does not appear on
/// the public channel, and a stock meshcore client receives it as a normal
/// DM. The contact must already exist in the firmware table (with a path).
pub async fn send_text_msg(&mut self, dest_pubkey_prefix: &[u8; 6], msg: &[u8]) -> Result<()> {
let frame_data = protocol::build_send_text(dest_pubkey_prefix, msg)?;
self.send_raw(&frame_data).await?;
let frame = self.recv_frame_timeout(READ_TIMEOUT).await?;
if frame.code == protocol::RESP_ERR {
anyhow::bail!(
"Direct text send failed: {}",
protocol::parse_error(&frame.data)
);
}
Ok(())
}
/// Clear the stored routing path for a contact so the firmware flood-
/// routes future messages instead of dropping them when path_len=0.
pub async fn reset_contact_path(&mut self, pubkey: &[u8; 32]) -> Result<()> {
@ -217,6 +235,47 @@ impl MeshcoreDevice {
Ok(())
}
/// Delete a contact from the firmware's persistent contact table.
pub async fn remove_contact(&mut self, pubkey: &[u8; 32]) -> Result<()> {
self.send_raw(&protocol::build_remove_contact(pubkey))
.await?;
let frame = self.recv_frame_timeout(READ_TIMEOUT).await?;
if frame.code == protocol::RESP_ERR {
anyhow::bail!(
"Remove contact failed: {}",
protocol::parse_error(&frame.data)
);
}
Ok(())
}
/// Add/update a contact in the firmware table (CMD_ADD_UPDATE_CONTACT).
/// Used to import a heard advert so it shows up as a contact immediately.
pub async fn add_contact(
&mut self,
pubkey: &[u8; 32],
contact_type: u8,
flags: u8,
out_path_len: u8,
name: &str,
last_advert: u32,
) -> Result<()> {
self.send_raw(&protocol::build_add_contact(
pubkey,
contact_type,
flags,
out_path_len,
name,
last_advert,
))
.await?;
let frame = self.recv_frame_timeout(READ_TIMEOUT).await?;
if frame.code == protocol::RESP_ERR {
anyhow::bail!("Add contact failed: {}", protocol::parse_error(&frame.data));
}
Ok(())
}
/// Get the list of known contacts from the device.
/// Protocol: CMD_GET_CONTACTS -> CONTACT_START(count) -> N×CONTACT -> CONTACT_END
pub async fn get_contacts(&mut self) -> Result<Vec<protocol::ParsedContact>> {

View File

@ -32,8 +32,18 @@ pub struct MeshPeer {
pub advert_name: String,
/// Archipelago DID (did:key:z...) if identity was received.
pub did: Option<String>,
/// Ed25519 public key hex if identity was received.
/// Routing key hex. For a radio (meshcore) peer this is the firmware
/// contact public key used to address outbound DMs; for a federation-
/// seeded peer it is the archipelago ed25519 key. Used for delivery, NOT
/// for authentication — see `arch_pubkey_hex`.
pub pubkey_hex: Option<String>,
/// Verified archipelago ed25519 identity key hex, bound from a signed
/// identity advert (`handle_identity_received`) or federation seeding.
/// Unlike `pubkey_hex`, this is NEVER overwritten by `refresh_contacts`
/// with the firmware routing key, so it stays stable for the `!ai` auth
/// gate, envelope signature verification, and federation-trust matching.
#[serde(default)]
pub arch_pubkey_hex: Option<String>,
/// X25519 public key (32 bytes) for key agreement.
#[serde(skip)]
pub x25519_pubkey: Option<[u8; 32]>,
@ -45,6 +55,28 @@ pub struct MeshPeer {
pub last_heard: String,
/// Number of hops to reach this peer.
pub hops: u8,
/// Firmware advert timestamp (unix secs) of the contact's last advert, or
/// 0 if unknown. Used to gauge reachability/recency in the UI.
#[serde(default)]
pub last_advert: u32,
/// Best-effort "currently reachable" flag: the radio has a route to this
/// contact (or it's a federation/identity peer reachable off-radio). A
/// contact with no path and no recent advert is shown as unreachable.
#[serde(default)]
pub reachable: bool,
}
impl MeshPeer {
/// The key to use when AUTHENTICATING this peer (`!ai` trust/allowlist,
/// envelope signature verification): the verified archipelago identity key
/// if one is bound, otherwise the routing key. Never use the firmware
/// routing key for auth when an archipelago identity is known — a radio
/// peer's firmware key won't match its `nodes.json` archipelago key.
pub fn identity_pubkey_hex(&self) -> Option<&str> {
self.arch_pubkey_hex
.as_deref()
.or(self.pubkey_hex.as_deref())
}
}
/// Direction of a mesh message.
@ -172,4 +204,61 @@ pub enum MeshEvent {
to_contact_id: u32,
error: Option<String>,
},
/// A local-AI answer to a `!ai`-in-chat query, to be delivered back into
/// the 1:1 thread via the transport-aware `MeshService::send_message`
/// (Tor for federation peers, LoRa for radio peers). The mesh listener
/// emits this because it can't route over federation itself — the signing
/// key and Tor client live on MeshService. Consumed at the server layer.
AssistChatReply {
contact_id: u32,
text: String,
},
}
#[cfg(test)]
mod tests {
use super::*;
fn peer(arch: Option<&str>, routing: Option<&str>) -> MeshPeer {
MeshPeer {
contact_id: 1,
advert_name: "Test".into(),
did: None,
pubkey_hex: routing.map(|s| s.to_string()),
arch_pubkey_hex: arch.map(|s| s.to_string()),
x25519_pubkey: None,
rssi: None,
snr: None,
last_heard: String::new(),
hops: 0,
last_advert: 0,
reachable: false,
}
}
#[test]
fn identity_prefers_bound_archipelago_key_over_firmware_routing_key() {
// A radio peer that sent an identity advert: routing key is the firmware
// contact key, but auth must use the bound archipelago key.
let p = peer(Some("archkey"), Some("firmwarekey"));
assert_eq!(p.identity_pubkey_hex(), Some("archkey"));
}
#[test]
fn identity_falls_back_to_routing_key_when_no_advert() {
// A plain peer with no archipelago identity bound: fall back to whatever
// key we have (federation peers carry the arch key in pubkey_hex).
let p = peer(None, Some("firmwarekey"));
assert_eq!(p.identity_pubkey_hex(), Some("firmwarekey"));
assert_eq!(peer(None, None).identity_pubkey_hex(), None);
}
#[test]
fn refresh_style_routing_update_does_not_change_identity() {
// Simulates refresh_contacts: pubkey_hex (routing) is rewritten to a new
// firmware key while arch_pubkey_hex (identity) is preserved.
let mut p = peer(Some("archkey"), Some("firmware-old"));
p.pubkey_hex = Some("firmware-new".into());
assert_eq!(p.identity_pubkey_hex(), Some("archkey"));
}
}

View File

@ -305,6 +305,47 @@ impl Server {
.rpc_handler()
.set_mesh_service(mesh_service)
.await;
// Mesh-AI assistant (#50): deliver `!ai`-in-chat answers via
// the transport-aware send path. The listener can't route
// over federation itself (send_message needs the signing key
// + Tor client on MeshService), so it emits AssistChatReply
// and we fulfil it here through the shared MeshService —
// which POSTs over Tor for federation askers and falls back
// to LoRa for radio askers, recording the Sent bubble.
{
let mesh_arc = api_handler.rpc_handler().mesh_service_arc();
let mut reply_rx = {
let guard = mesh_arc.read().await;
guard.as_ref().map(|svc| svc.state().event_tx.subscribe())
};
if let Some(mut rx) = reply_rx.take() {
tokio::spawn(async move {
loop {
match rx.recv().await {
Ok(crate::mesh::MeshEvent::AssistChatReply {
contact_id,
text,
}) => {
let guard = mesh_arc.read().await;
if let Some(svc) = guard.as_ref() {
if let Err(e) =
svc.send_message(contact_id, &text).await
{
warn!("AI chat reply send failed: {}", e);
}
}
}
Ok(_) => {}
Err(tokio::sync::broadcast::error::RecvError::Lagged(
_,
)) => continue,
Err(_) => break, // sender dropped → mesh stopped
}
}
});
}
}
info!("📡 Mesh service initialized");
}
Err(e) => {
@ -387,6 +428,99 @@ impl Server {
});
}
// Periodic federation auto-sync. Pulls every federated peer's state on a
// timer so renamed nodes and roster changes propagate WITHOUT a manual
// "Sync" click. Each sync now fast-fails a dead FIPS path and falls back
// to Tor (~3-5s), so a full pass over a handful of peers is quick.
{
let data_dir = config.data_dir.clone();
let state = state_manager.clone();
tokio::spawn(async move {
// Delay the first pass so Tor/onion publishing settles after boot.
tokio::time::sleep(Duration::from_secs(20)).await;
let mut interval = tokio::time::interval(Duration::from_secs(90));
loop {
interval.tick().await;
let nodes = match crate::federation::load_nodes(&data_dir).await {
Ok(n) if !n.is_empty() => n,
_ => continue,
};
let (snap, _) = state.get_snapshot().await;
let local_did =
match crate::identity::did_key_from_pubkey_hex(&snap.server_info.pubkey) {
Ok(d) => d,
Err(_) => continue,
};
let identity_dir = data_dir.join("identity");
let node_identity =
match crate::identity::NodeIdentity::load_or_create(&identity_dir).await {
Ok(id) => id,
Err(_) => continue,
};
// Our own identity, for re-asserting membership to any peer
// that doesn't list us back (asymmetry self-heal, below).
let local_onion = snap.server_info.tor_address.clone().unwrap_or_default();
let local_pubkey = snap.server_info.pubkey.clone();
let local_name = snap.server_info.name.clone();
let local_fips_npub =
crate::identity::fips_npub(&identity_dir).await.unwrap_or(None);
let mut ok = 0usize;
let mut healed = 0usize;
for node in &nodes {
if node.trust_level == crate::federation::TrustLevel::Untrusted {
continue;
}
match crate::federation::sync_with_peer(&data_dir, node, &local_did, |b| {
node_identity.sign(b)
})
.await
{
Ok(state) => {
ok += 1;
// Asymmetry self-heal: if this peer's exported
// trusted list doesn't include us, our original
// peer-joined never landed (e.g. it was sent
// before the reliable-notify fix, or the peer was
// down). Re-assert membership over the now
// FIPS-fast-failing/Tor path so they add us back.
// Without this, a node that joined everyone stays
// invisible to the whole fleet until a manual
// re-add (the "X250-EXP missing everywhere" case).
let they_list_us = state
.federated_peers
.iter()
.any(|h| h.did == local_did);
if !they_list_us && !local_onion.is_empty() {
crate::federation::notify_join(
&node.onion,
node.fips_npub.as_deref(),
&local_did,
&local_onion,
&local_pubkey,
local_fips_npub.as_deref(),
local_name.as_deref(),
|b| node_identity.sign(b),
)
.await
.ok();
healed += 1;
}
}
Err(e) => {
debug!(peer = %node.did, error = %e, "federation auto-sync (non-fatal)")
}
}
}
debug!(
synced = ok,
reasserted = healed,
total = nodes.len(),
"federation auto-sync pass complete"
);
}
});
}
// Initialize container scanner — discovers installed apps from Podman/Docker
{
let scanner = create_docker_scanner(&config).await?;

View File

@ -42,6 +42,16 @@ pub struct EcashTransaction {
/// Peer identifier (DID, pubkey, or onion) if applicable.
#[serde(default)]
pub peer: String,
/// Which ecash system this entry belongs to: "cashu" or "fedimint". Lets the
/// unified history/UI label each transaction. Defaults to "cashu" so legacy
/// stored entries (written before dual-ecash) read back correctly.
#[serde(default = "default_tx_kind")]
pub kind: String,
}
/// Default `kind` for transactions persisted before the dual-ecash split.
pub fn default_tx_kind() -> String {
"cashu".to_string()
}
/// A stored proof with metadata.
@ -190,6 +200,7 @@ impl WalletState {
description: description.to_string(),
mint_url: mint_url.to_string(),
peer: peer.to_string(),
kind: "cashu".to_string(),
});
}
@ -1107,6 +1118,24 @@ pub async fn verify_and_receive_payment(
return Ok(received);
}
// Fedimint notes (#3): a buyer whose balance is in Fedimint pays with notes
// rather than a Cashu token. Cashu tokens all start with "cashu" (cashuA/
// cashuB, or the legacy form handled above), so anything else is treated as
// Fedimint notes and redeemed into a joined federation. reissue_into_any
// verifies the notes are unspent and credits this node's wallet.
if !token_str.starts_with("cashu") {
let (received, _fed_id) =
crate::wallet::fedimint_client::reissue_into_any(data_dir, token_str).await?;
if received < required_sats {
anyhow::bail!(
"Insufficient payment: {} sats, need {} sats",
received,
required_sats
);
}
return Ok(received);
}
// Parse and validate cashuA token
let token = CashuToken::deserialize(token_str)?;
let total = token.total_amount();
@ -1175,8 +1204,13 @@ pub async fn get_balance(data_dir: &Path) -> Result<u64> {
}
/// Default mint URL (local Fedimint).
/// Default Cashu mint. Minibits is a well-known public Cashu mint — note this
/// is a CASHU mint, distinct from the local Fedimint guardian (:8175), which is
/// a separate ecash protocol managed under the Fedimint Federations tab. The
/// old default pointed at :8175, which incorrectly surfaced the Fedimint URL in
/// the Cashu mints list.
fn default_mint_url() -> String {
"http://127.0.0.1:8175".to_string()
"https://mint.minibits.cash/Bitcoin".to_string()
}
#[cfg(test)]
@ -1359,11 +1393,11 @@ mod tests {
async fn test_save_and_load_wallet_roundtrip() {
let tmp = TempDir::new().unwrap();
let mut wallet = WalletState {
mint_url: "http://127.0.0.1:8175".into(),
mint_url: "https://mint.minibits.cash/Bitcoin".into(),
..Default::default()
};
wallet.add_proofs(
"http://127.0.0.1:8175",
"https://mint.minibits.cash/Bitcoin",
vec![Proof {
amount: 42,
id: "ks1".into(),
@ -1375,7 +1409,7 @@ mod tests {
TransactionType::Mint,
42,
"Test mint",
"http://127.0.0.1:8175",
"https://mint.minibits.cash/Bitcoin",
"",
);
@ -1491,6 +1525,7 @@ mod tests {
description: "test".into(),
mint_url: String::new(),
peer: String::new(),
kind: default_tx_kind(),
};
let json = serde_json::to_string(&tx).unwrap();
assert!(json.contains("\"streamingpayment\""));
@ -1498,7 +1533,7 @@ mod tests {
#[test]
fn test_default_mint_url() {
assert_eq!(default_mint_url(), "http://127.0.0.1:8175");
assert_eq!(default_mint_url(), "https://mint.minibits.cash/Bitcoin");
}
#[test]
@ -1517,9 +1552,11 @@ mod tests {
.await
.unwrap());
// Trailing slash on the home URL still matches.
assert!(is_mint_trusted(tmp.path(), "http://127.0.0.1:8175/")
.await
.unwrap());
assert!(
is_mint_trusted(tmp.path(), "https://mint.minibits.cash/Bitcoin/")
.await
.unwrap()
);
}
#[tokio::test]
@ -1553,7 +1590,7 @@ mod tests {
let err = swap_between_mints(
tmp.path(),
&default_mint_url(),
"http://127.0.0.1:8175/",
"https://mint.minibits.cash/Bitcoin/",
100,
10,
)

View File

@ -49,6 +49,13 @@ pub struct FederationRegistry {
const REGISTRY_FILE: &str = "wallet/fedimint_federations.json";
/// Shared HTTP-Basic password between the fmcd container and this bridge. The
/// fedimint-clientd manifest generates it via `generated_secrets: [fmcd-password]`
/// and injects it through `secret_env`; the bridge reads the same file in
/// `from_node`. (Generation lives in `container::secrets`, not here — it's a
/// generic, manifest-declared concern, not fedimint-specific.)
const FMCD_PASSWORD_SECRET: &str = "fmcd-password";
pub async fn load_registry(data_dir: &Path) -> Result<FederationRegistry> {
let path = data_dir.join(REGISTRY_FILE);
if !path.exists() {
@ -72,6 +79,62 @@ pub async fn save_registry(data_dir: &Path, reg: &FederationRegistry) -> Result<
Ok(())
}
/// Local Fedimint transaction log. fmcd has no per-node history API, so we
/// record each redeem/spend ourselves and merge it with the Cashu history in
/// `wallet.ecash-history` — otherwise a Fedimint receive shows nowhere.
const FEDIMINT_TX_FILE: &str = "wallet/fedimint_transactions.json";
/// Load the local Fedimint transaction log (newest entries last). Empty on any
/// error — history is best-effort and must never block a wallet operation.
pub async fn load_fedimint_txs(data_dir: &Path) -> Vec<crate::wallet::ecash::EcashTransaction> {
match fs::read_to_string(data_dir.join(FEDIMINT_TX_FILE)).await {
Ok(s) => serde_json::from_str(&s).unwrap_or_default(),
Err(_) => Vec::new(),
}
}
/// Append a Fedimint transaction to the local log so it appears in unified
/// history with meaningful data. Best-effort: write failures are logged, not
/// propagated, so they never fail the redeem/spend that produced the funds.
pub async fn record_fedimint_tx(
data_dir: &Path,
tx_type: crate::wallet::ecash::TransactionType,
amount_sats: u64,
federation_id: &str,
description: &str,
) {
let mut txs = load_fedimint_txs(data_dir).await;
txs.push(crate::wallet::ecash::EcashTransaction {
id: uuid::Uuid::new_v4().to_string(),
tx_type,
amount_sats,
timestamp: chrono::Utc::now().to_rfc3339(),
description: description.to_string(),
// Cashu uses mint_url; for Fedimint we record the federation id in `peer`
// so the UI can show which federation the funds moved through.
mint_url: String::new(),
peer: federation_id.to_string(),
kind: "fedimint".to_string(),
});
// Cap the log so it can't grow unbounded.
let len = txs.len();
if len > 500 {
txs.drain(0..len - 500);
}
if let Err(e) = fs::create_dir_all(data_dir.join("wallet")).await {
tracing::warn!("fedimint tx log: could not create wallet dir: {e}");
return;
}
match serde_json::to_string_pretty(&txs) {
Ok(content) => {
if let Err(e) = fs::write(data_dir.join(FEDIMINT_TX_FILE), content).await {
tracing::warn!("fedimint tx log: write failed: {e}");
}
}
Err(e) => tracing::warn!("fedimint tx log: serialize failed: {e}"),
}
}
/// Idempotently ensure the node has joined the default federation and that it
/// is tracked in the local registry. Best-effort: silently no-ops if clientd
/// isn't installed/running yet. Joining is idempotent on the clientd side.
@ -80,6 +143,34 @@ pub async fn ensure_default_federation(data_dir: &Path) -> Result<()> {
Ok(c) => c,
Err(_) => return Ok(()), // clientd not configured yet
};
// Fast path: if fmcd already reports a joined federation, do NOT re-issue the
// POST /v2/admin/join. That call re-syncs federation config against the
// guardians and adds seconds of latency — and ensure_default_federation runs
// on every wallet.fedimint-list / spend / reissue, so the join was being paid
// on each balance refresh (the "mints take ages to load" report). The cheap
// GET /v2/admin/info is enough to confirm membership; just reconcile the local
// registry against the live joined set and return.
let joined = client.joined_federation_ids().await;
if !joined.is_empty() {
let mut reg = load_registry(data_dir).await?;
let mut changed = false;
for id in joined {
if !reg.federations.iter().any(|f| f.federation_id == id) {
reg.federations.push(JoinedFederation {
federation_id: id,
name: Some("Archipelago Federation".to_string()),
});
changed = true;
}
}
if changed {
save_registry(data_dir, &reg).await?;
}
return Ok(());
}
// Cold start only: nothing joined yet, so join the default federation once.
let federation_id = match client.join(DEFAULT_FEDERATION_INVITE).await {
Ok(id) => id,
Err(e) => {
@ -95,13 +186,165 @@ pub async fn ensure_default_federation(data_dir: &Path) -> Result<()> {
{
reg.federations.push(JoinedFederation {
federation_id,
name: None,
name: Some("Archipelago Federation".to_string()),
});
save_registry(data_dir, &reg).await?;
}
Ok(())
}
/// Spend `amount_sats` of Fedimint ecash from whichever joined federation can
/// cover it, returning the serialized notes (the X-Payment-Token a buyer hands
/// the seller) and the federation id that minted them. Federations are tried in
/// registry order (default first); only one with sufficient balance is used so
/// the resulting notes redeem cleanly on the other side. Errors clearly when no
/// federation is joined or none has the balance — the caller falls back to (or
/// from) the Cashu path.
pub async fn spend_from_any(data_dir: &Path, amount_sats: u64) -> Result<(String, String)> {
if amount_sats == 0 {
anyhow::bail!("payment amount must be greater than zero");
}
let _ = ensure_default_federation(data_dir).await;
let client = FedimintClient::from_node(data_dir).await?;
// Same union-of-sources approach as reissue_into_any: the persisted registry
// and what fmcd actually reports joined can drift, so consider both.
let mut fed_ids: Vec<String> = Vec::new();
if let Ok(reg) = load_registry(data_dir).await {
for f in reg.federations {
if !fed_ids.contains(&f.federation_id) {
fed_ids.push(f.federation_id);
}
}
}
for id in client.joined_federation_ids().await {
if !fed_ids.contains(&id) {
fed_ids.push(id);
}
}
if fed_ids.is_empty() {
anyhow::bail!("No Fedimint federation joined to spend from");
}
let mut last_err = None;
for fed_id in &fed_ids {
// Skip federations that can't cover the amount so we don't mint a
// partial/failed spend and leave dangling reserved notes.
match client.federation_balance_sats(fed_id).await {
Ok(bal) if bal >= amount_sats => {}
Ok(_) => continue,
Err(e) => {
last_err = Some(e);
continue;
}
}
match client.spend(fed_id, amount_sats).await {
Ok(notes) => {
record_fedimint_tx(
data_dir,
crate::wallet::ecash::TransactionType::Send,
amount_sats,
fed_id,
"Sent Fedimint ecash",
)
.await;
return Ok((notes, fed_id.clone()));
}
Err(e) => last_err = Some(e),
}
}
Err(last_err
.map(|e| anyhow::anyhow!("Fedimint spend failed across all federations: {e}"))
.unwrap_or_else(|| {
anyhow::anyhow!(
"No joined Fedimint federation has {amount_sats} sats available"
)
}))
}
/// Redeem received Fedimint notes into a joined federation. fmcd's reissue is
/// per-federation, but a token only validates against the federation that
/// minted it, so we try each joined federation (default first) and return the
/// first that accepts the notes, along with its id. Errors clearly when the
/// fmcd sidecar isn't installed or no federation is joined — the Cashu path is
/// handled separately by the caller.
pub async fn reissue_into_any(data_dir: &Path, notes: &str) -> Result<(u64, String)> {
// Make sure at least the default federation is tracked before we try.
let _ = ensure_default_federation(data_dir).await;
let client = FedimintClient::from_node(data_dir).await?;
// Build the set of federations to try: the locally-persisted registry PLUS
// every federation the fmcd sidecar actually reports joined. The two can
// drift (a federation joined directly, before tracking, or a registry that
// wasn't written), and a note only validates against the federation that
// minted it — so we must try EVERY connected federation before giving up,
// or a perfectly valid token is wrongly reported as failed.
let mut fed_ids: Vec<String> = Vec::new();
if let Ok(reg) = load_registry(data_dir).await {
for f in reg.federations {
if !fed_ids.contains(&f.federation_id) {
fed_ids.push(f.federation_id);
}
}
}
for id in client.joined_federation_ids().await {
if !fed_ids.contains(&id) {
fed_ids.push(id);
}
}
if fed_ids.is_empty() {
anyhow::bail!("No Fedimint federation joined to redeem these notes into");
}
let mut already_redeemed = false;
let mut last_err = None;
for fed_id in &fed_ids {
match client.reissue(fed_id, notes).await {
Ok(sats) => {
// Record the receive so it appears in unified ecash history.
record_fedimint_tx(
data_dir,
crate::wallet::ecash::TransactionType::Receive,
sats,
fed_id,
"Received Fedimint ecash",
)
.await;
return Ok((sats, fed_id.clone()));
}
Err(e) => {
let msg = e.to_string().to_ascii_lowercase();
// fmcd reports already-claimed notes as "We already reissued
// these notes" (or "already spent"). That means the funds were
// already redeemed INTO this node's wallet — they're safe, just
// not new — so surface that clearly instead of a raw 500.
if msg.contains("already reissued") || msg.contains("already spent") {
already_redeemed = true;
}
last_err = Some(e);
}
}
}
if already_redeemed {
anyhow::bail!(
"This ecash was already redeemed into your wallet — the notes are \
already claimed, so no new balance was added. Check your Fedimint balance."
);
}
Err(last_err
.map(|e| {
anyhow::anyhow!(
"These notes didn't match any of your {} connected Fedimint \
federation(s). You may need to join the federation that issued \
them first. (last error: {e})",
fed_ids.len()
)
})
.unwrap_or_else(|| anyhow::anyhow!("Fedimint reissue failed")))
}
/// HTTP client for a `fedimint-clientd` instance.
pub struct FedimintClient {
base_url: String,
@ -135,14 +378,25 @@ impl FedimintClient {
let password = match std::env::var("FMCD_PASSWORD") {
Ok(p) if !p.is_empty() => p,
_ => {
let secret = data_dir.join("fmcd").join("password");
fs::read_to_string(&secret)
.await
.map(|s| s.trim().to_string())
.context(
"Fedimint client not configured (no FMCD_PASSWORD and no \
fmcd/password secret). Install the Fedimint client app.",
)?
// The shared secret the fmcd container also reads (manifest
// secret_env: fmcd-password, resolved from <data_dir>/secrets).
// Legacy <data_dir>/fmcd/password kept as a fallback.
let shared = data_dir.join("secrets").join(FMCD_PASSWORD_SECRET);
let legacy = data_dir.join("fmcd").join("password");
let mut found = None;
for candidate in [shared, legacy] {
if let Ok(s) = fs::read_to_string(&candidate).await {
let s = s.trim().to_string();
if !s.is_empty() {
found = Some(s);
break;
}
}
}
found.context(
"Fedimint client not configured (no FMCD_PASSWORD and no \
fmcd-password secret). Install the Fedimint client app.",
)?
}
};
Self::new(&base_url, &password)
@ -193,6 +447,20 @@ impl FedimintClient {
self.get("/v2/admin/info").await
}
/// Every federation id the fmcd sidecar currently reports joined, read from
/// `/v2/admin/info` (the authoritative live set — the locally-persisted
/// registry can drift from it). Returns an empty vec on any error so callers
/// can fall back to the registry rather than fail outright.
pub async fn joined_federation_ids(&self) -> Vec<String> {
match self.info().await {
Ok(info) => info
.as_object()
.map(|m| m.keys().cloned().collect())
.unwrap_or_default(),
Err(_) => Vec::new(),
}
}
/// `POST /v2/admin/join` — join a federation by invite code; returns its federationId.
pub async fn join(&self, invite_code: &str) -> Result<String> {
let res = self

View File

@ -8,9 +8,11 @@ pub mod runtime;
pub use bitcoin_simulator::{BitcoinSimulationMode, BitcoinSimulator};
pub use health_monitor::HealthMonitor;
pub use manifest::{
AppInterface, AppManifest, BuildConfig, ContainerConfig, Dependency, DerivedEnv, GeneratedFile,
HealthCheck, HostFacts, ManifestError, ResolvedSource, ResourceLimits, SecretEnv,
SecretsProvider, SecurityPolicy, Volume,
AppInterface, AppManifest, BuildConfig, ContainerConfig, Dependency, DerivedEnv, GeneratedCert,
GeneratedFile, GeneratedSecret, HealthCheck, HookStep, HostCopy, HostFacts, LifecycleHooks,
ManifestError,
ResolvedSource, ResourceLimits, SecretEnv, SecretGenKind, SecretsProvider, SecurityPolicy,
Volume,
};
pub use podman_client::{
image_uses_insecure_registry, ContainerState, ContainerStatus, PodmanClient,

View File

@ -57,10 +57,88 @@ pub struct AppDefinition {
#[serde(default)]
pub interfaces: HashMap<String, AppInterface>,
/// Controlled post-install / pre-start lifecycle hooks. Declarative,
/// allowlisted operations run against the app's OWN container — never the
/// host. See `docs/manifest-hooks-design.md`.
#[serde(default)]
pub hooks: LifecycleHooks,
#[serde(flatten)]
pub extensions: HashMap<String, serde_yaml::Value>,
}
/// Declarative lifecycle hooks for an app. Absent = none (forward-compatible).
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq)]
pub struct LifecycleHooks {
/// Run once after a successful install, with the container created + running.
#[serde(default)]
pub post_install: Vec<HookStep>,
/// Run before each start (repair/ownership). Reserved; not yet executed.
#[serde(default)]
pub pre_start: Vec<HookStep>,
}
/// A single controlled hook operation. Each list item is a one-key map, e.g.
/// `- exec: [...]` or `- copy_from_host: { src, dest }`.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(untagged)]
pub enum HookStep {
/// Run a command vector INSIDE the app's container (`podman exec`). Never on
/// the host; inherits the container's (already dropped) capabilities.
Exec { exec: Vec<String> },
/// Copy a file from an allowlisted host root into the container. `src` is
/// relative to the allowlist (data dir / web-ui) — no absolute paths, no `..`.
CopyFromHost {
#[serde(rename = "copy_from_host")]
copy_from_host: HostCopy,
},
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct HostCopy {
pub src: String,
pub dest: String,
}
impl LifecycleHooks {
fn validate(&self) -> Result<(), ManifestError> {
for step in self.post_install.iter().chain(self.pre_start.iter()) {
step.validate()?;
}
Ok(())
}
}
impl HookStep {
fn validate(&self) -> Result<(), ManifestError> {
match self {
HookStep::Exec { exec } => {
if exec.is_empty() {
return Err(ManifestError::Invalid(
"hooks: exec must be a non-empty command vector".to_string(),
));
}
}
HookStep::CopyFromHost { copy_from_host } => {
let s = &copy_from_host.src;
if s.is_empty() || s.starts_with('/') || s.contains("..") {
return Err(ManifestError::Invalid(format!(
"hooks: copy_from_host.src must be a relative allowlisted path \
(no leading '/', no '..'), got '{s}'"
)));
}
if copy_from_host.dest.is_empty() || !copy_from_host.dest.starts_with('/') {
return Err(ManifestError::Invalid(format!(
"hooks: copy_from_host.dest must be an absolute container path, got '{}'",
copy_from_host.dest
)));
}
}
}
Ok(())
}
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct ContainerConfig {
/// Pull source. Mutually exclusive with `build`. Exactly one of the two must be present.
@ -92,6 +170,17 @@ pub struct ContainerConfig {
#[serde(default)]
pub network: Option<String>,
/// Extra DNS aliases the container answers to on its `network`, in addition
/// to its own container name (which is always added). Mirrors podman
/// `--network-alias`. Used by multi-container stacks whose images reference
/// peers by a short baked-in hostname — e.g. indeedhub's frontend nginx
/// proxies to `api:4000` / `minio:9000` / `relay:8080`, so the api/minio/relay
/// members declare `network_aliases: [api]` / `[minio]` / `[relay]` to keep
/// those short names resolvable on the dedicated `indeedhub-net`. Ignored for
/// slirp4netns/pasta (podman rejects aliases there).
#[serde(default)]
pub network_aliases: Vec<String>,
/// Extra positional arguments appended to the container command
/// after the image. Mirrors `SPEC_CUSTOM_ARGS` in
/// `scripts/container-specs.sh` (bitcoin-knots prune/dbcache flags,
@ -122,6 +211,31 @@ pub struct ContainerConfig {
#[serde(default)]
pub secret_env: Vec<SecretEnv>,
/// Secrets the orchestrator generates on first use when absent, so an app
/// installs from its manifest alone — no host provisioning, no per-app Rust.
/// Materialised before `secret_env` is resolved, written `0600` and owned by
/// the unprivileged (rootless) service user. Idempotent and self-healing: a
/// file that already exists and is readable is left untouched; one that is
/// present-but-unreadable (e.g. wrongly created `root`-owned) is recreated
/// in place via the service-owned secrets dir — no `chown`, no privilege.
///
/// Example: `- { name: fmcd-password, kind: hex16 }`
#[serde(default)]
pub generated_secrets: Vec<GeneratedSecret>,
/// Self-signed TLS certificates the orchestrator materialises before the
/// container is created (so a bind-mounted cert path resolves to a real
/// file, not a stale/missing path). Like `generated_secrets`, this keeps an
/// app data-driven: a service that needs a secure context (e.g. netbird's
/// dashboard — OIDC PKCE / `window.crypto.subtle` only works over HTTPS,
/// issue #15) declares the cert here instead of relying on per-app Rust.
/// Idempotent: an entry whose `crt` and `key` already exist is left
/// untouched. SAN/CN templates are rendered against host facts at apply time.
///
/// Example: `- { crt: /var/lib/archipelago/netbird/tls.crt, key: /var/lib/archipelago/netbird/tls.key }`
#[serde(default)]
pub generated_certs: Vec<GeneratedCert>,
/// Rootless-mapped UID:GID applied to the container's data directory
/// (the `bind`-mounted host path with `target` inside the container's
/// data root) before creation. Mirrors `SPEC_DATA_UID`.
@ -151,6 +265,66 @@ pub struct SecretEnv {
pub secret_file: String,
}
/// How a [`GeneratedSecret`] is produced. Each kind is deterministic in shape
/// (so the orchestrator knows which files to expect) but random in value.
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum SecretGenKind {
/// 16 random bytes, lowercase hex (32 chars). Service passwords/API tokens.
Hex16,
/// 32 random bytes, lowercase hex (64 chars). Longer keys/cookies.
Hex32,
/// 32 random bytes, standard base64 (44 chars incl. padding). For services
/// that require a base64-encoded key rather than hex — e.g. netbird's relay
/// `authSecret` and the SQLite store `encryptionKey`, which base64-decode
/// their configured value (hex would decode to the wrong bytes).
Base64,
/// A random password and its bcrypt hash. `<name>` holds the bcrypt hash
/// (what a server is configured with); the plaintext is stored alongside as
/// `<name>.pw` for any client that must authenticate. `secret_env` injects
/// whichever file it references.
Bcrypt,
}
/// A secret materialised by the orchestrator on demand. See
/// [`ContainerConfig::generated_secrets`]. `name` is a bare filename under the
/// secrets dir — validated (no `/`, no `..`) at [`AppManifest::validate`] time.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct GeneratedSecret {
pub name: String,
pub kind: SecretGenKind,
}
impl GeneratedSecret {
/// Every file this secret materialises, in the order they should be written
/// (primary first). A consumer references one of these via `secret_env`.
pub fn target_files(&self) -> Vec<String> {
match self.kind {
SecretGenKind::Hex16 | SecretGenKind::Hex32 | SecretGenKind::Base64 => {
vec![self.name.clone()]
}
SecretGenKind::Bcrypt => vec![self.name.clone(), format!("{}.pw", self.name)],
}
}
}
/// A self-signed TLS certificate materialised by the orchestrator. See
/// [`ContainerConfig::generated_certs`]. `crt`/`key` are absolute host paths
/// (typically under `/var/lib/archipelago/<app>/`) that the container
/// bind-mounts read-only. `common_name` and `sans` are rendered against host
/// facts (`{{HOST_IP}}`) at apply time; when omitted they default to the
/// node's host IP plus `IP:127.0.0.1,DNS:localhost` so the cert is valid for
/// however the box is reached locally.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct GeneratedCert {
pub crt: String,
pub key: String,
#[serde(default)]
pub common_name: Option<String>,
#[serde(default)]
pub sans: Vec<String>,
}
fn default_pull_policy() -> String {
"if-not-present".to_string()
}
@ -413,6 +587,25 @@ impl AppManifest {
}
}
// network_aliases: each must be a non-empty DNS label (lowercase
// alphanumeric + hyphen, no leading/trailing hyphen) so it renders as a
// valid podman --network-alias / aardvark-dns name.
for (i, alias) in self.app.container.network_aliases.iter().enumerate() {
let ok = !alias.is_empty()
&& alias.len() <= 63
&& alias
.chars()
.all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '-')
&& !alias.starts_with('-')
&& !alias.ends_with('-');
if !ok {
return Err(ManifestError::Invalid(format!(
"container.network_aliases[{i}] '{alias}' must be a non-empty DNS label \
(lowercase a-z, 0-9, '-'; no leading/trailing '-')"
)));
}
}
// custom_args: no empty strings (would inject literal "" into
// the podman command line and confuse downstream parsing).
for (i, a) in self.app.container.custom_args.iter().enumerate() {
@ -487,6 +680,40 @@ impl AppManifest {
}
}
// generated_secrets: bare-filename names, unique across every file the
// set materialises (so a Bcrypt's `.pw` sibling can't collide with
// another secret). Path-safety mirrors secret_env.
{
let mut names: std::collections::HashSet<String> = std::collections::HashSet::new();
for (i, g) in self.app.container.generated_secrets.iter().enumerate() {
if g.name.is_empty() || g.name.contains('/') || g.name.contains("..") {
return Err(ManifestError::Invalid(format!(
"container.generated_secrets[{}].name must be a bare filename (no '/', no '..'), got '{}'",
i, g.name
)));
}
for f in g.target_files() {
if !names.insert(f.clone()) {
return Err(ManifestError::Invalid(format!(
"container.generated_secrets produces duplicate file '{f}'"
)));
}
}
}
}
// generated_certs: crt/key must be non-empty absolute paths with no
// traversal (they become bind-mount sources, same safety bar as files).
for (i, c) in self.app.container.generated_certs.iter().enumerate() {
for (field, val) in [("crt", &c.crt), ("key", &c.key)] {
if val.is_empty() || !val.starts_with('/') || val.contains("..") {
return Err(ManifestError::Invalid(format!(
"container.generated_certs[{i}].{field} must be an absolute path with no '..', got '{val}'"
)));
}
}
}
// data_uid: if set, must look like "NNNNN:NNNNN".
if let Some(u) = &self.app.container.data_uid {
let parts: Vec<&str> = u.split(':').collect();
@ -587,6 +814,10 @@ impl AppManifest {
}
}
// Lifecycle hooks: declarative, allowlisted (no host exec, no absolute /
// `..` copy sources). See docs/manifest-hooks-design.md.
self.app.hooks.validate()?;
Ok(())
}
}
@ -1002,6 +1233,57 @@ mod tests {
use std::fs;
use std::path::{Path, PathBuf};
#[test]
fn hooks_parse_and_validate() {
let yaml = r#"
app:
id: indeedhub
name: IndeedHub
version: 1.0.0
container:
image: test/indeedhub:1.0.0
hooks:
post_install:
- exec: ["sed", "-i", "/X-Frame-Options/d", "/etc/nginx/conf.d/default.conf"]
- copy_from_host:
src: "web-ui/nostr-provider.js"
dest: "/usr/share/nginx/html/nostr-provider.js"
"#;
let m = AppManifest::parse(yaml).unwrap();
assert_eq!(m.app.hooks.post_install.len(), 2);
match &m.app.hooks.post_install[0] {
HookStep::Exec { exec } => assert_eq!(exec[0], "sed"),
_ => panic!("expected exec step"),
}
match &m.app.hooks.post_install[1] {
HookStep::CopyFromHost { copy_from_host } => {
assert_eq!(copy_from_host.dest, "/usr/share/nginx/html/nostr-provider.js")
}
_ => panic!("expected copy_from_host step"),
}
m.validate().unwrap();
}
#[test]
fn hooks_reject_absolute_or_traversal_copy_src() {
for bad in ["/etc/passwd", "../../etc/shadow", "web-ui/../../etc/x"] {
let yaml = format!(
"app:\n id: a\n name: a\n version: 1.0.0\n container:\n image: x:y\n \
hooks:\n post_install:\n - copy_from_host:\n src: \"{bad}\"\n dest: \"/x\"\n"
);
assert!(
AppManifest::parse(&yaml).is_err(),
"src '{bad}' must be rejected"
);
}
}
#[test]
fn hooks_reject_empty_exec() {
let yaml = "app:\n id: a\n name: a\n version: 1.0.0\n container:\n image: x:y\n hooks:\n post_install:\n - exec: []\n";
assert!(AppManifest::parse(yaml).is_err());
}
#[test]
fn test_manifest_parse() {
let yaml = r#"
@ -1459,6 +1741,7 @@ app:
pull_policy: "if-not-present".to_string(),
build: None,
network: None,
network_aliases: vec![],
custom_args: vec![],
entrypoint: None,
derived_env: vec![
@ -1476,6 +1759,8 @@ app:
},
],
secret_env: vec![],
generated_secrets: vec![],
generated_certs: vec![],
data_uid: None,
};
let facts = HostFacts {
@ -1512,6 +1797,7 @@ app:
pull_policy: "if-not-present".to_string(),
build: None,
network: None,
network_aliases: vec![],
custom_args: vec![],
entrypoint: None,
derived_env: vec![],
@ -1525,6 +1811,8 @@ app:
secret_file: "fedimint-gateway-password".to_string(),
},
],
generated_secrets: vec![],
generated_certs: vec![],
data_uid: None,
};
let p = MapSecretsProvider {
@ -1553,6 +1841,7 @@ app:
pull_policy: "if-not-present".to_string(),
build: None,
network: None,
network_aliases: vec![],
custom_args: vec![],
entrypoint: None,
derived_env: vec![],
@ -1560,6 +1849,8 @@ app:
key: "BITCOIN_RPC_PASS".to_string(),
secret_file: "bitcoin-rpc-password".to_string(),
}],
generated_secrets: vec![],
generated_certs: vec![],
data_uid: None,
};
let p = MapSecretsProvider {

View File

@ -121,10 +121,16 @@ impl PodmanClient {
"cryptpad" => "http://localhost:3003",
"penpot" => "http://localhost:9001",
"immich_server" | "immich" => "http://localhost:2283",
// Gitea publishes SSH (2222) and web (3001). Without a manifest on
// disk, extract_lan_address() returns whichever podman lists first —
// which can be the SSH port, breaking the launch. Pin the web UI.
"gitea" => "http://localhost:3001",
"nginx-proxy-manager" => "http://localhost:8081",
"fedimint-gateway" => "http://localhost:8176",
"endurain" => "http://localhost:8080",
"netbird" => "http://localhost:8087",
// HTTPS: netbird's dashboard needs a secure context for OIDC PKCE
// (window.crypto.subtle), so the proxy serves TLS on 8087 (issue #15).
"netbird" => "https://localhost:8087",
"electrs" | "archy-electrs-ui" => "http://localhost:50002",
_ => return None,
};
@ -275,10 +281,18 @@ impl PodmanClient {
// Build the container spec for the API
let mut port_mappings = Vec::new();
for port in &manifest.app.ports {
// Honour the manifest's protocol (default tcp). netbird's STUN port
// is 3478/udp; forcing tcp here would publish the wrong protocol and
// silently break relay discovery.
let protocol = match port.protocol.to_ascii_lowercase().as_str() {
"udp" => "udp",
"sctp" => "sctp",
_ => "tcp",
};
port_mappings.push(serde_json::json!({
"container_port": port.container,
"host_port": port.host,
"protocol": "tcp",
"protocol": protocol,
}));
}
@ -385,11 +399,21 @@ impl PodmanClient {
},
});
if let Some(network) = custom_network {
// The container always answers to its own name; manifest
// network_aliases add extra short hostnames peers may bake in
// (e.g. indeedhub's api/minio/relay). Dedup so a manifest that
// redundantly lists its own name doesn't double it.
let mut aliases = vec![name.to_string()];
for a in &manifest.app.container.network_aliases {
if !aliases.iter().any(|x| x == a) {
aliases.push(a.clone());
}
}
body.as_object_mut()
.expect("container create body is a JSON object")
.insert(
"networks".to_string(),
serde_json::json!({ network: { "aliases": [name] } }),
serde_json::json!({ network: { "aliases": aliases } }),
);
}
@ -412,11 +436,22 @@ impl PodmanClient {
}
pub async fn stop_container(&self, name: &str) -> Result<()> {
self.stop_container_with_grace(name, 10).await
}
/// Stop via libpod honouring a per-app grace (seconds). The HTTP deadline is
/// kept above the grace so the post-grace SIGKILL lands before we give up —
/// otherwise slow-to-SIGTERM apps (fedimint, bitcoin-core, electrumx…) time
/// out at exactly the grace boundary and the stop is reported as failed.
pub async fn stop_container_with_grace(&self, name: &str, grace_secs: u64) -> Result<()> {
let deadline = std::time::Duration::from_secs(
grace_secs + crate::runtime::STOP_GRACE_DEADLINE_BUFFER_SECS,
);
self.api_request(
"POST",
&format!("libpod/containers/{}/stop?t=10", name),
&format!("libpod/containers/{}/stop?t={}", name, grace_secs),
None,
DEFAULT_TIMEOUT,
deadline,
)
.await
.map(|_| ())

View File

@ -10,6 +10,35 @@ const PODMAN_CLI_DEFAULT_TIMEOUT: Duration = Duration::from_secs(30);
const PODMAN_CLI_IMAGE_CHECK_TIMEOUT: Duration = Duration::from_secs(10);
const PODMAN_CLI_BUILD_TIMEOUT: Duration = Duration::from_secs(900);
/// Default graceful-stop grace (seconds) when a caller doesn't supply a per-app
/// value. Mirrors the historical `podman stop -t 30`.
pub const DEFAULT_STOP_GRACE_SECS: u64 = 30;
/// Headroom added to a stop grace to form the await/HTTP deadline, so podman's
/// post-grace SIGKILL completes before the wrapper times out.
pub const STOP_GRACE_DEADLINE_BUFFER_SECS: u64 = 15;
/// Canonical per-app graceful-stop grace (seconds), keyed by container name.
/// Slow-to-SIGTERM apps need far longer than the 30s default: bitcoin-core
/// flushes its chainstate, lnd closes channels, electrumx finishes indexing,
/// stack DBs checkpoint. Used as the fallback when a manifest doesn't declare
/// `stop_grace_secs`. NOTE: the RPC layer's `stop_timeout_secs` mirrors this
/// (returns the same values as `&str` for legacy `podman stop -t` call sites) —
/// keep the two in sync until that path is retired.
pub fn stop_grace_secs_for(container_name: &str) -> u64 {
let id = container_name
.strip_prefix("archy-")
.unwrap_or(container_name);
match id {
"bitcoin-knots" | "bitcoin-core" | "bitcoin" => 600,
"lnd" => 330,
"electrumx" | "electrs" | "mempool-electrs" => 300,
"btcpay-db" | "mempool-db" | "penpot-postgres" | "immich_postgres" | "nextcloud-db"
| "endurain-db" => 120,
"btcpay-server" | "nbxplorer" | "fedimint" | "fedimint-gateway" => 60,
_ => DEFAULT_STOP_GRACE_SECS,
}
}
#[async_trait]
pub trait ContainerRuntime: Send + Sync {
async fn pull_image(&self, image: &str, signature: Option<&str>) -> Result<()>;
@ -21,6 +50,19 @@ pub trait ContainerRuntime: Send + Sync {
) -> Result<String>;
async fn start_container(&self, name: &str) -> Result<()>;
async fn stop_container(&self, name: &str) -> Result<()>;
/// Stop a container honouring a per-app graceful-shutdown grace (seconds).
///
/// Slow-to-SIGTERM apps (bitcoin-core, lnd, electrumx, fedimint, immich…)
/// need a longer `podman stop -t` than the default 30s, or `podman stop`
/// returns before the container exits and the orchestrator treats the stop
/// as failed (the container keeps running). The wrapping deadline is always
/// kept strictly greater than `grace_secs` so podman's post-grace SIGKILL
/// lands inside the await. The default impl ignores the grace and calls
/// `stop_container` — only the real podman runtime honours it.
async fn stop_container_with_grace(&self, name: &str, grace_secs: u64) -> Result<()> {
let _ = grace_secs;
self.stop_container(name).await
}
async fn remove_container(&self, name: &str) -> Result<()>;
async fn get_container_status(&self, name: &str) -> Result<ContainerStatus>;
async fn get_container_logs(&self, name: &str, lines: u32) -> Result<Vec<String>>;
@ -122,10 +164,23 @@ impl ContainerRuntime for PodmanRuntime {
}
async fn stop_container(&self, name: &str) -> Result<()> {
match self.client.stop_container(name).await {
self.stop_container_with_grace(name, DEFAULT_STOP_GRACE_SECS)
.await
}
async fn stop_container_with_grace(&self, name: &str, grace_secs: u64) -> Result<()> {
match self.client.stop_container_with_grace(name, grace_secs).await {
Ok(()) => Ok(()),
Err(api_err) => {
let output = self.podman_cli(&["stop", "-t", "30", name]).await?;
// CLI fallback. Keep the wrapper deadline strictly above the
// `-t` grace so podman's post-grace SIGKILL completes before the
// await gives up (otherwise a deadline == grace races the kill
// and reports a spurious timeout).
let grace = grace_secs.to_string();
let deadline = Duration::from_secs(grace_secs + STOP_GRACE_DEADLINE_BUFFER_SECS);
let output = self
.podman_cli_timeout(&["stop", "-t", &grace, name], deadline)
.await?;
if output.status.success() {
Ok(())
} else {
@ -841,6 +896,10 @@ impl ContainerRuntime for AutoRuntime {
self.runtime.stop_container(name).await
}
async fn stop_container_with_grace(&self, name: &str, grace_secs: u64) -> Result<()> {
self.runtime.stop_container_with_grace(name, grace_secs).await
}
async fn remove_container(&self, name: &str) -> Result<()> {
self.runtime.remove_container(name).await
}

View File

@ -737,6 +737,15 @@
return response.json();
}
// Snapshot age in ms. Prefer the server-computed age_ms (single clock, no
// skew). Fall back to the old browser-vs-server subtraction only for an
// older backend that doesn't send age_ms. Mixing clocks was why the
// "reconnecting…" banner could stick on nodes whose clock drifted.
function snapshotAgeMs(status) {
if (typeof status.age_ms === 'number') return status.age_ms;
return status.updated_at_ms ? Date.now() - status.updated_at_ms : Number.POSITIVE_INFINITY;
}
function cookieValue(name) {
return document.cookie
.split('; ')
@ -1127,7 +1136,7 @@
const rpcEl = document.getElementById('settingsRpc');
if (rpcEl) {
const port = chain === 'main' ? 8332 : (chain === 'test' ? 18332 : (chain === 'signet' ? 38332 : 18443));
const statusAgeMs = status.updated_at_ms ? Date.now() - status.updated_at_ms : Number.POSITIVE_INFINITY;
const statusAgeMs = snapshotAgeMs(status);
const displayStale = status.stale === true && statusAgeMs > 30000;
rpcEl.textContent = displayStale
? `Reconnecting on port ${port}`
@ -1143,7 +1152,7 @@
const diskSize = formatBytes(blockchainInfo.size_on_disk || 0);
const appearsToBeReindexing = initialBlockDownload && blocks === 0 && headers > 0 && (blockchainInfo.size_on_disk || 0) > 1024 * 1024 * 1024;
const previousBlockCount = lastBlockCount;
const statusAgeMs = status.updated_at_ms ? Date.now() - status.updated_at_ms : Number.POSITIVE_INFINITY;
const statusAgeMs = snapshotAgeMs(status);
const snapshotAdvanced = previousBlockCount > 0 && blocks > previousBlockCount;
const displayStale = status.stale === true && !snapshotAdvanced && statusAgeMs > 30000;

Some files were not shown because too many files have changed in this diff Show More