631 Commits

Author SHA1 Message Date
archipelago
760a32bccf fix(reconcile): keep user-stopped apps stopped (reconciler was resurrecting them)
package.stop a dependency (e.g. electrumx, a mempool dep) and the reconciler
restarts it within ~8s: the reconcile filter's dependency_required override
re-includes a user-stopped app that an active app depends on, and the in-memory
disabled set is wiped on manifest reload — so ensure_running runs, the stopped
app's unreachable ports look like a fault, the host-port repair restarts it, and
package.stop never sticks (gate 'transitions to stopped' times out).

Fix: guard ensure_running_with_mode on the on-disk user_stopped marker (the single
choke point every reconcile flows through) → Left('user-stopped'). Explicit
install/start clear the marker first (added clear_user_stopped to orchestrator
install/start, symmetric with disabled.remove; start/restart RPC already cleared
it) so user actions are unaffected. The container itself already stopped correctly
— this stops the resurrection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 09:04:02 -04:00
archipelago
2dad64b2ee fix(stop): honour per-app graceful-stop grace in orchestrator stop path
package.stop left slow-to-SIGTERM apps (fedimint/electrumx/bitcoin/btcpay/immich)
running: the orchestrator path hardcoded podman API ?t=10 / CLI -t 30 and the CLI
wrapper deadline (30s) equalled the -t grace, so the await fired exactly as podman
SIGKILLed -> stop reported failed -> state reverted to running. Reproduced live on
clean .198 (fedimint).

- container/runtime.rs: add ContainerRuntime::stop_container_with_grace (defaulted
  so mock/dev impls are unchanged); PodmanRuntime honours grace for API + CLI with
  deadline = grace + 15s buffer; AutoRuntime delegates. New canonical per-app table
  stop_grace_secs_for() + DEFAULT_STOP_GRACE_SECS / STOP_GRACE_DEADLINE_BUFFER_SECS.
- podman_client.rs: stop_container_with_grace uses ?t=<grace> + longer HTTP deadline.
- prod_orchestrator::stop: resolve grace = manifest stop_grace_secs (north-star) else
  the table; pass to quadlet::stop_service_with_timeout AND stop_container_with_grace.
- quadlet.rs: stop_service_with_timeout so slow apps aren't SIGKILLed at 45s.
- rpc/package/runtime.rs: doc-note its &str stop_timeout_secs mirrors the canonical table.
- tests: resolve_stop_grace_secs (manifest field wins / table fallback / default 30).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 06:59:40 -04:00
archipelago
ff78b31212 fix(hooks): run post_install exec in a transient user scope (fixes cgroup denial)
Live on .228 the post_install `exec` steps failed with "crun: write
cgroup.procs: Permission denied / OCI permission denied": a `podman exec`
launched from archipelago.service can't place its child in the container's
cgroup (under the service's own slice). Wrap `exec` in
`systemd-run --user --scope --quiet --collect podman exec …` so it gets its own
delegated cgroup — same trick as `podman_user_scope` for pasta starts.
`copy_from_host` (a host-side `cp`, no in-container process) stays direct.

Without this only copy_from_host worked; indeedhub happened to be unaffected
(its image pre-bakes the nginx config so the exec steps were no-ops), but the
hook capability is only generally useful with exec working. hooks unit tests
pass; live verify on .228 next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:38:23 -04:00
archipelago
b73084dbb0 refactor(indeedhub): delete orchestrator special-cases; use generic path (#20 phase 3)
The fresh-create path was blocked by hardcoded indeedhub orchestrator logic
that predated and conflicted with the manifest migration:
- ensure_running routed app_id=="indeedhub" → reconcile_indeedhub_stack, which
  REFUSED to create the frontend from its manifest (returned Left("stack-managed")).
- run_pre_start_hooks("indeedhub") → start_indeedhub_backends →
  wait_for_indeedhub_dependencies_ready(120) — a DNS gate with a chicken-and-egg
  bug (required the frontend's own alias present before the frontend could be
  created), which failed install_fresh with "dependencies were not ready within
  120s" and left the frontend down (caught live on .228).

Delete all of it (−382 lines): reconcile_indeedhub_stack, start_indeedhub_backends,
wait_for_indeedhub_dependencies_ready, indeedhub_api_dependency_dns_ready,
indeedhub_required_aliases_present, repair_indeedhub_network_aliases,
indeedhub_alias_present, patch_indeedhub_nostr_provider, and the INDEEDHUB_*
consts. The manifests now carry everything these did: network_aliases (short
hostnames), generated_secrets, dependencies, and the post_install nginx hook. So
"indeedhub" + every member flows through the generic install_fresh/reconcile path
— the frontend fresh-creates normally and runs its hook.

(crash_recovery.rs's frontend-after-deps ordering guard is kept — it's beneficial
startup ordering, not a blocker.) cargo check + release build green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:11:33 -04:00
archipelago
b1eea8c053 feat(indeedhub): manifest-driven 7-member stack, orchestrator-first (#20 phase 3)
Author the IndeedHub stack as 7 manifests (postgres/redis/minio/relay/api/
ffmpeg + frontend) and route install_indeedhub_stack through the
orchestrator first (immich pattern), falling back to the legacy installer
only when the manifests aren't deployed.

Data-preserving by construction — the manifests reproduce the live install
exactly so an existing node ADOPTS rather than recreates:
- container_name = the live hyphenated names the runtime already references
  (health_monitor tiers/deps, crash_recovery).
- named volumes indeedhub-{postgres,redis,minio,relay}-data (not bind mounts).
- dedicated indeedhub-net + network_aliases [postgres|redis|minio|relay|api]
  so the api/ffmpeg env hostnames and the frontend nginx upstreams resolve
  unchanged.
- generated_secrets (indeedhub-db-password/-minio-password owned by their
  backends, indeedhub-jwt by the api) reuse the live /var/lib/archipelago/
  secrets values (ensure_one no-ops on existing files; postgres pw is fixed
  at PGDATA init). minio user "indeeadmin" + AES_MASTER_SECRET literal kept.

The frontend carries the post_install hook (#20) that replaces the hardcoded
patch_indeedhub_nostr_provider: strip X-Frame-Options, refresh
nostr-provider.js from /opt/archipelago/web-ui, inject the <script> if
absent, reload nginx — defensive/idempotent since indeedhub:1.0.0 already
bakes these. Frontend manifest also corrected off its dead Next.js shape
(health check now nginx :7777, tmpfs /run + /var/cache/nginx).

Builds + unit-tested; live adoption/lifecycle verification on .228 next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 15:46:26 -04:00
archipelago
b94b61f640 feat(manifest): network_aliases — extra DNS aliases on a container's network
Add `container.network_aliases: Vec<String>` (serde default, DNS-label
validated) so a stack member can answer to short hostnames its peers bake
in, beyond its own container name. Rendered in both runtime paths:
- podman_client: merged (deduped) into the custom-network aliases array.
- quadlet from_manifest: appended after the container name; emitted only
  for Bridge networks (slirp/pasta reject aliases).

Needed for the indeedhub migration: its frontend nginx proxies to
`api:4000` / `minio:9000` / `relay:8080`, so those members declare
`network_aliases: [api|minio|relay]` to keep the short names resolvable on
the dedicated indeedhub-net (vs. colliding generic aliases on archy-net).

Also fixes 4 pre-existing from_manifest test failures (unrelated to this
change, surfaced now that the quadlet suite runs green): test manifests
used the long-invalid `network_policy: archy-net` (allowlist is
isolated/bridge/host → moved to network_policy: isolated + container.network)
and bind sources outside /var/lib/archipelago.

Tests: container crate 53 pass; archipelago quadlet+alias 47 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 15:45:11 -04:00
archipelago
955c54b713 feat(hooks): post_install executor + install-path wiring (#20 phase 2)
Add container::hooks::run_post_install — runs an app's declarative
post_install hooks against its own running container:
- Exec  -> podman exec <container> <args…> (60s timeout-bounded)
- CopyFromHost -> resolve src against allowlist roots (<data_dir>/<app>
  and /opt/archipelago), canonicalise + prefix-check (defeats symlink
  escape), then podman cp <abs-src> <container>:<dest>

Best-effort + idempotent: a failed step is warned and skipped, never
fails the install — matching the legacy patch_indeedhub_nostr_provider
behaviour this replaces. Wired into install_fresh after the container is
up, so it runs only on a freshly created container (not plain start), and
re-applies on recreate-after-drift.

5 unit tests on resolve_copy_src (accept in-data-dir, reject absolute /
traversal / missing / symlink-escape). cargo test -p archipelago green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 11:45:28 -04:00
archipelago
4c1a4e5976 feat(hooks): manifest lifecycle-hooks schema (#20 phase 1) + fix container test literals
Add controlled post_install/pre_start hook schema to AppDefinition:
LifecycleHooks/HookStep (Exec | CopyFromHost)/HostCopy with allowlist
validation (relative src, no '..', absolute container dest, non-empty
exec). Re-exported from the crate root. Design: docs/manifest-hooks-design.md.

Also add the missing generated_secrets: vec![] field to three
pre-existing ContainerConfig test literals (the field was added to the
struct in 03a4ee1b but the container crate's own tests were never rerun,
so -p archipelago-container failed to compile). cargo test green: 53 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 11:07:00 -04:00
archipelago
f160e0c404 fix(reboot): enable podman-restart.service at startup (--restart reboot-survival)
Orchestrator-installed backends (immich, btcpay-db, …) run as plain podman
`--restart=unless-stopped` containers until the Phase-3 Quadlet rollout flips
use_quadlet_backends on. Nothing in the codebase enabled the user's
podman-restart.service, so those containers had NO reboot-survival mechanism.
Enable it (idempotent, best-effort) at orchestrator startup so unless-stopped
containers come back after a reboot. Already applied manually on .228 (covers
31 containers incl. immich + btcpay); this codifies it fleet-wide.

The deeper fix (render Quadlet for all orchestrator installs) remains the gated
Phase-3 Quadlet-everywhere rollout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 08:23:19 -04:00
archipelago
d5ef45731a fix(immich): restore canonical app_id "immich" (title + icon)
After the manifest migration the launcher installed as "immich-server" (app_id),
which has no catalog entry → showed the raw id and no icon. Rename the server
manifest app_id immich-server→immich so it matches the catalog/curated "immich"
entry (title "Immich", icon immich.png) and is recognised as a known launcher app
(APP_CATEGORY_MAP) → stays in My Apps. immich_stack_app_ids now installs
[immich-postgres, immich-redis, immich]; orchestrator.install bypasses package
routing so there's no recursion with the "immich"→stack-installer mapping.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 08:07:08 -04:00
archipelago
9e6c5370fc feat(immich): manifest-driven stack via orchestrator — live-migrated on .228
Completes the immich migration off the legacy hardcoded install_immich_stack
(podman run + sudo chown) to the registry-manifest + orchestrator path. Validated
live on .228 (clean single set, healthy v2.7.4, data dir ownership correct).

- install_immich_stack now tries install_stack_via_orchestrator(immich_stack_app_ids)
  first; legacy remains only as the no-manifests fallback.
- immich-{postgres,redis,server} manifests corrected from live findings:
  * named by app_id (dropped container_name override) — using container_name
    spawned DUPLICATE containers (app_id-named install vs name-override reconcile)
    on the same PGDATA, which corrupted a postgres cluster. Server reaches its
    siblings via app_id aliases (DB_HOSTNAME=immich-postgres, REDIS=immich-redis).
  * immich-postgres data_uid 100998:100998 (postgres drops to container 999 →
    host 100998 under rootless; verified the fresh dir is chowned correctly).
  * immich-server version "release"→"2.7.4" (manifest validation requires a digit;
    the bad version made the manifest silently skip → partial orchestrator install
    → legacy fallback → the duplicate corruption above).
- HARDEN install_stack_via_orchestrator: only fall back to the legacy installer
  when NOTHING was installed yet. An "unknown app_id" AFTER a member is up now
  errors instead of double-creating containers on shared data (the corruption
  root cause).
- Strict the all-manifests round-trip test: fail (not skip) on any invalid shipped
  manifest — this gap let the bad immich-server version through.

Known follow-up (pre-existing, platform-wide): orchestrator-installed backends
(immich, btcpay-db) run as podman --restart, not Quadlet, and podman-restart.service
is disabled on .228 → reboot-survival gap independent of this migration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 07:08:45 -04:00
archipelago
7bfbe8fe40 feat(registry-manifest): phase 2 — publisher embeds manifests into signed catalog
generate-app-catalog.sh gains opt-in EMBED_MANIFESTS=1: embeds each
apps/<id>/manifest.yml into its catalog entry's `manifest` field (whole document,
top-level app: preserved — exactly what the Rust side deserializes). Default off
so routine catalog regen is unchanged during the migration window; turn on
deliberately, then sign via the existing release-root ceremony. Verified: default
embeds 0; EMBED_MANIFESTS=1 embeds 40 manifests (generated_secrets preserved).

Adds a round-trip guard test: every shipped apps/*/manifest.yml must deserialize
+ validate through catalog_manifest_to_overlay (image apps accepted, build apps
defer to disk) — catches schema drift between disk manifests and the catalog path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:46:17 -04:00
archipelago
220666d3a9 feat(registry-manifest): phase 1 — orchestrator consumes manifests from signed catalog
Workstream B phase 1 (node-side consume). The signed app-catalog can now carry a
full manifest per entry; the orchestrator overlays it over the disk manifest
(origin-wins) with disk as the migration fallback. Moves apps toward
registry-distributed manifests with no OTA-shipped disk file.

- app_catalog: `manifest: Option<Value>` on AppCatalogEntry (forward-compatible,
  covered by the existing release-root signature over the raw JSON);
  `catalog_manifest_values()` accessor.
- prod_orchestrator: `load_manifests` overlays catalog manifests after the disk
  walk; `catalog_manifest_to_overlay()` returns None (→ disk fallback) on
  unparseable value / app-id mismatch / failed validate() / build source
  (build contexts aren't registry-distributed yet — phase 1 is image-only).
- manifest_dir stays PathBuf (build-only field); image-only apps never read it.
- 6 unit tests; compiles clean. No-op until a catalog embeds a manifest, so
  existing nodes are unaffected.

See docs/registry-manifest-design.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:30:38 -04:00
archipelago
03a4ee1b30 feat(container): manifest-declared generated secrets + companion/quadlet hardening
Generated-secrets system: apps declare `generated_secrets` in their manifest
(kinds hex16/hex32/bcrypt); `container::secrets::ensure_generated_secrets`
materialises them 0600/rootless in resolve_dynamic_env — idempotent and
self-healing (recovers wrongly root-owned secrets with no privilege). Replaces
per-app Rust (deletes ensure_fmcd_password). fedimint-clientd/gateway manifests
now declare fmcd-password / fedimint-gateway-hash.

companion.rs: rebuild the auto-built :latest image when its build context changes
(staleness check) so baked-in fixes (e.g. guardian-UI CSS) actually reach nodes.

quadlet.rs: skip PublishPort under Network=host (podman rejects the combo, exit
125) + regression tests.

UI: "Fedimint Guardian" rename, fedimint-clientd/nostr-rs-relay/meshtastic tagged
as Services (headless backends), gateway icon fallback.

Deployed + verified on .228 (generated-secrets fixed fedimint-gateway start;
grafana/strfry orphan crash-loop units removed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 05:11:07 -04:00
archipelago
db7d424bff feat(content): owned-content persistence + Fedimint paid downloads, fmcd caps fix, FIPS warm-path perf
Buyer-side paid downloads now persist: purchases are cached on disk
(content_owned.rs) keyed by (seller onion, content_id), the gallery shows
an "Owned" badge unblurred, and items view/play in-app from the local
cache with no re-payment or reliance on a browser download (which
silently failed on the mobile companion). New RPCs content.owned-list /
content.owned-get. Validated e2e .116<-.198 (paid 100 sats via Fedimint,
166KB jpeg returns, survives restart).

fedimint-clientd manifest: restore the standard container capability set
(CHOWN/DAC_OVERRIDE/FOWNER/SETUID/SETGID) so fmcd's startup chown of an
existing-federation /data succeeds instead of dying EPERM (#7). Confirmed
the orchestrator applies these to the running container.

FIPS perf: tighten the supervisor warm-path keepalive 45s -> 25s so peer
paths stay inside the ~30-60s NAT cold window. Dials now reliably land on
FIPS instead of re-punching and falling back to Tor. Measured to the same
peer: cloud browse 18-22s -> 0.4s; full Fedimint paid download 29s -> 11s
(residual is the seller-side guardian reissue round-trip).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 18:58:52 -04:00
archipelago
63b98599e8 Revert "fix(fedimint): run fmcd with seccomp=unconfined so its DHT can start (#7)"
This reverts commit 409543c41e78025354acbdde5ffc6445895d4508.
2026-06-20 14:37:24 -04:00
archipelago
409543c41e fix(fedimint): run fmcd with seccomp=unconfined so its DHT can start (#7)
fmcd crash-looped "Operation not permitted (os error 1)" on .116 (kernel
6.12.74): the default rootless seccomp profile blocks a syscall its Mainline-DHT
/ iroh transport needs, so the REST API never came up (:8178 → HTTP 000) and
federations couldn't be joined. Verified: with seccomp=unconfined fmcd boots and
answers /v2/* (HTTP 401 instead of dead). fmcd works on other nodes, so this is
kernel/seccomp-specific — but the relaxation is safe for an outbound-networking
daemon and harmless where not needed.

- new `security.seccomp_unconfined` manifest flag (SecurityPolicy);
- libpod backend sets `seccomp_profile_path: "unconfined"` (== --security-opt
  seccomp=unconfined); quadlet backend emits `SeccompProfile=unconfined`;
- enabled in apps/fedimint-clientd/manifest.yml.

NOTE: manifests live on-disk at /opt/archipelago/apps/<id>/manifest.yml, so the
node needs the updated manifest deployed + the fmcd container recreated to apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 13:08:13 -04:00
archipelago
12f54e390d feat(wallet): ecash pay confirmation screen + auto-refund on failed sale (#3)
- PeerFiles: new confirmation step after "pay from ecash" — shows the amount and
  which wallet will be spent (Cashu/Fedimint) with balances, lets the user switch
  backends, and a styled Confirm button. The chosen backend is passed to the
  payment so it spends exactly what was confirmed.
- content.download-peer-paid: accept `method` (cashu|fedimint) to honor the
  confirmed choice; log the backend + outcome; backend-specific rejection errors
  ("not in the same Fedimint federation" / "doesn't accept your Cashu mint").
- AUTO-REFUND: a minted token whose sale fails (peer unreachable, rejected, or
  error) is now reclaimed (fedimint reissue / cashu receive) so the buyer no
  longer loses the spent ecash — fixes the stuck-Fedimint-notes report.
- wallet.ecash-balance already reports cashu_sats/fedimint_sats/total_sats which
  the confirm screen uses to pick/show the covering wallet.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 12:16:02 -04:00
archipelago
a6957a48f7 fix(netbird): wait for OIDC discovery before reporting install done (#10)
Right after install the dashboard SPA opens and, if it loads before NetBird's
embedded OIDC provider is serving, caches a bad auth state — the user appears
logged-in but can't log out until it self-corrects. Container "running" != OIDC
ready, so gate the install's Done phase on the management server's
/oauth2/.well-known/openid-configuration answering (best-effort, 60s cap, never
fails the install since the stack is already up).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:57:37 -04:00
archipelago
8f06d88fbf feat(wallet): pay for peer files from BOTH Cashu and Fedimint ecash (#3)
Paying for a peer file minted a Cashu-only token, so a node whose ecash balance
lived in Fedimint couldn't pay even with funds. Now both backends are tried:

- payer (content.download-peer-paid): mint a Cashu token first; on failure fall
  back to spending Fedimint notes. Only error if BOTH backends can't cover it.
- seller (verify_and_receive_payment): accept Fedimint notes as well as Cashu —
  anything not starting with "cashu" is redeemed via reissue_into_any.
- new fedimint_client::spend_from_any() — spend from whichever joined federation
  has the balance, returning the notes + federation id (mirrors reissue_into_any).
- wallet.ecash-balance now also reports fedimint_sats + combined total_sats; the
  pay-for-file pre-check uses the combined total so a Fedimint-funded node isn't
  wrongly blocked.

Compiles (cargo check + vue-tsc). Live cross-node federation validation pending
(dual-ecash phase 6) — needs two nodes sharing a federation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:13:23 -04:00
archipelago
f92e442bfc fix(mesh): collapse cross-transport twin contacts into one conversation (#12)
A node reachable both over LoRa and federation has two MeshPeer rows (radio
twin: low contact_id + firmware key; federation twin: high contact_id +
archipelago key), and messages key by peer_contact_id split across the two ids
— so opening one twin shows an empty thread (the .120->.89 symptom).

- backend: new group_peer_twins() helper groups peers by arch_pubkey_hex (set on
  BOTH twins by bind_federation_twins), keeps the radio id as the mesh-first
  send target, and unions messages across all twin ids. Wired into
  conversations.list / conversations.messages / mesh.contacts-list. +3 unit tests.
- frontend: the live chat list merges client-side (mergedPeers) and matched twins
  by the "Archy-z6Mk..." advert prefix, which the Meshtastic device rename broke
  (radio now advertises the server name). Merge by arch_pubkey_hex instead, which
  the backend reliably sets on both twins. Expose arch_pubkey_hex on MeshPeer.
- fix unrelated stale test: EcashTransaction test missing the new `kind` field.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 08:01:14 -04:00
archipelago
d00d1b20d7 fix(mesh): rename Meshtastic radio to the node's server name
Meshtastic device rename was a no-op — set_advert_name only updated an
in-memory field and never told the radio, so the device kept its firmware
default ('Meshtastic xxxx') and wasn't findable from external Meshtastic
apps. MeshCore already renamed correctly (CMD_SET_ADVERT_NAME); this brings
Meshtastic to parity.

Send an AdminMessage{set_owner=User{long_name,short_name}} to the locally
connected node (admin packet to our own node_num on the ADMIN_APP port).
Local serial admin needs no session passkey, matching the official client.
long_name = server name (<=39 chars); short_name = first 4 alphanumerics,
upper-cased. Verified on real hardware: .120 -> 'Archy-X250-EXP', .5 ->
'Archy-X250-Beta' (name read back from the radio after reconnect).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 06:04:22 -04:00
archipelago
63611a4453 fix(mesh): honour explicit !ai allowlist for unauthenticated stock clients
A stock meshcore client (e.g. a phone) can't sign our typed envelopes, so it is
never 'authenticated' — which meant ticking it as an allowed assistant contact
had no effect and !ai stayed denied. The explicit per-contact allowlist is a
deliberate operator opt-in for a specific key, so match it regardless of
authentication, keyed on the asker's resolved identity (bound archipelago key,
else firmware routing key — how meshcore addresses the contact). The spoofable
federation-trust-list match still requires authentication.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:43:30 -04:00
archipelago
7831e68d13 fix(wallet): redeem across all federations, unified ecash history, fmcd healthcheck
- reissue_into_any now tries the UNION of the local registry AND fmcd's live
  joined set (/v2/admin/info) before failing, so a valid Fedimint token isn't
  wrongly rejected when the registry has drifted. On all-fail it returns a
  friendly message: notes already redeemed into this wallet (funds safe) vs
  didn't match any connected federation.
- Unified transaction history: a local Fedimint tx log (recorded on each
  successful redeem) is merged with the Cashu history in wallet.ecash-history,
  newest-first, each tagged kind=cashu|fedimint. Previously a Fedimint receive
  appeared nowhere.
- fedimint-clientd healthcheck -> type:tcp. It was probing /health, which fmcd
  doesn't serve (only /v2/*), pinning the container in (starting) forever; the
  TCP probe is skipped by the Quadlet renderer (host-side lifecycle verifies),
  so it reports running. Cosmetic for ecash, which worked throughout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:43:29 -04:00
archipelago
a0b80dd27d fix(mesh): authenticate !ai over LoRa via federation-twin binding + signed Text
A !ai (or any typed message) from a trusted, federated node was denied when
it arrived over the radio. The radio half of a node that is also a federation
peer carried no archipelago identity (identity adverts are no longer broadcast
on the public channel), so the trusted_only gate and signature verification
had no key to check the asker against — and the same node showed up as two
contacts (a radio twin + a federation twin).

- bind_federation_twins(): correlate a radio contact with its federation twin
  by exact, case-insensitive advert_name and copy the federation peer's
  arch_pubkey_hex/did/x25519 onto the radio record. Called from
  upsert_federation_peer and refresh_contacts. Ambiguous names (held by >1
  federation peer) are skipped. This is only a CANDIDATE key — security is
  unchanged: the inbound envelope signature must still verify against it.
- send_message now signs the typed Text envelope (new_signed) so a radio !ai
  authenticates against the bound key. A meshcore node merely named like a
  trusted node cannot forge the signature, so it is still denied.

Receiver-side verification (handle_typed_envelope_direct) and federation-trust
matching (is_sender_allowed) already existed; this supplies the missing key
binding and signature. Also resolves the radio/federation duplicate-contact
display for same-named nodes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 13:57:50 -04:00
archipelago
75e470bfa4 fix(mesh): mesh-preferred message routing with FIPS/Tor fallback
Messages to a federated peer that is out of LoRa range (e.g. on another
continent) were dropped into the radio with no fallback, or hung on a dead
FIPS path before reaching Tor — so they never arrived.

- Route a radio contact over the federation transport (FIPS->Tor) when it is
  the same node as a federated peer (known archipelago identity -> onion) AND
  it is not currently reachable over the radio. Reachable radio peers stay on
  the mesh (preferred); oversized/file envelopes still always take federation.
- Resolve the onion via the archipelago identity key (arch_pubkey_hex), not
  the firmware routing key, so a radio contact maps to its nodes.json onion.
- Add .fips_timeout(8s) to the federation message POST so an unreachable FIPS
  overlay fast-fails to Tor (~3-5s) instead of burning the 120s budget.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 10:09:14 -04:00
archipelago
837cc02812 fix(federation): reliable symmetric auto-federation across LAN/Tor/FIPS
Federated nodes failed to converge to full-mesh across the LAN<->Tailscale
boundary: nodes were invisible to peers, sync 'took ages'/timed out, and
names only updated on a manual sync. Onions were healthy in both directions
(~3-5s); the failures were app-layer.

- B: federation dials fast-fail a dead FIPS path via .fips_timeout(6s) in
  sync_with_peer + notify_join, so the Tor fallback isn't stuck behind the
  full 30s FIPS budget when LAN and remote peers share no FIPS path.
- A: notify_join (peer-joined) now spawns with retries+backoff instead of a
  single awaited best-effort POST, so the join RPC returns instantly (no
  'Request timeout') and the inviter reliably learns the joiner (was
  asymmetric).
- C: new 90s periodic federation auto-sync (none existed) so renamed nodes
  and roster changes propagate without a manual Sync click.
- self-heal: each auto-sync re-asserts membership to any peer that doesn't
  list us back, converging the fleet to full-mesh and healing pre-existing
  asymmetry with no manual re-joins.

Validated live across 7 nodes: a previously fleet-invisible node became
fully meshed automatically (logs: 'auto-sync ... reasserted=1',
'peer-joined ... delivered').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
1bce694ebb feat(ui): mobile mesh tabs, AIUI-style audio player, cloud grid + map fixes
UI (this session):
- Global audio player now scales the whole interface into the space above it
  on desktop (sidebar + main) and docks directly above the tab bar on mobile;
  it stays visible while navigating.
- Mesh mobile redesign: floating Chat / BTC / Dead Man / AI / Map tab strip
  with a single fixed, internally-scrolling pane (page no longer scrolls);
  tabs hide while a conversation is open; floating back button; collapsible
  Device panel (starts collapsed); keyboard-aware conversation sizing via
  VisualViewport so the chat sits just above the keyboard.
- Cloud file grid: uniform 4/3 card heights (folders + images match).
- Swipe left/right switches tabs on the Apps and Web5 screens.
- Map tool fills its pane (no bottom gap); fix skewed Share Location toggle
  on mobile (global min-height rule was deforming the switch).
- Trim redundant helper copy from the mesh AI tab.

Also bundles pre-existing in-progress work that was already in the tree:
mesh listener/session + wallet + container + bitcoin-status backend changes,
docker UI updates, and assorted other UI tweaks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
c4855526fe feat(wallet): wire fmcd as core app + dual-ecash receive
Fedimint never appeared in Wallet > Settings > Fedimint because the
fmcd (fedimint-clientd) sidecar was never installed: ensure_default_
federation() needs the fmcd password to reach the daemon, found none,
and silently no-oped, leaving the registry empty.

- prod_orchestrator: add fedimint-clientd to the baseline auto-install
  set so it self-heals onto every node and auto-joins the default
  federation; generate the fmcd-password secret before secret_env
  resolves.
- fedimint_client: ensure_fmcd_password (random hex, 0600) shared with
  the container's secret_env; from_node reads the same secret (legacy
  fmcd/password kept as fallback); reissue_into_any redeems received
  notes into the first joined federation that accepts them.
- wallet.ecash-receive: dual-token — cashu* tokens redeem at the mint,
  anything else is reissued via fmcd; returns the kind + federation_id.
- UI: receive box advertises "Cashu or Fedimint" and reports which kind.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
298595069d fix(mesh): native Meshtastic unicast DMs + driver-level E2E status
Meshtastic DMs were falling back to a channel broadcast, so every node
on the LoRa channel saw a "direct" message. Send a directed MeshPacket
(to = node num, decoded from the synthetic pubkey's node-id bytes)
instead — the Meshtastic analog of the meshcore CMD_SEND_TXT_MSG fix.
DMs now reach only the recipient; firmware auto-PKC-encrypts them
end-to-end once NodeInfo keys are exchanged.

Capture E2E status at the driver level (no shared-type/UI change):
- learn each peer's real Curve25519 key from User.public_key (field 8)
  and inbound MeshPacket.public_key (16), kept in a side-map separate
  from the synthetic routing key so unicast routing is untouched
- detect inbound MeshPacket.pki_encrypted (17) to tell a true E2E DM
  from a channel-PSK fallback
- peer_is_pkc_capable() seam for a future mesh-tab E2E badge

Hot-swap preserved: no dispatched MeshRadioDevice signature or the
shared ParsedContact changed, so meshcore and meshtastic stay
interchangeable behind the listener.

Adds tests/multinode/meshtastic.sh, a two/three-radio on-air parity
harness (detect, discover, DM round-trip, DM privacy, channel
broadcast, typed envelope, reachability).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00
archipelago
790da4bd0f fix(wallet): Minibits default Cashu mint, resilient peer-file invoices, named default federation
- Cashu default mint was the local Fedimint guardian (:8175), wrongly surfacing
  a Fedimint URL in the Cashu mints list. Default is now Minibits
  (https://mint.minibits.cash/Bitcoin) — Cashu and Fedimint are distinct
  protocols (Fedimint lives under its own tab).
- Peer-file (buy) invoice creation: retry the LND REST call (3× / 400ms) so a
  transient LND-REST blip (swap pressure / just-restarted / TLS race) no longer
  hard-fails as an opaque 503, and surface the real error chain ({:#}) in the
  response + logs instead of a generic "Failed to create invoice".
- Autojoined default federation now shows a friendly name ("Archipelago
  Federation") in the Fedimint tab instead of a bare federation id.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 09:23:56 -04:00
archipelago
cc2e055e09 fix(bitcoin,ui): RAM-aware dbcache to stop swap-thrash 502s + snappier status + icon placeholder
Sizes bitcoind -dbcache to host RAM (~1/16, floor 300MB, cap 4096) instead of a
fixed 2048/4096. A multi-GB UTXO cache on an 8GB node running the full app stack
pushed memory past physical RAM and triggered system-wide swap thrash: the disk
saturated, bitcoind could not answer its own RPC, and the dashboard backend's
sqlite reads stalled — surfacing as fleet-wide /rpc/v1 502s and a blank Bitcoin
UI. Applied in scripts/container-specs.sh (reconciler path) and the config.rs
bitcoin-core path.

Bitcoin status cache now polls every 5s (was 10/15) with an 8s timeout (was 20s)
and fetches the four RPCs concurrently, so the cached snapshot tracks bitcoind's
responsive windows during IBD and the UI stops dwelling on "reconnecting...".

Unifies the divergent discover AppGrid/FeaturedApps image-error handlers onto the
canonical placeholder fallback so missing app icons render the placeholder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 09:14:47 -04:00
archipelago
f0fdc23cc9 feat(mesh): native-unicast DMs, contact import/remove, reachability, contact search
- DMs now use native meshcore unicast (CMD_SEND_TXT_MSG) instead of @DM2 channel
  broadcasts: private (E2E-encrypted to the recipient pubkey by firmware), off the
  public channel, and decodable by stock clients. Plain text (split, not MC-chunked)
  to non-archipelago contacts; typed envelopes to archy peers.
- !ai replies now DM the asker privately (RadioDm) instead of broadcasting on ch0.
- Auto contact-import: a heard advert (PUSH_CONTACT_ADVERT/0x80, 32-byte pubkey) is
  added via CMD_ADD_UPDATE_CONTACT (0x09) so contacts appear without a flood advert.
- clear-all now DELETES firmware contacts via CMD_REMOVE_CONTACT (0x0F) instead of
  blocklisting; blocking filter removed entirely. Wiped contacts return when reachable.
- Contact reachability: MeshPeer carries last_advert + reachable (path-based); UI shows
  a reachability dot.
- Peers list: contact search box (filter by name/DID/npub/pubkey) with a clear button.
- send_message routes stock contacts as plain native text (fixes garbled envelopes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 08:08:52 -04:00
archipelago
3a21243be7 fix(mesh,ui,fedimint): mesh-AI chat trigger + transport-aware reply, stop ARCHY:2 public-channel spam, AI allowlist + model dropdown, Fedimint client manifest, settings reorder, chat scroll
- mesh: stop broadcasting ARCHY:2 identity on the public channel (startup + every advert tick); receive path still parses inbound. No more public-channel spam.
- mesh assistant: trigger on !ai/!ask typed in 1:1 chat (was only the dead AssistQuery path + bare channel text); route the reply transport-aware via MeshService::send_message (Tor for federation peers, LoRa for radio) through a new AssistChatReply event consumed at the server layer — fixes replies never reaching federation askers.
- mesh assistant: per-contact !ai allowlist (allowed_contacts) bypassing trusted_only; config + RPC + is_sender_allowed.
- fedimint-clientd manifest: network_policy open -> bridge (invalid value made the loader skip the whole manifest, so fmcd never ran and federations never joined/listed).
- ui: AI panel — Claude model dropdown (Haiku/Sonnet/Opus presets) + allowlist contact picker.
- ui: Settings — App Updates + App Registry moved under Account.
- ui: mesh chat — overscroll-behavior: contain so chat scroll no longer bleeds to the contacts panel.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 03:33:37 -04:00
archipelago
2a017623e9 chore: release v1.7.99-alpha 2026-06-18 01:00:24 -04:00
archipelago
83bb589ea6 style: cargo fmt for v1.7.99-alpha release gate
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 19:50:46 -04:00
archipelago
5b2a11b8c7 Merge meshroller-50: mesh-AI assistant (#50) into release train 2026-06-17 19:22:11 -04:00
archipelago
bd567cd165 feat(wallet,content,seed): Fedimint dual-ecash, paid content streaming, seed ceremony
- Fedimint ecash alongside Cashu: fedimint-clientd (fmcd) HTTP bridge,
  fedimint_client, fedimint RPC, wallet wiring
- Paid peer content: content invoices + streaming content server + content RPCs
- Seed-phrase ceremony/reveal RPCs and CLI ceremony tool
- LND wallet, mesh status/messaging, app-stack (netbird HTTPS), and
  decoupled-update wiring; Fedimint Client core app in catalog

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 19:21:07 -04:00
archipelago
7a76d32e4b feat(mesh): mesh-AI assistant scheduler + config panel (#50)
Adds the assistant scheduler, MeshAssistantPanel UI, and the remaining
config-RPC / live-toggle / Ollama-detect wiring on top of Phase 1.x.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 19:19:32 -04:00
archipelago
0947ecee11 feat(mesh): assistant config RPCs + live toggle + Ollama detect (#50)
Phase 2 backend. AssistantConfig is now live-updatable (RwLock) so the UI
toggle applies without a listener restart. New RPCs:
- mesh.assistant-status  -> {enabled, model, trusted_only, default_model,
  ollama_detected, models[]} (probes local Ollama :11434/api/tags)
- mesh.assistant-configure -> set enabled/model/trusted_only live + persist

MeshService::assistant_config / configure_assistant. Compiles clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 18:29:36 -04:00
archipelago
ef601c6d26 feat(mesh): wire ARCHY identity broadcast for trust over both radios (#50)
The ARCHY:2 identity broadcast (DID + ed25519 + x25519) was unwired dead
code on both send and receive. Wiring it lets a radio peer prove its
archipelago identity, so the assistant's trusted-only gate (and encrypted
DMs) work over meshcore AND Meshtastic — the latter otherwise only exposes
synthetic node keys.

- session.rs: broadcast ARCHY:2 as channel text at startup + each advert tick
- frames.rs: parse inbound ARCHY:2 on the channel path, dedupe-keyed by
  archipelago pubkey (federation_peer_contact_id) so it MERGES with the
  federation-seeded peer instead of duplicating; self-echo guarded
- threads our_x25519_secret into handle_channel_payload (was reserved)

Reuses the existing handle_identity_received verifier (ed/x25519 consistency
check + shared-secret derivation). Compiles clean. Needs a live 2-radio test
before trusting trusted-only over radio.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 18:20:12 -04:00
archipelago
87d0d53205 feat(mesh): assistant Phase 1.5 — !ai channel trigger (issue #50)
A plain '!ai <q>' / '!ask <q>' on the channel is now answered by the node's
local model and broadcast back as plain text, so ANY client (bare meshcore
or Meshtastic) can ask. Generalised run_assist with an AssistReply target:
Typed chunks to a peer (archipelago UI path) vs plain channel-text (bare
clients). Trust/rate gate unchanged; asker identity is separate from reply
mode. Works over both radios.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 17:59:03 -04:00
archipelago
d8d014bfd9 feat(mesh): mesh-AI assistant — Phase 1.1-1.4 (issue #50)
Rust-native lift of Meshroller's LLM bridge. Adds typed AssistQuery/
AssistResponse mesh messages, a trust-gated inbound handler that answers
with the node's local Ollama model, and airtime discipline (reply cap,
chunking, one in-flight query per asker). Works over both meshcore and
Meshtastic radios via the existing MeshRadioDevice abstraction.

- message_types: AssistQuery=24 / AssistResponse=25 + payloads
- listener/assist.rs: run_assist (gate -> Ollama -> chunked reply)
- listener/dispatch.rs: AssistQuery/AssistResponse arms
- MeshConfig: assistant_enabled / assistant_model / assistant_trusted_only
- MeshState: AssistantConfig + data_dir + in-flight guard

Compiles clean (cargo check). Off by default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 17:41:15 -04:00
archipelago
3ca1fadfea chore: reconcile Cargo.lock after DHT merge
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 07:50:25 -04:00
archipelago
7c458ede8e Merge agent-trust-wip (DHT Phases 0–4) into main
Integrates the DHT/peer-distribution line with the v1.7.98-alpha release
fixes:
- Phase 0 signed-catalog trust + release-root key (KAT-pinned)
- Phase 1 BLAKE3 content addressing alongside SHA-256
- Phase 2 swarm-assist fetch seam (origin always wins) + iroh-blobs
  provider — heavy iroh deps stay behind the off-by-default `iroh-swarm`
  feature, so the default build/deploy is unaffected
- Phase 3 signed Nostr seed-advertisement + discovery glue + paid swarm
  serving + "Networking Profits" Settings page
- Phase 4 paid swarm streaming (cross-mint ecash, Shape-A paid ALPN,
  streaming.prepare-payment), also iroh-swarm-gated

Conflicts resolved: seed.rs (kept release-root KAT tests), update.rs
(comment-only, OTA logic identical), Cargo.lock (regenerated against the
merged Cargo.toml). Default-feature build is clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 07:50:06 -04:00
archipelago
27a6199939 feat(dht): Phase 4 — paid swarm streaming (cross-mint ecash + Shape-A ALPN)
Fetch-side auto-pay decision layer (payment.rs), Shape-A paid-blobs
negotiation ALPN (paid_alpn.rs), cross-mint ecash swap + payer auto-swap
builder + idempotent resume/liquidity cache (ecash.rs), and the
streaming.prepare-payment RPC. All gated behind the iroh-swarm feature
(off by default). 91/91 tests pass, both build configs clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 07:36:31 -04:00
archipelago
d4c0587df0 fix(health): IndeeHub API waits for MinIO before restart (#41)
The IndeeHub API needs MinIO (object storage) up to serve, but the
health monitor's dependency map listed only postgres + redis, so it
would restart the API while MinIO was still starting — the "recovers
only after 1-2 container restarts" symptom. Add indeedhub-minio to the
API's deps; MinIO has no deps of its own so the monitor restarts it
first, no deadlock. (First-start ordering in the stack definition is a
deeper, separate follow-up.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 06:33:04 -04:00
archipelago
ab56054aeb fix(federation): remove-node also purges the mesh contact/thread (#2)
federation.remove-node only edited nodes.json, so a removed/renamed node
(e.g. a stale "Arch HP") lingered in the mesh chat list with its old
thread. Capture the node's pubkey before removal, then purge its
synthetic mesh peer, shared secret, messages, presence, and persisted
contact entry via the new mesh::purge_federation_peer. Combined with the
#42 name refresh, stale federation contacts can now be fully cleaned from
a node.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 06:12:56 -04:00
archipelago
56752ebfc0 fix(identity): Node npub in Web5 Identities matches Settings (#49)
Settings shows the node-level Nostr key (HKDF derive_node_nostr_key,
read via node.nostr-pubkey) while Web5 > Identities showed the identity
record's own key — the mirrored "Node" identity stores nostr=None and
seed identities use a different BIP-32 NIP-06 key, so the two surfaces
disagreed.

Resolve the node-level Nostr key once in identity.list and override it
onto whichever identity record is the node's own (ed25519 == server_info
.pubkey). Display-only — no stored key is rewritten, so it self-applies
to existing nodes with no migration and the discovery identity is
unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 06:03:25 -04:00
archipelago
6de8173d18 fix(mesh): refresh federation chat names + roster after sync without restart (#42)
A peer accepted via invite is seeded into the mesh peer table with
name=None, so it shows as "Archipelago <pubkey8>" in chat. Federation
sync later learns the real name (update_node_state writes it to
nodes.json) and discovers transitive peers (merge_transitive_peers),
but nothing pushed those into the live mesh peer table — the chat list
stayed stale until the next mesh restart, and transitive peers never
appeared as contacts at all.

Add RpcHandler::refresh_federation_mesh_peers() (re-runs the idempotent,
onion-deduped seed_federation_peers_into_mesh) and call it after every
periodic sync cycle (server.rs) and after the manual federation.sync-all
RPC. Names now correct themselves and the full roster meshes within a
sync cycle, no restart needed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 05:52:41 -04:00