Destructure the first 4 pubkey bytes into typed locals so vue-tsc's
noUncheckedIndexedAccess doesn't fail the build (the bytes.length<4 guard
doesn't narrow per-element access). No behaviour change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The IndeeHub API needs MinIO (object storage) up to serve, but the
health monitor's dependency map listed only postgres + redis, so it
would restart the API while MinIO was still starting — the "recovers
only after 1-2 container restarts" symptom. Add indeedhub-minio to the
API's deps; MinIO has no deps of its own so the monitor restarts it
first, no deadlock. (First-start ordering in the stack definition is a
deeper, separate follow-up.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
federation.remove-node only edited nodes.json, so a removed/renamed node
(e.g. a stale "Arch HP") lingered in the mesh chat list with its old
thread. Capture the node's pubkey before removal, then purge its
synthetic mesh peer, shared secret, messages, presence, and persisted
contact entry via the new mesh::purge_federation_peer. Combined with the
#42 name refresh, stale federation contacts can now be fully cleaned from
a node.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the rule that only front-end apps with a UI belong in "My Apps"
(databases/backends/headless go to Websites), make the manifest's
interfaces.main.ui the deciding signal. isWebsitePackage now treats any
package that declares a UI as an app even when it isn't in the curated
APP_CATEGORY_MAP, and falls through headless LAN-reachable packages to
Websites. Additive — service-by-name infra and curated known apps are
unchanged, so no currently-correct app moves.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Settings shows the node-level Nostr key (HKDF derive_node_nostr_key,
read via node.nostr-pubkey) while Web5 > Identities showed the identity
record's own key — the mirrored "Node" identity stores nostr=None and
seed identities use a different BIP-32 NIP-06 key, so the two surfaces
disagreed.
Resolve the node-level Nostr key once in identity.list and override it
onto whichever identity record is the node's own (ed25519 == server_info
.pubkey). Display-only — no stored key is rewritten, so it self-applies
to existing nodes with no migration and the discovery identity is
unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A peer accepted via invite is seeded into the mesh peer table with
name=None, so it shows as "Archipelago <pubkey8>" in chat. Federation
sync later learns the real name (update_node_state writes it to
nodes.json) and discovers transitive peers (merge_transitive_peers),
but nothing pushed those into the live mesh peer table — the chat list
stayed stale until the next mesh restart, and transitive peers never
appeared as contacts at all.
Add RpcHandler::refresh_federation_mesh_peers() (re-runs the idempotent,
onion-deduped seed_federation_peers_into_mesh) and call it after every
periodic sync cycle (server.rs) and after the manual federation.sync-all
RPC. Names now correct themselves and the full roster meshes within a
sync cycle, no restart needed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bitcoin-core was missing from APP_CATEGORY_MAP, so isKnownApp() was false and
isWebsitePackage() fell through to 'has a runtime LAN address'. Once the running
container's LAN address (the bitcoind RPC port :8332) showed up ~a minute after
launch, Bitcoin Core was reclassified as a website: it dropped out of the Apps
tab and search, moved under Websites, and launching it opened :8332 (raw RPC)
instead of the :8334 custom UI that Knots opens.
Add 'bitcoin-core': 'money' alongside bitcoin-knots/bitcoin-ui so isKnownApp is
true, isWebsitePackage is false, and launchAppNow routes through openSession ->
resolveAppUrl (:8334 custom UI). Fixes search, category, and the launch URL.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Messaging a federation-only peer (e.g. 'Arch Dev') failed with 'Missing
contact_id'. The UI gave federation-only rows a *negative* placeholder
contact_id derived from a DID hash, but the backend parses contact_id as u64,
so a negative value deserialized to None. The negative id also never matched
the positive federation-synthetic id that federation-routed messages are stored
under, so those threads looked empty.
- Frontend: derive the SAME positive federation-synthetic id the backend uses
(federationContactId mirrors federation_peer_contact_id) so mesh.send accepts
it and messages thread correctly.
- Backend: send_typed_wire now resolves a federation-synthetic contact_id from
nodes.json when it isn't in the live mesh peer table (radio-less node),
instead of bailing 'Unknown federation peer'.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Large peer downloads (~178MB) failed with a generic 'Operation failed', and
the download path had three stacked problems:
- The FIPS reqwest client used a hard-coded 20s total timeout regardless of the
caller's .timeout(), so a big transfer over the mesh aborted at 20s before
the Tor fallback could help. Honor the per-request timeout (client_with_timeout).
- The peer-content proxy buffered the whole file into node memory via
resp.bytes() before sending a byte, and capped the transfer at 60s. Stream
the body through with hyper::Body::wrap_stream (constant memory) and raise the
timeout to 900s; bump the nginx peer-content read timeout to match.
- Free downloads pulled the file as base64 over RPC, doubling it in node memory
and the browser — fatal for large files. Download free files by streaming
from /api/peer-content straight to disk, after a 1-byte Range probe that
surfaces the real reason (peer offline on mesh and Tor) instead of a generic
failure. Paid downloads now return the real error through the {error} channel
the UI already displays.
Adds the reqwest 'stream' feature for bytes_stream().
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The prod orchestrator only checked whether a build-image tag was *present*
before deciding to skip the build. The local UI images (bitcoin-ui, lnd-ui,
electrs-ui) COPY a built neode-ui dist, so a UI update changed the source but
left the old tag in place and the new UI never shipped.
Gate the build on a content fingerprint of the build context (sorted relative
path + length + mtime, SHA-256) recorded in a per-tag stamp under data_dir.
Rebuild whenever the fingerprint differs from the one that produced the
existing image; podman's own COPY-layer cache keeps a no-op rebuild cheap.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Apps could fail install when a stack member exited on its first start
because a dependency (db/redis/the bitcoin node) was not ready yet — a
transient crash, not a broken install. wait_for_stack_containers now
restarts each exited/dead container up to 3 times before declaring the
install failed; the runtime supervisor keeps it alive afterwards.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The backend already sends did in federation peer lists, but the Peer
type omitted it and federationNodeToPeer() dropped it when mapping. Add
did?: string to Peer and pass node.did through, so trusted/observer
node rows route to Federation/Mesh by their real DID (falling back to
pubkey/onion) instead of failing the build on a missing property.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The #26 fix makes has_staged_update require the .download-complete
marker, so the state self-heal treats a marker-less staging dir as a
partial download and clears update_in_progress. The roundtrip test
staged a binary file but not the marker, so it began failing. Write
the marker to simulate a *complete* staged update.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The kiosk chromium pinned ~92% of a core (software-compositing spin from
--enable-gpu-rasterization on a GPU-less/headless node), saturating the machine
and starving the backend + container builds — it caused the .198 receive timeout
and the deploy storms.
- archipelago-kiosk.service: CPUQuota=75% + MemoryMax/High + Delegate, so a
runaway kiosk can never take the whole node down.
- archipelago-kiosk-launcher.sh: detect /dev/dri — use GPU rasterization only
when a GPU exists, else --disable-gpu (avoids the headless spin).
- bootstrap::ensure_kiosk_hardened: OTA self-heal that installs the updated
unit+launcher on already-deployed nodes, daemon-reloads, and only try-restarts
a *running* kiosk (never re-enables an operator-disabled one).
cargo check clean; launcher bash -n clean; unit syntax valid.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
immich_server/redis/postgres + indeedhub-* are multi-container stack members
whose sub-container app_ids are NOT in package_data, so the health monitor skips
them as "orphans" and never restarts them when they exit — Immich/IndeedHub stay
down until the next reboot (the boot-only start_stopped_stack_containers was the
only recovery). Spawn a 120s supervisor that reuses that same recovery at
runtime. It cheaply skips already-running containers and honours the user-stopped
list (set on every container by package.stop), so it only revives genuinely
crashed members and never fights a user stop.
cargo check clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- All four tabs (trusted/observers/messages/requests) capped at max-h-72 with
internal scroll, so the screen stays short instead of growing very long.
- Clicking a node row navigates to that node in the Federation screen
(?node=did); the Message button (stop-propagation) deep-links to that peer\047s
mesh chat (?peer=), using the Mesh.vue ?peer handler.
type-check clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A resumable-but-failed download leaves partial component files in update-staging.
has_staged_update() treated ANY staged file as "install-ready", so the state
self-heal kept update_in_progress=true and the UI showed Install instead of
Download (no clean retry).
- update.rs: write a .download-complete marker only after EVERY component
downloads+verifies; has_staged_update() now checks that marker. Partial/failed
downloads (no marker) correctly read as not-staged → self-heal clears
update_in_progress → UI shows Download. Resume still works (partial files kept).
- SystemUpdate.vue: on a genuine download failure, reset downloaded/in_progress
and re-sync, so the user lands back on Download immediately.
cargo check + vue-tsc clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
sendArchMessage looped over every federation node sequentially (await
sendMessageToPeer per node), so the spinner stayed up until the slowest/offline
node's Tor request finished — long after online peers had received the message.
Send to all peers concurrently (Promise.allSettled); the spinner now clears
after the slowest single delivery, not the sum.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- DID: the Identity card read the DID only from localStorage('neode_did'), so
nodes/browsers that never cached it (e.g. .116/.228) showed no DID. Fall back
to the node.did RPC and cache it — the DID now shows everywhere.
- npub: add the node's seed-derived Nostr public key (npub) to the Identity card
next to the DID + onion, fetched from node.nostr-pubkey, with a copy button.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make each peer file card a flex column filling its grid cell (flex flex-col
h-full) and pin the body row (filename + Play/Download) with mt-auto, so cards
with a media preview and cards without line their footers up across the row.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
User: chat history (messages + mesh/Tor contacts) must persist and be
secure/encrypted per best practice. Root cause of the .198 loss was the B17
mount race writing empty stores over real data (B17 already fixes the trigger);
this hardens storage so it can never silently lose or expose data:
- storage_crypto: shared at-rest envelope mirroring credentials::store — key =
SHA-256(domain ‖ node identity key) (seed-derived, per-store domain
separation), ChaCha20-Poly1305 AEAD with a random 96-bit nonce, tamper-evident.
Transparent migration of legacy plaintext files. Unit-tested (round-trip,
wrong-key/tamper rejection, plaintext detection).
- messages.json: encrypted at rest + ATOMIC write (temp+rename) so a crash/
reboot mid-write cannot corrupt history; decrypt-with-migration on load; a
failed decrypt never overwrites the on-disk data.
- mesh contacts (alias/notes/pinned/blocked): were ONLY in memory and lost on
every restart — now persisted to mesh-contacts.json (encrypted, atomic),
loaded on MeshState startup, saved after contacts-save/contacts-block.
Explicit clear (mesh.clear-all) still wipes everything, as intended.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
User priority: FIPS is the main transport but it was unreliable and needed a
manual "Activate" button. Improvements (all in the FIPS dial/supervisor):
- Auto-activate: ensure_activated() installs the daemon config + starts the
service on its own once seed onboarding has materialised the key — no Activate
button needed. Idempotent; runs from the supervisor every 45s so a node that
onboards after boot still comes up automatically.
- Dial retry: try_fips_get/post now retry ONCE on a connect/timeout error. The
first dial to a peer triggers NAT hole-punching and often times out before the
path is up; the retry lands on the now-warm path — the main reason calls were
dropping to Tor despite the peer being FIPS-reachable.
- More patient connect_timeout (5s→8s) so a reachable-but-cold peer isn't
abandoned to Tor while hole-punching completes.
- Path warmer: spawn_fips_supervisor() keeps hole-punched paths to known
federation peers warm (every 45s, concurrent), so on-demand dials are fast and
land on FIPS.
- Confirmed the daemon config already enables BOTH udp + tcp transports
(render_config_yaml), so FIPS already uses TCP where UDP is blocked; the Tor
fallback was path-establishment, addressed above.
cargo check + fmt clean. Backend — needs a binary rebuild+deploy to validate on
.116/.198 (watch last_transport flip fips, and FIPS coming up with no button).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add a close (X) button to the message toast (closeToast, @click.stop) like the
system notifications.
- Carry the sender pubkey on the toast; clicking now deep-links to that
conversation (/dashboard/mesh?peer=<pubkey>) instead of the generic mesh page.
- Mesh.vue reads ?peer= on mount and opens the matching peer (by pubkey_hex/did),
gracefully falling back to the mesh list when no match (B1/B2 identity).
type-check clean; useMessageToast tests 11/11.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Streaming a peer file connects over mesh/Tor before the first frame, so the
player sat blank. Add a loading state:
- PeerFiles video modal: spinner overlay ("Connecting to peer…") until the
<video> fires playing/canplay; an error overlay on failure instead of a
silent black box.
- useAudioPlayer: loading flag driven by loadstart/waiting vs canplay/playing;
GlobalAudioPlayer shows a spinner in the transport button while connecting.
- Fix the misleading audio error "Could not play audio. File Browser may not be
running." (wrong for peer content) → "Could not play this audio file. The peer
may be offline…" (B22).
type-check clean; useAudioPlayer tests 10/10.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- docker/fedimint-ui/nginx.conf: the local /assets/ handler 404'd the real
fedimint guardian UI's own bundled CSS (bootstrap.min.css, style.css) →
unstyled app. B13 fixed our local icon; this adds a @guardian_assets proxy
fallback to :8177 so the guardian's own /assets/* resolve. Verified live on
.116: /app/fedimint/assets/bootstrap.min.css 404→200 text/css. (needs
archy-fedimint-ui image rebuild to persist on nodes.)
- Home.vue: Quick Start Goals card regained lg:col-span-2 so it fills its row
on desktop instead of sitting at half width.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The B4 fix made listDirectory require a JSON content-type (to detect the
SPA-fallback HTML / 502 cases) and changed the non-OK error string, but its
tests still mocked headerless responses + the old message, so they failed —
which also polluted the run and tripped AppIconGrid's teardown. Give the JSON
mock a content-type, update the non-OK expectation, and add a test for the
guard's friendly-error path. Full suite now 667/667 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On production nodes /var/lib/archipelago (the app data dir AND podman's
graphroot=/var/lib/archipelago/containers/storage) is a separate
device-mapper volume. archipelago.service ordered only After=network-online
.target, so on cold boots it (and its ExecStartPre) could start BEFORE
var-lib-archipelago.mount, write to the bare mountpoint on rootfs, fail every
podman call, exit, and be restarted every 5s until the volume mounted — the
"~20x [FAILED] Failed to start over ~5min" boot flap. Proven live on .198:
"var-lib-archipelago.mount: Directory /var/lib/archipelago to mount over is
not empty, mounting anyway" — the service had written there pre-mount.
Fix: RequiresMountsFor=/var/lib/archipelago (adds Requires= + After= on the
mount unit).
- image-recipe/configs/archipelago.service: ships the directive on fresh ISOs.
- bootstrap::ensure_archipelago_mount_ordering(): self-heals already-deployed
nodes' installed unit + daemon-reload (boot-ordering only, effective next
reboot; never restarts the running service). Idempotent; harmless on rootfs
installs (maps to the always-mounted root).
Verified on .198: after applying, systemctl shows After=var-lib-archipelago
.mount and systemd-analyze verify is clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Home > System bitcoin tile is gated on bitcoinAvailable===true, so any
transient bitcoin.getinfo failure (RPC busy during heavy IBD, route-change
scan) could blank it even though the node is fine. Add a bitcoinStale flag:
- getinfo fails while the container is Running, or package data is momentarily
absent → retain the last-known value and mark it stale (tile stays, shows
"Updating…" instead of a frozen figure presented as live).
- container authoritatively Stopped/Exited → flip to not-available as before
(no stale-as-live).
- first-ever poll times out but container Running → show the tile as updating
rather than staying hidden on a syncing node.
Harness: src/stores/__tests__/homeStatus.test.ts (6 cases) — red before, green
after. type-check clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CORE_RPC_HOST was hardcoded to bitcoin-knots in three env-render paths, so on a
bitcoin-core node (container named bitcoin-core) mempool-api could not reach
Bitcoin RPC. Both node variants are reachable on archy-net by container name —
only the name differs.
- Legacy direct-podman (stacks.rs) and config.rs::get_app_config now use a new
dependencies::detect_bitcoin_rpc_host() (pure, unit-tested pick_bitcoin_host).
- Quadlet/manifest path (the modern fleet default): add a {{BITCOIN_HOST}}
derived-env placeholder — HostFacts.bitcoin_host + resolve_derived_env render
it; prod_orchestrator detects Knots/Core via podman ps, resolved on demand
only for manifests that use the placeholder. mempool-api manifest moves
CORE_RPC_HOST from static env to derived_env: {{BITCOIN_HOST}}.
Tests: pick_bitcoin_host (5 cases incl. substring safety), container-crate
resolve_derived_env, and orchestrator mempool_core_rpc_host_follows_bitcoin_node
(core->bitcoin-core, knots->bitcoin-knots). No-regression confirmed: picker
returns bitcoin-knots live on .198. Live bitcoin-core validation pending (no
core node available). Sibling hardcodes (lnd/btcpay/electrumx/fedimint) tracked
as B12b.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The B13 template fix only fixed fresh ISOs. Already-deployed nodes keep their
old nginx config, where /app/fedimint/ proxies to :8175 without rewriting the
Guardian UI's root-rooted asset URLs (src="/assets/...", url("/assets/...")).
Those resolve against the SPA root: bg-network.jpg exists there by luck, but
app-icons/fedimint.jpg 404s (location /assets/ uses try_files =404) — the
visibly-broken icon.
bootstrap.rs::patch_nginx_conf now heals both paths on startup:
- Style A (main conf, HTTP): swaps the old single nostr-provider sub_filter tail
for the full reroot set; byte-matches the shipped template.
- Style B (HTTPS app-proxy snippet): the snippet's fedimint block has no
sub_filter and a per-node-varying trailing directive, so anchor on the unique
:8175 proxy_pass and insert the reroot set after it (nginx ignores directive
order). Snippet added to the bootstrap nginx loop (skipped on HTTP-only nodes).
missing_* flags are now gated on their splice anchors so the included snippet
neither attempts the main-conf-only patches nor logs warn-skips every boot.
Idempotent via the 'href="/' 'href="/app/fedimint/' marker.
Verified on .198 (both paths): fedimint app-icon 404 -> 200 image/jpeg; nginx -t
OK; containers survived restart (Quadlet); idempotent steady state, no warn spam.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fedimint UI HTML/CSS reference absolute /assets/* paths; under /app/fedimint/
those hit the main SPA, not the fedimint container, so the UI renders
unstyled. Add the proven sub_filter asset-rewrite pattern (as indeedhub/
botfights use) to the /app/fedimint/ block in the nginx template + https
snippet (also rewrites url(...) for the CSS background image). Bootstrap
self-heal for already-deployed nodes is the documented resume point.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
B15: Home system stats (incl. bitcoin sync %) polled every 30s — too slow;
now 10s so sync progress tracks the actual block height more closely.
B7: the ElectrumX sync overlay was gated only on status!=='synced', so if
the status never flips to 'synced' (ElectrumX stale/disconnected) the loader
stuck on top forever. Now the overlay hides and the app iframe loads when
the sync status is stale (fail-open), while still showing during active
indexing. type-check EXIT 0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>