archy

lfg2025/archy

Author	SHA1	Message	Date
archipelago	d4c0587df0	fix(health): IndeeHub API waits for MinIO before restart (#41 ) The IndeeHub API needs MinIO (object storage) up to serve, but the health monitor's dependency map listed only postgres + redis, so it would restart the API while MinIO was still starting — the "recovers only after 1-2 container restarts" symptom. Add indeedhub-minio to the API's deps; MinIO has no deps of its own so the monitor restarts it first, no deadlock. (First-start ordering in the stack definition is a deeper, separate follow-up.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 06:33:04 -04:00
archipelago	ab56054aeb	fix(federation): remove-node also purges the mesh contact/thread (#2 ) federation.remove-node only edited nodes.json, so a removed/renamed node (e.g. a stale "Arch HP") lingered in the mesh chat list with its old thread. Capture the node's pubkey before removal, then purge its synthetic mesh peer, shared secret, messages, presence, and persisted contact entry via the new mesh::purge_federation_peer. Combined with the #42 name refresh, stale federation contacts can now be fully cleaned from a node. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 06:12:56 -04:00
archipelago	56752ebfc0	fix(identity): Node npub in Web5 Identities matches Settings (#49 ) Settings shows the node-level Nostr key (HKDF derive_node_nostr_key, read via node.nostr-pubkey) while Web5 > Identities showed the identity record's own key — the mirrored "Node" identity stores nostr=None and seed identities use a different BIP-32 NIP-06 key, so the two surfaces disagreed. Resolve the node-level Nostr key once in identity.list and override it onto whichever identity record is the node's own (ed25519 == server_info .pubkey). Display-only — no stored key is rewritten, so it self-applies to existing nodes with no migration and the discovery identity is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 06:03:25 -04:00
archipelago	6de8173d18	fix(mesh): refresh federation chat names + roster after sync without restart (#42 ) A peer accepted via invite is seeded into the mesh peer table with name=None, so it shows as "Archipelago <pubkey8>" in chat. Federation sync later learns the real name (update_node_state writes it to nodes.json) and discovers transitive peers (merge_transitive_peers), but nothing pushed those into the live mesh peer table — the chat list stayed stale until the next mesh restart, and transitive peers never appeared as contacts at all. Add RpcHandler::refresh_federation_mesh_peers() (re-runs the idempotent, onion-deduped seed_federation_peers_into_mesh) and call it after every periodic sync cycle (server.rs) and after the manual federation.sync-all RPC. Names now correct themselves and the full roster meshes within a sync cycle, no restart needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 05:52:41 -04:00
archipelago	1ea3f8d65c	fix(mesh): message federation contacts without a radio (fixes 'Missing contact_id') Messaging a federation-only peer (e.g. 'Arch Dev') failed with 'Missing contact_id'. The UI gave federation-only rows a negative placeholder contact_id derived from a DID hash, but the backend parses contact_id as u64, so a negative value deserialized to None. The negative id also never matched the positive federation-synthetic id that federation-routed messages are stored under, so those threads looked empty. - Frontend: derive the SAME positive federation-synthetic id the backend uses (federationContactId mirrors federation_peer_contact_id) so mesh.send accepts it and messages thread correctly. - Backend: send_typed_wire now resolves a federation-synthetic contact_id from nodes.json when it isn't in the live mesh peer table (radio-less node), instead of bailing 'Unknown federation peer'. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 03:24:34 -04:00
archipelago	e456c9701b	fix(peer-files): stream large cloud downloads + surface real errors (#30 , #38 ) Large peer downloads (~178MB) failed with a generic 'Operation failed', and the download path had three stacked problems: - The FIPS reqwest client used a hard-coded 20s total timeout regardless of the caller's .timeout(), so a big transfer over the mesh aborted at 20s before the Tor fallback could help. Honor the per-request timeout (client_with_timeout). - The peer-content proxy buffered the whole file into node memory via resp.bytes() before sending a byte, and capped the transfer at 60s. Stream the body through with hyper::Body::wrap_stream (constant memory) and raise the timeout to 900s; bump the nginx peer-content read timeout to match. - Free downloads pulled the file as base64 over RPC, doubling it in node memory and the browser — fatal for large files. Download free files by streaming from /api/peer-content straight to disk, after a 1-byte Range probe that surfaces the real reason (peer offline on mesh and Tor) instead of a generic failure. Paid downloads now return the real error through the {error} channel the UI already displays. Adds the reqwest 'stream' feature for bytes_stream(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 03:10:21 -04:00
archipelago	3aea8c5bfa	fix(orchestrator): rebuild local UI images when source changes (#34 ) The prod orchestrator only checked whether a build-image tag was present before deciding to skip the build. The local UI images (bitcoin-ui, lnd-ui, electrs-ui) COPY a built neode-ui dist, so a UI update changed the source but left the old tag in place and the new UI never shipped. Gate the build on a content fingerprint of the build context (sorted relative path + length + mtime, SHA-256) recorded in a per-tag stamp under data_dir. Rebuild whenever the fingerprint differs from the one that produced the existing image; podman's own COPY-layer cache keeps a no-op rebuild cheap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 03:09:56 -04:00
archipelago	1843739e0c	fix(install): restart stack containers that crash on first start (#25 ) Apps could fail install when a stack member exited on its first start because a dependency (db/redis/the bitcoin node) was not ready yet — a transient crash, not a broken install. wait_for_stack_containers now restarts each exited/dead container up to 3 times before declaring the install failed; the runtime supervisor keeps it alive afterwards. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 15:14:09 -04:00
archipelago	83b77796fc	chore: release v1.7.98-alpha	2026-06-16 14:07:49 -04:00
archipelago	7e84434ff6	test(update): stage .download-complete marker in roundtrip test The #26 fix makes has_staged_update require the .download-complete marker, so the state self-heal treats a marker-less staging dir as a partial download and clears update_in_progress. The roundtrip test staged a binary file but not the marker, so it began failing. Write the marker to simulate a complete staged update. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 12:41:18 -04:00
archipelago	981a86cc26	style: cargo fmt (update.rs has_staged_update + #16/#36 changes)	2026-06-16 11:30:51 -04:00
archipelago	45ac9be965	fix(kiosk): cap chromium resources + drop GPU rasterization when headless (#36 ) The kiosk chromium pinned ~92% of a core (software-compositing spin from --enable-gpu-rasterization on a GPU-less/headless node), saturating the machine and starving the backend + container builds — it caused the .198 receive timeout and the deploy storms. - archipelago-kiosk.service: CPUQuota=75% + MemoryMax/High + Delegate, so a runaway kiosk can never take the whole node down. - archipelago-kiosk-launcher.sh: detect /dev/dri — use GPU rasterization only when a GPU exists, else --disable-gpu (avoids the headless spin). - bootstrap::ensure_kiosk_hardened: OTA self-heal that installs the updated unit+launcher on already-deployed nodes, daemon-reloads, and only try-restarts a running kiosk (never re-enables an operator-disabled one). cargo check clean; launcher bash -n clean; unit syntax valid. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 11:10:26 -04:00
archipelago	ab6fcef6f3	fix(containers): periodically restart crashed stack members at runtime (#16/#17) immich_server/redis/postgres + indeedhub-* are multi-container stack members whose sub-container app_ids are NOT in package_data, so the health monitor skips them as "orphans" and never restarts them when they exit — Immich/IndeedHub stay down until the next reboot (the boot-only start_stopped_stack_containers was the only recovery). Spawn a 120s supervisor that reuses that same recovery at runtime. It cheaply skips already-running containers and honours the user-stopped list (set on every container by package.stop), so it only revives genuinely crashed members and never fights a user stop. cargo check clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 10:49:36 -04:00
archipelago	82cfc8ccba	fix(update): failed download returns to Download, not Install (#26 ) A resumable-but-failed download leaves partial component files in update-staging. has_staged_update() treated ANY staged file as "install-ready", so the state self-heal kept update_in_progress=true and the UI showed Install instead of Download (no clean retry). - update.rs: write a .download-complete marker only after EVERY component downloads+verifies; has_staged_update() now checks that marker. Partial/failed downloads (no marker) correctly read as not-staged → self-heal clears update_in_progress → UI shows Download. Resume still works (partial files kept). - SystemUpdate.vue: on a genuine download failure, reset downloaded/in_progress and re-sync, so the user lands back on Download immediately. cargo check + vue-tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 10:31:12 -04:00
archipelago	3a9d1db763	feat(identity): seed-derivation verifier + KAT; rename "Your DID"→"Node DID" - scripts/verify-seed-derivation.py: stdlib-only tool to cryptographically prove a node's on-disk keys (node_key→DID, nostr_secret→npub, fips_key) are derived from its onboarding seed exactly as seed.rs documents (BIP-39 → PBKDF2-HMAC- SHA512 → HKDF-SHA256 with per-key domain separation). - seed.rs: known-answer regression test cross-checking Rust node_key + nostr bytes against the Python verifier (locks the derivation). - en.json: "Your DID" → "Node DID". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 10:17:29 -04:00
archipelago	aa9e0f02b7	fix(cloud): pin peer file-card filename + action buttons to the bottom (#11 ) Make each peer file card a flex column filling its grid cell (flex flex-col h-full) and pin the body row (filename + Play/Download) with mt-auto, so cards with a media preview and cards without line their footers up across the row. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 09:27:29 -04:00
archipelago	edd03e542d	feat(storage): encrypt chat history + mesh contacts at rest, atomic writes, persist contacts (#12 ) User: chat history (messages + mesh/Tor contacts) must persist and be secure/encrypted per best practice. Root cause of the .198 loss was the B17 mount race writing empty stores over real data (B17 already fixes the trigger); this hardens storage so it can never silently lose or expose data: - storage_crypto: shared at-rest envelope mirroring credentials::store — key = SHA-256(domain ‖ node identity key) (seed-derived, per-store domain separation), ChaCha20-Poly1305 AEAD with a random 96-bit nonce, tamper-evident. Transparent migration of legacy plaintext files. Unit-tested (round-trip, wrong-key/tamper rejection, plaintext detection). - messages.json: encrypted at rest + ATOMIC write (temp+rename) so a crash/ reboot mid-write cannot corrupt history; decrypt-with-migration on load; a failed decrypt never overwrites the on-disk data. - mesh contacts (alias/notes/pinned/blocked): were ONLY in memory and lost on every restart — now persisted to mesh-contacts.json (encrypted, atomic), loaded on MeshState startup, saved after contacts-save/contacts-block. Explicit clear (mesh.clear-all) still wipes everything, as intended. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 08:54:37 -04:00
archipelago	774ca28847	feat(fips): auto-activate + reliability (retry, warm paths) — make FIPS the robust primary (B14b/#27) User priority: FIPS is the main transport but it was unreliable and needed a manual "Activate" button. Improvements (all in the FIPS dial/supervisor): - Auto-activate: ensure_activated() installs the daemon config + starts the service on its own once seed onboarding has materialised the key — no Activate button needed. Idempotent; runs from the supervisor every 45s so a node that onboards after boot still comes up automatically. - Dial retry: try_fips_get/post now retry ONCE on a connect/timeout error. The first dial to a peer triggers NAT hole-punching and often times out before the path is up; the retry lands on the now-warm path — the main reason calls were dropping to Tor despite the peer being FIPS-reachable. - More patient connect_timeout (5s→8s) so a reachable-but-cold peer isn't abandoned to Tor while hole-punching completes. - Path warmer: spawn_fips_supervisor() keeps hole-punched paths to known federation peers warm (every 45s, concurrent), so on-demand dials are fast and land on FIPS. - Confirmed the daemon config already enables BOTH udp + tcp transports (render_config_yaml), so FIPS already uses TCP where UDP is blocked; the Tor fallback was path-establishment, addressed above. cargo check + fmt clean. Backend — needs a binary rebuild+deploy to validate on .116/.198 (watch last_transport flip fips, and FIPS coming up with no button). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 08:16:02 -04:00
archipelago	47c16971a7	chore: release v1.7.97-alpha	2026-06-16 04:16:13 -04:00
archipelago	34b1fdc1a3	fix(boot): order archipelago.service after the data volume mount (B17) On production nodes /var/lib/archipelago (the app data dir AND podman's graphroot=/var/lib/archipelago/containers/storage) is a separate device-mapper volume. archipelago.service ordered only After=network-online .target, so on cold boots it (and its ExecStartPre) could start BEFORE var-lib-archipelago.mount, write to the bare mountpoint on rootfs, fail every podman call, exit, and be restarted every 5s until the volume mounted — the "~20x [FAILED] Failed to start over ~5min" boot flap. Proven live on .198: "var-lib-archipelago.mount: Directory /var/lib/archipelago to mount over is not empty, mounting anyway" — the service had written there pre-mount. Fix: RequiresMountsFor=/var/lib/archipelago (adds Requires= + After= on the mount unit). - image-recipe/configs/archipelago.service: ships the directive on fresh ISOs. - bootstrap::ensure_archipelago_mount_ordering(): self-heals already-deployed nodes' installed unit + daemon-reload (boot-ordering only, effective next reboot; never restarts the running service). Idempotent; harmless on rootfs installs (maps to the always-mounted root). Verified on .198: after applying, systemctl shows After=var-lib-archipelago .mount and systemd-analyze verify is clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 03:33:29 -04:00
archipelago	2943fd0c5e	style(core): cargo fmt (B1/B3/B13 follow-up — satisfy release fmt gate)	2026-06-16 03:09:18 -04:00
archipelago	bf24bbc15a	fix(mempool): resolve CORE_RPC_HOST to the actual bitcoin node (Knots/Core) (B12) CORE_RPC_HOST was hardcoded to bitcoin-knots in three env-render paths, so on a bitcoin-core node (container named bitcoin-core) mempool-api could not reach Bitcoin RPC. Both node variants are reachable on archy-net by container name — only the name differs. - Legacy direct-podman (stacks.rs) and config.rs::get_app_config now use a new dependencies::detect_bitcoin_rpc_host() (pure, unit-tested pick_bitcoin_host). - Quadlet/manifest path (the modern fleet default): add a {{BITCOIN_HOST}} derived-env placeholder — HostFacts.bitcoin_host + resolve_derived_env render it; prod_orchestrator detects Knots/Core via podman ps, resolved on demand only for manifests that use the placeholder. mempool-api manifest moves CORE_RPC_HOST from static env to derived_env: {{BITCOIN_HOST}}. Tests: pick_bitcoin_host (5 cases incl. substring safety), container-crate resolve_derived_env, and orchestrator mempool_core_rpc_host_follows_bitcoin_node (core->bitcoin-core, knots->bitcoin-knots). No-regression confirmed: picker returns bitcoin-knots live on .198. Live bitcoin-core validation pending (no core node available). Sibling hardcodes (lnd/btcpay/electrumx/fedimint) tracked as B12b. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 02:07:39 -04:00
archipelago	987a961f4a	fix(nginx): self-heal fedimint asset rewrite on deployed nodes — HTTP + HTTPS (B13) The B13 template fix only fixed fresh ISOs. Already-deployed nodes keep their old nginx config, where /app/fedimint/ proxies to :8175 without rewriting the Guardian UI's root-rooted asset URLs (src="/assets/...", url("/assets/...")). Those resolve against the SPA root: bg-network.jpg exists there by luck, but app-icons/fedimint.jpg 404s (location /assets/ uses try_files =404) — the visibly-broken icon. bootstrap.rs::patch_nginx_conf now heals both paths on startup: - Style A (main conf, HTTP): swaps the old single nostr-provider sub_filter tail for the full reroot set; byte-matches the shipped template. - Style B (HTTPS app-proxy snippet): the snippet's fedimint block has no sub_filter and a per-node-varying trailing directive, so anchor on the unique :8175 proxy_pass and insert the reroot set after it (nginx ignores directive order). Snippet added to the bootstrap nginx loop (skipped on HTTP-only nodes). missing_* flags are now gated on their splice anchors so the included snippet neither attempts the main-conf-only patches nor logs warn-skips every boot. Idempotent via the 'href="/' 'href="/app/fedimint/' marker. Verified on .198 (both paths): fedimint app-icon 404 -> 200 image/jpeg; nginx -t OK; containers survived restart (Quadlet); idempotent steady state, no warn spam. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 18:03:04 -04:00
archipelago	602b9cd3df	fix(nginx): route /api/peer-content/* to the backend for B3 streaming The B3 streaming proxy endpoint existed in the backend but nginx had no location for /api/peer-content/*, so the browser's requests fell through to the SPA (200 text/html) and media still wouldn't play. Add an NGINX_PEER_CONTENT_BLOCK that bootstrap patches into every server block (forwards Cookie for session auth + Range, proxy_buffering off). Idempotent; covers fresh-ISO nodes too since bootstrap runs on every startup. Verified on .198: after restart the async nginx patch lands and /api/peer-content/<onion>/<id> returns 401 (reaches backend, auth-gated) instead of the SPA; nginx block present in both server blocks. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 14:07:39 -04:00
archipelago	5c8707432b	fix(cloud): Range-streaming proxy for peer media so it plays/seeks (B3) Peer media (music/video) wouldn't play: the frontend downloaded the whole file via RPC as base64 and made a non-seekable Blob URL, so <video>/large <audio> stalled and big files hit the RPC timeout. Add GET /api/peer-content/<onion>/<id> — a same-origin, session-gated proxy that forwards the browser's Range header to the peer's /content/<id> (which already returns 206 Partial Content) and passes status + Content-Range + Content-Type back. PeerFiles.playMedia() now points <video>/<audio> at this streaming URL for free content instead of buffering a base64 blob, so the player can seek and start immediately. Onion/id validated to prevent SSRF/path traversal. (Paid preview keeps its existing flow.) Verified: cargo build --release EXIT 0; vue-tsc --noEmit EXIT 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:46:51 -04:00
archipelago	0801dd6632	feat(cloud): show Tor/FIPS transport pill on peer browse (B21) content.browse-peer now returns the transport that actually reached the peer (fips/tor/mesh/lan). PeerFiles shows it as a small coloured pill next to the peer name (FIPS/Mesh green, LAN blue, Tor amber) and the loading text no longer hardcodes "Connecting via Tor" (it was misleading when FIPS was used). Pairs with B14 (transport recording). Verified: cargo build --release EXIT 0; vue-tsc --noEmit EXIT 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:25:39 -04:00
archipelago	1c6dc153ce	fix(content): use re-exported federation::record_peer_transport path (repair build) The B14 commit referenced crate::federation::storage::record_peer_transport but `storage` is a private module — record_peer_transport is re-exported at crate::federation::. E0603 broke the build. Use the re-exported path (as load_nodes/fips_npub_for_onion already do). Verified: cargo build --release EXIT 0. Also logs B21 (Tor/FIPS pill) plan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:15:01 -04:00
archipelago	f2e3710c28	fix(content): record peer transport on cloud browse/download/preview (B14) The 4 content peer handlers (browse, download, download_paid, preview) captured the transport returned by PeerRequest::send_get() but discarded it, so the federation node's last_transport was never updated for cloud activity — the UI showed Tor/none even when FIPS was used. Call record_peer_transport() after each successful fetch (same as sync does). Note: live data shows FIPS still reaches only some peers (many genuinely fall back to Tor) — tracked separately as B14b (FIPS reachability). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 13:02:13 -04:00
archipelago	ed4931064b	fix(federation,cloud): dedup trusted nodes + chat contacts by onion; guard cloud my-folders (B1,B2,B4) B1/B2: the same physical node can linger in the federation list under two dids (e.g. after a did/key change). An onion is a node's unique stable identity, so two entries with the same onion are one node. This showed the node twice in the trusted-node list (B1) and as two mesh chat contacts — one by name+logo, one by raw did (B2). - storage::load_nodes now collapses same-onion entries (keep first, merge fips_npub/name/last_state) so every consumer (list + chat seed + sync) sees one entry per node. - federation::sync merge_transitive_peers also matches by onion (not just did) so new transitive hints don't re-add a known node under a new did. - mesh::seed_federation_peers_into_mesh skips already-seeded onions (belt and suspenders). - Unit tests for dedup_nodes_by_onion (collapse + onion-suffix handling). B4: filebrowser-client.listDirectory only checked res.ok before res.json(), so when File Browser is absent (nginx serves the SPA index.html, 200) or down (502) the JSON parse threw the opaque "Unexpected token '<'". Now it checks the content-type and throws a friendly "File Browser is not available" the Cloud view already renders as an empty state. Verified: dedup unit tests 2/2; live .198 (15 entries→13 distinct onions) restarted healthy on new binary; B4 guard present in built bundle + deployed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 12:29:12 -04:00
archipelago	1db720af13	fix(lnd): repair fleet-wide CORS on LND connect-wallet endpoints (B5) The LND wallet UI (served on its own app port) fetches /lnd-connect-info and /proxy/lnd/* cross-origin, so both need correct CORS headers. (a) Older nginx configs add their own Access-Control-Allow-Origin in the /lnd-connect-info location on top of the one the backend sets, yielding a DUPLICATE header that browsers reject ("multiple values"). bootstrap now strips that redundant nginx add_header (backend owns CORS). (b) /proxy/lnd/* returned a 401 with no CORS headers when the session check failed, so the browser saw an opaque CORS error instead of a readable 401. Add unauthorized_cors() and use it on that path. Adds tests/production-quality/ (bug tracker + lnd-cors-test.sh harness). Verified: harness 4/4 on .116, .198, .103. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 11:31:14 -04:00
archipelago	8c3c79543e	chore: sync core/Cargo.lock to 1.7.96-alpha (release leftover) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 10:15:24 -04:00
archipelago	7aa1ca013f	chore: release v1.7.96-alpha	2026-06-15 10:14:05 -04:00
archipelago	790ad154f3	chore: sync core/Cargo.lock to 1.7.95-alpha (release leftover) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 09:04:30 -04:00
archipelago	e2c2f942c2	chore: release v1.7.95-alpha	2026-06-15 08:48:22 -04:00
archipelago	937ba7e115	chore: sync core/Cargo.lock to 1.7.94-alpha (release leftover) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 08:09:55 -04:00
archipelago	e056c2477b	fix(fips,federation,ui): mesh content browse, removed-node tombstones, modal sizing FIPS peer content browse over the mesh was failing with "Peer returned error: 404 Not Found" and never falling back to Tor. `is_peer_allowed_path` only allowed `/content/<id>` (item fetches) — the catalog endpoint is exactly `/content` (no trailing slash), so it 404'd over the FIPS peer listener. A FIPS 404 was also treated as a successful response, so the dial never retried Tor. Fixes: allow `/content` over the mesh; add `fips_should_fall_back()` so a FIPS 404/5xx in Auto mode falls back to Tor (handles version-skew peers reaching a different route). Also correct the reconnect hint text — the public anchor is TCP/8443, not UDP/8668. Federation: deleted nodes reappeared because transitive discovery (`merge` of a peer's advertised trusted peers) re-added any unknown DID. Add a tombstone store (`removed-nodes.json`): remove_node tombstones the DID, transitive merge skips tombstoned DIDs, and a remote-triggered peer-joined is ignored for a removed DID. Explicit local re-add (add_node) clears the tombstone. UI: the app credentials modal panel stretched edge-to-edge (height:100%, max-width:none, items-stretch overlay). Constrain it to a centered card (max-width 34rem, rounded, dimmed full-screen backdrop) matching the AppIconGrid / wallet-receive modal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 08:09:26 -04:00
archipelago	7bd22f1f80	chore: release v1.7.94-alpha	2026-06-15 07:09:58 -04:00
archipelago	95f9a805b1	feat(fips): connect to public mesh anchor over TCP + wire daemon updates The whole fleet was silently never reaching the FIPS mesh: the default public anchor was configured as fips.v0l.io:8668/udp, but the anchor only answers on TCP/8443. Fix the default to 185.18.221.160:8443/tcp (IPv4 literal — the hostname resolves IPv6-first and the daemon binds v4-only, which fails the handshake with EAFNOSUPPORT), and auto-seed it in anchors::load() so every node dials it without operator action (removal still persists). Proven live on .116: cold start → anchor_connected in ~400ms, anchor became mesh parent. Wire fips::update::apply() against upstream GitHub releases (stable channel only): resolve /releases/latest → SHA256-verify the .deb against checksums-linux.txt → install → restart. dpkg runs via `systemd-run` to escape archipelago's ProtectSystem=strict sandbox (else /var/lib/dpkg is read-only), with --force-confold (archipelago manages /etc/fips conffiles) and --force-downgrade (dev builds sort newer than the stable tag). Validated live: .116 upgraded 0.3.0-dev -> stable v0.3.0. Also: standalone fips-ui dashboard app (apps/fips-ui + docker/fips-ui, static nginx proxying /rpc/v1 same-origin, copiable own-anchor address); reserve UI port 8336; register fips/fips-ui as platform-managed. Includes the Lightning wallet cross-origin (CORS) + LND proxy auth + nginx self-healer fix so the wallet screen connects instead of "failed to fetch". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 06:41:48 -04:00
archipelago	640dc87a5f	chore: sync core/Cargo.lock to 1.7.93-alpha (release leftover) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 15:21:07 -04:00
archipelago	327a4e34dd	chore: release v1.7.93-alpha	2026-06-14 15:18:34 -04:00
archipelago	1973d76427	style: rustfmt lnd migrate_locked_wallet matches! call Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 14:41:40 -04:00
archipelago	3214d6aff3	fix(lnd): self-heal unrecoverable locked wallet via wipe+recreate When an existing LND wallet is locked and none of the candidate passwords (per-node secret, legacy constant) open it, the node can never auto-unlock unattended. unlock_existing_wallet now returns Ok(false) for "all candidates actively rejected" (vs Err for transient "LND not ready"), and ensure_wallet_initialized responds by recreating the wallet: - mark the lnd container user-stopped so the health monitor won't re-launch it (and re-open the wallet) mid-wipe, - stop lnd, delete its wallet/chain/graph state as root, - start lnd, wait for NON_EXISTING, re-init a fresh wallet on the per-node secret, then clear the user-stopped flag. LND runs as a plain bridge-network podman container (not a Quadlet unit), so it is restarted via `systemd-run --user --scope podman`, matching the orchestrator/health-monitor path. Alpha nodes hold no funds and a wallet locked with an unknown password is already inaccessible, so the wipe loses nothing reachable. Completes the forward fix from 91adc281 for nodes whose wallet pre-dates the per-node secret and whose password is unrecorded (e.g. .116/.228). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 14:08:33 -04:00
archipelago	91adc281ca	fix(lnd): per-node wallet password + locked-wallet self-heal on login Replaces the fleet-wide hardcoded WALLET_PASSWORD='hellohello' that left wallets LOCKED after OTA/reboot (auto-unlock used the wrong password fleet-wide). Forward fix (both init paths unified, validated cargo check + LND REST mechanics on a scratch wallet): - Per-node random 256-bit secret in secrets/lnd-wallet-password (0600), mirroring secrets/bitcoin-rpc-password. read_wallet_password (no-gen) vs ensure_wallet_password (gen at init only). - container/lnd.rs init AND api/rpc/lnd/wallet.rs seed-derived init both use the per-node secret (wallet.rs keeps recoverable derived entropy; password unified). - Unlock tries [per-node secret, legacy 'hellohello']; single-attempt primitive distinguishes invalid-passphrase (fail fast, try next) from not-ready (retry), so a wrong password no longer hangs the boot path ~60s. Migration (candidate-unlock + rotate, best-effort at login): - change_wallet_password (WalletUnlocker.ChangePassword) + migrate_locked_wallet: if LOCKED, try candidates as current pw and ChangePassword onto the per-node secret so future boots auto-unlock. Hooked into auth.login (non-blocking) with the just-verified password as the candidate. NOT YET: seed-recovery fallback for wallets where no candidate matches (e.g. .116/.228) — destructive, needs entropy-source/funds-safety handling; next pass. NOT shipped: pending end-to-end validation on a real node. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 11:19:56 -04:00
archipelago	a9c4e54023	chore: sync core/Cargo.lock to 1.7.92-alpha (release leftover) create-release.sh bumps Cargo.toml but not the lock's archipelago version line; the cargo build regenerates it post-commit. Same as the 1.7.91 leftover — worth fixing create-release.sh to stage Cargo.lock, tracked separately. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 10:42:13 -04:00
archipelago	d462e44453	chore: release v1.7.92-alpha	2026-06-14 09:09:57 -04:00
archipelago	60fe761def	chore: sync core/Cargo.lock to 1.7.91-alpha (release leftover) create-release.sh bumps Cargo.toml; the lock's archipelago version line is regenerated by the subsequent cargo build and was left uncommitted after the v1.7.91-alpha release commit. The shipped binary is built from the bumped Cargo.toml, so this is bookkeeping only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 07:58:03 -04:00
archipelago	9b9fa9cdee	chore: release v1.7.91-alpha	2026-06-14 05:32:38 -04:00
archipelago	a483fe4baa	fix: derive launch port from URL authority, not naive rsplit reachable_lan_address() parsed the launch port with url.rsplit(':') which yields "8096/" for manifest interfaces.main URLs that carry a path (http://localhost:8096/). That fails to parse and silently drops a perfectly reachable launch URL, so apps like jellyfin, btcpay-server, fedimint, gitea, nextcloud and portainer showed running with no launch link in the UI. New launch_url_port() reads digits after the final colon (mirroring port_from_url in the RPC layer) and tolerates a trailing path. Adds regression tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 03:35:19 -04:00
archipelago	0ed892a412	fix: wallet receive reliability, bitcoin install self-heal, ElectrumX app tile Fixes three Bitcoin/wallet failures observed across the fleet on v1.7.90-alpha (all nodes were already on the latest build — these were live bugs, not stale builds), plus the missing ElectrumX tile, and adds automated coverage so each can't regress silently. Receive address (".116 receive fails", ".228 false 'wallet is locked'"): - LND publishes its REST API on a host port that can drift from the manifest (a container created when the mapping was 8080 kept publishing 8080 after the manifest moved to 18080). The in-process client connects to the manifest port, gets connection-refused, and wallet init fails forever while the container looks "Up". Add published-port drift detection to the reconciler (container_ports_drifted / host_port_bindings_drifted) that recreates a drifted backend even for restart-sensitive apps — a drifted container is already broken, so leaving it "untouched" only perpetuates the failure. - Receive errors now carry a stable [CODE] token (REST_UNREACHABLE, WALLET_LOCKED, WALLET_UNINITIALIZED, SYNCING) and always start with "Bitcoin address" so they survive the RPC error sanitizer instead of collapsing to the generic "Operation failed". The UI maps the code instead of guessing wallet state from substrings — so an unreachable REST endpoint is no longer mislabelled "locked". Bitcoin install (".198 bitcoin gone / reinstall just stops"): - bitcoin-knots requires the secret bitcoin-rpc-txrelay-rpcauth, which was only generated by the tx-relay flow. Nodes that never used tx-relay lacked it, so secret resolution hard-failed and the whole Bitcoin stack cascaded. Generate it idempotently before bitcoin starts (ensure_app_secrets, reusing ensure_txrelay_credentials), and name the missing secret in the error so a genuine gap is actionable instead of a bare "IO error". ElectrumX app tile missing on every node with it installed: - The catalog generator dropped electrumx because the manifest had no interfaces.main block, so the tile had no launch URL and was hidden. Declare the companion UI port (50002) in the manifest, regenerate the catalog, and let an app with a known launch URL stay launchable while its backend is still "starting" (ElectrumX indexes for 10m+). Test harness: - New lifecycle bats suites: bitcoin-receive, port-drift, secret-completeness (validated live; port-drift catches the real .116 drift). - Rust unit tests for drift detection, the receive reason-code classifier, and the named-missing-secret error; vitest for the UI code mapping. - create-release.sh now runs tests/release/run.sh and aborts the release on failure — previously it ran no tests at all. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-14 03:12:56 -04:00
archipelago	bb808df89a	chore: release v1.7.90-alpha	2026-06-13 05:05:14 -04:00

1 2 3 4 5 ...

578 Commits