1521 Commits

Author SHA1 Message Date
archipelago
9cc288521d docs: multinode pass — cleared .198 preconditions, hit a real hardware blocker
Reset-failed 2 stale dead-unit records on .198, confirmed nginx lnd proxy
target is correct. Hit a genuine blocker needing a user decision: .198's
448GB disk is below the 1TB archival threshold so it runs pruned bitcoin,
currently only 21% through IBD — the multinode plan's precondition requires
pruned:false + fully synced.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 12:45:41 -04:00
archipelago
0323310c91 docs: close out Tier 1 tracker items — all 3 turned out non-issues
immich is already fully Quadlet-migrated (verified live on .228, same
install_stack_via_orchestrator primitive as netbird/btcpay). TanStack Query
spike recommends not adopting — no cache/staleness bugs, WS push already
covers hot data. Netbird reinstall adoption-skips-cert-render is correct by
design (adoption only fires when no manifest exists to render from anyway).

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 12:37:23 -04:00
archipelago
79bbcca964 docs: consolidate OTA 1.8.0 + master-plan open items into one priority-ordered tracker
docs/UNIFIED-TASK-TRACKER.md replaces hunting across SESSION-1.8.0-OTA-PROGRESS.md
and PRODUCTION-MASTER-PLAN.md for "what's left" — fastest/simplest tasks first.
Verified against live code/nodes rather than trusting doc text: several previously
"open" items (bind-dir chown, netbird legacy installer, launch-port fallback,
archival-bitcoin manifest field, progress-UI monotonicity, all-apps coverage,
fedimint test coverage, changelog backfill, portainer image pin, grafana quadlet
activation) turned out already shipped or non-issues, and are closed out here.
TESTING.md's release-gate checklist updated to match reality (cargo warnings,
5x gate, changelog already green; multinode/backend-default-flip/tag genuinely open).

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 12:29:26 -04:00
archipelago
177b8a4338 feat(mesh): show federated Archipelago nodes on the Mesh Map
Peers that opt in via a new "Share Location" toggle in Settings
(server.set-location RPC) get plotted on other trusted peers' Mesh Map
with a distinct Archy-logo marker, separate from raw LoRa radio peers.
Location is persisted locally, carried in NodeStateSnapshot, and
propagated through federation sync/delta like other node state.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 12:04:31 -04:00
archipelago
e3baaa5de3 docs: record fleet-deploy ENOSPC bug + fix + cleanup outcome
Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 11:01:27 -04:00
archipelago
84d35b3b68 fix(deploy): also exclude .venv from the rsync payload
reticulum-daemon/.venv (a local Python virtualenv bundling PyInstaller +
esptool + Qt hooks, several hundred MB) was also being synced to deploy
targets uncached -- same class of bug as the releases/ exclude just added.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 10:54:34 -04:00
archipelago
aa849849e8 fix(deploy): exclude releases/ from the rsync payload
releases/ (the local repo's own historical build artifacts -- dozens of
versioned binaries + frontend tarballs, 7-10GB) was never excluded, so every
deploy synced it to the target's root disk. Filled .198 (29GB disk) to 100%
mid-deploy and .228 to 100% right after a "successful" deploy -- the target
node never needs its own copy of the release archive, only the built
binary+frontend actually get installed into system paths.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 10:44:07 -04:00
archipelago
bebf3bae10 fix(mesh): Reticulum garbage-text + reconnect churn + signal bars + node naming/HTTPS
- reticulum.rs: send_text_msg was lossy-UTF8-mangling binary CBOR control
  envelopes (ReadReceipt etc.) before sending as LXMF text; base64-encode
  with a marker instead, decoded losslessly on receive.
- typed_messages.rs: mesh.send-read-receipt fired automatically on every
  chat view with no is_archy_peer gate, so viewing a message from a stock
  (non-archy) LXMF peer auto-sent it an undecodable control envelope,
  surfacing as garbage text right after whatever it just sent. Now a no-op
  for non-archy peers.
- mesh/listener/mod.rs: RX_STALL_TIMEOUT was 300s and forced a full
  auto-detect reconnect on any otherwise-healthy but quiet mesh link
  (visible as "Connecting..." flapping); this also wiped Reticulum's
  in-memory peer-address table every cycle, breaking messaging with peers
  who hadn't re-announced in the window. Bumped to 1800s.
- reticulum.rs: persist the peer prefix/dest-hash/display-name table to
  disk so a restart doesn't force every peer back to "Anonymous Peer"
  until they re-announce.
- decode.rs/frames.rs: Meshcore was discarding the SNR its wire format
  carries; wire it onto the peer record. Mesh.vue's signalBars() now falls
  back to SNR-based bars when RSSI is unavailable (always true for
  Meshcore); Reticulum has neither and correctly stays at 0/"no data".
- system/handlers.rs, dispatcher.rs: new system.get-hostname RPC + cert
  regeneration (with a proper SAN) whenever server.set-name changes the
  hostname, so HTTPS doesn't add a mismatch warning on top of the
  self-signed one after a rename.
- AccountInfoSection.vue: surface the mDNS hostname + http/https links in
  Settings (HTTPS needed for mic/camera secure-context features) — never
  forced, both keep working.
- build-auto-installer-iso.sh: ship avahi-daemon so .local names actually
  resolve on the LAN, and give the self-signed cert a real SAN instead of
  a bare CN, both at image-build and install-time-fallback.
- Mesh.vue/MediaLightbox.vue/mesh-styles.css: mic/attach-stack no longer
  closes on a plain hover-past; mesh images open in the shared lightbox
  and have a real download button; lightbox close button moves to
  bottom-center on mobile instead of under the status bar; mesh device
  panel gets the same height/padding as its sibling tabs.

Verified: 108/108 mesh unit tests, deployed + confirmed healthy on
.116/.198/.228 (matching binary hash across all three), live Reticulum
messaging confirmed working end-to-end post-deploy.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 10:42:20 -04:00
archipelago
99cd82ab0a fix(ui): catch useAudioPlayer's play() rejection instead of leaving it unhandled
play() on the underlying <audio> element rejects independently of its
'error' event (e.g. NotSupportedError when a peer-content request 404s and
there's no decodable source) — the 'error' listener already sets a friendly
message, but the unawaited play() promise still surfaced as a raw unhandled
rejection in the console. Follow-up from the .116->.228 peer-content
investigation (2026-07-01).

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 09:53:50 -04:00
archipelago
5269d50039 docs: record .198 cleanup outcome + .228 fedimint-guardian clarification
Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 09:20:15 -04:00
archipelago
09d42cbbf7 fix(orchestrator): immich uninstall must disable its sibling app_ids too
orchestrator_uninstall_app_ids("immich") only disabled the "immich" app_id
itself; "immich-postgres" and "immich-redis" (separate orchestrator-tracked
manifests, same pattern as mempool-api/archy-mempool-db) stayed enabled, so
the boot reconciler kept restarting their leftover stopped containers
forever after the generic uninstall path stopped them (.198, 2026-07-01 --
found while uninstalling immich to relieve disk I/O pressure competing with
a slow Bitcoin IBD).

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 09:12:13 -04:00
archipelago
d0710e7491 fix(orchestrator,content): bound repair-recreate loops; self-heal stale content catalog entries
- prod_orchestrator.rs: the boot reconciler's zombie-guard and start-failed
  recreate paths (Created/Stopped/Exited states) had no attempt cap, unlike
  health_monitor's independent restart tracker. A container whose entrypoint
  fatally crashes right after `podman start` succeeds got stop+remove+
  install_fresh'd every ~30s reconcile tick forever (portainer on .198,
  2026-07-01: a DB schema newer than the pinned binary could read -- no
  amount of recreating fixes that). Added a 5-attempts/30-minute circuit
  breaker; once exhausted the container is left alone with an error! log
  instead of looping, and an explicit install/start clears the counter.
- content_server.rs: serve_content now prunes a catalog entry whose backing
  file is missing on disk, instead of leaving it advertised to every peer
  forever with no way to distinguish "gone" from "transient failure."

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 08:19:54 -04:00
archipelago
d414ae3daa fix(orchestrator,ui): stop crash-looping orphan stack members; dedupe Electrum launch overlay
- crash_recovery.rs: stack boot/runtime recovery (immich/indeedhub/netbird) now
  requires the stack's core dependency container to exist before touching any
  sibling, instead of firing on any leftover container. Fixes an infinite
  120s-interval crash loop where orphan debris from a partial/failed install
  (indeedhub-api with no indeedhub-postgres ever created) was repeatedly
  force-restarted against a dependency that doesn't exist, which also blocked
  a real reinstall via container name conflicts.
- AppSessionFrame.vue: the generic app-loading overlay and the ElectrumX
  sync-in-progress overlay could render simultaneously (same z-index) during
  launch. The sync screen is strictly more informative, so it now takes
  precedence instead of the two stacking on top of each other.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 07:02:01 -04:00
archipelago
5b7cd5d5d0 fix(orchestrator): durable uninstall marker for baseline apps + archival-bitcoin/version-report gaps
- mempool-api now declares dependencies:[bitcoin:archival] directly, closing a
  gap where installing it standalone (a legitimate direct orchestrator-install
  target) bypassed the mempool umbrella's pruning gate entirely.
- New durable user-uninstalled marker (crash_recovery.rs, mirrors user_stopped)
  fixes required-baseline-app self-heal (bitcoin-knots/electrumx/lnd/mempool/
  etc.) resurrecting itself after an explicit uninstall survives a restart or
  reboot, since the in-memory disabled set is wiped by every load_manifests().
- installed_version() (set_config.rs) no longer trusts a floating image tag
  ("latest") as the reported running version -- a stale local :latest cache
  reported "latest" forever regardless of what latest had moved on to. Now
  falls back to asking the Bitcoin backend directly via `bitcoind --version`
  when the tag is floating.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 06:29:11 -04:00
archipelago
de8b2bb812 fix(iso): raise /tmp tmpfs cap above systemd's 50%-of-RAM default
PyInstaller-based daemons (e.g. the Reticulum mesh daemon) self-extract
to /tmp on every launch and leak their extraction dir on abnormal exit;
combined with release-build staging dirs this exhausted the default cap
on .116 and silently broke Reticulum ("no space left on device" during
self-extraction, masquerading as a connect failure). Ship a tmp.mount.d
override (75% of RAM) in the installer image so fresh installs don't
inherit the same ceiling.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 04:53:52 -04:00
archipelago
306b6356ee fix(orchestrator): generalize launch-port fallback + archival-bitcoin dependency gating
Master-plan backlog §10b/§10c: replace two per-app-hardcoded lookups with
generic, manifest-driven behavior so future apps are covered automatically
instead of needing a code edit.

- extract_lan_address (docker_packages.rs) now skips container-side ports
  that are known non-HTTP (SSH, FTP, common DB ports) instead of blindly
  taking podman's first-listed port. Fixes the whole class of bug the gitea
  SSH-before-web static override was a one-off patch for.
- requires_unpruned_bitcoin (dependencies.rs) now checks the app's own
  manifest for a `bitcoin:archival` dependency declaration first, falling
  back to the old hardcoded id list. electrumx and mempool manifests now
  declare it explicitly as the proof case.

869/869 Rust tests green, catalog drift clean.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-07-01 03:59:00 -04:00
archipelago
46dae75a0f feat(mesh): device onboarding modal (backlog #6)
Guided prompt that pops up when a mesh radio is detected but not yet
connected -- wraps the existing Connect action (mesh.configure with
device_path) rather than building a new setup engine. Dismissible per
device path (won't re-prompt for the same undismissed-but-ignored device on
every poll tick). Not the whole-app identity/seed onboarding system
(useOnboarding.ts) -- confirmed unrelated, this is mesh-specific only.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 23:24:12 -04:00
archipelago
712df2278f feat(mesh): Meshtastic provisioning robustness (backlog #12)
Three fixes:
1. Modem-preset authoritative: parse_config_lora_region now also decodes
   modem_preset (field 2) alongside region, tracked as current_modem_preset.
   ensure_lora_region's "region already set, don't touch it" branch (correct,
   unchanged) now ALSO re-asserts LONG_FAST when a real observed preset has
   drifted -- previously modem_preset only ever got written when region was
   UNSET, so a radio with the right region but wrong preset was never fixed.
   Only acts on an actually-observed wrong value (never speculative), so it
   can't reboot-loop.
2. RX-stall watchdog: run_mesh_session now bails (triggering the existing
   auto-reconnect path) if no frame has been successfully received in 5
   minutes -- the existing consecutive_write_failures counter is blind to a
   receive-only stall (writes can keep succeeding while inbound streaming is
   wedged).
3. Hot-swap detection: spawn_mesh_listener now compares self_node_id across
   session restarts and logs clearly when the physical radio itself changed
   (not just an ordinary reconnect of the same board). Per-session device
   state (contacts, current_region, etc.) was already naturally isolated
   per-session (fresh struct each reconnect) -- nothing else needed clearing.

107/107 mesh tests pass (2 new: modem_preset decode + the
absent-field-defaults-to-LONG_FAST case).

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 23:21:29 -04:00
archipelago
494f272815 feat(mesh): Device settings tab (backlog #8)
New MeshDevicePanel.vue, added as a 4th/5th tab entry to activeTab/toolsTab/
mobileTab following the exact existing pattern (chat/bitcoin/deadman/
assistant/map). Shows firmware version, node ID, advert name, LoRa region,
channel, and device type -- firmware_version/self_node_id were already
server-side but never rendered; region is new (composed into MeshStatus from
MeshConfig.lora_region at read time, not part of the live session state).
Reboot button wired to the already-working mesh.reboot-radio RPC.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 23:03:09 -04:00
archipelago
4a309a3ee4 feat(mesh): RSSI/SNR dBm tooltip on the existing signal-bars indicator
The bars UI (signalBars/.mesh-signal-bars) was already built and wired to
mp.primary_rssi -- it just needed real backend data, which the previous
commit provides. Adds primary_snr alongside primary_rssi in MergedPeer and a
hover tooltip showing exact dBm/SNR values.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 22:54:51 -04:00
archipelago
02b6b52a8c feat(mesh): Meshtastic RSSI/SNR + peer-location map wiring (backlog #14/#15, part 1)
Backend: parse_mesh_packet now decodes MeshPacket.rx_snr (field 8, float) and
rx_rssi (field 12, int32), and a new POSITION_APP branch decodes Position.
latitude_i/longitude_i (fields 1/2, sfixed32) -- all field numbers confirmed
against the canonical meshtastic/protobufs mesh.proto, not guessed. Threaded
through ParsedContact -> refresh_contacts -> MeshPeer (mirroring how
pkc_capable was wired for #17), so mesh.peers now surfaces real rssi/snr/lat/
lon instead of always-null. Fixed a real bug found along the way:
update_node_info's unconditional contact replace would have silently wiped
any already-tracked signal/position data on the next NodeInfo packet -- now
preserves it.

Frontend: mesh.ts's updateNodePositionsFromPeers() feeds real position data
into the SAME nodePositions map MeshMap.vue already renders from (parallel to
the existing Coordinate/Alert-message path) -- MeshMap.vue itself needed zero
changes, it was already built for this.

105/105 mesh tests pass (4 new: rx_snr/rx_rssi decode, position decode +
incomplete-field handling, full packet_to_inbound_frame integration).

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 22:52:42 -04:00
archipelago
dfca007949 wip(mesh): parse MeshPacket rx_snr/rx_rssi fields (Meshtastic backlog #14, part 1/many)
Field numbers confirmed against the canonical meshtastic/protobufs mesh.proto
(rx_snr=8 float, rx_rssi=12 int32), not guessed. Not yet threaded through to
ParsedContact/MeshPeer/mesh.peers — that's the next step. Part of the
Meshtastic 1.8.0 backlog plan (RSSI/SNR indicator, peer-location map, Device
tab, provisioning robustness, onboarding modal) — see
.claude/plans/floofy-riding-seahorse.md for the full plan.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 22:27:54 -04:00
archipelago
0eb5c258f5 fix(mesh): Meshtastic 3ccc pkc_capable pill + Sideband image interop + critical CBOR wire-bloat fix
Merges in the meshtastic agent's now-finished work alongside this session's
continuation: stock-peer (3ccc) PKI-capability is now stamped through
get_contacts -> refresh_contacts -> MeshPeer.pkc_capable, so a directed DM to/from
a PKC-capable stock Meshtastic peer correctly shows the E2E pill on the Sent row,
not just received messages. Confirmed live: .198 sees "Meshtastic 3ccc" with
pkc_capable=true.

Also fixes two real interop/correctness bugs found while live-testing the
Reticulum <-> Sideband link:
  - Receive: the daemon only ever read LXMF's plain-text content, silently
    dropping native FIELD_IMAGE/FIELD_FILE_ATTACHMENTS fields — a stock
    Sideband/NomadNet photo vanished into a blank-space message. Now decoded
    into the same ContentInline typed envelope our own attachments use.
  - Send: images to a non-archy (stock) peer now use native LXMF FIELD_IMAGE
    instead of our own opaque CBOR wire format, which Sideband can't decode.
  - Root cause of a garbled MC-chunk-fragment bug: TypedEnvelope.v/.sig (the
    OUTER wrapper every message type uses) serialized raw bytes as a CBOR
    array-of-integers instead of a native byte string, bloating every
    message on the wire ~2-3.5x — enough to push even a tiny ReadReceipt
    over the 140-byte single-frame chunking threshold. Root-caused by
    reading ciborium's deserializer source directly (deserialize_bytes only
    works within its internal scratch buffer; deserialize_byte_buf streams
    unbounded).

Frontend: consolidated the attach/record buttons into a single animated "+"
menu (was overflowing the compose row).

857/857 tests pass. Verified live across all 5 deploy-roster nodes.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 22:07:45 -04:00
archipelago
f54c853128 feat(mesh): Reticulum LoRa hardware gates pass + RNS Resource transfer + image/voice attachments
Phase 0 gates #2/#3 (two-node LXMF-over-LoRa, external Sideband interop) passed
on real hardware (.116's flashed Heltec V3 RNode <-> a phone-flashed RNode running
Sideband) — RNS announce, encrypted DM round-trip, and contact binding all verified
live. Fixed two bugs found in the process: the Reticulum send path wasn't stamping
outbound messages as E2E despite LXMF being unconditionally encrypted, and the
per-message transport pill collapsed Meshcore/Meshtastic into one generic "lora"
color instead of distinguishing the three radio transports.

Built on top of that link: a Columba-style image/file send experience —
compression-quality presets with a real transfer-time estimate (mesh.transport-advice,
now device-throughput-aware), receive-side thumbnail previews + auto-render for
already-local attachments, and async voice messages, all reusing the existing
ContentRef/ContentInline attachment pipeline. The headline addition is genuine RNS
Resource transfer support (daemon-side RNS.Link + RNS.Resource, Rust-side
send_resource/resource_recv plumbing, a new "resource-mesh" transport-advice tier)
so compressed photos up to 2MB now actually transfer over LoRa for Reticulum peers
instead of always falling back to Tor past the small inline-chunk cap.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
2026-06-30 19:57:01 -04:00
archipelago
12e7990b10 fix(mesh): route Meshtastic public-channel text to the channel thread, not DMs
Inbound Meshtastic text addressed to BROADCAST_NUM (the default public
LongFast channel, or any channel slot) was filed into a per-sender 1:1 DM
thread, so public-channel messages polluted individual people's DM chats
and appeared as if sent directly to the user.

packet_to_inbound_frame now detects `to == BROADCAST_NUM` and emits a new
synthetic RESP_MESHTASTIC_CHANNEL_TEXT frame
([channel_idx][sender_prefix(6)][text]) that the listener files under the
channel thread (contact_id = u32::MAX - idx) while still attributing the
message to its real sender. Directed text (to == our node) still routes to
the DM thread — a regression test locks that split in.

send_channel_text now sets MeshPacket.channel (field 3) so archy actually
transmits on channel 0 (public) instead of ignoring the slot. Mesh.vue keeps
the synthetic "Meshtastic !xxxx" sender id when that is the best identity
available for a stock public-channel device.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 14:33:30 -04:00
archipelago
f392670e2a feat(mesh): show sender identity on received channel messages
Received messages snapshot peer_name at receive time, so a Meshtastic
text that arrived before its sender's NodeInfo was stuck showing the
synthetic "Meshtastic !xxxx" id forever, and channel/group bubbles
showed no sender at all. Add a per-bubble sender label for received
messages in multi-sender views (mesh + Archipelago channels), resolved
LIVE from the peer table so it always shows the current archy identity
(e.g. "Arch Optiplex") the moment NodeInfo is learned. Falls back to
"Unknown sender" rather than echoing a Channel/synthetic placeholder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 13:04:41 -04:00
archipelago
a57ae388ec fix(mesh): restore Meshtastic inbound stream after radio reboot
archy went deaf to inbound LoRa packets after every config write.
A config write (region/channel/owner) reboots the radio, which resets
the firmware PhoneAPI to STATE_SEND_NOTHING; it won't stream received
packets again until the client re-sends want_config. archy ignored
FromRadio.rebooted (field 8) so never resubscribed — which is why old
messages only arrived after a full restart (restart = fresh want_config).

- meshtastic.rs: handle FROM_RADIO_REBOOTED -> set pending_reinit;
  try_recv_frame re-sends want_config to resubscribe the packet stream.
  Add send_keepalive (bare heartbeat) and pin modem_preset=LONG_FAST in
  set_lora_region so all radios share frequency.
- listener/session.rs: MeshRadioDevice::send_keepalive; 10s sync_timer
  sends a keepalive each tick (insurance vs 15-min idle serial close).
- mod.rs send_message: device-aware send — Meshtastic archy peers get a
  plain TEXT_MESSAGE_APP DM (firmware PKC E2E); Meshcore archy peers keep
  the typed envelope (no meshcore regression).

Verified: .198->.228 directed DM arrives as RECEIVED enc=True
peer="Arch Optiplex"; all 3 nodes (.116/.198/.228) + 3ccc hear each
other. Binary 737b16c3 deployed+active on all three.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 12:44:31 -04:00
archipelago
fbfeeeb0f5 fix(mesh): native E2E DM for archy↔archy text + software radio-reboot
- send_message now sends archy↔archy plain text as a native TEXT_MESSAGE_APP
  DM (firmware PKC-encrypts E2E), not wrapped in the binary typed envelope
  that silently broke archy↔archy LoRa delivery. Archy peers' Sent rows are
  marked encrypted so the E2E pill shows; rich typed msgs still use the
  typed-wire path.
- Add a software radio-reboot to recover a wedged/RX-deaf radio without
  physical access (and for the Device-tab settings panel): driver reboot()
  via AdminMessage reboot_seconds=97 (verified vs meshtastic/protobufs),
  MeshCommand::RebootRadio, MeshService::reboot_radio, RPC mesh.reboot-radio.
- Handoff doc: docs/SESSION-1.8.0-OTA-PROGRESS.md "RESUME HERE" — RF link is
  the proven blocker (radios not hearing each other); modem_preset mismatch
  is the prime suspect; on-device Meshtastic-app check + fix plan documented.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 10:39:34 -04:00
archipelago
b4531bb4fc fix(mesh): enforce LoRa-only off-grid labels 2026-06-30 06:22:45 -04:00
archipelago
2ac0711f8e fix(ui): refresh mesh transport labels after send 2026-06-30 06:05:41 -04:00
archipelago
a91814641e fix(mesh): set Meshtastic hop limit and show LoRa pill 2026-06-30 05:59:53 -04:00
archipelago
c2c4b5af7d merge: demo build updates
# Conflicts:
#	neode-ui/src/stores/appLauncher.ts
#	neode-ui/src/views/AppSession.vue
2026-06-30 05:22:42 -04:00
archipelago
daf750688d merge: mesh multiversion and transport pills
# Conflicts:
#	core/archipelago/src/mesh/listener/decode.rs
#	core/archipelago/src/mesh/meshtastic.rs
2026-06-30 05:19:58 -04:00
archipelago
4b7cbf2b5e merge: bitcoin version bulletproof and OTA work 2026-06-30 05:08:27 -04:00
archipelago
df9d3a55be integration: preserve deployed 1.8.0 OTA work 2026-06-30 05:08:17 -04:00
archipelago
7b0748c868 fix(mesh): respect the radio's flashed LoRa region (don't force ours)
ensure_lora_region previously force-overrode the device's region with the
mesh-config region (EU_868) whenever they differed — which would shove a US/ANZ
user's radio onto EU_868: an illegal band that also cuts it off from its local
mesh. Off-the-shelf interop must respect whatever region the user flashed.

Now: a radio that already reports a REAL region (US, EU_868, ANZ, …) is left
untouched. We only set a region when the device reports UNSET (a fresh radio is
RF-silent and can't mesh at all), using the operator-configured region as the
fallback. Unknown/None (never reported) is also left alone. Pairs with the
default-channel change so a meshtastic archy node behaves like a stock device.

cargo check green (built into the same binary as the channel fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 08:36:04 -04:00
archipelago
810127fd3e feat(mesh): meshtastic off-the-shelf interop — default channel + private archipelago
Make a meshtastic-equipped archy node work like a stock Meshtastic device AND
keep the private archy group, instead of being isolated on a custom primary:
- slot 0 (PRIMARY)  = the DEFAULT public channel (empty name + default key) →
  interoperates with every off-the-shelf device on LongFast and picks up
  default-channel users; our NodeInfo broadcasts ride here like normal.
- slot 1 (SECONDARY) = "archipelago" (deterministic psk) → private archy↔archy.

Previously the driver set "archipelago" as the PRIMARY, isolating archy from the
public mesh. Now ensure_channel writes at most one channel per call (default
primary first, then archipelago secondary), reusing the existing reboot→
reconnect→re-check loop so it converges in ≤2 cycles without reboot-looping;
primary_is_default() accepts the default key in 1-byte or expanded form so a
stock radio is never needlessly rewritten. set_channel generalized to
(index, name, psk, role); want_config parse tracks both slots.

MeshCore needs no change — it never overrides channels (ensure_channel is a
no-op) and already rides MeshCore's default Public channel off the shelf.

cargo check green. NEEDS radio verify on .116/.198 (default-channel RX + archy
group on the secondary). Channel provision cap (3) covers the 2-write migration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 07:40:10 -04:00
archipelago
067002b04b Merge branch 'bitcoin-version-bulletproof' into mesh-multiversion-integration 2026-06-29 06:45:50 -04:00
archipelago
20f762cb2c feat(fips): auto-peer LAN-discovered federation nodes directly over FIPS
Mesh/federation messages between co-located nodes were always falling back to
Tor because the FIPS overlay had no direct peering — every node depended on the
global anchor's spanning tree, and when that anchor link flaps a node is
isolated and all FIPS dials time out. (Diagnosed live on .116/.198: pure-FIPS
direct peering over UDP 8668 fixes it — 2.5ms vs timeout.)

Generalize the manual fix: in the existing 5-min FIPS seed-anchor apply loop,
also auto-connect every federation peer the PeerRegistry knows both a LAN
address AND a FIPS npub for, dialing its FIPS UDP transport (port 8668) at its
LAN IP via the same idempotent `fipsctl connect` path (new
anchors::lan_fips_anchors). This is FIPS's own transport over the LAN — NOT
Tailscale, NOT the HTTP/LAN messaging port. Transient (recomputed each tick from
live mDNS discovery, never persisted) so changing IPs self-correct. Remote peers
with no LAN address are untouched (still routed via the anchor).

Registry Arc hoisted out of the transport-init block so the loop can read
all_peers(). cargo check green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 06:42:18 -04:00
archipelago
11155055aa feat(mesh): meshtastic PKI E2E pill — surface pki_encrypted on received DMs
The synthetic meshcore-style frame the meshtastic driver builds can't carry the
radio's PKI-encryption status, so received meshtastic DMs never lit the E2E pill.
Thread it out-of-band: the device records `last_rx_encrypted` (= packet
pki_encrypted) when it yields a text frame; the session loop reads it via
`take_rx_encrypted()` right after dispatch and stamps the just-stored received
message E2E (dispatch::stamp_received_encrypted, monotonic-id keyed). Meshcore
returns false here (its E2E is derived in the frames decrypt path). Pure
out-of-band signal — no change to the shared meshcore wire format.

Built + deployed live in binary d937814e on .116/.198. cargo check green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 06:25:01 -04:00
archipelago
f4f45c1a09 docs: mark .228 reindex finish/verify as other-agent owned
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 06:04:01 -04:00
archipelago
ed1352d3a3 docs+catalog: bitcoin multi-version rollout handoff + reproducible generator
- generate-app-catalog.sh: VERSIONS map now lists the full Knots set
  (29.3.knots20260508/20260507/20260210 + 29.2.knots20251110) and Core
  (adds 29.2 + a `latest` entry → newest); generator forces top-level
  `version` == the default entry's version (the 169ff2e2 invariant) so
  regeneration is reproducible. releases/app-catalog.json regenerated.
- docs/bitcoin-version-bulletproof-rollout.md: full handoff — root causes,
  fixes, current .228 state, the coordinated fleet-rollout steps (incl.
  :latest repoint sequencing / fleet-safety), reindex finish procedure, and
  the switch-matrix test plan.
- PRODUCTION-MASTER-PLAN.md: link the rollout doc (§6b-bis).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 06:02:24 -04:00
archipelago
095a76cd20 fix(bitcoin): bulletproof multi-version switching (Knots & Core)
Three stacked bugs made "switch version" silently fail / crash-loop, and
the data-access mismatch corrupted a node's index during recovery attempts.

Backend renderer:
- sync_quadlet_unit ignored the per-app pinned version and re-rendered the
  quadlet with the manifest's :latest every reconcile tick, reverting any
  switch. Factor the install-time catalog/pin resolution into a shared
  resolve_catalog_image() and call it in BOTH install_fresh and
  sync_quadlet_unit.
- The renderer folded manifest `entrypoint: ["sh","-lc"]` into Exec=, which
  only worked when the image entrypoint was a passthrough shell wrapper. The
  versioned images use ENTRYPOINT ["bitcoind"], so Exec=sh -lc ... became
  `bitcoind sh -lc ...` and crash-looped. Emit a real Entrypoint= override;
  exec_changed now also compares Entrypoint=.

Images:
- Build all bitcoin images (Core + Knots, every version) as container-root
  (USER removed) like the legacy :latest image. Chain data is owned by the
  data_uid (container uid 102); root reads it via CAP_DAC_OVERRIDE (granted in
  the manifest). A non-root USER (the previous uid 1000) can't read existing
  chain data → "Error initializing block database". Still fully rootless:
  container-root maps to the unprivileged host service user.

Catalog:
- bitcoin-knots versions[]: 29.3.knots20260508/20260507/20260210 +
  29.2.knots20251110, "latest" tracking newest.
- bitcoin-core versions[]: add 29.2 + a "latest" entry. All images rebuilt
  root and published to the mirror.

Frontend:
- AppSidebar version dropdown: rename the latest option to "Always use the
  latest version" (no v prefix), fix right padding, and guarantee the current
  selection matches a real option (was rendering blank).
- New InstallVersionModal: full-screen version chooser shown from the App
  Store / Discover install button for multi-version apps (Bitcoin Knots/Core),
  app icon + "Install <name>", latest pre-selected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 05:46:04 -04:00
archipelago
3c7c04a662 fix(mesh): meshtastic receive — drain frame batch per poll + rx diagnostics
Addresses the open Meshtastic parity bug (project_meshtastic_parity): the
running driver received nothing (`mesh.messages` stayed []) though the radio
got the packets and sends worked.

Root-cause candidate: `try_recv_frame` decoded ONE serial frame per poll and
returned Ok(None) for every non-text FromRadio frame, so the session loop slept
50ms between frames. Under Meshtastic's frequent NodeInfo/telemetry stream a
received text packet queued behind them, and read_from_radio's 64KB buffer cap
could drain (drop) it before it was ever decoded — reception silently dead while
sends kept working.

- try_recv_frame now drains a bounded batch (64) per poll, processing each
  frame's side effects and returning the first inbound text frame, so a text
  packet is decoded the same poll it arrives and the buffer never grows enough
  to hit the lossy cap. Bounded so a continuous flood still yields to select!.
- packet_to_inbound_frame logs every decoded packet (from/portnum/payload_len)
  and a "did not parse (dropped)" case, so one live radio pass is conclusive.

The rest of the decode path was verified correct by inspection (FROM_RADIO_PACKET
=2, wire-type-5 handled, parse_mesh_packet sound, 60s heartbeat present) — not a
parse bug. cargo check green. NEEDS a live radio pass on a rig that isn't .228
(off-limits: bitcoin testing) to confirm.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 05:04:09 -04:00
archipelago
11038cdcc9 feat(mesh,ui): per-message transport pill (Mesh/FIPS/Tor) + fix E2E pill
Adds a per-message transport badge to archy↔archy mesh chats and fixes the
long-broken E2E badge — both meshcore and meshtastic, styled like the existing
E2E pill.

Transport pill:
- New `MeshMessage.transport` ("lora"/"fips"/"tor"), surfaced in the UI beside
  the E2E badge (Mesh.vue transportLabel() → Mesh/FIPS/Tor, mesh-styles.css).
- Sent LoRa → "lora"; sent federation → finalized to the real leg ("fips"/"tor")
  once the background send resolves (req.send_json transport), via an id-keyed
  store update.
- Received: a post-dispatch stamp on handle_typed_envelope_direct's output
  (monotonic ids) tags both transports without threading through all 20 typed-
  dispatch sites — radio wrapper stamps "lora", federation injector stamps the
  peer's last_transport ("fips"/"tor", default tor; the inbound HTTP carries no
  FIPS-vs-Tor signal).
- Plain native/channel LoRa frames → "lora"; channel broadcasts stay non-E2E.

E2E pill fix:
- `encrypted` was hardcoded false at every MeshMessage construction site, so the
  UI badge (Mesh.vue `v-if="msg.encrypted"`) never showed. Now: federation
  envelopes are E2E (identity-signed over an encrypted transport); the meshcore
  native-DM receive path already had a real `encrypted` flag (now also tagged
  with transport). meshtastic-PKI radio E2E flag threading is a noted follow-up.

Backend cargo check + frontend vue-tsc build both green. Needs a live radio +
multi-transport pass on .116/.228 to confirm end-to-end (see
project_transport_pill / project_meshtastic_parity).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 04:29:25 -04:00
archipelago
169ff2e2cd fix(bitcoin): knots catalog default must equal top-level version
The knots versions[] marked 29.3.knots20260508 as default while the
top-level catalog version is the floating 'latest' tag — violating the
generator's own invariant (default:true MUST equal the top-level version
so selecting it un-pins / tracks latest). Live effect via package.versions:
catalog_default_version='latest' so the UI-highlighted default actually
PINS+recreates (opposite of un-pin) and 'latest' was unreachable from the
Version & Updates card.

Add a 'latest' default entry (== the manifest's floating tag) and keep
29.3.knots20260508 as a pinnable option. Verified on .228: package.versions
now returns default=latest with 2 selectable versions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 19:56:49 -04:00
archipelago
da20f67462 Merge bitcoin-multi-version: multi-version support for Core & Knots
Integrate the bitcoin-multi-version feature (commit 6aa74c73): per-node
choice/pin/switch of Bitcoin Core & Knots versions with auto-update toggle —
catalog versions[] schema, install-time selection, package.versions +
package.set-config RPCs, hourly per-app auto-update tick, build-bitcoin-image.sh
(GPG+SHA verified rootless image builder), and UI (version select + Version &
Updates card). Catalog regenerated; preserves the mempool 127.0.0.1 health fix.

Not yet live-verified on .228 — gate any tagged release on that per CLAUDE.md.
2026-06-28 18:48:38 -04:00
archipelago
6aa74c7386 feat(bitcoin): multi-version support for Core & Knots (install/switch/pin/auto-update)
Lets a node runner choose which Bitcoin Core / Knots version to install
(latest pre-selected), then switch, pin, or opt into auto-update from the
app's interface — all manifest/catalog-driven, rootless, signed-registry,
zero-data-loss. Motivated by upcoming BIP-110 signalling: runners need a
real choice of software version.

Backend:
- version_config.rs: per-app pin + auto-update persistence (atomic, merge-
  preserving), downgrade detection, auto-update enumeration (+ unit tests).
- app_catalog.rs: CatalogVersion / versions[] schema, catalog_versions(),
  catalog_image_for_version() (same-repo guard); a pin suppresses the update
  badge.
- prod_orchestrator.rs: pinned version wins over the catalog default on every
  install/recreate.
- install.rs: install-time `version` param persisted (default = unpinned).
- set_config.rs: package.versions (read) + package.set-config (write) RPCs;
  downgrade is gated behind explicit confirm (warn + confirm + allow).
- update.rs/main.rs: hourly per-app auto-update tick via the orchestrator
  (opt-in, pin-respecting); fix handle_package_update to be non-fatal for
  orchestrator-managed apps lacking a catalog primary image (bitcoin-core).

UI:
- MarketplaceAppDetails.vue: install-time version selector (shown when an app
  offers >=2 versions).
- appDetails/AppSidebar.vue: "Version & Updates" card (switch / pin / auto-
  update toggle / downgrade warning), per app.
- rpc-client.ts + en.json: RPC methods, types, strings.

Phase 0 image pipeline:
- scripts/build-bitcoin-image.sh: download official tarball + SHA256SUMS(.asc),
  verify SHA-256 + pinned-maintainer OpenPGP signature (fail-closed), build a
  minimal rootless image, smoke-test, tag + push.
- apps/bitcoin-core/Dockerfile rewritten (drops stale community base);
  apps/bitcoin-knots/Dockerfile added.
- generate-app-catalog.sh: emit curated versions[]; published + catalog now
  offers Core 25.2/26.2/27.2/28.4/29.3/30.2/31.0 + Knots 29.3.knots20260508.

docs/bitcoin-multi-version-design.md: live progress tracker.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 18:46:17 -04:00
archipelago
3cea7dd6c5 test(phase3): fix Phase-3 quadlet gates — define fail(), drop stale Notify=healthy assert
Two Phase-3 bats suites used `fail` (a bats-assert helper) but bats-assert
isn't installed on the alpha fleet (only bats-core), so every tripped
assertion crashed with `fail: command not found` (status 127) instead of
reporting a real pass/fail. Define the same minimal `fail() { echo ...;
return 1; }` the other suites already use (see mempool.bats). Without this
the gates were silently non-functional.

Also rewrite the obsolete "HealthCmd= implies Notify=healthy" assertion in
use-quadlet-backends-install.bats. Phase 3.4's Notify=healthy was
deliberately reverted: gating `systemctl start` on health hung boot
reconciliation for dependency-waiting apps (fedimint idles until Bitcoin
IBD; lnd until macaroon unlock), leaving units stuck "deactivating". The
renderer now emits HealthCmd= for Podman's health state but TimeoutStartSec=0
and NO Notify=healthy (quadlet.rs render() + contains_stale_health_gate()).
The test now asserts the current invariant: no backend unit gates start on
health.

Verified on the .228 canary node (ARCHIPELAGO_USE_QUADLET_BACKENDS=1):
use-quadlet-backends-install 6/6, backend-survives-archipelago-restart 3/3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 16:09:05 -04:00
archipelago
d7c6f8c348 fix(mempool): health-check 127.0.0.1 not localhost (stops false-unhealthy loop)
The archy-mempool-web health_check endpoint used http://localhost:8080.
Inside the frontend image, wget resolves `localhost` to ::1 (IPv6) first,
but nginx binds 0.0.0.0:8080 (IPv4) only -> the baked HealthCmd gets
"connection refused" every probe -> container is perpetually unhealthy ->
the reconciler recreates it forever (observed on .228: mempool container
re-Started every ~3 min, Health=unhealthy). Proven live: in-container
`wget http://localhost:8080/` = refused, `wget http://127.0.0.1:8080/` = OK.

Pin the probe to 127.0.0.1 so it matches nginx's IPv4 bind. Updated both
the source manifest and the embedded copy in releases/app-catalog.json
(the catalog overlay wins over the disk manifest on fleet nodes, so the
catalog copy is the one that actually reaches .228).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 15:09:34 -04:00