Merges in the meshtastic agent's now-finished work alongside this session's
continuation: stock-peer (3ccc) PKI-capability is now stamped through
get_contacts -> refresh_contacts -> MeshPeer.pkc_capable, so a directed DM to/from
a PKC-capable stock Meshtastic peer correctly shows the E2E pill on the Sent row,
not just received messages. Confirmed live: .198 sees "Meshtastic 3ccc" with
pkc_capable=true.
Also fixes two real interop/correctness bugs found while live-testing the
Reticulum <-> Sideband link:
- Receive: the daemon only ever read LXMF's plain-text content, silently
dropping native FIELD_IMAGE/FIELD_FILE_ATTACHMENTS fields — a stock
Sideband/NomadNet photo vanished into a blank-space message. Now decoded
into the same ContentInline typed envelope our own attachments use.
- Send: images to a non-archy (stock) peer now use native LXMF FIELD_IMAGE
instead of our own opaque CBOR wire format, which Sideband can't decode.
- Root cause of a garbled MC-chunk-fragment bug: TypedEnvelope.v/.sig (the
OUTER wrapper every message type uses) serialized raw bytes as a CBOR
array-of-integers instead of a native byte string, bloating every
message on the wire ~2-3.5x — enough to push even a tiny ReadReceipt
over the 140-byte single-frame chunking threshold. Root-caused by
reading ciborium's deserializer source directly (deserialize_bytes only
works within its internal scratch buffer; deserialize_byte_buf streams
unbounded).
Frontend: consolidated the attach/record buttons into a single animated "+"
menu (was overflowing the compose row).
857/857 tests pass. Verified live across all 5 deploy-roster nodes.
Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
22 KiB
1.8.0 OTA Session Progress
Updated: 2026-06-30
▶️▶️▶️▶️ LIVE CHECKPOINT 2026-06-30 (evening) — #17 deployed + verified on .198/.228
#17 (3ccc / stock-peer E2E pill) is now built, deployed, and live-verified on .198 and
.228 only (.116 skipped per the hardware notice below — its radio is mid-reflash to RNode).
- Built release binary sha
b1d695fc626a7382from the working tree (cargo check+cargo test -p archipelago mesh::both green, 99 passed/0 failed/1 ignored, right before building — tree was settled, no collision with the Reticulum agent's concurrent edits). - Deployed via stop/swap/start to
.198(192.168.1.198) and.228(192.168.1.228), sha256 confirmed matching on both,systemctl is-active=activeon both (.228took its usual ~couple-minute convergence — heavy resilience node, unrelated bitcoind/fedimint container startup noise in the logs during that window, no mesh errors). - Live-verified the actual fix, not just deploy: on
.198,mesh.peersshows"advert_name":"Meshtastic 3ccc", "pkc_capable":true, andmesh.sendto 3ccc (contact_id:1128152268) now returns"encrypted":true— confirms thearchy || peer_pkc_capable(contact_id)TX fix is live, not just compiled. .228's RPC password in memory (password123) was stale — user confirmed the correct password isThisIsWeb54321@(same as.198/.116, i.e. fully unified now). Re-verified via RPC:mesh.peersshows 3cccpkc_capable:true, andmesh.sendto 3ccc returns"encrypted":true— #17 confirmed live on.228too, not just.198.
NOT yet done: push commit to gitea-vps2 (still uncommitted in the working tree, by design — shares the tree with the Reticulum agent's uncommitted work); user on-device confirmation that the E2E pill actually renders in the Mesh UI for 3ccc.
🛠️ HARDWARE NOTICE 2026-06-30 (~16:30) — .116's Heltec V3 is being repurposed
The Reticulum agent is reflashing .116's Heltec V3 (the board on /dev/ttyUSB0, currently
.116's live Meshtastic radio) to RNode firmware, with explicit user approval, to unblock the
Reticulum Phase-0 hardware gates (real RNode needed; see docs/RETICULUM-TRANSPORT-PROGRESS.md).
This was user-confirmed specifically because it takes .116 offline as a Meshtastic radio.
Effect on this workstream: do all on-device Meshtastic testing on .198 and .228 only — .116 no
longer has a Meshtastic-firmware radio attached once this lands. cargo check/cargo test -p archipelago were both confirmed clean (99/99 mesh tests) right before the reflash started, so
the earlier "wait for their edit to settle" blocker above is cleared — software-side it's safe to
build/test/deploy; only .116's physical radio role changed.
▶️▶️▶️ LIVE CHECKPOINT 2026-06-30 (later PM, ~15:50) — READ THIS FIRST IF RESUMING
#17 (3ccc / stock-peer E2E pill) is CODE-COMPLETE in the working tree, isolated
to meshtastic.rs/protocol.rs/types.rs/mod.rs as planned (no session.rs
transport-plumbing changes from this side):
ParsedContact.pkc_capable(protocol.rs) +MeshPeer.pkc_capable(types.rs), both#[serde(default)]/defaultedfalseat every construction site.MeshtasticDevice::get_contacts()now stampspkc_capableper contact from the existingpeer_is_pkc_capable(node_num)seam (de-allow(dead_code)'d).listener/session.rs::refresh_contactsORs the new value intoMeshPeer.pkc_capable(capability only grows, never cleared by a transient refresh) — this IS a touch of session.rs, but additive/non-colliding with the Reticulum device-enum match arms already there; did not touch transport plumbing/routing.mod.rs::MeshService::send_messagenow doesarchy || self.peer_pkc_capable(contact_id)for the Sent-rowencryptedflag (wasarchy-only before).- Verified via
cargo check -p archipelago --bin archipelago(clean, exit 0) before the other agent's latest edit landed.
NOT YET DONE: rebuild release binary → redeploy 5 nodes → push → user on-device test (same as #16, both still pending live verification).
⚠️ BLOCKED right now — do not build/deploy/push until this clears: the Reticulum
agent is actively mid-edit in the same working tree. A cargo test run right after
the clean cargo check above failed with a real (but transient, not mine) signature
mismatch: session.rs::auto_detect_and_open / run_mesh_session were observed with a
new device_kind: Option<DeviceType> param that listener/mod.rs's call site didn't
have yet — a normal in-flight snapshot of their work, not a regression to fix here.
Action on resume: re-run cargo check first; if it's clean, the other agent's edit
has settled and it's safe to proceed to build/test/deploy. If still broken, wait —
do not stash, revert, or patch their in-progress session.rs/listener/mod.rs changes
(see memory feedback_concurrent_agent_tree.md). Also: building/deploying right now
would bundle their not-yet-finished reticulum.rs wiring into the binary — confirm
with the user before shipping a combined build, since only the meshtastic #17 piece
has been asked for/owned by this session.
▶️▶️ LIVE CHECKPOINT 2026-06-30 (late PM) — READ THIS FIRST
Fleet state: all 5 test nodes on binary 38c456b0bacec3c4 + frontend
Mesh-CAkPgvLo.js, archipelago active on each:
.116, .198, .228 (LAN, archipelago@ + ~/.ssh/archipelago-deploy),
100.72.136.5, 100.89.209.89 (Tailscale, same key — installed this session;
SSH user archipelago / pw ThisIsWeb54321@; NOPASSWD sudo on all 5).
Shipped this session (commit 12e7990b on main, pushed to gitea-vps2):
- ✅ #16 public-channel routing — inbound Meshtastic text to
BROADCAST_NUMnow files under the public channel thread (contact_idu32::MAX - idx), attributed to its real sender, instead of polluting per-sender DM threads. Directed text (to == our node) still routes to the DM thread (regression testpacket_to_inbound_frame_directed_dm_stays_a_contact_message).send_channel_textnow setsMeshPacket.channelso archy TX's on channel 0 (public). Code:meshtastic.rs(packet_to_inbound_frame,parse_mesh_packetto/channel,send_channel_text),protocol.rs(RESP_MESHTASTIC_CHANNEL_TEXT = 0x70),listener/frames.rs(handler + sender attribution),Mesh.vue(senderLabelFor). Tests green (95 mesh tests). Pending: user on-device test with the radios.
Push access: main is a PROTECTED branch on gitea-vps2. Direct push uses the
dedicated ai account via remote gitea-ai (git push gitea-ai main).
See memory reference_gitea_ai_push_account.md.
Coordination: another agent owns Reticulum (reticulum-daemon/ + Rust
transport wiring). DO NOT touch mesh/listener/session.rs transport plumbing or
mod.rs routing in ways that collide. Keep #17 work isolated to meshtastic.rs
RX/TX + (if needed) the sent-row encrypted flag.
✅ CODE-COMPLETE (not yet deployed/tested live) — #17 (3ccc / stock-peer E2E pill)
Goal: DMs to and from a PKC-capable stock peer (3ccc, NodeInfo public_key key_len=32 confirmed) must show the E2E pill.
- RX side is already correct:
parse_mesh_packetreadspublic_key(field 16)pki_encrypted(field 17) per the MeshPacket proto; the directed-DM RX path promotes toRESP_CONTACT_MSG_V3_E2Ewhenpki_encrypted. (Verify live.)
- TX bug (root cause) — FIXED:
mod.rs::send_messagenow records the Sent row withencrypted = archy || peer_pkc_capable(contact_id).peer_is_pkc_capable(meshtastic.rs) is wired out viaget_contacts()→ParsedContact.pkc_capable→refresh_contacts(session.rs) →MeshPeer.pkc_capable→MeshService::peer_pkc_capable. See the LIVE CHECKPOINT at the top of this file for the exact touch points. - NEXT STEP when resuming: confirm
cargo checkis clean (the other agent's Reticulum work shares this tree and may be mid-edit — see top checkpoint), then rebuild → redeploy 5 nodes → push → user test (same pending step as #16).
Remaining open after #17: #12 (provisioning robustness — HOLD, session.rs churn
risks reticulum collision), #8 (Device-tab settings panel + reboot button — RPC
mesh.reboot-radio already exists), #6 (onboarding modal), #7 (.116 re-verify),
#14 (RSSI/SNR per-contact indicator), #15 (peer-location map, POSITION_APP portnum=3).
▶️ RESUME HERE — archy↔archy LoRa (2026-06-30 PM) — READ FIRST
Goal: archy↔archy text over Meshtastic LoRa must DELIVER and show the E2E pill,
identical in off-grid and normal mode. Test bed = .116 / .198 / .228 (all EU_868).
Don't touch the federation/FIPS path.
✅✅✅ SOLVED 2026-06-30 — archy↔archy LoRa WORKS (delivery + E2E pill + identity)
VERIFIED: .198→.228 directed DM → .228 row RECEIVED enc=True peer="Arch Optiplex".
All three nodes (.116/.198/.228) now hear each other + stock peer 3ccc. Deployed binary
737b16c3235b active on all three. Fix source COMMITTED as a57ae388 on main
(not yet pushed to gitea-vps2/origin).
THE fix (receive stream): archy ignored FromRadio.rebooted (field 8). Every config
write reboots the radio → firmware PhoneAPI resets to STATE_SEND_NOTHING and stops
streaming received packets until the client re-sends want_config. archy never did →
went deaf to inbound (that's why old messages only arrived after a full restart = fresh
want_config). Fix: handle FROM_RADIO_REBOOTED → set pending_reinit → re-send
want_config; plus a 10s keepalive heartbeat (insurance vs 15-min idle serial close) and
a pinned modem_preset=LONG_FAST so all radios share frequency. Combined with the earlier
E2E send fix (plain TEXT_MESSAGE_APP DM, firmware PKC) this closes archy↔archy LoRa.
Open follow-ups: #A surface received msgs under archy identity in all UI views; #6
device-onboarding modal; #8 Device-tab settings panel; #7 re-verify .116 in rotation;
#12 make modem_preset authoritative + hot-swap re-binding + RX-stall watchdog;
#14 signal-strength (RSSI/SNR) indicator per contact (from MeshPacket rx_rssi/rx_snr);
#15 map view plotting peer locations where shared (Meshtastic POSITION_APP portnum=3
lat/lon). See the resume memory project_session_resume_2026_06_30_lora.md for the full
task list.
(historical) earlier TL;DR — RF-layer suspicion, now RESOLVED by the reboot-recovery fix
The archy software is correct and deployed. The blocker was at the radio/RF layer: the three radios are not hearing each other over the air at all. No amount of archy code change will fix that until the radios actually RF-link. Resume by testing the radios directly at home (Meshtastic phone app over Bluetooth) — see "DO THIS FIRST AT HOME" below. ← this turned out to be the want_config resubscribe bug above.
What is DONE and deployed (commit pending — see below)
- E2E send fix (
core/archipelago/src/mesh/mod.rssend_message, ~L1542): archy↔archy plain chat text is now sent as a nativeTEXT_MESSAGE_APPDM (firmware PKC-encrypts it E2E), NOT wrapped in our binary typed envelope. Archy peers' Sent rows are markedencrypted=trueso the pill shows. Rich typed msgs still usesend_typed_wire. This was the original root-cause fix (envelope-wrapped text silently broke archy↔archy LoRa). - NEW: software radio-reboot end-to-end, so a wedged/RX-deaf radio can be rebooted
without physical access (and for the Device-tab settings panel the user requested):
meshtastic.rs:reboot(seconds)driver method +ADMIN_REBOOT_SECONDS_FIELD = 97(verified vs meshtastic/protobufs admin.proto —set_owner=32/set_channel=33/set_config=34matched our existing constants, confirming the proto read).listener/mod.rs:MeshCommand::RebootRadio { seconds }.listener/session.rs: device-enumreboot()dispatch (Meshtastic only) + handler arm.mesh/mod.rs:MeshService::reboot_radio(seconds).api/rpc/mesh/messaging.rs:handle_mesh_reboot_radio→ RPCmesh.reboot-radio{seconds?}(default 2); dispatcher arm inapi/rpc/dispatcher.rs.cargo checkpasses. Built release shaba4aed590027690dand DEPLOYED + active on.116/.198/.228. The RPC works ({"reboot":true,"seconds":2}).- ⚠️ Caveat: when called, archy logged "Sent Meshtastic radio reboot" but the radio did not visibly reboot afterward (no config re-stream). Either field 97 is still off, or newer firmware requires an admin session passkey even over local serial, or the USB serial stayed open through the 2s reboot so no reconnect was logged. Needs on-device verification.
The hard evidence (why "nothing works")
- Directed DM tests
.198→.228AND.116→.228(neither path reflashed): sender logsSent plain native DM dest=30d258436d65 part=1 total=1and RPC returnssent:true, encrypted:true, but.228logs nothing — packet never reaches archy from the radio. - A raw broadcast from
.198(mesh.broadcast) was accepted by its radio but not heard by.228/.116. - In an 8-minute window, all three nodes received 0 inbound OTA packets from any other node.
Each only logs its OWN once-a-minute
Broadcast Meshtastic NodeInfo advert+ local TXfield=11queue-status..228 mesh.status=messages_received:1total. .198's radio is alive and transmitting NodeInfo every 60s — so it's not dead; it's that reception is broken on the receivers. A radio cannot drop a broadcast AND a unicast to its own node number while config matches, unless it simply isn't on the same airwaves.- archy provisioning is correct & identical across nodes (read back from device): PRIMARY =
public LongFast (
name="" psk_len=1), SECONDARY =archipelago, region=3 (EU_868). Admin field constants verified. The send path hands the radio a correct unicast MeshPacket (to=node, want_ack, hop_limit=3, plaintextdecodedfor the firmware to PKC-encrypt).
PRIME SUSPECT (software-fixable) — modem-preset / frequency mismatch
archy only ever writes region + use_preset and never explicitly pins modem_preset
(it parses region but not preset; set_lora_region relies on the LongFast default). If ANY
radio has a non-default modem preset / frequency slot persisted (e.g. set via the Meshtastic
app, or a different factory default after the .198 reflash), the radios are on different
airwaves despite identical channel name + region, and archy would never correct it.
DO THIS FIRST AT HOME (decisive, ~2 min, only the user can do it)
Open the Meshtastic phone app over Bluetooth (works alongside archy's USB serial) on each
of .116/.198/.228 and check:
- Do the 3 nodes see each other in the node list (recent "heard")? → if NO, they're not RF-reaching (preset/freq/antenna/range).
- Do all 3 show the same Modem preset (LongFast), Region (EU_868), Frequency slot, and the same PRIMARY channel? → any difference = the cause. This single test separates "archy misconfigures the radios" from "radios physically can't reach each other."
THEN — the archy fix to apply (if preset/config differs)
Make archy authoritatively write the full LoRaConfig and force re-provision so all radios
converge: in core/archipelago/src/mesh/meshtastic.rs::set_lora_region (and its
caller/guard ensure_lora_region ~L304), explicitly set modem_preset = LONG_FAST (0) as a
field in the LoRaConfig (it's currently omitted/defaulted), and make the startup provision
path rewrite LoRa config when the preset doesn't match, then reboot the radio (use the new
mesh.reboot-radio). Also verify the mesh.reboot-radio actually reboots the radio
on-device (the caveat above).
TEST RECIPE (works on each node)
- RPC helper used this session: a node-side
rpc.shthat logs in (passwordThisIsWeb54321@), grabs thecsrf_tokencookie, echoes it asX-CSRF-Token, and POSTs tohttp://127.0.0.1:5678/rpc/v1. Recreate it or run archy's RPC directly. Methods:mesh.peers,mesh.status,mesh.messages,mesh.send {contact_id,message},mesh.broadcast,mesh.reboot-radio {seconds}. - LoRa contact ids:
.116=1135977788(prefix3ca5b543),.198=3677050140(db2b551c),.228=1129894448(prefix30d25843), stock3ccc=1128152268. - Link health check (run on each node): look for inbound
from=Some("!...")lines injournalctl -u archipelagothat are NOT the node's ownBroadcast ... NodeInfo advert. If zero across all nodes → RF link is down (the current state). - E2E success criteria: send
.198→.228, the marker appears in.228mesh.messagesas an inbound row withencrypted:true/transport:"lora", AND.116↔.228likewise.
DEPLOY / BUILD RECIPE
- Build: from
core/,CARGO_TARGET_DIR=/tmp/archy-hotfix-target CARGO_INCREMENTAL=0 cargo build --release -p archipelago --bin archipelago. (Ifrust-lld: undefined hidden symbol, it's incremental cache —CARGO_INCREMENTAL=0fixes it.) - SSH key
~/.ssh/archipelago-deployis authorized on.116/.198/.228. SSH/UI/RPC passwordThisIsWeb54321@. Per node: scp the binary,sudo systemctl stop archipelago→kill -9 $(pgrep -x archipelago)→install -m0755to/usr/local/bin/archipelago→systemctl start archipelago. Verify bysha256summatch +systemctl is-active. - Current deployed sha on all 3 =
ba4aed590027690d(the reboot-enabled build).
Fleet state (as of 2026-06-30 PM)
- All 3 nodes on binary
ba4aed59, active. Off-grid mode currently OFF (mesh_only:false). .198radio was reflashed to factoryfirmware-heltec-v3-2.7.26(recovered from corrupt NVS); region EU_868 persists. Its archy identity is NOT re-bound on.228(.228shows.198as raw radio "Meshtastic 551c",arch_pubkey_hexabsent) because.228hasn't heard.198's identity broadcast — a downstream symptom of the dead RF link, not a separate bug.- The radios are powered & each transmitting; they are simply not hearing each other.
Deferred UI (after LoRa works)
- Device-tab settings panel (gear/desktop) — host the "Reboot radio" button there; calls
mesh.reboot-radio. Scoping done: add to the Mesh.vue actions row (mirrors Broadcast/Off-Grid buttons) + arebootRadio()method inneode-ui/src/stores/mesh.ts. SeeMesh.vue~L1484 actions row andmesh.ts~L373broadcastIdentity()pattern. - Device-onboarding modal (detect plugged-in radio).
Current scope:
- Preserve existing mesh work: E2E indicators, FIPS/Tor transport indicators, typed-message paths, Meshtastic region/channel provisioning, and dirty Meshtastic receive-attempt changes.
- Take over the
3cccstock Meshtastic peer bug: LoRa text from3cccto Archipelago.116does not surface inmesh.messages. - Keep release-gate fixes already made in this session.
Local gate status so far:
cargo test -p archipelago --bin archipelago: green, 849/849 after Meshtastic fixes.python3 scripts/check-app-catalog-drift.py --release --strict: green.npm run type-check: green.
Key changes made so far:
- Added cascade uninstall progress truthfulness assertion to
tests/lifecycle/bats/cascade-uninstall.bats. - Fixed release catalog drift filters and regenerated catalog metadata.
- Fixed invalid
apps/fedimint-clientd/manifest.ymlcpu_limitschema value. - Updated stale/tight Rust tests without changing production behavior.
Remaining non-automatable / operational gates:
- Workstream B signing is blocked on the offline
RELEASE_MASTER_MNEMONIC; code + runbook exist, but the publisher must pin/sign the release-root catalog. - Phase-3 Quadlet backend rollout is implemented behind
use_quadlet_backendsand default-off. The gate skip-passes until explicitly enabled on a node; flipping it fleet-wide requires a coordinated flag rollout plus backend reinstall/migration verification. .116read-onlyuse-quadlet-backends-install.bats: 6/6 skip-clean; no backend.containerunits, so Phase-3 is not active on that node.- Release metadata still says
1.7.99-alphainreleases/manifest.json; changelog top isv1.8.00-alpha. Cutting an actual 1.8.0 OTA requires an explicit version/manifest update.
Do not discard:
core/archipelago/src/mesh/listener/decode.rscore/archipelago/src/mesh/listener/session.rscore/archipelago/src/mesh/meshtastic.rs
3ccc bug current hypothesis:
- The prior attempted Meshtastic fix added a hard stale-packet filter using
rx_time. - Stock Meshtastic radios without GPS/RTC can report tiny nonzero epoch values until time sync.
- That would make live
3cccpackets look older than 10 minutes and get dropped beforemesh.messages. - Current patch treats implausibly early
rx_timevalues as unknown rather than stale.
.116 live validation after 2026-06-30 hotfix:
.116reachable by SSH;archipelagoactive;/dev/mesh-radio -> ttyUSB0attached.- Current canary deploy is commit
b4531bb4; backend sha4ab53e539d89679ef664401a9a57996267772fed02327abc2912c3e77543acbf; frontend bundleindex-YOAeJF7w.js/Mesh-BSAo88jN.js. mainpushed togitea-vps2.- RPC on
.116:transport.statuscurrently reportsmesh_only:false(off-grid mode is not enabled unless the user toggles it).mesh.statusreports Meshtastic connected:device_type:"meshtastic",self_node_id:1135977788,peer_count:13.- Recent
.116->3cccsent rows are stored with real 2026 timestamps andtransport:"lora".
- UI/backend fixes included in
b4531bb4:transportLabel("lora")displays LoRa.- mesh sends refetch messages after send so transport pills settle without browser refresh.
- off-grid mode blocks the mesh-chat FIPS/Tor federation fallback and forces LoRa-only sends;
banner text is
Tor/FIPS disabled - LoRa only. - empty mesh-chat placeholder opacity reduced.
- Meshtastic diagnostics now identify the remaining blocker:
- 3ccc NodeInfo is discovered:
Meshtastic peer is PKC-capable (NodeInfo public_key) node=1128152268 key_len=32. - Bytes from stock Meshtastic text reach
.116, but the custom parser rejects the packet:Meshtastic FromRadio.packet did not parse into a decoded MeshPacket len=73 head=0dcc3c3e43153ca5b5432a16df56cbed. - Non-text packets decode and are ignored with port numbers (
portnum=3/4/5), so the serial read path is alive. Resume insidecore/archipelago/src/mesh/meshtastic.rs::parse_mesh_packet.
- 3ccc NodeInfo is discovered:
- LoRa is therefore not fully fixed yet: stock
3ccc->.116text does not surface inmesh.messages, and.116->3cccstill needs user-visible confirmation in the Meshtastic app.