Resume notes for the 1.8.0 bug-bash mesh work: Meshtastic rename shipped + verified; .120->.89 'non-delivery' diagnosed to a duplicate-contact surfacing bug (messages inject fine, split across federation/radio twin contact_ids); design for the dedup fix (#12) and the netbird logout-race map (#10). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.4 KiB
Handoff — Mesh device rename, mesh routing, duplicate contacts, netbird logout (2026-06-20)
Session is a test-build iteration toward the 1.8.0 bug-bash release — sideload patched binaries
to test nodes, NO version bump / NO OTA release (manifest stays 1.7.99-alpha). Because the version
string never changes, verify a deploy by sha256-matching the deployed binary, not by current_version.
Test node roster (creds in the operator's local notes / agent memory — NOT in this repo)
.116192.168.1.116 — this build host (archi-thinkpad), dev/validation..198192.168.1.198,.228192.168.1.228 — LAN resilience nodes..5Tailscale 100.72.136.5 (archy-x250-beta) — Meshtastic radio..120Tailscale 100.66.157.120 (archy-x250-exp) — Meshtastic radio..89Tailscale 100.89.209.89 (archy-x250-pa) — dual radio: ttyACM0 Meshtastic (probe FAILS), ttyUSB0 MeshCore (active). Configured device_path = ttyACM0. Runs netbird (v2.38.0).
Deploy driver used this session: /tmp/archy-deploy/deploy-node.sh <user@host> <pw> <label>
(scp binary + stream web/dist/neode-ui + sudo swap /usr/local/bin/archipelago, preserve aiui +
claude-login.html, chown 1000:1000, restart, verify sha256+health). Recreate from this doc if /tmp is gone.
Deploy state (binary sha) at handoff
b5183dfc…(HEADd00d1b20, includes Meshtastic rename) → on .5 and .120 (verified).f702b4f1…(the 3 wallet/mesh/ui fixes, pre-rename) → on .116, .198, .228.7c17a96…(OLD, pre-f702b4f1) → .89 is STALE — update before re-testing .120→.89.
DONE
- Meshtastic device rename → server name — committed
d00d1b20(pushed to gitea-vps2/main).meshtastic.rs set_advert_namewas a no-op (in-memory only). Now sendsAdminMessage{set_owner=User{long_name,short_name}}to the local node on ADMIN_APP port (6), set_owner field = 32. long_name = server name (≤39), short_name = first 4 alphanumerics upper-cased. Hardware-verified: .120 radio now reads backArchy-X250-EXP, .5 reads backArchy-X250-Beta. MeshCore already renamed (CMD_SET_ADVERT_NAME, serial.rs:147) — unchanged, now at parity. - Routing priority confirmed = Mesh → FIPS → Tor.
send_typed_wire(mesh/mod.rs:1007): reachable radio peer → LoRa; federation-synthetic OR (!reachable && arch_pubkey_hex.is_some()) → federation.send_typed_wire_via_federation(mod.rs:1124): FIPS first w/.fips_timeout(8s), Tor fallback. .120→.89"non-delivery" diagnosed — it is NOT a delivery failure..120sends to .89's federation contact_id3027572739, logsFederation envelope delivered transport=tor(gated on HTTP 2xx, mod.rs:1185). The receiver returns 2xx ONLY after ed25519-verify + successfulinject_typed_from_federation(node_message.rs:217-263). Identity matches (.89 pubkey 031875b4…)..89→.120works. So .120's messages ARE injected into .89's state under contact_id2679725907= federation_peer_contact_id(.120 pubkey 535fb91f…), name "Archy-X250-EXP". It's a duplicate-contact SURFACING problem (user confirmed doubles).
TODO (resume here)
#12 Fix duplicate mesh contacts ← user chose this NEXT
Root cause: handle_mesh_contacts_list (api/rpc/mesh/typed_messages.rs:1126) and
handle_conversations_list (api/rpc/mesh/status.rs:89) emit one row per state.peers entry with
no cross-transport dedup. A node can have TWO peers: a radio peer (low contact_id, firmware key)
and a federation peer (high contact_id ≥ 0x8000_0000, archipelago key). bind_federation_twins
(mesh/mod.rs:85) correlates them by exact advert_name and copies arch_pubkey_hex onto the radio
twin, but LEAVES BOTH ROWS. Messages are keyed by peer_contact_id (split across the two ids), so
the federation-injected messages sit on the federation row while the user may open the radio row → empty.
Design constraint (important): the two twins have DIFFERENT routing. Collapsing must NOT break
"mesh-first": the canonical SEND contact_id should be the RADIO twin when one exists (so send_typed_wire
routes LoRa-if-reachable, else federation via the bound arch key), else the federation id. The merged
THREAD must union messages from ALL twin contact_ids (group by arch_pubkey_hex). Apply the dedup in:
handle_conversations_list(status.rs:89) — one conversation per identity group; last msg = newest across twins.handle_mesh_contacts_list(typed_messages.rs:1126).handle_conversations_messages(status.rs ~146) — when asked for a contact_id, resolve its group's twin ids and filter messages by ANY of them. Add a shared helper (e.g. group peers byarch_pubkey_hexwhen Some, else singleton by contact_id). Do NOT merge/re-key atbind_federation_twinstime — that would force federation routing and break mesh-first. MeshPeer struct: mesh/types.rs:28 (fields: contact_id, advert_name, did, pubkey_hex, arch_pubkey_hex, reachable…).
Before testing #12: update .89 to the current build (it's on stale 7c17a96), then re-check whether
.120 ("Archy-X250-EXP") shows once with its messages. NB: .89 had 0 journal mentions of "Archy-X250-EXP"
and no radio contact for .120 — so its specific double may be a stale-binary artifact; confirm on fresh build.
#10 Netbird logout race
Symptom: right after install netbird shows logged-in but can't log out; self-corrects after a while.
Map: install stacks.rs install_netbird_stack (~1760-1918): 3 containers (netbird-server :8086, dashboard,
nginx proxy :8087→443 self-signed TLS). wait_for_stack_containers waits for "running", NOT OIDC-ready.
Dashboard is netbird's own SPA, opened in a NEW TAB (appLauncher.ts ~52-60, secure-context/crypto.subtle).
Hypothesis: startup race — dashboard loads before netbird-server's OIDC provider is ready, caches a bad auth
state; logout endpoint not ready. Likely fix: gate install completion / launch on netbird-server OIDC
readiness (poll an endpoint) rather than container "running". Repro on .89 (has netbird running).
Prior note: AccountInfoSection.vue ~602 release note claims a previous unified-origin fix for the 404
logout/login loop — the initial-state race remains.
Mesh parity directive
MeshCore "works great"; Meshtastic must reach the SAME parity (rename done; duplicate-contact + routing fallback shared across both). Meshtastic↔MeshCore are INCOMPATIBLE over-the-air, so cross-protocol federated peers (.120↔.89) rely entirely on the FIPS/Tor fallback.