archy/docs/meshroller-integration-design.md
archipelago 705e2436ba chore(ops,docs): first-boot containers, image versions, design docs, android remote-input
- first-boot-containers + image-versions for fmcd/fedimint
- dual-ecash, meshroller-integration, and remaining-issues design docs
- Android remote-input two-finger scroll + external-open handling

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 19:22:02 -04:00

8.4 KiB
Raw Blame History

Meshroller → Rust-native mesh assistant (issue #50)

Decision (2026-06-17): seam (a) — lift Meshroller's behaviors into our Rust mesh stack as typed message kinds. We do NOT package the Python/Meshtastic daemon. Meshroller rides Meshtastic-serial + a local Ollama; our radio is meshcore (Heltec V3) and the meshtastic Python module cannot drive it. So we reimplement its four behaviors natively against core/archipelago/src/mesh/, drop the Python + Meshtastic dependency, and reuse our existing event/transport seams.

Meshroller's behaviors (from the Phase-0 review of meshroller.py):

  1. LLM bridge — relay an inbound mesh message to a local LLM, send the reply back on the mesh.
  2. Trusted-node auth — only trusted senders may invoke commands.
  3. Scheduled / queued messaging — send messages at a future time; queue for peers that are currently offline.
  4. On-channel command parser — recognise commands in channel traffic.

Where this plugs in (verified seam map)

Concern File / type Anchor
Wire message kinds mesh/message_types.rs MeshMessageType (#[repr(u8)]) 2873
Envelope (CBOR, 0x02 marker, seq, sig) mesh/message_types.rs TypedEnvelope 183197
Inbound dispatch match mesh/listener/dispatch.rs handle_typed_envelope_direct() 80691
Outbound send mesh/mod.rs send_typed_wire() / send_channel_typed_wire() 848 / 1152
Radio I/O command channel mesh/listener/mod.rs MeshCommand (SendText/BroadcastChannel) 5573
Frame chunking (≤160 B/frame, transparent) mesh/listener/session.rs send_dm_via_channel()
UI push mesh/types.rs MeshEvent (broadcast on state.event_tx, cap 64) 125164
Trust gate federation/types.rs TrustLevel::Trusted on FederatedNode; federation::load_nodes() 552
Block on user-blocklist mesh/listener/mod.rs ContactEntry.blocked (state.contacts) 110
Local model Ollama container, port 11434 (port_allocator.rs:11); call via reqwest (already a dep)

No in-Rust LLM exists yet; we call the local Ollama HTTP API (the same model Meshroller used) so nothing new is baked into the binary.


Phase 1 — the assistant on the wire

1.1 New typed message kinds (message_types.rs)

Add two variants (next free tag = 24):

AssistQuery    = 24,   // "ask the node's AI" — prompt + optional model
AssistResponse = 25,   // reply — request_id + text + done flag

Wire the four spots the enum requires (from_u8 76104, from_label 109137, label() 139166, plus the variant) — mirror the Invoice variant exactly.

Payloads (CBOR via encode_payload/decode_payload):

pub struct AssistQueryPayload  { pub req_id: u64, pub prompt: String, pub model: Option<String> }
pub struct AssistResponsePayload { pub req_id: u64, pub text: String, pub seq: u16, pub done: bool }

seq/done let a long reply span multiple AssistResponse messages without relying solely on frame reassembly (radio airtime is scarce — see §1.4 cap).

1.2 Inbound handler (listener/dispatch.rs)

Add a match arm for AssistQuery, mirroring the TxRelay arm (169207): validate → gate → spawn background work (never block the radio loop).

Some(MeshMessageType::AssistQuery) => {
    let payload = decode_payload::<AssistQueryPayload>(&envelope.v)?;
    if !assistant_enabled(state) { return; }                 // kill switch (config)
    if !sender_is_allowed(state, sender_contact_id).await { warn!(..); return; }
    if !rate_limit_ok(state, sender_contact_id).await { return; } // 1 in-flight / sender
    let _ = state.event_tx.send(MeshEvent::AssistQueryReceived { from_contact_id, prompt });
    let st = Arc::clone(state);
    tokio::spawn(async move { run_assist(&st, sender_contact_id, payload).await; });
}

run_assist: POST http://localhost:11434/api/generate ({model, prompt, stream:false}), cap + chunk the response (§1.4), and emit each chunk back to the sender via send_typed_wire(contact_id, …, "assist_response", …). Also store via the existing store_typed_message path so it lands in history, and emit MeshEvent::AssistResponseReady.

1.3 Trust gate (sender_is_allowed)

Reuse the federation trust list — no new store:

let nodes = federation::load_nodes(&data_dir).await.unwrap_or_default();
let peer  = state.peers.read().await.get(&sender_contact_id).cloned();
let trusted = peer.and_then(|p| nodes.iter().find(|n|
    Some(&n.pubkey) == p.pubkey_hex.as_ref() || Some(&n.did) == p.did.as_ref())
    .map(|n| n.trust_level == TrustLevel::Trusted)).unwrap_or(false);

Plus honour ContactEntry.blocked. Config picks the policy: trusted-only (default) | specific contacts | anyone on channel (opt-in).

1.4 Airtime discipline (meshcore reality)

Frames are ≤160 B and reassembly is automatic, but bandwidth is tiny. So:

  • Cap the reply (default ~480 chars / ≤3 AssistResponse chunks); append …(truncated — reply '!more') and keep the tail server-side for a !more.
  • Rate-limit: one in-flight query per sender; drop/deny extras.
  • Timeout the Ollama call (e.g. 60 s) and reply with a short error on failure (MeshEvent::AssistResponseReady { error }).

1.5 Channel command parser

The killer entry point is a plain channel message, not a typed one. In the inbound Text path, when a channel-0/1 message starts with the trigger (default !ai / !ask ), synthesise an AssistQuery from the remainder and run the same gated run_assist. This means any meshcore client (even a bare Meshtastic-style sender) can ask, while typed AssistQuery is the rich path our own UI uses. Trigger + enable are config.

1.6 UI events (types.rs)

AssistQueryReceived  { from_contact_id: u32, prompt: String },
AssistResponseReady  { req_id: u64, to_contact_id: u32, error: Option<String> },
ScheduledMessageFired { message_id: u64 },   // for Phase 1.7

Subscribers already flow through the single event_tx broadcast — no extra wiring.

1.7 Scheduled / queued messaging

A small AssistScheduler owned by MeshService (sits beside relay_tracker / dead_man_switch in mod.rs):

  • Persisted queue { id, contact_id|channel, wire, fire_at, attempts } under data_dir/mesh/scheduled.json.
  • A tokio task wakes at the earliest fire_at, sends via the normal send_typed_wire / MeshCommand::SendText path, emits ScheduledMessageFired.
  • Offline queue: on send failure (peer unreachable) keep the item and retry when a PeerDiscovered / PeerUpdated event names that peer.
  • RPC: mesh.schedule-message { contact_id|channel, body, fire_at }, mesh.list-scheduled, mesh.cancel-scheduled.

Phase 2 — killer Mesh-tab UX (ties into project_mesh_telegram_plan)

Onboarding (one screen, three steps):

  1. Model — detect Ollama on :11434. If absent, a single "Install AI (Ollama)" button deep-links to the App Store entry; if present, pick the model (default the one already pulled).
  2. Who can ask — Trusted nodes only (default) · Pick contacts · Anyone on the mesh channel (with a clear "uses your node's compute / airtime" warning).
  3. Trigger word — default !ai; toggle the whole feature on.

Usage (Mesh tab):

  • An Assistant card: on/off, model, policy, trigger; live feed driven by AssistQueryReceived / AssistResponseReady.
  • Composer gains two actions: Ask the mesh AI (sends a typed AssistQuery) and Send later (date/time → mesh.schedule-message), with a "Scheduled" list (mesh.list-scheduled, cancel).

The 12 killer actions: ask the island's AI from any radio, and queue a message that sends itself when a peer comes back in range.


Verification

Needs 2 radios (the .116 meshcore + a second) + Ollama running on the answering node:

  1. From radio B send !ai what's the block height? → node A (trusted) answers on the channel; untrusted B is silently denied.
  2. Typed AssistQuery from our UI → chunked AssistResponse renders in the feed.
  3. Long reply → truncation + !more continues.
  4. Schedule a message to an out-of-range peer → it fires when the peer reappears.

Effort & order

Multi-day. Land in this order so each step is testable alone: 1.1 enum + payloads → 1.2/1.3/1.4 gated bridge → 1.5 channel trigger → 1.6 events → 1.7 scheduler → Phase 2 UI. Phases 1.11.4 are the minimum demoable slice (ask over the mesh, get an answer).