archy/docs/REMAINING-ISSUES-PLAN.md

225 lines
12 KiB
Markdown
Raw Normal View History

# Remaining issues — implementation plans
Written 2026-06-17. Covers the open Gitea issues not closeable in the single-box
dev env. Each plan lists the files to touch, the approach, and how to verify
(most need .116 + .198, a companion phone, or funded wallets). Issues #3 (VPN)
and #5 (OpenWRT/TollGate) are intentionally out of scope per the user.
Status of the rest at time of writing:
- **#31** group chat over Tor — dedup-by-`msg_id` fix already shipped (open only
for a 2-node Tor confirmation). See its Gitea comment.
- **#43** install on .70 — blocked: .70 unreachable. Plan below is a code-side
hardening that doesn't depend on .70's logs.
---
## #46 — Pay for peer files (local wallet OR invoice+QR to seller)
> **Status (2026-06-17): Phase 1 DONE & compiles** (LN invoice + QR + release).
> Seller: `content_invoice.rs` entitlement store, `GET /content/{id}/invoice`
> + `/invoice-status/{hash}`, invoice-paid path in `serve_content`
> (`X-Invoice-Hash`), LND `create_invoice`/`invoice_is_settled`. Buyer:
> `content.request-invoice` / `.invoice-status` / `.download-peer-invoice` +
> `PeerFiles.vue` picker modal + QR + poll. Phases 2 (on-chain) and 3 (local
> LN/on-chain methods) remain; needs live funded-wallet verify. Issue left open.
**Goal.** At the paid-download step in Cloud → peer files, let the buyer choose
how to pay: (a) their local wallet (ecash today; LN/on-chain later), or (b) get
an invoice with a QR drawn on the **selling** node's wallet, pay from any
external wallet, and have the file release on confirmation.
**What exists already**
- Buyer ecash auto-pay: `content.download-peer-paid` (mints ecash, downloads
atomically) — wired in `neode-ui/src/views/PeerFiles.vue` `downloadFile()`.
- Payer-side builder: `streaming.prepare-payment` RPC + `wallet/ecash.rs`
(`build_payment_token`, cross-mint), `swarm/payment.rs`.
- Free streaming download: `/api/peer-content/:onion/:id` (Range-capable).
- LND invoice RPC: `lnd.createinvoice`; ecash balance: `wallet.ecash-balance`.
**Backend work**
1. **Seller-side invoice RPC** (new), e.g. `content.request-invoice`
`{ onion, content_id }` → asks the *selling* node (over the existing
`/archipelago/...` peer transport, same path machinery as
`content.download-peer-paid`) to produce a payment request for `price_sats`:
- LN: `lnd.createinvoice` on the seller, return `bolt11` + `payment_hash`.
- on-chain: `lnd.newaddress` on the seller, return `address` + `amount`.
- Seller records a pending entitlement keyed by `payment_hash`/address →
content_id → buyer.
2. **Payment confirmation + release**: seller polls its own LND
(`lnd.lookup-invoice` / address watch); on settle, marks the entitlement
paid. Buyer side polls `content.invoice-status { payment_hash }` → when paid,
downloads via the existing `/api/peer-content` (gate now passes because the
entitlement is satisfied). Reuse the streaming gate in `streaming/` — add an
"invoice-paid" path alongside the ecash-token path.
3. Keep `content.download-peer-paid` (local-ecash) as the (a) fast path.
**Frontend work** (`PeerFiles.vue`)
1. Before a paid download, open a small **payment-method picker** modal:
- "Pay from this node's wallet" → existing ecash flow (show balance; if
insufficient, the LN/on-chain local options when those land).
- "Pay from another wallet (QR)" → call `content.request-invoice`, render the
`bolt11`/address as a **QR** (add a tiny QR lib or reuse one already in the
bundle — check `package.json`), show amount + a live "waiting for
payment…" state polling `content.invoice-status`, then auto-download.
2. Reuse the existing `purchaseError`/`downloading` state + `triggerDownload`.
**Verify**: .116 (seller) + .198 (buyer), a funded regtest/LN wallet. Buyer
picks QR, pays from a 3rd wallet, file releases. Then the local-ecash path.
**Effort**: large (multi-day). Phase it: (1) LN-invoice + QR + release, (2)
on-chain, (3) local LN/on-chain methods.
---
## #18 — Companion app: "open in external browser" apps don't work
> **Status (2026-06-17): DONE & compiles (Rust + TS); Android unbuilt here.**
> Reverse relay hop added: `external_open_tx` channel, kiosk publishes
> `{"t":"o","url"}` on `/ws/remote-relay` (URL-validated), forwarded to the
> companion's `/ws/remote-input`. `requestExternalOpen()` in `remote-relay.ts`
> wired into all four `appLauncher.ts` external-open sites; `InputWebSocket.kt`
> + `RemoteInputScreen.kt` open it via `ACTION_VIEW`. Issue closed; live pairing
> test pending.
**Goal.** Apps configured to open in a new/external browser should launch on the
**phone** when driven from the companion controller, using the phone-default-
browser request pattern.
**What exists**
- Relay protocol in `neode-ui/src/api/remote-relay.ts` — message cases `m`
(move cursor), `c` (click), `s` (scroll, just fixed in #7). Click resolves the
element under the virtual cursor via `deepElementFromPoint`.
- The kiosk side runs the dashboard; "open external" apps currently try to
`window.open` on the **kiosk**, which the phone never sees.
**Approach**
1. **Detect external-open intent on the kiosk**: when a click lands on an
element that would open externally (anchor with `target=_blank` / an app
flagged `opensExternally`, or an intercepted `window.open`), instead of
opening locally, send a new relay message to the phone:
`{ t: 'open-url', url }` over the `/ws/remote-relay` channel (the kiosk is the
relay server side — find where it sends frames back to the companion).
2. **Companion (phone) side** handles `open-url` by doing `window.open(url,
'_blank')` / `location.href = url` so it opens in the phone's default browser.
- If the companion is the **Android APK** (separate codebase, see
`Android/` + memory `feedback_companion_apk_not_in_update`), add an
intent-based handler there; if it's a mobile web client, handle in JS.
3. Intercept `window.open` on the kiosk dashboard globally (a small shim that,
when remote-relay is active, forwards to the phone instead of opening).
**Verify**: phone + kiosk paired; tap an "open external" app from the companion;
it opens in the phone browser.
**Effort**: medium; needs the companion device + possibly an APK change.
---
## #50 — Integrate Meshroller into our mesh features
> **Decision made 2026-06-17: seam (a) — Rust-native lift.** Full design with
> verified seam anchors (message types, dispatch, send API, event/trust gates,
> Ollama call) is in **`docs/meshroller-integration-design.md`**. Summary below.
Source: https://gitea.l484.com/clasko/Meshroller
**Phase 0 — review (DONE 2026-06-17)**
- Reviewed. Meshroller is a single ~29KB Python script (`meshroller.py`): a
daemon that bridges a **Meshtastic** radio (via the `meshtastic` Python serial
module, `SerialInterface`) to an **Ollama** LLM (`qwen2.5-coder`). It has
trusted-node auth, scheduled/queued messaging, and command handling on mesh
channels. It is a **daemon**, not firmware or a library.
- **License**: in-house (our own developer) — no third-party license blocker.
- **Hardware/transport reality**: it rides **Meshtastic serial + a local
Ollama**. Our radio is **Meshcore** (Heltec V3) and our mesh stack targets
meshcore. The `meshtastic` module does NOT speak meshcore, so the script
cannot drive our radio unmodified.
- **Decision needed (architecture)**: per user, integration **must work with
meshcore**. Two seams:
- (a) Lift Meshroller's *behaviors* (LLM bridge, trusted-node auth, scheduled
messaging, command parser) into our Rust mesh stack as typed message kinds —
native to meshcore, no Python/Meshtastic dependency. Preferred for meshcore.
- (b) Package the Python daemon as a container app and add a meshcore serial
backend to it (keeps the script, but requires writing meshcore I/O the
`meshtastic` module doesn't provide).
This choice is the remaining gate; the rest of Phase 1 below stands.
**Phase 1 — choose the seam**
- Our mesh stack: `core/archipelago/src/mesh/` (`mod.rs` `MeshService`,
`listener/`, `protocol.rs`, `types.rs`). Decide:
- If Meshroller is a *protocol/feature on the same radio* → implement it as a
typed message kind in our `MeshMessageType` + `listener/dispatch.rs`
(mirrors how block headers / alerts are handled).
- If it's a *separate transport/daemon* → wrap it behind our transport router
(`transport/`) like FIPS/LAN/Tor.
- Reuse the event seam (`MeshEvent`) so the UI gets pushes (same path we just
wired for #48).
**Phase 2 — UX** (ties into `project_mesh_telegram_plan`)
- A dead-simple onboarding + usage flow in the Mesh tab. Define the 12 killer
actions and design the setup wizard.
**Verify**: 2 radios (the .116 Meshcore + a second).
**Effort**: multi-day; gated on the Phase 0 review + a license/architecture
decision.
---
## #15 — netbird app doesn't work (LOW PRIORITY)
> **Status (2026-06-17): DIAGNOSED LIVE on .198 + FIXED (option A shipped); login works.**
> THE real blocker: the dashboard needs a **secure context** —
> `window.crypto.subtle is unavailable` over plain http, so OIDC PKCE threw
> before login. Fix: proxy now serves **HTTPS** (self-signed cert at install,
> `8087:443`, all origins `https://`); frontend opens netbird in a **new tab**
> (self-signed-HTTPS iframe is blocked). Layered fixes also in `stacks.rs`:
> nginx `resolver <gateway>` + variable upstreams (IP-cache 502; `resolver
> local=on`/`${NGINX_LOCAL_RESOLVERS}` FAIL on nginx:1.27-alpine), LAN-IP
> canonical origin + CORS + multi-origin redirect URIs, `/nb-auth`+`/nb-silent-auth`
> SPA fallback (were 404), and a stale-store note (wipe to re-init). Also found:
> `conmon died` zombie containers (recreate fixes; #53). Validated on .198,
> registration+login succeed. Trusted-cert/iframe (option B) = #56;
> registry-app migration = #52. Existing nodes need a clean reinstall.
**Diagnose first** (likely a container/config issue, like other app fixes):
1. On a node: `podman logs <netbird container>` — capture the actual failure.
2. Check the app manifest + install path (`container/` install, env, ports,
the four iframe-sync places per memory `feedback_gitea_iframe_setup` if it
has a UI).
3. netbird needs a management URL / setup key — confirm whether the app expects
config we don't provide, or a host capability (TUN device / NET_ADMIN) the
rootless-podman setup lacks.
**Likely fix**: either supply the missing env/setup-key UI, or add the required
container capability. Low priority — schedule after the above.
---
## #43 — Install errors at DID-creation + password screens (.70); FIPS slow
`.70` is unreachable, so we can't read its logs. Code-side hardening that helps
regardless:
> **Status (2026-06-17): hardening DONE & compiles.** Root cause was a
> non-idempotent `seed.generate` that overwrote node keys under the client's
> retry storm on slow first boot. Fixed: idempotent generate + retry-safe
> verify (`seed_rpc.rs`), transient-vs-genuine error handling in
> `OnboardingSeedGenerate/Verify.vue`, and a non-blocking FIPS status on
> `OnboardingDone.vue`. Issue closed; full closure wants a fresh install on a
> reachable node + re-test on .70.
1. **Onboarding error surfacing** — in the seed/DID + password onboarding views
(`OnboardingSeed*`, the password step) and their RPC handlers
(`seed.generate` / `seed.verify` / `auth.setup`), make a *successful*
operation never show an error toast, and make genuinely-failed ops show the
real message + a retry — so cosmetic errors (op actually succeeded) stop
alarming users. Audit the promise/catch paths for races where a slow backend
resolves after a timeout fires.
2. **FIPS start delay** — confirm `spawn_post_onboarding_fips_activate`
(`api/rpc/seed_rpc.rs`) isn't blocking onboarding; it already runs detached.
Consider surfacing "FIPS starting…" status instead of letting it look stuck.
**Verify**: a fresh ISO install on a reachable node (.198 or a scratch box),
watch the DID + password screens; then re-test on .70 once reachable.
**Effort**: smallmedium (the hardening); full closure needs a repro node.