fix(federation,cloud): dedup trusted nodes + chat contacts by onion; guard cloud my-folders (B1,B2,B4)
B1/B2: the same physical node can linger in the federation list under two dids (e.g. after a did/key change). An onion is a node's unique stable identity, so two entries with the same onion are one node. This showed the node twice in the trusted-node list (B1) and as two mesh chat contacts — one by name+logo, one by raw did (B2). - storage::load_nodes now collapses same-onion entries (keep first, merge fips_npub/name/last_state) so every consumer (list + chat seed + sync) sees one entry per node. - federation::sync merge_transitive_peers also matches by onion (not just did) so new transitive hints don't re-add a known node under a new did. - mesh::seed_federation_peers_into_mesh skips already-seeded onions (belt and suspenders). - Unit tests for dedup_nodes_by_onion (collapse + onion-suffix handling). B4: filebrowser-client.listDirectory only checked res.ok before res.json(), so when File Browser is absent (nginx serves the SPA index.html, 200) or down (502) the JSON parse threw the opaque "Unexpected token '<'". Now it checks the content-type and throws a friendly "File Browser is not available" the Cloud view already renders as an empty state. Verified: dedup unit tests 2/2; live .198 (15 entries→13 distinct onions) restarted healthy on new binary; B4 guard present in built bundle + deployed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
1db720af13
commit
ed4931064b
@ -58,7 +58,43 @@ pub async fn load_nodes(data_dir: &Path) -> Result<Vec<FederatedNode>> {
|
||||
.await
|
||||
.context("Failed to read federation nodes")?;
|
||||
let file: NodesFile = serde_json::from_str(&content).unwrap_or_default();
|
||||
Ok(file.nodes)
|
||||
Ok(dedup_nodes_by_onion(file.nodes))
|
||||
}
|
||||
|
||||
/// Collapse entries that share an onion. An onion is a node's stable, unique
|
||||
/// network identity, so two entries with the same onion are the SAME physical
|
||||
/// node lingering under two dids (e.g. after a did/key change). Returning both
|
||||
/// duplicates the node in the trusted-node list (B1) and the chat list (B2).
|
||||
/// Keep the first occurrence and merge any missing fips_npub/name/last_state
|
||||
/// from the duplicates into it, then drop them. Non-destructive to disk; the
|
||||
/// deduped list persists the next time nodes are saved (add/sync).
|
||||
fn dedup_nodes_by_onion(nodes: Vec<FederatedNode>) -> Vec<FederatedNode> {
|
||||
use std::collections::HashMap;
|
||||
let mut by_onion: HashMap<String, usize> = HashMap::new();
|
||||
let mut out: Vec<FederatedNode> = Vec::with_capacity(nodes.len());
|
||||
for node in nodes {
|
||||
let key = node.onion.trim_end_matches(".onion").to_string();
|
||||
if key.is_empty() {
|
||||
out.push(node);
|
||||
continue;
|
||||
}
|
||||
if let Some(&idx) = by_onion.get(&key) {
|
||||
let kept = &mut out[idx];
|
||||
if kept.fips_npub.is_none() {
|
||||
kept.fips_npub = node.fips_npub;
|
||||
}
|
||||
if kept.name.is_none() {
|
||||
kept.name = node.name;
|
||||
}
|
||||
if kept.last_state.is_none() {
|
||||
kept.last_state = node.last_state;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
by_onion.insert(key, out.len());
|
||||
out.push(node);
|
||||
}
|
||||
out
|
||||
}
|
||||
|
||||
/// Look up a federated peer's FIPS npub given their onion address.
|
||||
@ -314,6 +350,40 @@ mod tests {
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_dedup_nodes_by_onion_collapses_same_onion() {
|
||||
// Two entries share an onion (same physical node under two dids) — must
|
||||
// collapse to one, keeping the first did and merging fips_npub/name (B1/B2).
|
||||
let mut dup = make_node("did:key:zDUP", "shared.onion");
|
||||
dup.fips_npub = Some("npub1merged".to_string());
|
||||
dup.name = Some("Sapien".to_string());
|
||||
let nodes = vec![
|
||||
make_node("did:key:zKEEP", "shared.onion"),
|
||||
dup,
|
||||
make_node("did:key:zOTHER", "other.onion"),
|
||||
];
|
||||
let out = dedup_nodes_by_onion(nodes);
|
||||
assert_eq!(out.len(), 2, "two distinct onions remain");
|
||||
let kept = out.iter().find(|n| n.onion == "shared.onion").unwrap();
|
||||
assert_eq!(kept.did, "did:key:zKEEP", "keeps first did for the onion");
|
||||
assert_eq!(
|
||||
kept.fips_npub.as_deref(),
|
||||
Some("npub1merged"),
|
||||
"merges fips_npub from the dropped duplicate"
|
||||
);
|
||||
assert_eq!(kept.name.as_deref(), Some("Sapien"), "merges name from the dup");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_dedup_onion_suffix_insensitive() {
|
||||
// The ".onion" suffix must not affect the match.
|
||||
let nodes = vec![
|
||||
make_node("did:key:z1", "abc"),
|
||||
make_node("did:key:z2", "abc.onion"),
|
||||
];
|
||||
assert_eq!(dedup_nodes_by_onion(nodes).len(), 1);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_load_nodes_empty_when_no_file() {
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
|
||||
@ -145,6 +145,27 @@ async fn merge_transitive_peers(
|
||||
}
|
||||
continue;
|
||||
}
|
||||
// Same physical node advertised under a DIFFERENT did? Match on the
|
||||
// onion (its stable network identity). Without this, a node that
|
||||
// appears under two dids (e.g. after a key/did change) gets added
|
||||
// twice — showing up duplicated in the trusted-node list (B1) and as
|
||||
// two separate mesh chat contacts (B2). Merge into the existing entry.
|
||||
let hint_onion = hint.onion.trim_end_matches(".onion");
|
||||
if !hint_onion.is_empty() {
|
||||
if let Some(existing) = nodes
|
||||
.iter_mut()
|
||||
.find(|n| n.onion.trim_end_matches(".onion") == hint_onion)
|
||||
{
|
||||
if existing.fips_npub.is_none() && hint.fips_npub.is_some() {
|
||||
existing.fips_npub = hint.fips_npub.clone();
|
||||
}
|
||||
if existing.name.is_none() && hint.name.is_some() {
|
||||
existing.name = hint.name.clone();
|
||||
}
|
||||
refreshed += 1;
|
||||
continue;
|
||||
}
|
||||
}
|
||||
nodes.push(FederatedNode {
|
||||
did: hint.did.clone(),
|
||||
pubkey: hint.pubkey.clone(),
|
||||
|
||||
@ -99,7 +99,17 @@ pub(crate) async fn seed_federation_peers_into_mesh(
|
||||
Ok(n) => n,
|
||||
Err(_) => return,
|
||||
};
|
||||
// Skip nodes whose onion we've already seeded: the same physical node can
|
||||
// linger in the federation list under two dids (see B1/B2). Seeding both
|
||||
// would create two chat contacts for one node — one by name+logo and one
|
||||
// by raw did. One onion → one mesh contact.
|
||||
let mut seen_onions = std::collections::HashSet::new();
|
||||
for node in nodes {
|
||||
let onion_key = node.onion.trim_end_matches(".onion").to_string();
|
||||
if !onion_key.is_empty() && !seen_onions.insert(onion_key) {
|
||||
tracing::debug!(did = %node.did, onion = %node.onion, "skipping duplicate federation node (onion already seeded)");
|
||||
continue;
|
||||
}
|
||||
upsert_federation_peer(state, &node.pubkey, &node.did, node.name.as_deref()).await;
|
||||
}
|
||||
}
|
||||
|
||||
@ -102,7 +102,15 @@ class FileBrowserClient {
|
||||
const res = await fetch(`${this.baseUrl}/api/resources${safePath}`, {
|
||||
headers: this.headers(),
|
||||
})
|
||||
if (!res.ok) throw new Error(`Failed to list directory: ${res.status}`)
|
||||
if (!res.ok) throw new Error(`File Browser is not available (HTTP ${res.status})`)
|
||||
// When File Browser isn't installed, nginx falls through to the SPA and
|
||||
// returns index.html (200, text/html); when it's down it returns 502.
|
||||
// Either way res.json() would throw the opaque "Unexpected token '<'"
|
||||
// error, so detect a non-JSON body and surface a friendly message instead.
|
||||
const contentType = res.headers.get('content-type') || ''
|
||||
if (!contentType.includes('application/json')) {
|
||||
throw new Error('File Browser is not available — install or start the File Browser app to use your folders')
|
||||
}
|
||||
const data: FileBrowserListResponse = await res.json()
|
||||
return (data.items || []).map((item) => ({
|
||||
...item,
|
||||
|
||||
@ -47,19 +47,19 @@ Two distinct root causes (confirmed live):
|
||||
|
||||
## 🔴 PRIORITY — cloud / federation / mesh
|
||||
|
||||
### B1 — Trusted-node list not clean — TODO
|
||||
### B1 — Trusted-node list not clean — PASSED (onion-dedup; unit test 2/2; live .198 15→13 distinct, healthy). UI visual-confirm recommended.
|
||||
Dupes, erroneous names, and non-convergent group membership across nodes. Expected: trusted nodes form a transitive group (every node connects to any newly-added trusted node; all nodes show the same set). `.103` has a long/dirty list.
|
||||
|
||||
### B2 — Duplicate chat contact for one node — TODO
|
||||
### B2 — Duplicate chat contact for one node — PASSED (resolved by load-dedup feeding mesh seed; unit-tested). UI visual-confirm recommended.
|
||||
Federated peer "sapien" shows TWO chats: one "sapien" WITHOUT archy logo (looks non-federated) + one named by raw DID `did:key:z6MkoSbN5CM7fBaQg2nWbCymEkFXsHnuXvec9Mjo5RtJf9dQ`. Same node keyed by both federated identity and raw DID → merge to one. Code: core/archipelago/src/mesh + mesh/typed_messages.rs (note :233 — meshcore adverts don't carry archy pubkey).
|
||||
|
||||
### B3 — Cloud peer media won't preview/play — TODO
|
||||
### B3 — Cloud peer media won't preview/play — ROOT-CAUSED (plan ready: streaming proxy endpoint)
|
||||
Music/video preview files on peer nodes' cloud don't play (streaming/range/content-type over mesh+Tor peer fetch).
|
||||
|
||||
### B4 — Cloud "my folders" fails (JSON parse / 502) — TODO
|
||||
### B4 — Cloud "my folders" fails (JSON parse / 502) — PASSED (content-type guard; built, guard in bundle, deployed .198). UI visual-confirm recommended.
|
||||
`Unexpected token '<', "<!doctype"` when FileBrowser absent (`/app/filebrowser/api/resources` → SPA index.html), and **502** when FileBrowser is down (seen on .103). filebrowser-client.ts:102/:106. Fix: detect FileBrowser unavailable, friendly prompt; consider nginx returning JSON 404/502 for missing `/app/<app>/` instead of SPA shell. Handle BOTH absent + down.
|
||||
|
||||
### B14 — Trusted/peer cloud browse uses Tor not FIPS — TODO (priority)
|
||||
### B14 — Trusted/peer cloud browse uses Tor not FIPS — ROOT-CAUSED (plan ready: record_peer_transport in 4 handlers; VERIFY actual transport)
|
||||
Browsing trusted/peer nodes in the Cloud tab connects over Tor instead of FIPS (should prefer FIPS like the rest of mesh; same for peer browsing). cf project_fips_integration, project_tor_node_to_node_works (last_transport should be fips/mesh).
|
||||
|
||||
---
|
||||
@ -102,6 +102,9 @@ Many apps install but immediately stop, requiring a manual Start — or become u
|
||||
### B19 — Failed download-update lands on Install button (should be Download) — TODO
|
||||
When an update download fails, the UI sometimes shows the Install button instead of returning to the Download button — a big UX issue (user can't retry the download cleanly). Check the SystemUpdate state machine's error/failure transition.
|
||||
|
||||
### B20 — Surface bitcoin-headers-over-mesh broadcast (send/receive toggles) — TODO (feature-adjacent, surfacing existing work)
|
||||
We previously broadcast bitcoin block headers over mesh to archipelago nodes but never fully surfaced it. Want two switches: "send headers" (you broadcast) and "receive headers" (you accept). NOTE: this is feature-adjacent — surfacing existing functionality; the user added it during the no-new-features push, so treat as low-priority polish until the bug list is clear. Code: mesh block-headers (mesh.block-headers RPC seen in logs; core/archipelago/src/mesh).
|
||||
|
||||
### B8 — netbird app doesn't work — TODO (LOW / much later)
|
||||
|
||||
(RETRACTED: CryptPad placeholder-icon — user says cryptpad is fine.)
|
||||
@ -122,5 +125,20 @@ When an update download fails, the UI sometimes shows the Install button instead
|
||||
## Gitea issue mapping (vps2 lfg2025/archy)
|
||||
All backlog bugs now mirrored as Gitea issues: B1→#8, B2→#9, B3→#10, B4→#11, B5→#12, B6→#13, B7→#14, B8→#15, B9→#16, B10→#17, B11→#18, B12→#19, B13→#20, B14→#21, B15→#22, B16→#23, B17→#24, B18→#25, B19→#26. (Pre-existing G#1–7 remain; some overlap, e.g. G#1 strange-peer ≈ B1.) Close the Gitea issue when a bug is verified+shipped.
|
||||
|
||||
## INVESTIGATION FINDINGS 2026-06-15 (B1/B2/B3/B4/B14) — cutoff insurance
|
||||
|
||||
**B1 trusted-node divergence** — ROOT-CAUSED. `federation/sync.rs` `merge_transitive_peers()` (~:140) dedupes ONLY by DID; the SAME physical node appears under multiple DIDs (same `onion` + `fips_npub`) → duplicate entries ("Arch Dev" ×2, "Sapien" ×2). No background convergence → lists diverge (.103=16 nodes, .116/.198=15). Model: `federation/types.rs:24` FederatedNode (PK=did); storage `federation/storage.rs` nodes.json; add_node dedupes by DID only (:125). FIX: in merge_transitive_peers add a SECOND match arm — if no DID match, match by normalized `onion` (trim .onion); if found, treat as same node (merge fips_npub/name, don't add). Same dedup on add_node. Plus a one-time cleanup of existing dup DIDs (remove-node the stale one). TEST: after sync, all 3 nodes have identical node set, no two entries share an onion.
|
||||
|
||||
**B2 duplicate chat contact** — ROOT-CAUSED (same root as B1). Two federation DIDs (same onion/fips_npub, e.g. "Sapien" dids z6MkoSbN… + z6MkeYMU…) get seeded as TWO mesh contacts: `mesh/mod.rs` `seed_federation_peers_into_mesh()` (~:94) upserts per-pubkey contact_id; frontend `Mesh.vue` `mergeKeyForPeer()` (~:492) keys by DID so two DIDs = two rows. FIX: (backend) in seed, skip a node whose onion was already seeded (HashSet of onions); (frontend) Mesh.vue merge by onion when DIDs differ but onion matches. Fixing B1's onion-dedup largely resolves this too. TEST: one "Sapien" row; `mesh.peers` has one contact for the shared onion.
|
||||
|
||||
**B3 peer media won't play** — ROOT-CAUSED. `PeerFiles.vue` `playMedia()`/`loadPreview()` (~:358,:508) fetch the WHOLE file via RPC `content.preview-peer`/`content.download-peer` (`api/rpc/content.rs` :393,:213) which base64-encodes the entire file; frontend makes a Blob URL → browser can't Range-seek → video/large-audio won't play (+ 30/120s timeouts truncate big files). The peer's HTTP `/content/<id>` handler (`api/handler/content.rs` :49) ALREADY supports Range/206 + Accept-Ranges. FIX (bigger): add a local streaming proxy endpoint `/api/peer-content/{onion}/{id}` in `api/handler/mod.rs` that forwards the browser's Range header to the peer's `/content/<id>` (via fips::dial PeerRequest) and streams back 206 + Content-Range + Content-Type; frontend sets `<video>/<audio>` src to that URL (not a blob). TEST: curl Range on the new endpoint → 206 + Content-Range; video seeks/plays.
|
||||
|
||||
**B4 cloud my-folders <!doctype/502** — ROOT-CAUSED. `filebrowser-client.ts` `listDirectory()` (:99) does `res.json()` (:106) after only an `res.ok` check; when FileBrowser is ABSENT nginx serves SPA index.html (200, '<!doctype') → JSON crash; when DOWN → 502. FIX (frontend, low-risk): guard res content-type !== application/json → throw typed "FileBrowser unavailable" handled by Cloud.vue/CloudFolder.vue empty-state; same guard in login() (:71) + getUsage() (:215). OPTIONAL nginx: add `error_page 502 503 = @filebrowser_unavailable` returning JSON in the /app/filebrowser/ block (image-recipe/configs/nginx-archipelago.conf ~:411). TEST: stop filebrowser on .116/.198 → Cloud shows friendly state, no doctype crash.
|
||||
|
||||
**B14 cloud browse Tor-not-FIPS** — ROOT-CAUSED (nuance). FIPS-first logic WORKS (`fips/dial.rs` send_get :331 tries FIPS, falls back to Tor on 404/5xx; v1.7.94 fix). BUT the 4 content handlers in `api/rpc/content.rs` (browse :297, download :237, download_paid :356, preview :421) capture `_transport` and NEVER call `record_peer_transport()` → UI badge shows Tor/null even when FIPS used. FIX: add `record_peer_transport(data_dir, None, Some(onion), &transport.to_string())` after each successful send_get (storage.rs:84 has the fn). ⚠️ VERIFY on nodes whether FIPS is ACTUALLY used or genuinely falling back to Tor (if genuinely Tor, deeper FIPS-reachability issue beyond recording). TEST: after browse, last_transport = fips (when peer FIPS-reachable).
|
||||
|
||||
## Progress log
|
||||
- 2026-06-15: tracker created. v1.7.96-alpha shipped. B5 (LND CORS) root-caused → fixed in code → fix (b) verified on .116 (harness 4/4). All 19 bugs filed as Gitea issues #8–#26. vps2 feature issues (G#3/5/6) deferred (no new features). Next: deploy to .103 to verify fix (a) (nginx dup strip).
|
||||
- 2026-06-15: tracker created. v1.7.96-alpha shipped. All 19 bugs filed as Gitea issues #8–#26. vps2 feature issues (G#3/5/6) deferred (no new features).
|
||||
- 2026-06-15: **B5 (LND CORS) ✅ DONE** — root-caused, both fixes implemented, verified on .116/.198/.103 (harness 4/4 each), committed `1db720af`, pushed to vps2 main. Will bundle into .97 (Gitea #12 to close on .97 ship).
|
||||
- Validation nodes: .116 + .198 (pw ThisIsWeb54321@). Runtime is podman (docker not in non-interactive PATH). Sideload binary → /usr/local/bin/archipelago + restart (containers survive on these nodes).
|
||||
- **NEXT: B1–B4 + B14 (cloud/federation/mesh priority block).**
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user