Mesh/federation messages between co-located nodes were always falling back to Tor because the FIPS overlay had no direct peering — every node depended on the global anchor's spanning tree, and when that anchor link flaps a node is isolated and all FIPS dials time out. (Diagnosed live on .116/.198: pure-FIPS direct peering over UDP 8668 fixes it — 2.5ms vs timeout.) Generalize the manual fix: in the existing 5-min FIPS seed-anchor apply loop, also auto-connect every federation peer the PeerRegistry knows both a LAN address AND a FIPS npub for, dialing its FIPS UDP transport (port 8668) at its LAN IP via the same idempotent `fipsctl connect` path (new anchors::lan_fips_anchors). This is FIPS's own transport over the LAN — NOT Tailscale, NOT the HTTP/LAN messaging port. Transient (recomputed each tick from live mDNS discovery, never persisted) so changing IPs self-correct. Remote peers with no LAN address are untouched (still routed via the anchor). Registry Arc hoisted out of the transport-init block so the loop can read all_peers(). cargo check green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
338 lines
13 KiB
Rust
338 lines
13 KiB
Rust
//! Seed-anchor management for FIPS bootstrap.
|
|
//!
|
|
//! A freshly-installed node can't reach the global mesh via npub
|
|
//! routing until it's connected to at least one peer that's already in
|
|
//! the DHT. Upstream `fips` solves this by dialing a public anchor
|
|
//! (e.g. `fips.v0l.io`) on first start. That's a single point of
|
|
//! failure and doesn't help nodes behind restrictive firewalls or
|
|
//! intermittent networks — archipelago operators reported fresh
|
|
//! installs failing to reach any public anchor.
|
|
//!
|
|
//! This module adds a local, operator-editable seed-anchor list. Each
|
|
//! entry is a `{npub, address, transport}` triple that archipelago
|
|
//! pushes into the running daemon via `fipsctl connect` on startup and
|
|
//! periodically thereafter. If one anchor falls over, the next one
|
|
//! seeds the DHT instead. A well-configured cluster (e.g. a VPS
|
|
//! running fips in anchor mode + a couple of home nodes) stops
|
|
//! depending on the global anchor entirely.
|
|
//!
|
|
//! The list is persisted at `<data_dir>/seed-anchors.json`. The
|
|
//! archipelago service user owns that directory, so no sudo is needed
|
|
//! to read or write it.
|
|
|
|
use anyhow::{Context, Result};
|
|
use serde::{Deserialize, Serialize};
|
|
use std::path::{Path, PathBuf};
|
|
use tokio::process::Command;
|
|
|
|
/// On-disk filename under `data_dir/`.
|
|
const SEED_ANCHORS_FILE: &str = "seed-anchors.json";
|
|
|
|
/// Public anchor (`fips.v0l.io`) carried as a default seed for every
|
|
/// node — it bootstraps DHT routing so a fresh node isn't isolated.
|
|
/// Operators can remove it from the UI once their own cluster has
|
|
/// independent anchors (removal persists, see `load`/`remove`).
|
|
///
|
|
/// IMPORTANT transport details, learned the hard way (see git history /
|
|
/// the 2026-06-15 debugging on .116):
|
|
/// - The anchor answers ONLY on **TCP port 8443**. UDP 8668 is dead
|
|
/// (host pings on both IP families but never completes a UDP FIPS
|
|
/// handshake). `fips/config.rs` always knew this; the old default
|
|
/// here (`fips.v0l.io:8668`/udp) silently never connected fleet-wide.
|
|
/// - We use the **IPv4 literal** rather than the `fips.v0l.io` hostname
|
|
/// on purpose: the hostname resolves IPv6-first, but the daemon binds
|
|
/// its transports IPv4-only (`0.0.0.0:8443`), so a v6 target makes the
|
|
/// daemon fail to send the handshake with `EAFNOSUPPORT (os error 97)`.
|
|
/// An IPv4 literal sidesteps the resolver entirely.
|
|
pub const DEFAULT_PUBLIC_ANCHOR_NPUB: &str =
|
|
"npub1zv58cn7v83mxvttl70w5fwjwuclfmntv9cnmv5wmz2nzz88u5urqvdx96n";
|
|
pub const DEFAULT_PUBLIC_ANCHOR_ADDR: &str = "185.18.221.160:8443";
|
|
pub const DEFAULT_PUBLIC_ANCHOR_TRANSPORT: &str = "tcp";
|
|
|
|
/// The default public anchor as a ready-to-apply `SeedAnchor`. Carried
|
|
/// implicitly by `load()` on nodes that have never edited their anchor
|
|
/// list, so every node dials it without operator action.
|
|
pub fn default_public_anchor() -> SeedAnchor {
|
|
SeedAnchor {
|
|
npub: DEFAULT_PUBLIC_ANCHOR_NPUB.to_string(),
|
|
address: DEFAULT_PUBLIC_ANCHOR_ADDR.to_string(),
|
|
transport: DEFAULT_PUBLIC_ANCHOR_TRANSPORT.to_string(),
|
|
label: "Public anchor (fips.v0l.io)".to_string(),
|
|
}
|
|
}
|
|
|
|
/// One seed-anchor entry. `address` must be directly dialable (IP or
|
|
/// resolvable hostname + UDP port); `transport` is one of "udp", "tcp",
|
|
/// "tor", "ethernet" (the values upstream `fipsctl connect` accepts).
|
|
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
|
|
pub struct SeedAnchor {
|
|
/// Bech32 `npub1...` of the anchor's FIPS identity.
|
|
pub npub: String,
|
|
/// Directly-dialable transport address, e.g. `192.168.1.116:8668`.
|
|
pub address: String,
|
|
/// Transport to use — almost always `"udp"`.
|
|
#[serde(default = "default_transport")]
|
|
pub transport: String,
|
|
/// Human-readable note shown in the UI (e.g. "Home anchor", "VPS").
|
|
#[serde(default)]
|
|
pub label: String,
|
|
}
|
|
|
|
fn default_transport() -> String {
|
|
"udp".to_string()
|
|
}
|
|
|
|
fn anchors_path(data_dir: &Path) -> PathBuf {
|
|
data_dir.join(SEED_ANCHORS_FILE)
|
|
}
|
|
|
|
/// Load the seed-anchor list. A node that has never edited its anchor
|
|
/// list (no file yet) gets the default public anchor so it can bootstrap
|
|
/// the mesh out of the box. Once the operator edits anchors — including
|
|
/// removing the default — a file exists and is authoritative, so removal
|
|
/// persists and we never silently re-add it.
|
|
pub async fn load(data_dir: &Path) -> Result<Vec<SeedAnchor>> {
|
|
let path = anchors_path(data_dir);
|
|
if !path.exists() {
|
|
return Ok(vec![default_public_anchor()]);
|
|
}
|
|
let bytes = tokio::fs::read(&path)
|
|
.await
|
|
.with_context(|| format!("read {}", path.display()))?;
|
|
let anchors: Vec<SeedAnchor> =
|
|
serde_json::from_slice(&bytes).with_context(|| format!("parse {}", path.display()))?;
|
|
Ok(anchors)
|
|
}
|
|
|
|
/// Persist the list. Overwrites atomically via write-then-rename so a
|
|
/// crashed archipelago never leaves a half-written config.
|
|
pub async fn save(data_dir: &Path, anchors: &[SeedAnchor]) -> Result<()> {
|
|
tokio::fs::create_dir_all(data_dir)
|
|
.await
|
|
.with_context(|| format!("mkdir -p {}", data_dir.display()))?;
|
|
let path = anchors_path(data_dir);
|
|
let tmp = path.with_extension("json.tmp");
|
|
let json = serde_json::to_vec_pretty(anchors).context("serialize seed anchors")?;
|
|
tokio::fs::write(&tmp, json)
|
|
.await
|
|
.with_context(|| format!("write {}", tmp.display()))?;
|
|
tokio::fs::rename(&tmp, &path)
|
|
.await
|
|
.with_context(|| format!("rename {} -> {}", tmp.display(), path.display()))?;
|
|
Ok(())
|
|
}
|
|
|
|
/// Add (or update) one anchor, keyed by npub. Returns the resulting list.
|
|
pub async fn add(data_dir: &Path, anchor: SeedAnchor) -> Result<Vec<SeedAnchor>> {
|
|
let mut list = load(data_dir).await?;
|
|
if let Some(existing) = list.iter_mut().find(|a| a.npub == anchor.npub) {
|
|
*existing = anchor;
|
|
} else {
|
|
list.push(anchor);
|
|
}
|
|
save(data_dir, &list).await?;
|
|
Ok(list)
|
|
}
|
|
|
|
/// Remove an anchor by npub. Returns the resulting list.
|
|
pub async fn remove(data_dir: &Path, npub: &str) -> Result<Vec<SeedAnchor>> {
|
|
let mut list = load(data_dir).await?;
|
|
list.retain(|a| a.npub != npub);
|
|
save(data_dir, &list).await?;
|
|
Ok(list)
|
|
}
|
|
|
|
/// Apply the seed anchors to the running FIPS daemon. For each entry,
|
|
/// asks `fipsctl connect` to dial the peer. Errors are logged but don't
|
|
/// fail the whole operation — a single unreachable anchor shouldn't
|
|
/// block the others.
|
|
///
|
|
/// `fipsctl connect` is idempotent-ish: calling it for an already-
|
|
/// connected peer is a no-op at the protocol layer, so re-applying on
|
|
/// a timer is safe. Returns a list of per-anchor results for logging.
|
|
///
|
|
/// Invoked through `sudo -n`: the upstream daemon's control socket
|
|
/// (`/run/fips/control.sock`) is owned `root:fips` 0660, and the
|
|
/// archipelago service user is not in the `fips` group, so a bare
|
|
/// `fipsctl connect` fails with EACCES. This matches the privileged
|
|
/// `sudo -n fipsctl show peers` call in `service::peer_connectivity_summary`.
|
|
/// Without it, seed anchors persist to disk but never actually dial,
|
|
/// leaving `anchor_connected=false` and every peer dial falling back to
|
|
/// a slow Tor timeout.
|
|
pub async fn apply(anchors: &[SeedAnchor]) -> Vec<ApplyResult> {
|
|
let mut results = Vec::with_capacity(anchors.len());
|
|
for anchor in anchors {
|
|
let out = Command::new("sudo")
|
|
.args([
|
|
"-n",
|
|
"fipsctl",
|
|
"connect",
|
|
&anchor.npub,
|
|
&anchor.address,
|
|
&anchor.transport,
|
|
])
|
|
.output()
|
|
.await;
|
|
let result = match out {
|
|
Ok(o) if o.status.success() => ApplyResult {
|
|
npub: anchor.npub.clone(),
|
|
ok: true,
|
|
message: String::from_utf8_lossy(&o.stdout).trim().to_string(),
|
|
},
|
|
Ok(o) => ApplyResult {
|
|
npub: anchor.npub.clone(),
|
|
ok: false,
|
|
message: format!(
|
|
"sudo fipsctl connect exited {}: {}",
|
|
o.status,
|
|
String::from_utf8_lossy(&o.stderr).trim()
|
|
),
|
|
},
|
|
Err(e) => ApplyResult {
|
|
npub: anchor.npub.clone(),
|
|
ok: false,
|
|
message: format!("sudo fipsctl launch failed: {}", e),
|
|
},
|
|
};
|
|
if result.ok {
|
|
tracing::debug!(npub = %result.npub, "Seed anchor applied");
|
|
} else {
|
|
tracing::warn!(
|
|
npub = %result.npub,
|
|
message = %result.message,
|
|
"Seed anchor apply failed (non-fatal)"
|
|
);
|
|
}
|
|
results.push(result);
|
|
}
|
|
results
|
|
}
|
|
|
|
/// Outcome of a single `fipsctl connect` call.
|
|
#[derive(Debug, Clone)]
|
|
pub struct ApplyResult {
|
|
pub npub: String,
|
|
pub ok: bool,
|
|
pub message: String,
|
|
}
|
|
|
|
/// FIPS UDP transport port (matches `transports.udp.bind_addr` in the generated
|
|
/// `fips.yaml`). Direct peer links dial this, NOT the HTTP/LAN messaging port.
|
|
const FIPS_UDP_PORT: u16 = 8668;
|
|
|
|
/// Build transient seed-anchor entries that dial LAN-discovered federation peers
|
|
/// directly over their FIPS UDP transport. For each peer the registry knows both
|
|
/// a LAN socket address AND a FIPS npub for, point a `udp` anchor at
|
|
/// `<lan-ip>:8668`. This lets co-located federation nodes form a DIRECT FIPS link
|
|
/// instead of depending on the global anchor's spanning tree to route between
|
|
/// them (the cause of every dial falling back to Tor when the anchor link flaps).
|
|
///
|
|
/// This is FIPS's own UDP transport over the LAN — not Tailscale, not the LAN
|
|
/// HTTP messaging port. NOT persisted to `seed-anchors.json`: recomputed each
|
|
/// apply tick from live LAN discovery, so a peer's changing IP self-corrects and
|
|
/// stale entries never accumulate. `fipsctl connect` is idempotent, so
|
|
/// re-applying just keeps the link warm.
|
|
pub fn lan_fips_anchors(peers: &[crate::transport::PeerRecord]) -> Vec<SeedAnchor> {
|
|
let mut out = Vec::new();
|
|
for p in peers {
|
|
let (Some(lan), Some(npub)) = (p.lan_address.as_deref(), p.fips_npub.as_deref()) else {
|
|
continue;
|
|
};
|
|
// lan_address is the peer's HTTP/LAN socket ("ip:port"); reuse only its IP
|
|
// and target the FIPS UDP port. SocketAddr::new(...).to_string() formats
|
|
// IPv6 with brackets correctly.
|
|
let Ok(sa) = lan.parse::<std::net::SocketAddr>() else {
|
|
continue;
|
|
};
|
|
out.push(SeedAnchor {
|
|
npub: npub.to_string(),
|
|
address: std::net::SocketAddr::new(sa.ip(), FIPS_UDP_PORT).to_string(),
|
|
transport: "udp".to_string(),
|
|
label: "LAN federation peer (direct FIPS)".to_string(),
|
|
});
|
|
}
|
|
out
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use super::*;
|
|
|
|
fn mk(npub: &str) -> SeedAnchor {
|
|
SeedAnchor {
|
|
npub: npub.to_string(),
|
|
address: "example.test:8668".to_string(),
|
|
transport: "udp".to_string(),
|
|
label: "test".to_string(),
|
|
}
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn load_missing_seeds_default_public_anchor() {
|
|
// A node that has never edited its anchor list should still get
|
|
// the public anchor so it can bootstrap the mesh out of the box.
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let got = load(dir.path()).await.unwrap();
|
|
assert_eq!(got, vec![default_public_anchor()]);
|
|
// ...and the default must be the TCP/8443 form, not the dead udp:8668.
|
|
assert_eq!(got[0].transport, "tcp");
|
|
assert!(got[0].address.ends_with(":8443"));
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn removing_default_persists_as_empty() {
|
|
// Once the operator removes the default, a file exists and is
|
|
// authoritative — we must not silently re-seed it on next load.
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let list = remove(dir.path(), DEFAULT_PUBLIC_ANCHOR_NPUB)
|
|
.await
|
|
.unwrap();
|
|
assert!(list.is_empty());
|
|
let got = load(dir.path()).await.unwrap();
|
|
assert!(got.is_empty(), "default must stay removed once edited");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn save_and_load_roundtrip() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let a = mk("npub1aaa");
|
|
let b = mk("npub1bbb");
|
|
save(dir.path(), &[a.clone(), b.clone()]).await.unwrap();
|
|
let got = load(dir.path()).await.unwrap();
|
|
assert_eq!(got, vec![a, b]);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn add_replaces_existing_by_npub() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let mut a = mk("npub1aaa");
|
|
save(dir.path(), &[a.clone()]).await.unwrap();
|
|
a.address = "newhost:8668".to_string();
|
|
let list = add(dir.path(), a.clone()).await.unwrap();
|
|
assert_eq!(list.len(), 1);
|
|
assert_eq!(list[0].address, "newhost:8668");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn remove_by_npub() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
save(
|
|
dir.path(),
|
|
&[mk("npub1aaa"), mk("npub1bbb"), mk("npub1ccc")],
|
|
)
|
|
.await
|
|
.unwrap();
|
|
let list = remove(dir.path(), "npub1bbb").await.unwrap();
|
|
assert_eq!(list.len(), 2);
|
|
assert!(list.iter().all(|a| a.npub != "npub1bbb"));
|
|
}
|
|
|
|
#[test]
|
|
fn seed_anchor_uses_udp_by_default() {
|
|
let json = r#"{"npub":"npub1x","address":"h:8668"}"#;
|
|
let a: SeedAnchor = serde_json::from_str(json).unwrap();
|
|
assert_eq!(a.transport, "udp");
|
|
assert_eq!(a.label, "");
|
|
}
|
|
}
|