archipelago 837cc02812 fix(federation): reliable symmetric auto-federation across LAN/Tor/FIPS
Federated nodes failed to converge to full-mesh across the LAN<->Tailscale
boundary: nodes were invisible to peers, sync 'took ages'/timed out, and
names only updated on a manual sync. Onions were healthy in both directions
(~3-5s); the failures were app-layer.

- B: federation dials fast-fail a dead FIPS path via .fips_timeout(6s) in
  sync_with_peer + notify_join, so the Tor fallback isn't stuck behind the
  full 30s FIPS budget when LAN and remote peers share no FIPS path.
- A: notify_join (peer-joined) now spawns with retries+backoff instead of a
  single awaited best-effort POST, so the join RPC returns instantly (no
  'Request timeout') and the inviter reliably learns the joiner (was
  asymmetric).
- C: new 90s periodic federation auto-sync (none existed) so renamed nodes
  and roster changes propagate without a manual Sync click.
- self-heal: each auto-sync re-asserts membership to any peer that doesn't
  list us back, converging the fleet to full-mesh and healing pre-existing
  asymmetry with no manual re-joins.

Validated live across 7 nodes: a previously fleet-invisible node became
fully meshed automatically (logs: 'auto-sync ... reasserted=1',
'peer-joined ... delivered').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:52:26 -04:00

25 lines
1018 B
Rust

//! Node federation: trusted multi-node clusters with state sync.
//!
//! Nodes federate by exchanging invite codes containing DID + onion address.
//! Trust is bilateral — both sides must agree. Federated nodes periodically
//! sync container status, health metrics, and availability.
mod invites;
pub mod pending;
mod storage;
mod sync;
mod types;
// Re-export all public items so `crate::federation::*` continues to work.
pub use invites::{accept_invite, create_invite};
// Crate-internal: used by the periodic federation auto-sync to re-assert
// membership to peers that don't list us back (asymmetry self-heal).
pub(crate) use invites::notify_join;
#[allow(unused_imports)]
pub use storage::{
add_node, fips_npub_for_onion, load_nodes, load_removed_dids, record_peer_transport,
remove_node, save_nodes, set_trust_level, update_node,
};
pub use sync::{build_local_state, deploy_to_peer, sync_with_peer, sync_with_peer_by_did};
pub use types::{AppStatus, FederatedNode, NodeStateSnapshot, TrustLevel};