feat(dht): Phase 2 engine — real iroh-blobs provider behind iroh-swarm

Pulls iroh 1.0 + iroh-blobs 0.103 as OPTIONAL deps under the iroh-swarm
feature and implements a real BlobProvider over them. Verified: the full
iroh QUIC dep tree (260 pkgs) resolves and compiles against the pinned
bitcoin/nostr-sdk/reqwest-rustls stack; the provider compiles against the
0.103/1.0 API.

- swarm/iroh_provider.rs: IrohProvider::new binds a QUIC Endpoint, opens a
  persistent FsStore (data_dir/iroh-blobs), and serves blobs via the
  iroh-blobs protocol/Router — a node that fetches also SEEDS. try_fetch
  maps ContentDigest -> iroh Hash, asks discovery for seed EndpointIds, then
  downloader.download(hash, providers) (range-verified) + export to staging.
- ProviderDiscovery trait: the seam Phase 3 (signed Nostr advertisement
  events) fills. discovery=None -> no seeds -> origin-only, so enabling the
  feature is never worse than today.
- Default build untouched: iroh is optional, the module is cfg-gated, and
  providers() stays empty until Phase 3 wires discovery in.

Build: cargo build --features iroh-swarm succeeds (dev). Default build +
44 swarm/update/content_hash/blobs tests unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
archipelago 2026-06-16 14:33:31 -04:00
parent 2523c9e3dd
commit 082946aa30
4 changed files with 3157 additions and 81 deletions

3095
core/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -17,7 +17,7 @@ default = []
# is empty and every fetch goes straight to the origin HTTP path (today's
# behaviour). Attach the optional iroh / iroh-blobs deps to this feature when
# wiring the IrohProvider.
iroh-swarm = []
iroh-swarm = ["dep:iroh", "dep:iroh-blobs"]
[dependencies]
# Core dependencies
@ -117,6 +117,12 @@ sd-notify = "0.4"
# Trait objects for async methods (container orchestrator trait, Step 4)
async-trait = "0.1"
# DHT Phase 2: iroh-blobs peer swarm engine. OPTIONAL — only pulled in by the
# `iroh-swarm` feature (off by default). Heavy QUIC dep tree; kept behind the
# flag so the default fleet build is unaffected until the PoC is measured.
iroh = { version = "1", optional = true }
iroh-blobs = { version = "0.103", optional = true }
[dev-dependencies]
tokio-test = "0.4"
tempfile = "3.10"

View File

@ -0,0 +1,132 @@
//! iroh-blobs swarm provider — the DHT Phase 2 engine, gated behind the
//! `iroh-swarm` feature (heavy QUIC dep tree, off by default).
//!
//! Stands up a real iroh node: binds a QUIC [`Endpoint`], opens a persistent
//! blob [`FsStore`] under `data_dir/iroh-blobs`, and serves blobs over the
//! iroh-blobs protocol — so a node that *fetches* content also *seeds* it
//! afterwards. Content is addressed by BLAKE3 ([`Hash`]) and range-verified by
//! iroh on arrival.
//!
//! This provider is an optimization beneath the origin HTTP path: the [`super`]
//! swarm seam falls back to origin whenever [`try_fetch`](IrohProvider::try_fetch)
//! returns `Ok(false)` (no known seeds) or `Err` (transient swarm failure).
//!
//! ## Discovery boundary (Phase 3)
//! Downloading needs the [`EndpointId`]s of peers that hold the hash. That
//! discovery — design Phase 3, *signed Nostr advertisement events* mapping
//! `{content-hash → provider endpoint}` — is injected via [`ProviderDiscovery`].
//! Until it is wired, discovery yields nothing and every fetch defers to origin,
//! so enabling the feature is safe (never worse than today).
use std::path::Path;
use std::sync::Arc;
use anyhow::Result;
use async_trait::async_trait;
use iroh::{endpoint::presets, protocol::Router, Endpoint, EndpointId};
use iroh_blobs::{store::fs::FsStore, BlobsProtocol, Hash};
use super::BlobProvider;
use crate::content_hash::{ContentDigest, HashAlg};
/// Resolves which peers are believed to hold a given content hash.
///
/// Phase 3 (signed Nostr advertisement events) provides the production impl;
/// `None` discovery means "origin-only" — a safe default.
pub trait ProviderDiscovery: Send + Sync {
/// Candidate seed endpoints for `hash` (may be empty).
fn providers_for(&self, hash: &Hash) -> Vec<EndpointId>;
}
/// Fetches content-addressed blobs from the iroh swarm, and seeds what it has.
#[allow(dead_code)] // constructed once Phase 3 discovery is wired into providers()
pub struct IrohProvider {
endpoint: Endpoint,
store: FsStore,
/// Kept alive so the node keeps accepting blob-protocol connections (seeds).
_router: Router,
discovery: Option<Arc<dyn ProviderDiscovery>>,
}
#[allow(dead_code)]
impl IrohProvider {
/// Bind an iroh endpoint, open the persistent blob store at
/// `data_dir/iroh-blobs`, and start serving blobs (seed capability).
pub async fn new(
data_dir: &Path,
discovery: Option<Arc<dyn ProviderDiscovery>>,
) -> Result<Self> {
let root = data_dir.join("iroh-blobs");
tokio::fs::create_dir_all(&root).await.ok();
let store = FsStore::load(&root)
.await
.map_err(|e| anyhow::anyhow!("open iroh blob store: {e}"))?;
let endpoint = Endpoint::bind(presets::N0)
.await
.map_err(|e| anyhow::anyhow!("bind iroh endpoint: {e}"))?;
// Serve blobs: a node that fetches a blob can then seed it to others.
let blobs = BlobsProtocol::new(&store, None);
let router = Router::builder(endpoint.clone())
.accept(iroh_blobs::ALPN, blobs)
.spawn();
Ok(Self {
endpoint,
store,
_router: router,
discovery,
})
}
/// This node's iroh endpoint id — what Phase 3 advertises as a seed address.
pub fn endpoint_id(&self) -> EndpointId {
self.endpoint.id()
}
}
#[async_trait]
impl BlobProvider for IrohProvider {
fn name(&self) -> &str {
"iroh"
}
async fn try_fetch(&self, digest: &ContentDigest, dest: &Path) -> Result<bool> {
// iroh addresses content by BLAKE3. A sha256-only digest isn't fetchable
// from the swarm — defer to origin.
if digest.alg != HashAlg::Blake3 {
return Ok(false);
}
let raw = hex::decode(&digest.hex).map_err(|e| anyhow::anyhow!("digest hex: {e}"))?;
let arr: [u8; 32] = raw
.as_slice()
.try_into()
.map_err(|_| anyhow::anyhow!("blake3 digest must be 32 bytes"))?;
let hash = Hash::from_bytes(arr);
// Who has it? Without discovery (Phase 3) this is empty → origin wins.
let providers = match &self.discovery {
Some(d) => d.providers_for(&hash),
None => Vec::new(),
};
if providers.is_empty() {
return Ok(false);
}
// Fetch (range-verified by iroh) then export the verified blob to the
// staging path the caller expects. The seam re-verifies the digest.
let downloader = self.store.downloader(&self.endpoint);
downloader
.download(hash, providers)
.await
.map_err(|e| anyhow::anyhow!("iroh swarm download: {e}"))?;
self.store
.blobs()
.export(hash, dest)
.await
.map_err(|e| anyhow::anyhow!("export blob to staging: {e}"))?;
Ok(true)
}
}

View File

@ -27,6 +27,9 @@ use tracing::{debug, info, warn};
use crate::content_hash::ContentDigest;
#[cfg(feature = "iroh-swarm")]
pub mod iroh_provider;
/// Which source ultimately served the content.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum FetchSource {