archy/docs/did-dht-integration.md
Dorian 0f6df9a021 docs: did:dht integration architecture and DWN protocol schemas
- DHT-01: docs/did-dht-integration.md — did:dht spec analysis, DNS packet
  encoding, mainline crate, publication/resolution flows, security notes
- SCHEMA-01: docs/dwn-protocols.md — 4 DWN protocol definitions with JSON
  schemas: node-identity, file-catalog, federation, app-deploy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 02:59:16 +00:00

161 lines
6.6 KiB
Markdown

# did:dht Integration Architecture
## Overview
Archipelago currently uses `did:key` for node identities. This document describes integrating `did:dht` as a **complementary** DID method that makes node identities discoverable via the BitTorrent Mainline DHT, without relying on centralized registries, Nostr relays, or Tor hidden services.
**Goal**: Each Archipelago node has two DID types:
- `did:key` — Offline, self-certifying, works without network (primary identity)
- `did:dht` — Published to Mainline DHT for decentralized discovery (optional, discoverable)
## What is did:dht?
The `did:dht` method stores DID Documents in the [BitTorrent Mainline DHT](https://www.bittorrent.org/beps/bep_0044.html) using BEP-44 (mutable items). Key properties:
- **No server needed**: Uses the public DHT network (~15M+ nodes)
- **Ed25519 keypair**: Same key type Archipelago already uses
- **DNS-encoded**: DID Document stored as a DNS packet (compact, standardized)
- **Mutable**: Documents can be updated by the key holder
- **TTL-based**: Published records have a TTL and must be refreshed periodically (every 2 hours recommended)
- **Identifier format**: `did:dht:{z-base-32-encoded-ed25519-pubkey}`
## Architecture
### DID Relationship
```
Node Identity (Ed25519 keypair)
├── did:key:z6Mk... (derived from pubkey, offline, stable)
└── did:dht:z6Mk... (published to DHT, discoverable, same key)
```
Both DIDs use the same underlying Ed25519 keypair. The `did:dht` identifier is the z-base-32 encoding of the 32-byte public key. This means the same keypair produces both DID types — no additional key management.
### DHT Publication Flow
```
1. Node generates Ed25519 keypair (already exists)
2. Node creates DID Document with:
- Ed25519 verification key (signing)
- X25519 key agreement key (derived)
- Service endpoints (optional: Tor onion, federation endpoint)
3. DID Document encoded as DNS packet (RFC 1035)
4. DNS packet signed with Ed25519 key (BEP-44 mutable item)
5. Published to Mainline DHT under the public key
6. Refreshed every 2 hours to maintain availability
```
### DNS Packet Encoding
The DID Document maps to DNS resource records:
| Record Type | Name | Purpose |
|------------|------|---------|
| TXT `_did.` | `vm=k0` | Verification method: key 0 (Ed25519) |
| TXT `_did.` | `auth=0` | Authentication uses key 0 |
| TXT `_did.` | `asm=0` | AssertionMethod uses key 0 |
| TXT `_k0._did.` | `id=0;t=0;k={base64url_pubkey}` | Key 0: Ed25519 public key |
| TXT `_s0._did.` | `id=tor;t=TorHiddenService;se={onion}` | Service endpoint (optional) |
### Resolution Flow
```
1. Receive did:dht:{identifier}
2. Decode z-base-32 → 32-byte Ed25519 public key
3. Query Mainline DHT for BEP-44 mutable item under that key
4. Verify signature on the DHT payload
5. Parse DNS packet → reconstruct DID Document
6. Cache for 1 hour (reduce DHT load)
```
## Implementation Plan
### Rust Crate: `mainline`
Use the `mainline` crate for Mainline DHT access. It provides:
- `Dht::client()` for resolution-only nodes
- `Dht::server()` for full DHT participation
- `MutableItem` for BEP-44 put/get operations
- Ed25519 signing compatible with `ed25519-dalek`
Additional crates needed:
- `simple-dns` or `trust-dns-proto` for DNS packet encoding/decoding
- `zbase32` for z-base-32 encoding (did:dht identifier format)
### New Files
```
core/archipelago/src/identity/
├── did_dht.rs — did:dht creation, publication, resolution
└── dns_packet.rs — DID Document ↔ DNS packet encoding
```
### New RPC Endpoints
| Endpoint | Description |
|----------|-------------|
| `identity.create-dht-did` | Publish current node's DID to DHT |
| `identity.resolve-dht-did` | Resolve a did:dht from the DHT |
| `identity.refresh-dht-did` | Force refresh the DHT publication |
| `identity.dht-status` | Check if node's did:dht is published and resolvable |
### Integration Points
1. **Server startup**: Optionally publish did:dht in background (non-blocking)
2. **Identity manager**: Store did:dht alongside did:key in identity records
3. **Federation**: Accept did:dht in peer join/discovery
4. **Web5 UI**: Display both DID types, add publish/resolve buttons
5. **Credentials**: Accept did:dht as issuer/subject in VCs
### Background Refresh
A background tokio task refreshes the DHT publication every 2 hours:
```
spawn background task:
loop {
publish_to_dht(keypair, did_document_dns_packet)
sleep(2 hours)
}
```
If the node is offline when the TTL expires, the record drops from the DHT. It gets re-published when the node comes back online.
## Security Considerations
1. **Same key for both DIDs**: No new key material to protect. The Ed25519 key already in `/var/lib/archipelago/identity/node_key` is used for both.
2. **DHT is public**: Publishing to the DHT makes the node's DID Document visible to anyone querying the DHT. This is intentional for discoverability. Sensitive information (Tor addresses) should only be included in the service endpoints if the user explicitly opts in.
3. **No Tor address by default**: The DID Document published to DHT should NOT include Tor hidden service addresses by default (per the security rule about not publishing onion addresses publicly). Tor addresses are exchanged privately via federation.
4. **BEP-44 signature verification**: All DHT records are signed with Ed25519. Resolvers verify the signature, preventing tampering.
5. **Sybil resistance**: did:dht identifiers are derived from public keys, so creating a fake identity requires generating a new keypair. The federation trust system already handles this via trust levels.
## Comparison: did:key vs did:dht
| Property | did:key | did:dht |
|----------|---------|---------|
| Offline creation | Yes | No (needs DHT access) |
| Discoverable | No (must share manually) | Yes (query by identifier) |
| Persistence | Permanent (derived from key) | TTL-based (needs refresh) |
| Network requirement | None | UDP to DHT peers |
| Resolution | Local computation only | DHT query (~1-5s) |
| Privacy | Key not published anywhere | Key is on the DHT |
| W3C standard | Yes (DID Core) | Yes (DID Core) |
## Timeline
1. **DHT-02**: Implement did:dht creation + publication (~2 days)
2. **DHT-03**: Implement did:dht resolution + caching (~1 day)
3. **DHT-04**: Web5 UI integration (~1 day)
4. **Testing**: Cross-node resolution via DHT (separate from Tor) (~1 day)
## References
- [did:dht Method Specification](https://did-dht.com/)
- [BEP-44: Storing arbitrary data in the DHT](https://www.bittorrent.org/beps/bep_0044.html)
- [Mainline DHT crate](https://crates.io/crates/mainline)
- [W3C DID Core 1.0](https://www.w3.org/TR/did-core/)