28 Commits

Author SHA1 Message Date
Dorian
19dcfd4f31 feat: BIP-39 master seed for unified key derivation
Replace fragmented random key generation with a single 24-word BIP-39
mnemonic that deterministically derives all node keys: Ed25519 (DID),
secp256k1 (Nostr/Bitcoin), BIP-84 xprv (Bitcoin Core), and LND aezeed
entropy. New onboarding flow: seed generate → word verification → identity
naming. Restore path enabled via 24-word entry. Includes seed RPC handlers,
mock backend support, LND/Bitcoin Core wallet-from-seed integration, and
UI polish across settings and discover views.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 01:41:24 +01:00
Dorian
967af7d96f fix: password setup, CSRF 403, reboot after install
Critical fixes:
- Remove ensure_default_user() — no more auto-creating user with
  password123. Login page now shows "Create Password" form on first
  boot. User sets their own password during onboarding flow.
- CSRF 403: increased retry delay from 300ms to 500ms for stale
  cookie recovery after remember-me session restore.
- Reboot: multiple fallback methods (/sbin/reboot, sysrq, kill init)
  when USB is pulled and /usr/sbin isn't available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 22:44:46 +01:00
Dorian
9636bac96b fix: auto-create default user, force reboot, i915 firmware, first boot info
Critical fixes from ISO testing on .198:
- Backend auto-creates default user (password123) on first start
  so login works immediately after onboarding
- Force reboot (reboot -f) after install to avoid SquashFS errors
  when live USB is removed
- Eject USB before prompting user, not after
- Add firmware-misc-nonfree for Intel i915 GPU (suppresses dozens
  of "Possible missing firmware" warnings during initramfs update)
- First boot screen: wait up to 10s for DHCP before showing IP
- First boot screen: compact layout fits 80-col terminals
- ISOLINUX menu resolution dropped to 640x480 for universal
  VESA compatibility (was 1024x768, caused scaling on some hardware)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:06:34 +00:00
Dorian
99400a7165 feat: container orchestration, branding overhaul, onboarding logging
Container orchestration:
- Health monitor with crash recovery and auto-restart
- Doctor service (periodic health checks via systemd timer)
- Reconcile service (desired-state convergence)
- Stack-aware install/uninstall with dependency tracking

Branding:
- Custom GRUB background (designer artwork, 1024x768)
- ISOLINUX boot menu: centered, orange accents, clean labels
- Terminal banners: adaptive width, basic ANSI colors, fits 80-col
- Removed auto-generated splash scripts (designer provides assets)
- GRUB theme: lowercase branding

Frontend:
- 401 handler clears localStorage immediately (prevents cascade)

Backend:
- Onboarding/auth logging ([onboarding] tag in journalctl)
- Cookie Secure flag logging for debugging HTTP/HTTPS issues

ISO fixes:
- Install log saved before unmount (was silently failing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
e4e0ef4f11 bug fixing and deploy and build diagnostics 2026-03-22 03:30:21 +00:00
Dorian
94f2de4a64 refactor: centralize constants, eliminate unwraps, remove dead code, resolve TODOs
- R13+R16: Replace .expect() with .context()? in main.rs and identity.rs
- R17+R18+R19: Fix unwrap() calls in helpers and js-engine
- R20+R21: Remove #[allow(dead_code)] annotations and delete truly dead code
- R22-R26: Create constants.rs module, replace 21 hardcoded values across 12 files
- R28+R29: LND/DWN timeouts already present — verified
- R30-R33: Remove TODO comments, implement marketplace payment check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:54:35 +00:00
Dorian
b57ca4f171 fix: add health RPC handler, Nostr connect timeouts, atomic backup restore, nginx rate limits
- R1: Add health RPC endpoint with crash recovery status, uptime, and version
- R2: Wrap all 5 Nostr client.connect() calls in 10s timeout
- R3: Make backup restore atomic with staging dir and rollback on failure
- I1: Add rate limiting, body size, and proxy timeouts to unauthenticated nginx endpoints

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:02:16 +00:00
Dorian
84a56c80de security+feat: v1.3.0 — pentest remediation, container reliability, UI overhaul
Security (33 pentest findings addressed):
- CRITICAL: backend binds 127.0.0.1, path traversal in tor.rs/dwn fixed
- HIGH: federation requires signatures, XSS login redirect, RBAC viewer restricted
- HIGH: tar slip prevention, S3 SSRF validation, backup ID validation
- MEDIUM: remember-me random secret, TOTP session rotation, password re-auth
- LOW: CSP unsafe-inline removed, CORS dev-only, onion/webhook validation

Container reliability:
- Memory limits on all 37 containers (OOM prevention)
- Exited vs stopped state distinction with health-aware status badges
- Crash recovery coordination (no more restart cascade)
- User-stopped tracking survives reboots
- Tiered boot recovery (databases → core → services → apps)

UI:
- Wallet TransactionsModal, health-aware app status badges
- Restart button on containers, exited/crashed red state
- Mesh view overhaul, glass button updates, BaseModal/ToggleSwitch
- Apps sticky header removed, dev faucet, mutable mock wallet

Infrastructure:
- LND REST port 8080 exposed over Tor (LND Connect fix)
- Nginx cookie_session fix, deploy script Tor config updated
- Dev environment: podman auto-start, boot mode simulation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 12:44:31 +00:00
Dorian
909ad5f019 feat: Phase 1 — per-installation credential generation, eliminate hardcoded passwords
Generate unique random passwords at first boot for Bitcoin RPC, all database
services (mempool, btcpay, immich, penpot, mysql-root), and Fedimint gateway.
Credentials stored in /var/lib/archipelago/secrets/ with 600 permissions.

Scripts: first-boot-containers.sh, deploy-to-target.sh, deploy-bitcoin-knots.sh,
container-doctor.sh all read from secrets files instead of hardcoded values.

Rust backend: new bitcoin_rpc module reads password from secrets file, env var,
or dev fallback. All .basic_auth() calls and container config strings now use
the shared credential reader instead of hardcoded "archipelago123".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 00:39:52 +00:00
Dorian
253c305cc8 backup commit 2026-03-17 00:03:08 +00:00
Dorian
8143f6871f feat: hardware compatibility, TPM attestation, security audit prep
- Y2-01: docs/hardware-compatibility.md — 2 certified platforms,
  4 planned, minimum requirements, known quirks
- Y3-04: tpm.rs — TPM 2.0 attestation types (TpmStatus, TpmAttestation,
  detect_tpm), ready for tss-esapi integration
- Y5-03: docs/security-audit-prep.md — audit scope, completed internal
  audits, recommended firms, budget estimates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 05:57:32 +00:00
Dorian
f8b29cd03d feat: add cluster HA module stub and mark PWA mobile companion done
- Y3-03: cluster.rs with Raft types (ClusterRole, ClusterState,
  AppPlacement, ClusterConfig). Ready for openraft integration.
- Y2-04: Existing PWA already serves as mobile companion (installable,
  read-only dashboard works on mobile via HTTPS).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 05:55:03 +00:00
Dorian
7776191fac fix: watchdog killing backend every 60s on .198 (47 restarts/day)
Root cause: sd_notify::notify(true, ...) cleared NOTIFY_SOCKET env var,
so watchdog pings never reached systemd. Backend killed every 60s.

Fixes:
- Change sd_notify::notify first param to false (keep socket)
- Increase WatchdogSec from 60 to 300 (5min) for crash recovery
- Add TimeoutStartSec=300 for slow container startups
- Adjust watchdog ping interval to 120s

This was causing 47 restarts/day on .198 and blocking REBOOT-03,
FLEET-03, FLEET-04, VC-04.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 04:30:57 +00:00
Dorian
2ce1b8661d perf: move crash recovery to background for instant health endpoint
Crash recovery (check_for_crash + recover_containers +
start_stopped_containers) now runs in a background tokio task.
The health endpoint is available immediately on startup instead of
blocking for 260+ seconds while containers restart sequentially.

This directly fixes the .198 boot recovery timeout issue where the
backend took 260s to become healthy after restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 03:44:33 +00:00
Dorian
e3e279331f feat: add systemd watchdog, OOM detection, disk growth alerting
MEM-01: OOM kill detection via dmesg checks every 5 minutes
MEM-03: Disk growth rate tracking (288 samples over 24h), warns at >1GB/day
MEM-04: Systemd watchdog (WatchdogSec=60, sd_notify::Watchdog every 30s)
        Service Type=notify for proper startup notification

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 02:54:59 +00:00
Dorian
62f37eca00 feat: auto-start stopped containers on boot, add failure recovery tests
Added start_stopped_containers() to crash_recovery.rs that starts all
exited/created containers on backend startup, fixing the issue where
containers didn't come back after clean reboot (PID marker removed by
systemd stop). Created test-failure-recovery.sh covering 5 failure
scenarios: container crash, backend restart, Tor restart, full reboot,
and Tor traffic block (UPTIME-02).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 03:55:14 +00:00
Dorian
6fee6befed refactor: update dependencies and remove unused code
- Added new dependencies: `adler2`, `crc32fast`, `flate2`, `miniz_oxide`, and `libredox`.
- Updated existing dependencies: `tokio-rustls` to version 0.26.4 and `filetime` to version 0.2.27.
- Removed the `backup.rs` file as it is no longer needed.
- Introduced tests for configuration and credential management.
- Enhanced the `identity` module to generate W3C compliant DID documents.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 00:19:30 +00:00
Dorian
7fc170f50e feat: add webhook notification system with Settings UI (REMOTE-03)
Webhook module with HTTP delivery, HMAC-SHA256 signing, and event
filtering. RPC handlers for get-config, configure, and test endpoints.
Settings page gains webhook configuration section with URL, secret,
event toggles, and test button.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 12:55:13 +00:00
Dorian
baeeb72f27 feat: add real-time metrics collection with ring buffer storage (MON-01)
Implements monitoring/collector.rs that collects per-container CPU/RAM/network/disk,
system-wide metrics, RPC latency, and WebSocket connection count every 60 seconds.
Data stored in dual ring buffers: 1-min resolution (24h) and 15-min resolution (7d).
Three new RPC endpoints: monitoring.current, monitoring.history, monitoring.containers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:11:02 +00:00
Dorian
e3aa95a103 fix: prevent tokio runtime deadlock in credential issue/verify
The credential issuance and verification handlers used
Handle::block_on() directly inside the tokio runtime, causing a
deadlock. Wrapped with block_in_place() to properly yield the
runtime thread.

Also completed full feature verification across all 25 test groups
(~175 checks) on live server.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 07:43:12 +00:00
Dorian
e55fd3baf0 feat: add TOTP 2FA, API key switcher, login progress bar, and alpha hardening plan
- TOTP 2FA: full setup/confirm/disable/login flow with Argon2id + ChaCha20-Poly1305
  encrypted secret storage, QR code generation, and bcrypt-hashed backup codes
- API key switcher: OAuth vs personal API key toggle in AIUI chat settings with
  status indicator, key validation, and help text
- Login progress bar: server startup detection with health check polling, form
  disabled until server is ready
- AI quarantine docs: comprehensive HTML page documenting all 6 security layers
- Settings: AI Data Access permission toggles with per-category control
- Alpha hardening plan: 28-task overnight automation plan across 7 phases
  (onboarding, login, app install, AIUI, UI polish, security, ISO build)
- Backlog: node discovery spatial map feature for alpha demo

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:23:57 +00:00
Dorian
6656d2f1d9 fix: implement 22 security pentest remediation fixes
Server-side session management with SHA-256 hashed tokens and HttpOnly
cookies. Auth middleware gating all RPC/WS/proxy routes with method
allowlist. Login rate limiting (5/60s per IP). CORS restricted to
config origin. Docker registry allowlist. App ID and path validation.
P2P message sanitization (HTML + log injection). Onion address and
known-peer validation. Nginx security headers (CSP, X-Frame-Options,
etc.) and AIUI proxy auth. Systemd hardening (non-root, NoNewPrivileges,
ProtectSystem).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 03:26:56 +00:00
Dorian
62d6c13764 Implement onboarding reset functionality and enhance backup features
- Added a new method to reset the onboarding state, allowing users to re-initiate the onboarding process.
- Integrated backup creation functionality, enabling users to create encrypted backups of their node identity.
- Updated API endpoints to handle onboarding reset and backup creation requests.
- Enhanced UI components to support the new onboarding reset and backup features, including error handling and user feedback.
- Introduced new dependencies for cryptographic operations and data encoding.
2026-03-02 08:34:13 +00:00
Dorian
1073d9fd2c Update Fedimint configuration and enhance onboarding process
- Upgraded Fedimint version to v0.10.0 in docker-compose.yml and manifest.yml, adding support for the built-in Guardian UI.
- Modified .gitignore to exclude deploy-config.sh script.
- Enhanced onboarding process in AuthManager to persist onboarding state and validate password strength during user setup.
- Updated API to handle onboarding completion and password change requests, ensuring a smoother user experience.
- Improved configuration management to support Nostr discovery and Tor proxy settings, enhancing node identity features.
2026-02-17 15:03:34 +00:00
Dorian
3b3f70276f Integrate Docker support into Archipelago and Neode UI
- Added StateManager and data_model modules to manage application state.
- Updated ApiHandler to utilize StateManager for WebSocket connections.
- Enhanced Server initialization to include StateManager.
- Implemented Docker container querying in Neode UI to populate app data dynamically.
- Removed temporary dummy app configurations in favor of real Docker-based applications.
- Improved WebSocket reconnection logic and error handling in the UI.
- Updated package.json and package-lock.json to include dockerode dependency.
2026-01-27 23:06:18 +00:00
Dorian
4126aa0b33 untrack 2026-01-27 22:37:08 +00:00
Dorian
1c024c5d64 Update archipelago: API, auth, container, parmanode, performance, security
- API handler, RPC, and server updates
- Auth and coding rules
- Container data manager, dev orchestrator, health monitor, podman client
- Parmanode script runner
- Performance resource manager
- Security container policies and secrets manager
- Add build scripts and documentation
2026-01-27 22:27:17 +00:00
zazawowow
731cd67cfb mid coding commit 2026-01-24 22:59:20 +00:00