87 Commits

Author SHA1 Message Date
archipelago
6bbe1b96cf refactor: drop dead code surfaced by cargo
cargo check was showing five real warnings, all genuinely dead:

* container/mod.rs   — re-exports compute_container_name, AdoptionReport,
                       ReconcileAction, ReconcileReport were unused outside
                       prod_orchestrator. Drop from the pub use line.
* prod_orchestrator  — with_runtime + insert_manifest_for_test only exist
                       for the test module in the same file. Mark them
                       #[cfg(test)] so they don't appear in release builds.
* async_lifecycle    — remove_package_entry has no callers; doc claims
                       "used for install-failure cleanup" but nothing
                       cleans up. Delete (10 lines).
* registry.rs        — `use tracing::{debug, info};` had no consumers.
* fips.rs            — unused-assignment chain on last_status. The poll
                       loop always sets it on every break path, so the
                       initial `None` and the unwrap_or_else fallback
                       were both dead. Refactored to `let after = loop
                       { ...; break s; };`.

cargo check is now clean. cargo test --workspace --bins: 614 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:34:02 -04:00
archipelago
6603227874 fix(install): auto-clean stuck OTHER-variant bitcoin container
If bitcoin-core was installed but never started (e.g. port 8332 already
bound by bitcoin-knots), the container sticks in `created` state forever.
The old conflict check refused EVERY future bitcoin install — including
re-install of the running variant — leaving no UI path to recovery.

Now the check distinguishes states:
  - missing                       → no conflict, continue
  - running                       → real conflict, refuse install
  - created/exited/configured/... → stuck; auto-remove and continue

Volumes are untouched; only the dead container record goes away.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:59:11 -04:00
archipelago
27ff1d5b52 fix(install): generate bitcoin RPC password before orchestrator install
Bitcoin containers were exiting in ms after start because the orchestrator
install path skipped the credential-materialisation step the legacy path
did. resolve_secret_env then failed to read
/var/lib/archipelago/secrets/bitcoin-rpc-password, the container started
with no password, and bitcoind crashed before logs were useful.

Two changes:

1. install.rs — call bitcoin_rpc_credentials() for bitcoin/bitcoin-core/
   bitcoin-knots before any install branch runs. The function generates +
   persists on first call (OnceCell-cached), so this is idempotent.

2. manifest.rs::resolve_secret_env — return ManifestError::Invalid when a
   resolved secret trims to empty, instead of silently producing
   `KEY=` env vars that crash auth.

Adds a unit test for the empty-secret rejection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:39:56 -04:00
archipelago
f9e34fd0c6 refactor(install): route orchestrator-managed apps through orchestrator first
Phase 3a of the install path consolidation. Two coupled changes:

1. install.rs handle_package_install: gate the legacy "container exists →
   adopt + return" probe on !orchestrator_managed. Apps the orchestrator
   knows about (bitcoin-knots, bitcoin-core, lnd, electrumx, fedimint,
   filebrowser, btcpay-server stack apps, mempool stack apps, plus the
   companion UIs that just moved to Quadlet) skip the legacy probe and
   fall straight into the orchestrator branch.

   The legacy adopt block was returning success on a bare `podman start`
   exit-0 — even when the process inside the container crashed seconds
   later. That's the .228 "running but unreachable" failure mode. The
   orchestrator's ensure_running honors the manifest's health check and
   pre-start hooks (e.g. re-renders bitcoin-ui's nginx.conf if the RPC
   password rotated), so this is a behavioral upgrade, not just a
   refactor.

2. ProdContainerOrchestrator::install: make idempotent. Previously it
   blindly called install_fresh which would fail on `podman create` if
   the container name already existed. Now it delegates to ensure_running:
     - Container Running + healthy → no-op (refresh hooks, restart if
       config rewritten)
     - Container Stopped/Exited → start (with hook refresh)
     - Container missing → install_fresh
     - Container in wedged state (Created/Paused/Unknown) → force-recreate

   Without this, change #1 would regress every "container already exists"
   case for the 18 orchestrator-managed app IDs. With it, install becomes
   the single source of truth for "make app X be in the desired state."

Tests: 654 passed across the workspace (614 unit + 37 orchestration + 3
rpc), 0 failures. The 20 prod_orchestrator tests cover the install /
ensure_running / reconcile paths the new install delegates through.

Net delta: install.rs grows by ~30 lines (gating wrapper + comments),
prod_orchestrator.rs grows by ~30 lines (idempotent install body). Both
are temporary — the larger deletions (~1700 lines) come once every app
has been verified through the orchestrator path in subsequent phases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:12:52 -04:00
archipelago
23c4e7441f refactor(container): move companion UIs to systemd via Quadlet
Companion UI containers (archy-bitcoin-ui, archy-lnd-ui,
archy-electrs-ui) used to be launched as fire-and-forget tokio::spawn
blocks from install.rs. If archipelago crashed mid-spawn or the
container's cgroup was reaped, companions vanished from podman ps -a
and only a manual rm/run could bring them back (the .228 incident).

Now each companion is rendered as a Quadlet .container unit under
~/.config/containers/systemd/, daemon-reloaded, and started via
systemctl --user. systemd owns supervision from that point on:

- archipelago can crash, restart, or be uninstalled without touching
  any companion.
- Quadlet's Restart=always + RestartSec=10 handles container exits.
- A 30s reconcile tick in boot_reconciler enumerates expected
  companion units and re-installs any whose unit file or service
  vanished — defense-in-depth against external tampering.

New module layout:
- container/quadlet.rs: pure unit renderer + atomic write_if_changed
  + systemctl helpers (daemon_reload_user / enable_now / disable_remove
  / is_active). 6 unit tests, no I/O in the renderer.
- container/companion.rs: per-app companion specs, install/remove/
  reconcile, image presence (build local first, fall back to insecure
  registry only via image_uses_insecure_registry whitelist). 2 tests.

install.rs handle_package_install now ends with a single call to
companion::install_for(package_id), replacing 287 lines of spawn-and-
hope shellouts plus a ~120-line nginx auth-injector helper that worked
around per-node RPC password baking. The helper is gone too — the
pre-start hook renders the per-node nginx.conf to /var/lib/archipelago/
bitcoin-ui/nginx.conf and the Quadlet unit bind-mounts it read-only.

runtime.rs handle_package_uninstall now disables companions before
the container rm loop. Otherwise systemd's Restart=always would
respawn each companion within ~10s of removal.

Tests: 53 container tests pass, including 6 quadlet renderer tests
(host network, bridge network, capability set, atomic write idempotence)
and 2 companion specs (per-app companion lookup, build_unit shape).
boot_reconciler tests gain a #[cfg(test)] without_companion_stage()
flag so the paused-clock fixtures don't race the real systemctl I/O.

A bats regression test (companion-survives-archipelago-restart.bats,
gated on ARCHY_ALLOW_DESTRUCTIVE=1) asserts the .228 failure mode
cannot recur: every installed companion has a unit file, services
stay active across systemctl --user restart archipelago, and a
deleted unit file is recreated within one reconcile tick.

Net delta: +941 / -363, but the +941 is mostly tests (~440 lines)
and the new declarative layer; the imperative tokio::spawn block and
its nginx-auth helper are gone, removing two failure classes
(orphan companions on archipelago crash, and post-start exec races
under tightly-confined cgroups) that previously needed manual SSH
recovery.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:45:07 -04:00
archipelago
2bf8181110 refactor(security): tighten capability + TLS-bypass surface
Three small, focused tightenings:

- core/container/src/podman_client.rs: drop the legacy Hetzner
  23.182.128.160:3000 mirror from image_uses_insecure_registry().
  It was decommissioned in v1.7.x and is stripped from active
  registry config at load time; leaving it in the bypass list let
  a stale config still skip TLS. Replace the inline match with a
  named INSECURE_REGISTRY_HOSTS slice so future entries are one
  line. Test now also pins the spoofing-immune semantics
  ("evil.example/146.59.87.168:3000/x" must NOT match).

- core/archipelago/src/api/rpc/package/config.rs: split bitcoin
  from lnd in get_app_capabilities(). bitcoind never opens raw
  sockets — drop CAP_NET_RAW from bitcoin/bitcoin-core/bitcoin-knots.
  lnd/fedimint/fedimint-gateway keep it because they enumerate
  network interfaces during cert generation.

- core/archipelago/src/bootstrap.rs: tighten_secrets_dir()
  enforces 0700 on /var/lib/archipelago/secrets and 0600 on every
  file inside on each startup. The dir-mode is the load-bearing
  isolation boundary against rootless container escapes (their UID
  maps to >=100000, can't traverse uid=1000/0700). The per-file
  sweep is defense-in-depth against any installer that wrote 0644.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:59:11 -04:00
archipelago
0684491072 chore: baseline codex hardening before lifecycle refactor
Snapshots the in-flight hardening work so subsequent reconcile/Quadlet
phases land on a clean before/after diff.

Changes:
- core/container/src/podman_client.rs: image_uses_insecure_registry()
  whitelist for the OVH (146.59.87.168:3000) and legacy Hetzner
  (23.182.128.160:3000) HTTP mirrors; podman_network_settings() lifts
  custom networks into the Networks map so containers can join them.
- core/archipelago/src/container/prod_orchestrator.rs:
  ensure_container_network() creates per-manifest networks on demand;
  apply_data_uid() now goes through host_sudo for mkdir -p + chown so
  bind-mount roots get created and chowned without password prompts.
- core/archipelago/src/api/rpc/package/{install,update,stacks}.rs:
  podman pull adds --tls-verify=false only for whitelisted registries.
- core/archipelago/src/bootstrap.rs: removes stale dev-mode systemd
  override on startup (live nodes carried it from old installers).
- core/archipelago/src/config.rs: ignore ARCHIPELAGO_DEV_MODE in prod
  binaries — it had been silently rerouting volumes to /tmp.
- apps/bitcoin-{core,knots}/manifest.yml: locate bitcoind at runtime
  so image-layout differences don't break entrypoint.
- scripts/app-catalog-image-smoke-test.py: production catalog/image
  smoke test that probes a target node before users click Install.
- .gitignore: cover .codex, .pnpm-store, __pycache__, *.bak.

Removes filebrowser.rs.bak and two stale catalog.json.bak files
(verified identical to live counterparts).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:52:29 -04:00
archipelago
05e6c2e738 fix: release v1.7.51-alpha install hardening 2026-05-01 05:02:39 -04:00
archipelago
7ab788d178 chore: release v1.7.49-alpha 2026-04-30 16:37:54 -04:00
archipelago
8a2899ab4a chore: release v1.7.47-alpha
Sync-perf tuning for bitcoin/bitcoin-core/bitcoin-knots/electrumx.

- Drop the --cpus=2 cap on bitcoin/electrumx variants. Script verification
  is parallelizable; the cap halved IBD speed on 4-8 core machines.
- Bump bitcoin --memory 4g→8g so dbcache=4096 has headroom for mempool +
  connection buffers + I/O. 4g was OOM-prone during heavy IBD.
- Bump electrumx --memory 1g→2g + add CACHE_MB=2048 + MAX_SEND=10MB.
- bitcoin-core CLI args gain -dbcache=4096 -par=0 -maxconnections=125.
- bitcoin-knots manifest matched (1024MB pruned / 4096MB full + par=0).

Future v2: host-RAM-aware dbcache scaling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:47:51 -04:00
archipelago
992b673b20 chore: release v1.7.46-alpha
Follow-up to v1.7.45-alpha closing the remaining tasks identified by the
resilience sweeps + the new bitcoin orphan / install-fail-vanish bugs.

User-visible:
- Health monitor: stop paging on orphaned containers from variant switches
- Install fail: card stays visible (was vanishing) with error message
- Stack pull progress: interpolate 20→70% (was stuck at 20%)
- docker.io → lfg2025 mirror: bitcoin/gitea/nextcloud/valkey

Internal:
- Resilience harness — install-wait uses expected_containers_for, ui+auth
  probes retry with 60s backoff, dep-snapshot fix
- InstallProgress gains optional `message` field (frontend renders it
  when phase is None)

binary  $(stat -c %s releases/v1.7.46-alpha/archipelago)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago | awk '{print $1}')
tarball $(stat -c %s releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz | awk '{print $1}')

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:50:33 -04:00
archipelago
4ec6ca98c1 chore: release v1.7.45-alpha
Resilience-validated release. Three full sweeps of the new resilience
harness against .228 confirm no shipstoppers.

Big user-visible:
- Bitcoin RPC auth durably correct via host-rendered nginx.conf bind-mount,
  replaces fragile post-start exec that failed under restricted-cap rootless
  podman ("crun: write cgroup.procs: Permission denied")
- Multi-container stack installs (indeedhub, immich, btcpay, mempool) now
  emit phase events at every boundary so the progress bar advances
- Apps no longer vanish from the dashboard mid-install (absent-scanner skips
  packages in transitional states)
- Indeedhub fresh installs work end-to-end (was 8500+ restart loop): five
  missing env vars (DATABASE_PORT, QUEUE_HOST, QUEUE_PORT,
  S3_PRIVATE_BUCKET_NAME, AES_MASTER_SECRET) added to install code
- Tailscale install fixed: --entrypoint string was being passed as a single
  shell-line arg; switched to custom_args array
- Catalog cleaned of broken entries (dwn, endurain, ollama removed; nextcloud
  restored on docker.io)
- Bitcoin Core update path uses correct image (was looking for nonexistent
  lfg2025/bitcoin:28.4)
- ISO installs now allocate swap on the encrypted data partition

Infra:
- New resilience harness (scripts/resilience/) — black-box state-machine
  tester, every app × every transition. Run before each release.

Sweep #3 final: PASS 107 / FAIL 12 / SKIP 14. The 12 fails are 1 cosmetic
(homeassistant trusted_hosts), 8 harness/timing false-positives, and 3
non-shipstopper tracked items. Down from 23 in baseline sweep #1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:31:45 -04:00
archipelago
8f83b37d51 feat(orchestrator): complete container migration and release hardening 2026-04-28 15:00:58 -04:00
archipelago
cd6f8bad70 fix(install-log): pre-create /var/log/archipelago/ so non-root backend can write
The backend runs as `archipelago` and calls `install_log()` to append
audit lines to the install log on every install / update / remove /
start / stop / restart. Target path was /var/log/archipelago-container-installs.log,
which does not exist and cannot be created by the service because
/var/log/ is root-owned. OpenOptions errors were silently swallowed,
so the log was never written on any node.

Ship a tmpfiles.d rule that pre-creates /var/log/archipelago/ and
container-installs.log with archipelago:archipelago ownership. Move
the const path to match, keeping logs inside the directory logrotate
already rotates (image-recipe/configs/logrotate.conf). Install the
rule from both the ISO build and self-update, and apply it
immediately on self-update so existing nodes get a working log
without needing a reboot.

Verified on .228: file created, backend user can write, backend
binary rebuilt with new const.
2026-04-23 12:02:46 -04:00
archipelago
694e5b0a9d fix(update): pass --create-missing when rollback recreates a destroyed container
The update flow removes the old container before starting the new
one. If the update fails after removal, the rollback path tries
`podman start <name>` first, then falls back to reconcile. But
reconcile without --create-missing treats the now-absent container
as an optional one that the install flow will (re)create later,
and skips it. Result: container stays destroyed until someone
notices and runs reconcile manually.

Add --create-missing to the rollback reconcile invocation so the
fallback actually rebuilds the container from its canonical spec.

Fixes the failure mode observed on .228 where a bitcoin-knots
update left the node with no bitcoin-knots container at all.
2026-04-23 10:06:55 -04:00
archipelago
d9d5fa65e5 chore: retire .23 VPS mirror, promote .168 OVH to primary
The Hetzner VPS at 23.182.128.160 was decommissioned. Replace it
everywhere with the OVH VPS at 146.59.87.168, which was previously
the tertiary mirror.

  - update.rs: drop DEFAULT_TERTIARY_MIRROR_URL, promote .168 into
    the secondary slot as "Server 1 (OVH)"; tx1138 becomes Server 2.
    Default mirror list shrinks from 3 to 2.
  - container/registry.rs: default RegistryConfig drops .23, promotes
    .168 to Server 1 / priority 0, tx1138 stays Server 2 / priority 10.
  - api/rpc/package/config.rs: trusted-registry allowlist swaps .23
    for .168.
  - api/handler/mod.rs: app-catalog fallback URL uses .168.
  - neode-ui/views/marketplace/marketplaceData.ts: REGISTRY uses .168.
  - scripts/image-versions.sh: ARCHY_REGISTRY_FALLBACK uses .168.
  - image-recipe/build-auto-installer-iso.sh: installer ISO registries
    use .168 (both podman registries.conf and backend registries.json).

Tests updated to assert on the new 2-entry default lists (registry +
mirror). URL-parser fixture tests in update.rs retain .23 strings —
they exercise string-parsing logic, not mirror policy.

Git remotes: dropped `gitea-vps` and the .23 push URL on the `origin`
multi-push alias (not part of this commit — pure working-copy change).
2026-04-23 08:22:32 -04:00
archipelago
980c1b25f4 fix(install): kick scanner post-install so Launch button appears immediately
After install completes, the async-spawn wrapper wrote state=Running
but the skeletal install-time manifest (interfaces: None) persisted
until the next scheduled 60s scan. The frontend saw state=running but
hasUI=false and hid the Launch button for up to a full minute.

Add a shared Notify/watch pair between RpcHandler and the scan loop:
  - scan_kick (Notify): scan loop selects! between the 60s interval
    and this notify, running immediately on either.
  - scan_tick (watch<u64>): scan loop bumps the counter after each
    completed scan so callers can await completion.

Install and update success paths now call kick_scanner_and_wait before
flipping to Running. The scan merges via merge_preserving_transitional
(state stays Installing/Updating, manifest refreshed from live podman
with interfaces.main.ui populated from real port bindings). 2s timeout
falls back to pre-fix behavior on slow podman — no regression.
2026-04-23 07:59:03 -04:00
archipelago
7e62ea07f7 feat(install): phase-based progress bar replaces unparseable pull bytes
Podman emits zero parseable progress when stderr is piped (no TTY), so
the old byte-counter regex never matched in real installs. Users saw
0% for the whole pull, then a jump to 95%, then silence through
create-container, health-check, and post-install hooks.

Replace with 7 explicit lifecycle phases wired through install.rs and
update.rs: Preparing (5%), PullingImage (20%), CreatingContainer (70%),
StartingContainer (80%), WaitingHealthy (88%), PostInstall (95%),
Done (100%). Each maps to a fixed UI progress and status message.

Frontend PHASE_INFO mapper in stores/server.ts prioritizes phase when
present, falls back to byte-counter for legacy. A Math.max forward-only
guard ensures the bar never regresses. Deleted the duplicate watcher
in Discover.vue that was fighting the store's watcher with stale byte
logic. Added shimmer CSS on the fill (with prefers-reduced-motion
opt-out) so the bar looks alive during long phases.
2026-04-23 07:58:43 -04:00
archipelago
49b98e0271 fix(rpc): empty icon in transient install entry to avoid broken-image flicker
create_installing_entry hardcoded /assets/img/app-icons/<id>.png for
every new install. About half the app icons ship as .svg or .webp
(lnd.svg, vaultwarden.webp, bitcoin-knots.webp, mempool.webp), so the
browser 404s on the wrong extension and renders the default broken-image
glyph for the 10-30s window before the scanner refreshes with real
manifest data.

Send empty icon. The frontend's icon computed in AppCard.vue falls
through to curatedMap which has correct extensions for bundled apps,
and handleImageError still guards any remaining misses with a
placeholder SVG.
2026-04-23 06:58:12 -04:00
archipelago
1ad889608f feat(rpc): async-spawn install/uninstall/update lifecycle
Extend the async-spawn treatment previously shipped for Stop/Start/Restart
to the three remaining long-running lifecycle RPCs. Each wrapper validates
params, rejects duplicate in-flight ops, flips state to the transitional
variant (Installing/Removing/Updating), then spawns the existing inner
handler on tokio. RPC returns immediately with { status, package_id }; the
spawn task owns the terminal state write.

Install and update success arms explicitly set state=Running. The scan
loop merge (merge_preserving_transitional) refuses to overwrite
transitional states, so the spawn task must write the terminal state.
Uninstall's inner handler removes the entry entirely, so no explicit
terminal write is needed there.

Dispatcher and handler now thread self as Arc<Self> / &Arc<Self> so
spawned tasks can hold their own Arc without extra field cloning.

Transient install entry uses empty icon string. Hardcoding
/assets/img/app-icons/<id>.png 404s for apps that ship .svg or .webp
assets, which produces a broken-image flicker until the scanner refreshes
with manifest data. Empty string causes the frontend's icon computed to
fall through to the curated map, which has correct extensions.

Removed the inner "already updating" guard in update.rs — the wrapper
now owns duplicate-op detection for all three operations.
2026-04-23 06:57:50 -04:00
archipelago
39dd1d9dcc fix(rpc): async container stop/start/restart; widen state mapping
RPC handlers no longer block on podman operations. container-stop on
bitcoin-core used to hold the connection for up to 600s while the UI
showed a frozen spinner; it now returns in under a second with
{status: stopping} after flipping the package state to Stopping and
broadcasting over WebSocket. Same treatment for container-start and
the new container-restart route.

Widens container-list state mapping to emit the transitional variants
(stopping, starting, restarting, installing, updating, removing,
installed, and the backup states) instead of collapsing them to
"unknown". Keeps the mapping in sync with the UI ContainerStatus.state
union so the dashboard can render the right transitional label.

Mirrors the treatment in package/runtime.rs for package.start,
package.stop, and package.restart. The body of each handler is lifted
into pure do_package_* helpers that the background task runs; state
flipping is bracketed around the spawn with revert on error. The
pre-existing post-start exit-check verification and restart stop+start
fallback run inside the spawned task, not the RPC body.

Adds container-restart route to the dispatcher. mark_user_stopped
continues to run BEFORE the spawn, preserving the ordering contract
with the crash recovery layer at runtime.rs:145-148.
2026-04-23 04:59:45 -04:00
archipelago
5baced5f5b feat(rpc): spawn_transitional helper for async lifecycle ops
Introduces a new RPC-layer helper that bridges the synchronous
ContainerOrchestrator trait with RPC handlers that must return in <1s.

The helper flips the package state to a transitional variant
(Stopping / Starting / Restarting) in the StateManager so WebSocket
clients see the live label immediately, then tokio::spawns the
actual orchestrator call. On success it writes the final state; on
error it reverts to the pre-transition state and logs via
install_log().

The ContainerOrchestrator trait stays synchronous so the reconciler,
boot flow, unit tests, and chaos harness keep deterministic
behaviour. Async only lives in the RPC layer.

Not wired to any handler yet — Commit 2 consumes this helper.
Widens install_log visibility from pub(super) to
pub(in crate::api::rpc) so the new sibling module can reach it.
2026-04-23 04:55:52 -04:00
Dorian
36a6101026 release(v1.7.38-alpha): onboarding auto-heal + silent returning logins + app-store trim
- auth.rs now infers onboarding-complete from setup_complete + password_hash so
  nodes stop bouncing users through the intro wizard after browser clear / update
  / reboot; the flag self-heals to disk on next check
- frontend: "backend uncertain" no longer defaults to /onboarding/intro —
  useOnboarding returns null + callers poll / retry instead of flashing the wizard
- login sounds (synthwave, welcome voice, pop, whoosh, oomph) gated by
  isFirstInstallPhase(); typing sounds unaffected
- removed FIPS app, Nostr Relay, Nostr VPN, Routstr, Penpot from catalog,
  frontend config, Rust AppMetadata + install dispatch + install_penpot_stack;
  docker/fips-ui + docker/nostr-vpn-ui + apps/penpot dirs and 5 icons deleted;
  15 image versions deleted from tx1138, .168, gitea-local registries (.160
  Gitea was 502 at release time — follow-up)
- AIUI baked into frontend release tarball via demo/aiui/; deploy-to-target
  falls back to demo/aiui/ when the AIUI sibling checkout is missing
- prebuild hook syncs app-catalog/catalog.json → public/catalog.json so the
  two copies can no longer drift (was the source of the "apps still visible"
  bug — public/ had stale data)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:02:24 -04:00
Dorian
cfc98c600e release(v1.7.37-alpha): bitcoin-core install fixes + dynamic node UI + full-archive default
Install flow
- api/rpc/package/install.rs: always append the literal image URL as a
  last-resort pull candidate in do_pull_image, so images not carried by
  any configured mirror (docker.io/bitcoin/bitcoin:28.4) still install
  instead of masquerading as a generic pull failure across every mirror.
- api/rpc/package/install.rs: write_bitcoin_conf now skips on any stat
  error, not just "file exists". Once bitcoin-knots' first-boot chowns
  /var/lib/archipelago/bitcoin into the container's user namespace (700
  perms, UID 100100/100101), the archipelago daemon can't even traverse
  in — try_exists returns Err which unwrap_or(false) treated as "not
  present" and drove a doomed write. Now errors out of the directory
  traversal are treated as "conf already owned by container user" and
  the write is skipped. Mirrors the lnd.conf pattern.
- api/rpc/package/install.rs: drop the hardcoded `prune=550` from the
  conf default. Operators with multi-TB drives shouldn't be silently
  pruned; users who want a pruned node can set it in bitcoin.conf
  themselves. Full archive is the only honest default.
- api/rpc/package/config.rs: bitcoin-core now passes explicit
  -server/-rpcbind/-rpcallowip/-rpcport/-printtoconsole/-datadir CLI
  args. Vanilla bitcoin/bitcoin:28.4 has no entrypoint wrapper and
  reads conf + argv only; without these the RPC listens on 127.0.0.1
  inside the container and rootlessport can't reach it, so the
  bitcoin-ui companion gets 502 on every /bitcoin-rpc/ call.
  Bitcoin Knots keeps its own entrypoint-driven defaults.
- container/docker_packages.rs: split bitcoin-core out of the shared
  AppMetadata arm. bitcoin-core now surfaces as "Bitcoin Core" with
  bitcoin-core.svg and a Reference-implementation description; the
  bitcoin + bitcoin-knots ids keep the Knots branding. Fixes the home
  card showing "Bitcoin Knots" for a Core install.

Bitcoin node UI (docker/bitcoin-ui)
- index.html: impl name/tagline/logo now dynamic. applyImplBranding()
  reads subversion from getnetworkinfo — /Satoshi:X/Knots:Y/ resolves
  to Bitcoin Knots, plain /Satoshi:X/ resolves to Bitcoin Core. Both
  get their own icon and subtitle. Settings modal replaced its
  hardcoded Regtest/txindex=1/port-18443 placeholders with live values
  from getblockchaininfo + getindexinfo + getzmqnotifications.
- index.html: new Storage info card (Full Archive · X GB /
  Pruned · X GB from blockchainInfo.pruned + size_on_disk) visible on
  the main dashboard, same level as Network. Settings modal mirrors it
  with the prune height when applicable.
- Dockerfile + assets/: bitcoin-core.svg, bitcoin-knots.webp, and the
  bg-network.jpg used by the dashboard are now COPY'd into the image
  under /usr/share/nginx/html/assets. Previously the <img src> pointed
  at paths that 404'd into the SPA fallback and the onerror handler
  hid the broken logo silently.

Frontend
- appSession/appSessionConfig.ts: add bitcoin-core to APP_PORTS (8334),
  HTTPS_PROXY_PATHS (/app/bitcoin-ui/), and APP_TITLES (Bitcoin Core).
  Without these the AppSessionFrame showed "No URL found for
  bitcoin-core" and the home/app-list title fell through to the raw id.
- settings/AccountInfoSection.vue: backfill What's New entries for
  v1.7.31 through v1.7.37 that had been missed in earlier cuts.

Release plumbing
- releases/v1.7.37-alpha/: binary + frontend tarball.
- releases/manifest.json: v1.7.37-alpha, sha256/size refreshed.
- Cargo.toml / package.json: version bumps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 11:03:47 -04:00
Dorian
682b93f2d6 release(v1.7.31-alpha): idempotent IndeedHub install + auto-merge default mirrors/registries + 3rd OVH update mirror
- Backend: install.rs registry reachability probe now strips the
  `host[:port]/namespace` suffix before appending `/v2/` (the Docker
  V2 API lives at the host root, not under the namespace) and accepts
  HTTP 405 in addition to 200/401 as "registry daemon alive". This
  fixes false "unreachable" reports on the Test button for Gitea and
  other registries that protect their /v2/ endpoint.
- Backend: stacks.rs install_indeedhub_stack now force-removes any
  leftover indeedhub-* containers and indeedhub-net before creating
  the stack. A partial install (or the old first-boot stub racing the
  installer) used to leave containers around that blocked re-install
  with "name already in use". Re-running the App Store install now
  self-heals.
- Backend: registry.rs load_registries auto-merges any default
  registry URLs missing from the saved config (appended with priority
  max+10+i, persisted). Lets new default mirrors (e.g. Server 3 OVH)
  roll out to existing nodes without manual config edits. Explicit
  removals still stick — URLs absent from disk AND absent from
  defaults stay gone.
- Backend: update.rs adds DEFAULT_TERTIARY_MIRROR_URL at
  http://146.59.87.168:3000/ (Server 3 OVH) to default_mirrors, with
  the same auto-merge-on-load behavior as registries. Test updated
  for 3-mirror default (.160, tx1138, .168).
- Scripts: dropped the first-boot IndeedHub stub (~38 lines in
  first-boot-containers.sh §8b). It predated the proper stack
  installer, raced it, and was the main source of the name-conflict
  mess the stacks.rs cleanup above now also guards against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 03:26:09 -04:00
Dorian
18f0929614 release(v1.7.30-alpha): live install/uninstall progress + cleaner pull waterfall
- Backend: unified pull-progress streaming across primary AND fallback
  registries. Earlier code only streamed for the primary attempt; if it
  failed fast (VPS 404, etc.) the UI froze at 0% until the fallback
  finished. The waterfall now uses a single shared helper that streams
  podman stderr through update_install_progress for every URL tried.
- Backend: PackageDataEntry gains uninstall_stage, set at each phase of
  handle_package_uninstall ("Stopping containers (i/total)",
  "Cleaning up volumes", "Removing app data"). State flips to Removing
  during the pipeline.
- Frontend: MarketplaceAppCard renders the live progress bar with byte
  counts during installs, matching the System Update download bar style.
- Frontend: AppCard renders the live uninstall stage label per app.
  Modal closes immediately on confirm so concurrent uninstalls each
  show their own progress on their own card.
- Cleanup: removed dead helpers (image_candidates, rewrite_for_primary,
  primary_image_url, pull_from_registries_with_skip) made unused by
  the install.rs refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 19:11:36 -04:00
Dorian
1709149ebd release(v1.7.29-alpha): VPS as default app registry + settings UI
- New Settings → App registries page (/dashboard/settings/registries)
  that mirrors the update-mirrors experience: list of configured
  registries, test reachability, set primary, add/remove. New
  registry.set-primary RPC; existing registry.{list,add,remove,test}
  reused.
- Default RegistryConfig flipped: VPS (23.182.128.160:3000/lfg2025) is
  now Server 1 (primary), tx1138 is Server 2 (fallback).
- Install pipeline now rewrites the first pull to the primary registry
  URL before attempting it. Before this, installs always hit whichever
  registry the image was hardcoded to, so changing the primary didn't
  actually affect where images came from. On failure, the existing
  fallback walk skips the primary (already tried) and walks the rest.
- App catalog proxy UPSTREAMS order flipped so the catalog follows the
  same VPS-first rule.
- Reboot overlay: animated "a" logo now sits in the center of the ring
  (matches the screensaver composition). Extracted the logo-wrapper
  pattern inline.

7/7 registry tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:54:07 -04:00
Dorian
46350f48b6 chore(fmt): rustfmt drift cleanup across misc crates
Pure formatter output — no semantic changes. Sweeping these into their
own commit so the FIPS integration diff that follows stays scoped to
the actual feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 22:57:14 -04:00
Dorian
b614c5c694 chore(ci): rustfmt + clippy clean-up to unblock the Rust CI job
The .github/workflows/ci.yml Rust job runs cargo fmt --check, clippy
with -D warnings, and tests. All three were failing. This commit:

- Applies rustfmt across the tree (the bulk of the diff — untouched
  since the last toolchain bump, so a wide sweep was unavoidable).
- Fixes the correctness-level clippy errors:
    container/bitcoin_simulator.rs wildcard-in-or-pattern
    container/manifest.rs from_str rename to parse (reserved name)
    container/podman_client.rs .get(0) -> .first()
    container/runtime.rs manual += collapse
    archipelago/src/constants.rs doc-comment → module-doc
    api/rpc/package/install.rs stray /// comment above a non-item
    container/docker_packages.rs redundant field init
    streaming/advertisement.rs missing Metric import in tests
    tests/orchestration_tests.rs `vec!` in non-Vec contexts
    mesh/listener/dispatch.rs unused store_plain_message import
    api/rpc/tor/mod.rs and mesh/steganography.rs: push-after-new → vec!
- Quiets wide legacy surfaces with crate-level allows in main.rs for
  stylistic lints (too_many_arguments, type_complexity, doc indent,
  enum variant prefix, wildcard-in-or, assertions-on-constants,
  drop_non_drop, unused_io_amount, ptr_arg) — these fired in dozens
  of places with no correctness payoff and have been churning every
  toolchain bump.
- Tags intentional-dead-code helpers: wallet/ and streaming/ modules
  are WIP, mesh::send_chunked_payload and DM_V1_MARKER are kept for
  rollback compatibility, vpn::get_nostr_vpn_status is surface-area
  for a not-yet-landed RPC.

cargo fmt --check, cargo clippy --all-targets --all-features
-- -D warnings, and cargo test --all-features now all pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:23:46 -04:00
Dorian
0c02d06a66 feat: deploy-to-target supports .253 + mesh/federation/VPN updates
- Add deploy_secondary() function for deploying to multiple LAN nodes
- --both now deploys to .198 and .253 (previously .198 only)
- Fleet deploy updated for 3 LAN nodes
- Mesh DM fixes: protocol frame format, DM-via-channel routing
- Federation pending requests, discover modal
- VPN status UI improvements
- Image versions and container specs updates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 11:07:08 -04:00
Dorian
0f71013952 fix: registry fallback skips dead primary, WireGuard first-boot, Gitea port 3001
Registry fallback now only tries DIFFERENT registries (skips original
that already failed). 120s timeout per fallback attempt. WireGuard
keys generated on unbundled first-boot. Gitea ROOT_URL uses port 3001.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 11:40:52 -04:00
Dorian
a1f70f9c18 feat: IndeedHub multi-container stack installer
Installs all 7 containers (postgres, redis, minio, relay, api,
ffmpeg, frontend) on indeedhub-net with proper env vars and volumes.
Fixes pull timeout to cover stderr reader. Catalog registry set to
23.182.128.160:3000.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:37:18 -04:00
Dorian
2d11f262dd fix: pull timeout covers entire operation, swap registry priority
Timeout now wraps stderr reader + wait (was only wrapping wait, so
hung pulls were never killed). 23.182.128.160:3000 is now primary
registry since git.tx1138.com is unreachable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:18:24 -04:00
Dorian
877b3e4168 fix: image pull timeout actually triggers fallback
Previous timeout used ExitStatus::default() which is success on Linux,
so the fallback never triggered. Now properly kills process, awaits
exit, and forces fallback path on timeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:08:22 -04:00
Dorian
b0656b068f fix: 60s timeout on image pull, gitea port 3001, wireguard first-boot
Image pulls now timeout after 60s and fall through to dynamic registry
fallback instead of hanging forever when primary is unreachable.
Gitea external port corrected to 3001. WireGuard key generation
added to first-boot for fresh installs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:00:06 -04:00
Dorian
3078d4b69e feat: dynamic app catalog, Gitea app polish, registry sync
App catalog served from Gitea repos (app-catalog) with 35 apps.
Nodes fetch catalog dynamically — new apps appear without frontend
rebuild. Test app added and removed to verify pipeline.

Gitea manifest updated with internal_port/nginx_proxy for iframe.
Updated catalog.json, nginx configs, app session configs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:20:18 -04:00
Dorian
1147dbd882 feat: dynamic container registry with fallback
Configurable registry list persisted to config/registries.json.
Image pulls try all registries in priority order — if primary fails,
fallback registries are attempted automatically. RPC endpoints:
registry.list, registry.add, registry.remove, registry.test.

Replaces hardcoded fallback logic with extensible registry system.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:09:14 -04:00
Dorian
a97128bfd2 feat: fallback container registry at 23.182.128.160:3000
When primary registry (git.tx1138.com) fails, image pull automatically
retries from Gitea registry at 23.182.128.160:3000. Tags pulled image
with original name so install continues seamlessly. Gitea added as
external app in app session config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 06:38:34 -04:00
Dorian
6cd67df575 feat: add Gitea as Archipelago app with container registry
Gitea app manifest, marketplace entry, nginx proxy, app session config,
image version, package install config. Container registry enabled on
Gitea for fallback image hosting. Trusted registries updated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 06:10:56 -04:00
Dorian
bb14490fb7 feat: botfights, discover, mobile gamepad, content handler, package config updates
Miscellaneous improvements: botfights manifest, discover page curated
apps, mobile gamepad enhancements, content HTTP handler, package
install config updates, health monitor tweaks, shared content UI,
container specs and image version updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 23:11:41 -04:00
Dorian
24f122f35a fix: allow Fedimint install without local Bitcoin node
Fedimint can use a remote Bitcoin RPC (e.g., over Tailscale or Tor).
Dependency check now logs info instead of blocking installation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 23:00:06 -04:00
Dorian
abf6ca000d feat: botfights container app + mobile gamepad + indeedhub fixes
- Promote botfights from external proxy to container app (port 9100)
- Add /app/botfights/ nginx proxy rules (HTTP + HTTPS)
- Add ARCHY_EMBEDDED env var to botfights container config
- Add BOTFIGHTS_IMAGE to image-versions.sh
- Add mobile gamepad overlay (D-pad + A/B + START/SELECT) for botfights
  arcade mode, sends postMessage arcade-input to iframe
- Remove old /ext/botfights/ and port 8901 external proxy blocks
- IndeeHub: add post-install nginx patching for NIP-07 provider injection
- IndeeHub: fix docker image references to registry (was localhost)
- IndeeHub: update port 7777 -> 7778

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:47:54 -04:00
Dorian
548107eb8b feat: add botfights app config and update container registry
- Add git.tx1138.com to trusted registries (replaces old 80.71.235.15)
- Add botfights app config: port 9100, data volume, JWT_SECRET auto-gen, fight loop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 20:01:14 +01:00
Dorian
401a44b40a fix: IndeedHub port 7778, podman registries v2 format
- IndeedHub container port changed from 7777 to 7778 (7777 used by nostr-relay)
- Nginx proxy updated to route to 7778
- Backend config.rs port mapping updated
- Podman registries.conf switched to v2 format (fixes mixed v1/v2 error)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:32:32 -04:00
Dorian
a147db9b70 refactor: migrate container registry from 80.71.235.15:3000 to git.tx1138.com/lfg2025
All hardcoded references to the old IP-based registry replaced across
Rust backend, Vue frontend, shell scripts, Dockerfiles, CI, and docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:33:10 -04:00
Dorian
067a3ed106 fix: ISO boot, container installs, VPN, nginx, companion input
- LUKS auto-unlock: initramfs hook + systemd service + nofail fstab
- Rootfs packages: add passt, aardvark-dns, netavark, nftables for Podman 5.x
- nginx: resolver + variable proxy_pass for external domains (DNS at boot)
- Boot: loglevel=0 suppresses kernel warnings, serial console for QEMU
- Container installs: write configs before chown, sudo chown for LUKS volumes
- Container installs: build UI sidecars locally (not from registry) for auth injection
- Bitcoin UI: inject RPC auth from secrets file, --no-cache rebuild
- Secrets: chown to archipelago user in first-boot (backend needs read access)
- Podman: image_copy_tmp_dir for read-only /var/tmp in user namespace
- NostrVPN: enable service in auto-install, always include public relays
- NostrVPN: read tunnel IP from nvpn status (not just config file)
- VPN invite: v2 base64 no-pad format matching phone app
- Companion input: relay always active, kiosk skips relay listener (prevents double input)
- dev-start.sh: production build includes AIUI deployment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:10:49 -04:00
Dorian
2517379ac3 chore: Debian 12 → 13 (Trixie) migration, service hardening
- Update all references from Debian 12 (Bookworm) to Debian 13 (Trixie)
- Enable SystemCallArchitectures, RestrictAddressFamilies, RestrictRealtime
  in archipelago.service (safe on systemd 256+ which respects NoNewPrivileges=no)
- Update GLIBC compatibility checks from 2.36 to 2.40
- ISO filename, build container, and docs updated throughout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 21:32:08 +02:00
Dorian
a8c6a36cd1 fix: netavark GLIBC mismatch in ISO, container adopt, app updates
ISO build no longer copies netavark from build host (Debian 13/GLIBC 2.41)
which broke container networking on Debian 12 targets. Rootfs already
installs netavark from Debian 12 repos — just configure the backend.

Install RPC now adopts existing containers (from first-boot) instead of
erroring on duplicates. Container scanner extracts real versions from
image tags and detects available updates against pinned versions.

Frontend shows update button with version info when updates are available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 11:47:35 +02:00
Dorian
07dff3e4ca fix: container stack installers, DNS resolution, uninstall cleanup
- Replace aardvark-dns container names with host.containers.internal
  for all cross-app connections (LND→Bitcoin, ElectrumX→Bitcoin,
  Mempool→ElectrumX, Fedimint→Bitcoin, NBXplorer→Bitcoin P2P+RPC)
- Add BTCPay multi-container stack installer (postgres + nbxplorer +
  btcpay-server) with proper secrets, data dir ownership, NOAUTH
- Add Mempool multi-container stack installer (mariadb + mempool-api +
  mempool-frontend) with host.containers.internal for RPC
- Immediately remove apps from state on uninstall (no 3-min ghost delay)
- Include archy-bitcoin-ui in bitcoin uninstall container list
- Fix LND UI port 8081 (was 8080, conflicting with LND gRPC)
- Fix ElectrumX UI: proxy /electrs-status to backend, cache-busting
  headers, graceful fallback when backend returns HTML
- Add Tor hidden services for ElectrumX and LND in torrc template
- Remove unused detect_bitcoin_container_name() (replaced by
  host.containers.internal)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 23:29:50 +02:00
Dorian
8e094c7ce9 fix: install/uninstall UI state, progress bar, auto-Tor hidden services
- Install progress bar replaces action buttons (no overlay)
- Hide status badge during install/uninstall
- Uninstall keeps progress state until container disappears from WebSocket
- Uninstall RPC timeout increased to 660s (Bitcoin UTXO flush)
- Installing apps appear in My Apps immediately as placeholders
- Auto-configure Tor hidden service for every app on install
- Widen Tor module visibility for install hooks
- Only clear stale install entries on error status

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:20:18 +02:00