68 Commits

Author SHA1 Message Date
archipelago
05e6c2e738 fix: release v1.7.51-alpha install hardening 2026-05-01 05:02:39 -04:00
archipelago
7ab788d178 chore: release v1.7.49-alpha 2026-04-30 16:37:54 -04:00
archipelago
8f83b37d51 feat(orchestrator): complete container migration and release hardening 2026-04-28 15:00:58 -04:00
archipelago
2843cc1e84 fix(container/image_versions): reject entries that are not image references
The parser retained any key ending in _IMAGE, so a harmless-looking
variable like NOT_AN_IMAGE="something" would be treated as a pinned
container image. Add a value-shape check: the value must contain both
a registry separator (/) and a tag separator (:) to qualify.
2026-04-23 13:02:15 -04:00
archipelago
12f93cc15e fix(image-versions): locate image-versions.sh at its actual deployed path
The Rust search path listed /opt/archipelago/image-versions.sh and
scripts/image-versions.sh (repo-relative for dev), but the image
recipe deploys the file to /opt/archipelago/scripts/image-versions.sh.
Production nodes therefore silently failed every lookup: find_file
returned None, load_image_versions returned an empty HashMap, and
both pinned_image_for_app and pinned_images_for_stack returned no
matches.

Symptom on deployed nodes: every container scan emitted
"image-versions.sh not found in any search path" at DEBUG level, and
the version-comparison logic in docker_packages.rs plus the
update-check logic in api/rpc/package/update.rs silently degraded to
no-op — users would not see update-available badges and upgrade RPCs
could not resolve pinned targets.

Fix: put the canonical deployed path first in PATHS, keep the older
/opt/archipelago/image-versions.sh as a fallback for not-yet-updated
nodes, and retain scripts/image-versions.sh as the dev-repo-relative
fallback. Verified on .228: backend now logs "Parsed 57 image
versions from /opt/archipelago/scripts/image-versions.sh" on scan.

Pre-existing test_parse_image_versions failure in this module is
unrelated (the NOT_AN_IMAGE assertion was broken before this change
because the parser's _IMAGE-suffix retain keeps it). Leaving that for
the general cargo-test cleanup pass.
2026-04-23 09:29:15 -04:00
archipelago
28e38a36a9 fix(config): auto-purge decommissioned .23 VPS from saved registry/mirror configs
load_registries + load_mirrors normally only ADD missing defaults to
the persisted JSON — explicit removals stick. After retiring the .23
Hetzner VPS we need the opposite: existing nodes have .23 baked into
their saved configs and would spend seconds per install/update timing
out against a dead host until the operator manually removes it via
the Settings UI.

Add a targeted one-time migration in both loaders: if any saved entry
has 23.182.128.160 in its URL, drop it on load and rewrite the file.
This is an exception to the usual "explicit removals stick" rule —
the user never chose to add this mirror, it was a default.

Narrow-scope migration (one hardcoded IP match, no schema version)
because the cost/benefit of a general migration system isn't worth
it for a single decommissioned host. Future retirements can follow
the same pattern.
2026-04-23 08:51:26 -04:00
archipelago
d9d5fa65e5 chore: retire .23 VPS mirror, promote .168 OVH to primary
The Hetzner VPS at 23.182.128.160 was decommissioned. Replace it
everywhere with the OVH VPS at 146.59.87.168, which was previously
the tertiary mirror.

  - update.rs: drop DEFAULT_TERTIARY_MIRROR_URL, promote .168 into
    the secondary slot as "Server 1 (OVH)"; tx1138 becomes Server 2.
    Default mirror list shrinks from 3 to 2.
  - container/registry.rs: default RegistryConfig drops .23, promotes
    .168 to Server 1 / priority 0, tx1138 stays Server 2 / priority 10.
  - api/rpc/package/config.rs: trusted-registry allowlist swaps .23
    for .168.
  - api/handler/mod.rs: app-catalog fallback URL uses .168.
  - neode-ui/views/marketplace/marketplaceData.ts: REGISTRY uses .168.
  - scripts/image-versions.sh: ARCHY_REGISTRY_FALLBACK uses .168.
  - image-recipe/build-auto-installer-iso.sh: installer ISO registries
    use .168 (both podman registries.conf and backend registries.json).

Tests updated to assert on the new 2-entry default lists (registry +
mirror). URL-parser fixture tests in update.rs retain .23 strings —
they exercise string-parsing logic, not mirror policy.

Git remotes: dropped `gitea-vps` and the .23 push URL on the `origin`
multi-push alias (not part of this commit — pure working-copy change).
2026-04-23 08:22:32 -04:00
archipelago
3e9c192b48 feat(container): bitcoin-ui pre-start hook renders nginx.conf from embedded template
Replaces the first-boot-containers.sh sed/envsubst approach with a
Rust-native render step bound into the ContainerOrchestrator lifecycle.

- New container::bitcoin_ui module: embeds the nginx.conf template via
  include_str!, reads the plaintext RPC password from
  /var/lib/archipelago/secrets/bitcoin-rpc-password, substitutes
  {{BITCOIN_RPC_AUTH}} with base64(archipelago:<password>), and atomic-
  writes (tmp + rename) to /var/lib/archipelago/bitcoin-ui/nginx.conf.
  Idempotent: byte-compares before writing so unchanged input is a
  no-op (no inode churn, no restart cascade).
- ProdContainerOrchestrator gains run_pre_start_hooks(app_id) returning
  HookOutcome::{Rewritten, Unchanged}. Fires in install_fresh before
  create_container, and in ensure_running: on Running + Rewritten
  triggers a restart; on Stopped re-renders then starts.
- bitcoin-ui Dockerfile no longer COPYs a default.conf; the file now
  arrives via runtime bind-mount of the rendered config. If the bind-
  mount is ever missing, nginx starts with no site configured and
  returns 404 everywhere — safe failure vs. serving upstream RPC with
  a stale Authorization header.
- apps/{bitcoin,electrs,lnd}-ui/manifest.yml land as first-class
  manifests. bitcoin-ui declares the bind-mount target and a dependency
  on bitcoin-core; electrs-ui and lnd-ui declare their own deps and
  health checks.
- 8 new unit tests on the render fn (idempotency, rotation, trimming,
  missing/empty secret, template invariants) plus an integration test
  asserting install(bitcoin-ui) actually lands a substituted nginx.conf
  on disk via the hook. 39/39 container:: tests pass
  (test_parse_image_versions pre-existing failure unchanged, out of
  scope).
2026-04-23 02:19:52 -04:00
archipelago
81c1613040 feat(container): BootReconciler — periodic reconcile loop for prod orchestrator
Step 5 of the rust-orchestrator migration. New file boot_reconciler.rs holds a
small Tokio task that calls ProdContainerOrchestrator::reconcile_all() on a
30-second cadence (answered design Q3).

  * BootReconciler::new(orch, interval, shutdown) — shutdown is an Arc<Notify>
    so callers can trigger a graceful exit without pulling in tokio-util.
  * run_forever(self) — does one reconcile immediately, then loops on
    tokio::select! { sleep_until | shutdown.notified() }. Shutdown interrupts
    the sleep but never an in-flight reconcile_all call.
  * Per-pass outcomes are logged at debug/warn; failures never propagate out
    because reconcile_all already absorbs per-app errors into ReconcileReport.

Four tokio::test(start_paused = true) tests verify the loop cadence against a
CountingRuntime test double:
  * initial_pass_fires_immediately — first reconcile runs with no delay
  * second_pass_fires_after_interval — second pass fires after exactly
    interval elapses in paused-clock time
  * shutdown_terminates_loop — notify_one() lets run_forever return
  * failure_in_one_pass_does_not_stop_loop — the loop keeps ticking even when
    the first pass had to install a missing container

Not wired into main.rs yet — that is Step 6. Re-exported from container::mod
as BootReconciler + RECONCILER_DEFAULT_INTERVAL for the wire-up step.
2026-04-22 19:04:34 -04:00
archipelago
40a6eaca72 feat(container): ContainerOrchestrator trait, RpcHandler uses it in prod
Step 4 of the rust-orchestrator migration. Unifies the container lifecycle
surface behind a single trait so the RPC layer stops caring whether it is
talking to the dev or prod orchestrator.

  * New trait core/archipelago/src/container/traits.rs: ContainerOrchestrator
    with install / start / stop / restart / remove / upgrade / status / list /
    logs / health, all keyed by app_id. Every method is async_trait-based.

  * ProdContainerOrchestrator: the lifecycle methods are moved from inherent
    impl into the trait impl (avoids name-shadowing recursion). Adoption and
    reconcile remain inherent since only main.rs / BootReconciler call them.

  * DevContainerOrchestrator: new trait impl that forwards to the existing
    Dev-named methods, applying the dev container-name + port-offset rules
    internally. New load_manifest_for() helper resolves app_id to
    <data_dir>/apps/<app_id>/manifest.yml so trait-level install(app_id)
    works in dev too. install_container(manifest, path) stays inherent for
    the manifest-path RPC shape.

  * RpcHandler now holds Option<Arc<dyn ContainerOrchestrator>> and, when in
    dev mode, a separate Option<Arc<DevContainerOrchestrator>> for the
    manifest_path install RPC. In prod mode RpcHandler::new() constructs a
    ProdContainerOrchestrator and calls load_manifests() at startup.

  * All seven container-* RPC guards no longer say dev mode required.
    container-install still requires dev mode because its manifest_path
    argument has no prod meaning; every other container RPC now works in both
    modes via the trait.

BOOT STILL DOES NOT USE THIS. main.rs wire-up (Step 6) and BootReconciler
(Step 5) come next. Until then the prod orchestrator is constructed but nothing
populates /opt/archipelago/apps so it has zero manifests to manage, matching
the pre-Step-4 behaviour.

Verification: cargo build -p archipelago clean (11 expected unused method
warnings for methods not yet wired from main.rs). cargo test -p archipelago:
all 21 container::* tests pass (16 prod_orchestrator + 5 others). 24 other
test failures are pre-existing and unrelated (identity_manager / session /
wallet / mesh / credentials — all independently flaky on file-backed state).
2026-04-22 18:56:52 -04:00
archipelago
e103925a4e feat(container): ProdContainerOrchestrator with build-or-pull, adoption, reconcile
Step 3 of the rust-orchestrator-migration. New file prod_orchestrator.rs (999 LOC)
implements the full public surface that will replace scripts/first-boot-containers.sh:

  * install / start / stop / restart / remove / upgrade / status / list / logs / health
  * adopt_existing: read-only scan that claims containers matching our manifests by
    name, without recreating — preserves the v1.7.42 fixture on .116.
  * reconcile_all: level-triggered, per-app failures collected rather than aborting.
  * install_fresh: build-or-pull (Step 2 trait methods), relative build contexts
    resolved against the manifest directory.

Naming rule (answered design Q1): UI app IDs (bitcoin-ui/electrs-ui/lnd-ui) get the
archy- prefix; backends keep their bare ID. An explicit extensions.container_name
always wins. Codified in compute_container_name() with unit tests for all three tiers.

Concurrency (answered design Q4): per-app tokio::sync::Mutex<()> created lazily,
protecting every mutating op against the reconciler loop. Acquiring the per-app
lock only needs a read lock on the map, so independent apps do not serialize.

16 tests: 3 sync naming rule tests + 13 tokio async tests covering install (pull,
build-absent, build-present, relative-context), reconcile (noop/exited/missing/
mixed-failure), adopt-by-name, upgrade sequence ordering, list filtering, health
state mapping, and unknown-app-id rejection. All pass.

Not wired into main.rs yet — that is Step 6. Crate builds clean with expected
unused warnings for the new re-exports.
2026-04-22 18:32:31 -04:00
archipelago
919055f3f1 feat(container): add build source to manifest schema
ContainerConfig.image is now Option<String>, mutually exclusive with a new
optional ContainerConfig.build: Option<BuildConfig>. Exactly one of image
or build must be present, enforced in AppManifest::validate.

Adds ResolvedSource enum (Pull | Build) and ContainerConfig::resolve +
::image_ref helpers so the orchestrator can treat pull and build uniformly.
All 26 existing pull-only manifests continue to parse unchanged
(covered by existing_pull_only_manifests_still_parse test).

Call sites updated: podman_client, runtime::DockerRuntime, dev_orchestrator.
Dev orchestrator errors out cleanly on Build sources until Step 2 lands
build_image support on the runtime trait.

Step 1 of docs/rust-orchestrator-migration.md. 10 new unit tests, all pass.

Also includes: docs/rust-orchestrator-migration.md (design spec) and
docs/STATUS.md resume section for the next session.
2026-04-22 17:46:36 -04:00
Dorian
36a6101026 release(v1.7.38-alpha): onboarding auto-heal + silent returning logins + app-store trim
- auth.rs now infers onboarding-complete from setup_complete + password_hash so
  nodes stop bouncing users through the intro wizard after browser clear / update
  / reboot; the flag self-heals to disk on next check
- frontend: "backend uncertain" no longer defaults to /onboarding/intro —
  useOnboarding returns null + callers poll / retry instead of flashing the wizard
- login sounds (synthwave, welcome voice, pop, whoosh, oomph) gated by
  isFirstInstallPhase(); typing sounds unaffected
- removed FIPS app, Nostr Relay, Nostr VPN, Routstr, Penpot from catalog,
  frontend config, Rust AppMetadata + install dispatch + install_penpot_stack;
  docker/fips-ui + docker/nostr-vpn-ui + apps/penpot dirs and 5 icons deleted;
  15 image versions deleted from tx1138, .168, gitea-local registries (.160
  Gitea was 502 at release time — follow-up)
- AIUI baked into frontend release tarball via demo/aiui/; deploy-to-target
  falls back to demo/aiui/ when the AIUI sibling checkout is missing
- prebuild hook syncs app-catalog/catalog.json → public/catalog.json so the
  two copies can no longer drift (was the source of the "apps still visible"
  bug — public/ had stale data)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:02:24 -04:00
Dorian
cfc98c600e release(v1.7.37-alpha): bitcoin-core install fixes + dynamic node UI + full-archive default
Install flow
- api/rpc/package/install.rs: always append the literal image URL as a
  last-resort pull candidate in do_pull_image, so images not carried by
  any configured mirror (docker.io/bitcoin/bitcoin:28.4) still install
  instead of masquerading as a generic pull failure across every mirror.
- api/rpc/package/install.rs: write_bitcoin_conf now skips on any stat
  error, not just "file exists". Once bitcoin-knots' first-boot chowns
  /var/lib/archipelago/bitcoin into the container's user namespace (700
  perms, UID 100100/100101), the archipelago daemon can't even traverse
  in — try_exists returns Err which unwrap_or(false) treated as "not
  present" and drove a doomed write. Now errors out of the directory
  traversal are treated as "conf already owned by container user" and
  the write is skipped. Mirrors the lnd.conf pattern.
- api/rpc/package/install.rs: drop the hardcoded `prune=550` from the
  conf default. Operators with multi-TB drives shouldn't be silently
  pruned; users who want a pruned node can set it in bitcoin.conf
  themselves. Full archive is the only honest default.
- api/rpc/package/config.rs: bitcoin-core now passes explicit
  -server/-rpcbind/-rpcallowip/-rpcport/-printtoconsole/-datadir CLI
  args. Vanilla bitcoin/bitcoin:28.4 has no entrypoint wrapper and
  reads conf + argv only; without these the RPC listens on 127.0.0.1
  inside the container and rootlessport can't reach it, so the
  bitcoin-ui companion gets 502 on every /bitcoin-rpc/ call.
  Bitcoin Knots keeps its own entrypoint-driven defaults.
- container/docker_packages.rs: split bitcoin-core out of the shared
  AppMetadata arm. bitcoin-core now surfaces as "Bitcoin Core" with
  bitcoin-core.svg and a Reference-implementation description; the
  bitcoin + bitcoin-knots ids keep the Knots branding. Fixes the home
  card showing "Bitcoin Knots" for a Core install.

Bitcoin node UI (docker/bitcoin-ui)
- index.html: impl name/tagline/logo now dynamic. applyImplBranding()
  reads subversion from getnetworkinfo — /Satoshi:X/Knots:Y/ resolves
  to Bitcoin Knots, plain /Satoshi:X/ resolves to Bitcoin Core. Both
  get their own icon and subtitle. Settings modal replaced its
  hardcoded Regtest/txindex=1/port-18443 placeholders with live values
  from getblockchaininfo + getindexinfo + getzmqnotifications.
- index.html: new Storage info card (Full Archive · X GB /
  Pruned · X GB from blockchainInfo.pruned + size_on_disk) visible on
  the main dashboard, same level as Network. Settings modal mirrors it
  with the prune height when applicable.
- Dockerfile + assets/: bitcoin-core.svg, bitcoin-knots.webp, and the
  bg-network.jpg used by the dashboard are now COPY'd into the image
  under /usr/share/nginx/html/assets. Previously the <img src> pointed
  at paths that 404'd into the SPA fallback and the onerror handler
  hid the broken logo silently.

Frontend
- appSession/appSessionConfig.ts: add bitcoin-core to APP_PORTS (8334),
  HTTPS_PROXY_PATHS (/app/bitcoin-ui/), and APP_TITLES (Bitcoin Core).
  Without these the AppSessionFrame showed "No URL found for
  bitcoin-core" and the home/app-list title fell through to the raw id.
- settings/AccountInfoSection.vue: backfill What's New entries for
  v1.7.31 through v1.7.37 that had been missed in earlier cuts.

Release plumbing
- releases/v1.7.37-alpha/: binary + frontend tarball.
- releases/manifest.json: v1.7.37-alpha, sha256/size refreshed.
- Cargo.toml / package.json: version bumps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 11:03:47 -04:00
Dorian
682b93f2d6 release(v1.7.31-alpha): idempotent IndeedHub install + auto-merge default mirrors/registries + 3rd OVH update mirror
- Backend: install.rs registry reachability probe now strips the
  `host[:port]/namespace` suffix before appending `/v2/` (the Docker
  V2 API lives at the host root, not under the namespace) and accepts
  HTTP 405 in addition to 200/401 as "registry daemon alive". This
  fixes false "unreachable" reports on the Test button for Gitea and
  other registries that protect their /v2/ endpoint.
- Backend: stacks.rs install_indeedhub_stack now force-removes any
  leftover indeedhub-* containers and indeedhub-net before creating
  the stack. A partial install (or the old first-boot stub racing the
  installer) used to leave containers around that blocked re-install
  with "name already in use". Re-running the App Store install now
  self-heals.
- Backend: registry.rs load_registries auto-merges any default
  registry URLs missing from the saved config (appended with priority
  max+10+i, persisted). Lets new default mirrors (e.g. Server 3 OVH)
  roll out to existing nodes without manual config edits. Explicit
  removals still stick — URLs absent from disk AND absent from
  defaults stay gone.
- Backend: update.rs adds DEFAULT_TERTIARY_MIRROR_URL at
  http://146.59.87.168:3000/ (Server 3 OVH) to default_mirrors, with
  the same auto-merge-on-load behavior as registries. Test updated
  for 3-mirror default (.160, tx1138, .168).
- Scripts: dropped the first-boot IndeedHub stub (~38 lines in
  first-boot-containers.sh §8b). It predated the proper stack
  installer, raced it, and was the main source of the name-conflict
  mess the stacks.rs cleanup above now also guards against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 03:26:09 -04:00
Dorian
18f0929614 release(v1.7.30-alpha): live install/uninstall progress + cleaner pull waterfall
- Backend: unified pull-progress streaming across primary AND fallback
  registries. Earlier code only streamed for the primary attempt; if it
  failed fast (VPS 404, etc.) the UI froze at 0% until the fallback
  finished. The waterfall now uses a single shared helper that streams
  podman stderr through update_install_progress for every URL tried.
- Backend: PackageDataEntry gains uninstall_stage, set at each phase of
  handle_package_uninstall ("Stopping containers (i/total)",
  "Cleaning up volumes", "Removing app data"). State flips to Removing
  during the pipeline.
- Frontend: MarketplaceAppCard renders the live progress bar with byte
  counts during installs, matching the System Update download bar style.
- Frontend: AppCard renders the live uninstall stage label per app.
  Modal closes immediately on confirm so concurrent uninstalls each
  show their own progress on their own card.
- Cleanup: removed dead helpers (image_candidates, rewrite_for_primary,
  primary_image_url, pull_from_registries_with_skip) made unused by
  the install.rs refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 19:11:36 -04:00
Dorian
1709149ebd release(v1.7.29-alpha): VPS as default app registry + settings UI
- New Settings → App registries page (/dashboard/settings/registries)
  that mirrors the update-mirrors experience: list of configured
  registries, test reachability, set primary, add/remove. New
  registry.set-primary RPC; existing registry.{list,add,remove,test}
  reused.
- Default RegistryConfig flipped: VPS (23.182.128.160:3000/lfg2025) is
  now Server 1 (primary), tx1138 is Server 2 (fallback).
- Install pipeline now rewrites the first pull to the primary registry
  URL before attempting it. Before this, installs always hit whichever
  registry the image was hardcoded to, so changing the primary didn't
  actually affect where images came from. On failure, the existing
  fallback walk skips the primary (already tried) and walks the rest.
- App catalog proxy UPSTREAMS order flipped so the catalog follows the
  same VPS-first rule.
- Reboot overlay: animated "a" logo now sits in the center of the ring
  (matches the screensaver composition). Extracted the logo-wrapper
  pattern inline.

7/7 registry tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:54:07 -04:00
Dorian
46350f48b6 chore(fmt): rustfmt drift cleanup across misc crates
Pure formatter output — no semantic changes. Sweeping these into their
own commit so the FIPS integration diff that follows stays scoped to
the actual feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 22:57:14 -04:00
Dorian
b614c5c694 chore(ci): rustfmt + clippy clean-up to unblock the Rust CI job
The .github/workflows/ci.yml Rust job runs cargo fmt --check, clippy
with -D warnings, and tests. All three were failing. This commit:

- Applies rustfmt across the tree (the bulk of the diff — untouched
  since the last toolchain bump, so a wide sweep was unavoidable).
- Fixes the correctness-level clippy errors:
    container/bitcoin_simulator.rs wildcard-in-or-pattern
    container/manifest.rs from_str rename to parse (reserved name)
    container/podman_client.rs .get(0) -> .first()
    container/runtime.rs manual += collapse
    archipelago/src/constants.rs doc-comment → module-doc
    api/rpc/package/install.rs stray /// comment above a non-item
    container/docker_packages.rs redundant field init
    streaming/advertisement.rs missing Metric import in tests
    tests/orchestration_tests.rs `vec!` in non-Vec contexts
    mesh/listener/dispatch.rs unused store_plain_message import
    api/rpc/tor/mod.rs and mesh/steganography.rs: push-after-new → vec!
- Quiets wide legacy surfaces with crate-level allows in main.rs for
  stylistic lints (too_many_arguments, type_complexity, doc indent,
  enum variant prefix, wildcard-in-or, assertions-on-constants,
  drop_non_drop, unused_io_amount, ptr_arg) — these fired in dozens
  of places with no correctness payoff and have been churning every
  toolchain bump.
- Tags intentional-dead-code helpers: wallet/ and streaming/ modules
  are WIP, mesh::send_chunked_payload and DM_V1_MARKER are kept for
  rollback compatibility, vpn::get_nostr_vpn_status is surface-area
  for a not-yet-landed RPC.

cargo fmt --check, cargo clippy --all-targets --all-features
-- -D warnings, and cargo test --all-features now all pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:23:46 -04:00
Dorian
0c02d06a66 feat: deploy-to-target supports .253 + mesh/federation/VPN updates
- Add deploy_secondary() function for deploying to multiple LAN nodes
- --both now deploys to .198 and .253 (previously .198 only)
- Fleet deploy updated for 3 LAN nodes
- Mesh DM fixes: protocol frame format, DM-via-channel routing
- Federation pending requests, discover modal
- VPN status UI improvements
- Image versions and container specs updates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 11:07:08 -04:00
Dorian
0f71013952 fix: registry fallback skips dead primary, WireGuard first-boot, Gitea port 3001
Registry fallback now only tries DIFFERENT registries (skips original
that already failed). 120s timeout per fallback attempt. WireGuard
keys generated on unbundled first-boot. Gitea ROOT_URL uses port 3001.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 11:40:52 -04:00
Dorian
2d11f262dd fix: pull timeout covers entire operation, swap registry priority
Timeout now wraps stderr reader + wait (was only wrapping wait, so
hung pulls were never killed). 23.182.128.160:3000 is now primary
registry since git.tx1138.com is unreachable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:18:24 -04:00
Dorian
1147dbd882 feat: dynamic container registry with fallback
Configurable registry list persisted to config/registries.json.
Image pulls try all registries in priority order — if primary fails,
fallback registries are attempted automatically. RPC endpoints:
registry.list, registry.add, registry.remove, registry.test.

Replaces hardcoded fallback logic with extensible registry system.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:09:14 -04:00
Dorian
a147db9b70 refactor: migrate container registry from 80.71.235.15:3000 to git.tx1138.com/lfg2025
All hardcoded references to the old IP-based registry replaced across
Rust backend, Vue frontend, shell scripts, Dockerfiles, CI, and docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:33:10 -04:00
Dorian
a8c6a36cd1 fix: netavark GLIBC mismatch in ISO, container adopt, app updates
ISO build no longer copies netavark from build host (Debian 13/GLIBC 2.41)
which broke container networking on Debian 12 targets. Rootfs already
installs netavark from Debian 12 repos — just configure the backend.

Install RPC now adopts existing containers (from first-boot) instead of
erroring on duplicates. Container scanner extracts real versions from
image tags and detects available updates against pinned versions.

Frontend shows update button with version info when updates are available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 11:47:35 +02:00
Dorian
b3a6d2103a fix: container stability, OnlyOffice removal, node bootstrapping, UI fixes
Container orchestration:
- Add --network-alias to all archy-net containers (fixes Podman DNS)
- Fix bitcoin-knots health check: expand $BITCOIN_RPC_PASS at creation
- Increase bitcoin-knots memory limit to 4g, reduce dbcache to 2048
- Enable podman-restart.service in ISO for auto-start on boot
- Fix UI container Dockerfiles: ENTRYPOINT [], user root for rootless

App changes:
- Remove OnlyOffice (incompatible with rootless Podman)
- Replace with CryptPad reference (single-process, e2e encrypted)
- Fix NPM port mapping: 8181 → 81
- Fix OnlyOffice port mapping: 8044 → 9980 (now CryptPad: 3003)

AIUI & proxy:
- Add MODEL_MAP to claude-api-proxy (ISO + deploy)
- Map legacy model IDs (claude-haiku-4.5 → claude-haiku-4-5-20251001)

Kiosk:
- Move chromium-kiosk data dir to /var/lib/archipelago (data partition)
- Remove --metrics-recording-only (contradicted --disable-metrics)

Node bootstrapping:
- Add bootstrap-switchover.sh for live node updates
- ElectrumX UI improvements and nginx proxy fixes
- LND UI nginx config updates

Backend:
- Bitcoin health check uses .cookie auth (no plaintext creds)
- ElectrumX status endpoint improvements
- Network alias flag in install.rs for DNS reliability

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:15:04 +01:00
Dorian
7a263851f2 fix: add bitcoin, electrumx, filebrowser to tor_service_name mapping
These services had hidden services configured in torrc but their
app IDs weren't mapped in tor_service_name(), so read_tor_address()
returned None and the UI showed them as having no Tor service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:22:30 +01:00
Dorian
9a556d7819 fix: CSRF race condition, UI containers, Tor ordering, seed layout
- session.rs: use OnceCell for remember_secret to prevent concurrent
  requests on first boot from generating different HMAC secrets, which
  caused CSRF token mismatch on every state-changing RPC call (app
  install, start, stop all failed with "CSRF token missing or invalid")

- install.rs: write lnd.conf with Bitcoin RPC credentials before LND
  container starts (prevents "bitcoin.mainnet must be specified" crash);
  inject Bitcoin RPC auth into bitcoin-ui nginx.conf; add proper error
  logging to UI container build/run steps; fix UI containers to use
  --network=host (they proxy to localhost backend/bitcoin RPC)

- Tor: remove After=tor.service from archipelago-tor-helper.path to
  break systemd ordering cycle that prevented Tor from starting on boot

- Seed screen: compact grid layout (2 cols mobile, 4 cols sm+) with
  tighter padding to fit kiosk displays without scrolling

- Dockerfiles: remove nonexistent assets/ COPY from bitcoin-ui, fix
  electrs-ui to COPY qrcode.js and EXPOSE 50002 (matches nginx.conf)

- image-versions.sh: add UI container image variables for registry

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 11:06:19 +01:00
Dorian
1e283daf13 fix: overhaul container lifecycle — recovery, health, uninstall, UI state
Container recovery:
- Health monitor: MAX_RESTART_ATTEMPTS 3→10, interval 60s→120s
- Dependency-aware restarts: won't restart services before their deps
- Reset dependent counters when a dependency recovers
- Handle "created" state containers (were invisible to health monitor)
- Added IndeedHub, mempool-api, mysql to tier system
- Crash recovery: podman start timeout 30s→120s with retry
- Podman client: socket timeout 5s→30s, added restart policy

UI state representation:
- Exit code 0 shows "stopped" (gray), not "crashed" (red)
- Exit code 137 shows "killed (OOM)"
- Non-zero exit shows "crashed" (red)
- Added exit_code field to PackageDataEntry

Install/uninstall fixes:
- Install returns error when container doesn't start (was silent success)
- Post-install hooks awaited instead of fire-and-forget tokio::spawn
- Uninstall: graceful rm before force, volume prune, network cleanup
- Uninstall returns error on partial failure (was 200 OK)

Config consistency:
- DB passwords read from /var/lib/archipelago/secrets/ (was hardcoded)
- Bitcoin: added ZMQ ports 28332/28333 for LND block notifications
- IndeedHub port 7777→8190 (was conflicting with strfry)
- Marketplace versions: LND 0.17.4→0.18.4, Mempool 2.5.0→3.0.0

Performance:
- Metrics collector interval 60s→300s (was duplicating health monitor)
- Podman client: proper error propagation instead of unwrap_or_default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 07:03:57 +01:00
Dorian
19dcfd4f31 feat: BIP-39 master seed for unified key derivation
Replace fragmented random key generation with a single 24-word BIP-39
mnemonic that deterministically derives all node keys: Ed25519 (DID),
secp256k1 (Nostr/Bitcoin), BIP-84 xprv (Bitcoin Core), and LND aezeed
entropy. New onboarding flow: seed generate → word verification → identity
naming. Restore path enabled via 24-word entry. Includes seed RPC handlers,
mock backend support, LND/Bitcoin Core wallet-from-seed integration, and
UI polish across settings and discover views.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 01:41:24 +01:00
Dorian
46c50961c2 feat: TASK-49 container reliability — tests, orchestration, MASTER_PLAN
- Add orchestration_tests.rs + mock_podman.rs (container unit tests)
- Add container-tests.yml CI workflow
- Add dev-container-test.sh for local testing
- MASTER_PLAN.md: add TASK-49 (P0) with 6-phase plan
- Login.vue: minor fixes from user testing
- AppCard.vue: enter key handler fix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:15:56 +01:00
Dorian
94f2de4a64 refactor: centralize constants, eliminate unwraps, remove dead code, resolve TODOs
- R13+R16: Replace .expect() with .context()? in main.rs and identity.rs
- R17+R18+R19: Fix unwrap() calls in helpers and js-engine
- R20+R21: Remove #[allow(dead_code)] annotations and delete truly dead code
- R22-R26: Create constants.rs module, replace 21 hardcoded values across 12 files
- R28+R29: LND/DWN timeouts already present — verified
- R30-R33: Remove TODO comments, implement marketplace payment check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:54:35 +00:00
Dorian
2443ae6bba refactor: replace blocking std::fs and TCP I/O with async tokio equivalents
- R6: Convert 6 std::fs calls in session.rs to tokio::fs async
- R7: Convert std::fs::read_to_string in docker_packages.rs to async
- R8: Convert 3 std::fs calls in port_allocator.rs to async, switch to tokio::sync::Mutex
- R9+R10+R11: Fix blocking I/O in node_message.rs and nostr_discovery.rs
- R12: Convert electrs_status.rs from sync TCP to async tokio::net with 5s timeouts
- R4+R5: Spawn periodic cleanup tasks for endpoint and login rate limiters

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:21:08 +00:00
Dorian
84a56c80de security+feat: v1.3.0 — pentest remediation, container reliability, UI overhaul
Security (33 pentest findings addressed):
- CRITICAL: backend binds 127.0.0.1, path traversal in tor.rs/dwn fixed
- HIGH: federation requires signatures, XSS login redirect, RBAC viewer restricted
- HIGH: tar slip prevention, S3 SSRF validation, backup ID validation
- MEDIUM: remember-me random secret, TOTP session rotation, password re-auth
- LOW: CSP unsafe-inline removed, CORS dev-only, onion/webhook validation

Container reliability:
- Memory limits on all 37 containers (OOM prevention)
- Exited vs stopped state distinction with health-aware status badges
- Crash recovery coordination (no more restart cascade)
- User-stopped tracking survives reboots
- Tiered boot recovery (databases → core → services → apps)

UI:
- Wallet TransactionsModal, health-aware app status badges
- Restart button on containers, exited/crashed red state
- Mesh view overhaul, glass button updates, BaseModal/ToggleSwitch
- Apps sticky header removed, dev faucet, mutable mock wallet

Infrastructure:
- LND REST port 8080 exposed over Tor (LND Connect fix)
- Nginx cookie_session fix, deploy script Tor config updated
- Dev environment: podman auto-start, boot mode simulation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 12:44:31 +00:00
Dorian
5e7a9dfa68 fix(TASK-26): Rename fedimintd to "Fedimint Guardian"
Added fedimintd to the metadata map with title "Fedimint Guardian"
and description clarifying it's the federation consensus node.
Shares the fedimint.png icon.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 17:56:45 +00:00
Dorian
c4d447c22d fix: import PodmanClient for lan_address_for fallback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 16:35:12 +00:00
Dorian
ccffaa3562 fix: use PodmanClient::lan_address_for as static fallback for port mapping
Dynamic port extraction from container bindings, falling back to the
static PodmanClient address map for apps without port bindings (e.g.
host-network containers).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 16:32:39 +00:00
Dorian
d05ed33528 fix: dynamic port detection + electrumx sync + rootless infra
Backend:
- Remove most hardcoded port overrides from docker_packages.rs, use
  dynamic port extraction from actual container bindings with fallback
  to static map in PodmanClient
- Fix OnlyOffice (8044), NginxPM (8181), Fedimint (8174) port mappings
- Remove Tailscale fake web UI port (no web UI)
- ElectrumX: detect "Connection reset" as syncing state (not error)

Deploy script:
- Auto-configure sysctl unprivileged_port_start=80 for rootless
- Auto-enable loginctl linger for container persistence
- Auto-enable podman.socket for Portainer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 16:29:03 +00:00
Dorian
208bb608f3 fix: container state mapping + marketplace install aliases
- Created containers now show as "stopped" not "starting" (fixes
  ollama/tailscale perpetual "starting" state)
- Comprehensive INSTALLED_ALIASES map: fedimint, electrumx, grafana,
  jellyfin, vaultwarden, searxng, homeassistant, photoprism, lnd,
  filebrowser, tailscale, ollama — prevents marketplace showing
  "Install" for already-installed containers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:18:43 +00:00
Dorian
d37ec1dea5 feat: v1.2.0-alpha — E2E encrypted mesh relay, steganography, relay status polling
Phase 5 mesh networking:
- E2E encrypted TX relay (X25519 + ChaCha20-Poly1305) — non-Archy nodes
  relay encrypted blobs transparently via Meshcore native routing
- Steganographic encoding modes (WeatherStation, SensorNetwork) — traffic
  looks like sensor data on the wire, 0xAA marker, configurable per-node
- Pre-flight Bitcoin Core health check on relay node — specific error codes
  (bitcoin_unreachable, bitcoin_syncing, tx_rejected) instead of generic fails
- mesh.relay-status RPC endpoint — frontend polls for relay result every 3s
- On-Chain / Lightning tabs in Off-Grid Bitcoin panel
- Archy Peers vs Mesh Broadcast relay mode selector
- Mesh view fills viewport (no page scroll), internal panel scrolling
- Version bump to 1.2.0-alpha

Also includes: deploy hardening, container fixes, IndeedHub updates,
boot screen, dashboard improvements, MASTER_PLAN task tracking

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 23:56:37 +00:00
Dorian
367b483a72 feat: bitcoin-ui CSS fix, HTTPS proxy support, deploy script improvements
Bitcoin UI:
- Replace cdn.tailwindcss.com with locally bundled tailwind.css (CSP blocks external scripts)
- Make all asset paths relative for nginx proxy compatibility
- Add bitcoin-ui build/deploy to deploy-to-target.sh (was missing entirely)
- Use --network host (bitcoin-ui proxies Bitcoin RPC at 127.0.0.1:8332)

HTTPS mixed content fix:
- Add HTTPS_PROXY_PATHS in AppSession.vue — when parent page is HTTPS,
  iframe loads through nginx proxy instead of direct HTTP port
- Prevents browser blocking HTTP iframes inside HTTPS pages
- All Tailscale servers use HTTPS, this was breaking all app iframes

Deploy & first-boot improvements:
- first-boot-containers.sh auto-detects disk size for pruning vs txindex
- first-boot-containers.sh checks fallback source path for UI containers
- Added mempool-electrs to APP_PORTS mapping
- ElectrumX container creation in first-boot
- Podman doctor/fix/uptime skills added

Also includes: session persistence, identity management, LND transactions,
ElectrumX status UI, nostr-provider improvements, Web5 enhancements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 12:58:35 +00:00
Dorian
4149337c1d fix: correct IndeedHub port mapping from 8190 to 7777
Backend metadata and manifest now match the actual running config
and the frontend port mapping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 04:28:18 +00:00
Dorian
fdf9415786 fix: consolidate IndeedHub icon to indeedhub.png and fix all references
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 04:01:58 +00:00
Dorian
a42e922000 fix: correct PhotoPrism icon filename typo in backend metadata
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 04:01:12 +00:00
Dorian
b786f68e7a bug fixes from sxsw 2026-03-14 17:12:41 +00:00
Dorian
c8d1049eb6 feat: add Monero and Liquid Network container support
- AppMetadata for monerod/monero and elementsd/liquid in docker_packages
- Marketplace entries with pinned images from trusted registries
- Monero: sethforprivacy/simple-monerod:v0.18.3.4
- Liquid: vulpemventures/elements:23.2.2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 05:53:41 +00:00
Dorian
d538ad0581 fix: restore get_app_tier function signature
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 03:39:17 +00:00
Dorian
74d205dfe0 fix: remove duplicate tier fields in AppMetadata
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 03:37:51 +00:00
Dorian
d71bc2a46c fix: add missing tier field to all AppMetadata, fix build errors
- Add tier: "" to all AppMetadata match arms (was missing from 30+ arms)
- Use std:🧵:available_parallelism() instead of num_cpus crate
- Remove unused num_cpus dependency
- Fix unused variable warning in health_monitor.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 03:36:44 +00:00
Dorian
224a0db76c feat: add app tier system — core/recommended/optional (SCALE-02, SCALE-03)
get_app_tier() classifies all apps:
- core: Bitcoin, LND, Electrs, Mempool, BTCPay, DWN, FileBrowser
- recommended: Fedimint, Grafana, Vaultwarden, Kuma, SearXNG, etc.
- optional: everything else

Tier field added to Manifest struct (data_model.rs) and exposed
via WebSocket package data for frontend tier badges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 03:27:51 +00:00