620 Commits

Author SHA1 Message Date
archipelago
f0bd49d03d fix(apps): repair netbird install and app icons 2026-05-19 17:20:32 -04:00
archipelago
dd8a6cd9d7 chore: release v1.7.70-alpha 2026-05-19 16:10:43 -04:00
archipelago
ab96c97cb9 fix(apps): self-host netbird and stabilize app sessions 2026-05-19 16:02:35 -04:00
archipelago
20bc9f250c chore: release v1.7.69-alpha 2026-05-19 14:39:15 -04:00
archipelago
87be717f40 fix(apps): keep slow installs visible 2026-05-19 14:29:20 -04:00
archipelago
ab27fb97f8 chore: release v1.7.68-alpha 2026-05-19 09:37:47 -04:00
archipelago
d736364ad7 fix(apps): stabilize btcpay and public proxy launch flows 2026-05-19 09:26:43 -04:00
archipelago
b25d41c5c6 chore: release v1.7.67-alpha 2026-05-18 11:54:57 -04:00
archipelago
32902d3891 fix(ui): stabilize system status metrics 2026-05-18 11:47:12 -04:00
archipelago
6240064acf chore: release v1.7.66-alpha 2026-05-18 10:15:56 -04:00
archipelago
ec36ac7e2c chore: release v1.7.65-alpha 2026-05-18 09:31:41 -04:00
archipelago
76288f541e chore: release v1.7.64-alpha 2026-05-17 23:24:39 -04:00
archipelago
8191d92bed chore: release v1.7.63-alpha 2026-05-17 23:03:06 -04:00
archipelago
d91b858d9b chore: release v1.7.62-alpha 2026-05-17 22:40:36 -04:00
archipelago
a992abcd06 chore: release v1.7.61-alpha 2026-05-17 22:13:21 -04:00
archipelago
4d6b4f76af chore: release v1.7.60-alpha 2026-05-17 20:45:56 -04:00
archipelago
0a94c0097f chore: release v1.7.59-alpha 2026-05-17 19:44:54 -04:00
archipelago
413d50116e fix(apps): restore mobile and website launching 2026-05-17 19:22:18 -04:00
archipelago
e05e356d64 chore: release v1.7.58-alpha 2026-05-17 18:40:50 -04:00
archipelago
7804223152 chore: release v1.7.57-alpha 2026-05-17 17:30:04 -04:00
archipelago
30505f41ff chore(release): refresh v1.7.56-alpha notes and artifacts 2026-05-15 17:54:32 -04:00
Dorian
5818541721 chore: release v1.7.56-alpha 2026-05-14 09:13:58 -04:00
Dorian
f95e9a1cd0 fix: quote quadlet environment values 2026-05-14 01:15:22 -04:00
Dorian
2ff47f88a7 fix: harden container reconcile and launch behavior 2026-05-13 22:59:55 -04:00
Dorian
835c525218 chore(release): stage v1.7.55-alpha 2026-05-13 15:09:22 -04:00
archipelago
c0751e2551 chore(release): stage v1.7.54-alpha 2026-05-06 09:23:57 -04:00
archipelago
1a0d8a432c chore(release): stage v1.7.53-alpha 2026-05-05 13:59:50 -04:00
archipelago
745cb1c626 chore(release): stage v1.7.52-alpha 2026-05-05 11:29:18 -04:00
archipelago
05e6c2e738 fix: release v1.7.51-alpha install hardening 2026-05-01 05:02:39 -04:00
archipelago
be9f9528c3 fix: release v1.7.50-alpha OTA runtime repair 2026-05-01 03:14:07 -04:00
archipelago
7ab788d178 chore: release v1.7.49-alpha 2026-04-30 16:37:54 -04:00
archipelago
f507b847ef chore: release v1.7.48-alpha
Hotfix: archipelago.service ExecStartPre now mkdirs /run/containers and
/var/lib/containers before the unit's mount-namespace setup tries to bind
them. Without this, fresh nodes that don't have /run/containers (e.g.
nodes provisioned without a prior podman session) fail at the namespace
step with:

  Failed to set up mount namespacing: /run/containers: No such file or directory
  Failed at step NAMESPACE spawning /bin/bash: No such file or directory

Existing nodes don't pick up systemd unit changes via OTA — they need a
one-time `systemctl edit archipelago` adding the same mkdir. ISO installs
from this version forward have the fix baked in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:27:22 -04:00
archipelago
8a2899ab4a chore: release v1.7.47-alpha
Sync-perf tuning for bitcoin/bitcoin-core/bitcoin-knots/electrumx.

- Drop the --cpus=2 cap on bitcoin/electrumx variants. Script verification
  is parallelizable; the cap halved IBD speed on 4-8 core machines.
- Bump bitcoin --memory 4g→8g so dbcache=4096 has headroom for mempool +
  connection buffers + I/O. 4g was OOM-prone during heavy IBD.
- Bump electrumx --memory 1g→2g + add CACHE_MB=2048 + MAX_SEND=10MB.
- bitcoin-core CLI args gain -dbcache=4096 -par=0 -maxconnections=125.
- bitcoin-knots manifest matched (1024MB pruned / 4096MB full + par=0).

Future v2: host-RAM-aware dbcache scaling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:47:51 -04:00
archipelago
992b673b20 chore: release v1.7.46-alpha
Follow-up to v1.7.45-alpha closing the remaining tasks identified by the
resilience sweeps + the new bitcoin orphan / install-fail-vanish bugs.

User-visible:
- Health monitor: stop paging on orphaned containers from variant switches
- Install fail: card stays visible (was vanishing) with error message
- Stack pull progress: interpolate 20→70% (was stuck at 20%)
- docker.io → lfg2025 mirror: bitcoin/gitea/nextcloud/valkey

Internal:
- Resilience harness — install-wait uses expected_containers_for, ui+auth
  probes retry with 60s backoff, dep-snapshot fix
- InstallProgress gains optional `message` field (frontend renders it
  when phase is None)

binary  $(stat -c %s releases/v1.7.46-alpha/archipelago)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago | awk '{print $1}')
tarball $(stat -c %s releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz | awk '{print $1}')

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:50:33 -04:00
archipelago
4ec6ca98c1 chore: release v1.7.45-alpha
Resilience-validated release. Three full sweeps of the new resilience
harness against .228 confirm no shipstoppers.

Big user-visible:
- Bitcoin RPC auth durably correct via host-rendered nginx.conf bind-mount,
  replaces fragile post-start exec that failed under restricted-cap rootless
  podman ("crun: write cgroup.procs: Permission denied")
- Multi-container stack installs (indeedhub, immich, btcpay, mempool) now
  emit phase events at every boundary so the progress bar advances
- Apps no longer vanish from the dashboard mid-install (absent-scanner skips
  packages in transitional states)
- Indeedhub fresh installs work end-to-end (was 8500+ restart loop): five
  missing env vars (DATABASE_PORT, QUEUE_HOST, QUEUE_PORT,
  S3_PRIVATE_BUCKET_NAME, AES_MASTER_SECRET) added to install code
- Tailscale install fixed: --entrypoint string was being passed as a single
  shell-line arg; switched to custom_args array
- Catalog cleaned of broken entries (dwn, endurain, ollama removed; nextcloud
  restored on docker.io)
- Bitcoin Core update path uses correct image (was looking for nonexistent
  lfg2025/bitcoin:28.4)
- ISO installs now allocate swap on the encrypted data partition

Infra:
- New resilience harness (scripts/resilience/) — black-box state-machine
  tester, every app × every transition. Run before each release.

Sweep #3 final: PASS 107 / FAIL 12 / SKIP 14. The 12 fails are 1 cosmetic
(homeassistant trusted_hosts), 8 harness/timing false-positives, and 3
non-shipstopper tracked items. Down from 23 in baseline sweep #1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:31:45 -04:00
archipelago
dffa7e99bb chore: release v1.7.44-alpha 2026-04-28 15:03:04 -04:00
archipelago
8f83b37d51 feat(orchestrator): complete container migration and release hardening 2026-04-28 15:00:58 -04:00
archipelago
0bd4e49a8c docs(release-notes): v1.7.43-alpha bullet for AIUI preservation fix 2026-04-23 13:22:28 -04:00
archipelago
310c709aba chore(release): bump version to 1.7.43-alpha 2026-04-23 13:21:58 -04:00
archipelago
2572688468 docs(release-notes): v1.7.43-alpha bullets for chunking, avatar, outbox, parser
Four production-code fixes merit user-visible mention: the transport
chunking data-corruption fix (real user-affecting bug for multi-chunk
mesh payloads), the avatar u16 overflow panic (backend crash on certain
seeds), the outbox TTL boundary, and the image-versions parser hardening.
2026-04-23 13:03:49 -04:00
archipelago
c4efb30382 docs(release-notes): v1.7.43-alpha bullet for install-log fix; prune stale RESUME note 2026-04-23 12:04:20 -04:00
archipelago
9f3d66e24e docs(release-notes): v1.7.43-alpha bullet for self-update script refresh
Document that OTA updates now refresh the reconcile helper scripts,
closing the deploy gap that kept fixes to those scripts from
reaching existing nodes.
2026-04-23 11:51:04 -04:00
archipelago
0f1ad47aec docs(release-notes): v1.7.43-alpha bullets for disk-detection and rollback recovery
Add two user-facing release notes for fixes shipped this round:
- Full-archive Bitcoin nodes no longer silently get pruned on reconcile
  because the disk-size check was reading the OS partition.
- Failed updates can now recover via reconcile --create-missing instead
  of leaving a destroyed container behind.
2026-04-23 10:02:32 -04:00
archipelago
353825b66c docs: release-note image-versions fix, add marketplace QA tracker, update RESUME
- AccountInfoSection.vue: append 5th bullet to v1.7.43-alpha entry
  explaining that update-available badges and version comparisons
  work again now that the pinned-image catalog is found at the
  correct deployed path.

- docs/MARKETPLACE-QA.md: new tracker for the upcoming app-by-app
  install walk on .228. Documents the per-app fix workflow, the
  four layers we might need to fix at (app recipe, registry image,
  backend orchestrator, frontend), status-key table for tracking
  each catalog entry, and the release-notes policy for the walk.

- docs/RESUME.md: refresh with a9908597 commit, updated binary md5
  on .228, and split Immediate Next Step into Phase 1 (browser
  verification) and Phase 2 (marketplace walk) with a pointer to
  the new tracker.
2026-04-23 09:32:41 -04:00
archipelago
6c8cb50679 docs(changelog): add v1.7.43-alpha entry covering async lifecycle + .23 retirement
Four release-note bullets describing the user-visible changes shipped
in this round:

- async-spawn install/update/uninstall (UI no longer freezes)
- phase-based install progress bar (Preparing through Finalizing)
- scanner kick post-install (Launch button appears immediately)
- .23 Hetzner VPS retired, .168 OVH promoted to Server 1 with
  auto-purge migration for existing nodes

Matches the tone of existing changelog entries: what changed from the
operator's perspective, not internal implementation detail.
2026-04-23 09:07:29 -04:00
archipelago
d9d5fa65e5 chore: retire .23 VPS mirror, promote .168 OVH to primary
The Hetzner VPS at 23.182.128.160 was decommissioned. Replace it
everywhere with the OVH VPS at 146.59.87.168, which was previously
the tertiary mirror.

  - update.rs: drop DEFAULT_TERTIARY_MIRROR_URL, promote .168 into
    the secondary slot as "Server 1 (OVH)"; tx1138 becomes Server 2.
    Default mirror list shrinks from 3 to 2.
  - container/registry.rs: default RegistryConfig drops .23, promotes
    .168 to Server 1 / priority 0, tx1138 stays Server 2 / priority 10.
  - api/rpc/package/config.rs: trusted-registry allowlist swaps .23
    for .168.
  - api/handler/mod.rs: app-catalog fallback URL uses .168.
  - neode-ui/views/marketplace/marketplaceData.ts: REGISTRY uses .168.
  - scripts/image-versions.sh: ARCHY_REGISTRY_FALLBACK uses .168.
  - image-recipe/build-auto-installer-iso.sh: installer ISO registries
    use .168 (both podman registries.conf and backend registries.json).

Tests updated to assert on the new 2-entry default lists (registry +
mirror). URL-parser fixture tests in update.rs retain .23 strings —
they exercise string-parsing logic, not mirror policy.

Git remotes: dropped `gitea-vps` and the .23 push URL on the `origin`
multi-push alias (not part of this commit — pure working-copy change).
2026-04-23 08:22:32 -04:00
archipelago
7e62ea07f7 feat(install): phase-based progress bar replaces unparseable pull bytes
Podman emits zero parseable progress when stderr is piped (no TTY), so
the old byte-counter regex never matched in real installs. Users saw
0% for the whole pull, then a jump to 95%, then silence through
create-container, health-check, and post-install hooks.

Replace with 7 explicit lifecycle phases wired through install.rs and
update.rs: Preparing (5%), PullingImage (20%), CreatingContainer (70%),
StartingContainer (80%), WaitingHealthy (88%), PostInstall (95%),
Done (100%). Each maps to a fixed UI progress and status message.

Frontend PHASE_INFO mapper in stores/server.ts prioritizes phase when
present, falls back to byte-counter for legacy. A Math.max forward-only
guard ensures the bar never regresses. Deleted the duplicate watcher
in Discover.vue that was fighting the store's watcher with stale byte
logic. Added shimmer CSS on the fill (with prefers-reduced-motion
opt-out) so the bar looks alive during long phases.
2026-04-23 07:58:43 -04:00
archipelago
702b5d64d3 fix(ui): shorten install/uninstall/update timeouts for async RPCs
With the backend flipped to async-spawn, install/uninstall/update return
immediately with a { status, package_id } envelope. Client timeouts of
45m/11m were a leftover from synchronous handlers and masked real RPC
failures.

Drop all install/uninstall/update RPC timeouts to 15s. Progress and
terminal state still arrive through the live state stream — the RPC
only needs to confirm the spawn was accepted.

Return-type annotations updated in rpc-client.ts and stores/server.ts.
Five direct rpcClient.call sites across Marketplace.vue, Discover.vue,
and MarketplaceAppDetails.vue updated with the shorter timeout.
2026-04-23 06:58:02 -04:00
archipelago
a8158b1ef5 fix(ui): single-button lifecycle control with transitional labels
The app card and details view previously used a pair of Start/Stop
buttons whose labels were driven off isAppLoading(), a client-side
"I just clicked the button" flag. When the backend's graceful stop
took longer than the RPC round-trip (up to 600s on bitcoin-core),
the flag cleared while the container was still shutting down, the
UI flipped back to "Running" as soon as the next 10s scan saw the
still-alive container, and the user had no indication the stop was
still in flight.

Now that the backend flips PackageState to Stopping / Starting /
Restarting / Installing / Updating / Removing for the duration of
each lifecycle operation and the scan loop preserves those states,
the UI can drive its label off the container state itself. A single
full-width primary button replaces the Start/Stop pair. Its label,
color, and disabled state come from getAppVisualState(), which
collapses resting states (exited/created/paused/installed) into
"stopped" and passes transitional states through untouched.

Changes:

- container-client.ts: widen ContainerStatus.state union to include
  the six transitional variants plus "installed". Add
  restartContainer() calling the new container-restart RPC.
- stores/container.ts: add getAppVisualState() computed and the
  restartContainer() action.
- ContainerApps.vue: single primary button (Start / Stop / Starting
  / Stopping / Restarting etc.) plus a separate circular Restart
  button visible only when running. Critically, handleStartApp and
  handleStopApp now route through store.startContainer and
  stopContainer (which call container-start / container-stop, the
  async RPCs) instead of the legacy synchronous bundled-app-start /
  bundled-app-stop path. Transitional-state polling widened from
  just "created" to the full set of transitional variants.
- ContainerAppDetails.vue: same single-button pattern, Restart
  button now calls container-restart instead of the old
  stop-sleep-start sequence, added 2s polling interval for
  transitional states.
- components/ContainerStatus.vue: widen state prop to match the
  shared union, render transitional labels with a trailing ellipsis
  and a yellow dot.

No new tests — this is presentation logic. Manual verification on
.228 will confirm the end-to-end async path: click Stop on LND,
button becomes "Stopping" in under a second, stays that way for
roughly 5 minutes, then flips to "Start" with a grey dot. The UI
must never revert to "Running" mid-stop.
2026-04-23 05:20:15 -04:00
archipelago
0ac673deb4 release(v1.7.42-alpha): bitcoin RPC retry wrapper so syncing nodes stop flashing red
Closes failure mode adjacent to FM3 (docs/bulletproof-containers.md): on
a syncing pruned node, bitcoind's RPC thread blocks for 5-10s during block
validation. The old 10s client-side timeout was rejecting roughly 30% of
UI calls even though the node was perfectly healthy. 20x stress test on
the live .116 node (caught in IBD catch-up at block 797k) used to drop
10 of 20 calls; now drops 0 of 20.

What changed:
- core/archipelago/src/api/rpc/bitcoin.rs: bitcoin_rpc_call now retries up
  to 3 times with 500ms and 1500ms backoffs between attempts. Only
  transient transport errors (timeout, connect refused, send/recv IO)
  trigger retry. A well-formed bitcoind error response is surfaced
  immediately - real RPC bugs are never masked.
- Per-attempt hard deadline (tokio::time::timeout, 15s) layered on top
  of reqwest's own timeout, so DNS starvation or TLS wedging can't
  steal the entire retry budget.
- handle_bitcoin_getinfo client builder gained a 3s connect_timeout
  so a dead bitcoind is fast-failed inside the first attempt instead
  of eating the whole 15s.
- Retry policy extracted into a RetryConfig struct so tests can dial
  down timeouts to ~100ms per attempt. Production defaults live in
  RetryConfig::production().

Not changed (tracked as follow-up):
- mesh/mod.rs bitcoin_rpc_getblockcount and related helpers use the
  same 10s-timeout pattern. Not migrated to the new wrapper in this
  release; scheduled for v1.7.43 alongside the render_bitcoin_conf
  work.
- lnd/info.rs and electrs_status have similar 10s/15s timeouts but
  different failure profiles - audit first, migrate only the ones
  that actually exhibit the bug.

Tests: 6 new unit tests under api::rpc::bitcoin::tests, all passing.
Uses an in-process hyper server (already a transitive dep) to simulate
bitcoind responses; no new crates required.
  - happy_path_first_attempt: no retry when first attempt succeeds
  - retries_on_timeout_then_succeeds: first attempt times out, second
    succeeds, returns OK (uses a short-timeout RetryConfig so the test
    runs in <1s instead of 15s)
  - retries_exhausted_on_persistent_connect_refused: all attempts fail
    against a closed port, error bubbles up, elapsed time confirms
    backoffs actually ran
  - does_not_retry_on_rpc_level_error: bitcoind-returned error body is
    surfaced immediately, no retry
  - does_not_retry_parse_errors: non-JSON response (e.g. 503 with html
    body) is NOT retried - guards against the tempting "retry all
    non-2xx" mistake that would mask real bitcoind misconfig
  - retry_budget_invariants: asserts total wall-time ceiling stays
    under 60s so a bumped constant can't silently hang a UI call
    forever

Validated live on .116: 20/20 bitcoin.getinfo calls succeed during IBD
catch-up (chain at block 797419 -> 797464), vs ~40% baseline under the
old 10s timeout. Worst-case latency was 48.9s during peak validation;
happy-path latency (cached result) remains 28-77ms.
2026-04-22 16:46:28 -04:00