83 Commits

Author SHA1 Message Date
archipelago
bb808df89a chore: release v1.7.90-alpha 2026-06-13 05:05:14 -04:00
archipelago
c49e8fcacd fix: harden OTA updates, AIUI desktop gap, LND no-proxy
- update.rs: post-OTA probe falls back to http://127.0.0.1/ on connect
  error (nginx binds :80, not :443) so good updates are no longer rolled
  back; recover stuck update_in_progress; avoid ETXTBSY on running binary
- LND: REST client bypasses proxy, GET newaddress p2wkh, wallet
  readiness/unlock after restart
- Dashboard.vue: chat route back to plain h-full (desktop bottom-gap fix)
- vite.config.ts: dev-only /aiui proxy
- tests/release/run.sh: release gate harness (static+frontend+backend)
- CHANGELOG: v1.7.89-alpha notes

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 01:23:32 -04:00
archipelago
b8ac68d844 fix: restore aiui and bitcoin receive before release 2026-06-12 05:10:03 -04:00
archipelago
6fd1cf9ba7 chore: release v1.7.87-alpha 2026-06-12 04:49:58 -04:00
archipelago
b11c6c17d1 chore: release v1.7.86-alpha 2026-06-12 04:21:18 -04:00
archipelago
00c32688f8 chore: release v1.7.85-alpha 2026-06-12 03:14:59 -04:00
archipelago
6a30ff11bd chore: release v1.7.84-alpha 2026-06-11 04:44:58 -04:00
archipelago
22df3f8f5f chore: release v1.7.83-alpha 2026-06-11 03:03:32 -04:00
archipelago
136eda16c9 chore: release v1.7.82-alpha 2026-05-22 17:19:45 -04:00
archipelago
853d51ae14 chore: release v1.7.81-alpha 2026-05-21 21:44:14 -04:00
archipelago
bdd5a2c43e chore: release v1.7.80-alpha 2026-05-21 00:38:57 -04:00
archipelago
7be7420c4f chore: release v1.7.79-alpha 2026-05-20 23:11:54 -04:00
archipelago
e61c757633 chore: release v1.7.78-alpha 2026-05-20 20:53:23 -04:00
archipelago
0898c54765 chore: bump version to v1.7.77-alpha 2026-05-20 00:38:26 -04:00
archipelago
92c58141af fix(apps): stabilize saleor and netbird launch 2026-05-19 21:45:17 -04:00
archipelago
e65e76cd9d chore: release v1.7.75-alpha 2026-05-19 20:19:24 -04:00
archipelago
bd69ef41d5 fix(apps): repair netbird login and iframe focus 2026-05-19 19:21:43 -04:00
archipelago
eeb08fc78f chore: release v1.7.73-alpha 2026-05-19 18:40:10 -04:00
archipelago
3e01e57c8d chore: release v1.7.72-alpha 2026-05-19 17:42:11 -04:00
archipelago
5859ef77e7 chore: release v1.7.71-alpha 2026-05-19 17:30:20 -04:00
archipelago
dd8a6cd9d7 chore: release v1.7.70-alpha 2026-05-19 16:10:43 -04:00
archipelago
20bc9f250c chore: release v1.7.69-alpha 2026-05-19 14:39:15 -04:00
archipelago
ab27fb97f8 chore: release v1.7.68-alpha 2026-05-19 09:37:47 -04:00
archipelago
b25d41c5c6 chore: release v1.7.67-alpha 2026-05-18 11:54:57 -04:00
archipelago
6240064acf chore: release v1.7.66-alpha 2026-05-18 10:15:56 -04:00
archipelago
ec36ac7e2c chore: release v1.7.65-alpha 2026-05-18 09:31:41 -04:00
archipelago
76288f541e chore: release v1.7.64-alpha 2026-05-17 23:24:39 -04:00
archipelago
8191d92bed chore: release v1.7.63-alpha 2026-05-17 23:03:06 -04:00
archipelago
d91b858d9b chore: release v1.7.62-alpha 2026-05-17 22:40:36 -04:00
archipelago
a992abcd06 chore: release v1.7.61-alpha 2026-05-17 22:13:21 -04:00
archipelago
4d6b4f76af chore: release v1.7.60-alpha 2026-05-17 20:45:56 -04:00
archipelago
0a94c0097f chore: release v1.7.59-alpha 2026-05-17 19:44:54 -04:00
archipelago
e05e356d64 chore: release v1.7.58-alpha 2026-05-17 18:40:50 -04:00
archipelago
7804223152 chore: release v1.7.57-alpha 2026-05-17 17:30:04 -04:00
Dorian
5818541721 chore: release v1.7.56-alpha 2026-05-14 09:13:58 -04:00
Dorian
835c525218 chore(release): stage v1.7.55-alpha 2026-05-13 15:09:22 -04:00
archipelago
c0751e2551 chore(release): stage v1.7.54-alpha 2026-05-06 09:23:57 -04:00
archipelago
1a0d8a432c chore(release): stage v1.7.53-alpha 2026-05-05 13:59:50 -04:00
archipelago
745cb1c626 chore(release): stage v1.7.52-alpha 2026-05-05 11:29:18 -04:00
archipelago
05e6c2e738 fix: release v1.7.51-alpha install hardening 2026-05-01 05:02:39 -04:00
archipelago
be9f9528c3 fix: release v1.7.50-alpha OTA runtime repair 2026-05-01 03:14:07 -04:00
archipelago
7ab788d178 chore: release v1.7.49-alpha 2026-04-30 16:37:54 -04:00
archipelago
f507b847ef chore: release v1.7.48-alpha
Hotfix: archipelago.service ExecStartPre now mkdirs /run/containers and
/var/lib/containers before the unit's mount-namespace setup tries to bind
them. Without this, fresh nodes that don't have /run/containers (e.g.
nodes provisioned without a prior podman session) fail at the namespace
step with:

  Failed to set up mount namespacing: /run/containers: No such file or directory
  Failed at step NAMESPACE spawning /bin/bash: No such file or directory

Existing nodes don't pick up systemd unit changes via OTA — they need a
one-time `systemctl edit archipelago` adding the same mkdir. ISO installs
from this version forward have the fix baked in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:27:22 -04:00
archipelago
8a2899ab4a chore: release v1.7.47-alpha
Sync-perf tuning for bitcoin/bitcoin-core/bitcoin-knots/electrumx.

- Drop the --cpus=2 cap on bitcoin/electrumx variants. Script verification
  is parallelizable; the cap halved IBD speed on 4-8 core machines.
- Bump bitcoin --memory 4g→8g so dbcache=4096 has headroom for mempool +
  connection buffers + I/O. 4g was OOM-prone during heavy IBD.
- Bump electrumx --memory 1g→2g + add CACHE_MB=2048 + MAX_SEND=10MB.
- bitcoin-core CLI args gain -dbcache=4096 -par=0 -maxconnections=125.
- bitcoin-knots manifest matched (1024MB pruned / 4096MB full + par=0).

Future v2: host-RAM-aware dbcache scaling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:47:51 -04:00
archipelago
992b673b20 chore: release v1.7.46-alpha
Follow-up to v1.7.45-alpha closing the remaining tasks identified by the
resilience sweeps + the new bitcoin orphan / install-fail-vanish bugs.

User-visible:
- Health monitor: stop paging on orphaned containers from variant switches
- Install fail: card stays visible (was vanishing) with error message
- Stack pull progress: interpolate 20→70% (was stuck at 20%)
- docker.io → lfg2025 mirror: bitcoin/gitea/nextcloud/valkey

Internal:
- Resilience harness — install-wait uses expected_containers_for, ui+auth
  probes retry with 60s backoff, dep-snapshot fix
- InstallProgress gains optional `message` field (frontend renders it
  when phase is None)

binary  $(stat -c %s releases/v1.7.46-alpha/archipelago)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago | awk '{print $1}')
tarball $(stat -c %s releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz)  sha256:$(sha256sum releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz | awk '{print $1}')

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:50:33 -04:00
archipelago
4ec6ca98c1 chore: release v1.7.45-alpha
Resilience-validated release. Three full sweeps of the new resilience
harness against .228 confirm no shipstoppers.

Big user-visible:
- Bitcoin RPC auth durably correct via host-rendered nginx.conf bind-mount,
  replaces fragile post-start exec that failed under restricted-cap rootless
  podman ("crun: write cgroup.procs: Permission denied")
- Multi-container stack installs (indeedhub, immich, btcpay, mempool) now
  emit phase events at every boundary so the progress bar advances
- Apps no longer vanish from the dashboard mid-install (absent-scanner skips
  packages in transitional states)
- Indeedhub fresh installs work end-to-end (was 8500+ restart loop): five
  missing env vars (DATABASE_PORT, QUEUE_HOST, QUEUE_PORT,
  S3_PRIVATE_BUCKET_NAME, AES_MASTER_SECRET) added to install code
- Tailscale install fixed: --entrypoint string was being passed as a single
  shell-line arg; switched to custom_args array
- Catalog cleaned of broken entries (dwn, endurain, ollama removed; nextcloud
  restored on docker.io)
- Bitcoin Core update path uses correct image (was looking for nonexistent
  lfg2025/bitcoin:28.4)
- ISO installs now allocate swap on the encrypted data partition

Infra:
- New resilience harness (scripts/resilience/) — black-box state-machine
  tester, every app × every transition. Run before each release.

Sweep #3 final: PASS 107 / FAIL 12 / SKIP 14. The 12 fails are 1 cosmetic
(homeassistant trusted_hosts), 8 harness/timing false-positives, and 3
non-shipstopper tracked items. Down from 23 in baseline sweep #1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:31:45 -04:00
archipelago
dffa7e99bb chore: release v1.7.44-alpha 2026-04-28 15:03:04 -04:00
archipelago
310c709aba chore(release): bump version to 1.7.43-alpha 2026-04-23 13:21:58 -04:00
archipelago
0ac673deb4 release(v1.7.42-alpha): bitcoin RPC retry wrapper so syncing nodes stop flashing red
Closes failure mode adjacent to FM3 (docs/bulletproof-containers.md): on
a syncing pruned node, bitcoind's RPC thread blocks for 5-10s during block
validation. The old 10s client-side timeout was rejecting roughly 30% of
UI calls even though the node was perfectly healthy. 20x stress test on
the live .116 node (caught in IBD catch-up at block 797k) used to drop
10 of 20 calls; now drops 0 of 20.

What changed:
- core/archipelago/src/api/rpc/bitcoin.rs: bitcoin_rpc_call now retries up
  to 3 times with 500ms and 1500ms backoffs between attempts. Only
  transient transport errors (timeout, connect refused, send/recv IO)
  trigger retry. A well-formed bitcoind error response is surfaced
  immediately - real RPC bugs are never masked.
- Per-attempt hard deadline (tokio::time::timeout, 15s) layered on top
  of reqwest's own timeout, so DNS starvation or TLS wedging can't
  steal the entire retry budget.
- handle_bitcoin_getinfo client builder gained a 3s connect_timeout
  so a dead bitcoind is fast-failed inside the first attempt instead
  of eating the whole 15s.
- Retry policy extracted into a RetryConfig struct so tests can dial
  down timeouts to ~100ms per attempt. Production defaults live in
  RetryConfig::production().

Not changed (tracked as follow-up):
- mesh/mod.rs bitcoin_rpc_getblockcount and related helpers use the
  same 10s-timeout pattern. Not migrated to the new wrapper in this
  release; scheduled for v1.7.43 alongside the render_bitcoin_conf
  work.
- lnd/info.rs and electrs_status have similar 10s/15s timeouts but
  different failure profiles - audit first, migrate only the ones
  that actually exhibit the bug.

Tests: 6 new unit tests under api::rpc::bitcoin::tests, all passing.
Uses an in-process hyper server (already a transitive dep) to simulate
bitcoind responses; no new crates required.
  - happy_path_first_attempt: no retry when first attempt succeeds
  - retries_on_timeout_then_succeeds: first attempt times out, second
    succeeds, returns OK (uses a short-timeout RetryConfig so the test
    runs in <1s instead of 15s)
  - retries_exhausted_on_persistent_connect_refused: all attempts fail
    against a closed port, error bubbles up, elapsed time confirms
    backoffs actually ran
  - does_not_retry_on_rpc_level_error: bitcoind-returned error body is
    surfaced immediately, no retry
  - does_not_retry_parse_errors: non-JSON response (e.g. 503 with html
    body) is NOT retried - guards against the tempting "retry all
    non-2xx" mistake that would mask real bitcoind misconfig
  - retry_budget_invariants: asserts total wall-time ceiling stays
    under 60s so a bumped constant can't silently hang a UI call
    forever

Validated live on .116: 20/20 bitcoin.getinfo calls succeed during IBD
catch-up (chain at block 797419 -> 797464), vs ~40% baseline under the
old 10s timeout. Worst-case latency was 48.9s during peak validation;
happy-path latency (cached result) remains 28-77ms.
2026-04-22 16:46:28 -04:00
archipelago
d1bcf271f9 release(v1.7.41-alpha): post-OTA auto-rollback so a bad release cannot strand the fleet
Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 +
v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500)
with no recovery path short of SSH. This release adds a self-check
guardrail to the update flow.

What changed:
- apply_update() writes a pending-verify marker with old+new version and
  a 150s deadline immediately before scheduling the service restart.
- verify_pending_update() runs from main.rs startup. If the marker is
  present and within its freshness window, the new binary waits 15s for
  nginx + backend to settle, then probes https://127.0.0.1/ every 5s for
  up to 90s (self-signed certs accepted).
- On any probe success within the window, the marker is cleared and
  nothing else happens.
- On window-exhaust, the new binary:
    1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts>
       (quarantined, not deleted, so we can post-mortem).
    2. Restores web-ui.bak on top of web-ui.
    3. Calls rollback_update() to restore the previous binary.
    4. Updates state.current_version to reflect the rollback.
    5. systemctl --no-block restart archipelago so the OLD binary boots.
- Markers older than 10 minutes are treated as stale and cleared without
  probing, so a crashed-during-startup marker from weeks ago cannot
  spontaneously roll back a healthy node on a later reboot.
- rollback_update() binary copy now goes through host_sudo instead of
  tokio::fs::copy, so it escapes the service's ProtectSystem=strict
  mount namespace. Without this, the rollback silently failed with
  EROFS on /usr/local/bin and orphaned the rollback - the exact
  opposite of what auto-rollback is for.

Tests: 4 new unit tests in update::tests covering marker round-trip,
absent-marker noop, no-panic on verify_pending_update with nothing to
verify, and an invariant assert that the 90s probe window stays below
the 600s stale threshold. All passing.

Side fix: scripts/create-release-manifest.sh was dying with exit 141
(SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail.
Replaced with a single awk NR==1 that doesn't short-circuit the upstream
pipe, so the release-build flow is idempotent again.
2026-04-22 16:14:35 -04:00