archy

lfg2025/archy

Author	SHA1	Message	Date
archipelago	b8ac68d844	fix: restore aiui and bitcoin receive before release	2026-06-12 05:10:03 -04:00
archipelago	6fd1cf9ba7	chore: release v1.7.87-alpha	2026-06-12 04:49:58 -04:00
archipelago	b11c6c17d1	chore: release v1.7.86-alpha	2026-06-12 04:21:18 -04:00
archipelago	00c32688f8	chore: release v1.7.85-alpha	2026-06-12 03:14:59 -04:00
archipelago	6a30ff11bd	chore: release v1.7.84-alpha	2026-06-11 04:44:58 -04:00
archipelago	22df3f8f5f	chore: release v1.7.83-alpha	2026-06-11 03:03:32 -04:00
archipelago	136eda16c9	chore: release v1.7.82-alpha	2026-05-22 17:19:45 -04:00
archipelago	853d51ae14	chore: release v1.7.81-alpha	2026-05-21 21:44:14 -04:00
archipelago	bdd5a2c43e	chore: release v1.7.80-alpha	2026-05-21 00:38:57 -04:00
archipelago	7be7420c4f	chore: release v1.7.79-alpha	2026-05-20 23:11:54 -04:00
archipelago	e61c757633	chore: release v1.7.78-alpha	2026-05-20 20:53:23 -04:00
archipelago	0898c54765	chore: bump version to v1.7.77-alpha	2026-05-20 00:38:26 -04:00
archipelago	92c58141af	fix(apps): stabilize saleor and netbird launch	2026-05-19 21:45:17 -04:00
archipelago	e65e76cd9d	chore: release v1.7.75-alpha	2026-05-19 20:19:24 -04:00
archipelago	bd69ef41d5	fix(apps): repair netbird login and iframe focus	2026-05-19 19:21:43 -04:00
archipelago	eeb08fc78f	chore: release v1.7.73-alpha	2026-05-19 18:40:10 -04:00
archipelago	3e01e57c8d	chore: release v1.7.72-alpha	2026-05-19 17:42:11 -04:00
archipelago	5859ef77e7	chore: release v1.7.71-alpha	2026-05-19 17:30:20 -04:00
archipelago	dd8a6cd9d7	chore: release v1.7.70-alpha	2026-05-19 16:10:43 -04:00
archipelago	20bc9f250c	chore: release v1.7.69-alpha	2026-05-19 14:39:15 -04:00
archipelago	ab27fb97f8	chore: release v1.7.68-alpha	2026-05-19 09:37:47 -04:00
archipelago	b25d41c5c6	chore: release v1.7.67-alpha	2026-05-18 11:54:57 -04:00
archipelago	6240064acf	chore: release v1.7.66-alpha	2026-05-18 10:15:56 -04:00
archipelago	ec36ac7e2c	chore: release v1.7.65-alpha	2026-05-18 09:31:41 -04:00
archipelago	76288f541e	chore: release v1.7.64-alpha	2026-05-17 23:24:39 -04:00
archipelago	8191d92bed	chore: release v1.7.63-alpha	2026-05-17 23:03:06 -04:00
archipelago	d91b858d9b	chore: release v1.7.62-alpha	2026-05-17 22:40:36 -04:00
archipelago	a992abcd06	chore: release v1.7.61-alpha	2026-05-17 22:13:21 -04:00
archipelago	4d6b4f76af	chore: release v1.7.60-alpha	2026-05-17 20:45:56 -04:00
archipelago	0a94c0097f	chore: release v1.7.59-alpha	2026-05-17 19:44:54 -04:00
archipelago	e05e356d64	chore: release v1.7.58-alpha	2026-05-17 18:40:50 -04:00
archipelago	7804223152	chore: release v1.7.57-alpha	2026-05-17 17:30:04 -04:00
Dorian	5818541721	chore: release v1.7.56-alpha	2026-05-14 09:13:58 -04:00
Dorian	835c525218	chore(release): stage v1.7.55-alpha	2026-05-13 15:09:22 -04:00
archipelago	c0751e2551	chore(release): stage v1.7.54-alpha	2026-05-06 09:23:57 -04:00
archipelago	1a0d8a432c	chore(release): stage v1.7.53-alpha	2026-05-05 13:59:50 -04:00
archipelago	745cb1c626	chore(release): stage v1.7.52-alpha	2026-05-05 11:29:18 -04:00
archipelago	05e6c2e738	fix: release v1.7.51-alpha install hardening	2026-05-01 05:02:39 -04:00
archipelago	be9f9528c3	fix: release v1.7.50-alpha OTA runtime repair	2026-05-01 03:14:07 -04:00
archipelago	7ab788d178	chore: release v1.7.49-alpha	2026-04-30 16:37:54 -04:00
archipelago	f507b847ef	chore: release v1.7.48-alpha Hotfix: archipelago.service ExecStartPre now mkdirs /run/containers and /var/lib/containers before the unit's mount-namespace setup tries to bind them. Without this, fresh nodes that don't have /run/containers (e.g. nodes provisioned without a prior podman session) fail at the namespace step with: Failed to set up mount namespacing: /run/containers: No such file or directory Failed at step NAMESPACE spawning /bin/bash: No such file or directory Existing nodes don't pick up systemd unit changes via OTA — they need a one-time `systemctl edit archipelago` adding the same mkdir. ISO installs from this version forward have the fix baked in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 16:27:22 -04:00
archipelago	8a2899ab4a	chore: release v1.7.47-alpha Sync-perf tuning for bitcoin/bitcoin-core/bitcoin-knots/electrumx. - Drop the --cpus=2 cap on bitcoin/electrumx variants. Script verification is parallelizable; the cap halved IBD speed on 4-8 core machines. - Bump bitcoin --memory 4g→8g so dbcache=4096 has headroom for mempool + connection buffers + I/O. 4g was OOM-prone during heavy IBD. - Bump electrumx --memory 1g→2g + add CACHE_MB=2048 + MAX_SEND=10MB. - bitcoin-core CLI args gain -dbcache=4096 -par=0 -maxconnections=125. - bitcoin-knots manifest matched (1024MB pruned / 4096MB full + par=0). Future v2: host-RAM-aware dbcache scaling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 15:47:51 -04:00
archipelago	992b673b20	chore: release v1.7.46-alpha Follow-up to v1.7.45-alpha closing the remaining tasks identified by the resilience sweeps + the new bitcoin orphan / install-fail-vanish bugs. User-visible: - Health monitor: stop paging on orphaned containers from variant switches - Install fail: card stays visible (was vanishing) with error message - Stack pull progress: interpolate 20→70% (was stuck at 20%) - docker.io → lfg2025 mirror: bitcoin/gitea/nextcloud/valkey Internal: - Resilience harness — install-wait uses expected_containers_for, ui+auth probes retry with 60s backoff, dep-snapshot fix - InstallProgress gains optional `message` field (frontend renders it when phase is None) binary $(stat -c %s releases/v1.7.46-alpha/archipelago) sha256:$(sha256sum releases/v1.7.46-alpha/archipelago \| awk '{print $1}') tarball $(stat -c %s releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz) sha256:$(sha256sum releases/v1.7.46-alpha/archipelago-frontend-1.7.46-alpha.tar.gz \| awk '{print $1}') Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 14:50:33 -04:00
archipelago	4ec6ca98c1	chore: release v1.7.45-alpha Resilience-validated release. Three full sweeps of the new resilience harness against .228 confirm no shipstoppers. Big user-visible: - Bitcoin RPC auth durably correct via host-rendered nginx.conf bind-mount, replaces fragile post-start exec that failed under restricted-cap rootless podman ("crun: write cgroup.procs: Permission denied") - Multi-container stack installs (indeedhub, immich, btcpay, mempool) now emit phase events at every boundary so the progress bar advances - Apps no longer vanish from the dashboard mid-install (absent-scanner skips packages in transitional states) - Indeedhub fresh installs work end-to-end (was 8500+ restart loop): five missing env vars (DATABASE_PORT, QUEUE_HOST, QUEUE_PORT, S3_PRIVATE_BUCKET_NAME, AES_MASTER_SECRET) added to install code - Tailscale install fixed: --entrypoint string was being passed as a single shell-line arg; switched to custom_args array - Catalog cleaned of broken entries (dwn, endurain, ollama removed; nextcloud restored on docker.io) - Bitcoin Core update path uses correct image (was looking for nonexistent lfg2025/bitcoin:28.4) - ISO installs now allocate swap on the encrypted data partition Infra: - New resilience harness (scripts/resilience/) — black-box state-machine tester, every app × every transition. Run before each release. Sweep #3 final: PASS 107 / FAIL 12 / SKIP 14. The 12 fails are 1 cosmetic (homeassistant trusted_hosts), 8 harness/timing false-positives, and 3 non-shipstopper tracked items. Down from 23 in baseline sweep #1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 12:31:45 -04:00
archipelago	dffa7e99bb	chore: release v1.7.44-alpha	2026-04-28 15:03:04 -04:00
archipelago	310c709aba	chore(release): bump version to 1.7.43-alpha	2026-04-23 13:21:58 -04:00
archipelago	0ac673deb4	release(v1.7.42-alpha): bitcoin RPC retry wrapper so syncing nodes stop flashing red Closes failure mode adjacent to FM3 (docs/bulletproof-containers.md): on a syncing pruned node, bitcoind's RPC thread blocks for 5-10s during block validation. The old 10s client-side timeout was rejecting roughly 30% of UI calls even though the node was perfectly healthy. 20x stress test on the live .116 node (caught in IBD catch-up at block 797k) used to drop 10 of 20 calls; now drops 0 of 20. What changed: - core/archipelago/src/api/rpc/bitcoin.rs: bitcoin_rpc_call now retries up to 3 times with 500ms and 1500ms backoffs between attempts. Only transient transport errors (timeout, connect refused, send/recv IO) trigger retry. A well-formed bitcoind error response is surfaced immediately - real RPC bugs are never masked. - Per-attempt hard deadline (tokio::time::timeout, 15s) layered on top of reqwest's own timeout, so DNS starvation or TLS wedging can't steal the entire retry budget. - handle_bitcoin_getinfo client builder gained a 3s connect_timeout so a dead bitcoind is fast-failed inside the first attempt instead of eating the whole 15s. - Retry policy extracted into a RetryConfig struct so tests can dial down timeouts to ~100ms per attempt. Production defaults live in RetryConfig::production(). Not changed (tracked as follow-up): - mesh/mod.rs bitcoin_rpc_getblockcount and related helpers use the same 10s-timeout pattern. Not migrated to the new wrapper in this release; scheduled for v1.7.43 alongside the render_bitcoin_conf work. - lnd/info.rs and electrs_status have similar 10s/15s timeouts but different failure profiles - audit first, migrate only the ones that actually exhibit the bug. Tests: 6 new unit tests under api::rpc::bitcoin::tests, all passing. Uses an in-process hyper server (already a transitive dep) to simulate bitcoind responses; no new crates required. - happy_path_first_attempt: no retry when first attempt succeeds - retries_on_timeout_then_succeeds: first attempt times out, second succeeds, returns OK (uses a short-timeout RetryConfig so the test runs in <1s instead of 15s) - retries_exhausted_on_persistent_connect_refused: all attempts fail against a closed port, error bubbles up, elapsed time confirms backoffs actually ran - does_not_retry_on_rpc_level_error: bitcoind-returned error body is surfaced immediately, no retry - does_not_retry_parse_errors: non-JSON response (e.g. 503 with html body) is NOT retried - guards against the tempting "retry all non-2xx" mistake that would mask real bitcoind misconfig - retry_budget_invariants: asserts total wall-time ceiling stays under 60s so a bumped constant can't silently hang a UI call forever Validated live on .116: 20/20 bitcoin.getinfo calls succeed during IBD catch-up (chain at block 797419 -> 797464), vs ~40% baseline under the old 10s timeout. Worst-case latency was 48.9s during peak validation; happy-path latency (cached result) remains 28-77ms.	2026-04-22 16:46:28 -04:00
archipelago	d1bcf271f9	release(v1.7.41-alpha): post-OTA auto-rollback so a bad release cannot strand the fleet Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 + v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500) with no recovery path short of SSH. This release adds a self-check guardrail to the update flow. What changed: - apply_update() writes a pending-verify marker with old+new version and a 150s deadline immediately before scheduling the service restart. - verify_pending_update() runs from main.rs startup. If the marker is present and within its freshness window, the new binary waits 15s for nginx + backend to settle, then probes https://127.0.0.1/ every 5s for up to 90s (self-signed certs accepted). - On any probe success within the window, the marker is cleared and nothing else happens. - On window-exhaust, the new binary: 1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts> (quarantined, not deleted, so we can post-mortem). 2. Restores web-ui.bak on top of web-ui. 3. Calls rollback_update() to restore the previous binary. 4. Updates state.current_version to reflect the rollback. 5. systemctl --no-block restart archipelago so the OLD binary boots. - Markers older than 10 minutes are treated as stale and cleared without probing, so a crashed-during-startup marker from weeks ago cannot spontaneously roll back a healthy node on a later reboot. - rollback_update() binary copy now goes through host_sudo instead of tokio::fs::copy, so it escapes the service's ProtectSystem=strict mount namespace. Without this, the rollback silently failed with EROFS on /usr/local/bin and orphaned the rollback - the exact opposite of what auto-rollback is for. Tests: 4 new unit tests in update::tests covering marker round-trip, absent-marker noop, no-panic on verify_pending_update with nothing to verify, and an invariant assert that the 90s probe window stays below the 600s stale threshold. All passing. Side fix: scripts/create-release-manifest.sh was dying with exit 141 (SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail. Replaced with a single awk NR==1 that doesn't short-circuit the upstream pipe, so the release-build flow is idempotent again.	2026-04-22 16:14:35 -04:00
Dorian	85417de952	release(v1.7.40-alpha): fix tarball root perms at source so OTA can't 500 again v1.7.38 and v1.7.39 both shipped with `./` inside the frontend tarball marked drwx------ (700). Tar extraction preserves archive perms, so every node that pulled the OTA landed with /opt/archipelago/web-ui at 700, nginx (www-data) returned 500 "permission denied" on every page, and the browser showed "Internal Server Error nginx". .116 hit this on both v1.7.38 and v1.7.39 rollouts. The v1.7.39 runtime self-heal in main.rs was the wrong layer — systemd's ReadOnlyPaths namespace made /opt/archipelago read-only from inside the archipelago service, so chmod from there returned EROFS. Root cause: create-release-manifest.sh used mktemp -d (700 default umask) for staging, then tar preserved that 700 in the archive's root entry. Fix the archive itself: - chmod 755 staging dir + `find -type d -exec chmod 755` + `-type f chmod 644` before tar, so the on-disk entries are correct. - tar --owner=0 --group=0 --mode='u=rwX,go=rX' to normalize archive perms belt-and-braces in case file-mode drift ever reappears. - Post-tar verify: `tar tvzf \| head -1` must show drwxr-xr-x at root, or the release script aborts before the manifest is even generated. Binary unchanged semantically — the main.rs self-heal stays in as a last- resort belt (can't hurt on nodes whose FS isn't namespace-isolated), and the update.rs in-extractor chmod stays in so v1.7.40-onwards extractors are double-safe. The authoritative fix is the archive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 13:54:44 -04:00
Dorian	b8d084368e	release(v1.7.39-alpha): hotfix web-ui perms after OTA (nginx 500) + startup self-heal v1.7.38 shipped with an OTA bug: the tar-extracted staging dir inherited 700 perms and nginx (www-data) returned 500/403 on every request after the swap. .116 hit this on rollout; had to chmod by hand to recover. - update.rs: after extraction, explicitly chmod 755 dirs + 644 files on the new staging dir before the mv into place, so nginx can stat/serve them. - main.rs: self-heal on startup — if /opt/archipelago/web-ui is not world-readable, run `sudo chmod -R u=rwX,go=rX` to repair. This is what rescues nodes upgrading from v1.7.37/v1.7.38, since their extractor (running on the old binary) doesn't have the chmod fix yet — the new binary's first boot fixes the mess before nginx serves a single request. Everything v1.7.38 shipped is still in this release: - auth.rs auto-heals is_onboarding_complete() from setup_complete + password_hash so nodes don't bounce back to /onboarding/intro after browser clear / reboot / update - useOnboarding tri-state: backend-unreachable no longer defaults to intro - login sounds gated by isFirstInstallPhase() — silent after onboarding, typing sounds unaffected - FIPS app / Nostr Relay / Nostr VPN / Routstr / Penpot removed from catalog + frontend + Rust + docker + icons; 15 image versions deleted from tx1138, .168, gitea-local - AIUI baked into release tarball via demo/aiui/ - prebuild hook syncs app-catalog/catalog.json → public/catalog.json Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 13:26:54 -04:00

1 2

81 Commits