Hotfix: archipelago.service ExecStartPre now mkdirs /run/containers and
/var/lib/containers before the unit's mount-namespace setup tries to bind
them. Without this, fresh nodes that don't have /run/containers (e.g.
nodes provisioned without a prior podman session) fail at the namespace
step with:
Failed to set up mount namespacing: /run/containers: No such file or directory
Failed at step NAMESPACE spawning /bin/bash: No such file or directory
Existing nodes don't pick up systemd unit changes via OTA — they need a
one-time `systemctl edit archipelago` adding the same mkdir. ISO installs
from this version forward have the fix baked in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sync-perf tuning for bitcoin/bitcoin-core/bitcoin-knots/electrumx.
- Drop the --cpus=2 cap on bitcoin/electrumx variants. Script verification
is parallelizable; the cap halved IBD speed on 4-8 core machines.
- Bump bitcoin --memory 4g→8g so dbcache=4096 has headroom for mempool +
connection buffers + I/O. 4g was OOM-prone during heavy IBD.
- Bump electrumx --memory 1g→2g + add CACHE_MB=2048 + MAX_SEND=10MB.
- bitcoin-core CLI args gain -dbcache=4096 -par=0 -maxconnections=125.
- bitcoin-knots manifest matched (1024MB pruned / 4096MB full + par=0).
Future v2: host-RAM-aware dbcache scaling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resilience-validated release. Three full sweeps of the new resilience
harness against .228 confirm no shipstoppers.
Big user-visible:
- Bitcoin RPC auth durably correct via host-rendered nginx.conf bind-mount,
replaces fragile post-start exec that failed under restricted-cap rootless
podman ("crun: write cgroup.procs: Permission denied")
- Multi-container stack installs (indeedhub, immich, btcpay, mempool) now
emit phase events at every boundary so the progress bar advances
- Apps no longer vanish from the dashboard mid-install (absent-scanner skips
packages in transitional states)
- Indeedhub fresh installs work end-to-end (was 8500+ restart loop): five
missing env vars (DATABASE_PORT, QUEUE_HOST, QUEUE_PORT,
S3_PRIVATE_BUCKET_NAME, AES_MASTER_SECRET) added to install code
- Tailscale install fixed: --entrypoint string was being passed as a single
shell-line arg; switched to custom_args array
- Catalog cleaned of broken entries (dwn, endurain, ollama removed; nextcloud
restored on docker.io)
- Bitcoin Core update path uses correct image (was looking for nonexistent
lfg2025/bitcoin:28.4)
- ISO installs now allocate swap on the encrypted data partition
Infra:
- New resilience harness (scripts/resilience/) — black-box state-machine
tester, every app × every transition. Run before each release.
Sweep #3 final: PASS 107 / FAIL 12 / SKIP 14. The 12 fails are 1 cosmetic
(homeassistant trusted_hosts), 8 harness/timing false-positives, and 3
non-shipstopper tracked items. Down from 23 in baseline sweep #1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four production-code fixes merit user-visible mention: the transport
chunking data-corruption fix (real user-affecting bug for multi-chunk
mesh payloads), the avatar u16 overflow panic (backend crash on certain
seeds), the outbox TTL boundary, and the image-versions parser hardening.
Document that OTA updates now refresh the reconcile helper scripts,
closing the deploy gap that kept fixes to those scripts from
reaching existing nodes.
Add two user-facing release notes for fixes shipped this round:
- Full-archive Bitcoin nodes no longer silently get pruned on reconcile
because the disk-size check was reading the OS partition.
- Failed updates can now recover via reconcile --create-missing instead
of leaving a destroyed container behind.
- AccountInfoSection.vue: append 5th bullet to v1.7.43-alpha entry
explaining that update-available badges and version comparisons
work again now that the pinned-image catalog is found at the
correct deployed path.
- docs/MARKETPLACE-QA.md: new tracker for the upcoming app-by-app
install walk on .228. Documents the per-app fix workflow, the
four layers we might need to fix at (app recipe, registry image,
backend orchestrator, frontend), status-key table for tracking
each catalog entry, and the release-notes policy for the walk.
- docs/RESUME.md: refresh with a9908597 commit, updated binary md5
on .228, and split Immediate Next Step into Phase 1 (browser
verification) and Phase 2 (marketplace walk) with a pointer to
the new tracker.
Four release-note bullets describing the user-visible changes shipped
in this round:
- async-spawn install/update/uninstall (UI no longer freezes)
- phase-based install progress bar (Preparing through Finalizing)
- scanner kick post-install (Launch button appears immediately)
- .23 Hetzner VPS retired, .168 OVH promoted to Server 1 with
auto-purge migration for existing nodes
Matches the tone of existing changelog entries: what changed from the
operator's perspective, not internal implementation detail.
The Hetzner VPS at 23.182.128.160 was decommissioned. Replace it
everywhere with the OVH VPS at 146.59.87.168, which was previously
the tertiary mirror.
- update.rs: drop DEFAULT_TERTIARY_MIRROR_URL, promote .168 into
the secondary slot as "Server 1 (OVH)"; tx1138 becomes Server 2.
Default mirror list shrinks from 3 to 2.
- container/registry.rs: default RegistryConfig drops .23, promotes
.168 to Server 1 / priority 0, tx1138 stays Server 2 / priority 10.
- api/rpc/package/config.rs: trusted-registry allowlist swaps .23
for .168.
- api/handler/mod.rs: app-catalog fallback URL uses .168.
- neode-ui/views/marketplace/marketplaceData.ts: REGISTRY uses .168.
- scripts/image-versions.sh: ARCHY_REGISTRY_FALLBACK uses .168.
- image-recipe/build-auto-installer-iso.sh: installer ISO registries
use .168 (both podman registries.conf and backend registries.json).
Tests updated to assert on the new 2-entry default lists (registry +
mirror). URL-parser fixture tests in update.rs retain .23 strings —
they exercise string-parsing logic, not mirror policy.
Git remotes: dropped `gitea-vps` and the .23 push URL on the `origin`
multi-push alias (not part of this commit — pure working-copy change).
Podman emits zero parseable progress when stderr is piped (no TTY), so
the old byte-counter regex never matched in real installs. Users saw
0% for the whole pull, then a jump to 95%, then silence through
create-container, health-check, and post-install hooks.
Replace with 7 explicit lifecycle phases wired through install.rs and
update.rs: Preparing (5%), PullingImage (20%), CreatingContainer (70%),
StartingContainer (80%), WaitingHealthy (88%), PostInstall (95%),
Done (100%). Each maps to a fixed UI progress and status message.
Frontend PHASE_INFO mapper in stores/server.ts prioritizes phase when
present, falls back to byte-counter for legacy. A Math.max forward-only
guard ensures the bar never regresses. Deleted the duplicate watcher
in Discover.vue that was fighting the store's watcher with stale byte
logic. Added shimmer CSS on the fill (with prefers-reduced-motion
opt-out) so the bar looks alive during long phases.
With the backend flipped to async-spawn, install/uninstall/update return
immediately with a { status, package_id } envelope. Client timeouts of
45m/11m were a leftover from synchronous handlers and masked real RPC
failures.
Drop all install/uninstall/update RPC timeouts to 15s. Progress and
terminal state still arrive through the live state stream — the RPC
only needs to confirm the spawn was accepted.
Return-type annotations updated in rpc-client.ts and stores/server.ts.
Five direct rpcClient.call sites across Marketplace.vue, Discover.vue,
and MarketplaceAppDetails.vue updated with the shorter timeout.
The app card and details view previously used a pair of Start/Stop
buttons whose labels were driven off isAppLoading(), a client-side
"I just clicked the button" flag. When the backend's graceful stop
took longer than the RPC round-trip (up to 600s on bitcoin-core),
the flag cleared while the container was still shutting down, the
UI flipped back to "Running" as soon as the next 10s scan saw the
still-alive container, and the user had no indication the stop was
still in flight.
Now that the backend flips PackageState to Stopping / Starting /
Restarting / Installing / Updating / Removing for the duration of
each lifecycle operation and the scan loop preserves those states,
the UI can drive its label off the container state itself. A single
full-width primary button replaces the Start/Stop pair. Its label,
color, and disabled state come from getAppVisualState(), which
collapses resting states (exited/created/paused/installed) into
"stopped" and passes transitional states through untouched.
Changes:
- container-client.ts: widen ContainerStatus.state union to include
the six transitional variants plus "installed". Add
restartContainer() calling the new container-restart RPC.
- stores/container.ts: add getAppVisualState() computed and the
restartContainer() action.
- ContainerApps.vue: single primary button (Start / Stop / Starting
/ Stopping / Restarting etc.) plus a separate circular Restart
button visible only when running. Critically, handleStartApp and
handleStopApp now route through store.startContainer and
stopContainer (which call container-start / container-stop, the
async RPCs) instead of the legacy synchronous bundled-app-start /
bundled-app-stop path. Transitional-state polling widened from
just "created" to the full set of transitional variants.
- ContainerAppDetails.vue: same single-button pattern, Restart
button now calls container-restart instead of the old
stop-sleep-start sequence, added 2s polling interval for
transitional states.
- components/ContainerStatus.vue: widen state prop to match the
shared union, render transitional labels with a trailing ellipsis
and a yellow dot.
No new tests — this is presentation logic. Manual verification on
.228 will confirm the end-to-end async path: click Stop on LND,
button becomes "Stopping" in under a second, stays that way for
roughly 5 minutes, then flips to "Start" with a grey dot. The UI
must never revert to "Running" mid-stop.
Closes failure mode adjacent to FM3 (docs/bulletproof-containers.md): on
a syncing pruned node, bitcoind's RPC thread blocks for 5-10s during block
validation. The old 10s client-side timeout was rejecting roughly 30% of
UI calls even though the node was perfectly healthy. 20x stress test on
the live .116 node (caught in IBD catch-up at block 797k) used to drop
10 of 20 calls; now drops 0 of 20.
What changed:
- core/archipelago/src/api/rpc/bitcoin.rs: bitcoin_rpc_call now retries up
to 3 times with 500ms and 1500ms backoffs between attempts. Only
transient transport errors (timeout, connect refused, send/recv IO)
trigger retry. A well-formed bitcoind error response is surfaced
immediately - real RPC bugs are never masked.
- Per-attempt hard deadline (tokio::time::timeout, 15s) layered on top
of reqwest's own timeout, so DNS starvation or TLS wedging can't
steal the entire retry budget.
- handle_bitcoin_getinfo client builder gained a 3s connect_timeout
so a dead bitcoind is fast-failed inside the first attempt instead
of eating the whole 15s.
- Retry policy extracted into a RetryConfig struct so tests can dial
down timeouts to ~100ms per attempt. Production defaults live in
RetryConfig::production().
Not changed (tracked as follow-up):
- mesh/mod.rs bitcoin_rpc_getblockcount and related helpers use the
same 10s-timeout pattern. Not migrated to the new wrapper in this
release; scheduled for v1.7.43 alongside the render_bitcoin_conf
work.
- lnd/info.rs and electrs_status have similar 10s/15s timeouts but
different failure profiles - audit first, migrate only the ones
that actually exhibit the bug.
Tests: 6 new unit tests under api::rpc::bitcoin::tests, all passing.
Uses an in-process hyper server (already a transitive dep) to simulate
bitcoind responses; no new crates required.
- happy_path_first_attempt: no retry when first attempt succeeds
- retries_on_timeout_then_succeeds: first attempt times out, second
succeeds, returns OK (uses a short-timeout RetryConfig so the test
runs in <1s instead of 15s)
- retries_exhausted_on_persistent_connect_refused: all attempts fail
against a closed port, error bubbles up, elapsed time confirms
backoffs actually ran
- does_not_retry_on_rpc_level_error: bitcoind-returned error body is
surfaced immediately, no retry
- does_not_retry_parse_errors: non-JSON response (e.g. 503 with html
body) is NOT retried - guards against the tempting "retry all
non-2xx" mistake that would mask real bitcoind misconfig
- retry_budget_invariants: asserts total wall-time ceiling stays
under 60s so a bumped constant can't silently hang a UI call
forever
Validated live on .116: 20/20 bitcoin.getinfo calls succeed during IBD
catch-up (chain at block 797419 -> 797464), vs ~40% baseline under the
old 10s timeout. Worst-case latency was 48.9s during peak validation;
happy-path latency (cached result) remains 28-77ms.
Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 +
v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500)
with no recovery path short of SSH. This release adds a self-check
guardrail to the update flow.
What changed:
- apply_update() writes a pending-verify marker with old+new version and
a 150s deadline immediately before scheduling the service restart.
- verify_pending_update() runs from main.rs startup. If the marker is
present and within its freshness window, the new binary waits 15s for
nginx + backend to settle, then probes https://127.0.0.1/ every 5s for
up to 90s (self-signed certs accepted).
- On any probe success within the window, the marker is cleared and
nothing else happens.
- On window-exhaust, the new binary:
1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts>
(quarantined, not deleted, so we can post-mortem).
2. Restores web-ui.bak on top of web-ui.
3. Calls rollback_update() to restore the previous binary.
4. Updates state.current_version to reflect the rollback.
5. systemctl --no-block restart archipelago so the OLD binary boots.
- Markers older than 10 minutes are treated as stale and cleared without
probing, so a crashed-during-startup marker from weeks ago cannot
spontaneously roll back a healthy node on a later reboot.
- rollback_update() binary copy now goes through host_sudo instead of
tokio::fs::copy, so it escapes the service's ProtectSystem=strict
mount namespace. Without this, the rollback silently failed with
EROFS on /usr/local/bin and orphaned the rollback - the exact
opposite of what auto-rollback is for.
Tests: 4 new unit tests in update::tests covering marker round-trip,
absent-marker noop, no-panic on verify_pending_update with nothing to
verify, and an invariant assert that the 90s probe window stays below
the 600s stale threshold. All passing.
Side fix: scripts/create-release-manifest.sh was dying with exit 141
(SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail.
Replaced with a single awk NR==1 that doesn't short-circuit the upstream
pipe, so the release-build flow is idempotent again.
v1.7.38 and v1.7.39 both shipped with `./` inside the frontend tarball marked
drwx------ (700). Tar extraction preserves archive perms, so every node that
pulled the OTA landed with /opt/archipelago/web-ui at 700, nginx (www-data)
returned 500 "permission denied" on every page, and the browser showed
"Internal Server Error nginx". .116 hit this on both v1.7.38 and v1.7.39
rollouts. The v1.7.39 runtime self-heal in main.rs was the wrong layer —
systemd's ReadOnlyPaths namespace made /opt/archipelago read-only from inside
the archipelago service, so chmod from there returned EROFS.
Root cause: create-release-manifest.sh used mktemp -d (700 default umask) for
staging, then tar preserved that 700 in the archive's root entry.
Fix the archive itself:
- chmod 755 staging dir + `find -type d -exec chmod 755` + `-type f chmod 644`
before tar, so the on-disk entries are correct.
- tar --owner=0 --group=0 --mode='u=rwX,go=rX' to normalize archive perms
belt-and-braces in case file-mode drift ever reappears.
- Post-tar verify: `tar tvzf | head -1` must show drwxr-xr-x at root, or
the release script aborts before the manifest is even generated.
Binary unchanged semantically — the main.rs self-heal stays in as a last-
resort belt (can't hurt on nodes whose FS isn't namespace-isolated), and the
update.rs in-extractor chmod stays in so v1.7.40-onwards extractors are
double-safe. The authoritative fix is the archive.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.7.38 shipped with an OTA bug: the tar-extracted staging dir inherited 700
perms and nginx (www-data) returned 500/403 on every request after the swap.
.116 hit this on rollout; had to chmod by hand to recover.
- update.rs: after extraction, explicitly chmod 755 dirs + 644 files on the
new staging dir before the mv into place, so nginx can stat/serve them.
- main.rs: self-heal on startup — if /opt/archipelago/web-ui is not
world-readable, run `sudo chmod -R u=rwX,go=rX` to repair. This is what
rescues nodes upgrading from v1.7.37/v1.7.38, since their extractor
(running on the old binary) doesn't have the chmod fix yet — the new
binary's first boot fixes the mess before nginx serves a single request.
Everything v1.7.38 shipped is still in this release:
- auth.rs auto-heals is_onboarding_complete() from setup_complete +
password_hash so nodes don't bounce back to /onboarding/intro after
browser clear / reboot / update
- useOnboarding tri-state: backend-unreachable no longer defaults to intro
- login sounds gated by isFirstInstallPhase() — silent after onboarding,
typing sounds unaffected
- FIPS app / Nostr Relay / Nostr VPN / Routstr / Penpot removed from
catalog + frontend + Rust + docker + icons; 15 image versions deleted
from tx1138, .168, gitea-local
- AIUI baked into release tarball via demo/aiui/
- prebuild hook syncs app-catalog/catalog.json → public/catalog.json
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- auth.rs now infers onboarding-complete from setup_complete + password_hash so
nodes stop bouncing users through the intro wizard after browser clear / update
/ reboot; the flag self-heals to disk on next check
- frontend: "backend uncertain" no longer defaults to /onboarding/intro —
useOnboarding returns null + callers poll / retry instead of flashing the wizard
- login sounds (synthwave, welcome voice, pop, whoosh, oomph) gated by
isFirstInstallPhase(); typing sounds unaffected
- removed FIPS app, Nostr Relay, Nostr VPN, Routstr, Penpot from catalog,
frontend config, Rust AppMetadata + install dispatch + install_penpot_stack;
docker/fips-ui + docker/nostr-vpn-ui + apps/penpot dirs and 5 icons deleted;
15 image versions deleted from tx1138, .168, gitea-local registries (.160
Gitea was 502 at release time — follow-up)
- AIUI baked into frontend release tarball via demo/aiui/; deploy-to-target
falls back to demo/aiui/ when the AIUI sibling checkout is missing
- prebuild hook syncs app-catalog/catalog.json → public/catalog.json so the
two copies can no longer drift (was the source of the "apps still visible"
bug — public/ had stale data)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Install flow
- api/rpc/package/install.rs: always append the literal image URL as a
last-resort pull candidate in do_pull_image, so images not carried by
any configured mirror (docker.io/bitcoin/bitcoin:28.4) still install
instead of masquerading as a generic pull failure across every mirror.
- api/rpc/package/install.rs: write_bitcoin_conf now skips on any stat
error, not just "file exists". Once bitcoin-knots' first-boot chowns
/var/lib/archipelago/bitcoin into the container's user namespace (700
perms, UID 100100/100101), the archipelago daemon can't even traverse
in — try_exists returns Err which unwrap_or(false) treated as "not
present" and drove a doomed write. Now errors out of the directory
traversal are treated as "conf already owned by container user" and
the write is skipped. Mirrors the lnd.conf pattern.
- api/rpc/package/install.rs: drop the hardcoded `prune=550` from the
conf default. Operators with multi-TB drives shouldn't be silently
pruned; users who want a pruned node can set it in bitcoin.conf
themselves. Full archive is the only honest default.
- api/rpc/package/config.rs: bitcoin-core now passes explicit
-server/-rpcbind/-rpcallowip/-rpcport/-printtoconsole/-datadir CLI
args. Vanilla bitcoin/bitcoin:28.4 has no entrypoint wrapper and
reads conf + argv only; without these the RPC listens on 127.0.0.1
inside the container and rootlessport can't reach it, so the
bitcoin-ui companion gets 502 on every /bitcoin-rpc/ call.
Bitcoin Knots keeps its own entrypoint-driven defaults.
- container/docker_packages.rs: split bitcoin-core out of the shared
AppMetadata arm. bitcoin-core now surfaces as "Bitcoin Core" with
bitcoin-core.svg and a Reference-implementation description; the
bitcoin + bitcoin-knots ids keep the Knots branding. Fixes the home
card showing "Bitcoin Knots" for a Core install.
Bitcoin node UI (docker/bitcoin-ui)
- index.html: impl name/tagline/logo now dynamic. applyImplBranding()
reads subversion from getnetworkinfo — /Satoshi:X/Knots:Y/ resolves
to Bitcoin Knots, plain /Satoshi:X/ resolves to Bitcoin Core. Both
get their own icon and subtitle. Settings modal replaced its
hardcoded Regtest/txindex=1/port-18443 placeholders with live values
from getblockchaininfo + getindexinfo + getzmqnotifications.
- index.html: new Storage info card (Full Archive · X GB /
Pruned · X GB from blockchainInfo.pruned + size_on_disk) visible on
the main dashboard, same level as Network. Settings modal mirrors it
with the prune height when applicable.
- Dockerfile + assets/: bitcoin-core.svg, bitcoin-knots.webp, and the
bg-network.jpg used by the dashboard are now COPY'd into the image
under /usr/share/nginx/html/assets. Previously the <img src> pointed
at paths that 404'd into the SPA fallback and the onerror handler
hid the broken logo silently.
Frontend
- appSession/appSessionConfig.ts: add bitcoin-core to APP_PORTS (8334),
HTTPS_PROXY_PATHS (/app/bitcoin-ui/), and APP_TITLES (Bitcoin Core).
Without these the AppSessionFrame showed "No URL found for
bitcoin-core" and the home/app-list title fell through to the raw id.
- settings/AccountInfoSection.vue: backfill What's New entries for
v1.7.31 through v1.7.37 that had been missed in earlier cuts.
Release plumbing
- releases/v1.7.37-alpha/: binary + frontend tarball.
- releases/manifest.json: v1.7.37-alpha, sha256/size refreshed.
- Cargo.toml / package.json: version bumps.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The trusted-registry allowlist in api/rpc/package/config.rs splits the
image on '/' and matches the first segment against a fixed set (docker.io,
ghcr.io, git.tx1138.com, 23.182.128.160:3000, ghcr.io, localhost). A bare
'bitcoin/bitcoin:28.4' splits to registry="bitcoin" which isn't on the
list, so the install RPC was returning 'Invalid Docker image format'.
Live catalogs on .160 and gitea-local already hotfixed directly; these
static copies keep ISO builds and the final hardcoded fallback in sync.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- neode-ui/public/assets/img/app-icons/bitcoin-core.svg (NEW): 256×256
Umbrel community Bitcoin icon sourced from getumbrel.github.io/
umbrel-apps-gallery/bitcoin/icon.svg. Referenced by the static
catalog, the curated fallback, and the upstream lfg2025/app-catalog
entry so every surface shows the same image.
- app-catalog/catalog.json + neode-ui/public/catalog.json: add
bitcoin-core (v28.4) entry pointing at bitcoin/bitcoin:28.4. Same
entry pushed to the lfg2025/app-catalog repo on .160 and the local
gitea mirror so nodes see it without needing a full archipelago
update. Sovereignty Stack entry added to FEATURED_DEFINITIONS with
a description that frames it as a Knots alternative, not a rival.
- core/archipelago/src/api/handler/mod.rs: handle_app_catalog_proxy
is now instance-scoped (&self) and derives its upstream list from
load_registries — each active container registry contributes one
`<scheme>://<reg.url>/app-catalog/raw/branch/main/catalog.json` URL
in priority order (scheme follows tls_verify). When the operator
switches mirrors in Settings, the App Store now follows. Falls back
to the legacy hardcoded .160/tx1138 pair only when registry config
can't be loaded, so the App Store still renders on nodes that
haven't persisted one yet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- core/archipelago/src/bootstrap.rs (NEW): embed scripts/container-doctor.sh
and image-recipe/configs/archipelago-doctor.{service,timer} via
include_str! and sync to disk + enable the timer on every archipelago
startup. Idempotent (content-hash compare), dev-box symlink guard keeps
the git checkout untouched, best-effort (warn-only on failure) so
bootstrap never blocks server readiness. Wired in main.rs as a
background tokio task.
- scripts/container-doctor.sh: add fix_rootless_netns_egress(). Detects
when the rootless-netns has lost its pasta tap (container-to-container
still works but outbound DNS/TCP fails) via an nsenter probe into
aardvark-dns; with a two-probe 10s debounce to rule out transients and
a host-precheck that bails out if the host itself is offline. When the
rootless-netns is truly broken, does a graceful podman stop --all /
start --all so pasta + aardvark-dns rebuild the netns from scratch.
Bitcoin-knots and every other outbound container recover in one cycle.
- core/archipelago/src/update.rs: host_sudo → pub(crate) so bootstrap.rs
can reuse the existing systemd-run escape hatch.
- apps/bitcoin-core/manifest.yml: bump app version 24.0.0 → 28.4.0 and
image bitcoin/bitcoin:24.0 → bitcoin/bitcoin:28.4. Resources aligned
with the real container-specs.sh large-disk tune (4 GiB memory cap,
cpu_limit: 0 so bitcoind can run -par=auto across every core).
- neode-ui/src/views/apps/AppCard.vue + Apps.vue: add an Update button
+ Updating spinner to every app card that has available-update set.
Wires through serverStore.updatePackage(id) — the same RPC the detail
view already calls. common.update / common.updating i18n keys added in
en.json and es.json.
- core/archipelago/src/identity_manager.rs: add create_from_signing_key()
that mirrors an existing Ed25519 key as a manager-level identity with
a deterministic id (`node-<pubkey16>`). Idempotent across restarts,
gets the hex-SVG master avatar.
- core/archipelago/src/server.rs: the auto-create path on first boot now
mirrors the node's own signing_key (seed-derived on onboarded installs)
as a "Node" identity instead of generating a random "Default" keypair.
Once this ships, the DID on the Web5 DID Status card (via node.did
RPC), the Node entry on the Identities page (via identity.list), and
the DID used for peer-to-peer connects (via server_info.pubkey) all
resolve to the same seed-derived pubkey.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- useOnboarding.ts: when the backend gives a definitive answer
(true/false, not a null retry failure), re-seed the
neode_onboarding_complete localStorage flag accordingly. Fixes the
case where a user clears site data on an already-onboarded node —
OnboardingWrapper's useVideoBackground computed reads localStorage
synchronously, so without this re-seed the intro video would fire
again on /login even though RootRedirect correctly sent them
straight to /login.
- OnboardingWrapper.vue: login background now rotates through
bg-intro-1..6 on each /login mount, with the current index
persisted to localStorage (neode_login_bg_idx) so subsequent
logouts advance rather than repeat the same image.
- Dashboard.vue: subsequent-login branch drops the 1.2s showZoomIn
entirely. Only the first dashboard entry after onboarding plays
the full zoom + glitch reveal; every re-login now just fades in
with the welcome typing (~300ms).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- useOnboarding.ts: prefer the backend over localStorage when checking
onboarding completion. The old order (localStorage first) meant any
browser that had ever onboarded a node would treat every new fresh
node as already-onboarded and skip the wizard, dumping the user
straight at the inline set-password form. Backend is now authoritative;
localStorage stays as the offline fallback.
- OnboardingWrapper.vue: skip the intro video on `/login` once
`neode_onboarding_complete` is set. Returning logged-out users now
get the static lock-screen background + glitch overlay instead of
replaying the full intro on every logout.
- RootRedirect.vue: when the health check fails, only show the full
BootScreen if the node was never onboarded. For already-onboarded
nodes (i.e. an OTA-update blip), keep the spinner and poll the
health endpoint every 2s for up to 60s before falling back to the
boot screen. Fixes the "fake boot loader" / "server starting up"
screens flashing on every successful update.
- loginTransition store: new `justCompletedOnboarding` flag distinct
from `justLoggedIn`. Set true only by the inline setup-password
flow (handleSetup). Dashboard.vue branches on it: full glitch+zoom
reveal for the post-onboarding entry, quick zoom + welcome typing
on every other login (no triple glitch flashes, ~1.2s vs 8s).
- vite.config.ts: bump assets cache from `assets-cache-v2` to
`assets-cache-v3` so service workers running the previous bundle
invalidate their cache and pick up the new UI cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Backend: unified pull-progress streaming across primary AND fallback
registries. Earlier code only streamed for the primary attempt; if it
failed fast (VPS 404, etc.) the UI froze at 0% until the fallback
finished. The waterfall now uses a single shared helper that streams
podman stderr through update_install_progress for every URL tried.
- Backend: PackageDataEntry gains uninstall_stage, set at each phase of
handle_package_uninstall ("Stopping containers (i/total)",
"Cleaning up volumes", "Removing app data"). State flips to Removing
during the pipeline.
- Frontend: MarketplaceAppCard renders the live progress bar with byte
counts during installs, matching the System Update download bar style.
- Frontend: AppCard renders the live uninstall stage label per app.
Modal closes immediately on confirm so concurrent uninstalls each
show their own progress on their own card.
- Cleanup: removed dead helpers (image_candidates, rewrite_for_primary,
primary_image_url, pull_from_registries_with_skip) made unused by
the install.rs refactor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New Settings → App registries page (/dashboard/settings/registries)
that mirrors the update-mirrors experience: list of configured
registries, test reachability, set primary, add/remove. New
registry.set-primary RPC; existing registry.{list,add,remove,test}
reused.
- Default RegistryConfig flipped: VPS (23.182.128.160:3000/lfg2025) is
now Server 1 (primary), tx1138 is Server 2 (fallback).
- Install pipeline now rewrites the first pull to the primary registry
URL before attempting it. Before this, installs always hit whichever
registry the image was hardcoded to, so changing the primary didn't
actually affect where images came from. On failure, the existing
fallback walk skips the primary (already tried) and walks the rest.
- App catalog proxy UPSTREAMS order flipped so the catalog follows the
same VPS-first rule.
- Reboot overlay: animated "a" logo now sits in the center of the ring
(matches the screensaver composition). Extracted the logo-wrapper
pattern inline.
7/7 registry tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New reboot progress overlay: full-screen black with the screensaver's
pulsing ring, rebooting → reconnecting → back-online → stalled stages,
elapsed counter, auto-reload on health-check success, manual reload
button at 3 min stall. Mirrors the existing update overlay.
- Ring extracted from Screensaver.vue into a reusable ScreensaverRing
component so the reboot overlay reuses the same animation.
- default_mirrors() now puts the VPS as Server 1 (primary) and tx1138 as
Server 2 — new nodes fetch manifests from VPS first; existing nodes
keep whatever mirror order they've customized.
- What's New entry prepended for v1.7.28-alpha.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- New "Served by {mirror}" line on the System Update page so operators can see
which mirror actually served the available manifest (vs. which is configured
primary). Backend threads the served URL through UpdateState.manifest_mirror.
- New update.test-mirror RPC + per-row lightning-bolt button that pings a
mirror and renders reachable/latency or error inline under the URL.
- UI polish on the mirrors section: Set Primary, Remove, and the new Test
action are compact icon buttons; add-mirror form moved into a dialog.
- "What's New" block prepended for v1.7.27-alpha.
21/21 update module tests pass. vue-tsc + vite build clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a multi-mirror manifest fetch. `check_for_updates` walks a
configurable list (data_dir/update-mirrors.json) in priority order
and falls through to the next mirror on any HTTP / parse / timeout
failure. Two defaults bake in: Server 1 (git.tx1138.com) and Server 2
(23.182.128.160:3000).
Critical fix: after parsing a manifest, rewrite every component's
`download_url` so its origin matches the manifest URL we fetched.
Before this, the manifest hard-coded absolute URLs pointing at one
specific server — so even when a node fetched the manifest from a
faster mirror, the actual 200MB download went back to the slow
original. Now the faster mirror wins end-to-end.
New RPCs: update.list-mirrors, update.add-mirror, update.remove-mirror,
update.set-primary-mirror. New UI section on the System Update page
for operator management. 5 new unit tests for origin parsing and
manifest rewriting (21/21 green).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>