Compare commits

..

16 Commits

Author SHA1 Message Date
archipelago
5b75310e0b docs(demo): comprehensive build info, deploy steps, gotchas
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:50:32 -04:00
archipelago
7efebb4a8c feat(demo): per-folder media merge + AIUI seed-chats bootstrap
- Curated files loader now MERGES per top-level folder: dropping real files into
  demo/files/Music/ swaps only Music and keeps the sample Documents/Photos/Videos
  (verified). Media plays with the Range support already in place.
- AIUI index.html: a ?seed bootstrap pre-loads the example "Content Showcase"
  conversation into AIUI's IndexedDB by calling the bundle's own
  seedPromptsToConversation() (identical to its /seed command), so the chat
  history isn't empty when the demo points users to "previous chats". Guarded by
  try/catch + an existence check; no-op without ?seed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:45:26 -04:00
archipelago
445f08a5c1 feat(demo): iframe asset-rewrite proxy, AIUI mockArchy, QR 2s, dummy mints
- IndeeHub + Mempool: nginx reverse-proxy + strip X-Frame-Options/CSP + sub_filter
  rewrite of absolute asset paths so the frame-busting SPAs load in the iframe
  (mempool.space remains best-effort — third-party CSP/ws may still limit it).
- AIUI iframe gets ?mockArchy in demo → its built-in mock node data loads.
- Pay-with-mobile QR: invoice settles after ~2s (backend gate keyed by
  payment_hash) and the poll tightened to 1s, so the QR is visible before auto-pay.
- Wallet settings: dummy Cashu mints (4) + Fedimint federations (2, 222,500 sats),
  interactive per session (streaming.list/configure-mints, wallet.fedimint-list/
  join/balance).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:34:12 -04:00
archipelago
1b7335f4ac fix(demo): nostr-rs-relay icon (nostr.svg missing → nostrudel.svg)
The catalog pointed at a non-existent nostr.svg (handleImageError only falls
back .png→.svg, so an .svg miss stays broken). Point it at the existing nostr
icon. fedimint icon already uses fedimint.png (exists); the stale fedimint.jpg
request is resolved by /api/app-catalog now serving the local catalog.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 15:23:25 -04:00
archipelago
c991e61a8f feat(demo): network/wallet dummy data — profits, federation, VPN, nostr, visibility
- wallet.networking-profits = 5,231,978 sats (content 3,180,000 / routing
  1,281,978 / relay 770,000); 6 labelled profit transactions added to the wallet
  history (1-2 per type: content sale, routing fee, file/mesh relay) — labels are
  production-ready.
- federation.list (the Web5 Federation container's method) now returns the 12
  demo nodes (was unhandled → empty).
- vpn.status: connected WireGuard with peers + traffic.
- nostr.list-relays / nostr.get-stats: 5 relays (3 connected).
- network.get/set-visibility: interactive, persisted per demo session.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 15:18:29 -04:00
archipelago
b99c4a604f fix(demo): iframe mempool+indeehub directly, serve real UIs statically, AIUI canned
- Mempool and IndeeHub load their real site directly in the iframe (reverted the
  proxy/new-tab — per request "use https://indee.tx1138.com/").
- Real app UIs now served as whole static dirs under /app/<id>/ (express.static)
  so their bundled assets (qrcode.js, css, bg images) resolve; /app/<id>/assets/*
  redirect to the frontend's shared assets. Fixes the console 404 cascade.
- Bitcoin Core/Knots: register rpc/v1 + bitcoin-rpc on their paths (relay-status
  no longer 404s); per-impl bitcoin-status preserved.
- AIUI chat returns a fixed line in demo ("Not available in demo, check out the
  previous chats to experience AIUI") instead of calling Claude — no key spend.
- Add /api/app-catalog (serves the baked catalog) to stop that 404.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 14:45:04 -04:00
archipelago
cf5f6d021a feat(demo): real registry UIs, IndeeHub iframe proxy, mempool tab, media Range
- App UIs now use the real registry shells with dummy data: bitcoin-ui for
  Bitcoin Core (Satoshi subversion) and Bitcoin Knots (Knots subversion) via
  per-path /app/bitcoin-{core,knots}/bitcoin-status; the real lnd-ui (mock
  /proxy/lnd/v1/getinfo+channels, /lnd-connect-info, /api/container/logs); the
  static fedimint-ui. ElectrumX already on the real electrs-ui. Custom mock UIs
  dropped — accurate UX.
- IndeeHub loads in the iframe: nginx reverse-proxies /app/indeedhub/ →
  indee.tx1138.com and strips X-Frame-Options/CSP (it blocked framing before).
- Mempool opens in a new tab (mempool.space can't be iframed).
- Cloud media playback: HTTP Range support in the curated-file server so audio/
  video can stream and seek (needs real files dropped into demo/files/).
- Dockerfile/.dockerignore copy docker/lnd-ui + docker/fedimint-ui.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 14:19:38 -04:00
archipelago
a0f70b3949 feat(demo): black-theme app UIs w/ icons, real ElectrumX UI, Core/Knots split
- Mock app UIs (ElectrumX, LND, Fedimint, Bitcoin Core) + the "Not available"
  notice now use the Archipelago black theme and show the app's My-Apps icon.
- Bitcoin Core gets its own UI (/app/bitcoin-core/) so it no longer shows Bitcoin
  Knots branding; the Knots-branded bitcoin-ui shell is reserved for Bitcoin Knots.
- ElectrumX now serves the real electrs-ui shell (+ qrcode.js + a dummy
  /electrs-status) with the correct ElectrumX icon; "Electrs" renamed to ElectrumX.
- My Apps: pre-install Bitcoin Knots again, drop ThunderHub, rename Electrs→ElectrumX.
- App store no longer shows "Checking…" forever in demo — non-demoable apps show
  "No demo" immediately (skip the container-scan state).
- Relay endpoint no longer reveals a real domain (randomised host).
- Dockerfile/.dockerignore copy docker/electrs-ui into the backend image.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 13:55:50 -04:00
archipelago
4cc808c73e fix(demo): /app proxy (fixes 404s), mempool iframe, LND UI, icons
- nginx-demo.conf + vite proxy now route every /app/<id>/ to the mock backend, so
  the per-app mock UIs and the generic "Not available in the demo" notice render
  (previously only /app/filebrowser was proxied → most apps 404'd).
- Mempool and IndeeHub now load in the in-app iframe (not a new tab).
- Add an LND Lightning mock UI (channels, balances, routing) with dummy data;
  lnd/thunderhub are demoable. Notice page reworded to "Not available in the demo".
- Fix missing icons: Bitcoin Core → bitcoin-core.png, Mempool → mempool.webp.
- Pre-install only Bitcoin Core (drop duplicate Bitcoin Knots; still installable).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 12:39:33 -04:00
archipelago
c9341baa35 fix(demo): un-ignore docker/bitcoin-ui in build context
The backend COPY of docker/bitcoin-ui failed in Portainer because .dockerignore
(* + whitelist) excluded it. Re-include docker/ then exclude its contents except
bitcoin-ui, so the build context contains the Bitcoin UI mock shell. demo/files is
already covered by !demo/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 11:16:31 -04:00
archipelago
79c3769542 feat(demo): curated cloud files drop-in + fix backend asset copies
- demo/files/<Folder>/<file> becomes the cloud's content for every visitor
  (read-only; "private login" = git/repo access). Text inlined, binaries streamed
  from disk; empty folder falls back to the built-in seeded set.
- Dockerfile.backend now copies docker/bitcoin-ui and demo/files into the image
  (they live outside neode-ui/) — this also fixes the Bitcoin UI mock, which the
  backend reads from /docker/bitcoin-ui and was previously absent in the container.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 11:11:40 -04:00
archipelago
df2ae3d7d8 feat(demo): ground AIUI chat in the node's mock state
The Claude proxy injects a system-prompt describing this node (version, signet
chain + height, wallet balances, installed apps, 5 FIPS peers / 12 trusted nodes)
into every demo chat request. The assistant answers local-node and Bitcoin
questions with the node's real-looking data automatically — no /seed needed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 10:58:58 -04:00
archipelago
3f411c1d10 feat(demo): mock FIPS as active (status, seed anchors, reconnect, install)
fips.status reports installed+active with 5 authenticated peers and an anchor
connection; list/add/remove/apply seed-anchors and reconnect/install all resolve
to working states so the FIPS Mesh + Seed Anchors cards light green in the demo.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 10:55:13 -04:00
archipelago
4d0c2d6717 feat(demo): real testnet tx links + interactive buy-files flow
- Tx/explorer links open mempool.space/testnet/tx/<id>; the backend hydrates the
  wallet's transactions with REAL recent testnet txids at startup (best-effort,
  falls back to mock hashes offline). Mempool app + demo-external apps open in a
  new tab; deep-link paths are carried through.
- Add the content.* paid-download handlers the buy flow needs (owned-list,
  preview-peer, download-peer-{paid,invoice,onchain}, request-invoice,
  invoice-status, request-onchain, onchain-status) — every path resolves to a
  success state with testnet receive addresses / bolt11 invoices so visitors can
  walk the full buy → unlock journey.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 10:53:05 -04:00
archipelago
2cffa79d9d feat(demo): app launch UIs, "No demo" gating, onboarding skip, 12 nodes
App launching (DEMO):
- resolveAppUrl routes every app to its demo target: mock UIs for Bitcoin Core,
  ElectrumX, Fedimint (served by the backend), IndeeHub → iframe indee.tx1138.com,
  Mempool → mempool.space/testnet (new tab); all others → a generic "Demo preview"
  notice page.
- Non-demoable apps show a disabled "No demo" install button (marketplace details,
  app grid, featured apps).

Onboarding:
- Demo treats the visitor as fully set up so the onboarding WIZARD (seed/identity)
  is never forced; the welcome intro still replays per day. Intro CTA goes straight
  to login; wizard entry points + login restart-onboarding link hidden in demo.

Network:
- federation.list-nodes now returns 12 trusted/federated nodes (9 trusted, 3
  observer); transport.peers already at 5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 10:26:35 -04:00
archipelago
2715f2d847 feat(demo): public multi-visitor demo sandbox for Portainer
Turn the mock backend + UI into a public, click-to-play demo deployable as a
Portainer stack, gated behind DEMO=1 (classic single-user mock unchanged when off).

Backend (neode-ui/mock-backend.js):
- Per-session state isolation via AsyncLocalStorage + Proxy: every visitor gets
  an isolated, deep-cloned copy of mockData/walletState/userState/etc., keyed by
  a demo_sid cookie. Per-session WebSocket fan-out, idle reaper, session cap.
- Real per-session file storage (upload/folder/rename/delete) with a 50MB quota,
  replacing the no-op filebrowser handlers; adds the missing app.filebrowser-token RPC.
- Force simulation mode (never touch a host Docker/Podman socket).
- Testnet (signet) flavor; shared login password "entertoexit".
- Report the real app version suffixed with -demo.

Frontend:
- VITE_DEMO build flag (useDemoIntro.ts): replay the intro once per calendar day
  per browser; prefill + show the "entertoexit" login hint.

Deploy:
- docker-compose.demo.yml wired for DEMO, UI on :2100 (build-from-repo).
- demo-deploy/ thin stack (prebuilt :demo image refs + .env.example + README).
- .github/workflows/demo-images.yml builds/pushes archy-demo-{web,backend} images.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 09:28:05 -04:00
116 changed files with 2717 additions and 8949 deletions

View File

@ -7,6 +7,14 @@
# Allow demo assets (AIUI pre-built dist)
!demo/
# Allow the Bitcoin UI + ElectrumX UI mock shells (served from /docker/*)
!docker/
docker/*
!docker/bitcoin-ui/
!docker/electrs-ui/
!docker/lnd-ui/
!docker/fedimint-ui/
# Allow backend source for ISO source builds
!core/
!scripts/

View File

@ -2,7 +2,7 @@
# Keep the served companion APK in sync with main on every push.
#
# When a push to main includes Android changes, rebuild the APK, refresh
# neode-ui/public/packages/archipelago-companion.apk, commit it, and ask
# neode-ui/public/packages/archipelago-companion.apk.zip, commit it, and ask
# you to push again (so the refreshed APK rides along in the same push).
#
# Enable once per clone: git config core.hooksPath .githooks
@ -40,7 +40,7 @@ fi
bash scripts/publish-companion-apk.sh || exit 0
DEST="neode-ui/public/packages/archipelago-companion.apk"
DEST="neode-ui/public/packages/archipelago-companion.apk.zip"
if git diff --cached --quiet -- "$DEST"; then
exit 0 # APK unchanged — nothing to do
fi

67
.github/workflows/demo-images.yml vendored Normal file
View File

@ -0,0 +1,67 @@
name: Demo images
# Builds and pushes the public-demo images on every change to the UI / mock
# backend, so the separated `archy-demo` Portainer stack auto-tracks the real
# code (see demo-deploy/ and docs/demo-deployment-design.md).
#
# Required repo configuration:
# vars.DEMO_REGISTRY e.g. 146.59.87.168:3000/lfg2025
# secrets.DEMO_REGISTRY_USER
# secrets.DEMO_REGISTRY_TOKEN
# Optional:
# secrets.PORTAINER_WEBHOOK redeploy hook called after a successful push
on:
push:
branches: [main]
paths:
- 'neode-ui/**'
- 'docker-compose.demo.yml'
- '.github/workflows/demo-images.yml'
workflow_dispatch:
jobs:
build:
name: Build & push demo images
runs-on: ubuntu-latest
# Skip cleanly on forks / before registry config is set.
if: ${{ vars.DEMO_REGISTRY != '' }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to registry
uses: docker/login-action@v3
with:
registry: ${{ vars.DEMO_REGISTRY_HOST || vars.DEMO_REGISTRY }}
username: ${{ secrets.DEMO_REGISTRY_USER }}
password: ${{ secrets.DEMO_REGISTRY_TOKEN }}
- name: Build & push backend
uses: docker/build-push-action@v6
with:
context: .
file: neode-ui/Dockerfile.backend
push: true
tags: |
${{ vars.DEMO_REGISTRY }}/archy-demo-backend:demo
${{ vars.DEMO_REGISTRY }}/archy-demo-backend:${{ github.sha }}
- name: Build & push web
uses: docker/build-push-action@v6
with:
context: .
file: neode-ui/Dockerfile.web
push: true
build-args: |
VITE_DEMO=1
tags: |
${{ vars.DEMO_REGISTRY }}/archy-demo-web:demo
${{ vars.DEMO_REGISTRY }}/archy-demo-web:${{ github.sha }}
- name: Trigger Portainer redeploy
if: ${{ success() && secrets.PORTAINER_WEBHOOK != '' }}
run: curl -fsS -X POST "${{ secrets.PORTAINER_WEBHOOK }}"

5
Android/.gitignore vendored
View File

@ -14,8 +14,3 @@ local.properties
*.aab
*.jks
*.keystore
# Exception: the repo-dedicated *debug* keystore is committed on purpose so every
# machine (and the published companion download) signs debug builds identically —
# updates then install over the top without an uninstall. Debug keys are not
# secret (well-known password "android"); never commit a real release keystore.
!/app/debug.keystore

View File

@ -1,94 +0,0 @@
# Companion App — Build, Ship & "App Not Installed" Runbook
Canonical procedure for releasing the Archipelago Companion Android app and for
debugging install failures. Read this before touching the companion release flow.
Hard lessons from 2026-06-26 are baked in below — don't relearn them.
## Ship the companion (the only sanctioned way)
```bash
./Android/ship-companion.sh
```
This calls `scripts/publish-companion-apk.sh` (the single source of truth, also
used by the `.githooks/pre-push` hook), which:
1. **Removes/rejects resource dirs whose names contain spaces.** Empty stray
`mipmap-* NNN` dirs (left by icon-export tools) break a *clean* build with
`Invalid resource directory name`. Incremental builds hide them — clean builds
don't.
2. **Always does a CLEAN build** (`:app:clean :app:assembleDebug`).
3. **Forces v1 + v2 + v3 signing** via `zipalign` + `apksigner`.
4. **Verifies all three schemes** (`apksigner verify --min-sdk-version 21`) and
**aborts** if any is missing.
5. Stages the signed APK at `neode-ui/public/packages/archipelago-companion.apk`,
commits, and pushes with `SHIP_COMPANION=1` (the sanctioned pre-push bypass).
**Never** hand-roll `gradlew assembleDebug` + `cp` to the served path. That path
skips the clean build and the signature enforcement and is exactly how a broken
APK shipped.
### Bump the version first
Edit `Android/app/build.gradle.kts``versionCode` (must strictly increase) and
`versionName`. The committed value can drift AHEAD of what's actually built into
the served APK, so verify the served APK's real version after shipping:
`aapt2 dump badging neode-ui/public/packages/archipelago-companion.apk | grep version`.
## Signing facts (important)
- Debug builds are signed with the **committed** `Android/app/debug.keystore`
(store/key pass `android`, alias `androiddebugkey`) so every machine and the
served download share ONE signing key. Cert SHA-256: `D6:22:E0:7E:…:66:4D`.
- **AGP silently ignores `enableV1Signing = true` for `minSdk ≥ 24`**, so a plain
gradle build produces a **v2-only** APK. The `apksigner` step in the publish
script is what actually guarantees v1+v2+v3 — do not remove it.
- **Changing the signing key forces every existing install to be uninstalled
once.** Android blocks in-place upgrades across different signatures. Treat the
keystore as permanent; never regenerate it casually.
## Debugging "App Not Installed" — DIAGNOSE FIRST
Do **not** theorize about signing schemes / OEM quirks. Get the real reason:
```bash
adb install ~/Desktop/archipelago-companion-<ver>.apk
# -> Failure [INSTALL_FAILED_<REASON>: ...]
```
Map the reason:
| `INSTALL_FAILED_*` | Cause | Fix |
|---|---|---|
| `UPDATE_INCOMPATIBLE … signatures do not match` | Old install signed with a **different key** (e.g. pre-shared-keystore per-machine key `58:31:12…`). | Uninstall the old package, then install. **One-time** per device after a key change. |
| `INVALID_APK` / parse error | Corrupt/incomplete download or bad signing. | Re-download; re-run the publish script. |
| `INSUFFICIENT_STORAGE` | Storage. | Free space. |
| `OLDER_SDK` | Device below `minSdk` (26 = Android 8.0). | Unsupported device. |
> A manual uninstall on the phone may NOT clear `UPDATE_INCOMPATIBLE` if the
> package is registered under another user/profile — `pm path <pkg>` under user 0
> can show nothing while the conflict persists. `adb uninstall <pkg>` clears it
> across all users.
## Phone / adb safety (non-negotiable)
When acting on the user's physical phone, be surgical — the user once had all
home-screen app layouts wiped by an over-broad action.
- Default to **read-only** adb (`devices`, `getprop`, `pm path/list`, `dumpsys`).
- Mutations (`adb install`, `adb uninstall com.archipelago.app.debug`) only with
explicit go-ahead and **scoped to our exact package** — echo it first.
- **Never** run launcher/system resets: no `pm clear` on launchers, no
`reset-permissions`, no factory wipe, no uninstalling apps you didn't build.
## Verify the published download after shipping
The download served to nodes is Gitea raw-on-main. Confirm the live bytes match
what you built and signed:
```bash
SERVED=neode-ui/public/packages/archipelago-companion.apk
URL=http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/$SERVED
curl -sS -o /tmp/live.apk "$URL"
shasum -a 256 "$SERVED" /tmp/live.apk # must match
apksigner verify -v --min-sdk-version 21 /tmp/live.apk | grep -i "scheme" # v1/v2/v3 = true
```

View File

@ -11,40 +11,20 @@ android {
applicationId = "com.archipelago.app"
minSdk = 26
targetSdk = 35
versionCode = 16
versionName = "0.4.12"
versionCode = 10
versionName = "0.4.6"
vectorDrawables {
useSupportLibrary = true
}
}
signingConfigs {
// Repo-dedicated debug keystore (committed at app/debug.keystore) so every
// machine — and the published companion download — signs debug builds with
// the SAME key. Without this, Gradle falls back to each machine's
// ~/.android/debug.keystore, so a build from a different machine has a
// different signature and the phone rejects the update ("App not installed").
getByName("debug") {
storeFile = file("debug.keystore")
storePassword = "android"
keyAlias = "androiddebugkey"
keyPassword = "android"
// Force both legacy JAR (v1) and APK Signature Scheme v2. AGP drops v1
// for minSdk>=24, but some OEM package installers (e.g. Samsung) reject
// a v2-only sideload with "App not installed" — keep v1 for max compat.
enableV1Signing = true
enableV2Signing = true
}
}
buildTypes {
debug {
// Separate app ID so a debug/test build installs alongside the
// release app instead of colliding on signature.
applicationIdSuffix = ".debug"
versionNameSuffix = "-debug"
signingConfig = signingConfigs.getByName("debug")
}
release {
isMinifyEnabled = true

Binary file not shown.

View File

@ -112,37 +112,6 @@ class ServerPreferences(private val context: Context) {
}
}
/**
* Replace a saved server in place. Matches the existing entry by connection
* identity (address/port/scheme) so edits that change the name or password
* or that touch a legacy 4-field entry still update the right record. If the
* edited server is also the active one, the active record is kept in sync.
*/
suspend fun updateSavedServer(original: ServerEntry, updated: ServerEntry) {
context.dataStore.edit { prefs ->
val current = prefs[savedServersKey] ?: emptySet()
val filtered = current.filterNot { raw ->
val e = ServerEntry.deserialize(raw)
e != null &&
e.address == original.address &&
e.port == original.port &&
e.useHttps == original.useHttps
}.toSet()
prefs[savedServersKey] = filtered + updated.serialize()
val isActive = prefs[activeAddressKey] == original.address &&
(prefs[activePortKey] ?: "") == original.port &&
(prefs[activeHttpsKey] ?: false) == original.useHttps
if (isActive) {
prefs[activeAddressKey] = updated.address
prefs[activeHttpsKey] = updated.useHttps
prefs[activePortKey] = updated.port
prefs[activePasswordKey] = updated.password
prefs[activeNameKey] = updated.name
}
}
}
suspend fun removeSavedServer(server: ServerEntry) {
context.dataStore.edit { prefs ->
val current = prefs[savedServersKey] ?: emptySet()

View File

@ -75,7 +75,6 @@ fun NESMenu(
onDismiss: () -> Unit,
onSelectServer: (ServerEntry) -> Unit,
onAddServer: (ServerEntry) -> Unit,
onEditServer: (ServerEntry, ServerEntry) -> Unit,
onRemoveServer: (ServerEntry) -> Unit,
onToggleMode: () -> Unit,
onToggleStyle: () -> Unit,
@ -88,7 +87,7 @@ fun NESMenu(
contentAlignment = Alignment.Center,
) {
AnimatedVisibility(visible = visible, enter = fadeIn() + scaleIn(initialScale = 0.95f), exit = fadeOut() + scaleOut(targetScale = 0.95f)) {
MenuPanel(servers, activeServer, isGamepadMode, controllerStyle, onDismiss, onSelectServer, onAddServer, onEditServer, onRemoveServer, onToggleMode, onToggleStyle, onBackToWebView)
MenuPanel(servers, activeServer, isGamepadMode, controllerStyle, onDismiss, onSelectServer, onAddServer, onRemoveServer, onToggleMode, onToggleStyle, onBackToWebView)
}
}
}
@ -103,39 +102,21 @@ private fun MenuPanel(
onDismiss: () -> Unit,
onSelectServer: (ServerEntry) -> Unit,
onAddServer: (ServerEntry) -> Unit,
onEditServer: (ServerEntry, ServerEntry) -> Unit,
onRemoveServer: (ServerEntry) -> Unit,
onToggleMode: () -> Unit,
onToggleStyle: () -> Unit,
onBackToWebView: (() -> Unit)?,
) {
var showAdd by remember { mutableStateOf(false) }
// The saved server being edited, or null when adding a new one.
var editing by remember { mutableStateOf<ServerEntry?>(null) }
var nm by remember { mutableStateOf("") }
var addr by remember { mutableStateOf("") }
var pwd by remember { mutableStateOf("") }
fun resetForm() {
nm = ""; addr = ""; pwd = ""; showAdd = false; editing = null
}
fun startEdit(server: ServerEntry) {
editing = server
nm = server.name; addr = server.address; pwd = server.password
showAdd = false
}
fun submit() {
if (addr.isBlank()) return
val orig = editing
if (orig != null) {
// Preserve fields the compact form doesn't expose (scheme, port).
onEditServer(orig, orig.copy(address = addr, password = pwd, name = nm))
} else {
if (addr.isNotBlank()) {
onAddServer(ServerEntry(addr, false, password = pwd, name = nm))
nm = ""; addr = ""; pwd = ""; showAdd = false
}
resetForm()
}
Column(
@ -168,7 +149,6 @@ private fun MenuPanel(
label = server.displayName(),
selected = active,
onClick = { onSelectServer(server) },
onEdit = { startEdit(server) },
onRemove = { onRemoveServer(server) },
)
}
@ -177,8 +157,8 @@ private fun MenuPanel(
Text("No servers", color = TextMuted, fontSize = 14.sp, modifier = Modifier.padding(vertical = 4.dp))
}
// Add / edit server
if (showAdd || editing != null) {
// Add server
if (showAdd) {
Column(
Modifier
.fillMaxWidth()
@ -188,25 +168,6 @@ private fun MenuPanel(
.padding(12.dp),
verticalArrangement = Arrangement.spacedBy(8.dp),
) {
Row(
Modifier.fillMaxWidth(),
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.SpaceBetween,
) {
Text(
if (editing != null) "Edit Server" else "Add Server",
color = TextMuted,
fontSize = 13.sp,
letterSpacing = 1.sp,
fontWeight = FontWeight.Medium,
)
Text(
"Cancel",
color = TextMuted,
fontSize = 13.sp,
modifier = Modifier.clickable { resetForm() }.padding(start = 8.dp),
)
}
GlassField(
value = nm, onValueChange = { nm = it },
placeholder = "Name (optional)",
@ -267,7 +228,6 @@ private fun MenuItem(
selected: Boolean = false,
labelColor: Color = TextPrimary,
onClick: () -> Unit,
onEdit: (() -> Unit)? = null,
onRemove: (() -> Unit)? = null,
) {
Row(
@ -287,16 +247,7 @@ private fun MenuItem(
color = if (selected) BitcoinOrange else labelColor,
fontSize = 16.sp,
fontWeight = FontWeight.Medium,
modifier = Modifier.weight(1f),
)
if (onEdit != null) {
Text(
"",
color = TextMuted,
fontSize = 16.sp,
modifier = Modifier.clickable { onEdit() }.padding(horizontal = 8.dp),
)
}
if (onRemove != null) {
Text(
"",

View File

@ -216,17 +216,6 @@ fun RemoteInputScreen(onBack: () -> Unit) {
onAddServer = { server ->
scope.launch { prefs.addSavedServer(server); if (activeServer == null) prefs.setActiveServer(server) }
},
onEditServer = { original, updated ->
scope.launch {
prefs.updateSavedServer(original, updated)
// If the edited server is the live one, reconnect with the new
// address/credentials so the change takes effect immediately.
if (original.serialize() == activeServer?.serialize()) {
ws.disconnect()
prefs.setActiveServer(updated)
}
}
},
onRemoveServer = { server ->
scope.launch {
prefs.removeSavedServer(server)

View File

@ -30,7 +30,6 @@ import androidx.compose.material.icons.filled.VisibilityOff
import androidx.compose.foundation.verticalScroll
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Close
import androidx.compose.material.icons.filled.Edit
import androidx.compose.material.icons.filled.Lock
import androidx.compose.material.icons.filled.LockOpen
import androidx.compose.material3.CircularProgressIndicator
@ -107,50 +106,9 @@ fun ServerConnectScreen(
var useHttps by remember { mutableStateOf(false) }
var isConnecting by remember { mutableStateOf(false) }
var errorMessage by remember { mutableStateOf<String?>(null) }
// The saved server currently being edited, or null when adding/connecting.
var editingServer by remember { mutableStateOf<ServerEntry?>(null) }
val savedServers by prefs.savedServers.collectAsState(initial = emptyList())
fun clearForm() {
name = ""
address = ""
port = ""
password = ""
useHttps = false
passwordVisible = false
errorMessage = null
}
fun startEdit(server: ServerEntry) {
editingServer = server
name = server.name
address = server.address
port = server.port
password = server.password
useHttps = server.useHttps
passwordVisible = false
errorMessage = null
}
fun cancelEdit() {
editingServer = null
clearForm()
}
fun saveEdit() {
val original = editingServer ?: return
if (address.isBlank()) {
errorMessage = "Enter a server address"
return
}
val updated = ServerEntry(address, useHttps, port, password, name)
scope.launch {
prefs.updateSavedServer(original, updated)
cancelEdit()
}
}
fun connect(server: ServerEntry) {
if (isConnecting) return
if (server.address.isBlank()) {
@ -220,7 +178,7 @@ fun ServerConnectScreen(
Spacer(modifier = Modifier.height(4.dp))
Text(
text = if (editingServer != null) stringResource(R.string.edit_server_title) else "Connect to Server",
text = "Connect to Server",
style = MaterialTheme.typography.headlineMedium,
color = TextPrimary,
textAlign = TextAlign.Center,
@ -366,11 +324,7 @@ fun ServerConnectScreen(
keyboardActions = KeyboardActions(
onGo = {
keyboard?.hide()
if (editingServer != null) {
saveEdit()
} else {
connect(ServerEntry(address, useHttps, port, password, name))
}
connect(ServerEntry(address, useHttps, port, password, name))
},
),
colors = OutlinedTextFieldDefaults.colors(
@ -435,40 +389,15 @@ fun ServerConnectScreen(
}
}
if (editingServer != null) {
// Save / Cancel while editing an existing saved server
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(12.dp),
) {
GlassButton(
text = stringResource(R.string.cancel),
onClick = {
keyboard?.hide()
cancelEdit()
},
modifier = Modifier.weight(1f).height(56.dp),
)
GlassButton(
text = stringResource(R.string.save_changes),
onClick = {
keyboard?.hide()
saveEdit()
},
modifier = Modifier.weight(1f).height(56.dp),
)
}
} else {
// Connect button — glass style
GlassButton(
text = if (isConnecting) stringResource(R.string.connecting) else stringResource(R.string.connect),
onClick = {
keyboard?.hide()
connect(ServerEntry(address, useHttps, port, password, name))
},
modifier = Modifier.fillMaxWidth().height(56.dp),
)
}
// Connect button — glass style
GlassButton(
text = if (isConnecting) stringResource(R.string.connecting) else stringResource(R.string.connect),
onClick = {
keyboard?.hide()
connect(ServerEntry(address, useHttps, port, password, name))
},
modifier = Modifier.fillMaxWidth().height(56.dp),
)
if (isConnecting) {
CircularProgressIndicator(
@ -478,8 +407,8 @@ fun ServerConnectScreen(
)
}
// Saved servers (hidden while editing one to keep focus on the form)
if (editingServer == null && savedServers.isNotEmpty()) {
// Saved servers
if (savedServers.isNotEmpty()) {
Spacer(modifier = Modifier.height(8.dp))
Text(
text = stringResource(R.string.saved_servers),
@ -493,7 +422,6 @@ fun ServerConnectScreen(
SavedServerItem(
server = server,
onConnect = { connect(it) },
onEdit = { startEdit(it) },
onRemove = { scope.launch { prefs.removeSavedServer(it) } },
)
}
@ -506,7 +434,6 @@ fun ServerConnectScreen(
private fun SavedServerItem(
server: ServerEntry,
onConnect: (ServerEntry) -> Unit,
onEdit: (ServerEntry) -> Unit,
onRemove: (ServerEntry) -> Unit,
) {
Row(
@ -549,9 +476,6 @@ private fun SavedServerItem(
}
}
}
IconButton(onClick = { onEdit(server) }) {
Icon(imageVector = Icons.Default.Edit, contentDescription = stringResource(R.string.edit_server), modifier = Modifier.size(18.dp), tint = TextMuted)
}
IconButton(onClick = { onRemove(server) }) {
Icon(imageVector = Icons.Default.Close, contentDescription = stringResource(R.string.remove_server), modifier = Modifier.size(18.dp), tint = TextMuted)
}

View File

@ -2,7 +2,6 @@ package com.archipelago.app.ui.screens
import android.annotation.SuppressLint
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.view.ViewGroup
import android.webkit.CookieManager
import android.webkit.WebChromeClient
@ -15,7 +14,6 @@ import androidx.activity.compose.BackHandler
import androidx.compose.animation.AnimatedVisibility
import androidx.compose.animation.fadeIn
import androidx.compose.animation.fadeOut
import androidx.compose.foundation.Image
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
@ -29,24 +27,17 @@ import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.safeDrawing
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.layout.windowInsetsPadding
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.automirrored.filled.ArrowBack
import androidx.compose.material.icons.automirrored.filled.ArrowForward
import androidx.compose.material.icons.filled.Close
import androidx.compose.material.icons.filled.CloudOff
import androidx.compose.material.icons.filled.OpenInBrowser
import androidx.compose.material.icons.filled.Refresh
import androidx.compose.material3.CircularProgressIndicator
import androidx.compose.material3.Icon
import androidx.compose.material3.IconButton
import androidx.compose.material3.LinearProgressIndicator
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Text
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableIntStateOf
import androidx.compose.runtime.mutableStateOf
@ -54,8 +45,6 @@ import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.asImageBitmap
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.res.stringResource
import androidx.compose.ui.text.style.TextAlign
@ -67,8 +56,6 @@ import com.archipelago.app.ui.theme.BitcoinOrange
import com.archipelago.app.ui.theme.SurfaceBlack
import com.archipelago.app.ui.theme.TextMuted
import com.archipelago.app.ui.theme.TextPrimary
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
/** Open a URL in the phone's default browser (genuinely external links). */
private fun openExternalUrl(context: android.content.Context, url: String) {
@ -323,26 +310,6 @@ fun WebViewScreen(
}
}
// Node apps (e.g. NetBird) terminate TLS with a
// self-signed cert — the dashboard needs a secure
// context for OIDC/window.crypto.subtle (#15). The
// WebView default is to CANCEL untrusted certs, so
// those apps render blank. The user explicitly trusts
// their own node, so proceed for same-host certs only;
// reject anything else (don't blanket-trust the web).
override fun onReceivedSslError(
view: WebView?,
handler: android.webkit.SslErrorHandler?,
error: android.net.http.SslError?,
) {
val u = error?.url
if (u != null && isSameHost(u, serverUrl)) {
handler?.proceed()
} else {
handler?.cancel()
}
}
override fun shouldOverrideUrlLoading(
view: WebView?,
request: WebResourceRequest?,
@ -461,34 +428,11 @@ fun WebViewScreen(
}
}
/** Best-effort fetch of the origin's /favicon.ico, so the launched app's icon
* can be shown on the loading screen before the WebView reports onReceivedIcon
* (which only fires once the page's <head> has parsed). Blocking call on IO. */
private fun fetchFavicon(pageUrl: String): Bitmap? {
return try {
val u = android.net.Uri.parse(pageUrl)
val scheme = u.scheme ?: return null
val host = u.host ?: return null
val portPart = if (u.port > 0) ":${u.port}" else ""
val conn = (java.net.URL("$scheme://$host$portPart/favicon.ico").openConnection()
as java.net.HttpURLConnection).apply {
connectTimeout = 4000
readTimeout = 4000
instanceFollowRedirects = true
}
conn.inputStream.use { BitmapFactory.decodeStream(it) }
} catch (_: Exception) {
null
}
}
/**
* Lightweight in-app browser used when the kiosk hands off an app that can't be
* shown in an iframe. Loads the app in a local WebView with a centered loading
* screen (app favicon + progress bar) and a BOTTOM control bar mirroring the
* web mobile-iframe footer (back / forward / reload / open-in-browser / close).
* Same-host navigation stays here; any genuinely external link escapes to the
* phone's browser.
* shown in an iframe. Loads the app in a local WebView with a minimal top bar
* (close + title + escalate-to-real-browser). Same-host navigation stays here;
* any genuinely external link escapes to the phone's browser.
*/
@SuppressLint("SetJavaScriptEnabled")
@Composable
@ -500,20 +444,8 @@ private fun InAppBrowser(
val context = LocalContext.current
var browser by remember { mutableStateOf<WebView?>(null) }
var title by remember { mutableStateOf(android.net.Uri.parse(url).host ?: url) }
var favicon by remember { mutableStateOf<Bitmap?>(null) }
var progress by remember { mutableIntStateOf(0) }
var loading by remember { mutableStateOf(true) }
var canGoBack by remember { mutableStateOf(false) }
var canGoForward by remember { mutableStateOf(false) }
// Seed the loading-screen icon immediately from a best-effort favicon
// pre-fetch (main's app-icon work), then onReceivedIcon upgrades it — so the
// loader shows an icon right away instead of staying blank until the page
// parses its <head> (which is what made the loader look stuck).
LaunchedEffect(url) {
val fetched = withContext(Dispatchers.IO) { fetchFavicon(url) }
if (fetched != null && favicon == null) favicon = fetched
}
// Back: walk the in-app history first, then close the overlay.
BackHandler {
@ -527,169 +459,13 @@ private fun InAppBrowser(
.background(SurfaceBlack)
.windowInsetsPadding(WindowInsets.safeDrawing),
) {
// WebView + loading overlay fill the area above the bottom control bar.
Box(modifier = Modifier.weight(1f).fillMaxWidth()) {
AndroidView(
modifier = Modifier.fillMaxSize(),
factory = { ctx ->
WebView(ctx).apply {
layoutParams = ViewGroup.LayoutParams(
ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT,
)
isVerticalScrollBarEnabled = false
isHorizontalScrollBarEnabled = false
CookieManager.getInstance().setAcceptThirdPartyCookies(this, true)
applyArchipelagoSettings()
webChromeClient = object : WebChromeClient() {
override fun onProgressChanged(view: WebView?, newProgress: Int) {
progress = newProgress
}
override fun onReceivedTitle(view: WebView?, t: String?) {
if (!t.isNullOrBlank()) title = t
}
override fun onReceivedIcon(view: WebView?, icon: Bitmap?) {
if (icon != null) favicon = icon
}
}
webViewClient = object : WebViewClient() {
override fun onPageStarted(view: WebView?, u: String?, favicon: Bitmap?) {
loading = true
}
override fun onPageFinished(view: WebView?, u: String?) {
loading = false
canGoBack = view?.canGoBack() == true
canGoForward = view?.canGoForward() == true
}
override fun doUpdateVisitedHistory(view: WebView?, u: String?, isReload: Boolean) {
canGoBack = view?.canGoBack() == true
canGoForward = view?.canGoForward() == true
}
// Self-signed TLS on the node's apps (e.g. NetBird on
// :8087) would otherwise be cancelled by the WebView
// and render blank. Proceed for the user's own node
// (same host); reject any other untrusted cert.
override fun onReceivedSslError(
view: WebView?,
handler: android.webkit.SslErrorHandler?,
error: android.net.http.SslError?,
) {
val u = error?.url
if (u != null && isSameHost(u, serverUrl)) {
handler?.proceed()
} else {
handler?.cancel()
}
}
override fun shouldOverrideUrlLoading(
view: WebView?,
request: WebResourceRequest?,
): Boolean {
val u = request?.url?.toString() ?: return false
// Stay in the overlay for same-node navigation;
// hand genuinely external links to the real browser.
if (isSameHost(u, serverUrl)) return false
openExternalUrl(ctx, u)
return true
}
}
browser = this
loadUrl(url)
}
},
)
// Centered loading screen — app favicon (or spinner) + title + bar.
if (loading) {
Column(
modifier = Modifier
.fillMaxSize()
.background(SurfaceBlack),
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.Center,
) {
Box(
modifier = Modifier.size(84.dp).clip(RoundedCornerShape(20.dp)),
contentAlignment = Alignment.Center,
) {
val fav = favicon
if (fav != null) {
Image(
bitmap = fav.asImageBitmap(),
contentDescription = title,
modifier = Modifier.fillMaxSize(),
)
} else {
CircularProgressIndicator(color = BitcoinOrange)
}
}
Spacer(modifier = Modifier.height(18.dp))
Text(
text = title,
style = MaterialTheme.typography.bodyLarge,
color = TextPrimary,
maxLines = 1,
overflow = TextOverflow.Ellipsis,
)
Spacer(modifier = Modifier.height(16.dp))
LinearProgressIndicator(
progress = { progress / 100f },
modifier = Modifier.width(220.dp),
color = BitcoinOrange,
trackColor = TextMuted.copy(alpha = 0.2f),
)
}
}
}
// Bottom control bar — mirrors the web mobile-iframe footer.
Row(
modifier = Modifier
.fillMaxWidth()
.height(56.dp)
.background(SurfaceBlack)
.padding(horizontal = 8.dp),
horizontalArrangement = Arrangement.SpaceAround,
.height(48.dp)
.padding(horizontal = 4.dp),
verticalAlignment = Alignment.CenterVertically,
) {
IconButton(onClick = { browser?.goBack() }, enabled = canGoBack) {
Icon(
imageVector = Icons.AutoMirrored.Filled.ArrowBack,
contentDescription = "Back",
tint = if (canGoBack) TextPrimary else TextMuted.copy(alpha = 0.4f),
)
}
IconButton(onClick = { browser?.goForward() }, enabled = canGoForward) {
Icon(
imageVector = Icons.AutoMirrored.Filled.ArrowForward,
contentDescription = "Forward",
tint = if (canGoForward) TextPrimary else TextMuted.copy(alpha = 0.4f),
)
}
IconButton(onClick = { browser?.reload() }) {
Icon(
imageVector = Icons.Default.Refresh,
contentDescription = "Reload",
tint = TextPrimary,
)
}
IconButton(onClick = { openExternalUrl(context, browser?.url ?: url) }) {
Icon(
imageVector = Icons.Default.OpenInBrowser,
contentDescription = stringResource(R.string.open_in_browser),
tint = TextPrimary,
)
}
IconButton(onClick = onClose) {
Icon(
imageVector = Icons.Default.Close,
@ -697,6 +473,82 @@ private fun InAppBrowser(
tint = TextPrimary,
)
}
Text(
text = title,
style = MaterialTheme.typography.bodyMedium,
color = TextPrimary,
maxLines = 1,
overflow = TextOverflow.Ellipsis,
modifier = Modifier.weight(1f),
)
IconButton(onClick = { openExternalUrl(context, browser?.url ?: url) }) {
Icon(
imageVector = Icons.Default.OpenInBrowser,
contentDescription = stringResource(R.string.open_in_browser),
tint = TextMuted,
)
}
}
AnimatedVisibility(visible = loading, enter = fadeIn(), exit = fadeOut()) {
LinearProgressIndicator(
progress = { progress / 100f },
modifier = Modifier.fillMaxWidth(),
color = BitcoinOrange,
trackColor = SurfaceBlack,
)
}
AndroidView(
modifier = Modifier.fillMaxSize(),
factory = { ctx ->
WebView(ctx).apply {
layoutParams = ViewGroup.LayoutParams(
ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT,
)
isVerticalScrollBarEnabled = false
isHorizontalScrollBarEnabled = false
CookieManager.getInstance().setAcceptThirdPartyCookies(this, true)
applyArchipelagoSettings()
webChromeClient = object : WebChromeClient() {
override fun onProgressChanged(view: WebView?, newProgress: Int) {
progress = newProgress
}
override fun onReceivedTitle(view: WebView?, t: String?) {
if (!t.isNullOrBlank()) title = t
}
}
webViewClient = object : WebViewClient() {
override fun onPageStarted(view: WebView?, u: String?, favicon: Bitmap?) {
loading = true
}
override fun onPageFinished(view: WebView?, u: String?) {
loading = false
}
override fun shouldOverrideUrlLoading(
view: WebView?,
request: WebResourceRequest?,
): Boolean {
val u = request?.url?.toString() ?: return false
// Stay in the overlay for same-node navigation;
// hand genuinely external links to the real browser.
if (isSameHost(u, serverUrl)) return false
openExternalUrl(ctx, u)
return true
}
}
browser = this
loadUrl(url)
}
},
)
}
}

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M15,19l-7,-7 7,-7"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M6,18L18,6M6,6l12,12"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M9,5l7,7 -7,7"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M10,6H6a2,2 0,0 0,-2 2v10a2,2 0,0 0,2 2h10a2,2 0,0 0,2 -2v-4M14,4h6m0,0v6m0,-6L10,14"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M4,4v6h6M20,20v-6h-6M5.64,15.36A8,8 0,0 0,18.36 18M18.36,8.64A8,8 0,0 0,5.64 6"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -23,13 +23,6 @@
<string name="remote_input_hint">Use your phone as a keyboard and mouse for the kiosk</string>
<string name="close">Close</string>
<string name="open_in_browser">Open in browser</string>
<string name="back">Back</string>
<string name="forward">Forward</string>
<string name="refresh">Refresh</string>
<string name="server_name_label">Server Name (optional)</string>
<string name="server_name_placeholder">My Archipelago</string>
<string name="edit_server">Edit</string>
<string name="edit_server_title">Edit Server</string>
<string name="save_changes">Save Changes</string>
<string name="cancel">Cancel</string>
</resources>

View File

@ -1,18 +1,13 @@
#!/usr/bin/env bash
#
# Build the Android companion app and publish it as the served download
# (neode-ui/public/packages/archipelago-companion.apk — a plain APK a phone can
# install straight from the link), then commit + push.
# (neode-ui/public/packages/archipelago-companion.apk.zip), then commit + push.
#
# Use this INSTEAD of `git push` when shipping the companion app, so the
# downloadable APK on the node always matches what's on main.
#
# ./Android/ship-companion.sh
#
# The actual build/sign/verify/stage is done by scripts/publish-companion-apk.sh
# (single source of truth, shared with the pre-push hook). It does a CLEAN build,
# forces v1+v2+v3 signing, and ABORTS if any signature scheme is missing — so a
# broken or v2-only APK can never be shipped.
set -euo pipefail
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
@ -21,15 +16,21 @@ cd "$ROOT"
export JAVA_HOME="${JAVA_HOME:-/opt/homebrew/opt/openjdk@17}"
export ANDROID_HOME="${ANDROID_HOME:-$HOME/Library/Android/sdk}"
DEST="neode-ui/public/packages/archipelago-companion.apk"
APK="Android/app/build/outputs/apk/debug/app-debug.apk"
DEST="neode-ui/public/packages/archipelago-companion.apk.zip"
echo "==> Building + signing + verifying companion APK"
bash scripts/publish-companion-apk.sh
echo "==> Building debug APK"
( cd Android && ./gradlew :app:assembleDebug --console=plain -q )
[ -f "$APK" ] || { echo "ERROR: APK not found at $APK" >&2; exit 1; }
[ -f "$DEST" ] || { echo "ERROR: served APK not found at $DEST" >&2; exit 1; }
echo "==> Publishing -> $DEST"
mkdir -p "$(dirname "$DEST")"
rm -f "$DEST"
( cd "$(dirname "$APK")" && zip -j -q "$ROOT/$DEST" "$(basename "$APK")" )
if git diff --cached --quiet -- "$DEST"; then
echo "==> Nothing to commit (APK unchanged)"
git add "$DEST"
if git diff --cached --quiet; then
echo "==> Nothing to commit (working tree + APK unchanged)"
else
git commit -q -m "chore(android): update companion apk download"
echo "==> Committed"

View File

@ -1,18 +1,13 @@
# Archipelago — agent guide
## ✅ Single-node production gate is GREEN (2026-06-23)
## 🚩 TOP PRIORITY (until production testing passes)
`tests/lifecycle/run-gate.sh` is **5/5 on .228, 0 failures** — the single-node exit
criterion is met and the priority banner is demoted. Next exit-criteria: the
**multinode pass** (`docs/multinode-testing-plan.md`) and workstreams B/C/D.
**Read `docs/PRODUCTION-MASTER-PLAN.md` first** — it is still the authoritative plan
for the north star: a world-class, **developer-ready app platform** where every app
is manifest-driven, manifests ship via the **signed registry** (not OTA disk files),
and **third-party developers publish apps via an external/decentralized registry**
all rootless, secure, robust, and 100%-uptime-capable. It no longer overrides all
ad-hoc direction now that the gate is green, but it remains the source of truth for
sequencing the remaining workstreams.
**Read `docs/PRODUCTION-MASTER-PLAN.md` first.** It is the authoritative plan and
overrides ad-hoc direction until the production test gate is green. Goal: a
world-class, **developer-ready app platform** where every app is manifest-driven,
manifests ship via the **signed registry** (not OTA disk files), and **third-party
developers publish apps via an external/decentralized registry** — all rootless,
secure, robust, and 100%-uptime-capable.
Detailed sub-plans (all linked from the master):
- App platform / packaging phases + security model → `docs/APP-PACKAGING-MIGRATION-PLAN.md`
@ -32,8 +27,7 @@ Detailed sub-plans (all linked from the master):
`container::secrets`, 0600/rootless) — never hardcoded, per-app, or logged.
- **Migrations never destroy data** — preserve `/var/lib/archipelago/<app>`,
secrets, credentials, ports, and adoption container names; keep a rollback path.
- **Verify on the real node .228 before any tag.** (Fleet-wide multinode
verification is a separate plan: `docs/multinode-testing-plan.md`.)
- **Verify on a real node (.228, then .198) before any tag.**
## Build / verify
@ -47,11 +41,7 @@ Detailed sub-plans (all linked from the master):
## Production test gate (definition of done)
`tests/lifecycle/run-gate.sh` green across install / UI / stop / start / restart /
`tests/lifecycle/run-20x.sh` green across install / UI / stop / start / restart /
reinstall / reboot-survive / archipelago-restart-survive / uninstall — **5× on
.228** (`ARCHY_ITERATIONS=5`). **Run the gate ON the node** (it uses local podman/systemctl/bitcoin
probes), not via RPC from another host. **✅ GREEN 2026-06-23 (5/5, 0 not-ok)** — keep it
green (re-run after orchestrator/lifecycle changes); regressions are top priority again.
**Multinode testing (.198 + the rest of the fleet) is a SEPARATE plan** —
`docs/multinode-testing-plan.md` — not part of this single-node gate criterion, and is
the next exit criterion now that single-node is green.
.228 AND .198 for now** (`ARCHY_ITERATIONS=5`; temporarily reduced from 20×
restore to 20× before the final ship). Until green, the master plan is the priority.

View File

@ -73,7 +73,7 @@
"author": "Mempool",
"category": "money",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0",
"repoUrl": "https://github.com/mempool/mempool",
"requires": [
"bitcoin-knots",
@ -195,7 +195,7 @@
"title": "Nostr Relay (Rust)",
"version": "0.8.0",
"description": "High-performance Nostr relay written in Rust. Host your own decentralized social media relay and earn networking profits.",
"icon": "/assets/img/app-icons/nostr.svg",
"icon": "/assets/img/app-icons/nostrudel.svg",
"author": "Nostr RS Relay",
"category": "community",
"tier": "recommended",

View File

@ -1,12 +1,12 @@
app:
id: archy-mempool-web
name: Mempool Web
version: 3.0.1
version: 3.0.0
description: Frontend web UI for mempool explorer.
container_name: mempool
container:
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1
image: git.tx1138.com/lfg2025/mempool-frontend:v3.0.0
pull_policy: if-not-present
network: archy-net

View File

@ -5,7 +5,7 @@ app:
description: Bitcoin mempool and blockchain explorer. Real-time transaction and block visualization.
container:
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0
image_signature: cosign://...
pull_policy: if-not-present

View File

@ -0,0 +1,5 @@
# Meshtastic - uses official image
FROM meshtastic/meshtastic:latest
# Default configuration is in the image
# No additional setup needed

View File

@ -0,0 +1,69 @@
app:
id: meshtastic
name: Meshtastic
version: 2-daily-alpine
description: Open-source mesh networking for LoRa radios. Create decentralized communication networks.
container:
image: docker.io/meshtastic/meshtasticd:daily-alpine
pull_policy: if-not-present
dependencies:
- storage: 1Gi
resources:
cpu_limit: 1
memory_limit: 512Mi
disk_limit: 1Gi
security:
capabilities: [NET_ADMIN, SYS_ADMIN] # Required for LoRa radio access
readonly_root: false # Needs write access for device management
no_new_privileges: true
user: 1000
seccomp_profile: default
network_policy: host # Requires host network for radio access
apparmor_profile: meshtastic
ports:
- host: 4403
container: 4403
protocol: tcp # Meshtastic TCP API
devices:
- /dev/ttyUSB0 # LoRa radio device (if connected)
volumes:
- type: bind
source: /var/lib/archipelago/meshtastic
target: /var/lib/meshtasticd
options: [rw]
files:
- path: /var/lib/archipelago/meshtastic/config.yaml
content: |
General:
MACAddress: AA:BB:CC:DD:EE:01
Webserver:
Port: 4403
environment:
- MESHTASTIC_PORT=/dev/ttyUSB0
- MESHTASTIC_SERIAL=true
health_check:
type: cmd
endpoint: test -f /var/lib/meshtasticd/config.yaml
interval: 30s
timeout: 30s
retries: 5
networking:
mesh_enabled: true
local_network_access: true
metadata:
icon: /assets/img/app-icons/meshcore.svg
category: networking
tier: recommended
repo: https://github.com/meshtastic/firmware

View File

@ -1,77 +0,0 @@
app:
id: netbird-dashboard
name: NetBird Dashboard
version: "2.38.0"
description: NetBird management dashboard (SPA). Internal stack member served through the netbird proxy.
category: networking
# Hyphen name matches runtime references + the live container (adoption).
# Alias `netbird-dashboard` is the short hostname the proxy's nginx proxies to.
container_name: netbird-dashboard
container:
image: docker.io/netbirdio/dashboard:v2.38.0
pull_policy: if-not-present
network: netbird-net
network_aliases: [netbird-dashboard]
# The dashboard SPA bakes its API/OIDC base URL from these at container
# start. They must point at the proxy's public HTTPS origin (8087) so the
# browser uses a secure context (window.crypto.subtle / OIDC PKCE, #15).
# {{HOST_IP}} is the node's primary host IP, resolved at apply time.
derived_env:
- key: NETBIRD_MGMT_API_ENDPOINT
template: "https://{{HOST_IP}}:8087"
- key: NETBIRD_MGMT_GRPC_API_ENDPOINT
template: "https://{{HOST_IP}}:8087"
- key: AUTH_AUTHORITY
template: "https://{{HOST_IP}}:8087/oauth2"
dependencies:
- app_id: netbird-server
resources:
memory_limit: 256Mi
security:
# cap-drop=ALL is applied by the orchestrator. The dashboard image runs
# nginx (master as root, drops workers) binding :80 — needs the worker-drop
# caps + NET_BIND_SERVICE for the privileged port.
capabilities: [CHOWN, DAC_OVERRIDE, SETGID, SETUID, NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
# Internal only — reached container-to-container by the proxy via netbird-net.
ports: []
volumes: []
environment:
- AUTH_AUDIENCE=netbird-dashboard
- AUTH_CLIENT_ID=netbird-dashboard
- AUTH_CLIENT_SECRET=
- USE_AUTH0=false
- AUTH_SUPPORTED_SCOPES=openid profile email groups
- AUTH_REDIRECT_URI=/nb-auth
- AUTH_SILENT_REDIRECT_URI=/nb-silent-auth
- NETBIRD_TOKEN_SOURCE=idToken
- NGINX_SSL_PORT=443
- LETSENCRYPT_DOMAIN=none
health_check:
type: tcp
endpoint: localhost:80
interval: 30s
timeout: 5s
retries: 5
start_period: 20s
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/dashboard
license: BSD-3-Clause
tags:
- networking
- vpn
- dashboard

View File

@ -1,122 +0,0 @@
app:
id: netbird-server
name: NetBird Server
version: "0.71.2"
description: NetBird combined management / signal / relay server with an embedded identity provider and STUN. Backend for the self-hosted NetBird mesh VPN.
category: networking
# Hyphen name matches the runtime references (crash_recovery / dependencies /
# config startup order) + the live container, so on an existing node the
# orchestrator ADOPTS the running server rather than recreating it (data +
# the sqlite store under /var/lib/netbird preserved). Alias `netbird-server`
# is the short hostname the proxy's nginx proxies/grpc-passes to.
container_name: netbird-server
container:
image: docker.io/netbirdio/netbird-server:0.71.2
pull_policy: if-not-present
network: netbird-net
network_aliases: [netbird-server]
# The relay authSecret and the sqlite store encryptionKey are base64 keys
# (the server base64-decodes them to recover raw bytes — hex would decode to
# the wrong value). Generated once and reused: ensure_generated_secrets
# no-ops when the file already exists, so a re-render of config.yaml on an
# adopted node keeps the same keys (regenerating would orphan the store).
generated_secrets:
- name: netbird-relay-auth-secret
kind: base64
- name: netbird-store-encryption-key
kind: base64
# Pass the rendered config explicitly, mirroring the legacy `--config` arg.
custom_args: ["--config", "/etc/netbird/config.yaml"]
dependencies:
- storage: 1Gi
resources:
memory_limit: 1Gi
security:
# cap-drop=ALL is applied by the orchestrator. The server binds :80
# (management/signal/relay HTTP + gRPC) inside the container — a privileged
# port — so it needs NET_BIND_SERVICE. STUN is 3478/udp (unprivileged).
capabilities: [NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
ports:
- host: 8086
container: 80
protocol: tcp # management API + embedded OIDC issuer (/oauth2)
- host: 3478
container: 3478
protocol: udp # STUN — must be UDP; tcp here breaks relay discovery
volumes:
- type: bind
source: /var/lib/archipelago/netbird/data
target: /var/lib/netbird
options: [rw]
# The rendered config.yaml, read-only. Re-rendered on every reconcile from
# host facts + the base64 secrets; idempotent (stable bytes → no restart).
- type: bind
source: /var/lib/archipelago/netbird/config.yaml
target: /etc/netbird/config.yaml
options: [ro]
environment: []
# The server's config. {{HOST_IP}} is the node's primary host IP (the proxy's
# public origin is https on 8087 — the dashboard needs a secure context for
# OIDC PKCE, issue #15). {{secret:...}} are read 0600 from the secrets dir.
files:
- path: /var/lib/archipelago/netbird/config.yaml
overwrite: true
content: |
server:
listenAddress: ":80"
exposedAddress: "https://{{HOST_IP}}:8087"
stunPorts:
- 3478
metricsPort: 9090
healthcheckAddress: ":9000"
logLevel: "info"
logFile: "console"
authSecret: "{{secret:netbird-relay-auth-secret}}"
dataDir: "/var/lib/netbird"
auth:
issuer: "https://{{HOST_IP}}:8087/oauth2"
localAuthDisabled: false
signKeyRefreshEnabled: false
dashboardRedirectURIs:
- "https://{{HOST_IP}}:8087/nb-auth"
- "https://{{HOST_IP}}:8087/nb-silent-auth"
dashboardPostLogoutRedirectURIs:
- "https://{{HOST_IP}}:8087/"
cliRedirectURIs:
- "http://localhost:53000/"
store:
engine: "sqlite"
encryptionKey: "{{secret:netbird-store-encryption-key}}"
# TCP liveness on the management port. Binds at startup, stays green; an http
# check of /oauth2 would false-fail while the issuer warms up.
health_check:
type: tcp
endpoint: localhost:80
interval: 30s
timeout: 5s
retries: 10
start_period: 30s
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/netbird
license: BSD-3-Clause
tags:
- networking
- vpn
- wireguard
- mesh

View File

@ -1,182 +0,0 @@
app:
id: netbird
name: NetBird
version: "2.38.0"
description: Self-hosted WireGuard mesh VPN control plane with dashboard, embedded identity provider, management API, signal, relay, and STUN. The user-facing entry point — a TLS proxy in front of the dashboard + server.
category: networking
# The user-facing launcher (app_id + container both "netbird", matching the
# runtime references + the live container so the orchestrator adopts it). This
# is the nginx that terminates TLS on 8087 and fans out to the dashboard +
# server by their short aliases on netbird-net.
container_name: netbird
container:
image: docker.io/library/nginx:1.27-alpine
pull_policy: if-not-present
network: netbird-net
# Self-signed TLS cert materialised before create — the dashboard needs a
# secure context (window.crypto.subtle / OIDC PKCE, issue #15), so the proxy
# serves HTTPS. Idempotent: kept as-is when crt+key already exist (a user
# accepts it once). SAN defaults to the host IP + 127.0.0.1 + localhost.
generated_certs:
- crt: /var/lib/archipelago/netbird/tls.crt
key: /var/lib/archipelago/netbird/tls.key
dependencies:
- app_id: netbird-server
- app_id: netbird-dashboard
- storage: 1Gi
resources:
memory_limit: 256Mi
security:
# cap-drop=ALL is applied by the orchestrator. nginx (master as root, drops
# workers) binds :443 — needs the worker-drop caps + NET_BIND_SERVICE.
capabilities: [CHOWN, DAC_OVERRIDE, SETGID, SETUID, NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
ports:
# 8087 publishes the TLS listener (container :443). HTTPS is required for the
# dashboard's secure context (issue #15).
- host: 8087
container: 443
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/netbird/nginx.conf
target: /etc/nginx/conf.d/default.conf
options: [ro]
- type: bind
source: /var/lib/archipelago/netbird/tls.crt
target: /etc/nginx/tls.crt
options: [ro]
- type: bind
source: /var/lib/archipelago/netbird/tls.key
target: /etc/nginx/tls.key
options: [ro]
environment: []
# The proxy config. {{NETWORK_GATEWAY}} is the netbird-net bridge gateway =
# Podman's aardvark DNS. nginx uses it as an explicit `resolver` with VARIABLE
# upstreams so it re-resolves container names per request — without it nginx
# pins a container IP at startup and 502s forever once that IP moves on a
# restart/reboot (issue #15, observed live on .198). Every #15 fix below
# (CORS $http_origin reflect, grpc pass, nb-auth/nb-silent-auth rewrite to
# index.html, /relay websocket) is preserved verbatim from the legacy config.
files:
- path: /var/lib/archipelago/netbird/nginx.conf
overwrite: true
content: |
server {
listen 443 ssl;
server_name _;
# netbird's dashboard needs a secure context (window.crypto.subtle for
# OIDC PKCE), so the proxy terminates TLS with a self-signed cert (#15).
ssl_certificate /etc/nginx/tls.crt;
ssl_certificate_key /etc/nginx/tls.key;
# Rootless Podman can hand a container a new IP across restarts/reboots.
# nginx resolves a literal upstream name ONCE at startup and caches it,
# so after the IP moves every request 502s with "host unreachable"
# (issue #15, observed live on .198: nginx pinned to a dead
# netbird-dashboard IP). Fix: point `resolver` at the netbird-net
# gateway (Podman's aardvark DNS) and use VARIABLE upstreams, which
# forces nginx to re-resolve the container names at request time.
resolver {{NETWORK_GATEWAY}} valid=10s ipv6=off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
location ~ ^/(relay|ws-proxy/) {
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 1d;
}
location ~ ^/(api|oauth2)(/|$) {
# The dashboard is a SPA whose API/OIDC base URL is baked at build
# time to one host:port. A single box is reached via several
# addresses, so those fetches are cross-origin and the browser
# blocks them with no Access-Control-Allow-Origin (#15, live on
# .198). Reflect the caller's Origin and answer the CORS preflight.
if ($request_method = OPTIONS) {
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
add_header Access-Control-Max-Age 86400 always;
add_header Content-Length 0;
return 204;
}
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
}
location ~ ^/(signalexchange\.SignalExchange|management\.ManagementService|management\.ProxyService)/ {
set $nb_server netbird-server;
grpc_pass grpc://$nb_server:80;
grpc_read_timeout 1d;
grpc_send_timeout 1d;
}
# OIDC callback routes are client-side SPA routes with NO prebuilt page
# in the dashboard bundle, so proxying them straight through 404s —
# which crashes the dashboard's auth init and shows "Unauthenticated"
# with dead buttons (#15, live on .198: /nb-auth + /nb-silent-auth
# returned 404). Serve index.html at these paths (URL unchanged) so
# react-oidc boots and completes the login / silent-SSO.
location ~ ^/(nb-auth|nb-silent-auth) {
set $nb_dashboard netbird-dashboard;
rewrite ^.*$ /index.html break;
proxy_pass http://$nb_dashboard:80;
}
location / {
set $nb_dashboard netbird-dashboard;
proxy_pass http://$nb_dashboard:80;
}
}
health_check:
type: tcp
endpoint: localhost:443
interval: 30s
timeout: 5s
retries: 5
start_period: 20s
interfaces:
main:
name: Dashboard
description: Manage your self-hosted NetBird mesh VPN
type: ui
port: 8087
protocol: https
path: /
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/netbird
license: BSD-3-Clause
tags:
- networking
- vpn
- wireguard
- mesh

View File

@ -171,13 +171,6 @@ impl RpcHandler {
// than the WebSocket-delivered package_data, which caused apps to flicker
// between "installed" and "not-installed" in the UI.
let (data, _) = self.state_manager.get_snapshot().await;
// Apps the user explicitly stopped must read as "stopped" even though a
// UI companion (electrs-ui, bitcoin-ui, …) keeps serving the launch port:
// launch_port_reachable() below would otherwise upgrade an exited backend
// back to "running". The reconcile guard keeps these backends down, so the
// marker is authoritative here.
let user_stopped =
crate::crash_recovery::load_user_stopped(&self.config.data_dir).await;
if data.server_info.status_info.containers_scanned && !data.package_data.is_empty() {
let mut containers = Vec::with_capacity(data.package_data.len());
for (id, pkg) in &data.package_data {
@ -209,11 +202,7 @@ impl RpcHandler {
// Scanner backoff preserves cached package_data. Refresh stable
// states so callers do not see stale `running`/`exited` after
// health-monitor recovery or Quadlet --rm container removal.
if user_stopped.contains(id) {
// User stopped it → authoritative "stopped". Do NOT let a
// still-running UI companion's launch port mark it running.
state = "stopped".to_string();
} else if state == "running" && requires_launch_port_for_health(id) {
if state == "running" && requires_launch_port_for_health(id) {
if !self.cached_reachable_health(id).await?.is_some() {
state = live_state_for_app(id)
.await

View File

@ -376,31 +376,16 @@ pub(super) fn startup_order(package_id: &str) -> &'static [&'static str] {
/// order for the given app. Unknown containers sort to the end.
pub(super) async fn ordered_containers_for_start(package_id: &str) -> Result<Vec<String>> {
let containers = get_containers_for_app(package_id).await?;
Ok(order_present_containers(package_id, containers))
}
/// Order the *actually-present* containers of an app by its dependency-aware
/// startup order. Containers whose name is unknown to the order list sort to
/// the end, preserving their relative input order.
///
/// This deliberately does NOT inject order entries that aren't live
/// containers. `startup_order` is a union of container-name variants across
/// install generations (e.g. `mysql-mempool` vs `archy-mempool-db`), so any
/// single install only ever has a subset of those names. Injecting a phantom
/// name makes the start path fail on a "no such object" inspect — and because
/// `do_orchestrator_package_start` propagates the unknown-app-id fallback
/// error via `?`, every later member (the api + frontend) is then skipped,
/// leaving the stack down until the health monitor recovers it minutes later.
/// That was the source of mempool gate flakes #73 (frontend) / #74 (api).
fn order_present_containers(package_id: &str, containers: Vec<String>) -> Vec<String> {
if containers.is_empty() {
// Nothing is live under any known name. Fall back to the package id so
// a single-container app whose container matches its id still gets one
// start attempt; multi-container stacks with no live members are
// surfaced as "no containers" by the caller's emptiness check.
return vec![package_id.to_string()];
}
let order = startup_order(package_id);
if order.is_empty() && containers.is_empty() {
return Ok(vec![package_id.to_string()]);
}
let mut sorted = containers;
for required in order {
if !sorted.iter().any(|name| name == required) {
sorted.push((*required).to_string());
}
}
// If no special order is defined, fall back to mempool order for legacy
// multi-container names that may still be returned by config lookups.
let effective_order: &[&str] = if order.is_empty() {
@ -408,14 +393,8 @@ fn order_present_containers(package_id: &str, containers: Vec<String>) -> Vec<St
} else {
order
};
let mut sorted = containers;
sorted.sort_by_key(|c| {
effective_order
.iter()
.position(|o| *o == c)
.unwrap_or(usize::MAX)
});
sorted
sorted.sort_by_key(|c| effective_order.iter().position(|o| *o == c).unwrap_or(99));
Ok(sorted)
}
/// Configure Fedimint Gateway to use LND instead of LDK.
@ -473,48 +452,7 @@ pub(super) fn configure_fedimint_lnd(
#[cfg(test)]
mod tests {
use super::{order_present_containers, requires_unpruned_bitcoin, startup_order};
#[test]
fn order_present_containers_never_injects_phantom_stack_members() {
// The live mempool stack on a node: db + api + frontend. These are the
// only real container names; the startup_order list also contains
// variant/legacy names (mysql-mempool, archy-mempool-api, ...) that are
// NOT live here and must never appear in the result — a phantom name in
// the start list aborts the orchestrator start mid-sequence (gate
// #73/#74).
let present = vec![
"mempool".to_string(),
"mempool-api".to_string(),
"archy-mempool-db".to_string(),
];
let ordered = order_present_containers("mempool", present);
// Dependency order: db -> api -> frontend.
assert_eq!(ordered, vec!["archy-mempool-db", "mempool-api", "mempool"]);
// No phantom variants leaked in.
for phantom in ["mysql-mempool", "archy-mempool-api", "archy-mempool-web"] {
assert!(
!ordered.iter().any(|c| c == phantom),
"phantom {phantom} must not be injected"
);
}
}
#[test]
fn order_present_containers_orders_known_before_unknown() {
let present = vec!["mempool".to_string(), "some-sidecar".to_string()];
let ordered = order_present_containers("mempool", present);
// The known frontend sorts ahead of an unknown sidecar.
assert_eq!(ordered, vec!["mempool", "some-sidecar"]);
}
#[test]
fn order_present_containers_empty_falls_back_to_package_id() {
assert_eq!(
order_present_containers("mempool", vec![]),
vec!["mempool".to_string()]
);
}
use super::{requires_unpruned_bitcoin, startup_order};
#[test]
fn btcpay_start_order_includes_required_stack_members() {

View File

@ -312,16 +312,7 @@ impl RpcHandler {
let mut stopped = 0u32;
let mut removed = 0u32;
// Two distinct failure classes, kept separate so they don't get
// conflated (the old single `errors` vec did, which caused the "ghost in
// My Apps" bug): `container_errors` means a container could NOT be
// removed (force-rm failed too) — the app is genuinely still present, so
// we keep its state entry and surface a hard error. `cleanup_errors`
// means volume/network/data-dir teardown left residue — the containers
// are already gone, so the app IS uninstalled and MUST disappear from My
// Apps; the residue is logged but never ghosts the app.
let mut container_errors: Vec<String> = Vec::new();
let mut cleanup_errors: Vec<String> = Vec::new();
let mut errors = Vec::new();
self.set_uninstall_stage(
package_id,
@ -379,7 +370,7 @@ impl RpcHandler {
let msg =
format!("Failed to remove {}: {}; {}", name, stderr.trim(), e);
tracing::error!("Uninstall {}: {}", package_id, msg);
container_errors.push(msg);
errors.push(msg);
}
}
}
@ -388,35 +379,12 @@ impl RpcHandler {
Err(force_err) => {
let msg = format!("Failed to remove {}: {}; {}", name, e, force_err);
tracing::error!("Uninstall {}: {}", package_id, msg);
container_errors.push(msg);
errors.push(msg);
}
},
}
}
// A container that survived even force-remove means the app is NOT
// actually uninstalled — keep its state entry and fail so the spawned
// task reverts it to its prior state (and the user can retry), rather
// than orphaning a live container that's missing from My Apps.
if !container_errors.is_empty() {
tracing::error!(
"Uninstall {}: containers could not be removed: {:?}",
package_id,
container_errors
);
return Err(anyhow::anyhow!(
"Uninstall {} failed: {}",
package_id,
container_errors.join("; ")
));
}
// Containers are gone → the app is uninstalled. Remove its state entry
// NOW, before the (possibly slow, possibly fallible) volume/data
// teardown below, so My Apps updates immediately and a residue failure
// can never leave a ghost. Reinstall/scan no longer see a stale entry.
self.remove_package_state_entry(package_id).await;
self.set_uninstall_stage(package_id, "Cleaning up volumes")
.await;
// Avoid global Podman volume prune on production nodes: store-wide
@ -464,73 +432,70 @@ impl RpcHandler {
let stderr = String::from_utf8_lossy(&o.stderr);
let msg = format!("Failed to remove data {}: {}", dir, stderr.trim());
tracing::error!("Uninstall {}: {}", package_id, msg);
cleanup_errors.push(msg);
errors.push(msg);
}
Err(e) => {
let msg = format!("Failed to remove data {}: {}", dir, e);
tracing::error!("Uninstall {}: {}", package_id, msg);
cleanup_errors.push(msg);
errors.push(msg);
}
_ => {}
}
}
}
// The app is already gone from My Apps (entry removed above). Residual
// volume/data cleanup failures are logged but NEVER ghost the app — a
// reinstall and the next uninstall both tolerate leftover dirs.
if !cleanup_errors.is_empty() {
if !errors.is_empty() {
tracing::error!(
"Uninstall {} removed but left cleanup residue: {:?}",
"Uninstall {} completed with errors: {:?}",
package_id,
cleanup_errors
errors
);
return Err(anyhow::anyhow!(
"Uninstall {} partially failed: {}",
package_id,
errors.join("; ")
));
}
tracing::info!(
"Uninstall {} complete: stopped={}, removed={}, cleanup_errors={}",
"Uninstall {} complete: stopped={}, removed={}",
package_id,
stopped,
removed,
cleanup_errors.len()
removed
);
// Immediately remove from in-memory state so the UI updates without
// waiting for the scanner's absence threshold (3 scans × 60s each).
{
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let before = data.package_data.len();
data.package_data.remove(package_id);
// Also remove any alias keys (e.g. "bitcoin-knots" vs "bitcoin")
let aliases: Vec<String> = data
.package_data
.keys()
.filter(|k| {
super::config::all_container_names(package_id)
.iter()
.any(|c| c.strip_prefix("archy-").unwrap_or(c) == k.as_str())
})
.cloned()
.collect();
for alias in &aliases {
data.package_data.remove(alias);
}
if data.package_data.len() < before {
self.state_manager.update_data(data).await;
}
}
Ok(serde_json::json!({
"status": "uninstalled",
"stopped": stopped,
"removed": removed,
"cleanup_warnings": cleanup_errors,
}))
}
/// Remove a package's entry (and any alias keys) from persisted state so it
/// disappears from My Apps immediately, without waiting for the scanner's
/// absence threshold (3 scans × 60s). Called as soon as an uninstall has
/// removed the app's containers — before the slower volume/data teardown —
/// so a residue failure can never leave a ghost entry behind.
async fn remove_package_state_entry(&self, package_id: &str) {
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let before = data.package_data.len();
data.package_data.remove(package_id);
// Also remove any alias keys (e.g. "bitcoin-knots" vs "bitcoin").
let aliases: Vec<String> = data
.package_data
.keys()
.filter(|k| {
super::config::all_container_names(package_id)
.iter()
.any(|c| c.strip_prefix("archy-").unwrap_or(c) == k.as_str())
})
.cloned()
.collect();
for alias in &aliases {
data.package_data.remove(alias);
}
if data.package_data.len() < before {
self.state_manager.update_data(data).await;
}
}
/// Start a bundled app (create container from pre-loaded image if needed).
pub(in crate::api::rpc) async fn handle_bundled_app_start(
&self,

View File

@ -6,6 +6,7 @@
use crate::api::rpc::RpcHandler;
use crate::data_model::InstallPhase;
use anyhow::{Context, Result};
use base64::Engine;
use std::process::Output;
use std::time::Duration;
use tracing::info;
@ -695,16 +696,6 @@ fn immich_stack_app_ids() -> &'static [&'static str] {
&["immich-postgres", "immich-redis", "immich"]
}
fn netbird_stack_app_ids() -> &'static [&'static str] {
// Dependency/startup order: the combined management/signal/relay server
// first (it owns the base64 relay/store secrets + the sqlite store, and is
// the OIDC issuer the others point at), then the dashboard SPA, then the
// user-facing TLS proxy ("netbird", which carries the self-signed cert +
// the templated nginx.conf and is the launcher). Mirrors the netbird
// startup_order in dependencies.rs.
&["netbird-server", "netbird-dashboard", "netbird"]
}
fn indeedhub_stack_app_ids() -> &'static [&'static str] {
// Dependency order: backends + their generated secrets first, then the api
// (owns indeedhub-jwt; reads the db/minio secrets the backends materialised),
@ -724,6 +715,10 @@ fn indeedhub_stack_app_ids() -> &'static [&'static str] {
const REGISTRY: &str = "146.59.87.168:3000/lfg2025";
const NETBIRD_DASHBOARD_IMAGE: &str = "docker.io/netbirdio/dashboard:v2.38.0";
const NETBIRD_SERVER_IMAGE: &str = "docker.io/netbirdio/netbird-server:0.71.2";
const NETBIRD_PROXY_IMAGE: &str = "docker.io/library/nginx:1.27-alpine";
/// Pull an image with retry and exponential backoff (3 attempts).
async fn pull_image_with_retry(image: &str) -> Result<()> {
let exists = podman_stack_status(&["image", "exists", image], PODMAN_STACK_PROBE_TIMEOUT).await;
@ -1833,27 +1828,6 @@ impl RpcHandler {
/// Install self-hosted NetBird (dashboard + combined management/signal/relay server).
pub(super) async fn install_netbird_stack(&self) -> Result<serde_json::Value> {
// Manifest-driven path (#20 phase 4): render the 3-member stack from
// apps/netbird-*/manifest.yml via the orchestrator — dedicated
// netbird-net + network_aliases, base64 generated_secrets, a self-signed
// TLS cert (generated_certs) so the dashboard gets a secure context for
// OIDC PKCE (#15), and templated config.yaml/nginx.conf rendered from
// host facts + the netbird-net gateway. The manifests use the exact live
// container names, so on an existing node this ADOPTS the running stack
// rather than recreating it (the sqlite store + base64 keys are
// preserved — ensure_generated_secrets no-ops on existing files).
//
// #20 ph4: the legacy hardcoded `podman run` installer was DELETED — the
// signed catalog always ships apps/netbird-*/manifest.yml, so there is no
// in-Rust fallback. If the orchestrator doesn't know these app_ids and no
// running stack exists to adopt, install errors rather than silently
// diverging from the manifest contract.
if let Some(orchestrated) =
install_stack_via_orchestrator(self, "netbird", netbird_stack_app_ids()).await?
{
return Ok(orchestrated);
}
if let Some(adopted) = adopt_stack_if_exists(
"netbird",
"netbird",
@ -1864,12 +1838,491 @@ impl RpcHandler {
return Ok(adopted);
}
anyhow::bail!(
"netbird manifests not available on this node — the signed catalog must provide apps/netbird-*/manifest.yml (legacy hardcoded installer removed in #20 ph4)"
install_log("INSTALL START: netbird stack (dashboard + server)").await;
info!("Installing self-hosted NetBird stack");
self.set_install_phase("netbird", InstallPhase::PullingImage)
.await;
for (i, image) in [
NETBIRD_DASHBOARD_IMAGE,
NETBIRD_SERVER_IMAGE,
NETBIRD_PROXY_IMAGE,
]
.iter()
.enumerate()
{
self.set_install_progress("netbird", i as u64, 3).await;
pull_image_with_retry(image)
.await
.with_context(|| format!("Failed to pull NetBird image: {}", image))?;
}
self.set_install_progress("netbird", 3, 3).await;
for name in ["netbird", "netbird-dashboard", "netbird-server"] {
let _ = podman_stack_status(&["rm", "-f", name], PODMAN_STACK_PROBE_TIMEOUT).await;
}
let _ = podman_stack_status(
&["network", "rm", "-f", "netbird-net"],
PODMAN_STACK_PROBE_TIMEOUT,
)
.await;
self.set_install_phase("netbird", InstallPhase::CreatingContainer)
.await;
tokio::fs::create_dir_all("/var/lib/archipelago/netbird/data")
.await
.context("Failed to create NetBird data directory")?;
let host_ip = detect_netbird_public_host_ip()
.await
.unwrap_or_else(|| self.config.host_ip.clone());
// Create the network FIRST so we can read back the gateway it was
// assigned — that gateway is Podman's aardvark DNS, which the proxy's
// nginx needs as an explicit `resolver` to re-resolve container names
// (issue #15: without it nginx caches a container IP and 502s forever
// once that IP changes on restart/reboot).
let _ = podman_stack_status(
&["network", "create", "netbird-net"],
PODMAN_STACK_PROBE_TIMEOUT,
)
.await;
let resolver_ip = netbird_net_resolver_ip().await;
write_netbird_config_files(&host_ip, &self.config.host_ip, &resolver_ip).await?;
ensure_netbird_tls_cert(&host_ip).await?;
let mut server_cmd = tokio::process::Command::new("podman");
server_cmd.args([
"run",
"-d",
"--name",
"netbird-server",
"--network",
"netbird-net",
"--network-alias",
"netbird-server",
"--restart=unless-stopped",
"-p",
"8086:80",
"-p",
"3478:3478/udp",
"-v",
"/var/lib/archipelago/netbird/data:/var/lib/netbird",
"-v",
"/var/lib/archipelago/netbird/config.yaml:/etc/netbird/config.yaml:ro",
NETBIRD_SERVER_IMAGE,
"--config",
"/etc/netbird/config.yaml",
]);
run_required_stack_command("netbird", "create server", &mut server_cmd).await?;
self.set_install_phase("netbird", InstallPhase::StartingContainer)
.await;
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
let mut dashboard_cmd = tokio::process::Command::new("podman");
dashboard_cmd.args([
"run",
"-d",
"--name",
"netbird-dashboard",
"--network",
"netbird-net",
// Explicit alias so the proxy can always resolve `netbird-dashboard`
// via Podman DNS — don't rely on implicit container-name aliasing.
"--network-alias",
"netbird-dashboard",
"--restart=unless-stopped",
"--env-file",
"/var/lib/archipelago/netbird/dashboard.env",
NETBIRD_DASHBOARD_IMAGE,
]);
run_required_stack_command("netbird", "create dashboard", &mut dashboard_cmd).await?;
let mut proxy_cmd = tokio::process::Command::new("podman");
proxy_cmd.args([
"run",
"-d",
"--name",
"netbird",
"--network",
"netbird-net",
"--restart=unless-stopped",
// 8087 publishes the TLS listener — netbird's dashboard requires a
// secure context (window.crypto.subtle / OIDC PKCE), issue #15.
"-p",
"8087:443",
"-v",
"/var/lib/archipelago/netbird/nginx.conf:/etc/nginx/conf.d/default.conf:ro",
"-v",
"/var/lib/archipelago/netbird/tls.crt:/etc/nginx/tls.crt:ro",
"-v",
"/var/lib/archipelago/netbird/tls.key:/etc/nginx/tls.key:ro",
NETBIRD_PROXY_IMAGE,
]);
run_required_stack_command("netbird", "create unified proxy", &mut proxy_cmd).await?;
wait_for_stack_containers(
"netbird",
&["netbird-server", "netbird-dashboard", "netbird"],
60,
)
.await?;
self.set_install_phase("netbird", InstallPhase::WaitingHealthy)
.await;
// Containers being "running" is NOT the same as the embedded OIDC
// provider being ready (#10). The dashboard SPA opens right after install
// and, if it loads before /oauth2/.well-known is served, caches a bad
// auth state — the user appears logged-in but can't log out until it
// self-corrects. Wait (best-effort) for OIDC discovery to answer before
// we report Done, so the first dashboard load sees a ready provider.
wait_for_netbird_oidc_ready(Duration::from_secs(60)).await;
self.set_install_phase("netbird", InstallPhase::PostInstall)
.await;
self.set_install_phase("netbird", InstallPhase::Done).await;
self.clear_install_progress("netbird").await;
install_log("INSTALL OK: netbird stack").await;
info!("NetBird stack installed");
Ok(serde_json::json!({
"success": true,
"package_id": "netbird",
"message": "NetBird self-hosted stack installed",
}))
}
}
/// Best-effort wait for NetBird's embedded OIDC provider to start serving its
/// discovery document. The management server publishes 8086:80 on the host and
/// is the issuer at `/oauth2`, so its `.well-known/openid-configuration` is the
/// signal that the dashboard's login/logout flow will work. Polls until a 2xx
/// or the timeout — NEVER fails the install (the stack is already running; this
/// only narrows the post-install race window in #10).
async fn wait_for_netbird_oidc_ready(timeout: Duration) {
let url = "http://127.0.0.1:8086/oauth2/.well-known/openid-configuration";
let client = match reqwest::Client::builder()
.timeout(Duration::from_secs(5))
.build()
{
Ok(c) => c,
Err(_) => return,
};
let deadline = tokio::time::Instant::now() + timeout;
loop {
if let Ok(resp) = client.get(url).send().await {
if resp.status().is_success() {
info!("NetBird OIDC discovery is ready");
return;
}
}
if tokio::time::Instant::now() >= deadline {
info!("NetBird OIDC discovery not ready within timeout — proceeding anyway");
return;
}
tokio::time::sleep(Duration::from_secs(2)).await;
}
}
async fn read_or_generate_b64_secret(name: &str) -> String {
let path = format!("/var/lib/archipelago/secrets/{}", name);
if let Ok(val) = tokio::fs::read_to_string(&path).await {
let trimmed = val.trim().to_string();
if !trimmed.is_empty() {
return trimmed;
}
}
let mut buf = [0u8; 32];
rand::RngCore::fill_bytes(&mut rand::rngs::OsRng, &mut buf);
let secret = base64::engine::general_purpose::STANDARD.encode(buf);
let _ = tokio::fs::create_dir_all("/var/lib/archipelago/secrets").await;
let _ = tokio::fs::write(&path, &secret).await;
secret
}
/// Read the gateway of the `netbird-net` bridge. Podman runs its aardvark DNS
/// resolver on this address, so nginx can use it as an explicit `resolver` to
/// re-resolve container names at request time. Falls back to Podman's usual
/// first-pool gateway if the inspect fails (best effort — config is rewritten
/// on every (re)install).
async fn netbird_net_resolver_ip() -> String {
let out = tokio::process::Command::new("podman")
.args([
"network",
"inspect",
"netbird-net",
"--format",
"{{range .Subnets}}{{.Gateway}}{{end}}",
])
.output()
.await;
if let Ok(o) = out {
let gw = String::from_utf8_lossy(&o.stdout).trim().to_string();
if !gw.is_empty() && gw.parse::<std::net::IpAddr>().is_ok() {
return gw;
}
}
"10.89.0.1".to_string()
}
/// Generate a self-signed TLS cert for the netbird proxy if absent. The
/// dashboard needs a secure context (window.crypto.subtle / OIDC PKCE), so the
/// proxy serves HTTPS; a self-signed cert is sufficient (the user accepts it
/// once when opening netbird in a tab). SAN covers the LAN IP plus
/// localhost/127.0.0.1 so it's valid however the box is reached locally.
async fn ensure_netbird_tls_cert(host_ip: &str) -> Result<()> {
let dir = "/var/lib/archipelago/netbird";
let crt = format!("{dir}/tls.crt");
let key = format!("{dir}/tls.key");
if tokio::fs::metadata(&crt).await.is_ok() && tokio::fs::metadata(&key).await.is_ok() {
return Ok(());
}
let _ = tokio::fs::create_dir_all(dir).await;
let san = format!("subjectAltName=IP:{host_ip},IP:127.0.0.1,DNS:localhost");
let status = tokio::process::Command::new("openssl")
.args([
"req",
"-x509",
"-newkey",
"rsa:2048",
"-nodes",
"-keyout",
&key,
"-out",
&crt,
"-days",
"3650",
"-subj",
&format!("/CN={host_ip}"),
"-addext",
&san,
])
.status()
.await
.context("failed to run openssl for netbird TLS cert")?;
if !status.success() {
anyhow::bail!("openssl failed to generate netbird TLS cert");
}
Ok(())
}
async fn write_netbird_config_files(host_ip: &str, lan_ip: &str, resolver_ip: &str) -> Result<()> {
// netbird's dashboard uses window.crypto.subtle (OIDC PKCE), which browsers
// only expose in a SECURE context — so the proxy serves HTTPS and every
// origin here is https (issue #15: over plain http the dashboard threw
// "window.crypto.subtle is unavailable" and never reached login).
let public_origin = format!("https://{}:8087", host_ip);
let server_origin = format!("http://{}:8086", host_ip);
// A single box is reached via several addresses. Allow the OIDC login flow
// to redirect back to whichever origin the user actually used, otherwise
// post-login lands on the wrong host and the dashboard shows
// "Unauthenticated" (issue #15). The browser-side CORS is handled in the
// nginx proxy; this covers the redirect-URI allow-list.
let lan_origin = format!("https://{}:8087", lan_ip);
let mut redirect_origins = vec![public_origin.clone()];
if lan_origin != public_origin {
redirect_origins.push(lan_origin);
}
let dashboard_redirect_uris = redirect_origins
.iter()
.flat_map(|o| {
[
format!(" - \"{o}/nb-auth\""),
format!(" - \"{o}/nb-silent-auth\""),
]
})
.collect::<Vec<_>>()
.join("\n");
let dashboard_logout_uris = redirect_origins
.iter()
.map(|o| format!(" - \"{o}/\""))
.collect::<Vec<_>>()
.join("\n");
let relay_secret = read_or_generate_b64_secret("netbird-relay-auth-secret").await;
let encryption_key = read_or_generate_b64_secret("netbird-store-encryption-key").await;
let config = format!(
r#"server:
listenAddress: ":80"
exposedAddress: "{public_origin}"
stunPorts:
- 3478
metricsPort: 9090
healthcheckAddress: ":9000"
logLevel: "info"
logFile: "console"
authSecret: "{relay_secret}"
dataDir: "/var/lib/netbird"
auth:
issuer: "{public_origin}/oauth2"
localAuthDisabled: false
signKeyRefreshEnabled: false
dashboardRedirectURIs:
{dashboard_redirect_uris}
dashboardPostLogoutRedirectURIs:
{dashboard_logout_uris}
cliRedirectURIs:
- "http://localhost:53000/"
store:
engine: "sqlite"
encryptionKey: "{encryption_key}"
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/config.yaml", config)
.await
.context("Failed to write NetBird config.yaml")?;
let dashboard_env = format!(
r#"NETBIRD_MGMT_API_ENDPOINT={public_origin}
NETBIRD_MGMT_GRPC_API_ENDPOINT={public_origin}
AUTH_AUDIENCE=netbird-dashboard
AUTH_CLIENT_ID=netbird-dashboard
AUTH_CLIENT_SECRET=
AUTH_AUTHORITY={public_origin}/oauth2
USE_AUTH0=false
AUTH_SUPPORTED_SCOPES=openid profile email groups
AUTH_REDIRECT_URI=/nb-auth
AUTH_SILENT_REDIRECT_URI=/nb-silent-auth
NETBIRD_TOKEN_SOURCE=idToken
NGINX_SSL_PORT=443
LETSENCRYPT_DOMAIN=none
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/dashboard.env", dashboard_env)
.await
.context("Failed to write NetBird dashboard.env")?;
let nginx_conf = format!(
r#"server {{
listen 443 ssl;
server_name _;
# netbird's dashboard needs a secure context (window.crypto.subtle for OIDC
# PKCE), so the proxy terminates TLS with a self-signed cert (issue #15).
ssl_certificate /etc/nginx/tls.crt;
ssl_certificate_key /etc/nginx/tls.key;
# Rootless Podman can hand a container a new IP across restarts/reboots.
# nginx resolves a literal upstream name ONCE at startup and caches it, so
# after the IP moves every request 502s with "host unreachable" (issue #15,
# observed live on .198: nginx pinned to a dead netbird-dashboard IP). Fix:
# point `resolver` at the netbird-net gateway (Podman's aardvark DNS) and
# use VARIABLE upstreams, which forces nginx to re-resolve the container
# names at request time. Everything is reached container-to-container by
# name so nothing depends on host-published ports either.
resolver {resolver_ip} valid=10s ipv6=off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
location ~ ^/(relay|ws-proxy/) {{
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 1d;
}}
location ~ ^/(api|oauth2)(/|$) {{
# The dashboard is a SPA whose API/OIDC base URL is baked at build time
# to one host:port. A single box is reached via several addresses (LAN
# IP, Tailscale 100.x, hostname), so those fetches are cross-origin and
# the browser blocks them with no Access-Control-Allow-Origin (issue
# #15, observed live on .198). Reflect the caller's Origin so the
# self-hosted management/OIDC API is reachable from any of them, and
# answer the CORS preflight here.
if ($request_method = OPTIONS) {{
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
add_header Access-Control-Max-Age 86400 always;
add_header Content-Length 0;
return 204;
}}
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
}}
location ~ ^/(signalexchange\.SignalExchange|management\.ManagementService|management\.ProxyService)/ {{
set $nb_server netbird-server;
grpc_pass grpc://$nb_server:80;
grpc_read_timeout 1d;
grpc_send_timeout 1d;
}}
# OIDC callback routes are client-side SPA routes with NO prebuilt page in
# the dashboard bundle, so proxying them straight through 404s which
# crashes the dashboard's auth init and shows "Unauthenticated" with dead
# buttons (issue #15, confirmed live on .198: /nb-auth + /nb-silent-auth
# returned 404). Serve the dashboard's index.html at these paths (URL
# unchanged) so react-oidc boots and completes the login / silent-SSO.
location ~ ^/(nb-auth|nb-silent-auth) {{
set $nb_dashboard netbird-dashboard;
rewrite ^.*$ /index.html break;
proxy_pass http://$nb_dashboard:80;
}}
location / {{
set $nb_dashboard netbird-dashboard;
proxy_pass http://$nb_dashboard:80;
}}
}}
# Direct server remains available for diagnostics at {server_origin}.
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/nginx.conf", nginx_conf)
.await
.context("Failed to write NetBird nginx.conf")?;
Ok(())
}
async fn detect_netbird_public_host_ip() -> Option<String> {
let output = tokio::process::Command::new("hostname")
.args(["-I"])
.output()
.await
.ok()?;
let stdout = String::from_utf8_lossy(&output.stdout);
let ips: Vec<&str> = stdout
.split_whitespace()
.filter(|s| s.contains('.'))
.collect();
// Prefer the LAN address as the canonical origin — that's what users browse
// to on the local network. Baking the Tailscale 100.x address here broke
// LAN access with cross-origin/redirect mismatches (issue #15). Tailscale
// (100.64.0.0/10 CGNAT) is only a fallback for nodes with no LAN IP.
let is_private_lan = |ip: &str| {
ip.starts_with("192.168.")
|| ip.starts_with("10.")
|| (ip.starts_with("172.")
&& ip
.split('.')
.nth(1)
.and_then(|o| o.parse::<u8>().ok())
.map(|o| (16..=31).contains(&o))
.unwrap_or(false))
};
if let Some(lan) = ips.iter().find(|ip| is_private_lan(ip)) {
return Some(lan.to_string());
}
ips.iter()
.find(|ip| ip.starts_with("100."))
.map(|s| s.to_string())
}
#[cfg(test)]
mod tests {
use super::{btcpay_stack_app_ids, mempool_stack_app_ids};

View File

@ -66,7 +66,7 @@ pub struct Config {
/// through Quadlet (`.container` units in ~/.config/containers/systemd
/// + systemctl --user start) instead of `podman create + start`. Default
/// off so the legacy path stays the production path until the harness
/// at tests/lifecycle/run-gate.sh has gone green against the new path
/// at tests/lifecycle/run-20x.sh has gone green against the new path
/// on .228 + .198. See `project_v1_7_52_phase3_quadlet_design`.
#[serde(default)]
pub use_quadlet_backends: bool,
@ -487,7 +487,7 @@ mod tests {
#[test]
fn test_config_use_quadlet_backends_defaults_off() {
// Phase 3.2 of v1.7.52 — the new path stays gated until the 5×
// Phase 3.2 of v1.7.52 — the new path stays gated until the 20×
// harness goes green on .228 and .198. Flipping this default
// ahead of that would route every backend install through code
// we haven't fleet-validated yet.

View File

@ -96,35 +96,6 @@ impl BootReconciler {
}
}
// Companion self-heal runs on its OWN cadence, decoupled from the
// per-app reconcile pass. On a heavily loaded node `reconcile_existing`
// over dozens of apps can take well over a minute, which would delay a
// companion-unit repair (deleted/lost unit file) past any reasonable
// safety window. Detecting + rewriting a companion unit is cheap, so it
// gets a dedicated `interval` loop. The handle is aborted when the main
// loop exits (shutdown uses `notify_one`, so we must NOT add a second
// waiter on `self.shutdown` — it would steal the single wake permit).
let companion_handle = if self.companion_stage {
let orchestrator = self.orchestrator.clone();
let interval = self.interval;
Some(tokio::spawn(async move {
loop {
let installed = orchestrator.manifest_ids().await;
for (companion, err) in crate::container::companion::reconcile(&installed).await
{
tracing::warn!(
companion = %companion,
error = %err,
"companion reconcile failed"
);
}
time::sleep(interval).await;
}
}))
} else {
None
};
// Initial pass: no delay.
self.tick().await;
@ -140,15 +111,23 @@ impl BootReconciler {
}
}
}
if let Some(handle) = companion_handle {
handle.abort();
}
}
async fn tick(&self) {
let report = self.orchestrator.reconcile_existing().await;
Self::log_report(&report);
if !self.companion_stage {
return;
}
let installed = self.orchestrator.manifest_ids().await;
for (companion, err) in crate::container::companion::reconcile(&installed).await {
tracing::warn!(
companion = %companion,
error = %err,
"companion reconcile failed"
);
}
}
fn log_report(report: &ReconcileReport) {

View File

@ -285,15 +285,7 @@ async fn ensure_image_present(spec: &CompanionSpec) -> Result<String> {
async fn image_exists(image: &str) -> bool {
let mut cmd = Command::new("podman");
// Only the exit status matters. WITHOUT a `--format`, `podman image inspect`
// prints the image's full multi-KB manifest JSON; `.status()` inherits the
// service's stdout, so on a hit that whole blob lands in the journal — once
// per companion image, every reconcile pass. That flood spikes journald +
// IO and starves the async runtime (UI websocket then drops → "connection
// lost"/reconnect). Discard the child's stdout/stderr; we read neither.
cmd.args(["image", "inspect", image])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null());
cmd.args(["image", "inspect", image]);
match tokio::time::timeout(COMPANION_IMAGE_CHECK_TIMEOUT, cmd.status()).await {
Ok(Ok(status)) => status.success(),
Ok(Err(err)) => {

View File

@ -691,37 +691,16 @@ fn extract_lan_address(ports: &[String]) -> Option<String> {
None
}
/// netbird's dashboard launch URL: HTTPS on 8087 (the proxy terminates TLS —
/// the dashboard needs a secure context for OIDC PKCE, issue #15) at the node's
/// primary host IP so it's reachable from the LAN. Manifest-driven netbird no
/// longer writes `dashboard.env`, so this is derived from host facts (the same
/// `{{HOST_IP}}` the orchestrator bakes into the cert/config); it falls back to
/// the static localhost mapping when the host IP can't be read. URL shape is
/// identical to the legacy installer's, so the existing https reachability
/// wrapper still applies.
async fn netbird_configured_launch_url() -> Option<String> {
if let Some(ip) = first_host_ip().await {
return Some(format!("https://{ip}:8087"));
}
PodmanClient::lan_address_for("netbird")
}
/// First address from `hostname -I` — the node's primary host IP. Mirrors the
/// orchestrator's `detect_host_ip` so launch URLs match the cert/config the
/// orchestrator renders for `{{HOST_IP}}`.
async fn first_host_ip() -> Option<String> {
let out = tokio::process::Command::new("hostname")
.arg("-I")
.output()
let env = tokio::fs::read_to_string("/var/lib/archipelago/netbird/dashboard.env")
.await
.ok()?;
if !out.status.success() {
return None;
}
String::from_utf8_lossy(&out.stdout)
.split_whitespace()
.next()
env.lines()
.find_map(|line| line.strip_prefix("NETBIRD_MGMT_API_ENDPOINT="))
.map(str::trim)
.filter(|s| !s.is_empty())
.map(ToOwned::to_owned)
.or_else(|| PodmanClient::lan_address_for("netbird"))
}
async fn reachable_lan_address(app_id: &str, candidate: Option<String>) -> Option<String> {

View File

@ -26,7 +26,7 @@
use anyhow::{Context, Result};
use archipelago_container::{
AppManifest, ContainerRuntime as ContainerRuntimeTrait, ContainerState, ContainerStatus,
Dependency, HostFacts, ManifestError, ResolvedSource, SecretsProvider,
Dependency, GeneratedFile, HostFacts, ManifestError, ResolvedSource, SecretsProvider,
};
use async_trait::async_trait;
use std::collections::{HashMap, HashSet};
@ -294,20 +294,6 @@ async fn chown_for_rootless_container(uid_gid: &str, path: &str) -> Result<()> {
))
}
/// `(container-id, mount-dest)` pairs whose in-container chown returned a hard,
/// permanent failure (e.g. "Operation not permitted" on a mount that can't be
/// re-owned from inside the userns). Remembered for the life of the process so
/// the per-reconcile repair stops re-attempting them — otherwise a single
/// unrepairable mount (observed: mempool-api `/data`) burns CPU + floods the
/// journal on every pass. Keyed by Id so a recreated container retries afresh.
fn unrepairable_ownership() -> &'static std::sync::Mutex<std::collections::HashSet<(String, String)>>
{
static SET: std::sync::OnceLock<
std::sync::Mutex<std::collections::HashSet<(String, String)>>,
> = std::sync::OnceLock::new();
SET.get_or_init(|| std::sync::Mutex::new(std::collections::HashSet::new()))
}
/// App-agnostic, userns-mapping-proof volume-ownership repair for a RUNNING
/// container.
///
@ -346,13 +332,6 @@ async fn ensure_running_container_ownership(name: &str) -> bool {
.filter(|g| !g.is_empty())
.unwrap_or_else(|| uid.clone());
// Stable identity of THIS container instance — used to remember mounts whose
// chown is hard-unrepairable so we stop hammering them every reconcile. Keyed
// by Id (not name) so a recreated container gets a fresh repair attempt.
let cid = podman_stdout(&["inspect", name, "--format", "{{.Id}}"])
.await
.unwrap_or_default();
// Writable bind-mount destinations only.
let dests = match podman_stdout(&[
"inspect",
@ -380,19 +359,6 @@ async fn ensure_running_container_ownership(name: &str) -> bool {
continue;
}
// Known hard-unrepairable for this container instance (a previous chown
// returned a permanent error like "Operation not permitted"). Skip the
// probe+chown entirely — retrying every reconcile only burns CPU and
// floods the journal; it will never succeed for this instance.
if !cid.is_empty()
&& unrepairable_ownership()
.lock()
.map(|s| s.contains(&(cid.clone(), dest.to_string())))
.unwrap_or(false)
{
continue;
}
// Drift check: can the service user write here already?
let probe = format!(
"t=\"{dest}/.archy-wtest.$$\"; touch \"$t\" 2>/dev/null && rm -f \"$t\" 2>/dev/null"
@ -429,21 +395,11 @@ async fn ensure_running_container_ownership(name: &str) -> bool {
"repaired unwritable volume ownership (in-container chown)"
);
}
Ok(o) => {
// Permanent failure (e.g. "Operation not permitted" on a mount
// that simply can't be re-owned from inside the userns). Record
// it so we don't re-attempt every reconcile — log once, loudly.
if !cid.is_empty() {
if let Ok(mut s) = unrepairable_ownership().lock() {
s.insert((cid.clone(), dest.to_string()));
}
}
tracing::warn!(
container = %name, dest,
"volume ownership repair failed (won't retry for this container instance): {}",
String::from_utf8_lossy(&o.stderr).trim()
)
}
Ok(o) => tracing::warn!(
container = %name, dest,
"volume ownership repair failed: {}",
String::from_utf8_lossy(&o.stderr).trim()
),
Err(e) => {
tracing::warn!(container = %name, dest, "volume ownership repair errored: {e}")
}
@ -513,18 +469,7 @@ async fn http_host_port_ready(port: u16, path: &str) -> bool {
}
async fn wait_for_manifest_host_ports(manifest: &AppManifest, timeout_secs: u64) -> Result<()> {
// Only TCP host ports are reachability-probed: the probe is a TCP connect,
// which a UDP/SCTP listener (e.g. netbird's 3478/udp STUN) can never answer,
// so probing it would always "fail" and drive an endless host-port repair
// loop (observed on .228 after netbird's manifest deploy). Default protocol
// (empty) is tcp.
for port in manifest
.app
.ports
.iter()
.filter(|p| matches!(p.protocol.to_ascii_lowercase().as_str(), "" | "tcp"))
.map(|p| p.host)
{
for port in manifest.app.ports.iter().map(|p| p.host) {
let ready = match manifest.app.id.as_str() {
"uptime-kuma" => wait_for_http_host_port(port, "/", timeout_secs).await,
_ => wait_for_host_port(port, timeout_secs).await,
@ -701,49 +646,6 @@ async fn remove_stale_podman_socket_path(socket_path: &str) {
}
}
/// True when `pid` names a live process (its `/proc/<pid>` entry exists).
/// `pid <= 0` is never alive. (Best-effort: a reused PID can read as alive, but
/// that only delays zombie detection a cycle — it never recreates a healthy one.)
fn pid_is_alive(pid: i32) -> bool {
pid > 0 && Path::new(&format!("/proc/{pid}")).exists()
}
/// Whether the process backing a podman **"running"** container is actually alive.
///
/// Podman trusts its own state DB: if a container's conmon dies without podman
/// observing it (a cgroup-cascade SIGKILL when `archipelago.service` restarts, a
/// crash), `podman ps` keeps reporting the container **"Up"** long after the
/// process is gone — a ZOMBIE. It serves nothing (its port is dead), yet the
/// reconciler NoOps it forever because the state says Running. Verify the
/// recorded main PID is alive so the caller can recreate a zombie rather than
/// trust the stale "running".
///
/// Conservative by design: any uncertainty (inspect failed, PID unparseable)
/// returns `true` (assume alive) so a transient podman hiccup never destroys a
/// healthy container. Only a concrete, dead PID returns `false`.
///
/// Observed live on .228 (2026-06-25): `netbird-dashboard` reported "Up" with
/// `State.Pid` 1394766 already gone → its nginx proxy 502'd → NetBird login
/// broke ("Unauthenticated"). The reconciler never recovered it because the
/// dashboard publishes no host port, so the Running branch had nothing to probe.
async fn container_running_process_alive(name: &str) -> bool {
let out = match tokio::process::Command::new("podman")
.args(["inspect", "--format", "{{.State.Pid}}", name])
.output()
.await
{
Ok(o) if o.status.success() => o,
_ => return true, // can't determine — don't destabilize a healthy app
};
match String::from_utf8_lossy(&out.stdout).trim().parse::<i32>() {
// A genuinely running container always has a supervised PID > 0 whose
// /proc entry exists. A dead PID (or PID <= 0 alongside state "running")
// is the anomaly we're catching.
Ok(pid) => pid_is_alive(pid),
Err(_) => true, // unparseable (older podman / odd output) — assume alive
}
}
async fn wait_for_container_stable_running(
runtime: &dyn ContainerRuntimeTrait,
name: &str,
@ -992,7 +894,7 @@ pub struct ProdContainerOrchestrator {
/// Quadlet `.container` unit and starts it via systemctl --user
/// instead of shelling out to `podman create + start`. Default
/// false so the legacy path remains the production path until the
/// 5× lifecycle harness goes green against the new path.
/// 20× lifecycle harness goes green against the new path.
use_quadlet_backends: bool,
#[cfg(test)]
test_disk_gb: Option<u64>,
@ -1305,11 +1207,6 @@ impl ProdContainerOrchestrator {
async fn reconcile_all_with_mode(&self, mode: ReconcileMode) -> ReconcileReport {
let user_stopped = crate::crash_recovery::load_user_stopped(&self.data_dir).await;
// Durable desired-state signal: the container names that were running at
// the last periodic snapshot. Used below to recreate a previously-running
// app whose container vanished (e.g. a wedged teardown cleared by a
// reboot) instead of leaving it down. See the immich .198 incident.
let was_running = crate::crash_recovery::load_last_running_names(&self.data_dir).await;
let manifests: Vec<LoadedManifest> = {
let state = self.state.read().await;
let dependency_required = dependency_manifests_required_by_active_apps(
@ -1343,34 +1240,6 @@ impl ProdContainerOrchestrator {
continue;
}
match self.ensure_running_with_mode(&lm, mode).await {
// Desired-state recovery: the app has no container and was left
// "absent" by boot reconcile, BUT it was running at the last
// snapshot — so its container vanished unexpectedly (a wedged
// teardown cleared by a reboot, a lost container record after a
// crash). It isn't user-stopped (those are filtered out of
// `manifests` above) and it's still installed (manifest present),
// so recreate it rather than leave a previously-running app down.
// Match is exact: compute_container_name == the snapshot's podman
// name (incl. each stack member), so no false positives. The only
// "absent" Left reason is the optional-missing case, so this never
// fires for paused/unknown states.
Ok(ReconcileAction::Left(reason))
if mode == ReconcileMode::ExistingOnly
&& reason == "absent"
&& was_running.contains(&compute_container_name(&lm.manifest)) =>
{
tracing::warn!(
app_id = %app_id,
"previously-running app has no container after boot — recreating (desired-state recovery)"
);
match self.install_fresh(&lm).await {
Ok(()) => report.record(&app_id, ReconcileAction::Installed),
Err(e) => {
tracing::error!(app_id = %app_id, error = %e, "desired-state recovery (recreate) failed");
report.failures.push((app_id, e.to_string()));
}
}
}
Ok(action) => report.record(&app_id, action),
Err(e) => {
tracing::error!(app_id = %app_id, error = %e, "reconcile failed");
@ -1457,27 +1326,6 @@ impl ProdContainerOrchestrator {
self.resolve_dynamic_env(&mut resolved_manifest)?;
let name = compute_container_name(&lm.manifest);
// An explicitly user-stopped app MUST stay stopped. The reconcile filter
// already drops user-stopped apps, but its `dependency_required` override
// re-includes a stopped app that an *active* app depends on (e.g. mempool
// keeps electrumx in the list), and the in-memory `disabled` set is wiped
// on manifest reload — so reconcile would resurrect it: its now-unreachable
// ports look like a fault, the host-port "repair" restarts it, and
// package.stop never sticks. Honour the on-disk marker here, the single
// choke point every reconcile flows through. Explicit install/start/restart
// clear the marker BEFORE calling this, so they are unaffected.
{
let user_stopped = crate::crash_recovery::load_user_stopped(&self.data_dir).await;
if user_stopped.contains(&app_id) || user_stopped.contains(&name) {
tracing::debug!(
app_id = %app_id,
container = %name,
"reconcile skipped — app is user-stopped (must stay stopped)"
);
return Ok(ReconcileAction::Left("user-stopped".into()));
}
}
match self.runtime.get_container_status(&name).await {
Ok(status) => {
// Phase 3.3: migrate pre-Phase-3 containers in place, but only
@ -1493,26 +1341,6 @@ impl ProdContainerOrchestrator {
}
match status.state {
ContainerState::Running => {
// Zombie guard: podman can report a container "running"
// after its process has died (conmon SIGKILLed in a
// cgroup cascade on archipelago restart, etc.). Such a
// container serves nothing yet would be NoOp'd forever.
// Recreate it from the manifest. This is the ONLY path
// that recovers a dead dependency with no published host
// port (netbird-dashboard on .228, 2026-06-25 — stale
// "Up" → proxy 502 → NetBird login broke). Conservative:
// only fires on a concrete dead PID, never on uncertainty.
if !container_running_process_alive(&name).await {
tracing::warn!(
app_id = %app_id,
container = %name,
"container reported running but its process is dead (zombie) — recreating"
);
let _ = self.runtime.stop_container(&name).await;
let _ = self.runtime.remove_container(&name).await;
self.install_fresh(lm).await?;
return Ok(ReconcileAction::Installed);
}
// App-specific hooks get a chance to refresh bind-mounted
// config. bitcoin-ui: re-render nginx.conf if the RPC
// password rotated (or template changed via OTA). If
@ -1889,7 +1717,7 @@ impl ProdContainerOrchestrator {
} else {
self.remove_quadlet_unit_if_present(&name).await?;
ensure_user_podman_socket().await?;
// Legacy path. Production until tests/lifecycle/run-gate.sh
// Legacy path. Production until tests/lifecycle/run-20x.sh
// goes green against the Quadlet path.
self.runtime
.create_container(&resolved_manifest, &name, 0)
@ -1960,9 +1788,6 @@ impl ProdContainerOrchestrator {
self.run_pre_start_hooks(&manifest.app.id).await?;
self.ensure_bind_mount_sockets(manifest).await?;
self.ensure_bind_mount_dirs(manifest).await?;
// Certs before files: a templated file may not need the cert, but the
// container's bind-mounts expect both present before create_container.
self.ensure_manifest_certs(manifest).await?;
self.ensure_manifest_files(manifest).await?;
self.apply_data_uid(manifest).await?;
self.run_post_data_uid_hooks(&manifest.app.id).await?;
@ -2870,10 +2695,6 @@ impl ProdContainerOrchestrator {
continue;
}
// Whether the bind source already existed BEFORE we (root) create it,
// so the ownership fix-up below only touches a dir we just made.
let source_existed = Path::new(&volume.source).exists();
let mkdir_status = host_sudo(&["mkdir", "-p", &volume.source])
.await
.with_context(|| format!("mkdir {}", volume.source))?;
@ -2884,43 +2705,6 @@ impl ProdContainerOrchestrator {
mkdir_status.code()
));
}
// A bind dir we JUST created is owned root:root (mkdir ran via sudo).
// An app that declares no `data_uid` runs as its own root inside the
// container, which rootless Podman maps to the host user running
// archipelago — so a root:root dir is UNWRITABLE from inside and the
// app EACCES-crash-loops the moment it tries to create a subdir
// (observed: immich upload dir `/var/lib/archipelago/immich` after a
// recreate). The in-container ownership self-heal only runs on RUNNING
// containers, so it never fires for an app that crashes on startup.
// Match the new dir to its parent's owner — the rootless data root
// (`/var/lib/archipelago`, owned by the service user) — via
// `--reference`, so there's no host-uid guessing. Only on fresh
// creation, and only when apply_data_uid won't already chown it.
if !source_existed && manifest.app.container.data_uid.is_none() {
if let Some(parent) = Path::new(&volume.source)
.parent()
.map(|p| p.display().to_string())
{
match host_sudo(&[
"chown",
&format!("--reference={parent}"),
&volume.source,
])
.await
{
Ok(s) if s.success() => {}
Ok(s) => tracing::warn!(
app_id = %manifest.app.id, dir = %volume.source,
"bind-dir ownership match exited {:?} (app may EACCES)", s.code()
),
Err(e) => tracing::warn!(
app_id = %manifest.app.id, dir = %volume.source,
"bind-dir ownership match failed (non-fatal): {e}"
),
}
}
}
}
Ok(())
}
@ -2945,14 +2729,7 @@ impl ProdContainerOrchestrator {
async fn ensure_manifest_files(&self, manifest: &AppManifest) -> Result<HookOutcome> {
let mut outcome = HookOutcome::Unchanged;
for file in &manifest.app.files {
// Render templated placeholders before comparing/writing so the
// idempotency check is against the FINAL bytes (not the template),
// otherwise a rendered file would be rewritten every reconcile.
let rendered = self
.render_file_placeholders(manifest, &file.content)
.await
.with_context(|| format!("rendering manifest file {}", file.path))?;
if ensure_rendered_file(&file.path, &rendered, file.overwrite)
if ensure_generated_file(file)
.await
.with_context(|| format!("ensure manifest file {}", file.path))?
== HookOutcome::Rewritten
@ -2962,186 +2739,23 @@ impl ProdContainerOrchestrator {
}
Ok(outcome)
}
/// Substitute the allow-listed placeholders a manifest `GeneratedFile` may
/// carry. Keeps runtime-derived config (netbird's `config.yaml`/`nginx.conf`)
/// declarative instead of generated by per-app Rust:
/// - `{{HOST_IP}}` / `{{HOST_MDNS}}` — host facts (`hostname -I` / `.local`).
/// - `{{NETWORK_GATEWAY}}` — the gateway of the app's podman network, i.e.
/// aardvark's DNS address. nginx uses it as an explicit `resolver` so it
/// re-resolves container names per request instead of pinning a stale IP
/// and 502-ing after a restart/reboot (issue #15). The network is ensured
/// to exist first so the gateway is readable on a fresh install (this runs
/// before `install_fresh`'s own `ensure_container_network`; both idempotent).
/// - `{{secret:NAME}}` — a `0600` secret read from the service-owned secrets
/// dir (e.g. netbird's base64 relay/store keys). NEVER logged.
async fn render_file_placeholders(
&self,
manifest: &AppManifest,
content: &str,
) -> Result<String> {
let mut out = content.to_string();
if out.contains("{{HOST_IP}}") || out.contains("{{HOST_MDNS}}") {
let facts = self.detect_host_facts();
out = out
.replace("{{HOST_IP}}", &facts.host_ip)
.replace("{{HOST_MDNS}}", &facts.host_mdns);
}
if out.contains("{{NETWORK_GATEWAY}}") {
self.ensure_container_network(manifest).await?;
let gw = self.network_gateway(manifest).await?;
out = out.replace("{{NETWORK_GATEWAY}}", &gw);
}
out = self.render_secret_placeholders(&out).await?;
Ok(out)
}
/// Replace every `{{secret:NAME}}` with the trimmed contents of
/// `<secrets_dir>/NAME`. `NAME` must be a bare filename (the same safety bar
/// as `secret_env`). The secret value is never placed in an error or log.
async fn render_secret_placeholders(&self, content: &str) -> Result<String> {
const OPEN: &str = "{{secret:";
let mut out = String::with_capacity(content.len());
let mut rest = content;
while let Some(start) = rest.find(OPEN) {
out.push_str(&rest[..start]);
let after = &rest[start + OPEN.len()..];
let end = after
.find("}}")
.ok_or_else(|| anyhow::anyhow!("unterminated {{secret:...}} placeholder"))?;
let name = &after[..end];
if name.is_empty() || name.contains('/') || name.contains("..") {
anyhow::bail!("invalid secret placeholder name '{name}' (must be a bare filename)");
}
let value = tokio::fs::read_to_string(self.secrets_dir.join(name))
.await
.map_err(|_| {
// Do not surface the path-with-value or io detail beyond the name.
anyhow::anyhow!("secret '{name}' referenced by a manifest file is missing")
})?;
out.push_str(value.trim());
rest = &after[end + 2..];
}
out.push_str(rest);
Ok(out)
}
/// The gateway IP of the app's podman network — aardvark's DNS resolver
/// address. (Generalised from the old per-app netbird resolver helper,
/// deleted in #20 ph4.) Falls back to
/// podman's usual first-pool gateway if the inspect can't be parsed (the
/// network was just ensured to exist, so this is a belt-and-braces default).
async fn network_gateway(&self, manifest: &AppManifest) -> Result<String> {
let network = manifest
.app
.container
.network
.as_deref()
.filter(|n| !n.is_empty() && !is_builtin_network_mode(n))
.ok_or_else(|| {
anyhow::anyhow!("{{NETWORK_GATEWAY}} used but app has no dedicated network")
})?;
let out = tokio::process::Command::new("podman")
.args([
"network",
"inspect",
network,
"--format",
"{{range .Subnets}}{{.Gateway}}{{end}}",
])
.output()
.await
.with_context(|| format!("inspecting podman network {network} for gateway"))?;
let gw = String::from_utf8_lossy(&out.stdout).trim().to_string();
if !gw.is_empty() && gw.parse::<std::net::IpAddr>().is_ok() {
return Ok(gw);
}
tracing::warn!(
network,
"could not read network gateway; falling back to 10.89.0.1"
);
Ok("10.89.0.1".to_string())
}
/// Materialise manifest-declared self-signed TLS certs before the container
/// is created (so a bind-mounted cert path resolves to a real file). Skips an
/// entry whose crt+key already exist (idempotent / data-preserving). CN and
/// SAN templates are rendered against host facts; when omitted they default
/// to the node's host IP plus `127.0.0.1`/`localhost` so the cert is valid
/// however the box is reached locally. (Generalised from the old per-app
/// netbird TLS helper, deleted in #20 ph4: rsa:2048, 10-year, no per-app Rust.)
async fn ensure_manifest_certs(&self, manifest: &AppManifest) -> Result<()> {
let facts = self.detect_host_facts();
let render = |s: &str| {
s.replace("{{HOST_IP}}", &facts.host_ip)
.replace("{{HOST_MDNS}}", &facts.host_mdns)
};
for cert in &manifest.app.container.generated_certs {
if tokio::fs::metadata(&cert.crt).await.is_ok()
&& tokio::fs::metadata(&cert.key).await.is_ok()
{
continue;
}
if let Some(parent) = Path::new(&cert.crt).parent() {
create_dir_all_or_sudo(parent).await?;
}
if let Some(parent) = Path::new(&cert.key).parent() {
create_dir_all_or_sudo(parent).await?;
}
let cn = render(cert.common_name.as_deref().unwrap_or("{{HOST_IP}}"));
let san = if cert.sans.is_empty() {
format!("IP:{},IP:127.0.0.1,DNS:localhost", facts.host_ip)
} else {
cert.sans
.iter()
.map(|s| render(s))
.collect::<Vec<_>>()
.join(",")
};
let status = tokio::process::Command::new("openssl")
.args([
"req",
"-x509",
"-newkey",
"rsa:2048",
"-nodes",
"-keyout",
&cert.key,
"-out",
&cert.crt,
"-days",
"3650",
"-subj",
&format!("/CN={cn}"),
"-addext",
&format!("subjectAltName={san}"),
])
.status()
.await
.with_context(|| format!("running openssl for manifest cert {}", cert.crt))?;
if !status.success() {
anyhow::bail!("openssl failed to generate manifest cert {}", cert.crt);
}
}
Ok(())
}
}
async fn ensure_rendered_file(path: &str, content: &str, overwrite: bool) -> Result<HookOutcome> {
let p = Path::new(path);
if let Ok(existing) = tokio::fs::read_to_string(p).await {
if existing == content || !overwrite {
async fn ensure_generated_file(file: &GeneratedFile) -> Result<HookOutcome> {
let path = Path::new(&file.path);
if let Ok(existing) = tokio::fs::read_to_string(path).await {
if existing == file.content || !file.overwrite {
return Ok(HookOutcome::Unchanged);
}
} else if p.exists() && !overwrite {
} else if path.exists() && !file.overwrite {
return Ok(HookOutcome::Unchanged);
}
let parent = p
let parent = path
.parent()
.ok_or_else(|| anyhow::anyhow!("generated file path has no parent: {}", path))?;
.ok_or_else(|| anyhow::anyhow!("generated file path has no parent: {}", file.path))?;
create_dir_all_or_sudo(parent).await?;
write_generated_file_atomically(p, content).await?;
write_generated_file_atomically(path, &file.content).await?;
Ok(HookOutcome::Rewritten)
}
@ -3225,11 +2839,6 @@ impl ContainerOrchestrator for ProdContainerOrchestrator {
let mut state = self.state.write().await;
state.disabled.remove(app_id);
}
// Installing is an explicit "I want this running" action — clear the
// user-stopped marker so the new reconcile guard in
// `ensure_running_with_mode` doesn't skip the very container we're
// installing. (start/restart RPC handlers clear it on their side too.)
crate::crash_recovery::clear_user_stopped(&self.data_dir, app_id).await;
// Idempotent: if the container is already up and healthy, just
// refresh hooks and return. If it's stopped, start it. If it's
// missing or in a wedged state, install fresh.
@ -3273,10 +2882,6 @@ impl ContainerOrchestrator for ProdContainerOrchestrator {
let mut state = self.state.write().await;
state.disabled.remove(app_id);
}
// Explicit start clears the user-stopped marker so the reconcile guard in
// `ensure_running_with_mode` doesn't skip this container (symmetric with
// install; the start/restart RPC handlers also clear it).
crate::crash_recovery::clear_user_stopped(&self.data_dir, app_id).await;
let lm = self.loaded(app_id).await?;
let action = self.ensure_running(&lm).await?;
match action {
@ -4892,17 +4497,4 @@ app:
)
);
}
#[test]
fn pid_is_alive_detects_live_and_dead_pids() {
// Our own process is alive.
assert!(pid_is_alive(std::process::id() as i32));
// Non-positive PIDs are never alive (a "running" container with PID 0 is
// exactly the zombie case).
assert!(!pid_is_alive(0));
assert!(!pid_is_alive(-1));
// A PID far above the kernel's pid_max can't name a live process, so the
// zombie guard reports it dead → the reconciler recreates.
assert!(!pid_is_alive(2_000_000_000));
}
}

View File

@ -581,12 +581,11 @@ pub async fn write_if_changed(unit: &QuadletUnit, dir: &Path) -> Result<bool> {
/// Reload the user systemd manager. Required after any quadlet write
/// or removal so systemd picks up the generated `.service` translation.
pub async fn daemon_reload_user() -> Result<()> {
// Bounded: a wedged user manager (e.g. a unit stuck "deactivating" while
// podman hangs) could otherwise block daemon-reload indefinitely and freeze
// any caller — notably uninstall teardown.
let status = systemctl_user_status(&["daemon-reload"], Duration::from_secs(30))
let status = Command::new("systemctl")
.args(["--user", "daemon-reload"])
.status()
.await
.context("systemctl --user daemon-reload")?;
.context("spawn systemctl --user daemon-reload")?;
if !status.success() {
return Err(anyhow!("systemctl --user daemon-reload exited {status}"));
}
@ -788,19 +787,11 @@ fn directive_values(unit_body: &str, prefix: &str) -> Vec<String> {
/// that systemd no longer knows about.
pub async fn disable_remove(unit_name: &str, dir: &Path) -> Result<()> {
let svc = format!("{unit_name}.service");
// Stop first; ignore failure (unit may already be down). BOUNDED — on
// rootless podman a generated unit can wedge in "deactivating" while
// `podman rm -f` hangs underneath it, and an unbounded `systemctl stop`
// would block the entire uninstall forever: the progress bar freezes and
// the package entry is stranded in `Removing` (a ghost in My Apps that also
// blocks reinstall). If the graceful stop times out, escalate to
// SIGKILL + reset-failed so teardown always proceeds.
if systemctl_user_status(&["stop", &svc], QUADLET_STOP_TIMEOUT)
.await
.is_err()
{
let _ = kill_and_reset_service(&svc).await;
}
// Stop first; ignore failure (unit may already be down).
let _ = Command::new("systemctl")
.args(["--user", "stop", &svc])
.status()
.await;
let path = dir.join(format!("{unit_name}.container"));
if fs::try_exists(&path).await.unwrap_or(false) {
match fs::remove_file(&path).await {
@ -811,15 +802,10 @@ pub async fn disable_remove(unit_name: &str, dir: &Path) -> Result<()> {
}
daemon_reload_user().await.ok();
// Defensive: kill the actual container too, in case quadlet left it.
// Bounded so a hung podman store can't re-introduce the stall this function
// exists to avoid.
let _ = tokio::time::timeout(
QUADLET_STOP_TIMEOUT,
Command::new("podman")
.args(["rm", "-f", unit_name])
.status(),
)
.await;
let _ = Command::new("podman")
.args(["rm", "-f", unit_name])
.status()
.await;
Ok(())
}

View File

@ -66,7 +66,6 @@ fn ensure_one(dir: &Path, gs: &GeneratedSecret) -> Result<()> {
match gs.kind {
SecretGenKind::Hex16 => write_secret(&dir.join(&gs.name), &random_hex(16))?,
SecretGenKind::Hex32 => write_secret(&dir.join(&gs.name), &random_hex(32))?,
SecretGenKind::Base64 => write_secret(&dir.join(&gs.name), &random_base64(32))?,
SecretGenKind::Bcrypt => {
let password = random_hex(BCRYPT_PASSWORD_BYTES);
let hash = bcrypt::hash(&password, bcrypt::DEFAULT_COST)
@ -93,15 +92,6 @@ fn random_hex(bytes: usize) -> String {
hex::encode(buf)
}
/// `bytes` of entropy, standard base64 (with padding). For keys that a service
/// base64-decodes to recover the raw bytes (e.g. netbird's store encryptionKey).
fn random_base64(bytes: usize) -> String {
use base64::Engine as _;
let mut buf = vec![0u8; bytes];
rand::thread_rng().fill_bytes(&mut buf);
base64::engine::general_purpose::STANDARD.encode(buf)
}
/// Atomically write a `0600` secret: a temp file in the same dir (so the rename
/// is atomic), fsynced, then renamed over the target.
fn write_secret(path: &Path, value: &str) -> Result<()> {

View File

@ -61,22 +61,6 @@ pub async fn load_user_stopped(data_dir: &Path) -> std::collections::HashSet<Str
}
}
/// Names of the containers that were running at the last periodic snapshot
/// (`running-containers.json`, saved every ~120s by `save_container_snapshot`).
/// Unlike `check_for_crash`, this reads the snapshot unconditionally (no PID/crash
/// gate) — it's the durable "what was running" signal the boot reconciler uses to
/// recreate a previously-running app whose container vanished. Empty if absent.
pub async fn load_last_running_names(data_dir: &Path) -> std::collections::HashSet<String> {
let path = data_dir.join(CONTAINER_STATE_FILE);
match fs::read_to_string(&path).await {
Ok(content) => match serde_json::from_str::<ContainerSnapshot>(&content) {
Ok(snapshot) => snapshot.containers.into_iter().map(|c| c.name).collect(),
Err(_) => std::collections::HashSet::new(),
},
Err(_) => std::collections::HashSet::new(),
}
}
/// Save the set of user-stopped containers to disk.
pub async fn save_user_stopped(data_dir: &Path, stopped: &std::collections::HashSet<String>) {
let path = data_dir.join(USER_STOPPED_FILE);
@ -914,43 +898,6 @@ mod tests {
assert_eq!(containers[1].name, "archy-mempool-web");
}
#[tokio::test]
async fn test_load_last_running_names_reads_snapshot_without_pid_gate() {
let tmp = TempDir::new().unwrap();
// No PID file written — load_last_running_names must NOT require a crash.
let snapshot = ContainerSnapshot {
timestamp: 1000,
containers: vec![
RunningContainerRecord {
name: "immich_server".to_string(),
image: "immich:2.7".to_string(),
},
RunningContainerRecord {
name: "immich_postgres".to_string(),
image: "postgres:16".to_string(),
},
],
};
fs::write(
tmp.path().join(CONTAINER_STATE_FILE),
serde_json::to_string(&snapshot).unwrap(),
)
.await
.unwrap();
let names = load_last_running_names(tmp.path()).await;
assert_eq!(names.len(), 2);
assert!(names.contains("immich_server"));
assert!(names.contains("immich_postgres"));
assert!(!names.contains("immich_redis"));
}
#[tokio::test]
async fn test_load_last_running_names_empty_when_absent() {
let tmp = TempDir::new().unwrap();
assert!(load_last_running_names(tmp.path()).await.is_empty());
}
#[tokio::test]
async fn test_write_and_remove_pid_marker() {
let tmp = TempDir::new().unwrap();

View File

@ -198,24 +198,6 @@ async fn main() -> Result<()> {
(Some(trait_obj), Some(dev))
} else {
let prod = Arc::new(ProdContainerOrchestrator::new(config.clone()).await?);
// Pull the freshest signed app-catalog BEFORE loading manifests, so any
// registry-embedded manifest (the origin-wins overlay in load_manifests)
// is in place on THIS boot — not a restart later. Without this the boot
// would overlay the previous run's cached catalog and a newly-published
// app (e.g. a registry-only install) wouldn't appear until the next
// restart. Bounded + best-effort: on timeout/unreachable origin the
// last-cached catalog (or the disk manifests) still load — registry is
// an overlay on top of disk, never a hard dependency.
match tokio::time::timeout(
std::time::Duration::from_secs(25),
crate::container::app_catalog::refresh_catalog(&config.data_dir),
)
.await
{
Ok(Ok(n)) => info!("🛰️ app-catalog refreshed before manifest load ({n} apps)"),
Ok(Err(e)) => tracing::debug!("app-catalog pre-load refresh failed (using cache): {e}"),
Err(_) => tracing::debug!("app-catalog pre-load refresh timed out (using cache)"),
}
// Best-effort manifest load; a missing /opt/archipelago/apps is
// logged inside load_manifests and not fatal.
match prod.load_manifests().await {

View File

@ -8,9 +8,8 @@ pub mod runtime;
pub use bitcoin_simulator::{BitcoinSimulationMode, BitcoinSimulator};
pub use health_monitor::HealthMonitor;
pub use manifest::{
AppInterface, AppManifest, BuildConfig, ContainerConfig, Dependency, DerivedEnv, GeneratedCert,
GeneratedFile, GeneratedSecret, HealthCheck, HookStep, HostCopy, HostFacts, LifecycleHooks,
ManifestError,
AppInterface, AppManifest, BuildConfig, ContainerConfig, Dependency, DerivedEnv, GeneratedFile,
GeneratedSecret, HealthCheck, HookStep, HostCopy, HostFacts, LifecycleHooks, ManifestError,
ResolvedSource, ResourceLimits, SecretEnv, SecretGenKind, SecretsProvider, SecurityPolicy,
Volume,
};

View File

@ -223,19 +223,6 @@ pub struct ContainerConfig {
#[serde(default)]
pub generated_secrets: Vec<GeneratedSecret>,
/// Self-signed TLS certificates the orchestrator materialises before the
/// container is created (so a bind-mounted cert path resolves to a real
/// file, not a stale/missing path). Like `generated_secrets`, this keeps an
/// app data-driven: a service that needs a secure context (e.g. netbird's
/// dashboard — OIDC PKCE / `window.crypto.subtle` only works over HTTPS,
/// issue #15) declares the cert here instead of relying on per-app Rust.
/// Idempotent: an entry whose `crt` and `key` already exist is left
/// untouched. SAN/CN templates are rendered against host facts at apply time.
///
/// Example: `- { crt: /var/lib/archipelago/netbird/tls.crt, key: /var/lib/archipelago/netbird/tls.key }`
#[serde(default)]
pub generated_certs: Vec<GeneratedCert>,
/// Rootless-mapped UID:GID applied to the container's data directory
/// (the `bind`-mounted host path with `target` inside the container's
/// data root) before creation. Mirrors `SPEC_DATA_UID`.
@ -274,11 +261,6 @@ pub enum SecretGenKind {
Hex16,
/// 32 random bytes, lowercase hex (64 chars). Longer keys/cookies.
Hex32,
/// 32 random bytes, standard base64 (44 chars incl. padding). For services
/// that require a base64-encoded key rather than hex — e.g. netbird's relay
/// `authSecret` and the SQLite store `encryptionKey`, which base64-decode
/// their configured value (hex would decode to the wrong bytes).
Base64,
/// A random password and its bcrypt hash. `<name>` holds the bcrypt hash
/// (what a server is configured with); the plaintext is stored alongside as
/// `<name>.pw` for any client that must authenticate. `secret_env` injects
@ -300,31 +282,12 @@ impl GeneratedSecret {
/// (primary first). A consumer references one of these via `secret_env`.
pub fn target_files(&self) -> Vec<String> {
match self.kind {
SecretGenKind::Hex16 | SecretGenKind::Hex32 | SecretGenKind::Base64 => {
vec![self.name.clone()]
}
SecretGenKind::Hex16 | SecretGenKind::Hex32 => vec![self.name.clone()],
SecretGenKind::Bcrypt => vec![self.name.clone(), format!("{}.pw", self.name)],
}
}
}
/// A self-signed TLS certificate materialised by the orchestrator. See
/// [`ContainerConfig::generated_certs`]. `crt`/`key` are absolute host paths
/// (typically under `/var/lib/archipelago/<app>/`) that the container
/// bind-mounts read-only. `common_name` and `sans` are rendered against host
/// facts (`{{HOST_IP}}`) at apply time; when omitted they default to the
/// node's host IP plus `IP:127.0.0.1,DNS:localhost` so the cert is valid for
/// however the box is reached locally.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct GeneratedCert {
pub crt: String,
pub key: String,
#[serde(default)]
pub common_name: Option<String>,
#[serde(default)]
pub sans: Vec<String>,
}
fn default_pull_policy() -> String {
"if-not-present".to_string()
}
@ -702,18 +665,6 @@ impl AppManifest {
}
}
// generated_certs: crt/key must be non-empty absolute paths with no
// traversal (they become bind-mount sources, same safety bar as files).
for (i, c) in self.app.container.generated_certs.iter().enumerate() {
for (field, val) in [("crt", &c.crt), ("key", &c.key)] {
if val.is_empty() || !val.starts_with('/') || val.contains("..") {
return Err(ManifestError::Invalid(format!(
"container.generated_certs[{i}].{field} must be an absolute path with no '..', got '{val}'"
)));
}
}
}
// data_uid: if set, must look like "NNNNN:NNNNN".
if let Some(u) = &self.app.container.data_uid {
let parts: Vec<&str> = u.split(':').collect();
@ -1760,7 +1711,6 @@ app:
],
secret_env: vec![],
generated_secrets: vec![],
generated_certs: vec![],
data_uid: None,
};
let facts = HostFacts {
@ -1812,7 +1762,6 @@ app:
},
],
generated_secrets: vec![],
generated_certs: vec![],
data_uid: None,
};
let p = MapSecretsProvider {
@ -1850,7 +1799,6 @@ app:
secret_file: "bitcoin-rpc-password".to_string(),
}],
generated_secrets: vec![],
generated_certs: vec![],
data_uid: None,
};
let p = MapSecretsProvider {

View File

@ -121,16 +121,10 @@ impl PodmanClient {
"cryptpad" => "http://localhost:3003",
"penpot" => "http://localhost:9001",
"immich_server" | "immich" => "http://localhost:2283",
// Gitea publishes SSH (2222) and web (3001). Without a manifest on
// disk, extract_lan_address() returns whichever podman lists first —
// which can be the SSH port, breaking the launch. Pin the web UI.
"gitea" => "http://localhost:3001",
"nginx-proxy-manager" => "http://localhost:8081",
"fedimint-gateway" => "http://localhost:8176",
"endurain" => "http://localhost:8080",
// HTTPS: netbird's dashboard needs a secure context for OIDC PKCE
// (window.crypto.subtle), so the proxy serves TLS on 8087 (issue #15).
"netbird" => "https://localhost:8087",
"netbird" => "http://localhost:8087",
"electrs" | "archy-electrs-ui" => "http://localhost:50002",
_ => return None,
};
@ -281,18 +275,10 @@ impl PodmanClient {
// Build the container spec for the API
let mut port_mappings = Vec::new();
for port in &manifest.app.ports {
// Honour the manifest's protocol (default tcp). netbird's STUN port
// is 3478/udp; forcing tcp here would publish the wrong protocol and
// silently break relay discovery.
let protocol = match port.protocol.to_ascii_lowercase().as_str() {
"udp" => "udp",
"sctp" => "sctp",
_ => "tcp",
};
port_mappings.push(serde_json::json!({
"container_port": port.container,
"host_port": port.host,
"protocol": protocol,
"protocol": "tcp",
}));
}

18
demo-deploy/.env.example Normal file
View File

@ -0,0 +1,18 @@
# Copy to .env and adjust. Used by demo-deploy/docker-compose.yml.
# Registry host + namespace that holds the prebuilt demo images.
REGISTRY=146.59.87.168:3000/lfg2025
# Image tag to deploy (CI publishes :demo and :<git-sha>).
IMAGE_TAG=demo
# Host port for the demo UI.
DEMO_WEB_PORT=2100
# Optional — enables the in-app AI chat panel. Leave blank to disable.
ANTHROPIC_API_KEY=
# Optional sandbox tuning (defaults shown).
DEMO_SESSION_TTL_MS=2700000 # 45 min idle before a visitor session is reaped
DEMO_MAX_SESSIONS=500 # concurrent visitor cap
DEMO_FILE_QUOTA_BYTES=52428800 # 50 MB uploads per visitor

33
demo-deploy/README.md Normal file
View File

@ -0,0 +1,33 @@
# Archipelago — Public Demo deploy
A click-to-play demo of the Archipelago UI, backed entirely by a mock backend.
Every visitor gets an **isolated, ephemeral sandbox** (own apps, wallet, files),
real container runtimes are never touched, and Bitcoin runs on **signet** test
coins. **Login password: `entertoexit`** (shown on the login screen).
This directory is the full contents of the public `archy-demo` repo. It holds no
source — only this compose file that pulls prebuilt `:demo` images.
## Deploy in Portainer
1. **Stacks → Add stack → Repository** (or paste `docker-compose.yml` into the web editor).
2. Set environment variables (see `.env.example`) — at minimum `REGISTRY`, and
`ANTHROPIC_API_KEY` if you want the AI chat panel.
3. Deploy. The UI is served on `:2100` (override with `DEMO_WEB_PORT`).
To pick up a new build, redeploy the stack (or wire the CI Portainer webhook).
## How it stays current
The images are built from the Archipelago monorepo by
`.github/workflows/demo-images.yml` on every change to `neode-ui/`, tagged `:demo`
and `:<git-sha>`, and pushed to `REGISTRY`. Editing the real UI → CI rebuilds →
redeploy here. No source lives in this repo.
## What's mocked
- **Per-visitor isolation** — state keyed by a `demo_sid` cookie, idle-reaped.
- **Apps** — install/uninstall/start/stop are simulated (no real Docker).
- **Wallet/Bitcoin** — signet-flavored; use the in-UI faucet for test sats.
- **Files** — real per-session upload/rename/delete, 50 MB quota, wiped on reap.
- **Intro** — replays once per calendar day per browser.

View File

@ -0,0 +1,49 @@
# Archipelago Public Demo — thin deploy stack
#
# This is the ENTIRE contents intended for the public `archy-demo` repo. It holds
# NO source — it pulls prebuilt `:demo` images that CI builds from the monorepo on
# every neode-ui change (see .github/workflows/demo-images.yml). Deploy this in
# Portainer ("deploy from repository" or paste into the web editor).
#
# Demo login password: entertoexit
# Access on http://<host>:2100
#
# Configure via a .env file (see .env.example):
# REGISTRY registry host/namespace holding the demo images
# IMAGE_TAG image tag to pull (default: demo)
# ANTHROPIC_API_KEY optional — enables the AI chat panel
# DEMO_WEB_PORT host port for the UI (default 2100)
services:
neode-backend:
image: ${REGISTRY:-146.59.87.168:3000/lfg2025}/archy-demo-backend:${IMAGE_TAG:-demo}
container_name: archy-demo-backend
environment:
DEMO: "1"
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:-}
NODE_OPTIONS: "--dns-result-order=ipv4first"
DEMO_SESSION_TTL_MS: ${DEMO_SESSION_TTL_MS:-2700000}
DEMO_MAX_SESSIONS: ${DEMO_MAX_SESSIONS:-500}
DEMO_FILE_QUOTA_BYTES: ${DEMO_FILE_QUOTA_BYTES:-52428800}
expose:
- "5959"
dns:
- 8.8.8.8
- 1.1.1.1
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:5959/health"]
interval: 30s
timeout: 10s
retries: 3
neode-web:
image: ${REGISTRY:-146.59.87.168:3000/lfg2025}/archy-demo-web:${IMAGE_TAG:-demo}
container_name: archy-demo-web
ports:
- "${DEMO_WEB_PORT:-2100}:80"
environment:
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:-}
depends_on:
- neode-backend
restart: unless-stopped

View File

@ -0,0 +1,22 @@
# Curated demo files
Drop real files into `demo/files/` to make them the cloud's content for **every**
demo visitor (read-only — visitors can browse, download, and "buy" them, but only
maintainers add them). This is the "private login": the only way to add files is
to commit them here, which requires repo access.
```
demo/files/
Documents/whitepaper.pdf
Photos/rig.jpg
Music/track.mp3
```
- Folder structure becomes the cloud's folders.
- Text files (`.md .txt .json .csv …`, < 1 MB) are inlined; everything else is
streamed from disk on download.
- If `demo/files/` is empty, the demo falls back to the built-in seeded set
(Documents/Photos/Music/Videos with sample content).
After adding files, commit and push — CI rebuilds the `:demo` image and Portainer
redeploys. Keep the total modest (these load into the demo image).

View File

@ -14,6 +14,31 @@
<link rel="icon" href="/aiui/favicon.svg" type="image/svg+xml" />
<link rel="apple-touch-icon" href="/aiui/apple-touch-icon-180x180.png" />
<title>AIUI</title>
<!-- Demo (?seed): pre-load the example "Content Showcase" conversation into
AIUI's IndexedDB so the chat history isn't empty (live chat is disabled
in the demo and points users to these previous chats). Mirrors the app's
own /seed exactly by calling its seedPromptsToConversation(). -->
<script type="module">
(async () => {
try {
if (!new URLSearchParams(location.search).has('seed')) return;
const db = await new Promise((res, rej) => {
const r = indexedDB.open('aiui-store', 1);
r.onupgradeneeded = (e) => { const d = e.target.result; if (!d.objectStoreNames.contains('conversations')) d.createObjectStore('conversations', { keyPath: 'id' }); };
r.onsuccess = () => res(r.result); r.onerror = () => rej(r.error);
});
const exists = await new Promise((res) => {
try { const q = db.transaction('conversations', 'readonly').objectStore('conversations').getKey('seed-all'); q.onsuccess = () => res(!!q.result); q.onerror = () => res(false); }
catch { res(false); }
});
if (exists) return;
const { seedPromptsToConversation } = await import('/aiui/assets/seedPrompts-CLWaUv28.js');
const conv = seedPromptsToConversation();
await new Promise((res, rej) => { const t = db.transaction('conversations', 'readwrite'); t.objectStore('conversations').put(conv); t.oncomplete = () => res(); t.onerror = () => rej(t.error); });
try { localStorage.setItem('aiui-active-conversation', conv.id); } catch {}
} catch (e) { console.warn('[demo] AIUI seed bootstrap failed', e); }
})();
</script>
<script type="module" crossorigin src="/aiui/assets/index-Lh5NfTCq.js"></script>
<link rel="stylesheet" crossorigin href="/aiui/assets/index-CHQ7uqBj.css">
<link rel="manifest" href="/aiui/manifest.webmanifest"><script id="vite-plugin-pwa:register-sw" src="/aiui/registerSW.js"></script></head>

0
demo/files/.gitkeep Normal file
View File

View File

@ -1,6 +1,13 @@
# Archipelago Demo Stack - Mock backend + Vue UI + AIUI Chat
# Deploy via Portainer: Web editor -> paste this, or deploy from repo
# Access at http://localhost:4848
# Archipelago Public Demo Stack - Mock backend + Vue UI + AIUI Chat
# Deploy via Portainer: Web editor -> paste this, or deploy from repo (build).
# Access at http://localhost:2100
#
# This builds the demo images from source. For the separated, auto-updating
# deploy that pulls prebuilt :demo images, see demo-deploy/docker-compose.yml.
#
# DEMO=1 turns on the public multi-visitor sandbox: each visitor gets an
# isolated, ephemeral copy of all state; real container runtimes are never
# touched; the shared login password is "entertoexit".
#
# Required: Set ANTHROPIC_API_KEY in environment or .env file for chat to work
# IndeedHub is deployed as a separate Portainer stack (indee-demo repo)
@ -12,9 +19,13 @@ services:
dockerfile: neode-ui/Dockerfile.backend
container_name: archy-demo-backend
environment:
VITE_DEV_MODE: "existing"
DEMO: "1"
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:-}
NODE_OPTIONS: "--dns-result-order=ipv4first"
# Optional tuning (defaults shown):
# DEMO_SESSION_TTL_MS: "2700000" # 45 min idle before a session is reaped
# DEMO_MAX_SESSIONS: "500" # concurrent visitor cap
# DEMO_FILE_QUOTA_BYTES: "52428800" # 50 MB uploads per visitor
expose:
- "5959"
dns:
@ -31,9 +42,11 @@ services:
build:
context: .
dockerfile: neode-ui/Dockerfile.web
args:
VITE_DEMO: "1"
container_name: archy-demo-web
ports:
- "4848:80"
- "2100:80"
depends_on:
- neode-backend
restart: unless-stopped

View File

@ -1,14 +0,0 @@
# Archipelago mempool frontend — adds a resilient nginx backend proxy.
#
# The only delta vs the upstream image is /patch/entrypoint.sh, which rewrites
# the generated nginx-mempool.conf to use `resolver` + a variable proxy_pass so
# the frontend re-resolves the backend (mempool-api) via DNS on every request.
# Without this, nginx pins the backend IP at startup and serves 502 / "offline"
# after any backend restart (podman reassigns the IP). See the script header.
ARG BASE=146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0
FROM ${BASE}
# --chmod keeps the exec bit (build runs as USER 1000, plain COPY lands root:0644
# → "not executable"). Base USER/ENTRYPOINT/CMD (1000 / /patch/entrypoint.sh /
# nginx -g "daemon off;") are inherited unchanged.
COPY --chmod=0755 entrypoint.sh /patch/entrypoint.sh

View File

@ -1,137 +0,0 @@
#!/bin/sh
__MEMPOOL_BACKEND_MAINNET_HTTP_HOST__=${BACKEND_MAINNET_HTTP_HOST:=127.0.0.1}
__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__=${BACKEND_MAINNET_HTTP_PORT:=8999}
__MEMPOOL_FRONTEND_HTTP_PORT__=${FRONTEND_HTTP_PORT:=8080}
CONF=/etc/nginx/conf.d/nginx-mempool.conf
# ─── archipelago patch ────────────────────────────────────────────────────
# The stock frontend writes `proxy_pass http://<backend>:8999` with a literal
# hostname and NO resolver, so nginx resolves the backend IP ONCE at worker
# start and caches it for the process lifetime. Podman reassigns the backend
# container's IP whenever it is restarted/recreated (gate, OTA, crash, reboot
# re-IPAM), after which nginx keeps proxying to the dead IP → /api hangs, the
# websocket 502s, and the mempool UI shows "offline" until nginx is reloaded.
#
# Fix: force per-request DNS re-resolution via `resolver` + a variable in
# proxy_pass. Because a variable in proxy_pass disables nginx's automatic
# location→URI rewriting, each block is rewritten to preserve its original
# path mapping exactly:
# /api/v1/ws, /ws → "/" (var + "/" replaces the whole URI)
# /api/v1 → identity (no-URI proxy_pass passes $uri unchanged)
# /api/ → /api/v1/$1 (explicit rewrite, then no-URI proxy_pass)
# Operates on the __PLACEHOLDER__ tokens so the host/port sed below fills in
# the concrete values (incl. the `set $mp_backend` line). Idempotent.
# Resolver address: podman's aardvark-dns answers on the network gateway
# (e.g. 10.89.0.1), NOT Docker's 127.0.0.11. Read it from resolv.conf so this
# works on any podman network/subnet (and still falls back for Docker).
ARCHY_RESOLVER=$(awk '/^nameserver/ { print $2; exit }' /etc/resolv.conf 2>/dev/null)
ARCHY_RESOLVER=${ARCHY_RESOLVER:-127.0.0.11}
if ! grep -q 'set \$mp_backend' "$CONF"; then
awk -v res_addr="$ARCHY_RESOLVER" '
BEGIN { res = 0 }
/^[[:space:]]*location / && res == 0 {
print "\tresolver " res_addr " valid=10s ipv6=off;"
res = 1
}
/proxy_pass http:\/\/__MEMPOOL_BACKEND_MAINNET_HTTP_HOST__:__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__\/;/ {
print "\t\tset $mp_backend __MEMPOOL_BACKEND_MAINNET_HTTP_HOST__;"
print "\t\tproxy_pass http://$mp_backend:__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__/;"
next
}
/proxy_pass http:\/\/__MEMPOOL_BACKEND_MAINNET_HTTP_HOST__:__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__\/api\/v1\/;/ {
print "\t\tset $mp_backend __MEMPOOL_BACKEND_MAINNET_HTTP_HOST__;"
print "\t\trewrite ^/api/(.*)$ /api/v1/$1 break;"
print "\t\tproxy_pass http://$mp_backend:__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__;"
next
}
/proxy_pass http:\/\/__MEMPOOL_BACKEND_MAINNET_HTTP_HOST__:__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__\/api\/v1;/ {
print "\t\tset $mp_backend __MEMPOOL_BACKEND_MAINNET_HTTP_HOST__;"
print "\t\tproxy_pass http://$mp_backend:__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__;"
next
}
{ print }
' "$CONF" > "$CONF.archy" && mv "$CONF.archy" "$CONF"
fi
# ─── end archipelago patch ────────────────────────────────────────────────
sed -i "s/__MEMPOOL_BACKEND_MAINNET_HTTP_HOST__/${__MEMPOOL_BACKEND_MAINNET_HTTP_HOST__}/g" /etc/nginx/conf.d/nginx-mempool.conf
sed -i "s/__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__/${__MEMPOOL_BACKEND_MAINNET_HTTP_PORT__}/g" /etc/nginx/conf.d/nginx-mempool.conf
cp /etc/nginx/nginx.conf /patch/nginx.conf
sed -i "s/__MEMPOOL_FRONTEND_HTTP_PORT__/${__MEMPOOL_FRONTEND_HTTP_PORT__}/g" /patch/nginx.conf
cat /patch/nginx.conf > /etc/nginx/nginx.conf
if [ "${LIGHTNING_DETECTED_PORT}" != "" ];then
export LIGHTNING=true
fi
# Runtime overrides - read env vars defined in docker compose
__MAINNET_ENABLED__=${MAINNET_ENABLED:=true}
__TESTNET_ENABLED__=${TESTNET_ENABLED:=false}
__TESTNET4_ENABLED__=${TESTNET_ENABLED:=false}
__SIGNET_ENABLED__=${SIGNET_ENABLED:=false}
__LIQUID_ENABLED__=${LIQUID_ENABLED:=false}
__LIQUID_TESTNET_ENABLED__=${LIQUID_TESTNET_ENABLED:=false}
__ITEMS_PER_PAGE__=${ITEMS_PER_PAGE:=10}
__KEEP_BLOCKS_AMOUNT__=${KEEP_BLOCKS_AMOUNT:=8}
__NGINX_PROTOCOL__=${NGINX_PROTOCOL:=http}
__NGINX_HOSTNAME__=${NGINX_HOSTNAME:=localhost}
__NGINX_PORT__=${NGINX_PORT:=8999}
__BLOCK_WEIGHT_UNITS__=${BLOCK_WEIGHT_UNITS:=4000000}
__MEMPOOL_BLOCKS_AMOUNT__=${MEMPOOL_BLOCKS_AMOUNT:=8}
__BASE_MODULE__=${BASE_MODULE:=mempool}
__ROOT_NETWORK__=${ROOT_NETWORK:=}
__MEMPOOL_WEBSITE_URL__=${MEMPOOL_WEBSITE_URL:=https://mempool.space}
__LIQUID_WEBSITE_URL__=${LIQUID_WEBSITE_URL:=https://liquid.network}
__MINING_DASHBOARD__=${MINING_DASHBOARD:=true}
__LIGHTNING__=${LIGHTNING:=false}
__AUDIT__=${AUDIT:=false}
__MAINNET_BLOCK_AUDIT_START_HEIGHT__=${MAINNET_BLOCK_AUDIT_START_HEIGHT:=0}
__TESTNET_BLOCK_AUDIT_START_HEIGHT__=${TESTNET_BLOCK_AUDIT_START_HEIGHT:=0}
__SIGNET_BLOCK_AUDIT_START_HEIGHT__=${SIGNET_BLOCK_AUDIT_START_HEIGHT:=0}
__ACCELERATOR__=${ACCELERATOR:=false}
__ACCELERATOR_BUTTON__=${ACCELERATOR_BUTTON:=true}
__SERVICES_API__=${SERVICES_API:=https://mempool.space/api/v1/services}
__PUBLIC_ACCELERATIONS__=${PUBLIC_ACCELERATIONS:=false}
__HISTORICAL_PRICE__=${HISTORICAL_PRICE:=true}
__ADDITIONAL_CURRENCIES__=${ADDITIONAL_CURRENCIES:=false}
# Export as environment variables to be used by envsubst
export __MAINNET_ENABLED__
export __TESTNET_ENABLED__
export __TESTNET4_ENABLED__
export __SIGNET_ENABLED__
export __LIQUID_ENABLED__
export __LIQUID_TESTNET_ENABLED__
export __ITEMS_PER_PAGE__
export __KEEP_BLOCKS_AMOUNT__
export __NGINX_PROTOCOL__
export __NGINX_HOSTNAME__
export __NGINX_PORT__
export __BLOCK_WEIGHT_UNITS__
export __MEMPOOL_BLOCKS_AMOUNT__
export __BASE_MODULE__
export __ROOT_NETWORK__
export __MEMPOOL_WEBSITE_URL__
export __LIQUID_WEBSITE_URL__
export __MINING_DASHBOARD__
export __LIGHTNING__
export __AUDIT__
export __MAINNET_BLOCK_AUDIT_START_HEIGHT__
export __TESTNET_BLOCK_AUDIT_START_HEIGHT__
export __SIGNET_BLOCK_AUDIT_START_HEIGHT__
export __ACCELERATOR__
export __ACCELERATOR_BUTTON__
export __SERVICES_API__
export __PUBLIC_ACCELERATIONS__
export __HISTORICAL_PRICE__
export __ADDITIONAL_CURRENCIES__
folder=$(find /var/www/mempool -name "config.js" | xargs dirname)
echo ${folder}
envsubst < ${folder}/config.template.js > ${folder}/config.js
exec "$@"

View File

@ -1,13 +1,11 @@
# PRODUCTION MASTER PLAN — Archipelago App Platform & Registry
# 🚩 PRODUCTION MASTER PLAN — Archipelago App Platform & Registry
> **✅ SINGLE-NODE PRODUCTION GATE IS GREEN (2026-06-23): `run-gate.sh` 5/5 on .228, 0 failures.**
> This remains the authoritative plan for the broader north star (manifest-driven
> platform, registry-distributed manifests, external marketplace), but it is no
> longer a hard priority banner blocking all other work. Remaining workstreams are
> in §6 / §8b. Next exit-criteria: multinode (`docs/multinode-testing-plan.md`) +
> workstreams B/C/D.
> **THIS IS THE AUTHORITATIVE PLAN. Agents: read this first and keep it open until
> the production test gate (§5) is green.** It overrides ad-hoc direction and
> supersedes all prior roadmap/handoff/status docs. When the gate passes, remove
> the priority banner and demote this doc.
>
> Last updated: 2026-06-26 · zombie-container guard + gitea launch-port fix shipped, binary `040df5ce` rolled to the fleet (see §8b SESSION h). Prior: orchestrator Fix A+B (`a721532f`/`e0343137`) deployed + proven.
> Last updated: 2026-06-22 · Binary: v1.7.99-alpha · See §8b for the live resume.
---
@ -42,8 +40,7 @@ real nodes. Until then, this plan is the priority.
- **Migrations never destroy data.** Preserve `/var/lib/archipelago/<app>`,
generated secrets, displayed credentials, public ports, and adoption container
names. Always provide a rollback path. Stop/recreate only when necessary.
- **Verify on the real node .228 before any tag.** (Fleet/multinode verification is
a separate pass → `docs/multinode-testing-plan.md`.)
- **Verify on a real node (.228, then .198) before any tag.**
## 3. Current state (2026-06-21)
@ -59,7 +56,7 @@ real nodes. Until then, this plan is the priority.
- **The 4 companions** (`archy-bitcoin-ui`, `-lnd-ui`, `-electrs-ui`,
`-fedimint-ui`) build from `docker/<name>` contexts via `companion.rs`, not the
manifest registry — a later phase folds them in.
- **No app has passed the formal production gate.** That is the blocker.
- **No app has passed the formal production gate (5× for now, was 20×).** That is the blocker.
## 4. Workstreams (each links its authoritative detail doc)
@ -69,8 +66,7 @@ real nodes. Until then, this plan is the priority.
| B | **Registry-distributed manifests** — catalog carries full signed manifest; orchestrator installs from registry; disk = migration fallback | `registry-manifest-design.md` | **phases 1+2 done** (node consume + opt-in publisher embed); not yet flipped on for the fleet |
| C | **Developer-ready external registry** — 3rd-party DID-signed manifests, decentralized Nostr discovery (NIP-78 kind 30078) + trust score, `archy app …` tooling | `marketplace-protocol.md`, `app-developer-guide.md` | design exists; tooling + trust UX pending |
| D | **Distribution backbone** — signed catalog, BLAKE3 content-addressing, iroh swarm (origin-always-wins) | `dht-distribution-design.md` | phases 02 code-complete (worktree) |
| E | **Production test gate** — 5× lifecycle on **.228**, per-app L1/L2 matrix; multinode is split out → `multinode-testing-plan.md` | `tests/lifecycle/TESTING.md`, `bulletproof-containers.md` | **✅ .228 5×-GREEN (110/110 ×5, 0 not-ok, 2026-06-23)** — but this is DESTRUCTIVE-tier / ~8 core apps only; see §6c for the coverage gaps |
| F | **Lifecycle perfection — cascade + progress + ALL apps** — extend the gate to uninstall/reinstall (cascade), real install/uninstall progress UI, and EVERY installed app (not just the 8 core). The "insanely-perfect OS/container environment" bar. | §6c (below), `tests/lifecycle/TESTING.md` | **IN PROGRESS (2026-06-26)** — root bug FIXED: uninstall could hang → ghost/stuck-bar/reinstall-block (`71cc9ac4`, unbounded systemctl/podman in `quadlet::disable_remove`); `cascade-uninstall.bats` **7/7 green on .228** w/ binary `ae349a75`. Remaining: wire CASCADE into the canonical gate run, progress-UI truthfulness, all-apps matrix, guardian/IBD state. |
| E | **Production test gate** — 5× lifecycle on .228 + .198 (for now; was 20×), per-app L1/L2 matrix | `tests/lifecycle/TESTING.md`, `bulletproof-containers.md` | **never green — exit criterion** |
**Orchestrator architecture** (foundation for A/B): `rust-orchestrator-migration.md`
(ProdContainerOrchestrator, BootReconciler 30s level-triggered reconcile, adoption
@ -79,23 +75,13 @@ modes FM1FM6 + the desired-state-first reconciler that fixes them).
## 5. Production test gate (exit criterion)
An app is **production-ready** only when `tests/lifecycle/run-gate.sh` is green
An app is **production-ready** only when `tests/lifecycle/run-20x.sh` is green
across the full matrix — install / UI-reachable / stop / start / restart /
reinstall / **reboot-survive** / **archipelago-restart-survive** / uninstall —
**5× on .228** (`ARCHY_ITERATIONS=5`). **The gate runs ON the node** (it uses local
podman/systemctl/bitcoin probes; running it via RPC from another host silently
tests the runner). **Multinode / fleet verification (.198 + others) is a SEPARATE
plan — `docs/multinode-testing-plan.md` — NOT part of this single-node criterion.**
Coverage today: L0 unit (631 ●), L1 RPC ● for 6 core apps, L2 UI ● dashboard +
proxies; L3 survival ◐; ~30 apps have zero automated coverage.
> ⚠️ **The 2026-06-23 5×-green is NOT the full bar.** `run-gate.sh` runs only the
> **DESTRUCTIVE tier** (stop/start/restart/survive) over ~8 core apps; it **skips
> uninstall/reinstall** (CASCADE is gated behind `ARCHY_ALLOW_CASCADE_DESTRUCTIVE`,
> never set by the gate) and tests no install/uninstall **progress UI**. Real
> uninstall/reinstall/progress bugs (immich + grafana) were found in manual testing
> right after — see **§6c (workstream F)** for the gap and the expanded-gate plan.
> The true "every app, fully" criterion is F's definition-of-done, not this run.
**5× on .228 AND .198 for now** (`ARCHY_ITERATIONS=5`; temporarily reduced from
20× — restore to 20× before the final ship). All 8 gate checkboxes in `tests/lifecycle/TESTING.md`
are currently unchecked. Coverage today: L0 unit (631 ●), L1 RPC ● for 6 core apps,
L2 UI ● dashboard + proxies; L3 survival ◐; ~30 apps have zero automated coverage.
## 6. Immediate sequence (live workstream)
@ -111,118 +97,14 @@ proxies; L3 survival ◐; ~30 apps have zero automated coverage.
data_uid 100998. Canonical app_id `immich` (title+icon). *(9e6c5370, d5ef4573)*
4. ✅ **Reboot-survival** — podman-restart.service enabled (startup, fleet-wide)
for the podman-`--restart` path. *(f160e0c4)*
5. ✅ **E** — 5× gate on **.228** (`ARCHY_ITERATIONS=5`) is **GREEN: 5/5, 0 not-ok**
(2026-06-23). Two real orchestrator bugs were found + fixed en route (package.stop
per-app grace; package.restart phantom stack-member injection → `order_present_containers`,
commit 92d7f52d) plus two single-shot-read probes hardened (bitcoin-knots state, immich
lan_address). The single-node criterion is met.
6. ✅ Banner demoted (this doc, 2026-06-23). Next: multinode pass + workstreams B/C/D.
**Multinode / fleet verification (.198 and the rest) is split into its own plan:**
`docs/multinode-testing-plan.md`. Do it AFTER the .228 single-node gate is green.
5. ◻ **Verify on .198** (immich migration validated on .228 only so far).
6. ◻ **E** — run the 5× gate (`ARCHY_ITERATIONS=5`, was 20×); fix until green.
7. ◻ Demote this banner.
**Not yet done / deliberate follow-ups:** flip `EMBED_MANIFESTS` on for the
published catalog (then sign) to actually distribute manifests via the registry;
Phase-3 `use_quadlet_backends` rollout so orchestrator backends are Quadlet (not
just podman-`--restart`).
## 6b. Post-deploy task order (agreed 2026-06-23)
After the 2026-06-23 multinode test deploy (latest backend + UX frontend to .116/.198/.228
+ Tailscale testers), do these IN ORDER:
1. **netbird #20 ph4** — the last real manifest migration (workstream A).
2. **Phase-3 `use_quadlet_backends`** — orchestrator backends become Quadlet units.
3. **§6c Lifecycle perfection** (workstream F) — the comprehensive uninstall/reinstall +
progress-UI + all-apps gate expansion below.
## 6c. Lifecycle perfection — what "green" MISSED (workstream F, the perfection bar)
**Why this exists:** the 2026-06-23 single-node gate went 5×-green but is **NOT** the
"every app fully lifecycle-tested" guarantee a user reasonably assumes. The canonical gate
(`run-gate.sh`) only runs the **DESTRUCTIVE tier** (stop / start / restart / survive) over
**~8 core apps** (bitcoin-knots, btcpay, electrumx, lnd, mempool, immich, fedimint,
filebrowser). It explicitly **SKIPS uninstall/reinstall** (the CASCADE tier is gated behind
`ARCHY_ALLOW_CASCADE_DESTRUCTIVE`, which `run-gate.sh` never sets) and has **zero coverage**
for the other ~30 apps (grafana, jellyfin, vaultwarden, penpot, nextcloud, photoprism,
uptime-kuma, homeassistant, … — see `app-registry-status-2026-06-21.md`). So uninstall,
reinstall, install-progress UI, and most apps were never under test.
**Real bugs found in manual multinode testing on .198 (2026-06-23) — the motivating evidence:**
- **Uninstall is broken for immich + grafana:** takes very long, the progress bar sits at a
**solid full-red with no real progression**, and the app **does not actually uninstall**
it still appears in **My Apps** afterward (ghost entry / state not cleared).
- **grafana reinstall just stops** partway (no completion, no clear error).
- **fedimint guardian** suddenly showed **"starting up — Guardian opens a wait page until
Bitcoin finishes initial sync" / "starting"** on that node — verify this is correct
wait-for-IBD behavior vs a stuck/false state (it's a backend that depends on bitcoin sync).
**✅ 2026-06-26 — root cause of the immich/grafana uninstall trio FOUND + FIXED (`71cc9ac4`).**
Single cause: `quadlet::disable_remove()` (first op in uninstall teardown, via companion +
orchestrator) ran `systemctl --user stop` / `daemon-reload` / `podman rm -f` with **no timeout**.
On rootless podman a generated unit can wedge "deactivating" while podman hangs → `systemctl stop`
blocks forever → the spawned uninstall task returns neither Ok nor Err, so (a) `set_uninstall_stage`
never fires → **frozen full-red bar**, (b) `remove_package_state_entry` never runs → **ghost stuck in
`Removing`**, (c) the install guard rejects reinstall (`already Removing`). The spawn wrapper already
reverts state on Err/removes on Ok — only a *hang* stranded it. Fix bounds all three calls
(stop→`QUADLET_STOP_TIMEOUT` + SIGKILL/reset-failed escalation; daemon-reload→30s; podman rm→timeout).
**Validated live: `cascade-uninstall.bats` 7/7 on .228** (binary `ae349a75`) — grafana install →
uninstall (no ghost, data dir gone) → reinstall → running → cleanup. NOTE: proves the happy path +
no-regression; the original hang was load/timing-induced and not separately reproduced.
**Workstream F scope — the gate must grow to (in priority order):**
1. **CASCADE tier in the canonical gate:** uninstall → verify the app is GONE from My Apps /
`container-list` / package state (no ghost), data preserved per policy, then reinstall →
verify it returns healthy. Catch the immich/grafana ghost + reinstall-stops bugs.
*(✅ DONE `b7d92107`: `run-gate.sh` now runs ONE cascade pass after the 5× loop when
`ARCHY_GATE_CASCADE=1` (+`ARCHY_ALLOW_DESTRUCTIVE=1`), counted into the tally — opt-in so default
behavior is unchanged, and deliberately NOT folded into all 5 iterations. `cascade-uninstall.bats`
7/7 on .228. Next: extend cascade coverage beyond the single throwaway app to the multi-container
stacks, e.g. an immich/btcpay cascade variant.)*
2. **Progress-UI assertions:** install AND uninstall must report monotonic, truthful progress
(not a stuck full-red bar); a long op must surface a real stage/percentage and a terminal
success/failure — no silent hang. (Likely both a backend progress-event fix AND a UI fix.)
*(✅ 2026-06-26 `9f17ba68`: the "stuck full-red bar" was `AppCard.vue` hardcoding the uninstall
bar to `w-full bg-red-400/60 animate-pulse` — solid, full, red, fake-pulse. Now derives a real
percentage from the backend's existing `uninstall-stage` label ("Stopping containers (X/N)"→1050%,
"Cleaning up volumes"→70%, "Removing app data"→90%) and renders like install (neutral fill, real
width+%, shimmer). FE built `index-DtZyZomC.js`, rolled to .228/.116/.198/.89 (+.88/.5/.120).
STILL TODO: a bats/UI assertion that the bar is monotonic + lands on a terminal state; possibly a
backend numeric-progress field so the UI doesn't parse stage strings.)*
3. **ALL-apps coverage:** a generic per-app lifecycle matrix (install / UI-reach / stop / start /
restart / uninstall / reinstall / reboot-survive) driven by the manifest set, so grafana and
the ~30 uncovered apps are gated too — not just the 8 core. Manifest-driven, so new apps are
covered automatically.
*(✅ 2026-06-26 `43934eef`: `bats/all-apps-lifecycle.bats` — DESTRUCTIVE counterpart to the
read-only `all-apps-matrix.bats`. Discovers the app set from My Apps ∩ the node `catalog.json`;
drives stop/start/restart for every app and, under `ARCHY_ALLOW_CASCADE_DESTRUCTIVE`, a FULL
teardown (uninstall→no-ghost→reinstall) with the catalog `{dockerImage, containerConfig}` as the
reinstall spec. PROTECTED (never touched): bitcoin*/electrum* (resync cost) + lnd/btcpay*/fedimint*
(irreversible wallet loss — user asked to protect only bitcoin+electrum; wallet apps added for
safety, override via `ARCHY_MATRIX_PROTECT`). Validated on .228 (discovery + 1-app lifecycle
green). HEAVY/destructive → a supervised pass on LAN nodes (.116/.198/.228), NOT folded into
run-gate. Invoke: `ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1 ARCHY_PASSWORD=…
ARCHY_SCHEME=https bats bats/all-apps-lifecycle.bats`.)*
**✅ FIRST FULL DESTRUCTIVE RUN on .228 (2026-06-26):** lifecycle **11/11 clean**; teardown
**8/11** (immich 3-container stack incl.) — and it surfaced **3 real reinstall bugs** (the payoff):
1. **fresh-install bind-dir ownership = root:root** → EACCES on reinstall (jellyfin `/config`
denied exit 139; netbird-server can't open its SQLite store). Fix B's chown-to-parent only
runs on the reconcile path, **not** `package.install`. The important orchestrator fix.
2. **netbird reinstall adopts leftover containers → skips the manifest cert/file render**
(tls.crt/key/nginx.conf never written → proxy can't start → app reads absent). Only a fully
clean reinstall renders them.
3. **portainer image pin `lfg2025/portainer:2.19.4` is `manifest unknown`** (never pushed to the
registry) and the pin OVERRIDES the RPC dockerImage → portainer is un(re)installable
fleet-wide. Registry/catalog data bug (push the image or change the pin).
.228 restored (jellyfin+netbird via manual chown / clean reinstall; all installed apps running,
28 ctrs; portainer left uninstalled — uninstallable until #3 fixed). TODO: fix #1 (extend chown
to install path) + #2 + #3; add reboot-survive + UI-reach per app to the matrix.
4. **Guardian/IBD-dependent states:** assert that "waiting for bitcoin sync"-style states are a
legitimate, surfaced wait (with a path to ready) and never a permanent stuck state.
**Definition of done for F:** the expanded gate (CASCADE + progress + all-apps) is 5×-green on
.228, then re-verified across the multinode fleet — i.e. an *insanely-perfect* OS/container
environment where every app installs, runs, updates, uninstalls, and reinstalls cleanly with
honest progress, no ghosts, no data loss, reboot-survivable.
just podman-`--restart`); immich on .198.
## 7. Release blockers & operational gotchas (durable)
@ -259,32 +141,6 @@ Beta Live (public). Hardening priorities feeding the gate:
- **P1** LUKS2 full-partition encryption for `/var/lib/archipelago/`
(AES-256-XTS, Argon2id, key from setup password + hardware salt).
- **P1** Meshtastic plug-and-play parity with MeshCore.
- **P1 ✅ CODE-COMPLETE** (branch `companion-mobile-ux`, 2026-06-23; needs
on-device + mobile-web verification before merge to `main`) — Mobile app-launch
UX — drop the "this app opens in a tab" interstitial.
Two surfaces (both: no interstitial screen, launch the app directly):
- **Companion app (Android):** open **every** app in the **in-app WebView**
(not just non-iframeable ones) — *and* carry the current mobile-iframe footer
controls into the WebView (back/forward/reload/close — good, useful UX).
- **Mobile web browser (PWA):** open tab-apps directly in a **new browser tab**.
Touch points: `neode-ui/src/stores/appLauncher.ts`, `AppLauncherOverlay.vue`,
the Android in-app WebView bridge, and the mesh-mobile iframe footer controls.
(Reference prior work: `b5a9deb8` in-app webview for non-iframeable apps,
`d1fbcd9b` "open in browser" via native bridge.)
- **✅ Done (branch `companion-mobile-ux`):** mobile launches now use the
store-driven panel (no route push) so the background tab no longer changes and
closing returns you where you launched; tab-only apps open directly (in-app
WebView on companion via `openInApp`, new browser tab on PWA) with **no
interstitial**; the Android `InAppBrowser` (`WebViewScreen.kt`) gained a bottom
footer bar (back/forward/reload/open-in-browser/close) + a centered loading
screen (favicon + progress); a shared `AppLoadingScreen` (icon + progress)
replaced the black/spinner loaders on the app session **and** legacy iframe
overlay; the dashboard is pinned to `100dvh` on mobile so the mesh chat/tools
panes stop sliding under the tab bar in mobile browsers (no-op in companion);
ElectrumX shows its real icon in My Apps. Companion APK bumped to **v0.4.7**
(versionCode 11) with a committed shared debug keystore so updates install
without an uninstall. **Not yet:** merge to `main`; publish the 0.4.7 companion
download (deferred until the gate work lands so they ship together).
**Post-beta (deferred — do not start until gate is green):** P2P encrypted
voice/video (WebRTC over federation via Tor); watch-only wallet + mesh BTC
@ -292,271 +148,14 @@ hardening; paid swarm streaming + IndeeHub source (`phase4-streaming-ecash-plan.
Meshroller Rust-native mesh AI (`meshroller-integration-design.md`); dual-ecash
phases 26 (`dual-ecash-design.md`).
## 8b. SESSION STATE + RESUME (updated 2026-06-26) — READ §8b "CURRENT STATE + RESUME" FIRST
### ▶ SESSION h (2026-06-26) — LATEST, RESUME FROM HERE
**Canonical resume detail: memory `project_session_resume_2026_06_23b` (▶️ top of MEMORY.md).**
Local main = `670ebb06` (3 commits past the previously-pushed `43e70049`: `0a8db904` zombie
guard + `670ebb06` gitea launch-port fix; `43e70049` webview was already pushed). **Combined
release binary `040df5ce2551d17b` rolled to the fleet.** Binary+FE not in git — rebuild on a
fresh machine (`cd core && CARGO_INCREMENTAL=0 cargo build --release -p archipelago`).
**DONE this session:**
1. ✅ **Zombie-container guard** (`0a8db904`) — the reconciler's Running branch now verifies a
container's `State.Pid` is alive (`/proc/<pid>` exists) before trusting podman's "Up"; on a
concrete dead PID it stop+remove+`install_fresh` from the manifest. Conservative: any
uncertainty (inspect fail / unparseable PID) assumes alive, so a transient hiccup never
destroys a healthy container. Fixes the class that broke NetBird login on .228 (dashboard
"Up" w/ dead PID → proxy 502, no host port → reconciler never recovered it). Unit test +
**live-proven on .228**: synthetic zombie on `jellyfin` (killed conmon+PID → podman still
"Up") → guard logged `…process is dead (zombie) — recreating app_id=jellyfin` → recreated →
settled to NoOp. **Zero false-positives across the other 33 healthy containers.**
2. ✅ **Gitea launch-port fix** (`670ebb06`) — gitea launched at **:2222 (SSH)** instead of
**:3001 (web)** on nodes without the gitea manifest on disk (`manifest_lan_address_for`
returns None → fell through to `extract_lan_address`, which returns podman's first-listed
port; podman lists `2222->22` before `3001->3000`). Added `"gitea" => http://localhost:3001`
to the static `lan_address_for` map (`core/container/src/podman_client.rs`) like every other
core app. Reported on tailscale node **100.82.34.38** — that node still needs the new binary
(or a refreshed gitea manifest) to pick it up.
3. ✅ **Rolled `040df5ce`** to .228/.116/.198/.89 (verified sha+active); .88/.5/.120 rolling.
**OPEN follow-ups (logged, NOT regressions):**
- **mempool env-drift recreate-loop on .228** — reconciler logs `container env drift detected —
recreating app_id=mempool` every ~30-90s, never converges (pre-existing; the known mempool
nginx stale-IP class, [[project_mempool_nginx_stale_ip_fix]]). mempool stays running but churns.
- **nostr-rs-relay** stuck "Stopping" + ~2s create-loop on .228 (from session g).
**NEXT:** finish .88/.5/.120 roll → push main to gitea-vps2 → Phase-3 quadlet / Workstream F /
multinode. SSH/sudo pw `ThisIsWeb54321@` (**.88 = `ThisIsWeb54321!`**); UI/RPC .228/.198 =
`ThisIsWeb54321@`. Reusable tooling in scratchpad: `deploy-bin.sh`/`remote-apply.sh` (EXPECT_SHA
= `040df5ce…`), `rpc.sh`.
---
### ▶ SESSION g (2026-06-25) — earlier, historical
**Canonical resume detail: memory `project_session_resume_2026_06_23b` + `project_netbird_ph4_legacy_deletion_map` + `project_workstream_f_lifecycle_perfection`.**
`gitea-vps2/main = a721532f` (pushed). **Local main = `89d397bb`** (2 new commits this session, NOT pushed/deployed: `41e7f500` harness tolerance + `89d397bb` netbird ph4 legacy delete). Binary+FE are NOT in git — rebuild on a fresh machine.
**TL;DR (SESSION g, 2026-06-25) — everything below DONE this session:**
1. ✅ **Rolled** `e0343137` + fresh FE (`index-a75rd6Hy.js`) to **7 nodes** (.116/.198/.228/.89/.88/.5/.120), all verified. **.15 SKIPPED** (auth rejected — creds don't match).
2. ✅ **Harness tolerance fixes COMMITTED** `41e7f500` (run-gate settle/immich + immich.bats 90s + mempool.bats poll).
3. ✅ **mempool RESOLVED** fleet-wide — see mempool note below.
4. ✅ **netbird #20 ph4 DONE** — legacy Rust installer DELETED, committed `89d397bb` (492 lines gone, manifest-driven only, `cargo check` clean). Release binary BUILDING for the .228 live-verify (build left running — check after).
**NEXT (resume here):** (a) check the release build, deploy the `89d397bb` binary to .228, live-verify netbird adopts via manifest (https:8087→200, no `bail!`); (b) roll `89d397bb` to the rest of the fleet (behavior-neutral — manifest path already executed); (c) **push local main → gitea-vps2** (2 commits ahead); then **Phase-3 `use_quadlet_backends` → Workstream F → multinode**.
**ROLL RESULTS (2026-06-25, binary `e0343137b99bf066` + fresh FE bundled):**
| Node | Result |
|------|--------|
| .228 | ✅ already on `e0343137` (prior session, binary-only) |
| .116 (local) | ✅ binary + fresh FE; 36 containers survived restart; UI 200; `index-a75rd6Hy.js` live |
| .198 (LAN) | ✅ binary + fresh FE; 38 containers up; UI 200 |
| .89 (100.89.209.89) | ✅ binary + fresh FE; service active |
| .88 (100.70.96.88, pw `ThisIsWeb54321!`) | ✅ binary + fresh FE; service active |
| .5 (100.72.136.5) | ⏳ attempted — see resume note (cellular x250) |
| .120 (100.66.157.120) | ⏳ attempted — see resume note (cellular x250) |
| .15 (100.64.83.15, archy-dev-pa) | ❌ SKIPPED — `archipelago@` + `ThisIsWeb54321@` rejected (`Permission denied (publickey,password)`); node creds unknown |
Deploy tooling (reusable): scratchpad `deploy-bin.sh <label> <local\|ssh\|ts> <host> <pw>` + `remote-apply.sh` (mv binary avoids ETXTBSY, atomic FE swap preserving `aiui`/APK/`claude-login.html`, chown 1000:1000, restart, sha+health verify). Frontend tarball = `tar -C web/dist/neode-ui -czf neode-ui.tgz .` (flat). Full sha `e0343137b99bf06642c45da67bb092e9a411190ff59eda8e5177c2a06b6f6e89`.
**Focus: validate the two UNVALIDATED-WIP orchestrator fixes (commit `a721532f`) on the .228 canary, then roll to the 7-node fleet.**
- **Fix A** — desired-state recovery: a was-running app that vanished (e.g. lost through a failed teardown + reboot) auto-recreates on reconcile, via new `crash_recovery::load_last_running_names` (reads `running-containers.json` sans PID gate) + exact container-name match in `reconcile_all_with_mode`. Zero false-positives (uninstalled/user-stopped excluded).
- **Fix B** — recreate volume-ownership: a freshly-created bind dir for a NO-`data_uid` app gets `chown --reference=<parent>` so container-root can write → kills the immich-class recreate EACCES crash-loop. Only fresh dirs (zero regression for existing installs).
VALIDATION PROGRESS (sessions e→f):
1. ✅ Release binary built — sha16 `e0343137b99bf066` (differs from pre-fix `f2aa2fab` → fixes compiled in).
2. ✅ `cargo test -p archipelago crash_recovery`**13/13 green**, incl. the two new Fix A tests.
3. ✅ Deployed new binary to **.228 canary** (binary-only; FE unchanged at `435b9f92`). Verified live sha `e0343137`, active, RPC OK. Container cgroup confirmed in `user@1000.service` (NOT archipelago.service) → `systemctl stop` is container-safe on .228.
4. ✅ **Fix A PROVEN**`podman rm -f jellyfin` (non-baseline, no-data_uid) → periodic ExistingOnly reconciler (30s) recreated it; journal: `previously-running app has no container after boot — recreating (desired-state recovery) app_id=jellyfin`.
5. ✅ **Fix B PROVEN** — fresh `package.install uptime-kuma` (no-data_uid, no prior data dir) → bind dir chowned to parent owner `1000:1000` (NOT root:root), state=running, RestartCount=0, no EACCES, app wrote its own subdirs → clean uninstall (container+data-dir gone). all-apps matrix read-only **5/5 (17 apps)**.
6. 🟡 **5× DESTRUCTIVE gate on .228 — NOT yet 5/5, but failures are HARNESS-TOLERANCE FLAKES, NOT Fix A/B regressions** (proven: Fix A logged **0** desired-state-recovery firings during the failures; immich/lnd `RestartCount: 0`, no crashes). Under sustained 5× churn on this 34-app node a *different* heavy-app recovery probe slips each iteration:
- immich `lan_address` (test 64): 30s probe too tight after archipelago-restart recovery. **FIXED** (settle_stack now waits on immich :2283 when present, cap 180→300s; test 64 deadline 30→90s). Went **ok/ok/ok 3×** after fix.
- mempool orphan count (test 82): single-shot count caught a transient extra container mid-recreate (clears to 3=3). **FIXED locally** (poll for steady-state ≤30s) — fix is in local `tests/lifecycle/bats/mempool.bats`, NOT yet re-gated.
- lnd `getinfo recovers after restart` (test 77): already has a generous 240s deadline; peak concurrent load occasionally beats it. lnd itself **HEALTHY** (wallet unlocked — "wallet already unlocked, WalletUnlocker no longer available", RestartCount 0). Likely needs deadline bump or lnd added to within-iteration tolerance. **NOT yet fixed.**
- NOTE: the 300s settle bump made iterations very long (iter2=1062s) and a diagnostic run wedged in iter3; killed it. Re-think settle (maybe per-app readiness with shorter caps) before the next run.
7. ✅ **DECISION RESOLVED (2026-06-25):** user chose **(B) roll now** AND bundle the fresh UX frontend (per `feedback_deploy_targets_and_ux_bundle`). Gate load-robustness deferred to a separate hardening pass.
8. ✅ **ROLLED** `e0343137` + fresh FE (`index-a75rd6Hy.js`) to .116/.198/.89/.88/.5/.120 (.228 already on it) — all verified `sha=e0343137`, service active. **.15 skipped** (auth reject). See roll table above.
9. ✅ **Harness fixes COMMITTED** `41e7f500` (no longer uncommitted).
10. ✅ **netbird #20 ph4 — legacy installer DELETED**, committed `89d397bb`. `install_netbird_stack` is now orchestrator-manifest → adopt → `bail!` (no in-Rust installer); removed 6 dead helpers + 3 `NETBIRD_*_IMAGE` consts + unused import (~492 lines). `cargo check` clean (0 warnings). Manifest path verified live pre-delete (.228 https:8087→200). **Release binary BUILT: sha `cccb7cfd9c38a651`** (`core/target/release/archipelago`, supersedes `e0343137`) — NOT yet deployed; deploy to .228 + live-verify then roll. Map+rationale: memory `project_netbird_ph4_legacy_deletion_map`. **Pre-existing follow-up (NOT introduced by delete): the manifest path lacks an active #10 OIDC-readiness gate — if that login race resurfaces, add an OIDC-ready gate to the netbird manifest.**
**✅ 2026-06-25 — STRAY 13h GATE on .228 found + killed; mempool RESOLVED.** A `setsid` gate run from session-e was still churning .228 ~13h later (pathologically slow — only reached test 71/lnd; the 300s settle bump is the suspect). Killed its process group (note: `pkill -f bats` self-matches the ssh command's own argv → kill by numeric PID/PGID instead). After kill, `crash_recovery` (Fix A) auto-recovered the immich/indeedhub/netbird stacks — **good live exercise of Fix A**. **mempool fallout RESOLVED:** the gate churn left .228's podman **overlay storage corrupt** (mempool frontend crash-looped — container couldn't write `/etc/nginx`, same image serves fine on .116) → **fixed by rebooting .228** (clears overlay corruption; Fix A staggered-recovered all apps; mempool stable 200). **.198 is PRUNED** bitcoin → mempool requires archival (install correctly refused) → **cleanly uninstalled** the orphan mempool-db. All nodes now correct. LESSON: never leave the gate running unsupervised; reconsider the 300s settle before re-running.
Fleet on `e0343137` + FE `index-a75rd6Hy.js` on .116/.198/.228/.89/.88/.5/.120 (.15 still old). **`89d397bb` (netbird-delete) binary NOT yet deployed anywhere — verify on .228 then roll.** SSH/sudo pw UNIFORM `ThisIsWeb54321@` (**.88 = `ThisIsWeb54321!`**); **UI/RPC: .228=`ThisIsWeb54321@`, .198=`ThisIsWeb54321@`.** Reusable tooling in scratchpad: `deploy-bin.sh`/`remote-apply.sh` (binary+FE swap), `rpc.sh <host> <pw> <method> [params]` (auth.login→call). Gate harness at `~/lifecycle/lifecycle` on .228 — **CHECK it isn't already running/wedged before re-launching**.
---
### ▶ SESSION b (2026-06-23 PM) — earlier, historical
**Canonical resume detail: memory `project_session_resume_2026_06_23b` (▶️ top of MEMORY.md).**
`gitea-vps2/main = 4346007d` pushed; local HEAD `e57514b6` (uninstall fix, committed, **not pushed/deployed**).
Shipped + verified live on .228 (all in 4346007d):
- **Connection-lost FULLY fixed** — companion `image_exists` journal-flood (Stdio::null) + netbird UDP-port reconcile churn (`wait_for_manifest_host_ports` tcp-only). .228: flood→0, ws/db→0 disconnects, load 3.95→2.26.
- **netbird → manifest-driven** (#20 ph4) — 3 manifests + 4 orchestrator primitives (base64 secret, GeneratedCert+`ensure_manifest_certs`, templated-file render `{{HOST_IP}}/{{NETWORK_GATEWAY}}/{{secret:}}`, udp port protocol). Live: https 8087→200, OIDC→200, resolver=gateway. Legacy-Rust delete deferred to post-full-verify.
- **registry-manifest flip (code)**`EMBED_MANIFESTS` default-on, `main.rs` bounded pre-load `refresh_catalog`. Catalog regenerated w/ 52 embedded manifests but **NOT published** (gitignored + never committed; publish = force-add to gitea-vps2 main). Do after fleet binary roll.
- **UX regression root-caused + fixed** — the mobile/desktop UX (loader/AppLoadingScreen, store-driven launch, app icons, android webview footer) was on `companion-mobile-ux` and **never merged to main**, so any main build silently dropped it. **Merged → main**, frontend redeployed to .228. Android 0.4.9/code13 pushed for user to build APK elsewhere.
In progress — **Workstream F lifecycle bugs** (this §, user-picked next):
- **uninstall ghost — FIXED + pushed (e57514b6) + DEPLOYED to .228.** `handle_package_uninstall` returned Err on any cleanup-residue failure *before* removing the package state entry → ghost in My Apps + revert-to-Installed. Now: split container vs cleanup errors; remove state entry as soon as containers gone (before slow data rm). **LIVE-VERIFY IN PROGRESS:** fresh grafana (not previously installed → no data risk) install→uninstall→reinstall on .228; install was mid image-pull at handoff. RPC recipe + caution in memory `project_session_resume_2026_06_23b`.
- **#15 fedimint guardian — RESOLVED, not stuck** (legit `until` IBD-gate → setup wizard now bitcoin synced; no code change).
- #14 grafana reinstall-stops — verify in the same grafana test (likely same root cause as #13).
Next: finish grafana uninstall/reinstall live-verify on .228 → roll the new binary to the rest of the fleet (.116/.198/.5/.120 still on old binary) → publish embedded catalog (#8) → finish Workstream F (gate CASCADE+progress+all-apps expansion) → Phase 3 Quadlet → multinode.
WATCH: main.rs pre-load `refresh_catalog` (≤25s) slows startup — sanity-check startup→RPC-ready isn't egregious on the fleet roll.
---
### ▶ CURRENT STATE + RESUME (2026-06-23) — earlier session-a baseline (historical)
**✅ HEADLINE (2026-06-23): single-node gate GREEN (`run-gate.sh` 5/5 on .228, 0 not-ok) +
multinode test deploy DONE to 6 nodes.** The exit criterion (§5) is met. Green took fixing **two real
orchestrator bugs** (package.stop per-app grace, 2026-06-22; package.restart phantom stack-member
injection, 2026-06-23 — `order_present_containers`, commit 92d7f52d) plus hardening two single-shot
probes (bitcoin-knots state, immich lan_address). All work is **committed + PUSHED to `gitea-vps2`
(146) `main` @ `ccb594fb`** — the local-only state is resolved. Binary = release sha `5472c575…`.
**▶ DEPLOY STATE (latest backend `5472c575` + UX frontend + one-tap companion APK) — 2026-06-23:**
| Node | Pw | Done | Notes |
|------|----|----|-------|
| .116 (local, http:80) | `ThisIsWeb54321@` | ✅ | dev node: bitcoin mid-IBD + http-only |
| .198 | `archipelago` | ✅ | resilience; user manual-testing here |
| .228 | `archipelago` | ✅ | canonical gate node (5×-green) |
| 100.82.34.38 (archipelago-1) | `archipelago` | ✅ | |
| 100.89.209.89 (archy-x250-pa) | `ThisIsWeb54321@` | ✅ | |
| 100.70.96.88 (archipelago node) | `ThisIsWeb54321!` | ✅ | note the `!` |
| 100.64.83.15 (archy-dev-pa) | ? | ⏳ | UP (tailscale ping ok) but `ThisIsWeb54321@` REJECTED — **need correct pw** |
| 100.66.157.120 (archy-x250-exp) | `ThisIsWeb54321@` | ⏭️ | DOWN — user said leave it |
Deploy scripts saved in scratchpad: `deploy-node.sh` (full binary+FE, sha+health verify) and
`fe-only.sh` (FE-only, no archipelago restart). Reusable: `bash deploy-node.sh <host> <pw> <scheme> 127.0.0.1`.
**▶ COMPANION APK fixed (other agent's commit `5c43e127` + my reconcile):** QR + download were a
zip-wrapped `.apk.zip` (forced unzip). Now serve raw `archipelago-companion.apk` (one-tap) from the
146 raw URL; `CompanionIntroOverlay.vue` + ship/publish scripts repointed; old `.zip` dropped. The
OLD `.apk.zip` URL now 404s, so EVERY node was FE-refreshed to the new build (all 6 verified
`/ : 200` + bundle references `archipelago-companion.apk`).
**▶ MANUAL-TEST BUGS FOUND on .198 → workstream F (§4/§6c).** The green gate is DESTRUCTIVE-tier /
~8 core apps; it SKIPS uninstall/reinstall and has no progress-UI / all-apps coverage. Real bugs:
immich+grafana **uninstall hangs at a solid full-red bar + leaves a ghost in My Apps** (doesn't
actually remove); grafana **reinstall stops**; fedimint guardian shows "waiting for bitcoin sync"
(verify legit vs stuck). These motivate **workstream F** (cascade + progress + all-apps gate).
Also added **§10**: investigate TanStack-Query/push-based state mgmt for neode-ui (the state-drift
root cause behind the stuck bar + ghosts).
**▶ NEXT — agreed task order (do IN ORDER, see §6b):**
1. **netbird #20 ph4** — last real manifest migration.
2. **Phase-3 `use_quadlet_backends`** — orchestrator backends → Quadlet units.
3. **§6c workstream F** — cascade/uninstall + progress-UI + ALL-apps gate; fix the immich/grafana
uninstall + ghost-My-Apps + reinstall-stops bugs to a 5×-green; then §10 state-mgmt investigation.
4. **Multinode pass**`docs/multinode-testing-plan.md` (the 6 deployed nodes are ready for manual
testing now).
**▶ LOOSE ENDS / gotchas for the resuming session:**
- **`neode-ui/src/components/AppLoadingScreen.vue` is UNTRACKED** on .116 — the other agent created it
but NO committed code imports it (orphan, not in `e825bbed`). Left in place; decide whether to wire
it in or delete. Not deployed (committed UX doesn't reference it).
- **gitea-local mirror (`localhost:3000`) push is BROKEN** (token redirects to `/login`); push to
`gitea-vps2` works and is primary. Reconcile the local mirror token if you need it.
- **Don't delete bitcoin/electrum data** (user directive) — run only the DESTRUCTIVE gate
(`run-gate.sh` default; never set `ARCHY_ALLOW_CASCADE_DESTRUCTIVE` on real nodes with synced chains).
- **.198 gate not run this session** (user was manual-testing there + restarting). .116 gate ran but
failed 12 tests — ALL environmental (.116 is http-only → ui-coverage hardcodes `https://`; + bitcoin
mid-IBD → bitcoin/lnd preconditions). NOT product regressions. `gate-116.log` on .116.
**(historical resume notes for the 5× chase below — superseded by the green result above)**
**Headline (2026-06-22):** the production gate's `package.stop` blocker is **FIXED**; **`.228` is 1×-GREEN
(110/110)**; a **fresh 5× run is IN PROGRESS on `.228`** (the single-node exit criterion) after a
real mempool bug found + fixed (below). The gate is now single-node (.228); multinode is split out
(`docs/multinode-testing-plan.md`). The gate is canonically **5×** now — `run-gate.sh` (the `20x`
naming/script was removed 2026-06-22, commit `57a013bc`).
**2026-06-22 (late) — mempool stale-IP bug FOUND + FIXED (real production bug, not a flake):**
The 1st 5× attempt failed iteration 1 on `#74 mempool api backend remains queryable`. Root cause was
NOT timing — the frontend nginx pinned mempool-api's IP at startup (no `resolver`); after the gate
restarts mempool-api (new podman IP) nginx 502s and the UI shows "offline". Fixed in
`mempool-frontend:v3.0.1` (resolver+variable proxy_pass; see `[[project_mempool_nginx_stale_ip_fix]]`
/ `docker/mempool-frontend/`), pushed to vps2, manifests bumped 3.0.0→3.0.1, deployed + resilience-
verified live on .228 (backend restart now auto-recovers). Also fixed the test itself (`mempool.bats`
#74: 180s→300s + real `fail` helper). Commits `0f05f73a` (fix) `57a013bc` (gate rename).
**THE 5× RUN IS DETACHED ON .228 — survives terminal/session close. Check it from any machine:**
```
sshpass -p archipelago ssh archipelago@192.168.1.228 \
'grep -E "iteration [0-9]+: (PASS|FAIL)|RESULTS|passed:|failed:" /tmp/gate-5x3.log; \
echo "running pid: $(pgrep -f run-gate.sh$ || echo DONE)"; grep "^not ok" /tmp/gate-5x3.log | sort -u'
```
- Log: `/tmp/gate-5x3.log` on .228 · launched `nohup` · `ARCHY_ITERATIONS=5 ARCHY_ALLOW_DESTRUCTIVE=1`,
run **ON the node** from `/tmp/lifecycle-run/tests/lifecycle` via `./run-gate.sh` (ARCHY_HOST=127.0.0.1).
`bats` 1.11.1 + static `jq` 1.7.1 are installed on .228.
- **If all 5 iterations PASS → .228 has met the single-node criterion → demote the banner.**
- If it flakes again: readiness-under-churn (lnd/mempool); hardening in `98f4fa44` (inter-iteration
`settle_stack()` + readiness windows). Re-copy repo `tests/lifecycle` to /tmp/lifecycle-run, relaunch.
**▶ 2026-06-23 (morning) — 5× FINISHED 2/5; both mempool fails ROOT-CAUSED to ONE real
orchestrator bug (NOT flakes) + FIXED:** the overnight run finished `passed: 2 / failed: 3` on
`gate-5x3.log`, three *distinct one-off* fails, none repeating:
- iter1 `#5 container-list valid state for bitcoin-knots` — pre-launch churn (as predicted); didn't
repeat. **Hardened anyway:** the probe was a single-shot read; now polls ≤30s for a settled valid
state so a momentary `restarting`/transient can't flake a 20-min iteration (`bitcoin-knots.bats`).
- iter2 `#74 mempool api queryable` + iter5 `#73 mempool stack running` — **SAME root cause.**
`package.restart mempool` resolves its container list via `ordered_containers_for_start`, which was
**injecting phantom stack-member names** (`mysql-mempool`, `archy-mempool-api`, `archy-mempool-web`
— variant names from the union `startup_order` list that aren't live on this node). The phantom
`mysql-mempool` is 2nd in the start order; `do_orchestrator_package_start` hits its unknown-app-id
fallback → `do_package_start` inspect fails "no such object" → the `?` **aborts the whole start
sequence**, so `mempool-api` (pos 5) + `mempool` frontend (pos 8) never start. They then sat down
~6 min until the health monitor independently recovered them → #73 (frontend not running in 180s)
and #74 (api not queryable in 300s) both flake. Journal proof on .228: `package.restart mempool
failed: Start failed: mysql-mempool: ... no such object`, 23:27:32.
**Fix:** `ordered_containers_for_start` now orders only the *actually-present* containers and never
injects phantom order entries (new pure helper `order_present_containers` + 3 unit tests,
`dependencies.rs`). This is the SAME class as the mempool nginx bug — a hardcoded-name/reality
mismatch — and is exactly the manifest-driven-lifecycle anti-pattern the master plan targets.
- **Deploy + relaunch:** built release binary on .116, swapped `/usr/local/bin/archipelago` on .228
(containers live under `user@1000.service`, NOT the `archipelago.service` cgroup, so a service
restart does NOT kill them — verified via conmon cgroup paths). Manually verified mempool restart
keeps the stack up, then relaunched a clean 5× → see `gate-5x4.log` (check cmd above, swap the
filename). Expectation: all three fixed → 5/5 green → demote the banner.
**Code fixes shipped this session (all on `main`, built + DEPLOYED to .228 AND .198):**
- `2dad64b2` stop honours per-app grace (was `-t 30` deadline racing SIGKILL).
- `760a32bc` reconciler stops resurrecting user-stopped apps (dep-override + host-port watchdog).
- `6e49ce6f` container-list reports user-stopped apps as `stopped` despite a live UI companion.
- `452f05d8` companion self-heal on its own ~30s loop (was gated behind the slow per-app pass).
- Test-harness hardening: `88930558` `53b8e47f` `892ff083` `98f4fa44` (readiness retries, immich/
fedimint/NPM/lnd windows, inter-iteration settle). Binary built on .116
`core/target/release/archipelago` (4-fix); deploy = stop archipelago, cp to /usr/local/bin, start.
**NODE-STATE fixes on .228 NOT in the repo (re-apply if .228 is reset/reimaged):**
- nginx `/app/lnd/` proxy target was stale `8081` → fixed to `18083` (sed in
/etc/nginx/sites-{available,enabled}/archipelago + snippets, then `nginx -s reload`). Repo code is
correct (18083); old node config was stale.
- Removed a stale orphan `~/.config/containers/systemd/home-assistant.container` (ContainerName
`home-assistant` ≠ the real `homeassistant` container; it was stuck "activating"). Real app fine.
- electrumx was re-installed (`package.install` w/ image `146.59.87.168:3000/lfg2025/electrumx:v1.18.0`)
to re-register it as a tracked manifest app (it had become adopted plain-podman).
**KEY LESSON:** run the lifecycle gate **ON the node**, not via RPC from .116 — its bitcoin/companion/
orphan/endpoint tests use local `podman`/`systemctl`/`bitcoin-cli`/`curl`, so a remote run silently
tests the *runner* (this is why earlier runs from .116 falsely showed "bitcoin in IBD" etc.).
**Remaining (after 5× green):** netbird migration (#20 ph4 — the one real migration left) + btcpay/
mempool stack polish; Phase-3 `use_quadlet_backends`; B flip-on (EMBED_MANIFESTS+sign); per-app test
coverage (~30 apps unwritten); the mobile app-launch UX (§8 Roadmap P1). Multinode → its own plan.
---
## 8b. SESSION STATE + RESUME (updated 2026-06-22) — READ THIS FIRST ON RESUME
### Where we are — Task #20 (manifest lifecycle hooks) + indeedhub migration: DONE & 2-node verified
Manifest-driven lifecycle hooks + the IndeedHub stack migration are **complete and
live-verified on BOTH .228 and .198** (adoption + fresh-create + post_install hook
exec, stable under load). 15 commits this session: `4c1a4e59`..`e2a012d0`. Working
tree clean. The release lifecycle gate is **5×** (`ARCHY_ITERATIONS=5`).
tree clean. The release lifecycle gate is temporarily **5×** (was 20×; `ARCHY_ITERATIONS=5`).
**Shipped (all on `main`, newest first):**
- `e2a012d0` indeedhub frontend health → `tcp:7777` (was http GET `/`; the http check
@ -648,78 +247,30 @@ regenerate, matching .198) → re-run the canonical gate (DESTRUCTIVE only).
regression suite green (37/37). **Validated:** healthy app `vaultwarden` stops cleanly on .198
(running→exited→removed) — no regression; the deployed binary's stop path works.
**The gate stop-failure was MULTI-CAUSED (3 real product bugs) — all 3 now FIXED + the electrumx
lifecycle suite is GREEN (10/10, 66s) on .228:**
1. ✅ **Stop ignored per-app grace** (`podman stop -t 30` spurious 30s timeout) — commit `2dad64b2`.
Orchestrator now uses manifest `stop_grace_secs``stop_grace_secs_for()` table; deadline =
grace + 15s; applied to quadlet stop + API + CLI.
2. ✅ **Reconciler resurrected user-stopped apps** — commit `760a32bc`. The reconcile filter's
`dependency_required` override re-included a user-stopped dependency (electrumx ← active mempool),
the in-memory `disabled` set is wiped on manifest reload, and the host-port "repair" then restarted
the stopped backend within ~8s. Fix: `ensure_running_with_mode` now bails `Left("user-stopped")`
when the on-disk `user_stopped` marker is set (the single choke point all reconcile flows through);
install/start clear the marker first so user actions are unaffected.
3. ✅ **container-list reported user-stopped apps as `running`** — commit `6e49ce6f`. The backend was
Exited but its UI companion (electrs-ui/bitcoin-ui/…) kept serving the launch port, and the
state-refresh upgraded any reachable launch port to `running`. Fix: `handle_container_list` forces
`stopped` for `user_stopped` apps before the launch-port refresh.
**But validation revealed the gate failures are MULTI-CAUSED — the grace bug is only one of ~5:**
1. ✅ FIXED — orchestrator ignored per-app stop grace (`podman stop -t 30` spurious 30s timeout).
2. ⛔ **`fedimint` is crash-looping / unhealthy on BOTH nodes** (`health_monitor: Auto-restarting
unhealthy container: fedimint`, attempt 6/10). An app that won't stay up can't be cleanly
stopped — fedimint was a *confounded* test case. Needs a fedimint-health investigation
(why is its container unhealthy / why does host port 8173 not become reachable).
`health_monitor` DOES respect `user_stopped` (health_monitor.rs:983) so that part is correct.
3. ⛔ **Host-listener repair watchdog** (`prod_orchestrator`: "host listener disappeared after
startup; restarting container app_id=fedimint") restarts containers whose launch port isn't
reachable — fights any stop of a port-unreachable app.
4. ⚠️ **State-model nuance:** `vaultwarden` showed `exited``absent`, never `stopped`; the gate waits
for exactly `"stopped"` (`wait_for_container_status … stopped`). The `Exited→Stopped` conversion
(server.rs:1191, needs `user_stopped.contains(id)`) isn't always firing — likely an id-vs-name
key mismatch. The gate may need to accept `exited`/`absent` as terminal, or the conversion fixed.
5. ⚠️ **Grace vs gate-timeout:** `electrumx` grace is 300s; if it ignores SIGQUIT the container
only dies at the 300s SIGKILL — far past the gate's 60s wait. `-t` is a *ceiling*, so a HEALTHY
electrumx that honours SIGQUIT stops fast; an unhealthy/ignoring one blows the gate window.
Decide: trim graces, make the gate's per-app stop-wait ≥ grace, or both.
6. ⚠️ **.228 contamination** (plain podman, no quadlet units) — my cascade-gate; re-quadletize.
**Earlier theories now RESOLVED/superseded:** "fedimint crash-looping" was **probe-induced churn**
left alone, fedimint is stable (Up 48 min, 0 watchdog restarts/30 min); its restarts during testing
were the host-port watchdog firing while I rapid-cycled stop/start (fixed by #2). "Exited→Stopped
key mismatch" was actually the live-UI-companion launch-port issue (#3). "Grace vs gate-timeout"
(electrumx 300s) was moot — a healthy electrumx honours SIGQUIT and stops in <1s.
**TWO-NODE GATE RESULT (1×, DESTRUCTIVE, both with the 3-fix binary):**
- **.228: 104/110.** All previously-failing `package.stop` tests now PASS (bitcoin/btcpay/electrumx/
fedimint/immich). Remaining 6: test 31 (companion recreate), 44 (fedimint orphan — probe
pollution), 55 (immich restart timing), 83 (bitcoin not archival-synced), 94/99 (endpoint/lnd-proxy
cascade from 83).
- **.198: 94/110.** **14 of 16 failures are one root cause: bitcoin is in IBD** (test 83 says
`blocks=817652 headers=954850` — ~137k behind). Everything chained to bitcoin cascades: lnd
(16,85), btcpay (22,23,103), electrumx (37), mempool stack (71,72,73,101), endpoints (94),
bitcoin.getinfo (7,12). The other 2 are node-independent: **31** (companion recreate) and **44**
(fedimint orphan pollution).
**CONCLUSION: the lifecycle-stop blocker is FIXED and validated on both nodes.** The residual red is
NOT lifecycle bugs — it is (a) **bitcoin still syncing (IBD)** on the test nodes [test 83 is an
explicit precondition; nothing electrumx/lnd/btcpay/mempool can pass until it finishes], (b) **.228
plain-podman contamination** (my cascade-gate), and (c) two minor items: **test 31** companion-unit
recreate (both nodes — likely the 90s window vs reconcile tick + image step; investigate) and **test
44** orphan fedimint container left by my probing.
**EVERY gate failure is now FIXED or explained — NO lifecycle code bugs remain.** Final read:
- ✅ `package.stop` (the blocker): 3 bugs fixed (`2dad64b2`/`760a32bc`/`6e49ce6f`), green both nodes.
- **bitcoin-IBD cascade** (most of .198's red): environmental — bitcoin syncing (test 83 precondition).
- **test 31** companion-recreate: NOT a product bug. Two things: (a) **FIXED** — the companion
reconcile stage was gated behind the slow per-app pass; now it runs on its own ~30s loop
(`452f05d8`). Validated on .228 with the new binary: a deleted `archy-electrs-ui` unit self-heals
in **~10s** (was stuck 100s+), journal: `companion not active, repairing → wrote quadlet unit →
companion started`. (b) **HARNESS CAVEAT** — the companion-survives bats does LOCAL `rm`/`systemctl
--user` (no ssh), so running the gate from .116 against a remote node actually tests **.116's**
companions with **.116's** (old) binary, not the RPC target. ⇒ the companion-survives suite must be
run ON the target node (or with the new binary on .116) to be meaningful. This explains the
"failed on both nodes" runs — both were silently testing .116.
- **test 55** immich restart: NOT a bug — the heavy 3-container stack (postgres+redis+server) restarts
in >120s under load; immich DOES return to running. *Optional:* bump the immich restart wait.
- **test 44** fedimint orphan: my probe pollution; a teardown clears it.
**To reach a literally-green 5× gate (now infra/node-prep + minor test-window tuning, not lifecycle code):**
1. Let bitcoin finish IBD on a test node (or point the gate at an archival-synced bitcoin).
2. Re-quadletize .228 (reinstall its backends so `.container` units regenerate, matching .198).
electrumx done; bitcoin/btcpay/fedimint/immich/etc. remain. (Most backends ARE in manifest_ids
already; this is about regenerating quadlet units + clearing adopted plain-podman state.)
3. Optional: faster companion-reconcile cadence (test 31) + longer immich-restart wait (test 55) +
clear the test-44 orphan — or simply run the gate on a less-loaded, bitcoin-synced node.
4. ✅ **test 31 ROOT-CAUSED = contamination + load (NOT a product bug).** `companion::reconcile` only
recreates a deleted companion unit (e.g. `archy-electrs-ui`) when its PARENT backend (electrumx)
is in `manifest_ids`. On contaminated .228 electrumx ran as plain podman and was NOT a tracked
manifest install (its `/opt/.../electrumx/manifest.yml` exists on disk but wasn't loaded), so the
reconciler never iterated it → companion orphaned. **Proven fix:** `package.install electrumx`
re-registered it (now `reconcile action app_id=electrumx` fires) AND restored the companion (unit
present, service active). The companion self-heal logic is correct. ⇒ test 31 clears once .228 is
re-quadletized (step 2). electrumx on .228 is now de-contaminated. Still: clear test-44 orphans.
4. Then run `ARCHY_ITERATIONS=5 ARCHY_ALLOW_DESTRUCTIVE=1` on the synced+quadlet node, then the other.
**Bottom line:** the grace fix is correct and shipped, but **the gate will not go green until #2#6
are addressed**. These are pre-existing product/health issues the gate is correctly surfacing, not
regressions from this work. They need owner prioritization (esp. fedimint health, the watchdog-vs-
stop interaction, and the gate's terminal-state acceptance).
**Quadlet context (still true, but SEPARATE from the bug above):** quadlet IS the intended backend
runtime — .198 has the backend `.container` files (bitcoin-knots/btcpay-server/fedimint/filebrowser/
@ -736,7 +287,7 @@ bug is purely "container never stops", not "state not reported".
### MY-SESSION ERRATA (own it on resume)
- I ran the gate with `ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1`, which is **NOT** the canonical gate (that
is `ARCHY_ALLOW_DESTRUCTIVE=1` only — stop/start/restart, no uninstall/reinstall; see run-gate.sh
is `ARCHY_ALLOW_DESTRUCTIVE=1` only — stop/start/restart, no uninstall/reinstall; see run-20x.sh
"Suggested release-gate invocation"). Cascade ran uninstall/reinstall on every app and, when I
killed the run mid-iteration, left bitcoin-knots/electrumx/btcpay/fedimint/immich uninstalled or
stranded. **I fully restored .228** (reinstalled bitcoin-knots with the correct image
@ -745,22 +296,30 @@ bug is purely "container never stops", not "state not reported".
- Reinstall gotcha: `package.install` needs a REAL image ref in `dockerImage`; a bare app name
`Invalid Docker image format`.
### NEXT STEPS (in order) — SINGLE-NODE (.228) criterion
1. ✅ **DONE** — 4 stop/reconcile bugs fixed + deployed (`2dad64b2` grace, `760a32bc`
reconcile-resurrection guard, `6e49ce6f` container-list user-stopped, `452f05d8` companion
cadence). Plus test-harness fixes (lnd/immich/fedimint/NPM readiness + config).
2. ✅ **DONE** — gate run **ON .228** (synced bitcoin): **110/110 GREEN** (1×). Key lesson:
**run the gate on the node**, not via RPC from .116 (local podman/systemctl/bitcoin probes).
3. ◧ **5× run on .228 in progress** (`ARCHY_ITERATIONS=5 ARCHY_ALLOW_DESTRUCTIVE=1`, on the node).
5 consecutive clean iterations = the single-node gate criterion → demote the banner.
4. **netbird migration (#20 phase 4)** — the one real migration left; assess setup steps first (TLS
cert gen, config files, resolver IP — may need host-file-write hooks beyond exec/copy_from_host;
legacy is install_netbird_stack in stacks.rs). Then btcpay/mempool stack polish.
5. Hardening: `package.start` should regenerate a missing quadlet unit, not fall back to bare podman.
**Multinode / fleet (.198 + the rest) → `docs/multinode-testing-plan.md` (separate, after .228 green).**
Carry-over notes for that plan: .198 bitcoin was mid-IBD; the lnd `/app/lnd/` nginx proxy had a
stale `8081` target on .228 (repo code is correct at 18083 — re-check on other nodes).
### NEXT STEPS (in order)
1. ✅ **DONE** — root-caused the stop-grace bug, fixed it (commit `2dad64b2`), unit-tested,
release-built, **deployed to .198 + .228**, validated no-regression (vaultwarden stops on .198).
2. ⛔ **fedimint health** — why is its container unhealthy on both nodes (health_monitor restart
6/10; host port 8173 unreachable)? A crash-looping app can't pass the lifecycle gate. Likely the
real top blocker now. Same lens for any other unhealthy app surfaced by the gate.
3. ⛔ **Host-listener repair vs user-stop** — the launch-port watchdog
(`prod_orchestrator`: "host listener disappeared after startup; restarting container") must NOT
restart a container the user just stopped. Check it consults `disabled`/`user_stopped`.
4. ⚠️ **Gate terminal-state acceptance** — apps end `exited`/`absent`, not always `stopped`
(Exited→Stopped conversion at server.rs:1191 needs a matching `user_stopped` key). Either fix the
conversion (id-vs-name) or have `wait_for_container_status … stopped` accept exited/absent.
5. ⚠️ **Grace vs gate-timeout** — trim over-long graces (electrumx 300s) and/or make the gate's
per-app stop-wait ≥ the app's grace.
6. **Re-quadletize .228** (backend `.container` files wiped by my cascade-gate; reinstall its apps so
units regenerate, matching .198; verify `.container` + `PODMAN_SYSTEMD_UNIT`).
7. **Run the canonical gate** `ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ITERATIONS=5` (NO cascade; never kill
mid-iteration) on .198 then .228. Green = Step-2-of-plan done.
8. Hardening: `package.start` should regenerate a missing quadlet unit, not fall back to bare podman;
re-survey the status doc's quadlet % from `.container`-file presence.
9. **netbird migration (#20 phase 4)** — same pattern; assess setup steps first (TLS cert gen,
config files, resolver IP — may need host-file-write hooks beyond exec/copy_from_host; legacy is
install_netbird_stack in stacks.rs).
10. Then single-container legacy apps onto the orchestrator install flow; then demote the banner.
### KNOWN ISSUES / WATCH-OUTS
- **.198 is a weak/loaded node** (load avg ~35). The generic reconcile recreates
@ -815,74 +374,3 @@ This master plan is the hub. Authoritative standalone docs (linked above), kept:
All dated handoffs/resumes/transcripts/superseded trackers were consolidated here
and removed (recoverable via git) on 2026-06-21.
## 10. Backlog — investigate frontend state management (2026-06-23)
**Investigate adopting a real client-state/data-fetching layer for `neode-ui`** instead of
the current hand-rolled Pinia stores + ad-hoc fetch/poll patterns. Motivation: lifecycle/UX
bugs like the stuck "full-red" install/uninstall progress bar and ghost **My Apps** entries
(see §6c) are partly a *state-sync* problem — the UI's view of package state drifts from the
backend and isn't reliably invalidated/refetched. A principled query/cache layer (request
dedup, background refetch, cache invalidation on mutation, optimistic updates, retry/stale
handling) would make these classes of bug structurally hard.
**Research → recommend → (maybe) adopt:**
- Evaluate **TanStack Query** (Vue Query) as the leading candidate, plus alternatives
(Pinia Colada, vue-query alternatives, plain Pinia + a disciplined invalidation layer, or
an SSE/WebSocket push model for package-state events instead of polling).
- Criteria: fit with the existing Pinia/RPC architecture, bundle-size cost, offline/PWA
behaviour, how cleanly it models long-running mutations (install/uninstall with progress),
and whether a push channel for package-state changes is the better root-cause fix.
- Deliverable: a short design note + a recommendation, then a scoped migration of the
package-lifecycle surfaces (My Apps / install / uninstall / update progress) as the proof
case — sequence AFTER workstream F (it informs F's progress-UI fix and vice-versa).
## 10b. Backlog — intelligent launch-port selection (2026-06-26)
**Replace the per-app static launch-port map with a smart, manifest-first heuristic.** Gitea
launched at **:2222 (SSH)** instead of **:3001 (web)** on a node missing the gitea manifest on
disk: `manifest_lan_address_for` returned None → the code fell through to `extract_lan_address`,
which returns podman's **first-listed** published port, and podman lists `2222->22` before
`3001->3000`. Patched 2026-06-26 (`670ebb06`) with a static `"gitea" => 3001` entry in
`lan_address_for` (`core/container/src/podman_client.rs`) — but that's a per-app band-aid (the
anti-pattern CLAUDE.md warns against; the map already carries bitcoin/lnd/mempool/immich/… by hand).
**Real fix (do this, then delete the static entries):**
- **Primary** is already correct — derive the launch URL from the manifest's declared
`interfaces.main` port. The failure was only the *fallback*. The north-star cure is
registry-distributed manifests (workstream B) so the manifest is always present and we never
guess.
- **Smart fallback** — make `extract_lan_address` stop returning the blind first port: **skip
container-side ports that are known non-HTTP (22/SSH, etc.) and prefer the published port whose
container side matches the manifest `health_check` endpoint / a known web port.** Fixes the whole
multi-port-app class generically (no per-app hardcoding), and lets us drop the static map.
- ~20-line change to one function + unit tests; rides the next fleet roll. NOT a free-port
remap (that's `port_allocator.rs`, which already resolves host-port *collisions* — a different
problem; gitea's web UI was never in conflict).
## 10c. Backlog — generalize the archival/full-node install blocker (2026-06-26)
**Make "this app needs an un-pruned (archival, txindex) Bitcoin node" a manifest-declared
dependency, applied to every app that needs it — using the electrumX/mempool blocker as the
reference behavior.** Today the gate works but is **hardcoded**: `requires_unpruned_bitcoin()` in
`core/archipelago/src/api/rpc/package/dependencies.rs` is a literal `matches!(package_id, "electrumx"
| "electrs" | "mempool-electrs" | "mempool" | "mempool-web")`, and install `bail!`s with
`archival_bitcoin_required_message` when `bitcoin.pruned` is true or disk < `ARCHIVAL_BITCOIN_DISK_GB`
(1 TB). That's the same per-app-hardcoding anti-pattern as the gitea static map (§10b) and the
`install_*_stack` Rust — any new app needing a full node is silently *un*-gated until someone edits
this match.
**Do:**
- **Declare it in the manifest** — e.g. `requires: { bitcoin: archival }` (or a
`dependencies.bitcoin.pruned: false` constraint) so the install pre-flight reads the requirement
from the manifest set instead of a hardcoded list. Covers future apps automatically (manifest-driven
north star).
- **Audit coverage** — confirm EVERY archival-dependent app is gated (electrumX, electrs,
mempool + its electrs, and any BTC-indexer/explorer added later); add a unit test asserting the
manifest constraint ⇒ blocker fires.
- **UX** — the blocker must be a clear, surfaced **pre-install** state in the UI (not just an RPC
`bail!` string): explain *why* (pruned node / insufficient disk), what to do (add ~1 TB, resync
un-pruned with txindex), and keep the app visibly "requires archival node" rather than a confusing
generic failure. Pairs with workstream F's honest-progress/blocker UX.
- Reference: the existing `package-install-prune-check` dependency descriptor (dependencies.rs:208)
is the seam to make data-driven.

View File

@ -103,10 +103,10 @@ Notes:
## 4. Test-gate reality
**No app has passed the formal release gate.** The gate is `run-gate.sh` green
**No app has passed the formal release gate.** The gate is `run-20x.sh` green
across the full lifecycle matrix (install / UI reachable / stop / start /
restart / reinstall / reboot-survive / archipelago-restart-survive / uninstall),
**5× on .228 AND .198**. All 8 release-gate checkboxes in
**20× on .228 AND .198**. All 8 release-gate checkboxes in
`tests/lifecycle/TESTING.md` are **unchecked (☐)**.
What exists today:
@ -132,7 +132,7 @@ failure): `bitcoin-receive.bats`, `port-drift.bats`, `secret-completeness.bats`.
1. **immich** is the last legacy (in-cgroup) app — migrate to Quadlet to finish Pillar 1.
2. **grafana / strfry** Quadlet units stuck *activating* with no container — investigate. (onlyoffice removed 2026-06-21.)
3. **fedimint-gateway / fedimint-clientd** (this session) now run but have no lifecycle test coverage.
4. The formal **5× release gate has never been green** — it is the blocker for the v1.7.52 tag.
4. The formal **20× release gate has never been green** — it is the blocker for the v1.7.52 tag.
---

View File

@ -1,215 +0,0 @@
# Bitcoin Multi-Version Support — Design
**Status:** design (2026-06-22)
**Goal:** let a user choose *which* version of Bitcoin Core / Bitcoin Knots to
install (latest pre-selected, older versions in a dropdown), and later switch
versions or opt into auto-update — all manifest/catalog-driven, all served from
**our signed registry**, rootless, with **zero data loss** across version
changes.
See also: [`docs/registry-manifest-design.md`](registry-manifest-design.md)
(catalog distribution + signing this builds on),
[`docs/PRODUCTION-MASTER-PLAN.md`](PRODUCTION-MASTER-PLAN.md) (gate that must be
green first), `MEMORY → project_decoupled_app_updates`,
`MEMORY → project_manifest_driven_north_star`.
> **Scheduling:** this is net-new scope. It lands **after** the production test
> gate (`tests/lifecycle/run-20x.sh`) is green on `.228` + `.198`. The data-
> preservation invariant (downgrade vs. chainstate) is the highest risk here.
---
## 1. Where we are today
### Image source / build
| Thing | Today |
|-------|-------|
| `apps/bitcoin-core/Dockerfile` | `FROM bitcoin/bitcoin:24.0` — a **community** image, **stale** (manifest says 28.4), no project-official Docker image exists |
| `apps/bitcoin-knots/` | **no Dockerfile**`:latest` is built/pushed by hand |
| Registry | `scripts/image-versions.sh``ARCHY_REGISTRY="146.59.87.168:3000/lfg2025"`; only `BITCOIN_KNOTS_IMAGE=…/bitcoin-knots:latest` pinned, no Core pin |
| Tags in registry | **one tag per image**. No historical versions. |
### Version pinning
- `apps/bitcoin-core/manifest.yml``…/bitcoin:28.4` (pinned).
- `apps/bitcoin-knots/manifest.yml``…/bitcoin-knots:latest` (**floating** — a
liability for reproducibility and for "switch back to the version I had").
- `core/archipelago/src/container/app_catalog.rs` + `app-catalog/catalog.json`:
signed, hourly-fetched, carries `version` (badge text) + `image`.
`catalog_image_override()` overrides the manifest image **only if same-repo**.
`available_update_for_app()` already ignores floating tags for update
detection.
### Install path
- `prod_orchestrator.rs::install_fresh()` resolves the image as
**manifest image → catalog override → pull**. There is **no per-install
version parameter** — `orchestrator.install(app_id)` takes only the id.
- RPC `package.install` (`api/rpc/package/install.rs`) *accepts* `dockerImage` /
`version` params but for orchestrator-managed apps (bitcoin-core / bitcoin-knots
are allowlisted) it **ignores them** and lets the orchestrator resolve.
- **Conflict guard** (`prod_orchestrator.rs` ~13061325): core and knots may not
run simultaneously. Must be preserved by everything below.
### UI
- Install is **one-click, no modal** (`MarketplaceAppDetails.vue::installApp()`).
- Update badge + "Update to X" already exist (`appDetails/AppHeroSection.vue`,
RPC `package.update`).
- **No** Bitcoin-specific settings panel; all apps share `AppSidebar.vue`.
- Per-app config persisted **only at install time** as `containerConfig`
`/var/lib/archipelago/app-configs/<id>.json`. **No post-install set-config RPC.**
---
## 2. Source-of-truth decision: official upstream → our registry
We use the **official releases** as upstream provenance, but nodes only ever pull
from our registry. Nodes do **not** fetch bitcoin.org / GitHub at install time —
that would break rootless/offline installs and the signed-registry trust model,
and neither project publishes an official Docker image anyway.
**Official sources (verified):**
| Impl | Index | Per-version asset pattern |
|------|-------|---------------------------|
| Bitcoin Core | [bitcoincore.org/en/releases](https://bitcoincore.org/en/releases/) · [github bitcoin/bitcoin](https://github.com/bitcoin/bitcoin/releases) | `https://bitcoincore.org/bin/bitcoin-core-<ver>/bitcoin-<ver>-x86_64-linux-gnu.tar.gz` + `SHA256SUMS` + `SHA256SUMS.asc` |
| Bitcoin Knots | [github bitcoinknots/bitcoin](https://github.com/bitcoinknots/bitcoin/releases) · [bitcoinknots.org/files](https://bitcoinknots.org/) | `https://bitcoinknots.org/files/<maj>.x/<ver>/bitcoin-<ver>-x86_64-linux-gnu.tar.gz` (`<ver>` e.g. `29.3.knots20260508`) |
Both ship **signed binary tarballs** with multi-builder Guix attestations
(`SHA256SUMS.asc`). The build pipeline verifies these **once, at build**; our DHT
Phase 0 registry signature then carries provenance to the fleet.
> Knots version strings embed a build date (`29.3.knots20260508`). Treat the full
> string as the tag; surface a friendly `29.3` + date in the UI.
---
## 3. Design
### Phase 0 — Reproducible, verified image pipeline *(prerequisite)*
New `scripts/build-bitcoin-image.sh <impl> <version>` that, per version:
1. Downloads the official tarball + `SHA256SUMS(.asc)` (GitHub release assets are
an identical mirror → fallback).
2. Verifies SHA256 **and** the Guix/builder GPG signatures. **Fail closed.**
3. Builds a minimal **rootless** image: pin a small base, unpack
`bitcoind`/`bitcoin-cli`. Keep the existing entrypoint probe
(`command -v bitcoind || find /opt -path '*/bin/bitcoind'`) so per-version
layout differences don't break startup.
4. Tags + pushes `:<version>` **and** updates the default pin (`:latest` /
`:28.4`-style) to the registry.
**Curate, don't mirror everything.** Publish a bounded set (proposal: current +
last ~3 majors), e.g. Core `31.0, 30.0, 29.3, 28.4, 27.2` and Knots
`29.3.knots…, 28.1.knots…, 27.1.knots…`. **`log` / document dropped versions** —
silent truncation reads as "all versions supported" when it isn't.
Also fixes existing debt: replaces the stale community `FROM bitcoin/bitcoin:24.0`
and gives Knots a real Dockerfile + non-floating tags.
### Phase 1 — Version catalog (signed, registry-distributed)
Extend `AppCatalogEntry` (forward-compatible — no `deny_unknown_fields`, old nodes
ignore it):
```jsonc
"bitcoin-core": {
"version": "31.0", // default / latest (existing field)
"image": "…/bitcoin:31.0", // existing
"versions": [ // NEW
{ "version": "31.0", "image": "…/bitcoin:31.0", "default": true },
{ "version": "30.0", "image": "…/bitcoin:30.0" },
{ "version": "28.4", "image": "…/bitcoin:28.4", "deprecated": true, "eol": "2026-...." }
]
}
```
Published to `releases/app-catalog.json`, signed by the existing release-root
mechanism. This is the **single source of truth** the UI reads for "what can I
install / switch to," and third-party-registry apps inherit the capability for
free. `version`/`image` stay as the default for back-compat.
### Phase 2 — Install-time version selection
- **Orchestrator:** add `install_with_image(app_id, Option<image_tag>)` (or an
optional arg on `install`). When a tag is supplied, **validate same-repo**
against the manifest (reuse `image_without_registry_or_tag()`), then override in
`install_fresh()`. Default path unchanged. Preserve the core/knots conflict
guard.
- **RPC:** thread the selected version/image from `package.install` into the
orchestrator for the allowlisted apps (the param is already received — just not
forwarded).
- **UI:** the first **install modal** in the app — latest pre-selected, dropdown
of `versions[]`, deprecated/EOL badges on old entries. On confirm, pass the
chosen version to `package.install`.
### Phase 3 — In-app version switch + auto-update toggle
- **UI:** a Bitcoin **"Version & Updates"** card (conditional in `AppSidebar.vue`
for `bitcoin-core` / `bitcoin-knots`): current version, a switch dropdown, and
an **auto-update-to-latest** toggle.
- **Switch = controlled re-pull/recreate** reusing the `package.update`
machinery but targeting an arbitrary (incl. older) tag → effectively
`package.set-version`.
- **Persistence:** new `package.set-config` RPC writing the existing
`app-configs/<id>.json` (`{ pinnedVersion, autoUpdate }`).
- **Auto-update:** the existing hourly catalog check, when `autoUpdate:true`,
triggers `package.update` to the catalog default. A pinned version **suppresses
the update badge**.
---
## 4. Invariants & safety rails
- **Rootless only.** Pipeline images and run path stay rootless; no Docker-socket,
no privileged.
- **No data loss across version change.** Preserve `/var/lib/archipelago/bitcoin`,
secrets (`bitcoin-rpc-password`, `…-rpcauth`), ports, and the adoption container
name on every install / switch / update.
- **⚠️ Downgrade vs. chainstate (highest risk).** Bitcoin Core refuses to start on
a chainstate written by a *newer* version unless reindexed (expensive, or data
loss on a pruned node). The UI **must** warn loudly on downgrade; the
orchestrator should gate/confirm it and never silently wipe. Pruned nodes can't
simply `-reindex`.
- **Core ⇄ Knots switch** stays governed by the existing conflict guard; treat an
impl switch as distinct from a version switch.
- **Floating tags** (`latest`) are never advertised as a selectable "version" and
never counted as an available update (already handled by
`available_update_for_app`).
- **Verify on a real node** (`.228` then `.198`) and pass `run-20x` before any
tag.
---
## 5. Files / seams (no code yet)
| Concern | File |
|---------|------|
| Image build/push | new `scripts/build-bitcoin-image.sh`; `apps/bitcoin-core/Dockerfile`; new `apps/bitcoin-knots/Dockerfile`; `scripts/image-versions.sh` |
| Catalog schema | `core/archipelago/src/container/app_catalog.rs`; `releases/app-catalog.json` (+ `app-catalog/catalog.json`) |
| Install override | `core/archipelago/src/container/prod_orchestrator.rs` (`install` / `install_fresh`); `api/rpc/package/install.rs`; `api/rpc/dispatcher.rs` |
| Switch / set-config RPC | `api/rpc/package/update.rs`; new `package.set-config` handler; `app-configs/<id>.json` |
| Install modal | `neode-ui/src/views/MarketplaceAppDetails.vue`; new `…/marketplace/AppInstallModal.vue` |
| Version & Updates card | `neode-ui/src/views/appDetails/AppSidebar.vue`; `neode-ui/src/api/rpc-client.ts`; `neode-ui/src/types/api.ts` |
---
## 6. Open questions
1. **Curated version set** — how many majors back do we host, and storage budget
on the registry?
2. **Multi-arch** — fleet is x86_64 today; do any nodes need arm64 images?
3. **Pruned-node downgrade policy** — block outright, or allow with an explicit
"this will require re-sync / may lose pruned data" confirmation?
4. **Auto-update default** — off (opt-in) for a consensus-critical app like
Bitcoin? (Recommended: **off**, explicit opt-in.)
5. **Knots date-suffix UX** — how to display `29.3.knots20260508` cleanly.
---
## Sources
- [Bitcoin Core releases](https://bitcoincore.org/en/releases/)
- [bitcoin/bitcoin releases](https://github.com/bitcoin/bitcoin/releases)
- [bitcoinknots/bitcoin releases](https://github.com/bitcoinknots/bitcoin/releases)
- [Bitcoin Knots](https://bitcoinknots.org/)
- [bitcoin.org version history](https://bitcoin.org/en/version-history)

109
docs/demo-build-info.md Normal file
View File

@ -0,0 +1,109 @@
# Archipelago Public Demo — build info & status
**Status:** implemented & deployable (2026-06-22)
**Branch:** `demo-build` (worktree `../archy-demo-build`), pushed to
`gitea-vps2` = `http://146.59.87.168:3000/lfg2025/archy.git`.
**Main/prod is untouched** — all demo work lives only on `demo-build`.
A public, click-to-play demo of the Archipelago UI, 100% mock-data driven,
multi-visitor, deployed via Portainer. See also `docs/demo-deployment-design.md`
(original design) and `demo-deploy/` (thin prebuilt-image stack).
---
## Deploy (Portainer)
Build-from-repo (works today, no registry needed):
| Field | Value |
|-------|-------|
| Repository URL | `http://146.59.87.168:3000/lfg2025/archy.git` |
| Reference | `refs/heads/demo-build` |
| Compose path | `docker-compose.demo.yml` |
| Auth | user `lfg2025`, password = Gitea token |
| UI port | **2100** · Login password: **`entertoexit`** |
Redeploy after each push. `docker-compose.demo.yml` builds two images
(`neode-ui/Dockerfile.backend` = mock server, `neode-ui/Dockerfile.web` = nginx+UI).
The thin `demo-deploy/docker-compose.yml` pulls prebuilt `:demo` images instead
(needs the CI image pipeline / registry wired — `.github/workflows/demo-images.yml`).
### Flags / env
- Backend: `DEMO=1` (compose sets it) → multi-session sandbox, no real runtime.
- Web build: `VITE_DEMO=1` (Dockerfile.web ARG, default 1) → inlined demo UI behaviour.
- Optional: `ANTHROPIC_API_KEY` (NOT needed — AIUI chat is canned in demo),
`DEMO_SESSION_TTL_MS` (45m), `DEMO_MAX_SESSIONS` (500), `DEMO_FILE_QUOTA_BYTES` (50MB).
---
## Architecture
Everything is gated behind `DEMO` (off = classic single-user dev mock, unchanged).
- **`neode-ui/mock-backend.js`** — the entire fake backend (Node/Express, ~95+ RPCs).
- **Per-session isolation:** `AsyncLocalStorage` + Proxy. Globals (`mockData`,
`walletState`, `userState`, `mockState`, `bitcoinRelayMockState`) are Proxies
that resolve to the current request's store, keyed by a `demo_sid` cookie.
Deep-cloned from `SEED_*` on first hit; idle-reaped; per-session WS fan-out.
- **Files:** per-session in-memory store + curated disk files (see below).
- Forces simulation mode in DEMO (`docker=null`).
- **`neode-ui/src/composables/useDemoIntro.ts`** — the frontend demo switch
(`IS_DEMO`), per-day intro gate, `DEMO_PASSWORD`, app demoability + launch URLs.
- **`neode-ui/docker/nginx-demo.conf`** — routes `/rpc`, `/ws`, `/app/*`,
`/electrs-status`, `/proxy/`, `/lnd-connect-info`, the IndeeHub/Mempool
reverse-proxies, and the SPA.
- **`docker/{bitcoin-ui,electrs-ui,lnd-ui,fedimint-ui}/`** — the REAL registry app
UIs, served statically under `/app/<id>/` with mocked data endpoints.
- **`demo/aiui/`** — prebuilt AIUI dist (chat is canned; `?mockArchy&seed`).
- **`demo/files/`** — curated cloud files drop-in (see below).
## Demo features (all implemented)
Per-session sandbox · per-session file upload (Range streaming) · testnet/signet
flavor · per-day intro replay · `entertoexit` login (prefilled + hint) · version
`<real>-demo` · onboarding wizard skipped (intro kept) · "No demo" install gating ·
real app UIs (Bitcoin Core vs Knots by subversion, ElectrumX, LND, Fedimint;
Mempool/IndeeHub iframed) · 12 federation nodes / 5 peers · FIPS active · interactive
buy flow (testnet addresses, bolt11, 2s QR) · real testnet tx links (mempool.space) ·
networking profits 5,231,978 sats + labelled wallet txs · VPN · Nostr relays ·
node-visibility toggle · dummy Cashu mints + Fedimint federations · AIUI canned
reply + `?mockArchy` mock data + `?seed` pre-loaded "Content Showcase" chat.
---
## Curated cloud files (`demo/files/`)
Drop real files into `demo/files/<Folder>/<file>` and commit — they become the
cloud content for every visitor (read-only; git access = the "private login").
Loader **merges per top-level folder**: adding `Music/` swaps only Music and keeps
the sample Documents/Photos/Videos. Empty → built-in seeds. Text inlined; binaries
streamed from disk with HTTP Range (seek). Backend reads `/demo/files`
**Dockerfile.backend COPYs it; `.dockerignore` must allow it.**
---
## Gotchas (READ before editing)
- **Sibling dirs need both the Dockerfile COPY and a `.dockerignore` allow.**
`docker/bitcoin-ui`, `docker/electrs-ui`, `docker/lnd-ui`, `docker/fedimint-ui`,
`demo/files` are outside `neode-ui/`; they're copied into the backend image and
un-ignored in `.dockerignore` (`* ` + `!docker/` + `docker/*` + `!docker/<ui>/`).
Forgetting either → Portainer build "not found" or runtime 500/404.
- **Real app UIs assume root-serving** — served via `express.static('/app/<id>')`
+ `/app/<id>/assets/*``/assets/*` redirect + per-path data endpoints
(`bitcoin-status`, `rpc/v1`, `bitcoin-rpc/`, `/proxy/lnd/*`, `/electrs-status`).
- **Uploaded-via-UI files are ephemeral** (per-session, lost on redeploy/reap).
Only `demo/files/` persists.
- **Mempool iframe is best-effort** (third-party CSP/websockets). **IndeeHub** is
reverse-proxied with header-strip + `sub_filter` asset rewrite; if still black,
it's indee's own `X-Frame-Options` (fix on that server).
- **AIUI `?seed` bootstrap hardcodes the current AIUI bundle hash**
(`/aiui/assets/seedPrompts-CLWaUv28.js`) — re-paste if AIUI is rebuilt. Tiny
first-load IndexedDB race (one refresh shows the chat).
- **Running mock-backend.js locally in the sandbox is flaky:** start backgrounded,
`sleep 5+`, then curl; NEVER `pkill -f mock-backend` (it matches & kills the
shell) — use `pkill -x node`.
- **Delete-405** seen pre-redeploy was nginx/stale; backend DELETE returns 200.
---
## Commit trail (demo-build, newest last)
`2715f2d8` sandbox → … → `7efebb4a` media merge + AIUI seed. ~14 commits, all
`feat(demo)/fix(demo)`.

View File

@ -1,169 +0,0 @@
# Public Demo Deployment — Design
**Status:** design (2026-06-22)
**Goal:** a public, click-to-play demo of the Archipelago UI that **auto-tracks
the real code** yet stays **separated** from the private monorepo and its
secrets/backend. Deployed via **Portainer**, mock-data driven, with working file
storage and a testnet-flavored Bitcoin sandbox so visitors can play freely.
See also: `neode-ui/mock-backend.js` (existing mock), `docker-compose.demo.yml`
(existing demo stack), `MEMORY → reference_neode_ui_dev_testing`,
`MEMORY → reference_ovh_168_mirror` (Portainer/registry host).
---
## 1. What already exists (the 70%)
The demo is mostly built. Inventory:
| Asset | Path | State |
|-------|------|-------|
| Mock backend (Node/Express + ws) | `neode-ui/mock-backend.js` (~3,862 lines) | 95+ JSON-RPC methods: auth, package lifecycle, Bitcoin/LND wallet, mesh, federation, identity, monitoring, mock filebrowser |
| Mock data | `mockData` / `walletState` / `MOCK_FILES` in `mock-backend.js` | rich; 10 pre-installed apps, 30+ marketplace apps, wallet balances, seeded files (Music/Documents/Photos/Videos) |
| Demo compose | `docker-compose.demo.yml` | `neode-backend` (mock, `:5959`) + `neode-web` (nginx, `:4848`); header already says "Deploy via Portainer" |
| Backend image | `neode-ui/Dockerfile.backend` | Node 22 Alpine → `node mock-backend.js` |
| Web image | `neode-ui/Dockerfile.web` | multi-stage `vite build` → nginx |
| Demo nginx | `neode-ui/docker/nginx-demo.conf` | proxies `/rpc/v1`, `/ws`, `/app/*` to the mock backend |
| Precedent | `indee-demo` Portainer stack | separate stack referencing a **pre-built image** — the pattern we extend |
**Gaps for a *public* (not dev) demo:** state is global (visitors collide),
uploads are no-ops, Bitcoin block height is hardcoded, no CI image pipeline, no
separated public deploy repo.
---
## 2. Architecture: source in monorepo, demo ships as images, public repo is thin
The tension — "must update as I update the real code" **and** "sort of
separated" — is resolved by separating at the **deploy layer, not the source
layer**.
```
monorepo (private — single source of truth)
neode-ui/ + mock-backend.js
│ push to main
CI: build archy-demo-web + archy-demo-backend
│ push :demo / :latest
registry (146.59.87.168:3000 / vps2)
│ Portainer webhook / re-pull
archy-demo (public repo — tiny)
docker-compose.yml ──referencing pre-built images──▶ Portainer ▶ demo.<host>
.env.example
```
- **Single source of truth = the monorepo.** `neode-ui/` and `mock-backend.js`
stay where they are, so the demo tracks real code automatically — no fork to
sync, no drift.
- **Separation = the public repo never holds source.** `archy-demo` contains only
a `docker-compose.yml` (image refs) + `.env.example` + README. No Rust backend,
no secrets, no UI source. Safe to make public.
- **Auto-update flow:** edit code → push → CI rebuilds demo images → Portainer
redeploys. The public compose file is touched rarely (only when service shape
changes).
**Why not a true fork / `git subtree split`?** It works but needs a sync job
*and* re-exposes UI source publicly. The image pipeline gives stronger
separation (zero source leak) **and** zero manual sync. (Decided 2026-06-22.)
---
## 3. Work items
### 3.1 CI image pipeline
- On push to `main` (path filter: `neode-ui/**`), build:
- `archy-demo-backend` from `neode-ui/Dockerfile.backend`
- `archy-demo-web` from `neode-ui/Dockerfile.web` (`build:docker`)
- Tag `:demo` + `:<git-sha>`, push to the registry.
- Trigger Portainer redeploy (stack webhook) on success.
### 3.2 Public `archy-demo` repo
- `docker-compose.yml` mirroring `docker-compose.demo.yml` but **`image:`
references instead of `build:`** (pull `:demo`, no build context).
- `.env.example` (`ANTHROPIC_API_KEY`, `VITE_DEV_MODE=existing`, session TTL,
upload quota).
- README: one-paragraph "deploy in Portainer → web editor paste / deploy from
repo," access on `:4848`.
- No source. This is the only public surface.
### 3.3 Multi-user: per-session sandbox (reset on idle) ⟵ *decided*
The biggest code change. Today `mockData` / `walletState` / `MOCK_FILES` are
**global singletons** → visitors corrupt each other's view.
- Issue a `demo-session` cookie on first hit (the mock already sets a session on
login; extend it to anonymous visitors).
- Key state by session id: `sessions[sid] = { mockData, walletState, files }`,
each **deep-cloned from a pristine seed** on creation.
- Reap on idle (e.g. 30 min no activity) + hard cap concurrent sessions; on reap,
free memory + temp dir.
- RPC dispatch + WS patches resolve the per-session state instead of the global.
- Keeps the demo a true playground: install/uninstall/spend freely, reset by
reconnecting.
### 3.4 File storage: persisted per session ⟵ *decided*
Today filebrowser upload/delete/rename are 200-OK no-ops.
- Back each session with a temp dir (e.g. `/tmp/demo/<sid>/`), seeded from
`MOCK_FILES`.
- Make `POST/DELETE/PATCH /app/filebrowser/api/resources/*` and `GET …/raw/*`
read/write that dir. Enforce a per-session quota (e.g. 50 MB) and reject
oversize/odd MIME.
- Cleaned when the session is reaped — no standing public writable volume, no real
filebrowser container to harden.
### 3.5 Bitcoin: testnet-flavored mock ⟵ *decided*
- Relabel wallet/chain as **testnet/signet**: `tb1q…` addresses, "testnet" chain
in `bitcoin.getinfo`, scripted-but-plausible block height + confirmations.
- Keep `dev.faucet` as the in-UI "get test sats" button (instant, free).
- No real `bitcoind` → no sync, no disk, no public RPC attack surface.
- *Future upgrade path:* swap to a real signet node + LND in the stack if we ever
want movable real test sats (out of scope now).
### 3.6 Mock containers / app lifecycle
- The mock already simulates `package.install/uninstall/start/stop/restart`
asynchronously. For the demo, **force simulation mode** (never touch a real
Docker socket — rootless/safe and host-independent). Confirm no path in
`mock-backend.js` reaches for a real runtime when `DEMO=1`.
### 3.7 Mock-data refresh
- Update `mockData` static apps + marketplace to current app set/versions, refresh
wallet figures, seeded mesh messages, and files so the demo feels current. This
is ongoing and rides the same image pipeline.
---
## 4. Invariants / guardrails (public exposure)
- **No real secrets, no real backend, no real Docker socket** in the demo image or
public repo. Mock password stays a known demo credential, clearly labeled.
- **Per-session isolation** is a hard requirement before going public — without it
the demo is unusable for strangers.
- **Resource caps:** session count, per-session memory + upload quota, idle reap;
the box can't be DoS'd into OOM by upload spam or session churn.
- **`ANTHROPIC_API_KEY`** (chat) is injected via Portainer env, never committed;
rate-limit / budget-cap demo chat usage.
- **Read-only registry creds** for the Portainer host to pull `:demo`.
---
## 5. Files / seams
| Concern | Where |
|---------|-------|
| Per-session state, file persistence, testnet labels, sim-mode | `neode-ui/mock-backend.js` |
| Build contexts (reused as-is) | `neode-ui/Dockerfile.backend`, `neode-ui/Dockerfile.web`, `neode-ui/docker/nginx-demo.conf` |
| Demo stack (in-repo, dev) | `docker-compose.demo.yml` (keep `build:`) |
| Public stack (new repo) | `archy-demo/docker-compose.yml` (`image:` refs), `.env.example`, README |
| CI pipeline | new workflow (path filter `neode-ui/**` → build + push `:demo` → Portainer webhook) |
---
## 6. Open questions
1. **Demo host** — which Portainer instance (OVH `.168`? a dedicated VPS)? Public
DNS + TLS for `demo.<domain>`?
2. **Registry for `:demo` images**`146.59.87.168:3000` vs vps2; public-pull or
creds baked into Portainer?
3. **Session TTL + concurrency cap** — concrete numbers (30 min / N sessions / 50 MB)?
4. **Chat in the demo** — enable Claude chat (needs key + budget cap) or stub it?
5. **Sync cadence** — rebuild `:demo` on every `neode-ui/**` push, or nightly?

View File

@ -1,69 +0,0 @@
# Multinode / Fleet Testing Plan (separate from the single-node gate)
> **Scope split (2026-06-22):** the production test gate (`docs/PRODUCTION-MASTER-PLAN.md` §5,
> `tests/lifecycle/TESTING.md`) is now a **single-node criterion on .228**. Verifying the same
> lifecycle matrix across the rest of the fleet (.198 and the other testers) lives HERE and is run
> **after** the .228 single-node gate is green. This is intentionally NOT a blocker on the .228 gate.
## Why split it out
The lifecycle gate must be **run ON the node under test** — its bitcoin/companion/orphan/endpoint
checks use local `podman`/`systemctl`/`bitcoin-cli`/`curl`, not RPC to a remote host. Running it from
one host against another silently tests the *runner*. So "multinode" isn't "point the harness at N
hosts" — it's "run the on-node gate on each host," plus the genuinely cross-node concerns (federation,
mesh, transport, sync) that a single node can't exercise.
## How to run the gate on another node
Bats + jq usually aren't installed on ISO nodes. Bootstrap (one-time per node):
```
# from a host that has them (e.g. .116):
dpkg -L bats | grep -E '^/usr/(bin|lib|libexec)' | tar czf /tmp/bats.tgz -P -T - $(which jq)
tar czf /tmp/tests.tgz -C <repo> tests/lifecycle
scp /tmp/bats.tgz /tmp/tests.tgz <node>:/tmp/
# on the node:
sudo tar xzf /tmp/bats.tgz -P -C / # bats (jq here is dynamically linked — may need libs)
sudo curl -fsSL -o /usr/local/bin/jq \
https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-amd64 && sudo chmod +x /usr/local/bin/jq
mkdir -p /tmp/lifecycle-run && tar xzf /tmp/tests.tgz -C /tmp/lifecycle-run
cd /tmp/lifecycle-run/tests/lifecycle
ARCHY_HOST=127.0.0.1 ARCHY_SCHEME=https ARCHY_PASSWORD=<node pw> \
ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ITERATIONS=5 nohup ./run-gate.sh > /tmp/gate.log 2>&1 &
```
## Per-node preconditions (learned on .228)
- **Bitcoin must be fully synced + archival** (`initialblockdownload:false`, `pruned:false`).
test 83 reads the *real* `getblockchaininfo`, not the UI's headers-height. A node mid-IBD will
cascade-fail electrumx/lnd/btcpay/mempool even though the apps run.
- **Backends should be proper installs** (in `manifest_ids`), not adopted plain-podman left over
from ad-hoc `package.start`/cascade churn — otherwise companion self-heal and quadlet checks skew.
- **No stale per-app nginx proxy targets.** e.g. `/app/lnd/` must point at the lnd-ui port (18083),
not a stale `8081`. Repo code is correct; old node configs may be stale — re-check + regenerate.
- **No orphan quadlet units** (e.g. a `home-assistant.container` whose ContainerName ≠ the real
`homeassistant` container) — these wedge `systemctl --user` "activating" and fail the quadlet checks.
## Node roster (carry-over)
| Node | Role | Notes |
|------|------|-------|
| .228 | **single-node gate** (primary) | 14-app resilience node; bitcoin synced archival; gate GREEN. |
| .198 | fleet verify | was weak/loaded (load ~35) + **bitcoin mid-IBD** at split time → must finish syncing first; sshd wedges under concurrent SSH (use ONE session; gate uses HTTPS RPC so fine). |
| .5 / .120 | x250 testers (Tailscale) | flaky cellular; SSH via `tailscale nc` ProxyCommand. |
| .116 | dev/validation | local repo; its own bitcoin may be mid-IBD — do NOT treat as a gate target unless synced. |
## Cross-node concerns (only a multinode setup can test)
- Federation sync (Tor/FIPS transports), DID/contact federation, peer file fetch.
- Mesh (Meshtastic/MeshCore) + mesh-AI gating.
- Dual-ecash federation validation + networking-sats routing.
- DHT / iroh swarm distribution (origin-always-wins) once that dep lands.
## Sequence
1. Get the **.228 single-node gate green 5×** (master plan §5/§6) — DONE/in progress.
2. THEN: bring each fleet node to the preconditions above; run the on-node gate 5× per node.
3. THEN: the cross-node suites (federation/mesh/transport), tracked here.
This plan does not gate the v1.7.x single-node criterion; it is the next layer.

View File

@ -14,6 +14,14 @@ RUN npm install
# Copy application code
COPY neode-ui/ ./
# Sibling assets the mock backend reads relative to /app (../docker, ../demo):
# the Bitcoin UI mock shell and any curated cloud files dropped into demo/files.
COPY docker/bitcoin-ui /docker/bitcoin-ui
COPY docker/electrs-ui /docker/electrs-ui
COPY docker/lnd-ui /docker/lnd-ui
COPY docker/fedimint-ui /docker/fedimint-ui
COPY demo/files /demo/files
# Expose port
EXPOSE 5959

View File

@ -20,6 +20,12 @@ RUN find public/assets -name "*backup*" -type f -delete || true && \
ENV DOCKER_BUILD=true
ENV NODE_ENV=production
# Public-demo build flag — inlined into the bundle (import.meta.env.VITE_DEMO).
# Enables the per-day intro replay, the "entertoexit" login hint, and other
# demo-only UI affordances. Override with --build-arg VITE_DEMO=0 for a plain build.
ARG VITE_DEMO=1
ENV VITE_DEMO=$VITE_DEMO
# Use npm script which handles build better
RUN npm run build:docker || (echo "Build failed! Listing files:" && ls -la && echo "Checking vite config:" && cat vite.config.ts && exit 1)

View File

@ -62,6 +62,28 @@ http {
proxy_set_header X-Real-IP $remote_addr;
}
# ElectrumX UI status (polled by the electrs-ui shell)
location /electrs-status {
proxy_pass http://neode-backend:5959;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
# LND UI endpoints (polled by the lnd-ui shell)
location /proxy/ {
proxy_pass http://neode-backend:5959;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /lnd-connect-info {
proxy_pass http://neode-backend:5959;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
# Proxy FileBrowser API to mock backend (demo mode)
location /app/filebrowser/ {
client_max_body_size 10G;
@ -72,6 +94,59 @@ http {
proxy_request_buffering off;
}
# IndeeHub: reverse-proxy the real site same-origin, strip framing headers,
# and rewrite its absolute asset paths (/assets, /, src, href) to the
# /app/indeedhub/ prefix so the SPA loads inside the iframe.
location /app/indeedhub/ {
proxy_pass https://indee.tx1138.com/;
proxy_http_version 1.1;
proxy_set_header Host indee.tx1138.com;
proxy_set_header Accept-Encoding "";
proxy_ssl_server_name on;
proxy_hide_header X-Frame-Options;
proxy_hide_header Content-Security-Policy;
proxy_hide_header Content-Security-Policy-Report-Only;
sub_filter_types text/html text/css application/javascript application/json;
sub_filter_once off;
sub_filter 'href="/' 'href="/app/indeedhub/';
sub_filter 'src="/' 'src="/app/indeedhub/';
sub_filter "href='/" "href='/app/indeedhub/";
sub_filter "src='/" "src='/app/indeedhub/";
sub_filter 'from"/' 'from"/app/indeedhub/';
sub_filter 'url(/' 'url(/app/indeedhub/';
}
# Mempool: same approach. NOTE mempool.space is a strict third-party app —
# its data/websocket calls may still be blocked; iframe is best-effort.
location /app/mempool/ {
proxy_pass https://mempool.space/;
proxy_http_version 1.1;
proxy_set_header Host mempool.space;
proxy_set_header Accept-Encoding "";
proxy_ssl_server_name on;
proxy_hide_header X-Frame-Options;
proxy_hide_header Content-Security-Policy;
proxy_hide_header Content-Security-Policy-Report-Only;
sub_filter_types text/html text/css application/javascript application/json;
sub_filter_once off;
sub_filter 'href="/' 'href="/app/mempool/';
sub_filter 'src="/' 'src="/app/mempool/';
sub_filter "href='/" "href='/app/mempool/";
sub_filter "src='/" "src='/app/mempool/";
sub_filter 'from"/' 'from"/app/mempool/';
sub_filter 'url(/' 'url(/app/mempool/';
}
# Proxy every other app UI (/app/<id>/) to the mock backend, which serves
# the per-app mock UIs (bitcoin-ui, electrumx, lnd, fedimint) and the
# generic "Not available in the demo" notice for the rest.
location /app/ {
proxy_pass http://neode-backend:5959;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
# Serve AIUI SPA
location /aiui/ {
alias /usr/share/nginx/html/aiui/;

File diff suppressed because it is too large Load Diff

View File

@ -73,7 +73,7 @@
"author": "Mempool",
"category": "money",
"tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0",
"repoUrl": "https://github.com/mempool/mempool",
"requires": [
"bitcoin-knots",
@ -195,7 +195,7 @@
"title": "Nostr Relay (Rust)",
"version": "0.8.0",
"description": "High-performance Nostr relay written in Rust. Host your own decentralized social media relay and earn networking profits.",
"icon": "/assets/img/app-icons/nostr.svg",
"icon": "/assets/img/app-icons/nostrudel.svg",
"author": "Nostr RS Relay",
"category": "community",
"tier": "recommended",

View File

@ -38,13 +38,6 @@ export const companionInputActive = ref(false)
let ws: WebSocket | null = null
let shouldReconnect = true
let reconnectTimer: ReturnType<typeof setTimeout> | null = null
// Exponential backoff for the relay socket. It's a secondary feature (companion
// input), so when the backend is down it must NOT hammer a fixed-interval
// reconnect — that floods the console/network with failed-WS noise for the whole
// outage. Back off 1s → 30s, reset on a successful open. (Mirrors websocket.ts.)
let relayReconnectAttempts = 0
const RELAY_RECONNECT_BASE_MS = 1000
const RELAY_RECONNECT_MAX_MS = 30_000
let cursorEl: HTMLDivElement | null = null
let companionTimeout: ReturnType<typeof setTimeout> | null = null
let inputFlickerTimeout: ReturnType<typeof setTimeout> | null = null
@ -339,7 +332,6 @@ function doConnect() {
ws.onopen = () => {
relayConnected.value = true
relayReconnectAttempts = 0 // healthy again — reset backoff
if (import.meta.env.DEV) console.log('[RemoteRelay] Connected')
}
@ -351,12 +343,7 @@ function doConnect() {
relayConnected.value = false
ws = null
if (shouldReconnect) {
const delay = Math.min(
RELAY_RECONNECT_BASE_MS * 2 ** relayReconnectAttempts,
RELAY_RECONNECT_MAX_MS,
)
relayReconnectAttempts++
reconnectTimer = setTimeout(doConnect, delay)
reconnectTimer = setTimeout(doConnect, 5000)
}
}
@ -392,7 +379,6 @@ export function requestExternalOpen(url: string): boolean {
/** Start the remote relay listener. Connects to /ws/remote-relay. */
export function startRemoteRelay() {
shouldReconnect = true
relayReconnectAttempts = 0
doConnect()
}

View File

@ -69,12 +69,12 @@
<div class="relative flex-1 min-h-0 bg-black/40 overflow-hidden">
<!-- Loading indicator -->
<Transition name="content-fade">
<AppLoadingScreen
v-if="iframeLoading"
:icon="overlayIcon"
:title="store.title || 'App'"
:progress="loadProgress"
/>
<div v-if="iframeLoading" class="absolute inset-0 z-10 flex items-center justify-center bg-black/40">
<svg class="animate-spin h-8 w-8 text-white/70" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
</svg>
</div>
</Transition>
<iframe
ref="iframeRef"
@ -184,12 +184,10 @@
</template>
<script setup lang="ts">
import { ref, computed, watch, onMounted, onBeforeUnmount } from 'vue'
import { ref, watch, onMounted, onBeforeUnmount } from 'vue'
import { useAppLauncherStore } from '@/stores/appLauncher'
import NostrSignConsent from '@/components/NostrSignConsent.vue'
import NostrIdentityPicker from '@/components/NostrIdentityPicker.vue'
import AppLoadingScreen from '@/components/AppLoadingScreen.vue'
import { DEFAULT_APP_ICON } from '@/views/apps/appsConfig'
import { rpcClient } from '@/api/rpc-client'
interface PaymentRequest {
@ -209,39 +207,6 @@ const isRefreshing = ref(false)
const iframeLoading = ref(true)
const iframeBlocked = ref(false)
// Best-guess icon for the loading screen resolved from the /app/{id}/ path
// when present; AppLoadingScreen's <img> falls back to the default icon if the
// guessed asset 404s.
const overlayIcon = computed(() => {
const url = store.url
if (!url) return DEFAULT_APP_ICON
try {
const m = new URL(url, window.location.origin).pathname.match(/^\/app\/([a-z0-9._-]+)/i)
if (m?.[1]) return `/assets/img/app-icons/${m[1].toLowerCase()}.png`
} catch { /* not a parseable URL */ }
return DEFAULT_APP_ICON
})
// Faux load progress (cross-origin iframes give no real progress events): ease
// toward ~92% while loading, snap to 100% on load.
const loadProgress = ref(0)
let progressTimer: ReturnType<typeof setInterval> | null = null
function stopProgress() {
if (progressTimer) { clearInterval(progressTimer); progressTimer = null }
}
function startProgress() {
stopProgress()
loadProgress.value = 8
progressTimer = setInterval(() => {
loadProgress.value += Math.max(0.4, (92 - loadProgress.value) * 0.08)
if (loadProgress.value >= 92) { loadProgress.value = 92; stopProgress() }
}, 180)
}
watch(iframeLoading, (loading) => {
if (loading) startProgress()
else { stopProgress(); loadProgress.value = 100 }
}, { immediate: true })
// Nostr identity picker state
const showIdentityPicker = ref(false)
const IDENTITY_STORAGE_KEY = 'archipelago_app_identity_'
@ -608,7 +573,6 @@ onMounted(() => {
onBeforeUnmount(() => {
clearTimers()
stopProgress()
window.removeEventListener('keydown', onKeyDown, true)
window.removeEventListener('message', onMessage)
})

View File

@ -1,81 +0,0 @@
<template>
<div class="app-loading-screen absolute inset-0 z-10 flex flex-col items-center justify-center">
<div class="app-loading-icon">
<img :src="icon" :alt="title" @error="handleImageError" />
</div>
<p class="app-loading-title">{{ title }}</p>
<div class="app-loading-bar">
<div class="app-loading-fill" :style="{ width: `${clampedProgress}%` }"></div>
</div>
<p class="app-loading-hint">{{ hint }}</p>
</div>
</template>
<script setup lang="ts">
import { computed } from 'vue'
import { handleImageError } from '@/views/apps/appsConfig'
const props = withDefaults(defineProps<{
icon: string
title: string
progress: number
hint?: string
}>(), {
hint: 'Loading…',
})
const clampedProgress = computed(() => Math.min(100, Math.max(0, props.progress)))
</script>
<style scoped>
.app-loading-screen {
gap: 18px;
background: #0b0d12;
}
.app-loading-icon {
width: 84px;
height: 84px;
border-radius: 20px;
overflow: hidden;
display: flex;
align-items: center;
justify-content: center;
background: rgba(255, 255, 255, 0.05);
border: 1px solid rgba(255, 255, 255, 0.08);
box-shadow: 0 12px 32px rgba(0, 0, 0, 0.45);
animation: app-loading-pulse 1.8s ease-in-out infinite;
}
.app-loading-icon img {
width: 100%;
height: 100%;
object-fit: cover;
}
.app-loading-title {
margin: 0;
font-size: 1rem;
font-weight: 600;
color: rgba(255, 255, 255, 0.9);
}
.app-loading-bar {
width: min(240px, 60vw);
height: 4px;
border-radius: 999px;
background: rgba(255, 255, 255, 0.1);
overflow: hidden;
}
.app-loading-fill {
height: 100%;
border-radius: 999px;
background: linear-gradient(90deg, #fb923c, #f59e0b);
transition: width 0.3s ease;
}
.app-loading-hint {
margin: 0;
font-size: 0.75rem;
color: rgba(255, 255, 255, 0.4);
}
@keyframes app-loading-pulse {
0%, 100% { transform: scale(1); opacity: 1; }
50% { transform: scale(1.05); opacity: 0.85; }
}
</style>

View File

@ -82,7 +82,7 @@ const STORAGE_KEY = 'neode_companion_intro_seen'
// Absolute URL so the QR works when scanned by a phone (a relative path has no
// host to resolve). Points at the companion APK hosted on the 146 release server
// (publicly reachable) rather than the local node's /packages copy.
const DEFAULT_DOWNLOAD_URL = 'http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/neode-ui/public/packages/archipelago-companion.apk'
const DEFAULT_DOWNLOAD_URL = 'http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/neode-ui/public/packages/archipelago-companion.apk.zip'
const visible = ref(false)
const qrDataUrl = ref('')

View File

@ -0,0 +1,96 @@
/**
* Public-demo helpers.
*
* The demo build (VITE_DEMO=1) replays the intro/onboarding on each visit, but
* only once per calendar day per browser tracked in localStorage so it
* survives the short-lived backend session. Also exposes the shared demo
* credentials shown on the login screen.
*/
export const IS_DEMO =
import.meta.env.VITE_DEMO === '1' || import.meta.env.VITE_DEMO === 'true'
/** Memorable shared password for the public demo (must match the mock backend). */
export const DEMO_PASSWORD = 'entertoexit'
const INTRO_DATE_KEY = 'demo_intro_date'
function todayKey(): string {
// Local calendar day, e.g. "2026-06-22".
const d = new Date()
return `${d.getFullYear()}-${String(d.getMonth() + 1).padStart(2, '0')}-${String(d.getDate()).padStart(2, '0')}`
}
/** True if this browser already watched the intro earlier today. */
export function demoIntroSeenToday(): boolean {
try {
return localStorage.getItem(INTRO_DATE_KEY) === todayKey()
} catch {
return false
}
}
/** Record that the intro has been seen today, so it won't replay until tomorrow. */
export function markDemoIntroSeen(): void {
try {
localStorage.setItem(INTRO_DATE_KEY, todayKey())
} catch {
/* ignore (private mode / storage disabled) */
}
}
/** Forget today's "seen" marker so the intro plays again (e.g. "Replay Intro"). */
export function clearDemoIntroSeen(): void {
try {
localStorage.removeItem(INTRO_DATE_KEY)
} catch {
/* ignore */
}
}
// ── Demoable apps ───────────────────────────────────────────────────────────
// Only these apps actually do something in the demo (a mock UI or a real
// external site). Everything else shows "No demo" on a disabled install button
// and is not launchable.
const DEMO_EXTERNAL_URLS: Record<string, string> = {}
// Apps loaded in the in-app iframe via a same-origin path. IndeeHub and Mempool
// are reverse-proxied by nginx (X-Frame-Options/CSP stripped + asset paths
// rewritten) so the frame-busting real sites can be embedded.
const DEMO_MOCK_UI: Record<string, string> = {
indeedhub: '/app/indeedhub/',
mempool: '/app/mempool/',
'mempool-web': '/app/mempool/',
'bitcoin-knots': '/app/bitcoin-knots/',
'bitcoin-core': '/app/bitcoin-core/',
bitcoin: '/app/bitcoin-core/',
'bitcoin-ui': '/app/bitcoin-ui/',
electrs: '/app/electrumx/',
electrumx: '/app/electrumx/',
'archy-electrs-ui': '/app/electrumx/',
lnd: '/app/lnd/',
'lnd-ui': '/app/lnd/',
'archy-lnd-ui': '/app/lnd/',
thunderhub: '/app/lnd/',
fedimint: '/app/fedimint/',
fedimintd: '/app/fedimint/',
filebrowser: '/app/filebrowser/',
}
/**
* Whether a demo app opens in a new tab. Nothing does IndeeHub and Mempool
* both load their real site directly in the in-app iframe.
*/
export function isDemoExternal(_appId: string): boolean {
return false
}
/** Can this app be launched/installed in the demo? */
export function isDemoApp(appId: string): boolean {
return appId in DEMO_EXTERNAL_URLS || appId in DEMO_MOCK_UI
}
/** Resolve the demo launch URL for an app, or null if it isn't demoable. */
export function demoAppUrl(appId: string): string | null {
return DEMO_EXTERNAL_URLS[appId] ?? DEMO_MOCK_UI[appId] ?? null
}

View File

@ -23,6 +23,8 @@ if (!navigator.clipboard) {
},
})
}
import { useToast } from '@/composables/useToast'
const app = createApp(App)
const pinia = createPinia()
@ -95,20 +97,14 @@ function recordError(source: string, err: unknown, info?: string) {
const entry: ArchyErrorEntry = { when: new Date().toISOString(), source, message, info, stack: e?.stack }
errorLog.push(entry)
if (errorLog.length > 25) errorLog.shift()
// Log SILENTLY: a global handler error is almost always something we should
// fix at the source, not interrupt the user for. Keep the full record on the
// console + the window.__archyErrors ring buffer, and make the screenshot-able
// overlay available ON DEMAND (window.__archyShowErrors(), or the debug view)
// — but do NOT auto-pop a red toast / overlay over the UI. Components that
// need to tell the user about a *specific, actionable* failure still call
// toast.error() directly; this catch-all stays out of the way.
console.error(`[${source}]`, err, info ?? '')
}
// Expose the on-demand error overlay + ring buffer so a crash that only repros
// in a runtime without a console (Android companion WebView) is still
// retrievable: call `window.__archyShowErrors()` to screenshot/Copy them.
;(window as unknown as { __archyShowErrors?: () => void }).__archyShowErrors = () => {
// Surface the real message (truncated) instead of a generic toast — this is a
// test/bug-bash build, and "Something went wrong" hides exactly what we need.
const short = message.length > 140 ? `${message.slice(0, 140)}` : message
try {
useToast().error(`Something went wrong: ${short}`)
} catch { /* toast itself failed — the console + ring buffer still have it */ }
// Always show the on-device overlay so the error is visible without a console.
try { showErrorOverlay() } catch { /* overlay is best-effort */ }
}
@ -137,28 +133,15 @@ function reloadOnceForStaleChunk(err: unknown): boolean {
return true
}
// Known-benign environmental noise — expected on some deployments and not
// actionable by the user or us, so it shouldn't even occupy a ring-buffer slot
// (which would push out real errors). The PWA service worker can't register
// over a self-signed cert (it needs a trusted cert or localhost); on those
// nodes the SW/offline cache simply doesn't run, which is fine. Logged at debug
// only. (A trusted cert is the real fix — tracked separately, #56.)
function isBenignEnvironmentError(err: unknown): boolean {
const msg = (err as { message?: string })?.message ?? String(err ?? '')
return /Failed to register a ServiceWorker|ServiceWorker.*(SSL|certificate|SecurityError)|An SSL certificate error occurred when fetching the script/i.test(msg)
}
// Vue's errorHandler only catches errors raised synchronously inside Vue's
// lifecycle/reactivity. Async rejections and plain runtime errors (e.g. a JS
// API missing in an older WebView) slip past it, so catch those too.
window.addEventListener('error', (ev) => {
if (reloadOnceForStaleChunk(ev.error ?? ev.message)) return
if (isBenignEnvironmentError(ev.error ?? ev.message)) { console.debug('[benign]', ev.message); return }
recordError('window.error', ev.error ?? ev.message)
})
window.addEventListener('unhandledrejection', (ev) => {
if (reloadOnceForStaleChunk(ev.reason)) return
if (isBenignEnvironmentError(ev.reason)) { console.debug('[benign]', ev.reason); return }
recordError('unhandledrejection', ev.reason)
})

View File

@ -55,7 +55,7 @@ describe('useAppLauncherStore', () => {
expect(mockWindowOpen).not.toHaveBeenCalled()
})
it('uses the store-driven panel on mobile (no route change, no background swap)', () => {
it('uses route-based app sessions on mobile instead of panel mode', () => {
Object.defineProperty(window, 'innerWidth', {
value: 390,
writable: true,
@ -65,10 +65,8 @@ describe('useAppLauncherStore', () => {
store.openSession('indeedhub')
// Mobile now uses the store-driven panel like desktop panel mode so the
// underlying page/tab never changes and closing returns to the origin.
expect(store.panelAppId).toBe('indeedhub')
expect(mockPush).not.toHaveBeenCalled()
expect(store.panelAppId).toBe(null)
expect(mockPush).toHaveBeenCalledWith({ name: 'app-session', params: { appId: 'indeedhub' }, query: { returnTo: '/dashboard/apps' } })
})
it('normalizes localhost launch URLs to current host before resolving', () => {
@ -119,7 +117,7 @@ describe('useAppLauncherStore', () => {
)
})
it('opens tab-only apps directly on mobile (new tab in PWA, no interstitial)', () => {
it('routes desktop new-tab apps into app session on mobile', () => {
Object.defineProperty(window, 'innerWidth', {
value: 390,
writable: true,
@ -129,17 +127,10 @@ describe('useAppLauncherStore', () => {
store.open({ url: 'http://192.168.1.228:8081', title: 'Nginx Proxy Manager' })
// Tab-only app on mobile-web: open directly in a new browser tab (the
// companion would use the in-app WebView). No session, no route push, no
// "this app opens in a tab" interstitial.
expect(store.isOpen).toBe(false)
expect(store.panelAppId).toBe(null)
expect(mockPush).not.toHaveBeenCalled()
expect(mockWindowOpen).toHaveBeenCalledWith(
'http://192.168.1.228:8081',
'_blank',
'noopener,noreferrer',
)
expect(mockWindowOpen).not.toHaveBeenCalled()
expect(mockPush).toHaveBeenCalledWith({ name: 'app-session', params: { appId: 'nginx-proxy-manager' }, query: { returnTo: '/dashboard/apps' } })
})
it('opens Nginx Proxy Manager in new tab using title hint when URL is path-only', () => {
@ -273,7 +264,7 @@ describe('useAppLauncherStore', () => {
)
})
it('opens prepackaged websites in the store-driven panel on mobile', () => {
it('routes prepackaged websites into app session on mobile', () => {
Object.defineProperty(window, 'innerWidth', {
value: 390,
writable: true,
@ -283,12 +274,9 @@ describe('useAppLauncherStore', () => {
store.open({ url: 'https://present.l484.com', title: 'Arch Presentation', openInNewTab: true })
// Iframeable prepackaged sites stay in-app via the store panel (no route
// change, no background swap) just like every other mobile launch.
expect(store.isOpen).toBe(false)
expect(store.panelAppId).toBe('arch-presentation')
expect(mockWindowOpen).not.toHaveBeenCalled()
expect(mockPush).not.toHaveBeenCalled()
expect(mockPush).toHaveBeenCalledWith({ name: 'app-session', params: { appId: 'arch-presentation' }, query: { returnTo: '/dashboard/apps' } })
})
it('routes HTTPS same-host apps via session view', () => {

View File

@ -4,7 +4,7 @@ import { rpcClient } from '@/api/rpc-client'
import router from '@/router'
import { recordAppLaunch } from '@/utils/appUsage'
import { requestExternalOpen } from '@/api/remote-relay'
import { openInAppOrNewTab } from '@/utils/openExternal'
import { IS_DEMO, isDemoExternal, demoAppUrl } from '@/composables/useDemoIntro'
/**
* Open a URL in a new browser tab but if a companion (phone) is currently
@ -223,25 +223,20 @@ export const useAppLauncherStore = defineStore('appLauncher', () => {
function openSession(appId: string) {
recordAppLaunch(appId)
const mobile = isMobileViewport()
// Tab-only apps (set X-Frame-Options, can't be iframed). No interstitial:
// desktop opens a new browser tab; mobile opens the in-app WebView (Android
// companion) or a new browser tab (PWA) — see openInAppOrNewTab.
if (NEW_TAB_APP_IDS.has(appId)) {
const launchUrl = directAppUrl(appId)
if (launchUrl) {
if (mobile) openInAppOrNewTab(launchUrl)
else openExternal(launchUrl)
return
}
// Demo: apps backed by a real external site that blocks iframing (mempool.space)
// open in a new tab; everything else demoable renders in the in-app session.
if (IS_DEMO && isDemoExternal(appId)) {
const ext = demoAppUrl(appId)
if (ext) { openExternal(ext); return }
}
const launchUrl = NEW_TAB_APP_IDS.has(appId) ? directAppUrl(appId) : null
if (launchUrl && !mobile) {
openExternal(launchUrl)
return
}
// Iframeable apps. Mobile and desktop-panel mode both use the store-driven
// panel so the underlying page/tab never changes (no background swap) and
// closing returns the user to wherever they launched from. Only desktop
// overlay/fullscreen modes use a routed session.
const mode = localStorage.getItem(DISPLAY_MODE_KEY) || 'panel'
if (mobile || mode === 'panel') {
if (mode === 'panel' && !mobile) {
panelAppId.value = appId
} else {
panelAppId.value = null

View File

@ -164,20 +164,6 @@ select:focus-visible {
/* Mobile: override with tab bar clearance */
@media (max-width: 767px) {
/* Mobile web browsers report 100vh taller than the visible area (the dynamic
URL/toolbar chrome). The dashboard is the containing block for the fixed,
container-relative panes (the mesh chat/tools panes), so a 100vh-tall
container pushes their `bottom` offset below the visible viewport they
slide under the bottom tab bar (which is body-teleported and viewport-fixed,
so it stays put). Pin the dashboard to the *dynamic* viewport so the two
reference frames line up. No-op in the companion WebView (no browser chrome
dvh == vh), so its layout is unchanged. Doubled class beats Tailwind's
`.min-h-screen` (100vh) utility on specificity. */
.dashboard-view.dashboard-view {
height: 100dvh;
min-height: 100dvh;
}
.mobile-scroll-pad {
padding-bottom: calc(var(--mobile-tab-bar-height, 88px) + var(--safe-area-bottom, env(safe-area-inset-bottom, 0px)) + var(--audio-player-height, 0px) + 16px);
}

View File

@ -11,37 +11,15 @@
*/
interface ArchipelagoNativeBridge {
openExternal?: (url: string) => void
openInApp?: (url: string) => void
}
function nativeBridge(): ArchipelagoNativeBridge | undefined {
return (window as unknown as { ArchipelagoNative?: ArchipelagoNativeBridge }).ArchipelagoNative
}
export function openExternalUrl(url: string): void {
if (!url) return
const native = nativeBridge()
const native = (window as unknown as { ArchipelagoNative?: ArchipelagoNativeBridge })
.ArchipelagoNative
if (native && typeof native.openExternal === 'function') {
native.openExternal(url)
return
}
window.open(url, '_blank', 'noopener,noreferrer')
}
/**
* Launch an app that can't be embedded in an iframe (X-Frame-Options) from a
* mobile surface with NO "this app opens in a tab" interstitial.
*
* - Android companion: hand it to the in-app WebView (`openInApp`) so it stays
* inside Archipelago with the native back/forward/reload/close controls.
* - Plain mobile browser (PWA): open directly in a new browser tab.
*/
export function openInAppOrNewTab(url: string): void {
if (!url) return
const native = nativeBridge()
if (native && typeof native.openInApp === 'function') {
native.openInApp(url)
return
}
window.open(url, '_blank', 'noopener,noreferrer')
}

View File

@ -1,6 +1,6 @@
<template>
<div class="app-session-root">
<Teleport to="body" :disabled="isInlinePanel && !isMobile">
<Teleport to="body" :disabled="isInlinePanel">
<div
:class="backdropClasses"
@click.self="handleBackdropClick"
@ -27,7 +27,6 @@
:app-url="appUrl"
:app-id="appId"
:app-title="appTitle"
:app-icon="appIcon"
:loading="loading"
:iframe-blocked="iframeBlocked"
:must-open-new-tab="mustOpenNewTab"
@ -105,11 +104,12 @@ import {
type DisplayMode, DISPLAY_MODE_KEY, NEW_TAB_APPS, IFRAME_BLOCKED_APPS,
resolveAppUrl, resolveAppTitle,
} from './appSession/appSessionConfig'
import { launchBlockedReason, resolveAppIcon } from './apps/appsConfig'
import { launchBlockedReason } from './apps/appsConfig'
import { useAppIdentity } from './appSession/useAppIdentity'
import { useNostrBridge } from './appSession/useNostrBridge'
import { openExternalUrl, openInAppOrNewTab } from '@/utils/openExternal'
import { openExternalUrl } from '@/utils/openExternal'
import { useElectrsSync } from '@/composables/useElectrsSync'
import { IS_DEMO, isDemoExternal } from '@/composables/useDemoIntro'
const props = defineProps<{
appIdProp?: string
@ -155,18 +155,14 @@ const appId = computed(() => {
const appTitle = computed(() => resolveAppTitle(appId.value))
const packageEntry = computed(() => store.data?.['package-data']?.[appId.value] || null)
const appIcon = computed(() =>
packageEntry.value
? resolveAppIcon(appId.value, packageEntry.value)
: `/assets/img/app-icons/${appId.value}.png`
)
const blockedReason = computed(() => launchBlockedReason(appId.value, packageEntry.value))
const blockedTitle = computed(() => appId.value === 'fedimint' || appId.value === 'fedimintd' ? 'Waiting for Bitcoin sync' : 'App not ready')
// Reactive so the overlay/teleport/footer/animation decisions track the live
// viewport (and match the CSS `md` breakpoint) instead of a stale one-shot read.
const isMobile = ref(typeof window !== 'undefined' && window.innerWidth < 768)
function updateIsMobile() { isMobile.value = window.innerWidth < 768 }
const mustOpenNewTab = computed(() => NEW_TAB_APPS.has(appId.value))
const isMobile = typeof window !== 'undefined' && window.innerWidth < 768
// In the demo, apps backed by a real external site that blocks iframing
// (mempool.space) open in a new tab rather than the in-app session frame.
const mustOpenNewTab = computed(() =>
NEW_TAB_APPS.has(appId.value) || (IS_DEMO && isDemoExternal(appId.value))
)
// ElectrumX shows a sync screen before its real UI (the Electrum server only
// serves clients once its index is built). Poll /electrs-status while this is
@ -250,18 +246,16 @@ function setMode(mode: DisplayMode) {
}
}
// Reactive classes based on display mode. On mobile the store-driven panel
// renders as a full-screen overlay (teleported to body) so it covers the nav
// and the underlying page never changes desktop keeps the inline panel.
// Reactive classes based on display mode
const backdropClasses = computed(() => {
if (isInlinePanel.value && !isMobile.value) return 'app-session-backdrop-inline'
if (isInlinePanel.value) return 'app-session-backdrop-inline'
return 'app-session-backdrop-overlay'
})
const panelClasses = computed(() => {
const base = 'app-session-panel glass-card'
if (isInlinePanel.value && !isMobile.value) return `${base} app-session-inline`
if (displayMode.value === 'fullscreen' && !isMobile.value) return `${base} app-session-fullscreen`
if (isInlinePanel.value) return `${base} app-session-inline`
if (displayMode.value === 'fullscreen') return `${base} app-session-fullscreen`
return `${base} app-session-overlay`
})
@ -381,13 +375,10 @@ watch(displayMode, (mode) => {
})
onMounted(() => {
// Apps that block iframes (X-Frame-Options) can't be shown in the session.
// Open them directly instead of showing a "this app opens in a tab"
// interstitial: desktop new browser tab; mobile in-app WebView (companion)
// or new tab (PWA). Then dismiss the (empty) session surface.
if (mustOpenNewTab.value && appUrl.value) {
if (isMobile.value) openInAppOrNewTab(appUrl.value)
else window.open(appUrl.value, '_blank', 'noopener,noreferrer')
// Apps that block iframes open externally on desktop. On mobile, keep the
// session surface visible so launcher taps do not bounce straight out.
if (mustOpenNewTab.value && appUrl.value && !isMobile) {
window.open(appUrl.value, '_blank', 'noopener,noreferrer')
if (isInlinePanel.value) emit('close')
else closeRouteSession()
return
@ -395,9 +386,8 @@ onMounted(() => {
window.addEventListener('keydown', onKeyDown, true)
window.addEventListener('message', onMessage)
window.addEventListener('resize', updateIsMobile)
document.addEventListener('fullscreenchange', onFullscreenChange)
if (IFRAME_BLOCKED_APPS.has(appId.value)) {
if (IFRAME_BLOCKED_APPS.has(appId.value) || (mustOpenNewTab.value && isMobile)) {
loading.value = false
iframeBlocked.value = true
} else {
@ -419,7 +409,6 @@ onBeforeUnmount(() => {
if (iframeCheckId) clearTimeout(iframeCheckId)
window.removeEventListener('keydown', onKeyDown, true)
window.removeEventListener('message', onMessage)
window.removeEventListener('resize', updateIsMobile)
document.removeEventListener('fullscreenchange', onFullscreenChange)
screensaverStore.resume(screensaverReason.value)
if (document.fullscreenElement) document.exitFullscreen().catch(() => {})

View File

@ -62,6 +62,7 @@ import { ref, computed, onMounted, onBeforeUnmount } from 'vue'
import { useRouter } from 'vue-router'
import { useI18n } from 'vue-i18n'
import { ContextBroker } from '@/services/contextBroker'
import { IS_DEMO } from '@/composables/useDemoIntro'
const { t } = useI18n()
@ -71,9 +72,12 @@ const aiuiConnected = ref(false)
let broker: ContextBroker | null = null
const aiuiUrl = computed(() => {
// Demo: ?mockArchy makes AIUI use its built-in mock node data (apps, system,
// network, wallet, bitcoin, files) and &seed pre-loads the example chats.
const demo = IS_DEMO ? '&mockArchy=1&seed=1' : ''
const envUrl = import.meta.env.VITE_AIUI_URL
if (envUrl) return `${envUrl}?embedded=true&hideClose=true`
if (import.meta.env.PROD) return '/aiui/?embedded=true&hideClose=true'
if (envUrl) return `${envUrl}?embedded=true&hideClose=true${demo}`
if (import.meta.env.PROD || IS_DEMO) return `/aiui/?embedded=true&hideClose=true${demo}`
return ''
})

View File

@ -156,6 +156,11 @@
<!-- Normal Login Mode -->
<template v-else>
<!-- Demo credential hint -->
<div v-if="isDemo" class="mb-4 p-3 bg-orange-500/15 border border-orange-400/30 rounded-lg text-orange-100 text-sm text-center">
🎮 Demo mode Password: <span class="font-mono font-semibold">{{ DEMO_PASSWORD }}</span>
</div>
<div class="mb-6">
<label for="login-password" class="block text-sm font-medium text-white/80 mb-2">
{{ t('login.password') }}
@ -203,14 +208,16 @@
>
{{ t('login.replayIntro') }}
</button>
<span class="text-white/30">|</span>
<button
@click="restartOnboarding"
:disabled="isResettingOnboarding"
class="text-xs text-white/50 hover:text-white/70 transition-colors underline-offset-2 hover:underline disabled:opacity-50 disabled:cursor-not-allowed"
>
{{ isResettingOnboarding ? t('login.resetting') : t('login.onboarding') }}
</button>
<template v-if="!isDemo">
<span class="text-white/30">|</span>
<button
@click="restartOnboarding"
:disabled="isResettingOnboarding"
class="text-xs text-white/50 hover:text-white/70 transition-colors underline-offset-2 hover:underline disabled:opacity-50 disabled:cursor-not-allowed"
>
{{ isResettingOnboarding ? t('login.resetting') : t('login.onboarding') }}
</button>
</template>
</div>
</div>
</div>
@ -228,6 +235,7 @@ const { t } = useI18n()
import { useLoginTransitionStore } from '../stores/loginTransition'
import { rpcClient } from '../api/rpc-client'
import { resumeAudioContext, startSynthwave, stopSynthwave, playLoginSuccessWhoosh, playPop } from '@/composables/useLoginSounds'
import { IS_DEMO, DEMO_PASSWORD, clearDemoIntroSeen } from '@/composables/useDemoIntro'
const router = useRouter()
const currentRoute = useRoute()
@ -241,7 +249,8 @@ const loginRedirectTo = computed(() => {
const store = useAppStore()
const loginTransition = useLoginTransitionStore()
const password = ref('')
const isDemo = IS_DEMO
const password = ref(IS_DEMO ? DEMO_PASSWORD : '')
const confirmPassword = ref('')
const loading = ref(false)
const error = ref<string | null>(null)
@ -520,6 +529,8 @@ async function handleTotpVerify() {
function replayIntro() {
// Clear the intro seen flag
localStorage.removeItem('neode_intro_seen')
// Demo: also clear the per-day gate so the intro plays again now.
if (IS_DEMO) clearDemoIntroSeen()
// Navigate to root to trigger splash screen
window.location.href = '/'
}

View File

@ -63,8 +63,8 @@
<button
v-else
@click="installApp"
:disabled="installing || (!installBlockedReason && !app.manifestUrl && !app.dockerImage)"
:title="installBlockedReason || undefined"
:disabled="demoNoInstall || installing || (!installBlockedReason && !app.manifestUrl && !app.dockerImage)"
:title="demoNoInstall ? 'Not available in the demo' : (installBlockedReason || undefined)"
class="glass-button glass-button-sm px-6 py-2.5 rounded-lg text-sm font-semibold flex items-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<svg v-if="installing" class="animate-spin h-4 w-4" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24">
@ -74,7 +74,7 @@
<svg v-else class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-4l-4 4m0 0l-4-4m4 4V4" />
</svg>
{{ installBlockedReason ? 'Bitcoin Pruned' : installing ? t('common.installing') : t('common.install') }}
{{ demoNoInstall ? 'No demo' : installBlockedReason ? 'Bitcoin Pruned' : installing ? t('common.installing') : t('common.install') }}
</button>
</div>
</div>
@ -129,8 +129,8 @@
<button
v-else
@click="installApp"
:disabled="installing || (!installBlockedReason && !app.manifestUrl && !app.dockerImage)"
:title="installBlockedReason || undefined"
:disabled="demoNoInstall || installing || (!installBlockedReason && !app.manifestUrl && !app.dockerImage)"
:title="demoNoInstall ? 'Not available in the demo' : (installBlockedReason || undefined)"
class="glass-button glass-button-sm px-4 py-2.5 rounded-lg text-sm font-semibold flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed col-span-2"
>
<svg v-if="installing" class="animate-spin h-4 w-4" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24">
@ -140,7 +140,7 @@
<svg v-else class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-4l-4 4m0 0l-4-4m4 4V4" />
</svg>
{{ installBlockedReason ? 'Bitcoin Pruned' : installing ? t('common.installing') : t('common.install') }}
{{ demoNoInstall ? 'No demo' : installBlockedReason ? 'Bitcoin Pruned' : installing ? t('common.installing') : t('common.install') }}
</button>
</div>
@ -351,6 +351,7 @@
<script setup lang="ts">
import { ref, computed, onMounted, onBeforeUnmount } from 'vue'
import { IS_DEMO, isDemoApp } from '@/composables/useDemoIntro'
import { useRouter, useRoute } from 'vue-router'
import { useI18n } from 'vue-i18n'
import { useAppStore } from '../stores/app'
@ -486,6 +487,9 @@ const installBlockedReason = computed(() => {
return electrumxArchiveWarning
})
// Demo: only demoable apps can be installed; the rest show "No demo".
const demoNoInstall = computed(() => IS_DEMO && !!app.value?.id && !isDemoApp(app.value.id))
let pendingRedirect: ReturnType<typeof setTimeout> | null = null
onMounted(() => {

View File

@ -22,27 +22,30 @@
@click="goToOptions"
class="glass-button px-6 py-3 sm:px-8 sm:py-4 rounded-lg text-base sm:text-lg font-medium transition-all hover:bg-black/70 hover:border-white/30 onb-cta"
>
Unlock your sovereignty
{{ isDemo ? 'Enter the demo →' : 'Unlock your sovereignty →' }}
</button>
<a
tabindex="0"
role="button"
class="text-white/50 hover:text-white/80 underline text-sm cursor-pointer mt-4 block text-center onb-cta"
@click="goToRestore"
@keydown.enter="goToRestore"
>
Restore from seed phrase
</a>
<a
tabindex="0"
role="button"
class="text-white/50 hover:text-white/80 underline text-sm cursor-pointer mt-2 block text-center onb-cta"
@click="goToLogin"
@keydown.enter="goToLogin"
>
Already set up? Log in
</a>
<!-- Onboarding wizard entry points are hidden in the demo (no seed/identity setup) -->
<template v-if="!isDemo">
<a
tabindex="0"
role="button"
class="text-white/50 hover:text-white/80 underline text-sm cursor-pointer mt-4 block text-center onb-cta"
@click="goToRestore"
@keydown.enter="goToRestore"
>
Restore from seed phrase
</a>
<a
tabindex="0"
role="button"
class="text-white/50 hover:text-white/80 underline text-sm cursor-pointer mt-2 block text-center onb-cta"
@click="goToLogin"
@keydown.enter="goToLogin"
>
Already set up? Log in
</a>
</template>
</div>
</div>
</div>
@ -53,11 +56,16 @@ import { ref, onMounted } from 'vue'
import { useRouter } from 'vue-router'
import AnimatedLogo from '@/components/AnimatedLogo.vue'
import { playNavSound } from '@/composables/useNavSounds'
import { IS_DEMO, markDemoIntroSeen } from '@/composables/useDemoIntro'
const router = useRouter()
const ctaButton = ref<HTMLButtonElement | null>(null)
const isDemo = IS_DEMO
onMounted(() => {
// Demo: once the visitor has seen the intro today, don't auto-replay it again
// until tomorrow (they can still use "Replay Intro" on the login screen).
if (IS_DEMO) markDemoIntroSeen()
// Auto-focus after entry animation completes (1.4s animation delay + 0.6s duration)
setTimeout(() => {
ctaButton.value?.focus({ preventScroll: true })
@ -66,6 +74,13 @@ onMounted(() => {
function goToOptions() {
playNavSound('action')
// Demo: skip the onboarding wizard (seed/identity setup) entirely go straight
// to login, which is prefilled with the demo password.
if (isDemo) {
localStorage.setItem('neode_onboarding_complete', '1')
router.push('/login').catch(() => {})
return
}
router.push('/onboarding/path').catch(() => {})
}

View File

@ -1304,7 +1304,7 @@ async function payWithLightning() {
function scheduleInvoicePoll() {
if (invoicePollTimer) clearTimeout(invoicePollTimer)
invoicePollTimer = setTimeout(pollInvoice, 3000)
invoicePollTimer = setTimeout(pollInvoice, 1000)
}
async function pollInvoice() {

View File

@ -16,11 +16,22 @@
import { ref, onMounted } from 'vue'
import { useRouter } from 'vue-router'
import { isOnboardingComplete } from '@/composables/useOnboarding'
import { IS_DEMO, demoIntroSeenToday } from '@/composables/useDemoIntro'
import BootScreen from '@/components/BootScreen.vue'
const router = useRouter()
const showBootScreen = ref(false)
/**
* Public demo: replay the intro on every visit, but at most once per calendar
* day per browser. If already seen today straight to login; otherwise intro.
*/
function demoRoute() {
const dest = demoIntroSeenToday() ? '/login' : '/onboarding/intro'
log('demoRoute', { dest })
router.replace(dest).catch(() => {})
}
function log(msg: string, data?: unknown) {
const ts = new Date().toISOString()
const entry = `[RootRedirect ${ts}] ${msg}` + (data !== undefined ? ` ${JSON.stringify(data)}` : '')
@ -68,6 +79,10 @@ async function checkOnboarded(): Promise<boolean> {
}
async function proceedToApp() {
if (IS_DEMO) {
demoRoute()
return
}
const devMode = import.meta.env.VITE_DEV_MODE
if (devMode === 'setup' || devMode === 'existing') {
log('proceedToApp devMode', { devMode })
@ -121,6 +136,11 @@ onMounted(async () => {
log('production flow', { isUp })
if (isUp) {
// Demo: per-day intro gate instead of server-side onboarding state.
if (IS_DEMO) {
demoRoute()
return
}
const onboarded = await checkOnboarded()
if (onboarded) {
log('server up + onboarded → proceedToApp')

View File

@ -3,8 +3,8 @@ import { beforeEach, describe, expect, it, vi } from 'vitest'
import AppSession from '../AppSession.vue'
const { mockReplace, mockPush, mockWindowOpen, mockSuppress, mockResume } = vi.hoisted(() => ({
mockReplace: vi.fn(() => Promise.resolve()),
mockPush: vi.fn(() => Promise.resolve()),
mockReplace: vi.fn(),
mockPush: vi.fn(),
mockWindowOpen: vi.fn(),
mockSuppress: vi.fn(),
mockResume: vi.fn(),
@ -62,7 +62,7 @@ describe('AppSession mobile new-tab apps', () => {
})
})
it('opens tab-only apps directly on mobile instead of showing an interstitial', async () => {
it('keeps iframe-blocked apps inside the mobile session instead of auto-opening a tab', async () => {
const wrapper = mount(AppSession, {
global: {
stubs: {
@ -75,11 +75,9 @@ describe('AppSession mobile new-tab apps', () => {
})
await flushPromises()
// Tab-only app (gitea) on mobile-web: open directly in a new browser tab
// (no native bridge in the test) and dismiss the empty session — no
// "this app opens in a tab" interstitial.
expect(mockWindowOpen).toHaveBeenCalled()
expect(mockReplace).toHaveBeenCalled()
expect(wrapper.text()).not.toContain('This app opens in a new tab')
expect(mockWindowOpen).not.toHaveBeenCalled()
expect(mockReplace).not.toHaveBeenCalled()
expect(wrapper.text()).toContain('This app opens in a new tab')
expect(wrapper.text()).toContain('Open in new tab')
})
})

View File

@ -1,7 +1,12 @@
<template>
<div class="relative flex-1 min-h-0 bg-black/40 overflow-hidden app-session-frame-safe">
<Transition name="content-fade">
<AppLoadingScreen v-if="loading" :icon="appIcon" :title="appTitle" :progress="loadProgress" />
<div v-if="loading" class="absolute inset-0 z-10 flex items-center justify-center bg-black/40">
<svg class="animate-spin h-8 w-8 text-blue-400" viewBox="0 0 24 24" fill="none">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4" />
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
</svg>
</div>
</Transition>
<!-- ElectrumX sync screen shown before the real UI while the on-chain
@ -111,15 +116,13 @@
</template>
<script setup lang="ts">
import { nextTick, onBeforeUnmount, ref, watch } from 'vue'
import { nextTick, ref, watch } from 'vue'
import type { ElectrsSyncStatus } from '@/composables/useElectrsSync'
import AppLoadingScreen from '@/components/AppLoadingScreen.vue'
const props = defineProps<{
appUrl: string
appId: string
appTitle: string
appIcon: string
loading: boolean
iframeBlocked: boolean
mustOpenNewTab: boolean
@ -141,40 +144,6 @@ const emit = defineEmits<{
const iframeRef = ref<HTMLIFrameElement | null>(null)
// Faux load progress for the loading screen. Cross-origin iframes give no real
// progress events, so ease toward ~92% while loading and snap to 100% on load
// far better UX than a black screen with a bare spinner.
const loadProgress = ref(0)
let progressTimer: ReturnType<typeof setInterval> | null = null
function stopProgress() {
if (progressTimer) { clearInterval(progressTimer); progressTimer = null }
}
function startProgress() {
stopProgress()
loadProgress.value = 8
progressTimer = setInterval(() => {
// Decelerate as it approaches the cap so it never visually "finishes" early.
const remaining = 92 - loadProgress.value
loadProgress.value += Math.max(0.4, remaining * 0.08)
if (loadProgress.value >= 92) { loadProgress.value = 92; stopProgress() }
}, 180)
}
watch(() => props.loading, (isLoading) => {
if (isLoading) {
startProgress()
} else {
stopProgress()
loadProgress.value = 100
}
}, { immediate: true })
watch(() => props.refreshKey, () => { if (props.loading) startProgress() })
onBeforeUnmount(stopProgress)
function focusIframe() {
iframeRef.value?.focus({ preventScroll: true })
}

View File

@ -1,6 +1,7 @@
/** Static configuration maps for app session routing and display */
import { GENERATED_APP_PORTS, GENERATED_APP_TITLES, GENERATED_NEW_TAB_APPS } from './generatedAppSessionConfig'
import { IS_DEMO, demoAppUrl } from '@/composables/useDemoIntro'
export type DisplayMode = 'panel' | 'overlay' | 'fullscreen'
@ -76,6 +77,15 @@ export const IFRAME_BLOCKED_APPS = new Set<string>([])
/** Resolve app URL using direct port mapping (source of truth) */
export function resolveAppUrl(id: string, routeQueryPath?: string, runtimeUrl?: string): string {
// Demo: route to the app's mock UI or real external site (mempool.space,
// indee.tx1138.com). Carry through a deep-link path (e.g. /tx/<hash> for
// mempool). Non-demoable apps fall through to a generic notice page.
if (IS_DEMO) {
const base = demoAppUrl(id)
if (base) return routeQueryPath ? base + routeQueryPath : base
return `/app/${id}/`
}
// External HTTPS apps
const ext = EXTERNAL_URLS[id]
if (ext) return ext

View File

@ -102,23 +102,17 @@
</div>
</div>
<!-- Uninstalling progress truthful stage-driven bar (mirrors install) -->
<!-- Uninstalling progress live stage label from backend -->
<div v-else-if="isUninstalling" class="mt-4">
<div class="flex items-center justify-between mb-1.5">
<span class="text-xs text-white/70 flex items-center gap-1.5">
<svg class="animate-spin h-3 w-3" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
</svg>
{{ uninstallStageLabel }}
</span>
<span v-if="uninstallProgress !== null" class="text-xs text-white/50">{{ uninstallProgress }}%</span>
<div class="flex items-center gap-1.5">
<svg class="animate-spin h-3 w-3 text-red-400" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
</svg>
<span class="text-xs text-red-300 truncate">{{ uninstallStageLabel }}</span>
</div>
<div class="w-full h-1.5 bg-white/10 rounded-full overflow-hidden">
<div
class="install-progress-fill h-full bg-white/60 rounded-full transition-all duration-500"
:style="{ width: `${Math.max(uninstallProgress ?? 8, 4)}%` }"
></div>
<div class="mt-1.5 w-full h-1.5 bg-white/10 rounded-full overflow-hidden">
<div class="h-full bg-red-400/60 rounded-full animate-pulse w-full"></div>
</div>
</div>
@ -288,29 +282,6 @@ const uninstallStageLabel = computed(() => {
return raw ? raw : `${t('common.uninstalling')}`
})
// Map the backend's uninstall-stage label to a truthful percentage so the bar
// progresses through the teardown instead of sitting at a solid full(-red)
// block. Backend stages (set_uninstall_stage):
// "Stopping containers (X/N)" 1050% (linear over the stack)
// "Cleaning up volumes" 70%
// "Removing app data" 90%
// Unknown/between pushes null the bar parks low and the shimmer overlay
// (install-progress-fill) carries the motion, exactly like a fixed install phase.
const uninstallProgress = computed<number | null>(() => {
const raw = props.pkg['uninstall-stage'] || ''
const m = raw.match(/\((\d+)\s*\/\s*(\d+)\)/)
if (m) {
const done = Number(m[1])
const total = Number(m[2])
if (total > 0) {
return Math.round(10 + Math.min(done / total, 1) * 40)
}
}
if (/volume/i.test(raw)) return 70
if (/data/i.test(raw)) return 90
return null
})
const isTransitioning = computed(() => {
const s = props.pkg.state
const h = props.pkg.health

View File

@ -239,16 +239,6 @@ const APP_ICON_FALLBACKS: Record<string, string> = {
'archy-bitcoin-ui': '/assets/img/app-icons/bitcoin-knots.webp',
'archy-lnd-ui': '/assets/img/app-icons/lnd.svg',
'archy-electrs-ui': '/assets/img/app-icons/electrumx.png',
// ElectrumX ships under a few historical ids (the backend was renamed
// electrs → electrumx). Without an explicit map, an `electrs`-keyed install
// falls through to the default `/assets/img/app-icons/electrs.png`, which
// doesn't exist → handleImageError swaps .png→.svg and lands on electrs.svg
// (the "Electrs in Rust" logo) instead of the real ElectrumX icon. Pin the
// whole family to the ElectrumX icon so My Apps shows the right logo no
// matter which id the node has it installed under.
'electrs': '/assets/img/app-icons/electrumx.png',
'electrs-ui': '/assets/img/app-icons/electrumx.png',
'electrumx': '/assets/img/app-icons/electrumx.png',
}
// Parent-app icon by prefix, for stack members not listed explicitly above

View File

@ -1,12 +1,9 @@
<template>
<Teleport to="body">
<!-- Lifecycle / Offline Banner.
Server restart/shutdown is deliberate shown immediately. A plain
connection blip is debounced (showConnIssue) so transient sub-grace
reconnects don't flash. -->
<!-- Offline Banner -->
<Transition name="conn-banner">
<div
v-if="(showLifecycle || showConnectionLost)"
v-if="isOffline && !store.isReconnecting && store.isAuthenticated"
class="conn-banner-overlay"
>
<div class="path-option-card px-6 py-3 border-l-4 border-yellow-500 inline-flex items-center gap-2 text-yellow-200 shadow-2xl">
@ -20,10 +17,10 @@
</div>
</Transition>
<!-- Reconnecting Banner (debounced) -->
<!-- Reconnecting Banner -->
<Transition name="conn-banner">
<div
v-if="showReconnecting"
v-if="store.isReconnecting && store.isAuthenticated"
class="conn-banner-overlay"
>
<div class="path-option-card px-6 py-3 border-l-4 border-blue-500 inline-flex items-center gap-2 text-blue-200 shadow-2xl">
@ -38,7 +35,7 @@
</template>
<script setup lang="ts">
import { computed, ref, watch, onUnmounted } from 'vue'
import { computed } from 'vue'
import { useAppStore } from '@/stores/app'
const store = useAppStore()
@ -46,58 +43,6 @@ const store = useAppStore()
const isOffline = computed(() => store.isOffline)
const isRestarting = computed(() => store.isRestarting)
const isShuttingDown = computed(() => store.isShuttingDown)
// A deliberate server lifecycle transition (restart/shutdown) is real and
// user-initiated surface it immediately, no debounce.
const isLifecycleTransition = computed(() => isRestarting.value || isShuttingDown.value)
const showLifecycle = computed(() => isLifecycleTransition.value && store.isAuthenticated)
// A plain connection blip (offline or reconnecting, not a lifecycle transition).
// The overwhelming majority recover within a second or two (load spikes,
// Tailscale/relay TCP resets), so showing the banner instantly makes a healthy
// node read as unstable. Debounce: only surface after the issue persists past a
// grace window; hide immediately on recovery.
const hasConnIssue = computed(
() => (store.isReconnecting || isOffline.value) && !isLifecycleTransition.value
)
const SHOW_DELAY_MS = 2500
const showConnIssue = ref(false)
let pendingTimer: ReturnType<typeof setTimeout> | null = null
function clearTimer() {
if (pendingTimer) {
clearTimeout(pendingTimer)
pendingTimer = null
}
}
watch(
hasConnIssue,
(issue) => {
clearTimer()
if (issue) {
pendingTimer = setTimeout(() => {
showConnIssue.value = true
pendingTimer = null
}, SHOW_DELAY_MS)
} else {
// Recovered before the grace window elapsed hide at once.
showConnIssue.value = false
}
},
{ immediate: true }
)
onUnmounted(clearTimer)
// Debounced visual states the template renders.
const showReconnecting = computed(
() => showConnIssue.value && store.isReconnecting && store.isAuthenticated
)
const showConnectionLost = computed(
() => showConnIssue.value && isOffline.value && !store.isReconnecting && store.isAuthenticated
)
</script>
<style scoped>

View File

@ -143,10 +143,9 @@ const mobileTabBar = ref<HTMLElement | null>(null)
const MOBILE_LAYOUT_MAX_WIDTH = 920
const viewportWidth = ref(typeof window === 'undefined' ? 1024 : window.innerWidth)
// App sessions own their mobile controls, so the nav hides while one is open.
// Mobile launches now use the store-driven panel (no route change) to keep the
// background tab intact, so treat an active panel the same as a routed session.
const isAppSessionActive = computed(() => route.name === 'app-session' || !!appLauncher.panelAppId)
// App sessions own their mobile controls. Normal mobile launches use the route
// session; keeping this guard also protects any desktop-panel state on resize.
const isAppSessionActive = computed(() => route.name === 'app-session')
// Show persistent tabs for Apps/Marketplace on mobile
const showAppsTabs = computed(() => {

View File

@ -102,9 +102,9 @@
@click.stop="$emit('launch', app)"
class="px-4 py-2 glass-button glass-button-sm rounded-lg text-sm font-medium"
>Launch</button>
<!-- Scanning -->
<!-- Scanning (skipped in demo there are no real containers to scan) -->
<span
v-else-if="!containersScanned && (app.source === 'local' || app.dockerImage)"
v-else-if="!IS_DEMO && !containersScanned && (app.source === 'local' || app.dockerImage)"
class="flex-1 px-4 py-2 rounded-lg text-white/50 text-sm font-medium text-center cursor-default relative overflow-hidden"
>
<span class="discover-shimmer-bg"></span>
@ -116,6 +116,12 @@
Checking...
</span>
</span>
<!-- Demo: app not demoable -->
<button
v-else-if="IS_DEMO && !isInstalled(app.id) && !isDemoApp(app.id)"
disabled
class="flex-1 px-4 py-2 bg-white/10 rounded-lg text-white/40 text-sm font-medium cursor-not-allowed"
>No demo</button>
<!-- Install button -->
<button
v-else-if="!isInstalled(app.id) && (app.source === 'local' || app.dockerImage)"
@ -158,6 +164,7 @@
<script setup lang="ts">
import type { MarketplaceApp } from './types'
import { handleImageError } from '@/views/apps/appsConfig'
import { IS_DEMO, isDemoApp } from '@/composables/useDemoIntro'
defineProps<{
filteredApps: MarketplaceApp[]

View File

@ -64,7 +64,7 @@
Starting...
</span>
<button
v-else-if="!containersScanned && app.dockerImage"
v-else-if="!IS_DEMO && !containersScanned && app.dockerImage"
disabled
class="text-white/40 text-sm flex items-center gap-2"
>
@ -74,6 +74,11 @@
</svg>
Checking...
</button>
<button
v-else-if="IS_DEMO && !isInstalled(app.id) && !isDemoApp(app.id)"
disabled
class="glass-button glass-button-sm rounded-lg text-sm font-medium opacity-50 cursor-not-allowed"
>No demo</button>
<button
v-else-if="!isInstalled(app.id) && app.dockerImage"
data-controller-install-btn
@ -99,6 +104,7 @@
<script setup lang="ts">
import type { FeaturedApp, MarketplaceApp } from './types'
import { handleImageError } from '@/views/apps/appsConfig'
import { IS_DEMO, isDemoApp } from '@/composables/useDemoIntro'
defineProps<{
featuredApps: FeaturedApp[]

View File

@ -85,7 +85,7 @@ export function getCuratedAppList(): MarketplaceApp[] {
{ id: 'grafana', title: 'Grafana', version: '10.2.0', description: 'Analytics and monitoring platform. Dashboards for your node metrics and system health.', icon: '/assets/img/app-icons/grafana.png', author: 'Grafana Labs', dockerImage: `${R}/grafana:10.2.0`, repoUrl: 'https://github.com/grafana/grafana' },
{ id: 'searxng', title: 'SearXNG', version: '2024.1.0', description: 'Privacy-respecting metasearch engine. Search the internet without being tracked or profiled.', icon: '/assets/img/app-icons/searxng.png', author: 'SearXNG', dockerImage: `${R}/searxng:latest`, repoUrl: 'https://github.com/searxng/searxng' },
{ id: 'ollama', title: 'Ollama', version: '0.5.4', description: 'Run AI models locally. Llama, Mistral, and more — on your hardware, completely private.', icon: '/assets/img/app-icons/ollama.png', author: 'Ollama', dockerImage: `${R}/ollama:latest`, repoUrl: 'https://github.com/ollama/ollama' },
{ id: 'cryptpad', title: 'CryptPad', version: '2024.12.0', description: 'End-to-end encrypted documents, spreadsheets, and presentations. Zero-knowledge collaboration.', icon: '/assets/icon/favico-black-v2.svg', author: 'XWiki SAS', dockerImage: `${R}/cryptpad:2024.12.0`, repoUrl: 'https://github.com/cryptpad/cryptpad' },
{ id: 'cryptpad', title: 'CryptPad', version: '2024.12.0', description: 'End-to-end encrypted documents, spreadsheets, and presentations. Zero-knowledge collaboration.', icon: '/assets/img/app-icons/cryptpad.webp', author: 'XWiki SAS', dockerImage: `${R}/cryptpad:2024.12.0`, repoUrl: 'https://github.com/cryptpad/cryptpad' },
{ id: 'nextcloud', title: 'Nextcloud', version: '29', description: 'Your own private cloud. File sync, calendars, contacts — all on your hardware.', icon: '/assets/img/app-icons/nextcloud.webp', author: 'Nextcloud', dockerImage: `${R}/nextcloud:29`, repoUrl: 'https://github.com/nextcloud/server' },
{ id: 'vaultwarden', title: 'Vaultwarden', version: '1.30.0', description: 'Self-hosted password vault. Bitwarden-compatible with zero-knowledge encryption.', icon: '/assets/img/app-icons/vaultwarden.webp', author: 'Vaultwarden', dockerImage: `${R}/vaultwarden:1.30.0-alpine`, repoUrl: 'https://github.com/dani-garcia/vaultwarden' },
{ id: 'jellyfin', title: 'Jellyfin', version: '10.8.13', description: 'Free media server. Stream your movies, music, and photos to any device.', icon: '/assets/img/app-icons/jellyfin.webp', author: 'Jellyfin', dockerImage: `${R}/jellyfin:10.8.13`, repoUrl: 'https://github.com/jellyfin/jellyfin' },

View File

@ -234,7 +234,7 @@ export function getCuratedAppList(): MarketplaceApp[] {
title: 'CryptPad',
version: '2024.12.0',
description: 'End-to-end encrypted documents, spreadsheets, and presentations. Zero-knowledge collaboration.',
icon: '/assets/icon/favico-black-v2.svg',
icon: '/assets/img/app-icons/cryptpad.webp',
author: 'XWiki SAS',
dockerImage: `${REGISTRY}/cryptpad:2024.12.0`,
manifestUrl: undefined,

View File

@ -151,6 +151,16 @@ export default defineConfig({
changeOrigin: true,
secure: false,
},
// Demo mock app UIs (electrumx, lnd, fedimint) + generic notice page.
'/app/electrumx': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/app/electrs': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/app/lnd': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/app/fedimint': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/app/bitcoin-core': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/app/bitcoin-knots': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/electrs-status': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/proxy': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
'/lnd-connect-info': { target: process.env.BACKEND_URL || 'http://localhost:5959', changeOrigin: true, secure: false },
// Serve the node's deployed AIUI same-origin like production (set VITE_AIUI_URL=/aiui/)
'/aiui': {
target: process.env.AIUI_PROXY_TARGET || 'http://127.0.0.1:80',

File diff suppressed because it is too large Load Diff

View File

@ -80,7 +80,7 @@ fi
# runs the release gate harness (cargo fmt/check, catalog drift, vitest, and
# the focused cargo suites — incl. the receive/port-drift/secret regressions).
# Skipped on --dry-run, or set SKIP_RELEASE_TESTS=1 to bypass in an emergency.
# The lifecycle bats harness (tests/lifecycle/run-gate.sh) still runs separately
# The lifecycle bats harness (tests/lifecycle/run-20x.sh) still runs separately
# against live nodes — see tests/lifecycle/TESTING.md.
if ! $DRY_RUN; then
if [ "${SKIP_RELEASE_TESTS:-0}" = "1" ]; then

View File

@ -14,16 +14,16 @@
#
# Usage:
# scripts/generate-app-catalog.sh [output-path]
# EMBED_MANIFESTS=0 scripts/generate-app-catalog.sh # version/image only (legacy)
# EMBED_MANIFESTS=1 scripts/generate-app-catalog.sh # also embed full manifests
# # then publish: push releases/app-catalog.json to the OVH gitea (raw URL).
#
# EMBED_MANIFESTS (default ON, 2026-06-23): embed each app's full
# apps/<id>/manifest.yml into its catalog entry's `manifest` field, so nodes
# EMBED_MANIFESTS (opt-in, default off): also embed each app's full
# apps/<id>/manifest.yml into its catalog entry's `manifest` field, so nodes can
# install from the signed registry alone (no OTA-shipped disk manifest). Consumed
# by container::app_catalog + the orchestrator's load_manifests overlay
# (origin-wins, disk = fallback). See docs/registry-manifest-design.md. The
# migration window is over — every regen now embeds; set EMBED_MANIFESTS=0 only
# to reproduce the old version/image-only catalog.
# (origin-wins, disk = fallback). See docs/registry-manifest-design.md. Kept
# opt-in during the migration window so a routine catalog regen never changes
# what phase-1 nodes install until we deliberately turn it on.
set -euo pipefail
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
@ -36,7 +36,7 @@ source "$ROOT/scripts/image-versions.sh"
set +a
UPDATED="$(date -u +%Y-%m-%d)" OUT="$OUT" APPS_DIR="$ROOT/apps" \
EMBED_MANIFESTS="${EMBED_MANIFESTS:-1}" python3 - <<'PY'
EMBED_MANIFESTS="${EMBED_MANIFESTS:-}" python3 - <<'PY'
import glob
import json, os

View File

@ -20,7 +20,7 @@ ELECTRUMX_IMAGE="$ARCHY_REGISTRY/electrumx:v1.18.0"
# Mempool stack
MEMPOOL_BACKEND_IMAGE="$ARCHY_REGISTRY/mempool-backend:v3.0.0"
MEMPOOL_WEB_IMAGE="$ARCHY_REGISTRY/mempool-frontend:v3.0.1"
MEMPOOL_WEB_IMAGE="$ARCHY_REGISTRY/mempool-frontend:v3.0.0"
MARIADB_IMAGE="$ARCHY_REGISTRY/mariadb:11.4.10"
# BTCPay

View File

@ -1,19 +1,8 @@
#!/usr/bin/env bash
# Build the Archipelago companion debug APK and stage it as the served download
# at neode-ui/public/packages/archipelago-companion.apk (a plain APK, so a phone
# can install it straight from the link — no unzip step).
# at neode-ui/public/packages/archipelago-companion.apk.zip.
#
# Run manually, or automatically via the pre-push hook (.githooks/pre-push).
#
# Hardened (2026-06-26) so a broken APK can never ship again:
# 1. Aborts on stray resource dirs whose names contain spaces (these break a
# clean build with "Invalid resource directory name"). Empty ones — junk
# left by some icon-export tools — are auto-removed; non-empty ones error.
# 2. Always a CLEAN build (incremental builds masked the bad resource dirs).
# 3. Forces v1 + v2 + v3 signing with zipalign + apksigner. AGP's
# `enableV1Signing = true` flag is silently ignored for minSdk>=24, which
# shipped a v2-only APK that some OEM installers reject ("App not installed").
# 4. VERIFIES all three schemes and ABORTS if any is missing — no silent ship.
set -euo pipefail
ROOT="$(git rev-parse --show-toplevel)"
@ -27,68 +16,20 @@ if [ ! -x "$JAVA/bin/java" ] || [ ! -d "$SDK" ]; then
echo " (set JAVA_HOME and ANDROID_HOME to build the companion APK)" >&2
exit 0
fi
export JAVA_HOME="$JAVA"
export PATH="$JAVA/bin:$PATH"
RES="Android/app/src/main/res"
echo "publish-companion-apk: building debug APK…" >&2
( cd Android && JAVA_HOME="$JAVA" ANDROID_HOME="$SDK" ./gradlew -q :app:assembleDebug )
APK="Android/app/build/outputs/apk/debug/app-debug.apk"
SIGNED="Android/app/build/outputs/apk/debug/app-debug-signed.apk"
DEST="neode-ui/public/packages/archipelago-companion.apk"
OLD_ZIP="neode-ui/public/packages/archipelago-companion.apk.zip"
KS="Android/app/debug.keystore"
# 1. Guard against resource dirs with spaces (Android forbids them; a clean
# build aborts on them). Empty ones are removed; non-empty ones are fatal.
while IFS= read -r d; do
[ -n "$d" ] || continue
if [ -n "$(ls -A "$d" 2>/dev/null)" ]; then
echo "publish-companion-apk: ERROR — resource dir with a space is not empty:" >&2
echo " $d" >&2
echo " Rename it (Android resource dir names cannot contain spaces)." >&2
exit 1
fi
rmdir "$d" && echo "publish-companion-apk: removed stray empty resource dir: $d" >&2
done < <(find "$RES" -type d -name '* *' 2>/dev/null)
# 2. Clean build.
echo "publish-companion-apk: clean build of debug APK…" >&2
( cd Android && ./gradlew -q --console=plain :app:clean :app:assembleDebug )
[ -f "$APK" ] || { echo "publish-companion-apk: ERROR — APK not produced at $APK" >&2; exit 1; }
# 3. Force v1 + v2 + v3 signing (AGP's enableV1Signing flag is ignored here).
BT="$(ls -d "$SDK"/build-tools/*/ | sort -V | tail -1)"
ZIPALIGN="${BT}zipalign"; APKSIGNER="${BT}apksigner"
[ -x "$ZIPALIGN" ] && [ -x "$APKSIGNER" ] || {
echo "publish-companion-apk: ERROR — zipalign/apksigner not found under $BT" >&2; exit 1; }
[ -f "$KS" ] || { echo "publish-companion-apk: ERROR — keystore missing at $KS" >&2; exit 1; }
echo "publish-companion-apk: zipalign + sign (v1+v2+v3)…" >&2
"$ZIPALIGN" -p -f 4 "$APK" "$SIGNED"
"$APKSIGNER" sign \
--ks "$KS" --ks-pass pass:android \
--ks-key-alias androiddebugkey --key-pass pass:android \
--v1-signing-enabled true --v2-signing-enabled true --v3-signing-enabled true \
"$SIGNED"
# 4. Verify all three schemes (min-sdk 21 forces the v1 path to be exercised).
VERIFY="$("$APKSIGNER" verify -v --min-sdk-version 21 "$SIGNED" 2>&1)"
for scheme in "v1 scheme" "v2 scheme" "v3 scheme"; do
if ! printf '%s\n' "$VERIFY" | grep -iq "$scheme.*: true"; then
echo "publish-companion-apk: ERROR — $scheme NOT present after signing. Aborting." >&2
printf '%s\n' "$VERIFY" | grep -iE "scheme" >&2
exit 1
fi
done
echo "publish-companion-apk: verified v1 + v2 + v3 signatures." >&2
# 5. Publish.
DEST="neode-ui/public/packages/archipelago-companion.apk.zip"
mkdir -p "$(dirname "$DEST")"
cp "$SIGNED" "$DEST"
# Drop the legacy zipped artifact so the served download is the raw APK only.
if [ -f "$OLD_ZIP" ]; then
git rm -q --ignore-unmatch "$OLD_ZIP" 2>/dev/null || rm -f "$OLD_ZIP"
fi
TMP="$(mktemp -d)"
cp "$APK" "$TMP/app-debug.apk"
# -X drops platform-specific extra fields for a stabler archive.
( cd "$TMP" && zip -q -X archipelago-companion.apk.zip app-debug.apk )
cp "$TMP/archipelago-companion.apk.zip" "$DEST"
rm -rf "$TMP"
git add "$DEST"
echo "publish-companion-apk: staged $DEST" >&2

View File

@ -26,9 +26,8 @@ The migration's aim, restated as **five pillars** (every app must satisfy all fi
desired→current from manifests + secrets. Self-healing, not edge-triggered.
3. **Lifecycle bulletproof** — every app passes the full matrix
(install / UI reachable / stop / start / restart / reinstall / reboot-survive
/ archipelago-restart-survive / uninstall) **5× green on .228** — run ON the node
(`ARCHY_ITERATIONS=5`).
(Multinode / fleet → `docs/multinode-testing-plan.md`, separate.)
/ archipelago-restart-survive / uninstall) **5× green on .228 AND .198 for now**
(`ARCHY_ITERATIONS=5`; temporarily reduced from 20×, restore before final ship)
before any release.
4. **Data-driven apps** — install/uninstall needs only the app's manifest +
catalog entry. **No host OS changes** (no apt, no /etc, no host units) and
@ -41,10 +40,9 @@ The migration's aim, restated as **five pillars** (every app must satisfy all fi
owned by the service user. Security is king.
**Per-app definition of done:** all five pillars hold → lifecycle matrix 5×
green on .228 (run ON the node) → catalog/registry updated (`app-catalog/catalog.json`
(for now; was 20×) green on .228 then .198 → catalog/registry updated (`app-catalog/catalog.json`
+ `releases/app-catalog.json`, rebuilt image pushed to the mirror) → tracker
cell ticked. Only then move to the next app. (Fleet/multinode verification is a
separate pass → `docs/multinode-testing-plan.md`.)
cell ticked. Only then move to the next app.
**.228 testing constraint:** do NOT touch `bitcoin-knots`, `electrumx`, or
`lnd` on .228 — they are synced and healthy; destructive cycles there would
@ -80,7 +78,7 @@ cost hours of resync.
archipelago` → `cp` binary → `start`.
4. Validate: install fedimint-gateway → assert `fedimint-gateway-hash` (0600,
archipelago-owned) + `.pw` generated → container starts healthy.
5. Run `tests/lifecycle/run-gate.sh` for the gateway (do NOT touch knots/electrumx/lnd).
5. Run `tests/lifecycle/run-20x.sh` for the gateway (do NOT touch knots/electrumx/lnd).
6. Frontend fixes (separate from binary): see icon/rename below; rebuild neode-ui,
ship `dist + catalog.json + assets` to `/opt/archipelago/web-ui` (chown 1000:1000).
@ -123,9 +121,8 @@ cost hours of resync.
| L5 — Chaos / failure-path | Failure modes recover gracefully (corrupt config, deleted bolt DB, network partition) | bats (chaos-gated) | ~120s per scenario |
| L6 — Performance | Cold install latency, reconcile-tick cost, podman call count per lifecycle event | timed bats + Prometheus (TBD) | ~60s per benchmark |
Release gate: **L0+L1+L2+L3 green × 20 iterations** on .228 (run ON the node; 5× for
now). Multinode/fleet → `docs/multinode-testing-plan.md`. L4+L5+L6 are quality gates
we add as they mature; not blocking the v1.7.52 tag.
Release gate: **L0+L1+L2+L3 green × 20 iterations** on .228 AND .198. L4+L5+L6 are
quality gates we add as they mature; not blocking the v1.7.52 tag.
## Coverage matrix — current state
@ -168,7 +165,7 @@ v1.7.52 tags.
Three production failures shipped on v1.7.90-alpha despite the existing harness,
because nothing exercised the receive path, port-mapping drift, or secret
completeness on a live node. New suites close those gaps (all run on the archy
host, read-only, so they join `run.sh`/`run-gate.sh` automatically):
host, read-only, so they join `run.sh`/`run-20x.sh` automatically):
| Suite | Failure it guards | Asserts |
|---|---|---|
@ -196,47 +193,11 @@ ARCHY_PASSWORD=password123 tests/lifecycle/run.sh
# Full + destructive (for the verification fleet):
ARCHY_PASSWORD=password123 ARCHY_ALLOW_DESTRUCTIVE=1 tests/lifecycle/run.sh
# 5× release-gate run:
# 5× release-gate run (for now; was 20× — restore before final ship):
ARCHY_PASSWORD=password123 ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ITERATIONS=5 \
tests/lifecycle/run-gate.sh
# CASCADE tier (uninstall → no-ghost → reinstall) — opt-in, NOT in the canonical
# gate. Installs/uninstalls a THROWAWAY app (default grafana; skips if already
# installed). Run on-node to also assert data-dir removal:
ARCHY_PASSWORD=password123 ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1 \
tests/lifecycle/run.sh cascade-uninstall
tests/lifecycle/run-20x.sh
```
### CASCADE tier — uninstall/reinstall regression guard (Workstream F)
The 5× gate is DESTRUCTIVE-only (stop/start/restart/survive); it never exercised
uninstall/reinstall, where the worst lifecycle bugs lived. `cascade-uninstall.bats`
closes that gap and encodes the fixes for two field bugs:
| Suite | Failure it guards | Asserts |
|---|---|---|
| `cascade-uninstall.bats` | **#13 uninstall ghost** (immich/grafana stayed in My Apps after uninstall) and **#14 reinstall stops** (stalled on stale state/data) | fresh install reaches `running` via a truthful (non-silent) progression; uninstall makes the entry **disappear from `server.get-state` package-data** (no ghost, no stuck uninstall stage) + removes the container + (on-node) the data dir; reinstall returns to `running`; node left as found |
Throwaway-app + precondition-skip (won't touch an app that's already installed),
so it's safe on a populated node. Override the app via `ARCHY_CASCADE_APP` /
`ARCHY_CASCADE_IMAGE` / `ARCHY_CASCADE_CONFIG` / `ARCHY_CASCADE_DATA_DIR`.
Gated on `ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1`. Verified 7/7 on .228 (2026-06-24).
### All-apps lifecycle matrix (Workstream F)
The per-app suites cover ~8 core apps in depth; `all-apps-matrix.bats` covers
**every installed app in breadth, automatically** — it derives the app set from
`server.get-state` package-data (no hardcoded list) and grows coverage as nodes
install more apps. **Read-only**, so it joins `run.sh`/`run-gate.sh` on every node.
| Suite | Guards (fleet-wide) | Asserts (per installed app) |
|---|---|---|
| `all-apps-matrix.bats` | apps STUCK transitional (the #13/#14 ghost generalized), error/failed apps, unreachable UI apps (port-drift generalized) | settles to a non-transitional state within a window; not error/failed; recognized (non-garbage) state; every **running UI app** (manifest `ui=="true"`) exposes a non-null lan-address |
Tunables: `ARCHY_MATRIX_SETTLE_SECS` (45), `ARCHY_MATRIX_UI_SECS` (30),
`ARCHY_MATRIX_ALLOW_STOPPED` (ids allowed non-running). Verified 5/5 on .228
(17 apps) and .116 (20 apps incl. grafana/nextcloud/photoprism/gitea), 2026-06-24.
To exercise the Phase 3.2 Quadlet-backend path on a target node without
editing config.json (which would require an archipelago restart and
trigger FM3 until 3.5 ships), set the env var on `archipelago.service`:
@ -264,7 +225,7 @@ Goal: minimum-viable container subsystem.
| `core/container/src/bitcoin_simulator.rs` | 219 | 0 | -219 | ○ couples with dev_orchestrator |
| `core/container/src/port_manager.rs` | 175 | 0 | -175 | ○ couples with dev_orchestrator |
| `core/archipelago/src/api/rpc/package/install.rs::install_bitcoincoin_rpc_repair` | ~150 | 0 | -150 | ◐ pending fold into orchestrator pre-start |
| imperative `install_fresh` in prod_orchestrator | ~120 | 0 | -120 | ◐ Phase 3.2 wired behind `use_quadlet_backends` flag (default off); 3.3 in-place migration ✅; 3.4 health-gated startup (`Notify=healthy`) ✅ + `TimeoutStartSec=600` race fix ✅; 3.4a unit drift-sync each reconcile ✅; flip default after 5× green |
| imperative `install_fresh` in prod_orchestrator | ~120 | 0 | -120 | ◐ Phase 3.2 wired behind `use_quadlet_backends` flag (default off); 3.3 in-place migration ✅; 3.4 health-gated startup (`Notify=healthy`) ✅ + `TimeoutStartSec=600` race fix ✅; 3.4a unit drift-sync each reconcile ✅; flip default after 20× green |
**Today: -270 LoC committed. Outstanding deletes possible: ~1,616 LoC** (if Phase 3 ships fully + dev_mode resolved).
@ -287,8 +248,8 @@ We don't have a performance harness yet. Add as L6 lands:
v1.7.52 ships only when ALL of:
1. ☐ Bitcoin-stops fix verified live on a fresh node (tests/lifecycle/bats/bitcoin-knots.bats fully ● after a cold install)
2. ☐ `ARCHY_ITERATIONS=5 tests/lifecycle/run-gate.sh` returns 0 **run ON .228** (5× for now; full suite, ARCHY_ALLOW_DESTRUCTIVE=1) — 1× is GREEN (110/110), 5× in progress
3. ☐ Multinode/fleet (.198 + others) — tracked separately in `docs/multinode-testing-plan.md`, NOT a v1.7.52 single-node gate item
2. ☐ `ARCHY_ITERATIONS=5 tests/lifecycle/run-20x.sh` returns 0 against .228 (5× for now; full suite, ARCHY_ALLOW_DESTRUCTIVE=1)
3. ☐ `ARCHY_ITERATIONS=5 tests/lifecycle/run-20x.sh` returns 0 against .198 (same)
4. ☐ The L3 `backend-survives-archipelago-restart` suite passes (= Phase 3 Quadlet shipped for backends)
5. ☐ Cargo: 0 warnings, 0 unused, all tests green (sustained ✓ since 1c0df95f)
6. ☐ LoC: at least one of {Phase 3 Quadlet, dev_mode resolution} merged

Some files were not shown because too many files have changed in this diff Show More