# PRODUCTION MASTER PLAN — Archipelago App Platform & Registry > **✅ SINGLE-NODE PRODUCTION GATE IS GREEN (2026-06-23): `run-gate.sh` 5/5 on .228, 0 failures.** > This remains the authoritative plan for the broader north star (manifest-driven > platform, registry-distributed manifests, external marketplace), but it is no > longer a hard priority banner blocking all other work. Remaining workstreams are > in §6 / §8b. Next exit-criteria: multinode (`docs/multinode-testing-plan.md`) + > workstreams B/C/D. > > Last updated: 2026-06-26 · zombie-container guard + gitea launch-port fix shipped, binary `040df5ce` rolled to the fleet (see §8b SESSION h). Prior: orchestrator Fix A+B (`a721532f`/`e0343137`) deployed + proven. --- ## 1. The North Star Make Archipelago a **world-class, developer-ready app platform** where: 1. **Every app is manifest-driven** — install/run/update/uninstall needs only the app's manifest (+ catalog entry). **Zero OS-level code reliance**: no per-app Rust installers, no `sudo mkdir/chown`, no host provisioning. 2. **Manifests are distributed via the (signed) registry**, not baked into the binary OTA as disk files. Bumping/adding an app = a signed catalog change. 3. **Third-party developers can build and ship apps via an external registry** — a decentralized marketplace (DID-signed manifests, Nostr discovery, reputation), not a gatekept central store. `archy app validate/render/install/test` tooling. 4. The platform stays **rootless, secure-by-default, elegant, robust, and 100%-uptime-capable** (reboot-survivable, self-healing, no data loss on migrate). **Definition of done:** the production test gate (§5) is green for the app set on real nodes. Until then, this plan is the priority. ## 2. Invariants (never violate) - **Rootless Podman only.** No rootful, no Docker-socket mounts, no privileged containers unless explicitly approved. (ADR-001, ADR-009.) - **No app-specific business logic in the Rust backend.** The orchestrator owns the lifecycle state machine; apps are declarative. Legacy `install_immich_stack` (hardcoded `podman run` + `sudo chown`) is the anti-pattern being deleted. - **Secrets are manifest-declared** (`generated_secrets`, materialised by `container::secrets` 0600/rootless, idempotent + self-healing) — never hardcoded, per-app, or logged. Replaces the deleted `ensure_fmcd_password`. - **Migrations never destroy data.** Preserve `/var/lib/archipelago/`, generated secrets, displayed credentials, public ports, and adoption container names. Always provide a rollback path. Stop/recreate only when necessary. - **Verify on the real node .228 before any tag.** (Fleet/multinode verification is a separate pass → `docs/multinode-testing-plan.md`.) ## 3. Current state (2026-06-21) - **~40 apps are manifest-based and Quadlet-migrated** (survive `archipelago.service` restart + reboot). Exhaustive per-app table: `docs/app-registry-status-2026-06-21.md`. - **Legacy holdout: immich** — the one app with **no manifest** and a hardcoded Rust stack installer (in-cgroup, not Quadlet). 3 containers, healthy, live data. The migration proof case. - **Manifests still travel by OTA disk rsync** (`apps/ → /opt/archipelago/apps`). The signed catalog (`app-catalog.json`) currently distributes **only image overrides** — not full manifests. Gap closed by workstream B. - **The 4 companions** (`archy-bitcoin-ui`, `-lnd-ui`, `-electrs-ui`, `-fedimint-ui`) build from `docker/` contexts via `companion.rs`, not the manifest registry — a later phase folds them in. - **No app has passed the formal production gate.** That is the blocker. ## 4. Workstreams (each links its authoritative detail doc) | # | Workstream | Detail doc | Status | |---|-----------|-----------|--------| | A | **Manifest-driven app platform** — packaging contract, single/multi-container runtime, routing, controlled hooks, dev tooling (6 phases, security model, migration rules) | `APP-PACKAGING-MIGRATION-PLAN.md` | mostly done; immich + multi-container polish remain | | B | **Registry-distributed manifests** — catalog carries full signed manifest; orchestrator installs from registry; disk = migration fallback | `registry-manifest-design.md` | **phases 1+2 done** (node consume + opt-in publisher embed); not yet flipped on for the fleet | | C | **Developer-ready external registry** — 3rd-party DID-signed manifests, decentralized Nostr discovery (NIP-78 kind 30078) + trust score, `archy app …` tooling | `marketplace-protocol.md`, `app-developer-guide.md` | design exists; tooling + trust UX pending | | D | **Distribution backbone** — signed catalog, BLAKE3 content-addressing, iroh swarm (origin-always-wins) | `dht-distribution-design.md` | phases 0–2 code-complete (worktree) | | E | **Production test gate** — 5× lifecycle on **.228**, per-app L1/L2 matrix; multinode is split out → `multinode-testing-plan.md` | `tests/lifecycle/TESTING.md`, `bulletproof-containers.md` | **✅ .228 5×-GREEN (110/110 ×5, 0 not-ok, 2026-06-23)** — but this is DESTRUCTIVE-tier / ~8 core apps only; see §6c for the coverage gaps | | F | **Lifecycle perfection — cascade + progress + ALL apps** — extend the gate to uninstall/reinstall (cascade), real install/uninstall progress UI, and EVERY installed app (not just the 8 core). The "insanely-perfect OS/container environment" bar. | §6c (below), `tests/lifecycle/TESTING.md` | **IN PROGRESS (2026-06-26)** — root bug FIXED: uninstall could hang → ghost/stuck-bar/reinstall-block (`71cc9ac4`, unbounded systemctl/podman in `quadlet::disable_remove`); `cascade-uninstall.bats` **7/7 green on .228** w/ binary `ae349a75`. Remaining: wire CASCADE into the canonical gate run, progress-UI truthfulness, all-apps matrix, guardian/IBD state. | **Orchestrator architecture** (foundation for A/B): `rust-orchestrator-migration.md` (ProdContainerOrchestrator, BootReconciler 30s level-triggered reconcile, adoption scan, Quadlet rendering) and `bulletproof-containers.md` (the six container failure modes FM1–FM6 + the desired-state-first reconciler that fixes them). ## 5. Production test gate (exit criterion) An app is **production-ready** only when `tests/lifecycle/run-gate.sh` is green across the full matrix — install / UI-reachable / stop / start / restart / reinstall / **reboot-survive** / **archipelago-restart-survive** / uninstall — **5× on .228** (`ARCHY_ITERATIONS=5`). **The gate runs ON the node** (it uses local podman/systemctl/bitcoin probes; running it via RPC from another host silently tests the runner). **Multinode / fleet verification (.198 + others) is a SEPARATE plan — `docs/multinode-testing-plan.md` — NOT part of this single-node criterion.** Coverage today: L0 unit (631 ●), L1 RPC ● for 6 core apps, L2 UI ● dashboard + proxies; L3 survival ◐; ~30 apps have zero automated coverage. > ⚠️ **The 2026-06-23 5×-green is NOT the full bar.** `run-gate.sh` runs only the > **DESTRUCTIVE tier** (stop/start/restart/survive) over ~8 core apps; it **skips > uninstall/reinstall** (CASCADE is gated behind `ARCHY_ALLOW_CASCADE_DESTRUCTIVE`, > never set by the gate) and tests no install/uninstall **progress UI**. Real > uninstall/reinstall/progress bugs (immich + grafana) were found in manual testing > right after — see **§6c (workstream F)** for the gap and the expanded-gate plan. > The true "every app, fully" criterion is F's definition-of-done, not this run. ## 6. Immediate sequence (live workstream) 1. ✅ **B-phase 1** — `manifest` field on `AppCatalogEntry`; `load_manifests` catalog-wins merge; `manifest_dir` kept (build-source catalog manifests skipped in phase 1); unit tests. *(commit 220666d3)* 2. ✅ **B-phase 2** — `EMBED_MANIFESTS` publisher generator + round-trip guard. *(7bfbe8fe; signing via existing ceremony — not yet flipped on for the fleet.)* 3. ✅ **C immich proof** — immich is a manifest-driven stack (immich + immich-postgres + immich-redis) installed via `install_stack_via_orchestrator`; legacy installer is now fallback-only. Live-migrated + verified on .228. Found+fixed: container_name duplicate-on-shared-PGDATA, version-digit validation, partial-fallback hardening, data_uid 100998. Canonical app_id `immich` (title+icon). *(9e6c5370, d5ef4573)* 4. ✅ **Reboot-survival** — podman-restart.service enabled (startup, fleet-wide) for the podman-`--restart` path. *(f160e0c4)* 5. ✅ **E** — 5× gate on **.228** (`ARCHY_ITERATIONS=5`) is **GREEN: 5/5, 0 not-ok** (2026-06-23). Two real orchestrator bugs were found + fixed en route (package.stop per-app grace; package.restart phantom stack-member injection → `order_present_containers`, commit 92d7f52d) plus two single-shot-read probes hardened (bitcoin-knots state, immich lan_address). The single-node criterion is met. 6. ✅ Banner demoted (this doc, 2026-06-23). Next: multinode pass + workstreams B/C/D. **Multinode / fleet verification (.198 and the rest) is split into its own plan:** `docs/multinode-testing-plan.md`. Do it AFTER the .228 single-node gate is green. **Not yet done / deliberate follow-ups:** flip `EMBED_MANIFESTS` on for the published catalog (then sign) to actually distribute manifests via the registry; Phase-3 `use_quadlet_backends` rollout so orchestrator backends are Quadlet (not just podman-`--restart`). ## 6b. Post-deploy task order (agreed 2026-06-23) After the 2026-06-23 multinode test deploy (latest backend + UX frontend to .116/.198/.228 + Tailscale testers), do these IN ORDER: 1. **netbird #20 ph4** — the last real manifest migration (workstream A). 2. **Phase-3 `use_quadlet_backends`** — orchestrator backends become Quadlet units. 3. **§6c Lifecycle perfection** (workstream F) — the comprehensive uninstall/reinstall + progress-UI + all-apps gate expansion below. ## 6b-bis. Bitcoin multi-version bulletproofing (2026-06-29) — READY TO MERGE + DEPLOY Branch `bitcoin-version-bulletproof` (base `095a76cd`). Fixes the "switch version silently fails / crash-loops" class + a data-access mismatch that can corrupt a node's index. All code + images + catalog + frontend DONE; **.228** carries it (Knots chainstate mid-reindex recovery). The **coordinated fleet rollout** (OTA binary+frontend, mirror catalog publish, `:latest` repoint sequencing, full switch-matrix test) is the remaining work — fold it into the next release. **Authoritative detail + exact remaining steps + test matrix → `docs/bitcoin-version-bulletproof-rollout.md`.** Pairs with `docs/bitcoin-multi-version-design.md`. ## 6c. Lifecycle perfection — what "green" MISSED (workstream F, the perfection bar) **Why this exists:** the 2026-06-23 single-node gate went 5×-green but is **NOT** the "every app fully lifecycle-tested" guarantee a user reasonably assumes. The canonical gate (`run-gate.sh`) only runs the **DESTRUCTIVE tier** (stop / start / restart / survive) over **~8 core apps** (bitcoin-knots, btcpay, electrumx, lnd, mempool, immich, fedimint, filebrowser). It explicitly **SKIPS uninstall/reinstall** (the CASCADE tier is gated behind `ARCHY_ALLOW_CASCADE_DESTRUCTIVE`, which `run-gate.sh` never sets) and has **zero coverage** for the other ~30 apps (grafana, jellyfin, vaultwarden, penpot, nextcloud, photoprism, uptime-kuma, homeassistant, … — see `app-registry-status-2026-06-21.md`). So uninstall, reinstall, install-progress UI, and most apps were never under test. **Real bugs found in manual multinode testing on .198 (2026-06-23) — the motivating evidence:** - **Uninstall is broken for immich + grafana:** takes very long, the progress bar sits at a **solid full-red with no real progression**, and the app **does not actually uninstall** — it still appears in **My Apps** afterward (ghost entry / state not cleared). - **grafana reinstall just stops** partway (no completion, no clear error). - **fedimint guardian** suddenly showed **"starting up — Guardian opens a wait page until Bitcoin finishes initial sync" / "starting"** on that node — verify this is correct wait-for-IBD behavior vs a stuck/false state (it's a backend that depends on bitcoin sync). **✅ 2026-06-26 — root cause of the immich/grafana uninstall trio FOUND + FIXED (`71cc9ac4`).** Single cause: `quadlet::disable_remove()` (first op in uninstall teardown, via companion + orchestrator) ran `systemctl --user stop` / `daemon-reload` / `podman rm -f` with **no timeout**. On rootless podman a generated unit can wedge "deactivating" while podman hangs → `systemctl stop` blocks forever → the spawned uninstall task returns neither Ok nor Err, so (a) `set_uninstall_stage` never fires → **frozen full-red bar**, (b) `remove_package_state_entry` never runs → **ghost stuck in `Removing`**, (c) the install guard rejects reinstall (`already Removing`). The spawn wrapper already reverts state on Err/removes on Ok — only a *hang* stranded it. Fix bounds all three calls (stop→`QUADLET_STOP_TIMEOUT` + SIGKILL/reset-failed escalation; daemon-reload→30s; podman rm→timeout). **Validated live: `cascade-uninstall.bats` 7/7 on .228** (binary `ae349a75`) — grafana install → uninstall (no ghost, data dir gone) → reinstall → running → cleanup. NOTE: proves the happy path + no-regression; the original hang was load/timing-induced and not separately reproduced. **Workstream F scope — the gate must grow to (in priority order):** 1. **CASCADE tier in the canonical gate:** uninstall → verify the app is GONE from My Apps / `container-list` / package state (no ghost), data preserved per policy, then reinstall → verify it returns healthy. Catch the immich/grafana ghost + reinstall-stops bugs. *(✅ DONE `b7d92107`: `run-gate.sh` now runs ONE cascade pass after the 5× loop when `ARCHY_GATE_CASCADE=1` (+`ARCHY_ALLOW_DESTRUCTIVE=1`), counted into the tally — opt-in so default behavior is unchanged, and deliberately NOT folded into all 5 iterations. `cascade-uninstall.bats` 7/7 on .228. Next: extend cascade coverage beyond the single throwaway app to the multi-container stacks, e.g. an immich/btcpay cascade variant.)* 2. **Progress-UI assertions:** install AND uninstall must report monotonic, truthful progress (not a stuck full-red bar); a long op must surface a real stage/percentage and a terminal success/failure — no silent hang. (Likely both a backend progress-event fix AND a UI fix.) *(✅ 2026-06-26 `9f17ba68`: the "stuck full-red bar" was `AppCard.vue` hardcoding the uninstall bar to `w-full bg-red-400/60 animate-pulse` — solid, full, red, fake-pulse. Now derives a real percentage from the backend's existing `uninstall-stage` label ("Stopping containers (X/N)"→10–50%, "Cleaning up volumes"→70%, "Removing app data"→90%) and renders like install (neutral fill, real width+%, shimmer). FE built `index-DtZyZomC.js`, rolled to .228/.116/.198/.89 (+.88/.5/.120). STILL TODO: a bats/UI assertion that the bar is monotonic + lands on a terminal state; possibly a backend numeric-progress field so the UI doesn't parse stage strings.)* 3. **ALL-apps coverage:** a generic per-app lifecycle matrix (install / UI-reach / stop / start / restart / uninstall / reinstall / reboot-survive) driven by the manifest set, so grafana and the ~30 uncovered apps are gated too — not just the 8 core. Manifest-driven, so new apps are covered automatically. *(✅ 2026-06-26 `43934eef`: `bats/all-apps-lifecycle.bats` — DESTRUCTIVE counterpart to the read-only `all-apps-matrix.bats`. Discovers the app set from My Apps ∩ the node `catalog.json`; drives stop/start/restart for every app and, under `ARCHY_ALLOW_CASCADE_DESTRUCTIVE`, a FULL teardown (uninstall→no-ghost→reinstall) with the catalog `{dockerImage, containerConfig}` as the reinstall spec. PROTECTED (never touched): bitcoin*/electrum* (resync cost) + lnd/btcpay*/fedimint* (irreversible wallet loss — user asked to protect only bitcoin+electrum; wallet apps added for safety, override via `ARCHY_MATRIX_PROTECT`). Validated on .228 (discovery + 1-app lifecycle green). HEAVY/destructive → a supervised pass on LAN nodes (.116/.198/.228), NOT folded into run-gate. Invoke: `ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1 ARCHY_PASSWORD=… ARCHY_SCHEME=https bats bats/all-apps-lifecycle.bats`.)* **✅ FIRST FULL DESTRUCTIVE RUN on .228 (2026-06-26):** lifecycle **11/11 clean**; teardown **8/11** (immich 3-container stack incl.) — and it surfaced **3 real reinstall bugs** (the payoff): 1. **fresh-install bind-dir ownership = root:root** → EACCES on reinstall (jellyfin `/config` denied exit 139; netbird-server can't open its SQLite store). Fix B's chown-to-parent only runs on the reconcile path, **not** `package.install`. The important orchestrator fix. 2. **netbird reinstall adopts leftover containers → skips the manifest cert/file render** (tls.crt/key/nginx.conf never written → proxy can't start → app reads absent). Only a fully clean reinstall renders them. 3. **portainer image pin `lfg2025/portainer:2.19.4` is `manifest unknown`** (never pushed to the registry) and the pin OVERRIDES the RPC dockerImage → portainer is un(re)installable fleet-wide. Registry/catalog data bug (push the image or change the pin). .228 restored (jellyfin+netbird via manual chown / clean reinstall; all installed apps running, 28 ctrs; portainer left uninstalled — uninstallable until #3 fixed). TODO: fix #1 (extend chown to install path) + #2 + #3; add reboot-survive + UI-reach per app to the matrix. 4. **Guardian/IBD-dependent states:** assert that "waiting for bitcoin sync"-style states are a legitimate, surfaced wait (with a path to ready) and never a permanent stuck state. **Definition of done for F:** the expanded gate (CASCADE + progress + all-apps) is 5×-green on .228, then re-verified across the multinode fleet — i.e. an *insanely-perfect* OS/container environment where every app installs, runs, updates, uninstalls, and reinstalls cleanly with honest progress, no ghosts, no data loss, reboot-survivable. ## 7. Release blockers & operational gotchas (durable) Carried forward from prior handoffs (deduped against persistent memory): - **Rootless control-plane responsiveness** — slow `podman ps`/store cleanup at startup must not surface a false "no apps installed" UI. **My Apps must preserve last-known apps during scanner backoff**, never show empty during a transient. - **Reboot survival** — gate on ≥3 (prefer 5) consecutive clean post-reboot lifecycle passes. Quadlet units under `user.slice` survive `archipelago.service` restart; legacy in-cgroup containers get SIGKILLed and reconciled back. - **Startup patterns** — wait on a socket/health, never `sleep`. Tailscale waits for its socket; Fedimint Guardian waits for Bitcoin RPC `initialblockdownload:false` before launching fedimintd (proxy/wait companion on :8175 during IBD). - **Bitcoin must run full** (`txindex=1`, non-pruned) for ElectrumX/mempool. - **Adoption** — match existing containers by name and adopt without recreate; record a migration version in app state; preserve Nostr signer bridges (IndeeHub needs `/nostr-provider.js` served, not just port reachability). - **Image presence** — use bounded targeted `podman image inspect`, not `podman image exists` (avoids store-walk stalls). - **Companion rebuilds** — `companion.rs` must rebuild `:latest` when the build context changes (staleness check), else baked-in fixes (e.g. guardian CSS) never reach nodes. `:local` is a manual override, never auto-rebuilt. ## 8. Roadmap **Pipeline:** Feature Testing (internal) → User Testing (controlled hardware) → Beta Live (public). Hardening priorities feeding the gate: - **P0** Container app reliability — bulletproof install/health/restart/uninstall across all apps, dependency chains, multi-container stacks. - **P0** Networking stack first-install → reboot-proof (WireGuard/NetBird, Tor hidden services, LND Connect). - **P1** LUKS2 full-partition encryption for `/var/lib/archipelago/` (AES-256-XTS, Argon2id, key from setup password + hardware salt). - **P1** Meshtastic plug-and-play parity with MeshCore. - **P1 ✅ CODE-COMPLETE** (branch `companion-mobile-ux`, 2026-06-23; needs on-device + mobile-web verification before merge to `main`) — Mobile app-launch UX — drop the "this app opens in a tab" interstitial. Two surfaces (both: no interstitial screen, launch the app directly): - **Companion app (Android):** open **every** app in the **in-app WebView** (not just non-iframeable ones) — *and* carry the current mobile-iframe footer controls into the WebView (back/forward/reload/close — good, useful UX). - **Mobile web browser (PWA):** open tab-apps directly in a **new browser tab**. Touch points: `neode-ui/src/stores/appLauncher.ts`, `AppLauncherOverlay.vue`, the Android in-app WebView bridge, and the mesh-mobile iframe footer controls. (Reference prior work: `b5a9deb8` in-app webview for non-iframeable apps, `d1fbcd9b` "open in browser" via native bridge.) - **✅ Done (branch `companion-mobile-ux`):** mobile launches now use the store-driven panel (no route push) so the background tab no longer changes and closing returns you where you launched; tab-only apps open directly (in-app WebView on companion via `openInApp`, new browser tab on PWA) with **no interstitial**; the Android `InAppBrowser` (`WebViewScreen.kt`) gained a bottom footer bar (back/forward/reload/open-in-browser/close) + a centered loading screen (favicon + progress); a shared `AppLoadingScreen` (icon + progress) replaced the black/spinner loaders on the app session **and** legacy iframe overlay; the dashboard is pinned to `100dvh` on mobile so the mesh chat/tools panes stop sliding under the tab bar in mobile browsers (no-op in companion); ElectrumX shows its real icon in My Apps. Companion APK bumped to **v0.4.7** (versionCode 11) with a committed shared debug keystore so updates install without an uninstall. **Not yet:** merge to `main`; publish the 0.4.7 companion download (deferred until the gate work lands so they ship together). **Post-beta (deferred — do not start until gate is green):** P2P encrypted voice/video (WebRTC over federation via Tor); watch-only wallet + mesh BTC hardening; paid swarm streaming + IndeeHub source (`phase4-streaming-ecash-plan.md`); Meshroller Rust-native mesh AI (`meshroller-integration-design.md`); dual-ecash phases 2–6 (`dual-ecash-design.md`). ## 8b. SESSION STATE + RESUME (updated 2026-06-26) — READ §8b "CURRENT STATE + RESUME" FIRST ### ▶ SESSION i (2026-06-30) — CURRENT HANDOFF / 1.8.0 OTA RESUME **Branch/worktree:** currently on `bitcoin-version-bulletproof`, not `main`. Worktree is dirty. Do **not** discard mesh changes: they include E2E/transport indicator plumbing and the Meshtastic receive-path fixes below. Separate recovery note: `docs/SESSION-1.8.0-OTA-PROGRESS.md`. **What was done this session:** 1. ✅ **Local Rust release gate fixed and green.** `cargo test -p archipelago --bin archipelago` is green: **849/849** after fixing stale tests and the invalid `fedimint-clientd` manifest (`cpu_limit` was `0.25`, invalid for the current schema; now integer). `cargo check -p archipelago` also green after mesh edits. 2. ✅ **Catalog/release static gates green.** `python3 scripts/check-app-catalog-drift.py --release --strict` is green. `scripts/check-release-manifest.sh` is green for the currently staged `1.7.99-alpha` manifest/artifacts. `npm run build` and `npm run type-check` are green. 3. ✅ **Frontend unit gate fixed.** `npx vitest run --silent` now green: **81 files / 668 tests**. Fixes were test-only: add `router.onError` to the login test router mock and update the `AppIconGrid` mobile unresolved-new-tab expectation to match current app-launcher behavior. 4. ✅ **Workstream F harness gap closed.** `tests/lifecycle/bats/cascade-uninstall.bats` now asserts uninstall progress truthfulness via backend `uninstall-stage`: stage must be parseable, monotonic, below 100 before terminal absence, and present before the app disappears. Non-destructive skip-mode parse check is green: `ARCHY_PASSWORD=dummy bats tests/lifecycle/bats/cascade-uninstall.bats` → 7 skip-ok. 5. ✅ **3ccc → .116 Meshtastic receive bug taken over and partially live-validated.** Context: `3ccc` is the stock/non-Archy Meshtastic peer. The bug was LoRa text from `3ccc` not surfacing in `.116` `mesh.messages`. Root causes/fixes: - The prior attempted fix dropped any packet older than 10 minutes by `rx_time`; live `.116` logs showed `FromRadio.packet` from `!433e3ccc` being dropped as stale (`rx_time` about an hour old). The window is now **24h**, so recent radio FIFO/store-forward backlog surfaces instead of vanishing. - Radios with unset clocks can report tiny nonzero epoch values; those are now treated as unknown, not stale. - Serial prevalidation was rejecting valid `FromRadio.queueStatus` frames (`field 11`, live bytes like `5a04100e1810`) as corrupt payloads; field 11 and other modern non-message `FromRadio` variants are now accepted/ignored instead of poisoning the stream. - Focused Meshtastic tests green: **8/8**, including `packet_to_inbound_frame_accepts_recent_meshtastic_backlog` and `packet_to_inbound_frame_accepts_stock_peer_with_unset_clock`. - Deployed patched binary to **.116**: sha256 `028ec6ff9a60ca8970c081987457d78ed1c517cd81f7089f51b9a01745b5c3c4` at `/usr/local/bin/archipelago`. Service active. Post-deploy checked window showed `FromRadio field=11` accepted and no new `Dropping stale ... !433e3ccc` entries. - There are stale other-agent `RXDIAG` shell watcher processes on `.116`; leave them unless they actively interfere. 6. ✅ **Phase-3 Quadlet read-only check on .116 skip-clean.** Copied lifecycle tests to `.116` and ran `bats bats/use-quadlet-backends-install.bats`: **6/6 skip-clean** because no backend `.container` units exist. This confirms `use_quadlet_backends` is not active on `.116`; Phase-3 remains a rollout gate. **Commands/results worth trusting:** - `cargo test -p archipelago --bin archipelago` → 849/849 green. - `npx vitest run --silent` from `neode-ui/` → 81 files / 668 tests green. - `npm run build` from `neode-ui/` → green, bundle `index-CYaDgfX3.js`. - `python3 scripts/check-app-catalog-drift.py --release --strict` → green. - `scripts/check-release-manifest.sh` → green for **v1.7.99-alpha** staged artifacts. - `tests/release/run.sh --manifest` was rerun after `cargo fmt`; it previously reached frontend tests, which are now fixed. Re-run it from scratch as the next static gate. **Remaining blockers / decisions before 1.8.0 OTA:** 1. **Release version metadata is not 1.8.0 yet.** `releases/manifest.json`, Cargo, and npm still say `1.7.99-alpha`; `CHANGELOG.md` top says `v1.8.00-alpha` (note double zero). Do not silently publish until the release version naming is decided (`1.8.0-alpha` vs `1.8.00-alpha` vs `1.8.0`). 2. **Workstream B signing is blocked on the offline release-root mnemonic.** `docs/workstream-b-signing-runbook.md` says catalog distribution/embedded manifests are live, but authenticity requires the publisher to pin `RELEASE_ROOT_PUBKEY_HEX` and sign `releases/app-catalog.json` with `RELEASE_MASTER_MNEMONIC`. This cannot be automated by an agent without the offline mnemonic. 3. **Phase-3 `use_quadlet_backends` is implemented but default-off.** Completing this requires explicit node/fleet flag rollout plus backend reinstall/migration verification. `.116` currently skip-clean only. 4. **Bitcoin multi-version coordinated rollout is still separately owned/blocked by its runbook.** See `docs/bitcoin-version-bulletproof-rollout.md`; do not repoint `bitcoin-knots:latest` before fixed binary is fleet-wide. 5. **True RF validation of 3ccc requires either a live 3ccc send or waiting for another FIFO/backlog packet.** Parser/unit coverage and `.116` logs strongly validate the drop-path fix, but no human was available to send a fresh 3ccc message during this session. **Immediate next steps for the next agent:** 1. Run `tests/release/run.sh --manifest` from repo root again; frontend unit failures are fixed, so expect it to pass or continue from the next failing stage. 2. If `.116` is still the canary, monitor logs after any 3ccc activity: `journalctl -u archipelago --since "