From 79bbcca964ff028dd1bd49aee62ac1746ef28921 Mon Sep 17 00:00:00 2001 From: archipelago Date: Wed, 1 Jul 2026 12:29:26 -0400 Subject: [PATCH] docs: consolidate OTA 1.8.0 + master-plan open items into one priority-ordered tracker MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit docs/UNIFIED-TASK-TRACKER.md replaces hunting across SESSION-1.8.0-OTA-PROGRESS.md and PRODUCTION-MASTER-PLAN.md for "what's left" โ€” fastest/simplest tasks first. Verified against live code/nodes rather than trusting doc text: several previously "open" items (bind-dir chown, netbird legacy installer, launch-port fallback, archival-bitcoin manifest field, progress-UI monotonicity, all-apps coverage, fedimint test coverage, changelog backfill, portainer image pin, grafana quadlet activation) turned out already shipped or non-issues, and are closed out here. TESTING.md's release-gate checklist updated to match reality (cargo warnings, 5x gate, changelog already green; multinode/backend-default-flip/tag genuinely open). Co-Authored-By: Claude Sonnet 5 --- CLAUDE.md | 5 ++ docs/PRODUCTION-MASTER-PLAN.md | 5 ++ docs/SESSION-1.8.0-OTA-PROGRESS.md | 5 ++ docs/UNIFIED-TASK-TRACKER.md | 108 +++++++++++++++++++++++++++++ tests/lifecycle/TESTING.md | 27 +++++--- 5 files changed, 141 insertions(+), 9 deletions(-) create mode 100644 docs/UNIFIED-TASK-TRACKER.md diff --git a/CLAUDE.md b/CLAUDE.md index 78b11433..22ec728d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,6 +6,11 @@ criterion is met and the priority banner is demoted. Next exit-criteria: the **multinode pass** (`docs/multinode-testing-plan.md`) and workstreams B/C/D. +**For day-to-day work, use `docs/UNIFIED-TASK-TRACKER.md`** โ€” the consolidated, +priority-ordered "what's left" list across the 1.8.0 OTA and master-plan docs +(fastest/simplest tasks first). It supersedes hunting through the two source docs +below for open items; those remain the narrative/history. + **Read `docs/PRODUCTION-MASTER-PLAN.md` first** โ€” it is still the authoritative plan for the north star: a world-class, **developer-ready app platform** where every app is manifest-driven, manifests ship via the **signed registry** (not OTA disk files), diff --git a/docs/PRODUCTION-MASTER-PLAN.md b/docs/PRODUCTION-MASTER-PLAN.md index 20592d68..e894570d 100644 --- a/docs/PRODUCTION-MASTER-PLAN.md +++ b/docs/PRODUCTION-MASTER-PLAN.md @@ -1,5 +1,10 @@ # PRODUCTION MASTER PLAN โ€” Archipelago App Platform & Registry +> **๐Ÿ“‹ Live day-to-day task tracker: `docs/UNIFIED-TASK-TRACKER.md`.** This doc remains +> the authoritative north-star narrative and detailed workstream history, but for +> "what's left, in priority order" work off the unified tracker instead of hunting +> through ยง6/ยง8b here. +> > **โœ… SINGLE-NODE PRODUCTION GATE IS GREEN (2026-06-23): `run-gate.sh` 5/5 on .228, 0 failures.** > This remains the authoritative plan for the broader north star (manifest-driven > platform, registry-distributed manifests, external marketplace), but it is no diff --git a/docs/SESSION-1.8.0-OTA-PROGRESS.md b/docs/SESSION-1.8.0-OTA-PROGRESS.md index aa76288d..50cc108d 100644 --- a/docs/SESSION-1.8.0-OTA-PROGRESS.md +++ b/docs/SESSION-1.8.0-OTA-PROGRESS.md @@ -2,6 +2,11 @@ Updated: 2026-06-30 +> **๐Ÿ“‹ Live day-to-day task tracker: `docs/UNIFIED-TASK-TRACKER.md`.** This doc is kept +> as the historical session-by-session log; open items were consolidated into the +> unified tracker on 2026-07-01 (several turned out already shipped โ€” see that doc for +> current status instead of re-deriving it from the log below). + --- ## โ–ถ๏ธโ–ถ๏ธโ–ถ๏ธโ–ถ๏ธ LIVE CHECKPOINT 2026-06-30 (evening) โ€” #17 deployed + verified on .198/.228 diff --git a/docs/UNIFIED-TASK-TRACKER.md b/docs/UNIFIED-TASK-TRACKER.md new file mode 100644 index 00000000..66c94106 --- /dev/null +++ b/docs/UNIFIED-TASK-TRACKER.md @@ -0,0 +1,108 @@ +# Unified Task Tracker โ€” OTA 1.8.0 + Master Plan + +Single working list for everything left before 1.8.0 ships and the next master-plan +exit criteria (multinode + workstreams B/C/D) are met. Supersedes the open-task +sections of `docs/SESSION-1.8.0-OTA-PROGRESS.md` and `docs/PRODUCTION-MASTER-PLAN.md` +as the day-to-day tracker โ€” those docs remain the historical record / detailed +narrative and are still linked from here where useful. **Ordered fastest/simplest +first** so we work top-down instead of hunting across docs. + +Verified against actual code state on 2026-07-01 (not just doc text โ€” several +items the source docs still listed as "open" turned out to already be shipped; +those are marked โœ… below with the commit that did it, so we stop re-litigating them). + +--- + +## Tier 0 โ€” Quick / mechanical, no blockers + +- [ ] **Update `tests/lifecycle/TESTING.md`'s stale Release Gates checklist** (lines + 289โ€“296) โ€” several boxes are unchecked but actually true now: + - #1 bitcoin-stops: covered by `tests/lifecycle/bats/bitcoin-knots.bats` stop/restart + tier, included in the 5/5 green gate run. + - #2 `ARCHY_ITERATIONS=5` on .228: **GREEN 2026-06-23 per CLAUDE.md** โ€” check the box. + - #5 cargo 0 warnings: confirmed 0 warnings on `cargo build --release` (2026-07-01). + - #7 layman changelog: `CHANGELOG.md` is backfilled with layman-readable entries + through v1.8.00-alpha โ€” check the box. + - Leave #3 (multinode), #4 (backend-survives-restart / Phase-3 default-on), #6 + (LoC decision), #8 (tag pushed) unchecked โ€” genuinely still open, see Tier 2/3. +- [x] ~~Finish the archival/full-node manifest generalization~~ โ€” investigated 2026-07-01: + the hardcoded fallback names in `dependencies.rs:48-52` (`electrs`, `mempool-electrs`, + `mempool-web`) are legacy **alias** ids for `electrumx`/`mempool`, resolved via + id-mapping in a dozen other places (`install.rs`, `runtime.rs`, `config.rs`, etc.), + not separate un-migrated apps with their own manifests. `electrumx` and `mempool` + themselves already declare `bitcoin:archival`. The fallback is correct as-is โ€” + not tech debt, closing this item rather than risk breaking alias resolution. +- [x] ~~Confirm/close the Portainer image-pin item~~ โ€” confirmed 2026-07-01: + `146.59.87.168:3000/lfg2025/portainer:2.19.4` is present in `podman images` on + all 3 LAN nodes (.116/.198/.228), i.e. actually resolvable/pulled from the mirror. + Not a live bug. +- [x] ~~grafana Quadlet "stuck activating"~~ โ€” checked live on .116 (2026-07-01): + `grafana.service` is `active (running)`, container `Up 2 hours (healthy)`. The + 2026-06-21 report is stale for grafana. **strfry still unconfirmed** โ€” not + installed on any of .116/.198/.228 to check directly; low priority until someone + actually needs it installed. + +## Tier 1 โ€” Medium effort, unblocked + +- [ ] **immich โ†’ Quadlet migration** โ€” last legacy in-cgroup app; manifest still + declares plain `immich-postgres`/`immich-redis` containers instead of Quadlet + units (`apps/immich/manifest.yml`). Blocks deleting the legacy `create_container` + path (see Phase 3.5 in master plan). +- [ ] **Netbird reinstall adoption path** โ€” verify whether the adopted-container path + actually skips cert/file rendering on reinstall (unconfirmed either way; + `wait_for_adopted_container` in `install.rs:326` exists but no dedicated + cert-render check was found). Reproduce on .228 (has netbird installed) before fixing. +- [ ] **TanStack Query (or equivalent) investigation** โ€” still just a backlog idea, no + code. Time-box a spike: is neode-ui's current fetch/store pattern actually + causing real bugs, or is this nice-to-have? Decide before spending real time. + +## Tier 2 โ€” High effort, unblocked (the actual next exit criteria) + +- [ ] **Multinode test pass** (`docs/multinode-testing-plan.md`) โ€” per-node + preconditions on .198 + rest of fleet, then 5ร— on-node gate per node, then + cross-node federation/mesh/transport suites. This is the literal "next exit + criterion" called out in `CLAUDE.md`. +- [ ] **Phase-3 Quadlet default-flip** โ€” code is validated + opt-in via + `ARCHIPELAGO_USE_QUADLET_BACKENDS=true` on .228/.198 already (confirmed live + 2026-07-01). Flip needs: re-test on a healthy idle legacy node, then flip the + default, then multinode gate re-run. +- [ ] **Per-app test coverage for the ~30 apps with zero automated coverage** โ€” + framework exists (bats + reusable helpers), just needs per-app suites written. +- [ ] **Convert remaining multi-container legacy stacks to the manifest-owned model** + (workstream A tail) โ€” netbird's legacy installer is already deleted (`89d397bb`); + immich (see Tier 1) and any other multi-container stacks are what's left. +- [ ] **Developer tooling CLI suite** (validate/render/local-install/lifecycle-test) โ€” + APP-PACKAGING-MIGRATION-PLAN.md step 5, needed before external devs can publish. + +## Tier 3 โ€” Blocked on a decision or resource only you can supply + +- [ ] **Version naming decision (1.7.99-alpha โ†’ 1.8.0 vs 1.8.00-alpha)** โ€” code is + otherwise ready to tag; this is a one-line decision, then a mechanical bump + + tag + push. **Needs your call**, not more engineering. +- [ ] **Workstream B signing ceremony** โ€” `core/archipelago/src/trust/anchor.rs:21` + still has `RELEASE_ROOT_PUBKEY_HEX = None`. Needs the offline + `RELEASE_MASTER_MNEMONIC` to run `docs/workstream-b-signing-runbook.md`'s + 4-step ceremony โ€” can't be automated by me. +- [ ] **Bitcoin multi-version fleet-wide OTA** โ€” `.228` fully working on branch, + per your prior gating this rollout is explicitly held for your decision on + timing (`docs/bitcoin-version-bulletproof-rollout.md`). +- [ ] **3ccc stock-Meshtastic RF validation** โ€” needs a live send/receive test with + physical radios in your hands; code fix is in place, just unverified live. + +## Backlog โ€” deferred, no scope decided, low priority + +- [ ] **Marketplace protocol (workstream C)** โ€” design-only (`docs/marketplace-protocol.md`), + no tooling/trust UX built. Future work, not urgent. +- [ ] **DHT distribution (workstream D)** โ€” confirmed design-only, no code + (`docs/dht-distribution-design.md` explicitly says "Status: Design (no code yet)"); + an experimental iroh provider skeleton exists behind a feature flag for future + PoC measurement, nothing fleet-facing. +- [ ] **Custom live voice-call protocol** โ€” deprioritized 2026-07-01 per user request; + scope not yet decided. Revisit after the tiers above are worked down. + +--- + +*Historical narrative and detailed per-session logs remain in +`docs/SESSION-1.8.0-OTA-PROGRESS.md` and `docs/PRODUCTION-MASTER-PLAN.md` ยง6/ยง8b โ€” +this doc is the live "what's left, in priority order" list. Update it (don't just +append to the old docs) as items close or new ones surface.* diff --git a/tests/lifecycle/TESTING.md b/tests/lifecycle/TESTING.md index 488f4bd6..166753c3 100644 --- a/tests/lifecycle/TESTING.md +++ b/tests/lifecycle/TESTING.md @@ -284,16 +284,25 @@ We don't have a performance harness yet. Add as L6 lands: ## Release gates -v1.7.52 ships only when ALL of: +1.8.0 ships only when ALL of (see `docs/UNIFIED-TASK-TRACKER.md` for the live +priority-ordered list of what's still open across these): -1. โ˜ Bitcoin-stops fix verified live on a fresh node (tests/lifecycle/bats/bitcoin-knots.bats fully โ— after a cold install) -2. โ˜ `ARCHY_ITERATIONS=5 tests/lifecycle/run-gate.sh` returns 0 **run ON .228** (5ร— for now; full suite, ARCHY_ALLOW_DESTRUCTIVE=1) โ€” 1ร— is GREEN (110/110), 5ร— in progress -3. โ˜ Multinode/fleet (.198 + others) โ€” tracked separately in `docs/multinode-testing-plan.md`, NOT a v1.7.52 single-node gate item -4. โ˜ The L3 `backend-survives-archipelago-restart` suite passes (= Phase 3 Quadlet shipped for backends) -5. โ˜ Cargo: 0 warnings, 0 unused, all tests green (sustained โœ“ since 1c0df95f) -6. โ˜ LoC: at least one of {Phase 3 Quadlet, dev_mode resolution} merged -7. โ˜ Layman-readable changelog (per `feedback_changelog_layman.md`) -8. โ˜ Tag pushed to origin + gitea-local + gitea-vps2 (per `feedback_ship_ritual.md`) +1. โ˜‘ Bitcoin-stops fix verified live on a fresh node (`tests/lifecycle/bats/bitcoin-knots.bats` + stop/restart tier, part of the green single-node gate) +2. โ˜‘ `ARCHY_ITERATIONS=5 tests/lifecycle/run-gate.sh` returns 0 **run ON .228** โ€” GREEN 2026-06-23, 5/5, 0 failures +3. โ˜ Multinode/fleet (.198 + others) โ€” tracked separately in `docs/multinode-testing-plan.md`, + the actual next exit criterion, NOT satisfied yet +4. โ˜ The L3 `backend-survives-archipelago-restart` suite passes fleet-wide default-on + (Phase 3 Quadlet is merged + validated but still opt-in via `ARCHIPELAGO_USE_QUADLET_BACKENDS` + on .228/.198 only โ€” not the default) +5. โ˜‘ Cargo: 0 warnings, 0 unused (confirmed 2026-07-01 release build); full test suite green + per last confirmed run +6. โ˜‘ LoC: Phase 3 Quadlet merged (opt-in) โ€” satisfies the "at least one of" bar; default-flip + itself is tracked as its own item in the unified tracker +7. โ˜‘ Layman-readable changelog โ€” `CHANGELOG.md` backfilled through v1.8.00-alpha + (per `feedback_changelog_layman.md`) +8. โ˜ Tag pushed to origin + gitea-local + gitea-vps2 (per `feedback_ship_ritual.md`) โ€” + blocked on the version-naming decision, see unified tracker Tier 3 ## How to update this document