docs: consolidate OTA 1.8.0 + master-plan open items into one priority-ordered tracker

docs/UNIFIED-TASK-TRACKER.md replaces hunting across SESSION-1.8.0-OTA-PROGRESS.md
and PRODUCTION-MASTER-PLAN.md for "what's left" — fastest/simplest tasks first.
Verified against live code/nodes rather than trusting doc text: several previously
"open" items (bind-dir chown, netbird legacy installer, launch-port fallback,
archival-bitcoin manifest field, progress-UI monotonicity, all-apps coverage,
fedimint test coverage, changelog backfill, portainer image pin, grafana quadlet
activation) turned out already shipped or non-issues, and are closed out here.
TESTING.md's release-gate checklist updated to match reality (cargo warnings,
5x gate, changelog already green; multinode/backend-default-flip/tag genuinely open).

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
This commit is contained in:
archipelago 2026-07-01 12:29:26 -04:00
parent 177b8a4338
commit 79bbcca964
5 changed files with 141 additions and 9 deletions

View File

@ -6,6 +6,11 @@
criterion is met and the priority banner is demoted. Next exit-criteria: the
**multinode pass** (`docs/multinode-testing-plan.md`) and workstreams B/C/D.
**For day-to-day work, use `docs/UNIFIED-TASK-TRACKER.md`** — the consolidated,
priority-ordered "what's left" list across the 1.8.0 OTA and master-plan docs
(fastest/simplest tasks first). It supersedes hunting through the two source docs
below for open items; those remain the narrative/history.
**Read `docs/PRODUCTION-MASTER-PLAN.md` first** — it is still the authoritative plan
for the north star: a world-class, **developer-ready app platform** where every app
is manifest-driven, manifests ship via the **signed registry** (not OTA disk files),

View File

@ -1,5 +1,10 @@
# PRODUCTION MASTER PLAN — Archipelago App Platform & Registry
> **📋 Live day-to-day task tracker: `docs/UNIFIED-TASK-TRACKER.md`.** This doc remains
> the authoritative north-star narrative and detailed workstream history, but for
> "what's left, in priority order" work off the unified tracker instead of hunting
> through §6/§8b here.
>
> **✅ SINGLE-NODE PRODUCTION GATE IS GREEN (2026-06-23): `run-gate.sh` 5/5 on .228, 0 failures.**
> This remains the authoritative plan for the broader north star (manifest-driven
> platform, registry-distributed manifests, external marketplace), but it is no

View File

@ -2,6 +2,11 @@
Updated: 2026-06-30
> **📋 Live day-to-day task tracker: `docs/UNIFIED-TASK-TRACKER.md`.** This doc is kept
> as the historical session-by-session log; open items were consolidated into the
> unified tracker on 2026-07-01 (several turned out already shipped — see that doc for
> current status instead of re-deriving it from the log below).
---
## ▶️▶️▶️▶️ LIVE CHECKPOINT 2026-06-30 (evening) — #17 deployed + verified on .198/.228

View File

@ -0,0 +1,108 @@
# Unified Task Tracker — OTA 1.8.0 + Master Plan
Single working list for everything left before 1.8.0 ships and the next master-plan
exit criteria (multinode + workstreams B/C/D) are met. Supersedes the open-task
sections of `docs/SESSION-1.8.0-OTA-PROGRESS.md` and `docs/PRODUCTION-MASTER-PLAN.md`
as the day-to-day tracker — those docs remain the historical record / detailed
narrative and are still linked from here where useful. **Ordered fastest/simplest
first** so we work top-down instead of hunting across docs.
Verified against actual code state on 2026-07-01 (not just doc text — several
items the source docs still listed as "open" turned out to already be shipped;
those are marked ✅ below with the commit that did it, so we stop re-litigating them).
---
## Tier 0 — Quick / mechanical, no blockers
- [ ] **Update `tests/lifecycle/TESTING.md`'s stale Release Gates checklist** (lines
289296) — several boxes are unchecked but actually true now:
- #1 bitcoin-stops: covered by `tests/lifecycle/bats/bitcoin-knots.bats` stop/restart
tier, included in the 5/5 green gate run.
- #2 `ARCHY_ITERATIONS=5` on .228: **GREEN 2026-06-23 per CLAUDE.md** — check the box.
- #5 cargo 0 warnings: confirmed 0 warnings on `cargo build --release` (2026-07-01).
- #7 layman changelog: `CHANGELOG.md` is backfilled with layman-readable entries
through v1.8.00-alpha — check the box.
- Leave #3 (multinode), #4 (backend-survives-restart / Phase-3 default-on), #6
(LoC decision), #8 (tag pushed) unchecked — genuinely still open, see Tier 2/3.
- [x] ~~Finish the archival/full-node manifest generalization~~ — investigated 2026-07-01:
the hardcoded fallback names in `dependencies.rs:48-52` (`electrs`, `mempool-electrs`,
`mempool-web`) are legacy **alias** ids for `electrumx`/`mempool`, resolved via
id-mapping in a dozen other places (`install.rs`, `runtime.rs`, `config.rs`, etc.),
not separate un-migrated apps with their own manifests. `electrumx` and `mempool`
themselves already declare `bitcoin:archival`. The fallback is correct as-is —
not tech debt, closing this item rather than risk breaking alias resolution.
- [x] ~~Confirm/close the Portainer image-pin item~~ — confirmed 2026-07-01:
`146.59.87.168:3000/lfg2025/portainer:2.19.4` is present in `podman images` on
all 3 LAN nodes (.116/.198/.228), i.e. actually resolvable/pulled from the mirror.
Not a live bug.
- [x] ~~grafana Quadlet "stuck activating"~~ — checked live on .116 (2026-07-01):
`grafana.service` is `active (running)`, container `Up 2 hours (healthy)`. The
2026-06-21 report is stale for grafana. **strfry still unconfirmed** — not
installed on any of .116/.198/.228 to check directly; low priority until someone
actually needs it installed.
## Tier 1 — Medium effort, unblocked
- [ ] **immich → Quadlet migration** — last legacy in-cgroup app; manifest still
declares plain `immich-postgres`/`immich-redis` containers instead of Quadlet
units (`apps/immich/manifest.yml`). Blocks deleting the legacy `create_container`
path (see Phase 3.5 in master plan).
- [ ] **Netbird reinstall adoption path** — verify whether the adopted-container path
actually skips cert/file rendering on reinstall (unconfirmed either way;
`wait_for_adopted_container` in `install.rs:326` exists but no dedicated
cert-render check was found). Reproduce on .228 (has netbird installed) before fixing.
- [ ] **TanStack Query (or equivalent) investigation** — still just a backlog idea, no
code. Time-box a spike: is neode-ui's current fetch/store pattern actually
causing real bugs, or is this nice-to-have? Decide before spending real time.
## Tier 2 — High effort, unblocked (the actual next exit criteria)
- [ ] **Multinode test pass** (`docs/multinode-testing-plan.md`) — per-node
preconditions on .198 + rest of fleet, then 5× on-node gate per node, then
cross-node federation/mesh/transport suites. This is the literal "next exit
criterion" called out in `CLAUDE.md`.
- [ ] **Phase-3 Quadlet default-flip** — code is validated + opt-in via
`ARCHIPELAGO_USE_QUADLET_BACKENDS=true` on .228/.198 already (confirmed live
2026-07-01). Flip needs: re-test on a healthy idle legacy node, then flip the
default, then multinode gate re-run.
- [ ] **Per-app test coverage for the ~30 apps with zero automated coverage**
framework exists (bats + reusable helpers), just needs per-app suites written.
- [ ] **Convert remaining multi-container legacy stacks to the manifest-owned model**
(workstream A tail) — netbird's legacy installer is already deleted (`89d397bb`);
immich (see Tier 1) and any other multi-container stacks are what's left.
- [ ] **Developer tooling CLI suite** (validate/render/local-install/lifecycle-test) —
APP-PACKAGING-MIGRATION-PLAN.md step 5, needed before external devs can publish.
## Tier 3 — Blocked on a decision or resource only you can supply
- [ ] **Version naming decision (1.7.99-alpha → 1.8.0 vs 1.8.00-alpha)** — code is
otherwise ready to tag; this is a one-line decision, then a mechanical bump +
tag + push. **Needs your call**, not more engineering.
- [ ] **Workstream B signing ceremony**`core/archipelago/src/trust/anchor.rs:21`
still has `RELEASE_ROOT_PUBKEY_HEX = None`. Needs the offline
`RELEASE_MASTER_MNEMONIC` to run `docs/workstream-b-signing-runbook.md`'s
4-step ceremony — can't be automated by me.
- [ ] **Bitcoin multi-version fleet-wide OTA**`.228` fully working on branch,
per your prior gating this rollout is explicitly held for your decision on
timing (`docs/bitcoin-version-bulletproof-rollout.md`).
- [ ] **3ccc stock-Meshtastic RF validation** — needs a live send/receive test with
physical radios in your hands; code fix is in place, just unverified live.
## Backlog — deferred, no scope decided, low priority
- [ ] **Marketplace protocol (workstream C)** — design-only (`docs/marketplace-protocol.md`),
no tooling/trust UX built. Future work, not urgent.
- [ ] **DHT distribution (workstream D)** — confirmed design-only, no code
(`docs/dht-distribution-design.md` explicitly says "Status: Design (no code yet)");
an experimental iroh provider skeleton exists behind a feature flag for future
PoC measurement, nothing fleet-facing.
- [ ] **Custom live voice-call protocol** — deprioritized 2026-07-01 per user request;
scope not yet decided. Revisit after the tiers above are worked down.
---
*Historical narrative and detailed per-session logs remain in
`docs/SESSION-1.8.0-OTA-PROGRESS.md` and `docs/PRODUCTION-MASTER-PLAN.md` §6/§8b —
this doc is the live "what's left, in priority order" list. Update it (don't just
append to the old docs) as items close or new ones surface.*

View File

@ -284,16 +284,25 @@ We don't have a performance harness yet. Add as L6 lands:
## Release gates
v1.7.52 ships only when ALL of:
1.8.0 ships only when ALL of (see `docs/UNIFIED-TASK-TRACKER.md` for the live
priority-ordered list of what's still open across these):
1. ☐ Bitcoin-stops fix verified live on a fresh node (tests/lifecycle/bats/bitcoin-knots.bats fully ● after a cold install)
2. ☐ `ARCHY_ITERATIONS=5 tests/lifecycle/run-gate.sh` returns 0 **run ON .228** (5× for now; full suite, ARCHY_ALLOW_DESTRUCTIVE=1) — 1× is GREEN (110/110), 5× in progress
3. ☐ Multinode/fleet (.198 + others) — tracked separately in `docs/multinode-testing-plan.md`, NOT a v1.7.52 single-node gate item
4. ☐ The L3 `backend-survives-archipelago-restart` suite passes (= Phase 3 Quadlet shipped for backends)
5. ☐ Cargo: 0 warnings, 0 unused, all tests green (sustained ✓ since 1c0df95f)
6. ☐ LoC: at least one of {Phase 3 Quadlet, dev_mode resolution} merged
7. ☐ Layman-readable changelog (per `feedback_changelog_layman.md`)
8. ☐ Tag pushed to origin + gitea-local + gitea-vps2 (per `feedback_ship_ritual.md`)
1. ☑ Bitcoin-stops fix verified live on a fresh node (`tests/lifecycle/bats/bitcoin-knots.bats`
stop/restart tier, part of the green single-node gate)
2. ☑ `ARCHY_ITERATIONS=5 tests/lifecycle/run-gate.sh` returns 0 **run ON .228** — GREEN 2026-06-23, 5/5, 0 failures
3. ☐ Multinode/fleet (.198 + others) — tracked separately in `docs/multinode-testing-plan.md`,
the actual next exit criterion, NOT satisfied yet
4. ☐ The L3 `backend-survives-archipelago-restart` suite passes fleet-wide default-on
(Phase 3 Quadlet is merged + validated but still opt-in via `ARCHIPELAGO_USE_QUADLET_BACKENDS`
on .228/.198 only — not the default)
5. ☑ Cargo: 0 warnings, 0 unused (confirmed 2026-07-01 release build); full test suite green
per last confirmed run
6. ☑ LoC: Phase 3 Quadlet merged (opt-in) — satisfies the "at least one of" bar; default-flip
itself is tracked as its own item in the unified tracker
7. ☑ Layman-readable changelog — `CHANGELOG.md` backfilled through v1.8.00-alpha
(per `feedback_changelog_layman.md`)
8. ☐ Tag pushed to origin + gitea-local + gitea-vps2 (per `feedback_ship_ritual.md`) —
blocked on the version-naming decision, see unified tracker Tier 3
## How to update this document