docs: temporarily reduce release lifecycle gate from 20x to 5x

Per user direction: the production test gate is 5x (ARCHY_ITERATIONS=5) on
.228 AND .198 for now, down from 20x. Restore to 20x before the final ship.
Updated CLAUDE.md, PRODUCTION-MASTER-PLAN.md, and tests/lifecycle/TESTING.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
archipelago 2026-06-21 17:11:00 -04:00
parent 9c45f718a2
commit 84031e6209
3 changed files with 16 additions and 13 deletions

View File

@ -42,5 +42,6 @@ Detailed sub-plans (all linked from the master):
## Production test gate (definition of done)
`tests/lifecycle/run-20x.sh` green across install / UI / stop / start / restart /
reinstall / reboot-survive / archipelago-restart-survive / uninstall — **20× on
.228 AND .198**. Until green, the master plan is the priority.
reinstall / reboot-survive / archipelago-restart-survive / uninstall — **5× on
.228 AND .198 for now** (`ARCHY_ITERATIONS=5`; temporarily reduced from 20×
restore to 20× before the final ship). Until green, the master plan is the priority.

View File

@ -56,7 +56,7 @@ real nodes. Until then, this plan is the priority.
- **The 4 companions** (`archy-bitcoin-ui`, `-lnd-ui`, `-electrs-ui`,
`-fedimint-ui`) build from `docker/<name>` contexts via `companion.rs`, not the
manifest registry — a later phase folds them in.
- **No app has passed the formal 20× production gate.** That is the blocker.
- **No app has passed the formal production gate (5× for now, was 20×).** That is the blocker.
## 4. Workstreams (each links its authoritative detail doc)
@ -66,7 +66,7 @@ real nodes. Until then, this plan is the priority.
| B | **Registry-distributed manifests** — catalog carries full signed manifest; orchestrator installs from registry; disk = migration fallback | `registry-manifest-design.md` | **phases 1+2 done** (node consume + opt-in publisher embed); not yet flipped on for the fleet |
| C | **Developer-ready external registry** — 3rd-party DID-signed manifests, decentralized Nostr discovery (NIP-78 kind 30078) + trust score, `archy app …` tooling | `marketplace-protocol.md`, `app-developer-guide.md` | design exists; tooling + trust UX pending |
| D | **Distribution backbone** — signed catalog, BLAKE3 content-addressing, iroh swarm (origin-always-wins) | `dht-distribution-design.md` | phases 02 code-complete (worktree) |
| E | **Production test gate**20× lifecycle on .228 + .198, per-app L1/L2 matrix | `tests/lifecycle/TESTING.md`, `bulletproof-containers.md` | **never green — exit criterion** |
| E | **Production test gate**5× lifecycle on .228 + .198 (for now; was 20×), per-app L1/L2 matrix | `tests/lifecycle/TESTING.md`, `bulletproof-containers.md` | **never green — exit criterion** |
**Orchestrator architecture** (foundation for A/B): `rust-orchestrator-migration.md`
(ProdContainerOrchestrator, BootReconciler 30s level-triggered reconcile, adoption
@ -78,7 +78,8 @@ modes FM1FM6 + the desired-state-first reconciler that fixes them).
An app is **production-ready** only when `tests/lifecycle/run-20x.sh` is green
across the full matrix — install / UI-reachable / stop / start / restart /
reinstall / **reboot-survive** / **archipelago-restart-survive** / uninstall —
**20× on .228 AND .198**. All 8 gate checkboxes in `tests/lifecycle/TESTING.md`
**5× on .228 AND .198 for now** (`ARCHY_ITERATIONS=5`; temporarily reduced from
20× — restore to 20× before the final ship). All 8 gate checkboxes in `tests/lifecycle/TESTING.md`
are currently unchecked. Coverage today: L0 unit (631 ●), L1 RPC ● for 6 core apps,
L2 UI ● dashboard + proxies; L3 survival ◐; ~30 apps have zero automated coverage.
@ -97,7 +98,7 @@ L2 UI ● dashboard + proxies; L3 survival ◐; ~30 apps have zero automated cov
4. ✅ **Reboot-survival** — podman-restart.service enabled (startup, fleet-wide)
for the podman-`--restart` path. *(f160e0c4)*
5. ◻ **Verify on .198** (immich migration validated on .228 only so far).
6. ◻ **E** — run the 20× gate; fix until green.
6. ◻ **E** — run the 5× gate (`ARCHY_ITERATIONS=5`, was 20×); fix until green.
7. ◻ Demote this banner.
**Not yet done / deliberate follow-ups:** flip `EMBED_MANIFESTS` on for the

View File

@ -26,7 +26,8 @@ The migration's aim, restated as **five pillars** (every app must satisfy all fi
desired→current from manifests + secrets. Self-healing, not edge-triggered.
3. **Lifecycle bulletproof** — every app passes the full matrix
(install / UI reachable / stop / start / restart / reinstall / reboot-survive
/ archipelago-restart-survive / uninstall) **20× green on .228 AND .198**
/ archipelago-restart-survive / uninstall) **5× green on .228 AND .198 for now**
(`ARCHY_ITERATIONS=5`; temporarily reduced from 20×, restore before final ship)
before any release.
4. **Data-driven apps** — install/uninstall needs only the app's manifest +
catalog entry. **No host OS changes** (no apt, no /etc, no host units) and
@ -38,8 +39,8 @@ The migration's aim, restated as **five pillars** (every app must satisfy all fi
drop-all-caps + add-back only what a manifest declares. Secrets are `0600`,
owned by the service user. Security is king.
**Per-app definition of done:** all five pillars hold → lifecycle matrix 20×
green on .228 then .198 → catalog/registry updated (`app-catalog/catalog.json`
**Per-app definition of done:** all five pillars hold → lifecycle matrix 5×
(for now; was 20×) green on .228 then .198 → catalog/registry updated (`app-catalog/catalog.json`
+ `releases/app-catalog.json`, rebuilt image pushed to the mirror) → tracker
cell ticked. Only then move to the next app.
@ -192,8 +193,8 @@ ARCHY_PASSWORD=password123 tests/lifecycle/run.sh
# Full + destructive (for the verification fleet):
ARCHY_PASSWORD=password123 ARCHY_ALLOW_DESTRUCTIVE=1 tests/lifecycle/run.sh
# 20× release-gate run (the actual v1.7.52 ship gate):
ARCHY_PASSWORD=password123 ARCHY_ALLOW_DESTRUCTIVE=1 \
# 5× release-gate run (for now; was 20× — restore before final ship):
ARCHY_PASSWORD=password123 ARCHY_ALLOW_DESTRUCTIVE=1 ARCHY_ITERATIONS=5 \
tests/lifecycle/run-20x.sh
```
@ -247,8 +248,8 @@ We don't have a performance harness yet. Add as L6 lands:
v1.7.52 ships only when ALL of:
1. ☐ Bitcoin-stops fix verified live on a fresh node (tests/lifecycle/bats/bitcoin-knots.bats fully ● after a cold install)
2. ☐ `tests/lifecycle/run-20x.sh` returns 0 against .228 (full suite, ARCHY_ALLOW_DESTRUCTIVE=1)
3. ☐ `tests/lifecycle/run-20x.sh` returns 0 against .198 (same)
2. ☐ `ARCHY_ITERATIONS=5 tests/lifecycle/run-20x.sh` returns 0 against .228 (5× for now; full suite, ARCHY_ALLOW_DESTRUCTIVE=1)
3. ☐ `ARCHY_ITERATIONS=5 tests/lifecycle/run-20x.sh` returns 0 against .198 (same)
4. ☐ The L3 `backend-survives-archipelago-restart` suite passes (= Phase 3 Quadlet shipped for backends)
5. ☐ Cargo: 0 warnings, 0 unused, all tests green (sustained ✓ since 1c0df95f)
6. ☐ LoC: at least one of {Phase 3 Quadlet, dev_mode resolution} merged