diff --git a/docs/PRODUCTION-MASTER-PLAN.md b/docs/PRODUCTION-MASTER-PLAN.md index 2b855c61..6a481d4f 100644 --- a/docs/PRODUCTION-MASTER-PLAN.md +++ b/docs/PRODUCTION-MASTER-PLAN.md @@ -158,10 +158,32 @@ phases 2–6 (`dual-ecash-design.md`). **Up 4 days (NOT recreated)** — zero data/credential disruption. UI green: frontend :7778 → 200, nostr-provider.js → 200, **/api/ → 200 (proves network_aliases: frontend nginx `http://api:4000` resolved on indeedhub-net)**. - Fleet healthy (36 containers, none down). **STILL TODO: fresh-create path** — - adoption is NoOp so install_fresh (→ post_install hook + alias rendering on a NEW - container) was NOT exercised live; validate via destructive lifecycle - (uninstall→reinstall or recreate one member) on .228, then .198, then the gate. + Fleet healthy (36 containers, none down). + **FRESH-CREATE PATH = BLOCKED (found live 2026-06-21).** Removed the stateless + frontend + reinstalled to exercise install_fresh → it FAILED: + `orchestrator stack install indeedhub failed at app indeedhub: IndeedHub + dependencies were not ready within 120s (indeedhub-api dependency DNS not ready)`, + and the frontend was left down. Recovered manually on .228 (podman run w/ alias + indeedhub on indeedhub-net; UI 200). ROOT CAUSE = hardcoded indeedhub orchestrator + special-cases that predate + conflict with the manifest path: + - prod_orchestrator `ensure_running` ~L1377: `app_id=="indeedhub"` → + `reconcile_indeedhub_stack`, which REFUSES manifest creation when the frontend + is absent (returns Left("stack-managed")). + - `run_pre_start_hooks("indeedhub")` ~L2324 → `start_indeedhub_backends` → + `wait_for_indeedhub_dependencies_ready(120)` — the gate that blocked install_fresh + (`indeedhub_api_dependency_dns_ready` returns false while the frontend's own alias + is absent + a getent transiently fails). + - also `repair_indeedhub_network_aliases`, `patch_indeedhub_nostr_provider`, the + "frontend did not stay reachable; restart" path (~L2474), `INDEEDHUB_BACKEND_*` + consts, and a crash_recovery.rs indeedhub special-case. + **FIX (next, its own build/deploy/test cycle):** delete these special-cases now + that the manifest carries dependencies/network_aliases/post_install — route + "indeedhub" through the GENERIC install_fresh + reconcile path so the frontend + fresh-creates normally (hook fires). Then re-run the destructive lifecycle on .228 + (frontend recreate must succeed + run the hook), then .198, then the gate. + NOTE: .228 currently runs v1.7.99-alpha (these special-cases still present) — the + running stack is fine (adoption NoOp); only a frontend-absent event re-triggers the + bug, and the frontend is up. - `b1eea8c0` indeedhub (#20) **phase 3 — CODE COMPLETE, unit-tested.** 7 manifests (apps/indeedhub-{postgres,redis,minio,relay,api, ffmpeg} + apps/indeedhub frontend) + install_indeedhub_stack orchestrator-first (immich pattern). Data-preserving by construction = ADOPTION on .228: exact live