180 lines
18 KiB
Markdown
180 lines
18 KiB
Markdown
# Step 8b Port Audit — container-specs.sh → apps/*/manifest.yml
|
||
|
||
Last updated: 2026-04-23
|
||
|
||
This audit is the scope-lock for Step 8b of `docs/rust-orchestrator-migration.md`. Every container currently declared in `scripts/container-specs.sh:ALL_CONTAINER_SPECS` must be port-faithful to `apps/<id>/manifest.yml` before Step 8c can delete the bash scripts.
|
||
|
||
Findings in short:
|
||
|
||
- `scripts/container-specs.sh` lists **30 containers** across 5 tiers.
|
||
- `apps/*/manifest.yml` exists for **27 app ids**, but the overlap is partial and most of the overlapping manifests are **aspirational stubs written in the original design phase, never reconciled against production behavior**. The image references, container names, network topology, env, and health checks disagree with what actually runs on `.116` and `.228`.
|
||
- Only the three UI apps (`bitcoin-ui`, `electrs-ui`, `lnd-ui`) plus `aiui` are truly ported (Step 7 scope).
|
||
- The Rust schema (`core/container/src/manifest.rs::AppManifest`) is **missing** several fields needed for a faithful port: `archy-net` network selection, `custom_args`, `entrypoint` override, derived host env (e.g. `HOST_MDNS`), secret-file env injection, and data-dir UID/GID mapping.
|
||
|
||
---
|
||
|
||
## Table — every spec, mapped
|
||
|
||
Legend for **Status**:
|
||
|
||
- ✅ PORTED — manifest exists and matches reality (Step 7 done).
|
||
- ⚠ STUB — `apps/<id>/manifest.yml` exists but disagrees with `container-specs.sh` (image, name, network, env, or health wrong).
|
||
- ❌ MISSING — no manifest file on disk.
|
||
- — N/A — intentionally out of Step 8b (optional app with no spec, or already managed by a different system).
|
||
|
||
| Tier | Spec name (container-specs.sh) | Actual container name | Image source | apps/<id>/ matches? | Status | Notes |
|
||
|-----:|----------------------------------|-----------------------|-------------------------------------|---------------------|--------|-------|
|
||
| 0 | archy-mempool-db | archy-mempool-db | `$MARIADB_IMAGE` | mempool/ | ⚠ | Existing manifest (if any) targets mempool combined stack, not the DB sidecar. Likely a companion of `apps/mempool`. |
|
||
| 0 | archy-btcpay-db | archy-btcpay-db | `$BTCPAY_POSTGRES_IMAGE` | btcpay-server/ | ⚠ | Existing manifest describes only the app container. DB is a silent companion in the current model. |
|
||
| 0 | immich_postgres | immich_postgres | `$IMMICH_POSTGRES_IMAGE` | (none) | ❌ | Optional. No `apps/immich/` dir. |
|
||
| 0 | immich_redis | immich_redis | `$VALKEY_IMAGE` | (none) | ❌ | Optional. No `apps/immich/` dir. |
|
||
| 1 | bitcoin-knots | bitcoin-knots | `$BITCOIN_KNOTS_IMAGE` | bitcoin-core/ | ⚠ | `apps/bitcoin-core/manifest.yml` references `bitcoin/bitcoin:28.4`; production runs Bitcoin **Knots** at `$ARCHY_REGISTRY/bitcoin-knots:latest`. App id mismatch: spec is `bitcoin-knots`, manifest is `bitcoin-core`. Decide: rename spec or rename app id. |
|
||
| 1 | electrumx | electrumx | `$ELECTRUMX_IMAGE` | (none) | ❌ | Separate from `electrs-ui`. No `apps/electrumx/` dir. |
|
||
| 2 | lnd | lnd | `$LND_IMAGE` | lnd/ | ⚠ | Manifest exists; needs verification against current env/ports/caps. |
|
||
| 2 | mempool-api | mempool-api | `$MEMPOOL_BACKEND_IMAGE` | mempool/ | ⚠ | Companion of `apps/mempool`. May need dedicated manifest or stack-form. |
|
||
| 2 | archy-mempool-web | archy-mempool-web | `$MEMPOOL_WEB_IMAGE` | mempool/ | ⚠ | Companion. |
|
||
| 2 | archy-nbxplorer | archy-nbxplorer | `$NBXPLORER_IMAGE` | btcpay-server/ | ⚠ | Companion of BTCPay. |
|
||
| 2 | btcpay-server | btcpay-server | `$BTCPAY_IMAGE` | btcpay-server/ | ⚠ | Stub; env, ports, deps need reconciliation. |
|
||
| 2 | fedimint | fedimint | `$FEDIMINT_IMAGE` | fedimint/ | ⚠ | **This is the bug from yesterday.** Stub references wrong image (`fedimint/fedimintd:v0.10.0` instead of `$ARCHY_REGISTRY/fedimintd:v0.10.0`), wrong RPC target (`bitcoin-core:8332` instead of `bitcoin-knots:8332`), missing `HOST_MDNS` env, missing `archy-net`, missing `FM_BIND_P2P`/`FM_BIND_API`, missing gateway ports etc. |
|
||
| 2 | fedimint-gateway | fedimint-gateway | `$FEDIMINT_GATEWAY_IMAGE` | (none) | ❌ | No manifest. Has complex LND-aware entrypoint in `container-specs.sh:load_spec_fedimint-gateway`. |
|
||
| 2 | immich_server | immich_server | `$IMMICH_SERVER_IMAGE` | (none) | ❌ | Optional. |
|
||
| 3 | homeassistant | homeassistant | `$HOMEASSISTANT_IMAGE` | home-assistant/ | ⚠ | id mismatch: `homeassistant` vs `home-assistant`. |
|
||
| 3 | grafana | grafana | `$GRAFANA_IMAGE` | grafana/ | ⚠ | Stub. |
|
||
| 3 | uptime-kuma | uptime-kuma | `$UPTIME_KUMA_IMAGE` | (none) | ❌ | Optional. |
|
||
| 3 | jellyfin | jellyfin | `$JELLYFIN_IMAGE` | (none) | ❌ | Optional. |
|
||
| 3 | photoprism | photoprism | `$PHOTOPRISM_IMAGE` | (none) | ❌ | Optional. |
|
||
| 3 | vaultwarden | vaultwarden | `$VAULTWARDEN_IMAGE` | (none) | ❌ | Optional. Known-bad container on `.228` (see STATUS.md). |
|
||
| 3 | nextcloud | nextcloud | `$NEXTCLOUD_IMAGE` | (none) | ❌ | Optional. |
|
||
| 3 | searxng | searxng | `$SEARXNG_IMAGE` | searxng/ | ⚠ | Stub. |
|
||
| 3 | onlyoffice | onlyoffice | `$ONLYOFFICE_IMAGE` | onlyoffice/ | ⚠ | Stub. |
|
||
| 3 | filebrowser | filebrowser | `$FILEBROWSER_IMAGE` | (none) | ❌ | **Critical** — this is Archipelago baseline (bootstrapped by first-boot), not an optional app. Lost `.filebrowser.json` yesterday. Must have a manifest. |
|
||
| 3 | nginx-proxy-manager | nginx-proxy-manager | `$NPM_IMAGE` | (none) | ❌ | Optional. |
|
||
| 3 | portainer | portainer | `$PORTAINER_IMAGE` | (none) | ❌ | Optional. |
|
||
| 3 | ollama | ollama | `$OLLAMA_IMAGE` | ollama/ | ⚠ | Stub. |
|
||
| 4 | archy-bitcoin-ui | archy-bitcoin-ui | `localhost/bitcoin-ui:local` | bitcoin-ui/ | ✅ | Step 7 done. |
|
||
| 4 | archy-lnd-ui | archy-lnd-ui | `localhost/lnd-ui:local` | lnd-ui/ | ✅ | Step 7 done. |
|
||
| 4 | archy-electrs-ui | archy-electrs-ui | `localhost/electrs-ui:local` | electrs-ui/ | ✅ | Step 7 done. |
|
||
|
||
### Non-spec apps that already have manifests (outside `container-specs.sh`)
|
||
|
||
These are managed entirely by the install RPC today and already have adoption paths in the Rust orchestrator. They are **not** in 8b scope:
|
||
|
||
- `aiui`, `botfights`, `core-lightning`, `did-wallet`, `endurain`, `gitea`, `indeedhub`, `lightning-stack` (stack), `meshtastic`, `morphos-server`, `nostr-rs-relay`, `router`, `strfry`, `web5-dwn`.
|
||
|
||
---
|
||
|
||
## Schema gaps blocking faithful ports
|
||
|
||
`core/container/src/manifest.rs::AppManifest` currently supports:
|
||
|
||
- `container.image` OR `container.build` (mutually exclusive, validated).
|
||
- `dependencies: Vec<Dependency>`, `resources: {cpu_limit, memory_limit, disk_limit}`.
|
||
- `security: { capabilities, readonly_root, network_policy: string, apparmor_profile }`.
|
||
- `ports: Vec<{host, container, protocol}>`, `volumes: Vec<{type, source, target, options}>`.
|
||
- `environment: Vec<String>` (each `"KEY=VALUE"`).
|
||
- `health_check: {type, endpoint, path, interval, timeout, retries}`.
|
||
- `devices: Vec<String>`, `extensions: HashMap<String, Value>` (flatten).
|
||
|
||
What `container-specs.sh` uses that the schema **does not** express first-class:
|
||
|
||
| Need | Example from bash | Proposed schema addition |
|
||
|---|---|---|
|
||
| Join the named `archy-net` bridge | `SPEC_NETWORK="archy-net"` | `container.network: Option<String>` (Some("archy-net"), or None for `isolated`, or "host"). Existing `security.network_policy` left as-is for policy knobs (e.g. firewall isolation layer); this new field is literally the podman `--network` value. |
|
||
| Extra args / custom flags | `SPEC_CUSTOM_ARGS="-server=1 -prune=550 ..."` | `container.custom_args: Vec<String>`. |
|
||
| Entrypoint override | `SPEC_ENTRYPOINT="gatewayd --data-dir /data ... lnd --lnd-rpc-host lnd:10009"` | `container.entrypoint: Option<Vec<String>>`. |
|
||
| Host-derived env (mDNS hostname, host IP) | `FM_P2P_URL=fedimint://$HOST_MDNS:8173` | `container.derived_env: Vec<{key, template}>` with a small allow-list of `{{HOST_MDNS}}`, `{{HOST_IP}}`, `{{DISK_GB}}` substitutions resolved at apply time. |
|
||
| Secret-file env (read from `/var/lib/archipelago/secrets/<name>`) | `FM_BITCOIND_PASSWORD=$BITCOIN_RPC_PASS` (from secret file in bash) | `container.secret_env: Vec<{key, secret_file}>`, secret_file relative to `$SECRETS_DIR`. Never logged. |
|
||
| Data dir UID/GID (for rootless mapped chown) | `SPEC_DATA_UID="100070:100070"` | `container.data_uid: Option<String>` (e.g. `"100070:100070"`). Applied as `chown -R` before container create. |
|
||
| Exec health check | `SPEC_HEALTH_CMD="bitcoin-cli ..."` | Extend `HealthCheck` so `type: exec` + `command: Vec<String>` works end-to-end; confirm the runtime honors it. |
|
||
| Optional/skip-when-not-installed semantics | `SPEC_OPTIONAL="true"` | Already covered: `BootReconciler` only installs if an `AppManifest` is registered. For baseline-on-first-boot containers (filebrowser), we use the same install path. No schema change. |
|
||
| Local-image flag (don't pull) | `SPEC_LOCAL_IMAGE="true"` | Already covered: `container.build` vs `container.image`. |
|
||
|
||
Everything else (tier ordering, dependency tree, readonly_root, tmpfs mounts) is either already in the schema or folded into `custom_args` cleanly.
|
||
|
||
### tmpfs
|
||
|
||
`SPEC_TMPFS="/tmp:rw,noexec,nosuid,size=256m ..."` used by `grafana`, `searxng`, `ollama`. Currently no first-class field. Proposed: `volumes[].type: tmpfs` with a new `tmpfs_options` field on `Volume`, or a dedicated `container.tmpfs: Vec<{target, options}>`. Either works; the `Volume`-variant keeps all mount declarations in one place.
|
||
|
||
---
|
||
|
||
## Proposed commit sequence
|
||
|
||
Each item is a separate commit. None recreates a container on the fleet.
|
||
|
||
**8b.0 — schema extensions, no manifest changes, no orchestrator changes**
|
||
|
||
1. `feat(container/manifest): add network, custom_args, entrypoint, derived_env, secret_env, data_uid, tmpfs fields` — add fields to `ContainerConfig`/`SecurityPolicy`/`Volume`, update `validate()`, add unit tests per new field. Backwards-compat: every existing `apps/*/manifest.yml` must still parse (verify with a `parse_every_real_manifest` test that walks `apps/*/manifest.yml` in the repo).
|
||
|
||
2. `feat(container/manifest): resolve derived_env against host facts` — add `HostFacts { host_ip, host_mdns, disk_gb }` struct and `resolve_env(facts) -> Vec<String>` method; unit test with a fixed `HostFacts`.
|
||
|
||
3. `feat(container/manifest): resolve secret_env against a SecretsProvider` — add trait `SecretsProvider { fn read(&self, name: &str) -> Result<String>; }`, stub `FileSecretsProvider` rooted at `/var/lib/archipelago/secrets`, unit test with a tmpdir provider.
|
||
|
||
**8b.1 — orchestrator honors the new fields**
|
||
|
||
4. `feat(prod_orchestrator): honor network/custom_args/entrypoint on create` — thread the new `ResolvedContainerConfig` into the runtime's create call. Mock-runtime unit tests for each field.
|
||
5. `feat(prod_orchestrator): chown data dir to data_uid before create` — called from `install_fresh`. Unit test with a tmpdir.
|
||
6. `feat(prod_orchestrator): resolve derived_env + secret_env before create` — wire in `HostFacts` + `SecretsProvider`. Unit test.
|
||
|
||
**8b.2 — first real backend port: fedimint**
|
||
|
||
7. `feat(apps/fedimint): port manifest from container-specs.sh with mDNS URLs + archy-net` — rewrites `apps/fedimint/manifest.yml` using the new schema. Includes `container_name: fedimint` (no prefix), `network: archy-net`, `derived_env: [FM_P2P_URL, FM_API_URL]`, `secret_env: [FM_BITCOIND_PASSWORD, ...]`.
|
||
8. `feat(apps/fedimint-gateway): new manifest with LND-aware entrypoint` — creates `apps/fedimint-gateway/manifest.yml`. Dynamic entrypoint is a 2-case template resolved by a derived field `{{LND_AVAILABLE}}` (presence of `/var/lib/archipelago/lnd/tls.cert`). May require a second commit to add that derived fact — scope-judge at write time.
|
||
9. `test(lifecycle): fedimint adoption + fresh-install` — bats scaffold per `docs/bulletproof-containers.md§Test harness`.
|
||
|
||
**8b.3 — remaining critical backends (one per commit)**
|
||
|
||
10. `feat(apps/filebrowser): new manifest — baseline Archipelago service` (fixes yesterday's `.filebrowser.json` loss by regenerating via `custom_args: ["--config", "/data/.filebrowser.json"]` + `caps: [..., NET_BIND_SERVICE]`).
|
||
11. `feat(apps/electrumx): new manifest`.
|
||
12. `feat(apps/bitcoin-knots): rename-or-merge with apps/bitcoin-core/manifest.yml` — decide naming once, update everywhere. Recommend: keep `apps/bitcoin-core/` dir (it's the user-visible app name) and use `extensions.container_name: bitcoin-knots` to preserve adoption.
|
||
13. `feat(apps/lnd): reconcile stub against spec`.
|
||
14. `feat(apps/btcpay-server + companions): multi-container stack` — reuse the existing stack path in `api/rpc/package/stacks.rs` OR decide to add `container.companions: Vec<ContainerConfig>`. Defer decision until 10–13 land.
|
||
|
||
**8b.4 — mempool stack, optional apps**
|
||
|
||
Continue one-at-a-time until every ⚠ or ❌ row above is ✅.
|
||
|
||
**8b.5 — port `core/archipelago/src/api/rpc/package/update.rs`**
|
||
|
||
Replace `reconcile-containers.sh` calls with `ContainerOrchestrator::upgrade(app_id)`. Unblocks 8c.
|
||
|
||
**8c — delete bash scripts** (per `docs/rust-orchestrator-migration.md`).
|
||
|
||
---
|
||
|
||
## Runtime-only drift on `.116` — write it into manifests, not scripts
|
||
|
||
Per `docs/RESUME.md§Runtime-only fixes on .116`, yesterday's patches are:
|
||
|
||
1. `~archipelago/.config/containers/containers.conf` (`image_copy_tmp_dir = "storage"`) → lands in `first-boot-setup.sh` (renamed in Step 8c) OR in a Rust startup-side prereq hook. Not a per-manifest concern.
|
||
2. Secrets ownership `archipelago:archipelago` → Rust orchestrator's `ensure_secrets` path (already exists; verify it chowns).
|
||
3. `/var/lib/archipelago/filebrowser-data/.filebrowser.json` → handled by filebrowser's `custom_args: ["--config", "/data/.filebrowser.json"]` plus a pre-start hook (mirrors `bitcoin_ui` precedent) that writes the file if absent. Details in 8b.3 commit 10.
|
||
4. Fedimint data dir chown → handled by `container.data_uid: "100000:100000"` in the fedimint manifest.
|
||
|
||
All runtime-only fixes end up expressed as manifest fields or Rust-side hooks. None survives as bash.
|
||
|
||
---
|
||
|
||
## Open decisions (lock before writing code)
|
||
|
||
1. **`bitcoin-knots` vs `bitcoin-core` naming.** Recommend: app id stays `bitcoin-core` (user-facing), container name becomes `bitcoin-knots` via `extensions.container_name`, image is Knots. Or rename both to `bitcoin-knots` for honesty. Pick one and apply everywhere.
|
||
2. **`archy-` prefix rule.** Currently `UI_APP_IDS` in `prod_orchestrator.rs` hardcodes `["bitcoin-ui", "electrs-ui", "lnd-ui"]` → `archy-`. Several backends use `archy-` too (`archy-mempool-db`, `archy-mempool-web`, `archy-nbxplorer`, `archy-btcpay-db`). Recommend: drop the hardcoded list, rely on `extensions.container_name` everywhere, audit all existing manifests to set it explicitly so adoption doesn't orphan.
|
||
3. **Companions (mempool-api + mempool-web + mempool-db, btcpay-server + nbxplorer + btcpay-db).** Two options: (a) one manifest per container with explicit deps and an "app group" id; (b) extend `ContainerConfig` with `companions: Vec<…>`. `apps/lightning-stack/manifest.yml` already shipped probably has a precedent — check its shape before deciding.
|
||
4. **Keep `container-specs.sh` as the source of truth until 8b is fully ported?** Yes. `BootReconciler` only acts on what's in `apps/*/manifest.yml`; anything not ported stays on the bash path until its commit lands. Zero-downtime migration.
|
||
|
||
---
|
||
|
||
## Where to resume
|
||
|
||
After user approves this plan: commit 1 in 8b.0 (schema extensions + tests, no orchestrator or manifest changes). Smallest possible diff, highest leverage, and unblocks every subsequent port.
|
||
|
||
## Validation Snapshot - 2026-04-28
|
||
|
||
- Runtime cleanup: removed orphan `bold_lichterman` duplicate; retained managed `filebrowser`.
|
||
- Launch policy alignment: local app launches are port-based; iframe-blocked apps (including `gitea`) are forced to new-tab.
|
||
- App icon reliability: image fallback now retries `.svg` when `.png` does not exist.
|
||
- Required stack verification on `.116`:
|
||
- `tests/lifecycle/bats/required-stack.bats` -> PASS
|
||
- `ARCHY_ALLOW_DESTRUCTIVE=1 tests/lifecycle/bats/required-stack-destructive.bats` -> PASS
|
||
- Broad host-port probe confirms HTTP 200 responses for user-facing app UIs on mapped ports; non-HTTP ports intentionally excluded from HTTP pass/fail semantics.
|
||
|