--- name: Deploy session 2026-03-22 findings description: Comprehensive deploy/build fixes made overnight — container issues, image tags, script improvements, remaining work type: project --- ## Session Summary (2026-03-22 overnight) Massive deploy infrastructure overhaul across all 5 nodes (.228, .198, Arch 1/2/3). ### Fixed in deploy-tailscale.sh - **Image tags**: Bitcoin Knots `28.1` (not `v28.1`), BTCPay `1.13.7` (not `1.14.5`), SearXNG `2026.3.20-6c7e9c197` - **Removed Immich** (3 containers) and **Penpot** (5 containers) from deploy + build - **Fedimint**: `FM_REL_NOTES_ACK=0_4_xyz` env var (NOT `FM_SKIP_REL_NOTES_ACK` or `FM_REQ_RELEASE_NOTES_ACK_V0_4`) - **Fedimint-gateway**: `--password` instead of `--bcrypt-password-hash` (v0.5.1 CLI change) - **FileBrowser**: added `--cap-add NET_BIND_SERVICE` for port 80 binding - **SearXNG**: added `/var/lib/archipelago/searxng:/etc/searxng` volume mount + caps - **Postgres**: pinned to `postgres:15` (data initialized with 15, incompatible with 16) - **Migration**: one-time flag file `/var/lib/archipelago/.rootless-migrated` - **Recreate-if-broken pattern**: containers that exist but are stopped get deleted and recreated - **Arch 2 hostname**: fixed from hardcoded hostname to `$TAILSCALE_ARCH2` - **Custom UI images**: graceful skip if not available, source extracted to repo (`docker/bitcoin-ui/`, `docker/electrs-ui/`) - **AIUI tar xattr**: silenced with `--no-xattrs` (only in deploy-tailscale.sh, NOT deploy-to-target.sh yet) - **Nginx MIME warning**: removed `text/html` from `sub_filter_types` ### Added - `--fleet` flag in deploy-to-target.sh: deploys .228 → .198 → Arch 1/2/3 - `--both` lock fix: releases lock before recursive `--live` call - Container verification step (Step 26b): restarts exited containers, fixes permissions, checks Tor - IndeedHub backend stack rebuilt on .228 (7 containers) - IndeedHub nginx patched with direct IPs (podman DNS doesn't work with nginx resolver) ### Frontend changes - Replaced Immich with FileBrowser on Setup homescreen (`goals.ts`, `EasyHome.vue`) - `MEMPOOL_API_IMAGE` renamed to `MEMPOOL_BACKEND_IMAGE` in image-versions.sh - Nextcloud downgraded from 30 to 29 (one major version upgrade at a time) ### Session 2 fixes (same day) **Critical pattern found: Container credential mismatches** - Deploy generates random passwords stored in `secrets/`. MariaDB/Postgres only use env vars on FIRST init — subsequent restarts ignore them. Container recreation with new passwords → auth failures → crash loops. - 50,000+ cumulative container restarts across fleet from this single root cause. **Fixes applied to all nodes:** 1. LND: `lnd.conf` rpcpass synced from `secrets/bitcoin-rpc-password` (was hardcoded `archipelago123`) 2. MariaDB mempool: data dirs wiped + reinitialized (password mismatch unrecoverable) 3. BTCPay Postgres: `ALTER USER` to sync password with secrets 4. FileBrowser: `--user 0:0` instead of `--cap-add NET_BIND_SERVICE` (rootless port 80 fix) 5. Nextcloud: same `--user 0:0` fix 6. Tailscale container on .228: removed (2,685 restarts — unauthenticated, host already has TS) **Deploy script fixes:** - `deploy-tailscale.sh`: LND config always synced before start, `eval "$DB_PASSWORDS"` → safe individual reads, MariaDB password sync step, filebrowser `--user 0:0` - `deploy-to-target.sh`: LND stale config check now compares passwords (not just cookie/localhost), filebrowser `--user 0:0` **Rootless port 80 rule**: Containers binding port 80 MUST use `--user 0:0`. `NET_BIND_SERVICE` cap doesn't work in rootless (UID 0 → host 100000, unprivileged). ### Session 3 fixes (2026-03-22 to 2026-03-24) **Additional container fixes applied live:** - PhotoPrism: recreated with proper `/photoprism/storage`, `/photoprism/originals`, `/photoprism/import` volume mounts (all 3 nodes) - Vaultwarden/Jellyfin: recreated with `--user 0:0` + health checks (Arch 1/2) - Nextcloud: downgraded image to v29 (data initialized with v28, can't skip to v30) - Fedimint: upgraded v0.5.1 → v0.10.0 on all Tailscale nodes - Fedimint-gateway: bcrypt hash passed via file mount (shell escaping workaround) - SearXNG: recreated with proper caps on Arch 2 - Arch 3 right-sized: stopped immich (3), jellyfin, vaultwarden, nbxplorer (7.3GB RAM) **Deploy script improvements (6 commits pushed):** 1. `d37165ca` — Credential sync, health checks, rootless port binding 2. `f5714a5b` — Fleet deploy falls back to Tailscale when LAN unreachable, `--all` alias 3. `028248df` — Suppress tar xattr spam in AIUI deploy (`--no-xattrs`) 4. `f5802f9e` — Fix LND config SSH escaping, Tailscale fallback for BUILD_SOURCE 5. `06d85e1d` — Fix health check escaping for SSH heredoc (`--health-cmd 'cmd'` not `"cmd"`) 6. `a7920de8` — Correct health check endpoints (fedimint→8175, nextcloud→`/`, filebrowser→`/`) **Health checks added to deploy-tailscale.sh:** - 25 containers now have `--health-cmd` in deploy-tailscale.sh (was zero) - Key corrections: fedimint checks port 8175 (UI) not 8174 (websocket), nextcloud/filebrowser check `/` not custom endpoints **Fleet status at end of session:** | Node | Status | Notes | |------|--------|-------| | .228 | 36/36, 0 unhealthy, load 1.0 | Fully stable | | Arch 1 | 25/25, 0 unhealthy, load 0.5 | Fully stable | | Arch 2 | 25/25, 0 unhealthy, load 0.2 | Fully stable | | Arch 3 | 24/28, 0 unhealthy, load 7.7 | Right-sized for 7.3GB RAM, Bitcoin IBD at 97.8% | | .198 | Bitcoin chain data empty (4KB) | Needs full IBD — will take days. Not pruned. | ### Remaining for next session - **.198**: Bitcoin doing full IBD from scratch (chain data was lost/empty). No prune flag set. Will take days. - **Arch 3**: Bitcoin IBD was at 97.8% — check if complete, then start LND/nbxplorer - **Tor config Python syntax errors** in deploy-to-target.sh step 33 (cosmetic, falls back to system Tor) - **deploy-to-target.sh** still missing health checks (only deploy-tailscale.sh has them) - **first-boot-containers.sh** needs same rootless fixes (filebrowser `--user 0:0`, credential sync) - **Fedimint guardian setup** not done on any node — all in "Setup UI" mode - User needs to `git pull && ./scripts/deploy-to-target.sh --all` to deploy latest fixes to Tailscale nodes