6.1 KiB
6.1 KiB
name, description, type
| name | description | type |
|---|---|---|
| Deploy session 2026-03-22 findings | Comprehensive deploy/build fixes made overnight — container issues, image tags, script improvements, remaining work | project |
Session Summary (2026-03-22 overnight)
Massive deploy infrastructure overhaul across all 5 nodes (.228, .198, Arch 1/2/3).
Fixed in deploy-tailscale.sh
- Image tags: Bitcoin Knots
28.1(notv28.1), BTCPay1.13.7(not1.14.5), SearXNG2026.3.20-6c7e9c197 - Removed Immich (3 containers) and Penpot (5 containers) from deploy + build
- Fedimint:
FM_REL_NOTES_ACK=0_4_xyzenv var (NOTFM_SKIP_REL_NOTES_ACKorFM_REQ_RELEASE_NOTES_ACK_V0_4) - Fedimint-gateway:
--passwordinstead of--bcrypt-password-hash(v0.5.1 CLI change) - FileBrowser: added
--cap-add NET_BIND_SERVICEfor port 80 binding - SearXNG: added
/var/lib/archipelago/searxng:/etc/searxngvolume mount + caps - Postgres: pinned to
postgres:15(data initialized with 15, incompatible with 16) - Migration: one-time flag file
/var/lib/archipelago/.rootless-migrated - Recreate-if-broken pattern: containers that exist but are stopped get deleted and recreated
- Arch 2 hostname: fixed from hardcoded hostname to
$TAILSCALE_ARCH2 - Custom UI images: graceful skip if not available, source extracted to repo (
docker/bitcoin-ui/,docker/electrs-ui/) - AIUI tar xattr: silenced with
--no-xattrs(only in deploy-tailscale.sh, NOT deploy-to-target.sh yet) - Nginx MIME warning: removed
text/htmlfromsub_filter_types
Added
--fleetflag in deploy-to-target.sh: deploys .228 → .198 → Arch 1/2/3--bothlock fix: releases lock before recursive--livecall- Container verification step (Step 26b): restarts exited containers, fixes permissions, checks Tor
- IndeedHub backend stack rebuilt on .228 (7 containers)
- IndeedHub nginx patched with direct IPs (podman DNS doesn't work with nginx resolver)
Frontend changes
- Replaced Immich with FileBrowser on Setup homescreen (
goals.ts,EasyHome.vue) MEMPOOL_API_IMAGErenamed toMEMPOOL_BACKEND_IMAGEin image-versions.sh- Nextcloud downgraded from 30 to 29 (one major version upgrade at a time)
Session 2 fixes (same day)
Critical pattern found: Container credential mismatches
- Deploy generates random passwords stored in
secrets/. MariaDB/Postgres only use env vars on FIRST init — subsequent restarts ignore them. Container recreation with new passwords → auth failures → crash loops. - 50,000+ cumulative container restarts across fleet from this single root cause.
Fixes applied to all nodes:
- LND:
lnd.confrpcpass synced fromsecrets/bitcoin-rpc-password(was hardcodedarchipelago123) - MariaDB mempool: data dirs wiped + reinitialized (password mismatch unrecoverable)
- BTCPay Postgres:
ALTER USERto sync password with secrets - FileBrowser:
--user 0:0instead of--cap-add NET_BIND_SERVICE(rootless port 80 fix) - Nextcloud: same
--user 0:0fix - Tailscale container on .228: removed (2,685 restarts — unauthenticated, host already has TS)
Deploy script fixes:
deploy-tailscale.sh: LND config always synced before start,eval "$DB_PASSWORDS"→ safe individual reads, MariaDB password sync step, filebrowser--user 0:0deploy-to-target.sh: LND stale config check now compares passwords (not just cookie/localhost), filebrowser--user 0:0
Rootless port 80 rule: Containers binding port 80 MUST use --user 0:0. NET_BIND_SERVICE cap doesn't work in rootless (UID 0 → host 100000, unprivileged).
Session 3 fixes (2026-03-22 to 2026-03-24)
Additional container fixes applied live:
- PhotoPrism: recreated with proper
/photoprism/storage,/photoprism/originals,/photoprism/importvolume mounts (all 3 nodes) - Vaultwarden/Jellyfin: recreated with
--user 0:0+ health checks (Arch 1/2) - Nextcloud: downgraded image to v29 (data initialized with v28, can't skip to v30)
- Fedimint: upgraded v0.5.1 → v0.10.0 on all Tailscale nodes
- Fedimint-gateway: bcrypt hash passed via file mount (shell escaping workaround)
- SearXNG: recreated with proper caps on Arch 2
- Arch 3 right-sized: stopped immich (3), jellyfin, vaultwarden, nbxplorer (7.3GB RAM)
Deploy script improvements (6 commits pushed):
d37165ca— Credential sync, health checks, rootless port bindingf5714a5b— Fleet deploy falls back to Tailscale when LAN unreachable,--allalias028248df— Suppress tar xattr spam in AIUI deploy (--no-xattrs)f5802f9e— Fix LND config SSH escaping, Tailscale fallback for BUILD_SOURCE06d85e1d— Fix health check escaping for SSH heredoc (--health-cmd 'cmd'not"cmd")a7920de8— Correct health check endpoints (fedimint→8175, nextcloud→/, filebrowser→/)
Health checks added to deploy-tailscale.sh:
- 25 containers now have
--health-cmdin deploy-tailscale.sh (was zero) - Key corrections: fedimint checks port 8175 (UI) not 8174 (websocket), nextcloud/filebrowser check
/not custom endpoints
Fleet status at end of session:
| Node | Status | Notes |
|---|---|---|
| .228 | 36/36, 0 unhealthy, load 1.0 | Fully stable |
| Arch 1 | 25/25, 0 unhealthy, load 0.5 | Fully stable |
| Arch 2 | 25/25, 0 unhealthy, load 0.2 | Fully stable |
| Arch 3 | 24/28, 0 unhealthy, load 7.7 | Right-sized for 7.3GB RAM, Bitcoin IBD at 97.8% |
| .198 | Bitcoin chain data empty (4KB) | Needs full IBD — will take days. Not pruned. |
Remaining for next session
- .198: Bitcoin doing full IBD from scratch (chain data was lost/empty). No prune flag set. Will take days.
- Arch 3: Bitcoin IBD was at 97.8% — check if complete, then start LND/nbxplorer
- Tor config Python syntax errors in deploy-to-target.sh step 33 (cosmetic, falls back to system Tor)
- deploy-to-target.sh still missing health checks (only deploy-tailscale.sh has them)
- first-boot-containers.sh needs same rootless fixes (filebrowser
--user 0:0, credential sync) - Fedimint guardian setup not done on any node — all in "Setup UI" mode
- User needs to
git pull && ./scripts/deploy-to-target.sh --allto deploy latest fixes to Tailscale nodes