archy/.claude/plans/tailscale-migration.md
Dorian f20f0650cf feat: Discover view, Fleet dashboard, MeshMap, type fixes
- New Discover.vue (app store redesign)
- Fleet.vue dashboard for .228
- MeshMap.vue component
- Fixed Discover.vue type errors (unused var, type predicate)
- Various UI updates (Apps, Dashboard, Marketplace, Mesh, Web5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 16:12:01 +00:00

5.3 KiB

Plan: Seamless Tailscale Migration for Alpha Testers

Context

Tailscale nodes (Arch 1/2/3) are alpha tester machines. They need full deployment — binary, frontend, infrastructure, and containers — with zero friction. Currently deploy-tailscale.sh only deploys binary + frontend (85 lines), missing ALL infrastructure that deploy-to-target.sh --live provides (rootless prereqs, UID mapping, containers, nginx, Tor, HTTPS, dev mode, UFW, etc.).

These nodes may also have old rootful containers that need migrating to rootless.

Approach

Don't refactor the 1615-line deploy-to-target.sh — too risky during beta freeze. Instead:

  1. Rewrite deploy-tailscale.sh as a full-deploy script with split-mode SSH resilience
  2. Add --tailscale flag to deploy-to-target.sh as a convenience wrapper
  3. Add rootful→rootless migration as an automatic pre-step
  4. Fix first-boot-containers.sh for rootless (separate concern, for ISO builds)

Changes

1. Rewrite scripts/deploy-tailscale.sh (~400 lines)

Currently 85 lines doing only binary+frontend. Rewrite to be a full deploy for any node, using split-mode SSH (each step = separate short SSH session) for Tailscale stability.

Steps the new script will run (each as its own SSH session):

  1. SSH connectivity check
  2. Install prerequisites (rsync, node, npm) if missing
  3. Rsync code to target
  4. Rootful→rootless migration (detect sudo podman ps -a, stop & remove old rootful containers)
  5. Build frontend (nohup + poll, or skip if copy-only node)
  6. Build backend (nohup + poll, or skip if copy-only node)
  7. Create rollback backup
  8. Deploy binary (build locally or copy from .228)
  9. Deploy frontend (build locally or copy from .228)
  10. Deploy AIUI
  11. Sync nginx config + HTTPS snippets
  12. Sync systemd service
  13. Setup rootless prereqs (sysctl, linger, podman.socket)
  14. Create data dirs + UID mapping (full chown table from deploy-to-target.sh:670-689)
  15. Dev mode (ARCHIPELAGO_DEV_MODE=true for HTTP cookies over Tailscale)
  16. Deploy nostr-provider.js
  17. Deploy Claude API proxy (if ANTHROPIC_API_KEY available)
  18. Setup NTP + swap
  19. Restart services
  20. Setup HTTPS (with node's own IP in SAN)
  21. Read Bitcoin RPC credentials from server secrets
  22. Create all containers (Bitcoin, Mempool, BTCPay, ElectrumX, LND, Fedimint, Immich, HA, Grafana, Jellyfin, Vaultwarden, SearXNG, FileBrowser)
  23. Setup Tor hidden services
  24. Fix UFW forward policy
  25. Fix IndeedHub NIP-07 (if running)
  26. Transfer custom images for copy-only nodes (individual tarballs, never combined)
  27. Run container doctor
  28. Write deploy manifest
  29. Post-deploy health check

Copy-only mode: When target can't build (Arch 1/3), script detects no cargo/npm on target and copies pre-built artifacts from .228 via SSH pipe.

Key sections to port from deploy-to-target.sh:

  • Lines 646-689 — rootless prereqs + UID mapping
  • Lines 629-641 — dev mode
  • Lines 839-1474 — all container creation
  • Lines 1143-1234 — Tor setup
  • Lines 1477-1485 — UFW fix
  • Lines 1487-1545 — IndeedHub NIP-07

2. Add --tailscale flag to deploy-to-target.sh (~30 lines)

Wrapper that calls deploy-tailscale.sh for each node sequentially. Also add --tailscale-node=arch1|arch2|arch3 for single-node targeting.

3. Rootful→rootless migration (in deploy-tailscale.sh step 4)

Auto-detect and handle:

ssh TARGET 'ROOTFUL=$(sudo podman ps -a 2>/dev/null | wc -l); if [ $ROOTFUL -gt 1 ]; then sudo podman stop --all; sudo podman rm --all; fi'

Data safe — /var/lib/archipelago/ never deleted, only ownership fixed by UID mapping step.

4. Fix scripts/first-boot-containers.sh (5 targeted edits)

  • Line 15: Change root check → archipelago user check (UID 1000)
  • Line 140: Change 10.88.0.0/160.0.0.0/0 (match deploy-to-target.sh)
  • After line 111: Add rootless prereqs (sysctl, linger, podman.socket)
  • After line 113: Add full UID mapping block
  • Pin :latest tags: photoprism, ollama, searxng, nginx-proxy-manager, penpot

5. Update scripts/setup-https-dev.sh

Dynamic SAN — detect node's own IPs (including Tailscale interface) instead of hardcoding .228/.198.

Files Modified

File Change ~Lines
scripts/deploy-tailscale.sh Full rewrite — complete deploy with split-mode SSH ~400
scripts/deploy-to-target.sh Add --tailscale / --tailscale-node flags ~30
scripts/first-boot-containers.sh Fix for rootless (subnet, UID mapping, prereqs) ~40
scripts/setup-https-dev.sh Dynamic SAN with Tailscale IPs ~15
docs/BETA-PROGRESS.md Update TASK-11 status ~5

Auth State Preservation

All user state in /var/lib/archipelago/ is never touched by deploys:

  • sessions.json, user.json, identities/, secrets/, federation/

Verification

  1. Deploy to Arch 2 first (has build tools, safest test)
  2. Then Arch 1/3 (copy-only mode)
  3. For each node: podman ps shows containers, curl /health returns 200, UI loads, login works
  4. Run container doctor — 0 fixes needed

Order

  1. Rewrite deploy-tailscale.sh (main deliverable)
  2. Add --tailscale flags to deploy-to-target.sh
  3. Fix first-boot-containers.sh
  4. Update setup-https-dev.sh
  5. Test: Arch 2 → Arch 1 → Arch 3
  6. Update BETA-PROGRESS.md