94 lines
3.9 KiB
Markdown
Raw Normal View History

fix: overhaul container lifecycle — recovery, health, uninstall, UI state Container recovery: - Health monitor: MAX_RESTART_ATTEMPTS 3→10, interval 60s→120s - Dependency-aware restarts: won't restart services before their deps - Reset dependent counters when a dependency recovers - Handle "created" state containers (were invisible to health monitor) - Added IndeedHub, mempool-api, mysql to tier system - Crash recovery: podman start timeout 30s→120s with retry - Podman client: socket timeout 5s→30s, added restart policy UI state representation: - Exit code 0 shows "stopped" (gray), not "crashed" (red) - Exit code 137 shows "killed (OOM)" - Non-zero exit shows "crashed" (red) - Added exit_code field to PackageDataEntry Install/uninstall fixes: - Install returns error when container doesn't start (was silent success) - Post-install hooks awaited instead of fire-and-forget tokio::spawn - Uninstall: graceful rm before force, volume prune, network cleanup - Uninstall returns error on partial failure (was 200 OK) Config consistency: - DB passwords read from /var/lib/archipelago/secrets/ (was hardcoded) - Bitcoin: added ZMQ ports 28332/28333 for LND block notifications - IndeedHub port 7777→8190 (was conflicting with strfry) - Marketplace versions: LND 0.17.4→0.18.4, Mempool 2.5.0→3.0.0 Performance: - Metrics collector interval 60s→300s (was duplicating health monitor) - Podman client: proper error propagation instead of unwrap_or_default Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 07:03:57 +01:00
# Rootless Podman UID Mapping Reference
## How Rootless UID Mapping Works
When Podman runs as the `archipelago` user (UID 1000), container processes don't run as their "apparent" UID on the host. Instead, Linux user namespaces remap UIDs.
**Mapping formula**: `host_uid = 100000 + container_uid`
This is configured in `/etc/subuid` and `/etc/subgid`:
```
archipelago:100000:65536
```
This means:
- Container UID 0 (root inside container) → Host UID 100000 (unprivileged on host)
- Container UID 70 (postgres) → Host UID 100070
- Container UID 101 (bitcoin) → Host UID 100101
- etc.
## Why This Matters
Volume directories (bind mounts) on the host must be owned by the **mapped** UID, not the container UID. If Bitcoin runs as UID 101 inside its container, the host directory must be owned by UID 100101.
If ownership is wrong, the container gets `permission denied` when trying to read/write its data.
## Complete UID Mapping Table
| Container UID | Host UID | Containers | Fix Command |
|---|---|---|---|
| 0 (root) | 100000 | lnd, fedimint, fedimint-gateway, homeassistant, jellyfin, vaultwarden, photoprism, ollama, filebrowser, electrumx, btcpay-server, nbxplorer, immich, nostr-rs-relay, strfry, nextcloud, searxng, onlyoffice, tailscale, uptime-kuma | `sudo chown -R 100000:100000 /var/lib/archipelago/{app}` |
| 70 | 100070 | postgres (btcpay-db, immich-db, penpot-postgres) | `sudo chown -R 100070:100070 /var/lib/archipelago/postgres-*` |
| 101 | 100101 | bitcoin-knots, bitcoin-core | `sudo chown -R 100101:100101 /var/lib/archipelago/bitcoin` |
| 472 | 100472 | grafana | `sudo chown -R 100472:100472 /var/lib/archipelago/grafana` |
| 999 | 100999 | MariaDB (mysql-mempool) | `sudo chown -R 100999:100999 /var/lib/archipelago/mysql-mempool` |
## How to Find a Container's UID
If you encounter a new container with permission issues:
```bash
# Check what user the container runs as
podman inspect CONTAINER_NAME --format "{{.Config.User}}"
# If empty, it runs as root (UID 0) → host UID 100000
# If it shows a username, find the UID inside the image
podman run --rm IMAGE_NAME id
# Then calculate: host_uid = 100000 + container_uid
```
## Fix Script
Run this after any fresh install, migration, or when containers have permission errors:
```bash
#!/bin/bash
# Fix all rootless podman volume ownership
# UID 0 → 100000 (most containers)
for dir in lnd fedimint fedimint-gateway homeassistant jellyfin vaultwarden photoprism \
ollama filebrowser electrumx btcpay nbxplorer immich nostr-rs-relay nextcloud \
searxng onlyoffice uptime-kuma; do
[ -d "/var/lib/archipelago/$dir" ] && sudo chown -R 100000:100000 "/var/lib/archipelago/$dir"
done
# UID 101 → 100101 (Bitcoin)
[ -d "/var/lib/archipelago/bitcoin" ] && sudo chown -R 100101:100101 /var/lib/archipelago/bitcoin
# UID 70 → 100070 (PostgreSQL)
for dir in /var/lib/archipelago/postgres-* /var/lib/archipelago/btcpay-db /var/lib/archipelago/immich-db; do
[ -d "$dir" ] && sudo chown -R 100070:100070 "$dir"
done
# UID 999 → 100999 (MariaDB)
[ -d "/var/lib/archipelago/mysql-mempool" ] && sudo chown -R 100999:100999 /var/lib/archipelago/mysql-mempool
# UID 472 → 100472 (Grafana)
[ -d "/var/lib/archipelago/grafana" ] && sudo chown -R 100472:100472 /var/lib/archipelago/grafana
```
## Rootful vs Rootless Comparison
| Aspect | Rootful (old) | Rootless (current) |
|--------|---------------|-------------------|
| Podman command | `sudo podman` | `podman` (as archipelago user) |
| Container storage | `/var/lib/containers/storage` | `~/.local/share/containers/storage` |
| Container subnet | `10.88.0.0/16` | `10.89.0.0/16` |
| Volume ownership | Container UID directly | Mapped UID (100000 + container_uid) |
| Requires root? | Yes | No (except fixing volume ownership) |
| XDG_RUNTIME_DIR | Not needed | Required: `/run/user/1000` |
| User lingering | Not needed | Required: `loginctl enable-linger` |
| Systemd restrictions | All can be enabled | Must disable: RestrictNamespaces, SystemCallFilter |