150 lines
4.1 KiB
Bash
Raw Normal View History

release(v1.7.41-alpha): post-OTA auto-rollback so a bad release cannot strand the fleet Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 + v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500) with no recovery path short of SSH. This release adds a self-check guardrail to the update flow. What changed: - apply_update() writes a pending-verify marker with old+new version and a 150s deadline immediately before scheduling the service restart. - verify_pending_update() runs from main.rs startup. If the marker is present and within its freshness window, the new binary waits 15s for nginx + backend to settle, then probes https://127.0.0.1/ every 5s for up to 90s (self-signed certs accepted). - On any probe success within the window, the marker is cleared and nothing else happens. - On window-exhaust, the new binary: 1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts> (quarantined, not deleted, so we can post-mortem). 2. Restores web-ui.bak on top of web-ui. 3. Calls rollback_update() to restore the previous binary. 4. Updates state.current_version to reflect the rollback. 5. systemctl --no-block restart archipelago so the OLD binary boots. - Markers older than 10 minutes are treated as stale and cleared without probing, so a crashed-during-startup marker from weeks ago cannot spontaneously roll back a healthy node on a later reboot. - rollback_update() binary copy now goes through host_sudo instead of tokio::fs::copy, so it escapes the service's ProtectSystem=strict mount namespace. Without this, the rollback silently failed with EROFS on /usr/local/bin and orphaned the rollback - the exact opposite of what auto-rollback is for. Tests: 4 new unit tests in update::tests covering marker round-trip, absent-marker noop, no-panic on verify_pending_update with nothing to verify, and an invariant assert that the 90s probe window stays below the 600s stale threshold. All passing. Side fix: scripts/create-release-manifest.sh was dying with exit 141 (SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail. Replaced with a single awk NR==1 that doesn't short-circuit the upstream pipe, so the release-build flow is idempotent again.
2026-04-22 16:14:35 -04:00
#!/bin/bash
#
# Bitcoin Core Setup Script for Archipelago
# Sets up Bitcoin Core in a Podman container
#
set -e
echo ""
echo "╔═══════════════════════════════════════════════════════════╗"
echo "║ ₿ BITCOIN CORE SETUP ║"
echo "╚═══════════════════════════════════════════════════════════╝"
echo ""
# Check if running as root
if [ "$EUID" -eq 0 ]; then
echo "⚠️ Running as root. For rootless Podman, run as regular user."
fi
# Check for Podman
if ! command -v podman >/dev/null 2>&1; then
echo "📦 Installing Podman..."
sudo apt-get update
sudo apt-get install -y podman podman-compose
fi
# Create data directory
BITCOIN_DATA="${HOME}/.bitcoin"
mkdir -p "$BITCOIN_DATA"
echo "📍 Bitcoin data directory: $BITCOIN_DATA"
echo ""
# Ask for configuration
echo "Bitcoin Core Configuration:"
echo ""
read -p "Enable pruning? (saves disk space) [y/N]: " PRUNE
read -p "Enable txindex? (required for some apps) [y/N]: " TXINDEX
read -p "RPC username [bitcoin]: " RPC_USER
RPC_USER=${RPC_USER:-bitcoin}
read -p "RPC password [randomly generated]: " RPC_PASS
if [ -z "$RPC_PASS" ]; then
RPC_PASS=$(openssl rand -hex 16)
echo " Generated RPC password: $RPC_PASS"
fi
# Create bitcoin.conf
cat > "$BITCOIN_DATA/bitcoin.conf" <<EOF
# Archipelago Bitcoin Core Configuration
# Network
server=1
listen=1
# RPC
rpcuser=$RPC_USER
rpcpassword=$RPC_PASS
rpcbind=0.0.0.0
rpcallowip=10.0.0.0/8
rpcallowip=172.16.0.0/12
rpcallowip=192.168.0.0/16
# Performance
dbcache=450
maxmempool=300
EOF
if [[ "$PRUNE" =~ ^[Yy]$ ]]; then
echo "prune=550" >> "$BITCOIN_DATA/bitcoin.conf"
echo " Pruning enabled (550MB)"
fi
if [[ "$TXINDEX" =~ ^[Yy]$ ]]; then
echo "txindex=1" >> "$BITCOIN_DATA/bitcoin.conf"
echo " Transaction index enabled"
fi
echo ""
echo "📋 Created bitcoin.conf"
echo ""
# Pull Bitcoin Core image
echo "🐳 Pulling Bitcoin Core container image..."
podman pull docker.io/lncm/bitcoind:v27.0
# Create systemd user service for Bitcoin Core
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/bitcoind.service <<EOF
[Unit]
Description=Bitcoin Core (Podman)
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
ExecStartPre=-/usr/bin/podman stop bitcoind
ExecStartPre=-/usr/bin/podman rm bitcoind
ExecStart=/usr/bin/podman run --name bitcoind \\
--rm \\
-v ${BITCOIN_DATA}:/data/.bitcoin:Z \\
-p 8332:8332 \\
-p 8333:8333 \\
docker.io/lncm/bitcoind:v27.0
ExecStop=/usr/bin/podman stop bitcoind
Restart=always
RestartSec=10
[Install]
WantedBy=default.target
EOF
echo "📋 Created systemd service"
echo ""
# Enable and start service
systemctl --user daemon-reload
systemctl --user enable bitcoind.service
echo ""
echo "╔═══════════════════════════════════════════════════════════╗"
echo "║ ✅ BITCOIN CORE SETUP COMPLETE! ║"
echo "╚═══════════════════════════════════════════════════════════╝"
echo ""
echo "Commands:"
echo " Start: systemctl --user start bitcoind"
echo " Stop: systemctl --user stop bitcoind"
echo " Status: systemctl --user status bitcoind"
echo " Logs: podman logs -f bitcoind"
echo ""
echo "Bitcoin CLI:"
echo " podman exec bitcoind bitcoin-cli -getinfo"
echo ""
echo "RPC Credentials (save these!):"
echo " User: $RPC_USER"
echo " Pass: $RPC_PASS"
echo ""
read -p "Start Bitcoin Core now? [Y/n]: " START_NOW
if [[ ! "$START_NOW" =~ ^[Nn]$ ]]; then
echo ""
echo "🚀 Starting Bitcoin Core..."
systemctl --user start bitcoind
echo ""
echo "Bitcoin Core is syncing. This will take several hours/days."
echo "Monitor with: podman logs -f bitcoind"
fi