archy/neode-ui/start-dev.sh

94 lines
2.8 KiB
Bash
Raw Normal View History

release(v1.7.41-alpha): post-OTA auto-rollback so a bad release cannot strand the fleet Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 + v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500) with no recovery path short of SSH. This release adds a self-check guardrail to the update flow. What changed: - apply_update() writes a pending-verify marker with old+new version and a 150s deadline immediately before scheduling the service restart. - verify_pending_update() runs from main.rs startup. If the marker is present and within its freshness window, the new binary waits 15s for nginx + backend to settle, then probes https://127.0.0.1/ every 5s for up to 90s (self-signed certs accepted). - On any probe success within the window, the marker is cleared and nothing else happens. - On window-exhaust, the new binary: 1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts> (quarantined, not deleted, so we can post-mortem). 2. Restores web-ui.bak on top of web-ui. 3. Calls rollback_update() to restore the previous binary. 4. Updates state.current_version to reflect the rollback. 5. systemctl --no-block restart archipelago so the OLD binary boots. - Markers older than 10 minutes are treated as stale and cleared without probing, so a crashed-during-startup marker from weeks ago cannot spontaneously roll back a healthy node on a later reboot. - rollback_update() binary copy now goes through host_sudo instead of tokio::fs::copy, so it escapes the service's ProtectSystem=strict mount namespace. Without this, the rollback silently failed with EROFS on /usr/local/bin and orphaned the rollback - the exact opposite of what auto-rollback is for. Tests: 4 new unit tests in update::tests covering marker round-trip, absent-marker noop, no-panic on verify_pending_update with nothing to verify, and an invariant assert that the 90s probe window stays below the 600s stale threshold. All passing. Side fix: scripts/create-release-manifest.sh was dying with exit 141 (SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail. Replaced with a single awk NR==1 that doesn't short-circuit the upstream pipe, so the release-build flow is idempotent again.
2026-04-22 16:14:35 -04:00
#!/usr/bin/env bash
# Neode Development Server Startup Script
# This script starts both the mock backend and Vite dev server
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
echo ""
echo -e "${BLUE}╔════════════════════════════════════════════════════════════╗${NC}"
echo -e "${BLUE}${NC} ${BLUE}${NC}"
echo -e "${BLUE}${NC} ${GREEN}🚀 Starting Neode Development Environment${NC} ${BLUE}${NC}"
echo -e "${BLUE}${NC} ${BLUE}${NC}"
echo -e "${BLUE}╚════════════════════════════════════════════════════════════╝${NC}"
echo ""
# Function to check if a port is in use
check_port() {
lsof -ti:$1 > /dev/null 2>&1
}
# Function to kill process on a port (avoids xargs which causes EAGAIN on macOS)
kill_port() {
local port=$1
local pids
pids=$(lsof -ti:"$port" 2>/dev/null) || true
if [ -n "$pids" ]; then
echo -e "${YELLOW} Port $port in use, cleaning up...${NC}"
echo "$pids" | while read -r pid; do
kill -9 "$pid" 2>/dev/null || true
done
sleep 1
fi
}
# Clean up function
cleanup() {
echo ""
echo -e "${YELLOW}🛑 Shutting down servers...${NC}"
# Kill all node processes started by this script
if [ ! -z "$BACKEND_PID" ]; then
kill $BACKEND_PID 2>/dev/null || true
fi
if [ ! -z "$VITE_PID" ]; then
kill $VITE_PID 2>/dev/null || true
fi
# Also kill concurrently if it's running
pkill -f "concurrently" 2>/dev/null || true
echo -e "${GREEN}✅ Servers stopped${NC}"
exit 0
}
# Set up cleanup trap
trap cleanup SIGINT SIGTERM EXIT
# Check for required ports and clean them up if needed
echo -e "${BLUE}🔍 Checking ports...${NC}"
kill_port 5959 # Mock backend
kill_port 8100 # Vite dev server
kill_port 8101 # Potential fallback port
kill_port 8102 # Another fallback port
kill_port 5173 # AIUI dev server
echo -e "${GREEN}✅ Ports cleared${NC}"
echo ""
# Mock backend handles app simulation — no Docker required for dev
echo -e "${GREEN} Mock backend will simulate all apps${NC}"
echo ""
# Check if node_modules exists
if [ ! -d "node_modules" ]; then
echo -e "${YELLOW}⚠️ node_modules not found. Running npm install...${NC}"
npm install
echo ""
fi
# Start the servers using npm script
echo -e "${BLUE}🚀 Starting servers...${NC}"
echo ""
# Use npm run dev:mock (includes AIUI dev server automatically)
exec npm run dev:mock