archy/scripts/audit-secrets.sh

119 lines
3.8 KiB
Bash
Raw Normal View History

release(v1.7.41-alpha): post-OTA auto-rollback so a bad release cannot strand the fleet Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 + v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500) with no recovery path short of SSH. This release adds a self-check guardrail to the update flow. What changed: - apply_update() writes a pending-verify marker with old+new version and a 150s deadline immediately before scheduling the service restart. - verify_pending_update() runs from main.rs startup. If the marker is present and within its freshness window, the new binary waits 15s for nginx + backend to settle, then probes https://127.0.0.1/ every 5s for up to 90s (self-signed certs accepted). - On any probe success within the window, the marker is cleared and nothing else happens. - On window-exhaust, the new binary: 1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts> (quarantined, not deleted, so we can post-mortem). 2. Restores web-ui.bak on top of web-ui. 3. Calls rollback_update() to restore the previous binary. 4. Updates state.current_version to reflect the rollback. 5. systemctl --no-block restart archipelago so the OLD binary boots. - Markers older than 10 minutes are treated as stale and cleared without probing, so a crashed-during-startup marker from weeks ago cannot spontaneously roll back a healthy node on a later reboot. - rollback_update() binary copy now goes through host_sudo instead of tokio::fs::copy, so it escapes the service's ProtectSystem=strict mount namespace. Without this, the rollback silently failed with EROFS on /usr/local/bin and orphaned the rollback - the exact opposite of what auto-rollback is for. Tests: 4 new unit tests in update::tests covering marker round-trip, absent-marker noop, no-panic on verify_pending_update with nothing to verify, and an invariant assert that the 90s probe window stays below the 600s stale threshold. All passing. Side fix: scripts/create-release-manifest.sh was dying with exit 141 (SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail. Replaced with a single awk NR==1 that doesn't short-circuit the upstream pipe, so the release-build flow is idempotent again.
2026-04-22 16:14:35 -04:00
#!/bin/bash
set -euo pipefail
# SEC-202: Secrets audit — checks for hardcoded credentials in the codebase.
# Scans source files for common secret patterns.
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
PASS=0
FAIL=0
RESULTS=()
log() { echo -e "\033[1;34m[AUDIT]\033[0m $*"; }
pass() { echo -e "\033[1;32m[PASS]\033[0m $*"; PASS=$((PASS + 1)); RESULTS+=("PASS: $*"); }
fail() { echo -e "\033[1;31m[FAIL]\033[0m $*"; FAIL=$((FAIL + 1)); RESULTS+=("FAIL: $*"); }
# Patterns to search for (case insensitive)
PATTERNS=(
"password\s*=\s*['\"][^'\"]*['\"]"
"api_key\s*=\s*['\"][^'\"]*['\"]"
"secret\s*=\s*['\"][^'\"]*['\"]"
"private_key\s*=\s*['\"][^'\"]*['\"]"
"sk-ant-"
"AKIA[A-Z0-9]{16}"
"ghp_[a-zA-Z0-9]{36}"
"glpat-[a-zA-Z0-9_-]{20}"
)
# Allowed files (config templates, docs, test fixtures)
ALLOW_PATTERNS="test|mock|example|template|CLAUDE.md|deploy-config|\.md$|node_modules|dist|target|default\)|grep.*rpc|audit-secrets"
main() {
log "=== Secrets Audit ==="
echo ""
# 1. Check for .env files in version control
log "1. Checking for .env files in git..."
local env_files
env_files=$(cd "$REPO_ROOT" && git ls-files '*.env' '.env*' 2>/dev/null || echo "")
if [ -z "$env_files" ]; then
pass "No .env files tracked in git"
else
fail "Found .env files in git: $env_files"
fi
# 2. Check .gitignore includes sensitive patterns
log "2. Checking .gitignore coverage..."
local gitignore="$REPO_ROOT/.gitignore"
if [ -f "$gitignore" ]; then
local has_env has_key
has_env=$(grep -c '\.env' "$gitignore" || echo 0)
has_key=$(grep -c 'credentials\|\.key\|\.pem' "$gitignore" || echo 0)
if [ "$has_env" -gt 0 ]; then
pass ".gitignore covers .env files"
else
fail ".gitignore missing .env pattern"
fi
else
fail "No .gitignore found"
fi
# 3. Scan source for hardcoded credentials
log "3. Scanning source for hardcoded secrets..."
local found_secrets=0
for pattern in "${PATTERNS[@]}"; do
local matches
matches=$(cd "$REPO_ROOT" && grep -rniE "$pattern" \
--include='*.rs' --include='*.ts' --include='*.vue' --include='*.js' \
--include='*.json' --include='*.sh' --include='*.py' \
2>/dev/null | grep -vE "$ALLOW_PATTERNS" || echo "")
if [ -n "$matches" ]; then
# Filter out false positives (empty strings, variable declarations, etc.)
local real_matches
real_matches=$(echo "$matches" | grep -vE '""|\x27\x27|None|null|undefined|TODO|placeholder|example|Option<' || echo "")
if [ -n "$real_matches" ]; then
echo " WARNING: Pattern '$pattern' found:"
echo "$real_matches" | head -5 | sed 's/^/ /'
found_secrets=$((found_secrets + 1))
fi
fi
done
if [ "$found_secrets" -eq 0 ]; then
pass "No hardcoded secrets found in source"
else
fail "Found $found_secrets secret pattern matches (review above)"
fi
# 4. Check deploy-config is gitignored
log "4. Checking deploy-config.sh is gitignored..."
if cd "$REPO_ROOT" && git check-ignore scripts/deploy-config.sh > /dev/null 2>&1; then
pass "scripts/deploy-config.sh is gitignored"
elif [ -f "$REPO_ROOT/scripts/deploy-config.sh" ]; then
fail "scripts/deploy-config.sh exists but is NOT gitignored"
else
pass "scripts/deploy-config.sh does not exist (using env vars)"
fi
# 5. Check for credential files in repo
log "5. Checking for credential files..."
local cred_files
cred_files=$(cd "$REPO_ROOT" && git ls-files '*.pem' '*.key' '*macaroon*' 2>/dev/null | grep -v '\.rs$' | grep -v '\.ts$' || echo "")
if [ -z "$cred_files" ]; then
pass "No credential files tracked in git"
else
fail "Credential files in git: $cred_files"
fi
echo ""
log "=== RESULTS ==="
for r in "${RESULTS[@]}"; do
echo " $r"
done
echo ""
log "Pass: $PASS | Fail: $FAIL"
[ $FAIL -gt 0 ] && exit 1
exit 0
}
main "$@"