archy/core/models/Cargo.toml

40 lines
1.1 KiB
TOML
Raw Normal View History

release(v1.7.41-alpha): post-OTA auto-rollback so a bad release cannot strand the fleet Closes failure mode FM5 from docs/bulletproof-containers.md: the v1.7.38 + v1.7.39 rollouts left every affected node on an unreachable UI (nginx 500) with no recovery path short of SSH. This release adds a self-check guardrail to the update flow. What changed: - apply_update() writes a pending-verify marker with old+new version and a 150s deadline immediately before scheduling the service restart. - verify_pending_update() runs from main.rs startup. If the marker is present and within its freshness window, the new binary waits 15s for nginx + backend to settle, then probes https://127.0.0.1/ every 5s for up to 90s (self-signed certs accepted). - On any probe success within the window, the marker is cleared and nothing else happens. - On window-exhaust, the new binary: 1. Moves the broken /opt/archipelago/web-ui to web-ui.failed.<ts> (quarantined, not deleted, so we can post-mortem). 2. Restores web-ui.bak on top of web-ui. 3. Calls rollback_update() to restore the previous binary. 4. Updates state.current_version to reflect the rollback. 5. systemctl --no-block restart archipelago so the OLD binary boots. - Markers older than 10 minutes are treated as stale and cleared without probing, so a crashed-during-startup marker from weeks ago cannot spontaneously roll back a healthy node on a later reboot. - rollback_update() binary copy now goes through host_sudo instead of tokio::fs::copy, so it escapes the service's ProtectSystem=strict mount namespace. Without this, the rollback silently failed with EROFS on /usr/local/bin and orphaned the rollback - the exact opposite of what auto-rollback is for. Tests: 4 new unit tests in update::tests covering marker round-trip, absent-marker noop, no-panic on verify_pending_update with nothing to verify, and an invariant assert that the 90s probe window stays below the 600s stale threshold. All passing. Side fix: scripts/create-release-manifest.sh was dying with exit 141 (SIGPIPE from tar tvzf pipe head pipe awk) under set -euo pipefail. Replaced with a single awk NR==1 that doesn't short-circuit the upstream pipe, so the release-build flow is idempotent again.
2026-04-22 16:14:35 -04:00
[package]
name = "models"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
base64 = "0.21.4"
color-eyre = "0.6.2"
ed25519-dalek = { version = "2.0.0", features = ["serde"] }
lazy_static = "1.4"
mbrman = "0.5.2"
emver = { git = "https://github.com/Start9Labs/emver-rs.git", rev = "61cf0bc96711b4d6f3f30df8efef025e0cc02bad", features = [
"serde",
] }
ipnet = "2.8.0"
openssl = { version = "0.10.57", features = ["vendored"] }
patch-db = { version = "*", path = "../../patch-db/patch-db", features = [
"trace",
] }
rand = "0.8.5"
regex = "1.10.2"
reqwest = "0.11.22"
rpc-toolkit = "0.2.2"
serde = { version = "1.0", features = ["derive", "rc"] }
serde_json = "1.0"
sqlx = { version = "0.7.2", features = [
"chrono",
"runtime-tokio-rustls",
"postgres",
] }
ssh-key = "0.6.2"
thiserror = "1.0"
tokio = { version = "1", features = ["full"] }
torut = "0.2.1"
tracing = "0.1.39"
# Pinned via [patch] to vendored 0.1.6 (edition 2021). 0.1.6+ on crates.io use edition2024, which Cargo 1.75 doesn't support.
yasi = "0.1.6"