Fixes three Bitcoin/wallet failures observed across the fleet on v1.7.90-alpha
(all nodes were already on the latest build — these were live bugs, not stale
builds), plus the missing ElectrumX tile, and adds automated coverage so each
can't regress silently.
Receive address (".116 receive fails", ".228 false 'wallet is locked'"):
- LND publishes its REST API on a host port that can drift from the manifest
(a container created when the mapping was 8080 kept publishing 8080 after the
manifest moved to 18080). The in-process client connects to the manifest port,
gets connection-refused, and wallet init fails forever while the container
looks "Up". Add published-port drift detection to the reconciler
(container_ports_drifted / host_port_bindings_drifted) that recreates a
drifted backend even for restart-sensitive apps — a drifted container is
already broken, so leaving it "untouched" only perpetuates the failure.
- Receive errors now carry a stable [CODE] token (REST_UNREACHABLE, WALLET_LOCKED,
WALLET_UNINITIALIZED, SYNCING) and always start with "Bitcoin address" so they
survive the RPC error sanitizer instead of collapsing to the generic
"Operation failed". The UI maps the code instead of guessing wallet state from
substrings — so an unreachable REST endpoint is no longer mislabelled "locked".
Bitcoin install (".198 bitcoin gone / reinstall just stops"):
- bitcoin-knots requires the secret bitcoin-rpc-txrelay-rpcauth, which was only
generated by the tx-relay flow. Nodes that never used tx-relay lacked it, so
secret resolution hard-failed and the whole Bitcoin stack cascaded. Generate
it idempotently before bitcoin starts (ensure_app_secrets, reusing
ensure_txrelay_credentials), and name the missing secret in the error so a
genuine gap is actionable instead of a bare "IO error".
ElectrumX app tile missing on every node with it installed:
- The catalog generator dropped electrumx because the manifest had no
interfaces.main block, so the tile had no launch URL and was hidden. Declare
the companion UI port (50002) in the manifest, regenerate the catalog, and let
an app with a known launch URL stay launchable while its backend is still
"starting" (ElectrumX indexes for 10m+).
Test harness:
- New lifecycle bats suites: bitcoin-receive, port-drift, secret-completeness
(validated live; port-drift catches the real .116 drift).
- Rust unit tests for drift detection, the receive reason-code classifier, and
the named-missing-secret error; vitest for the UI code mapping.
- create-release.sh now runs tests/release/run.sh and aborts the release on
failure — previously it ran no tests at all.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
79 lines
3.0 KiB
Bash
79 lines
3.0 KiB
Bash
#!/usr/bin/env bats
|
|
# tests/lifecycle/bats/port-drift.bats
|
|
#
|
|
# Regression guard for the .116 failure class: a backend container that is
|
|
# "Up" but publishes its ports to the WRONG host ports because the manifest
|
|
# changed after the container was created (e.g. lnd REST stuck on host 8080
|
|
# while the manifest — and every in-process client — expects 18080).
|
|
#
|
|
# This mirrors the orchestrator's `host_port_bindings_drifted` check, but from
|
|
# the outside: it compares the live `podman inspect` PortBindings against the
|
|
# manifest `ports:` for each installed backend. Runs on the archy host.
|
|
#
|
|
# Tiers: read-only.
|
|
|
|
_apps_dir() {
|
|
local d
|
|
for d in "${ARCHIPELAGO_APPS_DIR:-}" /opt/archipelago/apps \
|
|
"$BATS_TEST_DIRNAME/../../../apps"; do
|
|
[[ -n "$d" && -d "$d" ]] && { echo "$d"; return 0; }
|
|
done
|
|
return 1
|
|
}
|
|
|
|
_manifest_for() {
|
|
local app="$1" dir
|
|
dir=$(_apps_dir) || return 1
|
|
local mf
|
|
for mf in "$dir/$app/manifest.yml" "$dir/$app/manifest.yaml"; do
|
|
[[ -r "$mf" ]] && { echo "$mf"; return 0; }
|
|
done
|
|
return 1
|
|
}
|
|
|
|
# Emit "host container" pairs from a manifest's ports: block.
|
|
_manifest_ports() {
|
|
awk '
|
|
/^[[:space:]]*ports:/ { inports=1; next }
|
|
inports && /^[[:space:]]*[a-z_]+:[[:space:]]*$/ && !/protocol:|host:|container:/ { inports=0 }
|
|
inports && /- host:/ { host=$3 }
|
|
inports && /container:/ { print host, $2 }
|
|
' "$1"
|
|
}
|
|
|
|
# For a given container + (host,container) port, emit a "DRIFT: …" line on
|
|
# mismatch (and nothing otherwise). Stays silent for unpublished / host-net
|
|
# ports — those are handled elsewhere and must never be treated as drift.
|
|
_drift_line() {
|
|
local cname="$1" want_host="$2" cport="$3"
|
|
local bindings actual
|
|
bindings=$(podman inspect "$cname" --format '{{json .HostConfig.PortBindings}}' 2>/dev/null) || return 0
|
|
actual=$(echo "$bindings" | jq -r --arg k "${cport}/tcp" '.[$k][]?.HostPort // empty' 2>/dev/null)
|
|
[[ -n "$actual" ]] || return 0
|
|
echo "$actual" | grep -qx "$want_host" && return 0
|
|
echo "DRIFT: $cname container-port $cport published on host [$actual] but manifest wants $want_host"
|
|
}
|
|
|
|
@test "backend containers publish ports that match their manifest" {
|
|
command -v podman >/dev/null 2>&1 || skip "podman not available"
|
|
local checked=0 violations="" app cname mf line
|
|
# container-name : manifest-app-id
|
|
for pair in "lnd:lnd" "bitcoin-knots:bitcoin-knots" "electrumx:electrumx"; do
|
|
cname="${pair%%:*}"; app="${pair##*:}"
|
|
podman container exists "$cname" 2>/dev/null || continue
|
|
mf=$(_manifest_for "$app") || continue
|
|
while read -r host cport; do
|
|
[[ -n "$host" && -n "$cport" ]] || continue
|
|
checked=$((checked + 1))
|
|
line=$(_drift_line "$cname" "$host" "$cport")
|
|
[[ -n "$line" ]] && violations+="${line}"$'\n'
|
|
done < <(_manifest_ports "$mf")
|
|
done
|
|
[[ "$checked" -gt 0 ]] || skip "no installed backend containers with published ports to check"
|
|
if [[ -n "$violations" ]]; then
|
|
echo "published-port drift detected:" >&2
|
|
echo "$violations" >&2
|
|
return 1
|
|
fi
|
|
}
|