26 Commits

Author SHA1 Message Date
archipelago
329e7811eb test(lifecycle): add os-audit OS-wide health gate; docs: v1.7.91 resume notes
os-audit.sh: one non-destructive scorecard tying backend/RPC health, the
all-apps lifecycle audit (delegates to remote-lifecycle.sh), and the FM-guards
(port-drift, secret-completeness, orphan-container sweep, OTA-wedge). The
per-boot building block for the reboot-survival loop. FM12 check uses jq has()
not // (// treats a legit false as empty). Section A validated all-PASS on .116.

docs: v1.7.91 release-pass resume notes + the bitcoinReceive blocker writeup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 04:36:06 -04:00
archipelago
0ed892a412 fix: wallet receive reliability, bitcoin install self-heal, ElectrumX app tile
Fixes three Bitcoin/wallet failures observed across the fleet on v1.7.90-alpha
(all nodes were already on the latest build — these were live bugs, not stale
builds), plus the missing ElectrumX tile, and adds automated coverage so each
can't regress silently.

Receive address (".116 receive fails", ".228 false 'wallet is locked'"):
- LND publishes its REST API on a host port that can drift from the manifest
  (a container created when the mapping was 8080 kept publishing 8080 after the
  manifest moved to 18080). The in-process client connects to the manifest port,
  gets connection-refused, and wallet init fails forever while the container
  looks "Up". Add published-port drift detection to the reconciler
  (container_ports_drifted / host_port_bindings_drifted) that recreates a
  drifted backend even for restart-sensitive apps — a drifted container is
  already broken, so leaving it "untouched" only perpetuates the failure.
- Receive errors now carry a stable [CODE] token (REST_UNREACHABLE, WALLET_LOCKED,
  WALLET_UNINITIALIZED, SYNCING) and always start with "Bitcoin address" so they
  survive the RPC error sanitizer instead of collapsing to the generic
  "Operation failed". The UI maps the code instead of guessing wallet state from
  substrings — so an unreachable REST endpoint is no longer mislabelled "locked".

Bitcoin install (".198 bitcoin gone / reinstall just stops"):
- bitcoin-knots requires the secret bitcoin-rpc-txrelay-rpcauth, which was only
  generated by the tx-relay flow. Nodes that never used tx-relay lacked it, so
  secret resolution hard-failed and the whole Bitcoin stack cascaded. Generate
  it idempotently before bitcoin starts (ensure_app_secrets, reusing
  ensure_txrelay_credentials), and name the missing secret in the error so a
  genuine gap is actionable instead of a bare "IO error".

ElectrumX app tile missing on every node with it installed:
- The catalog generator dropped electrumx because the manifest had no
  interfaces.main block, so the tile had no launch URL and was hidden. Declare
  the companion UI port (50002) in the manifest, regenerate the catalog, and let
  an app with a known launch URL stay launchable while its backend is still
  "starting" (ElectrumX indexes for 10m+).

Test harness:
- New lifecycle bats suites: bitcoin-receive, port-drift, secret-completeness
  (validated live; port-drift catches the real .116 drift).
- Rust unit tests for drift detection, the receive reason-code classifier, and
  the named-missing-secret error; vitest for the UI code mapping.
- create-release.sh now runs tests/release/run.sh and aborts the release on
  failure — previously it ran no tests at all.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 03:12:56 -04:00
archipelago
c800293f1f fix: bitcoin receive, AIUI pointer input, electrs self-heal, OTA timeout
- LND wallet: request correct address type so receive-address generation
  no longer 400s
- AIUI/app session: on-screen pointer can click + type into app content
  (incl. app store search); "open in new tab" opens the phone browser;
  mobile credential modal centered instead of full-height
  (remote-relay.ts, AppSession.vue, AppSessionFrame.vue, AppIconGrid.vue,
  openExternal.ts, WebViewScreen.kt) + remote-relay tests
- health_monitor: electrs auto-recovers from a corrupt index and shows a
  percent/block-height progress screen while reindexing (useElectrsSync.ts)
- update.rs: drop retired tx1138 secondary mirror (one-time migration);
  longer download timeout for slow connections
- CHANGELOG: v1.7.90-alpha notes
- tests/release/run.sh: harness tweaks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 04:49:32 -04:00
archipelago
c49e8fcacd fix: harden OTA updates, AIUI desktop gap, LND no-proxy
- update.rs: post-OTA probe falls back to http://127.0.0.1/ on connect
  error (nginx binds :80, not :443) so good updates are no longer rolled
  back; recover stuck update_in_progress; avoid ETXTBSY on running binary
- LND: REST client bypasses proxy, GET newaddress p2wkh, wallet
  readiness/unlock after restart
- Dashboard.vue: chat route back to plain h-full (desktop bottom-gap fix)
- vite.config.ts: dev-only /aiui proxy
- tests/release/run.sh: release gate harness (static+frontend+backend)
- CHANGELOG: v1.7.89-alpha notes

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 01:23:32 -04:00
archipelago
d6f108d818 chore: snapshot release workspace 2026-06-12 03:00:15 -04:00
archipelago
c393b96da3 backend: harden rootless app lifecycle orchestration 2026-06-11 00:24:32 -04:00
archipelago
d736364ad7 fix(apps): stabilize btcpay and public proxy launch flows 2026-05-19 09:26:43 -04:00
archipelago
7804223152 chore: release v1.7.57-alpha 2026-05-17 17:30:04 -04:00
Dorian
835c525218 chore(release): stage v1.7.55-alpha 2026-05-13 15:09:22 -04:00
archipelago
745cb1c626 chore(release): stage v1.7.52-alpha 2026-05-05 11:29:18 -04:00
archipelago
10fbb8f87c docs(testing): track Phase 3.4 race fix + drift-sync hook
* L0 unit count: 630 → 631 (translate_health_check_http_does_not_double_prefix_scheme)
* Phase 3 row: add TimeoutStartSec=600 race fix (44f275ed) + drift-sync hook (0889367d)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 11:53:18 -04:00
archipelago
bd96c0475d feat(config): ARCHIPELAGO_USE_QUADLET_BACKENDS env override
Adds an env-var lever for Phase 3.2's use_quadlet_backends flag so the
20× harness can flip the path on per-node without a config.json edit
(which would require an archipelago.service restart — and that triggers
FM3 cgroup cascade until Phase 3.5 ships, so we can't ask anyone to
reconfigure live nodes that way today).

Truthy parsing centralised in `parse_truthy_env` (1, true, yes, on —
case-insensitive, whitespace-trimmed). Anything else is false. The
helper is unit-tested so future env-var flags can reuse the same shape.

Also adds a default-off regression test for use_quadlet_backends so
flipping the default ahead of the 20× verification fires immediately.

TESTING.md documents the Environment= snippet for the systemd drop-in
so the next operator can flip the flag on a debug node without
re-deriving the recipe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 05:44:09 -04:00
archipelago
9a89a000d4 test(lifecycle): post-condition gate for use_quadlet_backends path
A six-test bats suite that validates what install_via_quadlet (Phase 3.2)
is supposed to leave behind:

  * `.container` unit on disk in $XDG_CONFIG_HOME/containers/systemd/
    with [Container] / [Service] / [Install] sections, Image= present,
    and Restart=on-failure (the backend invariant — companions use Always)
  * Phase 3.4 cross-check: any unit with HealthCmd= must also emit
    Notify=healthy, otherwise systemctl start won't gate on health
  * `systemctl --user is-active` returns 0 for the .service
  * podman shows the container running
  * the container's cgroup is under user.slice/, NOT under
    archipelago.service — the kernel-level proof that FM3 cgroup
    cascade SIGKILL is structurally fixed for this container

Auto-skips on every test when no backend Quadlet units exist (today's
default state, use_quadlet_backends=false) — so the suite is a no-op
on current fleet boxes and turns into a hard regression gate the
moment anyone flips the flag and reinstalls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 05:34:47 -04:00
archipelago
97ce23d773 feat(quadlet): Phase 3.4 — health-gated startup via Notify=healthy
QuadletUnit gains an optional HealthSpec; from_manifest translates the
manifest's health_check (tcp/http/cmd) into a HealthCmd= directive and
emits Notify=healthy alongside it. systemctl start <unit>.service then
blocks until the container's first green probe — eliminating the
"container up but RPC not ready" race the orchestrator currently papers
over with post-start polling.

Translation policy:
* tcp,  endpoint "host:port"        -> nc -z host port
* http, endpoint "host:port", path  -> curl -fsS -m 5 http://endpoint<path>
* cmd,  endpoint "<shell command>"  -> verbatim
* unknown type / malformed endpoint -> None (skip Notify=healthy rather
  than emit a HealthCmd that hangs the unit start forever)

Companion units leave health: None and remain byte-identical to before
this PR — the renderer only emits the Health* / Notify= block when set.

+4 quadlet unit tests (19 total). Dropped a never-used test setter that
was generating a dead_code warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 05:21:57 -04:00
archipelago
65576bd755 feat(orchestrator): Phase 3.3 — in-place migration to Quadlet
When use_quadlet_backends flips from off → on, existing fleet boxes
have backend containers parented under archipelago.service's cgroup
(the bad shape that triggers FM3 cascade SIGKILL on every archipelago
restart). ensure_running now notices and corrects this:

* If there's already a `<name>.container` unit on disk → no-op
  (subsequent reconcile ticks take this fast path).
* Else if a podman container with that name exists → it's a pre-3.3
  artifact. Stop+remove it (volumes survive — bind mounts are not
  touched by `podman rm`), then write the Quadlet unit, daemon-reload,
  and start the new managed service.
* Else → fall through to install_fresh, which already routes through
  install_via_quadlet when the flag is on.

The migration is idempotent and self-healing: if a fleet box is
half-migrated (unit on disk but no service active, or service active
but stale unit), the next reconcile tick converges. Bitcoin chain
data, lnd wallet state, and electrumx index all live on host bind
mounts and are unaffected by the container-record swap.

Volume safety audited per backend in `uses_orchestrator_install_flow`
allowlist — every entry mounts its data dir as a host bind mount.

Default still off. To migrate a node:
  /etc/archipelago/config.toml: use_quadlet_backends = true
followed by `systemctl restart archipelago` — the next reconcile tick
walks every managed app and migrates each in turn.

Tests: 624 passing, 0 cargo warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:27:59 -04:00
archipelago
5b2e02bd43 feat(orchestrator): Phase 3.2 — wire Quadlet path behind feature flag
prod_orchestrator::install_fresh now branches on the new
Config::use_quadlet_backends flag (default false):

* off (today's production behavior) — unchanged: runtime.create_container
  + start_container, container parented under archipelago.service's
  cgroup, FM3 cascade SIGKILL on every archipelago restart.
* on  — install_via_quadlet renders the manifest as a Quadlet unit via
  QuadletUnit::from_manifest, writes it atomically into
  ~/.config/containers/systemd/, calls daemon-reload, and starts the
  generated <name>.service. Container ends up under user.slice — no
  more cgroup parented under archipelago, so archipelago restarts
  don't touch the container's lifetime.

Default off so this commit is structurally safe to ship: nothing
changes at runtime until an operator opts in. Flip the default once
tests/lifecycle/run-20x.sh has gone green against the new path on
.228 + .198 (the v1.7.52 release gate).

Plumbing:
* config.rs — `use_quadlet_backends: bool` w/ Default false
* prod_orchestrator.rs — flag stored on the struct, threaded through
  new(), with set_use_quadlet_backends(bool) test setter
* prod_orchestrator.rs — install_via_quadlet helper
* dropped the Phase-3.1 #[allow(dead_code)] markers on from_manifest /
  parse_memory_mib / RestartPolicy::OnFailure now that the call path
  exists; if a future revert removes the wiring, the warnings come back.

Tests: 624 passing, cargo check clean (0 warnings). Existing companion
behavior unaffected — render_skips_backend_directives_when_default
still passes byte-equal to before quadlet.rs grew the new fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:22:10 -04:00
archipelago
9becafafd3 feat(quadlet): backend-manifest renderer (Phase 3.1 of v1.7.52)
The QuadletUnit struct now covers everything a backend manifest needs
(ports, environment, devices, add_hosts, entrypoint+command, read-only
root, no_new_privileges, cpu_quota, restart policy choice). Adds
QuadletUnit::from_manifest(&AppManifest, name) that translates a parsed
manifest into a unit, plus parse_memory_mib for "1g"/"512m"/raw-MiB
forms. The renderer skips empty/false directives so existing companion
units render byte-identically — no behavior change for shipping
companions; the backend renderer is dead code until Phase 3.2 wires it
into the orchestrator.

Eight new unit tests cover:
* parse_memory_mib forms (1024, 512m, 2g, garbage)
* shell_join quoting (whitespace, embedded quotes)
* RestartPolicy → systemd string mapping
* render emits backend directives when set
* render skips them when defaulted (companion regression gate)
* from_manifest happy path on a bitcoin-knots-shaped manifest
* from_manifest read-only volume detection
* from_manifest tmpfs filtering
* end-to-end manifest → render bytes assertion

Tests: 615 → 624 (+9 net; one pre-existing parse_memory_mib path was
implicitly covered before but is now explicit). Cargo warnings: 0.

`from_manifest`, `parse_memory_mib`, and `RestartPolicy::OnFailure` are
marked allow(dead_code) with explicit references to Phase 3.2 — if
3.2 doesn't wire them, the dead-code warning resurfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:09:50 -04:00
archipelago
5074572373 test(lifecycle): add btcpay + fedimint + mempool suites
Brings L1 (RPC API) + L3 (lifecycle survival) parity coverage to the
three multi-app stacks that were previously only touched by
required-stack.bats. Combined with bitcoin-knots / lnd / electrumx
already shipping, the six core apps now have dedicated bats files.

Each suite is shaped like the existing single-container suites
(bitcoin-knots / lnd / electrumx) and gates every assertion on the
backing container actually being present, so a node without the stack
installed gets clean skip messages instead of false fails.

* btcpay.bats — 9 tests, including stack-wide presence and a
  "supporting containers don't cascade-restart" guard
* fedimint.bats — 8 tests, single container
* mempool.bats — 9 tests, mixed legacy + orchestrator-managed stack;
  reuses the :8999 mempool-api probe from required-stack for parity

Total bats now: 88 (was 53 → +35).
TESTING.md matrix advances 23 → 50 of 110 cells.
UI URL coverage for these three apps already lives in
ui-coverage.bats, so this PR doesn't duplicate proxy-path probes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:55:31 -04:00
archipelago
ec1dce93a9 docs(testing): canonical scorecard for container subsystem testing
Single source of truth for "where are we, where are we going" on the
v1.7.52 container excellence work. Replaces ad-hoc tracking in chat.

Sections:
* Test layers L0..L6 with toolchain + per-iteration latency
* Per-app × per-state coverage matrix (23 of 110 cells today; goal 110)
* Layer-by-layer status (L0+L1+L2 ●; L3 ◐; L4..L6 ○)
* Run commands (single suite / full suite / 20×)
* LoC budget — -270 committed, ~1,616 more possible if Phase 3 ships
* Performance KPIs (TBD — measure first, target second)
* Release gates — 8 boxes that must tick before v1.7.52 ships

The file lives in-repo so PR diffs to it answer "what did this commit
improve?". If you can't tick the box, the change isn't ready.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:52:42 -04:00
archipelago
b9eb6eb18a test(lifecycle): add UI surface coverage — HTTPS proxy + iframe URLs
Closes the coverage gap where existing bats suites would report green
on a node whose dashboard tiles 502 because the proxy upstream is dead.
First pass against .198 caught real prod issues immediately:
  /app/lnd/       → 502 (lnd container exited)
  /app/mempool/   → 502 (mempool container exited)
  /app/fedimint/  → 502 (fedimint container exited)
while existing tests reported only "container is up: false" with no
404/502 distinction.

* lib/ui-probes.bash — sourced helper. probe_https_200,
  probe_app_url (skip-if-container-down else assert-200),
  probe_dashboard_shell (asserts the Vue SPA HTML, not nginx default —
  catches the layout regression from feedback_release_tarball_layout.md),
  probe_dashboard_catalog (asserts /catalog.json non-empty).
* bats/ui-coverage.bats — 9 @test cases covering the dashboard +
  bitcoin-ui :8334 + the seven HTTPS_PROXY_PATHS most users hit
  (lnd, electrumx, mempool, fedimint, btcpay, filebrowser).

URL list mirrors HTTPS_PROXY_PATHS in
neode-ui/src/views/appSession/appSessionConfig.ts. Divergence between
the two is the exact bug class we're guarding against.

Loops clean under run-20x.sh. Container-state oracle is via local
podman inspect, so the suite must run on the archy host (same as
companion-survives-archipelago-restart.bats).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:49:30 -04:00
archipelago
01f416ae5d test(lifecycle): regression gate for FM3 cgroup-cascade SIGKILL
Sister suite to companion-survives-archipelago-restart.bats. That one
tests the same property for UI companions, which already ship via
Quadlet (commit 6e716f68) and so already pass.

This new suite tests the property for backend containers (bitcoin-knots
/ bitcoin-core / lnd / electrumx). Until v1.7.52 Phase 3 ships these
under Quadlet too, the suite is EXPECTED TO FAIL on fleet boxes — it's
the executable definition of "FM3 fixed".

Observed live on .198 on 2026-05-01: `sudo systemctl stop archipelago`
killed every container in archipelago.service's cgroup. The dedicated
"backends survive archipelago restart" test catches exactly that, and
also verifies the SAME container instance survives (compares pre/post
.Id), so an orchestrator that recreates a fresh container after the
SIGKILL doesn't read as pass.

Three @test cases:
* destructive gate (skip-marker for the suite)
* baseline: at least one backend installed + running
* backends survive: same .Id pre + post archipelago restart

Don't gate releases on this passing until Phase 3 lands; before then
treat it as a "expected to fail / shows progress" indicator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:17:27 -04:00
archipelago
f80daff8ba test(lifecycle): add dedicated electrumx.bats suite
Same shape as bitcoin-knots.bats and lnd.bats so the 20× release-gate
exercises electrumx through the same state matrix it uses for the other
two core apps. electrumx previously had a single TCP-port check inside
required-stack.bats; this adds destructive + cascade-destructive tiers.

10 @test cases:
* read-only: presence, valid state, TCP port (50001) reachable, no
  orphan containers beyond {electrumx, archy-electrs-ui}
* destructive: stop, start, restart, TCP port recovers within 120s of
  cold restart (longer than bitcoind because electrumx replays its
  index against bitcoind on start)
* cascade: uninstall, reinstall (240s timeout for index rebuild)

With this suite, the three single-container core apps (bitcoin-knots,
lnd, electrumx) now have parity coverage. Multi-container stacks
(btcpay, mempool, fedimint) come next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:11:02 -04:00
archipelago
1103c2c710 test(lifecycle): add dedicated lnd.bats suite
Mirrors bitcoin-knots.bats so the 20× release-gate run exercises lnd
through the same state matrix. lnd previously had only a single
read-only check inside required-stack.bats; this adds the destructive
and cascade-destructive tiers that match what we already test for
bitcoin-knots.

10 @test cases:
* read-only: presence, valid state, lncli getinfo, no orphan containers
* destructive (ARCHY_ALLOW_DESTRUCTIVE=1): stop, start, restart,
  RPC recovers within 90s of cold restart (longer than bitcoind
  because the wallet has to unlock first)
* cascade (ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1): uninstall, reinstall

Reuses the same lncli invocation as required-stack.bats so divergence
shows up clearly if either test breaks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:09:43 -04:00
archipelago
1b6c500657 test(lifecycle): add setup-teardown + run-20x harness scaffolding
Phase 4 of the v1.7.52 container excellence plan: a release-gate harness
that loops the bats suite N times in a row, with teardown between
iterations, and reports a pass/fail tally.

* setup-teardown.sh — clears /tmp/archy-rpc-session-* between runs so
  iteration N+1 doesn't reuse a logged-out cookie from iteration N.
  Idempotent; safe to run anytime. Designed to grow as we add suites
  that leave other transient state.
* run-20x.sh — wraps run.sh in a loop of ARCHY_ITERATIONS (default 20).
  Tracks per-iteration pass/fail with wall-clock timing, prints a
  results block, exits non-zero on any failure. Honors ARCHY_FAIL_FAST
  for short-circuit during dev.

Suggested release-gate command:
  ARCHY_PASSWORD=password123 ARCHY_ALLOW_DESTRUCTIVE=1 \
    tests/lifecycle/run-20x.sh

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:06:09 -04:00
archipelago
23c4e7441f refactor(container): move companion UIs to systemd via Quadlet
Companion UI containers (archy-bitcoin-ui, archy-lnd-ui,
archy-electrs-ui) used to be launched as fire-and-forget tokio::spawn
blocks from install.rs. If archipelago crashed mid-spawn or the
container's cgroup was reaped, companions vanished from podman ps -a
and only a manual rm/run could bring them back (the .228 incident).

Now each companion is rendered as a Quadlet .container unit under
~/.config/containers/systemd/, daemon-reloaded, and started via
systemctl --user. systemd owns supervision from that point on:

- archipelago can crash, restart, or be uninstalled without touching
  any companion.
- Quadlet's Restart=always + RestartSec=10 handles container exits.
- A 30s reconcile tick in boot_reconciler enumerates expected
  companion units and re-installs any whose unit file or service
  vanished — defense-in-depth against external tampering.

New module layout:
- container/quadlet.rs: pure unit renderer + atomic write_if_changed
  + systemctl helpers (daemon_reload_user / enable_now / disable_remove
  / is_active). 6 unit tests, no I/O in the renderer.
- container/companion.rs: per-app companion specs, install/remove/
  reconcile, image presence (build local first, fall back to insecure
  registry only via image_uses_insecure_registry whitelist). 2 tests.

install.rs handle_package_install now ends with a single call to
companion::install_for(package_id), replacing 287 lines of spawn-and-
hope shellouts plus a ~120-line nginx auth-injector helper that worked
around per-node RPC password baking. The helper is gone too — the
pre-start hook renders the per-node nginx.conf to /var/lib/archipelago/
bitcoin-ui/nginx.conf and the Quadlet unit bind-mounts it read-only.

runtime.rs handle_package_uninstall now disables companions before
the container rm loop. Otherwise systemd's Restart=always would
respawn each companion within ~10s of removal.

Tests: 53 container tests pass, including 6 quadlet renderer tests
(host network, bridge network, capability set, atomic write idempotence)
and 2 companion specs (per-app companion lookup, build_unit shape).
boot_reconciler tests gain a #[cfg(test)] without_companion_stage()
flag so the paused-clock fixtures don't race the real systemctl I/O.

A bats regression test (companion-survives-archipelago-restart.bats,
gated on ARCHY_ALLOW_DESTRUCTIVE=1) asserts the .228 failure mode
cannot recur: every installed companion has a unit file, services
stay active across systemctl --user restart archipelago, and a
deleted unit file is recreated within one reconcile tick.

Net delta: +941 / -363, but the +941 is mostly tests (~440 lines)
and the new declarative layer; the imperative tokio::spawn block and
its nginx-auth helper are gone, removing two failure classes
(orphan companions on archipelago crash, and post-start exec races
under tightly-confined cgroups) that previously needed manual SSH
recovery.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:45:07 -04:00
archipelago
8f83b37d51 feat(orchestrator): complete container migration and release hardening 2026-04-28 15:00:58 -04:00