archy/CLAUDE.md
archipelago 27299ea687 docs: make the production test gate a SINGLE-NODE (.228) criterion; split out multinode
Per direction: the gate is now 5x green ON .228 only (run on the node, not via RPC).
Fleet/multinode verification (.198 + others) moved to a new docs/multinode-testing-plan.md
with the bootstrap recipe, per-node preconditions (synced archival bitcoin, no stale
nginx proxy targets, no orphan quadlet units), node roster, and cross-node suites.
Updated CLAUDE.md, master-plan SS5/SS6/SS8b/WS-E, and TESTING.md release gates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:47:34 -04:00

3.0 KiB
Raw Blame History

Archipelago — agent guide

🚩 TOP PRIORITY (until production testing passes)

Read docs/PRODUCTION-MASTER-PLAN.md first. It is the authoritative plan and overrides ad-hoc direction until the production test gate is green. Goal: a world-class, developer-ready app platform where every app is manifest-driven, manifests ship via the signed registry (not OTA disk files), and third-party developers publish apps via an external/decentralized registry — all rootless, secure, robust, and 100%-uptime-capable.

Detailed sub-plans (all linked from the master):

  • App platform / packaging phases + security model → docs/APP-PACKAGING-MIGRATION-PLAN.md
  • Registry-distributed manifests (in progress) → docs/registry-manifest-design.md
  • External/decentralized marketplace for devs → docs/marketplace-protocol.md
  • Current per-app state → docs/app-registry-status-2026-06-21.md
  • Production test gate (exit criterion) → tests/lifecycle/TESTING.md

Invariants (never violate)

  • Rootless Podman only. No rootful, no Docker-socket mounts, no privileged containers unless explicitly approved.
  • No per-app Rust installers / no OS-level reliance. Apps are declarative; the orchestrator owns the lifecycle. install_immich_stack (hardcoded podman run + sudo chown) is the anti-pattern being deleted, not a template.
  • Secrets are manifest-declared (generated_secrets, materialised by container::secrets, 0600/rootless) — never hardcoded, per-app, or logged.
  • Migrations never destroy data — preserve /var/lib/archipelago/<app>, secrets, credentials, ports, and adoption container names; keep a rollback path.
  • Verify on the real node .228 before any tag. (Fleet-wide multinode verification is a separate plan: docs/multinode-testing-plan.md.)

Build / verify

  • Rust workspace root is core/ (no Cargo.toml at repo root). cargo from core/.
  • If a cargo test/build hits rust-lld: undefined hidden symbol, it's incremental-cache corruption — rebuild with CARGO_INCREMENTAL=0.
  • Frontend: neode-ui/npm run build outputs to web/dist/neode-ui/. Grep the built bundle for new strings before shipping (build can silently no-op).
  • App manifests load from disk on nodes at /opt/archipelago/apps/*/manifest.yml (today); the goal is to distribute them via the signed catalog instead.

Production test gate (definition of done)

tests/lifecycle/run-20x.sh green across install / UI / stop / start / restart / reinstall / reboot-survive / archipelago-restart-survive / uninstall — 5× on .228 (ARCHY_ITERATIONS=5; temporarily reduced from 20× — restore to 20× before the final ship). Run the gate ON the node (it uses local podman/systemctl/bitcoin probes), not via RPC from another host. Until green, the master plan is the priority. Multinode testing (.198 + the rest of the fleet) is a SEPARATE plandocs/multinode-testing-plan.md — not part of this single-node gate criterion.