archy/CLAUDE.md
archipelago 1977bdefb5 feat(trust): pin release-root anchor + ship signed app-catalog
Pin RELEASE_ROOT_PUBKEY_HEX from the 2026-07-02 release-root signing ceremony
(signer did🔑z6MkkidEnEpo6qHMCNSZoNKWtvQvxq3whnaME9wGgEFhq7ur) so nodes verify
the publisher identity of the app-catalog. Sign releases/app-catalog.json in place.

Fix two floats that made the catalog unsignable: archy-btcpay-db manifest version
-> string, fedimint-clientd cpu_limit 0.25 -> 1 (u32). Add scripts/sign-catalog.sh
helper, the 1.8.0 release-hardening plan/tracker, and the commit-and-push project
rule in CLAUDE.md.

Backward-compatible: old binaries still accept the signed catalog; the pinned-anchor
binary ships in the next build/OTA.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-02 09:15:43 -04:00

5.0 KiB
Raw Blame History

Archipelago — agent guide

Single-node production gate is GREEN (2026-06-23)

tests/lifecycle/run-gate.sh is 5/5 on .228, 0 failures — the single-node exit criterion is met and the priority banner is demoted. Next exit-criteria: the multinode pass (docs/multinode-testing-plan.md) and workstreams B/C/D.

For day-to-day work, use docs/UNIFIED-TASK-TRACKER.md — the consolidated, priority-ordered "what's left" list across the 1.8.0 OTA and master-plan docs (fastest/simplest tasks first). It supersedes hunting through the two source docs below for open items; those remain the narrative/history.

Read docs/PRODUCTION-MASTER-PLAN.md first — it is still the authoritative plan for the north star: a world-class, developer-ready app platform where every app is manifest-driven, manifests ship via the signed registry (not OTA disk files), and third-party developers publish apps via an external/decentralized registry — all rootless, secure, robust, and 100%-uptime-capable. It no longer overrides all ad-hoc direction now that the gate is green, but it remains the source of truth for sequencing the remaining workstreams.

Detailed sub-plans (all linked from the master):

  • App platform / packaging phases + security model → docs/APP-PACKAGING-MIGRATION-PLAN.md
  • Registry-distributed manifests (in progress) → docs/registry-manifest-design.md
  • External/decentralized marketplace for devs → docs/marketplace-protocol.md
  • Current per-app state → docs/app-registry-status-2026-06-21.md
  • Production test gate (exit criterion) → tests/lifecycle/TESTING.md

Commit & push every unit of work (never violate)

The #1 process rule: work is not "done" until it is committed AND pushed. This exists because finished work has been lost/clobbered by sitting uncommitted in the shared tree across agents and sessions. To prevent that:

  • Commit each feature/fix the moment it works — one focused, self-contained commit per logical change (it compiles and its targeted tests pass). Do not let unrelated changes accumulate uncommitted.
  • Push immediately after committing so nothing lives only on one machine. main is protected → push via git push gitea-ai main (account ai, see the memory note); feature branches push to their own remote.
  • Never leave a stack of finished work uncommitted overnight or when handing off between agents — if you must pause mid-change, commit a clearly-labelled WIP checkpoint rather than leaving it dirty.
  • Stage explicitly by path (git add <paths>) when another agent's uncommitted work shares the tree — never git add -A / git commit -a, which clobbers or entangles their changes.
  • Never commit or push secrets (mnemonics, private keys, API tokens). Signing is done offline; artifacts (catalog/manifest) are signed, not the keys.
  • Commit messages end with the Co-Authored-By: Claude … trailer.

Invariants (never violate)

  • Rootless Podman only. No rootful, no Docker-socket mounts, no privileged containers unless explicitly approved.
  • No per-app Rust installers / no OS-level reliance. Apps are declarative; the orchestrator owns the lifecycle. install_immich_stack (hardcoded podman run + sudo chown) is the anti-pattern being deleted, not a template.
  • Secrets are manifest-declared (generated_secrets, materialised by container::secrets, 0600/rootless) — never hardcoded, per-app, or logged.
  • Migrations never destroy data — preserve /var/lib/archipelago/<app>, secrets, credentials, ports, and adoption container names; keep a rollback path.
  • Verify on the real node .228 before any tag. (Fleet-wide multinode verification is a separate plan: docs/multinode-testing-plan.md.)

Build / verify

  • Rust workspace root is core/ (no Cargo.toml at repo root). cargo from core/.
  • If a cargo test/build hits rust-lld: undefined hidden symbol, it's incremental-cache corruption — rebuild with CARGO_INCREMENTAL=0.
  • Frontend: neode-ui/npm run build outputs to web/dist/neode-ui/. Grep the built bundle for new strings before shipping (build can silently no-op).
  • App manifests load from disk on nodes at /opt/archipelago/apps/*/manifest.yml (today); the goal is to distribute them via the signed catalog instead.

Production test gate (definition of done)

tests/lifecycle/run-gate.sh green across install / UI / stop / start / restart / reinstall / reboot-survive / archipelago-restart-survive / uninstall — 5× on .228 (ARCHY_ITERATIONS=5). Run the gate ON the node (it uses local podman/systemctl/bitcoin probes), not via RPC from another host. GREEN 2026-06-23 (5/5, 0 not-ok) — keep it green (re-run after orchestrator/lifecycle changes); regressions are top priority again. Multinode testing (.198 + the rest of the fleet) is a SEPARATE plandocs/multinode-testing-plan.md — not part of this single-node gate criterion, and is the next exit criterion now that single-node is green.