Per user direction: the production test gate is 5x (ARCHY_ITERATIONS=5) on
.228 AND .198 for now, down from 20x. Restore to 20x before the final ship.
Updated CLAUDE.md, PRODUCTION-MASTER-PLAN.md, and tests/lifecycle/TESTING.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add controlled post_install/pre_start hook schema to AppDefinition:
LifecycleHooks/HookStep (Exec | CopyFromHost)/HostCopy with allowlist
validation (relative src, no '..', absolute container dest, non-empty
exec). Re-exported from the crate root. Design: docs/manifest-hooks-design.md.
Also add the missing generated_secrets: vec![] field to three
pre-existing ContainerConfig test literals (the field was added to the
struct in 03a4ee1b but the container crate's own tests were never rerun,
so -p archipelago-container failed to compile). cargo test green: 53 pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the immich migration off the legacy hardcoded install_immich_stack
(podman run + sudo chown) to the registry-manifest + orchestrator path. Validated
live on .228 (clean single set, healthy v2.7.4, data dir ownership correct).
- install_immich_stack now tries install_stack_via_orchestrator(immich_stack_app_ids)
first; legacy remains only as the no-manifests fallback.
- immich-{postgres,redis,server} manifests corrected from live findings:
* named by app_id (dropped container_name override) — using container_name
spawned DUPLICATE containers (app_id-named install vs name-override reconcile)
on the same PGDATA, which corrupted a postgres cluster. Server reaches its
siblings via app_id aliases (DB_HOSTNAME=immich-postgres, REDIS=immich-redis).
* immich-postgres data_uid 100998:100998 (postgres drops to container 999 →
host 100998 under rootless; verified the fresh dir is chowned correctly).
* immich-server version "release"→"2.7.4" (manifest validation requires a digit;
the bad version made the manifest silently skip → partial orchestrator install
→ legacy fallback → the duplicate corruption above).
- HARDEN install_stack_via_orchestrator: only fall back to the legacy installer
when NOTHING was installed yet. An "unknown app_id" AFTER a member is up now
errors instead of double-creating containers on shared data (the corruption
root cause).
- Strict the all-manifests round-trip test: fail (not skip) on any invalid shipped
manifest — this gap let the bad immich-server version through.
Known follow-up (pre-existing, platform-wide): orchestrator-installed backends
(immich, btcpay-db) run as podman --restart, not Quadlet, and podman-restart.service
is disabled on .228 → reboot-survival gap independent of this migration.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Workstream B phase 1 (node-side consume). The signed app-catalog can now carry a
full manifest per entry; the orchestrator overlays it over the disk manifest
(origin-wins) with disk as the migration fallback. Moves apps toward
registry-distributed manifests with no OTA-shipped disk file.
- app_catalog: `manifest: Option<Value>` on AppCatalogEntry (forward-compatible,
covered by the existing release-root signature over the raw JSON);
`catalog_manifest_values()` accessor.
- prod_orchestrator: `load_manifests` overlays catalog manifests after the disk
walk; `catalog_manifest_to_overlay()` returns None (→ disk fallback) on
unparseable value / app-id mismatch / failed validate() / build source
(build contexts aren't registry-distributed yet — phase 1 is image-only).
- manifest_dir stays PathBuf (build-only field); image-only apps never read it.
- 6 unit tests; compiles clean. No-op until a catalog embeds a manifest, so
existing nodes are unaffected.
See docs/registry-manifest-design.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Single authoritative hub (docs/PRODUCTION-MASTER-PLAN.md) for the app-platform
north star: every app manifest-driven (zero OS-level reliance), manifests via the
signed registry, developer-ready external marketplace; rootless/secure/robust/
100%-uptime. Repo CLAUDE.md (auto-loaded each session) points agents at it until
the 20x lifecycle gate is green. New design doc registry-manifest-design.md.
Consolidated docs 56 -> 28: deleted dated handoffs/resumes/transcripts and
superseded trackers (content folded into the master plan or already in memory).
Kept all evergreen design/reference docs + ADRs (the master links them).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resume notes for the 1.8.0 bug-bash mesh work: Meshtastic rename shipped +
verified; .120->.89 'non-delivery' diagnosed to a duplicate-contact surfacing
bug (messages inject fine, split across federation/radio twin contact_ids);
design for the dedup fix (#12) and the netbird logout-race map (#10).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
UI (this session):
- Global audio player now scales the whole interface into the space above it
on desktop (sidebar + main) and docks directly above the tab bar on mobile;
it stays visible while navigating.
- Mesh mobile redesign: floating Chat / BTC / Dead Man / AI / Map tab strip
with a single fixed, internally-scrolling pane (page no longer scrolls);
tabs hide while a conversation is open; floating back button; collapsible
Device panel (starts collapsed); keyboard-aware conversation sizing via
VisualViewport so the chat sits just above the keyboard.
- Cloud file grid: uniform 4/3 card heights (folders + images match).
- Swipe left/right switches tabs on the Apps and Web5 screens.
- Map tool fills its pane (no bottom gap); fix skewed Share Location toggle
on mobile (global min-height rule was deforming the switch).
- Trim redundant helper copy from the mesh AI tab.
Also bundles pre-existing in-progress work that was already in the tree:
mesh listener/session + wallet + container + bitcoin-status backend changes,
docker UI updates, and assorted other UI tweaks.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
phase4-streaming-ecash-plan.md: design for ecash-paid swarm transport, paying
across different mints (§2a, Lightning-bridged swaps), networking-through-nodes
relay, and an IndeeHub "Archipelago" content source. Records the resolved
iroh-blobs paid-serving spike. dht-RESUME.md: task #12 + step F marked done.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Single source of truth for picking the DHT work back up after a restart:
worktree/branch rules, all phase commits, the exact next task (#12 Phase 3
glue), build-time facts, and the Phase 0 go-live ceremony.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Captures the verified 2026-06-16 design: swarm-assist/origin-always-wins,
iroh-blobs as the swarm engine, BLAKE3 addressing, signed Nostr/release-root
authenticity, and the Phase 0-4 plan. Foundation doc for the dht branch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make each peer file card a flex column filling its grid cell (flex flex-col
h-full) and pin the body row (filename + Play/Download) with mt-auto, so cards
with a media preview and cards without line their footers up across the row.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
os-audit.sh: one non-destructive scorecard tying backend/RPC health, the
all-apps lifecycle audit (delegates to remote-lifecycle.sh), and the FM-guards
(port-drift, secret-completeness, orphan-container sweep, OTA-wedge). The
per-boot building block for the reboot-survival loop. FM12 check uses jq has()
not // (// treats a legit false as empty). Section A validated all-PASS on .116.
docs: v1.7.91 release-pass resume notes + the bitcoinReceive blocker writeup.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Before/after on the live node confirms the launch_url_port fix:
jellyfin/btcpay/fedimint/gitea/portainer/botfights all went from
lan_address=None to a resolved http://localhost:PORT/ URL; harness
focused audit passed, exit 0. Also documents that archipelago.service
restarts are safe on .116 (containers run in the user-1000 slice, a
different cgroup, and survived the restart).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
reachable_lan_address() parsed the launch port with url.rsplit(':')
which yields "8096/" for manifest interfaces.main URLs that carry a
path (http://localhost:8096/). That fails to parse and silently drops
a perfectly reachable launch URL, so apps like jellyfin, btcpay-server,
fedimint, gitea, nextcloud and portainer showed running with no launch
link in the UI. New launch_url_port() reads digits after the final
colon (mirroring port_from_url in the RPC layer) and tolerates a
trailing path. Adds regression tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Releases no longer ship as bootable ISOs. Archipelago updates are
distributed as the backend binary plus a frontend tarball referenced by
releases/manifest.json. Nodes OTA-update via scripts/self-update.sh.
Filebrowser and AIUI remain bundled inside the frontend tarball and
deployed atomically, verified present in v1.7.43-alpha release artifact
(189 AIUI files, filebrowser-client bundle).
Archived under image-recipe/_archived/ (resurrectable if ISO distribution
is reintroduced):
- build-auto-installer-iso.sh
- build-unbundled-iso.sh
- test-iso-qemu.sh
- scripts/convert-iso-to-disk.sh
- BUILD-ISO-STATUS.md, ISO-BUILD-CHECKLIST.md
- branding/isohdpfx.bin
- .gitea/workflows/build-iso-dev.yml
Updated release process docs to drop ISO references:
- scripts/create-release.sh (next-steps text)
- docs/BETA-RELEASE-CHECKLIST.md
- docs/hotfix-process.md
- README.md
- AccountInfoSection.vue: append 5th bullet to v1.7.43-alpha entry
explaining that update-available badges and version comparisons
work again now that the pinned-image catalog is found at the
correct deployed path.
- docs/MARKETPLACE-QA.md: new tracker for the upcoming app-by-app
install walk on .228. Documents the per-app fix workflow, the
four layers we might need to fix at (app recipe, registry image,
backend orchestrator, frontend), status-key table for tracking
each catalog entry, and the release-notes policy for the walk.
- docs/RESUME.md: refresh with a9908597 commit, updated binary md5
on .228, and split Immediate Next Step into Phase 1 (browser
verification) and Phase 2 (marketplace walk) with a pointer to
the new tracker.
Consolidated single-file snapshot of plan + progress for a fresh
OpenCode session to pick up the install UX polish work:
- Where we are: v1.7.43-alpha shipped, 5 commits on main, deployed
to .228, browser verification in progress.
- Immediate next step: await user's verification results from
https://192.168.1.228/ browser checklist.
- Working layout: SSHFS mount, ssh archy / archy228, deploy recipes.
- Architecture patterns: async-spawn lifecycle, phase-based install
progress, scanner kick, .23 auto-purge migration.
- Backlog: Vaultwarden exit-on-start, install log perms, 22 stale
cargo test failures, historical changelog entries left intact.
- User preferences: "best long-term first", one-by-one, no push,
Bitcoin-only, conventional commits.
Complements STATUS.md (which remains the engineering log) with a
tighter resume-the-work narrative focused on the current round.
Adds a new top section to STATUS.md covering v1.7.43-alpha:
- Round 3: phase-based install progress bar
- Round 4: post-install scanner kick for instant Launch button
- Round 5: .23 VPS retirement, .168 promoted to Server 1
- Config migration: auto-purge .23 from saved registry/mirror JSONs
- Changelog: new v1.7.43-alpha entry in AccountInfoSection
All 5 commits, deployment md5, verification notes, and git remote
cleanup captured. Round 2 rollback command still valid for the full
stack since backups predate every round in this session.
Records the four landed commits, the .228 deploy (binary + frontend
paths, backups, md5), the manual LND Stop verification, and the
rollback incantation. Leaves the older "NEXT SESSION" design block
in place as historical reference with a note that it's stale.
Adds a follow-ups list: chaos matrix is now unblocked, bundled-app
RPCs are still sync (deprecate or mirror-async?), transitional_since
is in-memory only, and there are 22 pre-existing test failures in
unrelated modules that should get their own cleanup pass.
Dedicated section covering the file-ops-via-mount + git/cargo-via-ssh
split that makes this dev setup work. Includes:
- Exact running mount command (pulled from ps)
- macFUSE + sshfs-mac brew install path
- Health check + recovery sequence for when mount hangs (it will)
- Full which-path-for-which-operation table
- Don't-do list (cargo from mount, rsync without AppleDouble exclude, etc)
- Cache caveat and inode-sharing note between mount and SSH views
No code change.
Captures full design for the next session:
- Full bug sequence (5.5min blocking RPC + 30s scan clobbering transitional state)
- 4-commit implementation order with exact file:line targets
- Single-button UI spec with full label table
- Verification gates including manual LND stop test on .228
- Architectural decision: spawn lives in RPC layer, orchestrator trait stays sync
No code change yet; next session implements.