8.4 KiB
RESUME — Rust orchestrator migration, Step 8b
Last updated: 2026-04-23 (evening, post-architecture-audit)
Read this first if you're a fresh OpenCode session resuming work. Paste the "Resume prompt" below verbatim.
Resume prompt (paste this into a new opencode session)
We are mid-migration:
docs/rust-orchestrator-migration.md+docs/bulletproof-containers.mdare the plan, Steps 1–7 + 8a are shipped onmain, Step 8b is next. Readdocs/RESUME.md+docs/STEP-8B-PORT-AUDIT.mdin full. Do NOT run any container mutations or editscripts/container-specs.sh,scripts/first-boot-containers.sh, orscripts/reconcile-containers.sh— those are dead code scheduled for deletion in Step 8c. Work happens incore/container/src/manifest.rs,core/archipelago/src/container/prod_orchestrator.rs, andapps/<id>/manifest.yml. Summarize back to me what you understand the current state to be, wait for approval before touching anything.
Standing directive from the user
Please get back to a well architected, minimal as possible, perfect working container architecture. If we've gone off track and the system is getting complex rather than elegant and perfect best containers ever then we need to review all the current state of the system and get back to making the best container system ever and according to our projects goals. We will be working on this until it's perfect.
Interpretation (validated with the user): resume the Rust orchestrator migration. Stop patching bash scripts. The bash scripts were supposed to be deleted three months of commits ago and we drifted into maintaining them by accident.
Latest user comment (must be followed)
please continue, please state my last comment in the resume doc and first before making this plan to adhere to
Adherence rule for this session:
- Before proposing or executing a plan, first record the user's latest directive in
docs/RESUME.md. - Keep work aligned to Step 8 migration goals and avoid off-scope drift.
Most recent directive:
And we need to get every container working on .116 and tested before we release
Release gate update:
.116must have all required containers healthy and tested before release is allowed.- Treat runtime stabilization on
.116as immediate priority while continuing Step 8 migration work.
Where we actually are
Shipped (Steps 1–7 + 8a)
Commits on main (unpushed to origin/tx1138 until release gate; user-visible history):
| Step | Commit | What |
|---|---|---|
| 1 | (schema in place from earlier commits) | ContainerConfig.image ⊕ ContainerConfig.build — mutually exclusive pull-or-build source |
| 2 | 34af4d9d |
ContainerRuntime trait gains image_exists + build_image; PodmanRuntime impl |
| 3 | b6a04d31 |
ProdContainerOrchestrator with build-or-pull + adoption + reconcile |
| 4 | e8a59c93 |
ContainerOrchestrator trait; RpcHandler uses it in prod |
| 5 | fc39b04b |
BootReconciler — periodic reconcile loop |
| 6 | 48f08aa3 |
Wire both into main.rs |
| 7 | 069bc4a5 |
bitcoin-ui pre-start hook renders nginx.conf from embedded template (the pattern for "derived config" at apply time) |
| 8a | a0707f4d, 1c81a739 |
Retire archipelago-reconcile systemd timer; split Step 8 into 8a/8b/8c |
Three apps/*/manifest.yml are genuinely ported and running under the Rust orchestrator on .116 + .228: bitcoin-ui, electrs-ui, lnd-ui (Step 7).
Where we drifted (the session that produced the previous RESUME.md)
On 2026-04-23 a fedimint outage on .116 pulled a session into patching scripts/reconcile-containers.sh, scripts/container-specs.sh, scripts/first-boot-containers.sh — files that Step 8c is scheduled to delete. Five bugs deep, the user halted the session. That cluster of bugs is a symptom of running two incompatible codepaths in parallel (bash first-boot/reconcile + Rust BootReconciler), which is exactly the condition Step 8c fixes by deleting the bash half.
Discard-of-scope decision: the uncommitted bash edits on .116 (listed in the previous RESUME.md's "Uncommitted script changes" section) are not going to be committed. The fedimint mDNS-URLs fix, the filebrowser custom-args fix, the bcrypt-escape fix — these all land as changes to apps/<id>/manifest.yml + the Rust orchestrator in Steps 8b.0 – 8b.3. See docs/STEP-8B-PORT-AUDIT.md for the exact mapping.
Current container state on .116
Running but drifted. See the "Current container state" section in the previous RESUME.md. Decision (approved by user): accept .116 is limping until 8b.3 lands. Do not run scripts/reconcile-containers.sh or any mutations; all rescues go through the Rust orchestrator or wait for the manifest port.
.228 is happier — it's already adopted by the Rust orchestrator for the three UI apps.
Next step — Step 8b.0
Concretely: schema extensions to core/container/src/manifest.rs + unit tests. No orchestrator changes, no manifest changes, no container mutations.
Fields to add (justified in docs/STEP-8B-PORT-AUDIT.md§Schema gaps):
container.network: Option<String>— podman--networkvalue ("archy-net","host", orNone= isolated default).container.custom_args: Vec<String>— appended to the container command.container.entrypoint: Option<Vec<String>>— override.container.derived_env: Vec<{key, template}>— template strings resolved againstHostFacts { host_ip, host_mdns, disk_gb }at apply time.container.secret_env: Vec<{key, secret_file}>— read from/var/lib/archipelago/secrets/<file>at apply time.container.data_uid: Option<String>—"NNNNN:NNNNN"applied viachown -Rbefore container create.Volume.volume_type: "tmpfs"+Volume.tmpfs_options: String— OR a newcontainer.tmpfs: Vec<{target, options}>. Pick one at implementation time.
Tests (block the commit until green):
- Every existing
apps/*/manifest.ymlstill parses (parse_every_real_manifesttest). - Each new field parses correctly with sensible defaults.
validate()rejects: empty custom_args elements, empty entrypoint elements, duplicate derived_env keys, derived_env templates referencing unknown host facts, secret_env with..or/in secret_file (path-traversal guard).resolve_env(HostFacts)returns expected strings for each supported placeholder.resolve_secret_env(SecretsProvider)returns expected strings; missing secret file is a hard error.
This is the smallest useful commit and unblocks every port in 8b.1+.
Project ground rules (standing)
archySSH alias =.116.archy228=.228. Do not swap.- SSHFS at
/Users/dorian/mnt/archy-thinkpad/=archy:Projects/archy/. .116sudo password:ThisIsWeb54321@— works passwordless in-session viasudo -nSafter first use..228has NOPASSWD.- Git commits on
.116MUST usegit commit -F /tmp/tmp-msg.txtoverssh archy— SSHFSgit commithangs. - Never push except current release (granted:
gitea-local+gitea-vps2). - No em-dashes. Conventional Commits.
- No altcoin mentions, Bitcoin-only.
Recommended next action for the fresh session
- Read this file +
docs/STEP-8B-PORT-AUDIT.md+ the "Open decisions" section of the audit. - Answer the four open decisions (or confirm the recommended defaults).
- Implement 8b.0 commit 1: add
network,custom_args,entrypoint,derived_env,secret_env,data_uidfields toContainerConfig+ validation + unit tests. Backwards-compat: every existingapps/*/manifest.ymlmust still parse. - Commit +
cargo test -p archipelago-container+ stop.
Do not touch scripts/*.sh. Do not run reconcile-containers.sh. Do not live-test on .116 or .228 until the schema + orchestrator pieces in 8b.0 + 8b.1 are both in.
Recent release (out of scope, for grep context)
v1.7.43-alpha shipped yesterday: tarball-only OTA, async install/uninstall/update lifecycle, install UX polish, .23 VPS retirement. Manifest at gitea-local + gitea-vps2. .228 on the new binary. See docs/STATUS.md for the full rundown.
Earlier session notes (container rescue on .116, "never fails" directive, env-drift detector experiment) are obsolete — superseded by this file. The directive ("never fails") is honored by the Step 8 migration itself: a declarative manifest regenerated on every reconcile tick can't bake stale IPs into consensus data because the env comes from derived/secret sources that are re-resolved every apply.