217 lines
8.2 KiB
Markdown
217 lines
8.2 KiB
Markdown
# Current Agent Handoff - Bitcoin UI Recovery And `1.8-alpha` Resume
|
|
|
|
Last updated: 2026-06-10 05:33 EDT
|
|
|
|
## Read This First
|
|
|
|
This is a separate handoff from `docs/NEXT_TERMINAL_HANDOFF.md`. That file tracks
|
|
an older/broader plan. For the next agent resuming this machine-switch pause,
|
|
read this file first, then read:
|
|
|
|
- `docs/RESUME.md`
|
|
- `docs/1.8-alpha-improvements-tracker.md`
|
|
- `docs/CONTAINER_LIFECYCLE_HANDOFF.md`
|
|
- `docs/MIGRATION_STATUS_REPORT.md`
|
|
|
|
Do not assume `docs/NEXT_TERMINAL_HANDOFF.md` is the current short-term plan.
|
|
|
|
## Current Goal
|
|
|
|
Cut Archipelago `1.8-alpha`, including a ready-to-test ISO image.
|
|
|
|
The release goal is not just "apps launch once"; the app/container system needs
|
|
to be developer-ready and production-release ready:
|
|
|
|
- manifests and docs must describe the real runtime contract;
|
|
- apps must install, start, stop, restart, uninstall, reinstall, survive reboot,
|
|
report truthful status, and show useful progress;
|
|
- My Apps must preserve last-known truth during Podman/scanner backoff instead
|
|
of showing false empty/no-app states;
|
|
- Bitcoin-dependent apps must explain sync/wallet readiness instead of looking
|
|
broken;
|
|
- final validation needs focused lifecycle, broad non-destructive lifecycle,
|
|
then repeated reboot checks before ISO cut/smoke test.
|
|
|
|
## Current Estimate
|
|
|
|
As of this pause:
|
|
|
|
- Credible release candidate: roughly `87-91%`.
|
|
- Production-quality release developers will love: roughly `73-79%`.
|
|
- Calendar estimate if the remaining systemic lifecycle issues are bounded:
|
|
`1-2 focused engineering days` for a release candidate, then additional
|
|
reboot/ISO smoke time.
|
|
- The biggest remaining risk is not catalog wiring; it is rootless Podman
|
|
control-plane responsiveness, stale scanner state, lifecycle progress UX, and
|
|
reboot validation.
|
|
|
|
## Validation Host
|
|
|
|
- Host: `192.168.1.198`
|
|
- SSH user: `archipelago`
|
|
- Password used in this session: `password123`
|
|
- Active Bitcoin app on this host: `bitcoin-knots`, not `bitcoin-core`
|
|
- Keep `archipelago-doctor.timer` and `archipelago-reconcile.timer` inactive
|
|
for deterministic validation unless intentionally testing them.
|
|
- Preserve app data.
|
|
- Avoid broad Podman store/image cleanup commands on `.198`.
|
|
|
|
## Bitcoin UI Incident Summary
|
|
|
|
User reported the Bitcoin custom UI showing:
|
|
|
|
`Bitcoin node is starting or busy syncing; retrying automatically. Detail:
|
|
getblockchaininfo: Bitcoin RPC request failed ... operation timed out`
|
|
|
|
Then after listener repair, the message changed through:
|
|
|
|
- `Connection refused`
|
|
- `Verifying blocks...`
|
|
- then the user reported it looked fine again.
|
|
|
|
What happened:
|
|
|
|
- The node is a `bitcoin-knots` node.
|
|
- During live debugging, the wrong alias, `bitcoin-core`, was started/stopped.
|
|
- `bitcoin-core` and `bitcoin-knots` compete for the same Bitcoin RPC/P2P ports.
|
|
- That action left the real `bitcoin-knots` service active but without the host
|
|
`8332` rootlessport listener for a while.
|
|
- Stopping the stray `bitcoin-core.service` and restarting only
|
|
`bitcoin-knots.service` recreated listeners on `8332` and `8333`.
|
|
- After restart, bitcoind entered the normal `-28 Verifying blocks...` phase.
|
|
- The user later reported the Bitcoin UI looked fine again.
|
|
|
|
Known live state observed during recovery:
|
|
|
|
- `bitcoin-knots.service`: active
|
|
- `bitcoin-core.service`: inactive
|
|
- `archy-bitcoin-ui.service`: active
|
|
- listeners present after repair:
|
|
- `8332` via `rootlessport`
|
|
- `8333` via `rootlessport`
|
|
- `8334` via nginx/Bitcoin UI
|
|
- `bitcoin-knots` logs showed active IBD around height `4137xx` and progress
|
|
about `0.09438`.
|
|
|
|
Do not restart Bitcoin again unless there is a fresh confirmed service/listener
|
|
failure. If checking status, prefer read-only probes and avoid starting the
|
|
wrong variant.
|
|
|
|
## Source Fixes Made Locally
|
|
|
|
These local edits were made after live Bitcoin recovered. They are not deployed
|
|
yet and were not fully validated before the user paused.
|
|
|
|
### `core/archipelago/src/bitcoin_status.rs`
|
|
|
|
Changed Bitcoin status cache behavior and copy:
|
|
|
|
- refresh interval changed from `5s` to `10s`;
|
|
- transient error backoff added at `15s`;
|
|
- RPC client timeout increased from `8s` to `20s`;
|
|
- error context now uses full anyhow chain with `{e:#}`;
|
|
- transient classifications now include common overloaded/backend states;
|
|
- user-facing copy now distinguishes:
|
|
- `verifying blocks after restart`;
|
|
- `waiting for the Bitcoin RPC listener`;
|
|
- `busy and not answering RPC before the timeout`;
|
|
- generic `starting or busy syncing`;
|
|
- added unit tests for the three user-visible states above.
|
|
|
|
Intent: stop collapsing distinct backend states into the same stale
|
|
"starting or busy syncing" timeout message.
|
|
|
|
### `core/archipelago/src/api/rpc/package/update.rs`
|
|
|
|
Narrow Bitcoin alias fix added:
|
|
|
|
- `orchestrator_update_app_id("bitcoin-knots")` now remains
|
|
`"bitcoin-knots"` instead of mapping to `"bitcoin-core"`;
|
|
- candidate app IDs for a Bitcoin container now prefer `bitcoin-knots` before
|
|
`bitcoin-core`;
|
|
- tests updated to lock this behavior.
|
|
|
|
Intent: `bitcoin-core` and `bitcoin-knots` can be dependency/status aliases,
|
|
but must not be interchangeable lifecycle/update targets on a node that has a
|
|
specific installed variant.
|
|
|
|
Important: this file also already contained other uncommitted update/pull
|
|
timeout changes from prior work. Do not assume every diff in this file came
|
|
from this interruption.
|
|
|
|
## Validation Status At Pause
|
|
|
|
Completed:
|
|
|
|
- `cargo fmt --manifest-path core/Cargo.toml --all` passed after the local
|
|
Bitcoin edits.
|
|
|
|
Attempted but not completed:
|
|
|
|
- Targeted Cargo tests were first launched in three separate `/tmp` target dirs
|
|
and failed due `/tmp` filling with `No space left on device`.
|
|
- Those temporary dirs were removed:
|
|
- `/tmp/archy-cargo-bitcoin-status`
|
|
- `/tmp/archy-cargo-update-alias`
|
|
- `/tmp/archy-cargo-container-candidates`
|
|
- A second run using `CARGO_TARGET_DIR=.codex-tmp/cargo-bitcoin-fix` was still
|
|
compiling when the user paused. It was terminated for handoff.
|
|
- No successful Rust test result exists yet for the new Bitcoin status/alias
|
|
tests.
|
|
|
|
Recommended validation after resume:
|
|
|
|
```bash
|
|
git diff --check -- core/archipelago/src/bitcoin_status.rs core/archipelago/src/api/rpc/package/update.rs docs/CURRENT_AGENT_HANDOFF.md
|
|
CARGO_TARGET_DIR=.codex-tmp/cargo-bitcoin-fix CARGO_BUILD_JOBS=2 cargo test --manifest-path core/Cargo.toml -p archipelago bitcoin_status::tests
|
|
CARGO_TARGET_DIR=.codex-tmp/cargo-bitcoin-fix CARGO_BUILD_JOBS=2 cargo test --manifest-path core/Cargo.toml -p archipelago update_aliases_map_to_manifest_app_ids
|
|
CARGO_TARGET_DIR=.codex-tmp/cargo-bitcoin-fix CARGO_BUILD_JOBS=2 cargo test --manifest-path core/Cargo.toml -p archipelago container_name_candidates_cover_common_aliases
|
|
```
|
|
|
|
If Cargo target locking appears stale, check for real `cargo`/`rustc` workers
|
|
before deleting anything. Prefer workspace-local target dirs under `.codex-tmp`
|
|
over new cold `/tmp` targets.
|
|
|
|
## Immediate Next Steps
|
|
|
|
1. Confirm no lingering Cargo process:
|
|
|
|
```bash
|
|
pgrep -af "cargo|rustc|cargo-bitcoin-fix"
|
|
```
|
|
|
|
2. Validate the local Bitcoin source fixes listed above.
|
|
|
|
3. If validation passes, build/deploy the backend to `.198` only after
|
|
confirming the user still wants deployment.
|
|
|
|
4. Recheck live Bitcoin non-destructively:
|
|
|
|
- `bitcoin-knots.service` active;
|
|
- `bitcoin-core.service` inactive;
|
|
- listeners on `8332`, `8333`, `8334`;
|
|
- Bitcoin UI loads on `8334`;
|
|
- `/bitcoin-status` returns useful copy if backend is busy.
|
|
|
|
5. Resume release backlog:
|
|
|
|
- rootless Podman lifecycle/control-plane responsiveness;
|
|
- My Apps last-known-state truthfulness during scanner backoff;
|
|
- progress UX for install/uninstall/start/stop/restart;
|
|
- remaining tracker rows in `docs/1.8-alpha-improvements-tracker.md`;
|
|
- focused lifecycle matrix on `.198`;
|
|
- broad non-destructive lifecycle;
|
|
- 3 clean reboot validations minimum, 5 preferred;
|
|
- ISO cut and ISO smoke test.
|
|
|
|
## Cautions For Next Agent
|
|
|
|
- Do not start `bitcoin-core` on `.198` unless intentionally migrating variants.
|
|
- Treat `bitcoin-knots` as the installed Bitcoin variant.
|
|
- Do not run broad Podman prune/store cleanup.
|
|
- Do not revert unrelated dirty worktree changes.
|
|
- `docs/NEXT_TERMINAL_HANDOFF.md` exists but is not the short-term handoff for
|
|
this pause.
|
|
- Many repo files are dirty from broader release hardening. Read diffs before
|
|
attributing changes.
|