10 KiB
Chat Transcript And Working Notes
Date: 2026-05-02
This file captures the current chat context, decisions, progress, and next steps so work can continue from another device/session.
User Request
The user asked to continue hardening Archipelago app/container lifecycle, then asked multiple times to save the plan/progress/next steps and finally to save the entire chat to Markdown.
Key user constraints and corrections:
- Continue if next steps are clear; ask only if blocked.
- Exhaustively harden app/container lifecycle before release.
- Preserve data during destructive lifecycle testing unless explicitly instructed otherwise.
- Do not rely on
/app/...proxy paths for app launch/testing. The user corrected: “we never use paths only ports.” - LND/Electrum wallet-connect tests must validate real connection details and QR, including Tor.
Earlier Progress Summary
Before the latest work, the project already had substantial lifecycle hardening in progress:
- Remote lifecycle harness exists at
tests/lifecycle/remote-lifecycle.sh. .198SSH works with/home/archipelago/.ssh/id_ed25519..228RPC works, but SSH is blocked withPermission denied (publickey,password).- Multiple backend release binaries were built and deployed to
.198with backups in/usr/local/bin/archipelago.bak-*. - Fixed stale package scanner state recovery from
Removing -> Runningwhen a container is actually live. - Fixed startup ordering so crash recovery runs before BootReconciler.
- Removed dangerous automatic Podman runtime directory deletion on
podman infofailure. - Narrowed generic crash recovery to safe legacy containers.
- Fixed companion reconciliation on install/start/restart.
- Fixed uninstall/reinstall behavior so uninstall disables manifest apps instead of deleting manifest availability, and reinstall re-enables them.
- Fixed LND config generation/repair:
bitcoin.active=truebitcoin.mainnet=truebitcoin.node=bitcoindbitcoind.rpchost=bitcoin-knots:8332- sudo fallback for writing container-owned config paths.
.198had previously passed focused lifecycle forfilebrowser,bitcoin-knots, and a looser LND launch test.
Major Files Touched In This Session
docs/CONTAINER_LIFECYCLE_HANDOFF.mddocs/CHAT_TRANSCRIPT_2026-05-02.mdtests/lifecycle/remote-lifecycle.shcore/archipelago/src/container/lnd.rscore/archipelago/src/container/companion.rscore/archipelago/src/container/prod_orchestrator.rscore/archipelago/src/container/docker_packages.rscore/container/src/podman_client.rscore/archipelago/src/port_allocator.rsapps/lnd-ui/manifest.ymlneode-ui/src/views/appSession/appSessionConfig.tsneode-ui/src/stores/container.tsneode-ui/src/stores/appLauncher.tsneode-ui/src/views/appDetails/appDetailsData.ts- nginx config/snippet files under
scripts/andimage-recipe/
LND Wallet Bootstrap Investigation
Initial strict LND probe failed because /lnd-connect-info could not read admin.macaroon:
Failed to read LND admin macaroon — is LND installed?
direct: Permission denied (os error 13)
sudo: cat: /var/lib/archipelago/lnd/data/chain/bitcoin/mainnet/admin.macaroon: No such file or directory
LND logs showed the wallet was uninitialized/locked:
Waiting for wallet encryption password. Use lncli create...
Tests showed lncli create is interactive and does not support --stdin:
[lncli] flag provided but not defined: -stdin
lncli unlock --stdin is supported, so the final approach was:
- Use LND REST unlocker endpoints for new wallet creation.
- Use
lncli unlock --stdinonly for an existing wallet. - Treat “wallet already exists” from REST as a signal to unlock.
- Use sudo-aware checks/reads for wallet artifacts because LND data directories are container-owned and
0700.
Implemented in core/archipelago/src/container/lnd.rs:
ensure_wallet_initialized()file_exists_as_root()read_file_as_root()init_wallet_via_rest()get_lnd_unlocker_json()post_lnd_unlocker_json()unlock_existing_wallet()wait_for_admin_macaroon()lnd_getinfo_ready()
Focused Rust test passes:
cd /home/archipelago/Projects/archy/core
cargo test -p archipelago --bin archipelago lnd
Result:
7 passed; 0 failed
LND UI Port Collision
The strict LND UI test then failed with 502.
Investigation found a real port collision:
nostr-rs-relayuses host8081.- Old
archy-lnd-uialso used host8081. - nginx
/app/lnd/proxy also pointed at8081.
Fix implemented:
- Move LND UI companion to host port
18083, container port80. - Keep
nostr-rs-relayon8081. - Update app metadata/routing to
18083. - Update tests to expect direct port launch.
Important correction from user:
we never use paths only ports, how many times do you need to be told
Action taken after correction:
- Stop validating through
/app/lnd/and/app/electrumx/in the lifecycle harness. - Switch
launch_url_for()to direct app ports. - Switch app session resolver to direct
http://host:portlaunch, even from HTTPS parent pages. - Remove use of
HTTPS_PROXY_PATHS[id]inresolveAppUrl().
Direct-port LND audit command:
ARCHY_HOST=192.168.1.198 ARCHY_PASSWORD=password123 ARCHY_APPS=lnd tests/lifecycle/remote-lifecycle.sh
Result:
### 192.168.1.198 iteration 1 / 1 ###
lnd state=running
all checks passed
The audit now validates http://192.168.1.198:18083/, not /app/lnd/.
Lifecycle Harness Changes
tests/lifecycle/remote-lifecycle.sh changes made:
- Normalize package states with
ascii_downcasebecause API returnedRunning. - Direct port launch URLs:
- LND:
http://${ARCHY_HOST}:18083/ - Electrum/Electrs:
http://${ARCHY_HOST}:50002/ - Bitcoin UI:
http://${ARCHY_HOST}:8334/ - Other apps mapped to direct ports where known.
- LND:
- LND probe checks:
Connect Your Walletid="lndQrBox"id="connHost"value="rest-tor"value="grpc-tor"value="rest-local"value="grpc-local"Copy lndconnect URI/lnd-connect-infocert, macaroon, ports, and Tor onion.
- Electrum probe checks:
- local QR container and address field
- Tor QR container and onion field
- port
50001 - QR renderer
- direct
http://${ARCHY_HOST}:50002/qrcode.js /electrs-statusTor onion.
- Full lifecycle now fails immediately on any failed phase with
|| return 1so a later reinstall cannot mask a failed restart/probe.
Deployments To .198
Several release builds were made and deployed:
cd /home/archipelago/Projects/archy/core
cargo build -p archipelago --bin archipelago --release
Deploy pattern:
scp -i /home/archipelago/.ssh/id_ed25519 -o StrictHostKeyChecking=no \
/home/archipelago/Projects/archy/core/target/release/archipelago \
archipelago@192.168.1.198:/tmp/archipelago.new
ssh -i /home/archipelago/.ssh/id_ed25519 -o StrictHostKeyChecking=no \
archipelago@192.168.1.198 \
"sudo cp /usr/local/bin/archipelago /usr/local/bin/archipelago.bak-<timestamp> && \
sudo install -m 0755 /tmp/archipelago.new /usr/local/bin/archipelago && \
sudo systemctl restart archipelago.service && \
systemctl is-active archipelago.service"
Latest deploy returned:
active
.198 Current Observations
After forcing LND package restart, companion reconciliation succeeded:
nostr-rs-relay Up ... 0.0.0.0:8081->8080/tcp
lnd Up ... 0.0.0.0:8080->8080/tcp, 0.0.0.0:9735->9735/tcp, 0.0.0.0:10009->10009/tcp
archy-lnd-ui Up ... 0.0.0.0:18083->80/tcp
Direct UI test from .198 returned 200:
curl -i http://127.0.0.1:18083/
Strict direct-port LND audit is green:
lnd state=running
all checks passed
Full LND Lifecycle Status
Full direct-port lifecycle was started:
ARCHY_HOST=192.168.1.198 ARCHY_PASSWORD=password123 ARCHY_APPS=lnd ARCHY_FULL_LIFECYCLE=1 tests/lifecycle/remote-lifecycle.sh
It reached:
### 192.168.1.198 iteration 1 / 1 ###
== lnd: install ==
== lnd: stop ==
Then the user aborted the command while asking to save memory/transcript.
The next continuation point is to rerun full LND direct-port lifecycle from scratch and inspect the stop phase if it hangs/fails.
Handoff File
A durable handoff file was also created:
docs/CONTAINER_LIFECYCLE_HANDOFF.md
It contains the plan, progress, current blockers, and next steps.
Immediate Next Steps
- Rerun full strict LND direct-port lifecycle:
ARCHY_HOST=192.168.1.198 ARCHY_PASSWORD=password123 ARCHY_APPS=lnd ARCHY_FULL_LIFECYCLE=1 tests/lifecycle/remote-lifecycle.sh
- If it hangs/fails at
stop, inspect package runtime stop path and logs:
ssh -i /home/archipelago/.ssh/id_ed25519 -o StrictHostKeyChecking=no archipelago@192.168.1.198 \
'journalctl -u archipelago.service -n 260 --no-pager | egrep -i "package\.(stop|start|restart|install|uninstall)|lnd|companion|error|failed" | sed -n "1,220p"; podman ps -a --format "{{.Names}} {{.Status}} {{.Ports}}" | egrep "lnd|nostr" || true'
- If stop is unreliable, inspect/fix:
core/archipelago/src/api/rpc/package/runtime.rscore/archipelago/src/container/prod_orchestrator.rs
Likely causes to check:
- Reconciler restarting LND while stop is expected.
- State scanner reporting stale
running. - Companion handling interfering with parent app state.
- Async lifecycle returning before actual stop completes.
- Once LND full lifecycle is green, run Electrum strict lifecycle with direct port
50002:
ARCHY_HOST=192.168.1.198 ARCHY_PASSWORD=password123 ARCHY_APPS=electrumx ARCHY_FULL_LIFECYCLE=1 tests/lifecycle/remote-lifecycle.sh
- Continue with app groups after LND/Electrum:
filebrowserbitcoin-knotslndelectrumxmempoolbtcpay-serverfedimint- remaining catalog apps.
Important Instruction To Preserve
Use ports only for app launch/testing. Do not add or rely on /app/... path proxy launch behavior unless the user explicitly changes this requirement.