archy/docs/SESSION-1.8.0-OTA-PROGRESS.md
2026-06-30 05:08:17 -04:00

46 lines
3.1 KiB
Markdown

# 1.8.0 OTA Session Progress
Updated: 2026-06-29
Current scope:
- Preserve existing mesh work: E2E indicators, FIPS/Tor transport indicators, typed-message paths, Meshtastic region/channel provisioning, and dirty Meshtastic receive-attempt changes.
- Take over the `3ccc` stock Meshtastic peer bug: LoRa text from `3ccc` to Archipelago `.116` does not surface in `mesh.messages`.
- Keep release-gate fixes already made in this session.
Local gate status so far:
- `cargo test -p archipelago --bin archipelago`: green, 849/849 after Meshtastic fixes.
- `python3 scripts/check-app-catalog-drift.py --release --strict`: green.
- `npm run type-check`: green.
Key changes made so far:
- Added cascade uninstall progress truthfulness assertion to `tests/lifecycle/bats/cascade-uninstall.bats`.
- Fixed release catalog drift filters and regenerated catalog metadata.
- Fixed invalid `apps/fedimint-clientd/manifest.yml` `cpu_limit` schema value.
- Updated stale/tight Rust tests without changing production behavior.
Remaining non-automatable / operational gates:
- Workstream B signing is blocked on the offline `RELEASE_MASTER_MNEMONIC`; code + runbook exist, but the publisher must pin/sign the release-root catalog.
- Phase-3 Quadlet backend rollout is implemented behind `use_quadlet_backends` and default-off. The gate skip-passes until explicitly enabled on a node; flipping it fleet-wide requires a coordinated flag rollout plus backend reinstall/migration verification.
- `.116` read-only `use-quadlet-backends-install.bats`: 6/6 skip-clean; no backend `.container` units, so Phase-3 is not active on that node.
- Release metadata still says `1.7.99-alpha` in `releases/manifest.json`; changelog top is `v1.8.00-alpha`. Cutting an actual 1.8.0 OTA requires an explicit version/manifest update.
Do not discard:
- `core/archipelago/src/mesh/listener/decode.rs`
- `core/archipelago/src/mesh/listener/session.rs`
- `core/archipelago/src/mesh/meshtastic.rs`
3ccc bug current hypothesis:
- The prior attempted Meshtastic fix added a hard stale-packet filter using `rx_time`.
- Stock Meshtastic radios without GPS/RTC can report tiny nonzero epoch values until time sync.
- That would make live `3ccc` packets look older than 10 minutes and get dropped before `mesh.messages`.
- Current patch treats implausibly early `rx_time` values as unknown rather than stale.
.116 live validation:
- `.116` reachable by SSH; `archipelago` active; `/dev/mesh-radio -> ttyUSB0` attached.
- Recent logs show repeated `FromRadio.queueStatus` frames (`field 11`, bytes like `5a04100e1810`) being rejected by the serial frame prevalidator as invalid payloads.
- Current patch accepts `FromRadio.queueStatus` as a valid ignored frame so non-message status frames no longer look like corrupt serial data.
- Focused Meshtastic tests: green, 7/7.
- Updated patch deployed to `.116` as binary sha `028ec6ff9a60ca8970c081987457d78ed1c517cd81f7089f51b9a01745b5c3c4`.
- After redeploy, logs show `FromRadio field=11` accepted and no new `Dropping stale ... !433e3ccc` entries in the checked post-deploy window.
- There are stale other-agent shell watcher processes on `.116` referencing `RXDIAG`; leave alone unless they interfere.