diff --git a/apps/meshtastic/Dockerfile b/apps/meshtastic/Dockerfile deleted file mode 100644 index 190cc4f0..00000000 --- a/apps/meshtastic/Dockerfile +++ /dev/null @@ -1,5 +0,0 @@ -# Meshtastic - uses official image -FROM meshtastic/meshtastic:latest - -# Default configuration is in the image -# No additional setup needed diff --git a/apps/meshtastic/manifest.yml b/apps/meshtastic/manifest.yml deleted file mode 100644 index d8aecdd4..00000000 --- a/apps/meshtastic/manifest.yml +++ /dev/null @@ -1,69 +0,0 @@ -app: - id: meshtastic - name: Meshtastic - version: 2-daily-alpine - description: Open-source mesh networking for LoRa radios. Create decentralized communication networks. - - container: - image: docker.io/meshtastic/meshtasticd:daily-alpine - pull_policy: if-not-present - - dependencies: - - storage: 1Gi - - resources: - cpu_limit: 1 - memory_limit: 512Mi - disk_limit: 1Gi - - security: - capabilities: [NET_ADMIN, SYS_ADMIN] # Required for LoRa radio access - readonly_root: false # Needs write access for device management - no_new_privileges: true - user: 1000 - seccomp_profile: default - network_policy: host # Requires host network for radio access - apparmor_profile: meshtastic - - ports: - - host: 4403 - container: 4403 - protocol: tcp # Meshtastic TCP API - - devices: - - /dev/ttyUSB0 # LoRa radio device (if connected) - - volumes: - - type: bind - source: /var/lib/archipelago/meshtastic - target: /var/lib/meshtasticd - options: [rw] - - files: - - path: /var/lib/archipelago/meshtastic/config.yaml - content: | - General: - MACAddress: AA:BB:CC:DD:EE:01 - Webserver: - Port: 4403 - - environment: - - MESHTASTIC_PORT=/dev/ttyUSB0 - - MESHTASTIC_SERIAL=true - - health_check: - type: cmd - endpoint: test -f /var/lib/meshtasticd/config.yaml - interval: 30s - timeout: 30s - retries: 5 - - networking: - mesh_enabled: true - local_network_access: true - - metadata: - icon: /assets/img/app-icons/meshcore.svg - category: networking - tier: recommended - repo: https://github.com/meshtastic/firmware diff --git a/docs/PRODUCTION-MASTER-PLAN.md b/docs/PRODUCTION-MASTER-PLAN.md index 5f971f7d..2345510a 100644 --- a/docs/PRODUCTION-MASTER-PLAN.md +++ b/docs/PRODUCTION-MASTER-PLAN.md @@ -70,7 +70,7 @@ real nodes. Until then, this plan is the priority. | C | **Developer-ready external registry** — 3rd-party DID-signed manifests, decentralized Nostr discovery (NIP-78 kind 30078) + trust score, `archy app …` tooling | `marketplace-protocol.md`, `app-developer-guide.md` | design exists; tooling + trust UX pending | | D | **Distribution backbone** — signed catalog, BLAKE3 content-addressing, iroh swarm (origin-always-wins) | `dht-distribution-design.md` | phases 0–2 code-complete (worktree) | | E | **Production test gate** — 5× lifecycle on **.228**, per-app L1/L2 matrix; multinode is split out → `multinode-testing-plan.md` | `tests/lifecycle/TESTING.md`, `bulletproof-containers.md` | **✅ .228 5×-GREEN (110/110 ×5, 0 not-ok, 2026-06-23)** — but this is DESTRUCTIVE-tier / ~8 core apps only; see §6c for the coverage gaps | -| F | **Lifecycle perfection — cascade + progress + ALL apps** — extend the gate to uninstall/reinstall (cascade), real install/uninstall progress UI, and EVERY installed app (not just the 8 core). The "insanely-perfect OS/container environment" bar. | §6c (below), `tests/lifecycle/TESTING.md` | **NEW (2026-06-23)** — real bugs already found in manual multinode testing; sequenced after netbird + Phase-3 | +| F | **Lifecycle perfection — cascade + progress + ALL apps** — extend the gate to uninstall/reinstall (cascade), real install/uninstall progress UI, and EVERY installed app (not just the 8 core). The "insanely-perfect OS/container environment" bar. | §6c (below), `tests/lifecycle/TESTING.md` | **IN PROGRESS (2026-06-26)** — root bug FIXED: uninstall could hang → ghost/stuck-bar/reinstall-block (`71cc9ac4`, unbounded systemctl/podman in `quadlet::disable_remove`); `cascade-uninstall.bats` **7/7 green on .228** w/ binary `ae349a75`. Remaining: wire CASCADE into the canonical gate run, progress-UI truthfulness, all-apps matrix, guardian/IBD state. | **Orchestrator architecture** (foundation for A/B): `rust-orchestrator-migration.md` (ProdContainerOrchestrator, BootReconciler 30s level-triggered reconcile, adoption @@ -156,10 +156,27 @@ reinstall, install-progress UI, and most apps were never under test. Bitcoin finishes initial sync" / "starting"** on that node — verify this is correct wait-for-IBD behavior vs a stuck/false state (it's a backend that depends on bitcoin sync). +**✅ 2026-06-26 — root cause of the immich/grafana uninstall trio FOUND + FIXED (`71cc9ac4`).** +Single cause: `quadlet::disable_remove()` (first op in uninstall teardown, via companion + +orchestrator) ran `systemctl --user stop` / `daemon-reload` / `podman rm -f` with **no timeout**. +On rootless podman a generated unit can wedge "deactivating" while podman hangs → `systemctl stop` +blocks forever → the spawned uninstall task returns neither Ok nor Err, so (a) `set_uninstall_stage` +never fires → **frozen full-red bar**, (b) `remove_package_state_entry` never runs → **ghost stuck in +`Removing`**, (c) the install guard rejects reinstall (`already Removing`). The spawn wrapper already +reverts state on Err/removes on Ok — only a *hang* stranded it. Fix bounds all three calls +(stop→`QUADLET_STOP_TIMEOUT` + SIGKILL/reset-failed escalation; daemon-reload→30s; podman rm→timeout). +**Validated live: `cascade-uninstall.bats` 7/7 on .228** (binary `ae349a75`) — grafana install → +uninstall (no ghost, data dir gone) → reinstall → running → cleanup. NOTE: proves the happy path + +no-regression; the original hang was load/timing-induced and not separately reproduced. + **Workstream F scope — the gate must grow to (in priority order):** 1. **CASCADE tier in the canonical gate:** uninstall → verify the app is GONE from My Apps / `container-list` / package state (no ghost), data preserved per policy, then reinstall → verify it returns healthy. Catch the immich/grafana ghost + reinstall-stops bugs. + *(Test EXISTS + passes — `bats/cascade-uninstall.bats`, gated on `ARCHY_ALLOW_CASCADE_DESTRUCTIVE`; + `run-gate.sh` still never sets it. DECISION PENDING: run a single cascade pass alongside the 5× + destructive loop vs. a dedicated cascade gate — do NOT fold uninstall/reinstall into all 5 + iterations, it balloons runtime.)* 2. **Progress-UI assertions:** install AND uninstall must report monotonic, truthful progress (not a stuck full-red bar); a long op must surface a real stage/percentage and a terminal success/failure — no silent hang. (Likely both a backend progress-event fix AND a UI fix.)