Compare commits

..

No commits in common. "main" and "demo-build" have entirely different histories.

284 changed files with 2160 additions and 24625 deletions

View File

@ -2,7 +2,7 @@
# Keep the served companion APK in sync with main on every push. # Keep the served companion APK in sync with main on every push.
# #
# When a push to main includes Android changes, rebuild the APK, refresh # When a push to main includes Android changes, rebuild the APK, refresh
# neode-ui/public/packages/archipelago-companion.apk, commit it, and ask # neode-ui/public/packages/archipelago-companion.apk.zip, commit it, and ask
# you to push again (so the refreshed APK rides along in the same push). # you to push again (so the refreshed APK rides along in the same push).
# #
# Enable once per clone: git config core.hooksPath .githooks # Enable once per clone: git config core.hooksPath .githooks
@ -40,7 +40,7 @@ fi
bash scripts/publish-companion-apk.sh || exit 0 bash scripts/publish-companion-apk.sh || exit 0
DEST="neode-ui/public/packages/archipelago-companion.apk" DEST="neode-ui/public/packages/archipelago-companion.apk.zip"
if git diff --cached --quiet -- "$DEST"; then if git diff --cached --quiet -- "$DEST"; then
exit 0 # APK unchanged — nothing to do exit 0 # APK unchanged — nothing to do
fi fi

5
Android/.gitignore vendored
View File

@ -14,8 +14,3 @@ local.properties
*.aab *.aab
*.jks *.jks
*.keystore *.keystore
# Exception: the repo-dedicated *debug* keystore is committed on purpose so every
# machine (and the published companion download) signs debug builds identically —
# updates then install over the top without an uninstall. Debug keys are not
# secret (well-known password "android"); never commit a real release keystore.
!/app/debug.keystore

View File

@ -1,94 +0,0 @@
# Companion App — Build, Ship & "App Not Installed" Runbook
Canonical procedure for releasing the Archipelago Companion Android app and for
debugging install failures. Read this before touching the companion release flow.
Hard lessons from 2026-06-26 are baked in below — don't relearn them.
## Ship the companion (the only sanctioned way)
```bash
./Android/ship-companion.sh
```
This calls `scripts/publish-companion-apk.sh` (the single source of truth, also
used by the `.githooks/pre-push` hook), which:
1. **Removes/rejects resource dirs whose names contain spaces.** Empty stray
`mipmap-* NNN` dirs (left by icon-export tools) break a *clean* build with
`Invalid resource directory name`. Incremental builds hide them — clean builds
don't.
2. **Always does a CLEAN build** (`:app:clean :app:assembleDebug`).
3. **Forces v1 + v2 + v3 signing** via `zipalign` + `apksigner`.
4. **Verifies all three schemes** (`apksigner verify --min-sdk-version 21`) and
**aborts** if any is missing.
5. Stages the signed APK at `neode-ui/public/packages/archipelago-companion.apk`,
commits, and pushes with `SHIP_COMPANION=1` (the sanctioned pre-push bypass).
**Never** hand-roll `gradlew assembleDebug` + `cp` to the served path. That path
skips the clean build and the signature enforcement and is exactly how a broken
APK shipped.
### Bump the version first
Edit `Android/app/build.gradle.kts``versionCode` (must strictly increase) and
`versionName`. The committed value can drift AHEAD of what's actually built into
the served APK, so verify the served APK's real version after shipping:
`aapt2 dump badging neode-ui/public/packages/archipelago-companion.apk | grep version`.
## Signing facts (important)
- Debug builds are signed with the **committed** `Android/app/debug.keystore`
(store/key pass `android`, alias `androiddebugkey`) so every machine and the
served download share ONE signing key. Cert SHA-256: `D6:22:E0:7E:…:66:4D`.
- **AGP silently ignores `enableV1Signing = true` for `minSdk ≥ 24`**, so a plain
gradle build produces a **v2-only** APK. The `apksigner` step in the publish
script is what actually guarantees v1+v2+v3 — do not remove it.
- **Changing the signing key forces every existing install to be uninstalled
once.** Android blocks in-place upgrades across different signatures. Treat the
keystore as permanent; never regenerate it casually.
## Debugging "App Not Installed" — DIAGNOSE FIRST
Do **not** theorize about signing schemes / OEM quirks. Get the real reason:
```bash
adb install ~/Desktop/archipelago-companion-<ver>.apk
# -> Failure [INSTALL_FAILED_<REASON>: ...]
```
Map the reason:
| `INSTALL_FAILED_*` | Cause | Fix |
|---|---|---|
| `UPDATE_INCOMPATIBLE … signatures do not match` | Old install signed with a **different key** (e.g. pre-shared-keystore per-machine key `58:31:12…`). | Uninstall the old package, then install. **One-time** per device after a key change. |
| `INVALID_APK` / parse error | Corrupt/incomplete download or bad signing. | Re-download; re-run the publish script. |
| `INSUFFICIENT_STORAGE` | Storage. | Free space. |
| `OLDER_SDK` | Device below `minSdk` (26 = Android 8.0). | Unsupported device. |
> A manual uninstall on the phone may NOT clear `UPDATE_INCOMPATIBLE` if the
> package is registered under another user/profile — `pm path <pkg>` under user 0
> can show nothing while the conflict persists. `adb uninstall <pkg>` clears it
> across all users.
## Phone / adb safety (non-negotiable)
When acting on the user's physical phone, be surgical — the user once had all
home-screen app layouts wiped by an over-broad action.
- Default to **read-only** adb (`devices`, `getprop`, `pm path/list`, `dumpsys`).
- Mutations (`adb install`, `adb uninstall com.archipelago.app.debug`) only with
explicit go-ahead and **scoped to our exact package** — echo it first.
- **Never** run launcher/system resets: no `pm clear` on launchers, no
`reset-permissions`, no factory wipe, no uninstalling apps you didn't build.
## Verify the published download after shipping
The download served to nodes is Gitea raw-on-main. Confirm the live bytes match
what you built and signed:
```bash
SERVED=neode-ui/public/packages/archipelago-companion.apk
URL=http://146.59.87.168:3000/lfg2025/archy/raw/branch/main/$SERVED
curl -sS -o /tmp/live.apk "$URL"
shasum -a 256 "$SERVED" /tmp/live.apk # must match
apksigner verify -v --min-sdk-version 21 /tmp/live.apk | grep -i "scheme" # v1/v2/v3 = true
```

View File

@ -11,40 +11,20 @@ android {
applicationId = "com.archipelago.app" applicationId = "com.archipelago.app"
minSdk = 26 minSdk = 26
targetSdk = 35 targetSdk = 35
versionCode = 16 versionCode = 10
versionName = "0.4.12" versionName = "0.4.6"
vectorDrawables { vectorDrawables {
useSupportLibrary = true useSupportLibrary = true
} }
} }
signingConfigs {
// Repo-dedicated debug keystore (committed at app/debug.keystore) so every
// machine — and the published companion download — signs debug builds with
// the SAME key. Without this, Gradle falls back to each machine's
// ~/.android/debug.keystore, so a build from a different machine has a
// different signature and the phone rejects the update ("App not installed").
getByName("debug") {
storeFile = file("debug.keystore")
storePassword = "android"
keyAlias = "androiddebugkey"
keyPassword = "android"
// Force both legacy JAR (v1) and APK Signature Scheme v2. AGP drops v1
// for minSdk>=24, but some OEM package installers (e.g. Samsung) reject
// a v2-only sideload with "App not installed" — keep v1 for max compat.
enableV1Signing = true
enableV2Signing = true
}
}
buildTypes { buildTypes {
debug { debug {
// Separate app ID so a debug/test build installs alongside the // Separate app ID so a debug/test build installs alongside the
// release app instead of colliding on signature. // release app instead of colliding on signature.
applicationIdSuffix = ".debug" applicationIdSuffix = ".debug"
versionNameSuffix = "-debug" versionNameSuffix = "-debug"
signingConfig = signingConfigs.getByName("debug")
} }
release { release {
isMinifyEnabled = true isMinifyEnabled = true

Binary file not shown.

View File

@ -112,37 +112,6 @@ class ServerPreferences(private val context: Context) {
} }
} }
/**
* Replace a saved server in place. Matches the existing entry by connection
* identity (address/port/scheme) so edits that change the name or password
* or that touch a legacy 4-field entry still update the right record. If the
* edited server is also the active one, the active record is kept in sync.
*/
suspend fun updateSavedServer(original: ServerEntry, updated: ServerEntry) {
context.dataStore.edit { prefs ->
val current = prefs[savedServersKey] ?: emptySet()
val filtered = current.filterNot { raw ->
val e = ServerEntry.deserialize(raw)
e != null &&
e.address == original.address &&
e.port == original.port &&
e.useHttps == original.useHttps
}.toSet()
prefs[savedServersKey] = filtered + updated.serialize()
val isActive = prefs[activeAddressKey] == original.address &&
(prefs[activePortKey] ?: "") == original.port &&
(prefs[activeHttpsKey] ?: false) == original.useHttps
if (isActive) {
prefs[activeAddressKey] = updated.address
prefs[activeHttpsKey] = updated.useHttps
prefs[activePortKey] = updated.port
prefs[activePasswordKey] = updated.password
prefs[activeNameKey] = updated.name
}
}
}
suspend fun removeSavedServer(server: ServerEntry) { suspend fun removeSavedServer(server: ServerEntry) {
context.dataStore.edit { prefs -> context.dataStore.edit { prefs ->
val current = prefs[savedServersKey] ?: emptySet() val current = prefs[savedServersKey] ?: emptySet()

View File

@ -75,7 +75,6 @@ fun NESMenu(
onDismiss: () -> Unit, onDismiss: () -> Unit,
onSelectServer: (ServerEntry) -> Unit, onSelectServer: (ServerEntry) -> Unit,
onAddServer: (ServerEntry) -> Unit, onAddServer: (ServerEntry) -> Unit,
onEditServer: (ServerEntry, ServerEntry) -> Unit,
onRemoveServer: (ServerEntry) -> Unit, onRemoveServer: (ServerEntry) -> Unit,
onToggleMode: () -> Unit, onToggleMode: () -> Unit,
onToggleStyle: () -> Unit, onToggleStyle: () -> Unit,
@ -88,7 +87,7 @@ fun NESMenu(
contentAlignment = Alignment.Center, contentAlignment = Alignment.Center,
) { ) {
AnimatedVisibility(visible = visible, enter = fadeIn() + scaleIn(initialScale = 0.95f), exit = fadeOut() + scaleOut(targetScale = 0.95f)) { AnimatedVisibility(visible = visible, enter = fadeIn() + scaleIn(initialScale = 0.95f), exit = fadeOut() + scaleOut(targetScale = 0.95f)) {
MenuPanel(servers, activeServer, isGamepadMode, controllerStyle, onDismiss, onSelectServer, onAddServer, onEditServer, onRemoveServer, onToggleMode, onToggleStyle, onBackToWebView) MenuPanel(servers, activeServer, isGamepadMode, controllerStyle, onDismiss, onSelectServer, onAddServer, onRemoveServer, onToggleMode, onToggleStyle, onBackToWebView)
} }
} }
} }
@ -103,39 +102,21 @@ private fun MenuPanel(
onDismiss: () -> Unit, onDismiss: () -> Unit,
onSelectServer: (ServerEntry) -> Unit, onSelectServer: (ServerEntry) -> Unit,
onAddServer: (ServerEntry) -> Unit, onAddServer: (ServerEntry) -> Unit,
onEditServer: (ServerEntry, ServerEntry) -> Unit,
onRemoveServer: (ServerEntry) -> Unit, onRemoveServer: (ServerEntry) -> Unit,
onToggleMode: () -> Unit, onToggleMode: () -> Unit,
onToggleStyle: () -> Unit, onToggleStyle: () -> Unit,
onBackToWebView: (() -> Unit)?, onBackToWebView: (() -> Unit)?,
) { ) {
var showAdd by remember { mutableStateOf(false) } var showAdd by remember { mutableStateOf(false) }
// The saved server being edited, or null when adding a new one.
var editing by remember { mutableStateOf<ServerEntry?>(null) }
var nm by remember { mutableStateOf("") } var nm by remember { mutableStateOf("") }
var addr by remember { mutableStateOf("") } var addr by remember { mutableStateOf("") }
var pwd by remember { mutableStateOf("") } var pwd by remember { mutableStateOf("") }
fun resetForm() {
nm = ""; addr = ""; pwd = ""; showAdd = false; editing = null
}
fun startEdit(server: ServerEntry) {
editing = server
nm = server.name; addr = server.address; pwd = server.password
showAdd = false
}
fun submit() { fun submit() {
if (addr.isBlank()) return if (addr.isNotBlank()) {
val orig = editing
if (orig != null) {
// Preserve fields the compact form doesn't expose (scheme, port).
onEditServer(orig, orig.copy(address = addr, password = pwd, name = nm))
} else {
onAddServer(ServerEntry(addr, false, password = pwd, name = nm)) onAddServer(ServerEntry(addr, false, password = pwd, name = nm))
nm = ""; addr = ""; pwd = ""; showAdd = false
} }
resetForm()
} }
Column( Column(
@ -168,7 +149,6 @@ private fun MenuPanel(
label = server.displayName(), label = server.displayName(),
selected = active, selected = active,
onClick = { onSelectServer(server) }, onClick = { onSelectServer(server) },
onEdit = { startEdit(server) },
onRemove = { onRemoveServer(server) }, onRemove = { onRemoveServer(server) },
) )
} }
@ -177,8 +157,8 @@ private fun MenuPanel(
Text("No servers", color = TextMuted, fontSize = 14.sp, modifier = Modifier.padding(vertical = 4.dp)) Text("No servers", color = TextMuted, fontSize = 14.sp, modifier = Modifier.padding(vertical = 4.dp))
} }
// Add / edit server // Add server
if (showAdd || editing != null) { if (showAdd) {
Column( Column(
Modifier Modifier
.fillMaxWidth() .fillMaxWidth()
@ -188,25 +168,6 @@ private fun MenuPanel(
.padding(12.dp), .padding(12.dp),
verticalArrangement = Arrangement.spacedBy(8.dp), verticalArrangement = Arrangement.spacedBy(8.dp),
) { ) {
Row(
Modifier.fillMaxWidth(),
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.SpaceBetween,
) {
Text(
if (editing != null) "Edit Server" else "Add Server",
color = TextMuted,
fontSize = 13.sp,
letterSpacing = 1.sp,
fontWeight = FontWeight.Medium,
)
Text(
"Cancel",
color = TextMuted,
fontSize = 13.sp,
modifier = Modifier.clickable { resetForm() }.padding(start = 8.dp),
)
}
GlassField( GlassField(
value = nm, onValueChange = { nm = it }, value = nm, onValueChange = { nm = it },
placeholder = "Name (optional)", placeholder = "Name (optional)",
@ -267,7 +228,6 @@ private fun MenuItem(
selected: Boolean = false, selected: Boolean = false,
labelColor: Color = TextPrimary, labelColor: Color = TextPrimary,
onClick: () -> Unit, onClick: () -> Unit,
onEdit: (() -> Unit)? = null,
onRemove: (() -> Unit)? = null, onRemove: (() -> Unit)? = null,
) { ) {
Row( Row(
@ -287,16 +247,7 @@ private fun MenuItem(
color = if (selected) BitcoinOrange else labelColor, color = if (selected) BitcoinOrange else labelColor,
fontSize = 16.sp, fontSize = 16.sp,
fontWeight = FontWeight.Medium, fontWeight = FontWeight.Medium,
modifier = Modifier.weight(1f),
) )
if (onEdit != null) {
Text(
"",
color = TextMuted,
fontSize = 16.sp,
modifier = Modifier.clickable { onEdit() }.padding(horizontal = 8.dp),
)
}
if (onRemove != null) { if (onRemove != null) {
Text( Text(
"", "",

View File

@ -216,17 +216,6 @@ fun RemoteInputScreen(onBack: () -> Unit) {
onAddServer = { server -> onAddServer = { server ->
scope.launch { prefs.addSavedServer(server); if (activeServer == null) prefs.setActiveServer(server) } scope.launch { prefs.addSavedServer(server); if (activeServer == null) prefs.setActiveServer(server) }
}, },
onEditServer = { original, updated ->
scope.launch {
prefs.updateSavedServer(original, updated)
// If the edited server is the live one, reconnect with the new
// address/credentials so the change takes effect immediately.
if (original.serialize() == activeServer?.serialize()) {
ws.disconnect()
prefs.setActiveServer(updated)
}
}
},
onRemoveServer = { server -> onRemoveServer = { server ->
scope.launch { scope.launch {
prefs.removeSavedServer(server) prefs.removeSavedServer(server)

View File

@ -30,7 +30,6 @@ import androidx.compose.material.icons.filled.VisibilityOff
import androidx.compose.foundation.verticalScroll import androidx.compose.foundation.verticalScroll
import androidx.compose.material.icons.Icons import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Close import androidx.compose.material.icons.filled.Close
import androidx.compose.material.icons.filled.Edit
import androidx.compose.material.icons.filled.Lock import androidx.compose.material.icons.filled.Lock
import androidx.compose.material.icons.filled.LockOpen import androidx.compose.material.icons.filled.LockOpen
import androidx.compose.material3.CircularProgressIndicator import androidx.compose.material3.CircularProgressIndicator
@ -107,50 +106,9 @@ fun ServerConnectScreen(
var useHttps by remember { mutableStateOf(false) } var useHttps by remember { mutableStateOf(false) }
var isConnecting by remember { mutableStateOf(false) } var isConnecting by remember { mutableStateOf(false) }
var errorMessage by remember { mutableStateOf<String?>(null) } var errorMessage by remember { mutableStateOf<String?>(null) }
// The saved server currently being edited, or null when adding/connecting.
var editingServer by remember { mutableStateOf<ServerEntry?>(null) }
val savedServers by prefs.savedServers.collectAsState(initial = emptyList()) val savedServers by prefs.savedServers.collectAsState(initial = emptyList())
fun clearForm() {
name = ""
address = ""
port = ""
password = ""
useHttps = false
passwordVisible = false
errorMessage = null
}
fun startEdit(server: ServerEntry) {
editingServer = server
name = server.name
address = server.address
port = server.port
password = server.password
useHttps = server.useHttps
passwordVisible = false
errorMessage = null
}
fun cancelEdit() {
editingServer = null
clearForm()
}
fun saveEdit() {
val original = editingServer ?: return
if (address.isBlank()) {
errorMessage = "Enter a server address"
return
}
val updated = ServerEntry(address, useHttps, port, password, name)
scope.launch {
prefs.updateSavedServer(original, updated)
cancelEdit()
}
}
fun connect(server: ServerEntry) { fun connect(server: ServerEntry) {
if (isConnecting) return if (isConnecting) return
if (server.address.isBlank()) { if (server.address.isBlank()) {
@ -220,7 +178,7 @@ fun ServerConnectScreen(
Spacer(modifier = Modifier.height(4.dp)) Spacer(modifier = Modifier.height(4.dp))
Text( Text(
text = if (editingServer != null) stringResource(R.string.edit_server_title) else "Connect to Server", text = "Connect to Server",
style = MaterialTheme.typography.headlineMedium, style = MaterialTheme.typography.headlineMedium,
color = TextPrimary, color = TextPrimary,
textAlign = TextAlign.Center, textAlign = TextAlign.Center,
@ -366,11 +324,7 @@ fun ServerConnectScreen(
keyboardActions = KeyboardActions( keyboardActions = KeyboardActions(
onGo = { onGo = {
keyboard?.hide() keyboard?.hide()
if (editingServer != null) { connect(ServerEntry(address, useHttps, port, password, name))
saveEdit()
} else {
connect(ServerEntry(address, useHttps, port, password, name))
}
}, },
), ),
colors = OutlinedTextFieldDefaults.colors( colors = OutlinedTextFieldDefaults.colors(
@ -435,40 +389,15 @@ fun ServerConnectScreen(
} }
} }
if (editingServer != null) { // Connect button — glass style
// Save / Cancel while editing an existing saved server GlassButton(
Row( text = if (isConnecting) stringResource(R.string.connecting) else stringResource(R.string.connect),
modifier = Modifier.fillMaxWidth(), onClick = {
horizontalArrangement = Arrangement.spacedBy(12.dp), keyboard?.hide()
) { connect(ServerEntry(address, useHttps, port, password, name))
GlassButton( },
text = stringResource(R.string.cancel), modifier = Modifier.fillMaxWidth().height(56.dp),
onClick = { )
keyboard?.hide()
cancelEdit()
},
modifier = Modifier.weight(1f).height(56.dp),
)
GlassButton(
text = stringResource(R.string.save_changes),
onClick = {
keyboard?.hide()
saveEdit()
},
modifier = Modifier.weight(1f).height(56.dp),
)
}
} else {
// Connect button — glass style
GlassButton(
text = if (isConnecting) stringResource(R.string.connecting) else stringResource(R.string.connect),
onClick = {
keyboard?.hide()
connect(ServerEntry(address, useHttps, port, password, name))
},
modifier = Modifier.fillMaxWidth().height(56.dp),
)
}
if (isConnecting) { if (isConnecting) {
CircularProgressIndicator( CircularProgressIndicator(
@ -478,8 +407,8 @@ fun ServerConnectScreen(
) )
} }
// Saved servers (hidden while editing one to keep focus on the form) // Saved servers
if (editingServer == null && savedServers.isNotEmpty()) { if (savedServers.isNotEmpty()) {
Spacer(modifier = Modifier.height(8.dp)) Spacer(modifier = Modifier.height(8.dp))
Text( Text(
text = stringResource(R.string.saved_servers), text = stringResource(R.string.saved_servers),
@ -493,7 +422,6 @@ fun ServerConnectScreen(
SavedServerItem( SavedServerItem(
server = server, server = server,
onConnect = { connect(it) }, onConnect = { connect(it) },
onEdit = { startEdit(it) },
onRemove = { scope.launch { prefs.removeSavedServer(it) } }, onRemove = { scope.launch { prefs.removeSavedServer(it) } },
) )
} }
@ -506,7 +434,6 @@ fun ServerConnectScreen(
private fun SavedServerItem( private fun SavedServerItem(
server: ServerEntry, server: ServerEntry,
onConnect: (ServerEntry) -> Unit, onConnect: (ServerEntry) -> Unit,
onEdit: (ServerEntry) -> Unit,
onRemove: (ServerEntry) -> Unit, onRemove: (ServerEntry) -> Unit,
) { ) {
Row( Row(
@ -549,9 +476,6 @@ private fun SavedServerItem(
} }
} }
} }
IconButton(onClick = { onEdit(server) }) {
Icon(imageVector = Icons.Default.Edit, contentDescription = stringResource(R.string.edit_server), modifier = Modifier.size(18.dp), tint = TextMuted)
}
IconButton(onClick = { onRemove(server) }) { IconButton(onClick = { onRemove(server) }) {
Icon(imageVector = Icons.Default.Close, contentDescription = stringResource(R.string.remove_server), modifier = Modifier.size(18.dp), tint = TextMuted) Icon(imageVector = Icons.Default.Close, contentDescription = stringResource(R.string.remove_server), modifier = Modifier.size(18.dp), tint = TextMuted)
} }

View File

@ -2,7 +2,6 @@ package com.archipelago.app.ui.screens
import android.annotation.SuppressLint import android.annotation.SuppressLint
import android.graphics.Bitmap import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.view.ViewGroup import android.view.ViewGroup
import android.webkit.CookieManager import android.webkit.CookieManager
import android.webkit.WebChromeClient import android.webkit.WebChromeClient
@ -15,7 +14,6 @@ import androidx.activity.compose.BackHandler
import androidx.compose.animation.AnimatedVisibility import androidx.compose.animation.AnimatedVisibility
import androidx.compose.animation.fadeIn import androidx.compose.animation.fadeIn
import androidx.compose.animation.fadeOut import androidx.compose.animation.fadeOut
import androidx.compose.foundation.Image
import androidx.compose.foundation.background import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Arrangement import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box import androidx.compose.foundation.layout.Box
@ -29,24 +27,17 @@ import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.safeDrawing import androidx.compose.foundation.layout.safeDrawing
import androidx.compose.foundation.layout.size import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.layout.windowInsetsPadding import androidx.compose.foundation.layout.windowInsetsPadding
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.automirrored.filled.ArrowBack
import androidx.compose.material.icons.automirrored.filled.ArrowForward
import androidx.compose.material.icons.filled.Close import androidx.compose.material.icons.filled.Close
import androidx.compose.material.icons.filled.CloudOff import androidx.compose.material.icons.filled.CloudOff
import androidx.compose.material.icons.filled.OpenInBrowser import androidx.compose.material.icons.filled.OpenInBrowser
import androidx.compose.material.icons.filled.Refresh
import androidx.compose.material3.CircularProgressIndicator
import androidx.compose.material3.Icon import androidx.compose.material3.Icon
import androidx.compose.material3.IconButton import androidx.compose.material3.IconButton
import androidx.compose.material3.LinearProgressIndicator import androidx.compose.material3.LinearProgressIndicator
import androidx.compose.material3.MaterialTheme import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Text import androidx.compose.material3.Text
import androidx.compose.runtime.Composable import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableIntStateOf import androidx.compose.runtime.mutableIntStateOf
import androidx.compose.runtime.mutableStateOf import androidx.compose.runtime.mutableStateOf
@ -54,8 +45,6 @@ import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.asImageBitmap
import androidx.compose.ui.platform.LocalContext import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.res.stringResource import androidx.compose.ui.res.stringResource
import androidx.compose.ui.text.style.TextAlign import androidx.compose.ui.text.style.TextAlign
@ -67,8 +56,6 @@ import com.archipelago.app.ui.theme.BitcoinOrange
import com.archipelago.app.ui.theme.SurfaceBlack import com.archipelago.app.ui.theme.SurfaceBlack
import com.archipelago.app.ui.theme.TextMuted import com.archipelago.app.ui.theme.TextMuted
import com.archipelago.app.ui.theme.TextPrimary import com.archipelago.app.ui.theme.TextPrimary
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
/** Open a URL in the phone's default browser (genuinely external links). */ /** Open a URL in the phone's default browser (genuinely external links). */
private fun openExternalUrl(context: android.content.Context, url: String) { private fun openExternalUrl(context: android.content.Context, url: String) {
@ -323,26 +310,6 @@ fun WebViewScreen(
} }
} }
// Node apps (e.g. NetBird) terminate TLS with a
// self-signed cert — the dashboard needs a secure
// context for OIDC/window.crypto.subtle (#15). The
// WebView default is to CANCEL untrusted certs, so
// those apps render blank. The user explicitly trusts
// their own node, so proceed for same-host certs only;
// reject anything else (don't blanket-trust the web).
override fun onReceivedSslError(
view: WebView?,
handler: android.webkit.SslErrorHandler?,
error: android.net.http.SslError?,
) {
val u = error?.url
if (u != null && isSameHost(u, serverUrl)) {
handler?.proceed()
} else {
handler?.cancel()
}
}
override fun shouldOverrideUrlLoading( override fun shouldOverrideUrlLoading(
view: WebView?, view: WebView?,
request: WebResourceRequest?, request: WebResourceRequest?,
@ -461,34 +428,11 @@ fun WebViewScreen(
} }
} }
/** Best-effort fetch of the origin's /favicon.ico, so the launched app's icon
* can be shown on the loading screen before the WebView reports onReceivedIcon
* (which only fires once the page's <head> has parsed). Blocking call on IO. */
private fun fetchFavicon(pageUrl: String): Bitmap? {
return try {
val u = android.net.Uri.parse(pageUrl)
val scheme = u.scheme ?: return null
val host = u.host ?: return null
val portPart = if (u.port > 0) ":${u.port}" else ""
val conn = (java.net.URL("$scheme://$host$portPart/favicon.ico").openConnection()
as java.net.HttpURLConnection).apply {
connectTimeout = 4000
readTimeout = 4000
instanceFollowRedirects = true
}
conn.inputStream.use { BitmapFactory.decodeStream(it) }
} catch (_: Exception) {
null
}
}
/** /**
* Lightweight in-app browser used when the kiosk hands off an app that can't be * Lightweight in-app browser used when the kiosk hands off an app that can't be
* shown in an iframe. Loads the app in a local WebView with a centered loading * shown in an iframe. Loads the app in a local WebView with a minimal top bar
* screen (app favicon + progress bar) and a BOTTOM control bar mirroring the * (close + title + escalate-to-real-browser). Same-host navigation stays here;
* web mobile-iframe footer (back / forward / reload / open-in-browser / close). * any genuinely external link escapes to the phone's browser.
* Same-host navigation stays here; any genuinely external link escapes to the
* phone's browser.
*/ */
@SuppressLint("SetJavaScriptEnabled") @SuppressLint("SetJavaScriptEnabled")
@Composable @Composable
@ -500,20 +444,8 @@ private fun InAppBrowser(
val context = LocalContext.current val context = LocalContext.current
var browser by remember { mutableStateOf<WebView?>(null) } var browser by remember { mutableStateOf<WebView?>(null) }
var title by remember { mutableStateOf(android.net.Uri.parse(url).host ?: url) } var title by remember { mutableStateOf(android.net.Uri.parse(url).host ?: url) }
var favicon by remember { mutableStateOf<Bitmap?>(null) }
var progress by remember { mutableIntStateOf(0) } var progress by remember { mutableIntStateOf(0) }
var loading by remember { mutableStateOf(true) } var loading by remember { mutableStateOf(true) }
var canGoBack by remember { mutableStateOf(false) }
var canGoForward by remember { mutableStateOf(false) }
// Seed the loading-screen icon immediately from a best-effort favicon
// pre-fetch (main's app-icon work), then onReceivedIcon upgrades it — so the
// loader shows an icon right away instead of staying blank until the page
// parses its <head> (which is what made the loader look stuck).
LaunchedEffect(url) {
val fetched = withContext(Dispatchers.IO) { fetchFavicon(url) }
if (fetched != null && favicon == null) favicon = fetched
}
// Back: walk the in-app history first, then close the overlay. // Back: walk the in-app history first, then close the overlay.
BackHandler { BackHandler {
@ -527,169 +459,13 @@ private fun InAppBrowser(
.background(SurfaceBlack) .background(SurfaceBlack)
.windowInsetsPadding(WindowInsets.safeDrawing), .windowInsetsPadding(WindowInsets.safeDrawing),
) { ) {
// WebView + loading overlay fill the area above the bottom control bar.
Box(modifier = Modifier.weight(1f).fillMaxWidth()) {
AndroidView(
modifier = Modifier.fillMaxSize(),
factory = { ctx ->
WebView(ctx).apply {
layoutParams = ViewGroup.LayoutParams(
ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT,
)
isVerticalScrollBarEnabled = false
isHorizontalScrollBarEnabled = false
CookieManager.getInstance().setAcceptThirdPartyCookies(this, true)
applyArchipelagoSettings()
webChromeClient = object : WebChromeClient() {
override fun onProgressChanged(view: WebView?, newProgress: Int) {
progress = newProgress
}
override fun onReceivedTitle(view: WebView?, t: String?) {
if (!t.isNullOrBlank()) title = t
}
override fun onReceivedIcon(view: WebView?, icon: Bitmap?) {
if (icon != null) favicon = icon
}
}
webViewClient = object : WebViewClient() {
override fun onPageStarted(view: WebView?, u: String?, favicon: Bitmap?) {
loading = true
}
override fun onPageFinished(view: WebView?, u: String?) {
loading = false
canGoBack = view?.canGoBack() == true
canGoForward = view?.canGoForward() == true
}
override fun doUpdateVisitedHistory(view: WebView?, u: String?, isReload: Boolean) {
canGoBack = view?.canGoBack() == true
canGoForward = view?.canGoForward() == true
}
// Self-signed TLS on the node's apps (e.g. NetBird on
// :8087) would otherwise be cancelled by the WebView
// and render blank. Proceed for the user's own node
// (same host); reject any other untrusted cert.
override fun onReceivedSslError(
view: WebView?,
handler: android.webkit.SslErrorHandler?,
error: android.net.http.SslError?,
) {
val u = error?.url
if (u != null && isSameHost(u, serverUrl)) {
handler?.proceed()
} else {
handler?.cancel()
}
}
override fun shouldOverrideUrlLoading(
view: WebView?,
request: WebResourceRequest?,
): Boolean {
val u = request?.url?.toString() ?: return false
// Stay in the overlay for same-node navigation;
// hand genuinely external links to the real browser.
if (isSameHost(u, serverUrl)) return false
openExternalUrl(ctx, u)
return true
}
}
browser = this
loadUrl(url)
}
},
)
// Centered loading screen — app favicon (or spinner) + title + bar.
if (loading) {
Column(
modifier = Modifier
.fillMaxSize()
.background(SurfaceBlack),
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.Center,
) {
Box(
modifier = Modifier.size(84.dp).clip(RoundedCornerShape(20.dp)),
contentAlignment = Alignment.Center,
) {
val fav = favicon
if (fav != null) {
Image(
bitmap = fav.asImageBitmap(),
contentDescription = title,
modifier = Modifier.fillMaxSize(),
)
} else {
CircularProgressIndicator(color = BitcoinOrange)
}
}
Spacer(modifier = Modifier.height(18.dp))
Text(
text = title,
style = MaterialTheme.typography.bodyLarge,
color = TextPrimary,
maxLines = 1,
overflow = TextOverflow.Ellipsis,
)
Spacer(modifier = Modifier.height(16.dp))
LinearProgressIndicator(
progress = { progress / 100f },
modifier = Modifier.width(220.dp),
color = BitcoinOrange,
trackColor = TextMuted.copy(alpha = 0.2f),
)
}
}
}
// Bottom control bar — mirrors the web mobile-iframe footer.
Row( Row(
modifier = Modifier modifier = Modifier
.fillMaxWidth() .fillMaxWidth()
.height(56.dp) .height(48.dp)
.background(SurfaceBlack) .padding(horizontal = 4.dp),
.padding(horizontal = 8.dp),
horizontalArrangement = Arrangement.SpaceAround,
verticalAlignment = Alignment.CenterVertically, verticalAlignment = Alignment.CenterVertically,
) { ) {
IconButton(onClick = { browser?.goBack() }, enabled = canGoBack) {
Icon(
imageVector = Icons.AutoMirrored.Filled.ArrowBack,
contentDescription = "Back",
tint = if (canGoBack) TextPrimary else TextMuted.copy(alpha = 0.4f),
)
}
IconButton(onClick = { browser?.goForward() }, enabled = canGoForward) {
Icon(
imageVector = Icons.AutoMirrored.Filled.ArrowForward,
contentDescription = "Forward",
tint = if (canGoForward) TextPrimary else TextMuted.copy(alpha = 0.4f),
)
}
IconButton(onClick = { browser?.reload() }) {
Icon(
imageVector = Icons.Default.Refresh,
contentDescription = "Reload",
tint = TextPrimary,
)
}
IconButton(onClick = { openExternalUrl(context, browser?.url ?: url) }) {
Icon(
imageVector = Icons.Default.OpenInBrowser,
contentDescription = stringResource(R.string.open_in_browser),
tint = TextPrimary,
)
}
IconButton(onClick = onClose) { IconButton(onClick = onClose) {
Icon( Icon(
imageVector = Icons.Default.Close, imageVector = Icons.Default.Close,
@ -697,6 +473,82 @@ private fun InAppBrowser(
tint = TextPrimary, tint = TextPrimary,
) )
} }
Text(
text = title,
style = MaterialTheme.typography.bodyMedium,
color = TextPrimary,
maxLines = 1,
overflow = TextOverflow.Ellipsis,
modifier = Modifier.weight(1f),
)
IconButton(onClick = { openExternalUrl(context, browser?.url ?: url) }) {
Icon(
imageVector = Icons.Default.OpenInBrowser,
contentDescription = stringResource(R.string.open_in_browser),
tint = TextMuted,
)
}
} }
AnimatedVisibility(visible = loading, enter = fadeIn(), exit = fadeOut()) {
LinearProgressIndicator(
progress = { progress / 100f },
modifier = Modifier.fillMaxWidth(),
color = BitcoinOrange,
trackColor = SurfaceBlack,
)
}
AndroidView(
modifier = Modifier.fillMaxSize(),
factory = { ctx ->
WebView(ctx).apply {
layoutParams = ViewGroup.LayoutParams(
ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT,
)
isVerticalScrollBarEnabled = false
isHorizontalScrollBarEnabled = false
CookieManager.getInstance().setAcceptThirdPartyCookies(this, true)
applyArchipelagoSettings()
webChromeClient = object : WebChromeClient() {
override fun onProgressChanged(view: WebView?, newProgress: Int) {
progress = newProgress
}
override fun onReceivedTitle(view: WebView?, t: String?) {
if (!t.isNullOrBlank()) title = t
}
}
webViewClient = object : WebViewClient() {
override fun onPageStarted(view: WebView?, u: String?, favicon: Bitmap?) {
loading = true
}
override fun onPageFinished(view: WebView?, u: String?) {
loading = false
}
override fun shouldOverrideUrlLoading(
view: WebView?,
request: WebResourceRequest?,
): Boolean {
val u = request?.url?.toString() ?: return false
// Stay in the overlay for same-node navigation;
// hand genuinely external links to the real browser.
if (isSameHost(u, serverUrl)) return false
openExternalUrl(ctx, u)
return true
}
}
browser = this
loadUrl(url)
}
},
)
} }
} }

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M15,19l-7,-7 7,-7"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M6,18L18,6M6,6l12,12"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M9,5l7,7 -7,7"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M10,6H6a2,2 0,0 0,-2 2v10a2,2 0,0 0,2 2h10a2,2 0,0 0,2 -2v-4M14,4h6m0,0v6m0,-6L10,14"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -1,12 +0,0 @@
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24">
<path
android:pathData="M4,4v6h6M20,20v-6h-6M5.64,15.36A8,8 0,0 0,18.36 18M18.36,8.64A8,8 0,0 0,5.64 6"
android:strokeColor="#FFFFFF"
android:strokeWidth="2"
android:strokeLineCap="round"
android:strokeLineJoin="round" />
</vector>

View File

@ -23,13 +23,6 @@
<string name="remote_input_hint">Use your phone as a keyboard and mouse for the kiosk</string> <string name="remote_input_hint">Use your phone as a keyboard and mouse for the kiosk</string>
<string name="close">Close</string> <string name="close">Close</string>
<string name="open_in_browser">Open in browser</string> <string name="open_in_browser">Open in browser</string>
<string name="back">Back</string>
<string name="forward">Forward</string>
<string name="refresh">Refresh</string>
<string name="server_name_label">Server Name (optional)</string> <string name="server_name_label">Server Name (optional)</string>
<string name="server_name_placeholder">My Archipelago</string> <string name="server_name_placeholder">My Archipelago</string>
<string name="edit_server">Edit</string>
<string name="edit_server_title">Edit Server</string>
<string name="save_changes">Save Changes</string>
<string name="cancel">Cancel</string>
</resources> </resources>

View File

@ -1,18 +1,13 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# #
# Build the Android companion app and publish it as the served download # Build the Android companion app and publish it as the served download
# (neode-ui/public/packages/archipelago-companion.apk — a plain APK a phone can # (neode-ui/public/packages/archipelago-companion.apk.zip), then commit + push.
# install straight from the link), then commit + push.
# #
# Use this INSTEAD of `git push` when shipping the companion app, so the # Use this INSTEAD of `git push` when shipping the companion app, so the
# downloadable APK on the node always matches what's on main. # downloadable APK on the node always matches what's on main.
# #
# ./Android/ship-companion.sh # ./Android/ship-companion.sh
# #
# The actual build/sign/verify/stage is done by scripts/publish-companion-apk.sh
# (single source of truth, shared with the pre-push hook). It does a CLEAN build,
# forces v1+v2+v3 signing, and ABORTS if any signature scheme is missing — so a
# broken or v2-only APK can never be shipped.
set -euo pipefail set -euo pipefail
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
@ -21,15 +16,21 @@ cd "$ROOT"
export JAVA_HOME="${JAVA_HOME:-/opt/homebrew/opt/openjdk@17}" export JAVA_HOME="${JAVA_HOME:-/opt/homebrew/opt/openjdk@17}"
export ANDROID_HOME="${ANDROID_HOME:-$HOME/Library/Android/sdk}" export ANDROID_HOME="${ANDROID_HOME:-$HOME/Library/Android/sdk}"
DEST="neode-ui/public/packages/archipelago-companion.apk" APK="Android/app/build/outputs/apk/debug/app-debug.apk"
DEST="neode-ui/public/packages/archipelago-companion.apk.zip"
echo "==> Building + signing + verifying companion APK" echo "==> Building debug APK"
bash scripts/publish-companion-apk.sh ( cd Android && ./gradlew :app:assembleDebug --console=plain -q )
[ -f "$APK" ] || { echo "ERROR: APK not found at $APK" >&2; exit 1; }
[ -f "$DEST" ] || { echo "ERROR: served APK not found at $DEST" >&2; exit 1; } echo "==> Publishing -> $DEST"
mkdir -p "$(dirname "$DEST")"
rm -f "$DEST"
( cd "$(dirname "$APK")" && zip -j -q "$ROOT/$DEST" "$(basename "$APK")" )
if git diff --cached --quiet -- "$DEST"; then git add "$DEST"
echo "==> Nothing to commit (APK unchanged)" if git diff --cached --quiet; then
echo "==> Nothing to commit (working tree + APK unchanged)"
else else
git commit -q -m "chore(android): update companion apk download" git commit -q -m "chore(android): update companion apk download"
echo "==> Committed" echo "==> Committed"

View File

@ -1,23 +1,13 @@
# Archipelago — agent guide # Archipelago — agent guide
## ✅ Single-node production gate is GREEN (2026-06-23) ## 🚩 TOP PRIORITY (until production testing passes)
`tests/lifecycle/run-gate.sh` is **5/5 on .228, 0 failures** — the single-node exit **Read `docs/PRODUCTION-MASTER-PLAN.md` first.** It is the authoritative plan and
criterion is met and the priority banner is demoted. Next exit-criteria: the overrides ad-hoc direction until the production test gate is green. Goal: a
**multinode pass** (`docs/multinode-testing-plan.md`) and workstreams B/C/D. world-class, **developer-ready app platform** where every app is manifest-driven,
manifests ship via the **signed registry** (not OTA disk files), and **third-party
**For day-to-day work, use `docs/UNIFIED-TASK-TRACKER.md`** — the consolidated, developers publish apps via an external/decentralized registry** — all rootless,
priority-ordered "what's left" list across the 1.8.0 OTA and master-plan docs secure, robust, and 100%-uptime-capable.
(fastest/simplest tasks first). It supersedes hunting through the two source docs
below for open items; those remain the narrative/history.
**Read `docs/PRODUCTION-MASTER-PLAN.md` first** — it is still the authoritative plan
for the north star: a world-class, **developer-ready app platform** where every app
is manifest-driven, manifests ship via the **signed registry** (not OTA disk files),
and **third-party developers publish apps via an external/decentralized registry**
all rootless, secure, robust, and 100%-uptime-capable. It no longer overrides all
ad-hoc direction now that the gate is green, but it remains the source of truth for
sequencing the remaining workstreams.
Detailed sub-plans (all linked from the master): Detailed sub-plans (all linked from the master):
- App platform / packaging phases + security model → `docs/APP-PACKAGING-MIGRATION-PLAN.md` - App platform / packaging phases + security model → `docs/APP-PACKAGING-MIGRATION-PLAN.md`
@ -26,28 +16,6 @@ Detailed sub-plans (all linked from the master):
- Current per-app state → `docs/app-registry-status-2026-06-21.md` - Current per-app state → `docs/app-registry-status-2026-06-21.md`
- Production test gate (exit criterion) → `tests/lifecycle/TESTING.md` - Production test gate (exit criterion) → `tests/lifecycle/TESTING.md`
## Commit & push every unit of work (never violate)
**The #1 process rule: work is not "done" until it is committed AND pushed.** This
exists because finished work has been lost/clobbered by sitting uncommitted in the
shared tree across agents and sessions. To prevent that:
- **Commit each feature/fix the moment it works** — one focused, self-contained
commit per logical change (it compiles and its targeted tests pass). Do not let
unrelated changes accumulate uncommitted.
- **Push immediately after committing** so nothing lives only on one machine. `main`
is protected → push via `git push gitea-ai main` (account `ai`, see the memory
note); feature branches push to their own remote.
- **Never leave a stack of finished work uncommitted** overnight or when handing off
between agents — if you must pause mid-change, commit a clearly-labelled WIP
checkpoint rather than leaving it dirty.
- **Stage explicitly by path** (`git add <paths>`) when another agent's uncommitted
work shares the tree — never `git add -A` / `git commit -a`, which clobbers or
entangles their changes.
- **Never commit or push secrets** (mnemonics, private keys, API tokens). Signing is
done offline; artifacts (catalog/manifest) are signed, not the keys.
- Commit messages end with the `Co-Authored-By: Claude …` trailer.
## Invariants (never violate) ## Invariants (never violate)
- **Rootless Podman only.** No rootful, no Docker-socket mounts, no privileged - **Rootless Podman only.** No rootful, no Docker-socket mounts, no privileged
@ -59,8 +27,7 @@ shared tree across agents and sessions. To prevent that:
`container::secrets`, 0600/rootless) — never hardcoded, per-app, or logged. `container::secrets`, 0600/rootless) — never hardcoded, per-app, or logged.
- **Migrations never destroy data** — preserve `/var/lib/archipelago/<app>`, - **Migrations never destroy data** — preserve `/var/lib/archipelago/<app>`,
secrets, credentials, ports, and adoption container names; keep a rollback path. secrets, credentials, ports, and adoption container names; keep a rollback path.
- **Verify on the real node .228 before any tag.** (Fleet-wide multinode - **Verify on a real node (.228, then .198) before any tag.**
verification is a separate plan: `docs/multinode-testing-plan.md`.)
## Build / verify ## Build / verify
@ -74,11 +41,7 @@ shared tree across agents and sessions. To prevent that:
## Production test gate (definition of done) ## Production test gate (definition of done)
`tests/lifecycle/run-gate.sh` green across install / UI / stop / start / restart / `tests/lifecycle/run-20x.sh` green across install / UI / stop / start / restart /
reinstall / reboot-survive / archipelago-restart-survive / uninstall — **5× on reinstall / reboot-survive / archipelago-restart-survive / uninstall — **5× on
.228** (`ARCHY_ITERATIONS=5`). **Run the gate ON the node** (it uses local podman/systemctl/bitcoin .228 AND .198 for now** (`ARCHY_ITERATIONS=5`; temporarily reduced from 20×
probes), not via RPC from another host. **✅ GREEN 2026-06-23 (5/5, 0 not-ok)** — keep it restore to 20× before the final ship). Until green, the master plan is the priority.
green (re-run after orchestrator/lifecycle changes); regressions are top priority again.
**Multinode testing (.198 + the rest of the fleet) is a SEPARATE plan** —
`docs/multinode-testing-plan.md` — not part of this single-node gate criterion, and is
the next exit criterion now that single-node is green.

View File

@ -73,7 +73,7 @@
"author": "Mempool", "author": "Mempool",
"category": "money", "category": "money",
"tier": "core", "tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1", "dockerImage": "146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0",
"repoUrl": "https://github.com/mempool/mempool", "repoUrl": "https://github.com/mempool/mempool",
"requires": [ "requires": [
"bitcoin-knots", "bitcoin-knots",
@ -214,6 +214,31 @@
] ]
} }
}, },
{
"id": "meshtastic",
"title": "Meshtastic",
"version": "2-daily-alpine",
"description": "Open-source mesh networking for LoRa radios. Create decentralized communication networks.",
"icon": "/assets/img/app-icons/meshcore.svg",
"author": "Meshtastic",
"category": "networking",
"tier": "recommended",
"dockerImage": "docker.io/meshtastic/meshtasticd:daily-alpine",
"repoUrl": "https://github.com/meshtastic/firmware",
"containerConfig": {
"ports": [
"4403:4403"
],
"volumes": [
"/var/lib/archipelago/meshtastic:/var/lib/meshtasticd"
],
"env": [
"MESHTASTIC_PORT=/dev/ttyUSB0",
"MESHTASTIC_SERIAL=true"
],
"notes": "Requires a LoRa radio device at /dev/ttyUSB0. The config file is rendered from the app manifest before container start."
}
},
{ {
"id": "vaultwarden", "id": "vaultwarden",
"title": "Vaultwarden", "title": "Vaultwarden",
@ -269,12 +294,12 @@
"id": "fedimint-clientd", "id": "fedimint-clientd",
"title": "Fedimint Client", "title": "Fedimint Client",
"version": "0.8.0", "version": "0.8.0",
"description": "Fedimint ecash client daemon (fmcd). Lets the node hold Fedimint ecash and join federations; the wallet talks to it over a local REST API.", "description": "Fedimint ecash client daemon (fmcd). Lets your node hold Fedimint ecash and join federations; the wallet talks to it over a local REST API.",
"icon": "/assets/img/app-icons/fedimint.png", "icon": "/assets/img/app-icons/fedimint.png",
"author": "Fedimint", "author": "Fedimint",
"category": "money", "category": "money",
"tier": "core", "tier": "core",
"dockerImage": "146.59.87.168:3000/lfg2025/fmcd:0.8.1", "dockerImage": "146.59.87.168:3000/lfg2025/fmcd:0.8.0",
"repoUrl": "https://github.com/minmoto/fmcd" "repoUrl": "https://github.com/minmoto/fmcd"
}, },
{ {
@ -321,8 +346,8 @@
{ {
"id": "immich", "id": "immich",
"title": "Immich", "title": "Immich",
"version": "2.7.4", "version": "1.90.0",
"description": "Self-hosted photo and video backup with mobile apps and search.", "description": "High-performance photo and video backup with ML.",
"icon": "/assets/img/app-icons/immich.png", "icon": "/assets/img/app-icons/immich.png",
"author": "Immich", "author": "Immich",
"category": "data", "category": "data",
@ -428,13 +453,13 @@
{ {
"id": "netbird", "id": "netbird",
"title": "NetBird", "title": "NetBird",
"version": "2.38.0", "version": "0.71.2",
"description": "Self-hosted WireGuard mesh VPN control plane with dashboard, embedded identity provider, management API, signal, relay, and STUN. The user-facing entry point — a TLS proxy in front of the dashboard + server.", "description": "Self-hosted WireGuard mesh VPN control plane with dashboard, embedded identity provider, management API, signal, relay, and STUN service.",
"icon": "/assets/img/app-icons/netbird.svg", "icon": "/assets/img/app-icons/netbird.svg",
"author": "NetBird", "author": "NetBird",
"category": "networking", "category": "networking",
"tier": "recommended", "tier": "recommended",
"dockerImage": "docker.io/library/nginx:1.27-alpine", "dockerImage": "docker.io/netbirdio/dashboard:v2.38.0",
"repoUrl": "https://github.com/netbirdio/netbird", "repoUrl": "https://github.com/netbirdio/netbird",
"containerConfig": { "containerConfig": {
"ports": [ "ports": [

View File

@ -1,7 +1,7 @@
app: app:
id: archy-btcpay-db id: archy-btcpay-db
name: BTCPay Postgres name: BTCPay Postgres
version: "15.17" version: 15.17
description: Postgres backend for BTCPay and NBXplorer. description: Postgres backend for BTCPay and NBXplorer.
container: container:

View File

@ -1,12 +1,12 @@
app: app:
id: archy-mempool-web id: archy-mempool-web
name: Mempool Web name: Mempool Web
version: 3.0.1 version: 3.0.0
description: Frontend web UI for mempool explorer. description: Frontend web UI for mempool explorer.
container_name: mempool container_name: mempool
container: container:
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1 image: git.tx1138.com/lfg2025/mempool-frontend:v3.0.0
pull_policy: if-not-present pull_policy: if-not-present
network: archy-net network: archy-net
@ -33,10 +33,7 @@ app:
health_check: health_check:
type: http type: http
# 127.0.0.1 not localhost: the image's wget resolves localhost to ::1 (IPv6) endpoint: http://localhost:8080
# first, but nginx binds 0.0.0.0:8080 (IPv4) only -> localhost probe gets
# "connection refused" -> perpetual unhealthy -> health_monitor restart loop.
endpoint: http://127.0.0.1:8080
path: / path: /
interval: 30s interval: 30s
timeout: 5s timeout: 5s

View File

@ -1,34 +1,5 @@
# Bitcoin Core — minimal rootless image built from the OFFICIAL upstream release. # Bitcoin Core - uses official image
# FROM bitcoin/bitcoin:24.0
# The CANONICAL, verified build path is scripts/build-bitcoin-image.sh, which
# downloads the upstream tarball, verifies SHA-256 + the OpenPGP signature # Default user is already 'bitcoin'
# (fail-closed), and tags/pushes <registry>/bitcoin:<version>. This Dockerfile # No additional setup needed
# mirrors that image for a manual/local build and replaces the old stale
# community base (`FROM bitcoin/bitcoin:24.0`).
#
# Build (binaries must be pre-fetched + verified into ./bin — see the script):
# scripts/build-bitcoin-image.sh core 31.0
FROM debian:bookworm-slim
ARG BITCOIN_VERSION=31.0
RUN set -eux; \
apt-get update; \
apt-get install -y --no-install-recommends ca-certificates; \
rm -rf /var/lib/apt/lists/*; \
useradd -m -u 1000 -s /bin/bash bitcoin; \
mkdir -p /home/bitcoin/.bitcoin; \
chown -R bitcoin:bitcoin /home/bitcoin
# bin/ holds the SHA-256 + GPG-verified bitcoind / bitcoin-cli (Guix-built,
# x86_64-linux-gnu) extracted from the official release tarball.
COPY bin/bitcoind /usr/local/bin/bitcoind
COPY bin/bitcoin-cli /usr/local/bin/bitcoin-cli
RUN chmod 0755 /usr/local/bin/bitcoind /usr/local/bin/bitcoin-cli
# Run as (container) root, like the legacy hand-built :latest image. Rootless
# Podman maps container-root to the unprivileged host service user; the manifest
# grants CAP_DAC_OVERRIDE so bitcoind can read its data dir, which the
# orchestrator chowns to the data_uid (host 100101 / container uid 102), not to
# this image's `bitcoin` user. A non-root USER can't read existing chain data and
# bitcoind crash-loops with "Error initializing block database".
WORKDIR /home/bitcoin
VOLUME ["/home/bitcoin/.bitcoin"]
EXPOSE 8332 8333
ENTRYPOINT ["bitcoind"]

View File

@ -17,13 +17,6 @@ app:
# the IBD sweet spot - 4GB on full nodes, 1GB on pruned. Container # the IBD sweet spot - 4GB on full nodes, 1GB on pruned. Container
# --memory=8g (config.rs::get_memory_limit) leaves headroom for # --memory=8g (config.rs::get_memory_limit) leaves headroom for
# mempool + connections. # mempool + connections.
#
# -printtoconsole=0: foreground bitcoind defaults console logging ON,
# which pushed every IBD "UpdateTip" line through conmon into journald
# (>1 GB/day on a fresh node). bitcoind still writes debug.log in the
# datadir (/var/lib/archipelago/bitcoin/debug.log, self-shrunk on
# restart) — use that for deep debugging; podman logs only carries
# entrypoint/startup errors.
- >- - >-
BITCOIND="$(command -v bitcoind || true)"; BITCOIND="$(command -v bitcoind || true)";
if [ -z "$BITCOIND" ]; then if [ -z "$BITCOIND" ]; then
@ -43,9 +36,9 @@ app:
RPC_TXRELAY_FLAGS="$RPC_TXRELAY_FLAGS -rpcauth=$RPC_TXRELAY_AUTH -rpcwhitelist=txrelay:sendrawtransaction,submitpackage,testmempoolaccept,getmempoolinfo,getrawmempool,getmempoolentry,getnetworkinfo,getblockchaininfo,getblockcount,getblockhash,getblock,getblockheader,getrawtransaction,gettxout,gettxspendingprevout,decoderawtransaction,decodescript,estimatesmartfee,uptime,ping,getconnectioncount,getpeerinfo,getindexinfo,getdeploymentinfo,getchaintips"; RPC_TXRELAY_FLAGS="$RPC_TXRELAY_FLAGS -rpcauth=$RPC_TXRELAY_AUTH -rpcwhitelist=txrelay:sendrawtransaction,submitpackage,testmempoolaccept,getmempoolinfo,getrawmempool,getmempoolentry,getnetworkinfo,getblockchaininfo,getblockcount,getblockhash,getblock,getblockheader,getrawtransaction,gettxout,gettxspendingprevout,decoderawtransaction,decodescript,estimatesmartfee,uptime,ping,getconnectioncount,getpeerinfo,getindexinfo,getdeploymentinfo,getchaintips";
fi; fi;
if [ "${DISK_GB_VALUE:-0}" -lt 1000 ]; then if [ "${DISK_GB_VALUE:-0}" -lt 1000 ]; then
exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -printtoconsole=0 -server=1 -prune=550 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=1024 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS"; exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -server=1 -prune=550 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=1024 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS";
else else
exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -printtoconsole=0 -server=1 -txindex=1 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=4096 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS"; exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -server=1 -txindex=1 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=4096 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS";
fi fi
derived_env: derived_env:
- key: DISK_GB - key: DISK_GB

View File

@ -1,35 +0,0 @@
# Bitcoin Knots — minimal rootless image built from the OFFICIAL upstream release.
#
# Knots previously had NO Dockerfile (the :latest tag was built/pushed by hand).
# The CANONICAL, verified build path is scripts/build-bitcoin-image.sh, which
# downloads the upstream tarball, verifies SHA-256 + the OpenPGP signature
# (fail-closed, Luke-Jr release key), and tags/pushes
# <registry>/bitcoin-knots:<version>. Knots version strings embed a build date,
# e.g. 29.3.knots20260508 — the full string is the tag.
#
# Build (binaries must be pre-fetched + verified into ./bin — see the script):
# scripts/build-bitcoin-image.sh knots 29.3.knots20260508
FROM debian:bookworm-slim
ARG KNOTS_VERSION=29.3.knots20260508
RUN set -eux; \
apt-get update; \
apt-get install -y --no-install-recommends ca-certificates; \
rm -rf /var/lib/apt/lists/*; \
useradd -m -u 1000 -s /bin/bash bitcoin; \
mkdir -p /home/bitcoin/.bitcoin; \
chown -R bitcoin:bitcoin /home/bitcoin
# bin/ holds the SHA-256 + GPG-verified bitcoind / bitcoin-cli (Knots, Guix-built,
# x86_64-linux-gnu) extracted from the official release tarball.
COPY bin/bitcoind /usr/local/bin/bitcoind
COPY bin/bitcoin-cli /usr/local/bin/bitcoin-cli
RUN chmod 0755 /usr/local/bin/bitcoind /usr/local/bin/bitcoin-cli
# Run as (container) root, like the legacy hand-built :latest image. Rootless
# Podman maps container-root to the unprivileged host service user; the manifest
# grants CAP_DAC_OVERRIDE so bitcoind can read its data dir, which the
# orchestrator chowns to the data_uid (host 100101 / container uid 102), not to
# this image's `bitcoin` user. A non-root USER can't read existing chain data and
# bitcoind crash-loops with "Error initializing block database".
WORKDIR /home/bitcoin
VOLUME ["/home/bitcoin/.bitcoin"]
EXPOSE 8332 8333
ENTRYPOINT ["bitcoind"]

View File

@ -17,13 +17,6 @@ app:
# the IBD sweet spot - 4GB on full nodes, 1GB on pruned. Container # the IBD sweet spot - 4GB on full nodes, 1GB on pruned. Container
# --memory=8g (config.rs::get_memory_limit) leaves headroom for # --memory=8g (config.rs::get_memory_limit) leaves headroom for
# mempool + connections. # mempool + connections.
#
# -printtoconsole=0: foreground bitcoind defaults console logging ON,
# which pushed every IBD "UpdateTip" line through conmon into journald
# (>1 GB/day on a fresh node). bitcoind still writes debug.log in the
# datadir (/var/lib/archipelago/bitcoin/debug.log, self-shrunk on
# restart) — use that for deep debugging; podman logs only carries
# entrypoint/startup errors.
- >- - >-
BITCOIND="$(command -v bitcoind || true)"; BITCOIND="$(command -v bitcoind || true)";
if [ -z "$BITCOIND" ]; then if [ -z "$BITCOIND" ]; then
@ -43,9 +36,9 @@ app:
RPC_TXRELAY_FLAGS="$RPC_TXRELAY_FLAGS -rpcauth=$RPC_TXRELAY_AUTH -rpcwhitelist=txrelay:sendrawtransaction,submitpackage,testmempoolaccept,getmempoolinfo,getrawmempool,getmempoolentry,getnetworkinfo,getblockchaininfo,getblockcount,getblockhash,getblock,getblockheader,getrawtransaction,gettxout,gettxspendingprevout,decoderawtransaction,decodescript,estimatesmartfee,uptime,ping,getconnectioncount,getpeerinfo,getindexinfo,getdeploymentinfo,getchaintips"; RPC_TXRELAY_FLAGS="$RPC_TXRELAY_FLAGS -rpcauth=$RPC_TXRELAY_AUTH -rpcwhitelist=txrelay:sendrawtransaction,submitpackage,testmempoolaccept,getmempoolinfo,getrawmempool,getmempoolentry,getnetworkinfo,getblockchaininfo,getblockcount,getblockhash,getblock,getblockheader,getrawtransaction,gettxout,gettxspendingprevout,decoderawtransaction,decodescript,estimatesmartfee,uptime,ping,getconnectioncount,getpeerinfo,getindexinfo,getdeploymentinfo,getchaintips";
fi; fi;
if [ "${DISK_GB_VALUE:-0}" -lt 1000 ]; then if [ "${DISK_GB_VALUE:-0}" -lt 1000 ]; then
exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -printtoconsole=0 -server=1 -prune=550 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=2048 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS"; exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -server=1 -prune=550 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=2048 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS";
else else
exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -printtoconsole=0 -server=1 -txindex=1 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=4096 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS"; exec "$BITCOIND" -datadir=/home/bitcoin/.bitcoin -noconf -server=1 -txindex=1 -rpcallowip=0.0.0.0/0 -rpcbind=0.0.0.0:8332 -listen=1 -bind=0.0.0.0:8333 -dbcache=4096 -par=0 -maxconnections=125 $RPC_HEADROOM $RPC_TXRELAY_FLAGS -rpcuser="$RPC_USER" -rpcpassword="$RPC_PASS";
fi fi
derived_env: derived_env:
- key: DISK_GB - key: DISK_GB

View File

@ -22,7 +22,6 @@ app:
- app_id: bitcoin-knots - app_id: bitcoin-knots
version: ">=26.0" version: ">=26.0"
- storage: 50Gi - storage: 50Gi
- bitcoin:archival
resources: resources:
cpu_limit: 0 cpu_limit: 0

View File

@ -9,7 +9,7 @@ app:
# 0.8.2 — iroh-capable). No usable upstream image exists, so we build + push # 0.8.2 — iroh-capable). No usable upstream image exists, so we build + push
# this to the node registry. Pin the tag to match the REST shapes coded in # this to the node registry. Pin the tag to match the REST shapes coded in
# core/archipelago/src/wallet/fedimint_client.rs (validated against 0.8.2). # core/archipelago/src/wallet/fedimint_client.rs (validated against 0.8.2).
image: 146.59.87.168:3000/lfg2025/fmcd:0.8.1 image: 146.59.87.168:3000/lfg2025/fmcd:0.8.0
pull_policy: if-not-present pull_policy: if-not-present
network: archy-net network: archy-net
# No entrypoint override: the image's resilient `fmcd-run` launcher loops # No entrypoint override: the image's resilient `fmcd-run` launcher loops
@ -33,11 +33,6 @@ app:
- storage: 2Gi - storage: 2Gi
resources: resources:
# fmcd's embedded iroh networking can hot-loop on relay/hole-punch retries
# on NAT'd nodes that reach the federation neither directly nor via iroh's
# public relays, pegging its whole allotment. Cap it low so a stuck instance
# can't starve the node (steady-state is <3% of a core; joins are brief);
# the fmcd-run watchdog additionally restarts a sustained-hot process.
cpu_limit: 1 cpu_limit: 1
memory_limit: 1Gi memory_limit: 1Gi
disk_limit: 2Gi disk_limit: 2Gi

View File

@ -8,13 +8,6 @@ app:
image: 146.59.87.168:3000/lfg2025/lnd:v0.18.4-beta image: 146.59.87.168:3000/lfg2025/lnd:v0.18.4-beta
pull_policy: if-not-present pull_policy: if-not-present
network: archy-net network: archy-net
# BITCOIND_HOST must follow the node's actual Bitcoin container — Knots or
# Core — resolved at apply time from host facts. Hardcoding either breaks
# LND's chain backend connection on the other (lnd.conf is likewise
# resolved in lnd::ensure_config).
derived_env:
- key: BITCOIND_HOST
template: "{{BITCOIN_HOST}}"
secret_env: secret_env:
- key: BITCOIND_RPCPASS - key: BITCOIND_RPCPASS
secret_file: bitcoin-rpc-password secret_file: bitcoin-rpc-password
@ -52,6 +45,7 @@ app:
options: [rw] options: [rw]
environment: environment:
- BITCOIND_HOST=bitcoin-knots
- BITCOIND_RPCUSER=archipelago - BITCOIND_RPCUSER=archipelago
- NETWORK=mainnet - NETWORK=mainnet

View File

@ -27,7 +27,6 @@ app:
version: ">=1.18.0" version: ">=1.18.0"
- app_id: archy-mempool-db - app_id: archy-mempool-db
version: ">=11.4.10" version: ">=11.4.10"
- bitcoin:archival
resources: resources:
memory_limit: 2Gi memory_limit: 2Gi

View File

@ -5,7 +5,7 @@ app:
description: Bitcoin mempool and blockchain explorer. Real-time transaction and block visualization. description: Bitcoin mempool and blockchain explorer. Real-time transaction and block visualization.
container: container:
image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.1 image: 146.59.87.168:3000/lfg2025/mempool-frontend:v3.0.0
image_signature: cosign://... image_signature: cosign://...
pull_policy: if-not-present pull_policy: if-not-present
@ -13,7 +13,6 @@ app:
- app_id: bitcoin-core - app_id: bitcoin-core
version: ">=24.0" version: ">=24.0"
- storage: 20Gi - storage: 20Gi
- bitcoin:archival
resources: resources:
cpu_limit: 2 cpu_limit: 2
@ -31,7 +30,7 @@ app:
ports: ports:
- host: 4080 - host: 4080
container: 8080 # mempool-frontend nginx listens on 8080 (FRONTEND_HTTP_PORT=8080) container: 4080
protocol: tcp # Web UI protocol: tcp # Web UI
volumes: volumes:

View File

@ -0,0 +1,5 @@
# Meshtastic - uses official image
FROM meshtastic/meshtastic:latest
# Default configuration is in the image
# No additional setup needed

View File

@ -0,0 +1,69 @@
app:
id: meshtastic
name: Meshtastic
version: 2-daily-alpine
description: Open-source mesh networking for LoRa radios. Create decentralized communication networks.
container:
image: docker.io/meshtastic/meshtasticd:daily-alpine
pull_policy: if-not-present
dependencies:
- storage: 1Gi
resources:
cpu_limit: 1
memory_limit: 512Mi
disk_limit: 1Gi
security:
capabilities: [NET_ADMIN, SYS_ADMIN] # Required for LoRa radio access
readonly_root: false # Needs write access for device management
no_new_privileges: true
user: 1000
seccomp_profile: default
network_policy: host # Requires host network for radio access
apparmor_profile: meshtastic
ports:
- host: 4403
container: 4403
protocol: tcp # Meshtastic TCP API
devices:
- /dev/ttyUSB0 # LoRa radio device (if connected)
volumes:
- type: bind
source: /var/lib/archipelago/meshtastic
target: /var/lib/meshtasticd
options: [rw]
files:
- path: /var/lib/archipelago/meshtastic/config.yaml
content: |
General:
MACAddress: AA:BB:CC:DD:EE:01
Webserver:
Port: 4403
environment:
- MESHTASTIC_PORT=/dev/ttyUSB0
- MESHTASTIC_SERIAL=true
health_check:
type: cmd
endpoint: test -f /var/lib/meshtasticd/config.yaml
interval: 30s
timeout: 30s
retries: 5
networking:
mesh_enabled: true
local_network_access: true
metadata:
icon: /assets/img/app-icons/meshcore.svg
category: networking
tier: recommended
repo: https://github.com/meshtastic/firmware

View File

@ -1,77 +0,0 @@
app:
id: netbird-dashboard
name: NetBird Dashboard
version: "2.38.0"
description: NetBird management dashboard (SPA). Internal stack member served through the netbird proxy.
category: networking
# Hyphen name matches runtime references + the live container (adoption).
# Alias `netbird-dashboard` is the short hostname the proxy's nginx proxies to.
container_name: netbird-dashboard
container:
image: docker.io/netbirdio/dashboard:v2.38.0
pull_policy: if-not-present
network: netbird-net
network_aliases: [netbird-dashboard]
# The dashboard SPA bakes its API/OIDC base URL from these at container
# start. They must point at the proxy's public HTTPS origin (8087) so the
# browser uses a secure context (window.crypto.subtle / OIDC PKCE, #15).
# {{HOST_IP}} is the node's primary host IP, resolved at apply time.
derived_env:
- key: NETBIRD_MGMT_API_ENDPOINT
template: "https://{{HOST_IP}}:8087"
- key: NETBIRD_MGMT_GRPC_API_ENDPOINT
template: "https://{{HOST_IP}}:8087"
- key: AUTH_AUTHORITY
template: "https://{{HOST_IP}}:8087/oauth2"
dependencies:
- app_id: netbird-server
resources:
memory_limit: 256Mi
security:
# cap-drop=ALL is applied by the orchestrator. The dashboard image runs
# nginx (master as root, drops workers) binding :80 — needs the worker-drop
# caps + NET_BIND_SERVICE for the privileged port.
capabilities: [CHOWN, DAC_OVERRIDE, SETGID, SETUID, NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
# Internal only — reached container-to-container by the proxy via netbird-net.
ports: []
volumes: []
environment:
- AUTH_AUDIENCE=netbird-dashboard
- AUTH_CLIENT_ID=netbird-dashboard
- AUTH_CLIENT_SECRET=
- USE_AUTH0=false
- AUTH_SUPPORTED_SCOPES=openid profile email groups
- AUTH_REDIRECT_URI=/nb-auth
- AUTH_SILENT_REDIRECT_URI=/nb-silent-auth
- NETBIRD_TOKEN_SOURCE=idToken
- NGINX_SSL_PORT=443
- LETSENCRYPT_DOMAIN=none
health_check:
type: tcp
endpoint: localhost:80
interval: 30s
timeout: 5s
retries: 5
start_period: 20s
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/dashboard
license: BSD-3-Clause
tags:
- networking
- vpn
- dashboard

View File

@ -1,122 +0,0 @@
app:
id: netbird-server
name: NetBird Server
version: "0.71.2"
description: NetBird combined management / signal / relay server with an embedded identity provider and STUN. Backend for the self-hosted NetBird mesh VPN.
category: networking
# Hyphen name matches the runtime references (crash_recovery / dependencies /
# config startup order) + the live container, so on an existing node the
# orchestrator ADOPTS the running server rather than recreating it (data +
# the sqlite store under /var/lib/netbird preserved). Alias `netbird-server`
# is the short hostname the proxy's nginx proxies/grpc-passes to.
container_name: netbird-server
container:
image: docker.io/netbirdio/netbird-server:0.71.2
pull_policy: if-not-present
network: netbird-net
network_aliases: [netbird-server]
# The relay authSecret and the sqlite store encryptionKey are base64 keys
# (the server base64-decodes them to recover raw bytes — hex would decode to
# the wrong value). Generated once and reused: ensure_generated_secrets
# no-ops when the file already exists, so a re-render of config.yaml on an
# adopted node keeps the same keys (regenerating would orphan the store).
generated_secrets:
- name: netbird-relay-auth-secret
kind: base64
- name: netbird-store-encryption-key
kind: base64
# Pass the rendered config explicitly, mirroring the legacy `--config` arg.
custom_args: ["--config", "/etc/netbird/config.yaml"]
dependencies:
- storage: 1Gi
resources:
memory_limit: 1Gi
security:
# cap-drop=ALL is applied by the orchestrator. The server binds :80
# (management/signal/relay HTTP + gRPC) inside the container — a privileged
# port — so it needs NET_BIND_SERVICE. STUN is 3478/udp (unprivileged).
capabilities: [NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
ports:
- host: 8086
container: 80
protocol: tcp # management API + embedded OIDC issuer (/oauth2)
- host: 3478
container: 3478
protocol: udp # STUN — must be UDP; tcp here breaks relay discovery
volumes:
- type: bind
source: /var/lib/archipelago/netbird/data
target: /var/lib/netbird
options: [rw]
# The rendered config.yaml, read-only. Re-rendered on every reconcile from
# host facts + the base64 secrets; idempotent (stable bytes → no restart).
- type: bind
source: /var/lib/archipelago/netbird/config.yaml
target: /etc/netbird/config.yaml
options: [ro]
environment: []
# The server's config. {{HOST_IP}} is the node's primary host IP (the proxy's
# public origin is https on 8087 — the dashboard needs a secure context for
# OIDC PKCE, issue #15). {{secret:...}} are read 0600 from the secrets dir.
files:
- path: /var/lib/archipelago/netbird/config.yaml
overwrite: true
content: |
server:
listenAddress: ":80"
exposedAddress: "https://{{HOST_IP}}:8087"
stunPorts:
- 3478
metricsPort: 9090
healthcheckAddress: ":9000"
logLevel: "info"
logFile: "console"
authSecret: "{{secret:netbird-relay-auth-secret}}"
dataDir: "/var/lib/netbird"
auth:
issuer: "https://{{HOST_IP}}:8087/oauth2"
localAuthDisabled: false
signKeyRefreshEnabled: false
dashboardRedirectURIs:
- "https://{{HOST_IP}}:8087/nb-auth"
- "https://{{HOST_IP}}:8087/nb-silent-auth"
dashboardPostLogoutRedirectURIs:
- "https://{{HOST_IP}}:8087/"
cliRedirectURIs:
- "http://localhost:53000/"
store:
engine: "sqlite"
encryptionKey: "{{secret:netbird-store-encryption-key}}"
# TCP liveness on the management port. Binds at startup, stays green; an http
# check of /oauth2 would false-fail while the issuer warms up.
health_check:
type: tcp
endpoint: localhost:80
interval: 30s
timeout: 5s
retries: 10
start_period: 30s
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/netbird
license: BSD-3-Clause
tags:
- networking
- vpn
- wireguard
- mesh

View File

@ -1,182 +0,0 @@
app:
id: netbird
name: NetBird
version: "2.38.0"
description: Self-hosted WireGuard mesh VPN control plane with dashboard, embedded identity provider, management API, signal, relay, and STUN. The user-facing entry point — a TLS proxy in front of the dashboard + server.
category: networking
# The user-facing launcher (app_id + container both "netbird", matching the
# runtime references + the live container so the orchestrator adopts it). This
# is the nginx that terminates TLS on 8087 and fans out to the dashboard +
# server by their short aliases on netbird-net.
container_name: netbird
container:
image: docker.io/library/nginx:1.27-alpine
pull_policy: if-not-present
network: netbird-net
# Self-signed TLS cert materialised before create — the dashboard needs a
# secure context (window.crypto.subtle / OIDC PKCE, issue #15), so the proxy
# serves HTTPS. Idempotent: kept as-is when crt+key already exist (a user
# accepts it once). SAN defaults to the host IP + 127.0.0.1 + localhost.
generated_certs:
- crt: /var/lib/archipelago/netbird/tls.crt
key: /var/lib/archipelago/netbird/tls.key
dependencies:
- app_id: netbird-server
- app_id: netbird-dashboard
- storage: 1Gi
resources:
memory_limit: 256Mi
security:
# cap-drop=ALL is applied by the orchestrator. nginx (master as root, drops
# workers) binds :443 — needs the worker-drop caps + NET_BIND_SERVICE.
capabilities: [CHOWN, DAC_OVERRIDE, SETGID, SETUID, NET_BIND_SERVICE]
readonly_root: false
network_policy: isolated
ports:
# 8087 publishes the TLS listener (container :443). HTTPS is required for the
# dashboard's secure context (issue #15).
- host: 8087
container: 443
protocol: tcp
volumes:
- type: bind
source: /var/lib/archipelago/netbird/nginx.conf
target: /etc/nginx/conf.d/default.conf
options: [ro]
- type: bind
source: /var/lib/archipelago/netbird/tls.crt
target: /etc/nginx/tls.crt
options: [ro]
- type: bind
source: /var/lib/archipelago/netbird/tls.key
target: /etc/nginx/tls.key
options: [ro]
environment: []
# The proxy config. {{NETWORK_GATEWAY}} is the netbird-net bridge gateway =
# Podman's aardvark DNS. nginx uses it as an explicit `resolver` with VARIABLE
# upstreams so it re-resolves container names per request — without it nginx
# pins a container IP at startup and 502s forever once that IP moves on a
# restart/reboot (issue #15, observed live on .198). Every #15 fix below
# (CORS $http_origin reflect, grpc pass, nb-auth/nb-silent-auth rewrite to
# index.html, /relay websocket) is preserved verbatim from the legacy config.
files:
- path: /var/lib/archipelago/netbird/nginx.conf
overwrite: true
content: |
server {
listen 443 ssl;
server_name _;
# netbird's dashboard needs a secure context (window.crypto.subtle for
# OIDC PKCE), so the proxy terminates TLS with a self-signed cert (#15).
ssl_certificate /etc/nginx/tls.crt;
ssl_certificate_key /etc/nginx/tls.key;
# Rootless Podman can hand a container a new IP across restarts/reboots.
# nginx resolves a literal upstream name ONCE at startup and caches it,
# so after the IP moves every request 502s with "host unreachable"
# (issue #15, observed live on .198: nginx pinned to a dead
# netbird-dashboard IP). Fix: point `resolver` at the netbird-net
# gateway (Podman's aardvark DNS) and use VARIABLE upstreams, which
# forces nginx to re-resolve the container names at request time.
resolver {{NETWORK_GATEWAY}} valid=10s ipv6=off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
location ~ ^/(relay|ws-proxy/) {
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 1d;
}
location ~ ^/(api|oauth2)(/|$) {
# The dashboard is a SPA whose API/OIDC base URL is baked at build
# time to one host:port. A single box is reached via several
# addresses, so those fetches are cross-origin and the browser
# blocks them with no Access-Control-Allow-Origin (#15, live on
# .198). Reflect the caller's Origin and answer the CORS preflight.
if ($request_method = OPTIONS) {
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
add_header Access-Control-Max-Age 86400 always;
add_header Content-Length 0;
return 204;
}
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
}
location ~ ^/(signalexchange\.SignalExchange|management\.ManagementService|management\.ProxyService)/ {
set $nb_server netbird-server;
grpc_pass grpc://$nb_server:80;
grpc_read_timeout 1d;
grpc_send_timeout 1d;
}
# OIDC callback routes are client-side SPA routes with NO prebuilt page
# in the dashboard bundle, so proxying them straight through 404s —
# which crashes the dashboard's auth init and shows "Unauthenticated"
# with dead buttons (#15, live on .198: /nb-auth + /nb-silent-auth
# returned 404). Serve index.html at these paths (URL unchanged) so
# react-oidc boots and completes the login / silent-SSO.
location ~ ^/(nb-auth|nb-silent-auth) {
set $nb_dashboard netbird-dashboard;
rewrite ^.*$ /index.html break;
proxy_pass http://$nb_dashboard:80;
}
location / {
set $nb_dashboard netbird-dashboard;
proxy_pass http://$nb_dashboard:80;
}
}
health_check:
type: tcp
endpoint: localhost:443
interval: 30s
timeout: 5s
retries: 5
start_period: 20s
interfaces:
main:
name: Dashboard
description: Manage your self-hosted NetBird mesh VPN
type: ui
port: 8087
protocol: https
path: /
metadata:
author: NetBird
icon: /assets/img/app-icons/netbird.svg
website: https://netbird.io
repo: https://github.com/netbirdio/netbird
license: BSD-3-Clause
tags:
- networking
- vpn
- wireguard
- mesh

80
core/Cargo.lock generated
View File

@ -99,7 +99,6 @@ version = "1.7.99-alpha"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"archipelago-container", "archipelago-container",
"archipelago-openwrt",
"archipelago-performance", "archipelago-performance",
"archipelago-security", "archipelago-security",
"argon2", "argon2",
@ -129,7 +128,6 @@ dependencies = [
"hyper-ws-listener", "hyper-ws-listener",
"iroh", "iroh",
"iroh-blobs", "iroh-blobs",
"libc",
"mainline", "mainline",
"mdns-sd", "mdns-sd",
"nostr-sdk", "nostr-sdk",
@ -182,22 +180,6 @@ dependencies = [
"uuid", "uuid",
] ]
[[package]]
name = "archipelago-openwrt"
version = "0.1.0"
dependencies = [
"anyhow",
"async-trait",
"reqwest 0.11.27",
"serde",
"serde_json",
"ssh2",
"thiserror 1.0.69",
"tokio",
"tokio-test",
"tracing",
]
[[package]] [[package]]
name = "archipelago-performance" name = "archipelago-performance"
version = "0.1.0" version = "0.1.0"
@ -2857,32 +2839,6 @@ dependencies = [
"redox_syscall 0.7.3", "redox_syscall 0.7.3",
] ]
[[package]]
name = "libssh2-sys"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "220e4f05ad4a218192533b300327f5150e809b54c4ec83b5a1d91833601811b9"
dependencies = [
"cc",
"libc",
"libz-sys",
"openssl-sys",
"pkg-config",
"vcpkg",
]
[[package]]
name = "libz-sys"
version = "1.1.29"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "85bc9657773828b90eeb625adff10eeac83cc21bbfd8e23a03eaa8a33c9e28d9"
dependencies = [
"cc",
"libc",
"pkg-config",
"vcpkg",
]
[[package]] [[package]]
name = "linux-raw-sys" name = "linux-raw-sys"
version = "0.11.0" version = "0.11.0"
@ -3624,18 +3580,6 @@ version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7c87def4c32ab89d880effc9e097653c8da5d6ef28e6b539d313baaacfbafcbe" checksum = "7c87def4c32ab89d880effc9e097653c8da5d6ef28e6b539d313baaacfbafcbe"
[[package]]
name = "openssl-sys"
version = "0.9.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b47e7e6bb2c38cd930d25a23b40fa52e068c10e85f3e03a7f5ba5aaca5713695"
dependencies = [
"cc",
"libc",
"pkg-config",
"vcpkg",
]
[[package]] [[package]]
name = "papaya" name = "papaya"
version = "0.2.4" version = "0.2.4"
@ -3814,12 +3758,6 @@ dependencies = [
"spki 0.8.0", "spki 0.8.0",
] ]
[[package]]
name = "pkg-config"
version = "0.3.33"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "19f132c84eca552bf34cab8ec81f1c1dcc229b811638f9d283dceabe58c5569e"
[[package]] [[package]]
name = "plain" name = "plain"
version = "0.2.3" version = "0.2.3"
@ -5050,18 +4988,6 @@ dependencies = [
"der 0.8.0", "der 0.8.0",
] ]
[[package]]
name = "ssh2"
version = "0.9.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2f84d13b3b8a0d4e91a2629911e951db1bb8671512f5c09d7d4ba34500ba68c8"
dependencies = [
"bitflags 2.13.0",
"libc",
"libssh2-sys",
"parking_lot 0.12.5",
]
[[package]] [[package]]
name = "stable_deref_trait" name = "stable_deref_trait"
version = "1.2.1" version = "1.2.1"
@ -5849,12 +5775,6 @@ version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ba73ea9cf16a25df0c8caa16c51acb937d5712a8429db78a3ee29d5dcacd3a65" checksum = "ba73ea9cf16a25df0c8caa16c51acb937d5712a8429db78a3ee29d5dcacd3a65"
[[package]]
name = "vcpkg"
version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "accd4ea62f7bb7a82fe23066fb0957d48ef677f6eeb8215f372f52e48bb32426"
[[package]] [[package]]
name = "vergen" name = "vergen"
version = "9.1.0" version = "9.1.0"

View File

@ -4,7 +4,6 @@ resolver = "2"
members = [ members = [
"archipelago", "archipelago",
"container", "container",
"openwrt",
"performance", "performance",
"security", "security",
] ]

View File

@ -22,7 +22,6 @@ iroh-swarm = ["dep:iroh", "dep:iroh-blobs"]
[dependencies] [dependencies]
# Core dependencies # Core dependencies
tokio = { version = "1", features = ["full"] } tokio = { version = "1", features = ["full"] }
libc = "0.2" # process-group signalling for the supervised reticulum daemon
serde = { version = "1.0", features = ["derive"] } serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0" serde_json = "1.0"
anyhow = "1.0" anyhow = "1.0"
@ -43,7 +42,6 @@ futures-util = "0.3"
# Our modules # Our modules
archipelago-container = { path = "../container" } archipelago-container = { path = "../container" }
archipelago-openwrt = { path = "../openwrt" }
archipelago-security = { path = "../security" } archipelago-security = { path = "../security" }
archipelago-performance = { path = "../performance" } archipelago-performance = { path = "../performance" }

View File

@ -48,17 +48,6 @@ impl ApiHandler {
.get("x-blob-filename") .get("x-blob-filename")
.and_then(|v| v.to_str().ok()) .and_then(|v| v.to_str().ok())
.map(|s| s.to_string()); .map(|s| s.to_string());
// Optional caller-supplied thumbnail (small, base64) — e.g. the mesh
// chat's image-quality picker generates a tiny client-side preview so
// a ContentRef receiver can render something before fetching the full
// blob. Best-effort: a malformed header is just ignored, not fatal.
let thumb_bytes = headers
.get("x-blob-thumb")
.and_then(|v| v.to_str().ok())
.and_then(|b64| {
use base64::{engine::general_purpose::STANDARD, Engine as _};
STANDARD.decode(b64).ok()
});
let bytes = body.to_vec(); let bytes = body.to_vec();
// Uploads through /api/blob come from the node owner's session and // Uploads through /api/blob come from the node owner's session and
@ -66,7 +55,7 @@ impl ApiHandler {
// pictures, banners). Store them public so `/blob/<cid>` serves // pictures, banners). Store them public so `/blob/<cid>` serves
// without a capability check — external Nostr clients fetching a // without a capability check — external Nostr clients fetching a
// kind-0 `picture` URL have no cap and can't get one. // kind-0 `picture` URL have no cap and can't get one.
match store.put(&bytes, &mime, filename, thumb_bytes, true).await { match store.put(&bytes, &mime, filename, None, true).await {
Ok(meta) => { Ok(meta) => {
let exp = let exp =
(chrono::Utc::now().timestamp() as u64) + crate::blobs::DEFAULT_CAP_TTL_SECS; (chrono::Utc::now().timestamp() as u64) + crate::blobs::DEFAULT_CAP_TTL_SECS;

View File

@ -39,17 +39,6 @@ impl ApiHandler {
let (mut tx, mut rx) = ws_stream.split(); let (mut tx, mut rx) = ws_stream.split();
// Subscribe BEFORE taking the initial snapshot. Messages are full
// data dumps keyed by a monotonic revision, so a broadcast that
// races the snapshot is at worst a harmless duplicate/newer dump
// delivered right after — but subscribing after the snapshot send
// (the old order) let any update in that window vanish forever,
// since a tokio broadcast channel never delivers sends that
// predate subscribe(). That silently stuck clients (e.g. a fresh
// install's post-boot container scan) on a stale initial snapshot
// until a full page reload opened a new connection past the race.
let mut state_rx = state_manager.subscribe();
let initial_msg = state_manager.get_initial_message().await; let initial_msg = state_manager.get_initial_message().await;
if let Ok(json_msg) = serde_json::to_string(&initial_msg) { if let Ok(json_msg) = serde_json::to_string(&initial_msg) {
if let Err(e) = tx.send(Message::Text(json_msg)).await { if let Err(e) = tx.send(Message::Text(json_msg)).await {
@ -58,6 +47,8 @@ impl ApiHandler {
} }
debug!("Sent initial data dump at revision {}", initial_msg.rev); debug!("Sent initial data dump at revision {}", initial_msg.rev);
} }
let mut state_rx = state_manager.subscribe();
let ping_interval = tokio::time::interval(tokio::time::Duration::from_secs(30)); let ping_interval = tokio::time::interval(tokio::time::Duration::from_secs(30));
tokio::pin!(ping_interval); tokio::pin!(ping_interval);
let mut last_client_activity = Instant::now(); let mut last_client_activity = Instant::now();

View File

@ -141,19 +141,6 @@ impl RpcHandler {
self.auth_manager.setup_user(password).await?; self.auth_manager.setup_user(password).await?;
tracing::info!("[onboarding] user setup complete"); tracing::info!("[onboarding] user setup complete");
// Persist the pending onboarding seed as the encrypted backup now that
// a passphrase (the login password) finally exists — otherwise "Reveal
// recovery phrase" has nothing to decrypt on this node, ever.
// Best-effort: a failure here must not break password setup.
match super::seed_rpc::save_pending_seed_encrypted(&self.config.data_dir, password).await {
Ok(true) => tracing::info!("[onboarding] encrypted seed backup saved"),
Ok(false) => tracing::info!(
"[onboarding] no pending mnemonic to back up (restored earlier or legacy node)"
),
Err(e) => tracing::warn!("[onboarding] encrypted seed backup failed: {e:#}"),
}
Ok(serde_json::json!(true)) Ok(serde_json::json!(true))
} }

View File

@ -171,12 +171,6 @@ impl RpcHandler {
// than the WebSocket-delivered package_data, which caused apps to flicker // than the WebSocket-delivered package_data, which caused apps to flicker
// between "installed" and "not-installed" in the UI. // between "installed" and "not-installed" in the UI.
let (data, _) = self.state_manager.get_snapshot().await; let (data, _) = self.state_manager.get_snapshot().await;
// Apps the user explicitly stopped must read as "stopped" even though a
// UI companion (electrs-ui, bitcoin-ui, …) keeps serving the launch port:
// launch_port_reachable() below would otherwise upgrade an exited backend
// back to "running". The reconcile guard keeps these backends down, so the
// marker is authoritative here.
let user_stopped = crate::crash_recovery::load_user_stopped(&self.config.data_dir).await;
if data.server_info.status_info.containers_scanned && !data.package_data.is_empty() { if data.server_info.status_info.containers_scanned && !data.package_data.is_empty() {
let mut containers = Vec::with_capacity(data.package_data.len()); let mut containers = Vec::with_capacity(data.package_data.len());
for (id, pkg) in &data.package_data { for (id, pkg) in &data.package_data {
@ -208,11 +202,7 @@ impl RpcHandler {
// Scanner backoff preserves cached package_data. Refresh stable // Scanner backoff preserves cached package_data. Refresh stable
// states so callers do not see stale `running`/`exited` after // states so callers do not see stale `running`/`exited` after
// health-monitor recovery or Quadlet --rm container removal. // health-monitor recovery or Quadlet --rm container removal.
if user_stopped.contains(id) { if state == "running" && requires_launch_port_for_health(id) {
// User stopped it → authoritative "stopped". Do NOT let a
// still-running UI companion's launch port mark it running.
state = "stopped".to_string();
} else if state == "running" && requires_launch_port_for_health(id) {
if !self.cached_reachable_health(id).await?.is_some() { if !self.cached_reachable_health(id).await?.is_some() {
state = live_state_for_app(id) state = live_state_for_app(id)
.await .await

View File

@ -429,15 +429,11 @@ impl RpcHandler {
}, },
Some("fedimint") => match mint_fedimint().await { Some("fedimint") => match mint_fedimint().await {
Ok((notes, fed)) => { Ok((notes, fed)) => {
tracing::info!( tracing::info!("paid download: spending {price_sats} sats Fedimint notes from {fed}");
"paid download: spending {price_sats} sats Fedimint notes from {fed}"
);
(notes, "fedimint") (notes, "fedimint")
} }
Err(e) => { Err(e) => {
tracing::warn!( tracing::warn!("paid download: fedimint spend failed for {price_sats} sats: {e:#}");
"paid download: fedimint spend failed for {price_sats} sats: {e:#}"
);
return Ok(serde_json::json!({ "error": format!( return Ok(serde_json::json!({ "error": format!(
"Couldn't pay {price_sats} sats from your Fedimint wallet: {e}. \ "Couldn't pay {price_sats} sats from your Fedimint wallet: {e}. \
Fund it, or choose Cashu." Fund it, or choose Cashu."
@ -461,9 +457,7 @@ impl RpcHandler {
}, },
}, },
}; };
tracing::info!( tracing::info!("paid download: paying {price_sats} sats to {onion} via {used_backend} ecash");
"paid download: paying {price_sats} sats to {onion} via {used_backend} ecash"
);
let (data, _) = self.state_manager.get_snapshot().await; let (data, _) = self.state_manager.get_snapshot().await;
let local_did = crate::identity::did_key_from_pubkey_hex(&data.server_info.pubkey)?; let local_did = crate::identity::did_key_from_pubkey_hex(&data.server_info.pubkey)?;

View File

@ -57,8 +57,6 @@ impl RpcHandler {
"package.uninstall" => self.clone().spawn_package_uninstall(params).await, "package.uninstall" => self.clone().spawn_package_uninstall(params).await,
"package.update" => self.clone().spawn_package_update(params).await, "package.update" => self.clone().spawn_package_update(params).await,
"package.check-updates" => self.handle_package_check_updates(params).await, "package.check-updates" => self.handle_package_check_updates(params).await,
"package.versions" => self.handle_package_versions(params).await,
"package.set-config" => self.clone().handle_package_set_config(params).await,
"package.credentials" => self.handle_package_credentials(params).await, "package.credentials" => self.handle_package_credentials(params).await,
"app.filebrowser-token" => self.handle_filebrowser_token().await, "app.filebrowser-token" => self.handle_filebrowser_token().await,
@ -223,7 +221,6 @@ impl RpcHandler {
"network.list-interfaces" => self.handle_network_list_interfaces().await, "network.list-interfaces" => self.handle_network_list_interfaces().await,
"network.scan-wifi" => self.handle_network_scan_wifi().await, "network.scan-wifi" => self.handle_network_scan_wifi().await,
"network.configure-wifi" => self.handle_network_configure_wifi(params).await, "network.configure-wifi" => self.handle_network_configure_wifi(params).await,
"network.set-wifi-radio" => self.handle_network_set_wifi_radio(params).await,
"network.configure-ethernet" => self.handle_network_configure_ethernet(params).await, "network.configure-ethernet" => self.handle_network_configure_ethernet(params).await,
"network.dns-status" => self.handle_network_dns_status().await, "network.dns-status" => self.handle_network_dns_status().await,
"network.configure-dns" => self.handle_network_configure_dns(params).await, "network.configure-dns" => self.handle_network_configure_dns(params).await,
@ -231,13 +228,6 @@ impl RpcHandler {
"router.info" => self.handle_router_info().await, "router.info" => self.handle_router_info().await,
"router.configure" => self.handle_router_configure(params).await, "router.configure" => self.handle_router_configure(params).await,
// OpenWrt / TollGate
"openwrt.scan" => self.handle_openwrt_scan(params).await,
"openwrt.get-status" => self.handle_openwrt_get_status(params).await,
"openwrt.provision-tollgate" => self.handle_openwrt_provision_tollgate(params).await,
"openwrt.scan-wifi" => self.handle_openwrt_scan_wifi(params).await,
"openwrt.configure-wan" => self.handle_openwrt_configure_wan(params).await,
// Ecash wallet // Ecash wallet
"wallet.ecash-balance" => self.handle_wallet_ecash_balance().await, "wallet.ecash-balance" => self.handle_wallet_ecash_balance().await,
"wallet.ecash-mint" => self.handle_wallet_ecash_mint(params).await, "wallet.ecash-mint" => self.handle_wallet_ecash_mint(params).await,
@ -374,7 +364,6 @@ impl RpcHandler {
"mesh.send" => self.handle_mesh_send(params).await, "mesh.send" => self.handle_mesh_send(params).await,
"mesh.send-channel" => self.handle_mesh_send_channel(params).await, "mesh.send-channel" => self.handle_mesh_send_channel(params).await,
"mesh.broadcast" => self.handle_mesh_broadcast().await, "mesh.broadcast" => self.handle_mesh_broadcast().await,
"mesh.reboot-radio" => self.handle_mesh_reboot_radio(params).await,
"mesh.configure" => self.handle_mesh_configure(params).await, "mesh.configure" => self.handle_mesh_configure(params).await,
"mesh.send-invoice" => self.handle_mesh_send_invoice(params).await, "mesh.send-invoice" => self.handle_mesh_send_invoice(params).await,
"mesh.send-coordinate" => self.handle_mesh_send_coordinate(params).await, "mesh.send-coordinate" => self.handle_mesh_send_coordinate(params).await,
@ -427,10 +416,8 @@ impl RpcHandler {
// Server settings // Server settings
"server.set-name" => self.handle_server_set_name(params).await, "server.set-name" => self.handle_server_set_name(params).await,
"server.set-location" => self.handle_server_set_location(params).await,
// System monitoring // System monitoring
"system.get-hostname" => self.handle_system_get_hostname().await,
"system.stats" => self.handle_system_stats().await, "system.stats" => self.handle_system_stats().await,
"system.processes" => self.handle_system_processes().await, "system.processes" => self.handle_system_processes().await,
"system.temperature" => self.handle_system_temperature().await, "system.temperature" => self.handle_system_temperature().await,

View File

@ -454,12 +454,6 @@ impl RpcHandler {
.flatten(), .flatten(),
}; };
let shared_location = if data.server_info.share_location {
data.server_info.lat.zip(data.server_info.lon)
} else {
None
};
let state = federation::build_local_state( let state = federation::build_local_state(
apps, apps,
0.0, 0.0,
@ -473,7 +467,6 @@ impl RpcHandler {
nostr_npub, nostr_npub,
own_fips_npub, own_fips_npub,
&federated_peers, &federated_peers,
shared_location,
); );
Ok(serde_json::to_value(&state)?) Ok(serde_json::to_value(&state)?)

View File

@ -18,24 +18,6 @@ impl RpcHandler {
Ok(serde_json::json!({ "networks": networks })) Ok(serde_json::json!({ "networks": networks }))
} }
/// network.set-wifi-radio — turn the wifi adapter fully on or off (not just
/// disconnect from a network). Params: `{ "enabled": bool }`.
pub(super) async fn handle_network_set_wifi_radio(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let enabled = params
.get("enabled")
.and_then(|v| v.as_bool())
.ok_or_else(|| anyhow::anyhow!("Missing required parameter: enabled"))?;
tracing::info!(enabled, "Setting wifi radio state");
set_wifi_radio(enabled).await?;
Ok(serde_json::json!({ "ok": true, "enabled": enabled }))
}
/// network.configure-wifi — connect to a WiFi network. /// network.configure-wifi — connect to a WiFi network.
pub(super) async fn handle_network_configure_wifi( pub(super) async fn handle_network_configure_wifi(
&self, &self,
@ -345,27 +327,6 @@ fn split_nmcli_escaped(line: &str, limit: usize) -> Vec<String> {
fields fields
} }
/// Turn the wifi radio fully on or off using nmcli (a rfkill-level toggle, not
/// just disconnecting from the current network — the adapter stops scanning/
/// associating entirely until switched back on).
async fn set_wifi_radio(enabled: bool) -> Result<()> {
let state = if enabled { "on" } else { "off" };
let output = tokio::process::Command::new("nmcli")
.args(["radio", "wifi", state])
.output()
.await
.context("Failed to run nmcli radio wifi")?;
if !output.status.success() {
anyhow::bail!(
"nmcli radio wifi {} failed: {}",
state,
String::from_utf8_lossy(&output.stderr)
);
}
Ok(())
}
/// Connect to a WiFi network using nmcli. /// Connect to a WiFi network using nmcli.
async fn connect_wifi(ssid: &str, password: &str) -> Result<()> { async fn connect_wifi(ssid: &str, password: &str) -> Result<()> {
let conn_name = format!("archipelago-wifi-{ssid}"); let conn_name = format!("archipelago-wifi-{ssid}");

View File

@ -19,10 +19,7 @@ impl RpcHandler {
let svc = service let svc = service
.as_ref() .as_ref()
.ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?; .ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?;
( (svc.assistant_config().await, svc.assistant_denied_askers().await)
svc.assistant_config().await,
svc.assistant_denied_askers().await,
)
}; };
let (ollama_detected, models) = detect_ollama().await; let (ollama_detected, models) = detect_ollama().await;

View File

@ -86,29 +86,6 @@ impl RpcHandler {
Ok(serde_json::json!({ "broadcast": true })) Ok(serde_json::json!({ "broadcast": true }))
} }
/// mesh.reboot-radio — Reboot the locally-connected radio firmware to
/// recover a wedged / RX-deaf radio. Optional `seconds` delay (default 2).
pub(in crate::api::rpc) async fn handle_mesh_reboot_radio(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let seconds = params
.as_ref()
.and_then(|p| p.get("seconds"))
.and_then(|v| v.as_i64())
.unwrap_or(2);
let service = self.mesh_service.read().await;
let svc = service
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Mesh service not running. Enable mesh first."))?;
svc.reboot_radio(seconds).await?;
info!(seconds, "Mesh radio reboot requested via RPC");
Ok(serde_json::json!({ "reboot": true, "seconds": seconds }))
}
/// mesh.configure — Enable/disable mesh and set device path. /// mesh.configure — Enable/disable mesh and set device path.
pub(in crate::api::rpc) async fn handle_mesh_configure( pub(in crate::api::rpc) async fn handle_mesh_configure(
&self, &self,

View File

@ -5,7 +5,6 @@ use crate::mesh::message_types::{
Coordinate, DeletePayload, EditPayload, ForwardPayload, InvoicePayload, MeshMessageType, Coordinate, DeletePayload, EditPayload, ForwardPayload, InvoicePayload, MeshMessageType,
MessageKey, PsbtHashPayload, ReactionPayload, ReadReceiptPayload, ReplyPayload, TypedEnvelope, MessageKey, PsbtHashPayload, ReactionPayload, ReadReceiptPayload, ReplyPayload, TypedEnvelope,
}; };
use crate::mesh::types::radio_transport_label;
use anyhow::Result; use anyhow::Result;
use tracing::info; use tracing::info;
@ -392,24 +391,9 @@ impl RpcHandler {
// Hard ceiling matching the chunked-send capacity (~20 chunks * 152 // Hard ceiling matching the chunked-send capacity (~20 chunks * 152
// b64 chars after MCIIXXTT framing). Anything larger must go via // b64 chars after MCIIXXTT framing). Anything larger must go via
// ContentRef over Tor — UNLESS the active device is Reticulum, which // ContentRef over Tor.
// can carry up to RETICULUM_RESOURCE_MAX directly over LoRa via a
// native RNS Resource transfer (keep this ceiling in sync with
// `mesh.transport-advice`'s `"resource-mesh"` tier, the source of
// truth the frontend consults before ever reaching this size).
const INLINE_HARD_MAX: usize = 2300; const INLINE_HARD_MAX: usize = 2300;
const RETICULUM_RESOURCE_MAX: usize = 2 * 1024 * 1024; if bytes.len() > INLINE_HARD_MAX {
let service = self.mesh_service.read().await;
let svc = service
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?;
let device_type = svc.shared_state().status.read().await.device_type;
let use_resource_transfer = bytes.len() > INLINE_HARD_MAX
&& device_type == crate::mesh::types::DeviceType::Reticulum
&& bytes.len() <= RETICULUM_RESOURCE_MAX;
if bytes.len() > INLINE_HARD_MAX && !use_resource_transfer {
anyhow::bail!( anyhow::bail!(
"Payload {} bytes exceeds inline max {} — use mesh.send-content (ContentRef) instead", "Payload {} bytes exceeds inline max {} — use mesh.send-content (ContentRef) instead",
bytes.len(), bytes.len(),
@ -430,6 +414,22 @@ impl RpcHandler {
.put(&bytes, &mime, filename.clone(), None, false) .put(&bytes, &mime, filename.clone(), None, false)
.await?; .await?;
let service = self.mesh_service.read().await;
let svc = service
.as_ref()
.ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?;
let content = ContentInlinePayload {
mime: mime.clone(),
filename: filename.clone(),
caption: caption.clone(),
bytes,
};
let seq = svc.next_send_seq(contact_id).await;
let payload = message_types::encode_payload(&content)?;
let envelope = TypedEnvelope::new(MeshMessageType::ContentInline, payload).with_seq(seq);
let wire = envelope.to_wire()?;
let display = match (&filename, &caption) { let display = match (&filename, &caption) {
(Some(f), Some(c)) => format!("📎 {}{}", f, c), (Some(f), Some(c)) => format!("📎 {}{}", f, c),
(Some(f), None) => format!("📎 {}", f), (Some(f), None) => format!("📎 {}", f),
@ -437,8 +437,7 @@ impl RpcHandler {
(None, None) => format!("📎 {} ({} bytes)", mime, meta.size), (None, None) => format!("📎 {} ({} bytes)", mime, meta.size),
}; };
// Render as a content_ref card on the sender side (UI already knows // Render as a content_ref card on the sender side (UI already knows
// how to draw it from cid + mime + filename + size) regardless of // how to draw it from cid + mime + filename + size).
// which wire format actually goes out — this is a local-only mirror.
let typed_json = serde_json::json!({ let typed_json = serde_json::json!({
"cid": meta.cid, "cid": meta.cid,
"size": meta.size, "size": meta.size,
@ -447,67 +446,22 @@ impl RpcHandler {
"caption": caption, "caption": caption,
"inline": true, "inline": true,
}); });
let seq = svc.next_send_seq(contact_id).await;
// A stock (non-archy) peer can't decode our typed-envelope wire let msg = svc
// format — send images to them via LXMF's native FIELD_IMAGE .send_typed_wire(
// instead, so they actually see the photo (Sideband/NomadNet).
let is_archy = svc.is_archy_peer(contact_id).await;
let native_image = !is_archy
&& device_type == crate::mesh::types::DeviceType::Reticulum
&& mime.starts_with("image/");
let msg = if native_image {
svc.send_native_image(contact_id, &mime, bytes, caption.clone())
.await?;
svc.record_sent_typed(
contact_id, contact_id,
wire,
"content_ref", "content_ref",
&display, &display,
Some(typed_json), Some(typed_json),
seq, seq,
Some(radio_transport_label(device_type).to_string()),
true, // Reticulum/LXMF is unconditionally E2E on every send
) )
.await .await?;
} else {
let content = ContentInlinePayload {
mime: mime.clone(),
filename: filename.clone(),
caption: caption.clone(),
bytes,
};
let payload = message_types::encode_payload(&content)?;
let envelope = TypedEnvelope::new(MeshMessageType::ContentInline, payload).with_seq(seq);
let wire = envelope.to_wire()?;
if use_resource_transfer {
svc.send_content_resource(
contact_id,
wire,
"content_ref",
&display,
Some(typed_json),
seq,
)
.await?
} else {
svc.send_typed_wire(
contact_id,
wire,
"content_ref",
&display,
Some(typed_json),
seq,
)
.await?
}
};
info!( info!(
contact_id, contact_id,
size = meta.size, size = meta.size,
cid = %meta.cid, cid = %meta.cid,
via_resource = use_resource_transfer,
"Sent content_inline over mesh" "Sent content_inline over mesh"
); );
Ok(serde_json::json!({ Ok(serde_json::json!({
@ -538,19 +492,8 @@ impl RpcHandler {
// Knobs — keep in sync with the frontend modal copy. // Knobs — keep in sync with the frontend modal copy.
const MESH_AUTO_MAX: u64 = 1024; const MESH_AUTO_MAX: u64 = 1024;
const MESH_HARD_MAX: u64 = 2300; const MESH_HARD_MAX: u64 = 2300;
// Reticulum-only: above the small inline-chunk cap, a real RNS Resource
// transfer can still carry the payload directly over LoRa (native
// chunked transfer with retries) instead of falling back to Tor. Capped
// well under TOR_LARGE_WARN to keep worst-case LoRa transfer time
// bounded — comfortably covers the HIGH image preset (512KB target).
const RETICULUM_RESOURCE_MAX: u64 = 2 * 1024 * 1024;
const TOR_LARGE_WARN: u64 = 5 * 1024 * 1024; const TOR_LARGE_WARN: u64 = 5 * 1024 * 1024;
// Meshcore/Meshtastic effective LoRa throughput after retries/FEC is much const LORA_BYTES_PER_SEC: u64 = 50;
// lower than the raw radio bitrate. Reticulum's RNodeInterface reports its
// real bitrate (e.g. ~3125 bps ≈ 390 B/s observed live), so estimates for it
// would be wildly pessimistic at the generic 50 B/s figure.
const LORA_BYTES_PER_SEC_DEFAULT: u64 = 50;
const LORA_BYTES_PER_SEC_RETICULUM: u64 = 390;
// Resolve peer Tor reachability via federation node list. // Resolve peer Tor reachability via federation node list.
let service = self.mesh_service.read().await; let service = self.mesh_service.read().await;
@ -558,12 +501,6 @@ impl RpcHandler {
.as_ref() .as_ref()
.ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?; .ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?;
let state = svc.shared_state(); let state = svc.shared_state();
let device_type = state.status.read().await.device_type;
let lora_bytes_per_sec = if device_type == crate::mesh::types::DeviceType::Reticulum {
LORA_BYTES_PER_SEC_RETICULUM
} else {
LORA_BYTES_PER_SEC_DEFAULT
};
let (peer_pubkey_hex, peer_did) = { let (peer_pubkey_hex, peer_did) = {
let peers = state.peers.read().await; let peers = state.peers.read().await;
match peers.get(&contact_id) { match peers.get(&contact_id) {
@ -583,10 +520,8 @@ impl RpcHandler {
.map(|d| nodes.iter().any(|n| &n.did == d)) .map(|d| nodes.iter().any(|n| &n.did == d))
.unwrap_or(false); .unwrap_or(false);
let est_seconds = let est_seconds = (size.saturating_add(LORA_BYTES_PER_SEC - 1) / LORA_BYTES_PER_SEC).max(1);
(size.saturating_add(lora_bytes_per_sec - 1) / lora_bytes_per_sec).max(1);
let is_reticulum = device_type == crate::mesh::types::DeviceType::Reticulum;
let (tier, reason) = if size <= MESH_AUTO_MAX { let (tier, reason) = if size <= MESH_AUTO_MAX {
("auto-mesh", "Small enough to send inline over mesh") ("auto-mesh", "Small enough to send inline over mesh")
} else if size <= MESH_HARD_MAX { } else if size <= MESH_HARD_MAX {
@ -595,8 +530,6 @@ impl RpcHandler {
} else { } else {
("auto-mesh", "No Tor path — sending inline over mesh") ("auto-mesh", "No Tor path — sending inline over mesh")
} }
} else if is_reticulum && size <= RETICULUM_RESOURCE_MAX {
("resource-mesh", "Sending directly over LoRa via a Reticulum resource transfer")
} else if size <= TOR_LARGE_WARN { } else if size <= TOR_LARGE_WARN {
if has_tor { if has_tor {
("tor-only", "Too large for mesh — Tor only") ("tor-only", "Too large for mesh — Tor only")
@ -741,6 +674,18 @@ impl RpcHandler {
.as_str() .as_str()
.ok_or_else(|| anyhow::anyhow!("Missing cid"))? .ok_or_else(|| anyhow::anyhow!("Missing cid"))?
.to_string(); .to_string();
let sender_onion = params["sender_onion"]
.as_str()
.ok_or_else(|| anyhow::anyhow!("Missing sender_onion"))?
.trim_end_matches('/')
.to_string();
let cap_token = params["cap_token"]
.as_str()
.ok_or_else(|| anyhow::anyhow!("Missing cap_token"))?
.to_string();
let cap_exp = params["cap_exp"]
.as_u64()
.ok_or_else(|| anyhow::anyhow!("Missing cap_exp"))?;
let mime_hint = params["mime"] let mime_hint = params["mime"]
.as_str() .as_str()
.unwrap_or("application/octet-stream") .unwrap_or("application/octet-stream")
@ -764,12 +709,7 @@ impl RpcHandler {
}; };
// Short-circuit if we already hold the blob — still issue a fresh // Short-circuit if we already hold the blob — still issue a fresh
// self-cap so the UI gets a displayable local URL. Checked BEFORE the // self-cap so the UI gets a displayable local URL.
// sender_onion/cap_token/cap_exp params are required below: an inline
// ContentInline attachment (mesh.send-content-inline) is written to
// our own BlobStore the moment it's received/sent (dispatch.rs), so
// its typed_payload never carries those fields at all — only a
// ContentRef fetched from a remote peer needs them.
if blob_store.has(&cid).await { if blob_store.has(&cid).await {
let local_exp = (chrono::Utc::now().timestamp() as u64) + DEFAULT_CAP_TTL_SECS; let local_exp = (chrono::Utc::now().timestamp() as u64) + DEFAULT_CAP_TTL_SECS;
let local_cap = blob_store.issue_capability(&cid, &self_pubkey_hex, local_exp); let local_cap = blob_store.issue_capability(&cid, &self_pubkey_hex, local_exp);
@ -785,19 +725,6 @@ impl RpcHandler {
})); }));
} }
let sender_onion = params["sender_onion"]
.as_str()
.ok_or_else(|| anyhow::anyhow!("Missing sender_onion"))?
.trim_end_matches('/')
.to_string();
let cap_token = params["cap_token"]
.as_str()
.ok_or_else(|| anyhow::anyhow!("Missing cap_token"))?
.to_string();
let cap_exp = params["cap_exp"]
.as_u64()
.ok_or_else(|| anyhow::anyhow!("Missing cap_exp"))?;
// Reach the sender: FIPS preferred when the sender is federated // Reach the sender: FIPS preferred when the sender is federated
// and has advertised a FIPS npub, Tor fallback otherwise. // and has advertised a FIPS npub, Tor fallback otherwise.
// Cap/exp/peer in the query string match what the sender signed in // Cap/exp/peer in the query string match what the sender signed in
@ -933,15 +860,6 @@ impl RpcHandler {
let svc = service let svc = service
.as_ref() .as_ref()
.ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?; .ok_or_else(|| anyhow::anyhow!("Mesh service not running"))?;
// Read receipts are fired automatically just by viewing a chat (no
// explicit user action), unlike every other typed send here — so a
// stock (non-archy) peer that can't decode a TypedEnvelope at all
// (e.g. a phone running plain Sideband) would otherwise get a raw
// control envelope shoved at it the moment its message is viewed,
// surfacing as garbage text right after whatever it just sent.
if !svc.is_archy_peer(contact_id).await {
return Ok(serde_json::json!({ "sent": false, "reason": "not an archy peer" }));
}
let seq = svc.next_send_seq(contact_id).await; let seq = svc.next_send_seq(contact_id).await;
let payload = message_types::encode_payload(&receipt)?; let payload = message_types::encode_payload(&receipt)?;
let envelope = TypedEnvelope::new(MeshMessageType::ReadReceipt, payload).with_seq(seq); let envelope = TypedEnvelope::new(MeshMessageType::ReadReceipt, payload).with_seq(seq);

View File

@ -64,32 +64,6 @@ pub(super) fn sanitize_error_message(msg: &str) -> String {
"Container", "Container",
"Image", "Image",
"Bitcoin address", "Bitcoin address",
"No router",
"No OpenWrt",
"No space left",
"Not enough flash",
"Not enough space",
"TollGate installation failed",
"No pre-built TollGate",
"opkg not found",
"apk update failed",
"No wireless interface",
"No wireless radio",
"WiFi radio enabled but",
"Missing required field",
// seed.reveal / auth flows — user-actionable, no internals to leak.
// Without these the sanitizer collapsed every reveal failure into
// "Operation failed. Check server logs." (which isn't even a crash).
"Incorrect",
"This node has no encrypted seed",
"A 2FA code is required",
"2FA is enabled but",
"Could not decrypt the saved seed",
"Could not unlock 2FA",
"No mnemonic available",
"No pending seed generation",
"Submitted words",
"Already set up",
]; ];
for prefix in &user_facing_prefixes { for prefix in &user_facing_prefixes {
if msg.starts_with(prefix) { if msg.starts_with(prefix) {
@ -109,43 +83,6 @@ pub(super) fn sanitize_error_message(msg: &str) -> String {
"Operation failed. Check server logs for details.".to_string() "Operation failed. Check server logs for details.".to_string()
} }
#[cfg(test)]
mod sanitize_tests {
use super::sanitize_error_message;
#[test]
fn seed_reveal_errors_pass_through() {
// Every user-actionable seed.reveal failure must reach the user —
// masking them as "Check server logs" sent a real user hunting a
// crash that never happened.
for msg in [
"Incorrect password",
"This node has no encrypted seed backup, so the recovery phrase cannot be shown. It was only displayed once during setup.",
"A 2FA code is required to reveal the recovery phrase",
"2FA is enabled but no TOTP data found",
"Could not decrypt the saved seed. If you set a separate backup passphrase during setup, enter that passphrase.",
"Could not unlock 2FA with this password",
"No mnemonic available. Generate or restore a seed first.",
"Submitted words do not match generated seed",
"Already set up. Use auth.changePassword to change.",
] {
assert_ne!(
sanitize_error_message(msg),
"Operation failed. Check server logs for details.",
"masked: {msg}"
);
}
}
#[test]
fn internal_errors_stay_generic() {
assert_eq!(
sanitize_error_message("thread panicked at src/foo.rs:42"),
"Operation failed. Check server logs for details."
);
}
}
/// Derive a CSRF token from the session token via HMAC. /// Derive a CSRF token from the session token via HMAC.
/// Deterministic: same session token always produces the same CSRF token. /// Deterministic: same session token always produces the same CSRF token.
/// Survives backend restarts because it depends only on the session token /// Survives backend restarts because it depends only on the session token

View File

@ -23,7 +23,6 @@ mod names;
mod network; mod network;
mod node; mod node;
mod nostr; mod nostr;
mod openwrt;
mod package; mod package;
mod peers; mod peers;
mod response; mod response;

View File

@ -1,353 +0,0 @@
use super::RpcHandler;
use anyhow::Result;
use archipelago_openwrt::{
detect,
router::Router,
tollgate::{self, TollGateConfig},
wan,
wifi_scan,
};
use crate::network::router as net_router;
/// Default port for the local Cashu mint (nutshell / cashu-mint app).
const LOCAL_MINT_PORT: u16 = 3338;
impl RpcHandler {
/// Scan the local subnet for OpenWrt routers.
///
/// Params: `{ "subnet": "192.168.1.0", "prefix": 24,
/// "ssh_user": "root", "ssh_password": "" }`
pub(super) async fn handle_openwrt_scan(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let p = params.unwrap_or_default();
let subnet: [u8; 4] = parse_ipv4(
p.get("subnet").and_then(|v| v.as_str()).unwrap_or("192.168.1.0"),
)?;
let prefix = p.get("prefix").and_then(|v| v.as_u64()).unwrap_or(24) as u8;
let ssh_user = p
.get("ssh_user")
.and_then(|v| v.as_str())
.unwrap_or("root")
.to_string();
let ssh_password = p
.get("ssh_password")
.and_then(|v| v.as_str())
.unwrap_or("")
.to_string();
let routers = detect::scan_subnet(subnet, prefix, &ssh_user, &ssh_password).await;
let ips: Vec<String> = routers.iter().map(|ip| ip.to_string()).collect();
Ok(serde_json::json!({ "routers": ips }))
}
/// Read current settings from a saved or ad-hoc OpenWrt router via SSH/UCI.
///
/// Params (all optional): `{ "host": "...", "ssh_user": "root", "ssh_password": "" }`
/// If params are omitted the saved `router_config.json` credentials are used.
pub(super) async fn handle_openwrt_get_status(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let saved = net_router::load_router_config(&self.config.data_dir).await?;
let p = params.unwrap_or_default();
let host_from_params = p.get("host").and_then(|v| v.as_str()).is_some();
let host = p
.get("host")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
.or_else(|| if saved.configured { Some(saved.address.clone()) } else { None })
.ok_or_else(|| anyhow::anyhow!("No router configured — provide host or call router.configure first"))?;
let ssh_user = p
.get("ssh_user")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
.or_else(|| saved.username.clone())
.unwrap_or_else(|| "root".to_string());
let ssh_password = p
.get("ssh_password")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
.or_else(|| saved.password.clone())
.unwrap_or_default();
let router = Router::connect_password(&host, 22, &ssh_user, &ssh_password)?;
router.verify_openwrt()?;
// Persist the connection so other views (e.g. the Home dashboard's
// Network tile) can poll `openwrt.get-status` with no params instead
// of every caller needing to carry host/credentials around. Only do
// this when the host actually came from params — otherwise every
// no-args poll would re-save the same thing it just read.
if host_from_params {
let _ = net_router::configure_router(
&self.config.data_dir,
net_router::RouterType::OpenWrt,
&host,
None,
Some(&ssh_user),
Some(&ssh_password),
).await;
}
// System info
let release = router.run_ok("cat /etc/openwrt_release").unwrap_or_default();
let hostname = router
.uci_get("system.@system[0].hostname")
.unwrap_or_else(|_| "unknown".into());
let uptime_secs: u64 = router
.run_ok("cat /proc/uptime")
.unwrap_or_default()
.split_whitespace()
.next()
.and_then(|s| s.split('.').next())
.and_then(|s| s.parse().ok())
.unwrap_or(0);
// TollGate — check via opkg (≤24.x) or binary presence (25.x apk-native).
// The service binary is /usr/bin/tollgate-wrt (per its init.d script),
// not /usr/bin/tollgate-module-basic-go — that's only the opkg/apk
// *package* name, never an on-disk filename.
let tollgate_installed = router
.run("/usr/bin/opkg list-installed 2>/dev/null | grep -q '^tollgate-module-basic-go ' || \
test -f /usr/bin/tollgate-wrt 2>/dev/null")
.map(|(_, code)| code == 0)
.unwrap_or(false);
let tollgate = if tollgate_installed {
serde_json::json!({
"installed": true,
"enabled": router.uci_get("tollgate.main.enabled").map(|v| v == "1").unwrap_or(false),
"metric": router.uci_get("tollgate.main.metric").unwrap_or_default(),
"step_size_ms": router.uci_get("tollgate.main.step_size").ok().and_then(|v| v.parse::<u64>().ok()).unwrap_or(0),
"price_per_step":router.uci_get("tollgate.main.price_per_step").ok().and_then(|v| v.parse::<u64>().ok()).unwrap_or(0),
"min_steps": router.uci_get("tollgate.main.min_steps").ok().and_then(|v| v.parse::<u32>().ok()).unwrap_or(1),
"currency": router.uci_get("tollgate.main.currency").unwrap_or_default(),
"mint_url": router.uci_get("tollgate.main.mint_url").unwrap_or_default(),
})
} else {
serde_json::json!({ "installed": false })
};
// WiFi interfaces
let wifi_raw = router.run_ok("uci show wireless").unwrap_or_default();
let wifi_interfaces = parse_wifi_interfaces(&wifi_raw);
let wan_status = wan::get_wan_status(&router);
Ok(serde_json::json!({
"host": host,
"hostname": hostname,
"uptime_secs": uptime_secs,
"release": parse_release(&release),
"tollgate": tollgate,
"wifi_interfaces": wifi_interfaces,
"wan": wan_status,
}))
}
/// Provision TollGate on an OpenWrt router and create the "archipelago" SSID.
///
/// Params: `{ "host": "192.168.1.1", "ssh_user": "root", "ssh_password": "",
/// "price_sats": 10, "step_size_ms": 60000, "min_steps": 1,
/// "mint_url": "<optional override>" }`
///
/// `mint_url` defaults to `http://<this node's IP>:3338` — the local Cashu
/// mint that must be running as an Archy app before calling this endpoint.
pub(super) async fn handle_openwrt_provision_tollgate(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let saved = net_router::load_router_config(&self.config.data_dir).await?;
let p = params.unwrap_or_default();
let host = p
.get("host")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
.or_else(|| if saved.configured { Some(saved.address.clone()) } else { None })
.ok_or_else(|| anyhow::anyhow!("No router configured — provide host or call router.configure first"))?;
let ssh_user = p
.get("ssh_user")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
.or_else(|| saved.username.clone())
.unwrap_or_else(|| "root".to_string());
let ssh_password = p
.get("ssh_password")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
.or_else(|| saved.password.clone())
.unwrap_or_default();
let default_mint_url = format!("http://{}:{}", self.config.host_ip, LOCAL_MINT_PORT);
let mint_url = p
.get("mint_url")
.and_then(|v| v.as_str())
.unwrap_or(&default_mint_url)
.to_string();
let config = TollGateConfig {
ssid: "archipelago".to_string(),
mint_url,
price_sats: p.get("price_sats").and_then(|v| v.as_u64()).unwrap_or(10),
step_size_ms: p
.get("step_size_ms")
.and_then(|v| v.as_u64())
.unwrap_or(60_000),
min_steps: p
.get("min_steps")
.and_then(|v| v.as_u64())
.unwrap_or(1) as u32,
enabled: p.get("enabled").and_then(|v| v.as_bool()).unwrap_or(true),
};
let router = Router::connect_password(&host, 22, &ssh_user, &ssh_password)?;
router.verify_openwrt()?;
tollgate::provision(&router, &config).await?;
Ok(serde_json::json!({
"ok": true,
"host": host,
"ssid": config.ssid,
"mint_url": config.mint_url,
}))
}
/// Scan for visible WiFi networks from the router's radio.
///
/// Params: same host/credentials as other openwrt methods.
pub(super) async fn handle_openwrt_scan_wifi(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let saved = net_router::load_router_config(&self.config.data_dir).await?;
let p = params.unwrap_or_default();
let host = p.get("host").and_then(|v| v.as_str()).map(|s| s.to_string())
.or_else(|| if saved.configured { Some(saved.address.clone()) } else { None })
.ok_or_else(|| anyhow::anyhow!("No router configured — provide host or call router.configure first"))?;
let ssh_user = p.get("ssh_user").and_then(|v| v.as_str()).map(|s| s.to_string())
.or_else(|| saved.username.clone()).unwrap_or_else(|| "root".to_string());
let ssh_password = p.get("ssh_password").and_then(|v| v.as_str()).map(|s| s.to_string())
.or_else(|| saved.password.clone()).unwrap_or_default();
let router = Router::connect_password(&host, 22, &ssh_user, &ssh_password)?;
router.verify_openwrt()?;
let networks = wifi_scan::scan_networks(&router)?;
let result: Vec<serde_json::Value> = networks
.iter()
.map(|n| serde_json::json!({
"ssid": n.ssid,
"bssid": n.bssid,
"signal": n.signal,
"channel": n.channel,
"encryption": n.encryption,
}))
.collect();
Ok(serde_json::json!({ "networks": result }))
}
/// Configure WAN/WISP — connect the router to an upstream WiFi network.
///
/// Params: host/credentials + `{ "ssid": "...", "password": "...", "encryption": "psk2" }`
pub(super) async fn handle_openwrt_configure_wan(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let saved = net_router::load_router_config(&self.config.data_dir).await?;
let p = params.unwrap_or_default();
let host = p.get("host").and_then(|v| v.as_str()).map(|s| s.to_string())
.or_else(|| if saved.configured { Some(saved.address.clone()) } else { None })
.ok_or_else(|| anyhow::anyhow!("No router configured — provide host or call router.configure first"))?;
let ssh_user = p.get("ssh_user").and_then(|v| v.as_str()).map(|s| s.to_string())
.or_else(|| saved.username.clone()).unwrap_or_else(|| "root".to_string());
let ssh_password = p.get("ssh_password").and_then(|v| v.as_str()).map(|s| s.to_string())
.or_else(|| saved.password.clone()).unwrap_or_default();
let ssid = p.get("ssid").and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing required field: ssid"))?.to_string();
let password = p.get("password").and_then(|v| v.as_str()).unwrap_or("").to_string();
let encryption = p.get("encryption").and_then(|v| v.as_str()).unwrap_or("psk2").to_string();
let dhcp_start = p.get("dhcp_start").and_then(|v| v.as_u64()).unwrap_or(100) as u32;
let dhcp_limit = p.get("dhcp_limit").and_then(|v| v.as_u64()).unwrap_or(150) as u32;
let masq = p.get("masq").and_then(|v| v.as_bool()).unwrap_or(true);
let router = Router::connect_password(&host, 22, &ssh_user, &ssh_password)?;
router.verify_openwrt()?;
let config = wan::WispConfig { ssid: ssid.clone(), password, encryption, dhcp_start, dhcp_limit, masq };
wan::configure_wisp(&router, &config)?;
Ok(serde_json::json!({ "ok": true, "host": host, "ssid": ssid }))
}
}
/// Parse /etc/openwrt_release key=value pairs into a JSON object.
fn parse_release(raw: &str) -> serde_json::Value {
let mut m = serde_json::Map::new();
for line in raw.lines() {
if let Some((k, v)) = line.split_once('=') {
m.insert(
k.to_lowercase(),
serde_json::Value::String(v.trim_matches('"').to_string()),
);
}
}
serde_json::Value::Object(m)
}
/// Extract AP wifi-iface sections from `uci show wireless` output.
fn parse_wifi_interfaces(raw: &str) -> Vec<serde_json::Value> {
use std::collections::HashMap;
let mut sections: HashMap<String, HashMap<String, String>> = HashMap::new();
for line in raw.lines() {
if let Some((lhs, rhs)) = line.trim().split_once('=') {
let parts: Vec<&str> = lhs.splitn(3, '.').collect();
if parts.len() == 3 && parts[0] == "wireless" {
sections
.entry(parts[1].to_string())
.or_default()
.insert(parts[2].to_string(), rhs.trim_matches('\'').to_string());
}
}
}
let mut ifaces: Vec<serde_json::Value> = sections
.into_iter()
.filter(|(_, f)| f.get("mode").map(|m| m == "ap").unwrap_or(false))
.map(|(name, f)| serde_json::json!({
"section": name,
"ssid": f.get("ssid").cloned().unwrap_or_default(),
"device": f.get("device").cloned().unwrap_or_default(),
"encryption": f.get("encryption").cloned().unwrap_or_else(|| "none".into()),
"network": f.get("network").cloned().unwrap_or_default(),
"disabled": f.get("disabled").map(|v| v == "1").unwrap_or(false),
}))
.collect();
ifaces.sort_by_key(|v| v["section"].as_str().unwrap_or("").to_string());
ifaces
}
fn parse_ipv4(s: &str) -> Result<[u8; 4]> {
let parts: Vec<&str> = s.split('.').collect();
if parts.len() != 4 {
anyhow::bail!("Invalid IPv4: {}", s);
}
Ok([
parts[0].parse()?,
parts[1].parse()?,
parts[2].parse()?,
parts[3].parse()?,
])
}

View File

@ -114,31 +114,6 @@ impl RpcHandler {
Err(e) => { Err(e) => {
error!("package.install {} failed: {:#}", package_id_spawn, e); error!("package.install {} failed: {:#}", package_id_spawn, e);
install_log(&format!("INSTALL FAIL: {}{:#}", package_id_spawn, e)).await; install_log(&format!("INSTALL FAIL: {}{:#}", package_id_spawn, e)).await;
// Dependency-gate rejections happen BEFORE any resource
// (container/image/data dir) exists for this package, so
// keeping the optimistic entry would leave a phantom
// "Stopped" tile whose Start fails with `no such object`
// (the log-confirmed LND fresh-install failure). Remove
// the entry so the card reverts to installable, and
// surface the reason as a notification instead.
if let Some(gate) = e.downcast_ref::<super::dependencies::DependencyGateError>()
{
let (mut data, _) = handler.state_manager.get_snapshot().await;
data.package_data.remove(&package_id_spawn);
data.notifications.push(crate::data_model::Notification {
id: format!("install-deps-{package_id_spawn}"),
level: crate::data_model::NotificationLevel::Error,
title: format!("Could not install {package_id_spawn}"),
message: gate.to_string(),
timestamp: chrono::Utc::now().to_rfc3339(),
app_id: Some(package_id_spawn.clone()),
});
while data.notifications.len() > 20 {
data.notifications.remove(0);
}
handler.state_manager.update_data(data).await;
return;
}
// Don't remove the entry — that's what made the card // Don't remove the entry — that's what made the card
// vanish from My Apps mid-install / between retry-loop // vanish from My Apps mid-install / between retry-loop
// attempts (e.g. tailscale's entrypoint failure). Leave // attempts (e.g. tailscale's entrypoint failure). Leave

View File

@ -707,17 +707,12 @@ pub(super) async fn get_app_config(
// effectively pinned at 2 by --cpus=2 (now removed). // effectively pinned at 2 by --cpus=2 (now removed).
// -maxconnections=125 — default but explicit, so ops can // -maxconnections=125 — default but explicit, so ops can
// tune downward on bandwidth-constrained nodes. // tune downward on bandwidth-constrained nodes.
// Log volume: -printtoconsole=0 — bitcoind already writes
// debug.log in the datadir (self-shrunk on restart); echoing it
// to stdout too pushed every IBD "UpdateTip" line through
// conmon into journald (>1 GB/day on a fresh node). Deep
// debugging uses /var/lib/archipelago/bitcoin/debug.log.
Some(vec![ Some(vec![
"-server=1".to_string(), "-server=1".to_string(),
"-rpcbind=0.0.0.0".to_string(), "-rpcbind=0.0.0.0".to_string(),
"-rpcallowip=0.0.0.0/0".to_string(), "-rpcallowip=0.0.0.0/0".to_string(),
"-rpcport=8332".to_string(), "-rpcport=8332".to_string(),
"-printtoconsole=0".to_string(), "-printtoconsole=1".to_string(),
"-datadir=/home/bitcoin/.bitcoin".to_string(), "-datadir=/home/bitcoin/.bitcoin".to_string(),
format!("-dbcache={}", bitcoin_dbcache_mb()), format!("-dbcache={}", bitcoin_dbcache_mb()),
"-par=0".to_string(), "-par=0".to_string(),

View File

@ -1,8 +1,6 @@
use super::config::get_containers_for_app; use super::config::get_containers_for_app;
use super::runtime::manifest_apps_dirs;
use crate::data_model::{PackageDataEntry, PackageState}; use crate::data_model::{PackageDataEntry, PackageState};
use anyhow::{Context, Result}; use anyhow::{Context, Result};
use archipelago_container::{AppManifest, Dependency};
use std::collections::HashMap; use std::collections::HashMap;
use tracing::info; use tracing::info;
@ -13,38 +11,7 @@ const BITCOIN_NAMES: &[&str] = &["bitcoin-knots", "bitcoin-core", "bitcoin"];
const ELECTRUM_NAMES: &[&str] = &["electrumx", "mempool-electrs", "electrs"]; const ELECTRUM_NAMES: &[&str] = &["electrumx", "mempool-electrs", "electrs"];
const ARCHIVAL_BITCOIN_DISK_GB: u64 = 1000; const ARCHIVAL_BITCOIN_DISK_GB: u64 = 1000;
/// The manifest string dependency that declares "needs an archival
/// (unpruned + txindex) Bitcoin node" — see `manifest_declares_archival_bitcoin`.
const ARCHIVAL_BITCOIN_DEPENDENCY: &str = "bitcoin:archival";
/// Whether `package_id`'s own on-disk manifest declares
/// `dependencies: [bitcoin:archival]`. Manifest-driven alternative to the
/// hardcoded id list below — a new app just declares the dependency instead
/// of needing a code change here.
fn manifest_declares_archival_bitcoin(package_id: &str) -> bool {
for apps_dir in manifest_apps_dirs() {
let path = apps_dir.join(package_id).join("manifest.yml");
let Ok(contents) = std::fs::read_to_string(&path) else {
continue;
};
let Ok(manifest) = AppManifest::parse(&contents) else {
continue;
};
return dependency_list_declares_archival_bitcoin(&manifest.app.dependencies);
}
false
}
fn dependency_list_declares_archival_bitcoin(deps: &[Dependency]) -> bool {
deps.iter()
.any(|dep| matches!(dep, Dependency::Simple(s) if s == ARCHIVAL_BITCOIN_DEPENDENCY))
}
fn requires_unpruned_bitcoin(package_id: &str) -> bool { fn requires_unpruned_bitcoin(package_id: &str) -> bool {
if manifest_declares_archival_bitcoin(package_id) {
return true;
}
// Fallback for apps not yet migrated to the manifest declaration above.
matches!( matches!(
package_id, package_id,
"electrumx" | "mempool-electrs" | "electrs" | "mempool" | "mempool-web" "electrumx" | "mempool-electrs" | "electrs" | "mempool" | "mempool-web"
@ -58,7 +25,6 @@ fn archival_bitcoin_required_message(package_id: &str) -> String {
} }
/// Snapshot of which dependency services are currently running. /// Snapshot of which dependency services are currently running.
#[derive(Debug)]
pub(super) struct RunningDeps { pub(super) struct RunningDeps {
pub has_bitcoin: bool, pub has_bitcoin: bool,
pub has_electrumx: bool, pub has_electrumx: bool,
@ -228,190 +194,6 @@ pub(super) fn check_install_deps(package_id: &str, deps: &RunningDeps) -> Result
} }
} }
// ---------------------------------------------------------------------------
// Bounded dependency wait (install race fix)
// ---------------------------------------------------------------------------
//
// Confirmed race on fresh nodes: the user clicks "Install LND" while
// bitcoin-knots is itself still installing/starting. `check_install_deps`
// rejected instantly ("LND requires a running Bitcoin node…") even though
// Bitcoin came up 55s later. The fix: when the dependency is INSTALLED
// (container exists in `podman ps -a`, or the package state knows about it)
// but not Running yet, poll for up to DEP_WAIT_MAX_ATTEMPTS × DEP_WAIT_INTERVAL
// (~3 minutes) before failing, surfacing "Waiting for X to start…" via the
// install-progress message. If the dependency is not installed at all, fail
// fast with the canonical `check_install_deps` message — waiting can't help.
/// Poll interval while waiting for an installed dependency to start.
pub(super) const DEP_WAIT_INTERVAL: std::time::Duration = std::time::Duration::from_secs(5);
/// 36 × 5s = 3 minutes of bounded waiting.
pub(super) const DEP_WAIT_MAX_ATTEMPTS: u32 = 36;
/// Marker error: the install was rejected by the dependency gate BEFORE any
/// resource (container, image, data dir) was created for the package. The
/// async install wrapper (`async_lifecycle.rs`) downcasts to this to remove
/// the optimistic `Installing` state entry instead of leaving a phantom
/// "Stopped" tile whose Start fails with `no such object`.
#[derive(Debug)]
pub(in crate::api::rpc) struct DependencyGateError(pub String);
impl std::fmt::Display for DependencyGateError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(&self.0)
}
}
impl std::error::Error for DependencyGateError {}
/// One unsatisfied install dependency: a user-facing label plus the container
/// name variants that would satisfy it.
struct MissingDep {
label: &'static str,
containers: &'static [&'static str],
}
/// Which dependencies `check_install_deps` would reject `package_id` over.
/// Must stay in lockstep with the match arms in `check_install_deps` (the
/// wait loop re-runs `check_install_deps` for the canonical error message).
fn missing_install_deps(package_id: &str, deps: &RunningDeps) -> Vec<MissingDep> {
const BITCOIN: MissingDep = MissingDep {
label: "Bitcoin",
containers: BITCOIN_NAMES,
};
const ELECTRUM: MissingDep = MissingDep {
label: "ElectrumX",
containers: ELECTRUM_NAMES,
};
let mut missing = Vec::new();
match package_id {
"electrumx" | "mempool-electrs" | "electrs" | "lnd" | "btcpay-server" | "btcpayserver" => {
if !deps.has_bitcoin {
missing.push(BITCOIN);
}
}
"mempool" | "mempool-web" => {
if !deps.has_bitcoin {
missing.push(BITCOIN);
}
if !deps.has_electrumx {
missing.push(ELECTRUM);
}
}
// fedimint deliberately absent: check_install_deps allows it without
// a local Bitcoin node (remote RPC configured in guardian setup).
_ => {}
}
missing
}
fn join_dep_labels(missing: &[MissingDep]) -> String {
missing
.iter()
.map(|d| d.label)
.collect::<Vec<_>>()
.join(" and ")
}
/// One snapshot of the dependency world, fed to [`wait_for_install_deps`].
pub(super) struct DepProbe {
/// Which dependency services are currently Running.
pub running: RunningDeps,
/// Container/package names that EXIST in any state — installed, but
/// possibly not running yet (`podman ps -a` package-state entries).
pub existing: Vec<String>,
}
/// All container names known to podman in any state (`podman ps -a`).
/// Conservative on probe failure: returns an empty list, which makes the
/// wait loop fall back to the pre-fix fail-fast behavior.
pub(super) async fn detect_existing_containers() -> Vec<String> {
let out = tokio::time::timeout(
std::time::Duration::from_secs(30),
tokio::process::Command::new("podman")
.args(["ps", "-a", "--format", "{{.Names}}"])
.output(),
)
.await;
match out {
Ok(Ok(o)) if o.status.success() => String::from_utf8_lossy(&o.stdout)
.lines()
.map(|l| l.trim().to_string())
.filter(|l| !l.is_empty())
.collect(),
_ => Vec::new(),
}
}
/// Bounded dependency gate. Returns the (satisfied) `RunningDeps` snapshot,
/// or a [`DependencyGateError`]:
/// - immediately, when a missing dependency is not installed at all
/// (canonical `check_install_deps` message), or
/// - after `max_attempts × interval`, when an installed dependency never
/// reached Running.
///
/// `probe` and `on_waiting` are injected so unit tests can drive the loop
/// without a podman runtime; production wires them to
/// `RpcHandler::dep_probe_for_install` / `set_install_message`.
pub(super) async fn wait_for_install_deps<P, PF, L, LF>(
package_id: &str,
mut probe: P,
mut on_waiting: L,
max_attempts: u32,
interval: std::time::Duration,
) -> Result<RunningDeps>
where
P: FnMut() -> PF,
PF: std::future::Future<Output = Result<DepProbe>>,
L: FnMut(String) -> LF,
LF: std::future::Future<Output = ()>,
{
let mut waited_attempts = 0u32;
loop {
let DepProbe { running, existing } = probe().await?;
let missing = missing_install_deps(package_id, &running);
if missing.is_empty() {
// Keep behavior in lockstep with the canonical gate (covers any
// future arm added there but not mirrored in missing_install_deps).
check_install_deps(package_id, &running)?;
return Ok(running);
}
// Fail fast if any missing dependency has no installed container
// under any name variant — waiting cannot satisfy it.
let some_dep_not_installed = missing
.iter()
.any(|dep| !dep.containers.iter().any(|c| existing.iter().any(|e| e == c)));
if some_dep_not_installed {
let msg = match check_install_deps(package_id, &running) {
Err(e) => e.to_string(),
Ok(()) => format!("{package_id} dependencies are not running"),
};
return Err(anyhow::Error::new(DependencyGateError(msg)));
}
if waited_attempts >= max_attempts {
let labels = join_dep_labels(&missing);
return Err(anyhow::Error::new(DependencyGateError(format!(
"{labels} is installed but did not reach the running state within \
{} seconds. Start {labels}, then install {package_id} again.",
u64::from(max_attempts) * interval.as_secs()
))));
}
waited_attempts += 1;
let labels = join_dep_labels(&missing);
if waited_attempts == 1 {
info!(
"Install {package_id}: dependency {labels} installed but not running yet — \
waiting up to {}s for it to start",
u64::from(max_attempts) * interval.as_secs()
);
}
on_waiting(format!("Waiting for {labels} to start…")).await;
tokio::time::sleep(interval).await;
}
}
/// ElectrumX and Mempool's Electrum backend need historical blocks from an /// ElectrumX and Mempool's Electrum backend need historical blocks from an
/// unpruned node while building their indexes. A pruned Bitcoin node can be /// unpruned node while building their indexes. A pruned Bitcoin node can be
/// running and RPC-reachable but still leave them stuck with closed ports. /// running and RPC-reachable but still leave them stuck with closed ports.
@ -594,31 +376,16 @@ pub(super) fn startup_order(package_id: &str) -> &'static [&'static str] {
/// order for the given app. Unknown containers sort to the end. /// order for the given app. Unknown containers sort to the end.
pub(super) async fn ordered_containers_for_start(package_id: &str) -> Result<Vec<String>> { pub(super) async fn ordered_containers_for_start(package_id: &str) -> Result<Vec<String>> {
let containers = get_containers_for_app(package_id).await?; let containers = get_containers_for_app(package_id).await?;
Ok(order_present_containers(package_id, containers))
}
/// Order the *actually-present* containers of an app by its dependency-aware
/// startup order. Containers whose name is unknown to the order list sort to
/// the end, preserving their relative input order.
///
/// This deliberately does NOT inject order entries that aren't live
/// containers. `startup_order` is a union of container-name variants across
/// install generations (e.g. `mysql-mempool` vs `archy-mempool-db`), so any
/// single install only ever has a subset of those names. Injecting a phantom
/// name makes the start path fail on a "no such object" inspect — and because
/// `do_orchestrator_package_start` propagates the unknown-app-id fallback
/// error via `?`, every later member (the api + frontend) is then skipped,
/// leaving the stack down until the health monitor recovers it minutes later.
/// That was the source of mempool gate flakes #73 (frontend) / #74 (api).
fn order_present_containers(package_id: &str, containers: Vec<String>) -> Vec<String> {
if containers.is_empty() {
// Nothing is live under any known name. Fall back to the package id so
// a single-container app whose container matches its id still gets one
// start attempt; multi-container stacks with no live members are
// surfaced as "no containers" by the caller's emptiness check.
return vec![package_id.to_string()];
}
let order = startup_order(package_id); let order = startup_order(package_id);
if order.is_empty() && containers.is_empty() {
return Ok(vec![package_id.to_string()]);
}
let mut sorted = containers;
for required in order {
if !sorted.iter().any(|name| name == required) {
sorted.push((*required).to_string());
}
}
// If no special order is defined, fall back to mempool order for legacy // If no special order is defined, fall back to mempool order for legacy
// multi-container names that may still be returned by config lookups. // multi-container names that may still be returned by config lookups.
let effective_order: &[&str] = if order.is_empty() { let effective_order: &[&str] = if order.is_empty() {
@ -626,14 +393,8 @@ fn order_present_containers(package_id: &str, containers: Vec<String>) -> Vec<St
} else { } else {
order order
}; };
let mut sorted = containers; sorted.sort_by_key(|c| effective_order.iter().position(|o| *o == c).unwrap_or(99));
sorted.sort_by_key(|c| { Ok(sorted)
effective_order
.iter()
.position(|o| *o == c)
.unwrap_or(usize::MAX)
});
sorted
} }
/// Configure Fedimint Gateway to use LND instead of LDK. /// Configure Fedimint Gateway to use LND instead of LDK.
@ -691,52 +452,7 @@ pub(super) fn configure_fedimint_lnd(
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use super::{ use super::{requires_unpruned_bitcoin, startup_order};
dependency_list_declares_archival_bitcoin, manifest_declares_archival_bitcoin,
order_present_containers, requires_unpruned_bitcoin, startup_order,
};
use archipelago_container::Dependency;
#[test]
fn order_present_containers_never_injects_phantom_stack_members() {
// The live mempool stack on a node: db + api + frontend. These are the
// only real container names; the startup_order list also contains
// variant/legacy names (mysql-mempool, archy-mempool-api, ...) that are
// NOT live here and must never appear in the result — a phantom name in
// the start list aborts the orchestrator start mid-sequence (gate
// #73/#74).
let present = vec![
"mempool".to_string(),
"mempool-api".to_string(),
"archy-mempool-db".to_string(),
];
let ordered = order_present_containers("mempool", present);
// Dependency order: db -> api -> frontend.
assert_eq!(ordered, vec!["archy-mempool-db", "mempool-api", "mempool"]);
// No phantom variants leaked in.
for phantom in ["mysql-mempool", "archy-mempool-api", "archy-mempool-web"] {
assert!(
!ordered.iter().any(|c| c == phantom),
"phantom {phantom} must not be injected"
);
}
}
#[test]
fn order_present_containers_orders_known_before_unknown() {
let present = vec!["mempool".to_string(), "some-sidecar".to_string()];
let ordered = order_present_containers("mempool", present);
// The known frontend sorts ahead of an unknown sidecar.
assert_eq!(ordered, vec!["mempool", "some-sidecar"]);
}
#[test]
fn order_present_containers_empty_falls_back_to_package_id() {
assert_eq!(
order_present_containers("mempool", vec![]),
vec!["mempool".to_string()]
);
}
#[test] #[test]
fn btcpay_start_order_includes_required_stack_members() { fn btcpay_start_order_includes_required_stack_members() {
@ -769,272 +485,4 @@ mod tests {
assert!(!requires_unpruned_bitcoin(package_id), "{package_id}"); assert!(!requires_unpruned_bitcoin(package_id), "{package_id}");
} }
} }
#[test]
fn dependency_matcher_finds_the_archival_marker_among_other_deps() {
let deps = vec![
Dependency::App {
app_id: "bitcoin-knots".to_string(),
version: Some(">=26.0".to_string()),
},
Dependency::Storage {
storage: "50Gi".to_string(),
},
Dependency::Simple("bitcoin:archival".to_string()),
];
assert!(dependency_list_declares_archival_bitcoin(&deps));
}
#[test]
fn dependency_matcher_false_when_marker_absent() {
let deps = vec![Dependency::App {
app_id: "bitcoin-knots".to_string(),
version: Some(">=26.0".to_string()),
}];
assert!(!dependency_list_declares_archival_bitcoin(&deps));
assert!(!dependency_list_declares_archival_bitcoin(&[]));
}
#[test]
fn manifest_declared_archival_bitcoin_covers_a_new_app_without_a_code_change() {
// electrumx and mempool declare `dependencies: [..., bitcoin:archival]`
// on disk (apps/electrumx/manifest.yml, apps/mempool/manifest.yml) —
// this is the manifest-driven path working end-to-end, not the
// hardcoded id list. A future app only needs this manifest line, no
// edit to `requires_unpruned_bitcoin`.
assert!(manifest_declares_archival_bitcoin("electrumx"));
assert!(manifest_declares_archival_bitcoin("mempool"));
// An app whose manifest exists but never declares the marker.
assert!(!manifest_declares_archival_bitcoin("bitcoin-knots"));
// An id with no manifest on disk at all.
assert!(!manifest_declares_archival_bitcoin("does-not-exist"));
}
mod dep_wait {
use super::super::{wait_for_install_deps, DepProbe, DependencyGateError, RunningDeps};
use std::sync::atomic::{AtomicU32, Ordering};
use std::sync::{Arc, Mutex};
use std::time::Duration;
fn deps(has_bitcoin: bool, has_electrumx: bool) -> RunningDeps {
RunningDeps {
has_bitcoin,
has_electrumx,
has_lnd: false,
}
}
fn probe(has_bitcoin: bool, has_electrumx: bool, existing: &[&str]) -> DepProbe {
DepProbe {
running: deps(has_bitcoin, has_electrumx),
existing: existing.iter().map(|s| s.to_string()).collect(),
}
}
/// Collects "Waiting for X to start…" labels emitted during the wait.
fn label_sink() -> (Arc<Mutex<Vec<String>>>, impl FnMut(String) -> std::future::Ready<()>)
{
let labels = Arc::new(Mutex::new(Vec::new()));
let sink = {
let labels = Arc::clone(&labels);
move |msg: String| {
labels.lock().unwrap().push(msg);
std::future::ready(())
}
};
(labels, sink)
}
#[tokio::test]
async fn passes_immediately_when_dependency_is_running() {
let (labels, sink) = label_sink();
let result = wait_for_install_deps(
"lnd",
|| async { Ok(probe(true, false, &["bitcoin-knots"])) },
sink,
3,
Duration::ZERO,
)
.await;
assert!(result.is_ok());
assert!(labels.lock().unwrap().is_empty(), "no waiting expected");
}
#[tokio::test]
async fn fails_fast_when_dependency_not_installed_at_all() {
let calls = AtomicU32::new(0);
let (labels, sink) = label_sink();
let err = wait_for_install_deps(
"lnd",
|| {
calls.fetch_add(1, Ordering::SeqCst);
async { Ok(probe(false, false, &["uptime-kuma"])) }
},
sink,
36,
Duration::ZERO,
)
.await
.unwrap_err();
// Single probe — no polling when waiting cannot help.
assert_eq!(calls.load(Ordering::SeqCst), 1);
assert!(labels.lock().unwrap().is_empty());
// Canonical check_install_deps message, wrapped in the gate marker
// so async_lifecycle removes the optimistic Installing entry.
assert!(err.downcast_ref::<DependencyGateError>().is_some());
assert!(
err.to_string().contains("LND requires a running Bitcoin node"),
"unexpected message: {err}"
);
}
#[tokio::test]
async fn waits_while_installed_dependency_starts_then_passes() {
// Bitcoin container exists (installing/starting) but only reports
// Running from the 3rd probe onward — the log-confirmed LND race.
let calls = Arc::new(AtomicU32::new(0));
let (labels, sink) = label_sink();
let probe_calls = Arc::clone(&calls);
let result = wait_for_install_deps(
"lnd",
move || {
let n = probe_calls.fetch_add(1, Ordering::SeqCst);
async move { Ok(probe(n >= 2, false, &["bitcoin-knots"])) }
},
sink,
36,
Duration::ZERO,
)
.await;
assert!(result.is_ok(), "{result:?}");
assert_eq!(calls.load(Ordering::SeqCst), 3);
let labels = labels.lock().unwrap();
assert_eq!(labels.len(), 2, "one waiting label per polling attempt");
assert!(labels.iter().all(|l| l == "Waiting for Bitcoin to start…"));
}
#[tokio::test]
async fn times_out_when_installed_dependency_never_runs() {
let (labels, sink) = label_sink();
let err = wait_for_install_deps(
"lnd",
|| async { Ok(probe(false, false, &["bitcoin-knots"])) },
sink,
4,
Duration::ZERO,
)
.await
.unwrap_err();
assert!(err.downcast_ref::<DependencyGateError>().is_some());
assert!(
err.to_string()
.contains("did not reach the running state within 0 seconds"),
"unexpected message: {err}"
);
assert_eq!(labels.lock().unwrap().len(), 4);
}
#[tokio::test]
async fn mempool_waits_on_both_bitcoin_and_electrumx() {
let calls = Arc::new(AtomicU32::new(0));
let (labels, sink) = label_sink();
let probe_calls = Arc::clone(&calls);
let result = wait_for_install_deps(
"mempool",
move || {
let n = probe_calls.fetch_add(1, Ordering::SeqCst);
// Bitcoin comes up on probe 2, electrumx on probe 3.
async move { Ok(probe(n >= 1, n >= 2, &["bitcoin-knots", "electrumx"])) }
},
sink,
36,
Duration::ZERO,
)
.await;
assert!(result.is_ok(), "{result:?}");
let labels = labels.lock().unwrap();
assert_eq!(
labels.as_slice(),
&[
"Waiting for Bitcoin and ElectrumX to start…".to_string(),
"Waiting for ElectrumX to start…".to_string(),
]
);
}
#[tokio::test]
async fn mempool_fails_fast_when_one_dep_is_not_installed() {
// Bitcoin is installed (waiting could help) but ElectrumX is not
// installed at all — waiting can never satisfy the gate, so fail
// fast with the canonical message.
let (labels, sink) = label_sink();
let err = wait_for_install_deps(
"mempool",
|| async { Ok(probe(false, false, &["bitcoin-knots"])) },
sink,
36,
Duration::ZERO,
)
.await
.unwrap_err();
assert!(err.downcast_ref::<DependencyGateError>().is_some());
assert!(labels.lock().unwrap().is_empty());
assert!(
err.to_string().contains("Mempool requires"),
"unexpected message: {err}"
);
}
#[tokio::test]
async fn variant_container_names_count_as_installed() {
// bitcoin-core (not just bitcoin-knots) satisfies the "installed"
// check for the wait path.
let calls = Arc::new(AtomicU32::new(0));
let (_labels, sink) = label_sink();
let probe_calls = Arc::clone(&calls);
let result = wait_for_install_deps(
"electrumx",
move || {
let n = probe_calls.fetch_add(1, Ordering::SeqCst);
async move { Ok(probe(n >= 1, false, &["bitcoin-core"])) }
},
sink,
36,
Duration::ZERO,
)
.await;
assert!(result.is_ok(), "{result:?}");
}
#[tokio::test]
async fn apps_without_dependency_gate_pass_untouched() {
let (labels, sink) = label_sink();
let result = wait_for_install_deps(
"uptime-kuma",
|| async { Ok(probe(false, false, &[])) },
sink,
36,
Duration::ZERO,
)
.await;
assert!(result.is_ok());
assert!(labels.lock().unwrap().is_empty());
}
}
#[test]
fn mempool_api_is_directly_installable_and_covered_by_the_archival_gate() {
// `mempool-api` is a legitimate direct `package.install` target
// (`uses_orchestrator_install_flow` in install.rs), reachable without
// going through the `mempool`/`mempool-web` umbrella id that the old
// hardcoded fallback list only recognized. It was missing from that
// list, so installing/repairing it directly skipped the archival
// Bitcoin gate entirely. Its manifest now declares `bitcoin:archival`
// directly, closing the gap the manifest-driven path exists for.
assert!(requires_unpruned_bitcoin("mempool-api"));
assert!(manifest_declares_archival_bitcoin("mempool-api"));
// `archy-mempool-web` has no direct Bitcoin RPC access
// (bitcoin_integration.rpc_access: none) and correctly stays excluded.
assert!(!requires_unpruned_bitcoin("archy-mempool-web"));
}
} }

View File

@ -3,10 +3,9 @@ use super::config::{
is_readonly_compatible, is_valid_docker_image, is_readonly_compatible, is_valid_docker_image,
}; };
use super::dependencies::{ use super::dependencies::{
check_bitcoin_pruning_compatibility, configure_fedimint_lnd, detect_existing_containers, check_bitcoin_pruning_compatibility, check_install_deps, configure_fedimint_lnd,
detect_running_deps, detect_running_deps_from_package_data, log_optional_dep_info, detect_running_deps, detect_running_deps_from_package_data, log_optional_dep_info,
needs_archy_net, wait_for_install_deps, DepProbe, RunningDeps, DEP_WAIT_INTERVAL, needs_archy_net, RunningDeps,
DEP_WAIT_MAX_ATTEMPTS,
}; };
use super::progress::parse_pull_progress; use super::progress::parse_pull_progress;
use super::validation::validate_app_id; use super::validation::validate_app_id;
@ -244,17 +243,6 @@ impl RpcHandler {
} }
} }
// Multi-version support: honor an install-time version selection for the
// orchestrator-managed Bitcoin apps. Selecting the catalog default (or
// omitting `version`) leaves the app unpinned (tracks latest); selecting
// an older version pins it so install_fresh resolves that image and the
// update badge stays suppressed. See docs/bitcoin-multi-version-design.md.
if matches!(package_id, "bitcoin-core" | "bitcoin-knots") {
if let Some(version) = params.get("version").and_then(|v| v.as_str()) {
persist_install_version_selection(package_id, version).await;
}
}
// Phase: Preparing — emit BEFORE the stack dispatch so multi-container // Phase: Preparing — emit BEFORE the stack dispatch so multi-container
// stacks also flip state to Installing immediately. Without this, the // stacks also flip state to Installing immediately. Without this, the
// backend's package state for stack apps stayed empty until the first // backend's package state for stack apps stayed empty until the first
@ -266,7 +254,8 @@ impl RpcHandler {
.await; .await;
if matches!(package_id, "mempool" | "mempool-web") { if matches!(package_id, "mempool" | "mempool-web") {
self.gate_install_deps(package_id).await?; let deps = self.running_deps_for_install(package_id).await?;
check_install_deps(package_id, &deps)?;
check_bitcoin_pruning_compatibility(package_id).await?; check_bitcoin_pruning_compatibility(package_id).await?;
} }
@ -289,11 +278,9 @@ impl RpcHandler {
// Dependency checks. Prefer the scanner's cached package state so a // Dependency checks. Prefer the scanner's cached package state so a
// congested Podman API does not turn an already-running dependency into // congested Podman API does not turn an already-running dependency into
// a false install failure. Fall back to a bounded direct Podman probe // a false install failure. Fall back to a bounded direct Podman probe
// only when the cache does not show the dependency. When the dependency // only when the cache does not show the dependency.
// is installed but not Running yet (the "clicked Install LND 55s before let deps = self.running_deps_for_install(package_id).await?;
// Bitcoin was up" race), wait up to ~3 minutes for it instead of check_install_deps(package_id, &deps)?;
// failing instantly.
let deps = self.gate_install_deps(package_id).await?;
check_bitcoin_pruning_compatibility(package_id).await?; check_bitcoin_pruning_compatibility(package_id).await?;
log_optional_dep_info(package_id, &deps); log_optional_dep_info(package_id, &deps);
let repaired_bitcoin_conf = let repaired_bitcoin_conf =
@ -947,27 +934,6 @@ impl RpcHandler {
} }
} }
/// Bounded dependency gate for installs: passes immediately when deps are
/// running, fails fast (with the phantom-tile marker) when a dependency
/// isn't installed at all, and otherwise waits up to
/// `DEP_WAIT_MAX_ATTEMPTS × DEP_WAIT_INTERVAL` for an installed-but-
/// starting dependency, surfacing "Waiting for X to start…" on the card.
pub(super) async fn gate_install_deps(&self, package_id: &str) -> Result<RunningDeps> {
wait_for_install_deps(
package_id,
|| async {
Ok(DepProbe {
running: self.running_deps_for_install(package_id).await?,
existing: detect_existing_containers().await,
})
},
|msg| async move { self.set_install_message(package_id, &msg).await },
DEP_WAIT_MAX_ATTEMPTS,
DEP_WAIT_INTERVAL,
)
.await
}
// -- Private helpers for install -- // -- Private helpers for install --
/// Pull the image from a registry or verify a local image exists. /// Pull the image from a registry or verify a local image exists.
@ -1318,11 +1284,6 @@ impl RpcHandler {
// Default to full archive — operators with 2TB+ drives shouldn't be // Default to full archive — operators with 2TB+ drives shouldn't be
// silently pruned down to 550 MB. Users who want a pruned node can // silently pruned down to 550 MB. Users who want a pruned node can
// set `prune=N` in bitcoin.conf themselves after install. // set `prune=N` in bitcoin.conf themselves after install.
//
// printtoconsole=0: bitcoind already writes debug.log in the datadir
// (self-shrunk on restart); duplicating it to stdout pushed every IBD
// "UpdateTip" line through conmon into journald (>1 GB/day). Deep
// debugging uses /var/lib/archipelago/bitcoin/debug.log.
let bitcoin_conf = format!( let bitcoin_conf = format!(
"\ "\
# rpcauth: salted hash only - no plaintext password in config or CLI\n\ # rpcauth: salted hash only - no plaintext password in config or CLI\n\
@ -1332,7 +1293,7 @@ rpcallowip=0.0.0.0/0\n\
listen=1\n\ listen=1\n\
rpcthreads=16\n\ rpcthreads=16\n\
rpcworkqueue=256\n\ rpcworkqueue=256\n\
printtoconsole=0\n", printtoconsole=1\n",
rpcauth_line rpcauth_line
); );
tokio::fs::create_dir_all(bitcoin_dir) tokio::fs::create_dir_all(bitcoin_dir)
@ -2466,36 +2427,6 @@ exit 2
} }
} }
/// Persist an install-time version selection for a multi-version app. Selecting
/// the catalog default (or a version equal to it) un-pins so the app tracks
/// latest; selecting any other version pins it. Best-effort: a write failure
/// just means the app installs at the catalog default.
async fn persist_install_version_selection(app_id: &str, version: &str) {
use crate::container::version_config::{read, write, AppVersionConfig};
let is_default = crate::container::app_catalog::catalog_default_version(app_id)
.map(|d| d == version)
.unwrap_or(false);
let existing = read(app_id);
let cfg = AppVersionConfig {
pinned_version: if is_default {
None
} else {
Some(version.to_string())
},
auto_update: existing.auto_update,
};
if let Err(e) = write(app_id, &cfg) {
tracing::warn!(app_id, version, error = %e, "failed to persist install-time version selection");
} else {
tracing::info!(
app_id,
version,
pinned = !is_default,
"persisted install-time version selection"
);
}
}
fn should_try_orchestrator_install(package_id: &str, orchestrator_available: bool) -> bool { fn should_try_orchestrator_install(package_id: &str, orchestrator_available: bool) -> bool {
orchestrator_available && uses_orchestrator_install_flow(package_id) orchestrator_available && uses_orchestrator_install_flow(package_id)
} }

View File

@ -5,7 +5,6 @@ mod install;
mod lifecycle; mod lifecycle;
mod progress; mod progress;
mod runtime; mod runtime;
mod set_config;
mod stacks; mod stacks;
mod update; mod update;
mod validation; mod validation;

View File

@ -61,31 +61,6 @@ impl RpcHandler {
self.state_manager.update_data(data).await; self.state_manager.update_data(data).await;
} }
/// Set a user-facing install status message (e.g. "Waiting for Bitcoin
/// to start…") without disturbing the current phase/byte counters.
pub(super) async fn set_install_message(&self, package_id: &str, message: &str) {
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let entry = data
.package_data
.entry(package_id.to_string())
.or_insert_with(|| create_installing_entry(package_id));
if entry.state != PackageState::Updating {
entry.state = PackageState::Installing;
}
let (size, downloaded, phase) = entry
.install_progress
.as_ref()
.map(|p| (p.size, p.downloaded, p.phase))
.unwrap_or((0, 0, None));
entry.install_progress = Some(InstallProgress {
size,
downloaded,
phase,
message: Some(message.to_string()),
});
self.state_manager.update_data(data).await;
}
/// Clear install progress after pull completes or fails. /// Clear install progress after pull completes or fails.
pub(super) async fn clear_install_progress(&self, package_id: &str) { pub(super) async fn clear_install_progress(&self, package_id: &str) {
let (mut data, _rev) = self.state_manager.get_snapshot().await; let (mut data, _rev) = self.state_manager.get_snapshot().await;

View File

@ -312,16 +312,7 @@ impl RpcHandler {
let mut stopped = 0u32; let mut stopped = 0u32;
let mut removed = 0u32; let mut removed = 0u32;
// Two distinct failure classes, kept separate so they don't get let mut errors = Vec::new();
// conflated (the old single `errors` vec did, which caused the "ghost in
// My Apps" bug): `container_errors` means a container could NOT be
// removed (force-rm failed too) — the app is genuinely still present, so
// we keep its state entry and surface a hard error. `cleanup_errors`
// means volume/network/data-dir teardown left residue — the containers
// are already gone, so the app IS uninstalled and MUST disappear from My
// Apps; the residue is logged but never ghosts the app.
let mut container_errors: Vec<String> = Vec::new();
let mut cleanup_errors: Vec<String> = Vec::new();
self.set_uninstall_stage( self.set_uninstall_stage(
package_id, package_id,
@ -379,7 +370,7 @@ impl RpcHandler {
let msg = let msg =
format!("Failed to remove {}: {}; {}", name, stderr.trim(), e); format!("Failed to remove {}: {}; {}", name, stderr.trim(), e);
tracing::error!("Uninstall {}: {}", package_id, msg); tracing::error!("Uninstall {}: {}", package_id, msg);
container_errors.push(msg); errors.push(msg);
} }
} }
} }
@ -388,35 +379,12 @@ impl RpcHandler {
Err(force_err) => { Err(force_err) => {
let msg = format!("Failed to remove {}: {}; {}", name, e, force_err); let msg = format!("Failed to remove {}: {}; {}", name, e, force_err);
tracing::error!("Uninstall {}: {}", package_id, msg); tracing::error!("Uninstall {}: {}", package_id, msg);
container_errors.push(msg); errors.push(msg);
} }
}, },
} }
} }
// A container that survived even force-remove means the app is NOT
// actually uninstalled — keep its state entry and fail so the spawned
// task reverts it to its prior state (and the user can retry), rather
// than orphaning a live container that's missing from My Apps.
if !container_errors.is_empty() {
tracing::error!(
"Uninstall {}: containers could not be removed: {:?}",
package_id,
container_errors
);
return Err(anyhow::anyhow!(
"Uninstall {} failed: {}",
package_id,
container_errors.join("; ")
));
}
// Containers are gone → the app is uninstalled. Remove its state entry
// NOW, before the (possibly slow, possibly fallible) volume/data
// teardown below, so My Apps updates immediately and a residue failure
// can never leave a ghost. Reinstall/scan no longer see a stale entry.
self.remove_package_state_entry(package_id).await;
self.set_uninstall_stage(package_id, "Cleaning up volumes") self.set_uninstall_stage(package_id, "Cleaning up volumes")
.await; .await;
// Avoid global Podman volume prune on production nodes: store-wide // Avoid global Podman volume prune on production nodes: store-wide
@ -464,73 +432,70 @@ impl RpcHandler {
let stderr = String::from_utf8_lossy(&o.stderr); let stderr = String::from_utf8_lossy(&o.stderr);
let msg = format!("Failed to remove data {}: {}", dir, stderr.trim()); let msg = format!("Failed to remove data {}: {}", dir, stderr.trim());
tracing::error!("Uninstall {}: {}", package_id, msg); tracing::error!("Uninstall {}: {}", package_id, msg);
cleanup_errors.push(msg); errors.push(msg);
} }
Err(e) => { Err(e) => {
let msg = format!("Failed to remove data {}: {}", dir, e); let msg = format!("Failed to remove data {}: {}", dir, e);
tracing::error!("Uninstall {}: {}", package_id, msg); tracing::error!("Uninstall {}: {}", package_id, msg);
cleanup_errors.push(msg); errors.push(msg);
} }
_ => {} _ => {}
} }
} }
} }
// The app is already gone from My Apps (entry removed above). Residual if !errors.is_empty() {
// volume/data cleanup failures are logged but NEVER ghost the app — a
// reinstall and the next uninstall both tolerate leftover dirs.
if !cleanup_errors.is_empty() {
tracing::error!( tracing::error!(
"Uninstall {} removed but left cleanup residue: {:?}", "Uninstall {} completed with errors: {:?}",
package_id, package_id,
cleanup_errors errors
); );
return Err(anyhow::anyhow!(
"Uninstall {} partially failed: {}",
package_id,
errors.join("; ")
));
} }
tracing::info!( tracing::info!(
"Uninstall {} complete: stopped={}, removed={}, cleanup_errors={}", "Uninstall {} complete: stopped={}, removed={}",
package_id, package_id,
stopped, stopped,
removed, removed
cleanup_errors.len()
); );
// Immediately remove from in-memory state so the UI updates without
// waiting for the scanner's absence threshold (3 scans × 60s each).
{
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let before = data.package_data.len();
data.package_data.remove(package_id);
// Also remove any alias keys (e.g. "bitcoin-knots" vs "bitcoin")
let aliases: Vec<String> = data
.package_data
.keys()
.filter(|k| {
super::config::all_container_names(package_id)
.iter()
.any(|c| c.strip_prefix("archy-").unwrap_or(c) == k.as_str())
})
.cloned()
.collect();
for alias in &aliases {
data.package_data.remove(alias);
}
if data.package_data.len() < before {
self.state_manager.update_data(data).await;
}
}
Ok(serde_json::json!({ Ok(serde_json::json!({
"status": "uninstalled", "status": "uninstalled",
"stopped": stopped, "stopped": stopped,
"removed": removed, "removed": removed,
"cleanup_warnings": cleanup_errors,
})) }))
} }
/// Remove a package's entry (and any alias keys) from persisted state so it
/// disappears from My Apps immediately, without waiting for the scanner's
/// absence threshold (3 scans × 60s). Called as soon as an uninstall has
/// removed the app's containers — before the slower volume/data teardown —
/// so a residue failure can never leave a ghost entry behind.
async fn remove_package_state_entry(&self, package_id: &str) {
let (mut data, _rev) = self.state_manager.get_snapshot().await;
let before = data.package_data.len();
data.package_data.remove(package_id);
// Also remove any alias keys (e.g. "bitcoin-knots" vs "bitcoin").
let aliases: Vec<String> = data
.package_data
.keys()
.filter(|k| {
super::config::all_container_names(package_id)
.iter()
.any(|c| c.strip_prefix("archy-").unwrap_or(c) == k.as_str())
})
.cloned()
.collect();
for alias in &aliases {
data.package_data.remove(alias);
}
if data.package_data.len() < before {
self.state_manager.update_data(data).await;
}
}
/// Start a bundled app (create container from pre-loaded image if needed). /// Start a bundled app (create container from pre-loaded image if needed).
pub(in crate::api::rpc) async fn handle_bundled_app_start( pub(in crate::api::rpc) async fn handle_bundled_app_start(
&self, &self,
@ -1603,7 +1568,7 @@ fn manifest_host_ports(container_name: &str) -> Vec<u16> {
Vec::new() Vec::new()
} }
pub(super) fn manifest_apps_dirs() -> Vec<std::path::PathBuf> { fn manifest_apps_dirs() -> Vec<std::path::PathBuf> {
let mut dirs = Vec::new(); let mut dirs = Vec::new();
if let Ok(manifest_dir) = std::env::var("CARGO_MANIFEST_DIR") { if let Ok(manifest_dir) = std::env::var("CARGO_MANIFEST_DIR") {
dirs.push(Path::new(&manifest_dir).join("../../apps")); dirs.push(Path::new(&manifest_dir).join("../../apps"));
@ -1947,17 +1912,6 @@ pub(super) fn orchestrator_uninstall_app_ids(package_id: &str) -> Vec<String> {
"archy-btcpay-db".into(), "archy-btcpay-db".into(),
], ],
"fedimint" => vec!["fedimint".into(), "fedimint-gateway".into()], "fedimint" => vec!["fedimint".into(), "fedimint-gateway".into()],
// Immich: multi-container stack, mirrors `immich_stack_app_ids` in
// stacks.rs. Without this, uninstalling "immich" only disabled the
// orchestrator-tracked "immich" app_id — "immich-postgres" and
// "immich-redis" stayed enabled, so the boot reconciler kept
// restarting their leftover stopped containers forever after the
// generic uninstall path stopped them (`.198`, 2026-07-01).
"immich" => vec![
"immich-postgres".into(),
"immich-redis".into(),
"immich".into(),
],
_ => vec![package_id.to_string()], _ => vec![package_id.to_string()],
} }
} }
@ -1977,19 +1931,4 @@ mod tests {
fn runtime_host_ports_preserve_legacy_extra_ports() { fn runtime_host_ports_preserve_legacy_extra_ports() {
assert_eq!(runtime_host_ports("gitea"), vec![3001, 2222, 3000]); assert_eq!(runtime_host_ports("gitea"), vec![3001, 2222, 3000]);
} }
#[test]
fn immich_uninstall_covers_every_sibling_orchestrator_app_id() {
// Regression: uninstalling "immich" used to only disable the
// "immich" app_id itself, leaving immich-postgres/immich-redis
// enabled — the boot reconciler kept restarting their leftover
// stopped containers forever (.198, 2026-07-01).
let ids = orchestrator_uninstall_app_ids("immich");
for expected in ["immich-postgres", "immich-redis", "immich"] {
assert!(
ids.iter().any(|id| id == expected),
"missing {expected} in {ids:?}"
);
}
}
} }

View File

@ -1,352 +0,0 @@
//! Multi-version support — version listing + in-app version switch / pin /
//! auto-update toggle (`docs/bitcoin-multi-version-design.md` §3 Phase 3).
//!
//! Two RPCs:
//! - `package.versions` — read the selectable versions for an app plus the
//! runner's current pin / auto-update preference and (best-effort) the
//! version actually running. Drives the install modal + "Version & Updates"
//! card.
//! - `package.set-config` — persist a version pin (or un-pin to track latest)
//! and/or the auto-update toggle, then recreate the app at the chosen image
//! when the version actually changed. A DOWNGRADE (older release over a
//! newer chainstate — the highest-risk operation, design §4) is refused
//! unless the caller passes `confirm: true`, so the UI can warn first.
use super::config::get_containers_for_app;
use super::install::install_log;
use super::validation::validate_app_id;
use crate::api::rpc::RpcHandler;
use crate::container::{app_catalog, version_config};
use anyhow::Result;
use std::sync::Arc;
use tracing::{info, warn};
/// Apps that participate in multi-version selection today. Kept narrow on
/// purpose: version switching recreates the container, which is only safe for
/// the single-container, orchestrator-managed Bitcoin backends whose data and
/// downgrade semantics we understand. Any app the catalog gives a `versions[]`
/// list also qualifies (third-party registry apps inherit the capability).
fn supports_versions(app_id: &str) -> bool {
matches!(app_id, "bitcoin-core" | "bitcoin-knots")
|| !app_catalog::catalog_versions(app_id).is_empty()
}
/// Extract the tag from a full image reference, leaving a `registry:port/repo`
/// host-port colon intact (only a colon AFTER the last `/` is a tag).
fn image_tag(image: &str) -> Option<String> {
let after_slash = image.rsplit_once('/').map(|(_, r)| r).unwrap_or(image);
after_slash
.rsplit_once(':')
.map(|(_, tag)| tag.to_string())
.filter(|t| !t.is_empty())
}
/// Best-effort: the version tag of the backend container actually running for
/// `app_id`, by inspecting its image. `None` when not installed or unreadable.
async fn installed_version(app_id: &str) -> Option<String> {
let containers = get_containers_for_app(app_id).await.ok()?;
// Prefer the backend container (exact id / `archy-<id>`) over UI companions.
let name = containers
.iter()
.find(|n| n.as_str() == app_id || n.as_str() == format!("archy-{app_id}"))
.or_else(|| containers.first())?;
let out = tokio::process::Command::new("podman")
.args(["inspect", name, "--format", "{{.ImageName}}"])
.output()
.await
.ok()?;
if !out.status.success() {
return None;
}
let image = String::from_utf8_lossy(&out.stdout).trim().to_string();
let tag = image_tag(&image)?;
// A floating tag (latest/stable/...) names the reference used to CREATE the
// container, not what's actually running — podman never re-resolves it once
// cached, so a stale local `:latest` reports "latest" even when the real
// `latest` moved on months ago (.228, 2026-07-01: ran a 4-month-old cached
// image while a newer one already sat locally, unused). Ask the Bitcoin
// backends directly instead of trusting the tag literal in that case.
if is_floating_tag(&tag) {
if let Some(real) = bitcoind_reported_version(app_id, name).await {
return Some(real);
}
}
Some(tag)
}
fn is_floating_tag(tag: &str) -> bool {
matches!(tag, "latest" | "stable" | "release" | "main")
}
/// Best-effort: ask the running bitcoind binary for its own version, trimmed to
/// the catalog's version-tag format (e.g. `29.3.knots20260210`, `29.2`). `None`
/// for apps other than the Bitcoin backends (no generic way to introspect a
/// third-party image's content version this way) or if the exec fails.
async fn bitcoind_reported_version(app_id: &str, container_name: &str) -> Option<String> {
if !matches!(app_id, "bitcoin-core" | "bitcoin-knots") {
return None;
}
let out = tokio::process::Command::new("podman")
.args(["exec", container_name, "bitcoind", "--version"])
.output()
.await
.ok()?;
if !out.status.success() {
return None;
}
parse_bitcoind_version_output(&String::from_utf8_lossy(&out.stdout))
}
/// Parses e.g. "Bitcoin Knots daemon version v29.3.knots20260210\n..." or
/// "Bitcoin Core version v29.2.0\n..." down to the version tag after `version v`.
fn parse_bitcoind_version_output(output: &str) -> Option<String> {
let first_line = output.lines().next()?;
let (_, version) = first_line.rsplit_once("version v")?;
let version = version.trim();
if version.is_empty() {
return None;
}
Some(version.to_string())
}
impl RpcHandler {
/// `package.versions` — what a runner can install / switch to for this app,
/// plus their current preference and the running version.
pub(in crate::api::rpc) async fn handle_package_versions(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
.get("id")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing package id"))?;
validate_app_id(app_id)?;
let versions = app_catalog::catalog_versions(app_id);
let default = app_catalog::catalog_default_version(app_id);
let cfg = version_config::read(app_id);
let installed = installed_version(app_id).await;
Ok(serde_json::json!({
"id": app_id,
"supportsVersions": supports_versions(app_id),
"default": default,
"installedVersion": installed,
"pinnedVersion": cfg.pinned_version,
"autoUpdate": cfg.auto_update,
"versions": versions.iter().map(|v| serde_json::json!({
"version": v.version,
"default": v.default,
"deprecated": v.deprecated,
"eol": v.eol,
})).collect::<Vec<_>>(),
}))
}
/// `package.set-config` — persist version pin + auto-update preference and
/// recreate on an actual version change. Downgrades require `confirm:true`.
pub(in crate::api::rpc) async fn handle_package_set_config(
self: Arc<Self>,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let app_id = params
.get("id")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("Missing package id"))?
.to_string();
validate_app_id(&app_id)?;
if !supports_versions(&app_id) {
return Err(anyhow::anyhow!(
"{} has no selectable versions in the catalog",
app_id
));
}
let confirm = params
.get("confirm")
.and_then(|v| v.as_bool())
.unwrap_or(false);
let existing = version_config::read(&app_id);
let default = app_catalog::catalog_default_version(&app_id);
// ---- Resolve the requested pin (if a version was supplied) ----------
// Absent `version` => leave the pin unchanged (an auto-update-only edit).
// `version == default` => un-pin (track latest). Any other version must
// exist in the catalog and resolve to a same-repo image, else reject.
let version_param = params
.get("version")
.and_then(|v| v.as_str())
.map(str::to_string);
let mut new_pin = existing.pinned_version.clone();
let mut version_changed = false;
if let Some(req) = version_param.as_deref() {
let resolved_pin = if default.as_deref() == Some(req) {
None // selecting the default un-pins
} else {
// Validate the version is real + same-repo before pinning.
if !app_catalog::catalog_versions(&app_id)
.iter()
.any(|v| v.version == req)
{
return Err(anyhow::anyhow!(
"version {} is not offered for {}",
req,
app_id
));
}
Some(req.to_string())
};
version_changed = resolved_pin != existing.pinned_version;
new_pin = resolved_pin;
}
let new_auto_update = params
.get("autoUpdate")
.and_then(|v| v.as_bool())
.unwrap_or(existing.auto_update);
// ---- Downgrade gate (design §4: warn + confirm + allow) -------------
// "Current" = what wrote the on-disk chainstate: the running version if
// we can read it, else the existing pin, else the catalog default.
if version_changed {
let target = version_param.as_deref().unwrap_or_default();
let current = installed_version(&app_id)
.await
.or_else(|| existing.pinned_version.clone())
.or_else(|| default.clone());
if let Some(current) = current {
if version_config::is_downgrade(&current, target) && !confirm {
warn!(
"set-config {}: refusing un-confirmed downgrade {} -> {}",
app_id, current, target
);
return Ok(serde_json::json!({
"status": "confirm_required",
"kind": "downgrade",
"id": app_id,
"currentVersion": current,
"targetVersion": target,
"warning": format!(
"Switching {app_id} from {current} down to {target} is a \
downgrade. Bitcoin may refuse to start on a chainstate \
written by the newer version without a full reindex, and \
a pruned node can lose block data. Re-confirm to proceed."
),
}));
}
}
}
// ---- Persist preference --------------------------------------------
version_config::write(
&app_id,
&version_config::AppVersionConfig {
pinned_version: new_pin.clone(),
auto_update: new_auto_update,
},
)?;
install_log(&format!(
"SET-CONFIG {}: pinned={:?} autoUpdate={} (version_changed={})",
app_id, new_pin, new_auto_update, version_changed
))
.await;
info!(
app_id = %app_id,
pinned = ?new_pin,
auto_update = new_auto_update,
version_changed,
"package.set-config applied"
);
// ---- Recreate when the version actually changed + app is installed --
// The orchestrator's install/recreate path reads the pin we just wrote
// (prod_orchestrator image resolution), so reusing the update machinery
// pulls + recreates at the chosen image. An auto-update-only edit, or a
// change to a not-installed app, just persists the preference.
let mut recreating = false;
if version_changed {
let installed = get_containers_for_app(&app_id)
.await
.map(|c| !c.is_empty())
.unwrap_or(false);
if installed {
recreating = true;
// Fire the existing async update flow; it flips state to
// Updating and recreates honoring the new pin. The UI polls.
self.clone()
.spawn_package_update(Some(serde_json::json!({ "id": app_id })))
.await?;
}
}
Ok(serde_json::json!({
"status": "ok",
"id": app_id,
"pinnedVersion": new_pin,
"autoUpdate": new_auto_update,
"versionChanged": version_changed,
"recreating": recreating,
}))
}
}
#[cfg(test)]
mod tests {
use super::{image_tag, is_floating_tag, parse_bitcoind_version_output};
#[test]
fn floating_tag_detects_generic_channel_names() {
for tag in ["latest", "stable", "release", "main"] {
assert!(is_floating_tag(tag), "{tag}");
}
for tag in ["29.3.knots20260508", "28.4", "v29.2.0"] {
assert!(!is_floating_tag(tag), "{tag}");
}
}
#[test]
fn parses_knots_version_line() {
assert_eq!(
parse_bitcoind_version_output(
"Bitcoin Knots daemon version v29.3.knots20260210\nCopyright...\n"
)
.as_deref(),
Some("29.3.knots20260210")
);
}
#[test]
fn parses_core_version_line() {
assert_eq!(
parse_bitcoind_version_output("Bitcoin Core version v29.2.0\n").as_deref(),
Some("29.2.0")
);
}
#[test]
fn parse_returns_none_when_output_has_no_version_marker() {
assert_eq!(parse_bitcoind_version_output("garbage output\n"), None);
assert_eq!(parse_bitcoind_version_output(""), None);
}
#[test]
fn image_tag_keeps_registry_port_colon() {
assert_eq!(
image_tag("146.59.87.168:3000/lfg2025/bitcoin:28.4").as_deref(),
Some("28.4")
);
assert_eq!(
image_tag("146.59.87.168:3000/lfg2025/bitcoin-knots:29.3.knots20260508").as_deref(),
Some("29.3.knots20260508")
);
// No tag => None (don't mistake the registry port for a tag).
assert_eq!(image_tag("146.59.87.168:3000/lfg2025/bitcoin"), None);
assert_eq!(
image_tag("docker.io/library/redis:7"),
Some("7".to_string())
);
}
}

View File

@ -6,6 +6,7 @@
use crate::api::rpc::RpcHandler; use crate::api::rpc::RpcHandler;
use crate::data_model::InstallPhase; use crate::data_model::InstallPhase;
use anyhow::{Context, Result}; use anyhow::{Context, Result};
use base64::Engine;
use std::process::Output; use std::process::Output;
use std::time::Duration; use std::time::Duration;
use tracing::info; use tracing::info;
@ -695,16 +696,6 @@ fn immich_stack_app_ids() -> &'static [&'static str] {
&["immich-postgres", "immich-redis", "immich"] &["immich-postgres", "immich-redis", "immich"]
} }
fn netbird_stack_app_ids() -> &'static [&'static str] {
// Dependency/startup order: the combined management/signal/relay server
// first (it owns the base64 relay/store secrets + the sqlite store, and is
// the OIDC issuer the others point at), then the dashboard SPA, then the
// user-facing TLS proxy ("netbird", which carries the self-signed cert +
// the templated nginx.conf and is the launcher). Mirrors the netbird
// startup_order in dependencies.rs.
&["netbird-server", "netbird-dashboard", "netbird"]
}
fn indeedhub_stack_app_ids() -> &'static [&'static str] { fn indeedhub_stack_app_ids() -> &'static [&'static str] {
// Dependency order: backends + their generated secrets first, then the api // Dependency order: backends + their generated secrets first, then the api
// (owns indeedhub-jwt; reads the db/minio secrets the backends materialised), // (owns indeedhub-jwt; reads the db/minio secrets the backends materialised),
@ -724,6 +715,10 @@ fn indeedhub_stack_app_ids() -> &'static [&'static str] {
const REGISTRY: &str = "146.59.87.168:3000/lfg2025"; const REGISTRY: &str = "146.59.87.168:3000/lfg2025";
const NETBIRD_DASHBOARD_IMAGE: &str = "docker.io/netbirdio/dashboard:v2.38.0";
const NETBIRD_SERVER_IMAGE: &str = "docker.io/netbirdio/netbird-server:0.71.2";
const NETBIRD_PROXY_IMAGE: &str = "docker.io/library/nginx:1.27-alpine";
/// Pull an image with retry and exponential backoff (3 attempts). /// Pull an image with retry and exponential backoff (3 attempts).
async fn pull_image_with_retry(image: &str) -> Result<()> { async fn pull_image_with_retry(image: &str) -> Result<()> {
let exists = podman_stack_status(&["image", "exists", image], PODMAN_STACK_PROBE_TIMEOUT).await; let exists = podman_stack_status(&["image", "exists", image], PODMAN_STACK_PROBE_TIMEOUT).await;
@ -1009,9 +1004,9 @@ impl RpcHandler {
return Ok(adopted); return Ok(adopted);
} }
// Dependency check: Bitcoin must be running. Bounded wait covers the // Dependency check: Bitcoin must be running
// "installed but still starting" race instead of failing instantly. let deps = super::dependencies::detect_running_deps().await?;
self.gate_install_deps("btcpay-server").await?; super::dependencies::check_install_deps("btcpay-server", &deps)?;
install_log("INSTALL START: btcpay-server (stack: postgres + nbxplorer + btcpay)").await; install_log("INSTALL START: btcpay-server (stack: postgres + nbxplorer + btcpay)").await;
@ -1833,27 +1828,6 @@ impl RpcHandler {
/// Install self-hosted NetBird (dashboard + combined management/signal/relay server). /// Install self-hosted NetBird (dashboard + combined management/signal/relay server).
pub(super) async fn install_netbird_stack(&self) -> Result<serde_json::Value> { pub(super) async fn install_netbird_stack(&self) -> Result<serde_json::Value> {
// Manifest-driven path (#20 phase 4): render the 3-member stack from
// apps/netbird-*/manifest.yml via the orchestrator — dedicated
// netbird-net + network_aliases, base64 generated_secrets, a self-signed
// TLS cert (generated_certs) so the dashboard gets a secure context for
// OIDC PKCE (#15), and templated config.yaml/nginx.conf rendered from
// host facts + the netbird-net gateway. The manifests use the exact live
// container names, so on an existing node this ADOPTS the running stack
// rather than recreating it (the sqlite store + base64 keys are
// preserved — ensure_generated_secrets no-ops on existing files).
//
// #20 ph4: the legacy hardcoded `podman run` installer was DELETED — the
// signed catalog always ships apps/netbird-*/manifest.yml, so there is no
// in-Rust fallback. If the orchestrator doesn't know these app_ids and no
// running stack exists to adopt, install errors rather than silently
// diverging from the manifest contract.
if let Some(orchestrated) =
install_stack_via_orchestrator(self, "netbird", netbird_stack_app_ids()).await?
{
return Ok(orchestrated);
}
if let Some(adopted) = adopt_stack_if_exists( if let Some(adopted) = adopt_stack_if_exists(
"netbird", "netbird",
"netbird", "netbird",
@ -1864,12 +1838,491 @@ impl RpcHandler {
return Ok(adopted); return Ok(adopted);
} }
anyhow::bail!( install_log("INSTALL START: netbird stack (dashboard + server)").await;
"netbird manifests not available on this node — the signed catalog must provide apps/netbird-*/manifest.yml (legacy hardcoded installer removed in #20 ph4)" info!("Installing self-hosted NetBird stack");
self.set_install_phase("netbird", InstallPhase::PullingImage)
.await;
for (i, image) in [
NETBIRD_DASHBOARD_IMAGE,
NETBIRD_SERVER_IMAGE,
NETBIRD_PROXY_IMAGE,
]
.iter()
.enumerate()
{
self.set_install_progress("netbird", i as u64, 3).await;
pull_image_with_retry(image)
.await
.with_context(|| format!("Failed to pull NetBird image: {}", image))?;
}
self.set_install_progress("netbird", 3, 3).await;
for name in ["netbird", "netbird-dashboard", "netbird-server"] {
let _ = podman_stack_status(&["rm", "-f", name], PODMAN_STACK_PROBE_TIMEOUT).await;
}
let _ = podman_stack_status(
&["network", "rm", "-f", "netbird-net"],
PODMAN_STACK_PROBE_TIMEOUT,
) )
.await;
self.set_install_phase("netbird", InstallPhase::CreatingContainer)
.await;
tokio::fs::create_dir_all("/var/lib/archipelago/netbird/data")
.await
.context("Failed to create NetBird data directory")?;
let host_ip = detect_netbird_public_host_ip()
.await
.unwrap_or_else(|| self.config.host_ip.clone());
// Create the network FIRST so we can read back the gateway it was
// assigned — that gateway is Podman's aardvark DNS, which the proxy's
// nginx needs as an explicit `resolver` to re-resolve container names
// (issue #15: without it nginx caches a container IP and 502s forever
// once that IP changes on restart/reboot).
let _ = podman_stack_status(
&["network", "create", "netbird-net"],
PODMAN_STACK_PROBE_TIMEOUT,
)
.await;
let resolver_ip = netbird_net_resolver_ip().await;
write_netbird_config_files(&host_ip, &self.config.host_ip, &resolver_ip).await?;
ensure_netbird_tls_cert(&host_ip).await?;
let mut server_cmd = tokio::process::Command::new("podman");
server_cmd.args([
"run",
"-d",
"--name",
"netbird-server",
"--network",
"netbird-net",
"--network-alias",
"netbird-server",
"--restart=unless-stopped",
"-p",
"8086:80",
"-p",
"3478:3478/udp",
"-v",
"/var/lib/archipelago/netbird/data:/var/lib/netbird",
"-v",
"/var/lib/archipelago/netbird/config.yaml:/etc/netbird/config.yaml:ro",
NETBIRD_SERVER_IMAGE,
"--config",
"/etc/netbird/config.yaml",
]);
run_required_stack_command("netbird", "create server", &mut server_cmd).await?;
self.set_install_phase("netbird", InstallPhase::StartingContainer)
.await;
tokio::time::sleep(std::time::Duration::from_secs(5)).await;
let mut dashboard_cmd = tokio::process::Command::new("podman");
dashboard_cmd.args([
"run",
"-d",
"--name",
"netbird-dashboard",
"--network",
"netbird-net",
// Explicit alias so the proxy can always resolve `netbird-dashboard`
// via Podman DNS — don't rely on implicit container-name aliasing.
"--network-alias",
"netbird-dashboard",
"--restart=unless-stopped",
"--env-file",
"/var/lib/archipelago/netbird/dashboard.env",
NETBIRD_DASHBOARD_IMAGE,
]);
run_required_stack_command("netbird", "create dashboard", &mut dashboard_cmd).await?;
let mut proxy_cmd = tokio::process::Command::new("podman");
proxy_cmd.args([
"run",
"-d",
"--name",
"netbird",
"--network",
"netbird-net",
"--restart=unless-stopped",
// 8087 publishes the TLS listener — netbird's dashboard requires a
// secure context (window.crypto.subtle / OIDC PKCE), issue #15.
"-p",
"8087:443",
"-v",
"/var/lib/archipelago/netbird/nginx.conf:/etc/nginx/conf.d/default.conf:ro",
"-v",
"/var/lib/archipelago/netbird/tls.crt:/etc/nginx/tls.crt:ro",
"-v",
"/var/lib/archipelago/netbird/tls.key:/etc/nginx/tls.key:ro",
NETBIRD_PROXY_IMAGE,
]);
run_required_stack_command("netbird", "create unified proxy", &mut proxy_cmd).await?;
wait_for_stack_containers(
"netbird",
&["netbird-server", "netbird-dashboard", "netbird"],
60,
)
.await?;
self.set_install_phase("netbird", InstallPhase::WaitingHealthy)
.await;
// Containers being "running" is NOT the same as the embedded OIDC
// provider being ready (#10). The dashboard SPA opens right after install
// and, if it loads before /oauth2/.well-known is served, caches a bad
// auth state — the user appears logged-in but can't log out until it
// self-corrects. Wait (best-effort) for OIDC discovery to answer before
// we report Done, so the first dashboard load sees a ready provider.
wait_for_netbird_oidc_ready(Duration::from_secs(60)).await;
self.set_install_phase("netbird", InstallPhase::PostInstall)
.await;
self.set_install_phase("netbird", InstallPhase::Done).await;
self.clear_install_progress("netbird").await;
install_log("INSTALL OK: netbird stack").await;
info!("NetBird stack installed");
Ok(serde_json::json!({
"success": true,
"package_id": "netbird",
"message": "NetBird self-hosted stack installed",
}))
} }
} }
/// Best-effort wait for NetBird's embedded OIDC provider to start serving its
/// discovery document. The management server publishes 8086:80 on the host and
/// is the issuer at `/oauth2`, so its `.well-known/openid-configuration` is the
/// signal that the dashboard's login/logout flow will work. Polls until a 2xx
/// or the timeout — NEVER fails the install (the stack is already running; this
/// only narrows the post-install race window in #10).
async fn wait_for_netbird_oidc_ready(timeout: Duration) {
let url = "http://127.0.0.1:8086/oauth2/.well-known/openid-configuration";
let client = match reqwest::Client::builder()
.timeout(Duration::from_secs(5))
.build()
{
Ok(c) => c,
Err(_) => return,
};
let deadline = tokio::time::Instant::now() + timeout;
loop {
if let Ok(resp) = client.get(url).send().await {
if resp.status().is_success() {
info!("NetBird OIDC discovery is ready");
return;
}
}
if tokio::time::Instant::now() >= deadline {
info!("NetBird OIDC discovery not ready within timeout — proceeding anyway");
return;
}
tokio::time::sleep(Duration::from_secs(2)).await;
}
}
async fn read_or_generate_b64_secret(name: &str) -> String {
let path = format!("/var/lib/archipelago/secrets/{}", name);
if let Ok(val) = tokio::fs::read_to_string(&path).await {
let trimmed = val.trim().to_string();
if !trimmed.is_empty() {
return trimmed;
}
}
let mut buf = [0u8; 32];
rand::RngCore::fill_bytes(&mut rand::rngs::OsRng, &mut buf);
let secret = base64::engine::general_purpose::STANDARD.encode(buf);
let _ = tokio::fs::create_dir_all("/var/lib/archipelago/secrets").await;
let _ = tokio::fs::write(&path, &secret).await;
secret
}
/// Read the gateway of the `netbird-net` bridge. Podman runs its aardvark DNS
/// resolver on this address, so nginx can use it as an explicit `resolver` to
/// re-resolve container names at request time. Falls back to Podman's usual
/// first-pool gateway if the inspect fails (best effort — config is rewritten
/// on every (re)install).
async fn netbird_net_resolver_ip() -> String {
let out = tokio::process::Command::new("podman")
.args([
"network",
"inspect",
"netbird-net",
"--format",
"{{range .Subnets}}{{.Gateway}}{{end}}",
])
.output()
.await;
if let Ok(o) = out {
let gw = String::from_utf8_lossy(&o.stdout).trim().to_string();
if !gw.is_empty() && gw.parse::<std::net::IpAddr>().is_ok() {
return gw;
}
}
"10.89.0.1".to_string()
}
/// Generate a self-signed TLS cert for the netbird proxy if absent. The
/// dashboard needs a secure context (window.crypto.subtle / OIDC PKCE), so the
/// proxy serves HTTPS; a self-signed cert is sufficient (the user accepts it
/// once when opening netbird in a tab). SAN covers the LAN IP plus
/// localhost/127.0.0.1 so it's valid however the box is reached locally.
async fn ensure_netbird_tls_cert(host_ip: &str) -> Result<()> {
let dir = "/var/lib/archipelago/netbird";
let crt = format!("{dir}/tls.crt");
let key = format!("{dir}/tls.key");
if tokio::fs::metadata(&crt).await.is_ok() && tokio::fs::metadata(&key).await.is_ok() {
return Ok(());
}
let _ = tokio::fs::create_dir_all(dir).await;
let san = format!("subjectAltName=IP:{host_ip},IP:127.0.0.1,DNS:localhost");
let status = tokio::process::Command::new("openssl")
.args([
"req",
"-x509",
"-newkey",
"rsa:2048",
"-nodes",
"-keyout",
&key,
"-out",
&crt,
"-days",
"3650",
"-subj",
&format!("/CN={host_ip}"),
"-addext",
&san,
])
.status()
.await
.context("failed to run openssl for netbird TLS cert")?;
if !status.success() {
anyhow::bail!("openssl failed to generate netbird TLS cert");
}
Ok(())
}
async fn write_netbird_config_files(host_ip: &str, lan_ip: &str, resolver_ip: &str) -> Result<()> {
// netbird's dashboard uses window.crypto.subtle (OIDC PKCE), which browsers
// only expose in a SECURE context — so the proxy serves HTTPS and every
// origin here is https (issue #15: over plain http the dashboard threw
// "window.crypto.subtle is unavailable" and never reached login).
let public_origin = format!("https://{}:8087", host_ip);
let server_origin = format!("http://{}:8086", host_ip);
// A single box is reached via several addresses. Allow the OIDC login flow
// to redirect back to whichever origin the user actually used, otherwise
// post-login lands on the wrong host and the dashboard shows
// "Unauthenticated" (issue #15). The browser-side CORS is handled in the
// nginx proxy; this covers the redirect-URI allow-list.
let lan_origin = format!("https://{}:8087", lan_ip);
let mut redirect_origins = vec![public_origin.clone()];
if lan_origin != public_origin {
redirect_origins.push(lan_origin);
}
let dashboard_redirect_uris = redirect_origins
.iter()
.flat_map(|o| {
[
format!(" - \"{o}/nb-auth\""),
format!(" - \"{o}/nb-silent-auth\""),
]
})
.collect::<Vec<_>>()
.join("\n");
let dashboard_logout_uris = redirect_origins
.iter()
.map(|o| format!(" - \"{o}/\""))
.collect::<Vec<_>>()
.join("\n");
let relay_secret = read_or_generate_b64_secret("netbird-relay-auth-secret").await;
let encryption_key = read_or_generate_b64_secret("netbird-store-encryption-key").await;
let config = format!(
r#"server:
listenAddress: ":80"
exposedAddress: "{public_origin}"
stunPorts:
- 3478
metricsPort: 9090
healthcheckAddress: ":9000"
logLevel: "info"
logFile: "console"
authSecret: "{relay_secret}"
dataDir: "/var/lib/netbird"
auth:
issuer: "{public_origin}/oauth2"
localAuthDisabled: false
signKeyRefreshEnabled: false
dashboardRedirectURIs:
{dashboard_redirect_uris}
dashboardPostLogoutRedirectURIs:
{dashboard_logout_uris}
cliRedirectURIs:
- "http://localhost:53000/"
store:
engine: "sqlite"
encryptionKey: "{encryption_key}"
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/config.yaml", config)
.await
.context("Failed to write NetBird config.yaml")?;
let dashboard_env = format!(
r#"NETBIRD_MGMT_API_ENDPOINT={public_origin}
NETBIRD_MGMT_GRPC_API_ENDPOINT={public_origin}
AUTH_AUDIENCE=netbird-dashboard
AUTH_CLIENT_ID=netbird-dashboard
AUTH_CLIENT_SECRET=
AUTH_AUTHORITY={public_origin}/oauth2
USE_AUTH0=false
AUTH_SUPPORTED_SCOPES=openid profile email groups
AUTH_REDIRECT_URI=/nb-auth
AUTH_SILENT_REDIRECT_URI=/nb-silent-auth
NETBIRD_TOKEN_SOURCE=idToken
NGINX_SSL_PORT=443
LETSENCRYPT_DOMAIN=none
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/dashboard.env", dashboard_env)
.await
.context("Failed to write NetBird dashboard.env")?;
let nginx_conf = format!(
r#"server {{
listen 443 ssl;
server_name _;
# netbird's dashboard needs a secure context (window.crypto.subtle for OIDC
# PKCE), so the proxy terminates TLS with a self-signed cert (issue #15).
ssl_certificate /etc/nginx/tls.crt;
ssl_certificate_key /etc/nginx/tls.key;
# Rootless Podman can hand a container a new IP across restarts/reboots.
# nginx resolves a literal upstream name ONCE at startup and caches it, so
# after the IP moves every request 502s with "host unreachable" (issue #15,
# observed live on .198: nginx pinned to a dead netbird-dashboard IP). Fix:
# point `resolver` at the netbird-net gateway (Podman's aardvark DNS) and
# use VARIABLE upstreams, which forces nginx to re-resolve the container
# names at request time. Everything is reached container-to-container by
# name so nothing depends on host-published ports either.
resolver {resolver_ip} valid=10s ipv6=off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
location ~ ^/(relay|ws-proxy/) {{
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 1d;
}}
location ~ ^/(api|oauth2)(/|$) {{
# The dashboard is a SPA whose API/OIDC base URL is baked at build time
# to one host:port. A single box is reached via several addresses (LAN
# IP, Tailscale 100.x, hostname), so those fetches are cross-origin and
# the browser blocks them with no Access-Control-Allow-Origin (issue
# #15, observed live on .198). Reflect the caller's Origin so the
# self-hosted management/OIDC API is reachable from any of them, and
# answer the CORS preflight here.
if ($request_method = OPTIONS) {{
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
add_header Access-Control-Max-Age 86400 always;
add_header Content-Length 0;
return 204;
}}
add_header Access-Control-Allow-Origin $http_origin always;
add_header Access-Control-Allow-Credentials true always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Authorization, Content-Type, Accept" always;
set $nb_server netbird-server;
proxy_pass http://$nb_server:80;
}}
location ~ ^/(signalexchange\.SignalExchange|management\.ManagementService|management\.ProxyService)/ {{
set $nb_server netbird-server;
grpc_pass grpc://$nb_server:80;
grpc_read_timeout 1d;
grpc_send_timeout 1d;
}}
# OIDC callback routes are client-side SPA routes with NO prebuilt page in
# the dashboard bundle, so proxying them straight through 404s which
# crashes the dashboard's auth init and shows "Unauthenticated" with dead
# buttons (issue #15, confirmed live on .198: /nb-auth + /nb-silent-auth
# returned 404). Serve the dashboard's index.html at these paths (URL
# unchanged) so react-oidc boots and completes the login / silent-SSO.
location ~ ^/(nb-auth|nb-silent-auth) {{
set $nb_dashboard netbird-dashboard;
rewrite ^.*$ /index.html break;
proxy_pass http://$nb_dashboard:80;
}}
location / {{
set $nb_dashboard netbird-dashboard;
proxy_pass http://$nb_dashboard:80;
}}
}}
# Direct server remains available for diagnostics at {server_origin}.
"#
);
tokio::fs::write("/var/lib/archipelago/netbird/nginx.conf", nginx_conf)
.await
.context("Failed to write NetBird nginx.conf")?;
Ok(())
}
async fn detect_netbird_public_host_ip() -> Option<String> {
let output = tokio::process::Command::new("hostname")
.args(["-I"])
.output()
.await
.ok()?;
let stdout = String::from_utf8_lossy(&output.stdout);
let ips: Vec<&str> = stdout
.split_whitespace()
.filter(|s| s.contains('.'))
.collect();
// Prefer the LAN address as the canonical origin — that's what users browse
// to on the local network. Baking the Tailscale 100.x address here broke
// LAN access with cross-origin/redirect mismatches (issue #15). Tailscale
// (100.64.0.0/10 CGNAT) is only a fallback for nodes with no LAN IP.
let is_private_lan = |ip: &str| {
ip.starts_with("192.168.")
|| ip.starts_with("10.")
|| (ip.starts_with("172.")
&& ip
.split('.')
.nth(1)
.and_then(|o| o.parse::<u8>().ok())
.map(|o| (16..=31).contains(&o))
.unwrap_or(false))
};
if let Some(lan) = ips.iter().find(|ip| is_private_lan(ip)) {
return Some(lan.to_string());
}
ips.iter()
.find(|ip| ip.starts_with("100."))
.map(|s| s.to_string())
}
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use super::{btcpay_stack_app_ids, mempool_stack_app_ids}; use super::{btcpay_stack_app_ids, mempool_stack_app_ids};

View File

@ -32,27 +32,19 @@ impl RpcHandler {
.ok_or_else(|| anyhow::anyhow!("Missing package id"))?; .ok_or_else(|| anyhow::anyhow!("Missing package id"))?;
validate_app_id(package_id)?; validate_app_id(package_id)?;
// Resolve the target image. Prefer the remote app catalog (decoupled // Verify an update is actually available. Prefer the remote app catalog
// from the binary OTA), falling back to the image-versions.sh pin. This // (decoupled from the binary OTA), falling back to the image-versions.sh
// is OPTIONAL for orchestrator-managed apps: the orchestrator resolves // pin when the catalog is absent or doesn't cover this app.
// the image itself (manifest + catalog + version_config pin) in its
// upgrade path, so an app the catalog doesn't carry a primary image for
// (e.g. bitcoin-core, image lives in the embedded manifest + versions[])
// still upgrades. Only the legacy/stack path below hard-requires it.
let pinned = crate::container::app_catalog::catalog_primary_image(package_id) let pinned = crate::container::app_catalog::catalog_primary_image(package_id)
.or_else(|| image_versions::pinned_image_for_app(package_id)); .or_else(|| image_versions::pinned_image_for_app(package_id))
.ok_or_else(|| anyhow::anyhow!("No pinned image found for {}", package_id))?;
// Note: the `already updating` guard lives in `spawn_package_update` // Note: the `already updating` guard lives in `spawn_package_update`
// (the async wrapper that dispatch actually routes to). By the time // (the async wrapper that dispatch actually routes to). By the time
// this inner function runs, the wrapper has already flipped state to // this inner function runs, the wrapper has already flipped state to
// `Updating`, so duplicating the check here would be a false positive. // `Updating`, so duplicating the check here would be a false positive.
install_log(&format!( install_log(&format!("UPDATE: {}{}", package_id, pinned)).await;
"UPDATE: {} → {}",
package_id,
pinned.as_deref().unwrap_or("(orchestrator-resolved)")
))
.await;
// Set state to Updating // Set state to Updating
{ {
@ -122,16 +114,6 @@ impl RpcHandler {
} }
} }
// Legacy/stack path hard-requires a concrete primary image (the
// orchestrator path above already returned for apps it manages).
let pinned = match pinned {
Some(p) => p,
None => {
self.clear_update_state(package_id).await;
return Err(anyhow::anyhow!("No pinned image found for {}", package_id));
}
};
// Resolve images to pull — either a stack or single container // Resolve images to pull — either a stack or single container
let images_to_pull = self.resolve_images_to_pull(package_id, &pinned); let images_to_pull = self.resolve_images_to_pull(package_id, &pinned);

View File

@ -26,36 +26,6 @@ impl Drop for OnboardingMnemonicState {
const MNEMONIC_TTL: std::time::Duration = std::time::Duration::from_secs(600); // 10 minutes const MNEMONIC_TTL: std::time::Duration = std::time::Duration::from_secs(600); // 10 minutes
/// Persist the pending onboarding mnemonic as `identity/master_seed.enc`,
/// encrypted with `passphrase`. Called from `auth.setup` — the first moment a
/// user password exists — so "Reveal recovery phrase" works after onboarding
/// without the frontend having to remember a separate save step (it never
/// did, which left every onboarded node with no encrypted seed backup).
///
/// Deliberately ignores MNEMONIC_TTL: the mnemonic stays in memory until
/// overwritten regardless, so using it here widens nothing, and onboarding
/// legitimately takes longer than 10 minutes when the user carefully writes
/// down 24 words. Clears the in-memory copy on success — password setup is
/// the end of onboarding, so the plaintext no longer needs to linger.
///
/// Returns Ok(true) if a seed was saved, Ok(false) if none was pending.
pub(in crate::api::rpc) async fn save_pending_seed_encrypted(
data_dir: &std::path::Path,
passphrase: &str,
) -> Result<bool> {
let mut state = ONBOARDING_MNEMONIC.lock().await;
let Some(pending) = state.as_ref() else {
return Ok(false);
};
let mnemonic: bip39::Mnemonic = pending
.words
.parse()
.context("Invalid mnemonic in memory")?;
crate::seed::save_seed_encrypted(data_dir, &mnemonic, passphrase).await?;
*state = None;
Ok(true)
}
/// Best-effort: install fips.yaml + start archipelago-fips.service after the /// Best-effort: install fips.yaml + start archipelago-fips.service after the
/// seed onboarding has written the fips_key to disk. Runs in a detached task /// seed onboarding has written the fips_key to disk. Runs in a detached task
/// so the user-facing RPC returns immediately — the systemctl calls can take /// so the user-facing RPC returns immediately — the systemctl calls can take
@ -238,17 +208,6 @@ impl RpcHandler {
let phrase = words.join(" "); let phrase = words.join(" ");
let (_mnemonic, seed) = crate::seed::MasterSeed::from_mnemonic_words(&phrase)?; let (_mnemonic, seed) = crate::seed::MasterSeed::from_mnemonic_words(&phrase)?;
// Stash the restored words like seed.generate does, so auth.setup can
// persist the encrypted backup once the user's password exists and
// "Reveal recovery phrase" works on restored nodes too.
{
let mut state = ONBOARDING_MNEMONIC.lock().await;
*state = Some(OnboardingMnemonicState {
words: phrase.clone(),
created_at: std::time::Instant::now(),
});
}
// Derive and write node Ed25519 key. // Derive and write node Ed25519 key.
let identity_dir = self.config.data_dir.join("identity"); let identity_dir = self.config.data_dir.join("identity");
crate::identity::NodeIdentity::from_seed(&identity_dir, &seed).await?; crate::identity::NodeIdentity::from_seed(&identity_dir, &seed).await?;

View File

@ -47,17 +47,6 @@ impl RpcHandler {
} }
}; };
// Keep the self-signed HTTPS cert's SAN in sync with the new hostname —
// best-effort, never blocks the rename itself. Without this the cert
// stays pinned to whatever name was set at install time, so browsers
// hit a hostname-mismatch warning on top of the usual self-signed one
// the moment a node is renamed.
if hostname_updated {
if let Err(e) = regenerate_tls_cert(&hostname).await {
warn!(hostname = %hostname, "TLS cert regen after rename failed: {}", e);
}
}
info!("Server name updated to: {}", name); info!("Server name updated to: {}", name);
// Push the new name to federation peers in background // Push the new name to federation peers in background
@ -77,70 +66,6 @@ impl RpcHandler {
})) }))
} }
/// server.set-location — Set this node's own lat/lon + whether to share
/// it with trusted federation peers (for the Mesh Map). `lat`/`lon` are
/// optional so a caller can flip `share` off without clearing the saved
/// position, or clear the position by passing nulls.
pub(in crate::api::rpc) async fn handle_server_set_location(
&self,
params: Option<serde_json::Value>,
) -> Result<serde_json::Value> {
let params = params.ok_or_else(|| anyhow::anyhow!("Missing params"))?;
let lat = params.get("lat").and_then(|v| v.as_f64());
let lon = params.get("lon").and_then(|v| v.as_f64());
let share_location = params
.get("share")
.and_then(|v| v.as_bool())
.ok_or_else(|| anyhow::anyhow!("Missing required parameter: share"))?;
if let (Some(lat), Some(lon)) = (lat, lon) {
if !(-90.0..=90.0).contains(&lat) || !(-180.0..=180.0).contains(&lon) {
anyhow::bail!("Invalid lat/lon");
}
}
let location_file = self.config.data_dir.join("server-location.json");
let payload = serde_json::json!({ "lat": lat, "lon": lon, "share_location": share_location });
tokio::fs::write(&location_file, serde_json::to_vec(&payload)?)
.await
.context("Failed to write server location")?;
let (mut data, _) = self.state_manager.get_snapshot().await;
data.server_info.lat = lat;
data.server_info.lon = lon;
data.server_info.share_location = share_location;
self.state_manager.update_data(data).await;
info!(share_location, "Server location updated");
// Push the new location to federation peers in background, same as
// a rename — trusted peers' next state sync picks it up.
let data_dir = self.config.data_dir.clone();
let state_manager = self.state_manager.clone();
tokio::spawn(async move {
if let Err(e) = push_name_to_peers(&data_dir, &state_manager).await {
debug!("Federation location push (non-fatal): {}", e);
}
});
Ok(serde_json::json!({ "lat": lat, "lon": lon, "share_location": share_location }))
}
/// system.get-hostname — Current OS hostname + the mDNS `.local` name it
/// resolves to on the LAN (avahi-daemon advertises `<hostname>.local`).
/// Lets Settings show users where to reach this node over HTTPS for
/// features (mic/camera access) that require a secure context.
pub(in crate::api::rpc) async fn handle_system_get_hostname(&self) -> Result<serde_json::Value> {
let hostname = tokio::fs::read_to_string("/etc/hostname")
.await
.map(|s| s.trim().to_string())
.unwrap_or_else(|_| "archipelago".to_string());
Ok(serde_json::json!({
"hostname": hostname,
"mdns_hostname": format!("{hostname}.local"),
}))
}
/// system.stats — CPU usage, RAM used/total, disk used/total, uptime, load average /// system.stats — CPU usage, RAM used/total, disk used/total, uptime, load average
pub(in crate::api::rpc) async fn handle_system_stats(&self) -> Result<serde_json::Value> { pub(in crate::api::rpc) async fn handle_system_stats(&self) -> Result<serde_json::Value> {
debug!("Getting system stats"); debug!("Getting system stats");
@ -394,63 +319,6 @@ async fn set_system_hostname(hostname: &str) -> Result<()> {
Ok(()) Ok(())
} }
/// Regenerate the self-signed HTTPS cert (`/etc/archipelago/ssl/archipelago.{crt,key}`)
/// with a SAN covering `hostname`, `hostname.local`, `localhost`, and 127.0.0.1, then
/// reload nginx so it picks up the new cert. Still self-signed (browsers will warn
/// on first visit regardless), but avoids stacking a hostname-mismatch warning on
/// top once a node has been renamed away from the install-time default.
async fn regenerate_tls_cert(hostname: &str) -> Result<()> {
let subj = format!("/C=XX/ST=Bitcoin/L=Node/O=Archipelago/CN={hostname}");
let san = format!("subjectAltName=DNS:{hostname},DNS:{hostname}.local,DNS:localhost,IP:127.0.0.1");
let output = tokio::process::Command::new("/usr/bin/sudo")
.args([
"-n",
"/usr/bin/openssl",
"req",
"-x509",
"-nodes",
"-days",
"3650",
"-newkey",
"rsa:2048",
"-keyout",
"/etc/archipelago/ssl/archipelago.key",
"-out",
"/etc/archipelago/ssl/archipelago.crt",
"-subj",
&subj,
"-addext",
&san,
])
.output()
.await
.context("Failed to run openssl")?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string();
anyhow::bail!(
"{}",
if stderr.is_empty() {
"openssl cert regen failed".to_string()
} else {
stderr
}
);
}
let reload = tokio::process::Command::new("/usr/bin/sudo")
.args(["-n", "/usr/bin/systemctl", "reload", "nginx"])
.output()
.await
.context("Failed to reload nginx")?;
if !reload.status.success() {
let stderr = String::from_utf8_lossy(&reload.stderr).trim().to_string();
anyhow::bail!("nginx reload failed: {}", stderr);
}
Ok(())
}
impl RpcHandler { impl RpcHandler {
/// system.factory-reset — Wipe all user data, remove containers, and restart. /// system.factory-reset — Wipe all user data, remove containers, and restart.
/// Only preserves the data_dir itself (recreated empty on restart). /// Only preserves the data_dir itself (recreated empty on restart).

View File

@ -16,11 +16,12 @@ impl RpcHandler {
// Spendable Fedimint balance too, so callers (e.g. the pay-for-file // Spendable Fedimint balance too, so callers (e.g. the pay-for-file
// pre-check) see funds available across BOTH backends (#3). Best-effort: // pre-check) see funds available across BOTH backends (#3). Best-effort:
// if fmcd isn't installed/joined this is just 0, never an error. // if fmcd isn't installed/joined this is just 0, never an error.
let fedimint_sats = let fedimint_sats = match fedimint_client::FedimintClient::from_node(&self.config.data_dir)
match fedimint_client::FedimintClient::from_node(&self.config.data_dir).await { .await
Ok(client) => client.total_balance_sats().await.unwrap_or(0), {
Err(_) => 0, Ok(client) => client.total_balance_sats().await.unwrap_or(0),
}; Err(_) => 0,
};
Ok(serde_json::json!({ Ok(serde_json::json!({
// `balance_sats` stays Cashu-only for back-compat; `total_sats` is the // `balance_sats` stays Cashu-only for back-compat; `total_sats` is the
// spendable amount across Cashu + Fedimint. // spendable amount across Cashu + Fedimint.

View File

@ -101,45 +101,19 @@ fn friendly_transient_error(has_cached_state: bool, err_msg: &str) -> String {
.trim_end_matches('.'); .trim_end_matches('.');
let lower = detail.to_lowercase(); let lower = detail.to_lowercase();
let state = if lower.contains("verifying blocks") { let state = if lower.contains("verifying blocks") {
Some("verifying blocks after restart") "verifying blocks after restart"
} else if lower.contains("connection reset") {
Some("starting up and not yet accepting RPC connections")
} else if lower.contains("connection refused") || lower.contains("tcp connect error") { } else if lower.contains("connection refused") || lower.contains("tcp connect error") {
Some("waiting for the Bitcoin RPC listener") "waiting for the Bitcoin RPC listener"
} else if lower.contains("timed out") || lower.contains("timeout") { } else if lower.contains("timed out") || lower.contains("timeout") {
Some("busy and not answering RPC before the timeout") "busy and not answering RPC before the timeout"
} else { } else {
None "starting or busy syncing"
}; };
// Recognized transient causes get a clean human sentence only — the raw if has_cached_state {
// transport error (URLs, repeated "os error 104" chains) is operator format!("Bitcoin node is {state}; showing last known state and retrying. Detail: {detail}")
// noise that was ending up verbatim on the app card. Unrecognized errors
// keep a bounded detail so a genuinely new failure stays diagnosable.
let (state, detail) = match state {
Some(state) => (state, None),
None => (
"starting or busy syncing",
Some(if detail.len() > 120 {
let mut cut = 120;
while !detail.is_char_boundary(cut) {
cut -= 1;
}
format!("{}", &detail[..cut])
} else {
detail.to_string()
}),
),
};
let base = if has_cached_state {
format!("Bitcoin node is {state}; showing last known state and retrying.")
} else { } else {
format!("Bitcoin node is {state}; retrying automatically.") format!("Bitcoin node is {state}; retrying automatically. Detail: {detail}")
};
match detail {
Some(detail) => format!("{base} Detail: {detail}"),
None => base,
} }
} }
@ -304,39 +278,4 @@ mod tests {
assert!(msg.contains("busy and not answering RPC before the timeout")); assert!(msg.contains("busy and not answering RPC before the timeout"));
} }
#[test]
fn connection_reset_gets_clean_message_without_raw_detail() {
// The exact string a fresh install showed on the app card: the raw
// reqwest chain (URL + repeated "os error 104") must not surface.
let msg = friendly_transient_error(
false,
"getblockchaininfo: Bitcoin RPC request failed: error sending request for url (http://127.0.0.1:8332/): connection error: Connection reset by peer (os error 104): connection error: Connection reset by peer (os error 104): Connection reset by peer (os error 104)",
);
assert!(msg.contains("starting up and not yet accepting RPC connections"));
assert!(!msg.contains("os error"));
assert!(!msg.contains("127.0.0.1"));
assert!(!msg.contains("Detail:"));
}
#[test]
fn recognized_causes_omit_detail_entirely() {
for raw in [
"x: Connection refused (os error 111)",
"x: operation timed out",
r#"x: {"error":{"code":-28,"message":"Verifying blocks..."}}"#,
] {
let msg = friendly_transient_error(false, raw);
assert!(!msg.contains("Detail:"), "leaked detail for: {raw}");
}
}
#[test]
fn unknown_errors_keep_bounded_detail() {
let long = format!("weird new failure {}", "x".repeat(300));
let msg = friendly_transient_error(false, &long);
assert!(msg.contains("Detail: weird new failure"));
assert!(msg.len() < 260);
}
} }

View File

@ -39,16 +39,6 @@ const KIOSK_LAUNCHER: &str =
const KIOSK_SERVICE_PATH: &str = "/etc/systemd/system/archipelago-kiosk.service"; const KIOSK_SERVICE_PATH: &str = "/etc/systemd/system/archipelago-kiosk.service";
const KIOSK_LAUNCHER_PATH: &str = "/usr/local/bin/archipelago-kiosk-launcher"; const KIOSK_LAUNCHER_PATH: &str = "/usr/local/bin/archipelago-kiosk-launcher";
// Journald log-volume policy (size cap + per-service rate limit). Fresh ISOs
// write the identical file at build time (image-recipe/_archived/
// build-auto-installer-iso.sh); this heals already-deployed nodes via OTA.
// A fresh node produced >1 GB/day of journal (bitcoind IBD console spam plus
// debug-level backend logging) — the cap bounds disk use and the rate limit
// keeps one chatty service from drowning everything else.
const JOURNALD_DROPIN: &str =
include_str!("../../../image-recipe/configs/journald-archipelago.conf");
const JOURNALD_DROPIN_PATH: &str = "/etc/systemd/journald.conf.d/10-archipelago-persistent.conf";
const NGINX_CONF_PATH: &str = "/etc/nginx/sites-available/archipelago"; const NGINX_CONF_PATH: &str = "/etc/nginx/sites-available/archipelago";
const NGINX_ENABLED_CONF_PATH: &str = "/etc/nginx/sites-enabled/archipelago"; const NGINX_ENABLED_CONF_PATH: &str = "/etc/nginx/sites-enabled/archipelago";
/// Per-app proxy snippet included by the HTTPS (:443) server block. Carries its /// Per-app proxy snippet included by the HTTPS (:443) server block. Carries its
@ -130,11 +120,6 @@ pub async fn ensure_doctor_installed() {
Ok(false) => debug!("Bitcoin RPC bind settings already usable"), Ok(false) => debug!("Bitcoin RPC bind settings already usable"),
Err(e) => warn!("Bitcoin RPC repair failed (non-fatal): {:#}", e), Err(e) => warn!("Bitcoin RPC repair failed (non-fatal): {:#}", e),
} }
match run_journald_dropin().await {
Ok(true) => info!("Installed journald log-volume policy drop-in"),
Ok(false) => debug!("journald log-volume policy already in place"),
Err(e) => warn!("journald drop-in bootstrap failed (non-fatal): {:#}", e),
}
match tighten_secrets_dir().await { match tighten_secrets_dir().await {
Ok(n) if n > 0 => info!(tightened = n, "Tightened mode on secret files"), Ok(n) if n > 0 => info!(tightened = n, "Tightened mode on secret files"),
Ok(_) => debug!("Secrets directory already at expected mode"), Ok(_) => debug!("Secrets directory already at expected mode"),
@ -423,14 +408,6 @@ ensure_line() {
ensure_line server=1 ensure_line server=1
ensure_line rpcallowip=0.0.0.0/0 ensure_line rpcallowip=0.0.0.0/0
ensure_line listen=1 ensure_line listen=1
# Log-volume fix: printtoconsole=1 duplicated every log line (incl. per-block
# IBD "UpdateTip" spam) into journald via conmon on top of the datadir
# debug.log bitcoind already writes. Console off; debug.log stays (bitcoind
# self-shrinks it on restart).
if grep -q '^printtoconsole=1' "$conf"; then
sed -i 's/^printtoconsole=1$/printtoconsole=0/' "$conf"
changed=1
fi
[ "$changed" -eq 0 ] && exit 0 [ "$changed" -eq 0 ] && exit 0
exit 2 exit 2
"#; "#;
@ -451,44 +428,6 @@ exit 2
} }
} }
/// Install the journald log-volume policy drop-in (JOURNALD_DROPIN) so nodes
/// deployed before the ISO shipped it get the size cap + rate limit via OTA.
/// Idempotent; restarts journald only when the file actually changed (safe:
/// the sockets are held by pid1, so at most a few messages queue briefly).
async fn run_journald_dropin() -> Result<bool> {
// Same dev-box guards as the doctor bootstrap: never touch /etc on
// contributors' laptops (symlinked or absent /home/archipelago/archy).
let home_archy = Path::new("/home/archipelago/archy");
if fs::symlink_metadata(home_archy)
.await
.map(|m| m.file_type().is_symlink())
.unwrap_or(false)
{
debug!("/home/archipelago/archy is a symlink — skipping journald bootstrap (dev box)");
return Ok(false);
}
if fs::metadata(home_archy).await.is_err() {
debug!("/home/archipelago/archy missing — skipping journald bootstrap");
return Ok(false);
}
let dropin_dir = "/etc/systemd/journald.conf.d";
let status = host_sudo(&["mkdir", "-p", dropin_dir])
.await
.with_context(|| format!("mkdir {}", dropin_dir))?;
if !status.success() {
anyhow::bail!("mkdir {} exited with {}", dropin_dir, status);
}
let changed = write_root_if_needed(JOURNALD_DROPIN_PATH, JOURNALD_DROPIN).await?;
if changed {
if let Err(e) = host_sudo(&["systemctl", "restart", "systemd-journald"]).await {
warn!("journald restart after drop-in update failed: {:#}", e);
}
}
Ok(changed)
}
async fn run() -> Result<bool> { async fn run() -> Result<bool> {
// Dev-box guard: on contributors' laptops `/home/archipelago/archy` is // Dev-box guard: on contributors' laptops `/home/archipelago/archy` is
// typically a symlink into the git checkout, and writing through it // typically a symlink into the git checkout, and writing through it

View File

@ -19,11 +19,6 @@
//! Sign a JSON document (e.g. releases/app-catalog.json) in place: insert //! Sign a JSON document (e.g. releases/app-catalog.json) in place: insert
//! `signature` + `signed_by` over the canonical form, matching exactly //! `signature` + `signed_by` over the canonical form, matching exactly
//! what `trust::verify_detached` recomputes on every node. //! what `trust::verify_detached` recomputes on every node.
//!
//! archipelago ceremony verify <file.json>
//! Verify a signed JSON document against the compiled-in release-root
//! anchor. Exits non-zero unless the signature verifies AND the signer
//! is the pinned anchor. Needs no mnemonic — used as the publish gate.
//! ``` //! ```
use anyhow::{bail, Context, Result}; use anyhow::{bail, Context, Result};
@ -52,15 +47,9 @@ pub fn run() -> Result<()> {
.context("usage: archipelago ceremony sign <file.json>")?; .context("usage: archipelago ceremony sign <file.json>")?;
cmd_sign(&file) cmd_sign(&file)
} }
"verify" => {
let file = std::env::args()
.nth(3)
.context("usage: archipelago ceremony verify <file.json>")?;
cmd_verify(&file)
}
other => { other => {
bail!( bail!(
"unknown ceremony subcommand {:?}; expected gen | pubkey | sign <file> | verify <file>", "unknown ceremony subcommand {:?}; expected gen | pubkey | sign <file>",
other other
) )
} }
@ -118,33 +107,6 @@ fn cmd_sign(path: &str) -> Result<()> {
Ok(()) Ok(())
} }
fn cmd_verify(path: &str) -> Result<()> {
let body = std::fs::read_to_string(path).with_context(|| format!("read {path}"))?;
let value: serde_json::Value =
serde_json::from_str(&body).with_context(|| format!("parse {path} as JSON"))?;
match signed_doc::verify_detached(&value)? {
signed_doc::SignatureStatus::Verified {
signer_did,
anchored: true,
} => {
eprintln!("{path} verified — signed by the pinned release root");
eprintln!(" signed_by: {signer_did}");
Ok(())
}
signed_doc::SignatureStatus::Verified {
signer_did,
anchored: false,
} => {
// Only reachable if no anchor is compiled in/overridden — the
// signature is self-consistent but proves nothing about identity.
bail!("{path} signed by {signer_did}, but no release-root anchor is pinned to compare against")
}
signed_doc::SignatureStatus::Unsigned => {
bail!("{path} is NOT signed (no `signature` field)")
}
}
}
/// Derive the release-root signing key from the mnemonic in env/stdin. /// Derive the release-root signing key from the mnemonic in env/stdin.
fn load_release_root_key() -> Result<SigningKey> { fn load_release_root_key() -> Result<SigningKey> {
let phrase = read_mnemonic()?; let phrase = read_mnemonic()?;

View File

@ -66,7 +66,7 @@ pub struct Config {
/// through Quadlet (`.container` units in ~/.config/containers/systemd /// through Quadlet (`.container` units in ~/.config/containers/systemd
/// + systemctl --user start) instead of `podman create + start`. Default /// + systemctl --user start) instead of `podman create + start`. Default
/// off so the legacy path stays the production path until the harness /// off so the legacy path stays the production path until the harness
/// at tests/lifecycle/run-gate.sh has gone green against the new path /// at tests/lifecycle/run-20x.sh has gone green against the new path
/// on .228 + .198. See `project_v1_7_52_phase3_quadlet_design`. /// on .228 + .198. See `project_v1_7_52_phase3_quadlet_design`.
#[serde(default)] #[serde(default)]
pub use_quadlet_backends: bool, pub use_quadlet_backends: bool,
@ -487,7 +487,7 @@ mod tests {
#[test] #[test]
fn test_config_use_quadlet_backends_defaults_off() { fn test_config_use_quadlet_backends_defaults_off() {
// Phase 3.2 of v1.7.52 — the new path stays gated until the 5× // Phase 3.2 of v1.7.52 — the new path stays gated until the 20×
// harness goes green on .228 and .198. Flipping this default // harness goes green on .228 and .198. Flipping this default
// ahead of that would route every backend install through code // ahead of that would route every backend install through code
// we haven't fleet-validated yet. // we haven't fleet-validated yet.

View File

@ -86,12 +86,6 @@ pub struct AppCatalogEntry {
/// Optional human-readable changelog lines for this version. /// Optional human-readable changelog lines for this version.
#[serde(default, skip_serializing_if = "Vec::is_empty")] #[serde(default, skip_serializing_if = "Vec::is_empty")]
pub changelog: Vec<String>, pub changelog: Vec<String>,
/// Multi-version support (`docs/bitcoin-multi-version-design.md`): the bounded
/// set of versions a user may install or switch to for this app. Empty for
/// single-version apps; `version`/`image` above remain the default/latest for
/// back-compat. Old nodes ignore this field (no `deny_unknown_fields`).
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub versions: Vec<CatalogVersion>,
/// Full app manifest, embedded so the app installs from the registry alone — /// Full app manifest, embedded so the app installs from the registry alone —
/// no OTA-shipped `apps/<id>/manifest.yml`. Carried as the raw value the /// no OTA-shipped `apps/<id>/manifest.yml`. Carried as the raw value the
/// publisher signed (so it stays part of the verified preimage) and /// publisher signed (so it stays part of the verified preimage) and
@ -103,29 +97,6 @@ pub struct AppCatalogEntry {
pub manifest: Option<serde_json::Value>, pub manifest: Option<serde_json::Value>,
} }
/// One selectable version in an app's `versions[]` list. The catalog carries a
/// curated, bounded set (current + a few majors back); see
/// `docs/bitcoin-multi-version-design.md` §3 Phase 1.
#[derive(Debug, Clone, Serialize, Deserialize, Default, PartialEq, Eq)]
pub struct CatalogVersion {
/// User-facing + tag-matching version string (e.g. `31.0`,
/// `29.3.knots20260508`). Treated as the image tag.
pub version: String,
/// Concrete image reference for this version. When omitted the orchestrator
/// falls back to composing `<default-repo>:<version>` from the entry image.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub image: Option<String>,
/// Marks the default / latest version pre-selected in the install modal.
#[serde(default, skip_serializing_if = "std::ops::Not::not")]
pub default: bool,
/// Deprecated versions are still installable but badged in the UI.
#[serde(default, skip_serializing_if = "std::ops::Not::not")]
pub deprecated: bool,
/// Optional end-of-life date (YYYY-MM-DD), surfaced in the UI.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub eol: Option<String>,
}
/// Read-side cache file search order. Mirrors `image_versions.rs`: the running /// Read-side cache file search order. Mirrors `image_versions.rs`: the running
/// daemon's data dir first (via env for dev), then the canonical runtime path. /// daemon's data dir first (via env for dev), then the canonical runtime path.
fn cache_paths() -> Vec<PathBuf> { fn cache_paths() -> Vec<PathBuf> {
@ -216,66 +187,6 @@ pub fn catalog_manifest_values() -> Vec<(String, serde_json::Value)> {
.collect() .collect()
} }
/// The catalog's default/latest version string for an app (the top-level
/// `version` field), if covered. Used to decide whether an install-time
/// selection should pin (older) or track-latest (default).
pub fn catalog_default_version(app_id: &str) -> Option<String> {
entry_for(app_id)
.map(|e| e.version)
.filter(|v| !v.is_empty())
}
/// Curated, selectable versions for an app per the remote catalog. Empty when
/// the catalog is absent or the app is single-version. The default entry (if
/// any) sorts first so callers can pre-select it.
pub fn catalog_versions(app_id: &str) -> Vec<CatalogVersion> {
let mut versions = entry_for(app_id).map(|e| e.versions).unwrap_or_default();
versions.sort_by_key(|v| !v.default); // default first, stable otherwise
versions
}
/// Resolve the image for a specific selectable `version` of `app_id`, validated
/// same-repo against `manifest_image` (the same guard `catalog_image_override`
/// applies). The version's explicit `image` is used when present; otherwise the
/// repo of `manifest_image` is retagged with `version`. Returns `None` when the
/// version is unknown or would point at a different repository — the caller then
/// keeps the default resolution and the switch is refused upstream.
pub fn catalog_image_for_version(
app_id: &str,
version: &str,
manifest_image: &str,
) -> Option<String> {
let entry = catalog_versions(app_id)
.into_iter()
.find(|v| v.version == version)?;
let manifest_repo =
crate::container::image_versions::image_without_registry_or_tag(manifest_image);
let candidate = match entry.image {
Some(img) => img,
None => {
// Retag the manifest's full registry/repo with the requested version.
let repo = manifest_image
.rsplit_once(':')
// keep registry:port colons intact: only strip a tag after the last '/'
.filter(|(left, _)| left.contains('/'))
.map(|(left, _)| left)
.unwrap_or(manifest_image);
format!("{repo}:{version}")
}
};
let same_repo = crate::container::image_versions::image_without_registry_or_tag(&candidate)
== manifest_repo;
if same_repo {
Some(candidate)
} else {
warn!(
"app-catalog: ignoring version {} for {} — repo mismatch (candidate={}, manifest={})",
version, app_id, candidate, manifest_image
);
None
}
}
/// Image override for the orchestrator's install/upgrade path. Returns the /// Image override for the orchestrator's install/upgrade path. Returns the
/// catalog's primary image for `app_id` ONLY when it refers to the same /// catalog's primary image for `app_id` ONLY when it refers to the same
/// repository as the manifest's current image — a guard so a catalog typo can /// repository as the manifest's current image — a guard so a catalog typo can
@ -303,12 +214,6 @@ pub fn catalog_image_override(app_id: &str, manifest_image: &str) -> Option<Stri
/// newer catalog, nor vice-versa). Falls back to the deployed pin only when the /// newer catalog, nor vice-versa). Falls back to the deployed pin only when the
/// catalog is missing or doesn't cover the app. /// catalog is missing or doesn't cover the app.
pub fn available_update_for_app(app_id: &str, running_image: &str) -> Option<String> { pub fn available_update_for_app(app_id: &str, running_image: &str) -> Option<String> {
// A runner-pinned version is an explicit "stay here" choice — never advertise
// an update over it (design §3 Phase 3). Auto-update, when enabled, ignores
// the pin and is driven by the catalog tick, not this badge.
if crate::container::version_config::pinned_version(app_id).is_some() {
return None;
}
if let Some(catalog_image) = catalog_primary_image(app_id) { if let Some(catalog_image) = catalog_primary_image(app_id) {
// Catalog covers this app with a concrete image -> authoritative. // Catalog covers this app with a concrete image -> authoritative.
return crate::container::image_versions::available_update_for_images( return crate::container::image_versions::available_update_for_images(

View File

@ -96,35 +96,6 @@ impl BootReconciler {
} }
} }
// Companion self-heal runs on its OWN cadence, decoupled from the
// per-app reconcile pass. On a heavily loaded node `reconcile_existing`
// over dozens of apps can take well over a minute, which would delay a
// companion-unit repair (deleted/lost unit file) past any reasonable
// safety window. Detecting + rewriting a companion unit is cheap, so it
// gets a dedicated `interval` loop. The handle is aborted when the main
// loop exits (shutdown uses `notify_one`, so we must NOT add a second
// waiter on `self.shutdown` — it would steal the single wake permit).
let companion_handle = if self.companion_stage {
let orchestrator = self.orchestrator.clone();
let interval = self.interval;
Some(tokio::spawn(async move {
loop {
let installed = orchestrator.manifest_ids().await;
for (companion, err) in crate::container::companion::reconcile(&installed).await
{
tracing::warn!(
companion = %companion,
error = %err,
"companion reconcile failed"
);
}
time::sleep(interval).await;
}
}))
} else {
None
};
// Initial pass: no delay. // Initial pass: no delay.
self.tick().await; self.tick().await;
@ -140,15 +111,23 @@ impl BootReconciler {
} }
} }
} }
if let Some(handle) = companion_handle {
handle.abort();
}
} }
async fn tick(&self) { async fn tick(&self) {
let report = self.orchestrator.reconcile_existing().await; let report = self.orchestrator.reconcile_existing().await;
Self::log_report(&report); Self::log_report(&report);
if !self.companion_stage {
return;
}
let installed = self.orchestrator.manifest_ids().await;
for (companion, err) in crate::container::companion::reconcile(&installed).await {
tracing::warn!(
companion = %companion,
error = %err,
"companion reconcile failed"
);
}
} }
fn log_report(report: &ReconcileReport) { fn log_report(report: &ReconcileReport) {
@ -294,7 +273,7 @@ mod tests {
} }
async fn wait_for_status_calls(rt: &CountingRuntime, expected: u32) -> u32 { async fn wait_for_status_calls(rt: &CountingRuntime, expected: u32) -> u32 {
for _ in 0..1000 { for _ in 0..100 {
let count = rt.status_call_count(); let count = rt.status_call_count();
if count >= expected { if count >= expected {
return count; return count;
@ -341,10 +320,11 @@ mod tests {
assert_eq!(wait_for_status_calls(&rt, 1).await, 1); assert_eq!(wait_for_status_calls(&rt, 1).await, 1);
tokio::time::sleep(Duration::from_millis(20)).await; tokio::time::sleep(Duration::from_millis(20)).await;
let count = wait_for_status_calls(&rt, 2).await; wait_for_status_calls(&rt, 2).await;
assert!( assert_eq!(
count >= 2, rt.status_call_count(),
2,
"a second reconcile pass should fire after one interval" "a second reconcile pass should fire after one interval"
); );
@ -402,7 +382,9 @@ mod tests {
assert!(first >= 1, "initial pass should have touched the runtime"); assert!(first >= 1, "initial pass should have touched the runtime");
tokio::time::sleep(Duration::from_millis(20)).await; tokio::time::sleep(Duration::from_millis(20)).await;
let second = wait_for_status_calls(&rt, first + 1).await; tokio::task::yield_now().await;
tokio::task::yield_now().await;
let second = rt.status_call_count();
assert!( assert!(
second > first, second > first,
"loop should have fired a second pass after the interval" "loop should have fired a second pass after the interval"

View File

@ -285,15 +285,7 @@ async fn ensure_image_present(spec: &CompanionSpec) -> Result<String> {
async fn image_exists(image: &str) -> bool { async fn image_exists(image: &str) -> bool {
let mut cmd = Command::new("podman"); let mut cmd = Command::new("podman");
// Only the exit status matters. WITHOUT a `--format`, `podman image inspect` cmd.args(["image", "inspect", image]);
// prints the image's full multi-KB manifest JSON; `.status()` inherits the
// service's stdout, so on a hit that whole blob lands in the journal — once
// per companion image, every reconcile pass. That flood spikes journald +
// IO and starves the async runtime (UI websocket then drops → "connection
// lost"/reconnect). Discard the child's stdout/stderr; we read neither.
cmd.args(["image", "inspect", image])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null());
match tokio::time::timeout(COMPANION_IMAGE_CHECK_TIMEOUT, cmd.status()).await { match tokio::time::timeout(COMPANION_IMAGE_CHECK_TIMEOUT, cmd.status()).await {
Ok(Ok(status)) => status.success(), Ok(Ok(status)) => status.success(),
Ok(Err(err)) => { Ok(Err(err)) => {
@ -336,10 +328,7 @@ async fn image_created_unix(image: &str) -> Option<i64> {
if !out.status.success() { if !out.status.success() {
return None; return None;
} }
String::from_utf8_lossy(&out.stdout) String::from_utf8_lossy(&out.stdout).trim().parse::<i64>().ok()
.trim()
.parse::<i64>()
.ok()
} }
/// Newest modification time (Unix seconds) across all files under `dir`, /// Newest modification time (Unix seconds) across all files under `dir`,

View File

@ -382,7 +382,7 @@ fn get_app_metadata(app_id: &str) -> AppMetadata {
"lnd" | "lightning-stack" => AppMetadata { "lnd" | "lightning-stack" => AppMetadata {
title: "LND".to_string(), title: "LND".to_string(),
description: "Lightning Network Daemon".to_string(), description: "Lightning Network Daemon".to_string(),
icon: "/assets/img/app-icons/lnd.png".to_string(), icon: "/assets/img/app-icons/lnd.svg".to_string(),
repo: "https://github.com/lightningnetwork/lnd".to_string(), repo: "https://github.com/lightningnetwork/lnd".to_string(),
tier: "", tier: "",
}, },
@ -396,7 +396,7 @@ fn get_app_metadata(app_id: &str) -> AppMetadata {
"electrumx" | "mempool-electrs" | "electrs" => AppMetadata { "electrumx" | "mempool-electrs" | "electrs" => AppMetadata {
title: "ElectrumX".to_string(), title: "ElectrumX".to_string(),
description: "ElectrumX server — full Electrum protocol indexer for Bitcoin. Powers Mempool and Electrum wallets.".to_string(), description: "ElectrumX server — full Electrum protocol indexer for Bitcoin. Powers Mempool and Electrum wallets.".to_string(),
icon: "/assets/img/app-icons/electrumx.png".to_string(), icon: "/assets/img/app-icons/electrs.svg".to_string(),
repo: "https://github.com/spesmilo/electrumx".to_string(), repo: "https://github.com/spesmilo/electrumx".to_string(),
tier: "", tier: "",
}, },
@ -677,76 +677,30 @@ pub async fn read_tor_address(app_id: &str) -> Option<String> {
.filter(|s| s.ends_with(".onion") && !s.is_empty()) .filter(|s| s.ends_with(".onion") && !s.is_empty())
} }
/// Container-side ports that are essentially never a web UI, even when
/// published alongside one — e.g. gitea publishes SSH (`2222->22`) before its
/// web port (`3001->3000`), and podman's port list order isn't guaranteed to
/// put the UI port first. Skipping these lets launch-URL guessing work for
/// any future multi-port app without a per-app static override.
const NON_HTTP_CONTAINER_PORTS: &[&str] = &["22", "21", "3306", "5432", "6379", "27017"];
fn extract_lan_address(ports: &[String]) -> Option<String> { fn extract_lan_address(ports: &[String]) -> Option<String> {
let mut first_candidate = None;
for port_str in ports { for port_str in ports {
// Parse port strings like "0.0.0.0:18443->18443/tcp" or "0.0.0.0:18443-18444->18443-18444/tcp" // Parse port strings like "0.0.0.0:18443->18443/tcp" or "0.0.0.0:18443-18444->18443-18444/tcp"
let Some(public_part) = port_str.split("->").next() else { if let Some(public_part) = port_str.split("->").next() {
continue; if let Some(port_part) = public_part.split(':').nth(1) {
}; // Extract just the first port if it's a range (e.g., "18443-18444" -> "18443")
let Some(port_part) = public_part.split(':').nth(1) else { let single_port = port_part.split('-').next().unwrap_or(port_part);
continue; return Some(format!("http://localhost:{}", single_port));
}; }
// Extract just the first port if it's a range (e.g., "18443-18444" -> "18443")
let host_port = port_part.split('-').next().unwrap_or(port_part);
let candidate = format!("http://localhost:{}", host_port);
if first_candidate.is_none() {
first_candidate = Some(candidate.clone());
} }
let container_port = port_str
.split("->")
.nth(1)
.and_then(|s| s.split('/').next())
.map(|s| s.split('-').next().unwrap_or(s));
if container_port.is_some_and(|p| NON_HTTP_CONTAINER_PORTS.contains(&p)) {
continue;
}
return Some(candidate);
} }
// Nothing looked HTTP-like — fall back to whatever was published first None
// rather than reporting no launch URL at all.
first_candidate
} }
/// netbird's dashboard launch URL: HTTPS on 8087 (the proxy terminates TLS —
/// the dashboard needs a secure context for OIDC PKCE, issue #15) at the node's
/// primary host IP so it's reachable from the LAN. Manifest-driven netbird no
/// longer writes `dashboard.env`, so this is derived from host facts (the same
/// `{{HOST_IP}}` the orchestrator bakes into the cert/config); it falls back to
/// the static localhost mapping when the host IP can't be read. URL shape is
/// identical to the legacy installer's, so the existing https reachability
/// wrapper still applies.
async fn netbird_configured_launch_url() -> Option<String> { async fn netbird_configured_launch_url() -> Option<String> {
if let Some(ip) = first_host_ip().await { let env = tokio::fs::read_to_string("/var/lib/archipelago/netbird/dashboard.env")
return Some(format!("https://{ip}:8087"));
}
PodmanClient::lan_address_for("netbird")
}
/// First address from `hostname -I` — the node's primary host IP. Mirrors the
/// orchestrator's `detect_host_ip` so launch URLs match the cert/config the
/// orchestrator renders for `{{HOST_IP}}`.
async fn first_host_ip() -> Option<String> {
let out = tokio::process::Command::new("hostname")
.arg("-I")
.output()
.await .await
.ok()?; .ok()?;
if !out.status.success() { env.lines()
return None; .find_map(|line| line.strip_prefix("NETBIRD_MGMT_API_ENDPOINT="))
} .map(str::trim)
String::from_utf8_lossy(&out.stdout) .filter(|s| !s.is_empty())
.split_whitespace()
.next()
.map(ToOwned::to_owned) .map(ToOwned::to_owned)
.or_else(|| PodmanClient::lan_address_for("netbird"))
} }
async fn reachable_lan_address(app_id: &str, candidate: Option<String>) -> Option<String> { async fn reachable_lan_address(app_id: &str, candidate: Option<String>) -> Option<String> {
@ -883,54 +837,3 @@ mod launch_url_port_tests {
assert_eq!(launch_url_port("http://localhost/"), None); assert_eq!(launch_url_port("http://localhost/"), None);
} }
} }
#[cfg(test)]
mod extract_lan_address_tests {
use super::extract_lan_address;
#[test]
fn skips_ssh_port_when_web_port_is_published() {
// gitea: SSH published before the web port, in podman's list order.
let ports = vec![
"0.0.0.0:2222->22/tcp".to_string(),
"0.0.0.0:3001->3000/tcp".to_string(),
];
assert_eq!(
extract_lan_address(&ports).as_deref(),
Some("http://localhost:3001")
);
}
#[test]
fn falls_back_to_first_port_when_nothing_looks_like_http() {
let ports = vec!["0.0.0.0:2222->22/tcp".to_string()];
assert_eq!(
extract_lan_address(&ports).as_deref(),
Some("http://localhost:2222")
);
}
#[test]
fn single_http_port_still_resolves() {
let ports = vec!["0.0.0.0:8096->8096/tcp".to_string()];
assert_eq!(
extract_lan_address(&ports).as_deref(),
Some("http://localhost:8096")
);
}
#[test]
fn handles_port_ranges() {
let ports = vec!["0.0.0.0:18443-18444->18443-18444/tcp".to_string()];
assert_eq!(
extract_lan_address(&ports).as_deref(),
Some("http://localhost:18443")
);
}
#[test]
fn no_ports_returns_none() {
let ports: Vec<String> = vec![];
assert_eq!(extract_lan_address(&ports), None);
}
}

View File

@ -85,7 +85,12 @@ pub async fn run_post_install(manifest: &AppManifest, container_name: &str, data
} }
} }
async fn run_step(step: &HookStep, container: &str, app_id: &str, data_dir: &Path) -> Result<()> { async fn run_step(
step: &HookStep,
container: &str,
app_id: &str,
data_dir: &Path,
) -> Result<()> {
match step { match step {
HookStep::Exec { exec } => { HookStep::Exec { exec } => {
let mut args: Vec<&str> = Vec::with_capacity(exec.len() + 2); let mut args: Vec<&str> = Vec::with_capacity(exec.len() + 2);

View File

@ -43,11 +43,7 @@ pub enum EnsureOutcome {
Unchanged, Unchanged,
} }
pub async fn ensure_config( pub async fn ensure_config(paths: &EnsurePaths, rpc_pass: &str) -> Result<EnsureOutcome> {
paths: &EnsurePaths,
rpc_pass: &str,
bitcoin_host: &str,
) -> Result<EnsureOutcome> {
fs::create_dir_all(&paths.data_dir) fs::create_dir_all(&paths.data_dir)
.await .await
.with_context(|| format!("creating {}", paths.data_dir.display()))?; .with_context(|| format!("creating {}", paths.data_dir.display()))?;
@ -56,7 +52,7 @@ pub async fn ensure_config(
let existing = fs::read_to_string(&paths.conf_path) let existing = fs::read_to_string(&paths.conf_path)
.await .await
.with_context(|| format!("reading {}", paths.conf_path.display()))?; .with_context(|| format!("reading {}", paths.conf_path.display()))?;
if has_required_lnd_flags(&existing, rpc_pass, bitcoin_host) { if has_required_lnd_flags(&existing, rpc_pass) {
return Ok(EnsureOutcome::Unchanged); return Ok(EnsureOutcome::Unchanged);
} }
} }
@ -72,11 +68,12 @@ restlisten=0.0.0.0:8080\n\
bitcoin.active=true\n\ bitcoin.active=true\n\
bitcoin.mainnet=true\n\ bitcoin.mainnet=true\n\
bitcoin.node=bitcoind\n\ bitcoin.node=bitcoind\n\
bitcoind.rpchost={bitcoin_host}:8332\n\ bitcoind.rpchost=bitcoin-knots:8332\n\
bitcoind.rpcuser=archipelago\n\ bitcoind.rpcuser=archipelago\n\
bitcoind.rpcpass={rpc_pass}\n\ bitcoind.rpcpass={}\n\
bitcoind.rpcpolling=true\n\ bitcoind.rpcpolling=true\n\
bitcoind.estimatemode=ECONOMICAL\n" bitcoind.estimatemode=ECONOMICAL\n",
rpc_pass
); );
write_config_atomically(paths, &conf).await?; write_config_atomically(paths, &conf).await?;
@ -656,14 +653,13 @@ fn shell_quote(s: &str) -> String {
s.replace('\'', "'\\''") s.replace('\'', "'\\''")
} }
fn has_required_lnd_flags(conf: &str, rpc_pass: &str, bitcoin_host: &str) -> bool { fn has_required_lnd_flags(conf: &str, rpc_pass: &str) -> bool {
let rpc_pass_line = format!("bitcoind.rpcpass={rpc_pass}"); let rpc_pass_line = format!("bitcoind.rpcpass={rpc_pass}");
let rpc_host_line = format!("bitcoind.rpchost={bitcoin_host}:8332");
[ [
"bitcoin.active=true", "bitcoin.active=true",
"bitcoin.mainnet=true", "bitcoin.mainnet=true",
"bitcoin.node=bitcoind", "bitcoin.node=bitcoind",
rpc_host_line.as_str(), "bitcoind.rpchost=bitcoin-knots:8332",
rpc_pass_line.as_str(), rpc_pass_line.as_str(),
] ]
.iter() .iter()
@ -682,7 +678,7 @@ mod tests {
conf_path: tmp.path().join("lnd/lnd.conf"), conf_path: tmp.path().join("lnd/lnd.conf"),
}; };
let out = ensure_config(&paths, "secret", "bitcoin-knots").await.unwrap(); let out = ensure_config(&paths, "secret").await.unwrap();
assert_eq!(out, EnsureOutcome::Written); assert_eq!(out, EnsureOutcome::Written);
let conf = fs::read_to_string(&paths.conf_path).await.unwrap(); let conf = fs::read_to_string(&paths.conf_path).await.unwrap();
assert!(conf.contains("bitcoin.active=true")); assert!(conf.contains("bitcoin.active=true"));
@ -701,46 +697,17 @@ mod tests {
}; };
assert_eq!( assert_eq!(
ensure_config(&paths, "first", "bitcoin-knots").await.unwrap(), ensure_config(&paths, "first").await.unwrap(),
EnsureOutcome::Written EnsureOutcome::Written
); );
assert_eq!( assert_eq!(
ensure_config(&paths, "second", "bitcoin-knots").await.unwrap(), ensure_config(&paths, "second").await.unwrap(),
EnsureOutcome::Written EnsureOutcome::Written
); );
let conf = fs::read_to_string(&paths.conf_path).await.unwrap(); let conf = fs::read_to_string(&paths.conf_path).await.unwrap();
assert!(conf.contains("bitcoind.rpcpass=second")); assert!(conf.contains("bitcoind.rpcpass=second"));
} }
#[tokio::test]
async fn ensure_config_repairs_bitcoin_host_drift() {
// A conf written against bitcoin-knots must be rewritten when the
// node's Bitcoin variant is bitcoin-core, or LND dials a hostname
// that doesn't exist on archy-net and dies on startup.
let tmp = tempfile::TempDir::new().unwrap();
let paths = EnsurePaths {
data_dir: tmp.path().join("lnd"),
conf_path: tmp.path().join("lnd/lnd.conf"),
};
assert_eq!(
ensure_config(&paths, "pw", "bitcoin-knots").await.unwrap(),
EnsureOutcome::Written
);
assert_eq!(
ensure_config(&paths, "pw", "bitcoin-core").await.unwrap(),
EnsureOutcome::Written
);
let conf = fs::read_to_string(&paths.conf_path).await.unwrap();
assert!(conf.contains("bitcoind.rpchost=bitcoin-core:8332"));
assert!(!conf.contains("bitcoind.rpchost=bitcoin-knots:8332"));
assert_eq!(
ensure_config(&paths, "pw", "bitcoin-core").await.unwrap(),
EnsureOutcome::Unchanged
);
}
#[tokio::test] #[tokio::test]
async fn ensure_config_repairs_incomplete_existing_config() { async fn ensure_config_repairs_incomplete_existing_config() {
let tmp = tempfile::TempDir::new().unwrap(); let tmp = tempfile::TempDir::new().unwrap();
@ -754,7 +721,7 @@ mod tests {
.unwrap(); .unwrap();
assert_eq!( assert_eq!(
ensure_config(&paths, "repaired", "bitcoin-knots").await.unwrap(), ensure_config(&paths, "repaired").await.unwrap(),
EnsureOutcome::Written EnsureOutcome::Written
); );
let conf = fs::read_to_string(&paths.conf_path).await.unwrap(); let conf = fs::read_to_string(&paths.conf_path).await.unwrap();

View File

@ -14,7 +14,6 @@ pub mod quadlet;
pub mod registry; pub mod registry;
pub mod secrets; pub mod secrets;
pub mod traits; pub mod traits;
pub mod version_config;
pub use boot_reconciler::{BootReconciler, DEFAULT_INTERVAL as RECONCILER_DEFAULT_INTERVAL}; pub use boot_reconciler::{BootReconciler, DEFAULT_INTERVAL as RECONCILER_DEFAULT_INTERVAL};
pub use dev_orchestrator::DevContainerOrchestrator; pub use dev_orchestrator::DevContainerOrchestrator;

File diff suppressed because it is too large Load Diff

View File

@ -268,21 +268,14 @@ impl QuadletUnit {
let _ = writeln!(s, "HealthTimeout={}", h.timeout); let _ = writeln!(s, "HealthTimeout={}", h.timeout);
let _ = writeln!(s, "HealthRetries={}", h.retries); let _ = writeln!(s, "HealthRetries={}", h.retries);
} }
if let Some((first, rest)) = self.entrypoint.as_deref().and_then(<[String]>::split_first) { if let Some(ep) = &self.entrypoint {
// Quadlet's Exec= sets only the command (the args passed to the // Quadlet's Exec= replaces the image entrypoint+cmd. When
// image's ENTRYPOINT) — it does NOT replace the entrypoint. So a // the manifest provides both entrypoint and command we
// manifest entrypoint like `sh -lc` must be emitted as a real // concatenate; if only command is set we'll emit that on
// Entrypoint= override; otherwise it gets appended to whatever // its own below.
// ENTRYPOINT the image baked in (e.g. the versioned bitcoind let mut parts: Vec<String> = ep.clone();
// images use `ENTRYPOINT ["bitcoind"]`, which turned the wrapper
// into `bitcoind sh -lc ...` and crash-looped). Emitting
// Entrypoint= makes the unit independent of the image's entrypoint.
let _ = writeln!(s, "Entrypoint={first}");
let mut parts: Vec<String> = rest.to_vec();
parts.extend(self.command.iter().cloned()); parts.extend(self.command.iter().cloned());
if !parts.is_empty() { let _ = writeln!(s, "Exec={}", shell_join(&parts));
let _ = writeln!(s, "Exec={}", shell_join(&parts));
}
} else if !self.command.is_empty() { } else if !self.command.is_empty() {
let _ = writeln!(s, "Exec={}", shell_join(&self.command)); let _ = writeln!(s, "Exec={}", shell_join(&self.command));
} }
@ -588,12 +581,11 @@ pub async fn write_if_changed(unit: &QuadletUnit, dir: &Path) -> Result<bool> {
/// Reload the user systemd manager. Required after any quadlet write /// Reload the user systemd manager. Required after any quadlet write
/// or removal so systemd picks up the generated `.service` translation. /// or removal so systemd picks up the generated `.service` translation.
pub async fn daemon_reload_user() -> Result<()> { pub async fn daemon_reload_user() -> Result<()> {
// Bounded: a wedged user manager (e.g. a unit stuck "deactivating" while let status = Command::new("systemctl")
// podman hangs) could otherwise block daemon-reload indefinitely and freeze .args(["--user", "daemon-reload"])
// any caller — notably uninstall teardown. .status()
let status = systemctl_user_status(&["daemon-reload"], Duration::from_secs(30))
.await .await
.context("systemctl --user daemon-reload")?; .context("spawn systemctl --user daemon-reload")?;
if !status.success() { if !status.success() {
return Err(anyhow!("systemctl --user daemon-reload exited {status}")); return Err(anyhow!("systemctl --user daemon-reload exited {status}"));
} }
@ -776,11 +768,9 @@ pub fn network_aliases_changed(old_body: &str, new_body: &str) -> bool {
} }
pub fn exec_changed(old_body: &str, new_body: &str) -> bool { pub fn exec_changed(old_body: &str, new_body: &str) -> bool {
// Entrypoint= and Exec= together define what the container runs, so a drift let old_exec = directive_values(old_body, "Exec=");
// in either must recreate the container (e.g. when this renderer first let new_exec = directive_values(new_body, "Exec=");
// splits a folded `Exec=sh -lc ...` into `Entrypoint=sh` + `Exec=-lc ...`). old_exec != new_exec
directive_values(old_body, "Exec=") != directive_values(new_body, "Exec=")
|| directive_values(old_body, "Entrypoint=") != directive_values(new_body, "Entrypoint=")
} }
fn directive_values(unit_body: &str, prefix: &str) -> Vec<String> { fn directive_values(unit_body: &str, prefix: &str) -> Vec<String> {
@ -797,19 +787,11 @@ fn directive_values(unit_body: &str, prefix: &str) -> Vec<String> {
/// that systemd no longer knows about. /// that systemd no longer knows about.
pub async fn disable_remove(unit_name: &str, dir: &Path) -> Result<()> { pub async fn disable_remove(unit_name: &str, dir: &Path) -> Result<()> {
let svc = format!("{unit_name}.service"); let svc = format!("{unit_name}.service");
// Stop first; ignore failure (unit may already be down). BOUNDED — on // Stop first; ignore failure (unit may already be down).
// rootless podman a generated unit can wedge in "deactivating" while let _ = Command::new("systemctl")
// `podman rm -f` hangs underneath it, and an unbounded `systemctl stop` .args(["--user", "stop", &svc])
// would block the entire uninstall forever: the progress bar freezes and .status()
// the package entry is stranded in `Removing` (a ghost in My Apps that also .await;
// blocks reinstall). If the graceful stop times out, escalate to
// SIGKILL + reset-failed so teardown always proceeds.
if systemctl_user_status(&["stop", &svc], QUADLET_STOP_TIMEOUT)
.await
.is_err()
{
let _ = kill_and_reset_service(&svc).await;
}
let path = dir.join(format!("{unit_name}.container")); let path = dir.join(format!("{unit_name}.container"));
if fs::try_exists(&path).await.unwrap_or(false) { if fs::try_exists(&path).await.unwrap_or(false) {
match fs::remove_file(&path).await { match fs::remove_file(&path).await {
@ -820,15 +802,10 @@ pub async fn disable_remove(unit_name: &str, dir: &Path) -> Result<()> {
} }
daemon_reload_user().await.ok(); daemon_reload_user().await.ok();
// Defensive: kill the actual container too, in case quadlet left it. // Defensive: kill the actual container too, in case quadlet left it.
// Bounded so a hung podman store can't re-introduce the stall this function let _ = Command::new("podman")
// exists to avoid. .args(["rm", "-f", unit_name])
let _ = tokio::time::timeout( .status()
QUADLET_STOP_TIMEOUT, .await;
Command::new("podman")
.args(["rm", "-f", unit_name])
.status(),
)
.await;
Ok(()) Ok(())
} }
@ -1072,10 +1049,7 @@ mod tests {
assert!(s.contains("ReadOnly=true")); assert!(s.contains("ReadOnly=true"));
assert!(s.contains("NoNewPrivileges=true")); assert!(s.contains("NoNewPrivileges=true"));
assert!(s.contains("PodmanArgs=--cpus=2")); assert!(s.contains("PodmanArgs=--cpus=2"));
// Manifest entrypoint becomes a real Entrypoint= override (not folded assert!(s.contains("Exec=/usr/local/bin/bitcoind -server=1 -rpcbind=0.0.0.0"));
// into Exec=), so the unit doesn't depend on the image's own ENTRYPOINT.
assert!(s.contains("Entrypoint=/usr/local/bin/bitcoind"));
assert!(s.contains("Exec=-server=1 -rpcbind=0.0.0.0"));
assert!(s.contains("Restart=on-failure")); assert!(s.contains("Restart=on-failure"));
assert!(s.contains("Network=archy-net")); assert!(s.contains("Network=archy-net"));
} }
@ -1300,10 +1274,7 @@ app:
let u = QuadletUnit::from_manifest(&m, "x"); let u = QuadletUnit::from_manifest(&m, "x");
// tmpfs entry is dropped from bind_mounts; bind entry survives. // tmpfs entry is dropped from bind_mounts; bind entry survives.
assert_eq!(u.bind_mounts.len(), 1); assert_eq!(u.bind_mounts.len(), 1);
assert_eq!( assert_eq!(u.bind_mounts[0].host, PathBuf::from("/var/lib/archipelago/x"));
u.bind_mounts[0].host,
PathBuf::from("/var/lib/archipelago/x")
);
} }
#[test] #[test]

View File

@ -66,7 +66,6 @@ fn ensure_one(dir: &Path, gs: &GeneratedSecret) -> Result<()> {
match gs.kind { match gs.kind {
SecretGenKind::Hex16 => write_secret(&dir.join(&gs.name), &random_hex(16))?, SecretGenKind::Hex16 => write_secret(&dir.join(&gs.name), &random_hex(16))?,
SecretGenKind::Hex32 => write_secret(&dir.join(&gs.name), &random_hex(32))?, SecretGenKind::Hex32 => write_secret(&dir.join(&gs.name), &random_hex(32))?,
SecretGenKind::Base64 => write_secret(&dir.join(&gs.name), &random_base64(32))?,
SecretGenKind::Bcrypt => { SecretGenKind::Bcrypt => {
let password = random_hex(BCRYPT_PASSWORD_BYTES); let password = random_hex(BCRYPT_PASSWORD_BYTES);
let hash = bcrypt::hash(&password, bcrypt::DEFAULT_COST) let hash = bcrypt::hash(&password, bcrypt::DEFAULT_COST)
@ -93,15 +92,6 @@ fn random_hex(bytes: usize) -> String {
hex::encode(buf) hex::encode(buf)
} }
/// `bytes` of entropy, standard base64 (with padding). For keys that a service
/// base64-decodes to recover the raw bytes (e.g. netbird's store encryptionKey).
fn random_base64(bytes: usize) -> String {
use base64::Engine as _;
let mut buf = vec![0u8; bytes];
rand::thread_rng().fill_bytes(&mut buf);
base64::engine::general_purpose::STANDARD.encode(buf)
}
/// Atomically write a `0600` secret: a temp file in the same dir (so the rename /// Atomically write a `0600` secret: a temp file in the same dir (so the rename
/// is atomic), fsynced, then renamed over the target. /// is atomic), fsynced, then renamed over the target.
fn write_secret(path: &Path, value: &str) -> Result<()> { fn write_secret(path: &Path, value: &str) -> Result<()> {
@ -169,10 +159,7 @@ mod tests {
let hash = std::fs::read_to_string(dir.path().join("admin")).unwrap(); let hash = std::fs::read_to_string(dir.path().join("admin")).unwrap();
let pw = std::fs::read_to_string(dir.path().join("admin.pw")).unwrap(); let pw = std::fs::read_to_string(dir.path().join("admin.pw")).unwrap();
assert!(hash.starts_with("$2"), "bcrypt hash shape"); assert!(hash.starts_with("$2"), "bcrypt hash shape");
assert!( assert!(bcrypt::verify(pw.trim(), hash.trim()).unwrap(), "pw matches hash");
bcrypt::verify(pw.trim(), hash.trim()).unwrap(),
"pw matches hash"
);
for f in ["tok", "admin", "admin.pw"] { for f in ["tok", "admin", "admin.pw"] {
let mode = std::fs::metadata(dir.path().join(f)) let mode = std::fs::metadata(dir.path().join(f))
@ -192,10 +179,7 @@ mod tests {
let first = std::fs::read_to_string(dir.path().join("tok")).unwrap(); let first = std::fs::read_to_string(dir.path().join("tok")).unwrap();
ensure_generated_secrets(dir.path(), &m).unwrap(); ensure_generated_secrets(dir.path(), &m).unwrap();
let second = std::fs::read_to_string(dir.path().join("tok")).unwrap(); let second = std::fs::read_to_string(dir.path().join("tok")).unwrap();
assert_eq!( assert_eq!(first, second, "a present readable secret is never rewritten");
first, second,
"a present readable secret is never rewritten"
);
} }
#[test] #[test]

View File

@ -1,272 +0,0 @@
//! Per-app version preferences — the persistence layer for multi-version support.
//!
//! Multi-version support (`docs/bitcoin-multi-version-design.md`) lets a node
//! runner pin Bitcoin Core / Knots to a specific version and opt into
//! auto-update-to-latest. Both choices live in the existing per-app config file
//! at `/var/lib/archipelago/app-configs/<id>.json` as two keys:
//!
//! ```jsonc
//! { "pinnedVersion": "29.3.knots20260508", "autoUpdate": false }
//! ```
//!
//! This is the single source of truth the orchestrator's install path reads to
//! resolve the image, and that the auto-update tick + "available update" badge
//! consult. Reads/writes are merge-preserving so they never clobber any
//! `containerConfig` (ports/volumes/env) a generic app may also store here.
//!
//! Platform-managed apps (bitcoin-core/knots/…) never use the
//! `containerConfig`-style keys (see `config.rs::dynamic_app_config`, which
//! returns early for them), so adding these keys to their file is collision-free.
use serde_json::{Map, Value};
use std::path::PathBuf;
/// Resolved version preferences for one app. Defaults: no pin, auto-update off
/// (consensus-critical apps opt in explicitly — design open-question #4).
#[derive(Debug, Clone, Default, PartialEq, Eq)]
pub struct AppVersionConfig {
/// The version string the runner pinned, if any. Suppresses the update badge
/// and overrides the catalog default at install/recreate time.
pub pinned_version: Option<String>,
/// When true, the hourly catalog tick updates this app to the catalog
/// default automatically. Ignored while a version is pinned.
pub auto_update: bool,
}
fn config_dir() -> PathBuf {
let base = std::env::var("ARCHIPELAGO_DATA_DIR")
.unwrap_or_else(|_| "/var/lib/archipelago".to_string());
PathBuf::from(base).join("app-configs")
}
fn config_path(app_id: &str) -> PathBuf {
config_dir().join(format!("{app_id}.json"))
}
/// App ids that have opted into auto-update-to-latest AND are not pinned (a pin
/// is an explicit "stay here"). Drives the hourly per-app auto-update tick. The
/// app id is the config file stem. Returns empty when the dir is absent.
pub fn auto_update_apps() -> Vec<String> {
let mut out = Vec::new();
let Ok(entries) = std::fs::read_dir(config_dir()) else {
return out;
};
for entry in entries.flatten() {
let path = entry.path();
if path.extension().and_then(|e| e.to_str()) != Some("json") {
continue;
}
let Some(app_id) = path.file_stem().and_then(|s| s.to_str()) else {
continue;
};
let cfg = read(app_id);
if cfg.auto_update && cfg.pinned_version.is_none() {
out.push(app_id.to_string());
}
}
out
}
fn read_raw(app_id: &str) -> Map<String, Value> {
let path = config_path(app_id);
match std::fs::read_to_string(&path) {
Ok(s) => serde_json::from_str::<Value>(&s)
.ok()
.and_then(|v| v.as_object().cloned())
.unwrap_or_default(),
Err(_) => Map::new(),
}
}
/// Read the version preferences for `app_id`. Returns defaults when the file is
/// absent or the keys are unset.
pub fn read(app_id: &str) -> AppVersionConfig {
let obj = read_raw(app_id);
AppVersionConfig {
pinned_version: obj
.get("pinnedVersion")
.and_then(Value::as_str)
.filter(|s| !s.is_empty())
.map(String::from),
auto_update: obj
.get("autoUpdate")
.and_then(Value::as_bool)
.unwrap_or(false),
}
}
/// The pinned version for `app_id`, if set. Convenience for the hot path.
pub fn pinned_version(app_id: &str) -> Option<String> {
read(app_id).pinned_version
}
/// Parse the leading numeric `major.minor.patch` of a version string into a
/// comparable tuple. Stops at the first non-numeric component, so Bitcoin Core
/// (`31.0`, `28.4`) and the Knots date-suffixed form (`29.3.knots20260508` →
/// `(29, 3, 0)`) both compare on their consensus-relevant major/minor. The
/// Knots build-date suffix is intentionally ignored — a same-major.minor Knots
/// rebuild is not a chainstate downgrade.
fn version_key(version: &str) -> (u64, u64, u64) {
let mut it = version.split('.').map(|c| {
// Take the leading digit run of each dotted component (`knots20260508`
// yields no leading digits → 0; `3` → 3).
c.chars()
.take_while(|ch| ch.is_ascii_digit())
.collect::<String>()
.parse::<u64>()
.unwrap_or(0)
});
(
it.next().unwrap_or(0),
it.next().unwrap_or(0),
it.next().unwrap_or(0),
)
}
/// True when installing `candidate` over `current` is a DOWNGRADE — an older
/// Bitcoin release over a chainstate written by a newer one. This is the
/// highest-risk operation (Core refuses to start on a newer chainstate without
/// an expensive reindex; pruned nodes can lose data), so the UI must warn and
/// the switch must be explicitly confirmed (design §4). Equal or newer → false.
pub fn is_downgrade(current: &str, candidate: &str) -> bool {
version_key(candidate) < version_key(current)
}
/// Merge `cfg` into the on-disk config, preserving every other key. A
/// `pinned_version` of `None` removes the `pinnedVersion` key (un-pins / "track
/// latest"). Creates the directory and file on first write.
pub fn write(app_id: &str, cfg: &AppVersionConfig) -> std::io::Result<()> {
let path = config_path(app_id);
let mut obj = read_raw(app_id);
match &cfg.pinned_version {
Some(v) => {
obj.insert("pinnedVersion".to_string(), Value::String(v.clone()));
}
None => {
obj.remove("pinnedVersion");
}
}
obj.insert("autoUpdate".to_string(), Value::Bool(cfg.auto_update));
if let Some(parent) = path.parent() {
std::fs::create_dir_all(parent)?;
}
let serialized = serde_json::to_string_pretty(&Value::Object(obj))
.map_err(|e| std::io::Error::new(std::io::ErrorKind::InvalidData, e))?;
// Atomic-ish write: temp + rename so a crash mid-write can't truncate config.
let tmp = path.with_extension("json.tmp");
std::fs::write(&tmp, serialized.as_bytes())?;
std::fs::rename(&tmp, &path)
}
#[cfg(test)]
mod tests {
use super::*;
// `ARCHIPELAGO_DATA_DIR` is process-global, so the write/read tests must not
// run concurrently — serialize them and give each a unique dir. Without this
// lock, parallel `cargo test` races on the env var (poisoning is fine: a
// panicking test still releases a usable guard).
static ENV_LOCK: std::sync::Mutex<u64> = std::sync::Mutex::new(0);
fn with_tmp_data_dir<F: FnOnce()>(f: F) {
let mut counter = ENV_LOCK.lock().unwrap_or_else(|e| e.into_inner());
*counter += 1;
let dir =
std::env::temp_dir().join(format!("archy-vc-test-{}-{}", std::process::id(), *counter));
let _ = std::fs::remove_dir_all(&dir);
std::fs::create_dir_all(&dir).unwrap();
std::env::set_var("ARCHIPELAGO_DATA_DIR", &dir);
f();
std::env::remove_var("ARCHIPELAGO_DATA_DIR");
let _ = std::fs::remove_dir_all(&dir);
// `counter` guard drops here, releasing the lock for the next test.
}
#[test]
fn defaults_when_absent() {
with_tmp_data_dir(|| {
let cfg = read("bitcoin-core");
assert_eq!(cfg.pinned_version, None);
assert!(!cfg.auto_update);
});
}
#[test]
fn write_then_read_roundtrips() {
with_tmp_data_dir(|| {
write(
"bitcoin-knots",
&AppVersionConfig {
pinned_version: Some("29.3.knots20260508".into()),
auto_update: false,
},
)
.unwrap();
let cfg = read("bitcoin-knots");
assert_eq!(cfg.pinned_version.as_deref(), Some("29.3.knots20260508"));
assert!(!cfg.auto_update);
});
}
#[test]
fn write_preserves_existing_keys() {
with_tmp_data_dir(|| {
// Simulate a generic app's containerConfig already on disk.
let path = config_path("someapp");
std::fs::create_dir_all(path.parent().unwrap()).unwrap();
std::fs::write(&path, r#"{"ports":["80:80"],"autoUpdate":false}"#).unwrap();
write(
"someapp",
&AppVersionConfig {
pinned_version: Some("1.2.3".into()),
auto_update: true,
},
)
.unwrap();
let raw = read_raw("someapp");
assert!(raw.contains_key("ports"), "ports key must survive");
assert_eq!(raw.get("pinnedVersion").unwrap(), "1.2.3");
assert_eq!(raw.get("autoUpdate").unwrap(), &Value::Bool(true));
});
}
#[test]
fn downgrade_detection() {
// Older over newer = downgrade.
assert!(is_downgrade("31.0", "30.0"));
assert!(is_downgrade("28.4", "27.2"));
// Same or newer = not a downgrade.
assert!(!is_downgrade("30.0", "31.0"));
assert!(!is_downgrade("28.4", "28.4"));
// Knots date-suffixed strings compare on major.minor only.
assert!(is_downgrade("29.3.knots20260508", "28.1.knots20251010"));
assert!(!is_downgrade("29.3.knots20260101", "29.3.knots20260508"));
}
#[test]
fn unpin_removes_key() {
with_tmp_data_dir(|| {
write(
"bitcoin-core",
&AppVersionConfig {
pinned_version: Some("31.0".into()),
auto_update: true,
},
)
.unwrap();
write(
"bitcoin-core",
&AppVersionConfig {
pinned_version: None,
auto_update: true,
},
)
.unwrap();
let raw = read_raw("bitcoin-core");
assert!(!raw.contains_key("pinnedVersion"));
assert_eq!(read("bitcoin-core").pinned_version, None);
assert!(read("bitcoin-core").auto_update);
});
}
}

View File

@ -153,9 +153,7 @@ pub async fn read_owned(
onion: &str, onion: &str,
content_id: &str, content_id: &str,
) -> Option<(String, Vec<u8>)> { ) -> Option<(String, Vec<u8>)> {
let bytes = fs::read(bytes_path(data_dir, onion, content_id)) let bytes = fs::read(bytes_path(data_dir, onion, content_id)).await.ok()?;
.await
.ok()?;
let mime = load_index(data_dir) let mime = load_index(data_dir)
.await .await
.items .items

View File

@ -7,7 +7,7 @@ use anyhow::{Context, Result};
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
use std::path::{Path, PathBuf}; use std::path::{Path, PathBuf};
use tokio::fs; use tokio::fs;
use tracing::{debug, warn}; use tracing::debug;
const CATALOG_FILE: &str = "content/catalog.json"; const CATALOG_FILE: &str = "content/catalog.json";
const CONTENT_DIR: &str = "content/files"; const CONTENT_DIR: &str = "content/files";
@ -86,22 +86,6 @@ pub async fn save_catalog(data_dir: &Path, catalog: &ContentCatalog) -> Result<(
Ok(()) Ok(())
} }
/// Removes `id` from the on-disk catalog. Best-effort: a failure here just
/// means the entry gets pruned again next time it's requested, so errors are
/// logged rather than propagated.
async fn prune_missing_content_entry(data_dir: &Path, id: &str) {
let Ok(mut catalog) = load_catalog(data_dir).await else {
return;
};
let before = catalog.items.len();
catalog.items.retain(|i| i.id != id);
if catalog.items.len() != before {
if let Err(e) = save_catalog(data_dir, &catalog).await {
warn!(error = %e, content_id = %id, "failed to save catalog after pruning missing content entry");
}
}
}
/// Get the full filesystem path for a content item. /// Get the full filesystem path for a content item.
/// Checks the dedicated content/files/ directory first, then falls back to the /// Checks the dedicated content/files/ directory first, then falls back to the
/// FileBrowser data directory (where users manage files via the web UI). /// FileBrowser data directory (where users manage files via the web UI).
@ -284,19 +268,6 @@ pub async fn serve_content(
let file_path = content_file_path(data_dir, item); let file_path = content_file_path(data_dir, item);
if !file_path.exists() { if !file_path.exists() {
// The catalog entry survived (it's a separate JSON file) but its
// backing file is gone — most likely lost in an unrelated data-dir
// reset (a shared filebrowser file, 2026-07-01: two catalog entries
// outlived a filebrowser reinstall that wiped the files themselves).
// Leaving the entry in place would keep advertising it as available
// to every peer forever, each hitting the exact same dead end this
// one just did. Prune it so it stops being offered.
warn!(
content_id = %id,
filename = %item.filename,
"content catalog entry's file is missing on disk — pruning the stale entry"
);
prune_missing_content_entry(data_dir, id).await;
return Ok(ServeResult::NotFound); return Ok(ServeResult::NotFound);
} }
@ -584,95 +555,3 @@ mod faststart_tests {
assert_eq!(mp4_is_faststart(&p).await, Some(false)); assert_eq!(mp4_is_faststart(&p).await, Some(false));
} }
} }
#[cfg(test)]
mod prune_missing_content_tests {
use super::*;
#[tokio::test]
async fn serve_content_prunes_catalog_entry_whose_file_is_missing() {
// Simulates a catalog entry that outlived its backing file (a shared
// filebrowser file lost in an unrelated data-dir reset, 2026-07-01) —
// every peer request for it would otherwise 404 forever with no way
// to tell it apart from a transient failure.
let dir = tempfile::tempdir().unwrap();
let data_dir = dir.path();
let item = ContentItem {
id: "missing-item".to_string(),
filename: "gone.mp4".to_string(),
mime_type: "video/mp4".to_string(),
size_bytes: 123,
description: String::new(),
access: AccessControl::Free,
availability: Availability::AllPeers,
added_at: "2026-01-01T00:00:00Z".to_string(),
};
save_catalog(
data_dir,
&ContentCatalog {
items: vec![item],
},
)
.await
.unwrap();
// File was never written to disk under content/files/ or filebrowser/.
let result = serve_content(data_dir, "missing-item", None, None, None, None)
.await
.unwrap();
assert!(matches!(result, ServeResult::NotFound));
let reloaded = load_catalog(data_dir).await.unwrap();
assert!(
reloaded.items.is_empty(),
"stale entry should have been pruned after the 404"
);
}
#[tokio::test]
async fn serve_content_leaves_other_entries_untouched_when_pruning() {
let dir = tempfile::tempdir().unwrap();
let data_dir = dir.path();
let missing = ContentItem {
id: "missing-item".to_string(),
filename: "gone.mp4".to_string(),
mime_type: "video/mp4".to_string(),
size_bytes: 123,
description: String::new(),
access: AccessControl::Free,
availability: Availability::AllPeers,
added_at: "2026-01-01T00:00:00Z".to_string(),
};
let present = ContentItem {
id: "present-item".to_string(),
filename: "here.mp4".to_string(),
mime_type: "video/mp4".to_string(),
size_bytes: 4,
description: String::new(),
access: AccessControl::Free,
availability: Availability::AllPeers,
added_at: "2026-01-01T00:00:00Z".to_string(),
};
save_catalog(
data_dir,
&ContentCatalog {
items: vec![missing, present],
},
)
.await
.unwrap();
let content_dir = data_dir.join("content").join("files");
tokio::fs::create_dir_all(&content_dir).await.unwrap();
tokio::fs::write(content_dir.join("here.mp4"), b"data")
.await
.unwrap();
let _ = serve_content(data_dir, "missing-item", None, None, None, None)
.await
.unwrap();
let reloaded = load_catalog(data_dir).await.unwrap();
assert_eq!(reloaded.items.len(), 1);
assert_eq!(reloaded.items[0].id, "present-item");
}
}

View File

@ -20,7 +20,6 @@ use tracing::{info, warn};
const PID_FILE: &str = "archipelago.pid"; const PID_FILE: &str = "archipelago.pid";
const CONTAINER_STATE_FILE: &str = "running-containers.json"; const CONTAINER_STATE_FILE: &str = "running-containers.json";
const USER_STOPPED_FILE: &str = "user-stopped.json"; const USER_STOPPED_FILE: &str = "user-stopped.json";
const USER_UNINSTALLED_FILE: &str = "user-uninstalled.json";
/// Shared flag: true once boot recovery is complete. Health monitor should wait for this. /// Shared flag: true once boot recovery is complete. Health monitor should wait for this.
pub static RECOVERY_COMPLETE: AtomicBool = AtomicBool::new(false); pub static RECOVERY_COMPLETE: AtomicBool = AtomicBool::new(false);
@ -49,46 +48,6 @@ pub fn is_recovery_complete() -> bool {
RECOVERY_COMPLETE.load(Ordering::SeqCst) RECOVERY_COMPLETE.load(Ordering::SeqCst)
} }
// ── Pending boot-start tracking ─────────────────────────────────────────
// Containers that boot recovery / the reconciler is about to start (or is
// starting right now). The package scanner overlays these as `Restarting`
// instead of the raw podman `Stopped`/`Exited`, so a freshly rebooted node
// doesn't tell the user their apps are "Stopped" while the sequential
// recovery pass (3s stagger + up to minutes for heavyweights like bitcoin)
// is still working through the queue. Writers register names when a pass
// begins and remove each name once its start attempt finishes, whatever
// the outcome — a container that truly failed goes back to showing its
// real state on the next scan.
static PENDING_BOOT_STARTS: std::sync::LazyLock<std::sync::RwLock<std::collections::HashSet<String>>> =
std::sync::LazyLock::new(|| std::sync::RwLock::new(std::collections::HashSet::new()));
/// Register container/app names an active recovery or reconcile pass
/// intends to start.
pub fn pending_boot_starts_add<I: IntoIterator<Item = String>>(names: I) {
if let Ok(mut set) = PENDING_BOOT_STARTS.write() {
set.extend(names);
}
}
/// A start attempt for `name` finished (success or failure) — stop
/// overlaying it.
pub fn pending_boot_start_done(name: &str) {
if let Ok(mut set) = PENDING_BOOT_STARTS.write() {
set.remove(name);
}
}
/// Whether `name` (a container name or scanner app id) is queued for a
/// boot/reconcile start. Container names may carry an `archy-` prefix the
/// scanner strips when deriving app ids, so check both forms.
pub fn is_pending_boot_start(name: &str) -> bool {
let Ok(set) = PENDING_BOOT_STARTS.read() else {
return false;
};
set.contains(name) || set.contains(&format!("archy-{name}"))
}
// ── User-stopped tracking ─────────────────────────────────────────────── // ── User-stopped tracking ───────────────────────────────────────────────
// When a user explicitly stops a container via the UI, we record it here // When a user explicitly stops a container via the UI, we record it here
// so crash recovery and health monitor don't auto-restart it. // so crash recovery and health monitor don't auto-restart it.
@ -102,22 +61,6 @@ pub async fn load_user_stopped(data_dir: &Path) -> std::collections::HashSet<Str
} }
} }
/// Names of the containers that were running at the last periodic snapshot
/// (`running-containers.json`, saved every ~120s by `save_container_snapshot`).
/// Unlike `check_for_crash`, this reads the snapshot unconditionally (no PID/crash
/// gate) — it's the durable "what was running" signal the boot reconciler uses to
/// recreate a previously-running app whose container vanished. Empty if absent.
pub async fn load_last_running_names(data_dir: &Path) -> std::collections::HashSet<String> {
let path = data_dir.join(CONTAINER_STATE_FILE);
match fs::read_to_string(&path).await {
Ok(content) => match serde_json::from_str::<ContainerSnapshot>(&content) {
Ok(snapshot) => snapshot.containers.into_iter().map(|c| c.name).collect(),
Err(_) => std::collections::HashSet::new(),
},
Err(_) => std::collections::HashSet::new(),
}
}
/// Save the set of user-stopped containers to disk. /// Save the set of user-stopped containers to disk.
pub async fn save_user_stopped(data_dir: &Path, stopped: &std::collections::HashSet<String>) { pub async fn save_user_stopped(data_dir: &Path, stopped: &std::collections::HashSet<String>) {
let path = data_dir.join(USER_STOPPED_FILE); let path = data_dir.join(USER_STOPPED_FILE);
@ -141,51 +84,6 @@ pub async fn clear_user_stopped(data_dir: &Path, name: &str) {
} }
} }
// ── User-uninstalled tracking ───────────────────────────────────────────
// Baseline apps (bitcoin-knots, electrumx, lnd, mempool, ...) self-heal when
// their container is missing — see `is_required_baseline_app` in
// prod_orchestrator.rs — because they're expected to exist from first boot.
// That self-heal has no way to distinguish "container vanished after a
// crash" from "user explicitly uninstalled this," and the in-memory
// `disabled` set the orchestrator otherwise uses is wiped by every
// `load_manifests()` call (once per archipelago startup). Without a durable
// marker, uninstalling a baseline app only "sticks" until the next reboot or
// archipelago restart, at which point the boot reconciler resurrects it.
// This mirrors `user_stopped` exactly, just for uninstall instead of stop.
/// Load the set of explicitly user-uninstalled app/container names from disk.
pub async fn load_user_uninstalled(data_dir: &Path) -> std::collections::HashSet<String> {
let path = data_dir.join(USER_UNINSTALLED_FILE);
match fs::read_to_string(&path).await {
Ok(content) => serde_json::from_str(&content).unwrap_or_default(),
Err(_) => std::collections::HashSet::new(),
}
}
/// Save the set of user-uninstalled app/container names to disk.
pub async fn save_user_uninstalled(data_dir: &Path, uninstalled: &std::collections::HashSet<String>) {
let path = data_dir.join(USER_UNINSTALLED_FILE);
if let Ok(json) = serde_json::to_string_pretty(uninstalled) {
let _ = fs::write(&path, json).await;
}
}
/// Mark a name as user-uninstalled (won't be self-healed by the baseline-app
/// reconciler across restarts/reboots).
pub async fn mark_user_uninstalled(data_dir: &Path, name: &str) {
let mut uninstalled = load_user_uninstalled(data_dir).await;
uninstalled.insert(name.to_string());
save_user_uninstalled(data_dir, &uninstalled).await;
}
/// Clear the user-uninstalled flag (app was explicitly (re)installed/started).
pub async fn clear_user_uninstalled(data_dir: &Path, name: &str) {
let mut uninstalled = load_user_uninstalled(data_dir).await;
if uninstalled.remove(name) {
save_user_uninstalled(data_dir, &uninstalled).await;
}
}
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct RunningContainerRecord { pub struct RunningContainerRecord {
pub name: String, pub name: String,
@ -218,17 +116,10 @@ pub async fn check_for_crash(data_dir: &Path) -> Result<Option<Vec<RunningContai
old_pid old_pid
); );
// Check if that PID is actually still running (zombie/stuck process). // Check if that PID is actually still running (zombie/stuck process)
// Guard against PID reuse: after a reboot the old PID often belongs to an
// unrelated process (or, before the main.rs ordering fix, to OURSELVES) —
// only treat it as "previous instance still alive" if it's a live process
// that is not us and whose cmdline looks like the archipelago binary.
if !old_pid.is_empty() { if !old_pid.is_empty() {
if let Ok(pid) = old_pid.parse::<u32>() { if let Ok(pid) = old_pid.parse::<u32>() {
if pid != std::process::id() if is_process_running(pid) {
&& is_process_running(pid)
&& process_is_archipelago(pid)
{
warn!( warn!(
"Previous process (PID {}) is still running — not a crash, skipping recovery", "Previous process (PID {}) is still running — not a crash, skipping recovery",
pid pid
@ -358,8 +249,6 @@ pub async fn recover_containers(containers: &[RunningContainerRecord]) -> Recove
failed: Vec::new(), failed: Vec::new(),
}; };
pending_boot_starts_add(containers.iter().map(|r| r.name.clone()));
for (i, record) in containers.iter().enumerate() { for (i, record) in containers.iter().enumerate() {
info!( info!(
"Recovering container: {} (image: {})", "Recovering container: {} (image: {})",
@ -422,7 +311,6 @@ pub async fn recover_containers(containers: &[RunningContainerRecord]) -> Recove
if !started { if !started {
report.failed.push(record.name.clone()); report.failed.push(record.name.clone());
} }
pending_boot_start_done(&record.name);
} }
report report
@ -441,16 +329,6 @@ fn is_process_running(pid: u32) -> bool {
std::path::Path::new(&format!("/proc/{}", pid)).exists() std::path::Path::new(&format!("/proc/{}", pid)).exists()
} }
/// Whether the process at `pid` looks like an archipelago instance. Used to
/// tell "the previous instance is genuinely still alive" apart from PID
/// reuse by an unrelated process after a reboot.
fn process_is_archipelago(pid: u32) -> bool {
match std::fs::read(format!("/proc/{pid}/cmdline")) {
Ok(cmdline) => String::from_utf8_lossy(&cmdline).contains("archipelago"),
Err(_) => false,
}
}
/// Start all stopped containers that were previously installed. /// Start all stopped containers that were previously installed.
/// Runs on every startup to ensure containers come back after clean reboots. /// Runs on every startup to ensure containers come back after clean reboots.
/// The crash recovery (PID-based) handles dirty shutdowns; this handles clean ones. /// The crash recovery (PID-based) handles dirty shutdowns; this handles clean ones.
@ -475,7 +353,7 @@ async fn start_stopped_app_stacks(data_dir: &Path) -> RecoveryReport {
}; };
for stack in stack_recovery_specs() { for stack in stack_recovery_specs() {
if !stack_anchor_container_exists(stack).await { if !stack_has_any_container(stack).await {
continue; continue;
} }
@ -485,34 +363,16 @@ async fn start_stopped_app_stacks(data_dir: &Path) -> RecoveryReport {
); );
repair_stack_network_aliases(stack).await; repair_stack_network_aliases(stack).await;
// Register the whole stack up front: the per-member dependency waits
// below can take minutes, and the UI should say "Restarting", not
// "Stopped", for members still queued behind them.
pending_boot_starts_add(
stack
.containers
.iter()
.filter(|c| !user_stopped.contains(**c))
.map(|c| (*c).to_string()),
);
for container in stack.containers { for container in stack.containers {
if user_stopped.contains(*container) { if user_stopped.contains(*container) {
info!("Skipping user-stopped container: {}", container); info!("Skipping user-stopped container: {}", container);
continue; continue;
} }
let state = container_state(container).await; match container_state(container).await {
match state { Some(state) if state == "running" => continue,
Some(state) if state == "running" => {
pending_boot_start_done(container);
continue;
}
Some(_) => {} Some(_) => {}
None => { None => continue,
pending_boot_start_done(container);
continue;
}
} }
repair_stack_network_aliases(stack).await; repair_stack_network_aliases(stack).await;
@ -524,7 +384,6 @@ async fn start_stopped_app_stacks(data_dir: &Path) -> RecoveryReport {
} else { } else {
report.failed.push((*container).to_string()); report.failed.push((*container).to_string());
} }
pending_boot_start_done(container);
} }
} }
@ -698,11 +557,6 @@ struct StackRecoverySpec {
network: &'static str, network: &'static str,
aliases: &'static [(&'static str, &'static str)], aliases: &'static [(&'static str, &'static str)],
containers: &'static [&'static str], containers: &'static [&'static str],
/// The stack's core dependency (its DB / server container) — every other
/// member depends on this being present. Used to distinguish "a genuinely
/// installed stack has a crashed member" from "orphan debris from a
/// partial/failed install" (see `stack_anchor_container_exists`).
anchor: &'static str,
} }
fn stack_recovery_specs() -> &'static [StackRecoverySpec] { fn stack_recovery_specs() -> &'static [StackRecoverySpec] {
@ -716,7 +570,6 @@ fn stack_recovery_specs() -> &'static [StackRecoverySpec] {
("immich_server", "immich_server"), ("immich_server", "immich_server"),
], ],
containers: &["immich_postgres", "immich_redis", "immich_server"], containers: &["immich_postgres", "immich_redis", "immich_server"],
anchor: "immich_postgres",
}, },
StackRecoverySpec { StackRecoverySpec {
name: "indeedhub", name: "indeedhub",
@ -738,7 +591,6 @@ fn stack_recovery_specs() -> &'static [StackRecoverySpec] {
"indeedhub-ffmpeg", "indeedhub-ffmpeg",
"indeedhub", "indeedhub",
], ],
anchor: "indeedhub-postgres",
}, },
StackRecoverySpec { StackRecoverySpec {
name: "netbird", name: "netbird",
@ -749,20 +601,17 @@ fn stack_recovery_specs() -> &'static [StackRecoverySpec] {
("netbird", "netbird"), ("netbird", "netbird"),
], ],
containers: &["netbird-server", "netbird-dashboard", "netbird"], containers: &["netbird-server", "netbird-dashboard", "netbird"],
anchor: "netbird-server",
}, },
] ]
} }
/// Whether the stack's core dependency container exists at all (running or async fn stack_has_any_container(stack: &StackRecoverySpec) -> bool {
/// not — existence, not health, is what matters here). `false` means any for container in stack.containers {
/// other stack member still lying around is orphan debris from a partial or if container_state(container).await.is_some() {
/// already-uninstalled install, not a legitimately-installed-but-crashed return true;
/// stack — blindly restarting those siblings just crash-loops them forever }
/// against a dependency that was never created (indeedhub-api on `.116`, }
/// 2026-07-01: retried every 120s against a nonexistent indeedhub-postgres). false
async fn stack_anchor_container_exists(stack: &StackRecoverySpec) -> bool {
container_state(stack.anchor).await.is_some()
} }
async fn repair_stack_network_aliases(stack: &StackRecoverySpec) { async fn repair_stack_network_aliases(stack: &StackRecoverySpec) {
@ -1049,43 +898,6 @@ mod tests {
assert_eq!(containers[1].name, "archy-mempool-web"); assert_eq!(containers[1].name, "archy-mempool-web");
} }
#[tokio::test]
async fn test_load_last_running_names_reads_snapshot_without_pid_gate() {
let tmp = TempDir::new().unwrap();
// No PID file written — load_last_running_names must NOT require a crash.
let snapshot = ContainerSnapshot {
timestamp: 1000,
containers: vec![
RunningContainerRecord {
name: "immich_server".to_string(),
image: "immich:2.7".to_string(),
},
RunningContainerRecord {
name: "immich_postgres".to_string(),
image: "postgres:16".to_string(),
},
],
};
fs::write(
tmp.path().join(CONTAINER_STATE_FILE),
serde_json::to_string(&snapshot).unwrap(),
)
.await
.unwrap();
let names = load_last_running_names(tmp.path()).await;
assert_eq!(names.len(), 2);
assert!(names.contains("immich_server"));
assert!(names.contains("immich_postgres"));
assert!(!names.contains("immich_redis"));
}
#[tokio::test]
async fn test_load_last_running_names_empty_when_absent() {
let tmp = TempDir::new().unwrap();
assert!(load_last_running_names(tmp.path()).await.is_empty());
}
#[tokio::test] #[tokio::test]
async fn test_write_and_remove_pid_marker() { async fn test_write_and_remove_pid_marker() {
let tmp = TempDir::new().unwrap(); let tmp = TempDir::new().unwrap();
@ -1148,27 +960,4 @@ mod tests {
true true
)); ));
} }
#[test]
fn stack_recovery_anchor_is_the_stacks_own_core_dependency() {
// Every stack's anchor must be one of its own containers (typically
// the DB/server the rest depend on) — a typo here would silently
// disable orphan-debris protection for that stack.
for stack in stack_recovery_specs() {
assert!(
stack.containers.contains(&stack.anchor),
"{}: anchor {} not among its own containers",
stack.name,
stack.anchor
);
}
assert_eq!(
stack_recovery_specs()
.iter()
.find(|s| s.name == "indeedhub")
.unwrap()
.anchor,
"indeedhub-postgres"
);
}
} }

View File

@ -61,18 +61,6 @@ pub struct ServerInfo {
/// True if this node's keys are derived from a BIP-39 seed. /// True if this node's keys are derived from a BIP-39 seed.
#[serde(rename = "seed-backed", default)] #[serde(rename = "seed-backed", default)]
pub seed_backed: bool, pub seed_backed: bool,
/// This node's own physical location, for the Mesh Map — opt-in only
/// (see `share_location`), set via `server.set-location`. `None` until
/// the user sets one, regardless of `share_location`.
#[serde(default)]
pub lat: Option<f64>,
#[serde(default)]
pub lon: Option<f64>,
/// Whether `lat`/`lon` should be included in the state snapshot we send
/// to trusted federation peers (so they can plot us on their Mesh Map).
/// Defaults to false — never shared unless explicitly turned on.
#[serde(rename = "share-location", default)]
pub share_location: bool,
} }
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] #[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
@ -359,9 +347,6 @@ impl DataModel {
wifi_ssids: vec![], wifi_ssids: vec![],
zram_enabled: false, zram_enabled: false,
seed_backed: false, seed_backed: false,
lat: None,
lon: None,
share_location: false,
}, },
package_data: HashMap::new(), package_data: HashMap::new(),
peer_health: HashMap::new(), peer_health: HashMap::new(),

View File

@ -296,9 +296,7 @@ pub(crate) async fn notify_join(
status = %resp.status(), status = %resp.status(),
"peer-joined notification rejected; will retry" "peer-joined notification rejected; will retry"
), ),
Err(e) => { Err(e) => tracing::warn!(attempt, error = %e, "peer-joined notification failed; will retry"),
tracing::warn!(attempt, error = %e, "peer-joined notification failed; will retry")
}
} }
tokio::time::sleep(std::time::Duration::from_secs(10 * attempt as u64)).await; tokio::time::sleep(std::time::Duration::from_secs(10 * attempt as u64)).await;
} }

View File

@ -506,8 +506,6 @@ mod tests {
nostr_npub: None, nostr_npub: None,
own_fips_npub: None, own_fips_npub: None,
federated_peers: Vec::new(), federated_peers: Vec::new(),
lat: None,
lon: None,
}; };
update_node_state(dir.path(), "did:key:z1", state) update_node_state(dir.path(), "did:key:z1", state)

View File

@ -208,7 +208,6 @@ async fn merge_transitive_peers(
/// and route directly over FIPS from now on). Only peers we trust are /// and route directly over FIPS from now on). Only peers we trust are
/// shared — an Untrusted/Observer node should not be re-exported /// shared — an Untrusted/Observer node should not be re-exported
/// through us to the network. /// through us to the network.
#[allow(clippy::too_many_arguments)]
pub fn build_local_state( pub fn build_local_state(
apps: Vec<AppStatus>, apps: Vec<AppStatus>,
cpu: f64, cpu: f64,
@ -222,9 +221,6 @@ pub fn build_local_state(
nostr_npub: Option<String>, nostr_npub: Option<String>,
own_fips_npub: Option<String>, own_fips_npub: Option<String>,
federated_peers: &[FederatedNode], federated_peers: &[FederatedNode],
// Only Some when the node has opted in via server.set-location's
// `share` flag — see NodeStateSnapshot::lat/lon's doc comment.
shared_location: Option<(f64, f64)>,
) -> NodeStateSnapshot { ) -> NodeStateSnapshot {
let hints = federated_peers let hints = federated_peers
.iter() .iter()
@ -252,8 +248,6 @@ pub fn build_local_state(
nostr_npub, nostr_npub,
own_fips_npub, own_fips_npub,
federated_peers: hints, federated_peers: hints,
lat: shared_location.map(|(lat, _)| lat),
lon: shared_location.map(|(_, lon)| lon),
} }
} }
@ -347,14 +341,12 @@ mod tests {
None, None,
None, None,
&[], &[],
None,
); );
assert_eq!(state.apps.len(), 1); assert_eq!(state.apps.len(), 1);
assert_eq!(state.cpu_usage_percent, Some(25.5)); assert_eq!(state.cpu_usage_percent, Some(25.5));
assert_eq!(state.tor_active, Some(true)); assert_eq!(state.tor_active, Some(true));
assert_eq!(state.node_name, Some("Test Node".to_string())); assert_eq!(state.node_name, Some("Test Node".to_string()));
assert!(state.federated_peers.is_empty()); assert!(state.federated_peers.is_empty());
assert_eq!(state.lat, None);
} }
#[test] #[test]
@ -400,7 +392,7 @@ mod tests {
last_transport_at: None, last_transport_at: None,
}, },
]; ];
let state = build_local_state(vec![], 0.0, 0, 0, 0, 0, 0, true, None, None, None, &peers, None); let state = build_local_state(vec![], 0.0, 0, 0, 0, 0, 0, true, None, None, None, &peers);
assert_eq!(state.federated_peers.len(), 1); assert_eq!(state.federated_peers.len(), 1);
assert_eq!(state.federated_peers[0].did, "did:key:zTrusted"); assert_eq!(state.federated_peers[0].did, "did:key:zTrusted");
assert_eq!( assert_eq!(

View File

@ -93,14 +93,6 @@ pub struct NodeStateSnapshot {
/// re-export them in her own state snapshots). /// re-export them in her own state snapshots).
#[serde(default)] #[serde(default)]
pub federated_peers: Vec<FederationPeerHint>, pub federated_peers: Vec<FederationPeerHint>,
/// This node's own location, for the Mesh Map — only present when the
/// sender has opted in via `server.set-location`'s `share` flag. Absent
/// (not just null) for nodes that haven't opted in, so older receivers
/// and the map's "no location shared" state both fall out naturally.
#[serde(default)]
pub lat: Option<f64>,
#[serde(default)]
pub lon: Option<f64>,
} }
/// Minimal peer summary shared via `NodeStateSnapshot.federated_peers`. /// Minimal peer summary shared via `NodeStateSnapshot.federated_peers`.

View File

@ -216,44 +216,6 @@ pub struct ApplyResult {
pub message: String, pub message: String,
} }
/// FIPS UDP transport port (matches `transports.udp.bind_addr` in the generated
/// `fips.yaml`). Direct peer links dial this, NOT the HTTP/LAN messaging port.
const FIPS_UDP_PORT: u16 = 8668;
/// Build transient seed-anchor entries that dial LAN-discovered federation peers
/// directly over their FIPS UDP transport. For each peer the registry knows both
/// a LAN socket address AND a FIPS npub for, point a `udp` anchor at
/// `<lan-ip>:8668`. This lets co-located federation nodes form a DIRECT FIPS link
/// instead of depending on the global anchor's spanning tree to route between
/// them (the cause of every dial falling back to Tor when the anchor link flaps).
///
/// This is FIPS's own UDP transport over the LAN — not Tailscale, not the LAN
/// HTTP messaging port. NOT persisted to `seed-anchors.json`: recomputed each
/// apply tick from live LAN discovery, so a peer's changing IP self-corrects and
/// stale entries never accumulate. `fipsctl connect` is idempotent, so
/// re-applying just keeps the link warm.
pub fn lan_fips_anchors(peers: &[crate::transport::PeerRecord]) -> Vec<SeedAnchor> {
let mut out = Vec::new();
for p in peers {
let (Some(lan), Some(npub)) = (p.lan_address.as_deref(), p.fips_npub.as_deref()) else {
continue;
};
// lan_address is the peer's HTTP/LAN socket ("ip:port"); reuse only its IP
// and target the FIPS UDP port. SocketAddr::new(...).to_string() formats
// IPv6 with brackets correctly.
let Ok(sa) = lan.parse::<std::net::SocketAddr>() else {
continue;
};
out.push(SeedAnchor {
npub: npub.to_string(),
address: std::net::SocketAddr::new(sa.ip(), FIPS_UDP_PORT).to_string(),
transport: "udp".to_string(),
label: "LAN federation peer (direct FIPS)".to_string(),
});
}
out
}
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use super::*; use super::*;

View File

@ -1358,14 +1358,6 @@ mod tests {
host_port_ready: None, host_port_ready: None,
healthy: true, healthy: true,
}, },
ContainerHealth {
name: "indeedhub-minio".into(),
app_id: "indeedhub-minio".into(),
state: "running".into(),
podman_health: None,
host_port_ready: None,
healthy: true,
},
ContainerHealth { ContainerHealth {
name: "indeedhub-api".into(), name: "indeedhub-api".into(),
app_id: "indeedhub-api".into(), app_id: "indeedhub-api".into(),

View File

@ -98,15 +98,11 @@ async fn main() -> Result<()> {
let startup_start = std::time::Instant::now(); let startup_start = std::time::Instant::now();
crash_recovery::init_start_time(); crash_recovery::init_start_time();
// Initialize tracing. Default to `info`: production units don't set // Initialize tracing
// RUST_LOG, and the old `archipelago=debug` default flooded journald
// with per-request debug lines ("RPC method: …", cookie-flag notes) —
// part of a >1 GB/day journal on a fresh node. Set RUST_LOG (e.g.
// RUST_LOG=archipelago=debug) to get debug logs back when debugging.
tracing_subscriber::fmt() tracing_subscriber::fmt()
.with_env_filter( .with_env_filter(
tracing_subscriber::EnvFilter::try_from_default_env() tracing_subscriber::EnvFilter::try_from_default_env()
.unwrap_or_else(|_| "info".into()), .unwrap_or_else(|_| "archipelago=debug,info".into()),
) )
.init(); .init();
@ -153,18 +149,13 @@ async fn main() -> Result<()> {
); );
} }
// Check for a crash marker BEFORE writing our own. The old order wrote // Write PID marker early so we can detect crashes on next startup
// the marker first, so the check always read the CURRENT process's PID,
// found it alive, and skipped recovery — on every boot, forever.
let crash_containers = crash_recovery::check_for_crash(&config.data_dir).await;
// Now mark this instance as running so the next startup can detect a crash.
crash_recovery::write_pid_marker(&config.data_dir).await?; crash_recovery::write_pid_marker(&config.data_dir).await?;
// Run crash recovery before starting the manifest reconciler. Both paths // Run crash recovery before starting the manifest reconciler. Both paths
// mutate Podman; running them concurrently can corrupt transient runtime // mutate Podman; running them concurrently can corrupt transient runtime
// state and leave netavark/conmon unable to start containers. // state and leave netavark/conmon unable to start containers.
match crash_containers { match crash_recovery::check_for_crash(&config.data_dir).await {
Ok(Some(containers)) => { Ok(Some(containers)) => {
info!( info!(
"🔧 Recovering {} containers from previous crash...", "🔧 Recovering {} containers from previous crash...",
@ -207,24 +198,6 @@ async fn main() -> Result<()> {
(Some(trait_obj), Some(dev)) (Some(trait_obj), Some(dev))
} else { } else {
let prod = Arc::new(ProdContainerOrchestrator::new(config.clone()).await?); let prod = Arc::new(ProdContainerOrchestrator::new(config.clone()).await?);
// Pull the freshest signed app-catalog BEFORE loading manifests, so any
// registry-embedded manifest (the origin-wins overlay in load_manifests)
// is in place on THIS boot — not a restart later. Without this the boot
// would overlay the previous run's cached catalog and a newly-published
// app (e.g. a registry-only install) wouldn't appear until the next
// restart. Bounded + best-effort: on timeout/unreachable origin the
// last-cached catalog (or the disk manifests) still load — registry is
// an overlay on top of disk, never a hard dependency.
match tokio::time::timeout(
std::time::Duration::from_secs(25),
crate::container::app_catalog::refresh_catalog(&config.data_dir),
)
.await
{
Ok(Ok(n)) => info!("🛰️ app-catalog refreshed before manifest load ({n} apps)"),
Ok(Err(e)) => tracing::debug!("app-catalog pre-load refresh failed (using cache): {e}"),
Err(_) => tracing::debug!("app-catalog pre-load refresh timed out (using cache)"),
}
// Best-effort manifest load; a missing /opt/archipelago/apps is // Best-effort manifest load; a missing /opt/archipelago/apps is
// logged inside load_manifests and not fatal. // logged inside load_manifests and not fatal.
match prod.load_manifests().await { match prod.load_manifests().await {
@ -297,9 +270,7 @@ async fn main() -> Result<()> {
// via auth.setup RPC. The Login page detects is_setup=false and shows // via auth.setup RPC. The Login page detects is_setup=false and shows
// "Create Password" form instead of login form. // "Create Password" form instead of login form.
// Create server. Keep a clone of the orchestrator handle for the background // Create server
// update scheduler (per-app auto-update applies via the orchestrator).
let update_orchestrator = orchestrator.clone();
let server = Server::new(config.clone(), orchestrator, dev_orchestrator).await?; let server = Server::new(config.clone(), orchestrator, dev_orchestrator).await?;
// Start server // Start server
@ -324,12 +295,10 @@ async fn main() -> Result<()> {
}); });
} }
// Spawn background update scheduler. Pass the orchestrator so the scheduler // Spawn background update scheduler
// can apply per-app auto-update-to-latest (multi-version support) via the
// safe orchestrator upgrade path; None in dev mode disables it.
let update_data_dir = config.data_dir.clone(); let update_data_dir = config.data_dir.clone();
tokio::spawn(async move { tokio::spawn(async move {
update::run_update_scheduler(update_data_dir, update_orchestrator).await; update::run_update_scheduler(update_data_dir).await;
}); });
// Synchronize host-side doctor artifacts (script + systemd units) with // Synchronize host-side doctor artifacts (script + systemd units) with

View File

@ -181,10 +181,7 @@ async fn is_sender_allowed(
match peers.get(&sender_contact_id) { match peers.get(&sender_contact_id) {
// Match identity on the bound archipelago key (stable, advert/ // Match identity on the bound archipelago key (stable, advert/
// federation-verified), not the firmware routing key. // federation-verified), not the firmware routing key.
Some(p) => ( Some(p) => (p.identity_pubkey_hex().map(|s| s.to_string()), p.did.clone()),
p.identity_pubkey_hex().map(|s| s.to_string()),
p.did.clone(),
),
None => (None, None), None => (None, None),
} }
}; };

View File

@ -314,82 +314,17 @@ pub(super) async fn try_chunk_reassemble(
/// Look up a peer by pubkey hex prefix. Returns (contact_id, display_name). /// Look up a peer by pubkey hex prefix. Returns (contact_id, display_name).
pub(super) async fn resolve_peer(state: &Arc<MeshState>, sender_prefix: &str) -> (u32, String) { pub(super) async fn resolve_peer(state: &Arc<MeshState>, sender_prefix: &str) -> (u32, String) {
{ let peers = state.peers.read().await;
let peers = state.peers.read().await; peers
if let Some(peer) = peers.values().find(|p| { .values()
.find(|p| {
p.pubkey_hex p.pubkey_hex
.as_ref() .as_ref()
.map(|k| k.starts_with(sender_prefix)) .map(|k| k.starts_with(sender_prefix))
.unwrap_or(false) .unwrap_or(false)
}) { })
return (peer.contact_id, peer.advert_name.clone()); .map(|p| (p.contact_id, p.advert_name.clone()))
} .unwrap_or((0, sender_prefix.to_string()))
}
if let Some((node_num, pubkey_hex, name)) = meshtastic_peer_from_prefix(sender_prefix) {
let peer = MeshPeer {
contact_id: node_num,
advert_name: name.clone(),
did: None,
pubkey_hex: Some(pubkey_hex),
arch_pubkey_hex: None,
x25519_pubkey: None,
rssi: None,
snr: None,
last_heard: chrono::Utc::now().to_rfc3339(),
hops: 0xff,
last_advert: 0,
reachable: true,
// Stamped fresh from `peer_pubkeys` in `get_contacts` once a real
// contact refresh runs; unknown at synthesis time here.
pkc_capable: false,
lat: None,
lon: None,
};
let is_new = {
let mut peers = state.peers.write().await;
peers.insert(node_num, peer.clone()).is_none()
};
state.update_peer_count().await;
let _ = state.event_tx.send(if is_new {
MeshEvent::PeerDiscovered(peer)
} else {
MeshEvent::PeerUpdated(peer)
});
return (node_num, name);
}
(0, sender_prefix.to_string())
}
fn meshtastic_peer_from_prefix(sender_prefix: &str) -> Option<(u32, String, String)> {
if sender_prefix.len() < 12 {
return None;
}
let bytes = hex::decode(&sender_prefix[..12]).ok()?;
if bytes.len() != 6 || bytes[4] != b'm' || bytes[5] != b'e' {
return None;
}
let node_num = u32::from_le_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
if node_num == 0 || node_num == u32::MAX {
return None;
}
let mut full_key = [0u8; 32];
full_key[..4].copy_from_slice(&node_num.to_le_bytes());
full_key[4..15].copy_from_slice(b"meshtastic:");
let name = format!("Meshtastic !{:08x}", node_num);
Some((node_num, hex::encode(full_key), name))
}
/// Stamp the SNR carried in a Meshcore v3 contact-message frame onto the
/// sender's peer record so the signal-bars indicator has real data (Meshcore
/// has no per-packet RSSI like Meshtastic, only this 1-byte SNR — see
/// `protocol::parse_contact_msg_v3_raw`).
pub(super) async fn update_peer_snr(state: &Arc<MeshState>, contact_id: u32, snr: f32) {
let mut peers = state.peers.write().await;
if let Some(peer) = peers.get_mut(&contact_id) {
peer.snr = Some(snr);
}
} }
/// Store a plain-text (non-typed) message and emit an event. /// Store a plain-text (non-typed) message and emit an event.
@ -398,19 +333,8 @@ pub(super) async fn store_plain_message(
contact_id: u32, contact_id: u32,
peer_name: &str, peer_name: &str,
text: &str, text: &str,
) {
store_plain_message_with_encryption(state, contact_id, peer_name, text, false).await;
}
pub(super) async fn store_plain_message_with_encryption(
state: &Arc<MeshState>,
contact_id: u32,
peer_name: &str,
text: &str,
encrypted: bool,
) { ) {
let msg_id = state.next_id().await; let msg_id = state.next_id().await;
let radio_transport = radio_transport_label(state.status.read().await.device_type);
let msg = MeshMessage { let msg = MeshMessage {
id: msg_id, id: msg_id,
direction: MessageDirection::Received, direction: MessageDirection::Received,
@ -419,8 +343,7 @@ pub(super) async fn store_plain_message_with_encryption(
plaintext: text.to_string(), plaintext: text.to_string(),
timestamp: chrono::Utc::now().to_rfc3339(), timestamp: chrono::Utc::now().to_rfc3339(),
delivered: true, delivered: true,
encrypted, encrypted: false,
transport: Some(radio_transport.to_string()),
message_type: "text".to_string(), message_type: "text".to_string(),
typed_payload: None, typed_payload: None,
sender_pubkey: None, sender_pubkey: None,
@ -578,11 +501,6 @@ pub(super) async fn handle_identity_received(
last_advert: 0, last_advert: 0,
// We just heard this peer's identity advert, so it's reachable. // We just heard this peer's identity advert, so it's reachable.
reachable: true, reachable: true,
// PKC capability is tracked by the radio driver's get_contacts(), not
// known at identity-advert time.
pkc_capable: false,
lat: None,
lon: None,
}; };
let is_new = { let is_new = {
@ -649,7 +567,6 @@ pub(super) async fn handle_received_message(
.map(|p| p.advert_name.clone()); .map(|p| p.advert_name.clone());
let msg_id = state.next_id().await; let msg_id = state.next_id().await;
let radio_transport = radio_transport_label(state.status.read().await.device_type);
let msg = MeshMessage { let msg = MeshMessage {
id: msg_id, id: msg_id,
direction: MessageDirection::Received, direction: MessageDirection::Received,
@ -659,7 +576,6 @@ pub(super) async fn handle_received_message(
timestamp: chrono::Utc::now().to_rfc3339(), timestamp: chrono::Utc::now().to_rfc3339(),
delivered: true, delivered: true,
encrypted, encrypted,
transport: Some(radio_transport.to_string()),
message_type: "text".to_string(), message_type: "text".to_string(),
typed_payload: None, typed_payload: None,
sender_pubkey: None, sender_pubkey: None,

View File

@ -34,10 +34,7 @@ async fn store_typed_message(
plaintext: display_text.to_string(), plaintext: display_text.to_string(),
timestamp: chrono::Utc::now().to_rfc3339(), timestamp: chrono::Utc::now().to_rfc3339(),
delivered: true, delivered: true,
// transport + E2E are stamped post-dispatch by
// handle_typed_envelope_direct, which alone knows the receive transport.
encrypted: false, encrypted: false,
transport: None,
message_type: type_label.to_string(), message_type: type_label.to_string(),
typed_payload, typed_payload,
sender_pubkey, sender_pubkey,
@ -73,69 +70,7 @@ pub(super) async fn handle_typed_message(
return; return;
} }
}; };
// Radio-delivered → the active device's transport label ("lora" or
// "reticulum"). Stamp after dispatch (see stamp helper).
let before = max_message_id(state).await;
handle_typed_envelope_direct(state, sender_contact_id, sender_name, envelope).await; handle_typed_envelope_direct(state, sender_contact_id, sender_name, envelope).await;
let radio_transport = radio_transport_label(state.status.read().await.device_type);
stamp_received_transport(state, sender_contact_id, before, radio_transport, false).await;
}
/// Highest stored message id right now. Paired with `stamp_received_transport`
/// to identify messages a dispatch call just stored (ids are monotonic).
pub(crate) async fn max_message_id(state: &Arc<MeshState>) -> u64 {
state
.messages
.read()
.await
.iter()
.map(|m| m.id)
.max()
.unwrap_or(0)
}
/// Stamp the per-message transport pill (and E2E flag) onto every RECEIVED
/// message from `sender_contact_id` stored since `after_id` — i.e. the ones the
/// just-completed `handle_typed_envelope_direct` produced. This is how both the
/// radio path ("lora") and the federation path ("fips"/"tor") tag inbound
/// messages without threading transport through all 20 typed-dispatch sites.
/// `encrypted` only ever sets the flag true (a federation envelope is E2E),
/// never clears a true set elsewhere.
pub(crate) async fn stamp_received_transport(
state: &Arc<MeshState>,
sender_contact_id: u32,
after_id: u64,
transport: &str,
encrypted: bool,
) {
let mut messages = state.messages.write().await;
for m in messages.iter_mut() {
if m.id > after_id
&& matches!(m.direction, MessageDirection::Received)
&& m.peer_contact_id == sender_contact_id
{
if m.transport.is_none() {
m.transport = Some(transport.to_string());
}
if encrypted {
m.encrypted = true;
}
}
}
}
/// Mark every RECEIVED message stored since `after_id` as end-to-end encrypted.
/// Used by the session loop to stamp the E2E pill on a meshtastic frame the radio
/// reported PKI-encrypted (the synthetic frame can't carry that flag, and the
/// typed-dispatch store path defaults `encrypted` to false). One inbound frame
/// yields at most one received message, so no sender filter is needed.
pub(crate) async fn stamp_received_encrypted(state: &Arc<MeshState>, after_id: u64) {
let mut messages = state.messages.write().await;
for m in messages.iter_mut() {
if m.id > after_id && matches!(m.direction, MessageDirection::Received) {
m.encrypted = true;
}
}
} }
/// Dispatch a pre-decoded TypedEnvelope. Shared between the radio receive /// Dispatch a pre-decoded TypedEnvelope. Shared between the radio receive

View File

@ -4,8 +4,7 @@ use super::super::message_types::TypedEnvelope;
use super::super::protocol; use super::super::protocol;
use super::decode::{ use super::decode::{
handle_identity_received, is_mc_chunk_frame, resolve_peer, store_plain_message, handle_identity_received, is_mc_chunk_frame, resolve_peer, store_plain_message,
store_plain_message_with_encryption, try_base64_typed, try_chunk_reassemble, try_base64_typed, try_chunk_reassemble, try_decrypt_base64, try_decrypt_ratchet_base64,
try_decrypt_base64, try_decrypt_ratchet_base64, update_peer_snr,
}; };
use super::dispatch::handle_typed_message; use super::dispatch::handle_typed_message;
use super::MeshState; use super::MeshState;
@ -63,14 +62,12 @@ pub(super) async fn handle_frame(
return true; // Signal caller to sync immediately return true; // Signal caller to sync immediately
} }
protocol::RESP_CONTACT_MSG_V3 | protocol::RESP_CONTACT_MSG_V3_E2E => { protocol::RESP_CONTACT_MSG_V3 => {
// Direct message received (v3 format) — check for typed envelope first // Direct message received (v3 format) — check for typed envelope first
match protocol::parse_contact_msg_v3_raw(&frame.data) { match protocol::parse_contact_msg_v3_raw(&frame.data) {
Ok((sender_prefix, payload, snr)) => { Ok((sender_prefix, payload, _snr)) => {
if !payload.is_empty() { if !payload.is_empty() {
let encrypted = frame.code == protocol::RESP_CONTACT_MSG_V3_E2E;
let (contact_id, name) = resolve_peer(state, &sender_prefix).await; let (contact_id, name) = resolve_peer(state, &sender_prefix).await;
update_peer_snr(state, contact_id, snr as f32).await;
if TypedEnvelope::is_typed(&payload) { if TypedEnvelope::is_typed(&payload) {
handle_typed_message(&payload, contact_id, &name, state).await; handle_typed_message(&payload, contact_id, &name, state).await;
} else if let Some(decoded) = try_base64_typed(&payload) { } else if let Some(decoded) = try_base64_typed(&payload) {
@ -89,10 +86,7 @@ pub(super) async fn handle_frame(
handle_typed_message(&decoded, contact_id, &name, state).await; handle_typed_message(&decoded, contact_id, &name, state).await;
} else if !payload.starts_with(b"MC") { } else if !payload.starts_with(b"MC") {
let text = String::from_utf8_lossy(&payload).to_string(); let text = String::from_utf8_lossy(&payload).to_string();
store_plain_message_with_encryption( store_plain_message(state, contact_id, &name, &text).await;
state, contact_id, &name, &text, encrypted,
)
.await;
info!(from = %sender_prefix, "Received mesh DM (v3)"); info!(from = %sender_prefix, "Received mesh DM (v3)");
} }
} }
@ -139,14 +133,8 @@ pub(super) async fn handle_frame(
match protocol::parse_channel_msg_v3_raw(&frame.data) { match protocol::parse_channel_msg_v3_raw(&frame.data) {
Ok((channel_idx, payload)) => { Ok((channel_idx, payload)) => {
if !payload.is_empty() { if !payload.is_empty() {
handle_channel_payload( handle_channel_payload(state, channel_idx, &payload, our_x25519_secret)
state, .await;
channel_idx,
&payload,
our_x25519_secret,
None,
)
.await;
} }
} }
Err(e) => warn!("Failed to parse v3 channel message: {}", e), Err(e) => warn!("Failed to parse v3 channel message: {}", e),
@ -158,44 +146,14 @@ pub(super) async fn handle_frame(
match protocol::parse_channel_msg_v1_raw(&frame.data) { match protocol::parse_channel_msg_v1_raw(&frame.data) {
Ok((channel_idx, payload)) => { Ok((channel_idx, payload)) => {
if !payload.is_empty() { if !payload.is_empty() {
handle_channel_payload( handle_channel_payload(state, channel_idx, &payload, our_x25519_secret)
state, .await;
channel_idx,
&payload,
our_x25519_secret,
None,
)
.await;
} }
} }
Err(e) => warn!("Failed to parse channel message: {}", e), Err(e) => warn!("Failed to parse channel message: {}", e),
} }
} }
// Synthetic Meshtastic channel broadcast that carries its sender:
// `[channel_idx: u8][sender_pubkey_prefix: 6 bytes][text…]`. Resolve the
// sender to a friendly name, then file the message under the channel
// thread attributed to them — this is what makes the default public
// LongFast channel actually show inbound traffic (and who sent it).
protocol::RESP_MESHTASTIC_CHANNEL_TEXT => {
if frame.data.len() > 7 {
let channel_idx = frame.data[0];
let sender_prefix_hex = hex::encode(&frame.data[1..7]);
let payload = frame.data[7..].to_vec();
if !payload.is_empty() {
let (_cid, name) = resolve_peer(state, &sender_prefix_hex).await;
handle_channel_payload(
state,
channel_idx,
&payload,
our_x25519_secret,
Some(name),
)
.await;
}
}
}
protocol::PUSH_LOG_DATA | protocol::PUSH_PATH_UPDATE | protocol::PUSH_RAW_DATA => { protocol::PUSH_LOG_DATA | protocol::PUSH_PATH_UPDATE | protocol::PUSH_RAW_DATA => {
// Internal device logging/path data — safe to ignore // Internal device logging/path data — safe to ignore
} }
@ -219,12 +177,6 @@ async fn handle_channel_payload(
channel_idx: u8, channel_idx: u8,
payload: &[u8], payload: &[u8],
our_x25519_secret: &[u8; 32], our_x25519_secret: &[u8; 32],
// When the transport knows who sent this channel broadcast (Meshtastic
// packets carry the originating node), the plain-text/typed message is filed
// under the channel thread but attributed to this sender name. Meshcore
// channel frames carry no sender, so they pass `None` and fall back to a
// generic "Channel N" label.
sender_name: Option<String>,
) { ) {
// DM-via-channel wrapper (text form): the channel text carries an // DM-via-channel wrapper (text form): the channel text carries an
// ASCII "@DM:<base64>" token somewhere in the body. We locate the // ASCII "@DM:<base64>" token somewhere in the body. We locate the
@ -433,18 +385,15 @@ async fn handle_channel_payload(
} }
} }
// Regular channel broadcast (not DM-wrapped). File it under the channel // Regular channel broadcast (not DM-wrapped)
// thread (contact_id = u32::MAX - idx) but label it with the real sender
// when the transport gave us one (Meshtastic), so the channel view shows who
// said what. Meshcore frames have no sender → generic "Channel N".
let chan_contact_id = u32::MAX - (channel_idx as u32); let chan_contact_id = u32::MAX - (channel_idx as u32);
let chan_name = sender_name.unwrap_or_else(|| format!("Channel {}", channel_idx)); let chan_name = format!("Channel {}", channel_idx);
if TypedEnvelope::is_typed(payload) { if TypedEnvelope::is_typed(payload) {
handle_typed_message(payload, chan_contact_id, &chan_name, state).await; handle_typed_message(payload, chan_contact_id, &chan_name, state).await;
} else { } else {
let text = String::from_utf8_lossy(payload).to_string(); let text = String::from_utf8_lossy(payload).to_string();
store_plain_message(state, chan_contact_id, &chan_name, &text).await; store_plain_message(state, chan_contact_id, &chan_name, &text).await;
info!(channel = channel_idx, sender = %chan_name, "Received mesh channel message"); info!(channel = channel_idx, "Received mesh channel message");
} }
} }

View File

@ -28,26 +28,6 @@ const ADVERT_INTERVAL: Duration = Duration::from_secs(60);
/// How often to poll for queued messages when no push notifications. /// How often to poll for queued messages when no push notifications.
const SYNC_INTERVAL: Duration = Duration::from_secs(10); const SYNC_INTERVAL: Duration = Duration::from_secs(10);
/// Backlog #12 (provisioning robustness): if we haven't successfully received
/// ANY frame in this long, treat the serial link as stalled and force a
/// reconnect — the write-side `consecutive_write_failures` counter is blind
/// to a receive-only stall (writes can keep succeeding while the radio's
/// stopped streaming inbound, e.g. the FROM_RADIO_REBOOTED-without-recovery
/// case meshtastic.rs already has a targeted, immediate fix for — this
/// watchdog is just the backstop for a device that goes silent WITHOUT
/// emitting that notification).
///
/// 5 minutes was originally chosen on the (wrong) assumption that the 60s
/// advert / 10s sync cadence implies *received* traffic — those are our own
/// OUTBOUND cadences and say nothing about what peers send us. A quiet mesh
/// (no peer transmitting, or Reticulum/LXMF's point-to-point store-and-
/// forward model with no broadcast echo) can be legitimately RX-silent for
/// long stretches with the link perfectly healthy; at 300s this forced a
/// full auto-detect reconnect (visible in the UI as "Connecting…") every
/// ~5 minutes on otherwise-idle nodes. 30 minutes still catches a wedged
/// device in reasonable time without false-triggering on normal mesh quiet.
const RX_STALL_TIMEOUT: Duration = Duration::from_secs(1800);
/// Maximum stored messages (circular buffer). /// Maximum stored messages (circular buffer).
const MAX_MESSAGES: usize = 100; const MAX_MESSAGES: usize = 100;
@ -83,25 +63,6 @@ pub enum MeshCommand {
dest_pubkey_prefix: [u8; 6], dest_pubkey_prefix: [u8; 6],
payload: Vec<u8>, payload: Vec<u8>,
}, },
/// Send pre-encoded binary over a dedicated Reticulum RNS Resource
/// transfer instead of the small inline-chunk path — Reticulum-only, see
/// `MeshRadioDevice::send_resource`. Used for large attachments
/// (compressed photos, voice messages) that exceed the small-message cap
/// but fit a sane LoRa-Resource budget; routing decision is made by the
/// RPC layer (`mesh.transport-advice`'s `"resource-mesh"` tier).
SendResource {
dest_pubkey_prefix: [u8; 6],
payload: Vec<u8>,
},
/// Native LXMF `FIELD_IMAGE` send — Reticulum-only, for a stock
/// (non-archy) peer that can't decode our typed envelope. See
/// `MeshRadioDevice::send_native_image`.
SendNativeImage {
dest_pubkey_prefix: [u8; 6],
mime: String,
bytes: Vec<u8>,
caption: Option<String>,
},
/// Send PLAIN text as one or more native meshcore DMs to a stock client /// Send PLAIN text as one or more native meshcore DMs to a stock client
/// (e.g. a phone). Long text is split into multiple readable plain messages /// (e.g. a phone). Long text is split into multiple readable plain messages
/// — never MC-chunked — because stock clients can't reassemble archy's /// — never MC-chunked — because stock clients can't reassemble archy's
@ -116,11 +77,6 @@ pub enum MeshCommand {
payload: Vec<u8>, payload: Vec<u8>,
}, },
SendAdvert, SendAdvert,
/// Reboot the locally-connected radio firmware to recover a wedged /
/// RX-deaf radio. Meshtastic-only; meshcore ignores it.
RebootRadio {
seconds: i64,
},
/// Re-fetch contact list from the radio device. /// Re-fetch contact list from the radio device.
RefreshContacts, RefreshContacts,
/// Delete a contact from the firmware table (clear-all / unreachable wipe). /// Delete a contact from the firmware table (clear-all / unreachable wipe).
@ -295,7 +251,6 @@ impl MeshState {
channel_name: channel_name.to_string(), channel_name: channel_name.to_string(),
messages_sent: 0, messages_sent: 0,
messages_received: 0, messages_received: 0,
region: None,
}), }),
event_tx: tx, event_tx: tx,
next_message_id: RwLock::new(1), next_message_id: RwLock::new(1),
@ -412,16 +367,12 @@ impl MeshState {
/// 4. Reconnect on disconnect /// 4. Reconnect on disconnect
pub fn spawn_mesh_listener( pub fn spawn_mesh_listener(
state: Arc<MeshState>, state: Arc<MeshState>,
data_dir: std::path::PathBuf,
device_path: Option<String>, device_path: Option<String>,
our_did: String, our_did: String,
our_ed_pubkey_hex: String, our_ed_pubkey_hex: String,
our_x25519_secret: [u8; 32], our_x25519_secret: [u8; 32],
our_x25519_pubkey_hex: String, our_x25519_pubkey_hex: String,
server_name: Option<String>, server_name: Option<String>,
lora_region: Option<String>,
channel_name: Option<String>,
device_kind: Option<super::types::DeviceType>,
shutdown: tokio::sync::watch::Receiver<bool>, shutdown: tokio::sync::watch::Receiver<bool>,
cmd_rx: mpsc::Receiver<MeshCommand>, cmd_rx: mpsc::Receiver<MeshCommand>,
) -> tokio::task::JoinHandle<()> { ) -> tokio::task::JoinHandle<()> {
@ -429,15 +380,6 @@ pub fn spawn_mesh_listener(
let mut shutdown = shutdown; let mut shutdown = shutdown;
let mut cmd_rx = cmd_rx; let mut cmd_rx = cmd_rx;
let mut reconnect_delay = RECONNECT_DELAY_INIT; let mut reconnect_delay = RECONNECT_DELAY_INIT;
// Backlog #12 hot-swap re-binding: each run_mesh_session call already
// builds a fresh device struct (contacts/current_region/etc. all
// start empty), so per-device session state is naturally isolated
// across reconnects — there's no stale in-memory state to clear here.
// What's worth doing is detecting when the *physical radio itself*
// changed (a genuine hot-swap, not just the same radio reconnecting)
// so it's visible in logs rather than silently treated the same as
// an ordinary reconnect.
let mut last_self_node_id: Option<u32> = None;
loop { loop {
if *shutdown.borrow() { if *shutdown.borrow() {
info!("Mesh listener shutting down"); info!("Mesh listener shutting down");
@ -446,16 +388,12 @@ pub fn spawn_mesh_listener(
match session::run_mesh_session( match session::run_mesh_session(
&state, &state,
&data_dir,
device_path.as_deref(), device_path.as_deref(),
&our_did, &our_did,
&our_ed_pubkey_hex, &our_ed_pubkey_hex,
&our_x25519_secret, &our_x25519_secret,
&our_x25519_pubkey_hex, &our_x25519_pubkey_hex,
server_name.as_deref(), server_name.as_deref(),
lora_region.as_deref(),
channel_name.as_deref(),
device_kind,
&mut shutdown, &mut shutdown,
&mut cmd_rx, &mut cmd_rx,
) )
@ -476,25 +414,6 @@ pub fn spawn_mesh_listener(
} }
} }
// Hot-swap detection: compare this session's self_node_id against
// the last one we saw. A change means the physical radio itself
// was swapped (not just a reconnect of the same board).
{
let current_self_node_id = state.status.read().await.self_node_id;
if let (Some(prev), Some(cur)) = (last_self_node_id, current_self_node_id) {
if prev != cur {
info!(
previous_node_id = prev,
new_node_id = cur,
"Local mesh radio identity changed — treating as a hot-swapped device"
);
}
}
if current_self_node_id.is_some() {
last_self_node_id = current_self_node_id;
}
}
// Update status to disconnected // Update status to disconnected
{ {
let mut status = state.status.write().await; let mut status = state.status.write().await;

View File

@ -1,24 +1,20 @@
//! Mesh session lifecycle: connect, initialize, main loop. //! Mesh session lifecycle: connect, initialize, main loop.
use super::super::meshtastic::MeshtasticDevice; use super::super::meshtastic::MeshtasticDevice;
use super::super::reticulum::ReticulumLink;
use super::super::serial::MeshcoreDevice; use super::super::serial::MeshcoreDevice;
use super::super::types::*; use super::super::types::*;
use super::{ use super::{
dispatch, frames, MeshCommand, MeshState, ADVERT_INTERVAL, MAX_CONSECUTIVE_WRITE_FAILURES, frames, MeshCommand, MeshState, ADVERT_INTERVAL, MAX_CONSECUTIVE_WRITE_FAILURES, SYNC_INTERVAL,
RX_STALL_TIMEOUT, SYNC_INTERVAL,
}; };
use anyhow::{Context, Result}; use anyhow::{Context, Result};
use std::path::Path;
use std::sync::Arc; use std::sync::Arc;
use std::time::{Duration, Instant}; use std::time::Duration;
use tokio::sync::mpsc; use tokio::sync::mpsc;
use tracing::{debug, error, info, warn}; use tracing::{debug, error, info, warn};
enum MeshRadioDevice { enum MeshRadioDevice {
Meshcore(MeshcoreDevice), Meshcore(MeshcoreDevice),
Meshtastic(MeshtasticDevice), Meshtastic(MeshtasticDevice),
Reticulum(ReticulumLink),
} }
impl MeshRadioDevice { impl MeshRadioDevice {
@ -26,7 +22,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(_) => DeviceType::Meshcore, Self::Meshcore(_) => DeviceType::Meshcore,
Self::Meshtastic(_) => DeviceType::Meshtastic, Self::Meshtastic(_) => DeviceType::Meshtastic,
Self::Reticulum(_) => DeviceType::Reticulum,
} }
} }
@ -34,7 +29,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.advert_name.clone(), Self::Meshcore(device) => device.advert_name.clone(),
Self::Meshtastic(device) => device.advert_name(), Self::Meshtastic(device) => device.advert_name(),
Self::Reticulum(device) => device.advert_name(),
} }
} }
@ -42,37 +36,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.set_advert_name(name).await, Self::Meshcore(device) => device.set_advert_name(name).await,
Self::Meshtastic(device) => device.set_advert_name(name).await, Self::Meshtastic(device) => device.set_advert_name(name).await,
Self::Reticulum(device) => device.set_advert_name(name).await,
}
}
/// Provision the operator-configured LoRa region. Meshcore radios manage
/// their own band on the device, so this is a no-op for them; Meshtastic
/// radios ship region-UNSET (RF-silent) and must be set or they never mesh.
/// Returns `Ok(true)` when a region was written (the device reboots to
/// apply, so the caller should restart the session). No-op for Reticulum:
/// the daemon's RNodeInterface config carries its own LoRa profile, not
/// driven through this firmware-admin path.
async fn ensure_lora_region(&mut self, region: Option<&str>) -> Result<bool> {
match self {
Self::Meshcore(_) => Ok(false),
Self::Meshtastic(device) => device.ensure_lora_region(region).await,
Self::Reticulum(_) => Ok(false),
}
}
/// Provision the shared archy primary channel so all nodes can decode each
/// other. No-op for meshcore (it joins its channel by name on the device);
/// Meshtastic radios can sit on mismatched channels otherwise and silently
/// drop every packet as undecryptable. Returns `Ok(true)` when a channel was
/// written (device reboots; caller should restart the session). No-op for
/// Reticulum: RNS has no shared-PSK channel concept (see
/// `ReticulumLink::send_channel_text`).
async fn ensure_channel(&mut self, channel_name: Option<&str>) -> Result<bool> {
match self {
Self::Meshcore(_) => Ok(false),
Self::Meshtastic(device) => device.ensure_channel(channel_name).await,
Self::Reticulum(_) => Ok(false),
} }
} }
@ -80,33 +43,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.send_self_advert().await, Self::Meshcore(device) => device.send_self_advert().await,
Self::Meshtastic(device) => device.send_self_advert().await, Self::Meshtastic(device) => device.send_self_advert().await,
Self::Reticulum(device) => device.send_self_advert().await,
}
}
/// Lightweight serial keepalive (Meshtastic only). Keeps the firmware
/// streaming RECEIVED packets to our serial client — without it the radio
/// can mark a quiet client gone and deliver only our own queue-status.
/// Meshcore/Reticulum need no such ping (Reticulum's "serial" traffic is
/// the daemon's own RNS link, not a firmware queue we poll).
async fn send_keepalive(&mut self) -> Result<()> {
match self {
Self::Meshcore(_) => Ok(()),
Self::Meshtastic(device) => device.send_keepalive().await,
Self::Reticulum(_) => Ok(()),
}
}
/// Actively advertise our identity over the air. Meshcore already does this
/// inside `send_self_advert` (CMD_SEND_SELF_ADVERT), so this is a no-op for
/// it; Meshtastic needs an explicit NodeInfo broadcast or peers never learn
/// about an already-running node. No-op for Reticulum: its `announce` (via
/// `send_self_advert`) already covers discovery.
async fn send_nodeinfo_advert(&mut self, want_response: bool) -> Result<()> {
match self {
Self::Meshcore(_) => Ok(()),
Self::Meshtastic(device) => device.send_nodeinfo_broadcast(want_response).await,
Self::Reticulum(_) => Ok(()),
} }
} }
@ -114,7 +50,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.send_channel_text(channel, payload).await, Self::Meshcore(device) => device.send_channel_text(channel, payload).await,
Self::Meshtastic(device) => device.send_channel_text(channel, payload).await, Self::Meshtastic(device) => device.send_channel_text(channel, payload).await,
Self::Reticulum(device) => device.send_channel_text(channel, payload).await,
} }
} }
@ -122,54 +57,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.send_text_msg(dest_pubkey_prefix, payload).await, Self::Meshcore(device) => device.send_text_msg(dest_pubkey_prefix, payload).await,
Self::Meshtastic(device) => device.send_text_msg(dest_pubkey_prefix, payload).await, Self::Meshtastic(device) => device.send_text_msg(dest_pubkey_prefix, payload).await,
Self::Reticulum(device) => device.send_text_msg(dest_pubkey_prefix, payload).await,
}
}
/// Send an image via native LXMF `FIELD_IMAGE` — Reticulum-only, for a
/// stock (non-archy) peer that can't decode our typed envelope. See
/// `ReticulumLink::send_native_image`.
async fn send_native_image(
&mut self,
dest_pubkey_prefix: &[u8; 6],
mime: &str,
bytes: &[u8],
caption: Option<&str>,
) -> Result<()> {
match self {
Self::Meshcore(_) | Self::Meshtastic(_) => {
anyhow::bail!("Native image send is Reticulum-only")
}
Self::Reticulum(device) => {
device.send_native_image(dest_pubkey_prefix, mime, bytes, caption).await
}
}
}
/// Send `data` over a dedicated RNS Resource transfer instead of the
/// small-payload "content" path — only Reticulum has anything resembling
/// this (a native large-binary transfer protocol over a `RNS.Link`).
/// Meshcore/Meshtastic have no equivalent in our driver; callers must
/// check `device_type() == DeviceType::Reticulum` before reaching for
/// this (see `mesh.transport-advice`'s `"resource-mesh"` tier, which is
/// Reticulum-only), so an Err here means the caller's gating is wrong,
/// not a legitimate no-op.
async fn send_resource(&mut self, dest_pubkey_prefix: &[u8; 6], data: &[u8]) -> Result<()> {
match self {
Self::Meshcore(_) | Self::Meshtastic(_) => {
anyhow::bail!("Resource transfer is Reticulum-only")
}
Self::Reticulum(device) => device.send_resource(dest_pubkey_prefix, data).await,
}
}
async fn reboot(&mut self, seconds: i64) -> Result<()> {
match self {
// Meshcore/Reticulum have no equivalent local-admin reboot in our
// driver; the RX-deaf recovery this targets is Meshtastic-specific.
Self::Meshcore(_) => Ok(()),
Self::Meshtastic(device) => device.reboot(seconds).await,
Self::Reticulum(_) => Ok(()),
} }
} }
@ -177,7 +64,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.remove_contact(pubkey).await, Self::Meshcore(device) => device.remove_contact(pubkey).await,
Self::Meshtastic(device) => device.remove_contact(pubkey).await, Self::Meshtastic(device) => device.remove_contact(pubkey).await,
Self::Reticulum(device) => device.remove_contact(pubkey).await,
} }
} }
@ -201,11 +87,6 @@ impl MeshRadioDevice {
.add_contact(pubkey, contact_type, flags, out_path_len, name, last_advert) .add_contact(pubkey, contact_type, flags, out_path_len, name, last_advert)
.await .await
} }
Self::Reticulum(device) => {
device
.add_contact(pubkey, contact_type, flags, out_path_len, name, last_advert)
.await
}
} }
} }
@ -213,7 +94,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.get_contacts().await, Self::Meshcore(device) => device.get_contacts().await,
Self::Meshtastic(device) => device.get_contacts().await, Self::Meshtastic(device) => device.get_contacts().await,
Self::Reticulum(device) => device.get_contacts().await,
} }
} }
@ -221,8 +101,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.reset_contact_path(pubkey).await, Self::Meshcore(device) => device.reset_contact_path(pubkey).await,
Self::Meshtastic(device) => device.reset_contact_path(pubkey).await, Self::Meshtastic(device) => device.reset_contact_path(pubkey).await,
// RNS does its own pathfinding — no firmware path table to reset.
Self::Reticulum(_) => Ok(()),
} }
} }
@ -230,7 +108,6 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.sync_messages().await, Self::Meshcore(device) => device.sync_messages().await,
Self::Meshtastic(device) => device.sync_messages().await, Self::Meshtastic(device) => device.sync_messages().await,
Self::Reticulum(device) => device.sync_messages().await,
} }
} }
@ -238,89 +115,37 @@ impl MeshRadioDevice {
match self { match self {
Self::Meshcore(device) => device.try_recv_frame().await, Self::Meshcore(device) => device.try_recv_frame().await,
Self::Meshtastic(device) => device.try_recv_frame().await, Self::Meshtastic(device) => device.try_recv_frame().await,
Self::Reticulum(device) => device.try_recv_frame().await,
}
}
/// PKI-E2E status of the last inbound frame (meshtastic only; meshcore's
/// per-message E2E is derived in the frames decrypt path). Reticulum/LXMF
/// is unconditionally E2E (no plaintext mode), so it always reports true.
/// Take-and-clear.
fn take_rx_encrypted(&mut self) -> bool {
match self {
Self::Meshcore(_) => false,
Self::Meshtastic(device) => device.take_rx_encrypted(),
Self::Reticulum(device) => device.take_rx_encrypted(),
} }
} }
} }
/// Scan all candidate serial ports and open the first supported mesh device found. /// Scan all candidate serial ports and open the first supported mesh device found.
/// async fn auto_detect_and_open() -> Result<(String, MeshRadioDevice, DeviceInfo)> {
/// `device_kind`, when set, pins the expected firmware (operator-confirmed via
/// `MeshConfig.device_kind` — see the plan's §2c reflashable-board note): only
/// that one device's probe runs, so a non-matching firmware's init bytes are
/// never injected into the port. `None` keeps the strict
/// Meshcore→Meshtastic→Reticulum probe order.
async fn auto_detect_and_open(
data_dir: &Path,
our_ed_pubkey_hex: &str,
our_x25519_pubkey_hex: &str,
device_kind: Option<DeviceType>,
) -> Result<(String, MeshRadioDevice, DeviceInfo)> {
let paths = super::super::serial::detect_serial_devices().await; let paths = super::super::serial::detect_serial_devices().await;
if paths.is_empty() { if paths.is_empty() {
anyhow::bail!("No serial devices found in /dev"); anyhow::bail!("No serial devices found in /dev");
} }
for path in &paths { for path in &paths {
debug!(path = %path, "Probing for mesh radio device"); debug!(path = %path, "Probing for mesh radio device");
if device_kind.is_none_or(|k| k == DeviceType::Meshcore) { match MeshcoreDevice::open(path).await {
match MeshcoreDevice::open(path).await { Ok(mut dev) => match dev.initialize().await {
Ok(mut dev) => match dev.initialize().await { Ok(info) => {
Ok(info) => { info!(path = %path, firmware = %info.firmware_version, "Found Meshcore device via auto-detect");
info!(path = %path, firmware = %info.firmware_version, "Found Meshcore device via auto-detect"); return Ok((path.clone(), MeshRadioDevice::Meshcore(dev), info));
return Ok((path.clone(), MeshRadioDevice::Meshcore(dev), info)); }
} Err(e) => debug!(path = %path, error = %e, "Not a Meshcore device"),
Err(e) => debug!(path = %path, error = %e, "Not a Meshcore device"), },
}, Err(e) => debug!(path = %path, error = %e, "Could not open serial port"),
Err(e) => debug!(path = %path, error = %e, "Could not open serial port"),
}
} }
if device_kind.is_none_or(|k| k == DeviceType::Meshtastic) { match MeshtasticDevice::open(path).await {
match MeshtasticDevice::open(path).await { Ok(mut dev) => match dev.initialize().await {
Ok(mut dev) => match dev.initialize().await { Ok(info) => {
Ok(info) => { info!(path = %path, firmware = %info.firmware_version, "Found Meshtastic device via auto-detect");
info!(path = %path, firmware = %info.firmware_version, "Found Meshtastic device via auto-detect"); return Ok((path.clone(), MeshRadioDevice::Meshtastic(dev), info));
return Ok((path.clone(), MeshRadioDevice::Meshtastic(dev), info)); }
} Err(e) => debug!(path = %path, error = %e, "Not a Meshtastic device"),
Err(e) => debug!(path = %path, error = %e, "Not a Meshtastic device"), },
}, Err(e) => debug!(path = %path, error = %e, "Could not open serial port for Meshtastic"),
Err(e) => debug!(path = %path, error = %e, "Could not open serial port for Meshtastic"),
}
}
// Tried LAST: the same reflashable board (e.g. Heltec V3) can run
// Meshcore, Meshtastic, or RNode firmware, so each probe must fail
// strictly before the next is attempted. The RNode KISS-detect probe
// is the most expensive (spawns the supervised daemon on a match), so
// it goes after the two cheap firmware-specific handshakes above.
if device_kind.is_none_or(|k| k == DeviceType::Reticulum) {
match ReticulumLink::open(
path,
data_dir,
Some(our_ed_pubkey_hex),
Some(our_x25519_pubkey_hex),
)
.await
{
Ok(mut dev) => match dev.initialize().await {
Ok(info) => {
info!(path = %path, "Found Reticulum (RNode) device via auto-detect");
return Ok((path.clone(), MeshRadioDevice::Reticulum(dev), info));
}
Err(e) => debug!(path = %path, error = %e, "Reticulum daemon failed to initialize"),
},
Err(e) => debug!(path = %path, error = %e, "Not a Reticulum RNode"),
}
} }
} }
anyhow::bail!( anyhow::bail!(
@ -330,57 +155,7 @@ async fn auto_detect_and_open(
) )
} }
async fn open_preferred_path( async fn open_preferred_path(path: &str) -> Result<(MeshRadioDevice, DeviceInfo)> {
path: &str,
data_dir: &Path,
our_ed_pubkey_hex: &str,
our_x25519_pubkey_hex: &str,
device_kind: Option<DeviceType>,
) -> Result<(MeshRadioDevice, DeviceInfo)> {
// Pinned: try only the configured firmware and surface its own error —
// never fall through to (and inject probe bytes into) another firmware's
// handshake on this port.
if let Some(kind) = device_kind {
return match kind {
DeviceType::Meshcore => {
let mut dev = MeshcoreDevice::open(path)
.await
.context("Could not open preferred path as Meshcore")?;
let info = dev
.initialize()
.await
.context("Preferred path is not a working Meshcore device")?;
Ok((MeshRadioDevice::Meshcore(dev), info))
}
DeviceType::Meshtastic => {
let mut dev = MeshtasticDevice::open(path)
.await
.context("Could not open preferred path as Meshtastic")?;
let info = dev
.initialize()
.await
.context("Preferred path is not a working Meshtastic device")?;
Ok((MeshRadioDevice::Meshtastic(dev), info))
}
DeviceType::Reticulum => {
let mut dev = ReticulumLink::open(
path,
data_dir,
Some(our_ed_pubkey_hex),
Some(our_x25519_pubkey_hex),
)
.await
.context("Could not open preferred path as Reticulum")?;
let info = dev
.initialize()
.await
.context("Preferred path is not a working Reticulum RNode")?;
Ok((MeshRadioDevice::Reticulum(dev), info))
}
DeviceType::Unknown => anyhow::bail!("device_kind cannot be Unknown"),
};
}
match MeshcoreDevice::open(path).await { match MeshcoreDevice::open(path).await {
Ok(mut dev) => match dev.initialize().await { Ok(mut dev) => match dev.initialize().await {
Ok(info) => return Ok((MeshRadioDevice::Meshcore(dev), info)), Ok(info) => return Ok((MeshRadioDevice::Meshcore(dev), info)),
@ -390,24 +165,10 @@ async fn open_preferred_path(
} }
match MeshtasticDevice::open(path).await { match MeshtasticDevice::open(path).await {
Ok(mut dev) => match dev.initialize().await { Ok(mut dev) => match dev.initialize().await {
Ok(info) => return Ok((MeshRadioDevice::Meshtastic(dev), info)), Ok(info) => Ok((MeshRadioDevice::Meshtastic(dev), info)),
Err(e) => debug!(path = %path, error = %e, "Preferred path is not Meshtastic"), Err(e) => Err(e).context("Preferred path is not Meshtastic"),
}, },
Err(e) => debug!(path = %path, error = %e, "Could not open preferred path as Meshtastic"), Err(e) => Err(e).context("Could not open preferred path as Meshtastic"),
}
match ReticulumLink::open(
path,
data_dir,
Some(our_ed_pubkey_hex),
Some(our_x25519_pubkey_hex),
)
.await
{
Ok(mut dev) => match dev.initialize().await {
Ok(info) => Ok((MeshRadioDevice::Reticulum(dev), info)),
Err(e) => Err(e).context("Preferred path is not a working Reticulum RNode"),
},
Err(e) => Err(e).context("Could not open preferred path as Reticulum"),
} }
} }
@ -611,16 +372,8 @@ async fn refresh_contacts(device: &mut MeshRadioDevice, state: &Arc<MeshState>)
// user-controlled feature; until then every firmware contact is // user-controlled feature; until then every firmware contact is
// surfaced. `radio_contact_blocklist` is retained but unused. // surfaced. `radio_contact_blocklist` is retained but unused.
let mut peers = state.peers.write().await; let mut peers = state.peers.write().await;
let is_meshtastic = matches!(device.device_type(), DeviceType::Meshtastic);
let is_reticulum = matches!(device.device_type(), DeviceType::Reticulum);
for (idx, contact) in contacts.iter().enumerate() { for (idx, contact) in contacts.iter().enumerate() {
let contact_id = if is_meshtastic { let contact_id = idx as u32;
meshtastic_contact_id(&contact.public_key_hex).unwrap_or(idx as u32)
} else if is_reticulum {
reticulum_contact_id(&contact.public_key_hex).unwrap_or(idx as u32)
} else {
idx as u32
};
let existing = peers.get(&contact_id); let existing = peers.get(&contact_id);
let peer = super::super::types::MeshPeer { let peer = super::super::types::MeshPeer {
contact_id, contact_id,
@ -633,29 +386,14 @@ async fn refresh_contacts(device: &mut MeshRadioDevice, state: &Arc<MeshState>)
// fail authentication after the next contact refresh. // fail authentication after the next contact refresh.
arch_pubkey_hex: existing.and_then(|p| p.arch_pubkey_hex.clone()), arch_pubkey_hex: existing.and_then(|p| p.arch_pubkey_hex.clone()),
x25519_pubkey: existing.and_then(|p| p.x25519_pubkey), x25519_pubkey: existing.and_then(|p| p.x25519_pubkey),
// Meshtastic-only today (see ParsedContact) — falls back to rssi: None,
// whatever was already known if this refresh's contact snr: None,
// snapshot doesn't carry a fresher reading (it always does
// for Meshtastic, since packet_to_inbound_frame updates the
// live contacts map on every heard packet; this fallback
// just avoids flapping to None on a transitional refresh).
rssi: contact.rssi.or_else(|| existing.and_then(|p| p.rssi)),
snr: contact.snr.or_else(|| existing.and_then(|p| p.snr)),
last_heard: chrono::Utc::now().to_rfc3339(), last_heard: chrono::Utc::now().to_rfc3339(),
hops: 0, hops: 0,
last_advert: contact.last_advert, last_advert: contact.last_advert,
// A non-zero path_len means the firmware has a route (direct // A non-zero path_len means the firmware has a route (direct
// or flood) to this contact — i.e. we can deliver to it. // or flood) to this contact — i.e. we can deliver to it.
reachable: contact.path_len != 0, reachable: contact.path_len != 0,
// E2E capability only grows (once the radio learns a peer's
// PKI key it stays known), so OR with any prior value rather
// than letting a transient contact refresh clear the pill.
pkc_capable: contact.pkc_capable
|| existing.map(|p| p.pkc_capable).unwrap_or(false),
// Position only ever improves to a fresher fix; never clear
// it just because a refresh's snapshot didn't carry one.
lat: contact.lat.or_else(|| existing.and_then(|p| p.lat)),
lon: contact.lon.or_else(|| existing.and_then(|p| p.lon)),
}; };
peers.insert(contact_id, peer); peers.insert(contact_id, peer);
} }
@ -709,30 +447,6 @@ async fn refresh_contacts(device: &mut MeshRadioDevice, state: &Arc<MeshState>)
} }
} }
fn meshtastic_contact_id(public_key_hex: &str) -> Option<u32> {
let bytes = hex::decode(public_key_hex).ok()?;
if bytes.len() < 15 || &bytes[4..15] != b"meshtastic:" {
return None;
}
let node_num = u32::from_le_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
if node_num == 0 || node_num == u32::MAX {
None
} else {
Some(node_num)
}
}
/// Stable `u32` contact id derived from a Reticulum contact's `public_key_hex`
/// (hex of the 16-byte RNS destination hash). Delegates to the canonical
/// derivation in `reticulum.rs` so there is exactly one masking rule (must
/// stay below `FEDERATION_CONTACT_ID_BASE`, mod.rs:53) shared with
/// `ReticulumLink::initialize()`'s reported `node_id`.
fn reticulum_contact_id(public_key_hex: &str) -> Option<u32> {
let bytes = hex::decode(public_key_hex).ok()?;
let hash: [u8; 16] = bytes.try_into().ok()?;
Some(super::super::reticulum::reticulum_contact_id_from_hash(&hash))
}
/// Drain any queued messages from the device. /// Drain any queued messages from the device.
/// Returns `true` if a write/communication error occurred (for failure tracking). /// Returns `true` if a write/communication error occurred (for failure tracking).
async fn sync_queued_messages( async fn sync_queued_messages(
@ -757,62 +471,32 @@ async fn sync_queued_messages(
} }
} }
/// How many times we will try to write the LoRa region across reconnects before
/// giving up. A healthy radio accepts it on the first try (the reboot-and-verify
/// resolves on the next session). A radio that silently refuses to persist
/// config — corrupt/full flash, managed mode, etc. — would otherwise reboot-loop
/// forever; after this many attempts we stop, log, and run without it.
const MAX_REGION_PROVISION_ATTEMPTS: u32 = 3;
/// Process-global count of LoRa-region writes attempted (one radio per process).
/// Reset to 0 whenever the radio reports the desired region, so genuine later
/// drift re-provisions but a broken radio doesn't loop.
static REGION_PROVISION_ATTEMPTS: std::sync::atomic::AtomicU32 =
std::sync::atomic::AtomicU32::new(0);
/// Same retry-cap idea as the region, for the shared-channel write.
static CHANNEL_PROVISION_ATTEMPTS: std::sync::atomic::AtomicU32 =
std::sync::atomic::AtomicU32::new(0);
/// Run a single mesh session (connect, initialize, main loop). /// Run a single mesh session (connect, initialize, main loop).
pub(super) async fn run_mesh_session( pub(super) async fn run_mesh_session(
state: &Arc<MeshState>, state: &Arc<MeshState>,
data_dir: &Path,
preferred_path: Option<&str>, preferred_path: Option<&str>,
our_did: &str, our_did: &str,
our_ed_pubkey_hex: &str, our_ed_pubkey_hex: &str,
our_x25519_secret: &[u8; 32], our_x25519_secret: &[u8; 32],
our_x25519_pubkey_hex: &str, our_x25519_pubkey_hex: &str,
server_name: Option<&str>, server_name: Option<&str>,
lora_region: Option<&str>,
channel_name: Option<&str>,
device_kind: Option<DeviceType>,
shutdown: &mut tokio::sync::watch::Receiver<bool>, shutdown: &mut tokio::sync::watch::Receiver<bool>,
cmd_rx: &mut mpsc::Receiver<MeshCommand>, cmd_rx: &mut mpsc::Receiver<MeshCommand>,
) -> Result<()> { ) -> Result<()> {
// Detect device — try preferred path first, fall back to auto-detect // Detect device — try preferred path first, fall back to auto-detect
let (device_path, mut device, device_info) = if let Some(path) = preferred_path { let (device_path, mut device, device_info) = if let Some(path) = preferred_path {
match open_preferred_path( match open_preferred_path(path).await {
path,
data_dir,
our_ed_pubkey_hex,
our_x25519_pubkey_hex,
device_kind,
)
.await
{
Ok((dev, info)) => (path.to_string(), dev, info), Ok((dev, info)) => (path.to_string(), dev, info),
Err(e) => { Err(e) => {
warn!( warn!(
"Preferred path {} probe failed: {} — trying auto-detect", "Preferred path {} probe failed: {} — trying auto-detect",
path, e path, e
); );
auto_detect_and_open(data_dir, our_ed_pubkey_hex, our_x25519_pubkey_hex, device_kind) auto_detect_and_open().await?
.await?
} }
} }
} else { } else {
auto_detect_and_open(data_dir, our_ed_pubkey_hex, our_x25519_pubkey_hex, device_kind).await? auto_detect_and_open().await?
}; };
// Update status // Update status
@ -828,73 +512,6 @@ pub(super) async fn run_mesh_session(
let _ = state.event_tx.send(MeshEvent::DeviceConnected(device_info)); let _ = state.event_tx.send(MeshEvent::DeviceConnected(device_info));
// Provision the LoRa region before anything else. A fresh Meshtastic radio
// is region-UNSET and therefore RF-silent — it can neither hear nor be
// heard, so contact discovery and DMs would all silently fail. If we write
// a new region the firmware reboots to apply it; restart the session so we
// re-handshake the freshly-rebooted radio (and then set its name on the
// reconnect, where the region already matches and no reboot occurs).
use std::sync::atomic::Ordering;
let region_attempts = REGION_PROVISION_ATTEMPTS.load(Ordering::Relaxed);
if region_attempts < MAX_REGION_PROVISION_ATTEMPTS {
match device.ensure_lora_region(lora_region).await {
Ok(true) => {
REGION_PROVISION_ATTEMPTS.fetch_add(1, Ordering::Relaxed);
info!(
region = lora_region.unwrap_or(""),
attempt = region_attempts + 1,
max = MAX_REGION_PROVISION_ATTEMPTS,
"Provisioned LoRa region — radio rebooting, restarting mesh session"
);
// Give the radio time to reboot before the reconnect re-opens it.
tokio::time::sleep(Duration::from_secs(10)).await;
return Ok(());
}
// Radio reports the desired region (or none configured): clear the
// attempt counter so a future genuine drift re-provisions cleanly.
Ok(false) => REGION_PROVISION_ATTEMPTS.store(0, Ordering::Relaxed),
Err(e) => warn!("Failed to provision LoRa region: {}", e),
}
} else if lora_region.is_some() {
warn!(
region = lora_region.unwrap_or(""),
attempts = MAX_REGION_PROVISION_ATTEMPTS,
"Radio did not persist the configured LoRa region after repeated \
attempts continuing without it. The radio likely needs a manual \
factory reset / reflash; mesh discovery stays offline until its \
region is set."
);
}
// Provision the shared primary channel (after the region, since both reboot
// the radio). Without a matching channel two same-region radios still can't
// decode each other's traffic. Same retry-cap + restart-on-change pattern.
let channel_attempts = CHANNEL_PROVISION_ATTEMPTS.load(Ordering::Relaxed);
if channel_attempts < MAX_REGION_PROVISION_ATTEMPTS {
match device.ensure_channel(channel_name).await {
Ok(true) => {
CHANNEL_PROVISION_ATTEMPTS.fetch_add(1, Ordering::Relaxed);
info!(
channel = channel_name.unwrap_or(""),
attempt = channel_attempts + 1,
max = MAX_REGION_PROVISION_ATTEMPTS,
"Provisioned shared mesh channel — radio rebooting, restarting mesh session"
);
tokio::time::sleep(Duration::from_secs(10)).await;
return Ok(());
}
Ok(false) => CHANNEL_PROVISION_ATTEMPTS.store(0, Ordering::Relaxed),
Err(e) => warn!("Failed to provision mesh channel: {}", e),
}
} else if channel_name.is_some() {
warn!(
channel = channel_name.unwrap_or(""),
attempts = MAX_REGION_PROVISION_ATTEMPTS,
"Radio did not persist the shared mesh channel after repeated \
attempts continuing without it; the radio may need a manual reset."
);
}
// Set advert name to the server's human-readable name (e.g. "ThinkPad"), // Set advert name to the server's human-readable name (e.g. "ThinkPad"),
// falling back to the DID fragment if no name is configured. // falling back to the DID fragment if no name is configured.
let advert_name = if let Some(name) = server_name { let advert_name = if let Some(name) = server_name {
@ -919,13 +536,6 @@ pub(super) async fn run_mesh_session(
if let Err(e) = device.send_self_advert().await { if let Err(e) = device.send_self_advert().await {
warn!("Failed to send initial advert: {}", e); warn!("Failed to send initial advert: {}", e);
} }
// Actively announce our identity over the air with want_response, so any
// already-running neighbour both learns about us and replies with its own
// NodeInfo — immediate two-way discovery instead of waiting for the radio's
// multi-hour NodeInfo cycle. (No-op for meshcore.)
if let Err(e) = device.send_nodeinfo_advert(true).await {
warn!("Failed to send initial NodeInfo advert: {}", e);
}
// NOTE: Archipelago identity adverts (`ARCHY:2:{ed}:{x25519}`) are intentionally // NOTE: Archipelago identity adverts (`ARCHY:2:{ed}:{x25519}`) are intentionally
// NOT broadcast on the shared public channel (channel 0). Doing so spams every // NOT broadcast on the shared public channel (channel 0). Doing so spams every
@ -950,11 +560,6 @@ pub(super) async fn run_mesh_session(
advert_timer.tick().await; // skip first immediate tick advert_timer.tick().await; // skip first immediate tick
sync_timer.tick().await; sync_timer.tick().await;
let mut consecutive_write_failures: u32 = 0; let mut consecutive_write_failures: u32 = 0;
// Backlog #12 RX-stall watchdog — see RX_STALL_TIMEOUT's doc comment.
// Reset on the very first frame check too (not just successful reads),
// so a session that never receives anything still gets a full timeout
// window from startup rather than an immediately-stale clock.
let mut last_rx_at = Instant::now();
loop { loop {
// If too many consecutive writes have failed, the serial port is dead — // If too many consecutive writes have failed, the serial port is dead —
@ -969,39 +574,19 @@ pub(super) async fn run_mesh_session(
consecutive_write_failures consecutive_write_failures
); );
} }
if last_rx_at.elapsed() >= RX_STALL_TIMEOUT {
error!(
stalled_for_secs = last_rx_at.elapsed().as_secs(),
"No mesh frames received for too long — triggering reconnection"
);
anyhow::bail!(
"RX stalled for over {}s — forcing reconnect",
RX_STALL_TIMEOUT.as_secs()
);
}
tokio::select! { tokio::select! {
// Check for incoming frames // Check for incoming frames
frame_result = device.try_recv_frame() => { frame_result = device.try_recv_frame() => {
match frame_result { match frame_result {
Ok(Some(frame)) => { Ok(Some(frame)) => {
// Successful read resets the failure counter and the // Successful read resets the failure counter
// RX-stall watchdog.
consecutive_write_failures = 0; consecutive_write_failures = 0;
last_rx_at = Instant::now();
// For meshtastic, the PKI-E2E status of this frame can't
// ride the synthetic meshcore frame — snapshot the message
// id high-water mark, dispatch, then stamp the E2E pill on
// whatever received message this frame produced.
let before_id = dispatch::max_message_id(state).await;
let should_action = frames::handle_frame( let should_action = frames::handle_frame(
&frame, &frame,
state, state,
our_x25519_secret, our_x25519_secret,
).await; ).await;
if device.take_rx_encrypted() {
dispatch::stamp_received_encrypted(state, before_id).await;
}
if should_action { if should_action {
// Contact discovery or messages waiting — sync both // Contact discovery or messages waiting — sync both
refresh_contacts(&mut device, state).await; refresh_contacts(&mut device, state).await;
@ -1030,13 +615,6 @@ pub(super) async fn run_mesh_session(
} else { } else {
consecutive_write_failures = 0; consecutive_write_failures = 0;
} }
// Periodic over-air identity beacon (no want_response, to avoid
// reply storms) so peers that come online later still discover
// us between the radio's own infrequent NodeInfo broadcasts.
// No-op for meshcore (its self-advert above already goes out).
if let Err(e) = device.send_nodeinfo_advert(false).await {
debug!("Periodic NodeInfo advert failed: {}", e);
}
// (Identity re-broadcast on the public channel intentionally // (Identity re-broadcast on the public channel intentionally
// removed — see the note at session startup. It spammed the // removed — see the note at session startup. It spammed the
// shared channel every advert tick.) // shared channel every advert tick.)
@ -1048,14 +626,8 @@ pub(super) async fn run_mesh_session(
handle_send_command(cmd, &mut device, state, &mut consecutive_write_failures).await; handle_send_command(cmd, &mut device, state, &mut consecutive_write_failures).await;
} }
// Periodic message sync + serial keepalive // Periodic message sync
_ = sync_timer.tick() => { _ = sync_timer.tick() => {
// Keep the radio streaming inbound packets to our serial client
// (best-effort — a failed keepalive shouldn't trip the reconnect
// counter on its own; a truly dead port is caught by real writes).
if let Err(e) = device.send_keepalive().await {
debug!("Mesh keepalive failed: {}", e);
}
if sync_queued_messages(&mut device, state, our_x25519_secret).await { if sync_queued_messages(&mut device, state, our_x25519_secret).await {
consecutive_write_failures += 1; consecutive_write_failures += 1;
debug!(failures = consecutive_write_failures, "Message sync failed"); debug!(failures = consecutive_write_failures, "Message sync failed");
@ -1135,53 +707,6 @@ async fn handle_send_command(
) )
.await; .await;
} }
MeshCommand::SendResource {
dest_pubkey_prefix,
payload,
} => {
// No MC-chunk framing here — RNS Resources do their own native
// chunked transfer at the link layer, so the payload goes through
// as-is (the receiving daemon hands back the complete blob in one
// `resource_recv` event).
if let Err(e) = device.send_resource(&dest_pubkey_prefix, &payload).await {
*consecutive_write_failures += 1;
warn!(
failures = *consecutive_write_failures,
"Failed to send Reticulum resource: {}", e
);
} else {
*consecutive_write_failures = 0;
info!(
dest = %hex::encode(dest_pubkey_prefix),
len = payload.len(),
"Sent Reticulum resource transfer"
);
}
}
MeshCommand::SendNativeImage {
dest_pubkey_prefix,
mime,
bytes,
caption,
} => {
if let Err(e) = device
.send_native_image(&dest_pubkey_prefix, &mime, &bytes, caption.as_deref())
.await
{
*consecutive_write_failures += 1;
warn!(
failures = *consecutive_write_failures,
"Failed to send native image: {}", e
);
} else {
*consecutive_write_failures = 0;
info!(
dest = %hex::encode(dest_pubkey_prefix),
len = bytes.len(),
"Sent native LXMF image"
);
}
}
MeshCommand::BroadcastChannel { channel, payload } => { MeshCommand::BroadcastChannel { channel, payload } => {
if let Err(e) = device.send_channel_text(channel, &payload).await { if let Err(e) = device.send_channel_text(channel, &payload).await {
*consecutive_write_failures += 1; *consecutive_write_failures += 1;
@ -1205,13 +730,6 @@ async fn handle_send_command(
*consecutive_write_failures = 0; *consecutive_write_failures = 0;
} }
} }
MeshCommand::RebootRadio { seconds } => {
if let Err(e) = device.reboot(seconds).await {
warn!("Failed to reboot radio: {}", e);
} else {
info!(seconds, "Radio reboot command sent to device");
}
}
MeshCommand::RefreshContacts => { MeshCommand::RefreshContacts => {
refresh_contacts(device, state).await; refresh_contacts(device, state).await;
} }

File diff suppressed because it is too large Load Diff

View File

@ -192,28 +192,16 @@ pub struct MessageKey {
// ─── Wire Envelope ────────────────────────────────────────────────────── // ─── Wire Envelope ──────────────────────────────────────────────────────
/// CBOR wire envelope wrapping any typed message. /// CBOR wire envelope wrapping any typed message.
///
/// `v`/`sig` MUST use `compact_bytes`/`compact_bytes_opt` — this is the
/// envelope EVERY message type wraps its payload in, so plain derived
/// `Vec<u8>` encoding here (one CBOR integer per byte instead of a native
/// byte string) bloats every single message on the wire, not just
/// attachments. Root-caused live: a small ReadReceipt (tiny inner payload)
/// crossed the 140-byte single-frame threshold purely from this envelope's
/// own array-of-ints tax on `v`, triggering MC-chunked send to a Reticulum
/// peer whose chunks then failed to reassemble — surfaced as raw
/// `MC000...` fragments in the chat instead of a receipt. Fix this here,
/// not just on individual payload structs like `ContentInlinePayload`.
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TypedEnvelope { pub struct TypedEnvelope {
/// Message type. /// Message type.
pub t: u8, pub t: u8,
/// Payload bytes (type-specific CBOR or raw data). /// Payload bytes (type-specific CBOR or raw data).
#[serde(with = "compact_bytes")]
pub v: Vec<u8>, pub v: Vec<u8>,
/// Unix timestamp (seconds since epoch). /// Unix timestamp (seconds since epoch).
pub ts: u32, pub ts: u32,
/// Optional Ed25519 signature of (t || v || ts_bytes) — for signed messages. /// Optional Ed25519 signature of (t || v || ts_bytes) — for signed messages.
#[serde(default, skip_serializing_if = "Option::is_none", with = "compact_bytes_opt")] #[serde(default, skip_serializing_if = "Option::is_none")]
pub sig: Option<Vec<u8>>, pub sig: Option<Vec<u8>>,
/// Message sequence number (per-sender, monotonically increasing). /// Message sequence number (per-sender, monotonically increasing).
#[serde(default)] #[serde(default)]
@ -493,29 +481,6 @@ pub struct ReactionPayload {
pub emoji: String, pub emoji: String,
} }
/// `Option<Vec<u8>>` <-> base64 string, for fields that need to survive a JSON
/// round-trip to the frontend readably (plain serde would emit/expect a JSON
/// array of numbers for `Vec<u8>`, which isn't what `data:` URLs want). CBOR
/// wire encoding pays a small (~33%) size tax for this on `thumb_bytes`
/// specifically — negligible given thumbnails are capped at ~60 bytes.
mod base64_opt_bytes {
use base64::{engine::general_purpose::STANDARD, Engine as _};
use serde::{Deserialize, Deserializer, Serializer};
pub fn serialize<S: Serializer>(v: &Option<Vec<u8>>, s: S) -> Result<S::Ok, S::Error> {
match v {
Some(bytes) => s.serialize_str(&STANDARD.encode(bytes)),
None => s.serialize_none(),
}
}
pub fn deserialize<'de, D: Deserializer<'de>>(d: D) -> Result<Option<Vec<u8>>, D::Error> {
let opt: Option<String> = Option::deserialize(d)?;
opt.map(|s| STANDARD.decode(&s).map_err(serde::de::Error::custom))
.transpose()
}
}
/// Content/attachment reference: points at a blob held by the sender that /// Content/attachment reference: points at a blob held by the sender that
/// recipients fetch out-of-band via `GET {sender_onion}/blob/{cid}?cap=..&exp=..&peer=..`. /// recipients fetch out-of-band via `GET {sender_onion}/blob/{cid}?cap=..&exp=..&peer=..`.
/// Thumb bytes (≤60B) may be inlined for immediate display; full blob is lazy. /// Thumb bytes (≤60B) may be inlined for immediate display; full blob is lazy.
@ -526,7 +491,7 @@ pub struct ContentRefPayload {
pub mime: String, pub mime: String,
#[serde(default, skip_serializing_if = "Option::is_none")] #[serde(default, skip_serializing_if = "Option::is_none")]
pub filename: Option<String>, pub filename: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none", with = "base64_opt_bytes")] #[serde(default, skip_serializing_if = "Option::is_none")]
pub thumb_bytes: Option<Vec<u8>>, pub thumb_bytes: Option<Vec<u8>>,
#[serde(default, skip_serializing_if = "Option::is_none")] #[serde(default, skip_serializing_if = "Option::is_none")]
pub caption: Option<String>, pub caption: Option<String>,
@ -538,86 +503,6 @@ pub struct ContentRefPayload {
pub cap_exp: u64, pub cap_exp: u64,
} }
/// Serde's blanket `Serialize`/`Deserialize` for `Vec<u8>` goes through
/// `serialize_seq`/one CBOR integer per byte, NOT CBOR's native byte-string
/// type — measured ~3.5x wire bloat on a real attachment send (4746 raw
/// bytes -> 16638-byte CBOR envelope) before this fix. `serialize_bytes`
/// maps to CBOR major type 2 (compact byte string) instead. Only apply this
/// to fields that never need JSON round-tripping to the frontend (this one
/// is CBOR-wire-only — the frontend gets `cid`/`size`/`mime` metadata built
/// by hand, never the raw bytes, see typed_messages.rs's `typed_json`).
mod compact_bytes {
use serde::{Deserializer, Serializer};
use std::fmt;
pub fn serialize<S: Serializer>(v: &[u8], s: S) -> Result<S::Ok, S::Error> {
s.serialize_bytes(v)
}
struct BytesVisitor;
impl<'de> serde::de::Visitor<'de> for BytesVisitor {
type Value = Vec<u8>;
fn expecting(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.write_str("a byte string")
}
fn visit_bytes<E: serde::de::Error>(self, v: &[u8]) -> Result<Vec<u8>, E> {
Ok(v.to_vec())
}
fn visit_borrowed_bytes<E: serde::de::Error>(self, v: &'de [u8]) -> Result<Vec<u8>, E> {
Ok(v.to_vec())
}
fn visit_byte_buf<E: serde::de::Error>(self, v: Vec<u8>) -> Result<Vec<u8>, E> {
Ok(v)
}
// ciborium's non-self-describing byte-string decode path visits a
// seq of u8 in some configurations rather than calling visit_bytes
// directly — accept that too so this is robust to the reader mode.
fn visit_seq<A: serde::de::SeqAccess<'de>>(self, mut seq: A) -> Result<Vec<u8>, A::Error> {
let mut out = Vec::with_capacity(seq.size_hint().unwrap_or(0));
while let Some(byte) = seq.next_element::<u8>()? {
out.push(byte);
}
Ok(out)
}
}
pub fn deserialize<'de, D: Deserializer<'de>>(d: D) -> Result<Vec<u8>, D::Error> {
// NOT deserialize_bytes: ciborium's deserialize_bytes only succeeds
// when the byte string fits its small internal scratch buffer —
// anything bigger (any real attachment) falls through to an
// "invalid type: bytes, expected bytes" error despite the CBOR
// header being genuinely Bytes. deserialize_byte_buf streams
// segments into an unbounded Vec instead (confirmed against
// ciborium 0.2.2's de/mod.rs — deserialize_bytes's `Header::Bytes(Some(len))
// if len <= self.scratch.len()` guard vs deserialize_byte_buf's
// unconditional `Header::Bytes(len)` streaming path).
d.deserialize_byte_buf(BytesVisitor)
}
}
/// `Option<Vec<u8>>` variant of `compact_bytes` — for wire-only optional byte
/// fields (e.g. `TypedEnvelope.sig`) that never need JSON round-tripping.
/// Not the same as `base64_opt_bytes` below, which exists specifically
/// because `ContentRefPayload.thumb_bytes` DOES need a JSON-friendly (string)
/// form for the frontend's `data:` URL — this one stays fully binary.
mod compact_bytes_opt {
use serde::{Deserialize, Deserializer, Serializer};
pub fn serialize<S: Serializer>(v: &Option<Vec<u8>>, s: S) -> Result<S::Ok, S::Error> {
match v {
Some(bytes) => s.serialize_bytes(bytes),
None => s.serialize_none(),
}
}
pub fn deserialize<'de, D: Deserializer<'de>>(d: D) -> Result<Option<Vec<u8>>, D::Error> {
#[derive(Deserialize)]
struct Wrapper(#[serde(with = "super::compact_bytes")] Vec<u8>);
let opt: Option<Wrapper> = Option::deserialize(d)?;
Ok(opt.map(|w| w.0))
}
}
/// Inline attachment payload — file bytes carried directly in the envelope. /// Inline attachment payload — file bytes carried directly in the envelope.
/// Used when the file is small enough to chunk over LoRa and the peer has no /// Used when the file is small enough to chunk over LoRa and the peer has no
/// Tor path. Receiver writes `bytes` to its local BlobStore on reassembly /// Tor path. Receiver writes `bytes` to its local BlobStore on reassembly
@ -629,7 +514,6 @@ pub struct ContentInlinePayload {
pub filename: Option<String>, pub filename: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")] #[serde(default, skip_serializing_if = "Option::is_none")]
pub caption: Option<String>, pub caption: Option<String>,
#[serde(with = "compact_bytes")]
pub bytes: Vec<u8>, pub bytes: Vec<u8>,
} }
@ -723,59 +607,6 @@ pub fn decode_payload<T: for<'a> Deserialize<'a>>(data: &[u8]) -> Result<T> {
mod tests { mod tests {
use super::*; use super::*;
#[test]
fn typed_envelope_of_a_small_payload_stays_under_single_frame_budget() {
// Regression test: a ReadReceipt (tiny inner payload — one MessageKey)
// wrapped in TypedEnvelope crossed the 140-byte single-LoRa-frame
// threshold purely from the OUTER envelope's own `v: Vec<u8>` field
// using array-of-ints CBOR encoding, live-observed forcing an
// unnecessary MC-chunked send whose chunks then failed to reassemble
// over Reticulum (surfaced as raw `MC000...` garbage in the chat).
let receipt = ReadReceiptPayload {
up_to: MessageKey {
sender_pubkey: "b550de818bb907047aad60d368668b3815ce2fcb9fc35d8040bb21c5c6217ccc"
.to_string(),
sender_seq: 42,
},
};
let payload = encode_payload(&receipt).unwrap();
let envelope = TypedEnvelope::new(MeshMessageType::ReadReceipt, payload).with_seq(1);
let wire = envelope.to_wire().unwrap();
assert!(
wire.len() < 140,
"a ReadReceipt envelope should fit one LoRa frame (<140B), got {} bytes — \
TypedEnvelope.v is bloating again",
wire.len()
);
let decoded = TypedEnvelope::from_wire(&wire).unwrap();
let decoded_receipt: ReadReceiptPayload = decode_payload(&decoded.v).unwrap();
assert_eq!(decoded_receipt.up_to, receipt.up_to);
}
#[test]
fn content_inline_bytes_use_compact_cbor_encoding() {
// Regression test: Vec<u8> without #[serde(with = "compact_bytes")]
// serializes as one CBOR integer per byte (~3.5x bloat, measured on
// a real send: 4746 raw bytes -> 16638-byte wire envelope). Compact
// encoding should stay close to the raw size, not balloon with it.
let raw = vec![0xABu8; 4746];
let payload = ContentInlinePayload {
mime: "image/jpeg".to_string(),
filename: None,
caption: None,
bytes: raw.clone(),
};
let encoded = encode_payload(&payload).unwrap();
assert!(
encoded.len() < raw.len() + 200,
"expected compact encoding close to {} raw bytes, got {} wire bytes",
raw.len(),
encoded.len()
);
let decoded: ContentInlinePayload = decode_payload(&encoded).unwrap();
assert_eq!(decoded.bytes, raw);
}
#[test] #[test]
fn test_typed_envelope_wire_roundtrip() { fn test_typed_envelope_wire_roundtrip() {
let envelope = TypedEnvelope::new(MeshMessageType::Text, b"hello mesh".to_vec()); let envelope = TypedEnvelope::new(MeshMessageType::Text, b"hello mesh".to_vec());

View File

@ -14,7 +14,6 @@ pub mod message_types;
pub mod outbox; pub mod outbox;
pub mod protocol; pub mod protocol;
pub mod ratchet; pub mod ratchet;
pub mod reticulum;
pub mod scheduler; pub mod scheduler;
pub mod serial; pub mod serial;
pub mod session; pub mod session;
@ -246,11 +245,6 @@ pub(crate) async fn upsert_federation_peer(
last_advert: existing.as_ref().map(|p| p.last_advert).unwrap_or(0), last_advert: existing.as_ref().map(|p| p.last_advert).unwrap_or(0),
// Federation peers are reachable off-radio (Tor/FIPS), so always true. // Federation peers are reachable off-radio (Tor/FIPS), so always true.
reachable: true, reachable: true,
// Off-radio E2E (federation) is handled by the archy-peer path; preserve
// any radio PKI capability learned for a twinned contact.
pkc_capable: existing.as_ref().map(|p| p.pkc_capable).unwrap_or(false),
lat: existing.as_ref().and_then(|p| p.lat),
lon: existing.as_ref().and_then(|p| p.lon),
}; };
peers.insert(contact_id, peer); peers.insert(contact_id, peer);
// A radio twin of this node (same advert_name, no arch identity yet) can now // A radio twin of this node (same advert_name, no arch identity yet) can now
@ -332,14 +326,6 @@ pub struct MeshConfig {
/// Channel name for broadcasts. /// Channel name for broadcasts.
#[serde(default)] #[serde(default)]
pub channel_name: Option<String>, pub channel_name: Option<String>,
/// Meshtastic LoRa region (e.g. "EU_868", "US", "ANZ"). Fresh-flashed
/// Meshtastic radios ship region-UNSET and are RF-silent until a region is
/// set, so archy provisions this region on connect to bring every node onto
/// the same band automatically (the parity equivalent of a meshcore radio
/// coming up on its configured band). Ignored for meshcore devices and when
/// unset/None.
#[serde(default)]
pub lora_region: Option<String>,
/// Whether to periodically broadcast our identity. /// Whether to periodically broadcast our identity.
#[serde(default)] #[serde(default)]
pub broadcast_identity: bool, pub broadcast_identity: bool,
@ -383,15 +369,6 @@ pub struct MeshConfig {
/// when `assistant_trusted_only` is on and they aren't federation-Trusted. /// when `assistant_trusted_only` is on and they aren't federation-Trusted.
#[serde(default)] #[serde(default)]
pub assistant_allowed_contacts: Vec<String>, pub assistant_allowed_contacts: Vec<String>,
/// Pin the expected firmware on `device_path`/auto-detected ports. A
/// reflashable board (e.g. Heltec V3) can run Meshcore, Meshtastic, or
/// RNode firmware, so probe order alone is best-effort — set this when an
/// operator knows which one is plugged in. When `Some`, only that
/// device's probe runs (no other firmware's init bytes are ever injected
/// into the port); `None` keeps today's Meshcore→Meshtastic→Reticulum
/// strict-probe auto-detect.
#[serde(default)]
pub device_kind: Option<types::DeviceType>,
} }
fn default_assistant_backend() -> String { fn default_assistant_backend() -> String {
@ -408,7 +385,6 @@ impl Default for MeshConfig {
enabled: false, enabled: false,
device_path: None, device_path: None,
channel_name: Some("archipelago".to_string()), channel_name: Some("archipelago".to_string()),
lora_region: None,
broadcast_identity: true, broadcast_identity: true,
advert_name: None, advert_name: None,
mesh_only_mode: None, mesh_only_mode: None,
@ -421,7 +397,6 @@ impl Default for MeshConfig {
assistant_trusted_only: true, assistant_trusted_only: true,
assistant_backend: default_assistant_backend(), assistant_backend: default_assistant_backend(),
assistant_allowed_contacts: Vec::new(), assistant_allowed_contacts: Vec::new(),
device_kind: None,
} }
} }
} }
@ -694,16 +669,12 @@ impl MeshService {
let handle = listener::spawn_mesh_listener( let handle = listener::spawn_mesh_listener(
Arc::clone(&self.state), Arc::clone(&self.state),
self.data_dir.clone(),
self.config.device_path.clone(), self.config.device_path.clone(),
self.our_did.clone(), self.our_did.clone(),
self.our_ed_pubkey_hex.clone(), self.our_ed_pubkey_hex.clone(),
self.our_x25519_secret, self.our_x25519_secret,
self.our_x25519_pubkey_hex.clone(), self.our_x25519_pubkey_hex.clone(),
self.server_name.clone(), self.server_name.clone(),
self.config.lora_region.clone(),
self.config.channel_name.clone(),
self.config.device_kind,
shutdown_rx, shutdown_rx,
cmd_rx, cmd_rx,
); );
@ -939,13 +910,7 @@ impl MeshService {
/// Get current mesh status. /// Get current mesh status.
pub async fn status(&self) -> MeshStatus { pub async fn status(&self) -> MeshStatus {
let mut status = self.state.status.read().await.clone(); self.state.status.read().await.clone()
// The operator-configured LoRa region isn't part of the live session
// state (it's config, read once at session start) — compose it in
// here rather than threading it through the session's shared status
// writes, for the Device tab (#8) to display.
status.region = self.config.lora_region.clone();
status
} }
/// Get a reference to the shared mesh state. /// Get a reference to the shared mesh state.
@ -1133,21 +1098,16 @@ impl MeshService {
// (FIPS→Tor) instead of handing it to a radio that physically cannot // (FIPS→Tor) instead of handing it to a radio that physically cannot
// deliver it. Reachable radio peers stay on the mesh; oversized // deliver it. Reachable radio peers stay on the mesh; oversized
// envelopes (file shares etc.) always take the federation path. // envelopes (file shares etc.) always take the federation path.
let radio_federated_unreachable = !is_federation_synthetic && !exceeds_lora && { let radio_federated_unreachable = !is_federation_synthetic
let peers = self.state.peers.read().await; && !exceeds_lora
peers && {
.get(&contact_id) let peers = self.state.peers.read().await;
.map(|p| !p.reachable && p.arch_pubkey_hex.is_some()) peers
.unwrap_or(false) .get(&contact_id)
}; .map(|p| !p.reachable && p.arch_pubkey_hex.is_some())
let mesh_only_mode = load_config(&self.data_dir) .unwrap_or(false)
.await };
.ok() if is_federation_synthetic || exceeds_lora || radio_federated_unreachable {
.and_then(|cfg| cfg.mesh_only_mode)
.unwrap_or(false);
if !mesh_only_mode
&& (is_federation_synthetic || exceeds_lora || radio_federated_unreachable)
{
// Resolve the peer's pubkey/did. Prefer the live mesh peer table, // Resolve the peer's pubkey/did. Prefer the live mesh peer table,
// but fall back to federation storage for federation-synthetic ids // but fall back to federation storage for federation-synthetic ids
// that were never seeded into `state.peers` — e.g. a radio-less // that were never seeded into `state.peers` — e.g. a radio-less
@ -1216,21 +1176,8 @@ impl MeshService {
// (`send_dm_via_channel` in listener/session.rs) handles both // (`send_dm_via_channel` in listener/session.rs) handles both
// single-frame and chunked transmission internally; we must NOT // single-frame and chunked transmission internally; we must NOT
// pre-chunk here as well or the receiver sees garbage. // pre-chunk here as well or the receiver sees garbage.
} else if mesh_only_mode
&& (is_federation_synthetic || exceeds_lora || radio_federated_unreachable)
{
tracing::info!(
contact_id,
bytes = wire.len(),
is_federation_synthetic,
exceeds_lora,
radio_federated_unreachable,
"Off-grid mode active; forcing mesh message over LoRa only"
);
} }
self.send_raw_payload(contact_id, wire).await?; self.send_raw_payload(contact_id, wire).await?;
let device_type = self.state.status.read().await.device_type;
let radio_transport = radio_transport_label(device_type);
Ok(self Ok(self
.record_sent_typed( .record_sent_typed(
contact_id, contact_id,
@ -1238,98 +1185,6 @@ impl MeshService {
display_text, display_text,
typed_payload, typed_payload,
sender_seq, sender_seq,
Some(radio_transport.to_string()),
// Archy↔archy typed envelopes over LoRa are identity-signed; the
// radio E2E flag (meshtastic PKI / meshcore session) isn't
// threaded to the send side yet, so don't over-claim E2E here —
// except Reticulum/LXMF, which is unconditionally E2E on every
// send regardless of peer/session state (see send_message).
device_type == DeviceType::Reticulum,
)
.await)
}
/// Send an image via native LXMF `FIELD_IMAGE` instead of our own typed
/// envelope — for a stock (non-archy) peer that can't decode our CBOR
/// wire format. Caller (the RPC layer) gates this on
/// `!is_archy_peer(contact_id)`; low-level "just send the bytes" shape
/// mirroring `send_raw_payload` — does NOT record a Sent MeshMessage
/// itself, callers use `record_sent_typed` same as the typed-envelope
/// paths so the Sent card renders identically regardless of which wire
/// format actually went out.
pub async fn send_native_image(
&self,
contact_id: u32,
mime: &str,
bytes: Vec<u8>,
caption: Option<String>,
) -> Result<()> {
let status = self.state.status.read().await;
if !status.device_connected {
anyhow::bail!("No mesh device connected");
}
drop(status);
let dest_prefix = self.peer_dest_prefix(contact_id).await?;
self.state
.send_cmd(listener::MeshCommand::SendNativeImage {
dest_pubkey_prefix: dest_prefix,
mime: mime.to_string(),
bytes,
caption,
})
.await
.map_err(|_| anyhow::anyhow!("Mesh listener not running"))?;
Ok(())
}
/// Send a typed envelope over a dedicated Reticulum RNS Resource transfer
/// (`MeshCommand::SendResource`) instead of the small inline-chunk path
/// `send_typed_wire`/`send_raw_payload` uses. Callers (the `mesh.send-content-inline`
/// RPC handler) are responsible for only reaching this when the active
/// device is actually Reticulum and the payload fits the
/// `RETICULUM_RESOURCE_MAX` budget — see `mesh.transport-advice`'s
/// `"resource-mesh"` tier, the single source of truth for that decision.
/// Mirrors `send_typed_wire`'s signature/return shape so RPC call sites
/// can switch between the two paths without restructuring.
pub async fn send_content_resource(
&self,
contact_id: u32,
wire: Vec<u8>,
type_label: &str,
display_text: &str,
typed_payload: Option<serde_json::Value>,
sender_seq: u64,
) -> Result<MeshMessage> {
let status = self.state.status.read().await;
if !status.device_connected {
anyhow::bail!("No mesh device connected");
}
drop(status);
let dest_prefix = self.peer_dest_prefix(contact_id).await?;
self.state
.send_cmd(listener::MeshCommand::SendResource {
dest_pubkey_prefix: dest_prefix,
payload: wire,
})
.await
.map_err(|_| anyhow::anyhow!("Mesh listener not running"))?;
let device_type = self.state.status.read().await.device_type;
let radio_transport = radio_transport_label(device_type);
Ok(self
.record_sent_typed(
contact_id,
type_label,
display_text,
typed_payload,
sender_seq,
Some(radio_transport.to_string()),
// Reticulum/LXMF is unconditionally E2E on every send — same
// reasoning as send_message's native-text path. This method
// is Reticulum-only by construction (callers gate on
// device_type before reaching it), so this is never wrong.
true,
) )
.await) .await)
} }
@ -1385,11 +1240,6 @@ impl MeshService {
display_text, display_text,
typed_payload, typed_payload,
sender_seq, sender_seq,
// Transport is finalized below once the background send resolves
// FIPS vs Tor; mark E2E now — a federation envelope is
// identity-signed and rides an encrypted transport.
None,
true,
) )
.await; .await;
@ -1399,10 +1249,6 @@ impl MeshService {
// MeshMessage and the UI's delivery indicator tracks the receipt. // MeshMessage and the UI's delivery indicator tracks the receipt.
let peer_onion_owned = peer_onion.to_string(); let peer_onion_owned = peer_onion.to_string();
let data_dir_owned = self.data_dir.clone(); let data_dir_owned = self.data_dir.clone();
// Finalize the Sent record's transport pill once we know which leg
// (FIPS/Tor) actually delivered it.
let state_for_transport = self.state.clone();
let sent_msg_id = msg.id;
tokio::spawn(async move { tokio::spawn(async move {
let fips_npub = let fips_npub =
crate::federation::fips_npub_for_onion(&data_dir_owned, &peer_onion_owned).await; crate::federation::fips_npub_for_onion(&data_dir_owned, &peer_onion_owned).await;
@ -1423,12 +1269,6 @@ impl MeshService {
match req.send_json(&body).await { match req.send_json(&body).await {
Ok((resp, transport)) if resp.status().is_success() => { Ok((resp, transport)) if resp.status().is_success() => {
tracing::debug!(contact_id, transport = %transport, "Federation envelope delivered"); tracing::debug!(contact_id, transport = %transport, "Federation envelope delivered");
// Tag the Sent bubble with the leg that delivered it (the
// transport pill: "fips" / "tor").
let mut messages = state_for_transport.messages.write().await;
if let Some(m) = messages.iter_mut().find(|m| m.id == sent_msg_id) {
m.transport = Some(transport.to_string());
}
} }
Ok((resp, transport)) => warn!( Ok((resp, transport)) => warn!(
contact_id, contact_id,
@ -1493,22 +1333,6 @@ impl MeshService {
Some(&display_name), Some(&display_name),
) )
.await; .await;
// The inbound HTTP gives no FIPS-vs-Tor signal, so label the message
// with the leg most recently used with this peer (federation storage's
// `last_transport`), defaulting to Tor. Federation envelopes are E2E
// (identity-signed over an encrypted transport).
let transport_label = {
let nodes = crate::federation::load_nodes(&self.data_dir)
.await
.unwrap_or_default();
nodes
.iter()
.find(|n| n.pubkey == from_pubkey_hex)
.and_then(|n| n.last_transport.clone())
.filter(|t| t == "fips" || t == "tor")
.unwrap_or_else(|| "tor".to_string())
};
let before = listener::dispatch::max_message_id(&self.state).await;
listener::dispatch::handle_typed_envelope_direct( listener::dispatch::handle_typed_envelope_direct(
&self.state, &self.state,
contact_id, contact_id,
@ -1516,14 +1340,6 @@ impl MeshService {
envelope, envelope,
) )
.await; .await;
listener::dispatch::stamp_received_transport(
&self.state,
contact_id,
before,
&transport_label,
true,
)
.await;
Ok(()) Ok(())
} }
@ -1625,7 +1441,6 @@ impl MeshService {
let chan_contact_id = u32::MAX - (channel as u32); let chan_contact_id = u32::MAX - (channel as u32);
let chan_name = format!("Channel {}", channel); let chan_name = format!("Channel {}", channel);
let msg_id = self.state.next_id().await; let msg_id = self.state.next_id().await;
let radio_transport = radio_transport_label(self.state.status.read().await.device_type);
let msg = MeshMessage { let msg = MeshMessage {
id: msg_id, id: msg_id,
direction: MessageDirection::Sent, direction: MessageDirection::Sent,
@ -1634,10 +1449,7 @@ impl MeshService {
plaintext: display_text.to_string(), plaintext: display_text.to_string(),
timestamp: chrono::Utc::now().to_rfc3339(), timestamp: chrono::Utc::now().to_rfc3339(),
delivered: false, delivered: false,
// Channel broadcasts use the shared channel PSK, not per-identity
// E2E — so not an E2E message, but it does travel over the radio.
encrypted: false, encrypted: false,
transport: Some(radio_transport.to_string()),
message_type: type_label.to_string(), message_type: type_label.to_string(),
typed_payload, typed_payload,
sender_pubkey: Some(self.our_ed_pubkey_hex.clone()), sender_pubkey: Some(self.our_ed_pubkey_hex.clone()),
@ -1658,78 +1470,39 @@ impl MeshService {
pub async fn send_message(&self, contact_id: u32, text: &str) -> Result<MeshMessage> { pub async fn send_message(&self, contact_id: u32, text: &str) -> Result<MeshMessage> {
use crate::mesh::message_types::{MeshMessageType, TypedEnvelope}; use crate::mesh::message_types::{MeshMessageType, TypedEnvelope};
let seq = self.state.next_send_seq(contact_id).await; let seq = self.state.next_send_seq(contact_id).await;
let device_type = self.state.status.read().await.device_type; // Stock (non-archipelago) radio contacts — e.g. a phone running the
let archy = self.is_archy_peer(contact_id).await; // MeshCore app — can't decode our typed envelope and would render it as
// garbled bytes. Send them the raw text as a plain native DM instead.
// Transport choice is DEVICE-AWARE so we fix Meshtastic without regressing // Archipelago peers still get the typed envelope (seq/reply/reaction
// Meshcore: // addressing + encryption).
// • Meshtastic (any peer) → plain text native DM on TEXT_MESSAGE_APP. The if !self.is_archy_peer(contact_id).await {
// firmware end-to-end (PKC/Curve25519) encrypts a directed DM to any let dest_prefix = self.peer_dest_prefix(contact_id).await?;
// peer whose public key it knows (archy peers exchange them via self.state
// NodeInfo), so it's delivered E2E and shows as chat on every client. .send_cmd(listener::MeshCommand::SendNativeText {
// Meshtastic firmware 2.7.x will NOT deliver our opaque binary typed dest_pubkey_prefix: dest_prefix,
// envelope as a message (PRIVATE_APP is opaque app-data; a base64 payload: text.as_bytes().to_vec(),
// envelope overflows one LoRa frame and chunk-fails) — wrapping text })
// is exactly what silently broke archy↔archy Meshtastic LoRa. .await
// • Meshcore/Reticulum archy peer → keep the rich signed typed envelope. .map_err(|_| anyhow::anyhow!("Mesh listener not running"))?;
// Meshcore frames are binary-safe (no UTF-8 mangling) and Reticulum/LXMF return Ok(self
// is binary-safe and high-capacity too; both carry their own transport .record_sent_typed(contact_id, "text", text, None, seq)
// E2E plus our signature for `!ai` auth / seq reply addressing, so the .await);
// envelope works there and we must not drop it.
// • Meshcore stock client → plain text (can't decode our envelope).
// Rich typed messages (invoice/coordinate/reaction/…) always use the
// typed-wire path via `send_typed_wire`; only plain Text is routed here.
let use_typed_envelope =
archy && matches!(device_type, DeviceType::Meshcore | DeviceType::Reticulum);
if use_typed_envelope {
// Sign with our archipelago identity so the receiver can authenticate
// us over LoRa (verifies against our bound `arch_pubkey_hex`). `with_seq`
// is applied after signing — seq is not covered by the signature.
let envelope = TypedEnvelope::new_signed(
MeshMessageType::Text,
text.as_bytes().to_vec(),
&self.signing_key,
)
.with_seq(seq);
let wire = envelope.to_wire()?;
return self
.send_typed_wire(contact_id, wire, "text", text, None, seq)
.await;
} }
// Sign the envelope with our archipelago identity key so the receiver
let dest_prefix = self.peer_dest_prefix(contact_id).await?; // can authenticate us over LoRa (it verifies against our bound
self.state // `arch_pubkey_hex`). This is what lets a `!ai` typed in chat to a
.send_cmd(listener::MeshCommand::SendNativeText { // trusted node pass the receiver's `trusted_only` gate over the radio —
dest_pubkey_prefix: dest_prefix, // an unsigned radio packet can never authenticate. The signature is
payload: text.as_bytes().to_vec(), // optional on the wire and ignored by peers that don't know our key, so
}) // it stays backward compatible. (Federation/Tor sends already sign in
// `send_typed_wire_via_federation`.) `with_seq` is applied after signing
// — seq is not covered by the signature.
let envelope =
TypedEnvelope::new_signed(MeshMessageType::Text, text.as_bytes().to_vec(), &self.signing_key)
.with_seq(seq);
let wire = envelope.to_wire()?;
self.send_typed_wire(contact_id, wire, "text", text, None, seq)
.await .await
.map_err(|_| anyhow::anyhow!("Mesh listener not running"))?;
// The firmware PKI-encrypts a directed DM to any peer whose key it knows;
// archy peers always exchange keys, so mark those Sent rows E2E so the
// pill shows immediately. A non-archy stock peer (e.g. 3ccc) can also be
// PKC-capable once we've learned its NodeInfo public key — OR that in too
// so the pill isn't archy-only. (The receiver independently stamps E2E
// from the radio's `pki_encrypted` flag, so an inbound row is accurate
// regardless.)
//
// Reticulum/LXMF has no such conditional: every send is encrypted to the
// destination's identity key by the LXMF router itself, archy peer or
// not — so it's unconditionally E2E rather than gated on `archy`/`pkc_capable`
// (which is a Meshtastic-only concept; Reticulum contacts never set it).
let pkc_capable = self.peer_pkc_capable(contact_id).await;
let encrypted = device_type == DeviceType::Reticulum || archy || pkc_capable;
Ok(self
.record_sent_typed(
contact_id,
"text",
text,
None,
seq,
Some(radio_transport_label(device_type).to_string()),
encrypted,
)
.await)
} }
/// Whether `contact_id` is an archipelago peer (vs a stock meshcore client). /// Whether `contact_id` is an archipelago peer (vs a stock meshcore client).
@ -1737,7 +1510,7 @@ impl MeshService {
/// only once we've learned their archipelago identity (DID or x25519 key, /// only once we've learned their archipelago identity (DID or x25519 key,
/// from federation seeding or an identity exchange). Stock clients have /// from federation seeding or an identity exchange). Stock clients have
/// neither, so we send them plain text rather than typed envelopes. /// neither, so we send them plain text rather than typed envelopes.
pub(crate) async fn is_archy_peer(&self, contact_id: u32) -> bool { async fn is_archy_peer(&self, contact_id: u32) -> bool {
if contact_id & 0x8000_0000 != 0 { if contact_id & 0x8000_0000 != 0 {
return true; return true;
} }
@ -1748,21 +1521,6 @@ impl MeshService {
.unwrap_or(false) .unwrap_or(false)
} }
/// Whether `contact_id`'s real radio PKI (Curve25519) key is known, so the
/// firmware delivers a directed DM to it end-to-end encrypted even though
/// it's not an archipelago peer (e.g. stock Meshtastic peer 3ccc). Stamped
/// onto `MeshPeer::pkc_capable` by `refresh_contacts` from the driver's
/// `get_contacts()`.
async fn peer_pkc_capable(&self, contact_id: u32) -> bool {
self.state
.peers
.read()
.await
.get(&contact_id)
.map(|p| p.pkc_capable)
.unwrap_or(false)
}
/// Record a Sent MeshMessage for a typed envelope that has already been /// Record a Sent MeshMessage for a typed envelope that has already been
/// transmitted by the caller. Used by the RPC layer after sending /// transmitted by the caller. Used by the RPC layer after sending
/// invoice/coordinate/alert/etc. so the UI gets a proper rich Sent card /// invoice/coordinate/alert/etc. so the UI gets a proper rich Sent card
@ -1774,8 +1532,6 @@ impl MeshService {
display_text: &str, display_text: &str,
typed_payload: Option<serde_json::Value>, typed_payload: Option<serde_json::Value>,
sender_seq: u64, sender_seq: u64,
transport: Option<String>,
encrypted: bool,
) -> MeshMessage { ) -> MeshMessage {
let msg_id = self.state.next_id().await; let msg_id = self.state.next_id().await;
let peer_name = self let peer_name = self
@ -1793,8 +1549,7 @@ impl MeshService {
plaintext: display_text.to_string(), plaintext: display_text.to_string(),
timestamp: chrono::Utc::now().to_rfc3339(), timestamp: chrono::Utc::now().to_rfc3339(),
delivered: false, delivered: false,
encrypted, encrypted: false,
transport,
message_type: type_label.to_string(), message_type: type_label.to_string(),
typed_payload, typed_payload,
sender_pubkey: Some(self.our_ed_pubkey_hex.clone()), sender_pubkey: Some(self.our_ed_pubkey_hex.clone()),
@ -1836,7 +1591,6 @@ impl MeshService {
let chan_contact_id = u32::MAX - (channel as u32); let chan_contact_id = u32::MAX - (channel as u32);
let chan_name = format!("Channel {}", channel); let chan_name = format!("Channel {}", channel);
let msg_id = self.state.next_id().await; let msg_id = self.state.next_id().await;
let radio_transport = radio_transport_label(self.state.status.read().await.device_type);
let msg = MeshMessage { let msg = MeshMessage {
id: msg_id, id: msg_id,
@ -1846,9 +1600,7 @@ impl MeshService {
plaintext: text.to_string(), plaintext: text.to_string(),
timestamp: chrono::Utc::now().to_rfc3339(), timestamp: chrono::Utc::now().to_rfc3339(),
delivered: false, delivered: false,
// Plain channel broadcast over the radio (shared PSK, not E2E).
encrypted: false, encrypted: false,
transport: Some(radio_transport.to_string()),
message_type: "text".to_string(), message_type: "text".to_string(),
typed_payload: None, typed_payload: None,
sender_pubkey: None, sender_pubkey: None,
@ -1882,26 +1634,6 @@ impl MeshService {
Ok(()) Ok(())
} }
/// Reboot the locally-connected radio firmware to recover a wedged /
/// RX-deaf radio (one that has stopped hearing the mesh while still able to
/// transmit). The device reconnects via the listener's reboot→reconnect
/// loop. `seconds` is the firmware reboot delay.
pub async fn reboot_radio(&self, seconds: i64) -> Result<()> {
let status = self.state.status.read().await;
if !status.device_connected {
anyhow::bail!("No mesh device connected. Check USB connection.");
}
drop(status);
self.state
.send_cmd(listener::MeshCommand::RebootRadio { seconds })
.await
.map_err(|_| anyhow::anyhow!("Mesh listener not running"))?;
info!(seconds, "Mesh radio reboot triggered");
Ok(())
}
/// Current mesh-AI assistant settings (issue #50). /// Current mesh-AI assistant settings (issue #50).
pub async fn assistant_config(&self) -> listener::AssistantConfig { pub async fn assistant_config(&self) -> listener::AssistantConfig {
self.state.assistant.read().await.clone() self.state.assistant.read().await.clone()
@ -1910,13 +1642,7 @@ impl MeshService {
/// Recently-denied `!ai` askers (newest first) so the UI can offer to allow /// Recently-denied `!ai` askers (newest first) so the UI can offer to allow
/// them. Cleared implicitly as new denials rotate older ones out. /// them. Cleared implicitly as new denials rotate older ones out.
pub async fn assistant_denied_askers(&self) -> Vec<listener::DeniedAsker> { pub async fn assistant_denied_askers(&self) -> Vec<listener::DeniedAsker> {
self.state self.state.assist_denied.read().await.iter().cloned().collect()
.assist_denied
.read()
.await
.iter()
.cloned()
.collect()
} }
/// Update the mesh-AI assistant settings live (no listener restart) and /// Update the mesh-AI assistant settings live (no listener restart) and
@ -2133,9 +1859,6 @@ mod tests {
hops: 0, hops: 0,
last_advert: 0, last_advert: 0,
reachable, reachable,
pkc_capable: false,
lat: None,
lon: None,
} }
} }

Some files were not shown because too many files have changed in this diff Show More