57 lines
2.5 KiB
Rust
Raw Normal View History

feat(container): ContainerOrchestrator trait, RpcHandler uses it in prod Step 4 of the rust-orchestrator migration. Unifies the container lifecycle surface behind a single trait so the RPC layer stops caring whether it is talking to the dev or prod orchestrator. * New trait core/archipelago/src/container/traits.rs: ContainerOrchestrator with install / start / stop / restart / remove / upgrade / status / list / logs / health, all keyed by app_id. Every method is async_trait-based. * ProdContainerOrchestrator: the lifecycle methods are moved from inherent impl into the trait impl (avoids name-shadowing recursion). Adoption and reconcile remain inherent since only main.rs / BootReconciler call them. * DevContainerOrchestrator: new trait impl that forwards to the existing Dev-named methods, applying the dev container-name + port-offset rules internally. New load_manifest_for() helper resolves app_id to <data_dir>/apps/<app_id>/manifest.yml so trait-level install(app_id) works in dev too. install_container(manifest, path) stays inherent for the manifest-path RPC shape. * RpcHandler now holds Option<Arc<dyn ContainerOrchestrator>> and, when in dev mode, a separate Option<Arc<DevContainerOrchestrator>> for the manifest_path install RPC. In prod mode RpcHandler::new() constructs a ProdContainerOrchestrator and calls load_manifests() at startup. * All seven container-* RPC guards no longer say dev mode required. container-install still requires dev mode because its manifest_path argument has no prod meaning; every other container RPC now works in both modes via the trait. BOOT STILL DOES NOT USE THIS. main.rs wire-up (Step 6) and BootReconciler (Step 5) come next. Until then the prod orchestrator is constructed but nothing populates /opt/archipelago/apps so it has zero manifests to manage, matching the pre-Step-4 behaviour. Verification: cargo build -p archipelago clean (11 expected unused method warnings for methods not yet wired from main.rs). cargo test -p archipelago: all 21 container::* tests pass (16 prod_orchestrator + 5 others). 24 other test failures are pre-existing and unrelated (identity_manager / session / wallet / mesh / credentials — all independently flaky on file-backed state).
2026-04-22 18:56:52 -04:00
//! Orchestrator trait — the shared surface the RPC layer talks to.
//!
//! Step 4 of the rust-orchestrator migration. Unifies the container lifecycle
//! surface of `DevContainerOrchestrator` and `ProdContainerOrchestrator` so
//! `RpcHandler` can hold `Arc<dyn ContainerOrchestrator>` and stop caring
//! which mode it is in.
//!
//! The trait takes `app_id: &str` everywhere (never a manifest path). Dev and
//! Prod both resolve app_id → manifest internally. The legacy
//! `container-install { manifest_path }` RPC shape is preserved as a concrete
//! `install_container_from_path` method on `DevContainerOrchestrator` only,
//! since that ad-hoc workflow is a dev convenience and has no prod meaning.
//!
//! See `docs/rust-orchestrator-migration.md`.
use anyhow::Result;
use archipelago_container::ContainerStatus;
use async_trait::async_trait;
/// Lifecycle + query operations every orchestrator exposes to the RPC layer.
#[async_trait]
pub trait ContainerOrchestrator: Send + Sync {
/// Build-or-pull the image, create the container, and start it. Returns the
/// podman container name that was created. Assumes the app_id corresponds
/// to a manifest the orchestrator already knows about.
async fn install(&self, app_id: &str) -> Result<String>;
/// Start an already-created container.
async fn start(&self, app_id: &str) -> Result<()>;
/// Stop a running container. No-op on Prod if already stopped.
async fn stop(&self, app_id: &str) -> Result<()>;
/// Stop-then-start. Best-effort: ignores stop failure.
async fn restart(&self, app_id: &str) -> Result<()>;
/// Remove the container. `preserve_data = true` keeps the volumes; `false`
/// is honored on a best-effort basis (Dev cleans, Prod leaves the volume
/// management to the data layer).
async fn remove(&self, app_id: &str, preserve_data: bool) -> Result<()>;
/// Pull/rebuild the image and recreate the container from scratch.
async fn upgrade(&self, app_id: &str) -> Result<()>;
/// Current state of a single container.
async fn status(&self, app_id: &str) -> Result<ContainerStatus>;
/// All containers this orchestrator knows about.
async fn list(&self) -> Result<Vec<ContainerStatus>>;
/// Tail the container's stdout+stderr.
async fn logs(&self, app_id: &str, lines: u32) -> Result<Vec<String>>;
/// Coarse health summary: "healthy", "unhealthy", "starting", "paused", "unknown".
async fn health(&self, app_id: &str) -> Result<String>;
}