archipelago 40a6eaca72 feat(container): ContainerOrchestrator trait, RpcHandler uses it in prod
Step 4 of the rust-orchestrator migration. Unifies the container lifecycle
surface behind a single trait so the RPC layer stops caring whether it is
talking to the dev or prod orchestrator.

  * New trait core/archipelago/src/container/traits.rs: ContainerOrchestrator
    with install / start / stop / restart / remove / upgrade / status / list /
    logs / health, all keyed by app_id. Every method is async_trait-based.

  * ProdContainerOrchestrator: the lifecycle methods are moved from inherent
    impl into the trait impl (avoids name-shadowing recursion). Adoption and
    reconcile remain inherent since only main.rs / BootReconciler call them.

  * DevContainerOrchestrator: new trait impl that forwards to the existing
    Dev-named methods, applying the dev container-name + port-offset rules
    internally. New load_manifest_for() helper resolves app_id to
    <data_dir>/apps/<app_id>/manifest.yml so trait-level install(app_id)
    works in dev too. install_container(manifest, path) stays inherent for
    the manifest-path RPC shape.

  * RpcHandler now holds Option<Arc<dyn ContainerOrchestrator>> and, when in
    dev mode, a separate Option<Arc<DevContainerOrchestrator>> for the
    manifest_path install RPC. In prod mode RpcHandler::new() constructs a
    ProdContainerOrchestrator and calls load_manifests() at startup.

  * All seven container-* RPC guards no longer say dev mode required.
    container-install still requires dev mode because its manifest_path
    argument has no prod meaning; every other container RPC now works in both
    modes via the trait.

BOOT STILL DOES NOT USE THIS. main.rs wire-up (Step 6) and BootReconciler
(Step 5) come next. Until then the prod orchestrator is constructed but nothing
populates /opt/archipelago/apps so it has zero manifests to manage, matching
the pre-Step-4 behaviour.

Verification: cargo build -p archipelago clean (11 expected unused method
warnings for methods not yet wired from main.rs). cargo test -p archipelago:
all 21 container::* tests pass (16 prod_orchestrator + 5 others). 24 other
test failures are pre-existing and unrelated (identity_manager / session /
wallet / mesh / credentials — all independently flaky on file-backed state).
2026-04-22 18:56:52 -04:00

57 lines
2.5 KiB
Rust

//! Orchestrator trait — the shared surface the RPC layer talks to.
//!
//! Step 4 of the rust-orchestrator migration. Unifies the container lifecycle
//! surface of `DevContainerOrchestrator` and `ProdContainerOrchestrator` so
//! `RpcHandler` can hold `Arc<dyn ContainerOrchestrator>` and stop caring
//! which mode it is in.
//!
//! The trait takes `app_id: &str` everywhere (never a manifest path). Dev and
//! Prod both resolve app_id → manifest internally. The legacy
//! `container-install { manifest_path }` RPC shape is preserved as a concrete
//! `install_container_from_path` method on `DevContainerOrchestrator` only,
//! since that ad-hoc workflow is a dev convenience and has no prod meaning.
//!
//! See `docs/rust-orchestrator-migration.md`.
use anyhow::Result;
use archipelago_container::ContainerStatus;
use async_trait::async_trait;
/// Lifecycle + query operations every orchestrator exposes to the RPC layer.
#[async_trait]
pub trait ContainerOrchestrator: Send + Sync {
/// Build-or-pull the image, create the container, and start it. Returns the
/// podman container name that was created. Assumes the app_id corresponds
/// to a manifest the orchestrator already knows about.
async fn install(&self, app_id: &str) -> Result<String>;
/// Start an already-created container.
async fn start(&self, app_id: &str) -> Result<()>;
/// Stop a running container. No-op on Prod if already stopped.
async fn stop(&self, app_id: &str) -> Result<()>;
/// Stop-then-start. Best-effort: ignores stop failure.
async fn restart(&self, app_id: &str) -> Result<()>;
/// Remove the container. `preserve_data = true` keeps the volumes; `false`
/// is honored on a best-effort basis (Dev cleans, Prod leaves the volume
/// management to the data layer).
async fn remove(&self, app_id: &str, preserve_data: bool) -> Result<()>;
/// Pull/rebuild the image and recreate the container from scratch.
async fn upgrade(&self, app_id: &str) -> Result<()>;
/// Current state of a single container.
async fn status(&self, app_id: &str) -> Result<ContainerStatus>;
/// All containers this orchestrator knows about.
async fn list(&self) -> Result<Vec<ContainerStatus>>;
/// Tail the container's stdout+stderr.
async fn logs(&self, app_id: &str, lines: u32) -> Result<Vec<String>>;
/// Coarse health summary: "healthy", "unhealthy", "starting", "paused", "unknown".
async fn health(&self, app_id: &str) -> Result<String>;
}