6 Commits

Author SHA1 Message Date
Dorian
d71bc2a46c fix: add missing tier field to all AppMetadata, fix build errors
- Add tier: "" to all AppMetadata match arms (was missing from 30+ arms)
- Use std:🧵:available_parallelism() instead of num_cpus crate
- Remove unused num_cpus dependency
- Fix unused variable warning in health_monitor.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 03:36:44 +00:00
Dorian
8302b0b357 feat: add CPU load alert, lower disk/RAM thresholds (SCALE-04)
- Add CpuLoad alert rule: fires when 5min load > 2x core count
- Lower disk usage alert from 90% to 80%
- Lower RAM usage alert from 90% to 80%
- Add num_cpus dependency for runtime core detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 03:29:29 +00:00
Dorian
a47b6cbcb7 fix: add webhook delivery for monitoring alerts
DiskUsage and ContainerCrash alerts now fire webhooks via
send_webhook() after pushing WebSocket notifications. Added
data_dir parameter to spawn_metrics_collector for webhook config
access.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:32:19 +00:00
Dorian
8fad8d6681 patches on sxsw ai working api key working container hardened plus many more 2026-03-12 22:19:04 +00:00
Dorian
6945bc4daf feat: add alerting system with configurable rules and UI (MON-02, MON-03)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 12:28:44 +00:00
Dorian
baeeb72f27 feat: add real-time metrics collection with ring buffer storage (MON-01)
Implements monitoring/collector.rs that collects per-container CPU/RAM/network/disk,
system-wide metrics, RPC latency, and WebSocket connection count every 60 seconds.
Data stored in dual ring buffers: 1-min resolution (24h) and 15-min resolution (7d).
Three new RPC endpoints: monitoring.current, monitoring.history, monitoring.containers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:11:02 +00:00