archy/docs/hotfix-process.md
Dorian 5495fc5c3e docs: set up post-release monitoring and hotfix process (LAUNCH-02)
Uptime monitor timer running every 5min, 30-day soak test active,
hotfix process documented. 100% uptime so far.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:24:56 +00:00

51 lines
1.7 KiB
Markdown

# Hotfix Process
For critical bugs discovered after v1.0.0 release.
## Severity Classification
| Level | Response Time | Examples |
|-------|--------------|---------|
| P0 — Critical | < 4 hours | Data loss, security vulnerability, node bricked |
| P1 High | < 24 hours | App won't start, auth broken, major UI failure |
| P2 Medium | < 72 hours | Non-critical feature broken, performance regression |
| P3 Low | Next release | Cosmetic, minor UX, edge cases |
## Hotfix Workflow
### 1. Triage
- Reproduce the issue on dev server (192.168.1.228)
- Classify severity (P0-P3)
- P0/P1: proceed immediately. P2/P3: add to v1.1 roadmap.
### 2. Fix
- Create branch: `hotfix/v1.0.1-description`
- Fix the issue with minimal code changes
- Run full test suite: `cd neode-ui && npm test && npm run type-check`
- Deploy to dev server: `./scripts/deploy-to-target.sh --live`
- Verify fix on live server
### 3. Release
- Merge hotfix branch to `main`
- Tag: `v1.0.1` (increment patch version)
- Build ISO if needed: `sudo ./image-recipe/build-auto-installer-iso.sh`
- Update release manifest for OTA updates
- Copy ISO to FileBrowser Builds folder
### 4. Communicate
- Update RELEASE-NOTES with hotfix details
- Note in CHANGELOG.md
## Monitoring Dashboards
- **Uptime monitor**: `/var/lib/archipelago/uptime-monitor/summary.json`
- **Soak test**: `/tmp/stability-test-*.log` on dev server
- **Health endpoint**: `http://192.168.1.228/health`
## Rollback
If a hotfix causes regressions:
1. OTA system supports rollback to previous version
2. Users can reflash with previous ISO (app data preserved on separate partition)
3. Backend binary backup at `/usr/local/bin/archipelago.bak`