archy/docs/hotfix-process.md

50 lines
1.7 KiB
Markdown
Raw Normal View History

# Hotfix Process
For critical bugs discovered after v1.0.0 release.
## Severity Classification
| Level | Response Time | Examples |
|-------|--------------|---------|
| P0 — Critical | < 4 hours | Data loss, security vulnerability, node bricked |
| P1 — High | < 24 hours | App won't start, auth broken, major UI failure |
| P2 — Medium | < 72 hours | Non-critical feature broken, performance regression |
| P3 — Low | Next release | Cosmetic, minor UX, edge cases |
## Hotfix Workflow
### 1. Triage
- Reproduce the issue on dev server (192.168.1.228)
- Classify severity (P0-P3)
- P0/P1: proceed immediately. P2/P3: add to v1.1 roadmap.
### 2. Fix
- Create branch: `hotfix/v1.0.1-description`
- Fix the issue with minimal code changes
- Run full test suite: `cd neode-ui && npm test && npm run type-check`
- Deploy to dev server: `./scripts/deploy-to-target.sh --live`
- Verify fix on live server
### 3. Release
- Merge hotfix branch to `main`
- Tag: `v1.0.1` (increment patch version)
- Update release manifest for OTA updates (`releases/manifest.json`)
- Push to both Gitea mirrors so nodes can pull via `self-update.sh`
### 4. Communicate
- Update RELEASE-NOTES with hotfix details
- Note in CHANGELOG.md
## Monitoring Dashboards
- **Uptime monitor**: `/var/lib/archipelago/uptime-monitor/summary.json`
- **Soak test**: `/tmp/stability-test-*.log` on dev server
- **Health endpoint**: `http://192.168.1.228/health`
## Rollback
If a hotfix causes regressions:
1. OTA system supports rollback to previous version (`scripts/self-update.sh --rollback`)
2. Point `releases/manifest.json` back at the last-known-good version and push to mirrors
3. Backend binary backup at `/usr/local/bin/archipelago.bak`