Recurring checks for production Sovra operators.
- Verify core health endpoints:
- Web:
GET /api/healthreturnsstatus: "ok"and nomissing_configchecks - Worker:
GET /health
- Web:
- Review error monitoring (Sentry) for new high-severity issues.
- Review security signals:
- Secret scanning alerts
- Code scanning alerts
- Dependency alerts
- Confirm background workflows are green (
CI,Security,Deploy).
- Run dependency hygiene checks:
pnpm audit --prod- Review Dependabot updates
- Validate DB migration and policy state in staging + production.
- Spot-check tenant isolation in critical read/write paths.
- Review worker logs for auth failures and broadcast errors.
- Rotate and validate shared secrets:
INTERNAL_API_SECRETSUPABASE_JWT_SECRET
- Review release process and rollback readiness.
- Run a recovery drill:
- Re-deploy from clean commit
- Validate health + core flows
- Check docs drift:
README.mddocs/environment-variables.mddocs/deployment.md
| Priority | Typical impact | Target response |
|---|---|---|
P1 |
Full outage, security incident, or cross-tenant data risk | Immediate |
P2 |
Major degradation with business impact | < 4 hours |
P3 |
Partial degradation with workaround | < 1 business day |
P4 |
Low-impact bug or docs issue | Next planned cycle |
- Security incidents:
[email protected] - Production support:
[email protected] - Community issues/discussion: GitHub issues + discussions