This commit is contained in:
303
docs/MAINTENANCE_MODE.md
Normal file
303
docs/MAINTENANCE_MODE.md
Normal file
@@ -0,0 +1,303 @@
|
||||
# Maintenance Mode & Update Management
|
||||
|
||||
## Overview
|
||||
|
||||
AtlasOS provides a maintenance mode feature that allows administrators to safely disable user operations during system updates or maintenance. When maintenance mode is enabled, all mutating operations (create, update, delete) are blocked except for users explicitly allowed.
|
||||
|
||||
## Features
|
||||
|
||||
- **Maintenance Mode**: Disable user operations during maintenance
|
||||
- **Automatic Backup**: Optionally create backup before entering maintenance
|
||||
- **Allowed Users**: Specify users who can operate during maintenance
|
||||
- **Health Check Integration**: Maintenance status included in health checks
|
||||
- **Audit Logging**: All maintenance mode changes are logged
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Get Maintenance Status
|
||||
|
||||
**GET** `/api/v1/maintenance`
|
||||
|
||||
Returns the current maintenance mode status.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"enabled": false,
|
||||
"enabled_at": "2024-12-20T10:30:00Z",
|
||||
"enabled_by": "admin",
|
||||
"reason": "System update",
|
||||
"allowed_users": ["admin"],
|
||||
"last_backup_id": "backup-1703123456"
|
||||
}
|
||||
```
|
||||
|
||||
### Enable Maintenance Mode
|
||||
|
||||
**POST** `/api/v1/maintenance`
|
||||
|
||||
Enables maintenance mode. Requires administrator role.
|
||||
|
||||
**Request Body:**
|
||||
```json
|
||||
{
|
||||
"reason": "System update to v1.1.0",
|
||||
"allowed_users": ["admin"],
|
||||
"create_backup": true
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
- `reason` (string, required): Reason for entering maintenance mode
|
||||
- `allowed_users` (array of strings, optional): User IDs allowed to operate during maintenance
|
||||
- `create_backup` (boolean, optional): Create automatic backup before entering maintenance
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"message": "maintenance mode enabled",
|
||||
"status": {
|
||||
"enabled": true,
|
||||
"enabled_at": "2024-12-20T10:30:00Z",
|
||||
"enabled_by": "admin",
|
||||
"reason": "System update to v1.1.0",
|
||||
"allowed_users": ["admin"],
|
||||
"last_backup_id": "backup-1703123456"
|
||||
},
|
||||
"backup_id": "backup-1703123456"
|
||||
}
|
||||
```
|
||||
|
||||
### Disable Maintenance Mode
|
||||
|
||||
**POST** `/api/v1/maintenance/disable`
|
||||
|
||||
Disables maintenance mode. Requires administrator role.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"message": "maintenance mode disabled"
|
||||
}
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Enable Maintenance Mode with Backup
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8080/api/v1/maintenance \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"reason": "System update to v1.1.0",
|
||||
"allowed_users": ["admin"],
|
||||
"create_backup": true
|
||||
}'
|
||||
```
|
||||
|
||||
### Check Maintenance Status
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/api/v1/maintenance \
|
||||
-H "Authorization: Bearer $TOKEN"
|
||||
```
|
||||
|
||||
### Disable Maintenance Mode
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8080/api/v1/maintenance/disable \
|
||||
-H "Authorization: Bearer $TOKEN"
|
||||
```
|
||||
|
||||
## Behavior
|
||||
|
||||
### When Maintenance Mode is Enabled
|
||||
|
||||
1. **Read Operations**: All GET requests continue to work normally
|
||||
2. **Mutating Operations**: All POST, PUT, PATCH, DELETE requests are blocked
|
||||
3. **Allowed Users**: Users in the `allowed_users` list can still perform operations
|
||||
4. **Public Endpoints**: Public endpoints (login, health checks) continue to work
|
||||
5. **Error Response**: Blocked operations return `503 Service Unavailable` with message:
|
||||
```json
|
||||
{
|
||||
"code": "SERVICE_UNAVAILABLE",
|
||||
"message": "system is in maintenance mode",
|
||||
"details": "the system is currently in maintenance mode and user operations are disabled"
|
||||
}
|
||||
```
|
||||
|
||||
### Middleware Order
|
||||
|
||||
Maintenance mode middleware is applied after authentication but before routes:
|
||||
|
||||
1. CORS
|
||||
2. Compression
|
||||
3. Security headers
|
||||
4. Request size limit
|
||||
5. Content-Type validation
|
||||
6. Rate limiting
|
||||
7. Caching
|
||||
8. Error recovery
|
||||
9. Request ID
|
||||
10. Logging
|
||||
11. Audit
|
||||
12. **Maintenance mode** ← Blocks operations
|
||||
13. Authentication
|
||||
14. Routes
|
||||
|
||||
## Health Check Integration
|
||||
|
||||
The health check endpoint (`/health`) includes maintenance mode status:
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "maintenance",
|
||||
"timestamp": "2024-12-20T10:30:00Z",
|
||||
"checks": {
|
||||
"zfs": "healthy",
|
||||
"database": "healthy",
|
||||
"smb": "healthy",
|
||||
"nfs": "healthy",
|
||||
"iscsi": "healthy",
|
||||
"maintenance": "enabled"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
When maintenance mode is enabled:
|
||||
- Status may change from "healthy" to "maintenance"
|
||||
- `checks.maintenance` will be "enabled"
|
||||
|
||||
## Automatic Backup
|
||||
|
||||
When `create_backup: true` is specified:
|
||||
|
||||
1. A backup is created automatically before entering maintenance
|
||||
2. The backup ID is stored in maintenance status
|
||||
3. The backup includes:
|
||||
- All user accounts
|
||||
- All SMB shares
|
||||
- All NFS exports
|
||||
- All iSCSI targets
|
||||
- All snapshot policies
|
||||
- System configuration
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Before System Updates
|
||||
|
||||
1. **Create Backup**: Always enable `create_backup: true`
|
||||
2. **Notify Users**: Inform users about maintenance window
|
||||
3. **Allow Administrators**: Include admin users in `allowed_users`
|
||||
4. **Document Reason**: Provide clear reason for maintenance
|
||||
|
||||
### During Maintenance
|
||||
|
||||
1. **Monitor Status**: Check `/api/v1/maintenance` periodically
|
||||
2. **Verify Backup**: Confirm backup was created successfully
|
||||
3. **Perform Updates**: Execute system updates or maintenance tasks
|
||||
4. **Test Operations**: Verify system functionality
|
||||
|
||||
### After Maintenance
|
||||
|
||||
1. **Disable Maintenance**: Use `/api/v1/maintenance/disable`
|
||||
2. **Verify Services**: Check all services are running
|
||||
3. **Test Operations**: Verify normal operations work
|
||||
4. **Review Logs**: Check audit logs for any issues
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **Administrator Only**: Only administrators can enable/disable maintenance mode
|
||||
2. **Audit Logging**: All maintenance mode changes are logged
|
||||
3. **Allowed Users**: Only specified users can operate during maintenance
|
||||
4. **Token Validation**: Maintenance mode respects authentication
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Maintenance Mode Already Enabled
|
||||
|
||||
```json
|
||||
{
|
||||
"code": "INTERNAL_ERROR",
|
||||
"message": "failed to enable maintenance mode",
|
||||
"details": "maintenance mode is already enabled"
|
||||
}
|
||||
```
|
||||
|
||||
### Maintenance Mode Not Enabled
|
||||
|
||||
```json
|
||||
{
|
||||
"code": "INTERNAL_ERROR",
|
||||
"message": "failed to disable maintenance mode",
|
||||
"details": "maintenance mode is not enabled"
|
||||
}
|
||||
```
|
||||
|
||||
### Backup Creation Failure
|
||||
|
||||
If backup creation fails, maintenance mode is not enabled:
|
||||
|
||||
```json
|
||||
{
|
||||
"code": "INTERNAL_ERROR",
|
||||
"message": "failed to create backup",
|
||||
"details": "error details..."
|
||||
}
|
||||
```
|
||||
|
||||
## Integration with Update Process
|
||||
|
||||
### Recommended Update Workflow
|
||||
|
||||
1. **Enable Maintenance Mode**:
|
||||
```bash
|
||||
POST /api/v1/maintenance
|
||||
{
|
||||
"reason": "Updating to v1.1.0",
|
||||
"allowed_users": ["admin"],
|
||||
"create_backup": true
|
||||
}
|
||||
```
|
||||
|
||||
2. **Verify Backup**:
|
||||
```bash
|
||||
GET /api/v1/backups/{backup_id}
|
||||
```
|
||||
|
||||
3. **Perform System Update**:
|
||||
- Stop services if needed
|
||||
- Update binaries/configurations
|
||||
- Restart services
|
||||
|
||||
4. **Verify System Health**:
|
||||
```bash
|
||||
GET /health
|
||||
```
|
||||
|
||||
5. **Disable Maintenance Mode**:
|
||||
```bash
|
||||
POST /api/v1/maintenance/disable
|
||||
```
|
||||
|
||||
6. **Test Operations**:
|
||||
- Verify normal operations work
|
||||
- Check service status
|
||||
- Review logs
|
||||
|
||||
## Limitations
|
||||
|
||||
1. **No Automatic Disable**: Maintenance mode must be manually disabled
|
||||
2. **No Scheduled Maintenance**: Maintenance mode must be enabled manually
|
||||
3. **No Maintenance History**: Only current status is available
|
||||
4. **No Notifications**: No automatic notifications to users
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Scheduled Maintenance**: Schedule maintenance windows
|
||||
2. **Maintenance History**: Track maintenance mode history
|
||||
3. **User Notifications**: Notify users when maintenance starts/ends
|
||||
4. **Automatic Disable**: Auto-disable after specified duration
|
||||
5. **Maintenance Templates**: Predefined maintenance scenarios
|
||||
6. **Rollback Support**: Automatic rollback on update failure
|
||||
Reference in New Issue
Block a user