Files
atlas/docs/MAINTENANCE_MODE.md
othman.suseno 9779b30a65
Some checks failed
CI / test-build (push) Failing after 2m12s
add maintenance mode
2025-12-15 01:11:51 +07:00

7.4 KiB

Maintenance Mode & Update Management

Overview

AtlasOS provides a maintenance mode feature that allows administrators to safely disable user operations during system updates or maintenance. When maintenance mode is enabled, all mutating operations (create, update, delete) are blocked except for users explicitly allowed.

Features

  • Maintenance Mode: Disable user operations during maintenance
  • Automatic Backup: Optionally create backup before entering maintenance
  • Allowed Users: Specify users who can operate during maintenance
  • Health Check Integration: Maintenance status included in health checks
  • Audit Logging: All maintenance mode changes are logged

API Endpoints

Get Maintenance Status

GET /api/v1/maintenance

Returns the current maintenance mode status.

Response:

{
  "enabled": false,
  "enabled_at": "2024-12-20T10:30:00Z",
  "enabled_by": "admin",
  "reason": "System update",
  "allowed_users": ["admin"],
  "last_backup_id": "backup-1703123456"
}

Enable Maintenance Mode

POST /api/v1/maintenance

Enables maintenance mode. Requires administrator role.

Request Body:

{
  "reason": "System update to v1.1.0",
  "allowed_users": ["admin"],
  "create_backup": true
}

Fields:

  • reason (string, required): Reason for entering maintenance mode
  • allowed_users (array of strings, optional): User IDs allowed to operate during maintenance
  • create_backup (boolean, optional): Create automatic backup before entering maintenance

Response:

{
  "message": "maintenance mode enabled",
  "status": {
    "enabled": true,
    "enabled_at": "2024-12-20T10:30:00Z",
    "enabled_by": "admin",
    "reason": "System update to v1.1.0",
    "allowed_users": ["admin"],
    "last_backup_id": "backup-1703123456"
  },
  "backup_id": "backup-1703123456"
}

Disable Maintenance Mode

POST /api/v1/maintenance/disable

Disables maintenance mode. Requires administrator role.

Response:

{
  "message": "maintenance mode disabled"
}

Usage Examples

Enable Maintenance Mode with Backup

curl -X POST http://localhost:8080/api/v1/maintenance \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "reason": "System update to v1.1.0",
    "allowed_users": ["admin"],
    "create_backup": true
  }'

Check Maintenance Status

curl http://localhost:8080/api/v1/maintenance \
  -H "Authorization: Bearer $TOKEN"

Disable Maintenance Mode

curl -X POST http://localhost:8080/api/v1/maintenance/disable \
  -H "Authorization: Bearer $TOKEN"

Behavior

When Maintenance Mode is Enabled

  1. Read Operations: All GET requests continue to work normally
  2. Mutating Operations: All POST, PUT, PATCH, DELETE requests are blocked
  3. Allowed Users: Users in the allowed_users list can still perform operations
  4. Public Endpoints: Public endpoints (login, health checks) continue to work
  5. Error Response: Blocked operations return 503 Service Unavailable with message:
    {
      "code": "SERVICE_UNAVAILABLE",
      "message": "system is in maintenance mode",
      "details": "the system is currently in maintenance mode and user operations are disabled"
    }
    

Middleware Order

Maintenance mode middleware is applied after authentication but before routes:

  1. CORS
  2. Compression
  3. Security headers
  4. Request size limit
  5. Content-Type validation
  6. Rate limiting
  7. Caching
  8. Error recovery
  9. Request ID
  10. Logging
  11. Audit
  12. Maintenance mode ← Blocks operations
  13. Authentication
  14. Routes

Health Check Integration

The health check endpoint (/health) includes maintenance mode status:

{
  "status": "maintenance",
  "timestamp": "2024-12-20T10:30:00Z",
  "checks": {
    "zfs": "healthy",
    "database": "healthy",
    "smb": "healthy",
    "nfs": "healthy",
    "iscsi": "healthy",
    "maintenance": "enabled"
  }
}

When maintenance mode is enabled:

  • Status may change from "healthy" to "maintenance"
  • checks.maintenance will be "enabled"

Automatic Backup

When create_backup: true is specified:

  1. A backup is created automatically before entering maintenance
  2. The backup ID is stored in maintenance status
  3. The backup includes:
    • All user accounts
    • All SMB shares
    • All NFS exports
    • All iSCSI targets
    • All snapshot policies
    • System configuration

Best Practices

Before System Updates

  1. Create Backup: Always enable create_backup: true
  2. Notify Users: Inform users about maintenance window
  3. Allow Administrators: Include admin users in allowed_users
  4. Document Reason: Provide clear reason for maintenance

During Maintenance

  1. Monitor Status: Check /api/v1/maintenance periodically
  2. Verify Backup: Confirm backup was created successfully
  3. Perform Updates: Execute system updates or maintenance tasks
  4. Test Operations: Verify system functionality

After Maintenance

  1. Disable Maintenance: Use /api/v1/maintenance/disable
  2. Verify Services: Check all services are running
  3. Test Operations: Verify normal operations work
  4. Review Logs: Check audit logs for any issues

Security Considerations

  1. Administrator Only: Only administrators can enable/disable maintenance mode
  2. Audit Logging: All maintenance mode changes are logged
  3. Allowed Users: Only specified users can operate during maintenance
  4. Token Validation: Maintenance mode respects authentication

Error Handling

Maintenance Mode Already Enabled

{
  "code": "INTERNAL_ERROR",
  "message": "failed to enable maintenance mode",
  "details": "maintenance mode is already enabled"
}

Maintenance Mode Not Enabled

{
  "code": "INTERNAL_ERROR",
  "message": "failed to disable maintenance mode",
  "details": "maintenance mode is not enabled"
}

Backup Creation Failure

If backup creation fails, maintenance mode is not enabled:

{
  "code": "INTERNAL_ERROR",
  "message": "failed to create backup",
  "details": "error details..."
}

Integration with Update Process

  1. Enable Maintenance Mode:

    POST /api/v1/maintenance
    {
      "reason": "Updating to v1.1.0",
      "allowed_users": ["admin"],
      "create_backup": true
    }
    
  2. Verify Backup:

    GET /api/v1/backups/{backup_id}
    
  3. Perform System Update:

    • Stop services if needed
    • Update binaries/configurations
    • Restart services
  4. Verify System Health:

    GET /health
    
  5. Disable Maintenance Mode:

    POST /api/v1/maintenance/disable
    
  6. Test Operations:

    • Verify normal operations work
    • Check service status
    • Review logs

Limitations

  1. No Automatic Disable: Maintenance mode must be manually disabled
  2. No Scheduled Maintenance: Maintenance mode must be enabled manually
  3. No Maintenance History: Only current status is available
  4. No Notifications: No automatic notifications to users

Future Enhancements

  1. Scheduled Maintenance: Schedule maintenance windows
  2. Maintenance History: Track maintenance mode history
  3. User Notifications: Notify users when maintenance starts/ends
  4. Automatic Disable: Auto-disable after specified duration
  5. Maintenance Templates: Predefined maintenance scenarios
  6. Rollback Support: Automatic rollback on update failure