adding snapshot function

2025-12-14 23:17:26 +07:00
parent 461edbc970
commit ed96137bad
8 changed files with 1075 additions and 20 deletions
--- a/docs/BACKGROUND_JOBS.md
+++ b/docs/BACKGROUND_JOBS.md
@@ -0,0 +1,125 @@
+# Background Job System
+
+The atlasOS API includes a background job system that automatically executes snapshot policies and manages long-running operations.
+
+## Architecture
+
+### Components
+
+1. **Job Manager** (`internal/job/manager.go`)
+   - Tracks job lifecycle (pending, running, completed, failed, cancelled)
+   - Stores job metadata and progress
+   - Thread-safe job operations
+
+2. **Snapshot Scheduler** (`internal/snapshot/scheduler.go`)
+   - Automatically creates snapshots based on policies
+   - Prunes old snapshots based on retention rules
+   - Runs every 15 minutes by default
+
+3. **Integration**
+   - Scheduler starts automatically when API server starts
+   - Gracefully stops on server shutdown
+   - Jobs are accessible via API endpoints
+
+## How It Works
+
+### Snapshot Creation
+
+The scheduler checks all enabled snapshot policies every 15 minutes and:
+
+1. **Frequent snapshots**: Creates every 15 minutes if `frequent > 0`
+2. **Hourly snapshots**: Creates every hour if `hourly > 0`
+3. **Daily snapshots**: Creates daily if `daily > 0`
+4. **Weekly snapshots**: Creates weekly if `weekly > 0`
+5. **Monthly snapshots**: Creates monthly if `monthly > 0`
+6. **Yearly snapshots**: Creates yearly if `yearly > 0`
+
+Snapshot names follow the pattern: `{type}-{timestamp}` (e.g., `hourly-20241214-143000`)
+
+### Snapshot Pruning
+
+When `autoprune` is enabled, the scheduler:
+
+1. Groups snapshots by type (frequent, hourly, daily, etc.)
+2. Sorts by creation time (newest first)
+3. Keeps only the number specified in the policy
+4. Deletes older snapshots that exceed the retention count
+
+### Job Tracking
+
+Every snapshot operation creates a job that tracks:
+- Status (pending → running → completed/failed)
+- Progress (0-100%)
+- Error messages (if failed)
+- Timestamps (created, started, completed)
+
+## API Endpoints
+
+### List Jobs
+```bash
+GET /api/v1/jobs
+GET /api/v1/jobs?status=running
+```
+
+### Get Job
+```bash
+GET /api/v1/jobs/{id}
+```
+
+### Cancel Job
+```bash
+POST /api/v1/jobs/{id}/cancel
+```
+
+## Configuration
+
+The scheduler interval is hardcoded to 15 minutes. To change it, modify:
+
+```go
+// In internal/httpapp/app.go
+scheduler.Start(15 * time.Minute)  // Change interval here
+```
+
+## Example Workflow
+
+1. **Create a snapshot policy:**
+```bash
+curl -X POST http://localhost:8080/api/v1/snapshot-policies \
+  -H "Content-Type: application/json" \
+  -d '{
+    "dataset": "pool/dataset",
+    "hourly": 24,
+    "daily": 7,
+    "autosnap": true,
+    "autoprune": true
+  }'
+```
+
+2. **Scheduler automatically:**
+   - Creates hourly snapshots (keeps 24)
+   - Creates daily snapshots (keeps 7)
+   - Prunes old snapshots beyond retention
+
+3. **Monitor jobs:**
+```bash
+curl http://localhost:8080/api/v1/jobs
+```
+
+## Job Statuses
+
+- `pending`: Job created but not started
+- `running`: Job is currently executing
+- `completed`: Job finished successfully
+- `failed`: Job encountered an error
+- `cancelled`: Job was cancelled by user
+
+## Notes
+
+- Jobs are stored in-memory (will be lost on restart)
+- Scheduler runs in a background goroutine
+- Snapshot operations are synchronous (blocking)
+- For production, consider:
+  - Database persistence for jobs
+  - Async job execution with worker pool
+  - Job history retention policies
+  - Metrics/alerting for failed jobs