othman.suseno/atlas

Fork 0

Files

othman.suseno ed96137bad

CI / test-build (push) Failing after 1m0s

Details

adding snapshot function

2025-12-14 23:17:26 +07:00

3.2 KiB

Raw Blame History

Background Job System

The atlasOS API includes a background job system that automatically executes snapshot policies and manages long-running operations.

Architecture

Components

Job Manager (internal/job/manager.go)
- Tracks job lifecycle (pending, running, completed, failed, cancelled)
- Stores job metadata and progress
- Thread-safe job operations
Snapshot Scheduler (internal/snapshot/scheduler.go)
- Automatically creates snapshots based on policies
- Prunes old snapshots based on retention rules
- Runs every 15 minutes by default
Integration
- Scheduler starts automatically when API server starts
- Gracefully stops on server shutdown
- Jobs are accessible via API endpoints

How It Works

Snapshot Creation

The scheduler checks all enabled snapshot policies every 15 minutes and:

Frequent snapshots: Creates every 15 minutes if frequent > 0
Hourly snapshots: Creates every hour if hourly > 0
Daily snapshots: Creates daily if daily > 0
Weekly snapshots: Creates weekly if weekly > 0
Monthly snapshots: Creates monthly if monthly > 0
Yearly snapshots: Creates yearly if yearly > 0

Snapshot names follow the pattern: {type}-{timestamp} (e.g., hourly-20241214-143000)

Snapshot Pruning

When autoprune is enabled, the scheduler:

Groups snapshots by type (frequent, hourly, daily, etc.)
Sorts by creation time (newest first)
Keeps only the number specified in the policy
Deletes older snapshots that exceed the retention count

Job Tracking

Every snapshot operation creates a job that tracks:

Status (pending → running → completed/failed)
Progress (0-100%)
Error messages (if failed)
Timestamps (created, started, completed)

API Endpoints

List Jobs

GET /api/v1/jobs
GET /api/v1/jobs?status=running

Get Job

GET /api/v1/jobs/{id}

Cancel Job

POST /api/v1/jobs/{id}/cancel

Configuration

The scheduler interval is hardcoded to 15 minutes. To change it, modify:

// In internal/httpapp/app.go
scheduler.Start(15 * time.Minute)  // Change interval here

Example Workflow

Create a snapshot policy:

curl -X POST http://localhost:8080/api/v1/snapshot-policies \
  -H "Content-Type: application/json" \
  -d '{
    "dataset": "pool/dataset",
    "hourly": 24,
    "daily": 7,
    "autosnap": true,
    "autoprune": true
  }'

Scheduler automatically:
- Creates hourly snapshots (keeps 24)
- Creates daily snapshots (keeps 7)
- Prunes old snapshots beyond retention
Monitor jobs:

curl http://localhost:8080/api/v1/jobs

Job Statuses

pending: Job created but not started
running: Job is currently executing
completed: Job finished successfully
failed: Job encountered an error
cancelled: Job was cancelled by user

Notes

Jobs are stored in-memory (will be lost on restart)
Scheduler runs in a background goroutine
Snapshot operations are synchronous (blocking)
For production, consider:
- Database persistence for jobs
- Async job execution with worker pool
- Job history retention policies
- Metrics/alerting for failed jobs

3.2 KiB Raw Blame History