Hypervisor Control Plane (HCP)
Provider-Agnostic Compute API Gateway
Hypervisor Control Plane (HCP) adalah compute control-plane service yang menyediakan API provider-agnostic untuk manajemen lifecycle Virtual Machine (VM) di berbagai hypervisor, termasuk:
- Proxmox VE
- VMware vSphere / ESXi
- KVM / QEMU (libvirt)
- Microsoft Hyper-V
HCP berfungsi sebagai jembatan antara Central API Gateway / Management Console dan infrastructure hypervisor, dengan pendekatan desired state + asynchronous job orchestration.
🎯 Tujuan Utama
- Menyediakan API compute yang konsisten lintas hypervisor
- Mendukung multi-tenant & multi-project isolation
- Menyediakan enterprise-grade control plane
- async job
- idempotency
- retry & reconciliation
- auditability
- Menjadi fondasi jangka panjang untuk ekspansi fitur compute tanpa rewrite
🧠 Konsep Arsitektur
HCP bukan sekadar reverse-proxy ke API hypervisor.
HCP adalah stateful control plane yang:
- Menyimpan desired state VM
- Mengelola lifecycle via job system
- Mengabstraksi perbedaan hypervisor melalui provider adapter
- Menghasilkan audit & metering events
Control Plane vs Data Plane
- Control Plane: HCP API + Job orchestration + State
- Data Plane: Hypervisor & provider-specific execution (via adapter/agent)
🧩 Komponen Utama
1. HCP API Service
- Northbound REST API (tenant & ops)
- AuthN/AuthZ enforcement (JWT + RBAC)
- Validasi request & idempotency
- Persist state VM & job
- Publish job ke queue
2. HCP Worker
- Consume job dari queue
- Jalankan workflow state machine
- Panggil provider adapter
- Update state VM & job
- Emit audit & metering events
3. Provider Adapter Layer
- Implementasi driver per hypervisor
- Mapping:
- generic VM spec → API provider
- error provider → error taxonomy HCP
- Tidak mengekspos detail provider ke API northbound
4. Datastore & Queue
- PostgreSQL: VM state, job state, catalog, placement
- Queue/Stream: NATS JetStream / RabbitMQ
- Audit/Event Store: append-only
🔌 Provider-Agnostic Design
Northbound API (Stable)
API HCP tidak pernah mengekspos:
- node / host hypervisor
- vmid / moid / UUID provider
- datastore / resource pool
Tenant hanya berinteraksi dengan:
- image
- flavor
- network attachment
- placement (zone/location)
- metadata/tags
Southbound Providers (Pluggable)
Setiap hypervisor diintegrasikan melalui provider adapter dengan kontrak yang konsisten.
🔁 Job & Workflow Model
Semua operasi yang berdampak ke infra dijalankan sebagai async job.
Contoh:
- create VM
- start / stop / reboot
- delete VM
- request console
Pola Standar
- API request diterima
- Desired state disimpan (
PENDING) - Job dibuat & dipublish
- Worker mengeksekusi via provider
- State diperbarui (
ACTIVE / ERROR) - Audit & event di-emit
API akan mengembalikan:
202 Accepted
{
"resource_id": "...",
"job_id": "..."
}
🌐 API Namespace
Disarankan dipanggil via Central API Gateway, namun HCP tetap melakukan guardrail sendiri.
Tenant
/api/hcp/tenant/v1/...
Operations / Provider
/api/hcp/ops/v1/...
Common (read-only)
/api/hcp/common/v1/...
🧪 Contoh Operasi Utama
Create VM
POST /api/hcp/tenant/v1/projects/{projectId}/vms
Response:
202 Accepted
{
"vm_id": "...",
"job_id": "..."
}
Get Job Status
GET /api/hcp/tenant/v1/jobs/{jobId}
Request Console
POST /api/hcp/tenant/v1/projects/{projectId}/vms/{vmId}:console
Response:
{
"type": "vnc | spice | web | rdp",
"url": "...",
"expires_at": "..."
}
🧠 Capability Negotiation
Setiap provider/cluster memiliki capability flags, contoh:
- supports_cloud_init
- supports_snapshot
- supports_live_migration
- supports_console_vnc
- supports_gpu_passthrough
Jika fitur tidak tersedia, API mengembalikan:
409 FEATURE_NOT_SUPPORTED
🔐 Security Model
- JWT-based authentication
- RBAC enforcement di level API
- Strict tenant vs ops boundary
- Provider credential terenkripsi
- Console session short-lived
- Semua aksi tercatat di audit log
📊 Observability
HCP menyediakan:
- Metrics: request rate, latency, job success/failure
- Structured logs: trace_id, job_id, vm_id
- Distributed tracing (OpenTelemetry-ready)
🗂️ Repository Structure (Suggested)
hcp/
├── cmd/
│ ├── api/
│ └── worker/
├── internal/
│ ├── api/
│ ├── auth/
│ ├── jobs/
│ ├── providers/
│ │ ├── proxmox/
│ │ ├── vsphere/
│ │ ├── libvirt/
│ │ └── hyperv/
│ ├── reconcile/
│ └── audit/
├── pkg/
│ └── models/
├── docs/
│ ├── HCP_SRS_v1.md
│ └── HCP_SDS_v1.md
└── README.md
📄 Documentation
- SRS:
docs/HCP_SRS_v1.md - SDS:
docs/HCP_SDS_v1.md
Dokumen tersebut adalah authoritative reference untuk desain dan implementasi HCP.
🚀 Roadmap Singkat
V1
- Proxmox provider
- VM lifecycle
- Async job & audit
- Console abstraction
Post-V1
- vSphere provider
- Libvirt/KVM provider
- Hyper-V provider
- Snapshot & migration
- Policy-based placement
🧭 Philosophy
HCP is not built to support one hypervisor.
HCP is built so that hypervisors can come and go.