add srs and sds documents

This commit is contained in:
Othman H. Suseno
2025-12-30 13:00:47 +07:00
commit eefa9d7035
4 changed files with 716 additions and 0 deletions

160
srs-sds/SDS_v1.md Normal file
View File

@@ -0,0 +1,160 @@
# Cloud Infrastructure Management Platform
## Software Design Specification (SDS)
**Version: 1.0 (V1 Enterprise Foundation)**
---
## 1. Architectural Overview
The platform adopts a Control Plane and Data Plane architecture.
- Control Plane manages APIs, identity, orchestration, policy, and state.
- Data Plane executes infrastructure operations via agents and providers.
---
## 2. High-Level Components
### 2.1 Management Layer
- Tenant Management Console
- Provider / Operations Console
In Version 1, both consoles MAY be implemented as a single UI with strict role-based access control.
---
### 2.2 API Gateway
Responsibilities:
- Authentication and authorization
- API namespace separation
- Request validation and rate limiting
- Centralized audit logging hook
---
### 2.3 Core Services
| Service | Responsibility |
|-------|----------------|
| Identity Service | Users, roles, RBAC |
| Resource Manager | Projects, quotas, metadata |
| Compute Service | Virtual machine lifecycle |
| Network Service | Virtual network management |
| Storage Service | Volume or object storage |
| Job Service | Workflow orchestration and retries |
| Audit Service | Append-only audit logging |
| Metering Service | Usage aggregation |
---
## 3. Data Model Overview
### Core Entities
- Organization
- Project
- User
- Role
- Role Binding
- Virtual Machine
- Network
- Volume / Bucket
- Job
- Audit Event
- Quota
- Provider
### Common Resource Attributes
```
id
organization_id
project_id
name
status
labels
provider_reference
created_at
updated_at
```
---
## 4. API Design Principles
- REST-based APIs
- Versioned endpoints
- Clear separation between tenant and provider APIs
### Namespace Examples
- /api/tenant/v1/*
- /api/ops/v1/*
- /api/common/v1/*
---
## 5. Job & Workflow Design
### Job Lifecycle States
- PENDING
- RUNNING
- SUCCEEDED
- FAILED
- RETRYING
### Design Characteristics
- Idempotent create operations
- Retry for transient failures only
- Persistent job state storage
---
## 6. Provider & Agent Architecture
### Provider Interfaces
- Compute Provider
- Network Provider
- Storage Provider
### Agent Responsibilities
- Execute infrastructure-level operations
- Report actual state to the control plane
- Emit audit and telemetry data
---
## 7. Reconciliation Mechanism
- Periodic reconciliation loop
- Desired state vs actual state comparison
- Drift handling via:
- Automated correction
- Operator alert and incident escalation
---
## 8. Security Architecture
- Token-based authentication
- RBAC enforcement across services
- Encrypted secret storage
- Distributed request tracing
---
## 9. Deployment Model (V1)
- Stateless API services
- PostgreSQL as primary datastore
- Message queue for job distribution
- Agent deployment per infrastructure cluster
---
## 10. Future Evolution
- Multi-cluster federation
- Kubernetes services
- Policy-as-Code
- Billing and invoicing
- Application marketplace
---