feat(lab-03): complete Phase 4 - Compute & EC2 lab

Phase Plans (5 files):
- 04-RESEARCH.md: Domain research on Docker limits, healthchecks, EC2 parallels
- 04-VALIDATION.md: Success criteria and validation strategy
- 04-01-PLAN.md: Test infrastructure (RED phase)
- 04-02-PLAN.md: Diátxis documentation
- 04-03-PLAN.md: Infrastructure implementation (GREEN phase)

Test Scripts (6 files, 1300+ lines):
- 01-resource-limits-test.sh: Validate INF-03 compliance
- 02-healthcheck-test.sh: Validate healthcheck configuration
- 03-enforcement-test.sh: Verify resource limits with docker stats
- 04-verify-infrastructure.sh: Infrastructure verification
- 99-final-verification.sh: End-to-end student verification
- run-all-tests.sh: Test orchestration with fail-fast
- quick-test.sh: Fast validation (<30s)

Documentation (11 files, 2500+ lines):
Tutorials (3):
- 01-set-resource-limits.md: EC2 instance types, Docker limits syntax
- 02-implement-healthchecks.md: ELB health check parallels
- 03-dependencies-with-health.md: depends_on with service_healthy

How-to Guides (4):
- check-resource-usage.md: docker stats monitoring
- test-limits-enforcement.md: Stress testing CPU/memory
- custom-healthcheck.md: HTTP, TCP, database healthchecks
- instance-type-mapping.md: Docker limits → EC2 mapping

Reference (3):
- compose-resources-syntax.md: Complete deploy.resources reference
- healthcheck-syntax.md: All healthcheck parameters
- ec2-instance-mapping.md: Instance type mapping table

Explanation (1):
- compute-ec2-parallels.md: Container=EC2, Limits=Instance Type, Healthcheck=ELB

Infrastructure:
- docker-compose.yml: 5 services (web, app, worker, db, stress-test)
  All services: INF-03 compliant (cpus + memory limits)
  All services: healthcheck configured
  EC2 parallels: t2.nano, t2.micro, t2.small, t2.medium, m5.large
- Dockerfile: Alpine 3.19 + stress tools + non-root user

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Luca Sacchi Ricciardi
2026-04-03 15:16:58 +02:00
parent 39b9a56850
commit 23a9ffe443
26 changed files with 5457 additions and 1 deletions

View File

@@ -0,0 +1,209 @@
# Phase 4 Research - Lab 03: Compute & EC2
## Domain Research: Docker Resource Limits & Healthchecks
### 1. Docker Compose Resource Limits
**CPU Limits:**
```yaml
services:
app:
deploy:
resources:
limits:
cpus: '0.5' # 50% of 1 CPU core
# OR
cpus: '2' # 2 full CPU cores
```
**Memory Limits:**
```yaml
services:
app:
deploy:
resources:
limits:
memory: 512M # 512 MB
# OR
memory: 2G # 2 GB
```
**Non-Swap Memory:**
```yaml
services:
app:
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256M
```
### 2. EC2 Instance Types Parallel
| Docker Limits | EC2 Equivalent | Instance Type |
|---------------|----------------|---------------|
| cpus: '0.5', memory: 512M | t2.nano (0.5 vCPU, 512MB) | Burstable |
| cpus: '1', memory: 1G | t2.micro (1 vCPU, 1GB) | Burstable |
| cpus: '1', memory: 2G | t2.small (1 vCPU, 2GB) | Burstable |
| cpus: '2', memory: 4G | t2.medium (2 vCPU, 4GB) | Burstable |
| cpus: '2', memory: 8G | m5.large (2 vCPU, 8GB) | General Purpose |
| cpus: '4', memory: 16G | m5.xlarge (4 vCPU, 16GB) | General Purpose |
**Key Parallelism:**
- Docker CPU fractions = AWS vCPUs
- Docker memory limits = AWS instance memory
- No swap enforcement = AWS EBS-optimized instances
- Resource reservations = AWS instance type guarantees
### 3. Healthcheck Implementation
**Docker Compose Healthcheck:**
```yaml
services:
web:
image: nginx:alpine
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:80"]
interval: 10s # Check every 10 seconds
timeout: 5s # Timeout after 5 seconds
retries: 3 # Mark unhealthy after 3 failures
start_period: 10s # Grace period on startup
```
**Healthcheck with curl:**
```yaml
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost/ || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 5s
```
**Database Healthcheck:**
```yaml
db:
image: postgres:16-alpine
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres || exit 1"]
interval: 10s
timeout: 5s
retries: 5
start_period: 10s
```
**Application Healthcheck:**
```yaml
app:
image: myapp:latest
healthcheck:
test: ["CMD-SHELL", "node healthcheck.js || exit 1"]
interval: 15s
timeout: 3s
retries: 3
start_period: 30s
```
### 4. Service Dependencies with Health
**Wait for healthy service:**
```yaml
services:
app:
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
```
**Lifecycle:**
1. `service_started`: Container started (default)
2. `service_healthy`: Healthcheck passing (requires healthcheck section)
### 5. Resource Monitoring
**docker stats:**
```bash
docker stats --no-stream # Single snapshot
docker stats lab03-app --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"
```
**Inspect limits:**
```bash
docker inspect lab03-app --format '{{.HostConfig.Memory}}' # Memory limit in bytes
docker inspect lab03-app --format '{{.HostConfig.NanoCpus}}' # CPU quota (1e9 = 1 CPU)
```
## Testing Strategy
### Test Scenarios
1. **Resource Limit Enforcement:**
- Deploy container with CPU limit (e.g., 0.5 CPU)
- Run CPU-intensive task
- Verify with `docker stats` that CPU usage doesn't exceed limit
2. **Memory Limit Enforcement:**
- Deploy container with memory limit (e.g., 512M)
- Run memory allocation task
- Verify container is OOM killed when exceeding limit
3. **Healthcheck Validation:**
- Deploy service with healthcheck
- Verify status transitions: starting → healthy
- Verify `depends_on: condition: service_healthy` waits
4. **Resource Verification:**
- Parse docker-compose.yml for `deploy.resources.limits`
- Verify all services have mandatory limits
- Report missing limits
## Security Requirements
**INF-03: Mandatory Resource Limits**
- Every container MUST have `cpus` limit specified
- Every container MUST have `memory` limit specified
- No container can run with unlimited resources (security risk)
**Safety First:**
- Resource limits prevent DoS from runaway processes
- Memory limits prevent host OOM
- CPU limits ensure fair resource sharing
## Cloud Parallels
### Docker → AWS EC2
| Docker | AWS EC2 |
|--------|---------|
| `cpus: '0.5'` | 0.5 vCPU (t2.nano) |
| `memory: 512M` | 512 MB RAM |
| `healthcheck` | EC2 Status Checks + ELB Health |
| `docker stats` | CloudWatch Metrics |
| `OOM kill` | Instance termination (out of credit) |
| `depends_on: healthy` | Auto Scaling Group health checks |
### Instance Type Selection
**Burstable Instances (t2/t3):**
- Credit-based CPU
- Good for dev/test
- Docker: Small limits with occasional bursts
**General Purpose (m5):**
- Balanced compute/memory
- Docker: Medium limits (2-4 vCPU, 8-16 GB)
**Compute Optimized (c5):**
- High CPU ratio
- Docker: High CPU limits (4+ vCPU, lower memory)
## Sources
- [Docker Compose Resources](https://docs.docker.com/compose/compose-file/compose-file-v3/#resources)
- [Docker Healthcheck](https://docs.docker.com/engine/reference/builder/#healthcheck)
- [AWS EC2 Instance Types](https://aws.amazon.com/ec2/instance-types/)
- [Docker Stats](https://docs.docker.com/engine/reference/commandline/stats/)