Phase Plans (5 files): - 04-RESEARCH.md: Domain research on Docker limits, healthchecks, EC2 parallels - 04-VALIDATION.md: Success criteria and validation strategy - 04-01-PLAN.md: Test infrastructure (RED phase) - 04-02-PLAN.md: Diátxis documentation - 04-03-PLAN.md: Infrastructure implementation (GREEN phase) Test Scripts (6 files, 1300+ lines): - 01-resource-limits-test.sh: Validate INF-03 compliance - 02-healthcheck-test.sh: Validate healthcheck configuration - 03-enforcement-test.sh: Verify resource limits with docker stats - 04-verify-infrastructure.sh: Infrastructure verification - 99-final-verification.sh: End-to-end student verification - run-all-tests.sh: Test orchestration with fail-fast - quick-test.sh: Fast validation (<30s) Documentation (11 files, 2500+ lines): Tutorials (3): - 01-set-resource-limits.md: EC2 instance types, Docker limits syntax - 02-implement-healthchecks.md: ELB health check parallels - 03-dependencies-with-health.md: depends_on with service_healthy How-to Guides (4): - check-resource-usage.md: docker stats monitoring - test-limits-enforcement.md: Stress testing CPU/memory - custom-healthcheck.md: HTTP, TCP, database healthchecks - instance-type-mapping.md: Docker limits → EC2 mapping Reference (3): - compose-resources-syntax.md: Complete deploy.resources reference - healthcheck-syntax.md: All healthcheck parameters - ec2-instance-mapping.md: Instance type mapping table Explanation (1): - compute-ec2-parallels.md: Container=EC2, Limits=Instance Type, Healthcheck=ELB Infrastructure: - docker-compose.yml: 5 services (web, app, worker, db, stress-test) All services: INF-03 compliant (cpus + memory limits) All services: healthcheck configured EC2 parallels: t2.nano, t2.micro, t2.small, t2.medium, m5.large - Dockerfile: Alpine 3.19 + stress tools + non-root user Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
210 lines
5.3 KiB
Markdown
210 lines
5.3 KiB
Markdown
# Phase 4 Research - Lab 03: Compute & EC2
|
|
|
|
## Domain Research: Docker Resource Limits & Healthchecks
|
|
|
|
### 1. Docker Compose Resource Limits
|
|
|
|
**CPU Limits:**
|
|
```yaml
|
|
services:
|
|
app:
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '0.5' # 50% of 1 CPU core
|
|
# OR
|
|
cpus: '2' # 2 full CPU cores
|
|
```
|
|
|
|
**Memory Limits:**
|
|
```yaml
|
|
services:
|
|
app:
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
memory: 512M # 512 MB
|
|
# OR
|
|
memory: 2G # 2 GB
|
|
```
|
|
|
|
**Non-Swap Memory:**
|
|
```yaml
|
|
services:
|
|
app:
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
memory: 512M
|
|
reservations:
|
|
memory: 256M
|
|
```
|
|
|
|
### 2. EC2 Instance Types Parallel
|
|
|
|
| Docker Limits | EC2 Equivalent | Instance Type |
|
|
|---------------|----------------|---------------|
|
|
| cpus: '0.5', memory: 512M | t2.nano (0.5 vCPU, 512MB) | Burstable |
|
|
| cpus: '1', memory: 1G | t2.micro (1 vCPU, 1GB) | Burstable |
|
|
| cpus: '1', memory: 2G | t2.small (1 vCPU, 2GB) | Burstable |
|
|
| cpus: '2', memory: 4G | t2.medium (2 vCPU, 4GB) | Burstable |
|
|
| cpus: '2', memory: 8G | m5.large (2 vCPU, 8GB) | General Purpose |
|
|
| cpus: '4', memory: 16G | m5.xlarge (4 vCPU, 16GB) | General Purpose |
|
|
|
|
**Key Parallelism:**
|
|
- Docker CPU fractions = AWS vCPUs
|
|
- Docker memory limits = AWS instance memory
|
|
- No swap enforcement = AWS EBS-optimized instances
|
|
- Resource reservations = AWS instance type guarantees
|
|
|
|
### 3. Healthcheck Implementation
|
|
|
|
**Docker Compose Healthcheck:**
|
|
```yaml
|
|
services:
|
|
web:
|
|
image: nginx:alpine
|
|
healthcheck:
|
|
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:80"]
|
|
interval: 10s # Check every 10 seconds
|
|
timeout: 5s # Timeout after 5 seconds
|
|
retries: 3 # Mark unhealthy after 3 failures
|
|
start_period: 10s # Grace period on startup
|
|
```
|
|
|
|
**Healthcheck with curl:**
|
|
```yaml
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "curl -f http://localhost/ || exit 1"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
start_period: 5s
|
|
```
|
|
|
|
**Database Healthcheck:**
|
|
```yaml
|
|
db:
|
|
image: postgres:16-alpine
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pg_isready -U postgres || exit 1"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 5
|
|
start_period: 10s
|
|
```
|
|
|
|
**Application Healthcheck:**
|
|
```yaml
|
|
app:
|
|
image: myapp:latest
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "node healthcheck.js || exit 1"]
|
|
interval: 15s
|
|
timeout: 3s
|
|
retries: 3
|
|
start_period: 30s
|
|
```
|
|
|
|
### 4. Service Dependencies with Health
|
|
|
|
**Wait for healthy service:**
|
|
```yaml
|
|
services:
|
|
app:
|
|
depends_on:
|
|
db:
|
|
condition: service_healthy
|
|
redis:
|
|
condition: service_started
|
|
```
|
|
|
|
**Lifecycle:**
|
|
1. `service_started`: Container started (default)
|
|
2. `service_healthy`: Healthcheck passing (requires healthcheck section)
|
|
|
|
### 5. Resource Monitoring
|
|
|
|
**docker stats:**
|
|
```bash
|
|
docker stats --no-stream # Single snapshot
|
|
docker stats lab03-app --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"
|
|
```
|
|
|
|
**Inspect limits:**
|
|
```bash
|
|
docker inspect lab03-app --format '{{.HostConfig.Memory}}' # Memory limit in bytes
|
|
docker inspect lab03-app --format '{{.HostConfig.NanoCpus}}' # CPU quota (1e9 = 1 CPU)
|
|
```
|
|
|
|
## Testing Strategy
|
|
|
|
### Test Scenarios
|
|
|
|
1. **Resource Limit Enforcement:**
|
|
- Deploy container with CPU limit (e.g., 0.5 CPU)
|
|
- Run CPU-intensive task
|
|
- Verify with `docker stats` that CPU usage doesn't exceed limit
|
|
|
|
2. **Memory Limit Enforcement:**
|
|
- Deploy container with memory limit (e.g., 512M)
|
|
- Run memory allocation task
|
|
- Verify container is OOM killed when exceeding limit
|
|
|
|
3. **Healthcheck Validation:**
|
|
- Deploy service with healthcheck
|
|
- Verify status transitions: starting → healthy
|
|
- Verify `depends_on: condition: service_healthy` waits
|
|
|
|
4. **Resource Verification:**
|
|
- Parse docker-compose.yml for `deploy.resources.limits`
|
|
- Verify all services have mandatory limits
|
|
- Report missing limits
|
|
|
|
## Security Requirements
|
|
|
|
**INF-03: Mandatory Resource Limits**
|
|
- Every container MUST have `cpus` limit specified
|
|
- Every container MUST have `memory` limit specified
|
|
- No container can run with unlimited resources (security risk)
|
|
|
|
**Safety First:**
|
|
- Resource limits prevent DoS from runaway processes
|
|
- Memory limits prevent host OOM
|
|
- CPU limits ensure fair resource sharing
|
|
|
|
## Cloud Parallels
|
|
|
|
### Docker → AWS EC2
|
|
|
|
| Docker | AWS EC2 |
|
|
|--------|---------|
|
|
| `cpus: '0.5'` | 0.5 vCPU (t2.nano) |
|
|
| `memory: 512M` | 512 MB RAM |
|
|
| `healthcheck` | EC2 Status Checks + ELB Health |
|
|
| `docker stats` | CloudWatch Metrics |
|
|
| `OOM kill` | Instance termination (out of credit) |
|
|
| `depends_on: healthy` | Auto Scaling Group health checks |
|
|
|
|
### Instance Type Selection
|
|
|
|
**Burstable Instances (t2/t3):**
|
|
- Credit-based CPU
|
|
- Good for dev/test
|
|
- Docker: Small limits with occasional bursts
|
|
|
|
**General Purpose (m5):**
|
|
- Balanced compute/memory
|
|
- Docker: Medium limits (2-4 vCPU, 8-16 GB)
|
|
|
|
**Compute Optimized (c5):**
|
|
- High CPU ratio
|
|
- Docker: High CPU limits (4+ vCPU, lower memory)
|
|
|
|
## Sources
|
|
|
|
- [Docker Compose Resources](https://docs.docker.com/compose/compose-file/compose-file-v3/#resources)
|
|
- [Docker Healthcheck](https://docs.docker.com/engine/reference/builder/#healthcheck)
|
|
- [AWS EC2 Instance Types](https://aws.amazon.com/ec2/instance-types/)
|
|
- [Docker Stats](https://docs.docker.com/engine/reference/commandline/stats/)
|