feat(lab-03): complete Phase 4 - Compute & EC2 lab

Phase Plans (5 files):
- 04-RESEARCH.md: Domain research on Docker limits, healthchecks, EC2 parallels
- 04-VALIDATION.md: Success criteria and validation strategy
- 04-01-PLAN.md: Test infrastructure (RED phase)
- 04-02-PLAN.md: Diátxis documentation
- 04-03-PLAN.md: Infrastructure implementation (GREEN phase)

Test Scripts (6 files, 1300+ lines):
- 01-resource-limits-test.sh: Validate INF-03 compliance
- 02-healthcheck-test.sh: Validate healthcheck configuration
- 03-enforcement-test.sh: Verify resource limits with docker stats
- 04-verify-infrastructure.sh: Infrastructure verification
- 99-final-verification.sh: End-to-end student verification
- run-all-tests.sh: Test orchestration with fail-fast
- quick-test.sh: Fast validation (<30s)

Documentation (11 files, 2500+ lines):
Tutorials (3):
- 01-set-resource-limits.md: EC2 instance types, Docker limits syntax
- 02-implement-healthchecks.md: ELB health check parallels
- 03-dependencies-with-health.md: depends_on with service_healthy

How-to Guides (4):
- check-resource-usage.md: docker stats monitoring
- test-limits-enforcement.md: Stress testing CPU/memory
- custom-healthcheck.md: HTTP, TCP, database healthchecks
- instance-type-mapping.md: Docker limits → EC2 mapping

Reference (3):
- compose-resources-syntax.md: Complete deploy.resources reference
- healthcheck-syntax.md: All healthcheck parameters
- ec2-instance-mapping.md: Instance type mapping table

Explanation (1):
- compute-ec2-parallels.md: Container=EC2, Limits=Instance Type, Healthcheck=ELB

Infrastructure:
- docker-compose.yml: 5 services (web, app, worker, db, stress-test)
  All services: INF-03 compliant (cpus + memory limits)
  All services: healthcheck configured
  EC2 parallels: t2.nano, t2.micro, t2.small, t2.medium, m5.large
- Dockerfile: Alpine 3.19 + stress tools + non-root user

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Luca Sacchi Ricciardi
2026-04-03 15:16:58 +02:00
parent 39b9a56850
commit 23a9ffe443
26 changed files with 5457 additions and 1 deletions

View File

@@ -0,0 +1,347 @@
# Tutorial 2: Implementare Healthchecks
In questo tutorial imparerai a configurare healthchecks per i container Docker, simulando gli **ELB Health Checks** di AWS.
## Obiettivi di Apprendimento
Al termine di questo tutorial sarai in grado di:
- Comprendere cosa sono i healthchecks e perché sono importanti
- Configurare healthchecks HTTP, CMD e custom
- Mappare healthchecks Docker agli ELB Health Checks
- Monitorare lo stato di salute dei container
---
## Prerequisiti
- Completamento di Tutorial 1: Configurare i Limiti delle Risorse
- docker-compose.yml con servizi configurati
- Container che espongono porte HTTP o servizi monitorabili
---
## Parte 1: Healthchecks - Concetti Fondamentali
### Cos'è un Healthcheck?
Un **healthcheck** è un comando periodico che verifica se un container è "sano" (healthy).
**Stati di un Container:**
1. **created** - Container creato ma non avviato
2. **starting** - Container in avvio (healthcheck in corso)
3. **healthy** - Healthcheck passing (container OK)
4. **unhealthy** - Healthcheck failing (container problematico)
5. **exited** - Container terminato
### Perché sono Importanti?
- **Failover:** Sostituisce container non sani
- **Zero-downtime:** Attiva solo container sani nel load balancer
- **Dependencies:** Altri servizi aspettano che il container diventi healthy
- **Monitoring:** Avvisa automaticamente su problemi
### Parallelismo: ELB Health Checks
| Docker | AWS ELB |
|--------|---------|
| healthcheck.test | Health check path/protocol |
| healthcheck.interval | Health check interval (default 30s) |
| healthcheck.timeout | Health check timeout (default 5s) |
| healthcheck.retries | Unhealthy threshold |
| healthcheck.start_period | Grace period |
---
## Parte 2: Sintassi Healthcheck in Docker Compose
### Sintassi di Base
```yaml
services:
web:
image: nginx:alpine
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost/"]
interval: 10s
timeout: 5s
retries: 3
start_period: 5s
```
### Parametri Spiegati
| Parametro | Default | Descrizione |
|-----------|---------|-------------|
| test | - | Comando da eseguire (richiesto) |
| interval | 30s | Frequenza del check |
| timeout | 30s | Tempo massimo per completare |
| retries | 3 | Tentativi prima di标记 unhealthy |
| start_period | 0s | Grace period all'avvio |
---
## Parte 3: Pratica - HTTP Healthcheck per Web Server
### Step 1: Aggiungere healthcheck al servizio web
Modifica `docker-compose.yml`:
```yaml
web:
image: nginx:alpine
container_name: lab03-web
hostname: web
deploy:
resources:
limits:
cpus: '1'
memory: 1G
# HTTP Healthcheck
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost/"]
interval: 10s
timeout: 5s
retries: 3
start_period: 5s
ports:
- "127.0.0.1:8080:80"
restart: unless-stopped
```
### Step 2: Avviare e verificare
```bash
docker compose up -d web
```
### Step 3: Monitorare lo stato di salute
```bash
# Mostra stato di salute
docker ps
# Output:
# CONTAINER IMAGE STATUS
# lab03-web nginx:alpine Up 30 seconds (healthy)
```
### Step 4: Ispezionare i dettagli del healthcheck
```bash
docker inspect lab03-web --format '{{json .State.Health}}' | jq
```
**Output JSON:**
```json
{
"Status": "healthy",
"FailingStreak": 0,
"Log": [
{
"Start": "2024-03-25T10:00:00Z",
"End": "2024-03-25T10:00:00Z",
"ExitCode": 0,
"Output": ""
}
]
}
```
---
## Parte 4: Pratica - Database Healthcheck
### Step 1: Aggiungere servizio database
```yaml
db:
image: postgres:16-alpine
container_name: lab03-db
hostname: db
environment:
POSTGRES_DB: lab03_db
POSTGRES_USER: lab03_user
POSTGRES_PASSWORD: lab03_password
deploy:
resources:
limits:
cpus: '2'
memory: 4G
# Database Healthcheck
healthcheck:
test: ["CMD-SHELL", "pg_isready -U lab03_user -d lab03_db || exit 1"]
interval: 10s
timeout: 5s
retries: 5
start_period: 10s
volumes:
- db-data:/var/lib/postgresql/data
restart: unless-stopped
```
Nota che il database:
- Ha più `retries` (5 vs 3) - i database partono più lentamente
- Ha `start_period` più lungo (10s vs 5s) - grace period esteso
### Step 2: Verificare il database diventi healthy
```bash
docker compose up -d db
# Attendere che diventi healthy
watch -n 2 'docker ps --filter "name=lab03-db" --format "table {{.Names}}\t{{.Status}}"'
```
Premi `Ctrl+C` quando vedi `(healthy)`.
---
## Parte 5: Pratica - CMD-SHELL Healthcheck
Per comandi più complessi, usa `CMD-SHELL`:
```yaml
app:
image: myapp:latest
container_name: lab03-app
hostname: app
deploy:
resources:
limits:
cpus: '1'
memory: 2G
# Custom healthcheck with shell
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost/health || exit 1"]
interval: 15s
timeout: 3s
retries: 3
start_period: 30s
restart: unless-stopped
```
### Esempi di Healthcheck
**HTTP con curl:**
```yaml
test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
```
**TCP connection:**
```yaml
test: ["CMD-SHELL", "nc -z localhost 8080 || exit 1"]
```
**File existence:**
```yaml
test: ["CMD-SHELL", "test -f /tmp/ready || exit 1"]
```
**Simple always-succeed:**
```yaml
test: ["CMD-SHELL", "exit 0"]
```
---
## Parte 6: ELB Health Check Parallelism
### Mapping Completo
| Docker Healthcheck | ELB Health Check | AWS Default | Docker Default |
|--------------------|------------------|-------------|----------------|
| test | Protocol + Path | TCP:80 | Configurato |
| interval | Interval | 30s | 30s |
| timeout | Timeout | 5s | 30s |
| retries | Unhealthy Threshold | 2 | 3 |
| start_period | - | - | 0s |
### Configurazione Equivalente AWS
**Docker:**
```yaml
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost/health"]
interval: 30s
timeout: 5s
retries: 2
```
**ELB (equivalente):**
```json
{
"TargetGroup": {
"HealthCheckProtocol": "HTTP",
"HealthCheckPath": "/health",
"HealthCheckIntervalSeconds": 30,
"HealthCheckTimeoutSeconds": 5,
"UnhealthyThresholdCount": 2
}
}
```
---
## Parte 7: Debugging Healthchecks
### Problema: Container rimane "starting"
**Causa:** Healthcheck non passa entro `start_period`
**Soluzione:**
- Aumenta `start_period`
- Riduci `interval` per check più frequenti
- Verifica che il servizio sia realmente pronto
### Problema: Container diventa "unhealthy"
**Debug:**
```bash
# Guarda i log del healthcheck
docker inspect lab03-web --format '{{range .State.Health.Log}}{{.Start}} - {{.ExitCode}} - {{.Output}}{{"\n"}}{{end}}'
# Esegui il comando manualmente
docker exec lab03-web wget --spider -q http://localhost/
```
### Problema: Healthcheck troppo lento
**Ottimizza:**
- Usa check HTTP leggeri (non pagine pesanti)
- Usa check TCP invece di HTTP se possibile
- Riduci `timeout` se il check è veloce
---
## Riepilogo
In questo tutorial hai imparato:
**Concetto:** Healthchecks verificano lo stato di salute dei container
**Sintassi:** Parametri test, interval, timeout, retries, start_period
**Tipi:** HTTP, CMD-SHELL, custom healthchecks
**Parallelismo:** Healthcheck Docker → ELB Health Check
**Monitoring:** `docker ps` mostra stato (healthy/unhealthy)
---
## Prossimi Passi
Nel prossimo tutorial imparerai a:
- Configurare **dipendenze** tra servizi
- Usare `depends_on` con `condition: service_healthy`
- Implementare startup ordinato per applicazioni multi-tier
Continua con **Tutorial 3: Dipendenze con Healthchecks**