Files
laboratori-cloud/labs/lab-03-compute/tutorial/02-implement-healthchecks.md
Luca Sacchi Ricciardi 23a9ffe443 feat(lab-03): complete Phase 4 - Compute & EC2 lab
Phase Plans (5 files):
- 04-RESEARCH.md: Domain research on Docker limits, healthchecks, EC2 parallels
- 04-VALIDATION.md: Success criteria and validation strategy
- 04-01-PLAN.md: Test infrastructure (RED phase)
- 04-02-PLAN.md: Diátxis documentation
- 04-03-PLAN.md: Infrastructure implementation (GREEN phase)

Test Scripts (6 files, 1300+ lines):
- 01-resource-limits-test.sh: Validate INF-03 compliance
- 02-healthcheck-test.sh: Validate healthcheck configuration
- 03-enforcement-test.sh: Verify resource limits with docker stats
- 04-verify-infrastructure.sh: Infrastructure verification
- 99-final-verification.sh: End-to-end student verification
- run-all-tests.sh: Test orchestration with fail-fast
- quick-test.sh: Fast validation (<30s)

Documentation (11 files, 2500+ lines):
Tutorials (3):
- 01-set-resource-limits.md: EC2 instance types, Docker limits syntax
- 02-implement-healthchecks.md: ELB health check parallels
- 03-dependencies-with-health.md: depends_on with service_healthy

How-to Guides (4):
- check-resource-usage.md: docker stats monitoring
- test-limits-enforcement.md: Stress testing CPU/memory
- custom-healthcheck.md: HTTP, TCP, database healthchecks
- instance-type-mapping.md: Docker limits → EC2 mapping

Reference (3):
- compose-resources-syntax.md: Complete deploy.resources reference
- healthcheck-syntax.md: All healthcheck parameters
- ec2-instance-mapping.md: Instance type mapping table

Explanation (1):
- compute-ec2-parallels.md: Container=EC2, Limits=Instance Type, Healthcheck=ELB

Infrastructure:
- docker-compose.yml: 5 services (web, app, worker, db, stress-test)
  All services: INF-03 compliant (cpus + memory limits)
  All services: healthcheck configured
  EC2 parallels: t2.nano, t2.micro, t2.small, t2.medium, m5.large
- Dockerfile: Alpine 3.19 + stress tools + non-root user

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 15:16:58 +02:00

7.6 KiB

Tutorial 2: Implementare Healthchecks

In questo tutorial imparerai a configurare healthchecks per i container Docker, simulando gli ELB Health Checks di AWS.

Obiettivi di Apprendimento

Al termine di questo tutorial sarai in grado di:

  • Comprendere cosa sono i healthchecks e perché sono importanti
  • Configurare healthchecks HTTP, CMD e custom
  • Mappare healthchecks Docker agli ELB Health Checks
  • Monitorare lo stato di salute dei container

Prerequisiti

  • Completamento di Tutorial 1: Configurare i Limiti delle Risorse
  • docker-compose.yml con servizi configurati
  • Container che espongono porte HTTP o servizi monitorabili

Parte 1: Healthchecks - Concetti Fondamentali

Cos'è un Healthcheck?

Un healthcheck è un comando periodico che verifica se un container è "sano" (healthy).

Stati di un Container:

  1. created - Container creato ma non avviato
  2. starting - Container in avvio (healthcheck in corso)
  3. healthy - Healthcheck passing (container OK)
  4. unhealthy - Healthcheck failing (container problematico)
  5. exited - Container terminato

Perché sono Importanti?

  • Failover: Sostituisce container non sani
  • Zero-downtime: Attiva solo container sani nel load balancer
  • Dependencies: Altri servizi aspettano che il container diventi healthy
  • Monitoring: Avvisa automaticamente su problemi

Parallelismo: ELB Health Checks

Docker AWS ELB
healthcheck.test Health check path/protocol
healthcheck.interval Health check interval (default 30s)
healthcheck.timeout Health check timeout (default 5s)
healthcheck.retries Unhealthy threshold
healthcheck.start_period Grace period

Parte 2: Sintassi Healthcheck in Docker Compose

Sintassi di Base

services:
  web:
    image: nginx:alpine
    healthcheck:
      test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost/"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 5s

Parametri Spiegati

Parametro Default Descrizione
test - Comando da eseguire (richiesto)
interval 30s Frequenza del check
timeout 30s Tempo massimo per completare
retries 3 Tentativi prima di标记 unhealthy
start_period 0s Grace period all'avvio

Parte 3: Pratica - HTTP Healthcheck per Web Server

Step 1: Aggiungere healthcheck al servizio web

Modifica docker-compose.yml:

  web:
    image: nginx:alpine
    container_name: lab03-web
    hostname: web

    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 1G

    # HTTP Healthcheck
    healthcheck:
      test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost/"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 5s

    ports:
      - "127.0.0.1:8080:80"

    restart: unless-stopped

Step 2: Avviare e verificare

docker compose up -d web

Step 3: Monitorare lo stato di salute

# Mostra stato di salute
docker ps

# Output:
# CONTAINER   IMAGE        STATUS
# lab03-web   nginx:alpine Up 30 seconds (healthy)

Step 4: Ispezionare i dettagli del healthcheck

docker inspect lab03-web --format '{{json .State.Health}}' | jq

Output JSON:

{
  "Status": "healthy",
  "FailingStreak": 0,
  "Log": [
    {
      "Start": "2024-03-25T10:00:00Z",
      "End": "2024-03-25T10:00:00Z",
      "ExitCode": 0,
      "Output": ""
    }
  ]
}

Parte 4: Pratica - Database Healthcheck

Step 1: Aggiungere servizio database

  db:
    image: postgres:16-alpine
    container_name: lab03-db
    hostname: db

    environment:
      POSTGRES_DB: lab03_db
      POSTGRES_USER: lab03_user
      POSTGRES_PASSWORD: lab03_password

    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G

    # Database Healthcheck
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U lab03_user -d lab03_db || exit 1"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 10s

    volumes:
      - db-data:/var/lib/postgresql/data

    restart: unless-stopped

Nota che il database:

  • Ha più retries (5 vs 3) - i database partono più lentamente
  • Ha start_period più lungo (10s vs 5s) - grace period esteso

Step 2: Verificare il database diventi healthy

docker compose up -d db

# Attendere che diventi healthy
watch -n 2 'docker ps --filter "name=lab03-db" --format "table {{.Names}}\t{{.Status}}"'

Premi Ctrl+C quando vedi (healthy).


Parte 5: Pratica - CMD-SHELL Healthcheck

Per comandi più complessi, usa CMD-SHELL:

  app:
    image: myapp:latest
    container_name: lab03-app
    hostname: app

    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 2G

    # Custom healthcheck with shell
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/health || exit 1"]
      interval: 15s
      timeout: 3s
      retries: 3
      start_period: 30s

    restart: unless-stopped

Esempi di Healthcheck

HTTP con curl:

test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]

TCP connection:

test: ["CMD-SHELL", "nc -z localhost 8080 || exit 1"]

File existence:

test: ["CMD-SHELL", "test -f /tmp/ready || exit 1"]

Simple always-succeed:

test: ["CMD-SHELL", "exit 0"]

Parte 6: ELB Health Check Parallelism

Mapping Completo

Docker Healthcheck ELB Health Check AWS Default Docker Default
test Protocol + Path TCP:80 Configurato
interval Interval 30s 30s
timeout Timeout 5s 30s
retries Unhealthy Threshold 2 3
start_period - - 0s

Configurazione Equivalente AWS

Docker:

healthcheck:
  test: ["CMD", "wget", "--spider", "-q", "http://localhost/health"]
  interval: 30s
  timeout: 5s
  retries: 2

ELB (equivalente):

{
  "TargetGroup": {
    "HealthCheckProtocol": "HTTP",
    "HealthCheckPath": "/health",
    "HealthCheckIntervalSeconds": 30,
    "HealthCheckTimeoutSeconds": 5,
    "UnhealthyThresholdCount": 2
  }
}

Parte 7: Debugging Healthchecks

Problema: Container rimane "starting"

Causa: Healthcheck non passa entro start_period

Soluzione:

  • Aumenta start_period
  • Riduci interval per check più frequenti
  • Verifica che il servizio sia realmente pronto

Problema: Container diventa "unhealthy"

Debug:

# Guarda i log del healthcheck
docker inspect lab03-web --format '{{range .State.Health.Log}}{{.Start}} - {{.ExitCode}} - {{.Output}}{{"\n"}}{{end}}'

# Esegui il comando manualmente
docker exec lab03-web wget --spider -q http://localhost/

Problema: Healthcheck troppo lento

Ottimizza:

  • Usa check HTTP leggeri (non pagine pesanti)
  • Usa check TCP invece di HTTP se possibile
  • Riduci timeout se il check è veloce

In questo tutorial hai imparato:

Concetto: Healthchecks verificano lo stato di salute dei container ✓ Sintassi: Parametri test, interval, timeout, retries, start_period ✓ Tipi: HTTP, CMD-SHELL, custom healthchecks ✓ Parallelismo: Healthcheck Docker → ELB Health Check ✓ Monitoring: docker ps mostra stato (healthy/unhealthy)


Prossimi Passi

Nel prossimo tutorial imparerai a:

  • Configurare dipendenze tra servizi
  • Usare depends_on con condition: service_healthy
  • Implementare startup ordinato per applicazioni multi-tier

Continua con Tutorial 3: Dipendenze con Healthchecks