Files
mockupAWS/export/architecture.md
Luca Sacchi Ricciardi cd6f8ad166 docs: complete architecture specifications and project planning
Add comprehensive technical specifications for mockupAWS v0.2.0:

- export/architecture.md: Complete system architecture with:
  * Layered architecture diagram (Client → API → Service → Repository → DB)
  * Full database schema with DDL SQL (5 tables, indexes, constraints)
  * API specifications (OpenAPI format) for all endpoints
  * Security architecture (auth, PII detection, rate limiting)
  * Data flow diagrams (ingestion, cost calculation, state machine)
  * Technology stack details (backend, frontend, infrastructure)
  * Project structure for backend and frontend
  * 4 Architecture Decision Records (DEC-001 to DEC-004)

- export/kanban.md: Task breakdown with 32 tasks organized in:
  * Database setup (DB-001 to DB-007)
  * Backend models/schemas (BE-001 to BE-003)
  * Backend repositories (BE-004 to BE-008)
  * Backend services (BE-009 to BE-014)
  * Backend API (BE-015 to BE-020)
  * Testing (QA-001 to QA-003)

- export/progress.md: Project tracking initialized with:
  * Current status: 0% complete, Fase 1 setup
  * Sprint planning and metrics
  * Resource links and team assignments

All specifications follow 'Little Often' principle with tasks < 2 hours.
2026-04-07 13:10:12 +02:00

52 KiB
Raw Blame History

Architecture - mockupAWS

1. Overview

mockupAWS è una piattaforma di simulazione costi AWS che permette di profilare traffico log e calcolare i driver di costo (SQS, Lambda, Bedrock/LLM) prima del deploy in produzione.

Architettura: Layered Architecture con pattern Repository e Service Layer
Paradigma: Async-first (FastAPI + SQLAlchemy async)
Deployment: Container-based (Docker Compose)


2. System Architecture

2.1 High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                               CLIENT LAYER                                   │
│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────────────┐  │
│  │   Logstash       │  │   React Web UI   │  │   API Consumers          │  │
│  │   (Log Source)   │  │   (Dashboard)    │  │   (CI/CD, Scripts)       │  │
│  └────────┬─────────┘  └────────┬─────────┘  └───────────┬──────────────┘  │
└───────────┼─────────────────────┼────────────────────────┼───────────────────┘
            │                     │                        │
            │ HTTP POST           │ HTTPS                  │ API Key + JWT
            │ /ingest             │ /api/v1/*              │ /api/v1/*
            ▼                     ▼                        ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                                API LAYER                                     │
│                         FastAPI + Uvicorn (ASGI)                             │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  Middleware Stack                                                    │   │
│  │  ├── CORS                                                            │   │
│  │  ├── Rate Limiting (slowapi)                                         │   │
│  │  ├── Authentication (JWT / API Key)                                  │   │
│  │  ├── Request Validation (Pydantic)                                   │   │
│  │  └── Error Handling                                                  │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐   │
│  │ /scenarios   │ │ /ingest      │ │ /reports     │ │ /pricing         │   │
│  │   CRUD       │ │   (log       │ │   generate   │ │   (admin)        │   │
│  │              │ │    intake)   │ │   download   │ │                  │   │
│  └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘   │
└─────────┼────────────────┼────────────────┼──────────────────┼─────────────┘
          │                │                │                  │
          ▼                ▼                ▼                  ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                             SERVICE LAYER                                    │
│  ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│  │  ScenarioService │ │  IngestService   │ │  CostCalculator              │ │
│  │  ─────────────── │ │  ──────────────  │ │  ─────────────               │ │
│  │  • create()      │ │  • ingest_log()  │ │  • calculate_sqs_cost()      │ │
│  │  • update()      │ │  • batch_process()│ │  • calculate_lambda_cost()   │ │
│  │  • delete()      │ │  • deduplicate() │ │  • calculate_bedrock_cost()  │ │
│  │  • lifecycle()   │ │  • persist()     │ │  • get_total_cost()          │ │
│  └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
│  ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│  │  ReportService   │ │  PIIDetector     │ │  TokenizerService            │ │
│  │  ──────────────  │ │  ───────────     │ │  ───────────────             │ │
│  │  • generate_csv()│ │  • detect_email()│ │  • count_tokens()            │ │
│  │  • generate_pdf()│ │  • scan_patterns()│ │  • encode()                  │ │
│  │  • compile()     │ │  • report()      │ │  • get_encoding()            │ │
│  └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
└─────────┬──────────────────────────────────────────────────────┬────────────┘
          │                                                      │
          ▼                                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                           REPOSITORY LAYER                                   │
│  ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│  │  ScenarioRepo    │ │  LogRepo         │ │  PricingRepo                 │ │
│  │  ─────────────   │ │  ───────         │ │  ──────────                  │ │
│  │  • get_by_id()   │ │  • save()        │ │  • get_by_service_region()   │ │
│  │  • list()        │ │  • list_by_      │ │  • list_active()             │ │
│  │  • create()      │ │    scenario()    │ │  • update()                  │ │
│  │  • update()      │ │  • count_by_     │ │  • bulk_insert()             │ │
│  │  • delete()      │ │    hash()        │ │                              │ │
│  └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
│  ┌──────────────────┐ ┌──────────────────┐                                  │
│  │  MetricRepo      │ │  ReportRepo      │                                  │
│  │  ──────────      │ │  ──────────      │                                  │ │
│  │  • save()        │ │  • save()        │                                  │ │
│  │  • get_aggregated│ │  • list()        │                                  │ │
│  │  • list_by_type()│ │  • delete()      │                                  │ │
│  └──────────────────┘ └──────────────────┘                                  │
└─────────────────────────────────────────────────────────────────────────────┘
          │
          │ SQLAlchemy 2.0 Async
          │ asyncpg driver
          ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                           DATABASE LAYER                                     │
│                              PostgreSQL 15+                                  │
│  ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│  │  scenarios       │ │  scenario_logs   │ │  aws_pricing                 │ │
│  │  ─────────       │ │  ─────────────   │ │  ───────────                 │ │
│  │  • metadata      │ │  • logs storage  │ │  • service prices            │ │
│  │  • state machine │ │  • hash for dedup│ │  • history tracking          │ │
│  │  • cost totals   │ │  • PII flags     │ │  • region-specific           │ │
│  └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
│  ┌──────────────────┐ ┌──────────────────┐                                  │
│  │  scenario_metrics│ │  reports         │                                  │ │
│  │  ─────────────── │ │  ────────        │                                  │ │
│  │  • time-series   │ │  • generated     │                                  │ │
│  │  • aggregates    │ │  • metadata      │                                  │ │
│  │  • cost breakdown│ │  • file refs     │                                  │ │
│  └──────────────────┘ └──────────────────┘                                  │
└─────────────────────────────────────────────────────────────────────────────┘

2.2 Layer Responsibilities

Layer Responsabilità Tecnologie
Client Interazione utente, ingestion log Browser, Logstash, curl
API Routing, validation, auth, middleware FastAPI, Pydantic, slowapi
Service Business logic, orchestration Python async/await
Repository Data access, query abstraction SQLAlchemy 2.0 Repository pattern
Database Persistenza, ACID, queries PostgreSQL 15+

3. Database Schema

3.1 Entity Relationship Diagram

┌─────────────────────────────────────────────────────────────────────────┐
│                              SCHEMA ERD                                  │
└─────────────────────────────────────────────────────────────────────────┘

┌─────────────────────┐         ┌─────────────────────┐
│     scenarios       │         │   aws_pricing       │
├─────────────────────┤         ├─────────────────────┤
│ PK id: UUID         │         │ PK id: UUID         │
│    name: VARCHAR(255)│         │    service: VARCHAR │
│    description: TEXT│         │    region: VARCHAR  │
│    tags: JSONB      │         │    tier: VARCHAR    │
│    status: ENUM     │         │    price: DECIMAL   │
│    region: VARCHAR  │         │    unit: VARCHAR    │
│    created_at: TS   │         │    effective_from: D│
│    updated_at: TS   │         │    effective_to: D  │
│    completed_at: TS │         │    is_active: BOOL  │
│    total_requests: INT│       │    source_url: TEXT │
│    total_cost: DEC  │         └─────────────────────┘
└──────────┬──────────┘
           │
           │ 1:N
           ▼
┌─────────────────────┐         ┌─────────────────────┐
│   scenario_logs     │         │  scenario_metrics   │
├─────────────────────┤         ├─────────────────────┤
│ PK id: UUID         │         │ PK id: UUID         │
│ FK scenario_id: UUID│         │ FK scenario_id: UUID│
│    received_at: TS  │         │    timestamp: TS    │
│    message_hash: V64│         │    metric_type: VAR │
│    message_preview  │         │    metric_name: VAR │
│    source: VARCHAR  │         │    value: DECIMAL   │
│    size_bytes: INT  │         │    unit: VARCHAR    │
│    has_pii: BOOL    │         │    metadata: JSONB  │
│    token_count: INT │         └─────────────────────┘
│    sqs_blocks: INT  │
└─────────────────────┘
           │
           │ 1:N (optional)
           ▼
┌─────────────────────┐
│      reports        │
├─────────────────────┤
│ PK id: UUID         │
│ FK scenario_id: UUID│
│    format: ENUM     │
│    file_path: TEXT  │
│    generated_at: TS │
│    metadata: JSONB  │
└─────────────────────┘

3.2 DDL - Schema Definition

-- ============================================
-- EXTENSIONS
-- ============================================
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- For text search

-- ============================================
-- ENUMS
-- ============================================
CREATE TYPE scenario_status AS ENUM ('draft', 'running', 'completed', 'archived');
CREATE TYPE report_format AS ENUM ('pdf', 'csv');

-- ============================================
-- TABLE: scenarios
-- ============================================
CREATE TABLE scenarios (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    name VARCHAR(255) NOT NULL,
    description TEXT,
    tags JSONB DEFAULT '[]'::jsonb,
    status scenario_status NOT NULL DEFAULT 'draft',
    region VARCHAR(50) NOT NULL DEFAULT 'us-east-1',
    created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
    completed_at TIMESTAMP WITH TIME ZONE,
    started_at TIMESTAMP WITH TIME ZONE,
    total_requests INTEGER NOT NULL DEFAULT 0,
    total_cost_estimate DECIMAL(12, 6) NOT NULL DEFAULT 0.000000,
    
    -- Constraints
    CONSTRAINT chk_name_not_empty CHECK (char_length(trim(name)) > 0),
    CONSTRAINT chk_region_not_empty CHECK (char_length(trim(region)) > 0)
);

-- Indexes
CREATE INDEX idx_scenarios_status ON scenarios(status);
CREATE INDEX idx_scenarios_region ON scenarios(region);
CREATE INDEX idx_scenarios_created_at ON scenarios(created_at DESC);
CREATE INDEX idx_scenarios_tags ON scenarios USING GIN(tags);

-- Trigger for updated_at
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ language 'plpgsql';

CREATE TRIGGER update_scenarios_updated_at
    BEFORE UPDATE ON scenarios
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

-- ============================================
-- TABLE: scenario_logs
-- ============================================
CREATE TABLE scenario_logs (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE,
    received_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
    message_hash VARCHAR(64) NOT NULL, -- SHA256
    message_preview VARCHAR(500),
    source VARCHAR(100) DEFAULT 'unknown',
    size_bytes INTEGER NOT NULL DEFAULT 0,
    has_pii BOOLEAN NOT NULL DEFAULT FALSE,
    token_count INTEGER NOT NULL DEFAULT 0,
    sqs_blocks INTEGER NOT NULL DEFAULT 1,
    
    -- Constraints
    CONSTRAINT chk_size_positive CHECK (size_bytes >= 0),
    CONSTRAINT chk_token_positive CHECK (token_count >= 0),
    CONSTRAINT chk_blocks_positive CHECK (sqs_blocks >= 1)
);

-- Indexes
CREATE INDEX idx_logs_scenario_id ON scenario_logs(scenario_id);
CREATE INDEX idx_logs_received_at ON scenario_logs(received_at DESC);
CREATE INDEX idx_logs_message_hash ON scenario_logs(message_hash);
CREATE INDEX idx_logs_has_pii ON scenario_logs(has_pii) WHERE has_pii = TRUE;

-- ============================================
-- TABLE: scenario_metrics
-- ============================================
CREATE TABLE scenario_metrics (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE,
    timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
    metric_type VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock', 'safety'
    metric_name VARCHAR(100) NOT NULL,
    value DECIMAL(15, 6) NOT NULL DEFAULT 0.000000,
    unit VARCHAR(20) NOT NULL, -- 'count', 'bytes', 'tokens', 'usd', 'invocations'
    metadata JSONB DEFAULT '{}'::jsonb
);

-- Indexes
CREATE INDEX idx_metrics_scenario_id ON scenario_metrics(scenario_id);
CREATE INDEX idx_metrics_timestamp ON scenario_metrics(timestamp DESC);
CREATE INDEX idx_metrics_type ON scenario_metrics(metric_type);
CREATE INDEX idx_metrics_scenario_type ON scenario_metrics(scenario_id, metric_type);

-- ============================================
-- TABLE: aws_pricing
-- ============================================
CREATE TABLE aws_pricing (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    service VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock'
    region VARCHAR(50) NOT NULL,
    tier VARCHAR(50) NOT NULL DEFAULT 'standard',
    price_per_unit DECIMAL(15, 10) NOT NULL,
    unit VARCHAR(20) NOT NULL, -- 'per_million_requests', 'per_gb_second', 'per_1k_tokens'
    effective_from DATE NOT NULL DEFAULT CURRENT_DATE,
    effective_to DATE,
    is_active BOOLEAN NOT NULL DEFAULT TRUE,
    source_url VARCHAR(500),
    description TEXT,
    
    -- Constraints
    CONSTRAINT chk_price_positive CHECK (price_per_unit >= 0),
    CONSTRAINT chk_valid_dates CHECK (effective_to IS NULL OR effective_to >= effective_from),
    CONSTRAINT uq_pricing_unique_active UNIQUE (service, region, tier, effective_from)
        WHERE is_active = TRUE
);

-- Indexes
CREATE INDEX idx_pricing_service ON aws_pricing(service);
CREATE INDEX idx_pricing_region ON aws_pricing(region);
CREATE INDEX idx_pricing_active ON aws_pricing(service, region, tier) WHERE is_active = TRUE;

-- ============================================
-- TABLE: reports
-- ============================================
CREATE TABLE reports (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE,
    format report_format NOT NULL,
    file_path VARCHAR(500) NOT NULL,
    file_size_bytes INTEGER,
    generated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
    generated_by VARCHAR(100), -- user_id or api_key_id
    metadata JSONB DEFAULT '{}'::jsonb
);

-- Indexes
CREATE INDEX idx_reports_scenario_id ON reports(scenario_id);
CREATE INDEX idx_reports_generated_at ON reports(generated_at DESC);

3.3 Key Queries

-- Query: Get scenario with aggregated metrics
SELECT 
    s.*,
    COUNT(DISTINCT sl.id) as total_logs,
    COUNT(DISTINCT CASE WHEN sl.has_pii THEN sl.id END) as pii_violations,
    SUM(sl.token_count) as total_tokens,
    SUM(sl.sqs_blocks) as total_sqs_blocks
FROM scenarios s
LEFT JOIN scenario_logs sl ON s.id = sl.scenario_id
WHERE s.id = :scenario_id
GROUP BY s.id;

-- Query: Get cost breakdown by service
SELECT 
    metric_type,
    SUM(value) as total_value,
    unit
FROM scenario_metrics
WHERE scenario_id = :scenario_id
  AND metric_name LIKE '%cost%'
GROUP BY metric_type, unit;

-- Query: Get active pricing for service/region
SELECT *
FROM aws_pricing
WHERE service = :service
  AND region = :region
  AND is_active = TRUE
  AND (effective_to IS NULL OR effective_to >= CURRENT_DATE)
ORDER BY effective_from DESC
LIMIT 1;

4. API Specifications

4.1 OpenAPI Overview

openapi: 3.0.0
info:
  title: mockupAWS API
  version: 0.2.0
  description: AWS Cost Simulation Platform API

servers:
  - url: http://localhost:8000/api/v1
    description: Development server

security:
  - BearerAuth: []
  - ApiKeyAuth: []

4.2 Endpoints

Scenarios API

# POST /scenarios - Create new scenario
request:
  content:
    application/json:
      schema:
        type: object
        required: [name, region]
        properties:
          name:
            type: string
            minLength: 1
            maxLength: 255
          description:
            type: string
          tags:
            type: array
            items:
              type: string
          region:
            type: string
            enum: [us-east-1, us-west-2, eu-west-1, eu-central-1]
          tier:
            type: string
            enum: [standard, on-demand]
            default: standard

response:
  201:
    content:
      application/json:
        schema:
          $ref: '#/components/schemas/Scenario'

# GET /scenarios - List scenarios
parameters:
  - name: status
    in: query
    schema:
      type: string
      enum: [draft, running, completed, archived]
  - name: region
    in: query
    schema:
      type: string
  - name: page
    in: query
    schema:
      type: integer
      default: 1
  - name: page_size
    in: query
    schema:
      type: integer
      default: 20
      maximum: 100

response:
  200:
    content:
      application/json:
        schema:
          type: object
          properties:
            items:
              type: array
              items:
                $ref: '#/components/schemas/Scenario'
            total:
              type: integer
            page:
              type: integer
            page_size:
              type: integer

# GET /scenarios/{id} - Get scenario details
# PUT /scenarios/{id} - Update scenario
# DELETE /scenarios/{id} - Delete scenario
# POST /scenarios/{id}/start - Start scenario
# POST /scenarios/{id}/stop - Stop scenario
# POST /scenarios/{id}/archive - Archive scenario

Ingest API

# POST /ingest - Ingest log
headers:
  X-Scenario-ID:
    required: true
    schema:
      type: string
      format: uuid

request:
  content:
    application/json:
      schema:
        type: object
        required: [message]
        properties:
          message:
            type: string
            minLength: 1
          source:
            type: string
            default: unknown

response:
  202:
    description: Log accepted
    content:
      application/json:
        schema:
          type: object
          properties:
            status:
              type: string
              example: accepted
            log_id:
              type: string
              format: uuid
            estimated_cost_impact:
              type: number

  400:
    description: Invalid scenario or scenario not running

Metrics API

# GET /scenarios/{id}/metrics - Get scenario metrics
response:
  200:
    content:
      application/json:
        schema:
          type: object
          properties:
            scenario_id:
              type: string
            summary:
              type: object
              properties:
                total_requests:
                  type: integer
                total_cost_usd:
                  type: number
                sqs_blocks:
                  type: integer
                lambda_invocations:
                  type: integer
                llm_tokens:
                  type: integer
                pii_violations:
                  type: integer
            cost_breakdown:
              type: array
              items:
                type: object
                properties:
                  service:
                    type: string
                  cost_usd:
                    type: number
                  percentage:
                    type: number
            timeseries:
              type: array
              items:
                type: object
                properties:
                  timestamp:
                    type: string
                    format: date-time
                  metric_type:
                    type: string
                  value:
                    type: number

Reports API

# POST /scenarios/{id}/reports - Generate report
request:
  content:
    application/json:
      schema:
        type: object
        required: [format]
        properties:
          format:
            type: string
            enum: [pdf, csv]
          include_logs:
            type: boolean
            default: false
          date_from:
            type: string
            format: date-time
          date_to:
            type: string
            format: date-time

response:
  202:
    description: Report generation started
    content:
      application/json:
        schema:
          type: object
          properties:
            report_id:
              type: string
            status:
              type: string
              enum: [pending, processing, completed]
            download_url:
              type: string

# GET /reports/{id}/download - Download report
# GET /reports/{id}/status - Check report status

Pricing API (Admin)

# GET /pricing - List pricing
# POST /pricing - Create pricing entry
# PUT /pricing/{id} - Update pricing
# DELETE /pricing/{id} - Delete pricing (soft delete)

4.3 Schemas

components:
  schemas:
    Scenario:
      type: object
      properties:
        id:
          type: string
          format: uuid
        name:
          type: string
        description:
          type: string
        tags:
          type: array
          items:
            type: string
        status:
          type: string
          enum: [draft, running, completed, archived]
        region:
          type: string
        created_at:
          type: string
          format: date-time
        updated_at:
          type: string
          format: date-time
        completed_at:
          type: string
          format: date-time
        total_requests:
          type: integer
        total_cost_estimate:
          type: number

    LogEntry:
      type: object
      properties:
        id:
          type: string
          format: uuid
        scenario_id:
          type: string
          format: uuid
        received_at:
          type: string
          format: date-time
        message_hash:
          type: string
        message_preview:
          type: string
        source:
          type: string
        size_bytes:
          type: integer
        has_pii:
          type: boolean
        token_count:
          type: integer
        sqs_blocks:
          type: integer

  securitySchemes:
    BearerAuth:
      type: http
      scheme: bearer
      bearerFormat: JWT
    ApiKeyAuth:
      type: apiKey
      in: header
      name: X-API-Key

5. Data Flow

5.1 Log Ingestion Flow

┌──────────┐     POST /ingest        ┌──────────────┐
│  Client  │ ───────────────────────>│  FastAPI     │
│(Logstash)│  Headers:               │  Middleware  │
│          │   X-Scenario-ID: uuid   │              │
└──────────┘                         └──────┬───────┘
                                            │
                                            │ 1. Validate scenario exists & running
                                            │ 2. Parse JSON payload
                                            ▼
                                     ┌──────────────┐
                                     │  Ingest      │
                                     │  Service     │
                                     └──────┬───────┘
                                            │
                    ┌───────────────────────┼───────────────────────┐
                    │                       │                       │
                    ▼                       ▼                       ▼
            ┌──────────────┐       ┌──────────────┐       ┌──────────────┐
            │ PII Detector │       │ SQS Calculator│      │ Tokenizer    │
            │ • check email│       │ • calc blocks │      │ • count      │
            └──────┬───────┘       └──────┬───────┘      └──────┬───────┘
                   │                      │                     │
                   │ has_pii: bool        │ sqs_blocks: int     │ tokens: int
                   └──────────────────────┼─────────────────────┘
                                          │
                                          ▼
                                   ┌──────────────┐
                                   │  LogRepo     │
                                   │  save()      │
                                   └──────┬───────┘
                                          │
                                          ▼
                                   ┌──────────────┐
                                   │  PostgreSQL  │
                                   │ scenario_logs│
                                   └──────────────┘

5.2 Scenario State Machine

                    ┌─────────────────────────────────────────────────────────┐
                    │                                                         │
                    ▼                                                         │
              ┌──────────┐     POST /start     ┌──────────┐                  │
     ┌───────│  DRAFT   │────────────────────>│ RUNNING  │                  │
     │       └──────────┘                     └────┬─────┘                  │
     │            ▲                               │                        │
     │            │                               │ POST /stop             │
     │            │ POST /archive                 ▼                        │
     │            │                          ┌──────────┐                  │
     │       ┌────┴────┐<────────────────────│COMPLETED │──────────────────┘
     │       │ARCHIVED │                     └──────────┘
     └──────>└─────────┘

5.3 Cost Calculation Flow

┌─────────────────────────────────────────────────────────────────────────┐
│                         COST CALCULATION PIPELINE                        │
└─────────────────────────────────────────────────────────────────────────┘

Input: scenario_logs row
├─ sqs_blocks
├─ token_count
└─ (future: lambda_gb_seconds)
         │
         ▼
┌─────────────────┐
│ Pricing Service │
│ • get_active()  │
└────────┬────────┘
         │ Query: SELECT * FROM aws_pricing
         │ WHERE service IN ('sqs', 'lambda', 'bedrock')
         │   AND region = :scenario_region
         │   AND is_active = TRUE
         ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                              COST FORMULAS                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  SQS Cost:                                                              │
│    cost = blocks × price_per_million / 1,000,000                        │
│    Example: 100 blocks × $0.40 / 1M = $0.00004                          │
│                                                                         │
│  Lambda Cost:                                                           │
│    request_cost = invocations × price_per_million / 1,000,000           │
│    compute_cost = gb_seconds × price_per_gb_second                      │
│    total = request_cost + compute_cost                                  │
│    Example: 1M invoc × $0.20/1M + 10GBs × $0.00001667 = $0.20 + $0.00017│
│                                                                         │
│  Bedrock Cost:                                                          │
│    input_cost = input_tokens × price_per_1k_input / 1,000               │
│    output_cost = output_tokens × price_per_1k_output / 1,000            │
│    total = input_cost + output_cost                                     │
│    Example: 1000 tokens × $0.003/1K = $0.003                            │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘
         │
         ▼
┌─────────────────┐
│  Update         │
│  scenarios      │
│  total_cost     │
└─────────────────┘

6. Security Architecture

6.1 Authentication & Authorization

┌─────────────────────────────────────────────────────────────────┐
│                      AUTHENTICATION LAYERS                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Layer 1: API Key (Programmatic Access)                         │
│  ├─ Header: X-API-Key: <key>                                    │
│  ├─ Rate limiting: 1000 req/min                                 │
│  └─ Scope: /ingest, /metrics (read-only on other resources)     │
│                                                                  │
│  Layer 2: JWT Token (Web UI Access)                             │
│  ├─ Header: Authorization: Bearer <jwt>                         │
│  ├─ Expiration: 24h                                             │
│  ├─ Refresh token: 7d                                           │
│  └─ Scope: Full access based on roles                           │
│                                                                  │
│  Layer 3: Role-Based Access Control (RBAC)                      │
│  ├─ admin: Full access                                          │
│  ├─ user: CRUD own scenarios, read pricing                      │
│  └─ readonly: View only                                         │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

6.2 Data Security

Layer Measure Implementation
Transport TLS 1.3 Nginx reverse proxy
Storage Hashing SHA-256 for message_hash
PII Detection + Truncation Email regex, 500 char preview limit
API Rate Limiting slowapi: 100/min public, 1000/min authenticated
DB Parameterized Queries SQLAlchemy ORM (no raw SQL)
Secrets Environment Variables python-dotenv, Docker secrets

6.3 PII Detection Strategy

# Pattern matching for common PII
def detect_pii(message: str) -> dict:
    patterns = {
        'email': r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b',
        'ssn': r'\b\d{3}-\d{2}-\d{4}\b',
        'credit_card': r'\b(?:\d[ -]*?){13,16}\b',
        'phone': r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b'
    }
    
    results = {}
    for pii_type, pattern in patterns.items():
        matches = re.findall(pattern, message)
        if matches:
            results[pii_type] = len(matches)
    
    return {
        'has_pii': len(results) > 0,
        'pii_types': list(results.keys()),
        'total_matches': sum(results.values())
    }

7. Technology Stack

7.1 Backend

Component Technology Version Purpose
Framework FastAPI ≥0.110 Web framework
Server Uvicorn ≥0.29 ASGI server
Validation Pydantic ≥2.7 Data validation
ORM SQLAlchemy ≥2.0 Database ORM
Migrations Alembic latest DB migrations
Driver asyncpg latest Async PostgreSQL
Tokenizer tiktoken ≥0.6 Token counting
Rate Limit slowapi latest API rate limiting
Auth python-jose latest JWT handling
Testing pytest ≥8.1 Test framework
HTTP Client httpx ≥0.27 Async HTTP

7.2 Frontend

Component Technology Version Purpose
Framework React ≥18 UI library
Language TypeScript ≥5.0 Type safety
Build Vite latest Build tool
Styling Tailwind CSS ≥3.4 CSS framework
Components shadcn/ui latest UI components
Charts Recharts latest Data viz
State React Query ≥5.0 Server state
HTTP Axios latest HTTP client
Routing React Router ≥6.0 Navigation

7.3 Infrastructure

Component Technology Purpose
Container Docker Application containers
Orchestration Docker Compose Multi-container dev
Database PostgreSQL 15+ Primary data store
Reverse Proxy Nginx SSL, static files
Process Manager systemd / PM2 Production process mgmt

8. Project Structure

mockupAWS/
├── backend/
│   ├── src/
│   │   ├── __init__.py
│   │   ├── main.py                 # FastAPI app entry
│   │   ├── config.py               # Settings & env vars
│   │   ├── dependencies.py         # FastAPI dependencies
│   │   ├── models/                 # SQLAlchemy models
│   │   │   ├── __init__.py
│   │   │   ├── base.py             # Base model
│   │   │   ├── scenario.py
│   │   │   ├── scenario_log.py
│   │   │   ├── scenario_metric.py
│   │   │   ├── aws_pricing.py
│   │   │   └── report.py
│   │   ├── schemas/                # Pydantic schemas
│   │   │   ├── __init__.py
│   │   │   ├── scenario.py
│   │   │   ├── log.py
│   │   │   ├── metric.py
│   │   │   ├── pricing.py
│   │   │   └── report.py
│   │   ├── api/                    # API routes
│   │   │   ├── __init__.py
│   │   │   ├── deps.py             # Dependencies
│   │   │   └── v1/
│   │   │       ├── __init__.py
│   │   │       ├── scenarios.py    # /scenarios/*
│   │   │       ├── ingest.py       # /ingest
│   │   │       ├── metrics.py      # /metrics
│   │   │       ├── reports.py      # /reports
│   │   │       └── pricing.py      # /pricing
│   │   ├── services/               # Business logic
│   │   │   ├── __init__.py
│   │   │   ├── scenario_service.py
│   │   │   ├── ingest_service.py
│   │   │   ├── cost_calculator.py
│   │   │   ├── report_service.py
│   │   │   └── pii_detector.py
│   │   ├── repositories/           # Data access
│   │   │   ├── __init__.py
│   │   │   ├── base.py
│   │   │   ├── scenario_repo.py
│   │   │   ├── log_repo.py
│   │   │   ├── metric_repo.py
│   │   │   └── pricing_repo.py
│   │   ├── core/                   # Core utilities
│   │   │   ├── __init__.py
│   │   │   ├── security.py         # Auth, JWT
│   │   │   ├── database.py         # DB connection
│   │   │   └── exceptions.py       # Custom exceptions
│   │   └── utils/                  # Utilities
│   │       ├── __init__.py
│   │       └── hashing.py          # SHA-256 utils
│   ├── alembic/                    # Database migrations
│   │   ├── versions/               # Migration files
│   │   ├── env.py
│   │   └── alembic.ini
│   ├── tests/
│   │   ├── __init__.py
│   │   ├── conftest.py             # pytest fixtures
│   │   ├── unit/
│   │   │   ├── test_services.py
│   │   │   └── test_cost_calculator.py
│   │   ├── integration/
│   │   │   ├── test_api_scenarios.py
│   │   │   ├── test_api_ingest.py
│   │   │   └── test_api_metrics.py
│   │   └── e2e/
│   │       └── test_full_flow.py
│   ├── Dockerfile
│   ├── pyproject.toml
│   └── requirements.txt
│
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   │   ├── ui/                 # shadcn/ui components
│   │   │   ├── layout/
│   │   │   │   ├── Header.tsx
│   │   │   │   ├── Sidebar.tsx
│   │   │   │   └── Layout.tsx
│   │   │   ├── scenarios/
│   │   │   │   ├── ScenarioList.tsx
│   │   │   │   ├── ScenarioCard.tsx
│   │   │   │   ├── ScenarioForm.tsx
│   │   │   │   └── ScenarioDetail.tsx
│   │   │   ├── metrics/
│   │   │   │   ├── MetricCard.tsx
│   │   │   │   ├── CostChart.tsx
│   │   │   │   └── MetricsDashboard.tsx
│   │   │   └── reports/
│   │   │       ├── ReportGenerator.tsx
│   │   │       └── ReportDownload.tsx
│   │   ├── pages/
│   │   │   ├── Dashboard.tsx
│   │   │   ├── ScenariosPage.tsx
│   │   │   ├── ScenarioCreate.tsx
│   │   │   ├── ScenarioDetail.tsx
│   │   │   ├── Compare.tsx
│   │   │   ├── Reports.tsx
│   │   │   └── Settings.tsx
│   │   ├── hooks/
│   │   │   ├── useScenarios.ts
│   │   │   ├── useMetrics.ts
│   │   │   └── useReports.ts
│   │   ├── services/
│   │   │   ├── api.ts              # Axios config
│   │   │   ├── scenarioApi.ts
│   │   │   └── metricApi.ts
│   │   ├── types/
│   │   │   ├── scenario.ts
│   │   │   ├── metric.ts
│   │   │   └── api.ts
│   │   ├── context/
│   │   │   └── ThemeContext.tsx
│   │   ├── App.tsx
│   │   └── main.tsx
│   ├── public/
│   ├── index.html
│   ├── Dockerfile
│   ├── package.json
│   ├── tsconfig.json
│   ├── tailwind.config.js
│   └── vite.config.ts
│
├── docker-compose.yml
├── nginx.conf
├── .env.example
├── .env
├── .gitignore
└── README.md

9. Decisioni Architetturali

DEC-001: Async-First Architecture

Decisione: Utilizzare Python async/await in tutto lo stack (FastAPI, SQLAlchemy, asyncpg)

Motivazione:

  • Alto throughput richiesto (>1000 RPS)
  • I/O bound operations (DB, tokenizer)
  • Migliore utilizzo risorse rispetto a sync

Alternative considerate:

  • Sync + ThreadPool: Più semplice ma meno efficiente
  • Celery + Redis: Troppo complesso per use case

Conseguenze:

  • Curva di apprendimento per async
  • Debugging più complesso
  • Migliore scalabilità

DEC-002: Repository Pattern

Decisione: Implementare Repository Pattern per accesso dati

Motivazione:

  • Separazione tra business logic e data access
  • Facile testing con mock repositories
  • Possibilità di cambiare DB in futuro

Struttura:

class BaseRepository(Generic[T]):
    async def get(self, id: UUID) -> T | None: ...
    async def list(self, **filters) -> list[T]: ...
    async def create(self, obj: T) -> T: ...
    async def update(self, id: UUID, data: dict) -> T: ...
    async def delete(self, id: UUID) -> bool: ...

DEC-003: Separate Database per Scenario

Decisione: Utilizzare una singola tabella scenario_logs con scenario_id FK invece di DB separati

Motivazione:

  • Più semplice da gestire
  • Query cross-scenario possibili (confronti)
  • Backup/restore più semplice

Alternative considerate:

  • Schema per scenario: Troppo overhead
  • DB separati: Troppo complesso per MVP

DEC-004: Message Hashing for Deduplication

Decisione: Utilizzare SHA-256 hash del messaggio per deduplicazione

Motivazione:

  • Privacy: Non memorizzare messaggi completi
  • Performance: Hash lookup O(1)
  • Storage: Risparmio spazio

Implementazione:

import hashlib
message_hash = hashlib.sha256(message.encode()).hexdigest()

DEC-005: Time-Series Metrics

Decisione: Salvare metriche come time-series in scenario_metrics

Motivazione:

  • Trend analysis possibile
  • Aggregazioni flessibili
  • Audit trail

Trade-off:

  • Più storage rispetto a campi aggregati
  • Query più complesse ma indicizzate

10. Performance Considerations

10.1 Database Optimization

Optimization Implementation Benefit
Indexes B-tree on foreign keys, timestamps Fast lookups
GIN tags (JSONB) Fast array search
Partitioning scenario_logs by date Query pruning
Connection Pool asyncpg pool (20-50) Concurrency

10.2 Caching Strategy (Future)

Layer 1: In-memory (FastAPI state)
├─ Active scenario metadata
└─ AWS pricing (rarely changes)

Layer 2: Redis (future)
├─ Session storage
├─ Rate limiting counters
└─ Report generation status

10.3 Query Optimization

  • Use selectinload for relationships
  • Batch inserts for logs (copy_expert)
  • Materialized views for reports
  • Async tasks for heavy operations

11. Error Handling Strategy

11.1 Exception Hierarchy

class AppException(Exception):
    """Base application exception"""
    status_code: int = 500
    code: str = "internal_error"

class NotFoundException(AppException):
    status_code = 404
    code = "not_found"

class ValidationException(AppException):
    status_code = 400
    code = "validation_error"

class ConflictException(AppException):
    status_code = 409
    code = "conflict"

class RateLimitException(AppException):
    status_code = 429
    code = "rate_limited"

11.2 Global Exception Handler

@app.exception_handler(AppException)
async def app_exception_handler(request: Request, exc: AppException):
    return JSONResponse(
        status_code=exc.status_code,
        content={
            "error": exc.code,
            "message": str(exc),
            "timestamp": datetime.utcnow().isoformat()
        }
    )

12. Deployment Architecture

12.1 Docker Compose (Development)

version: '3.8'

services:
  postgres:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: mockupaws
      POSTGRES_USER: app
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app -d mockupaws"]

  backend:
    build: ./backend
    environment:
      DATABASE_URL: postgresql+asyncpg://app:${DB_PASSWORD}@postgres:5432/mockupaws
    ports:
      - "8000:8000"
    depends_on:
      postgres:
        condition: service_healthy

  frontend:
    build: ./frontend
    ports:
      - "3000:80"
    depends_on:
      - backend

volumes:
  postgres_data:

12.2 Production Considerations

  • Use managed PostgreSQL (AWS RDS, Azure PostgreSQL)
  • Nginx as reverse proxy with SSL
  • Environment-specific configuration
  • Log aggregation (ELK or similar)
  • Monitoring (Prometheus + Grafana)
  • Health checks and readiness probes

Documento creato da @spec-architect
Versione: 1.0
Data: 2026-04-07