# Architecture - mockupAWS ## 1. Overview mockupAWS è una piattaforma di simulazione costi AWS che permette di profilare traffico log e calcolare i driver di costo (SQS, Lambda, Bedrock/LLM) prima del deploy in produzione. **Architettura:** Layered Architecture con pattern Repository e Service Layer **Paradigma:** Async-first (FastAPI + SQLAlchemy async) **Deployment:** Container-based (Docker Compose) --- ## 2. System Architecture ### 2.1 High-Level Architecture ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ CLIENT LAYER │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────┐ │ │ │ Logstash │ │ React Web UI │ │ API Consumers │ │ │ │ (Log Source) │ │ (Dashboard) │ │ (CI/CD, Scripts) │ │ │ └────────┬─────────┘ └────────┬─────────┘ └───────────┬──────────────┘ │ └───────────┼─────────────────────┼────────────────────────┼───────────────────┘ │ │ │ │ HTTP POST │ HTTPS │ API Key + JWT │ /ingest │ /api/v1/* │ /api/v1/* ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ API LAYER │ │ FastAPI + Uvicorn (ASGI) │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ │ Middleware Stack │ │ │ │ ├── CORS │ │ │ │ ├── Rate Limiting (slowapi) │ │ │ │ ├── Authentication (JWT / API Key) │ │ │ │ ├── Request Validation (Pydantic) │ │ │ │ └── Error Handling │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ │ /scenarios │ │ /ingest │ │ /reports │ │ /pricing │ │ │ │ CRUD │ │ (log │ │ generate │ │ (admin) │ │ │ │ │ │ intake) │ │ download │ │ │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │ └─────────┼────────────────┼────────────────┼──────────────────┼─────────────┘ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ SERVICE LAYER │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ ScenarioService │ │ IngestService │ │ CostCalculator │ │ │ │ ─────────────── │ │ ────────────── │ │ ───────────── │ │ │ │ • create() │ │ • ingest_log() │ │ • calculate_sqs_cost() │ │ │ │ • update() │ │ • batch_process()│ │ • calculate_lambda_cost() │ │ │ │ • delete() │ │ • deduplicate() │ │ • calculate_bedrock_cost() │ │ │ │ • lifecycle() │ │ • persist() │ │ • get_total_cost() │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ ReportService │ │ PIIDetector │ │ TokenizerService │ │ │ │ ────────────── │ │ ─────────── │ │ ─────────────── │ │ │ │ • generate_csv()│ │ • detect_email()│ │ • count_tokens() │ │ │ │ • generate_pdf()│ │ • scan_patterns()│ │ • encode() │ │ │ │ • compile() │ │ • report() │ │ • get_encoding() │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ └─────────┬──────────────────────────────────────────────────────┬────────────┘ │ │ ▼ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ REPOSITORY LAYER │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ ScenarioRepo │ │ LogRepo │ │ PricingRepo │ │ │ │ ───────────── │ │ ─────── │ │ ────────── │ │ │ │ • get_by_id() │ │ • save() │ │ • get_by_service_region() │ │ │ │ • list() │ │ • list_by_ │ │ • list_active() │ │ │ │ • create() │ │ scenario() │ │ • update() │ │ │ │ • update() │ │ • count_by_ │ │ • bulk_insert() │ │ │ │ • delete() │ │ hash() │ │ │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ │ ┌──────────────────┐ ┌──────────────────┐ │ │ │ MetricRepo │ │ ReportRepo │ │ │ │ ────────── │ │ ────────── │ │ │ │ │ • save() │ │ • save() │ │ │ │ │ • get_aggregated│ │ • list() │ │ │ │ │ • list_by_type()│ │ • delete() │ │ │ │ └──────────────────┘ └──────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ SQLAlchemy 2.0 Async │ asyncpg driver ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ DATABASE LAYER │ │ PostgreSQL 15+ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ scenarios │ │ scenario_logs │ │ aws_pricing │ │ │ │ ───────── │ │ ───────────── │ │ ─────────── │ │ │ │ • metadata │ │ • logs storage │ │ • service prices │ │ │ │ • state machine │ │ • hash for dedup│ │ • history tracking │ │ │ │ • cost totals │ │ • PII flags │ │ • region-specific │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ │ ┌──────────────────┐ ┌──────────────────┐ │ │ │ scenario_metrics│ │ reports │ │ │ │ │ ─────────────── │ │ ──────── │ │ │ │ │ • time-series │ │ • generated │ │ │ │ │ • aggregates │ │ • metadata │ │ │ │ │ • cost breakdown│ │ • file refs │ │ │ │ └──────────────────┘ └──────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ### 2.2 Layer Responsibilities | Layer | Responsabilità | Tecnologie | |-------|----------------|------------| | **Client** | Interazione utente, ingestion log | Browser, Logstash, curl | | **API** | Routing, validation, auth, middleware | FastAPI, Pydantic, slowapi | | **Service** | Business logic, orchestration | Python async/await | | **Repository** | Data access, query abstraction | SQLAlchemy 2.0 Repository pattern | | **Database** | Persistenza, ACID, queries | PostgreSQL 15+ | --- ## 3. Database Schema ### 3.1 Entity Relationship Diagram ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ SCHEMA ERD │ └─────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────┐ ┌─────────────────────┐ │ scenarios │ │ aws_pricing │ ├─────────────────────┤ ├─────────────────────┤ │ PK id: UUID │ │ PK id: UUID │ │ name: VARCHAR(255)│ │ service: VARCHAR │ │ description: TEXT│ │ region: VARCHAR │ │ tags: JSONB │ │ tier: VARCHAR │ │ status: ENUM │ │ price: DECIMAL │ │ region: VARCHAR │ │ unit: VARCHAR │ │ created_at: TS │ │ effective_from: D│ │ updated_at: TS │ │ effective_to: D │ │ completed_at: TS │ │ is_active: BOOL │ │ total_requests: INT│ │ source_url: TEXT │ │ total_cost: DEC │ └─────────────────────┘ └──────────┬──────────┘ │ │ 1:N ▼ ┌─────────────────────┐ ┌─────────────────────┐ │ scenario_logs │ │ scenario_metrics │ ├─────────────────────┤ ├─────────────────────┤ │ PK id: UUID │ │ PK id: UUID │ │ FK scenario_id: UUID│ │ FK scenario_id: UUID│ │ received_at: TS │ │ timestamp: TS │ │ message_hash: V64│ │ metric_type: VAR │ │ message_preview │ │ metric_name: VAR │ │ source: VARCHAR │ │ value: DECIMAL │ │ size_bytes: INT │ │ unit: VARCHAR │ │ has_pii: BOOL │ │ metadata: JSONB │ │ token_count: INT │ └─────────────────────┘ │ sqs_blocks: INT │ └─────────────────────┘ │ │ 1:N (optional) ▼ ┌─────────────────────┐ │ reports │ ├─────────────────────┤ │ PK id: UUID │ │ FK scenario_id: UUID│ │ format: ENUM │ │ file_path: TEXT │ │ generated_at: TS │ │ metadata: JSONB │ └─────────────────────┘ ``` ### 3.2 DDL - Schema Definition ```sql -- ============================================ -- EXTENSIONS -- ============================================ CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- For text search -- ============================================ -- ENUMS -- ============================================ CREATE TYPE scenario_status AS ENUM ('draft', 'running', 'completed', 'archived'); CREATE TYPE report_format AS ENUM ('pdf', 'csv'); -- ============================================ -- TABLE: scenarios -- ============================================ CREATE TABLE scenarios ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), name VARCHAR(255) NOT NULL, description TEXT, tags JSONB DEFAULT '[]'::jsonb, status scenario_status NOT NULL DEFAULT 'draft', region VARCHAR(50) NOT NULL DEFAULT 'us-east-1', created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), completed_at TIMESTAMP WITH TIME ZONE, started_at TIMESTAMP WITH TIME ZONE, total_requests INTEGER NOT NULL DEFAULT 0, total_cost_estimate DECIMAL(12, 6) NOT NULL DEFAULT 0.000000, -- Constraints CONSTRAINT chk_name_not_empty CHECK (char_length(trim(name)) > 0), CONSTRAINT chk_region_not_empty CHECK (char_length(trim(region)) > 0) ); -- Indexes CREATE INDEX idx_scenarios_status ON scenarios(status); CREATE INDEX idx_scenarios_region ON scenarios(region); CREATE INDEX idx_scenarios_created_at ON scenarios(created_at DESC); CREATE INDEX idx_scenarios_tags ON scenarios USING GIN(tags); -- Trigger for updated_at CREATE OR REPLACE FUNCTION update_updated_at_column() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = NOW(); RETURN NEW; END; $$ language 'plpgsql'; CREATE TRIGGER update_scenarios_updated_at BEFORE UPDATE ON scenarios FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); -- ============================================ -- TABLE: scenario_logs -- ============================================ CREATE TABLE scenario_logs ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE, received_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), message_hash VARCHAR(64) NOT NULL, -- SHA256 message_preview VARCHAR(500), source VARCHAR(100) DEFAULT 'unknown', size_bytes INTEGER NOT NULL DEFAULT 0, has_pii BOOLEAN NOT NULL DEFAULT FALSE, token_count INTEGER NOT NULL DEFAULT 0, sqs_blocks INTEGER NOT NULL DEFAULT 1, -- Constraints CONSTRAINT chk_size_positive CHECK (size_bytes >= 0), CONSTRAINT chk_token_positive CHECK (token_count >= 0), CONSTRAINT chk_blocks_positive CHECK (sqs_blocks >= 1) ); -- Indexes CREATE INDEX idx_logs_scenario_id ON scenario_logs(scenario_id); CREATE INDEX idx_logs_received_at ON scenario_logs(received_at DESC); CREATE INDEX idx_logs_message_hash ON scenario_logs(message_hash); CREATE INDEX idx_logs_has_pii ON scenario_logs(has_pii) WHERE has_pii = TRUE; -- ============================================ -- TABLE: scenario_metrics -- ============================================ CREATE TABLE scenario_metrics ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE, timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), metric_type VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock', 'safety' metric_name VARCHAR(100) NOT NULL, value DECIMAL(15, 6) NOT NULL DEFAULT 0.000000, unit VARCHAR(20) NOT NULL, -- 'count', 'bytes', 'tokens', 'usd', 'invocations' metadata JSONB DEFAULT '{}'::jsonb ); -- Indexes CREATE INDEX idx_metrics_scenario_id ON scenario_metrics(scenario_id); CREATE INDEX idx_metrics_timestamp ON scenario_metrics(timestamp DESC); CREATE INDEX idx_metrics_type ON scenario_metrics(metric_type); CREATE INDEX idx_metrics_scenario_type ON scenario_metrics(scenario_id, metric_type); -- ============================================ -- TABLE: aws_pricing -- ============================================ CREATE TABLE aws_pricing ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), service VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock' region VARCHAR(50) NOT NULL, tier VARCHAR(50) NOT NULL DEFAULT 'standard', price_per_unit DECIMAL(15, 10) NOT NULL, unit VARCHAR(20) NOT NULL, -- 'per_million_requests', 'per_gb_second', 'per_1k_tokens' effective_from DATE NOT NULL DEFAULT CURRENT_DATE, effective_to DATE, is_active BOOLEAN NOT NULL DEFAULT TRUE, source_url VARCHAR(500), description TEXT, -- Constraints CONSTRAINT chk_price_positive CHECK (price_per_unit >= 0), CONSTRAINT chk_valid_dates CHECK (effective_to IS NULL OR effective_to >= effective_from), CONSTRAINT uq_pricing_unique_active UNIQUE (service, region, tier, effective_from) WHERE is_active = TRUE ); -- Indexes CREATE INDEX idx_pricing_service ON aws_pricing(service); CREATE INDEX idx_pricing_region ON aws_pricing(region); CREATE INDEX idx_pricing_active ON aws_pricing(service, region, tier) WHERE is_active = TRUE; -- ============================================ -- TABLE: reports -- ============================================ CREATE TABLE reports ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE, format report_format NOT NULL, file_path VARCHAR(500) NOT NULL, file_size_bytes INTEGER, generated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), generated_by VARCHAR(100), -- user_id or api_key_id metadata JSONB DEFAULT '{}'::jsonb ); -- Indexes CREATE INDEX idx_reports_scenario_id ON reports(scenario_id); CREATE INDEX idx_reports_generated_at ON reports(generated_at DESC); ``` ### 3.3 Key Queries ```sql -- Query: Get scenario with aggregated metrics SELECT s.*, COUNT(DISTINCT sl.id) as total_logs, COUNT(DISTINCT CASE WHEN sl.has_pii THEN sl.id END) as pii_violations, SUM(sl.token_count) as total_tokens, SUM(sl.sqs_blocks) as total_sqs_blocks FROM scenarios s LEFT JOIN scenario_logs sl ON s.id = sl.scenario_id WHERE s.id = :scenario_id GROUP BY s.id; -- Query: Get cost breakdown by service SELECT metric_type, SUM(value) as total_value, unit FROM scenario_metrics WHERE scenario_id = :scenario_id AND metric_name LIKE '%cost%' GROUP BY metric_type, unit; -- Query: Get active pricing for service/region SELECT * FROM aws_pricing WHERE service = :service AND region = :region AND is_active = TRUE AND (effective_to IS NULL OR effective_to >= CURRENT_DATE) ORDER BY effective_from DESC LIMIT 1; ``` --- ## 4. API Specifications ### 4.1 OpenAPI Overview ```yaml openapi: 3.0.0 info: title: mockupAWS API version: 0.3.0 description: AWS Cost Simulation Platform API servers: - url: http://localhost:8000/api/v1 description: Development server security: - BearerAuth: [] - ApiKeyAuth: [] ``` ### 4.2 Endpoints #### Scenarios API ```yaml # POST /scenarios - Create new scenario request: content: application/json: schema: type: object required: [name, region] properties: name: type: string minLength: 1 maxLength: 255 description: type: string tags: type: array items: type: string region: type: string enum: [us-east-1, us-west-2, eu-west-1, eu-central-1] tier: type: string enum: [standard, on-demand] default: standard response: 201: content: application/json: schema: $ref: '#/components/schemas/Scenario' # GET /scenarios - List scenarios parameters: - name: status in: query schema: type: string enum: [draft, running, completed, archived] - name: region in: query schema: type: string - name: page in: query schema: type: integer default: 1 - name: page_size in: query schema: type: integer default: 20 maximum: 100 response: 200: content: application/json: schema: type: object properties: items: type: array items: $ref: '#/components/schemas/Scenario' total: type: integer page: type: integer page_size: type: integer # GET /scenarios/{id} - Get scenario details # PUT /scenarios/{id} - Update scenario # DELETE /scenarios/{id} - Delete scenario # POST /scenarios/{id}/start - Start scenario # POST /scenarios/{id}/stop - Stop scenario # POST /scenarios/{id}/archive - Archive scenario ``` #### Ingest API ```yaml # POST /ingest - Ingest log headers: X-Scenario-ID: required: true schema: type: string format: uuid request: content: application/json: schema: type: object required: [message] properties: message: type: string minLength: 1 source: type: string default: unknown response: 202: description: Log accepted content: application/json: schema: type: object properties: status: type: string example: accepted log_id: type: string format: uuid estimated_cost_impact: type: number 400: description: Invalid scenario or scenario not running ``` #### Metrics API ```yaml # GET /scenarios/{id}/metrics - Get scenario metrics response: 200: content: application/json: schema: type: object properties: scenario_id: type: string summary: type: object properties: total_requests: type: integer total_cost_usd: type: number sqs_blocks: type: integer lambda_invocations: type: integer llm_tokens: type: integer pii_violations: type: integer cost_breakdown: type: array items: type: object properties: service: type: string cost_usd: type: number percentage: type: number timeseries: type: array items: type: object properties: timestamp: type: string format: date-time metric_type: type: string value: type: number ``` #### Reports API ```yaml # POST /scenarios/{id}/reports - Generate report request: content: application/json: schema: type: object required: [format] properties: format: type: string enum: [pdf, csv] include_logs: type: boolean default: false date_from: type: string format: date-time date_to: type: string format: date-time response: 202: description: Report generation started content: application/json: schema: type: object properties: report_id: type: string status: type: string enum: [pending, processing, completed] download_url: type: string # GET /reports/{id}/download - Download report # GET /reports/{id}/status - Check report status ``` #### Pricing API (Admin) ```yaml # GET /pricing - List pricing # POST /pricing - Create pricing entry # PUT /pricing/{id} - Update pricing # DELETE /pricing/{id} - Delete pricing (soft delete) ``` ### 4.3 Schemas ```yaml components: schemas: Scenario: type: object properties: id: type: string format: uuid name: type: string description: type: string tags: type: array items: type: string status: type: string enum: [draft, running, completed, archived] region: type: string created_at: type: string format: date-time updated_at: type: string format: date-time completed_at: type: string format: date-time total_requests: type: integer total_cost_estimate: type: number LogEntry: type: object properties: id: type: string format: uuid scenario_id: type: string format: uuid received_at: type: string format: date-time message_hash: type: string message_preview: type: string source: type: string size_bytes: type: integer has_pii: type: boolean token_count: type: integer sqs_blocks: type: integer securitySchemes: BearerAuth: type: http scheme: bearer bearerFormat: JWT ApiKeyAuth: type: apiKey in: header name: X-API-Key ``` --- ## 5. Data Flow ### 5.1 Log Ingestion Flow ``` ┌──────────┐ POST /ingest ┌──────────────┐ │ Client │ ───────────────────────>│ FastAPI │ │(Logstash)│ Headers: │ Middleware │ │ │ X-Scenario-ID: uuid │ │ └──────────┘ └──────┬───────┘ │ │ 1. Validate scenario exists & running │ 2. Parse JSON payload ▼ ┌──────────────┐ │ Ingest │ │ Service │ └──────┬───────┘ │ ┌───────────────────────┼───────────────────────┐ │ │ │ ▼ ▼ ▼ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ PII Detector │ │ SQS Calculator│ │ Tokenizer │ │ • check email│ │ • calc blocks │ │ • count │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ has_pii: bool │ sqs_blocks: int │ tokens: int └──────────────────────┼─────────────────────┘ │ ▼ ┌──────────────┐ │ LogRepo │ │ save() │ └──────┬───────┘ │ ▼ ┌──────────────┐ │ PostgreSQL │ │ scenario_logs│ └──────────────┘ ``` ### 5.2 Scenario State Machine ``` ┌─────────────────────────────────────────────────────────┐ │ │ ▼ │ ┌──────────┐ POST /start ┌──────────┐ │ ┌───────│ DRAFT │────────────────────>│ RUNNING │ │ │ └──────────┘ └────┬─────┘ │ │ ▲ │ │ │ │ │ POST /stop │ │ │ POST /archive ▼ │ │ │ ┌──────────┐ │ │ ┌────┴────┐<────────────────────│COMPLETED │──────────────────┘ │ │ARCHIVED │ └──────────┘ └──────>└─────────┘ ``` ### 5.3 Cost Calculation Flow ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ COST CALCULATION PIPELINE │ └─────────────────────────────────────────────────────────────────────────┘ Input: scenario_logs row ├─ sqs_blocks ├─ token_count └─ (future: lambda_gb_seconds) │ ▼ ┌─────────────────┐ │ Pricing Service │ │ • get_active() │ └────────┬────────┘ │ Query: SELECT * FROM aws_pricing │ WHERE service IN ('sqs', 'lambda', 'bedrock') │ AND region = :scenario_region │ AND is_active = TRUE ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ COST FORMULAS │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ SQS Cost: │ │ cost = blocks × price_per_million / 1,000,000 │ │ Example: 100 blocks × $0.40 / 1M = $0.00004 │ │ │ │ Lambda Cost: │ │ request_cost = invocations × price_per_million / 1,000,000 │ │ compute_cost = gb_seconds × price_per_gb_second │ │ total = request_cost + compute_cost │ │ Example: 1M invoc × $0.20/1M + 10GBs × $0.00001667 = $0.20 + $0.00017│ │ │ │ Bedrock Cost: │ │ input_cost = input_tokens × price_per_1k_input / 1,000 │ │ output_cost = output_tokens × price_per_1k_output / 1,000 │ │ total = input_cost + output_cost │ │ Example: 1000 tokens × $0.003/1K = $0.003 │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────┐ │ Update │ │ scenarios │ │ total_cost │ └─────────────────┘ ``` --- ## 6. Security Architecture ### 6.1 Authentication & Authorization ``` ┌─────────────────────────────────────────────────────────────────┐ │ AUTHENTICATION LAYERS │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Layer 1: API Key (Programmatic Access) │ │ ├─ Header: X-API-Key: │ │ ├─ Rate limiting: 1000 req/min │ │ └─ Scope: /ingest, /metrics (read-only on other resources) │ │ │ │ Layer 2: JWT Token (Web UI Access) │ │ ├─ Header: Authorization: Bearer │ │ ├─ Expiration: 24h │ │ ├─ Refresh token: 7d │ │ └─ Scope: Full access based on roles │ │ │ │ Layer 3: Role-Based Access Control (RBAC) │ │ ├─ admin: Full access │ │ ├─ user: CRUD own scenarios, read pricing │ │ └─ readonly: View only │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` ### 6.2 Data Security | Layer | Measure | Implementation | |-------|---------|----------------| | **Transport** | TLS 1.3 | Nginx reverse proxy | | **Storage** | Hashing | SHA-256 for message_hash | | **PII** | Detection + Truncation | Email regex, 500 char preview limit | | **API** | Rate Limiting | slowapi: 100/min public, 1000/min authenticated | | **DB** | Parameterized Queries | SQLAlchemy ORM (no raw SQL) | | **Secrets** | Environment Variables | python-dotenv, Docker secrets | ### 6.3 PII Detection Strategy ```python # Pattern matching for common PII def detect_pii(message: str) -> dict: patterns = { 'email': r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', 'ssn': r'\b\d{3}-\d{2}-\d{4}\b', 'credit_card': r'\b(?:\d[ -]*?){13,16}\b', 'phone': r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b' } results = {} for pii_type, pattern in patterns.items(): matches = re.findall(pattern, message) if matches: results[pii_type] = len(matches) return { 'has_pii': len(results) > 0, 'pii_types': list(results.keys()), 'total_matches': sum(results.values()) } ``` --- ## 7. Technology Stack ### 7.1 Backend | Component | Technology | Version | Purpose | |-----------|------------|---------|---------| | Framework | FastAPI | ≥0.110 | Web framework | | Server | Uvicorn | ≥0.29 | ASGI server | | Validation | Pydantic | ≥2.7 | Data validation | | ORM | SQLAlchemy | ≥2.0 | Database ORM | | Migrations | Alembic | latest | DB migrations | | Driver | asyncpg | latest | Async PostgreSQL | | Tokenizer | tiktoken | ≥0.6 | Token counting | | Rate Limit | slowapi | latest | API rate limiting | | Auth | python-jose | latest | JWT handling | | Testing | pytest | ≥8.1 | Test framework | | HTTP Client | httpx | ≥0.27 | Async HTTP | ### 7.2 Frontend (v0.4.0 Implemented) | Component | Technology | Version | Purpose | Status | |-----------|------------|---------|---------|--------| | Framework | React | ≥18 | UI library | ✅ Implemented | | Language | TypeScript | ≥5.0 | Type safety | ✅ Implemented | | Build | Vite | ≥5.0 | Build tool | ✅ Implemented | | Styling | Tailwind CSS | ≥3.4 | CSS framework | ✅ Implemented | | Components | shadcn/ui | latest | UI components | ✅ 15+ components | | Icons | Lucide React | latest | Icon library | ✅ Implemented | | State | TanStack Query | ≥5.0 | Server state | ✅ React Query v5 | | HTTP | Axios | ≥1.6 | HTTP client | ✅ With interceptors | | Routing | React Router | ≥6.0 | Navigation | ✅ Implemented | | Charts | Recharts | ≥2.0 | Data viz | ✅ Implemented v0.4.0 | | Theme | next-themes | latest | Dark/Light mode | ✅ Implemented v0.4.0 | | E2E Testing | Playwright | ≥1.40 | Browser testing | ✅ 100 tests v0.4.0 | **Note v0.4.0:** - ✅ 5 pages complete: Dashboard, ScenarioDetail, ScenarioEdit, Compare, Reports - ✅ 15+ shadcn/ui components integrated - ✅ Recharts visualization (CostBreakdown, TimeSeries, Comparison charts) - ✅ Dark/Light mode with system preference detection - ✅ React Query for data fetching with caching - ✅ Axios with error interceptors and toast notifications - ✅ Responsive design with Tailwind CSS - ✅ E2E testing with Playwright (100 test cases) ### 7.3 Infrastructure (v0.4.0 Status) | Component | Technology | Purpose | Status | |-----------|------------|---------|--------| | Container | Docker | Application containers | ✅ PostgreSQL | | Orchestration | Docker Compose | Multi-container dev | ✅ Dev setup | | Database | PostgreSQL 15+ | Primary data store | ✅ Running | | E2E Testing | Playwright | Browser automation | ✅ 100 tests | | Reverse Proxy | Nginx | SSL, static files | 🔄 Planned v1.0.0 | | Process Manager | systemd / PM2 | Production process mgmt | 🔄 Planned v1.0.0 | **Docker Services:** ```yaml # Current (v0.4.0) - postgres: PostgreSQL 15 with healthcheck Status: ✅ Tested and running Ports: 5432:5432 Volume: postgres_data (persistent) # Planned (v1.0.0) - backend: FastAPI production image - frontend: Nginx serving React build - nginx: Reverse proxy with SSL ``` --- ## 8. Project Structure (v0.3.0 - Implemented) ``` mockupAWS/ ├── src/ # Backend FastAPI (Root level) │ ├── main.py # FastAPI app entry │ ├── core/ # Core utilities │ │ ├── config.py # Settings & env vars │ │ ├── database.py # SQLAlchemy async config │ │ └── exceptions.py # Custom exception handlers │ ├── models/ # SQLAlchemy models (v0.2.0) │ │ ├── __init__.py │ │ ├── scenario.py │ │ ├── scenario_log.py │ │ ├── scenario_metric.py │ │ ├── aws_pricing.py │ │ └── report.py │ ├── schemas/ # Pydantic schemas │ │ ├── __init__.py │ │ ├── scenario.py │ │ ├── scenario_log.py │ │ └── scenario_metric.py │ ├── api/ # API routes │ │ ├── deps.py # FastAPI dependencies (get_db) │ │ └── v1/ │ │ ├── __init__.py # API router aggregation │ │ ├── scenarios.py # CRUD endpoints (v0.2.0) │ │ ├── ingest.py # Log ingestion (v0.2.0) │ │ └── metrics.py # Metrics endpoints (v0.2.0) │ ├── repositories/ # Repository pattern (v0.2.0) │ │ ├── __init__.py │ │ ├── base.py │ │ ├── scenario.py │ │ ├── scenario_log.py │ │ ├── scenario_metric.py │ │ └── aws_pricing.py │ └── services/ # Business logic (v0.2.0) │ ├── __init__.py │ ├── pii_detector.py # PII detection service │ ├── cost_calculator.py # AWS cost calculation │ └── ingest_service.py # Log ingestion orchestration │ ├── frontend/ # Frontend React (v0.4.0) │ ├── src/ │ │ ├── App.tsx # Root component with routing │ │ ├── main.tsx # React entry point │ │ ├── components/ │ │ │ ├── layout/ # Layout components │ │ │ │ ├── Header.tsx # With theme toggle (v0.4.0) │ │ │ │ ├── Sidebar.tsx │ │ │ │ └── Layout.tsx │ │ │ ├── ui/ # shadcn/ui components (v0.3.0) │ │ │ │ ├── button.tsx │ │ │ │ ├── card.tsx │ │ │ │ ├── dialog.tsx │ │ │ │ ├── input.tsx │ │ │ │ ├── label.tsx │ │ │ │ ├── table.tsx │ │ │ │ ├── textarea.tsx │ │ │ │ ├── toast.tsx │ │ │ │ ├── toaster.tsx │ │ │ │ ├── sonner.tsx │ │ │ │ ├── tabs.tsx # v0.4.0 │ │ │ │ ├── checkbox.tsx # v0.4.0 │ │ │ │ └── select.tsx # v0.4.0 │ │ │ ├── charts/ # Recharts components (v0.4.0) │ │ │ │ ├── CostBreakdownChart.tsx │ │ │ │ ├── TimeSeriesChart.tsx │ │ │ │ └── ComparisonBarChart.tsx │ │ │ ├── comparison/ # Comparison feature (v0.4.0) │ │ │ │ ├── ScenarioComparisonTable.tsx │ │ │ │ └── ComparisonMetrics.tsx │ │ │ └── reports/ # Report generation UI (v0.4.0) │ │ │ ├── ReportGenerator.tsx │ │ │ └── ReportList.tsx │ │ ├── pages/ # Page components (v0.4.0) │ │ │ ├── Dashboard.tsx # Scenarios list │ │ │ ├── ScenarioDetail.tsx # Scenario view/edit with charts │ │ │ ├── ScenarioEdit.tsx # Create/edit form │ │ │ ├── Compare.tsx # Compare scenarios (v0.4.0) │ │ │ └── Reports.tsx # Reports page (v0.4.0) │ │ ├── hooks/ # React Query hooks (v0.4.0) │ │ │ ├── useScenarios.ts │ │ │ ├── useCreateScenario.ts │ │ │ ├── useUpdateScenario.ts │ │ │ ├── useComparison.ts # v0.4.0 │ │ │ └── useReports.ts # v0.4.0 │ │ ├── lib/ # Utilities │ │ │ ├── api.ts # Axios client config │ │ │ ├── utils.ts # Utility functions │ │ │ ├── queryClient.ts # React Query config │ │ │ └── theme-provider.tsx # Dark mode (v0.4.0) │ │ └── types/ │ │ └── api.ts # TypeScript types │ ├── e2e/ # E2E tests (v0.4.0) │ │ ├── tests/ │ │ │ ├── scenarios.spec.ts │ │ │ ├── reports.spec.ts │ │ │ ├── comparison.spec.ts │ │ │ └── dark-mode.spec.ts │ │ ├── fixtures/ │ │ └── TEST-RESULTS.md │ ├── package.json │ ├── vite.config.ts │ ├── tsconfig.json │ ├── tailwind.config.js │ ├── playwright.config.ts # E2E config (v0.4.0) │ ├── components.json # shadcn/ui config │ └── Dockerfile # Production build │ ├── alembic/ # Database migrations (v0.2.0) │ ├── versions/ # 6 migrations implemented │ │ ├── 8c29fdcbbf85_create_scenarios_table.py │ │ ├── e46de4b0264a_create_scenario_logs_table.py │ │ ├── 5e247ed57b77_create_scenario_metrics_table.py │ │ ├── 48f2231e7c12_create_aws_pricing_table.py │ │ ├── e80c6eef58b2_create_reports_table.py │ │ └── 0892c44b2a58_seed_aws_pricing_data.py │ ├── env.py │ └── alembic.ini │ ├── export/ # Project documentation │ ├── prd.md # Product Requirements │ ├── architecture.md # This file │ ├── kanban.md # Task breakdown │ └── progress.md # Progress tracking │ ├── .opencode/ # OpenCode team config │ └── agents/ # 6 agent configurations │ ├── spec-architect.md │ ├── backend-dev.md │ ├── db-engineer.md │ ├── frontend-dev.md │ ├── devops-engineer.md │ └── qa-engineer.md │ ├── docker-compose.yml # PostgreSQL service ├── Dockerfile.backend # Backend production image ├── pyproject.toml # Python dependencies (uv) ├── uv.lock # Locked dependencies ├── .env # Environment variables ├── .gitignore # Git ignore rules └── README.md # Project documentation ``` --- ## 9. Decisioni Architetturali ### DEC-001: Async-First Architecture **Decisione:** Utilizzare Python async/await in tutto lo stack (FastAPI, SQLAlchemy, asyncpg) **Motivazione:** - Alto throughput richiesto (>1000 RPS) - I/O bound operations (DB, tokenizer) - Migliore utilizzo risorse rispetto a sync **Alternative considerate:** - Sync + ThreadPool: Più semplice ma meno efficiente - Celery + Redis: Troppo complesso per use case **Conseguenze:** - Curva di apprendimento per async - Debugging più complesso - Migliore scalabilità --- ### DEC-002: Repository Pattern **Decisione:** Implementare Repository Pattern per accesso dati **Motivazione:** - Separazione tra business logic e data access - Facile testing con mock repositories - Possibilità di cambiare DB in futuro **Struttura:** ```python class BaseRepository(Generic[T]): async def get(self, id: UUID) -> T | None: ... async def list(self, **filters) -> list[T]: ... async def create(self, obj: T) -> T: ... async def update(self, id: UUID, data: dict) -> T: ... async def delete(self, id: UUID) -> bool: ... ``` --- ### DEC-003: Separate Database per Scenario **Decisione:** Utilizzare una singola tabella `scenario_logs` con `scenario_id` FK invece di DB separati **Motivazione:** - Più semplice da gestire - Query cross-scenario possibili (confronti) - Backup/restore più semplice **Alternative considerate:** - Schema per scenario: Troppo overhead - DB separati: Troppo complesso per MVP --- ### DEC-004: Message Hashing for Deduplication **Decisione:** Utilizzare SHA-256 hash del messaggio per deduplicazione **Motivazione:** - Privacy: Non memorizzare messaggi completi - Performance: Hash lookup O(1) - Storage: Risparmio spazio **Implementazione:** ```python import hashlib message_hash = hashlib.sha256(message.encode()).hexdigest() ``` --- ### DEC-005: Time-Series Metrics **Decisione:** Salvare metriche come time-series in `scenario_metrics` **Motivazione:** - Trend analysis possibile - Aggregazioni flessibili - Audit trail **Trade-off:** - Più storage rispetto a campi aggregati - Query più complesse ma indicizzate --- ## 10. Performance Considerations ### 10.1 Database Optimization | Optimization | Implementation | Benefit | |--------------|----------------|---------| | Indexes | B-tree on foreign keys, timestamps | Fast lookups | | GIN | tags (JSONB) | Fast array search | | Partitioning | scenario_logs by date | Query pruning | | Connection Pool | asyncpg pool (20-50) | Concurrency | ### 10.2 Caching Strategy (Future) ``` Layer 1: In-memory (FastAPI state) ├─ Active scenario metadata └─ AWS pricing (rarely changes) Layer 2: Redis (future) ├─ Session storage ├─ Rate limiting counters └─ Report generation status ``` ### 10.3 Query Optimization - Use `selectinload` for relationships - Batch inserts for logs (copy_expert) - Materialized views for reports - Async tasks for heavy operations --- ## 11. Error Handling Strategy ### 11.1 Exception Hierarchy ```python class AppException(Exception): """Base application exception""" status_code: int = 500 code: str = "internal_error" class NotFoundException(AppException): status_code = 404 code = "not_found" class ValidationException(AppException): status_code = 400 code = "validation_error" class ConflictException(AppException): status_code = 409 code = "conflict" class RateLimitException(AppException): status_code = 429 code = "rate_limited" ``` ### 11.2 Global Exception Handler ```python @app.exception_handler(AppException) async def app_exception_handler(request: Request, exc: AppException): return JSONResponse( status_code=exc.status_code, content={ "error": exc.code, "message": str(exc), "timestamp": datetime.utcnow().isoformat() } ) ``` --- ## 12. Deployment Architecture ### 12.1 Docker Compose (Development) ```yaml version: '3.8' services: postgres: image: postgres:15-alpine environment: POSTGRES_DB: mockupaws POSTGRES_USER: app POSTGRES_PASSWORD: ${DB_PASSWORD} volumes: - postgres_data:/var/lib/postgresql/data ports: - "5432:5432" healthcheck: test: ["CMD-SHELL", "pg_isready -U app -d mockupaws"] backend: build: ./backend environment: DATABASE_URL: postgresql+asyncpg://app:${DB_PASSWORD}@postgres:5432/mockupaws ports: - "8000:8000" depends_on: postgres: condition: service_healthy frontend: build: ./frontend ports: - "3000:80" depends_on: - backend volumes: postgres_data: ``` ### 12.2 Production Considerations - Use managed PostgreSQL (AWS RDS, Azure PostgreSQL) - Nginx as reverse proxy with SSL - Environment-specific configuration - Log aggregation (ELK or similar) - Monitoring (Prometheus + Grafana) - Health checks and readiness probes --- ## 13. Implementation Status & Changelog ### v0.2.0 - Backend Core ✅ COMPLETED **Database Layer:** - ✅ PostgreSQL 15 with 5 tables (scenarios, logs, metrics, pricing, reports) - ✅ 6 Alembic migrations (including AWS pricing seed data) - ✅ SQLAlchemy 2.0 async models with relationships - ✅ Indexes and constraints optimized **Backend API:** - ✅ FastAPI application with structured routing - ✅ Scenario CRUD endpoints (POST, GET, PUT, DELETE) - ✅ Ingest API with PII detection - ✅ Metrics API with cost calculation - ✅ Repository pattern implementation - ✅ Service layer (PII detector, Cost calculator, Ingest service) - ✅ Exception handlers and validation **Data Processing:** - ✅ SHA-256 message hashing for deduplication - ✅ Email PII detection with regex - ✅ AWS cost calculation (SQS, Lambda, Bedrock) - ✅ Token counting with tiktoken ### v0.3.0 - Frontend Implementation ✅ COMPLETED **React Application:** - ✅ Vite + TypeScript + React 18 setup - ✅ Tailwind CSS integration - ✅ shadcn/ui components (Button, Card, Dialog, Input, Label, Table, Textarea, Toast) - ✅ Lucide React icons **State Management:** - ✅ TanStack Query (React Query) v5 for server state - ✅ Axios HTTP client with interceptors - ✅ Error handling with toast notifications **Pages & Routing:** - ✅ Dashboard - Scenarios list with pagination - ✅ ScenarioDetail - View and edit scenarios - ✅ ScenarioEdit - Create and edit form - ✅ React Router v6 navigation **API Integration:** - ✅ TypeScript types for all API responses - ✅ Custom hooks for data fetching (useScenarios, useCreateScenario, useUpdateScenario) - ✅ Loading states and error boundaries - ✅ Responsive design **Docker & DevOps:** - ✅ Docker Compose with PostgreSQL service - ✅ Health checks for database - ✅ Dockerfile for backend (production ready) - ✅ Dockerfile for frontend (multi-stage build) - ✅ Environment configuration ### v0.4.0 - Reports, Charts & Comparison ✅ COMPLETATA (2026-04-07) **Backend Features:** - ✅ Report generation (PDF/CSV) with ReportLab and Pandas - ✅ Report storage and download API - ✅ Rate limiting for report downloads (10/min) - ✅ Automatic cleanup of old reports **Frontend Features:** - ✅ Interactive charts with Recharts (Pie, Area, Bar) - ✅ Cost Breakdown chart in Scenario Detail - ✅ Time Series chart for metrics - ✅ Comparison Bar Chart for scenario compare - ✅ Dark/Light mode toggle with system preference detection - ✅ Scenario comparison page (2-4 scenarios side-by-side) - ✅ Comparison tables with delta indicators - ✅ Report generation UI (PDF/CSV) **Testing:** - ✅ E2E testing suite with Playwright - ✅ 100 test cases covering all features - ✅ Multi-browser support (Chromium, Firefox) - ✅ Visual regression testing **Technical:** - ✅ next-themes for theme management - ✅ Tailwind dark mode configuration - ✅ Radix UI components (Tabs, Checkbox, Select) - ✅ Responsive charts with theme adaptation ### v1.0.0 - Production Ready ⏳ PLANNED **Security:** - ⏳ JWT authentication - ⏳ API key management - ⏳ Role-based access control **Infrastructure:** - ⏳ Full Docker Compose stack (backend + frontend + nginx) - ⏳ SSL/TLS configuration - ⏳ Database backup automation - ⏳ Monitoring and logging **Documentation:** - ⏳ Complete OpenAPI specification - ⏳ User guide - ⏳ API reference --- ## 14. Testing Status ### Current Coverage (v0.4.0) | Layer | Type | Status | Coverage | |-------|------|--------|----------| | Backend Unit | pytest | ✅ Implemented | ~60% | | Backend Integration | pytest | ✅ Implemented | All endpoints | | Frontend Unit | Vitest | 🔄 Partial | Key components | | E2E | Playwright | ✅ Implemented | 100 tests | **E2E Test Results:** - Total tests: 100 - Passing: 100 - Browsers: Chromium, Firefox - Features covered: Scenarios, Reports, Comparison, Dark Mode ### Test Files ``` tests/ ├── __init__.py ├── conftest.py # Fixtures ├── unit/ │ ├── test_main.py # Basic app tests (v0.1) │ ├── test_services.py # Service logic tests (planned) │ └── test_cost_calculator.py ├── integration/ │ ├── test_api_scenarios.py │ ├── test_api_ingest.py │ └── test_api_metrics.py └── e2e/ └── test_full_flow.py # Complete user journey ``` --- ## 15. Known Limitations & Technical Debt ### Current (v0.4.0) 1. **No Authentication**: API is open (JWT planned v0.5.0) 2. **No Caching**: Every request hits database (Redis planned v1.0.0) 3. **Limited Frontend Unit Tests**: Vitest coverage partial ### Resolved in v0.4.0 - ✅ Report generation with PDF/CSV export - ✅ Interactive charts with Recharts - ✅ Scenario comparison feature - ✅ Dark/Light mode toggle - ✅ E2E testing with Playwright (100 tests) - ✅ Rate limiting for report downloads ### Resolved in v0.3.0 - ✅ Database connection pooling - ✅ Async SQLAlchemy implementation - ✅ React Query for efficient data fetching - ✅ Error handling with user-friendly messages - ✅ Docker setup for consistent development --- *Documento creato da @spec-architect* *Versione: 1.2* *Ultimo aggiornamento: 2026-04-07* *Stato: v0.4.0 Completata*