# Architecture - mockupAWS ## 1. Overview mockupAWS è una piattaforma di simulazione costi AWS che permette di profilare traffico log e calcolare i driver di costo (SQS, Lambda, Bedrock/LLM) prima del deploy in produzione. **Architettura:** Layered Architecture con pattern Repository e Service Layer **Paradigma:** Async-first (FastAPI + SQLAlchemy async) **Deployment:** Container-based (Docker Compose) --- ## 2. System Architecture ### 2.1 High-Level Architecture ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ CLIENT LAYER │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────┐ │ │ │ Logstash │ │ React Web UI │ │ API Consumers │ │ │ │ (Log Source) │ │ (Dashboard) │ │ (CI/CD, Scripts) │ │ │ └────────┬─────────┘ └────────┬─────────┘ └───────────┬──────────────┘ │ └───────────┼─────────────────────┼────────────────────────┼───────────────────┘ │ │ │ │ HTTP POST │ HTTPS │ API Key + JWT │ /ingest │ /api/v1/* │ /api/v1/* ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ API LAYER │ │ FastAPI + Uvicorn (ASGI) │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ │ Middleware Stack │ │ │ │ ├── CORS │ │ │ │ ├── Rate Limiting (slowapi) │ │ │ │ ├── Authentication (JWT / API Key) │ │ │ │ ├── Request Validation (Pydantic) │ │ │ │ └── Error Handling │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ │ /scenarios │ │ /ingest │ │ /reports │ │ /pricing │ │ │ │ CRUD │ │ (log │ │ generate │ │ (admin) │ │ │ │ │ │ intake) │ │ download │ │ │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │ └─────────┼────────────────┼────────────────┼──────────────────┼─────────────┘ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ SERVICE LAYER │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ ScenarioService │ │ IngestService │ │ CostCalculator │ │ │ │ ─────────────── │ │ ────────────── │ │ ───────────── │ │ │ │ • create() │ │ • ingest_log() │ │ • calculate_sqs_cost() │ │ │ │ • update() │ │ • batch_process()│ │ • calculate_lambda_cost() │ │ │ │ • delete() │ │ • deduplicate() │ │ • calculate_bedrock_cost() │ │ │ │ • lifecycle() │ │ • persist() │ │ • get_total_cost() │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ ReportService │ │ PIIDetector │ │ TokenizerService │ │ │ │ ────────────── │ │ ─────────── │ │ ─────────────── │ │ │ │ • generate_csv()│ │ • detect_email()│ │ • count_tokens() │ │ │ │ • generate_pdf()│ │ • scan_patterns()│ │ • encode() │ │ │ │ • compile() │ │ • report() │ │ • get_encoding() │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ └─────────┬──────────────────────────────────────────────────────┬────────────┘ │ │ ▼ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ REPOSITORY LAYER │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ ScenarioRepo │ │ LogRepo │ │ PricingRepo │ │ │ │ ───────────── │ │ ─────── │ │ ────────── │ │ │ │ • get_by_id() │ │ • save() │ │ • get_by_service_region() │ │ │ │ • list() │ │ • list_by_ │ │ • list_active() │ │ │ │ • create() │ │ scenario() │ │ • update() │ │ │ │ • update() │ │ • count_by_ │ │ • bulk_insert() │ │ │ │ • delete() │ │ hash() │ │ │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ │ ┌──────────────────┐ ┌──────────────────┐ │ │ │ MetricRepo │ │ ReportRepo │ │ │ │ ────────── │ │ ────────── │ │ │ │ │ • save() │ │ • save() │ │ │ │ │ • get_aggregated│ │ • list() │ │ │ │ │ • list_by_type()│ │ • delete() │ │ │ │ └──────────────────┘ └──────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ SQLAlchemy 2.0 Async │ asyncpg driver ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ DATABASE LAYER │ │ PostgreSQL 15+ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │ │ │ scenarios │ │ scenario_logs │ │ aws_pricing │ │ │ │ ───────── │ │ ───────────── │ │ ─────────── │ │ │ │ • metadata │ │ • logs storage │ │ • service prices │ │ │ │ • state machine │ │ • hash for dedup│ │ • history tracking │ │ │ │ • cost totals │ │ • PII flags │ │ • region-specific │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │ │ ┌──────────────────┐ ┌──────────────────┐ │ │ │ scenario_metrics│ │ reports │ │ │ │ │ ─────────────── │ │ ──────── │ │ │ │ │ • time-series │ │ • generated │ │ │ │ │ • aggregates │ │ • metadata │ │ │ │ │ • cost breakdown│ │ • file refs │ │ │ │ └──────────────────┘ └──────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ### 2.2 Layer Responsibilities | Layer | Responsabilità | Tecnologie | |-------|----------------|------------| | **Client** | Interazione utente, ingestion log | Browser, Logstash, curl | | **API** | Routing, validation, auth, middleware | FastAPI, Pydantic, slowapi | | **Service** | Business logic, orchestration | Python async/await | | **Repository** | Data access, query abstraction | SQLAlchemy 2.0 Repository pattern | | **Database** | Persistenza, ACID, queries | PostgreSQL 15+ | --- ## 3. Database Schema ### 3.1 Entity Relationship Diagram ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ SCHEMA ERD │ └─────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────┐ ┌─────────────────────┐ │ users │ │ aws_pricing │ ├─────────────────────┤ ├─────────────────────┤ │ PK id: UUID │ │ PK id: UUID │ │ email: VARCHAR │ │ service: VARCHAR │ │ password_hash: V │ │ region: VARCHAR │ │ full_name: VAR │ │ tier: VARCHAR │ │ is_active: BOOL │ │ price: DECIMAL │ │ is_superuser: B │ │ unit: VARCHAR │ │ created_at: TS │ │ effective_from: D│ │ updated_at: TS │ │ effective_to: D │ │ last_login: TS │ │ is_active: BOOL │ └──────────┬──────────┘ │ source_url: TEXT │ │ └─────────────────────┘ │ 1:N ▼ ┌─────────────────────┐ ┌─────────────────────┐ │ api_keys │ │ scenarios │ ├─────────────────────┤ ├─────────────────────┤ │ PK id: UUID │ │ PK id: UUID │ │ FK user_id: UUID │ │ name: VARCHAR │ │ key_hash: V(255) │ │ description: TEXT│ │ key_prefix: V(8) │ │ tags: JSONB │ │ name: VARCHAR │ │ status: ENUM │ │ scopes: JSONB │ │ region: VARCHAR │ │ last_used_at: TS │ │ created_at: TS │ │ expires_at: TS │ │ updated_at: TS │ │ is_active: BOOL │ │ completed_at: TS │ │ created_at: TS │ │ total_requests: I│ └─────────────────────┘ │ total_cost: DEC │ │ └──────────┬──────────┘ │ │ │ 1:N │ 1:N ▼ ▼ ┌─────────────────────┐ ┌─────────────────────┐ │ report_schedules │ │ scenario_logs │ ├─────────────────────┤ ├─────────────────────┤ │ PK id: UUID │ │ PK id: UUID │ │ FK user_id: UUID │ │ FK scenario_id: UUID│ │ FK scenario_id: UUID│ │ received_at: TS │ │ name: VARCHAR │ │ message_hash: V64│ │ frequency: ENUM │ │ message_preview │ │ day_of_week: INT │ │ source: VARCHAR │ │ day_of_month: INT│ │ size_bytes: INT │ │ hour: INT │ │ has_pii: BOOL │ │ minute: INT │ │ token_count: INT │ │ format: ENUM │ │ sqs_blocks: INT │ │ email_to: TEXT[] │ └─────────────────────┘ │ is_active: BOOL │ │ │ last_run_at: TS │ │ 1:N │ next_run_at: TS │ ▼ │ created_at: TS │ ┌─────────────────────┐ └─────────────────────┘ │ scenario_metrics │ ├─────────────────────┤ │ PK id: UUID │ │ FK scenario_id: UUID│ │ timestamp: TS │ │ metric_type: VAR │ │ metric_name: VAR │ │ value: DECIMAL │ │ unit: VARCHAR │ │ metadata: JSONB │ └─────────────────────┘ │ │ 1:N (optional) ▼ ┌─────────────────────┐ │ reports │ ├─────────────────────┤ │ PK id: UUID │ │ FK scenario_id: UUID│ │ format: ENUM │ │ file_path: TEXT │ │ generated_at: TS │ │ generated_by: VAR│ │ metadata: JSONB │ └─────────────────────┘ ``` ### 3.2 DDL - Schema Definition ```sql -- ============================================ -- EXTENSIONS -- ============================================ CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- For text search -- ============================================ -- ENUMS -- ============================================ CREATE TYPE scenario_status AS ENUM ('draft', 'running', 'completed', 'archived'); CREATE TYPE report_format AS ENUM ('pdf', 'csv'); -- ============================================ -- TABLE: scenarios -- ============================================ CREATE TABLE scenarios ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), name VARCHAR(255) NOT NULL, description TEXT, tags JSONB DEFAULT '[]'::jsonb, status scenario_status NOT NULL DEFAULT 'draft', region VARCHAR(50) NOT NULL DEFAULT 'us-east-1', created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), completed_at TIMESTAMP WITH TIME ZONE, started_at TIMESTAMP WITH TIME ZONE, total_requests INTEGER NOT NULL DEFAULT 0, total_cost_estimate DECIMAL(12, 6) NOT NULL DEFAULT 0.000000, -- Constraints CONSTRAINT chk_name_not_empty CHECK (char_length(trim(name)) > 0), CONSTRAINT chk_region_not_empty CHECK (char_length(trim(region)) > 0) ); -- Indexes CREATE INDEX idx_scenarios_status ON scenarios(status); CREATE INDEX idx_scenarios_region ON scenarios(region); CREATE INDEX idx_scenarios_created_at ON scenarios(created_at DESC); CREATE INDEX idx_scenarios_tags ON scenarios USING GIN(tags); -- Trigger for updated_at CREATE OR REPLACE FUNCTION update_updated_at_column() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = NOW(); RETURN NEW; END; $$ language 'plpgsql'; CREATE TRIGGER update_scenarios_updated_at BEFORE UPDATE ON scenarios FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); -- ============================================ -- TABLE: scenario_logs -- ============================================ CREATE TABLE scenario_logs ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE, received_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), message_hash VARCHAR(64) NOT NULL, -- SHA256 message_preview VARCHAR(500), source VARCHAR(100) DEFAULT 'unknown', size_bytes INTEGER NOT NULL DEFAULT 0, has_pii BOOLEAN NOT NULL DEFAULT FALSE, token_count INTEGER NOT NULL DEFAULT 0, sqs_blocks INTEGER NOT NULL DEFAULT 1, -- Constraints CONSTRAINT chk_size_positive CHECK (size_bytes >= 0), CONSTRAINT chk_token_positive CHECK (token_count >= 0), CONSTRAINT chk_blocks_positive CHECK (sqs_blocks >= 1) ); -- Indexes CREATE INDEX idx_logs_scenario_id ON scenario_logs(scenario_id); CREATE INDEX idx_logs_received_at ON scenario_logs(received_at DESC); CREATE INDEX idx_logs_message_hash ON scenario_logs(message_hash); CREATE INDEX idx_logs_has_pii ON scenario_logs(has_pii) WHERE has_pii = TRUE; -- ============================================ -- TABLE: scenario_metrics -- ============================================ CREATE TABLE scenario_metrics ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE, timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), metric_type VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock', 'safety' metric_name VARCHAR(100) NOT NULL, value DECIMAL(15, 6) NOT NULL DEFAULT 0.000000, unit VARCHAR(20) NOT NULL, -- 'count', 'bytes', 'tokens', 'usd', 'invocations' metadata JSONB DEFAULT '{}'::jsonb ); -- Indexes CREATE INDEX idx_metrics_scenario_id ON scenario_metrics(scenario_id); CREATE INDEX idx_metrics_timestamp ON scenario_metrics(timestamp DESC); CREATE INDEX idx_metrics_type ON scenario_metrics(metric_type); CREATE INDEX idx_metrics_scenario_type ON scenario_metrics(scenario_id, metric_type); -- ============================================ -- TABLE: aws_pricing -- ============================================ CREATE TABLE aws_pricing ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), service VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock' region VARCHAR(50) NOT NULL, tier VARCHAR(50) NOT NULL DEFAULT 'standard', price_per_unit DECIMAL(15, 10) NOT NULL, unit VARCHAR(20) NOT NULL, -- 'per_million_requests', 'per_gb_second', 'per_1k_tokens' effective_from DATE NOT NULL DEFAULT CURRENT_DATE, effective_to DATE, is_active BOOLEAN NOT NULL DEFAULT TRUE, source_url VARCHAR(500), description TEXT, -- Constraints CONSTRAINT chk_price_positive CHECK (price_per_unit >= 0), CONSTRAINT chk_valid_dates CHECK (effective_to IS NULL OR effective_to >= effective_from), CONSTRAINT uq_pricing_unique_active UNIQUE (service, region, tier, effective_from) WHERE is_active = TRUE ); -- Indexes CREATE INDEX idx_pricing_service ON aws_pricing(service); CREATE INDEX idx_pricing_region ON aws_pricing(region); CREATE INDEX idx_pricing_active ON aws_pricing(service, region, tier) WHERE is_active = TRUE; -- ============================================ -- TABLE: reports -- ============================================ CREATE TABLE reports ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE, format report_format NOT NULL, file_path VARCHAR(500) NOT NULL, file_size_bytes INTEGER, generated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), generated_by VARCHAR(100), -- user_id or api_key_id metadata JSONB DEFAULT '{}'::jsonb ); -- Indexes CREATE INDEX idx_reports_scenario_id ON reports(scenario_id); CREATE INDEX idx_reports_generated_at ON reports(generated_at DESC); -- ============================================ -- TABLE: users (v0.5.0) -- ============================================ CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), email VARCHAR(255) NOT NULL UNIQUE, password_hash VARCHAR(255) NOT NULL, full_name VARCHAR(255), is_active BOOLEAN NOT NULL DEFAULT true, is_superuser BOOLEAN NOT NULL DEFAULT false, created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), last_login TIMESTAMP WITH TIME ZONE ); -- Indexes CREATE INDEX idx_users_email ON users(email); CREATE INDEX idx_users_created_at ON users(created_at) USING brin; -- Trigger for updated_at CREATE TRIGGER update_users_updated_at BEFORE UPDATE ON users FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); -- ============================================ -- TABLE: api_keys (v0.5.0) -- ============================================ CREATE TABLE api_keys ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, key_hash VARCHAR(255) NOT NULL UNIQUE, key_prefix VARCHAR(8) NOT NULL, name VARCHAR(255), scopes JSONB DEFAULT '[]'::jsonb, last_used_at TIMESTAMP WITH TIME ZONE, expires_at TIMESTAMP WITH TIME ZONE, is_active BOOLEAN NOT NULL DEFAULT true, created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() ); -- Indexes CREATE INDEX idx_api_keys_key_hash ON api_keys(key_hash); CREATE INDEX idx_api_keys_user_id ON api_keys(user_id); CREATE INDEX idx_api_keys_prefix ON api_keys(key_prefix); -- ============================================ -- TABLE: report_schedules (v0.5.0) -- ============================================ CREATE TABLE report_schedules ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE, name VARCHAR(255), frequency VARCHAR(20) NOT NULL CHECK (frequency IN ('daily', 'weekly', 'monthly')), day_of_week INTEGER CHECK (day_of_week BETWEEN 0 AND 6), day_of_month INTEGER CHECK (day_of_month BETWEEN 1 AND 31), hour INTEGER NOT NULL CHECK (hour BETWEEN 0 AND 23), minute INTEGER NOT NULL CHECK (minute BETWEEN 0 AND 59), format VARCHAR(10) NOT NULL CHECK (format IN ('pdf', 'csv')), include_logs BOOLEAN DEFAULT false, sections JSONB DEFAULT '[]'::jsonb, email_to TEXT[], is_active BOOLEAN NOT NULL DEFAULT true, last_run_at TIMESTAMP WITH TIME ZONE, next_run_at TIMESTAMP WITH TIME ZONE, created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() ); -- Indexes CREATE INDEX idx_schedules_user_id ON report_schedules(user_id); CREATE INDEX idx_schedules_scenario_id ON report_schedules(scenario_id); CREATE INDEX idx_schedules_next_run ON report_schedules(next_run_at) WHERE is_active = true; ``` ### 3.3 Key Queries ```sql -- Query: Get scenario with aggregated metrics SELECT s.*, COUNT(DISTINCT sl.id) as total_logs, COUNT(DISTINCT CASE WHEN sl.has_pii THEN sl.id END) as pii_violations, SUM(sl.token_count) as total_tokens, SUM(sl.sqs_blocks) as total_sqs_blocks FROM scenarios s LEFT JOIN scenario_logs sl ON s.id = sl.scenario_id WHERE s.id = :scenario_id GROUP BY s.id; -- Query: Get cost breakdown by service SELECT metric_type, SUM(value) as total_value, unit FROM scenario_metrics WHERE scenario_id = :scenario_id AND metric_name LIKE '%cost%' GROUP BY metric_type, unit; -- Query: Get active pricing for service/region SELECT * FROM aws_pricing WHERE service = :service AND region = :region AND is_active = TRUE AND (effective_to IS NULL OR effective_to >= CURRENT_DATE) ORDER BY effective_from DESC LIMIT 1; ``` --- ## 4. API Specifications ### 4.1 OpenAPI Overview ```yaml openapi: 3.0.0 info: title: mockupAWS API version: 0.3.0 description: AWS Cost Simulation Platform API servers: - url: http://localhost:8000/api/v1 description: Development server security: - BearerAuth: [] - ApiKeyAuth: [] ``` ### 4.2 Endpoints #### Scenarios API ```yaml # POST /scenarios - Create new scenario request: content: application/json: schema: type: object required: [name, region] properties: name: type: string minLength: 1 maxLength: 255 description: type: string tags: type: array items: type: string region: type: string enum: [us-east-1, us-west-2, eu-west-1, eu-central-1] tier: type: string enum: [standard, on-demand] default: standard response: 201: content: application/json: schema: $ref: '#/components/schemas/Scenario' # GET /scenarios - List scenarios parameters: - name: status in: query schema: type: string enum: [draft, running, completed, archived] - name: region in: query schema: type: string - name: page in: query schema: type: integer default: 1 - name: page_size in: query schema: type: integer default: 20 maximum: 100 response: 200: content: application/json: schema: type: object properties: items: type: array items: $ref: '#/components/schemas/Scenario' total: type: integer page: type: integer page_size: type: integer # GET /scenarios/{id} - Get scenario details # PUT /scenarios/{id} - Update scenario # DELETE /scenarios/{id} - Delete scenario # POST /scenarios/{id}/start - Start scenario # POST /scenarios/{id}/stop - Stop scenario # POST /scenarios/{id}/archive - Archive scenario ``` #### Ingest API ```yaml # POST /ingest - Ingest log headers: X-Scenario-ID: required: true schema: type: string format: uuid request: content: application/json: schema: type: object required: [message] properties: message: type: string minLength: 1 source: type: string default: unknown response: 202: description: Log accepted content: application/json: schema: type: object properties: status: type: string example: accepted log_id: type: string format: uuid estimated_cost_impact: type: number 400: description: Invalid scenario or scenario not running ``` #### Metrics API ```yaml # GET /scenarios/{id}/metrics - Get scenario metrics response: 200: content: application/json: schema: type: object properties: scenario_id: type: string summary: type: object properties: total_requests: type: integer total_cost_usd: type: number sqs_blocks: type: integer lambda_invocations: type: integer llm_tokens: type: integer pii_violations: type: integer cost_breakdown: type: array items: type: object properties: service: type: string cost_usd: type: number percentage: type: number timeseries: type: array items: type: object properties: timestamp: type: string format: date-time metric_type: type: string value: type: number ``` #### Reports API ```yaml # POST /scenarios/{id}/reports - Generate report request: content: application/json: schema: type: object required: [format] properties: format: type: string enum: [pdf, csv] include_logs: type: boolean default: false date_from: type: string format: date-time date_to: type: string format: date-time response: 202: description: Report generation started content: application/json: schema: type: object properties: report_id: type: string status: type: string enum: [pending, processing, completed] download_url: type: string # GET /reports/{id}/download - Download report # GET /reports/{id}/status - Check report status ``` #### Pricing API (Admin) ```yaml # GET /pricing - List pricing # POST /pricing - Create pricing entry # PUT /pricing/{id} - Update pricing # DELETE /pricing/{id} - Delete pricing (soft delete) ``` #### Authentication API (v0.5.0) ```yaml # POST /auth/register - Register new user request: content: application/json: schema: type: object required: [email, password, full_name] properties: email: type: string format: email password: type: string minLength: 8 pattern: "^(?=.*[a-z])(?=.*[A-Z])(?=.*\\d)(?=.*[!@#$%^&*])" full_name: type: string maxLength: 255 response: 201: content: application/json: schema: type: object properties: user: $ref: '#/components/schemas/User' access_token: type: string refresh_token: type: string token_type: type: string example: bearer # POST /auth/login - Authenticate user request: content: application/json: schema: type: object required: [email, password] properties: email: type: string format: email password: type: string response: 200: content: application/json: schema: type: object properties: access_token: type: string refresh_token: type: string token_type: type: string example: bearer 401: description: Invalid credentials # POST /auth/refresh - Refresh access token request: content: application/json: schema: type: object required: [refresh_token] properties: refresh_token: type: string response: 200: content: application/json: schema: type: object properties: access_token: type: string refresh_token: type: string token_type: type: string example: bearer # POST /auth/logout - Logout user (optional: blacklist token) security: - BearerAuth: [] response: 200: description: Successfully logged out # GET /auth/me - Get current user info security: - BearerAuth: [] response: 200: content: application/json: schema: $ref: '#/components/schemas/User' # POST /auth/reset-password-request - Request password reset request: content: application/json: schema: type: object required: [email] properties: email: type: string format: email response: 202: description: Reset email sent (if user exists) # POST /auth/reset-password - Reset password with token request: content: application/json: schema: type: object required: [token, new_password] properties: token: type: string new_password: type: string minLength: 8 response: 200: description: Password reset successful ``` #### API Keys API (v0.5.0) ```yaml # POST /api-keys - Create new API key security: - BearerAuth: [] request: content: application/json: schema: type: object required: [name] properties: name: type: string maxLength: 255 scopes: type: array items: type: string enum: [read:scenarios, write:scenarios, delete:scenarios, read:reports, write:reports, read:metrics, ingest:logs] expires_days: type: integer minimum: 1 maximum: 365 response: 201: description: API key created content: application/json: schema: type: object properties: id: type: string format: uuid name: type: string key: type: string description: Full key (shown ONLY once!) example: mk_a3f9b2c1_xK9mP2nQ8rS4tU7vW1yZ prefix: type: string example: a3f9b2c1 scopes: type: array items: type: string expires_at: type: string format: date-time created_at: type: string format: date-time # GET /api-keys - List user's API keys security: - BearerAuth: [] response: 200: content: application/json: schema: type: array items: type: object properties: id: type: string format: uuid name: type: string prefix: type: string scopes: type: array items: type: string last_used_at: type: string format: date-time expires_at: type: string format: date-time is_active: type: boolean created_at: type: string format: date-time # NOTE: key_hash is NOT included in response # DELETE /api-keys/{id} - Revoke API key security: - BearerAuth: [] parameters: - name: id in: path required: true schema: type: string format: uuid response: 204: description: API key revoked 404: description: API key not found # POST /api-keys/{id}/rotate - Rotate API key security: - BearerAuth: [] parameters: - name: id in: path required: true schema: type: string format: uuid response: 200: description: New API key generated content: application/json: schema: type: object properties: id: type: string format: uuid name: type: string key: type: string description: New full key (shown ONLY once!) prefix: type: string scopes: type: array items: type: string ``` #### Report Schedules API (v0.5.0) ```yaml # POST /schedules - Create report schedule security: - BearerAuth: [] request: content: application/json: schema: type: object required: [scenario_id, name, frequency, hour, minute, format] properties: scenario_id: type: string format: uuid name: type: string frequency: type: string enum: [daily, weekly, monthly] day_of_week: type: integer minimum: 0 maximum: 6 description: Required for weekly (0=Sunday) day_of_month: type: integer minimum: 1 maximum: 31 description: Required for monthly hour: type: integer minimum: 0 maximum: 23 minute: type: integer minimum: 0 maximum: 59 format: type: string enum: [pdf, csv] include_logs: type: boolean sections: type: array items: type: string email_to: type: array items: type: string format: email response: 201: content: application/json: schema: $ref: '#/components/schemas/Schedule' # GET /schedules - List user's schedules security: - BearerAuth: [] response: 200: content: application/json: schema: type: array items: $ref: '#/components/schemas/Schedule' # PUT /schedules/{id} - Update schedule security: - BearerAuth: [] response: 200: content: application/json: schema: $ref: '#/components/schemas/Schedule' # DELETE /schedules/{id} - Delete schedule security: - BearerAuth: [] response: 204: description: Schedule deleted ``` ### 4.3 Schemas ```yaml components: schemas: Scenario: type: object properties: id: type: string format: uuid name: type: string description: type: string tags: type: array items: type: string status: type: string enum: [draft, running, completed, archived] region: type: string created_at: type: string format: date-time updated_at: type: string format: date-time completed_at: type: string format: date-time total_requests: type: integer total_cost_estimate: type: number LogEntry: type: object properties: id: type: string format: uuid scenario_id: type: string format: uuid received_at: type: string format: date-time message_hash: type: string message_preview: type: string source: type: string size_bytes: type: integer has_pii: type: boolean token_count: type: integer sqs_blocks: type: integer User: type: object properties: id: type: string format: uuid email: type: string format: email full_name: type: string is_active: type: boolean is_superuser: type: boolean created_at: type: string format: date-time updated_at: type: string format: date-time last_login: type: string format: date-time required: - id - email - is_active - created_at APIKey: type: object properties: id: type: string format: uuid name: type: string key_prefix: type: string scopes: type: array items: type: string last_used_at: type: string format: date-time expires_at: type: string format: date-time is_active: type: boolean created_at: type: string format: date-time required: - id - key_prefix - scopes - is_active - created_at Schedule: type: object properties: id: type: string format: uuid user_id: type: string format: uuid scenario_id: type: string format: uuid name: type: string frequency: type: string enum: [daily, weekly, monthly] day_of_week: type: integer day_of_month: type: integer hour: type: integer minute: type: integer format: type: string enum: [pdf, csv] include_logs: type: boolean sections: type: array items: type: string email_to: type: array items: type: string format: email is_active: type: boolean last_run_at: type: string format: date-time next_run_at: type: string format: date-time created_at: type: string format: date-time required: - id - user_id - scenario_id - frequency - hour - minute - format - is_active securitySchemes: BearerAuth: type: http scheme: bearer bearerFormat: JWT ApiKeyAuth: type: apiKey in: header name: X-API-Key ``` --- ## 5. Data Flow ### 5.1 Authentication Flow (v0.5.0) ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ USER AUTHENTICATION FLOW │ └─────────────────────────────────────────────────────────────────────────────┘ Registration: ┌──────────┐ POST /auth/register ┌──────────────┐ ┌─────────────┐ │ Client │ ───────────────────────────> │ Backend │────>│ Validate │ │ (Browser)│ {email, password, name} │ │ │ Input │ └──────────┘ └──────┬───────┘ └──────┬──────┘ │ │ │ ▼ │ ┌─────────────┐ │ │ Check if │ │ │ email exists│ │ └──────┬──────┘ │ │ │ ▼ │ ┌─────────────┐ │ │ Hash with │ │ │ bcrypt(12) │ │ └──────┬──────┘ │ │ ▼ ▼ ┌──────────────┐ ┌─────────────┐ │ Create User │<────│ Insert │ │ in DB │ │ to users │ └──────┬───────┘ └─────────────┘ │ ▼ ┌──────────────┐ │ Generate │ │ JWT Tokens │ └──────┬───────┘ │ ▼ ┌──────────┐ 201 Created ┌──────────────┐ │ Client │ <─────────────────────────│ {user, │ │ (Browser)│ {access_token, │ tokens} │ └──────────┘ refresh_token} └──────────────┘ Login: ┌──────────┐ POST /auth/login ┌──────────────┐ ┌─────────────┐ │ Client │ ──────────────────────────>│ Backend │────>│ Find User │ │ (Browser)│ {email, password} │ │ │ by Email │ └──────────┘ └──────┬───────┘ └──────┬──────┘ │ │ │ ▼ │ ┌─────────────┐ │ │ Verify │ │ │ Password │ │ │ bcrypt │ │ └──────┬──────┘ │ │ ▼ │ ┌──────────────┐ │ │ Update │<──────────┘ │ last_login │ └──────┬───────┘ │ ▼ ┌──────────────┐ │ Generate │ │ JWT Tokens │ └──────┬───────┘ │ ▼ ┌──────────┐ 200 OK ┌──────────────┐ │ Client │ <─────────────────────────│ {access, │ │ (Browser)│ {access_token, │ refresh} │ └──────────┘ refresh_token} └──────────────┘ ``` ### 5.2 API Key Authentication Flow (v0.5.0) ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ API KEY AUTHENTICATION FLOW │ └─────────────────────────────────────────────────────────────────────────────┘ API Key Creation: ┌──────────┐ POST /api-keys ┌──────────────┐ ┌─────────────┐ │ Client │ ───────────────────────────>│ Backend │────>│ Validate │ │(JWT Auth)│ {name, scopes, expires} │ (JWT Auth) │ │ Input │ └──────────┘ └──────┬───────┘ └──────┬──────┘ │ │ │ ▼ │ ┌─────────────┐ │ │ Generate │ │ │ Random Key │ │ │ mk_xxxx... │ │ └──────┬──────┘ │ │ ▼ │ ┌──────────────┐ │ │ Split Key │<────────────┘ │ prefix/hash │ └──────┬───────┘ │ ┌─────────────────────┼─────────────────────┐ │ │ │ ▼ ▼ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Extract │ │ Hash with │ │ Store in │ │ Prefix │ │ SHA-256 │ │ api_keys │ │ (8 chars) │ │ │ │ table │ └─────────────┘ └──────┬──────┘ └──────┬──────┘ │ │ └────────────────────┘ │ ▼ ┌──────────┐ 201 Created ┌──────────────┐ ┌─────────────┐ │ Client │ <─────────────────────────│ Return │────>│ Store ONLY │ │(JWT Auth)│ {key: "mk_xxxx...", │ Response │ │ hash/prefix│ └──────────┘ prefix, scopes} │ ⚠️ SHOW ONCE│ │ (NOT full) │ └──────────────┘ └─────────────┘ API Key Usage: ┌──────────┐ X-API-Key: mk_xxxx ┌──────────────┐ ┌─────────────┐ │ Client │ ──────────────────────────>│ Backend │────>│ Extract │ │(API Key) │ GET /scenarios │ │ │ Prefix │ └──────────┘ └──────┬───────┘ └──────┬──────┘ │ │ │ ▼ │ ┌─────────────┐ │ │ Lookup by │ │ │ prefix in │ │ │ api_keys │ │ └──────┬──────┘ │ │ ▼ │ ┌──────────────┐ │ │ Hash Input │<──────────┘ │ Key & │ │ Compare │ └──────┬───────┘ │ ┌────────────────────┼────────────────────┐ │ │ │ ▼ ▼ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Check │ │ Validate │ │ Update │ │ is_active │ │ Scopes │ │ last_used │ │ & expiry │ │ │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ └────────────────────┼────────────────────┘ │ ▼ ┌──────────┐ 200 OK ┌──────────────┐ │ Client │ <──────────────────────────│ Process │ │(API Key) │ {scenarios: [...]} │ Request │ └──────────┘ └──────────────┘ ``` ### 5.3 Log Ingestion Flow ``` ┌──────────┐ POST /ingest ┌──────────────┐ │ Client │ ───────────────────────>│ FastAPI │ │(Logstash)│ Headers: │ Middleware │ │ │ X-Scenario-ID: uuid │ │ └──────────┘ └──────┬───────┘ │ │ 1. Validate scenario exists & running │ 2. Parse JSON payload ▼ ┌──────────────┐ │ Ingest │ │ Service │ └──────┬───────┘ │ ┌───────────────────────┼───────────────────────┐ │ │ │ ▼ ▼ ▼ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ PII Detector │ │ SQS Calculator│ │ Tokenizer │ │ • check email│ │ • calc blocks │ │ • count │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ has_pii: bool │ sqs_blocks: int │ tokens: int └──────────────────────┼─────────────────────┘ │ ▼ ┌──────────────┐ │ LogRepo │ │ save() │ └──────┬───────┘ │ ▼ ┌──────────────┐ │ PostgreSQL │ │ scenario_logs│ └──────────────┘ ``` ### 5.2 Scenario State Machine ``` ┌─────────────────────────────────────────────────────────┐ │ │ ▼ │ ┌──────────┐ POST /start ┌──────────┐ │ ┌───────│ DRAFT │────────────────────>│ RUNNING │ │ │ └──────────┘ └────┬─────┘ │ │ ▲ │ │ │ │ │ POST /stop │ │ │ POST /archive ▼ │ │ │ ┌──────────┐ │ │ ┌────┴────┐<────────────────────│COMPLETED │──────────────────┘ │ │ARCHIVED │ └──────────┘ └──────>└─────────┘ ``` ### 5.3 Cost Calculation Flow ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ COST CALCULATION PIPELINE │ └─────────────────────────────────────────────────────────────────────────┘ Input: scenario_logs row ├─ sqs_blocks ├─ token_count └─ (future: lambda_gb_seconds) │ ▼ ┌─────────────────┐ │ Pricing Service │ │ • get_active() │ └────────┬────────┘ │ Query: SELECT * FROM aws_pricing │ WHERE service IN ('sqs', 'lambda', 'bedrock') │ AND region = :scenario_region │ AND is_active = TRUE ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ COST FORMULAS │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ SQS Cost: │ │ cost = blocks × price_per_million / 1,000,000 │ │ Example: 100 blocks × $0.40 / 1M = $0.00004 │ │ │ │ Lambda Cost: │ │ request_cost = invocations × price_per_million / 1,000,000 │ │ compute_cost = gb_seconds × price_per_gb_second │ │ total = request_cost + compute_cost │ │ Example: 1M invoc × $0.20/1M + 10GBs × $0.00001667 = $0.20 + $0.00017│ │ │ │ Bedrock Cost: │ │ input_cost = input_tokens × price_per_1k_input / 1,000 │ │ output_cost = output_tokens × price_per_1k_output / 1,000 │ │ total = input_cost + output_cost │ │ Example: 1000 tokens × $0.003/1K = $0.003 │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────┐ │ Update │ │ scenarios │ │ total_cost │ └─────────────────┘ ``` --- ## 6. Security Architecture ### 6.1 Authentication Architecture #### JWT Token Implementation (v0.5.0) ``` ┌─────────────────────────────────────────────────────────────────┐ │ JWT AUTHENTICATION FLOW │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ POST /auth/login ┌──────────────┐ │ │ │ User │ ───────────────────────> │ Backend │ │ │ │ (Client) │ {email, password} │ │ │ │ └─────────────┘ └──────┬───────┘ │ │ │ │ │ │ 1. Validate │ │ │ credentials│ │ │ 2. Generate │ │ │ tokens │ │ ▼ │ │ ┌──────────────┐ │ │ ┌─────────────┐ {access, refresh} │ JWT │ │ │ │ User │ <────────────────────── │ Tokens │ │ │ └──────┬──────┘ └──────────────┘ │ │ │ │ │ │ Authorization: Bearer │ │ ▼ │ │ ┌─────────────┐ POST /auth/refresh ┌──────────────┐ │ │ │ Protected │ <───────────────────────> │ Refresh │ │ │ │ API │ {refresh_token} │ Token │ │ │ └─────────────┘ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` **Token Configuration:** | Parameter | Value | Security Level | |-----------|-------|----------------| | Algorithm | HS256 | Standard | | Secret Length | ≥32 chars | 256-bit minimum | | Access Token TTL | 30 minutes | Short-lived | | Refresh Token TTL | 7 days | Rotating | | bcrypt Cost | 12 | ~250ms/hash | **Token Rotation:** - New refresh token issued with each access token refresh - Old refresh tokens invalidated after use - Prevents replay attacks with stolen refresh tokens #### API Keys Architecture (v0.5.0) ``` ┌─────────────────────────────────────────────────────────────────┐ │ API KEYS SECURITY │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Key Format: mk__ │ │ Example: mk_a3f9b2c1_xK9mP2nQ8rS4tU7vW1yZ │ │ │ │ ┌──────────────┐ │ │ │ Generation │ │ │ ├──────────────┤ │ │ │ mk_ │ Fixed prefix │ │ │ a3f9b2c1 │ 8-char prefix (identification) │ │ │ xK9m... │ 32 random chars (base64url) │ │ └──────┬───────┘ │ │ │ │ │ ▼ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ Storage │ │ Database │ │ │ ├──────────────┤ ├──────────────┤ │ │ │ key_prefix │──────>│ a3f9b2c1 │ (plaintext) │ │ │ key_hash │──────>│ SHA-256(...) │ (hashed) │ │ │ scopes │──────>│ ["read:*"] │ (JSONB) │ │ └──────────────┘ └──────────────┘ │ │ │ │ ⚠️ Full key shown ONLY at creation time! │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` **API Key Scopes:** | Scope | Permission | Description | |-------|------------|-------------| | `read:scenarios` | Read | View scenarios | | `write:scenarios` | Write | Create/update scenarios | | `delete:scenarios` | Delete | Delete scenarios | | `read:reports` | Read | Download reports | | `write:reports` | Write | Generate reports | | `read:metrics` | Read | View metrics | | `ingest:logs` | Special | Send logs to scenarios | #### Authentication Layers ``` ┌─────────────────────────────────────────────────────────────────┐ │ AUTHENTICATION LAYERS │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Layer 1: API Key (Programmatic Access) │ │ ├─ Header: X-API-Key: mk__ │ │ ├─ Rate limiting: 10 req/min (management) │ │ ├─ Rate limiting: 1000 req/min (ingest) │ │ ├─ Scope validation: Required │ │ └─ Storage: Hash only (SHA-256) │ │ │ │ Layer 2: JWT Token (Web UI Access) │ │ ├─ Header: Authorization: Bearer │ │ ├─ Algorithm: HS256 │ │ ├─ Secret: ≥32 chars (env var) │ │ ├─ Access expiration: 30 minutes │ │ ├─ Refresh expiration: 7 days │ │ ├─ Token rotation: Enabled │ │ └─ Scope: Full access based on user role │ │ │ │ Layer 3: Role-Based Access Control (RBAC) │ │ ├─ superuser: Full system access │ │ ├─ user: CRUD own scenarios, own API keys │ │ └─ readonly: View scenarios, read metrics │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` ### 6.2 Data Security #### Security Controls Matrix | Layer | Measure | Implementation | v0.5.0 Status | |-------|---------|----------------|---------------| | **Transport** | TLS 1.3 | Nginx reverse proxy | 🔄 Planned | | **Auth Storage** | Password Hashing | bcrypt (cost=12) | ✅ Implemented | | **API Key Storage** | Hashing | SHA-256 (hash only) | ✅ Implemented | | **JWT** | Token Encryption | HS256, ≥32 char secret | ✅ Implemented | | **PII** | Detection + Truncation | Email regex, 500 char preview | ✅ Implemented | | **API** | Rate Limiting | slowapi with tiered limits | ✅ Implemented | | **DB** | Parameterized Queries | SQLAlchemy ORM (no raw SQL) | ✅ Implemented | | **Secrets** | Environment Variables | python-dotenv, Docker secrets | ✅ Implemented | | **CORS** | Origin Validation | Configured allowed origins | ✅ Implemented | | **Input** | Validation | Pydantic schemas | ✅ Implemented | #### Rate Limiting Configuration (v0.5.0) ```python # Rate limit tiers RATE_LIMITS = { "auth": {"requests": 5, "window": "1 minute"}, # Login/register "apikey_mgmt": {"requests": 10, "window": "1 minute"}, # API key CRUD "reports": {"requests": 10, "window": "1 minute"}, # Report generation "general": {"requests": 100, "window": "1 minute"}, # Standard API "ingest": {"requests": 1000, "window": "1 minute"}, # Log ingestion } ``` #### CORS Configuration ```python # Allowed origins (configurable via env) ALLOWED_ORIGINS = [ "http://localhost:5173", # Development "http://localhost:3000", # Alternative dev # Production origins configured via FRONTEND_URL env var ] # CORS policy CORS_CONFIG = { "allow_credentials": True, "allow_methods": ["GET", "POST", "PUT", "DELETE", "OPTIONS"], "allow_headers": ["*"], "max_age": 600, } ``` ### 6.3 PII Detection Strategy ```python # Pattern matching for common PII def detect_pii(message: str) -> dict: patterns = { 'email': r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', 'ssn': r'\b\d{3}-\d{2}-\d{4}\b', 'credit_card': r'\b(?:\d[ -]*?){13,16}\b', 'phone': r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b' } results = {} for pii_type, pattern in patterns.items(): matches = re.findall(pattern, message) if matches: results[pii_type] = len(matches) return { 'has_pii': len(results) > 0, 'pii_types': list(results.keys()), 'total_matches': sum(results.values()) } ``` --- ## 7. Technology Stack ### 7.1 Backend | Component | Technology | Version | Purpose | |-----------|------------|---------|---------| | Framework | FastAPI | ≥0.110 | Web framework | | Server | Uvicorn | ≥0.29 | ASGI server | | Validation | Pydantic | ≥2.7 | Data validation | | ORM | SQLAlchemy | ≥2.0 | Database ORM | | Migrations | Alembic | latest | DB migrations | | Driver | asyncpg | latest | Async PostgreSQL | | Tokenizer | tiktoken | ≥0.6 | Token counting | | Rate Limit | slowapi | latest | API rate limiting | | Auth | python-jose | latest | JWT handling | | Password Hash | bcrypt | ≥4.0 | Password hashing | | Email | sendgrid-python | latest | Email notifications | | Scheduling | apscheduler | ≥3.10 | Cron jobs | | Testing | pytest | ≥8.1 | Test framework | | HTTP Client | httpx | ≥0.27 | Async HTTP | ### 7.2 Frontend (v0.4.0 Implemented) | Component | Technology | Version | Purpose | Status | |-----------|------------|---------|---------|--------| | Framework | React | ≥18 | UI library | ✅ Implemented | | Language | TypeScript | ≥5.0 | Type safety | ✅ Implemented | | Build | Vite | ≥5.0 | Build tool | ✅ Implemented | | Styling | Tailwind CSS | ≥3.4 | CSS framework | ✅ Implemented | | Components | shadcn/ui | latest | UI components | ✅ 15+ components | | Icons | Lucide React | latest | Icon library | ✅ Implemented | | State | TanStack Query | ≥5.0 | Server state | ✅ React Query v5 | | HTTP | Axios | ≥1.6 | HTTP client | ✅ With interceptors | | Routing | React Router | ≥6.0 | Navigation | ✅ Implemented | | Charts | Recharts | ≥2.0 | Data viz | ✅ Implemented v0.4.0 | | Theme | next-themes | latest | Dark/Light mode | ✅ Implemented v0.4.0 | | E2E Testing | Playwright | ≥1.40 | Browser testing | ✅ 100 tests v0.4.0 | **Note v0.4.0:** - ✅ 5 pages complete: Dashboard, ScenarioDetail, ScenarioEdit, Compare, Reports - ✅ 15+ shadcn/ui components integrated - ✅ Recharts visualization (CostBreakdown, TimeSeries, Comparison charts) - ✅ Dark/Light mode with system preference detection - ✅ React Query for data fetching with caching - ✅ Axios with error interceptors and toast notifications - ✅ Responsive design with Tailwind CSS - ✅ E2E testing with Playwright (100 test cases) ### 7.3 Infrastructure (v0.4.0 Status) | Component | Technology | Purpose | Status | |-----------|------------|---------|--------| | Container | Docker | Application containers | ✅ PostgreSQL | | Orchestration | Docker Compose | Multi-container dev | ✅ Dev setup | | Database | PostgreSQL 15+ | Primary data store | ✅ Running | | E2E Testing | Playwright | Browser automation | ✅ 100 tests | | Reverse Proxy | Nginx | SSL, static files | 🔄 Planned v1.0.0 | | Process Manager | systemd / PM2 | Production process mgmt | 🔄 Planned v1.0.0 | **Docker Services:** ```yaml # Current (v0.4.0) - postgres: PostgreSQL 15 with healthcheck Status: ✅ Tested and running Ports: 5432:5432 Volume: postgres_data (persistent) # Planned (v1.0.0) - backend: FastAPI production image - frontend: Nginx serving React build - nginx: Reverse proxy with SSL ``` --- ## 8. Project Structure (v0.3.0 - Implemented) ``` mockupAWS/ ├── src/ # Backend FastAPI (Root level) │ ├── main.py # FastAPI app entry │ ├── core/ # Core utilities │ │ ├── config.py # Settings & env vars │ │ ├── database.py # SQLAlchemy async config │ │ └── exceptions.py # Custom exception handlers │ ├── models/ # SQLAlchemy models (v0.2.0) │ │ ├── __init__.py │ │ ├── scenario.py │ │ ├── scenario_log.py │ │ ├── scenario_metric.py │ │ ├── aws_pricing.py │ │ └── report.py │ ├── schemas/ # Pydantic schemas │ │ ├── __init__.py │ │ ├── scenario.py │ │ ├── scenario_log.py │ │ └── scenario_metric.py │ ├── api/ # API routes │ │ ├── deps.py # FastAPI dependencies (get_db) │ │ └── v1/ │ │ ├── __init__.py # API router aggregation │ │ ├── scenarios.py # CRUD endpoints (v0.2.0) │ │ ├── ingest.py # Log ingestion (v0.2.0) │ │ └── metrics.py # Metrics endpoints (v0.2.0) │ ├── repositories/ # Repository pattern (v0.2.0) │ │ ├── __init__.py │ │ ├── base.py │ │ ├── scenario.py │ │ ├── scenario_log.py │ │ ├── scenario_metric.py │ │ └── aws_pricing.py │ └── services/ # Business logic (v0.2.0) │ ├── __init__.py │ ├── pii_detector.py # PII detection service │ ├── cost_calculator.py # AWS cost calculation │ └── ingest_service.py # Log ingestion orchestration │ ├── frontend/ # Frontend React (v0.4.0) │ ├── src/ │ │ ├── App.tsx # Root component with routing │ │ ├── main.tsx # React entry point │ │ ├── components/ │ │ │ ├── layout/ # Layout components │ │ │ │ ├── Header.tsx # With theme toggle (v0.4.0) │ │ │ │ ├── Sidebar.tsx │ │ │ │ └── Layout.tsx │ │ │ ├── ui/ # shadcn/ui components (v0.3.0) │ │ │ │ ├── button.tsx │ │ │ │ ├── card.tsx │ │ │ │ ├── dialog.tsx │ │ │ │ ├── input.tsx │ │ │ │ ├── label.tsx │ │ │ │ ├── table.tsx │ │ │ │ ├── textarea.tsx │ │ │ │ ├── toast.tsx │ │ │ │ ├── toaster.tsx │ │ │ │ ├── sonner.tsx │ │ │ │ ├── tabs.tsx # v0.4.0 │ │ │ │ ├── checkbox.tsx # v0.4.0 │ │ │ │ └── select.tsx # v0.4.0 │ │ │ ├── charts/ # Recharts components (v0.4.0) │ │ │ │ ├── CostBreakdownChart.tsx │ │ │ │ ├── TimeSeriesChart.tsx │ │ │ │ └── ComparisonBarChart.tsx │ │ │ ├── comparison/ # Comparison feature (v0.4.0) │ │ │ │ ├── ScenarioComparisonTable.tsx │ │ │ │ └── ComparisonMetrics.tsx │ │ │ └── reports/ # Report generation UI (v0.4.0) │ │ │ ├── ReportGenerator.tsx │ │ │ └── ReportList.tsx │ │ ├── pages/ # Page components (v0.4.0) │ │ │ ├── Dashboard.tsx # Scenarios list │ │ │ ├── ScenarioDetail.tsx # Scenario view/edit with charts │ │ │ ├── ScenarioEdit.tsx # Create/edit form │ │ │ ├── Compare.tsx # Compare scenarios (v0.4.0) │ │ │ └── Reports.tsx # Reports page (v0.4.0) │ │ ├── hooks/ # React Query hooks (v0.4.0) │ │ │ ├── useScenarios.ts │ │ │ ├── useCreateScenario.ts │ │ │ ├── useUpdateScenario.ts │ │ │ ├── useComparison.ts # v0.4.0 │ │ │ └── useReports.ts # v0.4.0 │ │ ├── lib/ # Utilities │ │ │ ├── api.ts # Axios client config │ │ │ ├── utils.ts # Utility functions │ │ │ ├── queryClient.ts # React Query config │ │ │ └── theme-provider.tsx # Dark mode (v0.4.0) │ │ └── types/ │ │ └── api.ts # TypeScript types │ ├── e2e/ # E2E tests (v0.4.0) │ │ ├── tests/ │ │ │ ├── scenarios.spec.ts │ │ │ ├── reports.spec.ts │ │ │ ├── comparison.spec.ts │ │ │ └── dark-mode.spec.ts │ │ ├── fixtures/ │ │ └── TEST-RESULTS.md │ ├── package.json │ ├── vite.config.ts │ ├── tsconfig.json │ ├── tailwind.config.js │ ├── playwright.config.ts # E2E config (v0.4.0) │ ├── components.json # shadcn/ui config │ └── Dockerfile # Production build │ ├── alembic/ # Database migrations (v0.2.0) │ ├── versions/ # 6 migrations implemented │ │ ├── 8c29fdcbbf85_create_scenarios_table.py │ │ ├── e46de4b0264a_create_scenario_logs_table.py │ │ ├── 5e247ed57b77_create_scenario_metrics_table.py │ │ ├── 48f2231e7c12_create_aws_pricing_table.py │ │ ├── e80c6eef58b2_create_reports_table.py │ │ └── 0892c44b2a58_seed_aws_pricing_data.py │ ├── env.py │ └── alembic.ini │ ├── export/ # Project documentation │ ├── prd.md # Product Requirements │ ├── architecture.md # This file │ ├── kanban.md # Task breakdown │ └── progress.md # Progress tracking │ ├── .opencode/ # OpenCode team config │ └── agents/ # 6 agent configurations │ ├── spec-architect.md │ ├── backend-dev.md │ ├── db-engineer.md │ ├── frontend-dev.md │ ├── devops-engineer.md │ └── qa-engineer.md │ ├── docker-compose.yml # PostgreSQL service ├── Dockerfile.backend # Backend production image ├── pyproject.toml # Python dependencies (uv) ├── uv.lock # Locked dependencies ├── .env # Environment variables ├── .gitignore # Git ignore rules └── README.md # Project documentation ``` --- ## 9. Decisioni Architetturali ### DEC-001: Async-First Architecture **Decisione:** Utilizzare Python async/await in tutto lo stack (FastAPI, SQLAlchemy, asyncpg) **Motivazione:** - Alto throughput richiesto (>1000 RPS) - I/O bound operations (DB, tokenizer) - Migliore utilizzo risorse rispetto a sync **Alternative considerate:** - Sync + ThreadPool: Più semplice ma meno efficiente - Celery + Redis: Troppo complesso per use case **Conseguenze:** - Curva di apprendimento per async - Debugging più complesso - Migliore scalabilità --- ### DEC-002: Repository Pattern **Decisione:** Implementare Repository Pattern per accesso dati **Motivazione:** - Separazione tra business logic e data access - Facile testing con mock repositories - Possibilità di cambiare DB in futuro **Struttura:** ```python class BaseRepository(Generic[T]): async def get(self, id: UUID) -> T | None: ... async def list(self, **filters) -> list[T]: ... async def create(self, obj: T) -> T: ... async def update(self, id: UUID, data: dict) -> T: ... async def delete(self, id: UUID) -> bool: ... ``` --- ### DEC-003: Separate Database per Scenario **Decisione:** Utilizzare una singola tabella `scenario_logs` con `scenario_id` FK invece di DB separati **Motivazione:** - Più semplice da gestire - Query cross-scenario possibili (confronti) - Backup/restore più semplice **Alternative considerate:** - Schema per scenario: Troppo overhead - DB separati: Troppo complesso per MVP --- ### DEC-004: Message Hashing for Deduplication **Decisione:** Utilizzare SHA-256 hash del messaggio per deduplicazione **Motivazione:** - Privacy: Non memorizzare messaggi completi - Performance: Hash lookup O(1) - Storage: Risparmio spazio **Implementazione:** ```python import hashlib message_hash = hashlib.sha256(message.encode()).hexdigest() ``` --- ### DEC-005: Time-Series Metrics **Decisione:** Salvare metriche come time-series in `scenario_metrics` **Motivazione:** - Trend analysis possibile - Aggregazioni flessibili - Audit trail **Trade-off:** - Più storage rispetto a campi aggregati - Query più complesse ma indicizzate --- ## 10. Performance Considerations ### 10.1 Database Optimization | Optimization | Implementation | Benefit | |--------------|----------------|---------| | Indexes | B-tree on foreign keys, timestamps | Fast lookups | | GIN | tags (JSONB) | Fast array search | | Partitioning | scenario_logs by date | Query pruning | | Connection Pool | asyncpg pool (20-50) | Concurrency | ### 10.2 Caching Strategy (Future) ``` Layer 1: In-memory (FastAPI state) ├─ Active scenario metadata └─ AWS pricing (rarely changes) Layer 2: Redis (future) ├─ Session storage ├─ Rate limiting counters └─ Report generation status ``` ### 10.3 Query Optimization - Use `selectinload` for relationships - Batch inserts for logs (copy_expert) - Materialized views for reports - Async tasks for heavy operations --- ## 11. Error Handling Strategy ### 11.1 Exception Hierarchy ```python class AppException(Exception): """Base application exception""" status_code: int = 500 code: str = "internal_error" class NotFoundException(AppException): status_code = 404 code = "not_found" class ValidationException(AppException): status_code = 400 code = "validation_error" class ConflictException(AppException): status_code = 409 code = "conflict" class RateLimitException(AppException): status_code = 429 code = "rate_limited" ``` ### 11.2 Global Exception Handler ```python @app.exception_handler(AppException) async def app_exception_handler(request: Request, exc: AppException): return JSONResponse( status_code=exc.status_code, content={ "error": exc.code, "message": str(exc), "timestamp": datetime.utcnow().isoformat() } ) ``` --- ## 12. Deployment Architecture ### 12.1 Docker Compose (Development) ```yaml version: '3.8' services: postgres: image: postgres:15-alpine environment: POSTGRES_DB: mockupaws POSTGRES_USER: app POSTGRES_PASSWORD: ${DB_PASSWORD} volumes: - postgres_data:/var/lib/postgresql/data ports: - "5432:5432" healthcheck: test: ["CMD-SHELL", "pg_isready -U app -d mockupaws"] backend: build: ./backend environment: DATABASE_URL: postgresql+asyncpg://app:${DB_PASSWORD}@postgres:5432/mockupaws ports: - "8000:8000" depends_on: postgres: condition: service_healthy frontend: build: ./frontend ports: - "3000:80" depends_on: - backend volumes: postgres_data: ``` ### 12.2 Production Considerations - Use managed PostgreSQL (AWS RDS, Azure PostgreSQL) - Nginx as reverse proxy with SSL - Environment-specific configuration - Log aggregation (ELK or similar) - Monitoring (Prometheus + Grafana) - Health checks and readiness probes --- ## 13. Implementation Status & Changelog ### v0.2.0 - Backend Core ✅ COMPLETED **Database Layer:** - ✅ PostgreSQL 15 with 5 tables (scenarios, logs, metrics, pricing, reports) - ✅ 6 Alembic migrations (including AWS pricing seed data) - ✅ SQLAlchemy 2.0 async models with relationships - ✅ Indexes and constraints optimized **Backend API:** - ✅ FastAPI application with structured routing - ✅ Scenario CRUD endpoints (POST, GET, PUT, DELETE) - ✅ Ingest API with PII detection - ✅ Metrics API with cost calculation - ✅ Repository pattern implementation - ✅ Service layer (PII detector, Cost calculator, Ingest service) - ✅ Exception handlers and validation **Data Processing:** - ✅ SHA-256 message hashing for deduplication - ✅ Email PII detection with regex - ✅ AWS cost calculation (SQS, Lambda, Bedrock) - ✅ Token counting with tiktoken ### v0.3.0 - Frontend Implementation ✅ COMPLETED **React Application:** - ✅ Vite + TypeScript + React 18 setup - ✅ Tailwind CSS integration - ✅ shadcn/ui components (Button, Card, Dialog, Input, Label, Table, Textarea, Toast) - ✅ Lucide React icons **State Management:** - ✅ TanStack Query (React Query) v5 for server state - ✅ Axios HTTP client with interceptors - ✅ Error handling with toast notifications **Pages & Routing:** - ✅ Dashboard - Scenarios list with pagination - ✅ ScenarioDetail - View and edit scenarios - ✅ ScenarioEdit - Create and edit form - ✅ React Router v6 navigation **API Integration:** - ✅ TypeScript types for all API responses - ✅ Custom hooks for data fetching (useScenarios, useCreateScenario, useUpdateScenario) - ✅ Loading states and error boundaries - ✅ Responsive design **Docker & DevOps:** - ✅ Docker Compose with PostgreSQL service - ✅ Health checks for database - ✅ Dockerfile for backend (production ready) - ✅ Dockerfile for frontend (multi-stage build) - ✅ Environment configuration ### v0.4.0 - Reports, Charts & Comparison ✅ COMPLETATA (2026-04-07) **Backend Features:** - ✅ Report generation (PDF/CSV) with ReportLab and Pandas - ✅ Report storage and download API - ✅ Rate limiting for report downloads (10/min) - ✅ Automatic cleanup of old reports **Frontend Features:** - ✅ Interactive charts with Recharts (Pie, Area, Bar) - ✅ Cost Breakdown chart in Scenario Detail - ✅ Time Series chart for metrics - ✅ Comparison Bar Chart for scenario compare - ✅ Dark/Light mode toggle with system preference detection - ✅ Scenario comparison page (2-4 scenarios side-by-side) - ✅ Comparison tables with delta indicators - ✅ Report generation UI (PDF/CSV) **Testing:** - ✅ E2E testing suite with Playwright - ✅ 100 test cases covering all features - ✅ Multi-browser support (Chromium, Firefox) - ✅ Visual regression testing **Technical:** - ✅ next-themes for theme management - ✅ Tailwind dark mode configuration - ✅ Radix UI components (Tabs, Checkbox, Select) - ✅ Responsive charts with theme adaptation ### v0.5.0 - Authentication & API Keys 🔄 IN PROGRESS **Authentication & Authorization:** - ✅ Database migrations (users, api_keys tables) - ✅ JWT implementation (HS256, 30min access, 7days refresh) - ✅ bcrypt password hashing (cost=12) - ✅ Token rotation on refresh - 🔄 Auth API endpoints (/auth/*) - 🔄 API Keys service (generation, validation, hashing) - 🔄 API Keys endpoints (/api-keys/*) - ⏳ Protected route middleware - ⏳ Frontend auth integration **Security:** - ✅ JWT secret configuration (≥32 chars) - ✅ API key hashing (SHA-256) - ✅ Rate limiting configuration - ✅ CORS policy - 🔄 Security documentation (SECURITY.md) - ⏳ Input validation hardening - ⏳ Security headers middleware **Report Scheduling:** - ⏳ Database migration (report_schedules table) - ⏳ Scheduler service - ⏳ Cron job runner - ⏳ Email service (SendGrid/SES) - ⏳ Schedule API endpoints ### v1.0.0 - Production Ready ⏳ PLANNED **Infrastructure:** - ⏳ Full Docker Compose stack (backend + frontend + nginx) - ⏳ SSL/TLS configuration - ⏳ Database backup automation - ⏳ Monitoring and logging **Documentation:** - ⏳ Complete OpenAPI specification - ⏳ User guide - ⏳ API reference --- ## 14. Testing Status ### Current Coverage (v0.4.0) | Layer | Type | Status | Coverage | |-------|------|--------|----------| | Backend Unit | pytest | ✅ Implemented | ~60% | | Backend Integration | pytest | ✅ Implemented | All endpoints | | Frontend Unit | Vitest | 🔄 Partial | Key components | | E2E | Playwright | ✅ Implemented | 100 tests | **E2E Test Results:** - Total tests: 100 - Passing: 100 - Browsers: Chromium, Firefox - Features covered: Scenarios, Reports, Comparison, Dark Mode ### Test Files ``` tests/ ├── __init__.py ├── conftest.py # Fixtures ├── unit/ │ ├── test_main.py # Basic app tests (v0.1) │ ├── test_services.py # Service logic tests (planned) │ └── test_cost_calculator.py ├── integration/ │ ├── test_api_scenarios.py │ ├── test_api_ingest.py │ └── test_api_metrics.py └── e2e/ └── test_full_flow.py # Complete user journey ``` --- ## 15. Known Limitations & Technical Debt ### Current (v0.5.0) - In Progress 1. **Authentication Implementation**: JWT and API keys being implemented 2. **No Caching**: Every request hits database (Redis planned v1.0.0) 3. **Limited Frontend Unit Tests**: Vitest coverage partial 4. **Email Service**: Configuration required for notifications 5. **HTTPS**: Requires production deployment setup ### Resolved in v0.4.0 - ✅ Report generation with PDF/CSV export - ✅ Interactive charts with Recharts - ✅ Scenario comparison feature - ✅ Dark/Light mode toggle - ✅ E2E testing with Playwright (100 tests) - ✅ Rate limiting for report downloads ### Resolved in v0.3.0 - ✅ Database connection pooling - ✅ Async SQLAlchemy implementation - ✅ React Query for efficient data fetching - ✅ Error handling with user-friendly messages - ✅ Docker setup for consistent development --- *Documento creato da @spec-architect* *Versione: 1.3* *Ultimo aggiornamento: 2026-04-07* *Stato: v0.5.0 In Sviluppo*