Files
mockupAWS/export/architecture.md
Luca Sacchi Ricciardi 311a576f40 docs: update documentation and add Docker configuration for v0.3.0
- Update README.md with v0.3.0 completion status and improved setup instructions
- Update export/progress.md with completed tasks (53/55, 96% progress)
- Update export/architecture.md with current project structure and implementation status
- Add docker-compose.yml with PostgreSQL service
- Add Dockerfile.backend for production builds
- Add frontend/Dockerfile for multi-stage builds
- Update .gitignore with comprehensive rules for Python, Node.js, and Docker

Project status:
- v0.2.0: Database and Backend API 
- v0.3.0: Frontend React implementation 
- v0.4.0: Reports and visualization (planned)
2026-04-07 15:17:15 +02:00

1438 lines
58 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Architecture - mockupAWS
## 1. Overview
mockupAWS è una piattaforma di simulazione costi AWS che permette di profilare traffico log e calcolare i driver di costo (SQS, Lambda, Bedrock/LLM) prima del deploy in produzione.
**Architettura:** Layered Architecture con pattern Repository e Service Layer
**Paradigma:** Async-first (FastAPI + SQLAlchemy async)
**Deployment:** Container-based (Docker Compose)
---
## 2. System Architecture
### 2.1 High-Level Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ Logstash │ │ React Web UI │ │ API Consumers │ │
│ │ (Log Source) │ │ (Dashboard) │ │ (CI/CD, Scripts) │ │
│ └────────┬─────────┘ └────────┬─────────┘ └───────────┬──────────────┘ │
└───────────┼─────────────────────┼────────────────────────┼───────────────────┘
│ │ │
│ HTTP POST │ HTTPS │ API Key + JWT
│ /ingest │ /api/v1/* │ /api/v1/*
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ API LAYER │
│ FastAPI + Uvicorn (ASGI) │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Middleware Stack │ │
│ │ ├── CORS │ │
│ │ ├── Rate Limiting (slowapi) │ │
│ │ ├── Authentication (JWT / API Key) │ │
│ │ ├── Request Validation (Pydantic) │ │
│ │ └── Error Handling │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ /scenarios │ │ /ingest │ │ /reports │ │ /pricing │ │
│ │ CRUD │ │ (log │ │ generate │ │ (admin) │ │
│ │ │ │ intake) │ │ download │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │
└─────────┼────────────────┼────────────────┼──────────────────┼─────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ SERVICE LAYER │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ ScenarioService │ │ IngestService │ │ CostCalculator │ │
│ │ ─────────────── │ │ ────────────── │ │ ───────────── │ │
│ │ • create() │ │ • ingest_log() │ │ • calculate_sqs_cost() │ │
│ │ • update() │ │ • batch_process()│ │ • calculate_lambda_cost() │ │
│ │ • delete() │ │ • deduplicate() │ │ • calculate_bedrock_cost() │ │
│ │ • lifecycle() │ │ • persist() │ │ • get_total_cost() │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ ReportService │ │ PIIDetector │ │ TokenizerService │ │
│ │ ────────────── │ │ ─────────── │ │ ─────────────── │ │
│ │ • generate_csv()│ │ • detect_email()│ │ • count_tokens() │ │
│ │ • generate_pdf()│ │ • scan_patterns()│ │ • encode() │ │
│ │ • compile() │ │ • report() │ │ • get_encoding() │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
└─────────┬──────────────────────────────────────────────────────┬────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ REPOSITORY LAYER │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ ScenarioRepo │ │ LogRepo │ │ PricingRepo │ │
│ │ ───────────── │ │ ─────── │ │ ────────── │ │
│ │ • get_by_id() │ │ • save() │ │ • get_by_service_region() │ │
│ │ • list() │ │ • list_by_ │ │ • list_active() │ │
│ │ • create() │ │ scenario() │ │ • update() │ │
│ │ • update() │ │ • count_by_ │ │ • bulk_insert() │ │
│ │ • delete() │ │ hash() │ │ │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ MetricRepo │ │ ReportRepo │ │
│ │ ────────── │ │ ────────── │ │ │
│ │ • save() │ │ • save() │ │ │
│ │ • get_aggregated│ │ • list() │ │ │
│ │ • list_by_type()│ │ • delete() │ │ │
│ └──────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
│ SQLAlchemy 2.0 Async
│ asyncpg driver
┌─────────────────────────────────────────────────────────────────────────────┐
│ DATABASE LAYER │
│ PostgreSQL 15+ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ scenarios │ │ scenario_logs │ │ aws_pricing │ │
│ │ ───────── │ │ ───────────── │ │ ─────────── │ │
│ │ • metadata │ │ • logs storage │ │ • service prices │ │
│ │ • state machine │ │ • hash for dedup│ │ • history tracking │ │
│ │ • cost totals │ │ • PII flags │ │ • region-specific │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────────────┘ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ scenario_metrics│ │ reports │ │ │
│ │ ─────────────── │ │ ──────── │ │ │
│ │ • time-series │ │ • generated │ │ │
│ │ • aggregates │ │ • metadata │ │ │
│ │ • cost breakdown│ │ • file refs │ │ │
│ └──────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### 2.2 Layer Responsibilities
| Layer | Responsabilità | Tecnologie |
|-------|----------------|------------|
| **Client** | Interazione utente, ingestion log | Browser, Logstash, curl |
| **API** | Routing, validation, auth, middleware | FastAPI, Pydantic, slowapi |
| **Service** | Business logic, orchestration | Python async/await |
| **Repository** | Data access, query abstraction | SQLAlchemy 2.0 Repository pattern |
| **Database** | Persistenza, ACID, queries | PostgreSQL 15+ |
---
## 3. Database Schema
### 3.1 Entity Relationship Diagram
```
┌─────────────────────────────────────────────────────────────────────────┐
│ SCHEMA ERD │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────┐ ┌─────────────────────┐
│ scenarios │ │ aws_pricing │
├─────────────────────┤ ├─────────────────────┤
│ PK id: UUID │ │ PK id: UUID │
│ name: VARCHAR(255)│ │ service: VARCHAR │
│ description: TEXT│ │ region: VARCHAR │
│ tags: JSONB │ │ tier: VARCHAR │
│ status: ENUM │ │ price: DECIMAL │
│ region: VARCHAR │ │ unit: VARCHAR │
│ created_at: TS │ │ effective_from: D│
│ updated_at: TS │ │ effective_to: D │
│ completed_at: TS │ │ is_active: BOOL │
│ total_requests: INT│ │ source_url: TEXT │
│ total_cost: DEC │ └─────────────────────┘
└──────────┬──────────┘
│ 1:N
┌─────────────────────┐ ┌─────────────────────┐
│ scenario_logs │ │ scenario_metrics │
├─────────────────────┤ ├─────────────────────┤
│ PK id: UUID │ │ PK id: UUID │
│ FK scenario_id: UUID│ │ FK scenario_id: UUID│
│ received_at: TS │ │ timestamp: TS │
│ message_hash: V64│ │ metric_type: VAR │
│ message_preview │ │ metric_name: VAR │
│ source: VARCHAR │ │ value: DECIMAL │
│ size_bytes: INT │ │ unit: VARCHAR │
│ has_pii: BOOL │ │ metadata: JSONB │
│ token_count: INT │ └─────────────────────┘
│ sqs_blocks: INT │
└─────────────────────┘
│ 1:N (optional)
┌─────────────────────┐
│ reports │
├─────────────────────┤
│ PK id: UUID │
│ FK scenario_id: UUID│
│ format: ENUM │
│ file_path: TEXT │
│ generated_at: TS │
│ metadata: JSONB │
└─────────────────────┘
```
### 3.2 DDL - Schema Definition
```sql
-- ============================================
-- EXTENSIONS
-- ============================================
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- For text search
-- ============================================
-- ENUMS
-- ============================================
CREATE TYPE scenario_status AS ENUM ('draft', 'running', 'completed', 'archived');
CREATE TYPE report_format AS ENUM ('pdf', 'csv');
-- ============================================
-- TABLE: scenarios
-- ============================================
CREATE TABLE scenarios (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name VARCHAR(255) NOT NULL,
description TEXT,
tags JSONB DEFAULT '[]'::jsonb,
status scenario_status NOT NULL DEFAULT 'draft',
region VARCHAR(50) NOT NULL DEFAULT 'us-east-1',
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
completed_at TIMESTAMP WITH TIME ZONE,
started_at TIMESTAMP WITH TIME ZONE,
total_requests INTEGER NOT NULL DEFAULT 0,
total_cost_estimate DECIMAL(12, 6) NOT NULL DEFAULT 0.000000,
-- Constraints
CONSTRAINT chk_name_not_empty CHECK (char_length(trim(name)) > 0),
CONSTRAINT chk_region_not_empty CHECK (char_length(trim(region)) > 0)
);
-- Indexes
CREATE INDEX idx_scenarios_status ON scenarios(status);
CREATE INDEX idx_scenarios_region ON scenarios(region);
CREATE INDEX idx_scenarios_created_at ON scenarios(created_at DESC);
CREATE INDEX idx_scenarios_tags ON scenarios USING GIN(tags);
-- Trigger for updated_at
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ language 'plpgsql';
CREATE TRIGGER update_scenarios_updated_at
BEFORE UPDATE ON scenarios
FOR EACH ROW
EXECUTE FUNCTION update_updated_at_column();
-- ============================================
-- TABLE: scenario_logs
-- ============================================
CREATE TABLE scenario_logs (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE,
received_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
message_hash VARCHAR(64) NOT NULL, -- SHA256
message_preview VARCHAR(500),
source VARCHAR(100) DEFAULT 'unknown',
size_bytes INTEGER NOT NULL DEFAULT 0,
has_pii BOOLEAN NOT NULL DEFAULT FALSE,
token_count INTEGER NOT NULL DEFAULT 0,
sqs_blocks INTEGER NOT NULL DEFAULT 1,
-- Constraints
CONSTRAINT chk_size_positive CHECK (size_bytes >= 0),
CONSTRAINT chk_token_positive CHECK (token_count >= 0),
CONSTRAINT chk_blocks_positive CHECK (sqs_blocks >= 1)
);
-- Indexes
CREATE INDEX idx_logs_scenario_id ON scenario_logs(scenario_id);
CREATE INDEX idx_logs_received_at ON scenario_logs(received_at DESC);
CREATE INDEX idx_logs_message_hash ON scenario_logs(message_hash);
CREATE INDEX idx_logs_has_pii ON scenario_logs(has_pii) WHERE has_pii = TRUE;
-- ============================================
-- TABLE: scenario_metrics
-- ============================================
CREATE TABLE scenario_metrics (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE,
timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
metric_type VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock', 'safety'
metric_name VARCHAR(100) NOT NULL,
value DECIMAL(15, 6) NOT NULL DEFAULT 0.000000,
unit VARCHAR(20) NOT NULL, -- 'count', 'bytes', 'tokens', 'usd', 'invocations'
metadata JSONB DEFAULT '{}'::jsonb
);
-- Indexes
CREATE INDEX idx_metrics_scenario_id ON scenario_metrics(scenario_id);
CREATE INDEX idx_metrics_timestamp ON scenario_metrics(timestamp DESC);
CREATE INDEX idx_metrics_type ON scenario_metrics(metric_type);
CREATE INDEX idx_metrics_scenario_type ON scenario_metrics(scenario_id, metric_type);
-- ============================================
-- TABLE: aws_pricing
-- ============================================
CREATE TABLE aws_pricing (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
service VARCHAR(50) NOT NULL, -- 'sqs', 'lambda', 'bedrock'
region VARCHAR(50) NOT NULL,
tier VARCHAR(50) NOT NULL DEFAULT 'standard',
price_per_unit DECIMAL(15, 10) NOT NULL,
unit VARCHAR(20) NOT NULL, -- 'per_million_requests', 'per_gb_second', 'per_1k_tokens'
effective_from DATE NOT NULL DEFAULT CURRENT_DATE,
effective_to DATE,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
source_url VARCHAR(500),
description TEXT,
-- Constraints
CONSTRAINT chk_price_positive CHECK (price_per_unit >= 0),
CONSTRAINT chk_valid_dates CHECK (effective_to IS NULL OR effective_to >= effective_from),
CONSTRAINT uq_pricing_unique_active UNIQUE (service, region, tier, effective_from)
WHERE is_active = TRUE
);
-- Indexes
CREATE INDEX idx_pricing_service ON aws_pricing(service);
CREATE INDEX idx_pricing_region ON aws_pricing(region);
CREATE INDEX idx_pricing_active ON aws_pricing(service, region, tier) WHERE is_active = TRUE;
-- ============================================
-- TABLE: reports
-- ============================================
CREATE TABLE reports (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
scenario_id UUID NOT NULL REFERENCES scenarios(id) ON DELETE CASCADE,
format report_format NOT NULL,
file_path VARCHAR(500) NOT NULL,
file_size_bytes INTEGER,
generated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
generated_by VARCHAR(100), -- user_id or api_key_id
metadata JSONB DEFAULT '{}'::jsonb
);
-- Indexes
CREATE INDEX idx_reports_scenario_id ON reports(scenario_id);
CREATE INDEX idx_reports_generated_at ON reports(generated_at DESC);
```
### 3.3 Key Queries
```sql
-- Query: Get scenario with aggregated metrics
SELECT
s.*,
COUNT(DISTINCT sl.id) as total_logs,
COUNT(DISTINCT CASE WHEN sl.has_pii THEN sl.id END) as pii_violations,
SUM(sl.token_count) as total_tokens,
SUM(sl.sqs_blocks) as total_sqs_blocks
FROM scenarios s
LEFT JOIN scenario_logs sl ON s.id = sl.scenario_id
WHERE s.id = :scenario_id
GROUP BY s.id;
-- Query: Get cost breakdown by service
SELECT
metric_type,
SUM(value) as total_value,
unit
FROM scenario_metrics
WHERE scenario_id = :scenario_id
AND metric_name LIKE '%cost%'
GROUP BY metric_type, unit;
-- Query: Get active pricing for service/region
SELECT *
FROM aws_pricing
WHERE service = :service
AND region = :region
AND is_active = TRUE
AND (effective_to IS NULL OR effective_to >= CURRENT_DATE)
ORDER BY effective_from DESC
LIMIT 1;
```
---
## 4. API Specifications
### 4.1 OpenAPI Overview
```yaml
openapi: 3.0.0
info:
title: mockupAWS API
version: 0.3.0
description: AWS Cost Simulation Platform API
servers:
- url: http://localhost:8000/api/v1
description: Development server
security:
- BearerAuth: []
- ApiKeyAuth: []
```
### 4.2 Endpoints
#### Scenarios API
```yaml
# POST /scenarios - Create new scenario
request:
content:
application/json:
schema:
type: object
required: [name, region]
properties:
name:
type: string
minLength: 1
maxLength: 255
description:
type: string
tags:
type: array
items:
type: string
region:
type: string
enum: [us-east-1, us-west-2, eu-west-1, eu-central-1]
tier:
type: string
enum: [standard, on-demand]
default: standard
response:
201:
content:
application/json:
schema:
$ref: '#/components/schemas/Scenario'
# GET /scenarios - List scenarios
parameters:
- name: status
in: query
schema:
type: string
enum: [draft, running, completed, archived]
- name: region
in: query
schema:
type: string
- name: page
in: query
schema:
type: integer
default: 1
- name: page_size
in: query
schema:
type: integer
default: 20
maximum: 100
response:
200:
content:
application/json:
schema:
type: object
properties:
items:
type: array
items:
$ref: '#/components/schemas/Scenario'
total:
type: integer
page:
type: integer
page_size:
type: integer
# GET /scenarios/{id} - Get scenario details
# PUT /scenarios/{id} - Update scenario
# DELETE /scenarios/{id} - Delete scenario
# POST /scenarios/{id}/start - Start scenario
# POST /scenarios/{id}/stop - Stop scenario
# POST /scenarios/{id}/archive - Archive scenario
```
#### Ingest API
```yaml
# POST /ingest - Ingest log
headers:
X-Scenario-ID:
required: true
schema:
type: string
format: uuid
request:
content:
application/json:
schema:
type: object
required: [message]
properties:
message:
type: string
minLength: 1
source:
type: string
default: unknown
response:
202:
description: Log accepted
content:
application/json:
schema:
type: object
properties:
status:
type: string
example: accepted
log_id:
type: string
format: uuid
estimated_cost_impact:
type: number
400:
description: Invalid scenario or scenario not running
```
#### Metrics API
```yaml
# GET /scenarios/{id}/metrics - Get scenario metrics
response:
200:
content:
application/json:
schema:
type: object
properties:
scenario_id:
type: string
summary:
type: object
properties:
total_requests:
type: integer
total_cost_usd:
type: number
sqs_blocks:
type: integer
lambda_invocations:
type: integer
llm_tokens:
type: integer
pii_violations:
type: integer
cost_breakdown:
type: array
items:
type: object
properties:
service:
type: string
cost_usd:
type: number
percentage:
type: number
timeseries:
type: array
items:
type: object
properties:
timestamp:
type: string
format: date-time
metric_type:
type: string
value:
type: number
```
#### Reports API
```yaml
# POST /scenarios/{id}/reports - Generate report
request:
content:
application/json:
schema:
type: object
required: [format]
properties:
format:
type: string
enum: [pdf, csv]
include_logs:
type: boolean
default: false
date_from:
type: string
format: date-time
date_to:
type: string
format: date-time
response:
202:
description: Report generation started
content:
application/json:
schema:
type: object
properties:
report_id:
type: string
status:
type: string
enum: [pending, processing, completed]
download_url:
type: string
# GET /reports/{id}/download - Download report
# GET /reports/{id}/status - Check report status
```
#### Pricing API (Admin)
```yaml
# GET /pricing - List pricing
# POST /pricing - Create pricing entry
# PUT /pricing/{id} - Update pricing
# DELETE /pricing/{id} - Delete pricing (soft delete)
```
### 4.3 Schemas
```yaml
components:
schemas:
Scenario:
type: object
properties:
id:
type: string
format: uuid
name:
type: string
description:
type: string
tags:
type: array
items:
type: string
status:
type: string
enum: [draft, running, completed, archived]
region:
type: string
created_at:
type: string
format: date-time
updated_at:
type: string
format: date-time
completed_at:
type: string
format: date-time
total_requests:
type: integer
total_cost_estimate:
type: number
LogEntry:
type: object
properties:
id:
type: string
format: uuid
scenario_id:
type: string
format: uuid
received_at:
type: string
format: date-time
message_hash:
type: string
message_preview:
type: string
source:
type: string
size_bytes:
type: integer
has_pii:
type: boolean
token_count:
type: integer
sqs_blocks:
type: integer
securitySchemes:
BearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
ApiKeyAuth:
type: apiKey
in: header
name: X-API-Key
```
---
## 5. Data Flow
### 5.1 Log Ingestion Flow
```
┌──────────┐ POST /ingest ┌──────────────┐
│ Client │ ───────────────────────>│ FastAPI │
│(Logstash)│ Headers: │ Middleware │
│ │ X-Scenario-ID: uuid │ │
└──────────┘ └──────┬───────┘
│ 1. Validate scenario exists & running
│ 2. Parse JSON payload
┌──────────────┐
│ Ingest │
│ Service │
└──────┬───────┘
┌───────────────────────┼───────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ PII Detector │ │ SQS Calculator│ │ Tokenizer │
│ • check email│ │ • calc blocks │ │ • count │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
│ has_pii: bool │ sqs_blocks: int │ tokens: int
└──────────────────────┼─────────────────────┘
┌──────────────┐
│ LogRepo │
│ save() │
└──────┬───────┘
┌──────────────┐
│ PostgreSQL │
│ scenario_logs│
└──────────────┘
```
### 5.2 Scenario State Machine
```
┌─────────────────────────────────────────────────────────┐
│ │
▼ │
┌──────────┐ POST /start ┌──────────┐ │
┌───────│ DRAFT │────────────────────>│ RUNNING │ │
│ └──────────┘ └────┬─────┘ │
│ ▲ │ │
│ │ │ POST /stop │
│ │ POST /archive ▼ │
│ │ ┌──────────┐ │
│ ┌────┴────┐<────────────────────│COMPLETED │──────────────────┘
│ │ARCHIVED │ └──────────┘
└──────>└─────────┘
```
### 5.3 Cost Calculation Flow
```
┌─────────────────────────────────────────────────────────────────────────┐
│ COST CALCULATION PIPELINE │
└─────────────────────────────────────────────────────────────────────────┘
Input: scenario_logs row
├─ sqs_blocks
├─ token_count
└─ (future: lambda_gb_seconds)
┌─────────────────┐
│ Pricing Service │
│ • get_active() │
└────────┬────────┘
│ Query: SELECT * FROM aws_pricing
│ WHERE service IN ('sqs', 'lambda', 'bedrock')
│ AND region = :scenario_region
│ AND is_active = TRUE
┌─────────────────────────────────────────────────────────────────────────┐
│ COST FORMULAS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ SQS Cost: │
│ cost = blocks × price_per_million / 1,000,000 │
│ Example: 100 blocks × $0.40 / 1M = $0.00004 │
│ │
│ Lambda Cost: │
│ request_cost = invocations × price_per_million / 1,000,000 │
│ compute_cost = gb_seconds × price_per_gb_second │
│ total = request_cost + compute_cost │
│ Example: 1M invoc × $0.20/1M + 10GBs × $0.00001667 = $0.20 + $0.00017│
│ │
│ Bedrock Cost: │
│ input_cost = input_tokens × price_per_1k_input / 1,000 │
│ output_cost = output_tokens × price_per_1k_output / 1,000 │
│ total = input_cost + output_cost │
│ Example: 1000 tokens × $0.003/1K = $0.003 │
│ │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────┐
│ Update │
│ scenarios │
│ total_cost │
└─────────────────┘
```
---
## 6. Security Architecture
### 6.1 Authentication & Authorization
```
┌─────────────────────────────────────────────────────────────────┐
│ AUTHENTICATION LAYERS │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Layer 1: API Key (Programmatic Access) │
│ ├─ Header: X-API-Key: <key> │
│ ├─ Rate limiting: 1000 req/min │
│ └─ Scope: /ingest, /metrics (read-only on other resources) │
│ │
│ Layer 2: JWT Token (Web UI Access) │
│ ├─ Header: Authorization: Bearer <jwt> │
│ ├─ Expiration: 24h │
│ ├─ Refresh token: 7d │
│ └─ Scope: Full access based on roles │
│ │
│ Layer 3: Role-Based Access Control (RBAC) │
│ ├─ admin: Full access │
│ ├─ user: CRUD own scenarios, read pricing │
│ └─ readonly: View only │
│ │
└─────────────────────────────────────────────────────────────────┘
```
### 6.2 Data Security
| Layer | Measure | Implementation |
|-------|---------|----------------|
| **Transport** | TLS 1.3 | Nginx reverse proxy |
| **Storage** | Hashing | SHA-256 for message_hash |
| **PII** | Detection + Truncation | Email regex, 500 char preview limit |
| **API** | Rate Limiting | slowapi: 100/min public, 1000/min authenticated |
| **DB** | Parameterized Queries | SQLAlchemy ORM (no raw SQL) |
| **Secrets** | Environment Variables | python-dotenv, Docker secrets |
### 6.3 PII Detection Strategy
```python
# Pattern matching for common PII
def detect_pii(message: str) -> dict:
patterns = {
'email': r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b',
'ssn': r'\b\d{3}-\d{2}-\d{4}\b',
'credit_card': r'\b(?:\d[ -]*?){13,16}\b',
'phone': r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b'
}
results = {}
for pii_type, pattern in patterns.items():
matches = re.findall(pattern, message)
if matches:
results[pii_type] = len(matches)
return {
'has_pii': len(results) > 0,
'pii_types': list(results.keys()),
'total_matches': sum(results.values())
}
```
---
## 7. Technology Stack
### 7.1 Backend
| Component | Technology | Version | Purpose |
|-----------|------------|---------|---------|
| Framework | FastAPI | ≥0.110 | Web framework |
| Server | Uvicorn | ≥0.29 | ASGI server |
| Validation | Pydantic | ≥2.7 | Data validation |
| ORM | SQLAlchemy | ≥2.0 | Database ORM |
| Migrations | Alembic | latest | DB migrations |
| Driver | asyncpg | latest | Async PostgreSQL |
| Tokenizer | tiktoken | ≥0.6 | Token counting |
| Rate Limit | slowapi | latest | API rate limiting |
| Auth | python-jose | latest | JWT handling |
| Testing | pytest | ≥8.1 | Test framework |
| HTTP Client | httpx | ≥0.27 | Async HTTP |
### 7.2 Frontend (v0.3.0 Implemented)
| Component | Technology | Version | Purpose | Status |
|-----------|------------|---------|---------|--------|
| Framework | React | ≥18 | UI library | ✅ Implemented |
| Language | TypeScript | ≥5.0 | Type safety | ✅ Implemented |
| Build | Vite | ≥5.0 | Build tool | ✅ Implemented |
| Styling | Tailwind CSS | ≥3.4 | CSS framework | ✅ Implemented |
| Components | shadcn/ui | latest | UI components | ✅ 10+ components |
| Icons | Lucide React | latest | Icon library | ✅ Implemented |
| State | TanStack Query | ≥5.0 | Server state | ✅ React Query v5 |
| HTTP | Axios | ≥1.6 | HTTP client | ✅ With interceptors |
| Routing | React Router | ≥6.0 | Navigation | ✅ Implemented |
| Charts | Recharts | ≥2.0 | Data viz | 🔄 Planned v0.4.0 |
| Forms | React Hook Form | latest | Form management | 🔄 Planned v0.4.0 |
| Validation | Zod | latest | Schema validation | 🔄 Planned v0.4.0 |
**Note v0.3.0:**
- ✅ 3 pages complete: Dashboard, ScenarioDetail, ScenarioEdit
- ✅ 10+ shadcn/ui components integrated
- ✅ React Query for data fetching with caching
- ✅ Axios with error interceptors and toast notifications
- ✅ Responsive design with Tailwind CSS
- 🔄 Charts and advanced forms in v0.4.0
### 7.3 Infrastructure (v0.3.0 Status)
| Component | Technology | Purpose | Status |
|-----------|------------|---------|--------|
| Container | Docker | Application containers | ✅ PostgreSQL |
| Orchestration | Docker Compose | Multi-container dev | ✅ Dev setup |
| Database | PostgreSQL 15+ | Primary data store | ✅ Running |
| Reverse Proxy | Nginx | SSL, static files | 🔄 Planned v0.4.0 |
| Process Manager | systemd / PM2 | Production process mgmt | 🔄 Planned v1.0.0 |
**Docker Services:**
```yaml
# Current (v0.3.0)
- postgres: PostgreSQL 15 with healthcheck
Status: ✅ Tested and running
Ports: 5432:5432
Volume: postgres_data (persistent)
# Planned (v1.0.0)
- backend: FastAPI production image
- frontend: Nginx serving React build
- nginx: Reverse proxy with SSL
```
---
## 8. Project Structure (v0.3.0 - Implemented)
```
mockupAWS/
├── src/ # Backend FastAPI (Root level)
│ ├── main.py # FastAPI app entry
│ ├── core/ # Core utilities
│ │ ├── config.py # Settings & env vars
│ │ ├── database.py # SQLAlchemy async config
│ │ └── exceptions.py # Custom exception handlers
│ ├── models/ # SQLAlchemy models (v0.2.0)
│ │ ├── __init__.py
│ │ ├── scenario.py
│ │ ├── scenario_log.py
│ │ ├── scenario_metric.py
│ │ ├── aws_pricing.py
│ │ └── report.py
│ ├── schemas/ # Pydantic schemas
│ │ ├── __init__.py
│ │ ├── scenario.py
│ │ ├── scenario_log.py
│ │ └── scenario_metric.py
│ ├── api/ # API routes
│ │ ├── deps.py # FastAPI dependencies (get_db)
│ │ └── v1/
│ │ ├── __init__.py # API router aggregation
│ │ ├── scenarios.py # CRUD endpoints (v0.2.0)
│ │ ├── ingest.py # Log ingestion (v0.2.0)
│ │ └── metrics.py # Metrics endpoints (v0.2.0)
│ ├── repositories/ # Repository pattern (v0.2.0)
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── scenario.py
│ │ ├── scenario_log.py
│ │ ├── scenario_metric.py
│ │ └── aws_pricing.py
│ └── services/ # Business logic (v0.2.0)
│ ├── __init__.py
│ ├── pii_detector.py # PII detection service
│ ├── cost_calculator.py # AWS cost calculation
│ └── ingest_service.py # Log ingestion orchestration
├── frontend/ # Frontend React (v0.3.0)
│ ├── src/
│ │ ├── App.tsx # Root component with routing
│ │ ├── main.tsx # React entry point
│ │ ├── components/
│ │ │ ├── layout/ # Layout components
│ │ │ │ ├── Header.tsx
│ │ │ │ ├── Sidebar.tsx
│ │ │ │ └── Layout.tsx
│ │ │ └── ui/ # shadcn/ui components (v0.3.0)
│ │ │ ├── button.tsx
│ │ │ ├── card.tsx
│ │ │ ├── dialog.tsx
│ │ │ ├── input.tsx
│ │ │ ├── label.tsx
│ │ │ ├── table.tsx
│ │ │ ├── textarea.tsx
│ │ │ ├── toast.tsx
│ │ │ ├── toaster.tsx
│ │ │ └── sonner.tsx
│ │ ├── pages/ # Page components (v0.3.0)
│ │ │ ├── Dashboard.tsx # Scenarios list
│ │ │ ├── ScenarioDetail.tsx # Scenario view/edit
│ │ │ └── ScenarioEdit.tsx # Create/edit form
│ │ ├── hooks/ # React Query hooks (v0.3.0)
│ │ │ ├── useScenarios.ts
│ │ │ ├── useCreateScenario.ts
│ │ │ └── useUpdateScenario.ts
│ │ ├── lib/ # Utilities
│ │ │ ├── api.ts # Axios client config
│ │ │ ├── utils.ts # Utility functions
│ │ │ └── queryClient.ts # React Query config
│ │ └── types/
│ │ └── api.ts # TypeScript types
│ ├── package.json
│ ├── vite.config.ts
│ ├── tsconfig.json
│ ├── tailwind.config.js
│ ├── components.json # shadcn/ui config
│ └── Dockerfile # Production build
├── alembic/ # Database migrations (v0.2.0)
│ ├── versions/ # 6 migrations implemented
│ │ ├── 8c29fdcbbf85_create_scenarios_table.py
│ │ ├── e46de4b0264a_create_scenario_logs_table.py
│ │ ├── 5e247ed57b77_create_scenario_metrics_table.py
│ │ ├── 48f2231e7c12_create_aws_pricing_table.py
│ │ ├── e80c6eef58b2_create_reports_table.py
│ │ └── 0892c44b2a58_seed_aws_pricing_data.py
│ ├── env.py
│ └── alembic.ini
├── export/ # Project documentation
│ ├── prd.md # Product Requirements
│ ├── architecture.md # This file
│ ├── kanban.md # Task breakdown
│ └── progress.md # Progress tracking
├── .opencode/ # OpenCode team config
│ └── agents/ # 6 agent configurations
│ ├── spec-architect.md
│ ├── backend-dev.md
│ ├── db-engineer.md
│ ├── frontend-dev.md
│ ├── devops-engineer.md
│ └── qa-engineer.md
├── docker-compose.yml # PostgreSQL service
├── Dockerfile.backend # Backend production image
├── pyproject.toml # Python dependencies (uv)
├── uv.lock # Locked dependencies
├── .env # Environment variables
├── .gitignore # Git ignore rules
└── README.md # Project documentation
```
---
## 9. Decisioni Architetturali
### DEC-001: Async-First Architecture
**Decisione:** Utilizzare Python async/await in tutto lo stack (FastAPI, SQLAlchemy, asyncpg)
**Motivazione:**
- Alto throughput richiesto (>1000 RPS)
- I/O bound operations (DB, tokenizer)
- Migliore utilizzo risorse rispetto a sync
**Alternative considerate:**
- Sync + ThreadPool: Più semplice ma meno efficiente
- Celery + Redis: Troppo complesso per use case
**Conseguenze:**
- Curva di apprendimento per async
- Debugging più complesso
- Migliore scalabilità
---
### DEC-002: Repository Pattern
**Decisione:** Implementare Repository Pattern per accesso dati
**Motivazione:**
- Separazione tra business logic e data access
- Facile testing con mock repositories
- Possibilità di cambiare DB in futuro
**Struttura:**
```python
class BaseRepository(Generic[T]):
async def get(self, id: UUID) -> T | None: ...
async def list(self, **filters) -> list[T]: ...
async def create(self, obj: T) -> T: ...
async def update(self, id: UUID, data: dict) -> T: ...
async def delete(self, id: UUID) -> bool: ...
```
---
### DEC-003: Separate Database per Scenario
**Decisione:** Utilizzare una singola tabella `scenario_logs` con `scenario_id` FK invece di DB separati
**Motivazione:**
- Più semplice da gestire
- Query cross-scenario possibili (confronti)
- Backup/restore più semplice
**Alternative considerate:**
- Schema per scenario: Troppo overhead
- DB separati: Troppo complesso per MVP
---
### DEC-004: Message Hashing for Deduplication
**Decisione:** Utilizzare SHA-256 hash del messaggio per deduplicazione
**Motivazione:**
- Privacy: Non memorizzare messaggi completi
- Performance: Hash lookup O(1)
- Storage: Risparmio spazio
**Implementazione:**
```python
import hashlib
message_hash = hashlib.sha256(message.encode()).hexdigest()
```
---
### DEC-005: Time-Series Metrics
**Decisione:** Salvare metriche come time-series in `scenario_metrics`
**Motivazione:**
- Trend analysis possibile
- Aggregazioni flessibili
- Audit trail
**Trade-off:**
- Più storage rispetto a campi aggregati
- Query più complesse ma indicizzate
---
## 10. Performance Considerations
### 10.1 Database Optimization
| Optimization | Implementation | Benefit |
|--------------|----------------|---------|
| Indexes | B-tree on foreign keys, timestamps | Fast lookups |
| GIN | tags (JSONB) | Fast array search |
| Partitioning | scenario_logs by date | Query pruning |
| Connection Pool | asyncpg pool (20-50) | Concurrency |
### 10.2 Caching Strategy (Future)
```
Layer 1: In-memory (FastAPI state)
├─ Active scenario metadata
└─ AWS pricing (rarely changes)
Layer 2: Redis (future)
├─ Session storage
├─ Rate limiting counters
└─ Report generation status
```
### 10.3 Query Optimization
- Use `selectinload` for relationships
- Batch inserts for logs (copy_expert)
- Materialized views for reports
- Async tasks for heavy operations
---
## 11. Error Handling Strategy
### 11.1 Exception Hierarchy
```python
class AppException(Exception):
"""Base application exception"""
status_code: int = 500
code: str = "internal_error"
class NotFoundException(AppException):
status_code = 404
code = "not_found"
class ValidationException(AppException):
status_code = 400
code = "validation_error"
class ConflictException(AppException):
status_code = 409
code = "conflict"
class RateLimitException(AppException):
status_code = 429
code = "rate_limited"
```
### 11.2 Global Exception Handler
```python
@app.exception_handler(AppException)
async def app_exception_handler(request: Request, exc: AppException):
return JSONResponse(
status_code=exc.status_code,
content={
"error": exc.code,
"message": str(exc),
"timestamp": datetime.utcnow().isoformat()
}
)
```
---
## 12. Deployment Architecture
### 12.1 Docker Compose (Development)
```yaml
version: '3.8'
services:
postgres:
image: postgres:15-alpine
environment:
POSTGRES_DB: mockupaws
POSTGRES_USER: app
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U app -d mockupaws"]
backend:
build: ./backend
environment:
DATABASE_URL: postgresql+asyncpg://app:${DB_PASSWORD}@postgres:5432/mockupaws
ports:
- "8000:8000"
depends_on:
postgres:
condition: service_healthy
frontend:
build: ./frontend
ports:
- "3000:80"
depends_on:
- backend
volumes:
postgres_data:
```
### 12.2 Production Considerations
- Use managed PostgreSQL (AWS RDS, Azure PostgreSQL)
- Nginx as reverse proxy with SSL
- Environment-specific configuration
- Log aggregation (ELK or similar)
- Monitoring (Prometheus + Grafana)
- Health checks and readiness probes
---
## 13. Implementation Status & Changelog
### v0.2.0 - Backend Core ✅ COMPLETED
**Database Layer:**
- ✅ PostgreSQL 15 with 5 tables (scenarios, logs, metrics, pricing, reports)
- ✅ 6 Alembic migrations (including AWS pricing seed data)
- ✅ SQLAlchemy 2.0 async models with relationships
- ✅ Indexes and constraints optimized
**Backend API:**
- ✅ FastAPI application with structured routing
- ✅ Scenario CRUD endpoints (POST, GET, PUT, DELETE)
- ✅ Ingest API with PII detection
- ✅ Metrics API with cost calculation
- ✅ Repository pattern implementation
- ✅ Service layer (PII detector, Cost calculator, Ingest service)
- ✅ Exception handlers and validation
**Data Processing:**
- ✅ SHA-256 message hashing for deduplication
- ✅ Email PII detection with regex
- ✅ AWS cost calculation (SQS, Lambda, Bedrock)
- ✅ Token counting with tiktoken
### v0.3.0 - Frontend Implementation ✅ COMPLETED
**React Application:**
- ✅ Vite + TypeScript + React 18 setup
- ✅ Tailwind CSS integration
- ✅ shadcn/ui components (Button, Card, Dialog, Input, Label, Table, Textarea, Toast)
- ✅ Lucide React icons
**State Management:**
- ✅ TanStack Query (React Query) v5 for server state
- ✅ Axios HTTP client with interceptors
- ✅ Error handling with toast notifications
**Pages & Routing:**
- ✅ Dashboard - Scenarios list with pagination
- ✅ ScenarioDetail - View and edit scenarios
- ✅ ScenarioEdit - Create and edit form
- ✅ React Router v6 navigation
**API Integration:**
- ✅ TypeScript types for all API responses
- ✅ Custom hooks for data fetching (useScenarios, useCreateScenario, useUpdateScenario)
- ✅ Loading states and error boundaries
- ✅ Responsive design
**Docker & DevOps:**
- ✅ Docker Compose with PostgreSQL service
- ✅ Health checks for database
- ✅ Dockerfile for backend (production ready)
- ✅ Dockerfile for frontend (multi-stage build)
- ✅ Environment configuration
### v0.4.0 - Reports & Visualization 🔄 PLANNED
**Features:**
- 🔄 Report generation (PDF/CSV)
- 🔄 Scenario comparison view
- 🔄 Interactive charts (Recharts)
- 🔄 Dark/Light mode toggle
- 🔄 Advanced form validation (React Hook Form + Zod)
### v1.0.0 - Production Ready ⏳ PLANNED
**Security:**
- ⏳ JWT authentication
- ⏳ API key management
- ⏳ Role-based access control
**Infrastructure:**
- ⏳ Full Docker Compose stack (backend + frontend + nginx)
- ⏳ SSL/TLS configuration
- ⏳ Database backup automation
- ⏳ Monitoring and logging
**Documentation:**
- ⏳ Complete OpenAPI specification
- ⏳ User guide
- ⏳ API reference
---
## 14. Testing Status
### Current Coverage (v0.3.0)
| Layer | Type | Status | Coverage |
|-------|------|--------|----------|
| Backend Unit | pytest | ✅ Basic | ~45% |
| Backend Integration | pytest | 🔄 Partial | Key endpoints |
| Frontend Unit | Vitest | ⏳ Planned | - |
| E2E | Playwright | ⏳ Planned | - |
### Test Files
```
tests/
├── __init__.py
├── conftest.py # Fixtures
├── unit/
│ ├── test_main.py # Basic app tests (v0.1)
│ ├── test_services.py # Service logic tests (planned)
│ └── test_cost_calculator.py
├── integration/
│ ├── test_api_scenarios.py
│ ├── test_api_ingest.py
│ └── test_api_metrics.py
└── e2e/
└── test_full_flow.py # Complete user journey
```
---
## 15. Known Limitations & Technical Debt
### Current (v0.3.0)
1. **No Authentication**: API is open (JWT planned v1.0.0)
2. **No Rate Limiting**: API endpoints unprotected (slowapi planned v0.4.0)
3. **Frontend Charts Missing**: Recharts integration pending
4. **Report Generation**: Backend ready but no UI
5. **No Caching**: Every request hits database (Redis planned v1.0.0)
6. **Limited Test Coverage**: Only basic tests from v0.1
### Resolved in v0.3.0
- ✅ Database connection pooling
- ✅ Async SQLAlchemy implementation
- ✅ React Query for efficient data fetching
- ✅ Error handling with user-friendly messages
- ✅ Docker setup for consistent development
---
*Documento creato da @spec-architect*
*Versione: 1.1*
*Ultimo aggiornamento: 2026-04-07*
*Stato: v0.3.0 Completata*