Files
mockupAWS/infrastructure/monitoring/grafana/dashboards/overview.json
Luca Sacchi Ricciardi 38fd6cb562
Some checks failed
CI/CD - Build & Test / Backend Tests (push) Has been cancelled
CI/CD - Build & Test / Frontend Tests (push) Has been cancelled
CI/CD - Build & Test / Security Scans (push) Has been cancelled
CI/CD - Build & Test / Docker Build Test (push) Has been cancelled
CI/CD - Build & Test / Terraform Validate (push) Has been cancelled
Deploy to Production / Build & Test (push) Has been cancelled
Deploy to Production / Security Scan (push) Has been cancelled
Deploy to Production / Build Docker Images (push) Has been cancelled
Deploy to Production / Deploy to Staging (push) Has been cancelled
Deploy to Production / E2E Tests (push) Has been cancelled
Deploy to Production / Deploy to Production (push) Has been cancelled
E2E Tests / Run E2E Tests (push) Has been cancelled
E2E Tests / Visual Regression Tests (push) Has been cancelled
E2E Tests / Smoke Tests (push) Has been cancelled
release: v1.0.0 - Production Ready
Complete production-ready release with all v1.0.0 features:

Architecture & Planning (@spec-architect):
- Production architecture design with scalability and HA
- Security audit plan and compliance review
- Technical debt assessment and refactoring roadmap

Database (@db-engineer):
- 17 performance indexes and 3 materialized views
- PgBouncer connection pooling
- Automated backup/restore with PITR (RTO<1h, RPO<5min)
- Data archiving strategy (~65% storage savings)

Backend (@backend-dev):
- Redis caching layer with 3-tier strategy
- Celery async jobs with Flower monitoring
- API v2 with rate limiting (tiered: free/premium/enterprise)
- Prometheus metrics and OpenTelemetry tracing
- Security hardening (headers, audit logging)

Frontend (@frontend-dev):
- Bundle optimization: 308KB (code splitting, lazy loading)
- Onboarding tutorial (react-joyride)
- Command palette (Cmd+K) and keyboard shortcuts
- Analytics dashboard with cost predictions
- i18n (English + Italian) and WCAG 2.1 AA compliance

DevOps (@devops-engineer):
- Complete deployment guide (Docker, K8s, AWS ECS)
- Terraform AWS infrastructure (Multi-AZ RDS, ElastiCache, ECS)
- CI/CD pipelines with blue-green deployment
- Prometheus + Grafana monitoring with 15+ alert rules
- SLA definition and incident response procedures

QA (@qa-engineer):
- 153+ E2E test cases (85% coverage)
- k6 performance tests (1000+ concurrent users, p95<200ms)
- Security testing (0 critical vulnerabilities)
- Cross-browser and mobile testing
- Official QA sign-off

Production Features:
 Horizontal scaling ready
 99.9% uptime target
 <200ms response time (p95)
 Enterprise-grade security
 Complete observability
 Disaster recovery
 SLA monitoring

Ready for production deployment! 🚀
2026-04-07 20:14:51 +02:00

364 lines
10 KiB
JSON

{
"dashboard": {
"id": null,
"uid": "mockupaws-overview",
"title": "mockupAWS - Overview",
"tags": ["mockupaws", "overview"],
"timezone": "UTC",
"schemaVersion": 36,
"version": 1,
"refresh": "30s",
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "grafana",
"uid": "-- Grafana --"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"templating": {
"list": [
{
"name": "environment",
"type": "constant",
"current": {
"value": "production",
"text": "production"
},
"hide": 0
},
{
"name": "service",
"type": "query",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"query": "label_values(up{job=~\"mockupaws-.*\"}, job)",
"refresh": 1,
"hide": 0
}
]
},
"panels": [
{
"id": 1,
"title": "Uptime (30d)",
"type": "stat",
"targets": [
{
"expr": "avg_over_time(up{job=\"mockupaws-backend\"}[30d]) * 100",
"legendFormat": "Uptime %",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"min": 99,
"max": 100,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "red", "value": null},
{"color": "yellow", "value": 99.9},
{"color": "green", "value": 99.95}
]
}
}
},
"gridPos": {"h": 4, "w": 4, "x": 0, "y": 0}
},
{
"id": 2,
"title": "Requests/sec",
"type": "stat",
"targets": [
{
"expr": "sum(rate(http_requests_total{job=\"mockupaws-backend\"}[5m]))",
"legendFormat": "RPS",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "reqps"
}
},
"gridPos": {"h": 4, "w": 4, "x": 4, "y": 0}
},
{
"id": 3,
"title": "Error Rate",
"type": "stat",
"targets": [
{
"expr": "sum(rate(http_requests_total{job=\"mockupaws-backend\",status=~\"5..\"}[5m])) / sum(rate(http_requests_total{job=\"mockupaws-backend\"}[5m])) * 100",
"legendFormat": "Error %",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 0.1},
{"color": "red", "value": 1}
]
}
}
},
"gridPos": {"h": 4, "w": 4, "x": 8, "y": 0}
},
{
"id": 4,
"title": "Latency p50",
"type": "stat",
"targets": [
{
"expr": "histogram_quantile(0.50, sum(rate(http_request_duration_seconds_bucket{job=\"mockupaws-backend\"}[5m])) by (le)) * 1000",
"legendFormat": "p50",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "ms",
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 200},
{"color": "red", "value": 500}
]
}
}
},
"gridPos": {"h": 4, "w": 4, "x": 12, "y": 0}
},
{
"id": 5,
"title": "Latency p95",
"type": "stat",
"targets": [
{
"expr": "histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{job=\"mockupaws-backend\"}[5m])) by (le)) * 1000",
"legendFormat": "p95",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "ms",
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 500},
{"color": "red", "value": 1000}
]
}
}
},
"gridPos": {"h": 4, "w": 4, "x": 16, "y": 0}
},
{
"id": 6,
"title": "Active Scenarios",
"type": "stat",
"targets": [
{
"expr": "scenarios_active_total",
"legendFormat": "Active",
"refId": "A"
}
],
"gridPos": {"h": 4, "w": 4, "x": 20, "y": 0}
},
{
"id": 7,
"title": "Request Rate Over Time",
"type": "timeseries",
"targets": [
{
"expr": "sum(rate(http_requests_total{job=\"mockupaws-backend\"}[5m])) by (status)",
"legendFormat": "{{status}}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "reqps"
}
},
"options": {
"legend": {
"displayMode": "table",
"placement": "right",
"calcs": ["mean", "max"]
}
},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 4}
},
{
"id": 8,
"title": "Response Time Percentiles",
"type": "timeseries",
"targets": [
{
"expr": "histogram_quantile(0.50, sum(rate(http_request_duration_seconds_bucket{job=\"mockupaws-backend\"}[5m])) by (le)) * 1000",
"legendFormat": "p50",
"refId": "A"
},
{
"expr": "histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{job=\"mockupaws-backend\"}[5m])) by (le)) * 1000",
"legendFormat": "p95",
"refId": "B"
},
{
"expr": "histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket{job=\"mockupaws-backend\"}[5m])) by (le)) * 1000",
"legendFormat": "p99",
"refId": "C"
}
],
"fieldConfig": {
"defaults": {
"unit": "ms",
"custom": {
"lineWidth": 2,
"fillOpacity": 10
}
}
},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 4}
},
{
"id": 9,
"title": "Error Rate Over Time",
"type": "timeseries",
"targets": [
{
"expr": "sum(rate(http_requests_total{job=\"mockupaws-backend\",status=~\"5..\"}[5m])) / sum(rate(http_requests_total{job=\"mockupaws-backend\"}[5m])) * 100",
"legendFormat": "5xx Error %",
"refId": "A"
},
{
"expr": "sum(rate(http_requests_total{job=\"mockupaws-backend\",status=~\"4..\"}[5m])) / sum(rate(http_requests_total{job=\"mockupaws-backend\"}[5m])) * 100",
"legendFormat": "4xx Error %",
"refId": "B"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent"
}
},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 12}
},
{
"id": 10,
"title": "Top Endpoints by Latency",
"type": "table",
"targets": [
{
"expr": "topk(10, histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{job=\"mockupaws-backend\"}[5m])) by (handler, le)))",
"format": "table",
"instant": true,
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "s"
},
"overrides": [
{
"matcher": {"id": "byName", "options": "Value"},
"properties": [
{"id": "displayName", "value": "p95 Latency"},
{"id": "unit", "value": "ms"}
]
}
]
},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 12}
},
{
"id": 11,
"title": "Infrastructure - CPU Usage",
"type": "timeseries",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"targets": [
{
"expr": "100 - (avg by (instance) (irate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"min": 0,
"max": 100,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 70},
{"color": "red", "value": 85}
]
}
}
},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 20}
},
{
"id": 12,
"title": "Infrastructure - Memory Usage",
"type": "timeseries",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"targets": [
{
"expr": "(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"min": 0,
"max": 100,
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 70},
{"color": "red", "value": 85}
]
}
}
},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 20}
}
]
}
}