Compare commits

..

36 Commits

Author SHA1 Message Date
Luca Sacchi Ricciardi 923621fd55 docs: aggiunge email contatto autore e aggiorna riferimenti 2026-04-25 19:00:37 +02:00
Luca Sacchi Ricciardi 5440887ef4 docs: aggiorna autore e titolarita diritti a Luca Sacchi Ricciardi 2026-04-25 18:59:39 +02:00
Luca Sacchi Ricciardi f6ad08c4a0 docs: aggiorna README con nuova policy licenza 2026-04-25 18:58:10 +02:00
Luca Sacchi Ricciardi 3a8adafb6a docs: aggiunge licenza proprietaria bilingue con foro Milano 2026-04-25 18:57:14 +02:00
Luca Sacchi Ricciardi 831b41f487 docs: aggiorna README e documentazione con feature attuali 2026-04-25 18:10:49 +02:00
Luca Sacchi Ricciardi bfed2f60aa chore: sposta PRD.md in docs/, rimuovi .env.local dal tracking, aggiorna .gitignore 2026-04-25 18:06:34 +02:00
Luca Sacchi Ricciardi c217860ebc chore: rimuovi artefatti obsoleti e aggiungi test-results/ a .gitignore 2026-04-25 18:04:43 +02:00
Luca Sacchi Ricciardi b6fba35004 fix: chevron SVG usa width/height e transform inline invece di classi Tailwind 2026-04-25 18:00:59 +02:00
Luca Sacchi Ricciardi 24b27d26ce fix: bump service worker cache to v3 to force app.js refresh 2026-04-25 17:58:45 +02:00
Luca Sacchi Ricciardi 74f9501a9c feat: accordion layout for model show details modal 2026-04-25 17:55:27 +02:00
Luca Sacchi Ricciardi a83c1d1261 fix: stop running page spinner when server data is unavailable 2026-04-25 17:28:54 +02:00
Luca Sacchi Ricciardi 6739b84b9a fix: avoid worker fetch noise when server is offline 2026-04-25 16:30:46 +02:00
Luca Sacchi Ricciardi 1c76515d8c feat: show deferred details cache mode 2026-04-25 16:13:41 +02:00
Luca Sacchi Ricciardi 165ad9c02b fix: handle local storage quota for model cache 2026-04-25 16:11:12 +02:00
Luca Sacchi Ricciardi ac2089f921 test: integrate playwright cache navigation spec 2026-04-25 16:06:59 +02:00
Luca Sacchi Ricciardi 760c9cc923 test: add browser cache navigation check 2026-04-25 16:00:46 +02:00
Luca Sacchi Ricciardi 9649f2ccfb fix: serve cached server data before background sync 2026-04-25 15:57:37 +02:00
Luca Sacchi Ricciardi f60781bd7f feat: add multi-server control panel and host-aware sync 2026-04-25 15:40:20 +02:00
Luca Sacchi Ricciardi 3ba6a9a41c Translate UI to English and add PWA support 2026-04-25 15:32:10 +02:00
Luca Sacchi Ricciardi 2f28b6a52a Fix blank ReDoc by pinning stable redoc bundle 2026-04-25 15:25:32 +02:00
Luca Sacchi Ricciardi bfe301a52c Make Docker Tailwind stage work without package-lock 2026-04-25 15:12:58 +02:00
Luca Sacchi Ricciardi 229115ae87 Harden Tailwind Docker build and add deploy verification 2026-04-25 15:08:57 +02:00
Luca Sacchi Ricciardi 32302e2b06 Add modal click Playwright test utilities 2026-04-25 14:22:36 +02:00
Luca Sacchi Ricciardi eea6e2a80e Fix model details modal interactions and scrolling 2026-04-25 14:05:11 +02:00
Luca Sacchi Ricciardi 87ebd35ad5 Fix model details modal and make running models primary page 2026-04-25 13:31:04 +02:00
Luca Sacchi Ricciardi 1aee51b0d6 Add dedicated page for running Ollama models 2026-04-25 11:36:05 +02:00
Luca Sacchi Ricciardi 2f591e55ce feat: open model details modal on hover and refine cards layout
- Add on-hover modal opening for model cards with debounce
- Keep click-to-open behavior as fallback
- Prevent accidental hover triggers while moving inside same card
- Convert models list to responsive grid layout
- Improve card visual feedback and helper text for interaction
2026-04-24 20:15:19 +02:00
Luca Sacchi Ricciardi e05df7ce2b feat: show model details in modal with close controls
- Replace inline details panel with centered modal overlay
- Add close button (X) in top-right of modal
- Add close on backdrop click
- Add close on Escape key
- Lock body scroll while modal is open
2026-04-24 20:10:37 +02:00
Luca Sacchi Ricciardi f19c03b7bd fix: restore dashboard styling by tracking compiled Tailwind CSS
- Generate and add app/web/static/css/output.css
- Stop ignoring output.css in .gitignore
- Ensure UI has styles without requiring local Tailwind build step
2026-04-24 20:06:16 +02:00
Luca Sacchi Ricciardi 0789e5b8e9 fix: make model-card click reliable and remove Tailwind CDN warning
- Use encoded model key in data attribute to avoid lookup mismatch
- Decode key on click before resolving showByModel data
- Guard localStorage writes with try/catch to avoid silent UI failures
- Scroll details section into view when a card is clicked
- Remove tailwindcdn script from template (use compiled CSS only)
2026-04-24 20:04:06 +02:00
Luca Sacchi Ricciardi 57663400ce feat: load and cache Ollama show data per model with clickable model details
- Add GET /api/v1/models/{model_name}/show endpoint (proxy to Ollama /api/show)
- Worker now fetches show data for each model during model list sync
- Persist show details in localStorage under llm_monitor_models.showByModel
- Make model cards clickable to display cached show details in a dedicated panel
- Keep UI updates incremental without full page reload
- Add tests for show endpoint and OpenAPI path
- Update README and PRD with show-flow and click-card behavior
2026-04-24 19:41:46 +02:00
Luca Sacchi Ricciardi 32b1130632 feat: add favicon.ico and gate model write APIs by env flag
- Generate and serve real /favicon.ico from static assets
- Update HTML to use /favicon.ico
- Add ENABLE_MODEL_RW_API setting (default: false)
- Disable POST/DELETE model endpoints by default
- Hide write endpoints from OpenAPI when disabled
- Return 404 for write endpoints when disabled
- Update env.example with ENABLE_MODEL_RW_API documentation
- Update README and PRD with R/W API policy and remote compose notes
- Add tests to verify write endpoints are disabled by default
2026-04-24 19:35:24 +02:00
Luca Sacchi Ricciardi 893376dc14 fix: resolve console errors (localStorage in Worker, favicon, Tailwind CDN)
Issues fixed:
1. Web Worker localStorage error - Remove localStorage calls from worker
   - Worker cannot access localStorage (browser context only)
   - Worker now sends data to main thread via postMessage
   - Main thread handles all localStorage operations

2. Add favicon to avoid 404 error
   - Use inline SVG favicon (llama emoji)
   - No external file request

3. Optimize Tailwind CSS for production
   - Add tailwind.config.js for content scanning
   - Add app/web/static/css/input.css (Tailwind directives)
   - Update package.json with tailwind build commands
   - Update Dockerfile multi-stage build:
     * Stage 1: Node.js - compile Tailwind CSS
     * Stage 2: Python - install dependencies
     * Stage 3: Runtime - use compiled CSS
   - Update index.html to use compiled output.css
   - Add fallback to CDN for development

4. Add DEVELOPMENT.md documentation
   - Setup instructions for local development
   - Tailwind CSS workflow (watch mode)
   - Docker build explanation
   - Development tips and best practices

Benefits:
- No more localStorage errors in console
- No more 404 favicon requests
- Optimized CSS for production (~30KB minified)
- Clear development workflow
- Multi-stage Docker build is efficient (~300MB image)
2026-04-24 19:30:53 +02:00
Luca Sacchi Ricciardi b3beb525ad refactor: support remote Ollama server in docker-compose
- Remove Ollama service from docker-compose.yml (now external/remote)
- Remove ollama_data volume and network configuration
- Simplify compose to only llm-monitor service
- Use env_file for all configuration from .env
- Make API_PORT dynamic with ${API_PORT:-8000}
- Update env.example with Ollama remote server examples:
  - Local development: http://localhost:11434
  - Remote server: http://ollama.example.com:11434
  - Remote with SSL: https://ollama.example.com
- Improve documentation for remote Ollama setup

This allows deployment against any Ollama server (local or remote).
2026-04-24 19:25:00 +02:00
Luca Sacchi Ricciardi 40d8ae9f52 docs: add comprehensive Product Requirements Document (PRD)
- Executive summary with key highlights
- Vision and primary/secondary objectives
- Problem statement and proposed solution
- Target users with detailed use cases
- 6 main feature descriptions with specifications
- Technical requirements (backend, frontend, devops)
- Complete system architecture with data flow
- 6 user stories with acceptance criteria
- Feature acceptance criteria with test matrix
- Browser compatibility matrix
- 4-phase roadmap (MVP to Production)
- Success metrics (technical, business, engagement)
- Constraints and assumptions
- Implementation notes and references

This document provides complete product specification for stakeholders and team.
2026-04-24 19:18:15 +02:00
Luca Sacchi Ricciardi 9a6f835ddf feat: implement Web Worker architecture for efficient data sync
- Add data-sync.worker.js: separate thread for API calls (30s interval)
- Add app.js: main thread with DOM update logic and localStorage integration
- Update index.html: remove inline scripts, use external app.js
- Implement granular DOM updates (only update changed elements)
- Add localStorage persistence for health and models data
- Add Web Worker fallback for unsupported browsers
- Add WEB_WORKERS.md documentation with architecture details

Benefits:
- Main thread never blocked by network requests
- UI stays responsive at 60 FPS
- Offline support via localStorage
- Efficient DOM updates (no unnecessary re-renders)
- Better browser support and performance
2026-04-24 19:16:51 +02:00
40 changed files with 5285 additions and 293 deletions
-1
View File
@@ -35,7 +35,6 @@ CONTRIBUTING.md
# Development
node_modules/
package-lock.json
Makefile
.env*
-15
View File
@@ -1,15 +0,0 @@
# LLM Monitor - Local Development Environment
# Copia questo file da env.example e personalizza per il tuo ambiente
OLLAMA_HOST=http://localhost:11434
OLLAMA_TIMEOUT=30
API_HOST=0.0.0.0
API_PORT=8000
API_WORKERS=1
CORS_ORIGINS=http://localhost:3000,http://localhost:5173,http://localhost:8000
LOG_LEVEL=DEBUG
ENVIRONMENT=development
+4 -1
View File
@@ -93,6 +93,8 @@ celerybeat.pid
# Environments
.env
.env.local
.env.*.local
.venv
env/
venv/
@@ -134,7 +136,8 @@ node_modules/
package-lock.json
# Build outputs
app/web/static/css/output.css
test-results/
playwright-report/
# Database
*.db
+11 -2
View File
@@ -2,6 +2,11 @@
Grazie per l'interesse nel contribuire a LLM Monitor! Questo documento fornisce linee guida per contribuire al progetto.
## Autore e Diritti
- **Autore del progetto**: Luca Sacchi Ricciardi
- **Detentore di tutti i diritti**: Luca Sacchi Ricciardi
## Codice di Condotta
Questo progetto aderisce a un Codice di Condotta per garantire un ambiente inclusivo e rispettoso.
@@ -116,8 +121,12 @@ feat: aggiungi endpoint per ottenere statistiche modelli
## Licenza
Contribuendo, accetti che i tuoi contributi siano licensiati sotto la MIT License.
Contribuendo, accetti che i tuoi contributi siano soggetti alla licenza
proprietaria del progetto (tutti i diritti riservati).
---
Domande? Apri un issue o contatta il maintainer!
Domande? Apri un issue o contatta il maintainer:
- luca.sacchi@gmail.com
- luca@lucasacchi.net
+31 -4
View File
@@ -1,7 +1,34 @@
# Multi-stage build per LLM Monitor
# Stage 1: Builder
FROM python:3.11-slim as builder
# Stage 1: Build CSS with Tailwind
FROM node:18-alpine AS css-builder
WORKDIR /app
# Copiare file di configurazione npm.
# Nota: package-lock.json puo non essere presente in alcuni deploy.
COPY package*.json tailwind.config.js ./
# Installare dipendenze Node
RUN if [ -f package-lock.json ]; then npm ci; else npm install; fi
# Copiare input CSS e file usati dal content scan di Tailwind.
# Questo passaggio deve avvenire prima della build per invalidare cache quando cambiano template/js.
COPY app/web/static/css/input.css ./app/web/static/css/
COPY app/web/templates/ ./app/web/templates/
COPY app/web/static/js/ ./app/web/static/js/
# Compilare CSS Tailwind
RUN npm run tailwind:build
# Verifica bloccante: output.css deve essere compilato e non vuoto.
RUN test -s ./app/web/static/css/output.css && \
CSS_LINES=$(wc -l < ./app/web/static/css/output.css) && \
echo "[css-builder] output.css lines: ${CSS_LINES}" && \
test "${CSS_LINES}" -ge 100
# Stage 2: Build Python packages
FROM python:3.11-slim AS builder
WORKDIR /app
@@ -19,7 +46,7 @@ ENV PATH="/opt/venv/bin:$PATH"
RUN pip install --no-cache-dir --upgrade pip setuptools wheel && \
pip install --no-cache-dir -r requirements.txt
# Stage 2: Runtime
# Stage 3: Runtime
FROM python:3.11-slim
WORKDIR /app
@@ -33,9 +60,9 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
COPY --from=builder /opt/venv /opt/venv
# Copiare codice dell'app
COPY --from=css-builder /app/app/web/static/css/output.css ./app/web/static/css/output.css
COPY app/ /app/app/
COPY main.py /app/
COPY .env* /app/
# Impostare PATH
ENV PATH="/opt/venv/bin:$PATH"
+12 -17
View File
@@ -1,21 +1,16 @@
MIT License
LLM Monitor - Proprietary License Notice
Copyright (c) 2024-2026 Luca Sacchi
Copyright (c) 2024-2026 Luca Sacchi Ricciardi.
All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
This software is proprietary. Permission is granted to use the software free
of charge, strictly "AS IS", without warranties, maintenance, or support,
subject to the terms in the following files:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
- LICENSE.en.txt (English, authoritative fallback)
- LICENSE.it.txt (Italian)
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
If there is a conflict between language versions, the English version in
LICENSE.en.txt prevails.
Jurisdiction and venue for any dispute: Milan, Italy.
+73
View File
@@ -0,0 +1,73 @@
LLM Monitor - Proprietary License (English)
Version: 1.0
Last updated: 2026-04-25
Copyright (c) 2024-2026 Luca Sacchi Ricciardi.
All rights reserved.
1. Ownership
The software, source code, binaries, assets, and documentation (collectively,
"Software") are owned by the copyright holder. No ownership rights are
transferred under this license.
2. Grant of Use (Free of Charge)
Subject to full compliance with this license, any person obtaining a copy of
this Software is granted a personal, non-exclusive, non-transferable,
non-sublicensable, revocable right to use the Software free of charge.
3. Restrictions
Unless expressly required by applicable law, you may not:
- remove or alter copyright, trademark, or license notices;
- represent the Software as your own work;
- use the Software in a way that violates applicable law;
- provide paid support, warranty, or indemnity on behalf of the copyright
holder;
- use the copyright holder's name, marks, or branding without prior written
permission.
4. Distribution
Redistribution of original or modified copies is allowed only if all of the
following are met:
- this license text is included in full;
- all copyright and attribution notices are preserved;
- recipients are clearly informed that the Software is provided "AS IS" with
no warranty and no support from the copyright holder.
5. No Warranty (AS IS)
THE SOFTWARE IS PROVIDED "AS IS", "AS AVAILABLE", AND WITH ALL FAULTS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
LIMITED TO MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE,
NON-INFRINGEMENT, ACCURACY, OR UNINTERRUPTED OPERATION.
6. No Support or Maintenance
THE COPYRIGHT HOLDER HAS NO OBLIGATION TO PROVIDE SUPPORT, MAINTENANCE,
UPDATES, ENHANCEMENTS, PATCHES, OR FIXES.
7. Limitation of Liability
TO THE MAXIMUM EXTENT PERMITTED BY LAW, IN NO EVENT SHALL THE COPYRIGHT HOLDER
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL,
EXEMPLARY, OR PUNITIVE DAMAGES, OR FOR LOSS OF DATA, PROFITS, REVENUE,
GOODWILL, OR BUSINESS INTERRUPTION, ARISING OUT OF OR RELATED TO THE SOFTWARE,
WHETHER IN CONTRACT, TORT, STRICT LIABILITY, OR ANY OTHER THEORY, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
8. Indemnity
You agree to defend, indemnify, and hold harmless the copyright holder from and
against claims, liabilities, damages, and expenses arising from your use of the
Software or your breach of this license, to the extent permitted by law.
9. Termination
This license terminates automatically if you fail to comply with its terms.
Upon termination, you must cease use and distribution of the Software.
Sections intended by nature to survive termination shall survive.
10. Governing Law and Jurisdiction
This license is governed by the laws of Italy.
For any dispute arising out of or related to this Software or this license,
the exclusive place of jurisdiction and venue is Milan, Italy.
11. Severability and Entire Agreement
If any provision is held unenforceable, the remaining provisions remain in
full force. This license constitutes the entire agreement regarding use of the
Software and supersedes prior statements on this subject.
+72
View File
@@ -0,0 +1,72 @@
LLM Monitor - Licenza Proprietaria (Italiano)
Versione: 1.0
Ultimo aggiornamento: 2026-04-25
Copyright (c) 2024-2026 Luca Sacchi Ricciardi.
Tutti i diritti riservati.
1. Titolarita
Il software, il codice sorgente, i binari, gli asset e la documentazione
(collettivamente, "Software") sono di proprieta del titolare dei diritti.
Questa licenza non trasferisce alcun diritto di proprieta.
2. Concessione d'uso (gratuita)
Nel rispetto integrale di questa licenza, a chi ottiene una copia del Software
viene concesso un diritto personale, non esclusivo, non trasferibile,
non sublicenziabile e revocabile di usare il Software gratuitamente.
3. Limitazioni
Salvo quanto imposto dalla legge applicabile, non e consentito:
- rimuovere o modificare avvisi di copyright, marchi o licenza;
- presentare il Software come opera propria;
- usare il Software in violazione di legge;
- offrire supporto, garanzie o manleve a pagamento in nome del titolare;
- usare nome, marchi o branding del titolare senza autorizzazione scritta.
4. Ridistribuzione
La ridistribuzione di copie originali o modificate e consentita solo se sono
rispettate tutte le seguenti condizioni:
- il testo della presente licenza e incluso integralmente;
- avvisi di copyright e attribuzione sono mantenuti;
- i destinatari sono informati in modo chiaro che il Software e fornito
"COSI COM'E" senza garanzia e senza supporto del titolare.
5. Esclusione di garanzia (AS IS)
IL SOFTWARE E FORNITO "COSI COM'E", "COME DISPONIBILE" E CON OGNI EVENTUALE
DIFETTO, SENZA GARANZIE O CONDIZIONI DI ALCUN TIPO, ESPRESSE O IMPLICITE,
INCLUSE, A TITOLO ESEMPLIFICATIVO, GARANZIE DI COMMERCIABILITA,
IDONEITA A UNO SCOPO SPECIFICO, TITOLARITA, NON VIOLAZIONE, ACCURATEZZA
O FUNZIONAMENTO ININTERROTTO.
6. Nessun supporto o manutenzione
IL TITOLARE NON HA ALCUN OBBLIGO DI FORNIRE SUPPORTO, MANUTENZIONE,
AGGIORNAMENTI, MIGLIORIE, PATCH O CORREZIONI.
7. Limitazione di responsabilita
NELLA MASSIMA MISURA CONSENTITA DALLA LEGGE, IL TITOLARE NON RISPONDE IN
ALCUN CASO DI DANNI DIRETTI, INDIRETTI, INCIDENTALI, SPECIALI,
CONSEQUENZIALI, ESEMPLARI O PUNITIVI, NE DI PERDITA DI DATI, PROFITTI,
RICAVI, AVVIAMENTO O INTERRUZIONE DI ATTIVITA, DERIVANTI DA O CONNESSI AL
SOFTWARE, A QUALSIASI TITOLO (CONTRATTUALE, EXTRACONTRATTUALE, RESPONSABILITA
OGGETTIVA O ALTRA TEORIA), ANCHE SE AVVISATO DELLA POSSIBILITA DI TALI DANNI.
8. Manleva
Nei limiti consentiti dalla legge, l'utente accetta di tenere indenne e
manlevare il titolare da pretese, responsabilita, danni e costi derivanti
dall'uso del Software o dalla violazione della presente licenza.
9. Risoluzione
La licenza si risolve automaticamente in caso di mancato rispetto dei suoi
termini. In caso di risoluzione, l'utente deve cessare uso e ridistribuzione
del Software. Le clausole che per natura devono sopravvivere restano efficaci.
10. Legge applicabile e foro competente
La presente licenza e regolata dalla legge italiana.
Per ogni controversia derivante da o connessa al Software o alla presente
licenza, il foro esclusivamente competente e Milano, Italia.
11. Clausola di salvaguardia e intero accordo
Se una clausola e ritenuta invalida o inapplicabile, le altre restano valide
ed efficaci. La presente licenza costituisce l'intero accordo relativo all'uso
del Software e sostituisce ogni precedente dichiarazione sul tema.
+13 -1
View File
@@ -1,4 +1,4 @@
.PHONY: help install dev prod test lint format clean docker-build docker-up docker-down
.PHONY: help install dev prod test lint format clean docker-build docker-up docker-down docker-build-no-cache verify-css deploy-no-cache
help:
@echo "LLM Monitor - Makefile Commands"
@@ -11,8 +11,11 @@ help:
@echo "make format - Formatta il codice"
@echo "make clean - Pulisce cache e file temporanei"
@echo "make docker-build - Build dell'immagine Docker"
@echo "make docker-build-no-cache - Build Docker senza cache (fix Tailwind)"
@echo "make docker-up - Avvia i container con Docker Compose"
@echo "make docker-down - Ferma i container con Docker Compose"
@echo "make verify-css - Verifica output.css compilato nel container"
@echo "make deploy-no-cache - Deploy completo con build no-cache + verifica CSS"
install:
python3 -m venv venv
@@ -41,12 +44,21 @@ clean:
docker-build:
docker build -t llm-monitor:latest .
docker-build-no-cache:
docker compose build --no-cache
docker-up:
docker compose up -d
docker-down:
docker compose down
verify-css:
./scripts/verify-tailwind-css.sh
deploy-no-cache:
./scripts/deploy-no-cache.sh
docker-logs:
docker compose logs -f llm-monitor
+152 -34
View File
@@ -4,12 +4,16 @@ Una dashboard web moderna e intuitiva per monitorare e gestire i modelli LLM car
## 🎯 Caratteristiche
-**Dashboard intuitiva** - Visualizza in tempo reale i modelli caricati in Ollama
-**Dashboard intuitiva** - Visualizza in tempo reale i modelli disponibili e in esecuzione su Ollama
- 📊 **Monitoraggio modelli** - Dettagli completi di ogni modello (nome, dimensione, memoria, stato)
- 🧩 **Dettagli accordion on click** - Clic su una card per esplorare i dati `ollama show` in pannelli collassabili (dettagli, parametri, template, modelfile, licenza)
- 🖥️ **Multi-server** - Gestione di più istanze Ollama con switch istantaneo (pagina `/servers`)
- 🏃 **Modelli in esecuzione** - Pagina dedicata `/models-running` con VRAM, tempo rimanente e backend GPU/CPU
- 📱 **PWA** - Installabile come app desktop/mobile con Service Worker e cache offline
- 🔌 **API REST documentata** - Documentazione interattiva con Swagger/OpenAPI
- 🎨 **UI moderna** - Interfaccia elegante realizzata con TailwindCSS
- 🐳 **Docker ready** - Container sempre acceso (until stopped)
-**Performance** - Costruito su FastAPI e uVicorn
- 🎨 **UI moderna** - Interfaccia dark-mode realizzata con TailwindCSS
- 🐳 **Docker ready** - Container sempre acceso (restart: unless-stopped)
-**Performance** - FastAPI + uVicorn, aggiornamenti ogni 30s via Web Worker senza bloccare l'UI
- 🔐 **Configurazione flessibile** - File `.env` per personalizzazione
## 📋 Requisiti
@@ -85,6 +89,7 @@ OLLAMA_TIMEOUT=30
API_HOST=0.0.0.0
API_PORT=8000
API_WORKERS=4
ENABLE_MODEL_RW_API=false
# CORS Configuration
CORS_ORIGINS=http://localhost:3000,http://localhost:5173
@@ -105,6 +110,7 @@ ENVIRONMENT=development
| `API_HOST` | `0.0.0.0` | Host su cui esporre l'API |
| `API_PORT` | `8000` | Porta dell'API |
| `API_WORKERS` | `4` | Worker processes |
| `ENABLE_MODEL_RW_API` | `false` | Abilita endpoint `POST/DELETE` sui modelli |
| `CORS_ORIGINS` | `http://localhost:3000` | Origini CORS consentite |
| `LOG_LEVEL` | `INFO` | Livello di logging |
| `ENVIRONMENT` | `development` | Ambiente (development/production) |
@@ -145,12 +151,33 @@ GET /api/v1/models
GET /api/v1/models/{model_name}
```
#### Dettagli estesi da Ollama show
```bash
GET /api/v1/models/{model_name}/show
```
#### Health check API Ollama
```bash
GET /api/v1/health
```
#### Endpoint R/W modelli (opzionali)
Per impostazione predefinita gli endpoint di scrittura sono **disabilitati** e non disponibili.
```bash
POST /api/v1/models/{model_name}/pull
DELETE /api/v1/models/{model_name}
```
Per abilitarli, imposta nel file `.env`:
```env
ENABLE_MODEL_RW_API=true
```
**Risposta:**
```json
@@ -170,10 +197,19 @@ curl http://localhost:8000/api/v1/models
# Ottenere info su un modello
curl http://localhost:8000/api/v1/models/llama2
# Ottenere dettagli estesi show
curl http://localhost:8000/api/v1/models/llama2/show
# Health check
curl http://localhost:8000/api/v1/health
```
### Comportamento dashboard
- Al refresh della lista modelli, per ogni modello viene recuperato anche il dettaglio `show`.
- I dati vengono salvati in localStorage nella chiave `llm_monitor_models` (campo `showByModel`).
- Cliccando su una card modello, la dashboard mostra i dettagli `show` senza ricaricare la pagina.
## 🐳 Docker
### Build dell'immagine
@@ -203,6 +239,12 @@ Usa il file `docker-compose.yml` fornito:
# Avviare i servizi
docker compose up -d
# Rebuild completo senza cache (consigliato dopo modifiche UI/Tailwind)
docker compose build --no-cache
# Verificare che il CSS compilato non sia vuoto
docker exec llm-monitor-app wc -l /app/app/web/static/css/output.css
# Visualizzare i log
docker compose logs -f llm-monitor
@@ -213,18 +255,53 @@ docker compose down
docker compose restart llm-monitor
```
### Deploy consigliato (Tailwind-safe)
Se l'interfaccia appare senza stili o una modale non si posiziona correttamente, usa il deploy con rebuild no-cache e verifica CSS:
```bash
cd /opt/llm-monitor
docker compose down
docker compose build --no-cache
docker compose up -d
sleep 5
docker exec llm-monitor-app wc -l /app/app/web/static/css/output.css
```
In alternativa dal repository:
```bash
make deploy-no-cache
```
### Tailwind Build Process
- Lo stage `css-builder` del Dockerfile installa dipendenze Node con `npm ci`.
- Prima della build Tailwind vengono copiati template HTML e JS usati dal content scan.
- Dopo `npm run tailwind:build` una verifica bloccante controlla che `output.css` esista e abbia almeno 100 linee.
- Lo stage runtime copia `output.css` compilato da `css-builder` con `COPY --from=css-builder`.
### Troubleshooting UI
Se la modale non appare o i componenti sembrano "unstyled":
1. Esegui `docker compose build --no-cache`.
2. Riavvia con `docker compose up -d`.
3. Verifica CSS compilato: `docker exec llm-monitor-app wc -l /app/app/web/static/css/output.css`.
4. Se il numero linee e `< 100`, la build Tailwind non e riuscita correttamente.
### Container sempre acceso
Il container Ollama rimarrà in esecuzione fino al suo arresto manuale:
Il container `llm-monitor` rimarrà in esecuzione fino al suo arresto manuale:
```bash
# Fermare
docker compose stop ollama
docker compose stop llm-monitor
# oppure
docker stop llm-monitor
# Riavviare
docker compose start ollama
docker compose start llm-monitor
# oppure
docker start llm-monitor
```
@@ -235,38 +312,59 @@ docker start llm-monitor
llm-monitor/
├── main.py # Entry point dell'applicazione
├── requirements.txt # Dipendenze Python
├── env.example # Esempio di configurazione
├── Dockerfile # Configurazione Docker
├── docker-compose.yml # Composizione servizi
├── README.md # Questo file
├── .gitignore
├── requirements-dev.txt # Dipendenze sviluppo (pytest, black, flake8…)
├── env.example # Esempio di configurazione
├── Dockerfile # Build multi-stage (Node CSS + Python runtime)
├── docker-compose.yml # Composizione servizi
├── package.json # Script Node (Tailwind, Playwright)
├── tailwind.config.js # Configurazione TailwindCSS
├── playwright.config.js # Configurazione test E2E
├── Makefile # Comandi rapidi (dev, test, deploy…)
├── README.md # Questo file
├── CONTRIBUTING.md # Guida ai contributi
├── app/
│ ├── __init__.py
│ ├── config.py # Configurazione (variabili ambiente)
│ ├── main.py # Inizializzazione FastAPI
│ ├── config.py # Configurazione via variabili d'ambiente
│ │
│ ├── api/
│ │ ├── __init__.py
│ │ ── models.py # Endpoint modelli
│ │ ├── health.py # Endpoint health
│ │ └── v1/
│ │ └── __init__.py
│ │ ├── models.py # Endpoint modelli (/api/v1/models)
│ │ ── health.py # Endpoint health (/api/v1/health)
│ │
│ ├── services/
│ │ ── __init__.py
│ │ ├── ollama.py # Client Ollama
│ │ └── cache.py # Cache in-memory (opzionale)
│ │ ── ollama.py # Client HTTP verso Ollama
│ │
│ └── web/
│ ├── __init__.py
├── static/ # Assets statici (CSS compilato TailwindCSS)
└── templates/ # Template HTML
│ ├── static/
│ ├── css/
│ │ ├── input.css # Sorgente Tailwind
│ │ │ └── output.css # CSS compilato (generato)
│ │ └── js/
│ │ ├── app.js # App principale (dashboard modelli)
│ │ ├── servers.js # Pagina gestione server
│ │ ├── models-running.js # Pagina modelli in esecuzione
│ │ ├── data-sync.worker.js # Web Worker sincronizzazione dati
│ │ ├── server-config.js # Utilità multi-server e localStorage
│ │ ├── pwa-register.js # Registrazione Service Worker
│ │ └── service-worker.js # PWA Service Worker (cache-first)
│ └── templates/
│ ├── index.html # Dashboard modelli disponibili
│ ├── servers.html # Gestione istanze Ollama
│ └── models_running.html # Modelli attualmente in esecuzione
├── docs/
│ ├── PRD.md # Product Requirements Document
│ ├── DEVELOPMENT.md # Guida al setup e sviluppo locale
│ └── WEB_WORKERS.md # Architettura Web Worker e PWA
├── scripts/
│ ├── deploy-no-cache.sh # Deploy Docker con rebuild forzato
│ └── verify-tailwind-css.sh # Verifica CSS compilato in container
└── tests/
├── __init__.py
├── test_api.py
└── test_ollama.py
├── test_api.py # Unit test endpoint FastAPI
├── test_ollama.py # Unit test client Ollama
└── e2e/
└── cache-navigation.spec.js # Test E2E Playwright (cache/PWA)
```
## 🛠️ Sviluppo
@@ -302,6 +400,9 @@ pytest tests/ -v
# Test con coverage
pytest tests/ --cov=app
# Browser E2E test (cache-first navigation)
OLLAMA_HOST=http://192.168.254.115:11434 npm run test:e2e
# Hot reload durante sviluppo
uvicorn main:app --reload
```
@@ -354,7 +455,21 @@ lsof -ti :8000 | xargs kill -9
## 📜 Licenza
Questo progetto è distribuito sotto licenza **MIT**. Vedi il file `LICENSE` per dettagli.
Questo progetto e distribuito con **licenza proprietaria** (tutti i diritti riservati).
Autore e detentore esclusivo di tutti i diritti: **Luca Sacchi Ricciardi**.
- Uso consentito gratuitamente
- Software fornito "AS IS"
- Nessuna garanzia
- Nessun supporto o manutenzione obbligatori
- Foro competente esclusivo: Milano, Italia
Dettagli completi:
- `LICENSE` (notice principale)
- `LICENSE.en.txt` (testo completo in inglese)
- `LICENSE.it.txt` (testo completo in italiano)
## 🤝 Contribuire
@@ -370,12 +485,15 @@ Le pull request sono benvenute! Per cambiamenti importanti, apri prima un issue
## 📞 Supporto
Per domande o segnalazioni di bug, apri un **Issue** nel repository.
Per domande o segnalazioni di bug, apri un **Issue** nel repository oppure contatta l'autore:
- luca.sacchi@gmail.com
- luca@lucasacchi.net
---
**Fatto con ❤️ da [LucaSacchi.Net](https://lucasacchi.net)**
**Autore: Luca Sacchi Ricciardi ([LucaSacchi.Net](https://lucasacchi.net), luca.sacchi@gmail.com, luca@lucasacchi.net)**
**Versione**: 1.0.0
**Versione**: 1.1.0
**Ultima modifica**: Aprile 2026
**Status**: 🟢 In Development
**Status**: 🟢 Active
+21 -5
View File
@@ -2,11 +2,13 @@
Health check endpoints
"""
from fastapi import APIRouter, HTTPException
from fastapi import APIRouter, HTTPException, Query
from pydantic import BaseModel
from datetime import datetime
import requests
import logging
from typing import Optional
from urllib.parse import urlparse
from app.config import settings
logger = logging.getLogger(__name__)
@@ -26,18 +28,31 @@ class HealthResponse(BaseModel):
}
}
def resolve_ollama_host(host: Optional[str]) -> str:
"""Resolve target Ollama host, optionally overridden by query parameter."""
if not host:
return settings.OLLAMA_HOST
parsed = urlparse(host.strip())
if parsed.scheme not in {"http", "https"} or not parsed.netloc:
raise HTTPException(status_code=422, detail="Invalid Ollama host URL")
return host.rstrip("/")
@router.get("/health", response_model=HealthResponse)
async def health_check():
async def health_check(host: Optional[str] = Query(default=None)):
"""
Health check dell'API e dello stato di Ollama
Returns:
HealthResponse: Status dell'API e di Ollama
"""
target_host = resolve_ollama_host(host)
try:
# Check Ollama
response = requests.get(
f"{settings.OLLAMA_HOST}/api/tags",
f"{target_host}/api/tags",
timeout=settings.OLLAMA_TIMEOUT
)
ollama_status = "online" if response.status_code == 200 else "offline"
@@ -52,13 +67,14 @@ async def health_check():
)
@router.get("/ready")
async def ready():
async def ready(host: Optional[str] = Query(default=None)):
"""
Readiness probe per Kubernetes/Docker
"""
target_host = resolve_ollama_host(host)
try:
response = requests.get(
f"{settings.OLLAMA_HOST}/api/tags",
f"{target_host}/api/tags",
timeout=5
)
if response.status_code == 200:
+144 -8
View File
@@ -2,17 +2,39 @@
Models endpoints - Gestione dei modelli Ollama
"""
from fastapi import APIRouter, HTTPException
from fastapi import APIRouter, HTTPException, Query
from pydantic import BaseModel
from typing import List, Optional
from typing import Any, Dict, List, Optional
from datetime import datetime
import requests
import logging
from urllib.parse import urlparse
from app.config import settings
logger = logging.getLogger(__name__)
router = APIRouter()
def ensure_rw_api_enabled() -> None:
"""Blocca le API di scrittura se non abilitate esplicitamente."""
if not settings.ENABLE_MODEL_RW_API:
raise HTTPException(
status_code=404,
detail="Endpoint non disponibile"
)
def resolve_ollama_host(host: Optional[str]) -> str:
"""Resolve target Ollama host, optionally overridden by query parameter."""
if not host:
return settings.OLLAMA_HOST
parsed = urlparse(host.strip())
if parsed.scheme not in {"http", "https"} or not parsed.netloc:
raise HTTPException(status_code=422, detail="Invalid Ollama host URL")
return host.rstrip("/")
class ModelInfo(BaseModel):
"""Informazioni su un modello"""
name: str
@@ -51,7 +73,7 @@ class ModelsResponse(BaseModel):
}
@router.get("/models", response_model=ModelsResponse)
async def get_models():
async def get_models(host: Optional[str] = Query(default=None)):
"""
Recupera l'elenco di tutti i modelli caricati in Ollama
@@ -61,9 +83,10 @@ async def get_models():
Raises:
HTTPException: Se Ollama non è disponibile
"""
target_host = resolve_ollama_host(host)
try:
response = requests.get(
f"{settings.OLLAMA_HOST}/api/tags",
f"{target_host}/api/tags",
timeout=settings.OLLAMA_TIMEOUT
)
@@ -103,6 +126,8 @@ async def get_models():
status_code=502,
detail="Impossible connettersi a Ollama"
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Error fetching models: {e}")
raise HTTPException(
@@ -110,8 +135,57 @@ async def get_models():
detail="Errore nel recupero dei modelli"
)
@router.get("/models/running")
async def get_running_models(host: Optional[str] = Query(default=None)) -> Dict[str, Any]:
"""
Recupera i modelli attualmente residenti in memoria, equivalenti a `ollama ps`.
Returns:
Dict[str, Any]: Payload con modelli running e conteggio
"""
target_host = resolve_ollama_host(host)
try:
response = requests.get(
f"{target_host}/api/ps",
timeout=settings.OLLAMA_TIMEOUT
)
if response.status_code != 200:
raise HTTPException(
status_code=502,
detail="Ollama non disponibile"
)
data = response.json()
models_data = data.get("models", [])
return {
"models": models_data,
"total": len(models_data)
}
except requests.exceptions.Timeout:
raise HTTPException(
status_code=504,
detail="Timeout: Ollama non ha risposto in tempo"
)
except requests.exceptions.ConnectionError:
raise HTTPException(
status_code=502,
detail="Impossible connettersi a Ollama"
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Error fetching running models: {e}")
raise HTTPException(
status_code=500,
detail="Errore nel recupero dei modelli residenti"
)
@router.get("/models/{model_name}", response_model=ModelInfo)
async def get_model(model_name: str):
async def get_model(model_name: str, host: Optional[str] = Query(default=None)):
"""
Recupera le informazioni di un modello specifico
@@ -124,9 +198,10 @@ async def get_model(model_name: str):
Raises:
HTTPException: Se il modello non esiste o Ollama non è disponibile
"""
target_host = resolve_ollama_host(host)
try:
response = requests.get(
f"{settings.OLLAMA_HOST}/api/tags",
f"{target_host}/api/tags",
timeout=settings.OLLAMA_TIMEOUT
)
@@ -165,7 +240,63 @@ async def get_model(model_name: str):
detail="Errore nel recupero del modello"
)
@router.post("/models/{model_name}/pull")
@router.get("/models/{model_name}/show")
async def get_model_show(model_name: str, host: Optional[str] = Query(default=None)) -> Dict[str, Any]:
"""
Recupera le informazioni estese di un modello tramite endpoint Ollama /api/show.
Args:
model_name: Nome del modello da interrogare
Returns:
Dict[str, Any]: Dati estesi del modello
"""
target_host = resolve_ollama_host(host)
try:
response = requests.post(
f"{target_host}/api/show",
json={"model": model_name},
timeout=settings.OLLAMA_TIMEOUT
)
if response.status_code == 404:
raise HTTPException(
status_code=404,
detail=f"Modello '{model_name}' non trovato"
)
if response.status_code != 200:
raise HTTPException(
status_code=502,
detail="Errore durante il recupero dettagli modello"
)
return response.json()
except requests.exceptions.Timeout:
raise HTTPException(
status_code=504,
detail="Timeout: Ollama non ha risposto in tempo"
)
except requests.exceptions.ConnectionError:
raise HTTPException(
status_code=502,
detail="Impossible connettersi a Ollama"
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Error fetching model show data: {e}")
raise HTTPException(
status_code=500,
detail="Errore nel recupero dei dettagli modello"
)
@router.post(
"/models/{model_name}/pull",
include_in_schema=settings.ENABLE_MODEL_RW_API
)
async def pull_model(model_name: str):
"""
Scarica/carica un modello in Ollama
@@ -176,6 +307,7 @@ async def pull_model(model_name: str):
Returns:
dict: Status del download
"""
ensure_rw_api_enabled()
try:
response = requests.post(
f"{settings.OLLAMA_HOST}/api/pull",
@@ -198,7 +330,10 @@ async def pull_model(model_name: str):
detail="Errore nel pull del modello"
)
@router.delete("/models/{model_name}")
@router.delete(
"/models/{model_name}",
include_in_schema=settings.ENABLE_MODEL_RW_API
)
async def delete_model(model_name: str):
"""
Elimina un modello da Ollama
@@ -209,6 +344,7 @@ async def delete_model(model_name: str):
Returns:
dict: Confirmazione eliminazione
"""
ensure_rw_api_enabled()
try:
response = requests.delete(
f"{settings.OLLAMA_HOST}/api/delete",
+1
View File
@@ -16,6 +16,7 @@ class Settings(BaseSettings):
API_HOST: str = "0.0.0.0"
API_PORT: int = 8000
API_WORKERS: int = 4
ENABLE_MODEL_RW_API: bool = False
# CORS
CORS_ORIGINS: str = "http://localhost:3000,http://localhost:5173,http://localhost:8000"
+3
View File
@@ -0,0 +1,3 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
File diff suppressed because it is too large Load Diff
Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

+733
View File
@@ -0,0 +1,733 @@
/**
* LLM Monitor - Main App
* Gestisce il Web Worker e aggiorna il DOM in modo efficiente
*/
class LLMMonitorApp {
constructor() {
this.worker = null;
this.activeServer = getActiveServer();
this.selectedModelName = null;
this.isModalOpen = false;
this.hoverOpenDelayMs = 180;
this.hoverOpenTimer = null;
this.lastData = {
health: null,
models: null
};
this.init();
}
init() {
if (!this.activeServer) {
this.renderNoServerState();
return;
}
this.updateServerContextUI();
// Caricare dati da localStorage prima di qualsiasi sync di rete.
this.loadFromLocalStorage();
// Inizializzare il Web Worker
if (typeof Worker !== 'undefined') {
this.worker = new Worker('/static/js/data-sync.worker.js');
this.worker.onmessage = (event) => this.handleWorkerMessage(event);
this.worker.onerror = (error) => {
console.error("Worker error:", error);
// Fallback: sincronizzazione nel main thread
this.syncDataInMainThread();
};
const shouldSyncImmediately = this.shouldSyncImmediately();
this.worker.postMessage({
type: "SET_SERVER",
serverId: this.activeServer.id,
host: this.activeServer.host,
syncImmediately: shouldSyncImmediately,
lastSyncTimestamp: this.getLatestCacheTimestamp()
});
if (shouldSyncImmediately) {
this.renderLoadingState();
}
} else if (this.shouldSyncImmediately()) {
console.warn("Web Workers not supported, using main thread");
this.syncDataInMainThread();
}
// Listener al pulsante manuale
document.getElementById("refresh-btn")?.addEventListener("click", () => {
this.requestSync();
});
// Chiusura modal con pulsante X
document.getElementById("model-details-close")?.addEventListener("click", () => {
this.hideModelDetails();
});
// Chiusura modal con click su overlay
document.getElementById("model-details-backdrop")?.addEventListener("click", () => {
this.hideModelDetails();
});
// Chiusura modal con tasto Esc
document.addEventListener("keydown", (event) => {
if (event.key === "Escape") {
this.hideModelDetails();
}
});
}
// Caricare dati da localStorage
loadFromLocalStorage() {
const health = readServerCache(this.activeServer.id, "health");
const models = readServerCache(this.activeServer.id, "models");
if (health) {
this.lastData.health = health;
this.renderHealth(this.lastData.health);
}
if (models) {
this.lastData.models = models;
this.renderModels(this.lastData.models);
}
this.updateCacheModeIndicator(models);
}
// Gestire messaggi dal Worker
handleWorkerMessage(event) {
const { type, health, modelsData, runningData, serverId } = event.data;
if (serverId && this.activeServer && serverId !== this.activeServer.id) {
return;
}
if (type === "DATA_UPDATED") {
if (health && JSON.stringify(this.lastData.health) !== JSON.stringify(health)) {
this.lastData.health = health;
try {
writeServerCache(this.activeServer.id, "health", health);
} catch (error) {
console.warn("Cannot persist health in localStorage:", error);
}
this.renderHealth(health);
}
if (modelsData && JSON.stringify(this.lastData.models) !== JSON.stringify(modelsData)) {
this.lastData.models = modelsData;
try {
const persistedModels = writeServerCache(this.activeServer.id, "models", modelsData);
if (persistedModels) {
this.lastData.models = persistedModels;
}
} catch (error) {
console.warn("Cannot persist models in localStorage:", error);
}
this.updateCacheModeIndicator(this.lastData.models);
this.renderModels(this.lastData.models);
if (this.selectedModelName) {
this.showModelDetails(this.selectedModelName);
}
}
if (runningData) {
try {
writeServerCache(this.activeServer.id, "running", runningData);
} catch (error) {
console.warn("Cannot persist running models in localStorage:", error);
}
}
}
}
// Renderizzare Health (aggiornamento granulare)
renderHealth(health) {
if (!health) return;
const ollamaStatus = health.ollama_status;
const statusEl = document.getElementById("status-indicator");
const statusText = document.getElementById("status-text");
const ollamaStatusEl = document.getElementById("ollama-status");
if (ollamaStatus === "online") {
// Aggiornare solo se cambiato
if (!statusEl.classList.contains("bg-green-500")) {
statusEl.className = "w-3 h-3 bg-green-500 rounded-full";
statusText.className = "text-sm text-green-400";
statusText.textContent = "Ollama Online";
ollamaStatusEl.innerHTML = "🟢 Online";
}
} else {
if (!statusEl.classList.contains("bg-red-500")) {
statusEl.className = "w-3 h-3 bg-red-500 rounded-full";
statusText.className = "text-sm text-red-400";
statusText.textContent = "Ollama Offline";
ollamaStatusEl.innerHTML = "🔴 Offline";
}
}
}
// Renderizzare Modelli (aggiornamento granulare)
renderModels(modelsData) {
if (!modelsData) return;
// Aggiornare conteggio
document.getElementById("models-count").textContent = modelsData.total;
// Aggiornare spazio totale
document.getElementById("total-size").textContent = modelsData.totalSize;
// Aggiornare lista modelli
const container = document.getElementById("models-container");
const { models } = modelsData;
if (models.length === 0) {
container.innerHTML = `
<div class="text-center py-8 text-gray-400">
<p>No models loaded</p>
</div>
`;
return;
}
// Comparare con il rendering precedente (evitare re-render se identico)
const newHTML = models.map(model => this.renderModelCard(model)).join("");
// Aggiornare solo se veramente diverso
if (container.innerHTML !== newHTML) {
container.innerHTML = newHTML;
this.bindModelCardInteractions();
}
}
// Associare eventi card dopo ogni render (piu affidabile della delega su hover)
bindModelCardInteractions() {
const cards = document.querySelectorAll("#models-container [data-model-key]");
cards.forEach((card) => {
if (card.dataset.modalBound === "true") {
return;
}
const modelKey = card.getAttribute("data-model-key");
if (!modelKey) {
return;
}
const modelName = decodeURIComponent(modelKey);
card.dataset.modalBound = "true";
card.addEventListener("click", () => {
this.toggleModelDetails(modelName);
});
card.addEventListener("mouseenter", () => {
if (this.hoverOpenTimer) {
clearTimeout(this.hoverOpenTimer);
}
this.hoverOpenTimer = setTimeout(() => {
this.showModelDetails(modelName);
}, this.hoverOpenDelayMs);
});
card.addEventListener("mouseleave", () => {
if (this.hoverOpenTimer) {
clearTimeout(this.hoverOpenTimer);
this.hoverOpenTimer = null;
}
});
});
}
toggleModelDetails(modelName) {
if (this.isModalOpen && this.selectedModelName === modelName) {
this.hideModelDetails();
return;
}
this.showModelDetails(modelName);
}
// Renderizzare singola card modello
renderModelCard(model) {
const formattedDate = this.formatDate(model.modified_at);
const modelName = this.escapeHtml(model.name);
const modelKey = encodeURIComponent(model.name);
return `
<div data-model-key="${modelKey}" class="bg-gray-700 rounded-lg p-4 border border-gray-600 hover:border-purple-500 hover:-translate-y-0.5 transition cursor-pointer h-full">
<div class="flex items-start justify-between mb-3">
<h3 class="text-lg font-semibold">${modelName}</h3>
<span class="bg-purple-600 px-3 py-1 rounded text-xs font-medium">Loaded</span>
</div>
<div class="grid grid-cols-2 gap-4 text-sm">
<div>
<p class="text-gray-400">Size</p>
<p class="font-semibold">${this.formatBytes(model.size)}</p>
</div>
<div>
<p class="text-gray-400">Last Updated</p>
<p class="font-semibold">${formattedDate}</p>
</div>
</div>
<div class="mt-3">
<p class="text-gray-400 text-xs">Digest</p>
<p class="font-mono text-xs bg-gray-800 p-2 rounded mt-1 break-all">${this.escapeHtml(model.digest.substring(0, 64))}...</p>
</div>
<p class="text-xs text-purple-300 mt-3">Hover or click to view show details</p>
</div>
`;
}
showModelDetails(modelName) {
const detailsModal = document.getElementById("model-details-modal");
const detailsDialog = document.getElementById("model-details-dialog");
const detailsName = document.getElementById("model-details-name");
const detailsContent = document.getElementById("model-details-content");
if (!detailsModal || !detailsDialog || !detailsName || !detailsContent || !this.lastData.models) {
return;
}
const showByModel = this.lastData.models.showByModel || {};
const showData = showByModel[modelName];
this.selectedModelName = modelName;
this.isModalOpen = true;
detailsModal.classList.remove("hidden");
detailsModal.classList.add("flex");
detailsDialog.classList.add("flex");
document.body.classList.add("overflow-hidden");
detailsName.textContent = modelName;
detailsModal.setAttribute("aria-hidden", "false");
if (!showData) {
detailsContent.innerHTML = `
<div class="flex items-center gap-2 text-gray-400 text-sm py-4">
<svg class="animate-spin w-4 h-4 shrink-0" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8v8H4z"></path>
</svg>
Loading details…
</div>`;
this.loadModelShowDetails(modelName, detailsContent);
return;
}
detailsContent.innerHTML = this.buildAccordionHTML(showData);
}
async loadModelShowDetails(modelName, detailsContent) {
try {
const response = await fetch(this.buildApiUrl(`/api/v1/models/${encodeURIComponent(modelName)}/show`));
if (!response.ok) {
throw new Error(`Failed to load show details for ${modelName}`);
}
const showData = await response.json();
if (!this.lastData.models) {
return;
}
if (!this.lastData.models.showByModel) {
this.lastData.models.showByModel = {};
}
this.lastData.models.showByModel[modelName] = showData;
if (this.selectedModelName === modelName) {
detailsContent.innerHTML = this.buildAccordionHTML(showData);
}
} catch (error) {
console.error(error);
if (this.selectedModelName === modelName) {
detailsContent.innerHTML = '<p class="text-gray-400 text-sm py-2">Show details are not available for this model.</p>';
}
}
}
// ── Accordion helpers ────────────────────────────────────────────────────
buildAccordionHTML(showData) {
if (!showData || typeof showData !== "object") {
return '<p class="text-gray-400 text-sm py-2">No details available.</p>';
}
const sectionOrder = ["details", "model_info", "parameters", "template", "modelfile", "license"];
const allKeys = Object.keys(showData);
const orderedKeys = [
...sectionOrder.filter(k => k in showData),
...allKeys.filter(k => !sectionOrder.includes(k))
];
let html = '<div class="space-y-2">';
orderedKeys.forEach((key, index) => {
const value = showData[key];
const isFirst = index === 0;
const contentId = `acc-${key.replace(/[^a-z0-9]/gi, "-")}`;
const label = this.formatAccordionLabel(key);
const body = this.renderAccordionBody(key, value);
html += `
<div class="border border-gray-700 rounded-lg overflow-hidden">
<button type="button"
class="accordion-header w-full flex items-center justify-between px-4 py-2.5 bg-gray-800 hover:bg-gray-700 text-left transition-colors duration-150"
onclick="app.toggleAccordion('${contentId}', this)"
aria-expanded="${isFirst}">
<span class="font-semibold text-sm text-gray-200">${label}</span>
<svg class="accordion-chevron text-gray-400"
width="16" height="16" style="flex-shrink:0;transition:transform 0.2s;transform:${isFirst ? "rotate(180deg)" : "rotate(0deg)"}"
fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 9l-7 7-7-7"/>
</svg>
</button>
<div id="${contentId}" class="accordion-content bg-gray-900 border-t border-gray-700 ${isFirst ? "" : "hidden"}">
<div class="px-4 py-3">${body}</div>
</div>
</div>`;
});
html += "</div>";
return html;
}
formatAccordionLabel(key) {
const labels = {
details: "Details",
model_info: "Model Info",
parameters: "Parameters",
template: "Template",
modelfile: "Modelfile",
license: "License"
};
const icons = {
details: "▦",
model_info: "🧠",
parameters: "⚙",
template: "📄",
modelfile: "📦",
license: "📜"
};
const icon = icons[key] || "▸";
const text = labels[key] || key.replace(/_/g, " ").replace(/\b\w/g, c => c.toUpperCase());
return `<span class="mr-2 text-base">${icon}</span>${text}`;
}
renderAccordionBody(key, value) {
if (key === "details" && value && typeof value === "object" && !Array.isArray(value)) {
return this.renderDetailsGrid(value);
}
if (key === "model_info" && value && typeof value === "object" && !Array.isArray(value)) {
return this.renderModelInfoTable(value);
}
if (typeof value === "string") {
return `<pre class="text-xs text-gray-300 whitespace-pre-wrap font-mono leading-relaxed max-h-60 overflow-y-auto">${this.escapeHtml(value)}</pre>`;
}
if (value && typeof value === "object") {
return this.renderKeyValueList(value);
}
return `<span class="text-sm text-gray-300">${this.escapeHtml(String(value))}</span>`;
}
renderDetailsGrid(details) {
const labelMap = {
family: "Family",
families: "Families",
parameter_size: "Parameters",
quantization_level: "Quantization",
format: "Format",
parent_model: "Parent Model"
};
let html = '<div class="grid grid-cols-2 gap-x-6 gap-y-3">';
for (const [k, v] of Object.entries(details)) {
const label = labelMap[k] || k.replace(/_/g, " ").replace(/\b\w/g, c => c.toUpperCase());
const display = Array.isArray(v) ? v.join(", ") : String(v);
html += `
<div class="flex flex-col">
<span class="text-xs text-gray-500 uppercase tracking-wide">${this.escapeHtml(label)}</span>
<span class="text-sm text-gray-200 font-medium mt-0.5">${this.escapeHtml(display)}</span>
</div>`;
}
html += "</div>";
return html;
}
renderModelInfoTable(modelInfo) {
let html = '<dl class="space-y-1.5">';
for (const [k, v] of Object.entries(modelInfo)) {
const display = typeof v === "object" ? JSON.stringify(v) : String(v);
html += `
<div class="flex gap-3 text-xs">
<dt class="text-gray-500 font-mono shrink-0 w-5/12 truncate" title="${this.escapeHtml(k)}">${this.escapeHtml(k)}</dt>
<dd class="text-gray-300 break-all">${this.escapeHtml(display)}</dd>
</div>`;
}
html += "</dl>";
return html;
}
renderKeyValueList(obj) {
let html = '<dl class="space-y-1.5">';
for (const [k, v] of Object.entries(obj)) {
const display = typeof v === "object" ? JSON.stringify(v) : String(v);
html += `
<div class="flex gap-3 text-xs">
<dt class="text-gray-500 shrink-0 w-1/3">${this.escapeHtml(k)}</dt>
<dd class="text-gray-300 break-all">${this.escapeHtml(display)}</dd>
</div>`;
}
html += "</dl>";
return html;
}
escapeHtml(str) {
return String(str)
.replace(/&/g, "&amp;")
.replace(/</g, "&lt;")
.replace(/>/g, "&gt;")
.replace(/"/g, "&quot;")
.replace(/'/g, "&#39;");
}
toggleAccordion(contentId, btn) {
const content = document.getElementById(contentId);
if (!content) return;
const isHidden = content.classList.contains("hidden");
content.classList.toggle("hidden", !isHidden);
const chevron = btn.querySelector(".accordion-chevron");
if (chevron) {
chevron.style.transform = isHidden ? "rotate(180deg)" : "";
}
btn.setAttribute("aria-expanded", String(isHidden));
}
// ── Fine accordion helpers ───────────────────────────────────────────────
hideModelDetails() {
const detailsModal = document.getElementById("model-details-modal");
const detailsDialog = document.getElementById("model-details-dialog");
if (!detailsModal || detailsModal.classList.contains("hidden")) {
return;
}
detailsModal.classList.add("hidden");
detailsModal.classList.remove("flex");
detailsDialog?.classList.remove("flex");
document.body.classList.remove("overflow-hidden");
detailsModal.setAttribute("aria-hidden", "true");
this.isModalOpen = false;
this.selectedModelName = null;
}
// Formattare bytes
formatBytes(bytes) {
if (bytes === 0) return "0 B";
const k = 1024;
const sizes = ["B", "KB", "MB", "GB"];
const i = Math.floor(Math.log(bytes) / Math.log(k));
return (bytes / Math.pow(k, i)).toFixed(2) + " " + sizes[i];
}
// Formattare data
formatDate(dateString) {
const date = new Date(dateString);
return date.toLocaleDateString("en-US", {
year: "numeric",
month: "short",
day: "numeric",
hour: "2-digit",
minute: "2-digit"
});
}
// Escapare HTML (prevenire XSS)
escapeHtml(text) {
const div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
// Chiedere sincronizzazione manuale al Worker
requestSync() {
if (!this.activeServer) {
return;
}
if (this.worker) {
this.worker.postMessage({ type: "SYNC_NOW" });
} else {
this.syncDataInMainThread();
}
}
// Fallback: sincronizzazione nel main thread
async syncDataInMainThread() {
if (!this.activeServer) {
return;
}
try {
const response = await fetch(this.buildApiUrl("/api/v1/health"));
if (response.ok) {
const health = await response.json();
this.lastData.health = health;
writeServerCache(this.activeServer.id, "health", health);
this.renderHealth(health);
}
} catch (error) {
console.error("Health check error:", error);
}
try {
const response = await fetch(this.buildApiUrl("/api/v1/models"));
if (response.ok) {
const data = await response.json();
const models = data.models || [];
const showByModel = {};
await Promise.allSettled(
models.map(async (model) => {
const showResponse = await fetch(this.buildApiUrl(`/api/v1/models/${encodeURIComponent(model.name)}/show`));
if (showResponse.ok) {
showByModel[model.name] = await showResponse.json();
}
})
);
const modelsData = {
models,
total: models.length,
totalSize: this.formatBytes(models.reduce((sum, m) => sum + m.size, 0)),
showByModel,
timestamp: new Date().toISOString()
};
this.lastData.models = modelsData;
const persistedModels = writeServerCache(this.activeServer.id, "models", modelsData);
if (persistedModels) {
this.lastData.models = persistedModels;
}
this.updateCacheModeIndicator(this.lastData.models);
this.renderModels(this.lastData.models);
if (this.selectedModelName) {
this.showModelDetails(this.selectedModelName);
}
}
} catch (error) {
console.error("Models loading error:", error);
}
}
getStorageKey(suffix) {
return getServerStorageKey(this.activeServer.id, suffix);
}
shouldSyncImmediately() {
const health = readServerCache(this.activeServer.id, "health");
const models = readServerCache(this.activeServer.id, "models");
if (!health || !models) {
return true;
}
return isCacheStale(this.getLatestCacheTimestamp());
}
getLatestCacheTimestamp() {
return getLatestServerCacheTimestamp(this.activeServer.id, ["health", "models", "running"]);
}
buildApiUrl(path) {
const url = new URL(path, window.location.origin);
url.searchParams.set("host", this.activeServer.host);
return `${url.pathname}${url.search}`;
}
updateServerContextUI() {
const serverLabel = document.getElementById("active-server-label");
if (serverLabel) {
serverLabel.textContent = `Server: ${this.activeServer.name}`;
serverLabel.classList.remove("hidden");
}
const runningLink = document.getElementById("running-link");
if (runningLink) {
runningLink.href = buildServerUrl("/models-running", this.activeServer.id);
}
const serversLink = document.getElementById("servers-link");
if (serversLink) {
serversLink.href = "/servers";
}
}
renderNoServerState() {
const container = document.getElementById("models-container");
const count = document.getElementById("models-count");
const totalSize = document.getElementById("total-size");
const statusIndicator = document.getElementById("status-indicator");
const statusText = document.getElementById("status-text");
const ollamaStatus = document.getElementById("ollama-status");
const cacheModeIndicator = document.getElementById("cache-mode-indicator");
if (count) count.textContent = "0";
if (totalSize) totalSize.textContent = "0 B";
if (statusIndicator) statusIndicator.className = "w-3 h-3 bg-yellow-500 rounded-full";
if (statusText) {
statusText.className = "text-sm text-yellow-300";
statusText.textContent = "No server selected";
}
if (ollamaStatus) {
ollamaStatus.innerHTML = "🟡 Not configured";
}
if (cacheModeIndicator) {
cacheModeIndicator.classList.add("hidden");
}
if (container) {
container.innerHTML = `
<div class="text-center py-10 text-gray-300">
<p class="text-lg font-semibold">No server selected</p>
<p class="text-sm text-gray-400 mt-2">Configure or select a server from the control panel.</p>
<a href="/servers" class="inline-block mt-4 bg-purple-600 hover:bg-purple-700 px-4 py-2 rounded">Open Servers Control Panel</a>
</div>
`;
}
}
updateCacheModeIndicator(modelsData) {
const cacheModeIndicator = document.getElementById("cache-mode-indicator");
if (!cacheModeIndicator) {
return;
}
if (hasDeferredShowDetails(modelsData)) {
cacheModeIndicator.classList.remove("hidden");
return;
}
cacheModeIndicator.classList.add("hidden");
}
renderLoadingState() {
if (this.lastData.models) {
return;
}
const container = document.getElementById("models-container");
if (!container) {
return;
}
container.innerHTML = `
<div class="text-center py-8">
<div class="animate-spin inline-block w-8 h-8 border-4 border-gray-600 border-t-purple-500 rounded-full"></div>
<p class="text-gray-400 mt-4">Loading models...</p>
</div>
`;
}
}
// Inizializzare l'app quando il DOM è pronto
document.addEventListener("DOMContentLoaded", () => {
window.app = new LLMMonitorApp();
});
+206
View File
@@ -0,0 +1,206 @@
/**
* LLM Monitor - Data Sync Worker
* Aggiorna i dati in background e notifica il main thread
*/
const API_BASE = "/api/v1";
const REFRESH_INTERVAL = 30000; // 30 secondi
let activeServerId = null;
let activeHost = null;
let nextSyncTimeout = null;
// Formattare bytes
function formatBytes(bytes) {
if (bytes === 0) return "0 B";
const k = 1024;
const sizes = ["B", "KB", "MB", "GB"];
const i = Math.floor(Math.log(bytes) / Math.log(k));
return (bytes / Math.pow(k, i)).toFixed(2) + " " + sizes[i];
}
// Recuperare health
async function fetchHealth() {
if (!activeHost) {
return null;
}
try {
const response = await fetch(buildApiUrl(`${API_BASE}/health`));
if (response.ok) {
const data = await response.json();
return {
status: data.status,
ollama_status: data.ollama_status,
timestamp: new Date().toISOString(),
serverId: activeServerId
};
}
} catch (error) {
console.error("Health check error:", error);
}
return null;
}
// Recuperare modelli
async function fetchModels() {
if (!activeHost) {
return null;
}
try {
const response = await fetch(buildApiUrl(`${API_BASE}/models`));
if (!response.ok) {
return null;
}
const data = await response.json();
const models = data.models || [];
return {
models,
total: models.length,
totalSize: formatBytes(models.reduce((sum, m) => sum + m.size, 0)),
timestamp: new Date().toISOString(),
serverId: activeServerId
};
} catch (error) {
console.error("Error loading models:", error);
return null;
}
}
// Recuperare dettagli show per un modello
async function fetchModelShow(modelName) {
try {
const response = await fetch(buildApiUrl(`${API_BASE}/models/${encodeURIComponent(modelName)}/show`));
if (!response.ok) {
return null;
}
return await response.json();
} catch (error) {
console.error(`Error loading show data for model ${modelName}:`, error);
return null;
}
}
// Recuperare dettagli show per tutti i modelli
async function fetchAllModelsShow(models) {
const showByModel = {};
const results = await Promise.allSettled(
models.map(async (model) => {
const showData = await fetchModelShow(model.name);
return { name: model.name, showData };
})
);
results.forEach((result) => {
if (result.status === "fulfilled" && result.value.showData) {
showByModel[result.value.name] = result.value.showData;
}
});
return showByModel;
}
async function fetchRunningModels() {
if (!activeHost) {
return null;
}
try {
const response = await fetch(buildApiUrl(`${API_BASE}/models/running`));
if (!response.ok) {
return null;
}
const data = await response.json();
return {
models: data.models || [],
total: data.total || (data.models || []).length,
timestamp: new Date().toISOString(),
serverId: activeServerId
};
} catch (error) {
console.error("Error loading running models:", error);
return null;
}
}
// Sincronizzare i dati
async function syncData() {
if (!activeHost) {
self.postMessage({
type: "DATA_UPDATED",
health: null,
modelsData: null,
serverId: activeServerId
});
return;
}
const health = await fetchHealth();
const isOnline = health?.ollama_status === "online";
const modelsData = isOnline ? await fetchModels() : null;
const runningData = isOnline ? await fetchRunningModels() : null;
if (modelsData && modelsData.models.length > 0) {
modelsData.showByModel = await fetchAllModelsShow(modelsData.models);
} else if (modelsData) {
modelsData.showByModel = {};
}
// Notificare il main thread
// (il main thread gestisce localStorage)
self.postMessage({
type: "DATA_UPDATED",
health,
modelsData,
runningData,
serverId: activeServerId
});
}
function buildApiUrl(path) {
const url = new URL(path, self.location.origin);
url.searchParams.set("host", activeHost);
return `${url.pathname}${url.search}`;
}
function clearNextSync() {
if (nextSyncTimeout) {
clearTimeout(nextSyncTimeout);
nextSyncTimeout = null;
}
}
function scheduleNextSync(lastSyncTimestamp = 0) {
clearNextSync();
const ageMs = lastSyncTimestamp ? Math.max(0, Date.now() - lastSyncTimestamp) : REFRESH_INTERVAL;
const delayMs = Math.max(0, REFRESH_INTERVAL - ageMs);
nextSyncTimeout = setTimeout(async () => {
await syncData();
scheduleNextSync(Date.now());
}, delayMs);
}
// Gestire messaggi dal main thread
self.onmessage = (event) => {
if (event.data.type === "SET_SERVER") {
activeServerId = event.data.serverId || null;
activeHost = event.data.host || null;
const lastSyncTimestamp = Number(event.data.lastSyncTimestamp || 0);
if (event.data.syncImmediately) {
syncData().finally(() => scheduleNextSync(Date.now()));
return;
}
scheduleNextSync(lastSyncTimestamp);
}
if (event.data.type === "SYNC_NOW") {
syncData().finally(() => scheduleNextSync(Date.now()));
}
};
+349
View File
@@ -0,0 +1,349 @@
class RunningModelsPage {
constructor() {
this.activeServer = getActiveServer();
this.worker = null;
this.lastRunningData = null;
this.init();
}
init() {
this.updateServerContextUI();
if (!this.activeServer) {
this.renderNoServerState();
return;
}
this.loadFromLocalStorage();
if (typeof Worker !== "undefined") {
this.worker = new Worker("/static/js/data-sync.worker.js");
this.worker.onmessage = (event) => this.handleWorkerMessage(event);
this.worker.onerror = (error) => {
console.error("Worker error:", error);
this.loadRunningModels(true);
};
const shouldSyncImmediately = this.shouldSyncImmediately();
this.worker.postMessage({
type: "SET_SERVER",
serverId: this.activeServer.id,
host: this.activeServer.host,
syncImmediately: shouldSyncImmediately,
lastSyncTimestamp: this.getLatestCacheTimestamp()
});
if (shouldSyncImmediately && !this.lastRunningData) {
this.renderLoadingState();
}
} else if (this.shouldSyncImmediately()) {
this.loadRunningModels(true);
}
document.getElementById("refresh-btn")?.addEventListener("click", () => {
if (this.worker) {
this.worker.postMessage({ type: "SYNC_NOW" });
} else {
this.loadRunningModels(true);
}
});
}
loadFromLocalStorage() {
const runningData = readServerCache(this.activeServer.id, "running");
if (!runningData) {
return;
}
this.lastRunningData = runningData;
this.renderStats(runningData.models || [], runningData.timestamp);
this.renderRunningModels(runningData.models || []);
}
handleWorkerMessage(event) {
const { type, health, modelsData, runningData, serverId } = event.data;
if (type !== "DATA_UPDATED") {
return;
}
if (serverId && serverId !== this.activeServer.id) {
return;
}
if (health) {
try {
writeServerCache(this.activeServer.id, "health", health);
} catch (error) {
console.warn("Cannot persist health in localStorage:", error);
}
}
if (modelsData) {
try {
writeServerCache(this.activeServer.id, "models", modelsData);
} catch (error) {
console.warn("Cannot persist models in localStorage:", error);
}
}
if (!runningData) {
if (!this.lastRunningData) {
this.renderStats([], health?.timestamp || null);
this.renderRunningUnavailable(health);
}
return;
}
this.lastRunningData = runningData;
try {
writeServerCache(this.activeServer.id, "running", runningData);
} catch (error) {
console.warn("Cannot persist running models in localStorage:", error);
}
this.renderStats(runningData.models || [], runningData.timestamp);
this.renderRunningModels(runningData.models || []);
}
async loadRunningModels(forceNetwork = false) {
const container = document.getElementById("running-models");
if (!container) {
return;
}
if (!forceNetwork && this.lastRunningData) {
this.renderStats(this.lastRunningData.models || [], this.lastRunningData.timestamp);
this.renderRunningModels(this.lastRunningData.models || []);
return;
}
this.renderLoadingState();
try {
const response = await fetch(this.buildApiUrl("/api/v1/models/running"));
if (!response.ok) {
throw new Error("Failed to load running models");
}
const data = await response.json();
const models = data.models || [];
const runningData = {
models,
total: data.total || models.length,
timestamp: new Date().toISOString(),
serverId: this.activeServer.id
};
this.lastRunningData = runningData;
writeServerCache(this.activeServer.id, "running", runningData);
this.renderStats(models, runningData.timestamp);
this.renderRunningModels(models);
} catch (error) {
this.renderRunningUnavailable(null);
this.renderStats([], null);
console.error(error);
}
}
shouldSyncImmediately() {
const running = readServerCache(this.activeServer.id, "running");
if (!running) {
return true;
}
return isCacheStale(this.getLatestCacheTimestamp());
}
getLatestCacheTimestamp() {
return getLatestServerCacheTimestamp(this.activeServer.id, ["health", "models", "running"]);
}
buildApiUrl(path) {
const url = new URL(path, window.location.origin);
url.searchParams.set("host", this.activeServer.host);
return `${url.pathname}${url.search}`;
}
updateServerContextUI() {
if (!this.activeServer) {
return;
}
const serverLabel = document.getElementById("active-server-label");
if (serverLabel) {
serverLabel.textContent = `Server: ${this.activeServer.name}`;
serverLabel.classList.remove("hidden");
}
const availableLink = document.getElementById("available-link");
if (availableLink) {
availableLink.href = buildServerUrl("/models-available", this.activeServer.id);
}
const serversLink = document.getElementById("servers-link");
if (serversLink) {
serversLink.href = "/servers";
}
}
renderNoServerState() {
const container = document.getElementById("running-models");
const runningCountEl = document.getElementById("running-count");
const vramTotalEl = document.getElementById("vram-total");
const lastRefreshEl = document.getElementById("last-refresh");
if (runningCountEl) runningCountEl.textContent = "0";
if (vramTotalEl) vramTotalEl.textContent = "0 B";
if (lastRefreshEl) lastRefreshEl.textContent = "-";
if (container) {
container.innerHTML = `
<div class="text-center py-10 text-gray-300">
<p class="text-lg font-semibold">No server selected</p>
<p class="text-sm text-gray-400 mt-2">Select a server in the control panel to load ollama ps data.</p>
<a href="/servers" class="inline-block mt-4 bg-purple-600 hover:bg-purple-700 px-4 py-2 rounded">Open Servers Control Panel</a>
</div>
`;
}
}
renderStats(models, timestamp = null) {
const runningCountEl = document.getElementById("running-count");
const vramTotalEl = document.getElementById("vram-total");
const lastRefreshEl = document.getElementById("last-refresh");
const totalVram = models.reduce((sum, model) => sum + (model.size_vram || 0), 0);
if (runningCountEl) {
runningCountEl.textContent = String(models.length);
}
if (vramTotalEl) {
vramTotalEl.textContent = this.formatBytes(totalVram);
}
if (lastRefreshEl) {
lastRefreshEl.textContent = timestamp ? this.formatDateTime(timestamp) : "-";
}
}
renderRunningModels(models) {
const container = document.getElementById("running-models");
if (!container) {
return;
}
if (models.length === 0) {
container.innerHTML = `
<div class="text-center py-8 text-gray-400">
<p>No models are currently loaded in memory.</p>
</div>
`;
return;
}
container.innerHTML = models
.map((model) => this.renderModelCard(model))
.join("");
}
renderRunningUnavailable(health = null) {
const container = document.getElementById("running-models");
if (!container) {
return;
}
const isOffline = health?.ollama_status === "offline";
container.innerHTML = `
<div class="text-center py-8 ${isOffline ? "text-yellow-300" : "text-red-400"}">
<p>${isOffline ? "Selected server is offline." : "Failed to load ollama ps output."}</p>
<p class="text-sm text-gray-400 mt-2">Data will refresh automatically when the server becomes reachable.</p>
</div>
`;
}
renderModelCard(model) {
const name = this.escapeHtml(model.name || "unknown");
const modelId = this.escapeHtml(model.model || "-");
const size = this.formatBytes(model.size || 0);
const sizeVram = this.formatBytes(model.size_vram || 0);
const processor = this.escapeHtml(model.details?.processor || "-");
const expiresAt = model.expires_at ? this.formatDateTime(model.expires_at) : "-";
return `
<div class="bg-gray-700 rounded-lg p-4 border border-gray-600">
<div class="flex items-start justify-between gap-4">
<div>
<h3 class="text-lg font-semibold">${name}</h3>
<p class="text-xs text-gray-400 mt-1">${modelId}</p>
</div>
<span class="bg-green-700 text-green-100 text-xs px-2 py-1 rounded">Ready</span>
</div>
<div class="grid grid-cols-1 md:grid-cols-2 gap-3 mt-4 text-sm">
<div class="bg-gray-800 rounded p-3">
<p class="text-gray-400 text-xs">Model Size</p>
<p class="font-semibold mt-1">${size}</p>
</div>
<div class="bg-gray-800 rounded p-3">
<p class="text-gray-400 text-xs">VRAM Used</p>
<p class="font-semibold mt-1">${sizeVram}</p>
</div>
<div class="bg-gray-800 rounded p-3">
<p class="text-gray-400 text-xs">Processor</p>
<p class="font-semibold mt-1">${processor}</p>
</div>
<div class="bg-gray-800 rounded p-3">
<p class="text-gray-400 text-xs">Unload Time</p>
<p class="font-semibold mt-1">${expiresAt}</p>
</div>
</div>
</div>
`;
}
formatBytes(bytes) {
if (!bytes || bytes <= 0) {
return "0 B";
}
const units = ["B", "KB", "MB", "GB", "TB"];
const index = Math.min(Math.floor(Math.log(bytes) / Math.log(1024)), units.length - 1);
const value = bytes / Math.pow(1024, index);
return `${value.toFixed(2)} ${units[index]}`;
}
formatDateTime(isoDate) {
const date = new Date(isoDate);
if (Number.isNaN(date.getTime())) {
return "-";
}
return date.toLocaleString("en-US", {
year: "numeric",
month: "short",
day: "2-digit",
hour: "2-digit",
minute: "2-digit"
});
}
escapeHtml(text) {
const div = document.createElement("div");
div.textContent = String(text);
return div.innerHTML;
}
renderLoadingState() {
const container = document.getElementById("running-models");
if (!container || this.lastRunningData) {
return;
}
container.innerHTML = `
<div class="text-center py-8">
<div class="inline-block w-8 h-8 border-4 border-gray-600 border-t-purple-500 rounded-full animate-spin"></div>
<p class="text-gray-400 mt-4">Refreshing data...</p>
</div>
`;
}
}
document.addEventListener("DOMContentLoaded", () => {
window.runningModelsPage = new RunningModelsPage();
});
+13
View File
@@ -0,0 +1,13 @@
(() => {
if (!("serviceWorker" in navigator)) {
return;
}
window.addEventListener("load", async () => {
try {
await navigator.serviceWorker.register("/service-worker.js", { scope: "/" });
} catch (error) {
console.error("Service worker registration failed:", error);
}
});
})();
+256
View File
@@ -0,0 +1,256 @@
const SERVER_STORAGE_KEY = "llm_monitor_servers";
const ACTIVE_SERVER_KEY = "llm_monitor_active_server";
const DATA_REFRESH_INTERVAL_MS = 30000;
const SERVER_CACHE_SUFFIXES = ["health", "models", "running"];
function normalizeHost(host) {
if (!host) {
return "";
}
const trimmed = host.trim();
if (!trimmed) {
return "";
}
return trimmed.replace(/\/+$/, "");
}
function loadServers() {
const raw = localStorage.getItem(SERVER_STORAGE_KEY);
if (!raw) {
return [];
}
try {
const parsed = JSON.parse(raw);
if (!Array.isArray(parsed)) {
return [];
}
return parsed
.map((item) => ({
id: String(item.id || ""),
name: String(item.name || "").trim(),
host: normalizeHost(item.host || "")
}))
.filter((item) => item.id && item.name && item.host);
} catch {
return [];
}
}
function saveServers(servers) {
localStorage.setItem(SERVER_STORAGE_KEY, JSON.stringify(servers));
cleanupOrphanedServerCaches(servers);
}
function generateServerId() {
return `srv_${Date.now()}_${Math.random().toString(16).slice(2, 8)}`;
}
function getActiveServerId() {
return localStorage.getItem(ACTIVE_SERVER_KEY);
}
function setActiveServerId(serverId) {
localStorage.setItem(ACTIVE_SERVER_KEY, serverId);
}
function getServerById(serverId) {
return loadServers().find((server) => server.id === serverId) || null;
}
function getServerIdFromQuery() {
const params = new URLSearchParams(window.location.search);
return params.get("server") || "";
}
function getActiveServer() {
const queryServerId = getServerIdFromQuery();
if (queryServerId) {
const fromQuery = getServerById(queryServerId);
if (fromQuery) {
setActiveServerId(fromQuery.id);
return fromQuery;
}
}
const activeServerId = getActiveServerId();
if (activeServerId) {
const activeServer = getServerById(activeServerId);
if (activeServer) {
return activeServer;
}
}
const servers = loadServers();
if (servers.length > 0) {
setActiveServerId(servers[0].id);
return servers[0];
}
return null;
}
function buildServerUrl(path, serverId) {
const url = new URL(path, window.location.origin);
if (serverId) {
url.searchParams.set("server", serverId);
}
return `${url.pathname}${url.search}`;
}
function getServerStorageKey(serverId, suffix) {
return `llm_monitor_${suffix}_${serverId}`;
}
function readServerCache(serverId, suffix) {
if (!serverId) {
return null;
}
const raw = localStorage.getItem(getServerStorageKey(serverId, suffix));
if (!raw) {
return null;
}
try {
return JSON.parse(raw);
} catch {
return null;
}
}
function writeServerCache(serverId, suffix, value) {
if (!serverId) {
return value;
}
const storageKey = getServerStorageKey(serverId, suffix);
const candidates = [value];
cleanupOrphanedServerCaches();
if (suffix === "models") {
candidates.push(createSlimModelsCache(value));
}
for (const candidate of candidates) {
try {
localStorage.setItem(storageKey, JSON.stringify(candidate));
return candidate;
} catch (error) {
if (!isQuotaExceededError(error)) {
throw error;
}
}
}
// Last resort: free stale server caches and retry with the smallest payload.
cleanupOrphanedServerCaches(loadServers());
if (suffix === "models") {
const slimValue = createSlimModelsCache(value);
try {
localStorage.setItem(storageKey, JSON.stringify(slimValue));
return slimValue;
} catch (error) {
if (!isQuotaExceededError(error)) {
throw error;
}
console.warn(`Cache quota exceeded for ${storageKey}; using in-memory models data only.`);
return null;
}
}
console.warn(`Cache quota exceeded for ${storageKey}; skipping persistence for this payload.`);
return null;
}
function createSlimModelsCache(value) {
if (!value || typeof value !== "object") {
return value;
}
const slimValue = { ...value };
if (slimValue.showByModel) {
delete slimValue.showByModel;
slimValue.showDetailsDeferred = true;
}
return slimValue;
}
function isQuotaExceededError(error) {
return error instanceof DOMException && (
error.code === 22 ||
error.code === 1014 ||
error.name === "QuotaExceededError" ||
error.name === "NS_ERROR_DOM_QUOTA_REACHED"
);
}
function cleanupOrphanedServerCaches(servers = loadServers()) {
const validServerIds = new Set(servers.map((server) => server.id));
const keysToRemove = [];
for (let index = 0; index < localStorage.length; index += 1) {
const key = localStorage.key(index);
if (!key) {
continue;
}
for (const suffix of SERVER_CACHE_SUFFIXES) {
const prefix = `llm_monitor_${suffix}_`;
if (!key.startsWith(prefix)) {
continue;
}
const serverId = key.slice(prefix.length);
if (!validServerIds.has(serverId)) {
keysToRemove.push(key);
}
}
}
keysToRemove.forEach((key) => localStorage.removeItem(key));
}
function clearServerCaches(serverId) {
if (!serverId) {
return;
}
SERVER_CACHE_SUFFIXES.forEach((suffix) => {
localStorage.removeItem(getServerStorageKey(serverId, suffix));
});
}
function getCacheTimestamp(cacheValue) {
if (!cacheValue || !cacheValue.timestamp) {
return 0;
}
const parsed = Date.parse(cacheValue.timestamp);
return Number.isNaN(parsed) ? 0 : parsed;
}
function getLatestServerCacheTimestamp(serverId, suffixes) {
return suffixes.reduce((latest, suffix) => {
const value = readServerCache(serverId, suffix);
return Math.max(latest, getCacheTimestamp(value));
}, 0);
}
function isCacheStale(timestamp, maxAgeMs = DATA_REFRESH_INTERVAL_MS) {
if (!timestamp) {
return true;
}
return (Date.now() - timestamp) >= maxAgeMs;
}
function hasDeferredShowDetails(cacheValue) {
return Boolean(cacheValue && cacheValue.showDetailsDeferred);
}
+179
View File
@@ -0,0 +1,179 @@
class ServersPage {
constructor() {
this.form = document.getElementById("server-form");
this.serverIdInput = document.getElementById("server-id");
this.serverNameInput = document.getElementById("server-name");
this.serverHostInput = document.getElementById("server-host");
this.clearFormBtn = document.getElementById("clear-form-btn");
this.serversList = document.getElementById("servers-list");
this.serversCount = document.getElementById("servers-count");
this.init();
}
init() {
this.form?.addEventListener("submit", (event) => {
event.preventDefault();
this.saveServer();
});
this.clearFormBtn?.addEventListener("click", () => this.resetForm());
this.renderServers();
}
saveServer() {
const name = this.serverNameInput?.value.trim() || "";
const host = normalizeHost(this.serverHostInput?.value || "");
if (!name || !host) {
return;
}
const existingId = this.serverIdInput?.value || "";
const servers = loadServers();
if (existingId) {
const index = servers.findIndex((server) => server.id === existingId);
if (index >= 0) {
servers[index] = { ...servers[index], name, host };
}
saveServers(servers);
setActiveServerId(existingId);
} else {
const newServer = {
id: generateServerId(),
name,
host
};
servers.push(newServer);
saveServers(servers);
setActiveServerId(newServer.id);
}
this.resetForm();
this.renderServers();
}
editServer(serverId) {
const server = getServerById(serverId);
if (!server) {
return;
}
this.serverIdInput.value = server.id;
this.serverNameInput.value = server.name;
this.serverHostInput.value = server.host;
}
deleteServer(serverId) {
const servers = loadServers().filter((server) => server.id !== serverId);
saveServers(servers);
clearServerCaches(serverId);
const activeServerId = getActiveServerId();
if (activeServerId === serverId) {
if (servers.length > 0) {
setActiveServerId(servers[0].id);
} else {
localStorage.removeItem(ACTIVE_SERVER_KEY);
}
}
this.renderServers();
}
selectServer(serverId) {
setActiveServerId(serverId);
this.renderServers();
}
openAvailable(serverId) {
window.location.href = buildServerUrl("/models-available", serverId);
}
openRunning(serverId) {
window.location.href = buildServerUrl("/models-running", serverId);
}
resetForm() {
this.serverIdInput.value = "";
this.serverNameInput.value = "";
this.serverHostInput.value = "";
}
renderServers() {
const servers = loadServers();
const activeServerId = getActiveServerId();
if (this.serversCount) {
this.serversCount.textContent = `${servers.length} server${servers.length === 1 ? "" : "s"}`;
}
if (!this.serversList) {
return;
}
if (servers.length === 0) {
this.serversList.innerHTML = `
<div class="text-center py-10 text-gray-400 border border-dashed border-gray-600 rounded-lg">
No servers configured yet. Add your first Ollama endpoint in the control panel.
</div>
`;
return;
}
this.serversList.innerHTML = servers
.map((server) => {
const isActive = server.id === activeServerId;
return `
<div class="bg-gray-700 border ${isActive ? "border-purple-500" : "border-gray-600"} rounded-lg p-4">
<div class="flex flex-col md:flex-row md:items-center md:justify-between gap-4">
<div>
<h3 class="text-lg font-semibold">${this.escapeHtml(server.name)}</h3>
<p class="text-xs text-gray-300 mt-1">${this.escapeHtml(server.host)}</p>
</div>
<div class="flex flex-wrap gap-2">
<button data-action="select" data-server-id="${server.id}" class="bg-gray-800 hover:bg-gray-900 px-3 py-2 rounded text-xs">${isActive ? "Selected" : "Select"}</button>
<button data-action="available" data-server-id="${server.id}" class="bg-blue-700 hover:bg-blue-800 px-3 py-2 rounded text-xs">Available</button>
<button data-action="running" data-server-id="${server.id}" class="bg-green-700 hover:bg-green-800 px-3 py-2 rounded text-xs">Running</button>
<button data-action="edit" data-server-id="${server.id}" class="bg-amber-700 hover:bg-amber-800 px-3 py-2 rounded text-xs">Edit</button>
<button data-action="delete" data-server-id="${server.id}" class="bg-red-700 hover:bg-red-800 px-3 py-2 rounded text-xs">Delete</button>
</div>
</div>
</div>
`;
})
.join("");
this.bindServerActions();
}
bindServerActions() {
this.serversList.querySelectorAll("button[data-action]").forEach((button) => {
button.addEventListener("click", () => {
const action = button.getAttribute("data-action");
const serverId = button.getAttribute("data-server-id") || "";
if (!serverId) {
return;
}
if (action === "select") this.selectServer(serverId);
if (action === "available") this.openAvailable(serverId);
if (action === "running") this.openRunning(serverId);
if (action === "edit") this.editServer(serverId);
if (action === "delete") this.deleteServer(serverId);
});
});
}
escapeHtml(text) {
const div = document.createElement("div");
div.textContent = text;
return div.innerHTML;
}
}
document.addEventListener("DOMContentLoaded", () => {
window.serversPage = new ServersPage();
});
+75
View File
@@ -0,0 +1,75 @@
const CACHE_NAME = "llm-monitor-v3";
const APP_SHELL = [
"/",
"/servers",
"/models-running",
"/models-available",
"/static/css/output.css",
"/static/js/server-config.js",
"/static/js/app.js",
"/static/js/servers.js",
"/static/js/models-running.js",
"/static/js/data-sync.worker.js",
"/static/js/pwa-register.js",
"/manifest.webmanifest",
"/favicon.ico"
];
self.addEventListener("install", (event) => {
event.waitUntil(
caches.open(CACHE_NAME).then((cache) => cache.addAll(APP_SHELL))
);
self.skipWaiting();
});
self.addEventListener("activate", (event) => {
event.waitUntil(
caches.keys().then((keys) =>
Promise.all(
keys.filter((key) => key !== CACHE_NAME).map((key) => caches.delete(key))
)
)
);
self.clients.claim();
});
self.addEventListener("fetch", (event) => {
if (event.request.method !== "GET") {
return;
}
const requestUrl = new URL(event.request.url);
const isApiRequest = requestUrl.pathname.startsWith("/api/");
if (isApiRequest) {
event.respondWith(
fetch(event.request).catch(() =>
new Response(JSON.stringify({ detail: "Offline" }), {
status: 503,
headers: { "Content-Type": "application/json" }
})
)
);
return;
}
event.respondWith(
caches.match(event.request).then((cached) => {
if (cached) {
return cached;
}
return fetch(event.request)
.then((response) => {
if (!response || response.status !== 200 || response.type !== "basic") {
return response;
}
const responseClone = response.clone();
caches.open(CACHE_NAME).then((cache) => cache.put(event.request, responseClone));
return response;
})
.catch(() => caches.match("/servers"));
})
);
});
+18
View File
@@ -0,0 +1,18 @@
{
"name": "LLM Monitor",
"short_name": "LLM Monitor",
"description": "Monitor available and running Ollama models.",
"start_url": "/",
"scope": "/",
"display": "standalone",
"background_color": "#111827",
"theme_color": "#111827",
"lang": "en",
"icons": [
{
"src": "/favicon.ico",
"sizes": "any",
"type": "image/x-icon"
}
]
}
+69 -136
View File
@@ -1,10 +1,16 @@
<!DOCTYPE html>
<html lang="it">
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>LLM Monitor - Dashboard Ollama</title>
<script src="https://cdn.tailwindcss.com"></script>
<title>LLM Monitor - Ollama Dashboard</title>
<link rel="icon" href="/favicon.ico" sizes="any">
<link rel="manifest" href="/manifest.webmanifest">
<meta name="theme-color" content="#111827">
<meta name="application-name" content="LLM Monitor">
<meta name="description" content="Monitor available and running Ollama models.">
<!-- Tailwind CSS (compiled for production) -->
<link rel="stylesheet" href="/static/css/output.css">
<style>
@keyframes spin {
to { transform: rotate(360deg); }
@@ -12,6 +18,24 @@
.animate-spin {
animation: spin 1s linear infinite;
}
.modal-body {
max-height: 80vh;
overflow-y: auto;
padding-right: 10px;
scrollbar-width: thin;
scrollbar-color: #8b5cf6 #1f2937;
}
.modal-body::-webkit-scrollbar {
width: 8px;
}
.modal-body::-webkit-scrollbar-track {
background: #1f2937;
border-radius: 4px;
}
.modal-body::-webkit-scrollbar-thumb {
background: #8b5cf6;
border-radius: 4px;
}
</style>
</head>
<body class="bg-gray-900 text-white">
@@ -27,9 +51,12 @@
<h1 class="text-2xl font-bold">LLM Monitor</h1>
</div>
<div class="flex items-center gap-4">
<a id="servers-link" href="/servers" class="text-sm bg-gray-700 hover:bg-gray-600 px-3 py-2 rounded-lg transition">Servers</a>
<a id="running-link" href="/models-running" class="text-sm bg-gray-700 hover:bg-gray-600 px-3 py-2 rounded-lg transition">Running Models</a>
<span id="active-server-label" class="hidden text-xs text-gray-300 bg-gray-700 px-3 py-2 rounded-lg"></span>
<div id="health-status" class="flex items-center gap-2">
<div id="status-indicator" class="w-3 h-3 bg-gray-500 rounded-full"></div>
<span id="status-text" class="text-sm text-gray-400">Controllo...</span>
<span id="status-text" class="text-sm text-gray-400">Checking...</span>
</div>
</div>
</div>
@@ -42,15 +69,15 @@
<!-- Stats Cards -->
<div class="grid grid-cols-1 md:grid-cols-3 gap-6 mb-8">
<div class="bg-gray-800 rounded-lg p-6 border border-gray-700">
<div class="text-gray-400 text-sm font-medium">Modelli Caricati</div>
<div class="text-gray-400 text-sm font-medium">Loaded Models</div>
<div id="models-count" class="text-4xl font-bold mt-2">-</div>
</div>
<div class="bg-gray-800 rounded-lg p-6 border border-gray-700">
<div class="text-gray-400 text-sm font-medium">Spazio Totale</div>
<div class="text-gray-400 text-sm font-medium">Total Size</div>
<div id="total-size" class="text-4xl font-bold mt-2">-</div>
</div>
<div class="bg-gray-800 rounded-lg p-6 border border-gray-700">
<div class="text-gray-400 text-sm font-medium">Status Ollama</div>
<div class="text-gray-400 text-sm font-medium">Ollama Status</div>
<div id="ollama-status" class="text-4xl font-bold mt-2">-</div>
</div>
</div>
@@ -58,25 +85,29 @@
<!-- Models Section -->
<div class="bg-gray-800 rounded-lg border border-gray-700 p-6">
<div class="flex items-center justify-between mb-6">
<h2 class="text-xl font-bold">Modelli Disponibili</h2>
<button onclick="loadModels()" class="bg-purple-600 hover:bg-purple-700 px-4 py-2 rounded-lg text-sm font-medium transition">
🔄 Aggiorna
<div>
<h2 class="text-xl font-bold">Available Models</h2>
<p class="text-xs text-gray-400 mt-1">Hover or click a card to open the details modal.</p>
<p id="cache-mode-indicator" class="hidden text-xs text-amber-300 mt-2">Model details are loaded on demand to keep device storage usage low.</p>
</div>
<button id="refresh-btn" class="bg-purple-600 hover:bg-purple-700 px-4 py-2 rounded-lg text-sm font-medium transition">
Refresh
</button>
</div>
<!-- Models List -->
<div id="models-container" class="space-y-4">
<div id="models-container" class="grid grid-cols-1 md:grid-cols-2 xl:grid-cols-3 gap-4">
<div class="text-center py-8">
<div class="animate-spin inline-block w-8 h-8 border-4 border-gray-600 border-t-purple-500 rounded-full"></div>
<p class="text-gray-400 mt-4">Caricamento modelli...</p>
<p class="text-gray-400 mt-4">Loading models...</p>
</div>
</div>
</div>
<!-- API Documentation Section -->
<div class="mt-8 bg-blue-900 bg-opacity-20 border border-blue-700 rounded-lg p-6">
<h3 class="text-lg font-bold mb-4">📚 Documentazione API</h3>
<p class="text-gray-300 mb-4">La API è documentata e testabile direttamente da:</p>
<h3 class="text-lg font-bold mb-4">API Documentation</h3>
<p class="text-gray-300 mb-4">The API is documented and testable from:</p>
<div class="flex gap-3 flex-wrap">
<a href="/docs" target="_blank" class="inline-block bg-blue-600 hover:bg-blue-700 px-4 py-2 rounded-lg text-sm font-medium transition">
Swagger UI
@@ -92,133 +123,35 @@
<!-- Footer -->
<footer class="bg-gray-800 border-t border-gray-700 mt-12">
<div class="max-w-7xl mx-auto px-4 py-6 text-center text-gray-400 text-sm">
<p>LLM Monitor v1.0.0 • Fatto con ❤️ da <a href="https://lucasacchi.net" target="_blank" class="text-purple-400 hover:text-purple-300">LucaSacchi.Net</a></p>
<p>LLM Monitor v1.0.0 • Built by <a href="https://lucasacchi.net" target="_blank" class="text-purple-400 hover:text-purple-300">LucaSacchi.Net</a></p>
</div>
</footer>
</div>
<script>
const API_BASE = "/api/v1";
// Formattare bytes in formato leggibile
function formatBytes(bytes) {
if (bytes === 0) return "0 B";
const k = 1024;
const sizes = ["B", "KB", "MB", "GB"];
const i = Math.floor(Math.log(bytes) / Math.log(k));
return (bytes / Math.pow(k, i)).toFixed(2) + " " + sizes[i];
}
// Formattare data
function formatDate(dateString) {
const date = new Date(dateString);
return date.toLocaleDateString("it-IT", {
year: "numeric",
month: "short",
day: "numeric",
hour: "2-digit",
minute: "2-digit"
});
}
// Verificare health
async function checkHealth() {
try {
const response = await fetch(`${API_BASE}/health`);
if (response.ok) {
const data = await response.json();
const statusEl = document.getElementById("status-indicator");
const statusText = document.getElementById("status-text");
const ollamaStatus = data.ollama_status;
if (ollamaStatus === "online") {
statusEl.className = "w-3 h-3 bg-green-500 rounded-full";
statusText.className = "text-sm text-green-400";
statusText.textContent = "Ollama Online";
document.getElementById("ollama-status").innerHTML = "🟢 Online";
} else {
statusEl.className = "w-3 h-3 bg-red-500 rounded-full";
statusText.className = "text-sm text-red-400";
statusText.textContent = "Ollama Offline";
document.getElementById("ollama-status").innerHTML = "🔴 Offline";
}
}
} catch (error) {
console.error("Health check error:", error);
document.getElementById("status-indicator").className = "w-3 h-3 bg-red-500 rounded-full";
document.getElementById("status-text").textContent = "Errore connessione";
}
}
// Caricare modelli
async function loadModels() {
try {
const response = await fetch(`${API_BASE}/models`);
if (!response.ok) throw new Error("Errore nel caricamento");
const data = await response.json();
const models = data.models || [];
// Aggiornare conteggio
document.getElementById("models-count").textContent = models.length;
// Calcolare spazio totale
const totalSize = models.reduce((sum, m) => sum + m.size, 0);
document.getElementById("total-size").textContent = formatBytes(totalSize);
// Renderizzare modelli
if (models.length === 0) {
document.getElementById("models-container").innerHTML = `
<div class="text-center py-8 text-gray-400">
<p>Nessun modello caricato</p>
</div>
`;
} else {
document.getElementById("models-container").innerHTML = models.map(model => `
<div class="bg-gray-700 rounded-lg p-4 border border-gray-600 hover:border-purple-500 transition">
<div class="flex items-start justify-between mb-3">
<h3 class="text-lg font-semibold">${model.name}</h3>
<span class="bg-purple-600 px-3 py-1 rounded text-xs font-medium">Caricato</span>
</div>
<div class="grid grid-cols-2 gap-4 text-sm">
<div>
<p class="text-gray-400">Dimensione</p>
<p class="font-semibold">${formatBytes(model.size)}</p>
</div>
<div>
<p class="text-gray-400">Ultimo aggiornamento</p>
<p class="font-semibold">${formatDate(model.modified_at)}</p>
</div>
</div>
<div class="mt-3">
<p class="text-gray-400 text-xs">Digest</p>
<p class="font-mono text-xs bg-gray-800 p-2 rounded mt-1 break-all">${model.digest.substring(0, 64)}...</p>
</div>
</div>
`).join("");
}
} catch (error) {
console.error("Error loading models:", error);
document.getElementById("models-container").innerHTML = `
<div class="text-center py-8 text-red-400">
<p>❌ Errore nel caricamento dei modelli</p>
<p class="text-sm mt-2">${error.message}</p>
<!-- Model Show Details Modal -->
<div id="model-details-modal" class="hidden fixed inset-0 z-50 items-center justify-center" aria-hidden="true">
<div id="model-details-backdrop" class="absolute inset-0 bg-black/70"></div>
<div id="model-details-dialog" class="relative w-full min-h-screen items-center justify-center p-4">
<div id="model-details-section" class="w-full max-w-4xl bg-gray-800 rounded-lg border border-gray-700 p-6 shadow-xl">
<div class="flex items-center justify-between mb-4">
<div>
<h3 class="text-lg font-bold">Model Details</h3>
<span id="model-details-name" class="text-sm text-purple-300 font-medium"></span>
</div>
`;
}
}
<button id="model-details-close" type="button" class="text-gray-300 hover:text-white text-2xl leading-none px-2" aria-label="Close modal">×</button>
</div>
<div class="modal-body overflow-y-auto max-h-[75vh]">
<div id="model-details-content"></div>
</div>
</div>
</div>
</div>
// Inizializzazione
document.addEventListener("DOMContentLoaded", () => {
checkHealth();
loadModels();
// Refresh ogni 30 secondi
setInterval(() => {
checkHealth();
loadModels();
}, 30000);
});
</script>
<!-- LLM Monitor Application -->
<!-- Web Worker for background data sync -->
<!-- localStorage for client-side persistence -->
<script src="/static/js/server-config.js"></script>
<script src="/static/js/app.js"></script>
<script src="/static/js/pwa-register.js"></script>
</body>
</html>
+86
View File
@@ -0,0 +1,86 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>LLM Monitor - Running Models</title>
<link rel="icon" href="/favicon.ico" sizes="any">
<link rel="manifest" href="/manifest.webmanifest">
<meta name="theme-color" content="#111827">
<meta name="application-name" content="LLM Monitor">
<meta name="description" content="View models currently loaded in Ollama memory.">
<link rel="stylesheet" href="/static/css/output.css">
<style>
@keyframes spin {
to { transform: rotate(360deg); }
}
.animate-spin {
animation: spin 1s linear infinite;
}
</style>
</head>
<body class="bg-gray-900 text-white">
<div class="min-h-screen flex flex-col">
<header class="bg-gray-800 border-b border-gray-700 sticky top-0 z-50">
<div class="max-w-7xl mx-auto px-4 py-6">
<div class="flex items-center justify-between gap-4">
<div class="flex items-center gap-3">
<div class="w-10 h-10 bg-gradient-to-br from-purple-500 to-pink-500 rounded-lg flex items-center justify-center font-bold text-lg">
🧠
</div>
<div>
<h1 class="text-2xl font-bold">Running Models</h1>
<p class="text-xs text-gray-400">Dedicated view for ollama ps output</p>
</div>
</div>
<div class="flex items-center gap-2">
<a id="servers-link" href="/servers" class="text-sm bg-gray-700 hover:bg-gray-600 px-3 py-2 rounded-lg transition">Servers</a>
<a id="available-link" href="/models-available" class="text-sm bg-gray-700 hover:bg-gray-600 px-3 py-2 rounded-lg transition">Available Models</a>
<span id="active-server-label" class="hidden text-xs text-gray-300 bg-gray-700 px-3 py-2 rounded-lg"></span>
<button id="refresh-btn" class="text-sm bg-purple-600 hover:bg-purple-700 px-3 py-2 rounded-lg transition">Refresh</button>
</div>
</div>
</div>
</header>
<main class="flex-1">
<div class="max-w-7xl mx-auto px-4 py-8">
<div class="grid grid-cols-1 md:grid-cols-3 gap-6 mb-8">
<div class="bg-gray-800 rounded-lg p-6 border border-gray-700">
<div class="text-gray-400 text-sm font-medium">Loaded in Memory</div>
<div id="running-count" class="text-4xl font-bold mt-2">-</div>
</div>
<div class="bg-gray-800 rounded-lg p-6 border border-gray-700">
<div class="text-gray-400 text-sm font-medium">Estimated Total VRAM</div>
<div id="vram-total" class="text-4xl font-bold mt-2">-</div>
</div>
<div class="bg-gray-800 rounded-lg p-6 border border-gray-700">
<div class="text-gray-400 text-sm font-medium">Last Refresh</div>
<div id="last-refresh" class="text-base font-semibold mt-3">-</div>
</div>
</div>
<div class="bg-gray-800 rounded-lg border border-gray-700 p-6">
<h2 class="text-xl font-bold mb-4">Ollama PS Output</h2>
<div id="running-models" class="space-y-3">
<div class="text-center py-8">
<div class="inline-block w-8 h-8 border-4 border-gray-600 border-t-purple-500 rounded-full animate-spin"></div>
<p class="text-gray-400 mt-4">Loading running models...</p>
</div>
</div>
</div>
</div>
</main>
<footer class="bg-gray-800 border-t border-gray-700 mt-12">
<div class="max-w-7xl mx-auto px-4 py-6 text-center text-gray-400 text-sm">
<p>LLM Monitor v1.0.0 • Models currently loaded in memory (ollama ps)</p>
</div>
</footer>
</div>
<script src="/static/js/server-config.js"></script>
<script src="/static/js/models-running.js"></script>
<script src="/static/js/pwa-register.js"></script>
</body>
</html>
+79
View File
@@ -0,0 +1,79 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>LLM Monitor - Servers</title>
<link rel="icon" href="/favicon.ico" sizes="any">
<link rel="manifest" href="/manifest.webmanifest">
<meta name="theme-color" content="#111827">
<meta name="application-name" content="LLM Monitor">
<meta name="description" content="Manage Ollama servers and open detailed dashboards.">
<link rel="stylesheet" href="/static/css/output.css">
</head>
<body class="bg-gray-900 text-white">
<div class="min-h-screen flex flex-col">
<header class="bg-gray-800 border-b border-gray-700 sticky top-0 z-50">
<div class="max-w-7xl mx-auto px-4 py-6">
<div class="flex items-center justify-between gap-4">
<div class="flex items-center gap-3">
<div class="w-10 h-10 bg-gradient-to-br from-purple-500 to-pink-500 rounded-lg flex items-center justify-center font-bold text-lg">
🌐
</div>
<div>
<h1 class="text-2xl font-bold">LLM Monitor Servers</h1>
<p class="text-xs text-gray-400">Configure Ollama endpoints and open per-server dashboards</p>
</div>
</div>
<div class="flex items-center gap-2">
<a href="/models-running" class="text-sm bg-gray-700 hover:bg-gray-600 px-3 py-2 rounded-lg transition">Running Models</a>
<a href="/models-available" class="text-sm bg-gray-700 hover:bg-gray-600 px-3 py-2 rounded-lg transition">Available Models</a>
</div>
</div>
</div>
</header>
<main class="flex-1">
<div class="max-w-7xl mx-auto px-4 py-8 grid grid-cols-1 xl:grid-cols-3 gap-6">
<section class="xl:col-span-2 bg-gray-800 rounded-lg border border-gray-700 p-6">
<div class="flex items-center justify-between mb-4">
<h2 class="text-xl font-bold">Configured Servers</h2>
<span id="servers-count" class="text-sm text-gray-400">0 servers</span>
</div>
<div id="servers-list" class="space-y-3"></div>
</section>
<section class="bg-gray-800 rounded-lg border border-gray-700 p-6">
<h2 class="text-xl font-bold mb-4">Control Panel</h2>
<form id="server-form" class="space-y-4">
<input id="server-id" type="hidden">
<div>
<label for="server-name" class="text-sm text-gray-300 block mb-1">Server Name</label>
<input id="server-name" type="text" required placeholder="Production Ollama" class="w-full bg-gray-900 border border-gray-600 rounded px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-purple-500">
</div>
<div>
<label for="server-host" class="text-sm text-gray-300 block mb-1">Ollama URL</label>
<input id="server-host" type="url" required placeholder="http://192.168.1.50:11434" class="w-full bg-gray-900 border border-gray-600 rounded px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-purple-500">
</div>
<div class="flex gap-2">
<button id="save-server-btn" type="submit" class="flex-1 bg-purple-600 hover:bg-purple-700 rounded px-3 py-2 text-sm font-semibold transition">Save Server</button>
<button id="clear-form-btn" type="button" class="bg-gray-700 hover:bg-gray-600 rounded px-3 py-2 text-sm transition">Clear</button>
</div>
</form>
<p class="text-xs text-gray-400 mt-4">All server profiles are saved to localStorage on this device.</p>
</section>
</div>
</main>
<footer class="bg-gray-800 border-t border-gray-700 mt-12">
<div class="max-w-7xl mx-auto px-4 py-6 text-center text-gray-400 text-sm">
<p>LLM Monitor v1.0.0 • Multi-server PWA control panel</p>
</div>
</footer>
</div>
<script src="/static/js/server-config.js"></script>
<script src="/static/js/servers.js"></script>
<script src="/static/js/pwa-register.js"></script>
</body>
</html>
+5 -45
View File
@@ -1,23 +1,4 @@
version: '3.8'
services:
# Ollama Service
ollama:
image: ollama/ollama:latest
container_name: ollama-server
ports:
- "11434:11434"
environment:
OLLAMA_HOST: 0.0.0.0:11434
volumes:
- ollama_data:/root/.ollama
restart: unless-stopped
# Keep container running until stopped
stdin_open: true
tty: true
networks:
- llm-monitor-network
# LLM Monitor Dashboard
llm-monitor:
build:
@@ -25,45 +6,24 @@ services:
dockerfile: Dockerfile
container_name: llm-monitor-app
ports:
- "8000:8000"
environment:
# Carica variabili da .env
OLLAMA_HOST: http://ollama:11434
OLLAMA_TIMEOUT: 30
API_HOST: 0.0.0.0
API_PORT: 8000
API_WORKERS: 4
CORS_ORIGINS: http://localhost:3000,http://localhost:5173,http://localhost:8000
LOG_LEVEL: INFO
ENVIRONMENT: production
- "${API_PORT:-8000}:${API_PORT:-8000}"
env_file:
- .env
depends_on:
- ollama
restart: unless-stopped
stdin_open: true
tty: true
networks:
- llm-monitor-network
# Health check
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/api/v1/health"]
test: ["CMD", "curl", "-f", "http://localhost:${API_PORT:-8000}/api/v1/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
volumes:
ollama_data:
driver: local
networks:
llm-monitor-network:
driver: bridge
# Istruzioni di avvio:
# docker compose up -d # Avvia i servizi
# docker compose build --no-cache # Rebuild completo (consigliato se output.css e vuoto o UI rotta)
# docker exec llm-monitor-app wc -l /app/app/web/static/css/output.css # Verifica CSS compilato
# docker compose logs -f # Visualizza i log
# docker compose down # Ferma i servizi
# docker compose stop ollama # Ferma solo Ollama
# docker compose start ollama # Riavvia Ollama
# docker compose restart # Riavvia i servizi
+261
View File
@@ -0,0 +1,261 @@
# Development Setup - LLM Monitor
## 🛠️ Setup Locale
### 1. Installare Dipendenze Python
```bash
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
pip install -r requirements-dev.txt
```
### 2. Installare Dipendenze Node (per Tailwind CSS)
```bash
npm install
```
### 3. Compilare Tailwind CSS
#### Modalità Development (watch mode)
```bash
npm run tailwind:dev
```
Questo comando:
- Compila `app/web/static/css/input.css` in `app/web/static/css/output.css`
- Rimane in watch mode per compilare automaticamente al salvataggio
- Legge la configurazione da `tailwind.config.js`
#### Modalità Production (minified)
```bash
npm run tailwind:build
```
Questo comando:
- Compila e minifica il CSS
- Ottimizzato per produzione
- Usato durante il build Docker
### 4. Avviare l'Applicazione
In una finestra di terminale (con `npm run tailwind:dev` in watch):
```bash
source venv/bin/activate
python3 -m uvicorn main:app --reload --host 0.0.0.0 --port 8000
```
O usar il comando Makefile:
```bash
make dev
```
Accedi a: http://localhost:8000
---
## 📱 Workflow di Sviluppo
### Sviluppare il Frontend
1. **Terminal 1 - Tailwind Watcher:**
```bash
npm run tailwind:dev
```
2. **Terminal 2 - FastAPI Dev Server:**
```bash
source venv/bin/activate
uvicorn main:app --reload
```
3. **Modificare i file:**
- HTML: `app/web/templates/index.html`, `servers.html`, `models_running.html`
- CSS input: `app/web/static/css/input.css` (raramente, usa classi Tailwind)
- JavaScript: `app/web/static/js/app.js`, `servers.js`, `models-running.js`, `data-sync.worker.js`
> ⚠️ **Classi Tailwind dinamiche**: Le classi generate dinamicamente via `innerHTML` (es. in accordion o card) **non** vengono rilevate dal JIT scanner. Usa stili inline (`style="..."`) o classi hardcoded nei template HTML per queste situazioni.
4. **Compilato automaticamente:**
- Tailwind genera `app/web/static/css/output.css` automaticamente
- FastAPI recarica il server automaticamente
- Browser reload automatico (se abilitato)
---
## 🐳 Build Docker
Il Dockerfile multi-stage:
1. **Stage 1 - CSS Builder (Node):**
- Installa dipendenze npm
- Compila Tailwind CSS
- Genera `app/web/static/css/output.css`
2. **Stage 2 - Python Builder:**
- Installa dipendenze Python
- Crea virtualenv
3. **Stage 3 - Runtime:**
- Copia CSS compilato dal Stage 1
- Copia Python packages dal Stage 2
- Immagine finale ottimizzata (~300MB)
### Build locale:
```bash
docker build -t llm-monitor:latest .
```
### Eseguire il container:
```bash
docker run -p 8000:8000 --env-file .env llm-monitor:latest
```
---
## ⚙️ Configurazione Tailwind
File: `tailwind.config.js`
```javascript
module.exports = {
content: [
"./app/web/templates/**/*.html",
"./app/web/static/**/*.js",
],
theme: {
extend: {},
},
plugins: [],
}
```
**Content**: Specifica quali file Tailwind deve scansionare per le classi utilizzate
---
## 🎯 CSS Architecture
### Input CSS
File: `app/web/static/css/input.css`
```css
@tailwind base;
@tailwind components;
@tailwind utilities;
```
### Output CSS
File: `app/web/static/css/output.css` (generato)
- Contiene solo le classi Tailwind utilizzate
- Minificato in produzione (~30KB)
- Ottimizzato per performance
### Usage in HTML
File: `app/web/templates/index.html`
```html
<!-- Usa il CSS compilato (produzione) -->
<link rel="stylesheet" href="/static/css/output.css">
<!-- Fallback CDN per sviluppo (se output.css non esiste) -->
<script src="https://cdn.tailwindcss.com"></script>
```
---
## 📝 Tips di Sviluppo
### Hot Reload CSS
```bash
npm run tailwind:dev
# Guarda i file e compila automaticamente
```
### Debug CSS Compilation
```bash
npm run tailwind:build
# Se il CSS non appare, verifica:
# 1. Le classi sono usate nei file HTML/JS?
# 2. C'è un errore nella sintassi CSS?
# 3. I percorsi in tailwind.config.js sono corretti?
```
### Aggiungere Nuove Classi Tailwind
1. Modifica i file HTML/JS con classi Tailwind
2. Tailwind watcher le detetta automaticamente
3. `output.css` viene rigenerato
```html
<!-- Nuova classe aggiunta -->
<div class="bg-gradient-to-r from-purple-500 to-pink-500">
<!-- Viene aggiunta automaticamente al CSS compilato -->
</div>
```
---
## 🚀 Production Checklist
- [ ] Eseguire `npm run tailwind:build` per minificare
- [ ] Verificare che `output.css` sia generato
- [ ] Eseguire i test Python: `make test`
- [ ] Eseguire i test E2E: `npm run test:e2e`
---
## 🧪 Testing
### Unit Test (pytest)
```bash
# Tutti i test
pytest tests/ -v
# Con coverage
pytest tests/ --cov=app
```
### E2E Test (Playwright)
I test E2E verificano il comportamento del browser (cache-first, navigazione, PWA).
```bash
# Installare i browser Playwright (prima volta)
npx playwright install --with-deps
# Eseguire i test E2E (richiede Ollama attivo)
OLLAMA_HOST=http://<ollama-host>:11434 npm run test:e2e
# Con report HTML
npm run test:e2e -- --reporter=html
```
I test si trovano in `tests/e2e/`. Il report viene generato in `playwright-report/` (gitignored).
### Makefile
```bash
make test # pytest
make lint # flake8
make format # black
make dev # uvicorn --reload
make deploy-no-cache # Docker rebuild forzato
```
- [ ] Controllare che il container Docker usi il CSS compilato
- [ ] Test performance con Lighthouse
- [ ] Verifica bundle size `output.css`
---
## 🔗 Risorse
- [Tailwind CSS Documentation](https://tailwindcss.com/docs)
- [Tailwind CLI](https://tailwindcss.com/docs/installation)
- [FastAPI Hot Reload](https://fastapi.tiangolo.com/#example-upgrade)
- [Docker Multi-Stage Builds](https://docs.docker.com/build/building/multi-stage/)
---
**Ultimo aggiornamento:** Aprile 2024
+683
View File
@@ -0,0 +1,683 @@
# LLM Monitor - Product Requirements Document (PRD)
**Versione:** 1.0.0
**Data:** Aprile 2024
**Autore:** Luca Sacchi Ricciardi
**Detentore dei diritti:** Luca Sacchi Ricciardi (tutti i diritti riservati)
**Status:** Active Development
---
## 📋 Indice
1. [Executive Summary](#executive-summary)
2. [Vision & Obiettivi](#vision--obiettivi)
3. [Problema & Soluzione](#problema--soluzione)
4. [Utenti Target](#utenti-target)
5. [Feature Principali](#feature-principali)
6. [Requisiti Tecnici](#requisiti-tecnici)
7. [Architettura](#architettura)
8. [User Stories](#user-stories)
9. [Acceptance Criteria](#acceptance-criteria)
10. [Timeline & Roadmap](#timeline--roadmap)
11. [Success Metrics](#success-metrics)
12. [Constraints & Assumptions](#constraints--assumptions)
---
## 🎯 Executive Summary
**LLM Monitor** è una **dashboard web moderna** per il monitoraggio in tempo reale dei modelli LLM caricati in **Ollama**. L'applicazione fornisce una visualizzazione intuitiva dello stato dei modelli, delle risorse utilizzate e dell'accesso ai dati via API REST documentata con Swagger/OpenAPI.
### Highlights
- ✅ Dashboard reattiva senza page reload
- ✅ Web Worker per sincronizzazione dati in background
- ✅ localStorage per cache locale e offline support
- ✅ API REST completamente documentata
- ✅ Containerizzata con Docker
- ✅ Architettura server-client moderna
---
## 🚀 Vision & Obiettivi
### Vision
Fornire ai developer e ai DevOps una **visibilità completa e in tempo reale** dei modelli LLM disponibili in Ollama, eliminando la necessità di comandi CLI per il monitoraggio.
### Obiettivi Primari
1. **Visualizzare modelli** caricati in Ollama senza comandi CLI
2. **Monitorare risorse** (dimensione, memoria, stato)
3. **Accedere all'API** via dashboard intuitiva
4. **Documentare API** con Swagger per integrazioni esterne
5. **Deployare facilmente** con Docker/Docker Compose
6. **Aggiornamenti in tempo reale** senza page reload
### Obiettivi Secondari
1. Supporto offline via localStorage
2. Performance ottimale con Web Workers
3. UI moderna e responsive
4. Facilità di installazione e configurazione
---
## 🔍 Problema & Soluzione
### Problema
Attualmente, per verificare i modelli LLM in Ollama, è necessario:
- Usare comandi CLI (`ollama list`)
- Fare chiamate API manuali con curl/Postman
- Non c'è una dashboard visuale dedicata
- Difficile monitoraggio per non-developer
### Soluzione Proposta
**LLM Monitor** fornisce:
- ✅ Dashboard web intuitiva e moderna
- ✅ Aggiornamenti automatici ogni 30 secondi
- ✅ Nessun page reload grazie ai Web Workers
- ✅ API documentata e testabile direttamente
- ✅ Deployment semplice con Docker
---
## 👥 Utenti Target
### Primary Users
1. **DevOps Engineers** - Monitorare modelli in produzione
2. **ML Engineers** - Verificare disponibilità modelli
3. **Developers** - Integrazioni via API
### Secondary Users
1. **System Administrators** - Overview dell'infrastruttura
2. **Project Managers** - Status modelli disponibili
### Use Cases
#### UC1: Verificare Modelli Caricati
- **Actor:** Developer
- **Goal:** Vedere quali modelli sono disponibili
- **Flow:** Apri dashboard → visualizza elenco modelli con dettagli
- **Benefit:** Non usare CLI, visione immediata
#### UC2: Monitorare Spazio Disco
- **Actor:** DevOps
- **Goal:** Tracciare consumo spazio dei modelli
- **Flow:** Dashboard → visualizza spazio totale e per modello
- **Benefit:** Pianificare cleanup e capacity planning
#### UC3: Integrare via API
- **Actor:** Developer
- **Goal:** Automatizzare script che consumano dati modelli
- **Flow:** Consulta Swagger → crea script che chiama endpoint API
- **Benefit:** Automazione e integrazione con altri sistemi
#### UC4: Offline Mode
- **Actor:** Developer (senza connessione)
- **Goal:** Accedere ai dati modelli salvati
- **Flow:** localStorage fornisce ultimo stato noto
- **Benefit:** Accesso parziale anche offline
---
## ⚡ Feature Principali
### 1. Dashboard Principale
**Descrizione:** Homepage con overview dei modelli
**Componenti:**
- Header con logo e status Ollama
- Stat cards: Modelli caricati, Spazio totale, Status
- Lista modelli con:
- Nome modello
- Dimensione
- Data ultimo aggiornamento
- Digest (hash univoco)
- Pulsante refresh manuale
- Pannello dettagli modello su click card
**Behavior:**
- Auto-refresh ogni 30 secondi
- Aggiorna solo elementi cambiati (no full re-render)
- Mostra loading state durante fetch
- Error handling con messaggi chiari
- Durante il refresh lista, chiama `show` per ogni modello e salva i dettagli in cache locale
- Click su card modello apre i dettagli `show` senza page reload
---
### 2. API REST Documentata
**Endpoints:**
#### `GET /api/v1/health`
Verifica lo stato dell'API e di Ollama
**Risposta:**
```json
{
"status": "healthy",
"ollama_status": "online",
"timestamp": "2024-04-15T10:30:00Z"
}
```
#### `GET /api/v1/models`
Recupera elenco di tutti i modelli
**Risposta:**
```json
{
"models": [
{
"name": "llama2",
"digest": "abc123...",
"size": 3825922048,
"modified_at": "2024-04-15T10:30:00Z"
}
],
"total": 1
}
```
#### `GET /api/v1/models/{model_name}`
Dettagli di un modello specifico
**Risposta:**
```json
{
"name": "llama2",
"digest": "abc123...",
"size": 3825922048,
"modified_at": "2024-04-15T10:30:00Z"
}
```
#### `GET /api/v1/models/{model_name}/show`
Proxy dell'endpoint Ollama `POST /api/show` per ottenere informazioni estese sul modello
#### `POST /api/v1/models/{model_name}/pull`
Scarica/carica un modello (**disabilitato di default**)
#### `DELETE /api/v1/models/{model_name}`
Elimina un modello (**disabilitato di default**)
#### Policy endpoint R/W
- Gli endpoint `POST/DELETE` sono **non disponibili** per default.
- Si abilitano solo con variabile ambiente `ENABLE_MODEL_RW_API=true`.
- Se non abilitati, gli endpoint non sono esposti in Swagger e rispondono con `404`.
---
### 3. Web Worker per Sincronizzazione
**Descrizione:** Thread separato per aggiornamenti dati
**Feature:**
- Esegue richieste HTTP senza bloccare UI
- Aggiorna localStorage ogni 30 secondi
- Notifica main thread con nuovi dati
- Fallback per browser senza Web Worker support
**Vantaggi:**
- UI sempre responsiva (60 FPS)
- Niente lag durante fetch
- Scalabilità migliore
---
### 4. LocalStorage Persistence
**Descrizione:** Cache locale dei dati
**Dati Salvati:**
- `llm_monitor_health` - Status health
- `llm_monitor_models` - Elenco modelli + mappa dettagli `showByModel`
**Benefit:**
- Offline support
- Caricamento istantaneo
- Ripristino ultimo stato noto
---
### 5. Swagger/OpenAPI Documentation
**Descrizione:** Documentazione interattiva API
**URL:**
- Swagger UI: `/docs`
- ReDoc: `/redoc`
**Feature:**
- Testa endpoint direttamente
- Visualizza schemi
- Genera client code (curl, Python, JS, ecc.)
---
### 6. Docker Support
**Descrizione:** Containerizzazione dell'applicazione
**Componenti:**
- Dockerfile multi-stage ottimizzato
- docker-compose.yml per la sola dashboard (Ollama esterno/remoto)
- Health checks configurati
- Sempre acceso fino all'arresto manuale
---
## 🏗️ Requisiti Tecnici
### Backend
- **Linguaggio:** Python 3.10+
- **Framework:** FastAPI
- **Server:** uVicorn
- **Validation:** Pydantic
- **HTTP Client:** requests/httpx
### Frontend
- **HTML5** - Template base
- **CSS:** TailwindCSS (utility-first)
- **JavaScript:** Vanilla JS (no frameworks)
- **Web APIs:**
- Fetch API per HTTP
- Web Workers per threading
- localStorage per persistence
### DevOps
- **Container:** Docker
- **Orchestration:** Docker Compose
- **Network:** HTTP/HTTPS
### Database
- Nessuno (stateless)
- localStorage nel browser (client-side only)
---
## 🏛️ Architettura
### Componenti Principali
```
┌─────────────────────────────────────────────────────┐
│ Client (Browser) │
│ ┌────────────────────────────────────────────────┐ │
│ │ index.html + app.js (Main Thread) │ │
│ │ - Renderizza UI │ │
│ │ - Legge localStorage │ │
│ │ - Aggiorna DOM granularmente │ │
│ └────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────┐ │
│ │ data-sync.worker.js (Web Worker Thread) │ │
│ │ - Fetch /api/v1/health │ │
│ │ - Fetch /api/v1/models │ │
│ │ - Aggiorna localStorage │ │
│ │ - Comunica con main thread │ │
│ └────────────────────────────────────────────────┘ │
└────────────────────────┬────────────────────────────┘
│ HTTP REST API
┌────────────────────────▼────────────────────────────┐
│ FastAPI Server (Python) │
│ ┌────────────────────────────────────────────────┐ │
│ │ main.py (Entry Point) │ │
│ │ - CORS middleware │ │
│ │ - Route setup │ │
│ └────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────┐ │
│ │ app/api/ (Endpoints) │ │
│ │ - health.py │ │
│ │ - models.py │ │
│ └────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────┐ │
│ │ app/services/ (Business Logic) │ │
│ │ - ollama.py (OllamaClient) │ │
│ └────────────────────────────────────────────────┘ │
└────────────────────────┬────────────────────────────┘
│ HTTP API
┌────────────────────────▼────────────────────────────┐
│ Ollama Server (LLM Models) │
│ - API Port: 11434 │
│ - Gestisce modelli LLM │
│ - Endpoint: /api/tags │
└─────────────────────────────────────────────────────┘
```
### Data Flow
1. **Inizializzazione:**
- Main thread carica localStorage
- Renderizza UI con dati cached
- Avvia Web Worker
2. **Sincronizzazione Periodica (ogni 30s):**
- Worker fetch `/api/v1/health`
- Worker fetch `/api/v1/models`
- Worker aggiorna localStorage
- Worker invia messaggio a main thread
3. **Aggiornamento UI:**
- Main thread riceve messaggio dal Worker
- Confronta dati vecchi vs nuovi
- Aggiorna solo elementi cambiati
4. **Refresh Manuale:**
- Utente clicca pulsante 🔄
- Main thread chiama `worker.postMessage({ type: "SYNC_NOW" })`
- Worker esegue sincronizzazione immediata
---
## 👤 User Stories
### US1: Visualizzare Modelli Disponibili
```
Come: Developer
Voglio: Vedere lista di modelli caricati in Ollama
Affinché: Sapere quali modelli sono disponibili
Acceptance Criteria:
- Dashboard mostra elenco modelli
- Per ogni modello: nome, dimensione, data aggiornamento
- Totale modelli visualizzato in stat card
- Dati aggiornati ogni 30 secondi
```
### US2: Monitorare Consumo Spazio
```
Come: DevOps Engineer
Voglio: Verificare quanto spazio occupano i modelli
Affinché: Pianificare capacity planning e cleanup
Acceptance Criteria:
- Stat card mostra spazio totale
- Ogni modello mostra dimensione
- Formato leggibile (GB, MB, etc)
- Aggiornamenti automatici
```
### US3: Verificare Status Ollama
```
Come: System Admin
Voglio: Sapere se Ollama è online
Affinché: Identificare problemi rapidamente
Acceptance Criteria:
- Status indicator nel header (verde/rosso)
- Testo descrittivo ("Online/Offline")
- Health check ogni 30 secondi
```
### US4: Accedere alla API Documentata
```
Come: Developer
Voglio: Consultare documentazione API con esempi
Affinché: Integrare i dati in miei script/app
Acceptance Criteria:
- Swagger UI disponibile su /docs
- ReDoc disponibile su /redoc
- Tutti gli endpoint documentati
- Possibile testare endpoint dal browser
```
### US5: Usare Dashboard Offline
```
Come: Developer
Voglio: Visualizzare ultimi dati anche offline
Affinché: Accedere all'info anche senza connessione
Acceptance Criteria:
- localStorage persiste dati
- Dashboard carica senza server
- Mostra timestamp ultimo aggiornamento
- Warning se dati non aggiornati
```
### US6: Refresh Manuale
```
Come: User
Voglio: Aggiornare i dati immediatamente
Affinché: Ottenere le informazioni più recenti
Acceptance Criteria:
- Pulsante 🔄 presente nella dashboard
- Clicco aggiorna immediatamente i dati
- Loading state durante fetch
- Nessun page reload
```
---
## ✅ Acceptance Criteria
### Funzionalità
| # | Feature | Accettazione |
|---|---------|--------------|
| 1 | Dashboard carica modelli | ✅ Elenco visibile entro 2 secondi |
| 2 | Auto-refresh ogni 30s | ✅ Nessun page reload, solo DOM update |
| 3 | Status Ollama | ✅ Indicatore verde/rosso corretto |
| 4 | localStorage sincronizzato | ✅ Dati persistenti tra session |
| 5 | Web Worker attivo | ✅ Main thread mai bloccato |
| 6 | API Swagger disponibile | ✅ Endpoint testabili su /docs |
| 7 | Docker container | ✅ Avvia e rimane acceso |
| 8 | Offline mode | ✅ Carica con localStorage |
### Performance
| # | Metrica | Target |
|---|---------|--------|
| 1 | FCP (First Contentful Paint) | < 1s |
| 2 | LCP (Largest Contentful Paint) | < 2s |
| 3 | TTI (Time to Interactive) | < 3s |
| 4 | API response time | < 200ms |
| 5 | Dashboard refresh FPS | 60 FPS |
| 6 | Memory usage | < 50MB |
### Compatibilità Browser
| Browser | Versione Minima | Status |
|---------|-----------------|--------|
| Chrome | 70+ | ✅ Supportato |
| Firefox | 65+ | ✅ Supportato |
| Safari | 12+ | ✅ Supportato |
| Edge | 79+ | ✅ Supportato |
| Opera | 57+ | ✅ Supportato |
| IE11 | - | ❌ Non supportato (no Web Workers) |
---
## 📅 Timeline & Roadmap
### Phase 1: MVP (In Development - Completato ✅)
**Durata:** 2 settimane
**Feature:**
- [x] Dashboard base con elenco modelli
- [x] API REST con 3 endpoint
- [x] Swagger documentation
- [x] Docker setup
- [x] Web Worker architettura
- [x] localStorage integration
**Release:** v1.0.0
---
### Phase 2: Enhancement (Pianificato 🔄)
**Durata:** 2 settimane
**Feature:**
- [ ] Statistiche storiche (grafici)
- [ ] Ricerca e filtri modelli
- [ ] Dark/Light theme toggle
- [ ] Configurazione refresh rate
- [ ] Export dati (CSV/JSON)
- [ ] Notifiche cambio status
**Release:** v1.1.0
---
### Phase 3: Advanced (Futuro 🚀)
**Durata:** 3+ settimane
**Feature:**
- [ ] Multi-tenant support
- [ ] Authentication & Authorization
- [ ] User preferences storage
- [ ] Service Worker per PWA
- [ ] Real-time updates (WebSocket)
- [ ] Model versioning
- [ ] Pull/Delete confirmation modal
- [ ] Advanced error handling
**Release:** v2.0.0
---
### Phase 4: Production (Futuro 🏆)
**Durata:** Ongoing
**Feature:**
- [ ] Monitoring & Alerting
- [ ] Analytics dashboard
- [ ] Performance optimization
- [ ] Load testing & benchmarks
- [ ] Security audit
- [ ] GDPR compliance
**Release:** v2.1.0+
---
## 📊 Success Metrics
### Technical Metrics
| Metrica | Target | Misura |
|---------|--------|--------|
| Uptime | 99%+ | Monitoring |
| API latency | < 200ms | New Relic/DataDog |
| Error rate | < 0.1% | Logs |
| Test coverage | 80%+ | pytest coverage |
| Bundle size | < 100KB | webpack-bundle-analyzer |
### Business Metrics
| Metrica | Target | Misura |
|---------|--------|--------|
| Time to load | < 2s | Lighthouse |
| Page interactions/sec | 100+ | App metrics |
| User satisfaction | 4.5/5 | Feedback form |
| DevOps adoption | 70%+ | Usage analytics |
| Automation enabled | 50%+ | Script integrations |
### User Engagement
| Metrica | Target | Misura |
|---------|--------|--------|
| Monthly active users | 100+ | Analytics |
| Dashboard views/month | 1000+ | Google Analytics |
| API calls/day | 500+ | API logs |
| Feature usage rate | 80%+ | Telemetry |
---
## 🚫 Constraints & Assumptions
### Constraints
#### Tecnici
- Ollama deve essere in esecuzione (hard requirement)
- Python 3.10+ necessario
- Docker richiesto per containerizzazione
- Browser moderno necessario (Web Workers)
#### Organizzativi
- Team: 1-2 developer
- Budget: Open source (free)
- Timeline: Sprint 2 settimane
#### User
- Conoscenza base di Docker
- Accesso locale a Ollama
- Browser moderno
### Assumptions
#### Prodotto
- Ollama API rimane stabile
- Modelli LLM sono relativamente statici (cambiano meno di 24h)
- Refresh ogni 30s è adeguato
#### Tecnico
- Web Workers supportati dai browser target
- localStorage disponibile (non private mode)
- CORS enabled tra client e server
#### Market
- Ollama diventerà standard per LLM locali
- Interesse crescente in monitoring tools
- Community contribuirà improvement
---
## 📝 Note Implementative
### Dependencies
```
fastapi==0.104.1
uvicorn==0.24.0
pydantic==2.5.0
requests==2.31.0
jinja2==3.1.2
```
### Dev Dependencies
```
pytest==7.4.3
black==23.12.0
flake8==6.1.0
mypy==1.7.1
```
### File Structure
```
llm-monitor/
├── main.py # FastAPI entry point
├── app/
│ ├── config.py # Configuration
│ ├── api/ # Endpoints
│ ├── services/ # Business logic
│ └── web/ # Frontend (HTML, JS, CSS)
├── tests/ # Test suite
├── Dockerfile # Container
└── docker-compose.yml # Orchestration
```
---
## 🔗 Riferimenti
### Documentazione Esterna
- [FastAPI Docs](https://fastapi.tiangolo.com/)
- [Ollama API](https://github.com/ollama/ollama/blob/main/docs/api.md)
- [Web Workers MDN](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API)
- [localStorage API](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage)
### Repository
- [GitHub: llm-monitor](https://github.com/lucasacchiNet/llm-monitor)
- [Docker Hub: llm-monitor](https://hub.docker.com/r/lucasacchi/llm-monitor)
---
## ✍️ Changelog PRD
| Data | Versione | Autore | Cambiamenti |
|------|----------|--------|------------|
| 2024-04-24 | 1.0 | Luca Sacchi Ricciardi | Documento iniziale |
| 2024-04-25 | 1.1 | - | TBD |
---
**Documento approvato:**
**Revisore:** Product Team
**Ultimo aggiornamento:** Aprile 2024
**Prossima review:** Giugno 2024
+190
View File
@@ -0,0 +1,190 @@
# LLM Monitor - Architettura Web Worker
## 📊 Architettura Moderna con Web Workers
Questa versione della dashboard utilizza **Web Workers** per un'esperienza utente ottimale senza blocchi dell'UI.
## 🏗️ Componenti
### 1. **data-sync.worker.js** (Web Worker)
Thread separato che:
- Effettua le richieste HTTP all'API (`/api/v1/health`, `/api/v1/models`)
- Aggiorna **localStorage** periodicamente (ogni 30 secondi)
- Invia messaggi al main thread con i dati aggiornati
- **NON blocca mai l'interfaccia utente**
### 2. **app.js** (Main Thread)
File principale che:
- Inizializza il Web Worker
- Carica dati da **localStorage** al boot
- Riceve messaggi dal Worker e aggiorna il DOM
- Aggiorna solo gli elementi DOM che sono effettivamente cambiati
- Fornisce fallback se Web Workers non sono supportati
### 3. **index.html**
Template HTML con struttura base e caricamento di app.js
## 🔄 Flusso Dati
```
┌─────────────────────────────────────────────────────┐
│ MAIN THREAD (UI Thread) │
│ - Renderizza il DOM │
│ - Interagisce con l'utente │
│ - Legge da localStorage │
└─────────────────┬───────────────────────────────────┘
postMessage() / onmessage
┌─────────────────▼───────────────────────────────────┐
│ WEB WORKER (Separate Thread) │
│ - Fetch /api/v1/health │
│ - Fetch /api/v1/models │
│ - localStorage.setItem() │
│ - Eseguito ogni 30 secondi │
└─────────────────────────────────────────────────────┘
postMessage({ DATA_UPDATED })
┌───────▼────────┐
│ localStorage │
│ persistente │
└────────────────┘
```
## 💾 LocalStorage
I dati sono memorizzati **per server** con chiavi dinamiche:
- `llm_monitor_health_<serverId>` - Dati health check
- `llm_monitor_models_<serverId>` - Dati modelli disponibili
- `llm_monitor_running_<serverId>` - Modelli in esecuzione
- `llm_monitor_servers` - Lista istanze Ollama configurate
- `llm_monitor_active_server` - ID del server attivo
La funzione `getServerStorageKey(serverId, suffix)` in `server-config.js` costruisce le chiavi.
### Struttura dati health:
```json
{
"status": "healthy",
"ollama_status": "online",
"timestamp": "2024-01-15T10:30:00.000Z"
}
```
### Struttura dati models:
```json
{
"models": [
{
"name": "llama2",
"digest": "abc123...",
"size": 3825922048,
"modified_at": "2024-01-15T10:30:00.000Z"
}
],
"total": 1,
"totalSize": "3.56 GB",
"showByModel": {
"llama2": { "details": {}, "model_info": {}, "parameters": "..." }
},
"timestamp": "2024-01-15T10:30:00.000Z"
}
```
## 🎯 Vantaggi
### ✅ Performance
- **Main thread mai bloccato** - Le richieste HTTP avvengono nel Worker
- **DOM updates ottimizzate** - Aggiorna solo elementi cambiati
- **60 FPS garantito** - L'UI resta responsiva
### ✅ Offline Support
- I dati rimangono in **localStorage** anche se il server è offline
- La dashboard mostra l'ultimo stato noto
- Il **Service Worker** (`service-worker.js`) mette in cache l'app shell (HTML, CSS, JS) per navigazione offline
- Cache name corrente: `llm-monitor-v3`
### ✅ Efficienza di Rete
- Una sola fetch ogni 30 secondi (dal Worker)
- Compressione gzip della risposta
- Ridotto uso di bandwidth
### ✅ Scalabilità
- Più tab della dashboard non sovraccaricare il server
- LocalStorage condiviso tra tab (gli aggiornamenti si sincronizzano)
## 🔧 Configurazione
### Intervallo di aggiornamento
Modifica in `data-sync.worker.js`:
```javascript
const REFRESH_INTERVAL = 30000; // 30 secondi
```
### Disabilitare Web Worker (debug)
Nel browser console:
```javascript
window.app.worker = null;
window.app.syncDataInMainThread();
```
## 🛠️ Sviluppo
### Debug del Worker
```javascript
// In data-sync.worker.js
console.log("Worker sync triggered", new Date());
```
Console del browser (DevTools > Dedicated Worker)
### Ispezionare localStorage
```javascript
// In console del browser
JSON.parse(localStorage.getItem('llm_monitor_health'))
JSON.parse(localStorage.getItem('llm_monitor_models'))
```
## 📱 Browser Support
- ✅ Chrome/Edge 4+
- ✅ Firefox 3.6+
- ✅ Safari 4+
- ✅ Opera 10.6+
- ⚠️ Fallback disponibile se non supportati
## 🚀 Ottimizzazioni Future
- [ ] IndexedDB per dati maggiori
- [ ] Sincronizzazione tra tab (BroadcastChannel API)
- [ ] Caching intelligente con TTL
- [ ] Compressione dati (Zstandard/Brotli)
## 🔍 Troubleshooting
### Worker non carica
- Verificare CORS
- Controllare DevTools > Application > Service Workers
- Verificare console per errori
### localStorage non persiste
- Modalità incognito/privato disabilita localStorage
- Spazio esaurito: svuotare localStorage
- Cookie di terze parti potrebbe essere disabilitato
### Aggiornamenti non visibili
- Controllare DevTools > Application > LocalStorage
- Verificare che il Worker sia attivo (DevTools > Dedicated Workers)
- Forzare refresh manuale con pulsante 🔄
## 📚 Riferimenti
- [MDN Web Workers](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API)
- [localStorage API](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage)
- [Performance Best Practices](https://web.dev/performance/)
---
**Sviluppato per LLM Monitor v1.1.0** 🦙
+11 -3
View File
@@ -1,10 +1,14 @@
# LLM Monitor - Environment Configuration Example
# Copy this file to .env and adjust values for your environment
# Copia questo file in .env e personalizza per il tuo ambiente
# ===========================================
# Ollama Configuration
# Ollama Configuration (Remote Server)
# ===========================================
# URL base dell'API Ollama
# URL base dell'API Ollama (server remoto)
# Esempi:
# - http://localhost:11434 (sviluppo locale)
# - http://ollama.example.com:11434 (server remoto)
# - https://ollama.example.com (con SSL)
OLLAMA_HOST=http://localhost:11434
# Timeout per le richieste a Ollama (secondi)
@@ -22,6 +26,10 @@ API_PORT=8000
# Numero di worker processes per uVicorn
API_WORKERS=4
# Abilita API R/W modelli (POST /pull, DELETE /models/{name})
# Default sicuro: false (endpoint non disponibili)
ENABLE_MODEL_RW_API=false
# ===========================================
# CORS Configuration
# ===========================================
+65 -17
View File
@@ -1,6 +1,6 @@
"""
LLM Monitor - Dashboard per controllare i modelli caricati in Ollama
Entry point dell'applicazione FastAPI
LLM Monitor - Dashboard to monitor Ollama models.
FastAPI application entry point.
"""
import logging
@@ -8,29 +8,30 @@ from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
from fastapi.responses import FileResponse
from fastapi.middleware.cors import CORSMiddleware
from fastapi.openapi.docs import get_redoc_html
from pathlib import Path
import os
# Configurazione logging
# Logging configuration
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# Importare le rotte
# Import API routes
from app.api.health import router as health_router
from app.api.models import router as models_router
from app.config import settings
# Creare l'app FastAPI
# Create FastAPI app
app = FastAPI(
title="LLM Monitor API",
description="Dashboard per il monitoraggio dei modelli LLM in Ollama",
description="Dashboard and API for monitoring Ollama LLM models",
version="1.0.0",
docs_url="/docs",
redoc_url="/redoc",
redoc_url=None,
openapi_url="/openapi.json"
)
# Configurare CORS
# Configure CORS
app.add_middleware(
CORSMiddleware,
allow_origins=settings.CORS_ORIGINS.split(","),
@@ -39,37 +40,84 @@ app.add_middleware(
allow_headers=["*"],
)
# Registrare le rotte API
# Register API routes
app.include_router(health_router, prefix="/api/v1", tags=["health"])
app.include_router(models_router, prefix="/api/v1", tags=["models"])
# Servire i file statici
# Serve static files
static_path = Path(__file__).parent / "app" / "web" / "static"
if static_path.exists():
app.mount("/static", StaticFiles(directory=static_path), name="static")
# Servire la dashboard web
# Serve web pages
templates_path = Path(__file__).parent / "app" / "web" / "templates"
@app.get("/")
async def root():
"""Redirect alla dashboard"""
return FileResponse(templates_path / "index.html")
"""Primary page: configured servers selector and control panel."""
return FileResponse(templates_path / "servers.html")
@app.get("/servers")
async def servers_page():
"""Configured Ollama servers page."""
return FileResponse(templates_path / "servers.html")
@app.get("/dashboard")
async def dashboard():
"""Dashboard principale"""
"""Legacy alias for configured servers page."""
return FileResponse(templates_path / "servers.html")
@app.get("/models-available")
async def models_available_page():
"""Page listing models available on disk."""
return FileResponse(templates_path / "index.html")
@app.get("/models-running")
async def models_running_page():
"""Page dedicated to models resident in memory (ollama ps)."""
return FileResponse(templates_path / "models_running.html")
@app.get("/manifest.webmanifest", include_in_schema=False)
async def web_manifest():
"""PWA web manifest."""
return FileResponse(static_path / "manifest.webmanifest", media_type="application/manifest+json")
@app.get("/service-worker.js", include_in_schema=False)
async def service_worker():
"""PWA service worker with root scope."""
return FileResponse(static_path / "js" / "service-worker.js", media_type="application/javascript")
@app.get("/redoc", include_in_schema=False)
async def redoc_html():
"""ReDoc documentation using a stable bundle."""
return get_redoc_html(
openapi_url=app.openapi_url,
title=f"{app.title} - ReDoc",
redoc_js_url="https://cdn.jsdelivr.net/npm/redoc@2/bundles/redoc.standalone.js",
with_google_fonts=False,
)
@app.get("/favicon.ico", include_in_schema=False)
async def favicon():
"""Application favicon."""
return FileResponse(static_path / "favicon.ico")
# Event hooks
@app.on_event("startup")
async def startup_event():
logger.info("🚀 LLM Monitor avviato")
logger.info(f"📊 Ollama host: {settings.OLLAMA_HOST}")
logger.info("LLM Monitor started")
logger.info(f"Ollama host: {settings.OLLAMA_HOST}")
@app.on_event("shutdown")
async def shutdown_event():
logger.info("🛑 LLM Monitor arrestato")
logger.info("LLM Monitor stopped")
if __name__ == "__main__":
import uvicorn
+5 -2
View File
@@ -1,13 +1,16 @@
{
"name": "llm-monitor",
"version": "1.0.0",
"type": "commonjs",
"description": "Dashboard per controllare i modelli caricati in Ollama",
"private": true,
"scripts": {
"tailwind:dev": "tailwindcss -i app/web/static/css/input.css -o app/web/static/css/output.css --watch",
"tailwind:build": "tailwindcss -i app/web/static/css/input.css -o app/web/static/css/output.css --minify"
"tailwind:dev": "tailwindcss -i ./app/web/static/css/input.css -o ./app/web/static/css/output.css --watch",
"tailwind:build": "tailwindcss -i ./app/web/static/css/input.css -o ./app/web/static/css/output.css",
"test:e2e": "playwright test tests/e2e/cache-navigation.spec.js"
},
"devDependencies": {
"@playwright/test": "^1.59.1",
"tailwindcss": "^3.4.0"
}
}
+22
View File
@@ -0,0 +1,22 @@
const { defineConfig } = require('@playwright/test');
const baseURL = process.env.TARGET_URL || 'http://127.0.0.1:8011';
module.exports = defineConfig({
testDir: './tests/e2e',
timeout: 45000,
fullyParallel: false,
retries: 0,
reporter: 'list',
use: {
baseURL,
headless: true,
serviceWorkers: 'block'
},
webServer: {
command: 'python3 -m uvicorn main:app --host 127.0.0.1 --port 8011',
url: baseURL,
reuseExistingServer: true,
timeout: 30000
}
});
+34
View File
@@ -0,0 +1,34 @@
#!/usr/bin/env bash
set -euo pipefail
PROJECT_DIR="${PROJECT_DIR:-/opt/llm-monitor}"
CONTAINER_NAME="${CONTAINER_NAME:-llm-monitor-app}"
if [[ -d "$PROJECT_DIR" ]]; then
cd "$PROJECT_DIR"
else
echo "[deploy] PROJECT_DIR non trovato: $PROJECT_DIR"
echo "[deploy] uso directory corrente: $PWD"
fi
echo "[deploy] stop stack"
docker compose down
if [[ ! -f ".env" && -f ".env.local" ]]; then
echo "[deploy] .env non trovato, copio .env.local -> .env"
cp .env.local .env
fi
echo "[deploy] build stack (no cache)"
docker compose build --no-cache
echo "[deploy] start stack"
docker compose up -d
echo "[deploy] waiting for container startup"
sleep 5
echo "[deploy] verify Tailwind CSS"
./scripts/verify-tailwind-css.sh "$CONTAINER_NAME"
echo "[deploy] completed successfully"
+28
View File
@@ -0,0 +1,28 @@
#!/usr/bin/env bash
set -euo pipefail
CONTAINER_NAME="${1:-llm-monitor-app}"
CSS_PATH="/app/app/web/static/css/output.css"
MIN_LINES="${MIN_TAILWIND_LINES:-100}"
if ! docker ps --format '{{.Names}}' | grep -Fxq "$CONTAINER_NAME"; then
echo "[verify-css] ERROR: container '$CONTAINER_NAME' non in esecuzione"
exit 1
fi
if ! docker exec "$CONTAINER_NAME" test -f "$CSS_PATH"; then
echo "[verify-css] ERROR: file CSS non trovato: $CSS_PATH"
exit 1
fi
LINES=$(docker exec "$CONTAINER_NAME" wc -l "$CSS_PATH" | awk '{print $1}')
BYTES=$(docker exec "$CONTAINER_NAME" wc -c "$CSS_PATH" | awk '{print $1}')
echo "[verify-css] $CSS_PATH -> ${LINES} lines, ${BYTES} bytes"
if [[ "$LINES" -lt "$MIN_LINES" ]]; then
echo "[verify-css] ERROR: output.css ha meno di ${MIN_LINES} linee"
exit 1
fi
echo "[verify-css] OK: Tailwind CSS compilato correttamente"
+11
View File
@@ -0,0 +1,11 @@
/** @type {import('tailwindcss').Config} */
module.exports = {
content: [
"./app/web/templates/**/*.html",
"./app/web/static/**/*.js",
],
theme: {
extend: {},
},
plugins: [],
}
+78
View File
@@ -0,0 +1,78 @@
const { test, expect } = require('@playwright/test');
const OLLAMA_HOST = process.env.OLLAMA_HOST || 'http://192.168.254.115:11434';
const SERVER_ID = process.env.TEST_SERVER_ID || 'srv_e2e_cache';
const SERVER_NAME = process.env.TEST_SERVER_NAME || 'E2E Cache Server';
const QUIET_WINDOW_MS = Number(process.env.QUIET_WINDOW_MS || 1500);
const CACHE_WAIT_TIMEOUT_MS = Number(process.env.CACHE_WAIT_TIMEOUT_MS || 20000);
test.describe('cache-first server navigation', () => {
test.beforeEach(async ({ page }) => {
await page.addInitScript(
({ serverId, serverName, host }) => {
localStorage.setItem(
'llm_monitor_servers',
JSON.stringify([
{
id: serverId,
name: serverName,
host
}
])
);
localStorage.setItem('llm_monitor_active_server', serverId);
},
{
serverId: SERVER_ID,
serverName: SERVER_NAME,
host: OLLAMA_HOST
}
);
});
test('serves cached data when navigating between running and available pages', async ({ context, page }) => {
const apiRequests = [];
context.on('request', (request) => {
const url = new URL(request.url());
if (url.pathname.startsWith('/api/v1/')) {
apiRequests.push(url.pathname + url.search);
}
});
const resetApiRequests = () => {
apiRequests.length = 0;
};
const waitForQuietWindow = async (label) => {
await page.waitForTimeout(QUIET_WINDOW_MS);
expect(apiRequests, `${label} should not issue API requests while cache is fresh`).toEqual([]);
};
await page.goto(`/models-running?server=${SERVER_ID}`, { waitUntil: 'domcontentloaded' });
await page.waitForFunction(
(serverId) => {
return ['health', 'models', 'running'].every((suffix) => {
return Boolean(localStorage.getItem(`llm_monitor_${suffix}_${serverId}`));
});
},
SERVER_ID,
{ timeout: CACHE_WAIT_TIMEOUT_MS }
);
expect(apiRequests.length).toBeGreaterThan(0);
resetApiRequests();
await page.goto(`/models-available?server=${SERVER_ID}`, { waitUntil: 'domcontentloaded' });
await waitForQuietWindow('running -> available');
resetApiRequests();
await page.goto(`/models-running?server=${SERVER_ID}`, { waitUntil: 'domcontentloaded' });
await waitForQuietWindow('available -> running');
resetApiRequests();
await page.goto(`/models-available?server=${SERVER_ID}`, { waitUntil: 'domcontentloaded' });
await waitForQuietWindow('running -> available again');
});
});
+118 -2
View File
@@ -3,6 +3,7 @@ Test API endpoints
"""
import pytest
import requests
from unittest.mock import patch, MagicMock
def test_health_check(client):
@@ -46,13 +47,82 @@ def test_get_models(client, mock_models_response):
assert len(data["models"]) == 2
assert data["models"][0]["name"] == "llama2"
def test_get_models_with_host_override(client, mock_models_response):
"""Test host override is propagated to upstream models API call."""
with patch("requests.get") as mock_get:
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = mock_models_response
mock_get.return_value = mock_response
response = client.get("/api/v1/models", params={"host": "http://example-host:11434"})
assert response.status_code == 200
assert mock_get.call_args.args[0] == "http://example-host:11434/api/tags"
def test_health_with_invalid_host_returns_422(client):
"""Invalid host query parameter must be rejected."""
response = client.get("/api/v1/health", params={"host": "not-a-url"})
assert response.status_code == 422
def test_model_show_with_invalid_host_returns_422(client):
"""Invalid host query parameter must be rejected on show endpoint."""
response = client.get("/api/v1/models/llama2/show", params={"host": "localhost:11434"})
assert response.status_code == 422
def test_get_running_models(client):
"""Test getting running models (ollama ps)."""
with patch("requests.get") as mock_get:
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"models": [
{
"name": "llama3.2:3b",
"size_vram": 2147483648,
"expires_at": "2026-04-24T10:30:00Z"
}
]
}
mock_get.return_value = mock_response
response = client.get("/api/v1/models/running")
assert response.status_code == 200
data = response.json()
assert "models" in data
assert data["total"] == 1
assert data["models"][0]["name"] == "llama3.2:3b"
def test_get_running_models_ollama_offline(client):
"""Test running models when Ollama is offline."""
with patch("requests.get") as mock_get:
mock_get.side_effect = Exception("Connection refused")
response = client.get("/api/v1/models/running")
assert response.status_code == 500
def test_get_models_ollama_offline(client):
"""Test getting models when Ollama is offline"""
with patch("requests.get") as mock_get:
mock_get.side_effect = Exception("Connection refused")
mock_get.side_effect = requests.exceptions.ConnectionError("Connection refused")
response = client.get("/api/v1/models")
assert response.status_code == 500
assert response.status_code == 502
def test_get_models_returns_502_when_upstream_is_unavailable(client):
"""Non-200 upstream response should remain a 502, not be converted to 500."""
with patch("requests.get") as mock_get:
mock_response = MagicMock()
mock_response.status_code = 503
mock_get.return_value = mock_response
response = client.get("/api/v1/models")
assert response.status_code == 502
def test_get_specific_model(client, mock_models_response):
"""Test getting specific model"""
@@ -78,6 +148,40 @@ def test_get_nonexistent_model(client, mock_models_response):
response = client.get("/api/v1/models/nonexistent")
assert response.status_code == 404
def test_get_model_show(client):
"""Test show endpoint for model details."""
with patch("requests.post") as mock_post:
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"details": {
"family": "llama",
"parameter_size": "8B"
},
"model_info": {
"general.architecture": "llama"
}
}
mock_post.return_value = mock_response
response = client.get("/api/v1/models/llama2/show")
assert response.status_code == 200
data = response.json()
assert "details" in data
assert data["details"]["family"] == "llama"
def test_get_model_show_not_found(client):
"""Test show endpoint when model is not found."""
with patch("requests.post") as mock_post:
mock_response = MagicMock()
mock_response.status_code = 404
mock_post.return_value = mock_response
response = client.get("/api/v1/models/nonexistent/show")
assert response.status_code == 404
def test_root_endpoint(client):
"""Test root endpoint redirects to dashboard"""
response = client.get("/", follow_redirects=False)
@@ -92,3 +196,15 @@ def test_openapi_schema(client):
assert "paths" in schema
assert "/api/v1/health" in schema["paths"]
assert "/api/v1/models" in schema["paths"]
assert "/api/v1/models/running" in schema["paths"]
assert "/api/v1/models/{model_name}/show" in schema["paths"]
assert "/api/v1/models/{model_name}/pull" not in schema["paths"]
def test_write_endpoints_disabled_by_default(client):
"""POST/DELETE sui modelli devono essere non disponibili di default."""
response_pull = client.post("/api/v1/models/llama2/pull")
assert response_pull.status_code == 404
response_delete = client.delete("/api/v1/models/llama2")
assert response_delete.status_code == 404