documente/README.md

# DocuMente & NotebookLM Agent

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.100+-009688.svg)](https://fastapi.tiangolo.com/)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)
[![Tests](https://img.shields.io/badge/tests-pytest-blue.svg)](https://docs.pytest.org/)

> **Piattaforma AI Completa - RAG Multi-Provider + Automazione NotebookLM**

Questo repository contiene **due sistemi AI complementari**:

1. **NotebookLM Agent** - API REST per l'automazione programmatica di Google NotebookLM
2. **DocuMente (Agentic RAG)** - Sistema RAG avanzato con supporto multi-provider LLM

---

## Indice

- [Panoramica](#panoramica)
- [Integrazione NotebookLM + RAG](#integrazione-notebooklm--rag)
- [Componenti](#componenti)
- [Requisiti](#requisiti)
- [Installazione](#installazione)
- [Configurazione](#configurazione)
- [Avvio](#avvio)
- [Testing](#testing)
- [Struttura Progetto](#struttura-progetto)
- [Documentazione](#documentazione)
- [Licenza](#licenza)

---

## Panoramica

### NotebookLM Agent

Interfaccia API e webhook per **Google NotebookLM** che permette:
- Gestione programmatica di notebook, fonti e chat
- Generazione automatica di contenuti (podcast, video, quiz, flashcard, slide)
- Integrazione con altri agenti AI tramite webhook
- Automazione completa dei workflow NotebookLM

**Ideale per:** Automation engineer, Content creator, AI Agent developers

### DocuMente (Agentic RAG)

Sistema **Retrieval-Augmented Generation** standalone con:
- Supporto per 8+ provider LLM (cloud e locali)
- Upload e indicizzazione documenti (PDF, DOCX, TXT, MD)
- Chat conversazionale con i tuoi documenti
- Interfaccia web moderna (React + TypeScript)
- **Integrazione con NotebookLM** - Ricerca semantica sui notebook

**Ideale per:** Knowledge management, Document analysis, Research assistant

---

## Integrazione NotebookLM + RAG

Ora puoi sincronizzare i tuoi notebook di NotebookLM nel sistema RAG di DocuMente, permettendo di:

- **Effettuare ricerche semantiche** sui contenuti dei tuoi notebook
- **Combinare documenti locali e notebook** nelle stesse query
- **Usare tutti i provider LLM** disponibili per interrogare i notebook
- **Filtrare per notebook specifici** durante le ricerche

### Architettura

```
NotebookLM → NotebookLMIndexerService → Qdrant Vector Store
                                               ↓
                                     RAGService (query con filtri)
                                               ↓
                                     Multi-Provider LLM Response
```

### Come funziona

1. **Sincronizzazione**: I contenuti dei notebook vengono estratti, divisi in chunks e indicizzati in Qdrant
2. **Metadati**: Ogni chunk mantiene informazioni sul notebook e la fonte di origine
3. **Ricerca**: Le query RAG possono filtrare per notebook_id specifici
4. **Risposta**: Il LLM riceve contesto dai notebook selezionati

📚 **[Guida Completa Integrazione](./docs/integration.md)** - API, esempi, best practices

---

## Componenti

### NotebookLM Agent

```
src/notebooklm_agent/
├── api/                    # FastAPI REST API
│   ├── main.py            # Application entry
│   ├── routes/            # API endpoints
│   │   ├── notebooks.py   # CRUD notebook
│   │   ├── sources.py     # Gestione fonti
│   │   ├── chat.py        # Chat interattiva
│   │   ├── generation.py  # Generazione contenuti
│   │   └── webhooks.py    # Webhook management
│   └── models/            # Pydantic models
├── services/              # Business logic
└── webhooks/              # Webhook system
```

**Funzionalita principali:**

| Categoria | Operazioni |
|-----------|------------|
| **Notebook** | Creare, listare, ottenere, aggiornare, eliminare |
| **Fonti** | Aggiungere URL, PDF, YouTube, Drive, ricerca web |
| **Chat** | Interrogare fonti, storico conversazioni |
| **Generazione** | Audio (podcast), Video, Slide, Quiz, Flashcard, Report, Mappe mentali |
| **Webhook** | Registrare endpoint, ricevere notifiche eventi |

**Endpoint API principali:**
- `POST /api/v1/notebooks` - Creare notebook
- `POST /api/v1/notebooks/{id}/sources` - Aggiungere fonti
- `POST /api/v1/notebooks/{id}/chat` - Chat con le fonti
- `POST /api/v1/notebooks/{id}/generate/audio` - Generare podcast
- `POST /api/v1/webhooks` - Registrare webhook

---

### DocuMente (Agentic RAG)

```
src/agentic_rag/
├── api/                   # FastAPI REST API
│   ├── main.py           # Application entry
│   └── routes/           # API endpoints
│       ├── documents.py  # Upload documenti
│       ├── query.py      # Query RAG
│       ├── chat.py       # Chat conversazionale
│       ├── providers.py  # Gestione provider LLM
│       └── notebooklm_sync.py  # Sync NotebookLM
├── services/             # Business logic
│   ├── rag_service.py    # Core RAG logic
│   ├── vector_store.py   # Qdrant integration
│   ├── document_service.py
│   └── notebooklm_indexer.py  # Indexing service
└── core/                 # Configurazioni
    ├── config.py        # Multi-provider config
    └── llm_factory.py   # LLM factory pattern
```

**Endpoint API NotebookLM Integration:**
- `POST /api/v1/notebooklm/sync/{notebook_id}` - Sincronizza un notebook da NotebookLM
- `GET /api/v1/notebooklm/indexed` - Lista notebook sincronizzati
- `DELETE /api/v1/notebooklm/sync/{notebook_id}` - Rimuovi sincronizzazione
- `GET /api/v1/notebooklm/sync/{notebook_id}/status` - Verifica stato sincronizzazione
- `POST /api/v1/query/notebooks` - Query solo sui notebook

**Query con filtri notebook:**
```bash
# Ricerca in notebook specifici
POST /api/v1/query
{
  "question": "Quali sono i punti chiave?",
  "notebook_ids": ["uuid-1", "uuid-2"],
  "include_documents": true  # Include anche documenti locali
}

# Ricerca solo nei notebook
POST /api/v1/query/notebooks
{
  "question": "Trova informazioni su...",
  "notebook_ids": ["uuid-1"],
  "k": 10
}
```

---

**Provider LLM Supportati:**

| Provider | Tipo | Modelli Principali |
|----------|------|-------------------|
| **OpenAI** | Cloud | GPT-4o, GPT-4, GPT-3.5 |
| **Anthropic** | Cloud | Claude 3.5, Claude 3 |
| **Google** | Cloud | Gemini 1.5 Pro/Flash |
| **Mistral** | Cloud | Mistral Large/Medium |
| **Azure** | Cloud | GPT-4, GPT-4o |
| **Z.AI** | Cloud | zai-large, zai-medium |
| **OpenCode Zen** | Cloud | zen-1, zen-lite |
| **OpenRouter** | Cloud | Multi-model access |
| **Ollama** | 🏠 Locale | llama3.2, mistral, qwen |
| **LM Studio** | 🏠 Locale | Any loaded model |

---

## Requisiti

- **Python** 3.10+
- **[uv](https://github.com/astral-sh/uv)** - Dependency management
- **[Node.js](https://nodejs.org/)** 18+ (solo per DocuMente frontend)
- **Docker** (opzionale)
- **Qdrant** (per DocuMente vector store)

---

## Installazione

```bash
# Clona il repository
git clone <repository-url>
cd documente

# Crea ambiente virtuale
uv venv --python 3.10
source .venv/bin/activate

# Installa tutte le dipendenze
uv sync --extra dev --extra browser
```

**Per DocuMente (frontend):**

```bash
cd frontend
npm install
```

---

## Configurazione

Crea un file `.env` nella root del progetto:

```env
# ========================================
# NotebookLM Agent Configuration
# ========================================
NOTEBOOKLM_AGENT_API_KEY=your-api-key
NOTEBOOKLM_AGENT_WEBHOOK_SECRET=your-webhook-secret
NOTEBOOKLM_AGENT_PORT=8000
NOTEBOOKLM_AGENT_HOST=0.0.0.0

# NotebookLM Authentication (via notebooklm-py)
# Esegui: notebooklm login (prima volta)
NOTEBOOKLM_HOME=~/.notebooklm
NOTEBOOKLM_PROFILE=default

# ========================================
# DocuMente (Agentic RAG) Configuration
# ========================================

# LLM Provider API Keys (configura almeno uno)
OPENAI_API_KEY=sk-...
ZAI_API_KEY=...
OPENCODE_ZEN_API_KEY=...
OPENROUTER_API_KEY=...
ANTHROPIC_API_KEY=...
GOOGLE_API_KEY=...
MISTRAL_API_KEY=...
AZURE_API_KEY=...
AZURE_ENDPOINT=https://your-resource.openai.azure.com

# Local LLM Providers (no API key needed)
OLLAMA_BASE_URL=http://localhost:11434
LMSTUDIO_BASE_URL=http://localhost:1234

# Vector Store (Qdrant)
QDRANT_HOST=localhost
QDRANT_PORT=6333
QDRANT_COLLECTION=documents

# Embedding Configuration
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_API_KEY=sk-...

# Default LLM Provider
default_llm_provider=openai

# ========================================
# General Configuration
# ========================================
LOG_LEVEL=INFO
LOG_FORMAT=json
DEBUG=false
```

---

## Avvio

### NotebookLM Agent

```bash
# 1. Autenticazione NotebookLM (prima volta)
notebooklm login

# 2. Avvio server API
uv run fastapi dev src/notebooklm_agent/api/main.py

# Server disponibile su http://localhost:8000
# API docs: http://localhost:8000/docs
```

**Esempio di utilizzo API:**

```bash
# Creare un notebook
curl -X POST http://localhost:8000/api/v1/notebooks \
  -H "Content-Type: application/json" \
  -d '{"title": "Ricerca AI", "description": "Studio AI"}'

# Aggiungere una fonte URL
curl -X POST http://localhost:8000/api/v1/notebooks/{id}/sources \
  -H "Content-Type: application/json" \
  -d '{"type": "url", "url": "https://example.com"}'

# Generare un podcast
curl -X POST http://localhost:8000/api/v1/notebooks/{id}/generate/audio \
  -H "Content-Type: application/json" \
  -d '{"format": "deep-dive", "length": "long"}'
```

---

### DocuMente (Agentic RAG)

#### Con Docker (Consigliato)

```bash
docker-compose up
```

#### Manuale

```bash
# 1. Avvia Qdrant (in un terminale separato)
docker run -p 6333:6333 qdrant/qdrant

# 2. Avvia backend
uv run fastapi dev src/agentic_rag/api/main.py

# 3. Avvia frontend (in un altro terminale)
cd frontend
npm run dev
```

**Servizi disponibili:**
- **Web UI**: http://localhost:3000
- **API docs**: http://localhost:8000/api/docs

**Esempio di utilizzo API:**

```bash
# Upload documento
curl -X POST http://localhost:8000/api/v1/documents \
  -F "file=@documento.pdf"

# Query RAG
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Qual e il contenuto principale?",
    "provider": "openai",
    "model": "gpt-4o-mini"
  }'
```

---

### Integrazione NotebookLM + RAG

**Sincronizzare un notebook:**
```bash
# Sincronizza un notebook da NotebookLM al vector store
curl -X POST http://localhost:8000/api/v1/notebooklm/sync/{notebook_id}

# Lista notebook sincronizzati
curl http://localhost:8000/api/v1/notebooklm/indexed

# Rimuovi sincronizzazione
curl -X DELETE http://localhost:8000/api/v1/notebooklm/sync/{notebook_id}
```

**Query sui notebook:**
```bash
# Query solo sui notebook (senza documenti locali)
curl -X POST http://localhost:8000/api/v1/query/notebooks \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Quali sono le conclusioni principali?",
    "notebook_ids": ["uuid-del-notebook"],
    "k": 10,
    "provider": "openai"
  }'

# Query mista (documenti + notebook)
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Confronta le informazioni tra i documenti e i notebook",
    "notebook_ids": ["uuid-1", "uuid-2"],
    "include_documents": true,
    "provider": "anthropic"
  }'
```

---

## Testing

```bash
# Esegui tutti i test
uv run pytest

# Con coverage
uv run pytest --cov=src --cov-report=term-missing

# Solo unit test
uv run pytest tests/unit/ -m unit

# Test NotebookLM Agent
uv run pytest tests/unit/test_notebooklm_agent/ -v

# Test DocuMente
uv run pytest tests/unit/test_agentic_rag/ -v
```

---

## Struttura Progetto

```
documente/
├── src/
│   ├── notebooklm_agent/       # API NotebookLM Agent
│   │   ├── api/
│   │   ├── services/
│   │   ├── core/
│   │   └── webhooks/
│   └── agentic_rag/            # DocuMente RAG System
│       ├── api/
│       ├── services/
│       └── core/
├── tests/
│   ├── unit/
│   │   ├── test_notebooklm_agent/
│   │   └── test_agentic_rag/
│   ├── integration/
│   └── e2e/
├── frontend/                   # React + TypeScript UI
│   ├── src/
│   └── package.json
├── docs/                       # Documentazione
├── prompts/                    # Prompt engineering
├── pyproject.toml             # Configurazione progetto
├── docker-compose.yml
└── README.md
```

---

## Documentazione

### 📚 Guide Principali

| Documento | Descrizione |
|-----------|-------------|
| **[docs/integration.md](./docs/integration.md)** | Guida completa integrazione NotebookLM + RAG - API, esempi, best practices |
| **[docs/README.md](./docs/README.md)** | Panoramica documentazione API e endpoint |

### 🤖 NotebookLM Agent

| Documento | Descrizione |
|-----------|-------------|
| **[SKILL.md](./SKILL.md)** | Skill definition per agenti AI - API reference completo |
| **[prd.md](./prd.md)** | Product Requirements Document |
| **[AGENTS.md](./AGENTS.md)** | Linee guida per sviluppo |

### 🧠 DocuMente (Agentic RAG)

| Documento | Descrizione |
|-----------|-------------|
| **[prd-v2.md](./prd-v2.md)** | Product Requirements Document v2 |
| **[frontend-plan.md](./frontend-plan.md)** | Piano sviluppo frontend |
| **[TEST_COVERAGE_REPORT.md](./TEST_COVERAGE_REPORT.md)** | Report coverage test |

### 📝 Generale

| Documento | Descrizione |
|-----------|-------------|
| **[CONTRIBUTING.md](./CONTRIBUTING.md)** | Come contribuire al progetto |
| **[CHANGELOG.md](./CHANGELOG.md)** | Cronologia modifiche e release |
| **[LICENSE](./LICENSE)** | Termini di licenza |

---

## Licenza

Questo software e proprieta riservata di Luca Sacchi Ricciardi.

Tutti i diritti sono riservati. Per ogni controversia derivante dall'uso o dallo sviluppo di questo software, il foro competente in via esclusiva e il Foro di Milano, Italia.

Vedi [LICENSE](./LICENSE) per i termini completi.

---

## Contributing

Vedi [CONTRIBUTING.md](./CONTRIBUTING.md) per le linee guida su come contribuire al progetto.

---

**Autore**: Luca Sacchi Ricciardi
**Contatto**: luca@lucasacchi.net
**Copyright**: (C) 2026 Tutti i diritti riservati