lucasacchi/documente

Fork 0

Files

Luca Sacchi Ricciardi e239829938

CI / test (3.10) (push) Has been cancelled

Details

CI / test (3.11) (push) Has been cancelled

Details

CI / test (3.12) (push) Has been cancelled

Details

CI / lint (push) Has been cancelled

Details

docs: reorganize README with improved documentation links

Reorganize documentation section in README.md:

- Add direct link to integration guide in Panoramica section
- Remove duplicate documentation references
- Organize docs into logical categories:
  * Guide Principali (integration guide, API overview)
  * NotebookLM Agent (SKILL.md, PRD, AGENTS.md)
  * DocuMente (prd-v2, frontend-plan, test coverage)
  * Generale (contributing, changelog, license)

- Add table format for better readability
- Update provider list to include Ollama and LM Studio
- Add emojis for visual organization
- Include Local LLM provider config in .env example

Improves discoverability of documentation and provides
clear navigation for users.

2026-04-06 18:36:19 +02:00

14 KiB

Raw Blame History

DocuMente & NotebookLM Agent

Piattaforma AI Completa - RAG Multi-Provider + Automazione NotebookLM

Questo repository contiene due sistemi AI complementari:

NotebookLM Agent - API REST per l'automazione programmatica di Google NotebookLM
DocuMente (Agentic RAG) - Sistema RAG avanzato con supporto multi-provider LLM

Panoramica

NotebookLM Agent

Interfaccia API e webhook per Google NotebookLM che permette:

Gestione programmatica di notebook, fonti e chat
Generazione automatica di contenuti (podcast, video, quiz, flashcard, slide)
Integrazione con altri agenti AI tramite webhook
Automazione completa dei workflow NotebookLM

Ideale per: Automation engineer, Content creator, AI Agent developers

DocuMente (Agentic RAG)

Sistema Retrieval-Augmented Generation standalone con:

Supporto per 8+ provider LLM (cloud e locali)
Upload e indicizzazione documenti (PDF, DOCX, TXT, MD)
Chat conversazionale con i tuoi documenti
Interfaccia web moderna (React + TypeScript)
Integrazione con NotebookLM - Ricerca semantica sui notebook

Ideale per: Knowledge management, Document analysis, Research assistant

Integrazione NotebookLM + RAG

Ora puoi sincronizzare i tuoi notebook di NotebookLM nel sistema RAG di DocuMente, permettendo di:

Effettuare ricerche semantiche sui contenuti dei tuoi notebook
Combinare documenti locali e notebook nelle stesse query
Usare tutti i provider LLM disponibili per interrogare i notebook
Filtrare per notebook specifici durante le ricerche

Architettura

NotebookLM → NotebookLMIndexerService → Qdrant Vector Store
                                               ↓
                                     RAGService (query con filtri)
                                               ↓
                                     Multi-Provider LLM Response

Come funziona

Sincronizzazione: I contenuti dei notebook vengono estratti, divisi in chunks e indicizzati in Qdrant
Metadati: Ogni chunk mantiene informazioni sul notebook e la fonte di origine
Ricerca: Le query RAG possono filtrare per notebook_id specifici
Risposta: Il LLM riceve contesto dai notebook selezionati

📚 Guida Completa Integrazione - API, esempi, best practices

Componenti

NotebookLM Agent

src/notebooklm_agent/
├── api/                    # FastAPI REST API
│   ├── main.py            # Application entry
│   ├── routes/            # API endpoints
│   │   ├── notebooks.py   # CRUD notebook
│   │   ├── sources.py     # Gestione fonti
│   │   ├── chat.py        # Chat interattiva
│   │   ├── generation.py  # Generazione contenuti
│   │   └── webhooks.py    # Webhook management
│   └── models/            # Pydantic models
├── services/              # Business logic
└── webhooks/              # Webhook system

Funzionalita principali:

Categoria	Operazioni
Notebook	Creare, listare, ottenere, aggiornare, eliminare
Fonti	Aggiungere URL, PDF, YouTube, Drive, ricerca web
Chat	Interrogare fonti, storico conversazioni
Generazione	Audio (podcast), Video, Slide, Quiz, Flashcard, Report, Mappe mentali
Webhook	Registrare endpoint, ricevere notifiche eventi

Endpoint API principali:

POST /api/v1/notebooks - Creare notebook
POST /api/v1/notebooks/{id}/sources - Aggiungere fonti
POST /api/v1/notebooks/{id}/chat - Chat con le fonti
POST /api/v1/notebooks/{id}/generate/audio - Generare podcast
POST /api/v1/webhooks - Registrare webhook

DocuMente (Agentic RAG)

src/agentic_rag/
├── api/                   # FastAPI REST API
│   ├── main.py           # Application entry
│   └── routes/           # API endpoints
│       ├── documents.py  # Upload documenti
│       ├── query.py      # Query RAG
│       ├── chat.py       # Chat conversazionale
│       ├── providers.py  # Gestione provider LLM
│       └── notebooklm_sync.py  # Sync NotebookLM
├── services/             # Business logic
│   ├── rag_service.py    # Core RAG logic
│   ├── vector_store.py   # Qdrant integration
│   ├── document_service.py
│   └── notebooklm_indexer.py  # Indexing service
└── core/                 # Configurazioni
    ├── config.py        # Multi-provider config
    └── llm_factory.py   # LLM factory pattern

Endpoint API NotebookLM Integration:

POST /api/v1/notebooklm/sync/{notebook_id} - Sincronizza un notebook da NotebookLM
GET /api/v1/notebooklm/indexed - Lista notebook sincronizzati
DELETE /api/v1/notebooklm/sync/{notebook_id} - Rimuovi sincronizzazione
GET /api/v1/notebooklm/sync/{notebook_id}/status - Verifica stato sincronizzazione
POST /api/v1/query/notebooks - Query solo sui notebook

Query con filtri notebook:

# Ricerca in notebook specifici
POST /api/v1/query
{
  "question": "Quali sono i punti chiave?",
  "notebook_ids": ["uuid-1", "uuid-2"],
  "include_documents": true  # Include anche documenti locali
}

# Ricerca solo nei notebook
POST /api/v1/query/notebooks
{
  "question": "Trova informazioni su...",
  "notebook_ids": ["uuid-1"],
  "k": 10
}

Provider LLM Supportati:

Provider	Tipo	Modelli Principali
OpenAI	Cloud	GPT-4o, GPT-4, GPT-3.5
Anthropic	Cloud	Claude 3.5, Claude 3
Google	Cloud	Gemini 1.5 Pro/Flash
Mistral	Cloud	Mistral Large/Medium
Azure	Cloud	GPT-4, GPT-4o
Z.AI	Cloud	zai-large, zai-medium
OpenCode Zen	Cloud	zen-1, zen-lite
OpenRouter	Cloud	Multi-model access
Ollama	🏠 Locale	llama3.2, mistral, qwen
LM Studio	🏠 Locale	Any loaded model

Requisiti

Python 3.10+
uv - Dependency management
Node.js 18+ (solo per DocuMente frontend)
Docker (opzionale)
Qdrant (per DocuMente vector store)

Installazione

# Clona il repository
git clone <repository-url>
cd documente

# Crea ambiente virtuale
uv venv --python 3.10
source .venv/bin/activate

# Installa tutte le dipendenze
uv sync --extra dev --extra browser

Per DocuMente (frontend):

cd frontend
npm install

Configurazione

Crea un file .env nella root del progetto:

# ========================================
# NotebookLM Agent Configuration
# ========================================
NOTEBOOKLM_AGENT_API_KEY=your-api-key
NOTEBOOKLM_AGENT_WEBHOOK_SECRET=your-webhook-secret
NOTEBOOKLM_AGENT_PORT=8000
NOTEBOOKLM_AGENT_HOST=0.0.0.0

# NotebookLM Authentication (via notebooklm-py)
# Esegui: notebooklm login (prima volta)
NOTEBOOKLM_HOME=~/.notebooklm
NOTEBOOKLM_PROFILE=default

# ========================================
# DocuMente (Agentic RAG) Configuration
# ========================================

# LLM Provider API Keys (configura almeno uno)
OPENAI_API_KEY=sk-...
ZAI_API_KEY=...
OPENCODE_ZEN_API_KEY=...
OPENROUTER_API_KEY=...
ANTHROPIC_API_KEY=...
GOOGLE_API_KEY=...
MISTRAL_API_KEY=...
AZURE_API_KEY=...
AZURE_ENDPOINT=https://your-resource.openai.azure.com

# Local LLM Providers (no API key needed)
OLLAMA_BASE_URL=http://localhost:11434
LMSTUDIO_BASE_URL=http://localhost:1234

# Vector Store (Qdrant)
QDRANT_HOST=localhost
QDRANT_PORT=6333
QDRANT_COLLECTION=documents

# Embedding Configuration
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_API_KEY=sk-...

# Default LLM Provider
default_llm_provider=openai

# ========================================
# General Configuration
# ========================================
LOG_LEVEL=INFO
LOG_FORMAT=json
DEBUG=false

Avvio

NotebookLM Agent

# 1. Autenticazione NotebookLM (prima volta)
notebooklm login

# 2. Avvio server API
uv run fastapi dev src/notebooklm_agent/api/main.py

# Server disponibile su http://localhost:8000
# API docs: http://localhost:8000/docs

Esempio di utilizzo API:

# Creare un notebook
curl -X POST http://localhost:8000/api/v1/notebooks \
  -H "Content-Type: application/json" \
  -d '{"title": "Ricerca AI", "description": "Studio AI"}'

# Aggiungere una fonte URL
curl -X POST http://localhost:8000/api/v1/notebooks/{id}/sources \
  -H "Content-Type: application/json" \
  -d '{"type": "url", "url": "https://example.com"}'

# Generare un podcast
curl -X POST http://localhost:8000/api/v1/notebooks/{id}/generate/audio \
  -H "Content-Type: application/json" \
  -d '{"format": "deep-dive", "length": "long"}'

DocuMente (Agentic RAG)

Con Docker (Consigliato)

docker-compose up

Manuale

# 1. Avvia Qdrant (in un terminale separato)
docker run -p 6333:6333 qdrant/qdrant

# 2. Avvia backend
uv run fastapi dev src/agentic_rag/api/main.py

# 3. Avvia frontend (in un altro terminale)
cd frontend
npm run dev

Servizi disponibili:

Web UI: http://localhost:3000
API docs: http://localhost:8000/api/docs

Esempio di utilizzo API:

# Upload documento
curl -X POST http://localhost:8000/api/v1/documents \
  -F "file=@documento.pdf"

# Query RAG
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Qual e il contenuto principale?",
    "provider": "openai",
    "model": "gpt-4o-mini"
  }'

Integrazione NotebookLM + RAG

Sincronizzare un notebook:

# Sincronizza un notebook da NotebookLM al vector store
curl -X POST http://localhost:8000/api/v1/notebooklm/sync/{notebook_id}

# Lista notebook sincronizzati
curl http://localhost:8000/api/v1/notebooklm/indexed

# Rimuovi sincronizzazione
curl -X DELETE http://localhost:8000/api/v1/notebooklm/sync/{notebook_id}

Query sui notebook:

# Query solo sui notebook (senza documenti locali)
curl -X POST http://localhost:8000/api/v1/query/notebooks \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Quali sono le conclusioni principali?",
    "notebook_ids": ["uuid-del-notebook"],
    "k": 10,
    "provider": "openai"
  }'

# Query mista (documenti + notebook)
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Confronta le informazioni tra i documenti e i notebook",
    "notebook_ids": ["uuid-1", "uuid-2"],
    "include_documents": true,
    "provider": "anthropic"
  }'

Testing

# Esegui tutti i test
uv run pytest

# Con coverage
uv run pytest --cov=src --cov-report=term-missing

# Solo unit test
uv run pytest tests/unit/ -m unit

# Test NotebookLM Agent
uv run pytest tests/unit/test_notebooklm_agent/ -v

# Test DocuMente
uv run pytest tests/unit/test_agentic_rag/ -v

Struttura Progetto

documente/
├── src/
│   ├── notebooklm_agent/       # API NotebookLM Agent
│   │   ├── api/
│   │   ├── services/
│   │   ├── core/
│   │   └── webhooks/
│   └── agentic_rag/            # DocuMente RAG System
│       ├── api/
│       ├── services/
│       └── core/
├── tests/
│   ├── unit/
│   │   ├── test_notebooklm_agent/
│   │   └── test_agentic_rag/
│   ├── integration/
│   └── e2e/
├── frontend/                   # React + TypeScript UI
│   ├── src/
│   └── package.json
├── docs/                       # Documentazione
├── prompts/                    # Prompt engineering
├── pyproject.toml             # Configurazione progetto
├── docker-compose.yml
└── README.md

Documentazione

📚 Guide Principali

Documento	Descrizione
docs/integration.md	Guida completa integrazione NotebookLM + RAG - API, esempi, best practices
docs/README.md	Panoramica documentazione API e endpoint

🤖 NotebookLM Agent

Documento	Descrizione
SKILL.md	Skill definition per agenti AI - API reference completo
prd.md	Product Requirements Document
AGENTS.md	Linee guida per sviluppo

🧠 DocuMente (Agentic RAG)

Documento	Descrizione
prd-v2.md	Product Requirements Document v2
frontend-plan.md	Piano sviluppo frontend
TEST_COVERAGE_REPORT.md	Report coverage test

📝 Generale

Documento	Descrizione
CONTRIBUTING.md	Come contribuire al progetto
CHANGELOG.md	Cronologia modifiche e release
LICENSE	Termini di licenza

Licenza

Questo software e proprieta riservata di Luca Sacchi Ricciardi.

Tutti i diritti sono riservati. Per ogni controversia derivante dall'uso o dallo sviluppo di questo software, il foro competente in via esclusiva e il Foro di Milano, Italia.

Vedi LICENSE per i termini completi.

Contributing

Vedi CONTRIBUTING.md per le linee guida su come contribuire al progetto.

14 KiB Raw Blame History

DocuMente & NotebookLM Agent

Indice

Panoramica

NotebookLM Agent

DocuMente (Agentic RAG)

Integrazione NotebookLM + RAG

Architettura

Come funziona

Componenti

NotebookLM Agent

DocuMente (Agentic RAG)

Requisiti

Installazione

Configurazione

Avvio

NotebookLM Agent

DocuMente (Agentic RAG)

Con Docker (Consigliato)

Manuale

Integrazione NotebookLM + RAG

Testing

Struttura Progetto

Documentazione

📚 Guide Principali

🤖 NotebookLM Agent

🧠 DocuMente (Agentic RAG)

📝 Generale

Licenza

Contributing

14 KiB

Raw Blame History