novela/docs/BLUEPRINT.md

421 lines
11 KiB
Markdown

# Novela 2.0 - Blauwdruk
> Vervangt repository `story-grabber`. Nieuwe repo: **Novela**.
> Stack: FastAPI · Jinja2 · plain JS · PostgreSQL 16 · Docker / Portainer
---
## 1. Doelstelling
Novela 2.0 is een volledig zelfgehoste media-bibliotheek en e-reader voor epub, pdf en cbr/cbz.
Het vervangt Kavita (library), Calibre (metadata), en Sigil (epub editor) in een web-applicatie.
Kernprincipe: **de database is de snelle index, het bestand is de bron van waarheid.**
Elke schrijfactie raakt altijd beide: eerst het bestand, dan de database. Lezen gaat altijd via de database.
---
## 2. Wat behouden blijft uit v1
| Module | Bestand | Toelichting |
|---|---|---|
| EPUB bouw | `epub.py` | `make_epub`, `make_chapter_xhtml`, `add_cover_to_epub` |
| EPUB lezen/schrijven | `epub.py` | `read_epub_file`, `write_epub_file` |
| XHTML conversie | `xhtml.py` | `element_to_xhtml`, `is_break_element`, `configure_break_patterns` |
| Scrapers | `scrapers/` | base, awesomedude, gayauthors, plugin-patroon blijft |
| SSE job streaming | `main.py` | `JOBS` dict + `/events/{job_id}` `StreamingResponse` |
| Migrations patroon | `migrations.py` | idempotente `CREATE IF NOT EXISTS`, `run_migrations()` bij startup |
| Cover cache | DB tabel | `library_cover_cache`, WebP thumbnails 300x450 |
| Reading progress | DB tabel | CFI voor epub, paginanummer voor pdf/cbr |
| Reading sessions | DB tabel | leesgeschiedenis per boek |
| Break patterns | DB tabel | regex + css_class patronen voor scene-breaks |
---
## 3. Projectstructuur
```text
novela/
├── containers/
│ └── novela/
│ ├── main.py
│ ├── migrations.py
│ ├── db.py
│ ├── epub.py
│ ├── xhtml.py
│ ├── pdf.py
│ ├── cbr.py
│ ├── routers/
│ │ ├── __init__.py
│ │ ├── library.py
│ │ ├── reader.py
│ │ ├── editor.py
│ │ ├── grabber.py
│ │ ├── backup.py
│ │ └── settings.py
│ ├── scrapers/
│ ├── static/
│ ├── templates/
│ ├── requirements.txt
│ └── Dockerfile
├── stack/
│ ├── stack.yml
│ └── novela.env
└── docs/
├── BLUEPRINT.md
└── TECHNICAL.md
```
---
## 4. Bibliotheek op schijf
`output/` wordt `library/`.
```text
library/
├── epub/
│ └── {Publisher}/
│ └── {Author}/
│ ├── Stories/
│ │ └── {Titel}.epub
│ └── Series/
│ └── {Serienaam}/
│ └── {001 - Titel}.epub
├── pdf/
│ └── {Author}/
│ └── {Titel}.pdf
├── comics/
│ └── {Author of Serienaam}/
│ └── {001 - Titel}.cbr
└── covers/
```
Naamgeving-regels:
- Ongeldige tekens weg: `< > : " / \\ | ? *` en control chars
- Max 80 tekens per map-segment, 140 voor bestandsnaam
- Bij conflict: `Titel (2).epub`, `Titel (3).epub`, enz.
Hernoemen na metadata-bewerking:
- Bestand verplaatsen op schijf
- DB-verwijzingen updaten: `library`, `book_tags`, `reading_progress`, `reading_sessions`, `library_cover_cache`
- Lege mappen opruimen
---
## 5. Database schema
### 5.1 `library`
```sql
CREATE TABLE library (
id SERIAL PRIMARY KEY,
filename VARCHAR(600) UNIQUE NOT NULL,
media_type VARCHAR(10) NOT NULL DEFAULT 'epub',
title VARCHAR(500),
author VARCHAR(255),
publisher VARCHAR(255),
series VARCHAR(500),
series_index INTEGER DEFAULT 0,
publication_status VARCHAR(100),
has_cover BOOLEAN DEFAULT FALSE,
description TEXT DEFAULT '',
source_url VARCHAR(1000),
publish_date DATE,
archived BOOLEAN DEFAULT FALSE,
want_to_read BOOLEAN DEFAULT FALSE,
needs_review BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
```
### 5.2 `book_tags`
```sql
CREATE TABLE book_tags (
id SERIAL PRIMARY KEY,
filename VARCHAR(600) NOT NULL REFERENCES library(filename) ON DELETE CASCADE,
tag VARCHAR(255) NOT NULL,
tag_type VARCHAR(20) NOT NULL,
UNIQUE (filename, tag, tag_type)
);
CREATE INDEX idx_book_tags_filename ON book_tags (filename);
```
`tag_type`:
- `genre`
- `subgenre`
- `tag`
- `subject`
### 5.3 `reading_progress`
```sql
CREATE TABLE reading_progress (
id SERIAL PRIMARY KEY,
filename VARCHAR(600) UNIQUE NOT NULL REFERENCES library(filename) ON DELETE CASCADE,
cfi TEXT,
page INTEGER,
progress INTEGER DEFAULT 0,
updated_at TIMESTAMP DEFAULT NOW()
);
```
### 5.4 `reading_sessions`
```sql
CREATE TABLE reading_sessions (
id SERIAL PRIMARY KEY,
filename VARCHAR(600) NOT NULL REFERENCES library(filename) ON DELETE CASCADE,
read_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_reading_sessions_filename ON reading_sessions (filename);
```
### 5.5 `library_cover_cache`
```sql
CREATE TABLE library_cover_cache (
filename VARCHAR(600) PRIMARY KEY REFERENCES library(filename) ON DELETE CASCADE,
mime_type VARCHAR(100) NOT NULL,
thumb_webp BYTEA NOT NULL,
updated_at TIMESTAMP DEFAULT NOW()
);
```
### 5.6 `credentials`
```sql
CREATE TABLE credentials (
id SERIAL PRIMARY KEY,
site VARCHAR(255) UNIQUE NOT NULL,
username VARCHAR(255) NOT NULL,
password VARCHAR(255) NOT NULL,
updated_at TIMESTAMP DEFAULT NOW()
);
```
### 5.7 `break_patterns`
```sql
CREATE TABLE break_patterns (
id SERIAL PRIMARY KEY,
pattern_type VARCHAR(20) NOT NULL,
pattern TEXT NOT NULL,
enabled BOOLEAN DEFAULT TRUE,
is_default BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE (pattern_type, pattern)
);
```
### 5.8 `backup_log`
```sql
CREATE TABLE backup_log (
id SERIAL PRIMARY KEY,
status VARCHAR(20) NOT NULL,
files_count INTEGER,
size_bytes BIGINT,
error_msg TEXT,
started_at TIMESTAMP DEFAULT NOW(),
finished_at TIMESTAMP
);
```
---
## 6. Schrijfprincipe: bestand en database synchroon
Volgorde per bewerking:
1. Bewerk bestand op schijf
2. Update database
3. Retourneer succes
Nooit alleen DB updaten zonder bestand.
---
## 7. Coverstrategie
Opslaan:
- EPUB cover in bestand (`OEBPS/Images/cover.{ext}`)
- Thumbnail als `300x450` WebP in `library_cover_cache`
Ontbrekende cover:
- Als geen cover: voeg tag `Cover Missing` toe
- UI upload schrijft cover in EPUB en cache
Opvragen:
- Primair: `/library/cover-cached/{filename}`
- Fallback: `/library/cover/{filename}`
PDF en CBR:
- PDF: eerste pagina als thumbnail
- CBR/CBZ: eerste afbeelding als thumbnail
---
## 8. Verwijder-flow
`DELETE /library/file/{filename}`:
1. Verwijder bestand
2. Prune lege mappen
3. Delete uit `library` (cascade verwijdert gerelateerde tabellen)
---
## 9. Router-overzicht
### 9.1 `routers/library.py`
- `GET /library`
- `GET /api/library`
- `POST /library/rescan`
- `POST /library/import`
- `DELETE /library/file/{filename}`
- `GET /library/cover/{filename}`
- `GET /library/cover-cached/{filename}`
- `POST /library/cover/{filename}`
- `POST /library/want-to-read/{filename}`
- `POST /library/archive/{filename}`
- `GET /home`
- `GET /api/home`
- `GET /stats`
- `GET /api/stats`
### 9.2 `routers/reader.py`
- `GET /library/read/{filename}`
- `GET /library/book/{filename}`
- `PATCH /library/book/{filename}`
- `GET /library/epub/{filename}`
- `GET /library/chapters/{filename}`
- `GET /library/chapter/{index}/{filename}`
- `GET /library/chapter-img/{path}`
- `GET /library/pdf/{filename}`
- `GET /library/cbr/{filename}/{page}`
- `GET /library/progress/{filename}`
- `POST /library/progress/{filename}`
- `DELETE /library/progress/{filename}`
- `POST /library/mark-read/{filename}`
- `GET /api/genres`
### 9.3 `routers/editor.py`
- `GET /library/editor/{filename}`
- `GET /api/edit/chapter/{index}/{filename}`
- `POST /api/edit/chapter/{index}/{filename}`
- `POST /api/edit/chapter/add/{filename}`
- `DELETE /api/edit/chapter/{index}/{filename}`
### 9.4 `routers/grabber.py`
- `GET /grabber`
- `POST /preload`
- `POST /convert`
- `GET /events/{job_id}`
- `GET /debug`
- `POST /debug/run`
- `GET /credentials`
- `POST /credentials`
- `DELETE /credentials/{site}`
### 9.5 `routers/backup.py`
- `GET /backup`
- `GET /api/backup/status`
- `POST /api/backup/run`
- `GET /api/backup/history`
### 9.6 `routers/settings.py`
- `GET /settings`
- `GET /api/break-patterns`
- `POST /api/break-patterns`
- `PATCH /api/break-patterns/{id}`
- `DELETE /api/break-patterns/{id}`
- `DELETE /api/reading-history`
---
## 10. Nieuwe modules
### 10.1 `db.py`
Gedeelde psycopg2 connection pool (`init_pool`, `get_conn`, `release_conn`).
### 10.2 `pdf.py`
PyMuPDF rendering (`pdf_render_page`), page count en cover thumb.
### 10.3 `cbr.py`
RAR/ZIP paginalijst, page extract en cover thumb.
---
## 11. Cover-flow per mediatype
| Actie | EPUB | PDF | CBR/CBZ |
|---|---|---|---|
| Cover import | Uit OPF/Images | Eerste pagina render | Eerste image uit archief |
| Thumbnail | Pillow -> WebP | PyMuPDF + Pillow -> WebP | Pillow -> WebP |
| Opslag | EPUB + cache | cache | cache |
| Cover vervangen | Ja | Nee | Nee |
| Geen cover | `Cover Missing` tag | `Cover Missing` tag | `Cover Missing` tag |
---
## 12. Database-opzet
- Start met schone v2 database
- Geen migratiepad vanuit v1 data
- `run_migrations()` op startup
- `CREATE TABLE IF NOT EXISTS` overal idempotent
---
## 13. Docker stack
Zie [`stack/stack.yml`](../stack/stack.yml).
Belangrijk:
- App container expose `8099 -> 8000`
- PostgreSQL 16
- Adminer op `8098`
- `NOVELA_MASTER_KEY` in `stack/novela.env` en doorgifte in `stack/stack.yml` voor encrypted credentials
---
## 14. Requirements
Zie [`containers/novela/requirements.txt`](../containers/novela/requirements.txt).
---
## 15. Bestanden klaarzetten
Bron: `/docker/develop/story-grabber/containers/story-grabber`.
Doel: `/docker/develop/novela/containers/novela`.
Overnemen:
- `epub.py`
- `xhtml.py`
- `scrapers/*`
- `static/*`
- `templates/*`
Nieuw schrijven:
- `main.py`, `db.py`, `pdf.py`, `cbr.py`, `migrations.py`
- `routers/*`
---
## 16. Bouw-volgorde
1. `db.py`
2. `migrations.py`
3. `main.py`
4. `routers/library.py`
5. `routers/reader.py`
6. `routers/editor.py`
7. `routers/grabber.py`
8. `routers/settings.py`
9. `pdf.py` + reader uitbreiding
10. `cbr.py` + reader uitbreiding
11. `routers/backup.py`
12. `routers/library.py` uitbreiden voor pdf/cbr import