# Novela 2.0 - Technical Status (Develop) ## Scope This document describes the current technical status of the `develop` codebase. It is the primary technical reference for the current implementation. ## Architecture - Stack: FastAPI, Jinja2 templates, plain JavaScript, PostgreSQL 16, Docker. - Startup lifecycle (`main.py`): 1. `init_pool()` 2. `run_migrations()` 3. `start_backup_scheduler()` 4. mount routers - Shutdown lifecycle: 1. `stop_backup_scheduler()` 2. `close_pool()` - Source-of-truth rule: files on disk are authoritative, the database is an index/cache. ## File Storage Paths All files are stored under `library/` (relative to the app working directory, mapped via Docker volume). `LIBRARY_DIR = Path("library")`, `LIBRARY_ROOT = LIBRARY_DIR.resolve()`. ### Path structure per format | Format | Path pattern | |--------|-------------| | EPUB (no series) | `library/epub/{publisher}/{author}/Stories/{title}.epub` | | EPUB (series) | `library/epub/{publisher}/{author}/Series/{series}/{idx:03d} - {title}.epub` | | PDF | `library/pdf/{publisher}/{author}/{title}.pdf` | | CBR | `library/comics/{publisher}/{author}/{title}.cbr` | | CBZ | `library/comics/{publisher}/{author}/{title}.cbz` | - Segments are sanitised: special chars stripped, max lengths applied (publisher/author 80, title 140, series 80). - Series index is zero-padded to 3 digits (`001`, `002`, …), clamped to 1–999. - Duplicate filenames get a `(2)`, `(3)`, … suffix. - After any file move, empty parent directories are pruned up to `LIBRARY_ROOT`. ### Path logic - `common.make_rel_path(media_type, publisher, author, title, series, series_index, ext)` — used by import and grabber. - `reader.py _make_rel_path(publisher, author, title, series, series_index, ext)` — used by metadata PATCH; same logic, uses actual file extension. - Both functions produce identical paths for all formats. ### Metadata save behaviour per format | Format | File written? | DB written? | |--------|--------------|-------------| | EPUB | Yes — OPF metadata updated in-place | Yes | | PDF | No | Yes | | CBR | No | Yes | | CBZ | No (tags/metadata); rating written to ComicInfo.xml | Yes | --- ## Router Status ### `routers/library.py` - `GET /library` — library page - `GET /api/library` — book list JSON (fast-path by default) - `POST /library/rescan` — forced full disk rescan - `POST /library/import` — upload EPUB/PDF/CBR/CBZ - `DELETE /library/file/{filename}` — delete file + DB row + prune dirs - `GET /download/{filename}` — download file with `Content-Disposition: attachment` - `GET /library/cover/{filename}` — serve cover (EPUB from file; PDF/CBR from cache) - `GET /library/cover-cached/{filename}` — serve cover from DB cache only - `POST /library/cover/{filename}` — upload/replace cover (EPUB only) - `POST /library/want-to-read/{filename}` — toggle want-to-read flag - `POST /library/archive/{filename}` — toggle archived flag - `POST /library/new/mark-reviewed` — bulk set `needs_review=false` - `POST /library/rating/{filename}` — set/clear star rating `{"rating": 0-5}` - `GET /home` — home page - `GET /api/home` — home data JSON - `GET /stats` — statistics page - `GET /api/stats` — statistics data JSON - `GET /library/list` — compat alias `GET /api/library` runs in fast-path mode by default (DB-only, no full disk rescan). For a forced sync: `GET /api/library?rescan=true` or `POST /library/rescan`. `include_file_info=true` is optional for file size/mtime enrichment. `/api/home` returns: - `continue_reading` - `shorts_unread` - `novels_unread` - `shorts_read` - `novels_read` `/api/stats` returns totals plus chart/history data for `stats.html`: - `reads_by_month`, `reads_by_dow`, `reads_by_hour` - `genre_counts`, `publisher_counts`, `fav_genre`, `fav_publisher` - `top_books`, `history` Home sections exclude series books via: - `COALESCE(series, '') = ''` - `filename NOT LIKE '%/Series/%'` Home read sections are ordered oldest-first: - `shorts_read`: `ORDER BY MAX(read_at) ASC` - `novels_read`: `ORDER BY MAX(read_at) ASC` ### `routers/reader.py` - `GET /library/epub/{filename}` — serve EPUB inline (no attachment header) - `GET /library/chapters/{filename}` — EPUB spine as JSON - `GET /library/chapter/{index}/{filename}` — single EPUB chapter as HTML fragment - `GET /library/chapter-img/{path}?filename=…` — image extracted from EPUB ZIP; `path` is the full internal ZIP path (e.g. `OEBPS/Images/cover.jpg` or `EPUB/images/cover.jpg`); case-insensitive fallback for mismatched folder names - `GET /library/pdf/{filename}?page=N&dpi=150` — render PDF page as PNG - `GET /api/pdf/info/{filename}` — `{"page_count": N}` - `GET /library/cbr/{filename}/{page}` — CBR/CBZ page as image - `GET /library/progress/{filename}` — read progress - `POST /library/progress/{filename}` — save progress `{"cfi": "…", "progress": N}` - `DELETE /library/progress/{filename}` — clear progress - `POST /library/mark-read/{filename}` — mark as read (with optional date) - `GET /library/book/{filename}` — book detail page - `GET /api/genres` — all tags from `book_tags` (optional `?type=genre|subgenre|tag`) - `PATCH /library/book/{filename}` — update metadata + tags; moves file if path fields change; DB-only for non-EPUB - `POST /library/rating/{filename}` — set/clear 1–5 star rating; writes to EPUB OPF / CBZ ComicInfo.xml; DB-only for CBR/PDF - `GET /library/read/{filename}` — reader page (EPUB or PDF); supports `?bm_ch=N&bm_scroll=F` to jump to bookmark position - `GET /library/bookmarks/{filename}` — list bookmarks for a book - `POST /library/bookmarks/{filename}` — add bookmark `{chapter_index, scroll_frac, chapter_title, note}` - `PATCH /library/bookmarks/{id}` — update bookmark note - `DELETE /library/bookmarks/{id}` — delete bookmark - `GET /api/bookmarks` — all bookmarks across all books (includes `book_title`, `book_author`) ### `routers/editor.py` - `GET /library/editor/{filename}` — EPUB chapter editor page - `GET /api/edit/chapter/{index}/{filename}` — get chapter HTML - `POST /api/edit/chapter/{index}/{filename}` — save chapter HTML - `POST /api/edit/chapter/add/{filename}` — add new chapter - `DELETE /api/edit/chapter/{index}/{filename}` — delete chapter ### `routers/grabber.py` - `GET /grabber` — grabber page - `GET /convert` — convert page - `GET /credentials-manager` — credentials manager UI - `GET /debug` — debug page - `POST /debug/run` — run debug scrape - `GET /credentials` — list stored credentials - `POST /credentials` — save credential - `DELETE /credentials/{site}` — delete credential - `POST /preload` — preload book info from URL - `POST /convert` — run scrape + convert to EPUB - `GET /events/{job_id}` — SSE stream for job progress ### `routers/settings.py` - `GET /settings` — settings page - `GET /api/break-patterns` — list chapter-break patterns - `POST /api/break-patterns` — add break pattern (type: `regex` or `css_class`) - `PATCH /api/break-patterns/{id}` — update pattern (enable/disable or change value) - `DELETE /api/break-patterns/{id}` — delete pattern - `DELETE /api/reading-history` — wipe all reading sessions ### `routers/builder.py` - `GET /builder` — Book Builder index (draft list + new draft form) - `POST /builder` — create new draft; redirects to `/builder/{id}` - `GET /builder/{draft_id}` — draft editor page - `DELETE /api/builder/{draft_id}` — delete draft - `GET /api/builder/{draft_id}` — draft JSON (id, title, author, publisher, source_url, chapters) - `POST /api/builder/{draft_id}/chapter` — add chapter `{title, after_index}`; returns `{index, count}` - `PUT /api/builder/{draft_id}/chapter/{idx}` — save chapter `{title?, content?}` - `DELETE /api/builder/{draft_id}/chapter/{idx}` — delete chapter; returns `{index, count}` - `POST /api/builder/{draft_id}/normalize/{idx}` — normalize chapter HTML (preview only, does not save); returns `{content}` - `POST /api/builder/{draft_id}/publish` — normalize all chapters → `build_epub()` → write to `library/epub/` → `upsert_book()` → delete draft; returns `{filename}`; redirects browser to `/library/book/{filename}` Publish flow: all chapters are run through `normalize_wysiwyg_html()`, then `build_epub()` produces an EPUB 2.0 ZIP. The file path is computed via `make_rel_path(media_type="epub", …)`. The book is inserted into the library with `needs_review=True`. The draft is deleted on success. ### `routers/backup.py` - `GET /backup` — backup page - `GET /api/backup/credentials` — Dropbox settings (includes `app_key_configured` flag) - `POST /api/backup/credentials` — save Dropbox settings - `DELETE /api/backup/credentials` — remove all Dropbox credentials - `POST /api/backup/oauth/prepare` — save app key + secret, return Dropbox auth URL - `POST /api/backup/oauth/exchange` — exchange authorization code for refresh token - `GET /api/backup/health` — Dropbox connectivity check (includes `schedule_enabled`, `schedule_interval_hours`) - `GET /api/backup/status` — current backup status - `GET /api/backup/history` — backup run history (last 20) - `GET /api/backup/progress` — live progress of running backup `{running, done, total, phase}` - `POST /api/backup/run` — trigger backup (background task) --- ## Backup & Security - Dropbox token (refresh token or legacy access token) stored encrypted in `credentials` (`site='dropbox'`). - Dropbox app key stored encrypted in `credentials` (`site='dropbox_app_key'`). - Dropbox app secret stored encrypted in `credentials` (`site='dropbox_app_secret'`). - Dropbox backup root stored encrypted in `credentials` (`site='dropbox_backup_root'`). - Retention (`snapshots to keep`) stored encrypted in `credentials` (`site='dropbox_backup_retention'`). - Backup schedule (`enabled` + `interval_hours`) stored encrypted in `credentials` (`site='dropbox_backup_schedule'`). - Encryption uses `NOVELA_MASTER_KEY` (Fernet). ### Dropbox authentication - Preferred: OAuth2 refresh token (does not expire). Set up via the two-step flow on `/backup`: 1. Enter App Key + App Secret → click **Generate Auth URL** 2. Approve in browser → paste the code → click **Save & Activate** - `_dbx()` uses `oauth2_refresh_token` + `app_key` + `app_secret` for automatic token renewal. - Fallback: legacy short-lived access token (backwards compatible; works without app key/secret). ### Implementation details - Versioned backups with deduplication: - file objects in Dropbox: `library_objects/{sha256_prefix}/{sha256}` - snapshots in Dropbox: `library_snapshots/snapshot-YYYYMMDD-HHMMSS.json` - Each run creates a new snapshot version and uploads only missing objects. - Retention removes older snapshots above the configured limit. - Orphan object pruning removes objects no longer referenced by retained snapshots. - Local manifest cache (`config/backup_manifest.json`) speeds up change detection. - Database backup is done via `pg_dump` to Dropbox `postgres/`. - `POST /api/backup/run` always starts a background task and returns immediately. - `GET /api/backup/progress` returns in-memory progress updated per file; phases: `starting` → `scanning` → `uploading` → `snapshot` → `pg_dump`. - Scheduler runs in the background (`start_backup_scheduler`) and triggers on interval when enabled. - Concurrency guard: only one backup can run at a time. - After container restart/crash, stale `running` logs are auto-marked as interrupted/error. --- ## Environment `stack/novela.env` should include at least: - `POSTGRES_DB` - `POSTGRES_USER` - `POSTGRES_PASSWORD` - `NOVELA_MASTER_KEY` - `CONFIG_DIR` Dropbox settings are managed via the web UI on `/backup`. --- ## UI Notes - Library import accepts EPUB/PDF/CBR/CBZ. - Home supports the same import formats. - Home includes search. - Home header/dropzone alignment matches Library (search top-right, dropzone below). - `New` view supports `Grid` and `List` mode. - Bulk selection + `Remove from New` works only in `List` mode. - `List` mode has a column visibility filter: Publisher, Author, Series, Volume, Title, Has cover, Updated, Genres, Sub-genres, Tags, Status. - `List` mode supports multi-select with `Shift+click` range selection on checkboxes. - `Grid` mode shows no selection checkboxes or bulk actions. - `All books` view supports `Grid` and `List` mode (same columns as `New`). - View mode persisted in `localStorage` as `novela.all.viewMode`. - Column visibility persisted in `localStorage` as `novela.all.visibleColumns`. - `List` mode has a checkbox column, column visibility filter, and multi-select with `Shift+click` range selection. - `List` mode has a `Delete selected` bulk action: confirms then calls `DELETE /library/file/{filename}` for each selected book. - Star ratings (1–5) shown under the cover in all grid views: - Display-only in grid cards (no click, prevents accidental taps while scrolling). - Interactive in Book Detail (1.1rem, clickable; clicking the active star clears the rating). - Amber: filled `#c8a03a`, unfilled `rgba(200, 160, 58, 0.25)`. - Reader settings (hamburger menu): - Content width slider (30–100 vw), persisted as `reader-content-width-pct`. - Text colour: 5 warm-tone presets `#e8e2d9` → `#938d86`, persisted as `reader-text-colour`. - Hamburger and back-link separated with `margin-left: 1rem` on `.header-back`. - Reader supports EPUB and PDF: - EPUB: chapter-text rendering; progress = `{chapterIndex}:{scrollFrac}`; progress % = `(chapterIndex + scrollFrac) / total * 100`. - PDF: page-image rendering via `/library/pdf/{filename}?page=N`; page count from `/api/pdf/info/{filename}`; progress = `{pageIndex}:0`; keyboard/button navigation identical. - `reader.html` branches on `FORMAT` variable injected by the server. - `Edit EPUB` button in Book Detail is only shown for `.epub` files. - Backup page supports: manual run, dry-run, Dropbox root, retention count, schedule (on/off + hours), status + history. - Bookmarks: saved per book via `POST /library/bookmarks/{filename}`; shown in Library sidebar section; navigated via `?bm_ch=N&bm_scroll=F` URL params on reader page. - Book Builder (`/builder`): create EPUB books from scratch; drafts stored in `builder_drafts` (JSONB chapters); contenteditable editor with toolbar (bold/italic/underline/blockquote/author-note/scene-break/normalize); autosave every 30 s + Ctrl+S; publish normalizes HTML via `normalize_wysiwyg_html()` and builds EPUB via `build_epub()`. --- ## Known Conventions - Book deletion flow: `unlink` file → `prune_empty_dirs(parent)` → `DELETE FROM library` (cascade removes child rows). - Empty dir pruning: `prune_empty_dirs(start)` walks up from `start` to `LIBRARY_ROOT`, removing each dir if empty; stops at first non-empty dir. - Cover strategy: - EPUB: `GET /library/cover/{filename}` checks `library_cover_cache` first; on miss, extracts from ZIP and warms the cache. Cover upload (`POST /library/cover/{filename}`) replaces the image inside the EPUB ZIP (OPF located via `META-INF/container.xml`, old cover found in manifest and removed) and updates the cache so subsequent requests return the new cover immediately. - PDF: first page rendered as thumbnail, cached - CBR/CBZ: first page extracted, cached - Rating storage: - EPUB: `` in OPF - CBZ: `N` in `ComicInfo.xml` inside the ZIP - CBR/PDF: DB only - `upsert_book` uses `CASE WHEN EXCLUDED.rating > 0 THEN EXCLUDED.rating ELSE library.rating END` to restore rating from file without overwriting existing DB value. - Tag types in `book_tags`: `genre`, `subgenre`, `tag`, `subject`. No direct `genres`/`subgenres` fields on book objects; always use helpers `bookGenres()`, `bookSubgenres()`, `bookPlainTags()`. --- ## Performance Notes - Library load is optimized for large datasets: - `list_library_json()` uses pre-aggregation for `reading_sessions`. - `has_cached_cover` is provided directly via SQL join instead of full cache fetch. - Additional migration indexes: - `idx_library_sort_coalesce` - `idx_library_needs_review` - `idx_library_archived` - `idx_reading_sessions_filename_readat` - `idx_book_tags_filename_tag` --- ## Known Bugs Fixed - `renderGenreView` and `renderSearchResults` in `library.js` referenced `b.genres` (non-existent). Fixed: use `bookGenres()`, `bookSubgenres()`, `bookPlainTags()`. - `PillInput` in `book.js` did not handle comma as delimiter and did not flush on save. Fixed: comma keydown + `flush()` in `saveEdit()`. - `PATCH /library/book` failed for PDFs: `_sync_epub_metadata` tried to open PDF as ZIP. Fixed: only called for `.epub`. - `_make_rel_path` in `reader.py` lacked format prefix (`epub/`, `pdf/`, `comics/`). Fixed: aligned with `common.make_rel_path`. - `common.make_rel_path` always generated `.cbr` extension for CBZ files (both map to `media_type="cbr"`). Fixed: accepts optional `ext` parameter; `library.py` import now passes actual suffix. - `/download/{filename}` was referenced in `book.html` but no endpoint existed (404). Fixed: added `GET /download/{filename}` to `library.py`. - PDF reader showed infinite loading: `reader.html` called EPUB-only `/library/chapters/`. Fixed: PDF path uses `/api/pdf/info/` + page-image rendering. - Empty dir pruning only ran when file was moved. Fixed: `prune_empty_dirs(old_path.parent)` always runs after a successful metadata save.