# Novela 2.0 - Technical Status (Develop) ## Scope This document describes the current technical status of the `develop` codebase. It is the primary technical reference for the current implementation. ## Architecture - Stack: FastAPI, Jinja2 templates, plain JavaScript, PostgreSQL 16, Docker. - All routers import `templates` from `shared_templates.py` (a single `Jinja2Templates` instance). This module registers a `develop_mode()` callable as a Jinja2 global, making it available in every template without passing it explicitly per route. - Startup lifecycle (`main.py`): 1. `init_pool()` 2. `run_migrations()` 3. `start_backup_scheduler()` 4. mount routers - Shutdown lifecycle: 1. `stop_backup_scheduler()` 2. `close_pool()` - Source-of-truth rule: files on disk are authoritative, the database is an index/cache. ## File Storage Paths All files are stored under `library/` (relative to the app working directory, mapped via Docker volume). `LIBRARY_DIR = Path("library")`, `LIBRARY_ROOT = LIBRARY_DIR.resolve()`. ### Path structure per format | Format | Path pattern | |--------|-------------| | EPUB (no series) | `library/epub/{publisher}/{author}/Stories/{title}.epub` | | EPUB (series) | `library/epub/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.epub` | | PDF | `library/pdf/{publisher}/{author}/{title}.pdf` | | CBR (no series) | `library/comics/{publisher}/{author}/{title}.cbr` | | CBR (series) | `library/comics/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.cbr` | | CBZ (no series) | `library/comics/{publisher}/{author}/{title}.cbz` | | CBZ (series) | `library/comics/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.cbz` | - Segments are sanitised: special chars stripped, spaces replaced with `_`, max lengths applied (publisher/author 80, title 140, series 80). - Series index is zero-padded to 3 digits (`001`, `002`, …), clamped to 1–999. - Duplicate filenames get a `(2)`, `(3)`, … suffix. - After any file move, empty parent directories are pruned up to `LIBRARY_ROOT`. ### Path logic - `common.make_rel_path(media_type, publisher, author, title, series, series_index, series_suffix, ext)` — used by import and grabber. - `reader.py _make_rel_path(publisher, author, title, series, series_index, series_suffix, ext)` — used by metadata PATCH; same logic, uses actual file extension. - `series_volume` is not part of the file path; it is stored in DB and OPF only. - Both functions produce identical paths for all formats. ### Metadata save behaviour per format | Format | File written? | DB written? | |--------|--------------|-------------| | EPUB | Yes — OPF metadata updated in-place | Yes | | PDF | No | Yes | | CBR | No | Yes | | CBZ | No (tags/metadata); rating written to ComicInfo.xml | Yes | --- ## Router Status ### `routers/library.py` - `GET /library` — library page - `GET /api/library` — book list JSON (fast-path by default) - `POST /library/rescan` — forced full disk rescan - `POST /library/import` — upload EPUB/PDF/CBR/CBZ - `DELETE /library/file/{filename}` — delete file + DB row + prune dirs - `GET /download/{filename}` — download file with `Content-Disposition: attachment` - `GET /library/cover/{filename}` — serve cover (EPUB from file; PDF/CBR from cache) - `GET /library/cover-cached/{filename}` — serve cover from DB cache only - `POST /library/cover/{filename}` — upload/replace cover; for EPUB files: embeds cover in the EPUB and updates cache; for DB-stored books: stores cover directly in `library_cover_cache` and sets `has_cover = TRUE` - `POST /library/want-to-read/{filename}` — toggle want-to-read flag - `POST /library/archive/{filename}` — toggle archived flag - `POST /library/archive-series` — set `archived` for all books in a series; body: `{"series": "…", "archive": true|false}`; returns `{ok, archived, count}` - `POST /library/new/mark-reviewed` — bulk set `needs_review=false` - `POST /library/bulk-delete` — delete multiple files; accepts `{"filenames": [...]}`, removes files from disk and DB in one query per batch; returns `{ok, deleted, skipped}` - `POST /library/rating/{filename}` — set/clear star rating `{"rating": 0-5}` - `GET /home` — home page - `GET /api/home` — home data JSON - `GET /stats` — statistics page - `GET /api/stats` — statistics data JSON - `GET /api/disk` — partition usage for the library directory: `{total, used, free, pct_used}` - `POST /api/bulk-check-duplicates` — accepts `{"items": [{title, author, series, volume}, ...]}`, returns `{"duplicates": [bool, ...]}` — checks by title+author+series_index; also checks by series+author+series_index as fallback (catches duplicate detection when title format changed); when volume is absent, matches on title+author only - `GET /library/list` — compat alias `GET /api/library` runs in fast-path mode by default (DB-only, no full disk rescan). For a forced sync: `GET /api/library?rescan=true` or `POST /library/rescan`. `include_file_info=true` is optional for file size/mtime enrichment. ETag caching: response includes `ETag: "{count}-{max_updated_at_unix}"` and `Cache-Control: no-cache`. Client sends `If-None-Match`; server returns `304 Not Modified` when nothing changed. `/api/home` returns: - `continue_reading` - `shorts_unread` - `novels_unread` - `shorts_read` - `novels_read` `/api/stats` returns totals plus chart/history data for `stats.html`: - `reads_by_month`, `reads_by_dow`, `reads_by_hour` - `genre_counts`, `publisher_counts`, `fav_genre`, `fav_publisher` - `top_books`, `history` Home sections exclude series books via: - `COALESCE(series, '') = ''` - `filename NOT LIKE '%/Series/%'` Home read sections are ordered oldest-first: - `shorts_read`: `ORDER BY MAX(read_at) ASC` - `novels_read`: `ORDER BY MAX(read_at) ASC` ### `routers/reader.py` - `GET /library/db-images/{path:path}` — serve image from content-addressed imagestore (`library/images/`); security: path must be under `IMAGES_DIR` - `POST /api/library/convert-to-db/{filename:path}` — convert on-disk EPUB to a DB-stored book; extracts chapters via `_epub_body_inner` (stores images in imagestore, rewrites src to `/library/db-images/…`), migrates all child tables (INSERT new library row → UPDATE children → DELETE old row), deletes EPUB file; returns `{ok, new_filename}` - `GET /api/library/export-epub/{filename:path}` — build and stream an EPUB from a DB-stored book; `_rewrite_db_images_for_epub` rewrites `/library/db-images/…` back to `OEBPS/Images/…` paths (dedup by sha256); returns as `Content-Disposition: attachment` - `GET /library/epub/{filename}` — serve EPUB inline (no attachment header) - `GET /library/chapters/{filename}` — EPUB spine as JSON; for `storage_type='db'` books returns chapters from `book_chapters` - `GET /library/chapter/{index}/{filename}` — single chapter as HTML fragment; for `storage_type='db'` books reads from `book_chapters` - `GET /library/chapter-img/{path}?filename=…` — image extracted from EPUB ZIP; `path` is the full internal ZIP path (e.g. `OEBPS/Images/cover.jpg` or `EPUB/images/cover.jpg`); case-insensitive fallback for mismatched folder names - `GET /library/pdf/{filename}?page=N&dpi=150` — render PDF page as PNG - `GET /api/pdf/info/{filename}` — `{"page_count": N}` - `GET /library/cbr/{filename}/{page}` — CBR/CBZ page as image - `GET /library/progress/{filename}` — read progress - `POST /library/progress/{filename}` — save progress `{"cfi": "…", "progress": N}` - `DELETE /library/progress/{filename}` — clear progress - `POST /library/mark-read/{filename}` — mark as read (with optional date) - `GET /library/book/{filename}` — book detail page - `GET /api/genres` — all tags from `book_tags` (optional `?type=genre|subgenre|tag`) - `PATCH /library/book/{filename}` — update metadata + tags; moves file if path fields change; DB-only for non-EPUB; for `storage_type='db'` books: recomputes synthetic `db/…` filename, FK-safe rename (INSERT→UPDATE children→DELETE old), updates `book_chapters` + `bookmarks` as well - `POST /library/rating/{filename}` — set/clear 1–5 star rating; writes to EPUB OPF / CBZ ComicInfo.xml; DB-only for CBR/PDF - `GET /library/read/{filename}` — reader page (EPUB or PDF); supports `?bm_ch=N&bm_scroll=F` to jump to bookmark position - `GET /api/series-nav/{filename}` — returns `{prev, next}` (`{filename, title, index, suffix}` or `null`) for the adjacent books in the same series ordered by `series_index ASC, series_suffix ASC`; used by the reader for series navigation buttons and `markRead()` redirect - `GET /library/bookmarks/{filename}` — list bookmarks for a book - `POST /library/bookmarks/{filename}` — add bookmark `{chapter_index, scroll_frac, chapter_title, note}` - `PATCH /library/bookmarks/{id}` — update bookmark note - `DELETE /library/bookmarks/{id}` — delete bookmark - `GET /api/bookmarks` — all bookmarks across all books (includes `book_title`, `book_author`) ### `routers/bulk_import.py` - `GET /bulk-import` — Bulk Import page - `POST /library/bulk-import` — import files with pre-parsed metadata; accepts multipart `files[]`, `rows` (JSON array of per-file metadata), `shared` (JSON with author/publisher/status/genres/tags applied to all files) Filename parsing is done client-side in `bulk_import.html`. The page uses a free-text `%placeholder%` pattern (e.g. `%series% - %series_volume% - %volume% - %title% - %year%`). Available placeholders: `%series%` `%series_volume%` `%volume%` `%title%` `%year%` `%month%` `%day%` `%author%` `%publisher%` `%ignore%`. Colored chips can be clicked (insert at cursor) or dragged onto the input. Pattern is converted to a regex at parse time. Shared metadata fields (including "Year/Vol." for `series_volume`) override filename-parsed values. "Auto-generate titles" checkbox fills empty title cells as `Series (Year/Vol) #Number`. Skip checkbox is always visible for every row; skipped rows are excluded from import. Files are uploaded in batches of 5 with a progress bar. ### `routers/editor.py` - `GET /library/editor/{filename}` — chapter editor page; supports both EPUB files and DB-stored books (`db/…` filenames); passes `is_db` flag to template; DB branch queries `library` table directly (no file check) - `GET /api/edit/chapter/{index}/{filename}` — get chapter content; DB branch reads from `book_chapters` and returns `{index, href, title, content}` - `POST /api/edit/chapter/{index}/{filename}` — save chapter; DB branch accepts `{content, title}`, calls `upsert_chapter` (updates `content_tsv` too) - `POST /api/edit/chapter/add/{filename}` — add new chapter after `after_index`; DB branch shifts `chapter_index` up via `UPDATE … SET chapter_index = chapter_index + 1 WHERE chapter_index >= insert_idx` then inserts - `DELETE /api/edit/chapter/{index}/{filename}` — delete chapter; DB branch deletes and re-indexes via `UPDATE … SET chapter_index = chapter_index - 1 WHERE chapter_index > index` ### `routers/grabber.py` - `GET /grabber` — grabber page - `GET /convert` — convert page - `GET /credentials-manager` — credentials manager UI - `GET /debug` — debug page - `POST /debug/run` — run debug scrape - `GET /credentials` — list stored credentials - `POST /credentials` — save credential - `DELETE /credentials/{site}` — delete credential - `POST /preload` — preload book info from URL - `POST /convert` — run scrape; body may include `storage_mode: "db"` (default) or `"epub"` to control output format - `GET /events/{job_id}` — SSE stream for job progress; `done` event includes `storage_type` (`'db'` or `'file'`) Scrape/convert flow (DB storage — default): 1. Fetch book info + chapters via scraper 2. Per chapter: download images → write to `library/images/{sha2}/{sha256}{ext}` (content-addressed) → rewrite `img[src]` to `/library/db-images/...`; break images replaced with `