615 lines
47 KiB
Markdown
615 lines
47 KiB
Markdown
# Novela 2.0 - Technical Status (Develop)
|
||
|
||
## Scope
|
||
This document describes the current technical status of the `develop` codebase.
|
||
It is the primary technical reference for the current implementation.
|
||
|
||
## Architecture
|
||
- Stack: FastAPI, Jinja2 templates, plain JavaScript, PostgreSQL 16, Docker.
|
||
- All routers import `templates` from `shared_templates.py` (a single `Jinja2Templates` instance). This module registers a `develop_mode()` callable as a Jinja2 global, making it available in every template without passing it explicitly per route.
|
||
- Startup lifecycle (`main.py`):
|
||
1. `init_pool()`
|
||
2. `run_migrations()`
|
||
3. `start_backup_scheduler()`
|
||
4. mount routers
|
||
- Shutdown lifecycle:
|
||
1. `stop_backup_scheduler()`
|
||
2. `close_pool()`
|
||
- Source-of-truth rule: files on disk are authoritative, the database is an index/cache.
|
||
|
||
## File Storage Paths
|
||
|
||
All files are stored under `library/` (relative to the app working directory, mapped via Docker volume).
|
||
`LIBRARY_DIR = Path("library")`, `LIBRARY_ROOT = LIBRARY_DIR.resolve()`.
|
||
|
||
### Path structure per format
|
||
|
||
| Format | Path pattern |
|
||
|--------|-------------|
|
||
| EPUB (no series) | `library/epub/{publisher}/{author}/Stories/{title}.epub` |
|
||
| EPUB (series) | `library/epub/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.epub` |
|
||
| PDF | `library/pdf/{publisher}/{author}/{title}.pdf` |
|
||
| CBR (no series) | `library/comics/{publisher}/{author}/{title}.cbr` |
|
||
| CBR (series) | `library/comics/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.cbr` |
|
||
| CBZ (no series) | `library/comics/{publisher}/{author}/{title}.cbz` |
|
||
| CBZ (series) | `library/comics/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.cbz` |
|
||
|
||
- Segments are sanitised: special chars stripped, spaces replaced with `_`, max lengths applied (publisher/author 80, title 140, series 80).
|
||
- Series index is zero-padded to 3 digits (`001`, `002`, …), clamped to 1–999.
|
||
- Duplicate filenames get a `(2)`, `(3)`, … suffix.
|
||
- After any file move, empty parent directories are pruned up to `LIBRARY_ROOT`.
|
||
|
||
### Path logic
|
||
|
||
- `common.make_rel_path(media_type, publisher, author, title, series, series_index, series_suffix, ext)` — used by import and grabber.
|
||
- `reader.py _make_rel_path(publisher, author, title, series, series_index, series_suffix, ext)` — used by metadata PATCH; same logic, uses actual file extension.
|
||
- `series_volume` is not part of the file path; it is stored in DB and OPF only.
|
||
- Both functions produce identical paths for all formats.
|
||
|
||
### Metadata save behaviour per format
|
||
|
||
| Format | File written? | DB written? |
|
||
|--------|--------------|-------------|
|
||
| EPUB | Yes — OPF metadata updated in-place | Yes |
|
||
| PDF | No | Yes |
|
||
| CBR | No | Yes |
|
||
| CBZ | No (tags/metadata); rating written to ComicInfo.xml | Yes |
|
||
|
||
---
|
||
|
||
## Router Status
|
||
|
||
### `routers/library.py`
|
||
- `GET /library` — library page
|
||
- `GET /api/library` — book list JSON (fast-path by default)
|
||
- `POST /library/rescan` — forced full disk rescan
|
||
- `POST /library/import` — upload EPUB/PDF/CBR/CBZ
|
||
- `DELETE /library/file/{filename}` — delete file + DB row + prune dirs
|
||
- `GET /download/{filename}` — download file with `Content-Disposition: attachment`
|
||
- `GET /library/cover/{filename}` — serve cover (EPUB from file; PDF/CBR from cache)
|
||
- `GET /library/cover-cached/{filename}` — serve cover from DB cache only
|
||
- `POST /library/cover/{filename}` — upload/replace cover; for EPUB files: embeds cover in the EPUB and updates cache; for DB-stored books: stores cover directly in `library_cover_cache` and sets `has_cover = TRUE`
|
||
- `POST /library/want-to-read/{filename}` — toggle want-to-read flag
|
||
- `POST /library/archive/{filename}` — toggle archived flag
|
||
- `POST /library/archive-series` — set `archived` for all books in a series; body: `{"series": "…", "archive": true|false}`; returns `{ok, archived, count}`
|
||
- `POST /library/new/mark-reviewed` — bulk set `needs_review=false`
|
||
- `POST /library/bulk-delete` — delete multiple files; accepts `{"filenames": [...]}`, removes files from disk and DB in one query per batch; returns `{ok, deleted, skipped}`
|
||
- `POST /library/rating/{filename}` — set/clear star rating `{"rating": 0-5}`
|
||
- `GET /home` — home page
|
||
- `GET /api/home` — home data JSON
|
||
- `GET /stats` — statistics page
|
||
- `GET /api/stats` — statistics data JSON
|
||
- `GET /api/disk` — partition usage for the library directory: `{total, used, free, pct_used}`
|
||
- `POST /api/bulk-check-duplicates` — accepts `{"items": [{title, author, series, volume}, ...]}`, returns `{"duplicates": [bool, ...]}` — checks by title+author+series_index; also checks by series+author+series_index as fallback (catches duplicate detection when title format changed); when volume is absent, matches on title+author only
|
||
- `GET /library/list` — compat alias
|
||
|
||
`GET /api/library` runs in fast-path mode by default (DB-only, no full disk rescan).
|
||
For a forced sync: `GET /api/library?rescan=true` or `POST /library/rescan`.
|
||
`include_file_info=true` is optional for file size/mtime enrichment.
|
||
ETag caching: response includes `ETag: "{count}-{max_updated_at_unix}"` and `Cache-Control: no-cache`. Client sends `If-None-Match`; server returns `304 Not Modified` when nothing changed.
|
||
|
||
`/api/home` returns:
|
||
- `continue_reading`
|
||
- `shorts_unread`
|
||
- `novels_unread`
|
||
- `shorts_read`
|
||
- `novels_read`
|
||
|
||
`/api/stats` returns totals plus chart/history data for `stats.html`:
|
||
- `reads_by_month`, `reads_by_dow`, `reads_by_hour`
|
||
- `genre_counts`, `publisher_counts`, `fav_genre`, `fav_publisher`
|
||
- `top_books`, `history`
|
||
|
||
Home sections exclude series books via:
|
||
- `COALESCE(series, '') = ''`
|
||
- `filename NOT LIKE '%/Series/%'`
|
||
|
||
Home read sections are ordered oldest-first:
|
||
- `shorts_read`: `ORDER BY MAX(read_at) ASC`
|
||
- `novels_read`: `ORDER BY MAX(read_at) ASC`
|
||
|
||
### `routers/reader.py`
|
||
- `GET /library/db-images/{path:path}` — serve image from content-addressed imagestore (`library/images/`); security: path must be under `IMAGES_DIR`
|
||
- `POST /api/library/convert-to-db/{filename:path}` — convert on-disk EPUB to a DB-stored book; extracts chapters via `_epub_body_inner` (stores images in imagestore, rewrites src to `/library/db-images/…`), migrates all child tables (INSERT new library row → UPDATE children → DELETE old row), deletes EPUB file; returns `{ok, new_filename}`
|
||
- `GET /api/library/export-epub/{filename:path}` — build and stream an EPUB from a DB-stored book; `_rewrite_db_images_for_epub` rewrites `/library/db-images/…` back to `OEBPS/Images/…` paths (dedup by sha256); returns as `Content-Disposition: attachment`
|
||
- `GET /library/epub/{filename}` — serve EPUB inline (no attachment header)
|
||
- `GET /library/chapters/{filename}` — EPUB spine as JSON; for `storage_type='db'` books returns chapters from `book_chapters`
|
||
- `GET /library/chapter/{index}/{filename}` — single chapter as HTML fragment; for `storage_type='db'` books reads from `book_chapters`
|
||
- `GET /library/chapter-img/{path}?filename=…` — image extracted from EPUB ZIP; `path` is the full internal ZIP path (e.g. `OEBPS/Images/cover.jpg` or `EPUB/images/cover.jpg`); case-insensitive fallback for mismatched folder names
|
||
- `GET /library/pdf/{filename}?page=N&dpi=150` — render PDF page as PNG
|
||
- `GET /api/pdf/info/{filename}` — `{"page_count": N}`
|
||
- `GET /library/cbr/{filename}/{page}` — CBR/CBZ page as image
|
||
- `GET /library/progress/{filename}` — read progress
|
||
- `POST /library/progress/{filename}` — save progress `{"cfi": "…", "progress": N}`
|
||
- `DELETE /library/progress/{filename}` — clear progress
|
||
- `POST /library/mark-read/{filename}` — mark as read (with optional date)
|
||
- `GET /library/book/{filename}` — book detail page
|
||
- `GET /api/genres` — all tags from `book_tags` (optional `?type=genre|subgenre|tag`)
|
||
- `PATCH /library/book/{filename}` — update metadata + tags; moves file if path fields change; DB-only for non-EPUB; for `storage_type='db'` books: recomputes synthetic `db/…` filename, FK-safe rename (INSERT→UPDATE children→DELETE old), updates `book_chapters` + `bookmarks` as well
|
||
- `POST /library/rating/{filename}` — set/clear 1–5 star rating; writes to EPUB OPF / CBZ ComicInfo.xml; DB-only for CBR/PDF
|
||
- `GET /library/read/{filename}` — reader page (EPUB or PDF); supports `?bm_ch=N&bm_scroll=F` to jump to bookmark position
|
||
- `GET /api/series-nav/{filename}` — returns `{prev, next}` (`{filename, title, index, suffix}` or `null`) for the adjacent books in the same series ordered by `series_index ASC, series_suffix ASC`; used by the reader for series navigation buttons and `markRead()` redirect
|
||
- `GET /library/bookmarks/{filename}` — list bookmarks for a book
|
||
- `POST /library/bookmarks/{filename}` — add bookmark `{chapter_index, scroll_frac, chapter_title, note}`
|
||
- `PATCH /library/bookmarks/{id}` — update bookmark note
|
||
- `DELETE /library/bookmarks/{id}` — delete bookmark
|
||
- `GET /api/bookmarks` — all bookmarks across all books (includes `book_title`, `book_author`)
|
||
|
||
### `routers/bulk_import.py`
|
||
- `GET /bulk-import` — Bulk Import page
|
||
- `POST /library/bulk-import` — import files with pre-parsed metadata; accepts multipart `files[]`, `rows` (JSON array of per-file metadata), `shared` (JSON with author/publisher/status/genres/tags applied to all files)
|
||
|
||
Filename parsing is done client-side in `bulk_import.html`. The page uses a free-text `%placeholder%` pattern (e.g. `%series% - %series_volume% - %volume% - %title% - %year%`). Available placeholders: `%series%` `%series_volume%` `%volume%` `%title%` `%year%` `%month%` `%day%` `%author%` `%publisher%` `%ignore%`. Colored chips can be clicked (insert at cursor) or dragged onto the input. Pattern is converted to a regex at parse time. Shared metadata fields (including "Year/Vol." for `series_volume`) override filename-parsed values. "Auto-generate titles" checkbox fills empty title cells as `Series (Year/Vol) #Number`. Skip checkbox is always visible for every row; skipped rows are excluded from import. Files are uploaded in batches of 5 with a progress bar.
|
||
|
||
### `routers/editor.py`
|
||
- `GET /library/editor/{filename}` — chapter editor page; supports both EPUB files and DB-stored books (`db/…` filenames); passes `is_db` flag to template; DB branch queries `library` table directly (no file check)
|
||
- `GET /api/edit/chapter/{index}/{filename}` — get chapter content; DB branch reads from `book_chapters` and returns `{index, href, title, content}`
|
||
- `POST /api/edit/chapter/{index}/{filename}` — save chapter; DB branch accepts `{content, title}`, calls `upsert_chapter` (updates `content_tsv` too)
|
||
- `POST /api/edit/chapter/add/{filename}` — add new chapter after `after_index`; DB branch shifts `chapter_index` up via `UPDATE … SET chapter_index = chapter_index + 1 WHERE chapter_index >= insert_idx` then inserts
|
||
- `DELETE /api/edit/chapter/{index}/{filename}` — delete chapter; DB branch deletes and re-indexes via `UPDATE … SET chapter_index = chapter_index - 1 WHERE chapter_index > index`
|
||
|
||
### `routers/grabber.py`
|
||
- `GET /grabber` — grabber page
|
||
- `GET /convert` — convert page
|
||
- `GET /credentials-manager` — credentials manager UI
|
||
- `GET /debug` — debug page
|
||
- `POST /debug/run` — run debug scrape
|
||
- `GET /credentials` — list stored credentials
|
||
- `POST /credentials` — save credential
|
||
- `DELETE /credentials/{site}` — delete credential
|
||
- `POST /preload` — preload book info from URL
|
||
- `POST /convert` — run scrape; body may include `storage_mode: "db"` (default) or `"epub"` to control output format
|
||
- `GET /events/{job_id}` — SSE stream for job progress; `done` event includes `storage_type` (`'db'` or `'file'`)
|
||
|
||
Scrape/convert flow (DB storage — default):
|
||
1. Fetch book info + chapters via scraper
|
||
2. Per chapter: download images → write to `library/images/{sha2}/{sha256}{ext}` (content-addressed) → rewrite `img[src]` to `/library/db-images/...`; break images replaced with `<hr>` before `element_to_xhtml` runs → build `content_html` via `element_to_xhtml` with `break_img_path="/static/break.png"`
|
||
3. One DB transaction: `ensure_unique_db_filename` → `upsert_book` (storage_type='db') → `upsert_chapter` for each chapter → `upsert_cover_cache` if cover provided
|
||
4. Synthetic filename: `db/{publisher}/{author}/{title}` (or `db/{pub}/{auth}/Series/{series}/{idx} - {title}` for series)
|
||
|
||
Scrape/convert flow (EPUB file — `storage_mode: "epub"`):
|
||
1–2. Same as DB flow; `break_img_path="../Images/break.png"` passed to `element_to_xhtml`
|
||
3. Chapters converted to XHTML via `make_chapter_xhtml`; EPUB file built via `make_epub` (embeds `static/break.png` as `OEBPS/Images/break.png`) and written to `library/epub/…`
|
||
4. `upsert_book` called with `storage_type='file'`
|
||
|
||
### Scrapers (`scrapers/`)
|
||
|
||
All scrapers inherit `BaseScraper` and implement `matches(url)`, `login()`, `fetch_book_info()`, `fetch_chapter()`. Registration order in `scrapers/__init__.py` determines priority (first match wins).
|
||
|
||
| Scraper | Domain | Login | Notes |
|
||
|---|---|---|---|
|
||
| `ArchiveOfOurOwnScraper` | archiveofourown.org | Optional | Uses authenticity token; adult content gate via `?view_adult=true` |
|
||
| `AwesomeDudeScraper` | awesomedude.org | No | Chapter discovery via `.htm/.html` links in same directory; content extracted from largest non-layout block |
|
||
| `CodeysWorldScraper` | codeysworld.org | No | See below |
|
||
| `GayAuthorsScraper` | gayauthors.org | Optional | Genres + subgenres from `itemprop="genre"` links; tags from `ipsTags` list |
|
||
| `IomfatsScraper` | iomfats.org | No | See below; requires chapter URL as entry point |
|
||
| `NiftyNewScraper` | new.nifty.org | No | See below; registered before NiftyScraper |
|
||
| `NiftyScraper` | nifty.org (classic) | No | See below; excludes new.nifty.org; category/subcategory stored as tags |
|
||
| `TedLouisScraper` | tedlouis.com | No | Story index URL required as entry point; all pages use `?t=TOKEN` routing; chapter links in `<ul class="story-index-list">` |
|
||
|
||
#### NiftyNewScraper
|
||
|
||
`new.nifty.org` is a Next.js RSC application. Pages render proper HTML with semantic markup — no plain-text email format.
|
||
|
||
- URL normalisation: `_to_index_url()` strips a trailing `/N` (chapter index) so any URL (index or chapter) can be passed as entry point. Story URL pattern: `/stories/{slug}-{id}`.
|
||
- `fetch_book_info()`:
|
||
- Title from `<h1>`; fallback: `<title>` with ` - … - Nifty Archive …` suffix stripped.
|
||
- Author from `<strong itemprop="name">` inside `<a href="/authors/{id}">`.
|
||
- Publication date from `<time itemprop="datePublished" datetime="…">`, updated date from `<time itemprop="dateModified" datetime="…">`; both truncated to `YYYY-MM-DD`.
|
||
- Tags from all `<ul aria-label="Tags">` containers on the page — covers both the story category links (`/collections/…`) and the AI-generated content tags (`/search?query=tags%3A…`); deduplicated; `genres` and `subgenres` are always empty.
|
||
- Description from `<meta name="description">`.
|
||
- Chapter list: `<a>` links matching `/stories/{slug}/N` collected from page HTML; fallback: regex scan of RSC stream for `"index": N` values. URLs generated as `{index_url}/1` … `{index_url}/max`.
|
||
- `fetch_chapter()`:
|
||
- Content extraction order:
|
||
1. Chapter HTML (`{url}`): read `<article>` and collect `<p>` text
|
||
2. Fallback on same HTML: extract escaped Next payload paragraphs (`\u003cp...\u003c/p`)
|
||
3. Last fallback (`{url}?_rsc=1`): parse RSC line format (`{hex_id}:{json}`) for `["$","p",…]` nodes, then escaped paragraph fallback
|
||
- Chapter title uses the precomputed chapter dict title (`Chapter N`).
|
||
- Lead/tail boilerplate detection for common Nifty intro/donate text. Removed boilerplate is preserved as invisible HTML comments in chapter content:
|
||
- `<!-- NIFTY_HIDDEN_LEAD: ... -->`
|
||
- `<!-- NIFTY_HIDDEN_TAIL: ... -->`
|
||
- No email-header stripping and no plain-text line-joining (those are specific to Nifty classic).
|
||
|
||
#### NiftyScraper
|
||
|
||
Nifty classic pages are plain-text email submissions wrapped in a `<pre>` element.
|
||
|
||
- URL normalisation: `_to_index_url()` strips the chapter segment so any URL (index or chapter) can be passed as the entry point. Path structure: `/nifty/{category}/{subcategory}/{story}/` (index, 4 segments) vs `/nifty/{category}/{subcategory}/{story}/{chapter}` (chapter, 5 segments).
|
||
- `fetch_book_info()` performs up to 3 extra HTTP requests: chapter 1 (author + publication date), last chapter (`updated_date`), chapter 2 (boilerplate detection). Author and dates are extracted from the email headers (`From:`, `Date:`) embedded at the top of each chapter file. Date is parsed via `email.utils.parsedate` → `YYYY-MM-DD`.
|
||
- Boilerplate detection: leading paragraphs of chapters 1 and 2 (after email-header strip) are compared using normalised text (lowercase, whitespace collapsed). Consecutive matching paragraphs are recorded as `preamble_count` and stored in each chapter dict; `fetch_chapter()` skips them.
|
||
- `fetch_chapter()` pipeline:
|
||
1. Extract `<pre>` text (fallback: full body text)
|
||
2. Parse `Subject:` header → store as `<!-- Subject: … -->` comment in chapter content (invisible in reader, extractable later)
|
||
3. Strip email header block (up to first blank line after `Date:`/`From:`/`Subject:` lines)
|
||
4. Skip first `preamble_count` paragraphs
|
||
5. Split on blank lines → paragraphs; join hard-wrapped lines within each paragraph with a space
|
||
6. Detect and remove lead/tail boilerplate (common notice/disclaimer/author promo/donate blocks)
|
||
7. Persist removed boilerplate as invisible comments:
|
||
- `<!-- NIFTY_HIDDEN_LEAD: ... -->`
|
||
- `<!-- NIFTY_HIDDEN_TAIL: ... -->`
|
||
8. Scene-break patterns (`***`, `---`, `~~~`, `• • •`, etc.) → `<hr/>`
|
||
9. Build `content_el` as a BeautifulSoup `<div>` of comments + `<p>` + `<hr/>` nodes
|
||
- Genres/subgenres from URL path: `category` (e.g. `gay` → `Gay`) and `subcategory` (e.g. `young-friends` → `Young Friends`).
|
||
|
||
#### CodeysWorldScraper
|
||
|
||
- Entry point: any `codeysworld.org` URL.
|
||
- Title from `<h1>`; author from `<h2>` matching `"by …"` pattern; fallback: URL path segment `/{author}/{category}/filename`.
|
||
- Category from URL path (second-to-last segment, e.g. `remembrances` → tag `"Remembrances"`).
|
||
- Chapter discovery: `.htm/.html` links in the same directory as the entry URL; audio/image links skipped. No chapter links → single-file story (entry URL is the only chapter).
|
||
- `fetch_chapter()`: removes all `<h1>`/`<h2>` headings, back-navigation links, audio links (`.mp3`), mailto links; falls back to `<body>` when no content wrapper is found.
|
||
|
||
#### IomfatsScraper
|
||
|
||
All stories by an author are listed on a single author page (`/storyshelf/hosted/{author}/`). Individual story pages do not exist.
|
||
|
||
- Entry point must be a **chapter URL** (`/storyshelf/hosted/{author}/{story-folder}/{chapter}.html`). Passing the author page URL raises a `ValueError` with a user-visible message.
|
||
- On load: navigates to the author page and scans `<div id="content">` for the matching story.
|
||
- Two page structures detected:
|
||
- **Single story**: outer `<h3>` = book title; chapters are direct `<li><a>` children of the following `<ul>`.
|
||
- **Multi-part series**: outer `<h3>` = series name; nested `<li><h3>` = book title per part; chapters in the sub-`<ul>` matching `story_folder`.
|
||
- Series index extracted from folder name suffix: `*-part{N}` or `*-{N}`.
|
||
- Publication status from `<p><small>[…]</small></p>` after the book title heading.
|
||
- `fetch_chapter()`: content from `<div id="content">`; removes `<h2>`/`<h3>` headings, `.chapternav` divs, `div.important` footer blocks, anchor-name elements.
|
||
|
||
#### TedLouisScraper
|
||
|
||
All pages on `tedlouis.com` use opaque token-based routing: `https://tedlouis.com/?t=<TOKEN>`. There are no predictable URL patterns — tokens must be followed from the story index page.
|
||
|
||
- Entry point must be a **story index URL** (the page listing all chapters). Passing a chapter URL raises a `ValueError` with a user-visible message. Detection: story index has `<h2 class="story-page-title">`, chapter page has `<h1 class="story-title">`.
|
||
- `fetch_book_info()`:
|
||
- Title from direct `NavigableString` children of `<h2 class="story-page-title">` — the element also contains a "Back" button (`<a class="btn">`) and the author byline (`<span class="story-author-by-line">`), which are skipped.
|
||
- Author from `<span class="story-author-by-line"> <a>`.
|
||
- Publication status from `<span class="story-status-text">` with "Status: " prefix stripped.
|
||
- Updated date from `<span class="story-last-updated">` ("Last Updated: Month D, YYYY") → `YYYY-MM-DD`.
|
||
- Chapter list from all `<ul class="story-index-list">` elements (three columns on the page); relative `?t=TOKEN` hrefs resolved to absolute URLs. Order preserved; duplicates deduplicated.
|
||
- No genres, subgenres, tags or description available on the page.
|
||
- `fetch_chapter()`: content from `<div id="chapter">`; strips `<h1 class="story-title">`, `<h2 class="chapter-title">`, `div.chapter-copyright-line`, and `div.chapter-copyright-notice-text` blocks. Chapter title refined from `<h2 class="chapter-title"> <span>`.
|
||
|
||
#### `xhtml.element_to_xhtml()` — Comment handling
|
||
|
||
`bs4.Comment` objects (a `NavigableString` subclass) are now emitted as XML comments: `<!-- … -->`. The `--` sequence (illegal inside XML comments) is sanitised to `- -`. This allows scrapers to embed invisible metadata (e.g. the Nifty `Subject:` header) in chapter content without it appearing in the rendered reader.
|
||
|
||
### `routers/search.py`
|
||
- `GET /search` — full-text search page (`search.html`); Enter-to-search, `?q=` param auto-runs on load
|
||
- `GET /api/search?q=…&mode=phrase|words&filter=all|unread_novels|unread_shorts` — FTS over `book_chapters.content_tsv`; `mode=phrase` (default) uses `phraseto_tsquery` (words in order); `mode=words` uses `plainto_tsquery` (all words present, any order); `ts_rank` and `ts_headline` always use `plainto_tsquery`; also matches chapters whose `title` contains the query (case-insensitive LIKE fallback); no result limit; excludes archived books; `filter=unread_novels` restricts to books with no reading sessions/progress and no `Shorts` tag; `filter=unread_shorts` restricts to books with no reading sessions/progress and a `Shorts` tag; results include `filename`, `title`, `author`, `chapter_index`, `chapter_title`, `snippet`, `rank`
|
||
|
||
### `routers/settings.py`
|
||
- `GET /settings` — settings page
|
||
- `GET /api/app-settings` — returns `{"develop_mode": bool, "break_image_url": str|null}`
|
||
- `PATCH /api/app-settings` — accepts `{"develop_mode": bool}`, persists to `app_settings` table
|
||
- `POST /api/app-settings/break-image` — multipart file upload (PNG/JPG/WebP); stores image in imagestore + overwrites `static/break.png`; saves `break_image_sha256` + `break_image_ext` to `app_settings`; returns `{"ok": true, "url": "/library/db-images/…"}`
|
||
- `GET /api/break-patterns` — list chapter-break patterns
|
||
- `POST /api/break-patterns` — add break pattern (type: `regex` or `css_class`)
|
||
- `PATCH /api/break-patterns/{id}` — update pattern (enable/disable or change value)
|
||
- `DELETE /api/break-patterns/{id}` — delete pattern
|
||
- `DELETE /api/reading-history` — wipe all reading sessions
|
||
|
||
`app_settings` table (single row, `id = 1`): `develop_mode BOOLEAN`, `break_image_sha256 VARCHAR(64)`, `break_image_ext VARCHAR(10)`.
|
||
|
||
### `routers/builder.py`
|
||
- `GET /builder` — Book Builder index (draft list + new draft form)
|
||
- `POST /builder` — create new draft; redirects to `/builder/{id}`
|
||
- `GET /builder/{draft_id}` — draft editor page
|
||
- `DELETE /api/builder/{draft_id}` — delete draft
|
||
- `GET /api/builder/{draft_id}` — draft JSON (id, title, author, publisher, source_url, chapters)
|
||
- `POST /api/builder/{draft_id}/chapter` — add chapter `{title, after_index}`; returns `{index, count}`
|
||
- `PUT /api/builder/{draft_id}/chapter/{idx}` — save chapter `{title?, content?}`
|
||
- `DELETE /api/builder/{draft_id}/chapter/{idx}` — delete chapter; returns `{index, count}`
|
||
- `POST /api/builder/{draft_id}/normalize/{idx}` — normalize chapter HTML (preview only, does not save); returns `{content}`
|
||
- `POST /api/builder/{draft_id}/publish` — normalize all chapters → `build_epub()` → write to `library/epub/` → `upsert_book()` → delete draft; returns `{filename}`; redirects browser to `/library/book/{filename}`
|
||
|
||
Publish flow: all chapters are run through `normalize_wysiwyg_html()`, then `build_epub()` produces an EPUB 2.0 ZIP. The file path is computed via `make_rel_path(media_type="epub", …)`. The book is inserted into the library with `needs_review=True`. The draft is deleted on success.
|
||
|
||
### `routers/following.py`
|
||
- `GET /following` — Following page (author URL management)
|
||
- `GET /api/following` — all distinct library authors with URL (if set), book count, and last-added date
|
||
- `POST /api/following/{author_name}` — set or clear URL for an author (empty `url` removes the record)
|
||
|
||
`GET /api/following` returns one entry per non-archived author:
|
||
```json
|
||
{ "name": "Author Name", "book_count": 5, "last_added": "2026-03-27T…", "url": "https://…" }
|
||
```
|
||
|
||
URL is stored in the `authors` table (`name` unique, `url`, `created_at`, `updated_at`).
|
||
|
||
### `routers/backup.py`
|
||
- `GET /backup` — backup page
|
||
- `GET /api/backup/credentials` — Dropbox settings (includes `app_key_configured` flag)
|
||
- `POST /api/backup/credentials` — save Dropbox settings
|
||
- `DELETE /api/backup/credentials` — remove all Dropbox credentials
|
||
- `POST /api/backup/oauth/prepare` — save app key + secret, return Dropbox auth URL
|
||
- `POST /api/backup/oauth/exchange` — exchange authorization code for refresh token
|
||
- `GET /api/backup/health` — Dropbox connectivity check (includes `schedule_enabled`, `schedule_interval_hours`)
|
||
- `GET /api/backup/status` — current backup status
|
||
- `GET /api/backup/history` — backup run history (last 20)
|
||
- `GET /api/backup/progress` — live progress of running backup `{running, done, total, phase}`
|
||
- `POST /api/backup/run` — trigger backup (background task)
|
||
- `GET /api/backup/snapshots` — list available snapshots `{ok, snapshots: [{name, created_at}]}`
|
||
- `GET /api/backup/snapshots/{snapshot_name}/files` — list files in a snapshot with local existence check `{ok, snapshot, files: [{path, size, sha256, exists_locally}]}`
|
||
- `POST /api/backup/restore` — restore files from a snapshot: `{snapshot_name, files: [rel_paths]}`; downloads from Dropbox, writes to disk, re-indexes via `scan_media` + `upsert_book`; returns `{ok, restored, total, results: [{path, ok, error?}]}`
|
||
|
||
---
|
||
|
||
## Backup & Security
|
||
- Dropbox token (refresh token or legacy access token) stored encrypted in `credentials` (`site='dropbox'`).
|
||
- Dropbox app key stored encrypted in `credentials` (`site='dropbox_app_key'`).
|
||
- Dropbox app secret stored encrypted in `credentials` (`site='dropbox_app_secret'`).
|
||
- Dropbox backup root stored encrypted in `credentials` (`site='dropbox_backup_root'`).
|
||
- Retention (`snapshots to keep`) stored encrypted in `credentials` (`site='dropbox_backup_retention'`).
|
||
- Backup schedule (`enabled` + `interval_hours`) stored encrypted in `credentials` (`site='dropbox_backup_schedule'`).
|
||
- Encryption uses `NOVELA_MASTER_KEY` (Fernet).
|
||
|
||
### Dropbox authentication
|
||
- Preferred: OAuth2 refresh token (does not expire). Set up via the two-step flow on `/backup`:
|
||
1. Enter App Key + App Secret → click **Generate Auth URL**
|
||
2. Approve in browser → paste the code → click **Save & Activate**
|
||
- `_dbx()` uses `oauth2_refresh_token` + `app_key` + `app_secret` for automatic token renewal.
|
||
- Fallback: legacy short-lived access token (backwards compatible; works without app key/secret).
|
||
|
||
### Implementation details
|
||
- Versioned backups with deduplication:
|
||
- file objects in Dropbox: `library_objects/{sha256_prefix}/{sha256}`
|
||
- snapshots in Dropbox: `library_snapshots/snapshot-YYYYMMDD-HHMMSS.json`
|
||
- Each run creates a new snapshot version and uploads only missing objects.
|
||
- Retention removes older snapshots above the configured limit.
|
||
- Orphan object pruning removes objects no longer referenced by retained snapshots.
|
||
- Local manifest cache (`config/backup_manifest.json`) speeds up change detection.
|
||
- Database backup is done via `pg_dump` to Dropbox `postgres/`.
|
||
- `POST /api/backup/run` always starts a background task and returns immediately.
|
||
- `GET /api/backup/progress` returns in-memory progress updated per file; phases: `starting` → `scanning` → `uploading` → `snapshot` → `pg_dump`.
|
||
- Scheduler runs in the background (`start_backup_scheduler`) and triggers on interval when enabled.
|
||
- Concurrency guard: only one backup can run at a time.
|
||
- After container restart/crash, stale `running` logs are auto-marked as interrupted/error.
|
||
|
||
---
|
||
|
||
## Environment
|
||
`stack/novela.env` should include at least:
|
||
- `POSTGRES_DB`
|
||
- `POSTGRES_USER`
|
||
- `POSTGRES_PASSWORD`
|
||
- `NOVELA_MASTER_KEY`
|
||
- `CONFIG_DIR`
|
||
|
||
Dropbox settings are managed via the web UI on `/backup`.
|
||
|
||
---
|
||
|
||
## Branding
|
||
|
||
Static assets in `static/`:
|
||
|
||
| File | Size | Purpose |
|
||
|------|------|---------|
|
||
| `logo.png` | 546×575, transparent | Sidebar wordmark (displayed at 26px height) |
|
||
| `favicon.ico` | 16×16 | Browser tab (legacy) |
|
||
| `favicon-32.png` | 32×32 | Browser tab (modern) |
|
||
| `favicon-256.png` | 256×256 | Pinned tabs / high-DPI |
|
||
| `apple-touch-icon.png` | 180×180 | iOS/iPadOS home screen icon |
|
||
|
||
All 15 page templates include:
|
||
```html
|
||
<link rel="icon" href="/static/favicon.ico" sizes="16x16"/>
|
||
<link rel="icon" type="image/png" sizes="32x32" href="/static/favicon-32.png"/>
|
||
<link rel="icon" type="image/png" sizes="256x256" href="/static/favicon-256.png"/>
|
||
<link rel="apple-touch-icon" sizes="180x180" href="/static/apple-touch-icon.png"/>
|
||
```
|
||
|
||
Sidebar logo: `logo.png` (26px, flex-aligned) next to the "No**vela**" wordmark ("No" in `--text`, "vela" in `--accent`).
|
||
`apple-touch-icon.png` uses `#0f0e0c` background (= `--bg`) with the orange N logo centered at 60% of canvas size.
|
||
|
||
---
|
||
|
||
## Shared CSS (`static/theme.css`)
|
||
|
||
Single `:root { }` block defining all global CSS custom properties. Loaded first on every page (`<link rel="stylesheet" href="/static/theme.css"/>`). No template defines its own global colours — only page-specific layout vars stay inline.
|
||
|
||
| Variable | Value | Role |
|
||
|---|---|---|
|
||
| `--bg` | `#0f0e0c` | Page background |
|
||
| `--surface` | `#1a1815` | Card/panel background |
|
||
| `--surface2` | `#221f1b` | Nested surface |
|
||
| `--border` | `#2e2a24` | Borders |
|
||
| `--accent` | `#ffa20e` | Orange highlight (logo colour) |
|
||
| `--accent2` | `#ffb840` | Lighter orange |
|
||
| `--text` | `#e8e2d9` | Body text |
|
||
| `--text-dim` | `#8a8278` | Muted text |
|
||
| `--text-faint` | `#4a453e` | Very muted text |
|
||
| `--success` | `#6baa6b` | Success state |
|
||
| `--warning` | `#c8a03a` | Warning state |
|
||
| `--error` | `#c85a3a` | Error state |
|
||
| `--radius` | `6px` | Border radius |
|
||
| `--sidebar` | `220px` | Sidebar width |
|
||
| `--mono` | `'DM Mono', monospace` | Monospace font stack |
|
||
| `--serif` | `'Libre Baskerville', Georgia, serif` | Serif font stack |
|
||
|
||
Page-specific overrides: `reader.html` (`--header-h`, `--footer-h`, `--content-w`); `backup.html` (`--ok`, `--warn`, `--err`); `editor.css` (`--danger`, `--header-h`, `--panel-w`).
|
||
|
||
## Shared JavaScript (`static/books.js`)
|
||
|
||
Loaded before any page-specific script on every page that needs book data or UI helpers.
|
||
|
||
| Function | Purpose |
|
||
|---|---|
|
||
| `esc(s)` | HTML-escape a string for safe insertion into markup |
|
||
| `strHash(s)` | Deterministic integer hash of a string (for colour selection) |
|
||
| `COVER_PALETTES` | Array of 8 `[bg, fg]` colour pairs for placeholder covers |
|
||
| `wrapText(ctx, text, x, y, maxW, lineH)` | Canvas word-wrap helper |
|
||
| `truncate(s, n)` | Truncate string with ellipsis |
|
||
| `makePlaceholderCover(canvas, title, author)` | Draw a generated book cover on a `<canvas>` |
|
||
| `_filenameBase(filename)` | Strip path and extension from a filename |
|
||
| `bookTitle(b)` | Return display title (falls back to filename parsing) |
|
||
| `bookAuthor(b)` | Return display author (falls back to filename parsing) |
|
||
| `tagValuesByType(b, type)` | Return tag strings of a given type from `b.tags` |
|
||
| `bookGenres(b)` | Tags of type `genre`; falls back to `subject` |
|
||
| `bookSubgenres(b)` | Tags of type `subgenre` |
|
||
| `bookPlainTags(b)` | Tags of type `tag` |
|
||
| `filterBooks(books, query)` | Filter book list by query across title, author, publisher, genre, sub-genre, tag |
|
||
| `setupSearchInput(inputId, clearId, onSearch)` | Wire input: show/hide clear button on input; call `onSearch(query)` on Enter |
|
||
|
||
## Shared JavaScript (`static/conversion.js`)
|
||
|
||
Loaded by `index.html` (Convert page) and `grabber.html` (Grabber page). Requires `books.js` for `esc()`.
|
||
|
||
| Function | Purpose |
|
||
|---|---|
|
||
| `addLog(msg, cls)` | Append a log line to `#log-lines` |
|
||
| `connectConversionStream(job_id)` | Open SSE stream `/events/{job_id}` and handle all conversion events: `status`, `meta`, `chapters`, `progress`, `warning`, `error`, `done` |
|
||
|
||
## UI Notes
|
||
- Library import accepts EPUB/PDF/CBR/CBZ.
|
||
- Home supports the same import formats.
|
||
- Home includes search.
|
||
- Home header/dropzone alignment matches Library (search top-right, dropzone below).
|
||
- `New` view supports `Grid` and `List` mode.
|
||
- Bulk selection + `Remove from New` works only in `List` mode.
|
||
- `List` mode has a column visibility filter: Publisher, Author, Series, Volume, Title, Has cover, Updated, Genres, Sub-genres, Tags, Status.
|
||
- `List` mode supports multi-select with `Shift+click` range selection on checkboxes.
|
||
- `Grid` mode shows no selection checkboxes or bulk actions.
|
||
- `All books` view supports `Grid` and `List` mode (same columns as `New`).
|
||
- View mode persisted in `localStorage` as `novela.all.viewMode`.
|
||
- Column visibility persisted in `localStorage` as `novela.all.visibleColumns`.
|
||
- `List` mode has a checkbox column, column visibility filter, and multi-select with `Shift+click` range selection.
|
||
- `List` mode has a `Delete selected` bulk action: confirms then calls `DELETE /library/file/{filename}` for each selected book.
|
||
- Publication status values: `Complete`, `Ongoing`, `Temporary Hold`, `Long-Term Hold` (blank = unknown). `Hiatus` was renamed to `Long-Term Hold` via startup migration `migrate_rename_hiatus()`.
|
||
- Status badges (top-right of grid card cover): circular icon, dark fill `rgba(15,14,12,0.82)` + `box-shadow: 0 0 0 2px #0f0e0c` ring for visibility on any cover colour. Icon colour per status: Complete=green `#6baa6b`, Ongoing=blue `#4a90b8`, Temporary Hold=amber `#c8a03a`, Long-Term Hold=orange `#c8783a`. `statusBadgeHtml()` in `library.js` is the single source for badge HTML across all grid views.
|
||
- Want-to-read star (top-left of grid card cover): same dark fill + ring as status badges.
|
||
- Status pills in Book Detail (`book.css`): `status-complete`, `status-ongoing`, `status-temporary-hold`, `status-long-term-hold` — same colour scheme as badges.
|
||
- Grabber status mapping (`grabber.py`): `Temporary-Hold` (gayauthors.org) → `Temporary Hold`; `Long-Term Hold` passes through unchanged.
|
||
- Star ratings (1–5) shown under the cover in all grid views:
|
||
- Display-only in grid cards (no click, prevents accidental taps while scrolling).
|
||
- Interactive in Book Detail (1.1rem, clickable; clicking the active star clears the rating).
|
||
- Amber: filled `#c8a03a`, unfilled `rgba(200, 160, 58, 0.25)`.
|
||
- Reader settings (hamburger menu):
|
||
- Content width slider (30–100 vw), persisted as `reader-content-width-pct`.
|
||
- Font size slider (80–150%, default 105%), persisted as `reader-font-size`; applied via `--reader-font-size` CSS custom property on `#chapter-content`.
|
||
- Text colour: 5 warm-tone presets `#e8e2d9` → `#938d86`, persisted as `reader-text-colour`.
|
||
- Hamburger and back-link separated with `margin-left: 1rem` on `.header-back`.
|
||
- Reader supports EPUB, PDF, and CBR/CBZ:
|
||
- EPUB: chapter-text rendering; progress = `{chapterIndex}:{scrollFrac}`; progress % = `(chapterIndex + scrollFrac) / total * 100`.
|
||
- PDF: page-image rendering via `/library/pdf/{filename}?page=N`; page count from `/api/pdf/info/{filename}`; progress = `{pageIndex}:0`; keyboard/button navigation identical.
|
||
- `reader.html` branches on `FORMAT` variable injected by the server.
|
||
- Series navigation: on load, `loadSeriesNav()` fetches `/api/series-nav/{filename}` and activates prev/next volume buttons in the header (hidden when no series); `markRead()` redirects to `/library/read/{next.filename}` when a next volume exists, otherwise to the book detail page.
|
||
- `Edit EPUB` button in Book Detail is only shown for `.epub` files.
|
||
- Backup page supports: manual run, dry-run, Dropbox root, retention count, schedule (on/off + hours), status + history.
|
||
- Bookmarks: saved per book via `POST /library/bookmarks/{filename}`; shown in Library sidebar section; navigated via `?bm_ch=N&bm_scroll=F` URL params on reader page.
|
||
- Convert page: after loading metadata, if a book with the same title+author already exists in the library, a warning banner is shown (with a link to the existing book); user can still proceed with conversion. Check is done server-side in `/preload` response (`already_exists`, `existing_books`).
|
||
- Authors view (`#authors`): lists all authors across `allBooks` (active + archived); authors whose books are all archived still appear. Sidebar counter (`count-authors`) counts only active-book authors. Author detail view (`#authors/{name}`) also uses `allBooks`; archived books show the `.badge-archived` overlay on their cover.
|
||
- Publishers view (`#publishers`): same rule — `allBooks` (active + archived); publishers with only archived books still appear. Sidebar counter uses active books only. Publisher detail also uses `allBooks`.
|
||
- Series detail view (`#series/{name}`): shows all books in a series as a cover grid. Header contains an "Archive series" / "Unarchive series" button — calls `POST /library/archive-series` to set `archived` for every book in the series at once; the button label reflects whether any book is still active.
|
||
- Duplicates view (`#duplicates`): groups non-archived books by `(title, author)` (case-insensitive); shows only groups with ≥ 2 copies; counter in sidebar shows total number of duplicate books. Detection is entirely client-side from the existing library data.
|
||
- Incomplete view (`#incomplete`): shows all non-archived books where `publication_status` is not `Complete` (Ongoing, Temporary Hold, Long-Term Hold, or blank); sidebar counter included.
|
||
- Following page (`/following`): dedicated page in its own sidebar section between Library and Tools; shows all library authors with their external URL; two tabs — Following (authors with URL set) and All Authors; inline URL editing with keyboard support (Enter = save, Escape = cancel); clicking Visit opens the external URL in a new tab. Author URLs are stored in the `authors` table. Sidebar counter shows number of followed authors.
|
||
- Book Builder (`/builder`): create EPUB books from scratch; drafts stored in `builder_drafts` (JSONB chapters); contenteditable editor with toolbar (bold/italic/underline/blockquote/author-note/scene-break/normalize); autosave every 30 s + Ctrl+S; publish normalizes HTML via `normalize_wysiwyg_html()` and builds EPUB via `build_epub()`.
|
||
|
||
---
|
||
|
||
## Develop Mode
|
||
|
||
When enabled, every page shows a diagonal **DEVELOP** ribbon in the top-left corner and the browser tab title becomes **Novela Develop — …** instead of **Novela — …**.
|
||
|
||
- Persisted in `app_settings` table (single row, `id = 1`); created by `migrate_create_app_settings()`.
|
||
- `shared_templates._develop_mode()` reads this value from DB on every template render and is registered as a Jinja2 global (`develop_mode`), so all templates can use `{% if develop_mode() %}` without explicit context injection.
|
||
- Banner CSS lives in `static/sidebar.css` (`.develop-banner` / `.develop-banner-text`); rendered at the top of `templates/_sidebar.html`.
|
||
- Toggled via the **Develop mode** card on the Settings page (`/settings`); saving reloads the page so the banner and title take effect immediately.
|
||
|
||
---
|
||
|
||
## Known Conventions
|
||
- Book deletion flow: `unlink` file → `prune_empty_dirs(parent)` → `DELETE FROM library` (cascade removes child rows).
|
||
- Empty dir pruning: `prune_empty_dirs(start)` walks up from `start` to `LIBRARY_ROOT`, removing each dir if empty; stops at first non-empty dir.
|
||
- Cover strategy:
|
||
- EPUB: `GET /library/cover/{filename}` checks `library_cover_cache` first; on miss, extracts from ZIP and warms the cache. Cover upload (`POST /library/cover/{filename}`) replaces the image inside the EPUB ZIP (OPF located via `META-INF/container.xml`, old cover found in manifest and removed) and updates the cache so subsequent requests return the new cover immediately.
|
||
- PDF: first page rendered as thumbnail, cached
|
||
- CBR/CBZ: first page extracted, cached
|
||
- Rating storage:
|
||
- EPUB: `<meta name="novela:rating" content="N"/>` in OPF
|
||
- CBZ: `<NovelaRating>N</NovelaRating>` in `ComicInfo.xml` inside the ZIP
|
||
- CBR/PDF: DB only
|
||
- `upsert_book` uses `CASE WHEN EXCLUDED.rating > 0 THEN EXCLUDED.rating ELSE library.rating END` to restore rating from file without overwriting existing DB value.
|
||
- Tag types in `book_tags`: `genre`, `subgenre`, `tag`, `subject`. No direct `genres`/`subgenres` fields on book objects; always use helpers `bookGenres()`, `bookSubgenres()`, `bookPlainTags()`.
|
||
- `series_volume` (e.g. `"1982"`) is used for annual comic series where issue numbers restart each year. It is separate from `series_index` (issue number within the year) and `series_suffix` (letter variant like `"a"`). Stored in DB and EPUB OPF (`novela:series_volume`); not reflected in the file path. Sort order: `series → series_volume → series_index → series_suffix`. In `getSeriesSlots`, gap-detection runs per volume independently when any book has `series_volume` set; slot labels show as `(year) #index`.
|
||
|
||
---
|
||
|
||
## Performance Notes
|
||
- Library load is optimized for large datasets (1000+ books):
|
||
- `list_library_json()` uses `json_agg` in the main query to inline tags per book — eliminates a separate `SELECT * FROM book_tags` query and Python merge loop.
|
||
- `has_cached_cover` is provided directly via SQL join instead of full cache fetch.
|
||
- `reading_sessions` is pre-aggregated in a subquery.
|
||
- ETag on `/api/library`: cheap `COUNT + MAX(updated_at)` query before full load; `304 Not Modified` on cache hit.
|
||
- Front-end rendering uses `IntersectionObserver` to defer both cover image loading and placeholder canvas drawing until cards enter the viewport — prevents hundreds of simultaneous HTTP requests and canvas operations on initial render.
|
||
- `renderBooksGrid`, `renderDuplicatesView`, `renderSeriesDetail` all use a single DOM pass: cover `<img>` and `<canvas>` are set up via `card.querySelector` immediately after `innerHTML` is set, eliminating a second full iteration with `document.getElementById` calls.
|
||
- Additional migration indexes:
|
||
- `idx_library_sort_coalesce`
|
||
- `idx_library_needs_review`
|
||
- `idx_library_archived`
|
||
- `idx_reading_sessions_filename_readat`
|
||
- `idx_book_tags_filename_tag`
|
||
|
||
---
|
||
|
||
## DB-Stored Books
|
||
|
||
Books scraped via the grabber are stored entirely in PostgreSQL (`storage_type = 'db'`). No EPUB file is written.
|
||
|
||
### New tables
|
||
|
||
| Table | Key columns | Notes |
|
||
|---|---|---|
|
||
| `book_chapters` | `filename FK, chapter_index, title, content TEXT, content_tsv TSVECTOR` | Unique on `(filename, chapter_index)`; GIN index on `content_tsv` for FTS; `content_tsv` is `to_tsvector('simple', title || ' ' || stripped_html)` — title included for title-based FTS matches |
|
||
| `book_images` | `sha256 PK, ext, media_type, size_bytes` | Content-addressed; files live at `library/images/{sha256[:2]}/{sha256}{ext}` |
|
||
|
||
### `library.storage_type`
|
||
|
||
| Value | Meaning |
|
||
|---|---|
|
||
| `'file'` | Book lives on disk (EPUB/PDF/CBR/CBZ); default for all existing books |
|
||
| `'db'` | Book content lives in `book_chapters`; no file on disk |
|
||
|
||
### Synthetic filename for DB books
|
||
|
||
`db/{publisher}/{author}/{title}` — or for series: `db/{publisher}/{author}/Series/{series}/{idx:03d} - {title}`
|
||
|
||
Same sanitization rules as file-based paths. Uniqueness enforced via `ensure_unique_db_filename` (DB lookup, not filesystem).
|
||
|
||
### Chapter editor for DB books
|
||
|
||
`GET /library/editor/{filename}` supports DB-stored books. The Monaco editor shows `language: 'html'` for DB books (vs `'xml'` for EPUB). The header shows a title input instead of a read-only chapter name. Unsaved content and titles are preserved across chapter switches via `pendingContent` and `pendingTitles` maps. `editor.focus()` is called after every content load so the editor is immediately interactive.
|
||
|
||
### Imagestore
|
||
|
||
Images embedded in chapter HTML are stored content-addressed at `library/images/{sha256[:2]}/{sha256}{ext}`.
|
||
- Served via `GET /library/db-images/{path:path}`
|
||
- URLs embedded in `book_chapters.content` as absolute paths: `/library/db-images/...`
|
||
- `book_images` table registers each unique image (auto-deduplication via sha256)
|
||
|
||
### EPUB → DB conversion
|
||
|
||
`POST /api/library/convert-to-db/{filename}` converts an on-disk EPUB to `storage_type='db'`:
|
||
1. Parse EPUB spine → per item: extract body HTML via `_epub_body_inner`, store images in imagestore via `write_image_file`, rewrite `img[src]` to `/library/db-images/…`
|
||
2. Compute new synthetic `db/…` filename via `make_rel_path(media_type="db", …)` + `ensure_unique_db_filename`
|
||
3. DB transaction: INSERT new library row (storage_type='db') → UPDATE all child tables (book_tags, reading_progress, reading_sessions, bookmarks, library_cover_cache, book_chapters) → DELETE old library row
|
||
4. Delete EPUB file from disk + `prune_empty_dirs`
|
||
|
||
### DB → EPUB export
|
||
|
||
`GET /api/library/export-epub/{filename}` streams an EPUB built from DB content:
|
||
1. Query metadata, tags, chapters, cover from DB
|
||
2. Per chapter: `_rewrite_db_images_for_epub` strips `/library/db-images/` prefix, reads files from `IMAGES_DIR`, deduplicates by sha256, assigns `OEBPS/Images/{sha256}{ext}` paths, rewrites `img[src]` to `../Images/…`
|
||
3. Build EPUB via `make_epub()`; return as `Content-Disposition: attachment`
|
||
|
||
---
|
||
|
||
## Known Bugs Fixed
|
||
- `renderGenreView` and `renderSearchResults` in `library.js` referenced `b.genres` (non-existent). Fixed: use `bookGenres()`, `bookSubgenres()`, `bookPlainTags()`.
|
||
- `PillInput` in `book.js` did not handle comma as delimiter and did not flush on save. Fixed: comma keydown + `flush()` in `saveEdit()`.
|
||
- `PillInput._add` in `book.js` added a pasted comma-separated list as one tag instead of splitting it. Fixed: `_add` now splits the value on commas and pushes each trimmed, non-empty, non-duplicate part individually.
|
||
- `PATCH /library/book` failed for PDFs: `_sync_epub_metadata` tried to open PDF as ZIP. Fixed: only called for `.epub`.
|
||
- `_make_rel_path` in `reader.py` lacked format prefix (`epub/`, `pdf/`, `comics/`). Fixed: aligned with `common.make_rel_path`.
|
||
- `common.make_rel_path` always generated `.cbr` extension for CBZ files (both map to `media_type="cbr"`). Fixed: accepts optional `ext` parameter; `library.py` import now passes actual suffix.
|
||
- `/download/{filename}` was referenced in `book.html` but no endpoint existed (404). Fixed: added `GET /download/{filename}` to `library.py`.
|
||
- PDF reader showed infinite loading: `reader.html` called EPUB-only `/library/chapters/`. Fixed: PDF path uses `/api/pdf/info/` + page-image rendering.
|
||
- Empty dir pruning only ran when file was moved. Fixed: `prune_empty_dirs(old_path.parent)` always runs after a successful metadata save.
|