# Novela 2.0 - Technical Status (Develop)
## Scope
This document describes the current technical status of the `develop` codebase.
It is the primary technical reference for the current implementation.
## Architecture
- Stack: FastAPI, Jinja2 templates, plain JavaScript, PostgreSQL 16, Docker.
- Startup lifecycle (`main.py`):
1. `init_pool()`
2. `run_migrations()`
3. `start_backup_scheduler()`
4. mount routers
- Shutdown lifecycle:
1. `stop_backup_scheduler()`
2. `close_pool()`
- Source-of-truth rule: files on disk are authoritative, the database is an index/cache.
## File Storage Paths
All files are stored under `library/` (relative to the app working directory, mapped via Docker volume).
`LIBRARY_DIR = Path("library")`, `LIBRARY_ROOT = LIBRARY_DIR.resolve()`.
### Path structure per format
| Format | Path pattern |
|--------|-------------|
| EPUB (no series) | `library/epub/{publisher}/{author}/Stories/{title}.epub` |
| EPUB (series) | `library/epub/{publisher}/{author}/Series/{series}/{idx:03d} - {title}.epub` |
| PDF | `library/pdf/{publisher}/{author}/{title}.pdf` |
| CBR (no series) | `library/comics/{publisher}/{author}/{title}.cbr` |
| CBR (series) | `library/comics/{publisher}/{author}/Series/{series}/{idx:03d} - {title}.cbr` |
| CBZ (no series) | `library/comics/{publisher}/{author}/{title}.cbz` |
| CBZ (series) | `library/comics/{publisher}/{author}/Series/{series}/{idx:03d} - {title}.cbz` |
- Segments are sanitised: special chars stripped, max lengths applied (publisher/author 80, title 140, series 80).
- Series index is zero-padded to 3 digits (`001`, `002`, …), clamped to 1–999.
- Duplicate filenames get a `(2)`, `(3)`, … suffix.
- After any file move, empty parent directories are pruned up to `LIBRARY_ROOT`.
### Path logic
- `common.make_rel_path(media_type, publisher, author, title, series, series_index, ext)` — used by import and grabber.
- `reader.py _make_rel_path(publisher, author, title, series, series_index, ext)` — used by metadata PATCH; same logic, uses actual file extension.
- Both functions produce identical paths for all formats.
### Metadata save behaviour per format
| Format | File written? | DB written? |
|--------|--------------|-------------|
| EPUB | Yes — OPF metadata updated in-place | Yes |
| PDF | No | Yes |
| CBR | No | Yes |
| CBZ | No (tags/metadata); rating written to ComicInfo.xml | Yes |
---
## Router Status
### `routers/library.py`
- `GET /library` — library page
- `GET /api/library` — book list JSON (fast-path by default)
- `POST /library/rescan` — forced full disk rescan
- `POST /library/import` — upload EPUB/PDF/CBR/CBZ
- `DELETE /library/file/{filename}` — delete file + DB row + prune dirs
- `GET /download/{filename}` — download file with `Content-Disposition: attachment`
- `GET /library/cover/{filename}` — serve cover (EPUB from file; PDF/CBR from cache)
- `GET /library/cover-cached/{filename}` — serve cover from DB cache only
- `POST /library/cover/{filename}` — upload/replace cover (EPUB only)
- `POST /library/want-to-read/{filename}` — toggle want-to-read flag
- `POST /library/archive/{filename}` — toggle archived flag
- `POST /library/new/mark-reviewed` — bulk set `needs_review=false`
- `POST /library/bulk-delete` — delete multiple files; accepts `{"filenames": [...]}`, removes files from disk and DB in one query per batch; returns `{ok, deleted, skipped}`
- `POST /library/rating/{filename}` — set/clear star rating `{"rating": 0-5}`
- `GET /home` — home page
- `GET /api/home` — home data JSON
- `GET /stats` — statistics page
- `GET /api/stats` — statistics data JSON
- `GET /api/disk` — partition usage for the library directory: `{total, used, free, pct_used}`
- `POST /api/bulk-check-duplicates` — accepts `{"items": [{title, author, volume}, ...]}`, returns `{"duplicates": [bool, ...]}` — when `volume` is a number, requires title+author+series_index to all match; when volume is absent, matches on title+author only
- `GET /library/list` — compat alias
`GET /api/library` runs in fast-path mode by default (DB-only, no full disk rescan).
For a forced sync: `GET /api/library?rescan=true` or `POST /library/rescan`.
`include_file_info=true` is optional for file size/mtime enrichment.
ETag caching: response includes `ETag: "{count}-{max_updated_at_unix}"` and `Cache-Control: no-cache`. Client sends `If-None-Match`; server returns `304 Not Modified` when nothing changed.
`/api/home` returns:
- `continue_reading`
- `shorts_unread`
- `novels_unread`
- `shorts_read`
- `novels_read`
`/api/stats` returns totals plus chart/history data for `stats.html`:
- `reads_by_month`, `reads_by_dow`, `reads_by_hour`
- `genre_counts`, `publisher_counts`, `fav_genre`, `fav_publisher`
- `top_books`, `history`
Home sections exclude series books via:
- `COALESCE(series, '') = ''`
- `filename NOT LIKE '%/Series/%'`
Home read sections are ordered oldest-first:
- `shorts_read`: `ORDER BY MAX(read_at) ASC`
- `novels_read`: `ORDER BY MAX(read_at) ASC`
### `routers/reader.py`
- `GET /library/db-images/{path:path}` — serve image from content-addressed imagestore (`library/images/`); security: path must be under `IMAGES_DIR`
- `POST /api/library/convert-to-db/{filename:path}` — convert on-disk EPUB to a DB-stored book; extracts chapters via `_epub_body_inner` (stores images in imagestore, rewrites src to `/library/db-images/…`), migrates all child tables (INSERT new library row → UPDATE children → DELETE old row), deletes EPUB file; returns `{ok, new_filename}`
- `GET /api/library/export-epub/{filename:path}` — build and stream an EPUB from a DB-stored book; `_rewrite_db_images_for_epub` rewrites `/library/db-images/…` back to `OEBPS/Images/…` paths (dedup by sha256); returns as `Content-Disposition: attachment`
- `GET /library/epub/{filename}` — serve EPUB inline (no attachment header)
- `GET /library/chapters/{filename}` — EPUB spine as JSON; for `storage_type='db'` books returns chapters from `book_chapters`
- `GET /library/chapter/{index}/{filename}` — single chapter as HTML fragment; for `storage_type='db'` books reads from `book_chapters`
- `GET /library/chapter-img/{path}?filename=…` — image extracted from EPUB ZIP; `path` is the full internal ZIP path (e.g. `OEBPS/Images/cover.jpg` or `EPUB/images/cover.jpg`); case-insensitive fallback for mismatched folder names
- `GET /library/pdf/{filename}?page=N&dpi=150` — render PDF page as PNG
- `GET /api/pdf/info/{filename}` — `{"page_count": N}`
- `GET /library/cbr/{filename}/{page}` — CBR/CBZ page as image
- `GET /library/progress/{filename}` — read progress
- `POST /library/progress/{filename}` — save progress `{"cfi": "…", "progress": N}`
- `DELETE /library/progress/{filename}` — clear progress
- `POST /library/mark-read/{filename}` — mark as read (with optional date)
- `GET /library/book/{filename}` — book detail page
- `GET /api/genres` — all tags from `book_tags` (optional `?type=genre|subgenre|tag`)
- `PATCH /library/book/{filename}` — update metadata + tags; moves file if path fields change; DB-only for non-EPUB; for `storage_type='db'` books: recomputes synthetic `db/…` filename, FK-safe rename (INSERT→UPDATE children→DELETE old), updates `book_chapters` + `bookmarks` as well
- `POST /library/rating/{filename}` — set/clear 1–5 star rating; writes to EPUB OPF / CBZ ComicInfo.xml; DB-only for CBR/PDF
- `GET /library/read/{filename}` — reader page (EPUB or PDF); supports `?bm_ch=N&bm_scroll=F` to jump to bookmark position
- `GET /library/bookmarks/{filename}` — list bookmarks for a book
- `POST /library/bookmarks/{filename}` — add bookmark `{chapter_index, scroll_frac, chapter_title, note}`
- `PATCH /library/bookmarks/{id}` — update bookmark note
- `DELETE /library/bookmarks/{id}` — delete bookmark
- `GET /api/bookmarks` — all bookmarks across all books (includes `book_title`, `book_author`)
### `routers/bulk_import.py`
- `GET /bulk-import` — Bulk Import page
- `POST /library/bulk-import` — import files with pre-parsed metadata; accepts multipart `files[]`, `rows` (JSON array of per-file metadata), `shared` (JSON with author/publisher/status/genres/tags applied to all files)
Filename parsing is done client-side in `bulk_import.html`. The page uses a free-text `%placeholder%` pattern (e.g. `%series% - %volume% - %title% - %year%`). Available placeholders: `%series%` `%volume%` `%title%` `%year%` `%month%` `%day%` `%author%` `%publisher%` `%ignore%`. Colored chips can be clicked (insert at cursor) or dragged onto the input. Pattern is converted to a regex at parse time. Shared metadata fields override filename-parsed values. Files are uploaded in batches of 5 with a progress bar.
### `routers/editor.py`
- `GET /library/editor/{filename}` — chapter editor page; supports both EPUB files and DB-stored books (`db/…` filenames); passes `is_db` flag to template; DB branch queries `library` table directly (no file check)
- `GET /api/edit/chapter/{index}/{filename}` — get chapter content; DB branch reads from `book_chapters` and returns `{index, href, title, content}`
- `POST /api/edit/chapter/{index}/{filename}` — save chapter; DB branch accepts `{content, title}`, calls `upsert_chapter` (updates `content_tsv` too)
- `POST /api/edit/chapter/add/{filename}` — add new chapter after `after_index`; DB branch shifts `chapter_index` up via `UPDATE … SET chapter_index = chapter_index + 1 WHERE chapter_index >= insert_idx` then inserts
- `DELETE /api/edit/chapter/{index}/{filename}` — delete chapter; DB branch deletes and re-indexes via `UPDATE … SET chapter_index = chapter_index - 1 WHERE chapter_index > index`
### `routers/grabber.py`
- `GET /grabber` — grabber page
- `GET /convert` — convert page
- `GET /credentials-manager` — credentials manager UI
- `GET /debug` — debug page
- `POST /debug/run` — run debug scrape
- `GET /credentials` — list stored credentials
- `POST /credentials` — save credential
- `DELETE /credentials/{site}` — delete credential
- `POST /preload` — preload book info from URL
- `POST /convert` — run scrape; body may include `storage_mode: "db"` (default) or `"epub"` to control output format
- `GET /events/{job_id}` — SSE stream for job progress; `done` event includes `storage_type` (`'db'` or `'file'`)
Scrape/convert flow (DB storage — default):
1. Fetch book info + chapters via scraper
2. Per chapter: download images → write to `library/images/{sha2}/{sha256}{ext}` (content-addressed) → rewrite `img[src]` to `/library/db-images/...` → build `content_html` via `element_to_xhtml`
3. One DB transaction: `ensure_unique_db_filename` → `upsert_book` (storage_type='db') → `upsert_chapter` for each chapter → `upsert_cover_cache` if cover provided
4. Synthetic filename: `db/{publisher}/{author}/{title}` (or `db/{pub}/{auth}/Series/{series}/{idx} - {title}` for series)
Scrape/convert flow (EPUB file — `storage_mode: "epub"`):
1–2. Same as DB flow (images downloaded, HTML built)
3. Chapters converted to XHTML via `make_chapter_xhtml`; EPUB file built via `make_epub` and written to `library/epub/…`
4. `upsert_book` called with `storage_type='file'`
### `routers/search.py`
- `GET /search` — full-text search page (`search.html`); Enter-to-search, `?q=` param auto-runs on load
- `GET /api/search?q=…` — FTS over `book_chapters.content_tsv`; uses `plainto_tsquery('simple', q)` with `ts_rank` ordering and `ts_headline` for highlighted snippets; also matches chapters whose `title` contains the query (case-insensitive `ILIKE` fallback); LIMIT 30; excludes archived books; results include `filename`, `title`, `author`, `chapter_index`, `chapter_title`, `snippet`, `rank`
### `routers/settings.py`
- `GET /settings` — settings page
- `GET /api/break-patterns` — list chapter-break patterns
- `POST /api/break-patterns` — add break pattern (type: `regex` or `css_class`)
- `PATCH /api/break-patterns/{id}` — update pattern (enable/disable or change value)
- `DELETE /api/break-patterns/{id}` — delete pattern
- `DELETE /api/reading-history` — wipe all reading sessions
### `routers/builder.py`
- `GET /builder` — Book Builder index (draft list + new draft form)
- `POST /builder` — create new draft; redirects to `/builder/{id}`
- `GET /builder/{draft_id}` — draft editor page
- `DELETE /api/builder/{draft_id}` — delete draft
- `GET /api/builder/{draft_id}` — draft JSON (id, title, author, publisher, source_url, chapters)
- `POST /api/builder/{draft_id}/chapter` — add chapter `{title, after_index}`; returns `{index, count}`
- `PUT /api/builder/{draft_id}/chapter/{idx}` — save chapter `{title?, content?}`
- `DELETE /api/builder/{draft_id}/chapter/{idx}` — delete chapter; returns `{index, count}`
- `POST /api/builder/{draft_id}/normalize/{idx}` — normalize chapter HTML (preview only, does not save); returns `{content}`
- `POST /api/builder/{draft_id}/publish` — normalize all chapters → `build_epub()` → write to `library/epub/` → `upsert_book()` → delete draft; returns `{filename}`; redirects browser to `/library/book/{filename}`
Publish flow: all chapters are run through `normalize_wysiwyg_html()`, then `build_epub()` produces an EPUB 2.0 ZIP. The file path is computed via `make_rel_path(media_type="epub", …)`. The book is inserted into the library with `needs_review=True`. The draft is deleted on success.
### `routers/following.py`
- `GET /following` — Following page (author URL management)
- `GET /api/following` — all distinct library authors with URL (if set), book count, and last-added date
- `POST /api/following/{author_name}` — set or clear URL for an author (empty `url` removes the record)
`GET /api/following` returns one entry per non-archived author:
```json
{ "name": "Author Name", "book_count": 5, "last_added": "2026-03-27T…", "url": "https://…" }
```
URL is stored in the `authors` table (`name` unique, `url`, `created_at`, `updated_at`).
### `routers/backup.py`
- `GET /backup` — backup page
- `GET /api/backup/credentials` — Dropbox settings (includes `app_key_configured` flag)
- `POST /api/backup/credentials` — save Dropbox settings
- `DELETE /api/backup/credentials` — remove all Dropbox credentials
- `POST /api/backup/oauth/prepare` — save app key + secret, return Dropbox auth URL
- `POST /api/backup/oauth/exchange` — exchange authorization code for refresh token
- `GET /api/backup/health` — Dropbox connectivity check (includes `schedule_enabled`, `schedule_interval_hours`)
- `GET /api/backup/status` — current backup status
- `GET /api/backup/history` — backup run history (last 20)
- `GET /api/backup/progress` — live progress of running backup `{running, done, total, phase}`
- `POST /api/backup/run` — trigger backup (background task)
- `GET /api/backup/snapshots` — list available snapshots `{ok, snapshots: [{name, created_at}]}`
- `GET /api/backup/snapshots/{snapshot_name}/files` — list files in a snapshot with local existence check `{ok, snapshot, files: [{path, size, sha256, exists_locally}]}`
- `POST /api/backup/restore` — restore files from a snapshot: `{snapshot_name, files: [rel_paths]}`; downloads from Dropbox, writes to disk, re-indexes via `scan_media` + `upsert_book`; returns `{ok, restored, total, results: [{path, ok, error?}]}`
---
## Backup & Security
- Dropbox token (refresh token or legacy access token) stored encrypted in `credentials` (`site='dropbox'`).
- Dropbox app key stored encrypted in `credentials` (`site='dropbox_app_key'`).
- Dropbox app secret stored encrypted in `credentials` (`site='dropbox_app_secret'`).
- Dropbox backup root stored encrypted in `credentials` (`site='dropbox_backup_root'`).
- Retention (`snapshots to keep`) stored encrypted in `credentials` (`site='dropbox_backup_retention'`).
- Backup schedule (`enabled` + `interval_hours`) stored encrypted in `credentials` (`site='dropbox_backup_schedule'`).
- Encryption uses `NOVELA_MASTER_KEY` (Fernet).
### Dropbox authentication
- Preferred: OAuth2 refresh token (does not expire). Set up via the two-step flow on `/backup`:
1. Enter App Key + App Secret → click **Generate Auth URL**
2. Approve in browser → paste the code → click **Save & Activate**
- `_dbx()` uses `oauth2_refresh_token` + `app_key` + `app_secret` for automatic token renewal.
- Fallback: legacy short-lived access token (backwards compatible; works without app key/secret).
### Implementation details
- Versioned backups with deduplication:
- file objects in Dropbox: `library_objects/{sha256_prefix}/{sha256}`
- snapshots in Dropbox: `library_snapshots/snapshot-YYYYMMDD-HHMMSS.json`
- Each run creates a new snapshot version and uploads only missing objects.
- Retention removes older snapshots above the configured limit.
- Orphan object pruning removes objects no longer referenced by retained snapshots.
- Local manifest cache (`config/backup_manifest.json`) speeds up change detection.
- Database backup is done via `pg_dump` to Dropbox `postgres/`.
- `POST /api/backup/run` always starts a background task and returns immediately.
- `GET /api/backup/progress` returns in-memory progress updated per file; phases: `starting` → `scanning` → `uploading` → `snapshot` → `pg_dump`.
- Scheduler runs in the background (`start_backup_scheduler`) and triggers on interval when enabled.
- Concurrency guard: only one backup can run at a time.
- After container restart/crash, stale `running` logs are auto-marked as interrupted/error.
---
## Environment
`stack/novela.env` should include at least:
- `POSTGRES_DB`
- `POSTGRES_USER`
- `POSTGRES_PASSWORD`
- `NOVELA_MASTER_KEY`
- `CONFIG_DIR`
Dropbox settings are managed via the web UI on `/backup`.
---
## Branding
Static assets in `static/`:
| File | Size | Purpose |
|------|------|---------|
| `logo.png` | 546×575, transparent | Sidebar wordmark (displayed at 26px height) |
| `favicon.ico` | 16×16 | Browser tab (legacy) |
| `favicon-32.png` | 32×32 | Browser tab (modern) |
| `favicon-256.png` | 256×256 | Pinned tabs / high-DPI |
| `apple-touch-icon.png` | 180×180 | iOS/iPadOS home screen icon |
All 15 page templates include:
```html
```
Sidebar logo: `logo.png` (26px, flex-aligned) next to the "No**vela**" wordmark ("No" in `--text`, "vela" in `--accent`).
`apple-touch-icon.png` uses `#0f0e0c` background (= `--bg`) with the orange N logo centered at 60% of canvas size.
---
## Shared CSS (`static/theme.css`)
Single `:root { }` block defining all global CSS custom properties. Loaded first on every page (``). No template defines its own global colours — only page-specific layout vars stay inline.
| Variable | Value | Role |
|---|---|---|
| `--bg` | `#0f0e0c` | Page background |
| `--surface` | `#1a1815` | Card/panel background |
| `--surface2` | `#221f1b` | Nested surface |
| `--border` | `#2e2a24` | Borders |
| `--accent` | `#ffa20e` | Orange highlight (logo colour) |
| `--accent2` | `#ffb840` | Lighter orange |
| `--text` | `#e8e2d9` | Body text |
| `--text-dim` | `#8a8278` | Muted text |
| `--text-faint` | `#4a453e` | Very muted text |
| `--success` | `#6baa6b` | Success state |
| `--warning` | `#c8a03a` | Warning state |
| `--error` | `#c85a3a` | Error state |
| `--radius` | `6px` | Border radius |
| `--sidebar` | `220px` | Sidebar width |
| `--mono` | `'DM Mono', monospace` | Monospace font stack |
| `--serif` | `'Libre Baskerville', Georgia, serif` | Serif font stack |
Page-specific overrides: `reader.html` (`--header-h`, `--footer-h`, `--content-w`); `backup.html` (`--ok`, `--warn`, `--err`); `editor.css` (`--danger`, `--header-h`, `--panel-w`).
## Shared JavaScript (`static/books.js`)
Loaded before any page-specific script on every page that needs book data or UI helpers.
| Function | Purpose |
|---|---|
| `esc(s)` | HTML-escape a string for safe insertion into markup |
| `strHash(s)` | Deterministic integer hash of a string (for colour selection) |
| `COVER_PALETTES` | Array of 8 `[bg, fg]` colour pairs for placeholder covers |
| `wrapText(ctx, text, x, y, maxW, lineH)` | Canvas word-wrap helper |
| `truncate(s, n)` | Truncate string with ellipsis |
| `makePlaceholderCover(canvas, title, author)` | Draw a generated book cover on a `