novela/docs/TECHNICAL.md
Ivo Oskamp 92cd301658 Add PDF reader/editor support, fix metadata save and dir cleanup
- PDF reader: page-image rendering via /library/pdf/{filename}?page=N;
  new /api/pdf/info/{filename} endpoint returns page count; reader.html
  branches on FORMAT (epub/pdf) injected by server
- PDF metadata edit: PATCH /library/book now updates DB for all formats;
  _sync_epub_metadata only called for .epub; non-EPUB formats skip file write
- Fix file path on metadata save: _make_rel_path now includes format prefix
  (epub/, pdf/, comics/) matching common.make_rel_path used during import;
  previously files were moved outside their format directory
- Fix empty dir cleanup: prune_empty_dirs always runs after successful
  metadata save, not only when file was moved
- Hide Edit EPUB button for non-EPUB files in book detail
- Docs: TECHNICAL.md and changelog-develop.md updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 08:47:01 +01:00

8.8 KiB
Raw Blame History

Novela 2.0 - Technical Status (Develop)

Scope

This document describes the current technical status of the develop codebase. It is the primary technical reference for the current implementation.

Architecture

  • Stack: FastAPI, Jinja2 templates, plain JavaScript, PostgreSQL 16, Docker.
  • Startup lifecycle (main.py):
    1. init_pool()
    2. run_migrations()
    3. start_backup_scheduler()
    4. mount routers
  • Shutdown lifecycle:
    1. stop_backup_scheduler()
    2. close_pool()
  • Source-of-truth rule: files on disk are authoritative, the database is an index/cache.

Router Status

routers/library.py

  • GET /library
  • GET /api/library
  • POST /library/rescan
  • POST /library/import (EPUB/PDF/CBR/CBZ)
  • DELETE /library/file/{filename}
  • GET /library/cover/{filename}
  • GET /library/cover-cached/{filename}
  • POST /library/cover/{filename} (EPUB)
  • POST /library/want-to-read/{filename}
  • POST /library/archive/{filename}
  • POST /library/new/mark-reviewed (bulk needs_review=false)
  • POST /library/rating/{filename} (set/clear star rating, body: {"rating": 0-5})
  • GET /home
  • GET /api/home
  • GET /stats
  • GET /api/stats
  • GET /library/list (compat)

GET /api/library runs in fast-path mode by default (DB-only, no full disk rescan). For a forced sync: GET /api/library?rescan=true or POST /library/rescan. include_file_info=true is optional for file size/mtime enrichment.

/api/home returns:

  • continue_reading
  • shorts_unread
  • novels_unread
  • shorts_read
  • novels_read

/api/stats returns totals plus chart/history data for stats.html:

  • reads_by_month, reads_by_dow, reads_by_hour
  • genre_counts, publisher_counts, fav_genre, fav_publisher
  • top_books, history

Home sections exclude series books via:

  • COALESCE(series, '') = ''
  • filename NOT LIKE '%/Series/%'

Home read sections are ordered oldest-first:

  • shorts_read: ORDER BY MAX(read_at) ASC
  • novels_read: ORDER BY MAX(read_at) ASC

routers/reader.py

  • EPUB serving/chapters/images
  • Reader page + book detail
  • Metadata patch (PATCH /library/book/{filename}): updates DB for all formats; writes to file only for EPUB
  • Progress read/write/delete
  • Mark-as-read
  • Star rating (POST /library/rating/{filename}): validates 05, writes to file (EPUB OPF / CBZ ComicInfo.xml) and DB; DB-only for CBR/PDF
  • PDF render endpoint (GET /library/pdf/{filename}?page=N&dpi=150) — returns page as PNG
  • PDF info endpoint (GET /api/pdf/info/{filename}) — returns {"page_count": N}
  • CBR/CBZ page endpoint
  • Genres endpoint

routers/editor.py

  • Editor page
  • Chapter get/save
  • Chapter add
  • Chapter delete

routers/grabber.py

  • Grabber page + convert/debug flows
  • SSE events
  • Credential management for scraper sites
  • Credentials manager UI (/credentials-manager)

routers/backup.py

  • GET /backup
  • GET/POST/DELETE /api/backup/credentials
  • GET /api/backup/health
  • GET /api/backup/status
  • GET /api/backup/history
  • POST /api/backup/run

Backup & Security

  • Dropbox token is stored encrypted-at-rest in credentials (site='dropbox').
  • Dropbox backup root is stored encrypted in credentials (site='dropbox_backup_root').
  • Retention (snapshots to keep) is stored encrypted in credentials (site='dropbox_backup_retention').
  • Backup schedule (enabled + interval_hours) is stored encrypted in credentials (site='dropbox_backup_schedule').
  • Encryption uses NOVELA_MASTER_KEY (Fernet).

Implementation details:

  • Versioned backups with deduplication:
    • file objects in Dropbox: library_objects/{sha256_prefix}/{sha256}
    • snapshots in Dropbox: library_snapshots/snapshot-YYYYMMDD-HHMMSS.json
  • Each run creates a new snapshot version and uploads only missing objects.
  • Retention removes older snapshots above the configured limit.
  • Orphan object pruning removes objects no longer referenced by retained snapshots.
  • Local manifest cache (config/backup_manifest.json) speeds up change detection.
  • Database backup is done via pg_dump to Dropbox postgres/.
  • POST /api/backup/run always starts a background task and returns immediately.
  • Scheduler runs in the background (start_backup_scheduler) and triggers on interval when enabled.
  • Concurrency guard: only one backup can run at a time.
  • After container restart/crash, stale running logs are auto-marked as interrupted/error.

Environment

stack/novela.env should include at least:

  • POSTGRES_DB
  • POSTGRES_USER
  • POSTGRES_PASSWORD
  • NOVELA_MASTER_KEY
  • CONFIG_DIR

Dropbox settings are managed via the web UI on /backup.

UI Notes

  • Library import accepts EPUB/PDF/CBR/CBZ.
  • Home supports the same import formats.
  • Home includes search.
  • Home header/dropzone alignment matches Library (search top-right, dropzone below).
  • New view supports Grid and List mode.
  • Bulk selection + Remove from New works only in List mode.
  • List mode has a column visibility filter with columns:
    • Publisher
    • Author
    • Series
    • Volume
    • Title
    • Has cover
    • Updated
    • Genres
    • Sub-genres
    • Tags
    • Status
  • List mode supports multi-select with Shift+click range selection on checkboxes.
  • Grid mode shows no selection checkboxes or bulk actions.
  • All books view supports Grid and List mode (same columns as New, no selection/bulk actions).
    • View mode persisted in localStorage as novela.all.viewMode.
    • Column visibility persisted in localStorage as novela.all.visibleColumns.
  • Star ratings (15) are shown under the cover in all grid views (Library, Home):
    • Display-only in grid cards (no click handler, prevents accidental taps).
    • Interactive in Book Detail (1.1rem, clickable; clicking the active star clears the rating).
    • Amber color: filled #c8a03a, unfilled rgba(200, 160, 58, 0.25).
  • Reader has a text colour setting in the hamburger menu:
    • 5 presets from #e8e2d9 (bright) to #938d86 (dim), persisted in localStorage as reader-text-colour.
    • Hamburger and back-link are visually separated with margin-left: 1rem on .header-back.
  • Backup page supports:
    • manual run and dry-run
    • Dropbox root settings
    • snapshot retention count
    • scheduled backup (on/off + interval in hours)
    • status + history overview
  • Reader supports EPUB and PDF:
    • EPUB: chapter-text rendering (existing flow)
    • PDF: page-image rendering via /library/pdf/{filename}?page=N; page count fetched from /api/pdf/info/{filename}; progress tracked per page; keyboard/button navigation identical to EPUB
    • reader.html branches on FORMAT variable injected by the server
  • Edit EPUB button in Book Detail is only shown for .epub files.

Known Bugs Fixed

  • renderGenreView and renderSearchResults in library.js referenced b.genres (non-existent field on the book object). All tag data lives in b.tags as {tag, tag_type} objects; the correct helpers are bookGenres(), bookSubgenres(), bookPlainTags().
  • PillInput in book.js did not handle comma as a delimiter and did not flush pending input on save. Fixed with comma keydown handler and flush() called in saveEdit().
  • PATCH /library/book/{filename} failed for PDFs: _sync_epub_metadata tried to open the PDF as a ZIP, throwing an exception that aborted the entire save (including the DB update). Fixed by only calling _sync_epub_metadata when ext == ".epub".
  • _make_rel_path in reader.py lacked the format prefix (epub/, pdf/, comics/) used by common.make_rel_path, causing files to be moved outside their format directory on metadata save. Fixed by aligning the path logic: EPUB → epub/{publisher}/{author}/…, PDF → pdf/{author}/{title}.pdf, CBR/CBZ → comics/{author}/{title}{ext}.
  • PDF reader showed infinite loading: reader.html always called /library/chapters/{filename} (EPUB-only) and tried to render chapter text. PDF reader now fetches page count and renders page images.

Known Conventions

  • Book deletion flow: delete file, prune empty directories, then DELETE FROM library (cascade removes child rows).
  • Cover strategy:
    • EPUB: cover from file + cache
    • PDF/CBR: thumbnail via cover cache
  • Rating storage:
    • EPUB: <meta name="novela:rating" content="N"/> in OPF
    • CBZ: <NovelaRating>N</NovelaRating> in ComicInfo.xml inside the ZIP
    • CBR/PDF: DB only
    • upsert_book uses CASE WHEN EXCLUDED.rating > 0 THEN EXCLUDED.rating ELSE library.rating END to restore rating from file without overwriting existing DB value

Performance Notes

  • Library load is optimized for large datasets:
    • list_library_json() uses pre-aggregation for reading_sessions.
    • has_cached_cover is provided directly via SQL join instead of full cache fetch.
  • Additional migration indexes:
    • idx_library_sort_coalesce
    • idx_library_needs_review
    • idx_library_archived
    • idx_reading_sessions_filename_readat
    • idx_book_tags_filename_tag