47 KiB
Novela 2.0 - Technical Status (Develop)
Scope
This document describes the current technical status of the develop codebase.
It is the primary technical reference for the current implementation.
Architecture
- Stack: FastAPI, Jinja2 templates, plain JavaScript, PostgreSQL 16, Docker.
- All routers import
templatesfromshared_templates.py(a singleJinja2Templatesinstance). This module registers adevelop_mode()callable as a Jinja2 global, making it available in every template without passing it explicitly per route. - Startup lifecycle (
main.py):init_pool()run_migrations()start_backup_scheduler()- mount routers
- Shutdown lifecycle:
stop_backup_scheduler()close_pool()
- Source-of-truth rule: files on disk are authoritative, the database is an index/cache.
File Storage Paths
All files are stored under library/ (relative to the app working directory, mapped via Docker volume).
LIBRARY_DIR = Path("library"), LIBRARY_ROOT = LIBRARY_DIR.resolve().
Path structure per format
| Format | Path pattern |
|---|---|
| EPUB (no series) | library/epub/{publisher}/{author}/Stories/{title}.epub |
| EPUB (series) | library/epub/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.epub |
library/pdf/{publisher}/{author}/{title}.pdf |
|
| CBR (no series) | library/comics/{publisher}/{author}/{title}.cbr |
| CBR (series) | library/comics/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.cbr |
| CBZ (no series) | library/comics/{publisher}/{author}/{title}.cbz |
| CBZ (series) | library/comics/{publisher}/{author}/Series/{series}/{idx:03d}_-_{title}.cbz |
- Segments are sanitised: special chars stripped, spaces replaced with
_, max lengths applied (publisher/author 80, title 140, series 80). - Series index is zero-padded to 3 digits (
001,002, …), clamped to 1–999. - Duplicate filenames get a
(2),(3), … suffix. - After any file move, empty parent directories are pruned up to
LIBRARY_ROOT.
Path logic
common.make_rel_path(media_type, publisher, author, title, series, series_index, series_suffix, ext)— used by import and grabber.reader.py _make_rel_path(publisher, author, title, series, series_index, series_suffix, ext)— used by metadata PATCH; same logic, uses actual file extension.series_volumeis not part of the file path; it is stored in DB and OPF only.- Both functions produce identical paths for all formats.
Metadata save behaviour per format
| Format | File written? | DB written? |
|---|---|---|
| EPUB | Yes — OPF metadata updated in-place | Yes |
| No | Yes | |
| CBR | No | Yes |
| CBZ | No (tags/metadata); rating written to ComicInfo.xml | Yes |
Router Status
routers/library.py
GET /library— library pageGET /api/library— book list JSON (fast-path by default)POST /library/rescan— forced full disk rescanPOST /library/import— upload EPUB/PDF/CBR/CBZDELETE /library/file/{filename}— delete file + DB row + prune dirsGET /download/{filename}— download file withContent-Disposition: attachmentGET /library/cover/{filename}— serve cover (EPUB from file; PDF/CBR from cache)GET /library/cover-cached/{filename}— serve cover from DB cache onlyPOST /library/cover/{filename}— upload/replace cover; for EPUB files: embeds cover in the EPUB and updates cache; for DB-stored books: stores cover directly inlibrary_cover_cacheand setshas_cover = TRUEPOST /library/want-to-read/{filename}— toggle want-to-read flagPOST /library/archive/{filename}— toggle archived flagPOST /library/archive-series— setarchivedfor all books in a series; body:{"series": "…", "archive": true|false}; returns{ok, archived, count}POST /library/new/mark-reviewed— bulk setneeds_review=falsePOST /library/bulk-delete— delete multiple files; accepts{"filenames": [...]}, removes files from disk and DB in one query per batch; returns{ok, deleted, skipped}POST /library/rating/{filename}— set/clear star rating{"rating": 0-5}GET /home— home pageGET /api/home— home data JSONGET /stats— statistics pageGET /api/stats— statistics data JSONGET /api/disk— partition usage for the library directory:{total, used, free, pct_used}POST /api/bulk-check-duplicates— accepts{"items": [{title, author, series, volume}, ...]}, returns{"duplicates": [bool, ...]}— checks by title+author+series_index; also checks by series+author+series_index as fallback (catches duplicate detection when title format changed); when volume is absent, matches on title+author onlyGET /library/list— compat alias
GET /api/library runs in fast-path mode by default (DB-only, no full disk rescan).
For a forced sync: GET /api/library?rescan=true or POST /library/rescan.
include_file_info=true is optional for file size/mtime enrichment.
ETag caching: response includes ETag: "{count}-{max_updated_at_unix}" and Cache-Control: no-cache. Client sends If-None-Match; server returns 304 Not Modified when nothing changed.
/api/home returns:
continue_readingshorts_unreadnovels_unreadshorts_readnovels_read
/api/stats returns totals plus chart/history data for stats.html:
reads_by_month,reads_by_dow,reads_by_hourgenre_counts,publisher_counts,fav_genre,fav_publishertop_books,history
Home sections exclude series books via:
COALESCE(series, '') = ''filename NOT LIKE '%/Series/%'
Home read sections are ordered oldest-first:
shorts_read:ORDER BY MAX(read_at) ASCnovels_read:ORDER BY MAX(read_at) ASC
routers/reader.py
GET /library/db-images/{path:path}— serve image from content-addressed imagestore (library/images/); security: path must be underIMAGES_DIRPOST /api/library/convert-to-db/{filename:path}— convert on-disk EPUB to a DB-stored book; extracts chapters via_epub_body_inner(stores images in imagestore, rewrites src to/library/db-images/…), migrates all child tables (INSERT new library row → UPDATE children → DELETE old row), deletes EPUB file; returns{ok, new_filename}GET /api/library/export-epub/{filename:path}— build and stream an EPUB from a DB-stored book;_rewrite_db_images_for_epubrewrites/library/db-images/…back toOEBPS/Images/…paths (dedup by sha256); returns asContent-Disposition: attachmentGET /library/epub/{filename}— serve EPUB inline (no attachment header)GET /library/chapters/{filename}— EPUB spine as JSON; forstorage_type='db'books returns chapters frombook_chaptersGET /library/chapter/{index}/{filename}— single chapter as HTML fragment; forstorage_type='db'books reads frombook_chaptersGET /library/chapter-img/{path}?filename=…— image extracted from EPUB ZIP;pathis the full internal ZIP path (e.g.OEBPS/Images/cover.jpgorEPUB/images/cover.jpg); case-insensitive fallback for mismatched folder namesGET /library/pdf/{filename}?page=N&dpi=150— render PDF page as PNGGET /api/pdf/info/{filename}—{"page_count": N}GET /library/cbr/{filename}/{page}— CBR/CBZ page as imageGET /library/progress/{filename}— read progressPOST /library/progress/{filename}— save progress{"cfi": "…", "progress": N}DELETE /library/progress/{filename}— clear progressPOST /library/mark-read/{filename}— mark as read (with optional date)GET /library/book/{filename}— book detail pageGET /api/genres— all tags frombook_tags(optional?type=genre|subgenre|tag)PATCH /library/book/{filename}— update metadata + tags; moves file if path fields change; DB-only for non-EPUB; forstorage_type='db'books: recomputes syntheticdb/…filename, FK-safe rename (INSERT→UPDATE children→DELETE old), updatesbook_chapters+bookmarksas wellPOST /library/rating/{filename}— set/clear 1–5 star rating; writes to EPUB OPF / CBZ ComicInfo.xml; DB-only for CBR/PDFGET /library/read/{filename}— reader page (EPUB or PDF); supports?bm_ch=N&bm_scroll=Fto jump to bookmark positionGET /api/series-nav/{filename}— returns{prev, next}({filename, title, index, suffix}ornull) for the adjacent books in the same series ordered byseries_index ASC, series_suffix ASC; used by the reader for series navigation buttons andmarkRead()redirectGET /library/bookmarks/{filename}— list bookmarks for a bookPOST /library/bookmarks/{filename}— add bookmark{chapter_index, scroll_frac, chapter_title, note}PATCH /library/bookmarks/{id}— update bookmark noteDELETE /library/bookmarks/{id}— delete bookmarkGET /api/bookmarks— all bookmarks across all books (includesbook_title,book_author)
routers/bulk_import.py
GET /bulk-import— Bulk Import pagePOST /library/bulk-import— import files with pre-parsed metadata; accepts multipartfiles[],rows(JSON array of per-file metadata),shared(JSON with author/publisher/status/genres/tags applied to all files)
Filename parsing is done client-side in bulk_import.html. The page uses a free-text %placeholder% pattern (e.g. %series% - %series_volume% - %volume% - %title% - %year%). Available placeholders: %series% %series_volume% %volume% %title% %year% %month% %day% %author% %publisher% %ignore%. Colored chips can be clicked (insert at cursor) or dragged onto the input. Pattern is converted to a regex at parse time. Shared metadata fields (including "Year/Vol." for series_volume) override filename-parsed values. "Auto-generate titles" checkbox fills empty title cells as Series (Year/Vol) #Number. Skip checkbox is always visible for every row; skipped rows are excluded from import. Files are uploaded in batches of 5 with a progress bar.
routers/editor.py
GET /library/editor/{filename}— chapter editor page; supports both EPUB files and DB-stored books (db/…filenames); passesis_dbflag to template; DB branch querieslibrarytable directly (no file check)GET /api/edit/chapter/{index}/{filename}— get chapter content; DB branch reads frombook_chaptersand returns{index, href, title, content}POST /api/edit/chapter/{index}/{filename}— save chapter; DB branch accepts{content, title}, callsupsert_chapter(updatescontent_tsvtoo)POST /api/edit/chapter/add/{filename}— add new chapter afterafter_index; DB branch shiftschapter_indexup viaUPDATE … SET chapter_index = chapter_index + 1 WHERE chapter_index >= insert_idxthen insertsDELETE /api/edit/chapter/{index}/{filename}— delete chapter; DB branch deletes and re-indexes viaUPDATE … SET chapter_index = chapter_index - 1 WHERE chapter_index > index
routers/grabber.py
GET /grabber— grabber pageGET /convert— convert pageGET /credentials-manager— credentials manager UIGET /debug— debug pagePOST /debug/run— run debug scrapeGET /credentials— list stored credentialsPOST /credentials— save credentialDELETE /credentials/{site}— delete credentialPOST /preload— preload book info from URLPOST /convert— run scrape; body may includestorage_mode: "db"(default) or"epub"to control output formatGET /events/{job_id}— SSE stream for job progress;doneevent includesstorage_type('db'or'file')
Scrape/convert flow (DB storage — default):
- Fetch book info + chapters via scraper
- Per chapter: download images → write to
library/images/{sha2}/{sha256}{ext}(content-addressed) → rewriteimg[src]to/library/db-images/...; break images replaced with<hr>beforeelement_to_xhtmlruns → buildcontent_htmlviaelement_to_xhtmlwithbreak_img_path="/static/break.png" - One DB transaction:
ensure_unique_db_filename→upsert_book(storage_type='db') →upsert_chapterfor each chapter →upsert_cover_cacheif cover provided - Synthetic filename:
db/{publisher}/{author}/{title}(ordb/{pub}/{auth}/Series/{series}/{idx} - {title}for series)
Scrape/convert flow (EPUB file — storage_mode: "epub"):
1–2. Same as DB flow; break_img_path="../Images/break.png" passed to element_to_xhtml
3. Chapters converted to XHTML via make_chapter_xhtml; EPUB file built via make_epub (embeds static/break.png as OEBPS/Images/break.png) and written to library/epub/…
4. upsert_book called with storage_type='file'
Scrapers (scrapers/)
All scrapers inherit BaseScraper and implement matches(url), login(), fetch_book_info(), fetch_chapter(). Registration order in scrapers/__init__.py determines priority (first match wins).
| Scraper | Domain | Login | Notes |
|---|---|---|---|
ArchiveOfOurOwnScraper |
archiveofourown.org | Optional | Uses authenticity token; adult content gate via ?view_adult=true |
AwesomeDudeScraper |
awesomedude.org | No | Chapter discovery via .htm/.html links in same directory; content extracted from largest non-layout block |
CodeysWorldScraper |
codeysworld.org | No | See below |
GayAuthorsScraper |
gayauthors.org | Optional | Genres + subgenres from itemprop="genre" links; tags from ipsTags list |
IomfatsScraper |
iomfats.org | No | See below; requires chapter URL as entry point |
NiftyNewScraper |
new.nifty.org | No | See below; registered before NiftyScraper |
NiftyScraper |
nifty.org (classic) | No | See below; excludes new.nifty.org; category/subcategory stored as tags |
TedLouisScraper |
tedlouis.com | No | Story index URL required as entry point; all pages use ?t=TOKEN routing; chapter links in <ul class="story-index-list"> |
NiftyNewScraper
new.nifty.org is a Next.js RSC application. Pages render proper HTML with semantic markup — no plain-text email format.
- URL normalisation:
_to_index_url()strips a trailing/N(chapter index) so any URL (index or chapter) can be passed as entry point. Story URL pattern:/stories/{slug}-{id}. fetch_book_info():- Title from
<h1>; fallback:<title>with- … - Nifty Archive …suffix stripped. - Author from
<strong itemprop="name">inside<a href="/authors/{id}">. - Publication date from
<time itemprop="datePublished" datetime="…">, updated date from<time itemprop="dateModified" datetime="…">; both truncated toYYYY-MM-DD. - Tags from all
<ul aria-label="Tags">containers on the page — covers both the story category links (/collections/…) and the AI-generated content tags (/search?query=tags%3A…); deduplicated;genresandsubgenresare always empty. - Description from
<meta name="description">. - Chapter list:
<a>links matching/stories/{slug}/Ncollected from page HTML; fallback: regex scan of RSC stream for"index": Nvalues. URLs generated as{index_url}/1…{index_url}/max.
- Title from
fetch_chapter():- Content extraction order:
- Chapter HTML (
{url}): read<article>and collect<p>text - Fallback on same HTML: extract escaped Next payload paragraphs (
\u003cp...\u003c/p) - Last fallback (
{url}?_rsc=1): parse RSC line format ({hex_id}:{json}) for["$","p",…]nodes, then escaped paragraph fallback
- Chapter HTML (
- Chapter title uses the precomputed chapter dict title (
Chapter N). - Lead/tail boilerplate detection for common Nifty intro/donate text. Removed boilerplate is preserved as invisible HTML comments in chapter content:
<!-- NIFTY_HIDDEN_LEAD: ... --><!-- NIFTY_HIDDEN_TAIL: ... -->
- No email-header stripping and no plain-text line-joining (those are specific to Nifty classic).
- Content extraction order:
NiftyScraper
Nifty classic pages are plain-text email submissions wrapped in a <pre> element.
- URL normalisation:
_to_index_url()strips the chapter segment so any URL (index or chapter) can be passed as the entry point. Path structure:/nifty/{category}/{subcategory}/{story}/(index, 4 segments) vs/nifty/{category}/{subcategory}/{story}/{chapter}(chapter, 5 segments). fetch_book_info()performs up to 3 extra HTTP requests: chapter 1 (author + publication date), last chapter (updated_date), chapter 2 (boilerplate detection). Author and dates are extracted from the email headers (From:,Date:) embedded at the top of each chapter file. Date is parsed viaemail.utils.parsedate→YYYY-MM-DD.- Boilerplate detection: leading paragraphs of chapters 1 and 2 (after email-header strip) are compared using normalised text (lowercase, whitespace collapsed). Consecutive matching paragraphs are recorded as
preamble_countand stored in each chapter dict;fetch_chapter()skips them. fetch_chapter()pipeline:- Extract
<pre>text (fallback: full body text) - Parse
Subject:header → store as<!-- Subject: … -->comment in chapter content (invisible in reader, extractable later) - Strip email header block (up to first blank line after
Date:/From:/Subject:lines) - Skip first
preamble_countparagraphs - Split on blank lines → paragraphs; join hard-wrapped lines within each paragraph with a space
- Detect and remove lead/tail boilerplate (common notice/disclaimer/author promo/donate blocks)
- Persist removed boilerplate as invisible comments:
<!-- NIFTY_HIDDEN_LEAD: ... --><!-- NIFTY_HIDDEN_TAIL: ... -->
- Scene-break patterns (
***,---,~~~,• • •, etc.) →<hr/> - Build
content_elas a BeautifulSoup<div>of comments +<p>+<hr/>nodes
- Extract
- Genres/subgenres from URL path:
category(e.g.gay→Gay) andsubcategory(e.g.young-friends→Young Friends).
CodeysWorldScraper
- Entry point: any
codeysworld.orgURL. - Title from
<h1>; author from<h2>matching"by …"pattern; fallback: URL path segment/{author}/{category}/filename. - Category from URL path (second-to-last segment, e.g.
remembrances→ tag"Remembrances"). - Chapter discovery:
.htm/.htmllinks in the same directory as the entry URL; audio/image links skipped. No chapter links → single-file story (entry URL is the only chapter). fetch_chapter(): removes all<h1>/<h2>headings, back-navigation links, audio links (.mp3), mailto links; falls back to<body>when no content wrapper is found.
IomfatsScraper
All stories by an author are listed on a single author page (/storyshelf/hosted/{author}/). Individual story pages do not exist.
- Entry point must be a chapter URL (
/storyshelf/hosted/{author}/{story-folder}/{chapter}.html). Passing the author page URL raises aValueErrorwith a user-visible message. - On load: navigates to the author page and scans
<div id="content">for the matching story. - Two page structures detected:
- Single story: outer
<h3>= book title; chapters are direct<li><a>children of the following<ul>. - Multi-part series: outer
<h3>= series name; nested<li><h3>= book title per part; chapters in the sub-<ul>matchingstory_folder.
- Single story: outer
- Series index extracted from folder name suffix:
*-part{N}or*-{N}. - Publication status from
<p><small>[…]</small></p>after the book title heading. fetch_chapter(): content from<div id="content">; removes<h2>/<h3>headings,.chapternavdivs,div.importantfooter blocks, anchor-name elements.
TedLouisScraper
All pages on tedlouis.com use opaque token-based routing: https://tedlouis.com/?t=<TOKEN>. There are no predictable URL patterns — tokens must be followed from the story index page.
- Entry point must be a story index URL (the page listing all chapters). Passing a chapter URL raises a
ValueErrorwith a user-visible message. Detection: story index has<h2 class="story-page-title">, chapter page has<h1 class="story-title">. fetch_book_info():- Title from direct
NavigableStringchildren of<h2 class="story-page-title">— the element also contains a "Back" button (<a class="btn">) and the author byline (<span class="story-author-by-line">), which are skipped. - Author from
<span class="story-author-by-line"> <a>. - Publication status from
<span class="story-status-text">with "Status: " prefix stripped. - Updated date from
<span class="story-last-updated">("Last Updated: Month D, YYYY") →YYYY-MM-DD. - Chapter list from all
<ul class="story-index-list">elements (three columns on the page); relative?t=TOKENhrefs resolved to absolute URLs. Order preserved; duplicates deduplicated. - No genres, subgenres, tags or description available on the page.
- Title from direct
fetch_chapter(): content from<div id="chapter">; strips<h1 class="story-title">,<h2 class="chapter-title">,div.chapter-copyright-line, anddiv.chapter-copyright-notice-textblocks. Chapter title refined from<h2 class="chapter-title"> <span>.
xhtml.element_to_xhtml() — Comment handling
bs4.Comment objects (a NavigableString subclass) are now emitted as XML comments: <!-- … -->. The -- sequence (illegal inside XML comments) is sanitised to - -. This allows scrapers to embed invisible metadata (e.g. the Nifty Subject: header) in chapter content without it appearing in the rendered reader.
routers/search.py
GET /search— full-text search page (search.html); Enter-to-search,?q=param auto-runs on loadGET /api/search?q=…&mode=phrase|words&filter=all|unread_novels|unread_shorts— FTS overbook_chapters.content_tsv;mode=phrase(default) usesphraseto_tsquery(words in order);mode=wordsusesplainto_tsquery(all words present, any order);ts_rankandts_headlinealways useplainto_tsquery; also matches chapters whosetitlecontains the query (case-insensitive LIKE fallback); no result limit; excludes archived books;filter=unread_novelsrestricts to books with no reading sessions/progress and noShortstag;filter=unread_shortsrestricts to books with no reading sessions/progress and aShortstag; results includefilename,title,author,chapter_index,chapter_title,snippet,rank
routers/settings.py
GET /settings— settings pageGET /api/app-settings— returns{"develop_mode": bool, "break_image_url": str|null}PATCH /api/app-settings— accepts{"develop_mode": bool}, persists toapp_settingstablePOST /api/app-settings/break-image— multipart file upload (PNG/JPG/WebP); stores image in imagestore + overwritesstatic/break.png; savesbreak_image_sha256+break_image_exttoapp_settings; returns{"ok": true, "url": "/library/db-images/…"}GET /api/break-patterns— list chapter-break patternsPOST /api/break-patterns— add break pattern (type:regexorcss_class)PATCH /api/break-patterns/{id}— update pattern (enable/disable or change value)DELETE /api/break-patterns/{id}— delete patternDELETE /api/reading-history— wipe all reading sessions
app_settings table (single row, id = 1): develop_mode BOOLEAN, break_image_sha256 VARCHAR(64), break_image_ext VARCHAR(10).
routers/builder.py
GET /builder— Book Builder index (draft list + new draft form)POST /builder— create new draft; redirects to/builder/{id}GET /builder/{draft_id}— draft editor pageDELETE /api/builder/{draft_id}— delete draftGET /api/builder/{draft_id}— draft JSON (id, title, author, publisher, source_url, chapters)POST /api/builder/{draft_id}/chapter— add chapter{title, after_index}; returns{index, count}PUT /api/builder/{draft_id}/chapter/{idx}— save chapter{title?, content?}DELETE /api/builder/{draft_id}/chapter/{idx}— delete chapter; returns{index, count}POST /api/builder/{draft_id}/normalize/{idx}— normalize chapter HTML (preview only, does not save); returns{content}POST /api/builder/{draft_id}/publish— normalize all chapters →build_epub()→ write tolibrary/epub/→upsert_book()→ delete draft; returns{filename}; redirects browser to/library/book/{filename}
Publish flow: all chapters are run through normalize_wysiwyg_html(), then build_epub() produces an EPUB 2.0 ZIP. The file path is computed via make_rel_path(media_type="epub", …). The book is inserted into the library with needs_review=True. The draft is deleted on success.
routers/following.py
GET /following— Following page (author URL management)GET /api/following— all distinct library authors with URL (if set), book count, and last-added datePOST /api/following/{author_name}— set or clear URL for an author (emptyurlremoves the record)
GET /api/following returns one entry per non-archived author:
{ "name": "Author Name", "book_count": 5, "last_added": "2026-03-27T…", "url": "https://…" }
URL is stored in the authors table (name unique, url, created_at, updated_at).
routers/backup.py
GET /backup— backup pageGET /api/backup/credentials— Dropbox settings (includesapp_key_configuredflag)POST /api/backup/credentials— save Dropbox settingsDELETE /api/backup/credentials— remove all Dropbox credentialsPOST /api/backup/oauth/prepare— save app key + secret, return Dropbox auth URLPOST /api/backup/oauth/exchange— exchange authorization code for refresh tokenGET /api/backup/health— Dropbox connectivity check (includesschedule_enabled,schedule_interval_hours)GET /api/backup/status— current backup statusGET /api/backup/history— backup run history (last 20)GET /api/backup/progress— live progress of running backup{running, done, total, phase}POST /api/backup/run— trigger backup (background task)GET /api/backup/snapshots— list available snapshots{ok, snapshots: [{name, created_at}]}GET /api/backup/snapshots/{snapshot_name}/files— list files in a snapshot with local existence check{ok, snapshot, files: [{path, size, sha256, exists_locally}]}POST /api/backup/restore— restore files from a snapshot:{snapshot_name, files: [rel_paths]}; downloads from Dropbox, writes to disk, re-indexes viascan_media+upsert_book; returns{ok, restored, total, results: [{path, ok, error?}]}
Backup & Security
- Dropbox token (refresh token or legacy access token) stored encrypted in
credentials(site='dropbox'). - Dropbox app key stored encrypted in
credentials(site='dropbox_app_key'). - Dropbox app secret stored encrypted in
credentials(site='dropbox_app_secret'). - Dropbox backup root stored encrypted in
credentials(site='dropbox_backup_root'). - Retention (
snapshots to keep) stored encrypted incredentials(site='dropbox_backup_retention'). - Backup schedule (
enabled+interval_hours) stored encrypted incredentials(site='dropbox_backup_schedule'). - Encryption uses
NOVELA_MASTER_KEY(Fernet).
Dropbox authentication
- Preferred: OAuth2 refresh token (does not expire). Set up via the two-step flow on
/backup:- Enter App Key + App Secret → click Generate Auth URL
- Approve in browser → paste the code → click Save & Activate
_dbx()usesoauth2_refresh_token+app_key+app_secretfor automatic token renewal.
- Fallback: legacy short-lived access token (backwards compatible; works without app key/secret).
Implementation details
- Versioned backups with deduplication:
- file objects in Dropbox:
library_objects/{sha256_prefix}/{sha256} - snapshots in Dropbox:
library_snapshots/snapshot-YYYYMMDD-HHMMSS.json
- file objects in Dropbox:
- Each run creates a new snapshot version and uploads only missing objects.
- Retention removes older snapshots above the configured limit.
- Orphan object pruning removes objects no longer referenced by retained snapshots.
- Local manifest cache (
config/backup_manifest.json) speeds up change detection. - Database backup is done via
pg_dumpto Dropboxpostgres/. POST /api/backup/runalways starts a background task and returns immediately.GET /api/backup/progressreturns in-memory progress updated per file; phases:starting→scanning→uploading→snapshot→pg_dump.- Scheduler runs in the background (
start_backup_scheduler) and triggers on interval when enabled. - Concurrency guard: only one backup can run at a time.
- After container restart/crash, stale
runninglogs are auto-marked as interrupted/error.
Environment
stack/novela.env should include at least:
POSTGRES_DBPOSTGRES_USERPOSTGRES_PASSWORDNOVELA_MASTER_KEYCONFIG_DIR
Dropbox settings are managed via the web UI on /backup.
Branding
Static assets in static/:
| File | Size | Purpose |
|---|---|---|
logo.png |
546×575, transparent | Sidebar wordmark (displayed at 26px height) |
favicon.ico |
16×16 | Browser tab (legacy) |
favicon-32.png |
32×32 | Browser tab (modern) |
favicon-256.png |
256×256 | Pinned tabs / high-DPI |
apple-touch-icon.png |
180×180 | iOS/iPadOS home screen icon |
All 15 page templates include:
<link rel="icon" href="/static/favicon.ico" sizes="16x16"/>
<link rel="icon" type="image/png" sizes="32x32" href="/static/favicon-32.png"/>
<link rel="icon" type="image/png" sizes="256x256" href="/static/favicon-256.png"/>
<link rel="apple-touch-icon" sizes="180x180" href="/static/apple-touch-icon.png"/>
Sidebar logo: logo.png (26px, flex-aligned) next to the "Novela" wordmark ("No" in --text, "vela" in --accent).
apple-touch-icon.png uses #0f0e0c background (= --bg) with the orange N logo centered at 60% of canvas size.
Shared CSS (static/theme.css)
Single :root { } block defining all global CSS custom properties. Loaded first on every page (<link rel="stylesheet" href="/static/theme.css"/>). No template defines its own global colours — only page-specific layout vars stay inline.
| Variable | Value | Role |
|---|---|---|
--bg |
#0f0e0c |
Page background |
--surface |
#1a1815 |
Card/panel background |
--surface2 |
#221f1b |
Nested surface |
--border |
#2e2a24 |
Borders |
--accent |
#ffa20e |
Orange highlight (logo colour) |
--accent2 |
#ffb840 |
Lighter orange |
--text |
#e8e2d9 |
Body text |
--text-dim |
#8a8278 |
Muted text |
--text-faint |
#4a453e |
Very muted text |
--success |
#6baa6b |
Success state |
--warning |
#c8a03a |
Warning state |
--error |
#c85a3a |
Error state |
--radius |
6px |
Border radius |
--sidebar |
220px |
Sidebar width |
--mono |
'DM Mono', monospace |
Monospace font stack |
--serif |
'Libre Baskerville', Georgia, serif |
Serif font stack |
Page-specific overrides: reader.html (--header-h, --footer-h, --content-w); backup.html (--ok, --warn, --err); editor.css (--danger, --header-h, --panel-w).
Shared JavaScript (static/books.js)
Loaded before any page-specific script on every page that needs book data or UI helpers.
| Function | Purpose |
|---|---|
esc(s) |
HTML-escape a string for safe insertion into markup |
strHash(s) |
Deterministic integer hash of a string (for colour selection) |
COVER_PALETTES |
Array of 8 [bg, fg] colour pairs for placeholder covers |
wrapText(ctx, text, x, y, maxW, lineH) |
Canvas word-wrap helper |
truncate(s, n) |
Truncate string with ellipsis |
makePlaceholderCover(canvas, title, author) |
Draw a generated book cover on a <canvas> |
_filenameBase(filename) |
Strip path and extension from a filename |
bookTitle(b) |
Return display title (falls back to filename parsing) |
bookAuthor(b) |
Return display author (falls back to filename parsing) |
tagValuesByType(b, type) |
Return tag strings of a given type from b.tags |
bookGenres(b) |
Tags of type genre; falls back to subject |
bookSubgenres(b) |
Tags of type subgenre |
bookPlainTags(b) |
Tags of type tag |
filterBooks(books, query) |
Filter book list by query across title, author, publisher, genre, sub-genre, tag |
setupSearchInput(inputId, clearId, onSearch) |
Wire input: show/hide clear button on input; call onSearch(query) on Enter |
Shared JavaScript (static/conversion.js)
Loaded by index.html (Convert page) and grabber.html (Grabber page). Requires books.js for esc().
| Function | Purpose |
|---|---|
addLog(msg, cls) |
Append a log line to #log-lines |
connectConversionStream(job_id) |
Open SSE stream /events/{job_id} and handle all conversion events: status, meta, chapters, progress, warning, error, done |
UI Notes
- Library import accepts EPUB/PDF/CBR/CBZ.
- Home supports the same import formats.
- Home includes search.
- Home header/dropzone alignment matches Library (search top-right, dropzone below).
Newview supportsGridandListmode.- Bulk selection +
Remove from Newworks only inListmode. Listmode has a column visibility filter: Publisher, Author, Series, Volume, Title, Has cover, Updated, Genres, Sub-genres, Tags, Status.Listmode supports multi-select withShift+clickrange selection on checkboxes.Gridmode shows no selection checkboxes or bulk actions.
- Bulk selection +
All booksview supportsGridandListmode (same columns asNew).- View mode persisted in
localStorageasnovela.all.viewMode. - Column visibility persisted in
localStorageasnovela.all.visibleColumns. Listmode has a checkbox column, column visibility filter, and multi-select withShift+clickrange selection.Listmode has aDelete selectedbulk action: confirms then callsDELETE /library/file/{filename}for each selected book.
- View mode persisted in
- Publication status values:
Complete,Ongoing,Temporary Hold,Long-Term Hold(blank = unknown).Hiatuswas renamed toLong-Term Holdvia startup migrationmigrate_rename_hiatus(). - Status badges (top-right of grid card cover): circular icon, dark fill
rgba(15,14,12,0.82)+box-shadow: 0 0 0 2px #0f0e0cring for visibility on any cover colour. Icon colour per status: Complete=green#6baa6b, Ongoing=blue#4a90b8, Temporary Hold=amber#c8a03a, Long-Term Hold=orange#c8783a.statusBadgeHtml()inlibrary.jsis the single source for badge HTML across all grid views. - Want-to-read star (top-left of grid card cover): same dark fill + ring as status badges.
- Status pills in Book Detail (
book.css):status-complete,status-ongoing,status-temporary-hold,status-long-term-hold— same colour scheme as badges. - Grabber status mapping (
grabber.py):Temporary-Hold(gayauthors.org) →Temporary Hold;Long-Term Holdpasses through unchanged. - Star ratings (1–5) shown under the cover in all grid views:
- Display-only in grid cards (no click, prevents accidental taps while scrolling).
- Interactive in Book Detail (1.1rem, clickable; clicking the active star clears the rating).
- Amber: filled
#c8a03a, unfilledrgba(200, 160, 58, 0.25).
- Reader settings (hamburger menu):
- Content width slider (30–100 vw), persisted as
reader-content-width-pct. - Font size slider (80–150%, default 105%), persisted as
reader-font-size; applied via--reader-font-sizeCSS custom property on#chapter-content. - Text colour: 5 warm-tone presets
#e8e2d9→#938d86, persisted asreader-text-colour. - Hamburger and back-link separated with
margin-left: 1remon.header-back.
- Content width slider (30–100 vw), persisted as
- Reader supports EPUB, PDF, and CBR/CBZ:
- EPUB: chapter-text rendering; progress =
{chapterIndex}:{scrollFrac}; progress % =(chapterIndex + scrollFrac) / total * 100. - PDF: page-image rendering via
/library/pdf/{filename}?page=N; page count from/api/pdf/info/{filename}; progress ={pageIndex}:0; keyboard/button navigation identical. reader.htmlbranches onFORMATvariable injected by the server.
- EPUB: chapter-text rendering; progress =
- Series navigation: on load,
loadSeriesNav()fetches/api/series-nav/{filename}and activates prev/next volume buttons in the header (hidden when no series);markRead()redirects to/library/read/{next.filename}when a next volume exists, otherwise to the book detail page. Edit EPUBbutton in Book Detail is only shown for.epubfiles.- Backup page supports: manual run, dry-run, Dropbox root, retention count, schedule (on/off + hours), status + history.
- Bookmarks: saved per book via
POST /library/bookmarks/{filename}; shown in Library sidebar section; navigated via?bm_ch=N&bm_scroll=FURL params on reader page. - Convert page: after loading metadata, if a book with the same title+author already exists in the library, a warning banner is shown (with a link to the existing book); user can still proceed with conversion. Check is done server-side in
/preloadresponse (already_exists,existing_books). - Authors view (
#authors): lists all authors acrossallBooks(active + archived); authors whose books are all archived still appear. Sidebar counter (count-authors) counts only active-book authors. Author detail view (#authors/{name}) also usesallBooks; archived books show the.badge-archivedoverlay on their cover. - Publishers view (
#publishers): same rule —allBooks(active + archived); publishers with only archived books still appear. Sidebar counter uses active books only. Publisher detail also usesallBooks. - Series detail view (
#series/{name}): shows all books in a series as a cover grid. Header contains an "Archive series" / "Unarchive series" button — callsPOST /library/archive-seriesto setarchivedfor every book in the series at once; the button label reflects whether any book is still active. - Duplicates view (
#duplicates): groups non-archived books by(title, author)(case-insensitive); shows only groups with ≥ 2 copies; counter in sidebar shows total number of duplicate books. Detection is entirely client-side from the existing library data. - Incomplete view (
#incomplete): shows all non-archived books wherepublication_statusis notComplete(Ongoing, Temporary Hold, Long-Term Hold, or blank); sidebar counter included. - Following page (
/following): dedicated page in its own sidebar section between Library and Tools; shows all library authors with their external URL; two tabs — Following (authors with URL set) and All Authors; inline URL editing with keyboard support (Enter = save, Escape = cancel); clicking Visit opens the external URL in a new tab. Author URLs are stored in theauthorstable. Sidebar counter shows number of followed authors. - Book Builder (
/builder): create EPUB books from scratch; drafts stored inbuilder_drafts(JSONB chapters); contenteditable editor with toolbar (bold/italic/underline/blockquote/author-note/scene-break/normalize); autosave every 30 s + Ctrl+S; publish normalizes HTML vianormalize_wysiwyg_html()and builds EPUB viabuild_epub().
Develop Mode
When enabled, every page shows a diagonal DEVELOP ribbon in the top-left corner and the browser tab title becomes Novela Develop — … instead of Novela — ….
- Persisted in
app_settingstable (single row,id = 1); created bymigrate_create_app_settings(). shared_templates._develop_mode()reads this value from DB on every template render and is registered as a Jinja2 global (develop_mode), so all templates can use{% if develop_mode() %}without explicit context injection.- Banner CSS lives in
static/sidebar.css(.develop-banner/.develop-banner-text); rendered at the top oftemplates/_sidebar.html. - Toggled via the Develop mode card on the Settings page (
/settings); saving reloads the page so the banner and title take effect immediately.
Known Conventions
- Book deletion flow:
unlinkfile →prune_empty_dirs(parent)→DELETE FROM library(cascade removes child rows). - Empty dir pruning:
prune_empty_dirs(start)walks up fromstarttoLIBRARY_ROOT, removing each dir if empty; stops at first non-empty dir. - Cover strategy:
- EPUB:
GET /library/cover/{filename}checkslibrary_cover_cachefirst; on miss, extracts from ZIP and warms the cache. Cover upload (POST /library/cover/{filename}) replaces the image inside the EPUB ZIP (OPF located viaMETA-INF/container.xml, old cover found in manifest and removed) and updates the cache so subsequent requests return the new cover immediately. - PDF: first page rendered as thumbnail, cached
- CBR/CBZ: first page extracted, cached
- EPUB:
- Rating storage:
- EPUB:
<meta name="novela:rating" content="N"/>in OPF - CBZ:
<NovelaRating>N</NovelaRating>inComicInfo.xmlinside the ZIP - CBR/PDF: DB only
upsert_bookusesCASE WHEN EXCLUDED.rating > 0 THEN EXCLUDED.rating ELSE library.rating ENDto restore rating from file without overwriting existing DB value.
- EPUB:
- Tag types in
book_tags:genre,subgenre,tag,subject. No directgenres/subgenresfields on book objects; always use helpersbookGenres(),bookSubgenres(),bookPlainTags(). series_volume(e.g."1982") is used for annual comic series where issue numbers restart each year. It is separate fromseries_index(issue number within the year) andseries_suffix(letter variant like"a"). Stored in DB and EPUB OPF (novela:series_volume); not reflected in the file path. Sort order:series → series_volume → series_index → series_suffix. IngetSeriesSlots, gap-detection runs per volume independently when any book hasseries_volumeset; slot labels show as(year) #index.
Performance Notes
- Library load is optimized for large datasets (1000+ books):
list_library_json()usesjson_aggin the main query to inline tags per book — eliminates a separateSELECT * FROM book_tagsquery and Python merge loop.has_cached_coveris provided directly via SQL join instead of full cache fetch.reading_sessionsis pre-aggregated in a subquery.- ETag on
/api/library: cheapCOUNT + MAX(updated_at)query before full load;304 Not Modifiedon cache hit.
- Front-end rendering uses
IntersectionObserverto defer both cover image loading and placeholder canvas drawing until cards enter the viewport — prevents hundreds of simultaneous HTTP requests and canvas operations on initial render. renderBooksGrid,renderDuplicatesView,renderSeriesDetailall use a single DOM pass: cover<img>and<canvas>are set up viacard.querySelectorimmediately afterinnerHTMLis set, eliminating a second full iteration withdocument.getElementByIdcalls.- Additional migration indexes:
idx_library_sort_coalesceidx_library_needs_reviewidx_library_archivedidx_reading_sessions_filename_readatidx_book_tags_filename_tag
DB-Stored Books
Books scraped via the grabber are stored entirely in PostgreSQL (storage_type = 'db'). No EPUB file is written.
New tables
| Table | Key columns | Notes |
|---|---|---|
book_chapters |
filename FK, chapter_index, title, content TEXT, content_tsv TSVECTOR |
Unique on (filename, chapter_index); GIN index on content_tsv for FTS; content_tsv is `to_tsvector('simple', title |
book_images |
sha256 PK, ext, media_type, size_bytes |
Content-addressed; files live at library/images/{sha256[:2]}/{sha256}{ext} |
library.storage_type
| Value | Meaning |
|---|---|
'file' |
Book lives on disk (EPUB/PDF/CBR/CBZ); default for all existing books |
'db' |
Book content lives in book_chapters; no file on disk |
Synthetic filename for DB books
db/{publisher}/{author}/{title} — or for series: db/{publisher}/{author}/Series/{series}/{idx:03d} - {title}
Same sanitization rules as file-based paths. Uniqueness enforced via ensure_unique_db_filename (DB lookup, not filesystem).
Chapter editor for DB books
GET /library/editor/{filename} supports DB-stored books. The Monaco editor shows language: 'html' for DB books (vs 'xml' for EPUB). The header shows a title input instead of a read-only chapter name. Unsaved content and titles are preserved across chapter switches via pendingContent and pendingTitles maps. editor.focus() is called after every content load so the editor is immediately interactive.
Imagestore
Images embedded in chapter HTML are stored content-addressed at library/images/{sha256[:2]}/{sha256}{ext}.
- Served via
GET /library/db-images/{path:path} - URLs embedded in
book_chapters.contentas absolute paths:/library/db-images/... book_imagestable registers each unique image (auto-deduplication via sha256)
EPUB → DB conversion
POST /api/library/convert-to-db/{filename} converts an on-disk EPUB to storage_type='db':
- Parse EPUB spine → per item: extract body HTML via
_epub_body_inner, store images in imagestore viawrite_image_file, rewriteimg[src]to/library/db-images/… - Compute new synthetic
db/…filename viamake_rel_path(media_type="db", …)+ensure_unique_db_filename - DB transaction: INSERT new library row (storage_type='db') → UPDATE all child tables (book_tags, reading_progress, reading_sessions, bookmarks, library_cover_cache, book_chapters) → DELETE old library row
- Delete EPUB file from disk +
prune_empty_dirs
DB → EPUB export
GET /api/library/export-epub/{filename} streams an EPUB built from DB content:
- Query metadata, tags, chapters, cover from DB
- Per chapter:
_rewrite_db_images_for_epubstrips/library/db-images/prefix, reads files fromIMAGES_DIR, deduplicates by sha256, assignsOEBPS/Images/{sha256}{ext}paths, rewritesimg[src]to../Images/… - Build EPUB via
make_epub(); return asContent-Disposition: attachment
Known Bugs Fixed
renderGenreViewandrenderSearchResultsinlibrary.jsreferencedb.genres(non-existent). Fixed: usebookGenres(),bookSubgenres(),bookPlainTags().PillInputinbook.jsdid not handle comma as delimiter and did not flush on save. Fixed: comma keydown +flush()insaveEdit().PillInput._addinbook.jsadded a pasted comma-separated list as one tag instead of splitting it. Fixed:_addnow splits the value on commas and pushes each trimmed, non-empty, non-duplicate part individually.PATCH /library/bookfailed for PDFs:_sync_epub_metadatatried to open PDF as ZIP. Fixed: only called for.epub._make_rel_pathinreader.pylacked format prefix (epub/,pdf/,comics/). Fixed: aligned withcommon.make_rel_path.common.make_rel_pathalways generated.cbrextension for CBZ files (both map tomedia_type="cbr"). Fixed: accepts optionalextparameter;library.pyimport now passes actual suffix./download/{filename}was referenced inbook.htmlbut no endpoint existed (404). Fixed: addedGET /download/{filename}tolibrary.py.- PDF reader showed infinite loading:
reader.htmlcalled EPUB-only/library/chapters/. Fixed: PDF path uses/api/pdf/info/+ page-image rendering. - Empty dir pruning only ran when file was moved. Fixed:
prune_empty_dirs(old_path.parent)always runs after a successful metadata save.