Compare commits

...

21 Commits

Author SHA1 Message Date
kikootwo 1711d256c2 Merge pull request #173 from MattiasC/feature/bulk-import-folder-fallback
Bulk import enhancement: group tagless files by folder and use folder name as search fallback
2026-05-14 16:15:41 -04:00
kikootwo 8376355233 Merge branch 'main' into feature/bulk-import-folder-fallback
Resolves conflicts in src/lib/integrations/audible.service.ts.

main switched the ASIN-detail fallback from HTML scraping to the JSON
catalog API (fetchAudibleDetailsFromApi), removing scrapeAudibleDetails.
The PR's lookupAsinFast was a fail-fast variant of the same pattern that
getAudiobookDetails now performs (Audnexus -> catalog API), so it's
redundant.

- Drop the lookupAsinFast method (delete entire HEAD-side conflict block)
- Take main's fetchAudibleDetailsFromApi verbatim (the scrapeAudibleDetails
  maxRetries parameterization is moot)
- In bulk-import scan route, swap lookupAsinFast for getAudiobookDetails

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 16:14:25 -04:00
kikootwo d1a980e210 Enhance download-torrent test mocks
Update tests/processors/download-torrent.processor.test.ts to better mock dependencies used by processDownloadTorrent. Add jobQueueMock.addNotificationJob.mockResolvedValue(undefined) to avoid unmocked job queue calls, and change prismaMock.request.update.mockResolvedValue from an empty object to include { type: 'audiobook', user: { plexUsername: 'testuser' } } in the affected test cases so the returned request shape matches code expectations.
2026-05-14 16:02:04 -04:00
kikootwo 5e4a38a340 Normalize notification events and update grab flow
Introduce a NotificationEventConfig interface and validate NOTIFICATION_EVENTS with `satisfies` for stronger typing and normalized metadata shape. Replace escaped emoji sequences with literal emoji, simplify helper functions (getEventMeta/getEventTitle) to use the typed registry, and clean up titleByRequestType typing.

In download-torrent.processor: include the requesting user when setting status to downloading to avoid an extra DB query, and use that returned user to enqueue a non-blocking `request_grabbed` notification.

Docs: note that `request_grabbed` notifications are opt-in for existing backends. Tests: add messageLabel rendering tests for Apprise and ntfy providers to validate emoji, label text, and type-specific titles.
2026-05-14 15:57:15 -04:00
kikootwo 4ded2cf219 Merge branch 'main' of https://github.com/kikootwo/ReadMeABook 2026-05-14 15:47:23 -04:00
kikootwo 21d811e2bf Merge pull request #162 from xFlawless11x/feature/on-grab-notification
feat: add On Grab notification event
2026-05-14 15:47:17 -04:00
kikootwo 247fe88b99 Refactor approval buttons into reusable component
Extract LoadingSpinner and ApprovalActionButtons components and replace duplicated approve/search/deny button blocks with the new ApprovalActionButtons to reduce duplication and centralize behavior/styles. Remove the inline LoadingSpinner in PendingApprovalSection, add an aria-label to the details button, and update the details modal's adminActions to use ApprovalActionButtons with callbacks that handle approval/denial/search and close modals as needed. Improves DRY, maintainability, and consistency of loading state handling.
2026-05-14 15:43:30 -04:00
kikootwo 3545ff6109 Merge pull request #158 from xFlawless11x/feature/admin-book-info-modal
feat: add book info modal to admin pending approval cards
2026-05-14 15:34:20 -04:00
kikootwo fb19c1a642 Merge branch 'main' of https://github.com/kikootwo/ReadMeABook 2026-05-14 15:34:19 -04:00
kikootwo 6c8ca9647d Support language/format/publisher for Audible
Expose language, formatType, and publisherName from the Audible catalog. Update audible.service to map format_type and publisher_name (and language) into the AudibleAudiobook model, update AudiobookDetailsModal to display language and format using the CSS "capitalize" class, and update documentation to list the new fields. Add unit tests to verify the mappings, details propagation, and behavior when fields are omitted.
2026-05-14 15:33:30 -04:00
kikootwo 18752dd02b Merge branch 'main' of https://github.com/kikootwo/ReadMeABook 2026-05-14 15:24:24 -04:00
kikootwo f8c70a6b9a Merge pull request #152 from Orvanix/feature/modal-view
feat(audiobook): add language, format and publisher to details modal
2026-05-14 15:24:22 -04:00
kikootwo fcae3bcf09 Audible: HTML refresh, multi-narrator & works dedup
Switch nightly discovery refresh to scrape Audible's curated HTML storefronts (popular, new releases, category pages) while keeping real-time user paths on the JSON catalog API. Add robust HTML resilience knobs (increased retries, capped jittered backoff, AdaptivePacer changes and per-batch cooldowns) to avoid failing nightly jobs during 503 storms. Implement multi-narrator capture via a new extractAllNarrators helper and update parsers to preserve all narrator anchors. Introduce two-pass dedup: in-memory deduplicateAndCollectGroups + collapseByExistingWorks that consults the works table, export metadataScore for consistent representative selection, and persist dedup groups (fire-and-forget). Wire collapseByExistingWorks into search/author/series routes and make defensive dedup in the refresh processor. Add HTML parsing helpers, runtime/lang-aware parsing, jitteredBackoff cap, and tests for the new behaviors.
2026-05-14 15:23:15 -04:00
xFlawless11x ba1efa88f5 feat: add On Grab notification event
Adds request_grabbed event that fires when a torrent/NZB is successfully
handed off to the configured download client, filling the gap between
request_approved (pre-search) and request_available (fully imported).

- Add request_grabbed to NOTIFICATION_EVENTS with titleByRequestType
  (Audiobook Grabbed / Ebook Grabbed), info severity, Details messageLabel
- Add NotificationEventConfig interface and update getEventMeta() return
  type to expose messageLabel to all providers without TypeScript errors
- Add messageLabel: 'Reason' to issue_reported event
- Fix all 4 providers (Discord, ntfy, Pushover, Apprise) to derive message
  field label from meta.messageLabel ?? 'Error' instead of hardcoded
  isIssue ternary — prevents grab details showing as Error
- Trigger request_grabbed in download-torrent.processor.ts after
  client.addDownload() succeeds; message carries torrent title, indexer,
  and download client name; requestType sourced from request.type
- Update notifications.md documentation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 13:49:36 -04:00
Mattias Carlsson c9392c49c9 If ASIN lookup fails, use the folder name instead of the tag. 2026-04-19 22:09:46 +02:00
Mattias Carlsson 7b01cda955 Fix bulk import: merge untagged files into single tagged group per folder 2026-04-19 22:03:45 +02:00
Mattias Carlsson 9a6062d860 Decreased audible retries when doing manual imports. 2026-04-19 21:53:28 +02:00
Mattias Carlsson ad1ab3af05 Better searching when using ASIN from folder names. 2026-04-19 21:14:14 +02:00
Mattias Carlsson 35cb318389 Fix bulk import: group tagless files by folder, use folder name as search fallback 2026-04-10 10:22:01 +02:00
xFlawless11x e9d7a2359a feat: add book info modal to admin pending approval cards
Adds an info icon button (top-right of each card) in the Requests
Awaiting Approval section. Clicking it opens AudiobookDetailsModal
with full book details (cover, description, narrator, series, genres,
etc.) and embeds the Approve / Search / Deny action buttons so admins
can review and act without navigating away from the admin panel.

Implementation:
- AudiobookDetailsModal: adds optional `adminActions` prop rendered as
  a second row inside the existing sticky action bar
- admin/page.tsx: adds detailsAsin/detailsRequestId state, info button
  per card (conditional on audibleAsin presence), and AudiobookDetailsModal
  wired with admin action buttons matching the card button behaviour
- Documentation updated: request-approval.md, components.md, TABLEOFCONTENTS.md

Closes #157

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 11:29:26 -04:00
Orvanix 1abaff1677 feat(audiobook): add language, format and publisher to details modal 2026-03-14 17:45:31 +00:00
37 changed files with 1830 additions and 327 deletions
+3
View File
@@ -45,6 +45,8 @@
- **Web scraping (popular, new releases)** → [integrations/audible.md](integrations/audible.md)
- **Database caching, real-time matching** → [integrations/audible.md](integrations/audible.md)
- **Book covers API for login page** → [frontend/pages/login.md](frontend/pages/login.md)
- **Dedup & works table (cross-ASIN identity)** → [integrations/audible.md](integrations/audible.md#dedup--works-table)
- **Multi-narrator capture in HTML scrapers** → [integrations/audible.md](integrations/audible.md#narrator-capture-in-html-scrapers)
## E-book Support (First-Class)
- **First-class ebook requests, separate tracking** → [integrations/ebook-sidecar.md](integrations/ebook-sidecar.md)
@@ -144,6 +146,7 @@
**"How do I delete requests?"** → [admin-features/request-deletion.md](admin-features/request-deletion.md)
**"How do I approve/deny user requests?"** → [admin-features/request-approval.md](admin-features/request-approval.md)
**"How do I enable auto-approve for requests?"** → [admin-features/request-approval.md](admin-features/request-approval.md)
**"How does the admin book info modal work?"** → [admin-features/request-approval.md](admin-features/request-approval.md#ui-features), [frontend/components.md](frontend/components.md#component-apis)
**"How do I customize audiobook folder organization?"** → [settings-pages.md](settings-pages.md#audiobook-organization-template), [phase3/file-organization.md](phase3/file-organization.md#target-structure)
**"How do I deploy?"** → [deployment/docker.md](deployment/docker.md) (multi-container), [deployment/unified.md](deployment/unified.md) (all-in-one)
**"How do I use the unified container?"** → [deployment/unified.md](deployment/unified.md)
@@ -259,8 +259,11 @@ Update user (includes autoApproveRequests field)
- Title and author
- User avatar and username
- Request timestamp (relative: "2 hours ago")
- Info button (ⓘ, top-right corner) — opens AudiobookDetailsModal for full book details
- Approve button (green, checkmark icon)
- Search button (blue, magnifier icon) — opens InteractiveTorrentSearchModal
- Deny button (red, X icon)
- **Info modal:** `AudiobookDetailsModal` rendered with `adminActions` prop containing Approve/Search/Deny buttons, allowing admin to review full book details (cover, description, series, genres, narrator, etc.) without leaving the approval workflow
- Auto-refreshes every 10 seconds (SWR)
- Loading states on buttons during approval/denial
- Success/error toast notifications
@@ -7,7 +7,7 @@ Sends notifications for audiobook request events (pending approval, approved, av
## Key Details
- **Backends:** Apprise (API), Discord (webhooks), ntfy (API), Pushover (API)
- **Events:** request_pending_approval, request_approved, request_available, request_error, issue_reported
- **Events:** request_pending_approval, request_approved, request_grabbed, request_available, request_error, issue_reported
- **Encryption:** AES-256-GCM for sensitive config (webhook URLs, API keys, notification URLs)
- **Delivery:** Async via Bull job queue (priority 5)
- **Failure Handling:** Non-blocking, Promise.allSettled (one backend fails, others succeed)
@@ -33,11 +33,14 @@ model NotificationBackend {
|-------|---------|------------------------|
| request_pending_approval | User creates request | Request needs admin approval |
| request_approved | Admin approves OR auto-approval | Request approved (manual or auto) |
| request_grabbed | Torrent/NZB added to download client | Download handed off to configured download client (title resolves by type) — **opt-in: existing backends do not auto-subscribe; enable in Settings** |
| request_available | Plex/ABS scan or ebook download completes | Request available (title resolves by type) |
| request_error | Download/import fails | Request failed at any stage |
| issue_reported | User reports issue | User reports problem with available audiobook |
**Dynamic Titles:** Events can define `titleByRequestType` in `notification-events.ts` for type-specific titles.
- `request_grabbed` + `requestType: 'audiobook'` → "Audiobook Grabbed"
- `request_grabbed` + `requestType: 'ebook'` → "Ebook Grabbed"
- `request_available` + `requestType: 'audiobook'` → "Audiobook Available"
- `request_available` + `requestType: 'ebook'` → "Ebook Available"
- `request_available` + no requestType → "Request Available" (fallback)
@@ -66,6 +69,11 @@ model NotificationBackend {
- Approve (with or without pre-selected torrent): After job triggered → request_approved
- Deny: No notification
**Download Grabbed (processor: download-torrent)**
- After `client.addDownload()` succeeds and `DownloadHistory` record created → request_grabbed
- `message` field: `"${torrent.title} via ${indexer} (${clientType})"`
- `requestType`: from `request.type` (audiobook/ebook)
**Audiobook Available (processors: scan-plex, plex-recently-added)**
- After `status: 'available'` update → request_available (requestType: 'audiobook')
- Includes user info in query (plexUsername)
+9 -4
View File
@@ -13,9 +13,13 @@ Lets admins scan a server folder recursively, discover audiobook subfolders, mat
## Key Details
- **Access:** Admin-only, modal opened from admin dashboard Quick Actions
- **Audio detection:** Uses `AUDIO_EXTENSIONS` from `src/lib/constants/audio-formats.ts`
- **Audiobook boundary:** A folder containing audio files = one audiobook; subfolders not scanned further
- **Metadata extraction:** ffprobe reads `album` (title), `album_artist` (author), `composer` (narrator) from first audio file
- **Fallback:** If metadata tags are empty, folder name used as search term; "Low Confidence" badge shown
- **Audiobook boundary:** A folder containing audio files = one audiobook. Files with matching metadata tags are grouped by title+author+narrator. Files with no metadata title tag are all grouped together per folder (one entry, not one per file).
- **Metadata extraction:** ffprobe reads `album` (title), `album_artist` (author), `composer` (narrator) from all audio files in folder
- **Search term fallback chain** (when no `album` tag):
1. **ASIN in folder name** — scans folder name for pattern `B[A-Z0-9]{9}` bounded by bracket/paren/space; if found, uses direct ASIN lookup instead of text search; no badge shown
2. **Folder name** — cleaned (strips bracketed ASIN/year, underscores→spaces); skipped if generic (CD1, Disc 2, Part 3, Vol 1, etc.); shows "Low Confidence" badge
3. **First file name** — last resort; shows "Low Confidence" badge
- **Generic folder detection:** `/^(cd|disc|disk|part|vol(ume)?)\s*\d+$/i` — these names are skipped as search terms
- **Author/narrator dedup:** Splits on `,;& ` delimiters, removes names appearing in both fields
- **Scan depth:** Max 10 levels recursion
- **Rate limiting:** 1.5s delay between Audible searches (same as existing scraping rate limit)
@@ -56,7 +60,8 @@ Lets admins scan a server folder recursively, discover audiobook subfolders, mat
| Already in library | 40% opacity, green "In Library" badge, toggle disabled |
| Active request exists | 40% opacity, purple "Requested" badge, toggle disabled |
| No Audible match | Red "No Match" badge, folder name shown, pre-skipped |
| Low confidence (folder name fallback) | Amber "Low Confidence" badge |
| ASIN extracted from folder name | No badge (high confidence — direct ASIN lookup) |
| Low confidence (folder name or file name fallback, no ASIN) | Amber "Low Confidence" badge |
## Files
+2 -1
View File
@@ -30,7 +30,7 @@ src/components/
**Audiobooks**
- **AudiobookCard** ✅ - Cover, title, author, narrator, duration, request button, clickable to open details modal. Shows "Requested by [username]" when someone else has requested the book, "Requested" when current user has requested it
- **AudiobookGrid** - Responsive grid (1/2/3/4 cols)
- **AudiobookDetailsModal** ✅ - Full-screen modal with comprehensive metadata (description, genres, rating, release date, narrator, request functionality). Shows requesting user's name when applicable
- **AudiobookDetailsModal** ✅ - Full-screen modal with comprehensive metadata (description, genres, rating, release date, narrator, language, format, publisher, request functionality). Shows requesting user's name when applicable
**Requests**
- **RequestCard** ✅ - Cover, title, author, status badge, progress bar, timestamps, action buttons (cancel, manual search, interactive search)
@@ -113,6 +113,7 @@ interface AudiobookDetailsModalProps {
requestStatus?: string | null;
isAvailable?: boolean;
requestedByUsername?: string | null;
adminActions?: React.ReactNode; // Optional admin buttons (Approve/Search/Deny) rendered as second row in action bar
}
interface RequestCardProps {
+86 -21
View File
@@ -1,29 +1,40 @@
# Audible Integration
**Status:** Implemented | Unauthenticated Audible JSON catalog API (primary) + Audnexus API (per-ASIN details)
**Status:** Implemented | Hybrid — curated HTML for discovery refresh + Audible JSON catalog API for user-facing real-time + Audnexus for per-ASIN details
## Overview
Audiobook metadata for discovery, search, and detail pages. All catalog operations (search, popular, new releases, categories, category books, author books, single-product details) now call Audible's unauthenticated public JSON catalog API (`api.audible.<tld>/1.0/catalog/*`). Per-ASIN detail lookups prefer Audnexus; the catalog API is used as fallback.
Audiobook metadata for discovery, search, and detail pages. Split by access pattern:
- **Nightly discovery refresh** (popular / new releases / category lists) — scraped from Audible's **curated HTML storefronts** (`www.audible.<tld>/adblbestsellers`, `/newreleases`, `/search?node=<id>`). The HTML pages reflect Audible's own editorial picks.
- **User-facing real-time** (search, author books, categories listing, per-ASIN details) — Audible's unauthenticated public **JSON catalog API** (`api.audible.<tld>/1.0/catalog/*`).
- **Per-ASIN detail lookups** — Audnexus (`api.audnex.us/books/{asin}`) primary; catalog API used as fallback when Audnexus returns 404.
## Architecture
- **Primary data source:** Audible JSON catalog API, same endpoint used by the official Audible mobile apps. No authentication, no API key, no user credentials, no special headers.
- **Per-ASIN details:** Audnexus (`api.audnex.us/books/{asin}`) remains primary; catalog API (`/1.0/catalog/products/{asin}`) is the fallback when Audnexus returns 404.
- **HTML scraping:** Removed from `audible.service.ts`. The only remaining HTML path is `audible-series.ts` (series-page scraping, out of scope).
- **`www.audible.<tld>`:** Still used by `audible-series.ts` and by `getBaseUrl()` for "View on Audible" link generation. Not used for any catalog operation.
- **Curated HTML (refresh job only):** the three methods called solely by `audible-refresh.processor.ts` (`getPopularAudiobooks`, `getNewReleases`, `getCategoryBooks`) scrape Audible's storefront HTML to inherit editorial curation. Beefed-up retry/backoff knobs (12 retries, 3-min jittered cap) handle 503 storms patiently on the nightly job without slowing healthy users.
- **JSON catalog API (real-time):** `search`, `searchByAuthorAsin`, `getCategories` (categories listing), and `fetchAudibleDetailsFromApi` (per-ASIN fallback). Same endpoint used by the official Audible mobile apps. No authentication, no API key, no user credentials, no special headers.
- **Audnexus (per-ASIN):** `getAudiobookDetails` and `getRuntime` prefer Audnexus, with catalog API fallback for `getAudiobookDetails`.
- **`www.audible.<tld>`:** Used by HTML refresh scraping, by `audible-series.ts`, and by `getBaseUrl()` for "View on Audible" link generation.
## Data Sources
All catalog operations are HTTP GET against `{apiBaseUrl}` (region-dependent, e.g. `https://api.audible.com`):
### Nightly refresh (HTML — `htmlClient`, baseURL `www.audible.<tld>`)
| Operation | Endpoint | Key params |
|---|---|---|
| Popular | `/adblbestsellers` | `pageSize=50`, `page=<n>` (omitted on first page) |
| New releases | `/newreleases` | `pageSize=50`, `page=<n>` (omitted on first page) |
| Category books | `/search` | `node=<categoryId>&pageSize=50&sort=popularity-rank&page=<n>` |
Parsed via cheerio. Selectors: `.productListItem` (popular/new releases), `.s-result-item, .productListItem` (categories).
### Real-time (JSON catalog API — `apiClient`, baseURL `api.audible.<tld>`)
| Operation | Endpoint | Key params |
|---|---|---|
| Search | `/1.0/catalog/products` | `keywords=<q>` |
| Author books | `/1.0/catalog/products` | `author=<name>` (name, NOT ASIN) |
| Popular | `/1.0/catalog/products` | `products_sort_by=BestSellers` |
| New releases | `/1.0/catalog/products` | `products_sort_by=-ReleaseDate` |
| Category books | `/1.0/catalog/products` | `category_id=<id>&products_sort_by=BestSellers` |
| Categories listing | `/1.0/catalog/categories` | (none) |
| Single product | `/1.0/catalog/products/{asin}` | — |
| Audnexus (per-ASIN) | `https://api.audnex.us/books/{asin}` | `region={audnexusParam}` |
@@ -48,20 +59,20 @@ Populates every `AudibleAudiobook` field. Covered:
## Gotchas
- **Catalog API cannot filter preorders or surface curated bestsellers.** The API's `BestSellers` sort is a right-now velocity rank that spikes on launch-day promos and preorder windows; the `-ReleaseDate` sort returns 100% future preorders. There is no server-side `release_time`, `released-only`, `customer_rights`, or alternate sort (`Reviewed`, `MostListened`, etc.) — every plausible variant was tested and silently ignored. This is why the nightly refresh job uses the curated HTML storefront pages instead.
- **`author=` takes a name, not an ASIN.** The catalog API has no ASIN-based author param. `searchByAuthorAsin()` queries by name, then filters client-side: keeps only products where `products[].authors[].asin === authorAsin`. Preserves ASIN-authoritative author identity. Also filters by `product.language` via `isAcceptedLanguage()` for the configured region.
- **Invalid ASIN returns HTTP 200 with stub body.** `/1.0/catalog/products/{asin}` responds 200 with `{product: {asin: INPUT}}` and no other fields. `fetchAudibleDetailsFromApi()` detects this via missing `product.title` and returns `null`.
- **`publisher_summary` is HTML.** Service strips tags via inline `stripHtml()` helper (regex-based, no cheerio) before populating `description`. Falls back to `merchandising_summary` (plain text) if `publisher_summary` missing.
- **Series is an array.** `products[].series[]` — a book may belong to multiple series. Service picks the first entry with non-empty `sequence`, else the first entry. `sequence` is cleaned by extracting first `/\d+(?:\.\d+)?/` match for numeric ordering.
- **Stub `product_images`:** cover URL reads from `product_images['500']`; missing keys fall back to `undefined`.
- **`page` is 0-indexed.** Despite the default value appearing to be 1, the API returns items `(page * num_results)` through `((page + 1) * num_results - 1)`. So `page=1` fetches items 51100, not 150. All service methods accept a 1-indexed `page` and subtract 1 at the axios call. The symptom of getting this wrong is silent: queries whose `total_results ≤ num_results` return an empty `products` array while `total_results` is populated (e.g. author searches for small catalogues).
- **`page` is 0-indexed (catalog API only).** Despite the default value appearing to be 1, the API returns items `(page * num_results)` through `((page + 1) * num_results - 1)`. So `page=1` fetches items 51100, not 150. All catalog-API service methods accept a 1-indexed `page` and subtract 1 at the axios call. The symptom of getting this wrong is silent: queries whose `total_results ≤ num_results` return an empty `products` array while `total_results` is populated (e.g. author searches for small catalogues). HTML paths use Audible's native 1-indexed `page` query param and omit it on the first page.
## Rate Limiting & Resilience
- 503s still possible but dramatically less frequent than the HTML surface.
- `fetchWithRetry()` — jittered exponential backoff, 5 retries, retries on 503/429/5xx.
- `AdaptivePacer` circuit-breaker preserved.
- Inter-page base delay on API paths: **5001500ms** (down from 20004000ms for HTML).
- API responses include `Cache-Control: private, max-age=1800`.
- **Real-time JSON API paths:** 503s are uncommon. `fetchWithRetry()` uses jittered exponential backoff, 5 retries, retries on 503/429/5xx. API responses include `Cache-Control: private, max-age=1800`.
- **Nightly HTML refresh paths:** 503s are more likely (HTML storefront is more rate-sensitive). Same `fetchWithRetry()`, but with `HTML_MAX_RETRIES=12` and `HTML_MAX_BACKOFF_MS=180_000` (3-minute cap on jittered backoff). Healthy refreshes still complete fast (per-page success on attempt 0); users hit by sustained 503 storms grind through patiently rather than abandoning the refresh.
- **`AdaptivePacer`** — inter-page delay 24 s baseline, scales up multiplicatively under retry pressure, with a 4560 s circuit-breaker cooldown after 3 consecutive retry-pages.
- **Per-batch cooldowns** in `audible-refresh.processor.ts` — 1530 s between popular/new-releases, 1020 s between categories.
## Region Configuration
@@ -101,8 +112,8 @@ Configurable Audible region for accurate metadata matching across international
- Automatic refresh: Region change triggers `audible_refresh` job.
**Per-region HTTP clients (on init):**
- `apiClient``baseURL=apiBaseUrl`, `Accept: application/json`, `User-Agent: ReadMeABook/1.0`, no language/ipRedirect params.
- `htmlClient``baseURL=baseUrl`, browser headers, default params `ipRedirectOverride=true` + `language=<audibleLocaleParam>`. Used only by `audible-series.ts` and `getBaseUrl()`-based link generation.
- `apiClient``baseURL=apiBaseUrl`, `Accept: application/json`, `User-Agent: ReadMeABook/1.0`, no language/ipRedirect params. Used for the real-time JSON catalog operations (search, author books, categories listing, per-ASIN details fallback).
- `htmlClient``baseURL=baseUrl`, rotating browser headers (`pickUserAgent` + `getBrowserHeaders`), default params `ipRedirectOverride=true` + `language=<audibleLocaleParam>`. Used by the nightly discovery refresh (`/adblbestsellers`, `/newreleases`, `/search?node=...`), by `audible-series.ts`, and by `getBaseUrl()`-based link generation.
- Audnexus calls include `region=<audnexusParam>`.
**Files:**
@@ -130,6 +141,44 @@ Single matching algorithm used everywhere (search, popular, new-releases, jobs).
**Note:** Fuzzy matching (70% threshold) is preserved in `ranking-algorithm.ts` for Prowlarr torrent ranking. Library availability checks require exact ASIN matches only.
## Dedup & Works Table
**Status:** ✅ Implemented | Two-pass dedup on every discovery view + cross-batch identity via works table
Discovery views (search, author books, series detail) collapse duplicate Audible listings for the same recording (publisher re-listings, regional re-issues, full-cast vs single-narrator productions) into a single card. Two passes run in sequence:
1. **Local pass — `deduplicateAndCollectGroups()`** (`src/lib/utils/deduplicate-audiobooks.ts`)
- Stateless, in-memory. Keys books by normalized title + sorted narrator set + duration (±max(5%, 10 min) tolerance), with subtitle compatibility to keep distinct series entries separate.
- Picks a canonical representative per group by `metadataScore()` (cover + rating + duration + description + narrator + release date + genres).
- Emits `DedupGroup[]` describing every multi-ASIN collapse → handed to `persistDedupGroups()` for the works table.
2. **Works pass — `collapseByExistingWorks()`** (`src/lib/services/works.service.ts`)
- Async DB lookup. Reads `work_asins` for every ASIN in the local-passed list and collapses any books sharing a `workId` to one representative (same `metadataScore()` ranking).
- Catches duplicates the local pass misses: source-metadata divergence (e.g. HTML scraper captured different narrators), cross-page splits (paginated series), or non-matching field shapes.
- Degrades gracefully — returns the input unchanged on DB failure (view still renders).
### Works Table Schema
- `Work { id, title, author }` — one row per logical book
- `WorkAsin { id, workId, asin, narrator?, durationMinutes?, isCanonical, source, createdAt }` — many ASINs per Work
### Population Layers
- **Layer 1 (auto):** `persistDedupGroups()` writes whenever the local pass finds a duplicate. Merges across pre-existing works when a new group spans them.
- **Layer 2 (seed):** `seedAsin()` writes a single-ASIN work at request creation time, ensuring every requested ASIN has an entry to grow from.
### Read Paths
- **`collapseByExistingWorks()`** — view-level collapse (this section).
- **`getSiblingAsins()`** — library availability matching (`audiobook-matcher.ts`), request-creation duplicate prevention (`request-creator.service.ts`), ignored-audiobook expansion. Returns sibling ASINs grouped by input ASIN.
### Narrator Capture in HTML Scrapers
- HTML scrapers (`audible-series.ts`, the two `parse*Items` parsers in `audible.service.ts`) capture **all** narrator anchors via `extractAllNarrators()` (`src/lib/utils/extract-narrator.ts`). Multi-narrator productions render each name as its own `<a href="?searchNarrator=...">` link; capturing only the first (prior bug) made co-narrated audiobooks fail to dedup. Order is not significant — `normalizeNarrator()` sorts before comparison.
### Wired Routes
- `src/app/api/audiobooks/search/route.ts`
- `src/app/api/authors/[asin]/books/route.ts`
- `src/app/api/series/[asin]/route.ts`
Watched-list background jobs (`watched-lists.service.ts`) run the local pass only — they don't render a view, and the downstream `request-creator.service.ts` already does sibling-aware dedup at request creation time.
## Database-First Approach
**Status:** Implemented
@@ -137,12 +186,12 @@ Single matching algorithm used everywhere (search, popular, new-releases, jobs).
Discovery APIs serve cached data from DB with real-time matching.
**Flow:**
1. `audible_refresh` cron runs daily → fetches 200 popular + 200 new releases + user-configured categories via catalog API.
1. `audible_refresh` cron runs daily → fetches 200 popular + 200 new releases + user-configured categories by scraping Audible's curated HTML storefronts (`/adblbestsellers`, `/newreleases`, `/search?node=<id>&sort=popularity-rank`).
2. Downloads and caches cover thumbnails locally.
3. Stores metadata in `audible_cache`, ranked entries in `audible_cache_categories` with reserved IDs (`__popular__`, `__new_releases__`) and user category IDs.
4. Cleans up unused thumbnails after sync.
5. API routes query `AudibleCacheCategory` by categoryId → join with `AudibleCache` metadata → apply real-time matching → return enriched results.
6. Homepage loads instantly (no Audible API hits).
6. Homepage loads instantly (no Audible HTTP hits at request time).
## Thumbnail Caching
@@ -201,6 +250,9 @@ interface AudibleAudiobook {
series?: string;
seriesPart?: string;
seriesAsin?: string;
language?: string;
formatType?: string;
publisherName?: string;
}
interface EnrichedAudibleAudiobook extends AudibleAudiobook {
@@ -228,12 +280,25 @@ interface AuthorBooksResult {
## Tech Stack
- `axios` (HTTP, two clients: `apiClient` for JSON catalog, `htmlClient` for series-page scraping only)
- `axios` (HTTP, two clients: `apiClient` for JSON catalog API, `htmlClient` for HTML refresh + series scraping)
- `cheerio` (HTML parsing for refresh job and `audible-series.ts`)
- Audnexus API (per-ASIN details, primary)
- PostgreSQL (`audible_cache`, `audible_cache_categories`)
## Fixed Issues
**Series-page duplicates not collapsing across user views (2026-05-14)**
- **Problem:** Two re-listings of the same audiobook (same title, same narrator set, same duration, different ASINs) showed as two cards on series detail pages, even after the works table had already linked them via search-page dedup.
- **Root cause (two-part):** (1) HTML scrapers used `$el.find('a[href*="searchNarrator="]').first()` for multi-narrator productions, capturing only the first co-narrator. So two listings of the same recording landed in `deduplicateAndCollectGroups` with mismatched single-narrator strings and never merged. (2) `deduplicateAndCollectGroups` was stateless — it wrote to the works table but never read it back, so even when one path (e.g. search) successfully merged two ASINs and persisted the Work, every other path (series, author books) re-derived the dedup decision from scratch and split them again.
- **Fix:** (1) New `extractAllNarrators()` helper (`src/lib/utils/extract-narrator.ts`) captures every `searchNarrator=` anchor and joins them; all three HTML scrapers route through it. (2) New `collapseByExistingWorks()` consults the works table after the local pass and collapses any remaining books sharing a `workId`. Wired into the three user-facing discovery routes (search / author books / series detail). Skipped for watched-list background jobs — those feed `request-creator.service.ts` which already does sibling-aware dedup.
- **Location:** `src/lib/utils/extract-narrator.ts` (new); `src/lib/integrations/audible-series.ts` (parseSeriesBooks); `src/lib/integrations/audible.service.ts` (parseProductListItems + parseSearchResultItems); `src/lib/utils/deduplicate-audiobooks.ts` (`metadataScore` exported); `src/lib/services/works.service.ts` (`collapseByExistingWorks` added); three API routes updated.
**Discovery refresh reverted to curated HTML scraping (2026-05-14)**
- **Problem:** After switching all catalog ops to the JSON catalog API in `f564d0a`, the nightly discovery refresh (Popular / New Releases / user-configured Categories) started serving junk: New Releases became 100% preorders out to 2027, and Popular was dominated by launch-day no-name shovelware.
- **Root cause:** `products_sort_by=BestSellers` is a right-now sales velocity rank that spikes on launch promos and preorder windows; `-ReleaseDate` returns all catalog items in date order with no released-only filter. The catalog API exposes no server-side filter to exclude preorders or sort by established popularity (verified by exhaustively testing `release_time`, `availability_status`, `customer_rights`, `Reviewed`/`MostListened`/`SalesRank` sorts — all silently ignored or rejected). Doing the curation client-side would have made RMAB the editorial curator, which Audible's storefront pages already do well.
- **Fix:** Hybrid architecture — the three refresh-only methods (`getPopularAudiobooks`, `getNewReleases`, `getCategoryBooks`) went back to scraping Audible's curated HTML storefronts (`/adblbestsellers`, `/newreleases`, `/search?node=<id>&sort=popularity-rank`). All user-facing real-time paths (search, author books, categories listing, per-ASIN details) stayed on the JSON catalog API. To keep the higher-503-risk HTML traffic resilient on the unattended nightly job, `fetchWithRetry()` accepts an optional `maxBackoffMs` cap and HTML callers use `HTML_MAX_RETRIES=12` + `HTML_MAX_BACKOFF_MS=180_000` (3-min cap). Healthy users finish quickly; 503-blocked users grind through patiently.
- **Location:** `src/lib/integrations/audible.service.ts` (three methods + two private parsers `parseProductListItems` / `parseSearchResultItems`); `src/lib/utils/scrape-resilience.ts` (`jitteredBackoff` cap parameter).
**Audiobookshelf metadata matching not respecting configured region (2026-01-28)**
- **Problem:** `triggerABSItemMatch()` hardcoded `'audible'` provider (audible.com) instead of respecting user's configured Audible region.
- **Impact:** Users with non-US regions (CA, UK, AU, IN) had incorrect metadata matching in Audiobookshelf, causing wrong ASINs.
+118 -44
View File
@@ -14,8 +14,10 @@ import { RecentRequestsTable } from './components/RecentRequestsTable';
import { ToastProvider, useToast } from '@/components/ui/Toast';
import { ReportedIssuesSection } from './components/ReportedIssuesSection';
import { InteractiveTorrentSearchModal } from '@/components/requests/InteractiveTorrentSearchModal';
import { AudiobookDetailsModal } from '@/components/audiobooks/AudiobookDetailsModal';
import { BulkImportWizard } from '@/components/admin/BulkImportWizard';
import { TorrentResult } from '@/lib/utils/ranking-algorithm';
import { InformationCircleIcon } from '@heroicons/react/24/outline';
import { formatDistanceToNow } from 'date-fns';
import { useState } from 'react';
@@ -56,15 +58,78 @@ function formatTorrentSize(bytes: number): string {
return gb >= 1 ? `${gb.toFixed(1)} GB` : `${mb.toFixed(0)} MB`;
}
function LoadingSpinner() {
return (
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
</svg>
);
}
interface ApprovalActionButtonsProps {
isLoading: boolean;
onApprove: () => void;
onSearch: () => void;
onDeny: () => void;
}
function ApprovalActionButtons({ isLoading, onApprove, onSearch, onDeny }: ApprovalActionButtonsProps) {
return (
<>
<button
onClick={onApprove}
disabled={isLoading}
className="flex-1 inline-flex items-center justify-center gap-1.5 px-3 py-2 bg-green-600 hover:bg-green-700 disabled:bg-green-400 disabled:cursor-not-allowed text-white text-sm font-medium rounded-lg transition-colors"
>
{isLoading ? <LoadingSpinner /> : (
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M5 13l4 4L19 7" />
</svg>
)}
<span>Approve</span>
</button>
<button
onClick={onSearch}
disabled={isLoading}
className="flex-1 inline-flex items-center justify-center gap-1.5 px-3 py-2 bg-blue-600 hover:bg-blue-700 disabled:bg-blue-400 disabled:cursor-not-allowed text-white text-sm font-medium rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M21 21l-6-6m2-5a7 7 0 11-14 0 7 7 0 0114 0z" />
</svg>
<span>Search</span>
</button>
<button
onClick={onDeny}
disabled={isLoading}
className="flex-1 inline-flex items-center justify-center gap-1.5 px-3 py-2 bg-red-600 hover:bg-red-700 disabled:bg-red-400 disabled:cursor-not-allowed text-white text-sm font-medium rounded-lg transition-colors"
>
{isLoading ? <LoadingSpinner /> : (
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M6 18L18 6M6 6l12 12" />
</svg>
)}
<span>Deny</span>
</button>
</>
);
}
function PendingApprovalSection({ requests }: { requests: PendingApprovalRequest[] }) {
const toast = useToast();
const [loadingStates, setLoadingStates] = useState<Record<string, boolean>>({});
const [searchModalRequestId, setSearchModalRequestId] = useState<string | null>(null);
const [detailsAsin, setDetailsAsin] = useState<string | null>(null);
const [detailsRequestId, setDetailsRequestId] = useState<string | null>(null);
const searchModalRequest = searchModalRequestId
? requests.find((r) => r.id === searchModalRequestId)
: null;
const detailsRequest = detailsRequestId
? requests.find((r) => r.id === detailsRequestId)
: null;
const handleApproveRequest = async (requestId: string) => {
setLoadingStates((prev) => ({ ...prev, [requestId]: true }));
@@ -125,13 +190,6 @@ function PendingApprovalSection({ requests }: { requests: PendingApprovalRequest
await mutate('/api/admin/metrics');
};
const LoadingSpinner = () => (
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
</svg>
);
return (
<div className="mb-8">
{/* Section Header */}
@@ -170,8 +228,23 @@ function PendingApprovalSection({ requests }: { requests: PendingApprovalRequest
return (
<div
key={request.id}
className="bg-white dark:bg-gray-800 border-2 border-amber-200 dark:border-amber-800 rounded-lg shadow-sm hover:shadow-md transition-shadow overflow-hidden"
className="relative bg-white dark:bg-gray-800 border-2 border-amber-200 dark:border-amber-800 rounded-lg shadow-sm hover:shadow-md transition-shadow overflow-hidden"
>
{/* Info Button — opens AudiobookDetailsModal */}
{request.audiobook.audibleAsin && (
<button
onClick={() => {
setDetailsAsin(request.audiobook.audibleAsin);
setDetailsRequestId(request.id);
}}
className="absolute top-2 right-2 z-10 p-1 text-gray-400 hover:text-blue-500 dark:hover:text-blue-400 transition-colors rounded-full hover:bg-gray-100 dark:hover:bg-gray-700"
title="View book details"
aria-label="View book details"
>
<InformationCircleIcon className="w-5 h-5" />
</button>
)}
{/* Card Content */}
<div className="p-4">
<div className="flex gap-3">
@@ -314,42 +387,12 @@ function PendingApprovalSection({ requests }: { requests: PendingApprovalRequest
{/* Action Buttons */}
<div className="border-t border-amber-200 dark:border-amber-800 bg-gray-50 dark:bg-gray-900/50 px-4 py-3 flex gap-2">
<button
onClick={() => handleApproveRequest(request.id)}
disabled={isLoading}
className="flex-1 inline-flex items-center justify-center gap-1.5 px-3 py-2 bg-green-600 hover:bg-green-700 disabled:bg-green-400 disabled:cursor-not-allowed text-white text-sm font-medium rounded-lg transition-colors"
>
{isLoading ? <LoadingSpinner /> : (
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M5 13l4 4L19 7" />
</svg>
)}
<span>Approve</span>
</button>
<button
onClick={() => setSearchModalRequestId(request.id)}
disabled={isLoading}
className="flex-1 inline-flex items-center justify-center gap-1.5 px-3 py-2 bg-blue-600 hover:bg-blue-700 disabled:bg-blue-400 disabled:cursor-not-allowed text-white text-sm font-medium rounded-lg transition-colors"
>
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M21 21l-6-6m2-5a7 7 0 11-14 0 7 7 0 0114 0z" />
</svg>
<span>Search</span>
</button>
<button
onClick={() => handleDenyRequest(request.id)}
disabled={isLoading}
className="flex-1 inline-flex items-center justify-center gap-1.5 px-3 py-2 bg-red-600 hover:bg-red-700 disabled:bg-red-400 disabled:cursor-not-allowed text-white text-sm font-medium rounded-lg transition-colors"
>
{isLoading ? <LoadingSpinner /> : (
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M6 18L18 6M6 6l12 12" />
</svg>
)}
<span>Deny</span>
</button>
<ApprovalActionButtons
isLoading={isLoading}
onApprove={() => handleApproveRequest(request.id)}
onSearch={() => setSearchModalRequestId(request.id)}
onDeny={() => handleDenyRequest(request.id)}
/>
</div>
</div>
);
@@ -375,6 +418,37 @@ function PendingApprovalSection({ requests }: { requests: PendingApprovalRequest
}}
/>
)}
{/* Book Details Modal — opened via info button on each approval card */}
{detailsAsin && detailsRequestId && (
<AudiobookDetailsModal
asin={detailsAsin}
isOpen={true}
onClose={() => { setDetailsAsin(null); setDetailsRequestId(null); }}
requestStatus="awaiting_approval"
requestedByUsername={detailsRequest?.user.plexUsername ?? null}
adminActions={
<ApprovalActionButtons
isLoading={loadingStates[detailsRequestId] || false}
onApprove={async () => {
await handleApproveRequest(detailsRequestId);
setDetailsAsin(null);
setDetailsRequestId(null);
}}
onSearch={() => {
setSearchModalRequestId(detailsRequestId);
setDetailsAsin(null);
setDetailsRequestId(null);
}}
onDeny={async () => {
await handleDenyRequest(detailsRequestId);
setDetailsAsin(null);
setDetailsRequestId(null);
}}
/>
}
/>
)}
</div>
);
}
+36 -3
View File
@@ -159,10 +159,42 @@ export async function POST(request: NextRequest) {
let hasActiveRequest = false;
try {
const searchResult = await audibleService.search(book.searchTerm);
// If the scanner extracted an ASIN directly from the folder name,
// use a direct ASIN lookup (Audnexus API) — more reliable than a
// keyword text search. Fall back to text search if the lookup fails.
if (book.extractedAsin) {
try {
const asinResult = await audibleService.getAudiobookDetails(book.extractedAsin);
if (asinResult) {
match = asinResult;
}
} catch {
/* ASIN lookup failed — fall through to text search */
}
}
if (searchResult.results.length > 0) {
match = searchResult.results[0];
if (!match) {
// When an ASIN was extracted from the folder name but the direct
// lookup failed, prefer the folder name as the text search term
// over book.searchTerm. book.searchTerm may come from a single
// tagged file whose album tag is unreliable (e.g. a series name
// or intro track), whereas the folder name is the human-assigned
// title and is more likely to be accurate.
const textSearchTerm = book.extractedAsin
? book.folderName
.replace(/[\[\(][A-Z0-9]{10}[\]\)]/g, '') // strip ASIN
.replace(/[\[\(]\d{4}[\]\)]/g, '') // strip year
.replace(/[_]/g, ' ')
.replace(/\s+/g, ' ')
.trim()
: book.searchTerm;
const searchResult = await audibleService.search(textSearchTerm);
if (searchResult.results.length > 0) {
match = searchResult.results[0];
}
}
if (match) {
// Check library availability
const plexMatch = await findPlexMatch({
@@ -208,6 +240,7 @@ export async function POST(request: NextRequest) {
audioFileCount: book.audioFileCount,
totalSizeBytes: book.totalSizeBytes,
metadataSource: book.metadataSource,
extractedAsin: book.extractedAsin,
searchTerm: book.searchTerm,
audioFiles: book.audioFiles,
match: match
+7 -4
View File
@@ -7,7 +7,7 @@ import { NextRequest, NextResponse } from 'next/server';
import { getAudibleService } from '@/lib/integrations/audible.service';
import { enrichAudiobooksWithMatches } from '@/lib/utils/audiobook-matcher';
import { deduplicateAndCollectGroups } from '@/lib/utils/deduplicate-audiobooks';
import { persistDedupGroups } from '@/lib/services/works.service';
import { persistDedupGroups, collapseByExistingWorks } from '@/lib/services/works.service';
import { getCurrentUser } from '@/lib/middleware/auth';
import { RMABLogger } from '@/lib/utils/logger';
import { annotateWithIgnoreStatus } from '@/lib/utils/ignored-audiobooks';
@@ -41,16 +41,19 @@ export async function GET(request: NextRequest) {
const currentUser = getCurrentUser(request);
const userId = currentUser?.sub || undefined;
// Deduplicate before enrichment to avoid wasted DB queries on duplicate entries
// Two-pass dedup: local title/narrator/duration matching first, then collapse
// any remaining duplicates that the works table already knows are the same book
// (handles cases where source metadata diverges across paths or pages).
const { books: dedupedResults, groups } = deduplicateAndCollectGroups(results.results);
// Fire-and-forget: persist dedup groups to works table for cross-ASIN matching
if (groups.length > 0) {
persistDedupGroups(groups).catch(() => {});
}
const collapsedResults = await collapseByExistingWorks(dedupedResults);
// Enrich search results with availability and request status information
const enrichedResults = await enrichAudiobooksWithMatches(dedupedResults, userId);
const enrichedResults = await enrichAudiobooksWithMatches(collapsedResults, userId);
// Annotate with per-user ignore status
const annotatedResults = await annotateWithIgnoreStatus(enrichedResults, userId);
+7 -4
View File
@@ -7,7 +7,7 @@ import { NextRequest, NextResponse } from 'next/server';
import { getAudibleService } from '@/lib/integrations/audible.service';
import { enrichAudiobooksWithMatches } from '@/lib/utils/audiobook-matcher';
import { deduplicateAndCollectGroups } from '@/lib/utils/deduplicate-audiobooks';
import { persistDedupGroups } from '@/lib/services/works.service';
import { persistDedupGroups, collapseByExistingWorks } from '@/lib/services/works.service';
import { getCurrentUser } from '@/lib/middleware/auth';
import { RMABLogger } from '@/lib/utils/logger';
import { annotateWithIgnoreStatus } from '@/lib/utils/ignored-audiobooks';
@@ -56,17 +56,20 @@ export async function GET(
const audibleService = getAudibleService();
const result = await audibleService.searchByAuthorAsin(authorName.trim(), asin, page);
// Deduplicate before enrichment to avoid wasted DB queries on duplicate entries
// Two-pass dedup: local title/narrator/duration matching first, then collapse
// any remaining duplicates that the works table already knows are the same book
// (handles cases where source metadata diverges across paths or pages).
const { books: dedupedBooks, groups } = deduplicateAndCollectGroups(result.books);
// Fire-and-forget: persist dedup groups to works table for cross-ASIN matching
if (groups.length > 0) {
persistDedupGroups(groups).catch(() => {});
}
const collapsedBooks = await collapseByExistingWorks(dedupedBooks);
// Enrich with library availability and request status
const userId = currentUser.sub || undefined;
const enrichedBooks = await enrichAudiobooksWithMatches(dedupedBooks, userId);
const enrichedBooks = await enrichAudiobooksWithMatches(collapsedBooks, userId);
// Annotate with per-user ignore status
const annotatedBooks = await annotateWithIgnoreStatus(enrichedBooks, userId);
+7 -4
View File
@@ -9,7 +9,7 @@ import { RMABLogger } from '@/lib/utils/logger';
import { scrapeSeriesPage } from '@/lib/integrations/audible-series';
import { enrichAudiobooksWithMatches } from '@/lib/utils/audiobook-matcher';
import { deduplicateAndCollectGroups } from '@/lib/utils/deduplicate-audiobooks';
import { persistDedupGroups } from '@/lib/services/works.service';
import { persistDedupGroups, collapseByExistingWorks } from '@/lib/services/works.service';
import { annotateWithIgnoreStatus } from '@/lib/utils/ignored-audiobooks';
const logger = RMABLogger.create('API.Series.Detail');
@@ -52,17 +52,20 @@ export async function GET(
);
}
// Deduplicate before enrichment to avoid wasted DB queries on duplicate entries
// Two-pass dedup: local title/narrator/duration matching first, then collapse
// any remaining duplicates that the works table already knows are the same book
// (handles cases where source metadata diverges across paths or pages).
const { books: dedupedBooks, groups } = deduplicateAndCollectGroups(detail.books);
// Fire-and-forget: persist dedup groups to works table for cross-ASIN matching
if (groups.length > 0) {
persistDedupGroups(groups).catch(() => {});
}
const collapsedBooks = await collapseByExistingWorks(dedupedBooks);
// Enrich books with library availability and request status
const userId = currentUser.sub || undefined;
const enrichedBooks = await enrichAudiobooksWithMatches(dedupedBooks, userId);
const enrichedBooks = await enrichAudiobooksWithMatches(collapsedBooks, userId);
// Annotate with per-user ignore status
const annotatedBooks = await annotateWithIgnoreStatus(enrichedBooks, userId);
@@ -39,7 +39,12 @@ function BookRow({
const isDisabled = book.inLibrary || book.hasActiveRequest;
const isSkipped = book.skipped;
const hasMatch = book.match !== null;
const isLowConfidence = book.metadataSource === 'file_name';
// Low confidence when search term came from a filename or folder name fallback,
// BUT not when an ASIN was extracted directly from the folder name (that's a
// direct lookup and is as reliable as embedded metadata tags).
const isLowConfidence =
(book.metadataSource === 'file_name' || book.metadataSource === 'folder_name') &&
!book.extractedAsin;
return (
<div
+3 -1
View File
@@ -34,7 +34,9 @@ export interface ScannedBook {
relativePath: string;
audioFileCount: number;
totalSizeBytes: number;
metadataSource: 'tags' | 'file_name';
metadataSource: 'tags' | 'folder_name' | 'file_name';
/** ASIN extracted directly from the folder name, if present. */
extractedAsin?: string;
searchTerm: string;
audioFiles: string[];
match: AudibleMatch | null;
@@ -38,6 +38,8 @@ interface AudiobookDetailsModalProps {
hideRequestActions?: boolean;
hasReportedIssue?: boolean;
aiReason?: string | null;
/** Optional admin action buttons (Approve / Search / Deny) rendered as a second row in the action bar */
adminActions?: React.ReactNode;
}
// Status helper
@@ -80,6 +82,7 @@ export function AudiobookDetailsModal({
hideRequestActions = false,
hasReportedIssue = false,
aiReason = null,
adminActions,
}: AudiobookDetailsModalProps) {
const { user } = useAuth();
const { squareCovers } = usePreferences();
@@ -548,6 +551,30 @@ export function AudiobookDetailsModal({
</a>
</div>
{/* Language */}
{audiobook.language && (
<div>
<p className="text-gray-500 dark:text-gray-400">Language</p>
<p className="text-gray-900 dark:text-gray-100 capitalize">{audiobook.language}</p>
</div>
)}
{/* Format */}
{audiobook.formatType && (
<div>
<p className="text-gray-500 dark:text-gray-400">Format</p>
<p className="text-gray-900 dark:text-gray-100 capitalize">{audiobook.formatType}</p>
</div>
)}
{/* Publisher */}
{audiobook.publisherName && (
<div>
<p className="text-gray-500 dark:text-gray-400">Publisher</p>
<p className="text-gray-900 dark:text-gray-100">{audiobook.publisherName}</p>
</div>
)}
{/* Download Link - subtle utility, visible from any context */}
{isAvailable && downloadAvailable && requestId && user?.permissions?.download !== false && (
<div>
@@ -739,6 +766,13 @@ export function AudiobookDetailsModal({
)}
</div>
{/* Admin Actions Row (Approve / Search / Deny) — injected by admin pages */}
{adminActions && (
<div className="flex items-center gap-2 mt-3 pt-3 border-t border-amber-200 dark:border-amber-700/50">
{adminActions}
</div>
)}
</div>
)}
+36 -11
View File
@@ -1,4 +1,4 @@
/**
/**
* Component: Notification Event Constants
* Documentation: documentation/backend/services/notifications.md
*
@@ -10,16 +10,28 @@ export type NotificationSeverity = 'info' | 'success' | 'error' | 'warning';
export type NotificationPriority = 'normal' | 'high';
/**
* Central registry of notification events.
* Normalized interface for event metadata.
* Each entry in NOTIFICATION_EVENTS is structurally validated against this via `satisfies`.
*
* Each entry defines:
* - `label`: Human-readable name shown in the UI
* - `title`: Default title used in notification messages
* - `titleByRequestType`: Optional map of request-type-specific titles (e.g. audiobook → "Audiobook Available")
* - `emoji`: Emoji prefix for notification titles
* - `severity`: Drives provider formatting (colors, Apprise types, ntfy tags)
* - `priority`: Drives notification urgency (Pushover/ntfy priority levels)
* - `messageLabel`: Optional label for the `message` payload field (defaults to "Error" if omitted)
*/
export interface NotificationEventConfig {
label: string;
title: string;
titleByRequestType?: Record<string, string>;
emoji: string;
severity: NotificationSeverity;
priority: NotificationPriority;
messageLabel?: string;
}
/** Central registry of notification events. */
export const NOTIFICATION_EVENTS = {
request_pending_approval: {
label: 'Request Pending Approval',
@@ -31,17 +43,29 @@ export const NOTIFICATION_EVENTS = {
request_approved: {
label: 'Request Approved',
title: 'Request Approved',
emoji: '\u2705',
emoji: '',
severity: 'success' as const,
priority: 'normal' as const,
},
request_grabbed: {
label: 'Request Grabbed',
title: 'Download Grabbed',
titleByRequestType: {
audiobook: 'Audiobook Grabbed',
ebook: 'Ebook Grabbed',
},
emoji: '\u{1F4E5}',
severity: 'info' as const,
priority: 'normal' as const,
messageLabel: 'Details',
},
request_available: {
label: 'Request Available',
title: 'Request Available',
titleByRequestType: {
audiobook: 'Audiobook Available',
ebook: 'Ebook Available',
} as Record<string, string>,
},
emoji: '\u{1F389}',
severity: 'success' as const,
priority: 'high' as const,
@@ -49,7 +73,7 @@ export const NOTIFICATION_EVENTS = {
request_error: {
label: 'Request Error',
title: 'Request Error',
emoji: '\u274C',
emoji: '',
severity: 'error' as const,
priority: 'high' as const,
},
@@ -59,8 +83,9 @@ export const NOTIFICATION_EVENTS = {
emoji: '\u{1F6A9}',
severity: 'warning' as const,
priority: 'high' as const,
messageLabel: 'Reason',
},
} as const;
} satisfies Record<string, NotificationEventConfig>;
/** Union type of all valid notification event keys */
export type NotificationEvent = keyof typeof NOTIFICATION_EVENTS;
@@ -72,7 +97,7 @@ export const NOTIFICATION_EVENT_KEYS = Object.keys(NOTIFICATION_EVENTS) as [Noti
export type NotificationEventMeta = (typeof NOTIFICATION_EVENTS)[NotificationEvent];
/** Helper: get event metadata by key */
export function getEventMeta(event: NotificationEvent) {
export function getEventMeta(event: NotificationEvent): NotificationEventConfig {
return NOTIFICATION_EVENTS[event];
}
@@ -82,9 +107,9 @@ export function getEventMeta(event: NotificationEvent) {
* returns the type-specific title. Otherwise falls back to the default `title`.
*/
export function getEventTitle(event: NotificationEvent, requestType?: string): string {
const meta = NOTIFICATION_EVENTS[event];
if (requestType && 'titleByRequestType' in meta) {
const typeTitle = (meta as typeof meta & { titleByRequestType: Record<string, string> }).titleByRequestType[requestType];
const meta = getEventMeta(event);
if (requestType && meta.titleByRequestType) {
const typeTitle = meta.titleByRequestType[requestType];
if (typeTitle) return typeTitle;
}
return meta.title;
+3
View File
@@ -34,6 +34,9 @@ export interface Audiobook {
requestedByUsername?: string | null; // Username who requested (only if not current user)
hasReportedIssue?: boolean; // True if an open issue exists for this audiobook
isIgnored?: boolean; // True if this user has ignored this audiobook from auto-requests
language?: string;
formatType?: string;
publisherName?: string;
}
export function useAudiobooks(type: 'popular' | 'new-releases', limit: number = 20, page: number = 1, hideAvailable: boolean = false) {
+3 -4
View File
@@ -19,6 +19,7 @@ import {
import { RMABLogger } from '../utils/logger';
import { parseRuntime } from '../utils/parse-runtime';
import { randomDelay } from '../utils/scrape-resilience';
import { extractAllNarrators } from '../utils/extract-narrator';
const logger = RMABLogger.create('Audible.Series');
@@ -442,10 +443,8 @@ function parseSeriesBooks(
const authorHref = authorLink.attr('href') || '';
const authorAsinMatch = authorHref.match(/\/author\/[^/]+\/([A-Z0-9]{10})/);
// Narrator
const narratorText = $el.find('a[href*="searchNarrator="]').first().text().trim() ||
$el.find('.narratorLabel').text().trim() ||
'';
// Narrator — capture all narrator links (multi-narrator productions are common)
const narratorText = extractAllNarrators($, $el);
// Cover art
const coverArtUrl = $el.find('img').first().attr('src')?.replace(/\._.*_\./, '._SL500_.') || '';
+237 -81
View File
@@ -4,21 +4,26 @@
*/
import axios, { AxiosInstance } from 'axios';
import * as cheerio from 'cheerio';
import { RMABLogger } from '../utils/logger';
import { getConfigService } from '../services/config.service';
import { AudibleRegion, AUDIBLE_REGIONS, DEFAULT_AUDIBLE_REGION } from '../types/audible';
import {
getLanguageForRegion,
isAcceptedLanguage,
stripPrefixes,
buildContainsSelector,
type LanguageConfig,
} from '../constants/language-config';
import {
pickUserAgent,
getBrowserHeaders,
jitteredBackoff,
randomDelay,
AdaptivePacer,
FetchResultMeta,
} from '../utils/scrape-resilience';
import { parseRuntime as parseRuntimeUtil } from '../utils/parse-runtime';
import { extractAllNarrators } from '../utils/extract-narrator';
const logger = RMABLogger.create('Audible');
@@ -27,6 +32,13 @@ const AUDIBLE_PAGE_SIZE = 50;
const CATALOG_RESPONSE_GROUPS =
'contributors,product_desc,product_attrs,product_extended_attrs,media,rating,series,category_ladders,product_details';
// Retry/backoff knobs for HTML scraping (nightly refresh job only).
// Healthy users still finish quickly — per-page success returns on attempt 0
// with a 2-4s inter-page delay. Struggling users grind through 503 storms
// patiently: up to ~12 retries per request, with each backoff capped at 3 min.
const HTML_MAX_RETRIES = 12;
const HTML_MAX_BACKOFF_MS = 180_000;
export interface AudibleAudiobook {
asin: string;
title: string;
@@ -42,6 +54,9 @@ export interface AudibleAudiobook {
series?: string;
seriesPart?: string;
seriesAsin?: string;
language?: string;
formatType?: string;
publisherName?: string;
}
export interface AudibleSearchResult {
@@ -93,6 +108,8 @@ interface CatalogProduct {
runtime_length_min?: number;
release_date?: string;
language?: string;
format_type?: string;
publisher_name?: string;
rating?: {
overall_distribution?: {
display_stars?: number;
@@ -183,6 +200,9 @@ function mapCatalogProduct(product: CatalogProduct): AudibleAudiobook {
series,
seriesPart,
seriesAsin,
language: product.language ?? undefined,
formatType: product.format_type ?? undefined,
publisherName: product.publisher_name ?? undefined,
};
}
@@ -298,6 +318,7 @@ export class AudibleService {
config: any = {},
maxRetries: number = 5,
client: AxiosInstance = this.htmlClient,
maxBackoffMs: number = Number.POSITIVE_INFINITY,
): Promise<{ data: any; meta: FetchResultMeta }> {
let lastError: Error | null = null;
let retriesUsed = 0;
@@ -324,7 +345,7 @@ export class AudibleService {
retriesUsed++;
const backoffMs = jitteredBackoff(attempt);
const backoffMs = jitteredBackoff(attempt, 1000, maxBackoffMs);
logger.info(
` Request failed (${status || 'network error'}), retrying in ${backoffMs}ms (attempt ${attempt + 1}/${maxRetries})...`,
);
@@ -379,6 +400,12 @@ export class AudibleService {
throw lastError || new Error('External API request failed after retries');
}
/**
* Popular audiobooks from Audible's curated /adblbestsellers HTML page.
* Uses HTML scraping (not the catalog API) because the API's BestSellers sort
* is a right-now velocity rank that surfaces launch-day shovelware and preorders;
* the HTML page reflects Audible's editorial curation.
*/
async getPopularAudiobooks(limit: number = 20): Promise<AudibleAudiobook[]> {
await this.initialize();
@@ -395,42 +422,36 @@ export class AudibleService {
logger.info(` Fetching page ${page}/${maxPages}...`);
const { data: response, meta } = await this.fetchWithRetry(
'/1.0/catalog/products',
'/adblbestsellers',
{
params: {
products_sort_by: 'BestSellers',
num_results: AUDIBLE_PAGE_SIZE,
page: page - 1,
response_groups: CATALOG_RESPONSE_GROUPS,
ipRedirectOverride: 'true',
pageSize: AUDIBLE_PAGE_SIZE,
...(page > 1 ? { page } : {}),
},
},
5,
this.apiClient,
HTML_MAX_RETRIES,
this.htmlClient,
HTML_MAX_BACKOFF_MS,
);
const envelope: CatalogProductsResponse = response.data;
const products = envelope.products ?? [];
const totalResults = envelope.total_results ?? 0;
const foundOnPage = this.parseProductListItems(
response.data,
audiobooks,
limit,
);
for (const product of products) {
if (audiobooks.length >= limit) break;
if (audiobooks.some((b) => b.asin === product.asin)) continue;
audiobooks.push(mapCatalogProduct(product));
logger.info(` Found ${foundOnPage} audiobooks on page ${page}`);
if (foundOnPage < AUDIBLE_PAGE_SIZE / 2) {
logger.info(` Reached end of available pages`);
break;
}
logger.info(` Found ${products.length} audiobooks on page ${page}`);
const hasMore =
totalResults > 0
? totalResults > page * AUDIBLE_PAGE_SIZE
: products.length >= AUDIBLE_PAGE_SIZE;
if (!hasMore) break;
page++;
if (page <= maxPages && audiobooks.length < limit) {
await this.delay(this.apiPageDelay(meta));
await this.delay(this.pacer.reportPageResult(meta));
}
} catch (error) {
logger.error(`Failed to fetch page ${page} of popular audiobooks`, {
@@ -445,6 +466,11 @@ export class AudibleService {
return audiobooks;
}
/**
* New release audiobooks from Audible's curated /newreleases HTML page.
* Uses HTML scraping (not the catalog API) because the API's -ReleaseDate sort
* returns 100% future preorders with no released-only filter available.
*/
async getNewReleases(limit: number = 20): Promise<AudibleAudiobook[]> {
await this.initialize();
@@ -461,42 +487,36 @@ export class AudibleService {
logger.info(` Fetching page ${page}/${maxPages}...`);
const { data: response, meta } = await this.fetchWithRetry(
'/1.0/catalog/products',
'/newreleases',
{
params: {
products_sort_by: '-ReleaseDate',
num_results: AUDIBLE_PAGE_SIZE,
page: page - 1,
response_groups: CATALOG_RESPONSE_GROUPS,
ipRedirectOverride: 'true',
pageSize: AUDIBLE_PAGE_SIZE,
...(page > 1 ? { page } : {}),
},
},
5,
this.apiClient,
HTML_MAX_RETRIES,
this.htmlClient,
HTML_MAX_BACKOFF_MS,
);
const envelope: CatalogProductsResponse = response.data;
const products = envelope.products ?? [];
const totalResults = envelope.total_results ?? 0;
const foundOnPage = this.parseProductListItems(
response.data,
audiobooks,
limit,
);
for (const product of products) {
if (audiobooks.length >= limit) break;
if (audiobooks.some((b) => b.asin === product.asin)) continue;
audiobooks.push(mapCatalogProduct(product));
logger.info(` Found ${foundOnPage} audiobooks on page ${page}`);
if (foundOnPage < AUDIBLE_PAGE_SIZE / 2) {
logger.info(` Reached end of available pages`);
break;
}
logger.info(` Found ${products.length} audiobooks on page ${page}`);
const hasMore =
totalResults > 0
? totalResults > page * AUDIBLE_PAGE_SIZE
: products.length >= AUDIBLE_PAGE_SIZE;
if (!hasMore) break;
page++;
if (page <= maxPages && audiobooks.length < limit) {
await this.delay(this.apiPageDelay(meta));
await this.delay(this.pacer.reportPageResult(meta));
}
} catch (error) {
logger.error(`Failed to fetch page ${page} of new releases`, {
@@ -677,6 +697,9 @@ export class AudibleService {
series: data.seriesPrimary?.name || undefined,
seriesPart: data.seriesPrimary?.position || undefined,
seriesAsin: data.seriesPrimary?.asin || undefined,
language: data.language || undefined,
formatType: data.formatType || undefined,
publisherName: data.publisherName || undefined,
};
if (result.coverArtUrl && !result.coverArtUrl.includes('_SL500_')) {
@@ -791,6 +814,11 @@ export class AudibleService {
}
}
/**
* Category audiobooks from Audible's HTML /search?node=<categoryId> page,
* sorted by popularity-rank. Uses HTML scraping (not the catalog API) so
* results match Audible's curated category-storefront ordering.
*/
async getCategoryBooks(categoryId: string, limit: number = 200): Promise<AudibleAudiobook[]> {
await this.initialize();
@@ -805,43 +833,35 @@ export class AudibleService {
while (audiobooks.length < limit && page <= maxPages) {
try {
const { data: response, meta } = await this.fetchWithRetry(
'/1.0/catalog/products',
'/search',
{
params: {
category_id: categoryId,
products_sort_by: 'BestSellers',
num_results: AUDIBLE_PAGE_SIZE,
page: page - 1,
response_groups: CATALOG_RESPONSE_GROUPS,
ipRedirectOverride: 'true',
node: categoryId,
pageSize: AUDIBLE_PAGE_SIZE,
sort: 'popularity-rank',
...(page > 1 ? { page } : {}),
},
},
5,
this.apiClient,
HTML_MAX_RETRIES,
this.htmlClient,
HTML_MAX_BACKOFF_MS,
);
const envelope: CatalogProductsResponse = response.data;
const products = envelope.products ?? [];
const totalResults = envelope.total_results ?? 0;
const foundOnPage = this.parseSearchResultItems(
response.data,
audiobooks,
limit,
);
for (const product of products) {
if (audiobooks.length >= limit) break;
if (audiobooks.some((b) => b.asin === product.asin)) continue;
audiobooks.push(mapCatalogProduct(product));
}
logger.info(`Category ${categoryId}: found ${foundOnPage} books on page ${page}`);
logger.info(`Category ${categoryId}: found ${products.length} books on page ${page}`);
const hasMore =
totalResults > 0
? totalResults > page * AUDIBLE_PAGE_SIZE
: products.length >= AUDIBLE_PAGE_SIZE;
if (!hasMore) break;
if (foundOnPage < AUDIBLE_PAGE_SIZE / 2) break;
page++;
if (page <= maxPages && audiobooks.length < limit) {
await this.delay(this.apiPageDelay(meta));
await this.delay(this.pacer.reportPageResult(meta));
}
} catch (error) {
logger.error(`Failed to fetch category ${categoryId} page ${page}`, {
@@ -858,12 +878,148 @@ export class AudibleService {
return audiobooks;
}
private apiPageDelay(meta: FetchResultMeta): number {
if (meta.retriesUsed > 0) {
return this.pacer.reportPageResult(meta);
}
this.pacer.reportPageResult(meta);
return randomDelay(500, 1500);
private getLangConfig(): LanguageConfig {
return getLanguageForRegion(this.region);
}
private parseRuntime(runtimeText: string): number | undefined {
return parseRuntimeUtil(runtimeText, this.getLangConfig());
}
/**
* Parse the `.productListItem` blocks used by /adblbestsellers and /newreleases.
* Pushes matched books into `audiobooks` (skipping duplicates and respecting `limit`)
* and returns the count parsed from this page.
*/
private parseProductListItems(
html: string,
audiobooks: AudibleAudiobook[],
limit: number,
): number {
const $ = cheerio.load(html);
const langConfig = this.getLangConfig();
let foundOnPage = 0;
$('.productListItem').each((_index, element) => {
if (audiobooks.length >= limit) return false;
const $el = $(element);
const asin =
$el.find('li').attr('data-asin') ||
$el.find('a').attr('href')?.match(/\/(?:pd|ac)\/[^\/]+\/([A-Z0-9]{10})/)?.[1] ||
'';
if (!asin) return;
if (audiobooks.some((book) => book.asin === asin)) return;
const title =
$el.find('h3 a').text().trim() ||
$el.find('.bc-heading a').text().trim();
const authorText =
$el.find('.authorLabel').text().trim() ||
$el.find('.bc-size-small .bc-text-bold').first().text().trim();
const authorHref = $el.find('a[href*="/author/"]').first().attr('href') || '';
const authorAsinMatch = authorHref.match(/\/author\/[^\/]+\/([A-Z0-9]{10})/);
// Narrator — capture all narrator links (multi-narrator productions are common);
// fall back to .narratorLabel text, then to the bc-text-bold sibling for layouts
// that omit both anchor links and the .narratorLabel span.
const narratorText =
extractAllNarrators($, $el) ||
$el.find('.bc-size-small .bc-text-bold').eq(1).text().trim();
const coverArtUrl = $el.find('img').attr('src') || '';
const ratingText = $el.find('.ratingsLabel').text().trim();
const rating = ratingText ? parseFloat(ratingText.split(' ')[0]) : undefined;
audiobooks.push({
asin,
title,
author: stripPrefixes(authorText, langConfig.scraping.authorPrefixes),
authorAsin: authorAsinMatch?.[1] || undefined,
narrator: stripPrefixes(narratorText, langConfig.scraping.narratorPrefixes),
coverArtUrl: coverArtUrl.replace(/\._.*_\./, '._SL500_.'),
rating,
});
foundOnPage++;
});
return foundOnPage;
}
/**
* Parse the `.s-result-item` / `.productListItem` blocks used by
* /search?node=<categoryId>. Pushes matched books into `audiobooks`
* (skipping duplicates and respecting `limit`) and returns the count parsed
* from this page.
*/
private parseSearchResultItems(
html: string,
audiobooks: AudibleAudiobook[],
limit: number,
): number {
const $ = cheerio.load(html);
const langConfig = this.getLangConfig();
let foundOnPage = 0;
$('.s-result-item, .productListItem').each((_index, element) => {
if (audiobooks.length >= limit) return false;
const $el = $(element);
const asin =
$el.find('li').attr('data-asin') ||
$el.find('a').attr('href')?.match(/\/(?:pd|ac)\/[^\/]+\/([A-Z0-9]{10})/)?.[1] ||
'';
if (!asin) return;
if (audiobooks.some((b) => b.asin === asin)) return;
const title =
$el.find('h2').first().text().trim() ||
$el.find('h3 a').text().trim() ||
$el.find('.bc-heading a').text().trim();
const authorLink = $el.find('a[href*="/author/"]').first();
const authorText =
authorLink.text().trim() ||
$el.find('.authorLabel').text().trim();
const authorHref = authorLink.attr('href') || '';
const authorAsinMatch = authorHref.match(/\/author\/[^\/]+\/([A-Z0-9]{10})/);
// Narrator — capture all narrator links (multi-narrator productions are common)
const narratorText = extractAllNarrators($, $el);
const coverArtUrl = $el.find('img').attr('src') || '';
const runtimeText =
$el.find('.runtimeLabel').text().trim() ||
$el.find(buildContainsSelector('span', langConfig.scraping.lengthLabels)).text().trim();
const durationMinutes = this.parseRuntime(runtimeText);
const ratingText =
$el.find('.ratingsLabel').text().trim() ||
$el.find('.a-icon-star span').first().text().trim();
const rating = ratingText ? parseFloat(ratingText.split(' ')[0]) : undefined;
audiobooks.push({
asin,
title,
author: stripPrefixes(authorText, langConfig.scraping.authorPrefixes),
authorAsin: authorAsinMatch?.[1] || undefined,
narrator: stripPrefixes(narratorText, langConfig.scraping.narratorPrefixes),
coverArtUrl: coverArtUrl.replace(/\._.*_\./, '._SL500_.'),
durationMinutes,
rating,
});
foundOnPage++;
});
return foundOnPage;
}
private async delay(ms: number): Promise<void> {
@@ -138,16 +138,37 @@ async function persistSectionBooks(
logger: ReturnType<typeof RMABLogger.forJob>,
labelForErrors: string,
): Promise<number> {
// Defensive dedup: the (asin, categoryId) unique constraint means a duplicate ASIN
// in `books` crashes the second .create() with P2002. The HTML parser already dedupes
// per page and across pages against the cumulative accumulator, but a warn-on-fire
// signal here lets us detect upstream surprises (e.g. Audible serving the same item
// in both a carousel and the main grid) without the noisy duplicate-key Postgres
// errors. Keep the first occurrence so Audible's editorial ordering is preserved.
const seenAsins = new Set<string>();
const dedupedBooks = books.filter((b) => {
if (!b?.asin || seenAsins.has(b.asin)) return false;
seenAsins.add(b.asin);
return true;
});
const droppedCount = books.length - dedupedBooks.length;
if (droppedCount > 0) {
logger.warn(
`Dropped ${droppedCount} duplicate ASIN(s) from ${categoryId} input list before persist`,
);
}
// Wipe previous entries for this section
logger.info(`Clearing previous data for ${categoryId}...`);
await prisma.audibleCacheCategory.deleteMany({
where: { categoryId },
});
logger.info(`Cleared previous entries for ${categoryId}, saving ${books.length} books...`);
logger.info(
`Cleared previous entries for ${categoryId}, saving ${dedupedBooks.length} books...`,
);
let saved = 0;
for (let i = 0; i < books.length; i++) {
const book = books[i];
for (let i = 0; i < dedupedBooks.length; i++) {
const book = dedupedBooks[i];
try {
// Cache thumbnail if coverArtUrl exists
let cachedCoverPath: string | null = null;
@@ -31,13 +31,16 @@ export async function processDownloadTorrent(payload: DownloadTorrentPayload): P
try {
// Update request status to downloading
await prisma.request.update({
const request = await prisma.request.update({
where: { id: requestId },
data: {
status: 'downloading',
progress: 0,
updatedAt: new Date(),
},
include: {
user: { select: { plexUsername: true } },
},
});
// Detect protocol from result and get appropriate client
@@ -103,8 +106,22 @@ export async function processDownloadTorrent(payload: DownloadTorrentPayload): P
logger.info(`Created download history record: ${downloadHistory.id}`);
// Trigger monitor download job with initial delay
// Send grab notification (non-blocking — failures here don't fail the download)
const jobQueue = getJobQueueService();
const grabMessage = `${torrent.title} via ${torrent.indexer} (${client.clientType})`;
await jobQueue.addNotificationJob(
'request_grabbed',
requestId,
audiobook.title,
audiobook.author,
request.user.plexUsername || 'Unknown User',
grabMessage,
request.type
).catch((error) => {
logger.error('Failed to queue grab notification', { error: error instanceof Error ? error.message : String(error) });
});
// Trigger monitor download job with initial delay
await jobQueue.addMonitorJob(
requestId,
downloadHistory.id,
@@ -127,6 +127,7 @@ export class AppriseProvider implements INotificationProvider {
private formatMessage(payload: NotificationPayload): { title: string; body: string } {
const { event, title, author, userName, message, requestType } = payload;
const meta = getEventMeta(event);
const isIssue = event === 'issue_reported';
const messageLines = [
@@ -136,7 +137,9 @@ export class AppriseProvider implements INotificationProvider {
];
if (message) {
messageLines.push(isIssue ? `\u{1F4DD} Reason: ${message}` : `\u26A0\uFE0F Error: ${message}`);
const messageLabel = meta.messageLabel ?? 'Error';
const msgEmoji = meta.severity === 'error' ? '\u26A0\uFE0F' : '\u{1F4DD}';
messageLines.push(`${msgEmoji} ${messageLabel}: ${message}`);
}
return {
@@ -71,7 +71,7 @@ export class DiscordProvider implements INotificationProvider {
];
if (message) {
fields.push({ name: isIssue ? 'Reason' : 'Error', value: message, inline: false });
fields.push({ name: meta.messageLabel ?? 'Error', value: message, inline: false });
}
return {
@@ -84,6 +84,7 @@ export class NtfyProvider implements INotificationProvider {
private formatMessage(payload: NotificationPayload): { title: string; message: string } {
const { event, title, author, userName, message, requestType } = payload;
const meta = getEventMeta(event);
const isIssue = event === 'issue_reported';
const messageLines = [
@@ -93,7 +94,9 @@ export class NtfyProvider implements INotificationProvider {
];
if (message) {
messageLines.push(isIssue ? `\u{1F4DD} Reason: ${message}` : `\u26A0\uFE0F Error: ${message}`);
const messageLabel = meta.messageLabel ?? 'Error';
const msgEmoji = meta.severity === 'error' ? '\u26A0\uFE0F' : '\u{1F4DD}';
messageLines.push(`${msgEmoji} ${messageLabel}: ${message}`);
}
return {
@@ -91,7 +91,9 @@ export class PushoverProvider implements INotificationProvider {
];
if (message) {
messageLines.push('', isIssue ? `\u{1F4DD} Reason: ${message}` : `\u26A0\uFE0F Error: ${message}`);
const messageLabel = meta.messageLabel ?? 'Error';
const msgEmoji = meta.severity === 'error' ? '\u26A0\uFE0F' : '\u{1F4DD}';
messageLines.push('', `${msgEmoji} ${messageLabel}: ${message}`);
}
return {
+92 -1
View File
@@ -9,7 +9,8 @@
import { prisma } from '@/lib/db';
import { RMABLogger } from '@/lib/utils/logger';
import type { DedupGroup } from '@/lib/utils/deduplicate-audiobooks';
import { metadataScore, type DedupGroup } from '@/lib/utils/deduplicate-audiobooks';
import type { AudibleAudiobook } from '@/lib/integrations/audible.service';
const logger = RMABLogger.create('WorksService');
@@ -182,6 +183,96 @@ export async function seedAsin(
}
}
// ---------------------------------------------------------------------------
// View-level collapse (consult the works table after local dedup)
// ---------------------------------------------------------------------------
/**
* Collapse books that already share a Work record according to the works table.
*
* The local `deduplicateAndCollectGroups()` pass is title/narrator/duration-based
* and stateless — it can fail to merge ASINs whose source metadata diverges (e.g.
* a series-page scrape captures different "first narrators" for two ASINs of the
* same recording, or two paginated pages each contain one ASIN and never compare
* them). The works table is the durable source of truth for "same book" identity,
* populated by every prior dedup pass and by request-time seeding. This pass
* applies that knowledge to the current view.
*
* Behavior:
* - Books whose ASINs map to a shared workId collapse to a single representative
* chosen by `metadataScore()` (same ranking as local dedup).
* - Books not present in any work, or in single-ASIN works, pass through untouched.
* - Original ordering is preserved (the kept representative sits at the position
* of the first occurrence of its work in the input list).
* - DB failure is non-fatal: the input list is returned unchanged so the view
* still renders (degrades to local-dedup-only behavior).
*/
export async function collapseByExistingWorks(
books: AudibleAudiobook[],
): Promise<AudibleAudiobook[]> {
if (books.length <= 1) return books;
try {
const asins = books.map(b => b.asin);
const entries = await prisma.workAsin.findMany({
where: { asin: { in: asins } },
select: { asin: true, workId: true },
});
if (entries.length === 0) return books;
// Map ASIN → workId for fast lookup in the loop below
const asinToWorkId = new Map<string, string>();
for (const entry of entries) {
asinToWorkId.set(entry.asin, entry.workId);
}
// Walk the input once, preserving position. For each work seen, keep a
// running "best" book; for books not in any work, emit immediately.
const result: AudibleAudiobook[] = [];
const workIdToResultIndex = new Map<string, number>();
for (const book of books) {
const workId = asinToWorkId.get(book.asin);
if (!workId) {
result.push(book);
continue;
}
const existingIndex = workIdToResultIndex.get(workId);
if (existingIndex === undefined) {
workIdToResultIndex.set(workId, result.length);
result.push(book);
continue;
}
// A sibling from this work is already in the result. Keep whichever
// has the richer metadata; on tie, keep the earlier entry (already there).
const existing = result[existingIndex];
if (metadataScore(book) > metadataScore(existing)) {
result[existingIndex] = book;
}
}
const collapsed = books.length - result.length;
if (collapsed > 0) {
logger.debug('Collapsed books via works table', {
inputCount: books.length,
outputCount: result.length,
collapsed,
});
}
return result;
} catch (error) {
logger.error('collapseByExistingWorks failed; returning input unchanged', {
error: error instanceof Error ? error.message : String(error),
bookCount: books.length,
});
return books;
}
}
// ---------------------------------------------------------------------------
// Sibling ASIN lookup (for library matching expansion)
// ---------------------------------------------------------------------------
+115 -38
View File
@@ -21,6 +21,12 @@ export const MAX_SCAN_DEPTH = 10;
/** Maximum concurrent ffprobe calls for metadata reads. */
const METADATA_CONCURRENCY = 10;
/**
* Folder names matching this pattern are considered generic and should not be
* used as Audible search terms (e.g. "CD1", "Disc 2", "Part 3", "Volume 1").
*/
const GENERIC_FOLDER_NAME_RE = /^(cd|disc|disk|part|vol(ume)?)\s*\d+$/i;
/** Metadata extracted from an audio file via ffprobe. */
export interface AudioFileMetadata {
title?: string; // From 'album' tag (book title)
@@ -39,7 +45,8 @@ export interface DiscoveredAudiobook {
totalSizeBytes: number;
metadata: AudioFileMetadata;
searchTerm: string; // Constructed search query for Audible
metadataSource: 'tags' | 'file_name'; // Where the search term came from
metadataSource: 'tags' | 'folder_name' | 'file_name'; // Where the search term came from
extractedAsin?: string; // ASIN extracted directly from folder name, if present
audioFiles: string[]; // File names (relative to folderPath) belonging to this book
groupingKey: string; // Normalized key for cross-folder deduplication
}
@@ -60,6 +67,18 @@ function isAudioFile(filename: string): boolean {
return (AUDIO_EXTENSIONS as readonly string[]).includes(ext);
}
/**
* Extract an Audible ASIN from a string (typically a folder name).
* Audible ASINs start with 'B' and are exactly 10 alphanumeric characters.
* The ASIN must be bounded by a bracket, parenthesis, whitespace, or string
* boundary to avoid false positives from random alphanumeric sequences.
* Returns the ASIN string or null if not found.
*/
export function extractAsinFromString(str: string): string | null {
const match = str.match(/(?:^|[\s\[\(])([B][A-Z0-9]{9})(?:$|[\s\]\)])/);
return match ? match[1] : null;
}
/**
* Read audio metadata from a file using ffprobe.
* Extracts album, album_artist, composer, and title tags.
@@ -140,15 +159,36 @@ export function deduplicateNames(
}
/**
* Build a search term from metadata or file name.
* Clean a raw string (folder name or file name) for use as an Audible search term.
* Strips file extension, bracketed ASINs, bracketed years, leading track numbers,
* underscores, and collapses whitespace.
*/
function cleanSearchString(raw: string): string {
return raw
.replace(/\.[^.]+$/, '') // Remove file extension
.replace(/[\[\(][A-Z0-9]{10}[\]\)]/g, '') // Remove ASIN in brackets
.replace(/[\[\(]\d{4}[\]\)]/g, '') // Remove year in brackets
.replace(/^\d+[\s._-]+/, '') // Remove leading track numbers
.replace(/[_]/g, ' ') // Underscores to spaces
.replace(/\s+/g, ' ') // Collapse whitespace
.trim();
}
/**
* Build a search term from metadata or folder/file name.
* Returns the search term and the source it was derived from.
*
* Fallback chain (when no album metadata tag is present):
* 1. Folder name — if provided and not a generic name (CD1, Disc 2, Part 3, etc.)
* 2. First audio file name — last resort, always available
*
* When metadata tags are present, constructs "Title Author Narrator ContributingArtists".
* When tags are empty, falls back to the first audio file's name (cleaned).
*/
export function buildSearchTerm(
metadata: AudioFileMetadata,
firstFileName: string
): { searchTerm: string; source: 'tags' | 'file_name' } {
firstFileName: string,
folderName?: string
): { searchTerm: string; source: 'tags' | 'folder_name' | 'file_name' } {
const { author, narrator, contributingArtists } = deduplicateNames(
metadata.author,
metadata.narrator,
@@ -165,23 +205,23 @@ export function buildSearchTerm(
return { searchTerm: parts.join(' '), source: 'tags' };
}
// Fallback: clean up the first audio file name and use it as search term
const cleaned = firstFileName
.replace(/\.[^.]+$/, '') // Remove file extension
.replace(/[\[\(][A-Z0-9]{10}[\]\)]/g, '') // Remove ASIN in brackets
.replace(/[\[\(]\d{4}[\]\)]/g, '') // Remove year in brackets
.replace(/^\d+[\s._-]+/, '') // Remove leading track numbers
.replace(/[_]/g, ' ') // Underscores to spaces
.replace(/\s+/g, ' ') // Collapse whitespace
.trim();
// Fallback 1: folder name (if provided and not generic)
if (folderName && !GENERIC_FOLDER_NAME_RE.test(folderName.trim())) {
const cleaned = cleanSearchString(folderName);
if (cleaned) {
return { searchTerm: cleaned, source: 'folder_name' };
}
}
// Fallback 2: first audio file name
const cleaned = cleanSearchString(firstFileName);
return { searchTerm: cleaned || firstFileName, source: 'file_name' };
}
/**
* Build a normalized grouping key from metadata.
* Used to determine which files belong to the same book.
* Returns null if metadata has no title (ungroupable).
* Returns null if metadata has no title (ungroupable by metadata).
*/
function buildGroupingKey(metadata: AudioFileMetadata): string | null {
if (!metadata.title) return null;
@@ -259,17 +299,23 @@ async function asyncPool<T, R>(
* Group audio files in a directory by their metadata.
* Reads metadata from all files using a concurrency pool, then groups them
* by a normalized key of title + author + narrator.
* Files with no metadata title each become their own group.
*
* Files with a metadata title are grouped by their shared key. Files with no
* metadata title are all grouped together under a single '__ungrouped_folder'
* key (rather than one entry per file), treating the folder as one book.
* If a folder contains both tagged and untagged files, the untagged files form
* one extra group alongside the tagged groups.
*/
async function groupAudioFilesByMetadata(
dirPath: string,
audioFiles: string[],
audioSizes: Map<string, number>
audioSizes: Map<string, number>,
folderName: string
): Promise<Array<{
files: string[];
totalSize: number;
metadata: AudioFileMetadata;
metadataSource: 'tags' | 'file_name';
metadataSource: 'tags' | 'folder_name' | 'file_name';
searchTerm: string;
groupingKey: string;
}>> {
@@ -291,14 +337,12 @@ async function groupAudioFilesByMetadata(
metadata: AudioFileMetadata;
}>();
let ungroupedCounter = 0;
for (const { fileName, metadata } of metadataResults) {
const key = buildGroupingKey(metadata);
const fileSize = audioSizes.get(fileName) || 0;
if (key) {
// Has metadata — group with others sharing the same key
// Has metadata title — group with others sharing the same key
const existing = groups.get(key);
if (existing) {
existing.files.push(fileName);
@@ -311,20 +355,45 @@ async function groupAudioFilesByMetadata(
});
}
} else {
// No title metadata — treat as individual book
const uniqueKey = `__ungrouped_${ungroupedCounter++}`;
groups.set(uniqueKey, {
files: [fileName],
totalSize: fileSize,
metadata,
});
// No title metadata — collect all such files under one folder-level group.
// Key must start with '__ungrouped_' so deduplicateDiscoveries treats it
// as unique per folder (prefixes it with folderPath before deduplication).
const ungroupedKey = '__ungrouped_folder';
const existing = groups.get(ungroupedKey);
if (existing) {
existing.files.push(fileName);
existing.totalSize += fileSize;
} else {
groups.set(ungroupedKey, {
files: [fileName],
totalSize: fileSize,
metadata,
});
}
}
}
// If there is exactly one tagged group alongside an ungrouped group, absorb
// the untagged files into the tagged group. Untagged files in the same folder
// almost certainly belong to the same book (e.g. one chapter was ripped
// without tags, or a cover/intro file carries different metadata).
// Only do this when there is a single tagged group — multiple tagged groups
// mean genuinely different books are mixed in the folder, so keep them separate.
const ungrouped = groups.get('__ungrouped_folder');
if (ungrouped) {
const taggedKeys = Array.from(groups.keys()).filter((k) => k !== '__ungrouped_folder');
if (taggedKeys.length === 1) {
const taggedGroup = groups.get(taggedKeys[0])!;
taggedGroup.files.push(...ungrouped.files);
taggedGroup.totalSize += ungrouped.totalSize;
groups.delete('__ungrouped_folder');
}
}
// Build result with search terms
return Array.from(groups.entries()).map(([groupingKey, group]) => {
group.files.sort((a, b) => a.localeCompare(b));
const { searchTerm, source } = buildSearchTerm(group.metadata, group.files[0]);
const { searchTerm, source } = buildSearchTerm(group.metadata, group.files[0], folderName);
return {
files: group.files,
totalSize: group.totalSize,
@@ -398,6 +467,7 @@ function deduplicateDiscoveries(
metadata: first.metadata,
searchTerm: first.searchTerm,
metadataSource: first.metadataSource,
extractedAsin: first.extractedAsin,
audioFiles: combinedFiles,
groupingKey: first.groupingKey,
});
@@ -434,9 +504,10 @@ function findCommonParent(paths: string[]): string {
*
* Scans every folder for audio files. When audio files are found, they are
* grouped by metadata (title + author + narrator) — each group becomes a
* separate discovered audiobook. Files with no metadata are treated as
* individual books. Scanning ALWAYS recurses into subfolders regardless of
* whether the current folder has audio files.
* separate discovered audiobook. Files with no metadata are all grouped
* together per folder (treated as one book) rather than one entry per file.
* Scanning ALWAYS recurses into subfolders regardless of whether the current
* folder has audio files.
*
* After the full walk, discoveries sharing the same grouping key across
* different folders (e.g., CD1/ and CD2/) are merged.
@@ -460,11 +531,13 @@ export async function discoverAudiobooks(
foldersScanned++;
const folderName = path.basename(currentPath);
onProgress?.({
phase: 'discovering',
foldersScanned,
audiobooksFound: results.length,
currentFolder: path.basename(currentPath),
currentFolder: folderName,
});
// Check if this folder contains audio files
@@ -486,19 +559,22 @@ export async function discoverAudiobooks(
phase: 'grouping',
foldersScanned,
audiobooksFound: results.length,
currentFolder: path.basename(currentPath),
currentFolder: folderName,
});
// Group audio files by metadata
// Group audio files by metadata, passing folder name for fallback search terms
const groups = await groupAudioFilesByMetadata(
currentPath,
audioResult.audioFiles,
audioSizes
audioSizes,
folderName
);
const folderName = path.basename(currentPath);
const relativePath = path.relative(rootPath, currentPath).replace(/\\/g, '/');
// Extract ASIN from folder name once for all groups in this folder
const extractedAsin = extractAsinFromString(folderName) ?? undefined;
for (const group of groups) {
results.push({
folderPath: currentPath.replace(/\\/g, '/'),
@@ -509,6 +585,7 @@ export async function discoverAudiobooks(
metadata: group.metadata,
searchTerm: group.searchTerm,
metadataSource: group.metadataSource,
extractedAsin,
audioFiles: group.files,
groupingKey: group.groupingKey,
});
@@ -518,7 +595,7 @@ export async function discoverAudiobooks(
phase: 'reading_metadata',
foldersScanned,
audiobooksFound: results.length,
currentFolder: path.basename(currentPath),
currentFolder: folderName,
});
}
+6 -1
View File
@@ -109,7 +109,12 @@ export function areDurationsCompatible(a?: number, b?: number): boolean {
// Metadata scoring (for picking best representative)
// ---------------------------------------------------------------------------
function metadataScore(book: AudibleAudiobook): number {
/**
* Score a book by how much metadata it carries. Used as the tie-breaker when
* collapsing duplicates — the entry with the richest metadata wins. Exported
* so the works-table collapse pass can apply the same ranking.
*/
export function metadataScore(book: AudibleAudiobook): number {
let score = 0;
if (book.coverArtUrl) score++;
if (book.rating != null) score++;
+37
View File
@@ -0,0 +1,37 @@
/**
* Component: Narrator Extraction Utility
* Documentation: documentation/integrations/audible.md
*
* Shared helper for Audible HTML scrapers. Audible product listings render
* each narrator as a separate `<a href="?searchNarrator=...">` link; using
* `.first()` on that selector silently drops co-narrators and breaks dedup
* for multi-narrator productions (e.g. full-cast audiobooks). This helper
* captures every narrator link and joins them, falling back to the
* `.narratorLabel` span when no anchor links are present.
*/
import type * as cheerio from 'cheerio';
import type { AnyNode } from 'domhandler';
/**
* Extract a comma-joined narrator string from an Audible product list item.
*
* Order is not semantically significant — downstream `normalizeNarrator()`
* sorts before comparison — but document-order preserves a stable, legible
* value for caching and logging.
*/
export function extractAllNarrators(
$: cheerio.CheerioAPI,
$el: cheerio.Cheerio<AnyNode>,
): string {
const links = $el.find('a[href*="searchNarrator="]');
if (links.length > 0) {
const names: string[] = [];
links.each((_, link) => {
const name = $(link).text().trim();
if (name) names.push(name);
});
if (names.length > 0) return names.join(', ');
}
return $el.find('.narratorLabel').text().trim();
}
+9 -3
View File
@@ -38,12 +38,18 @@ export function getBrowserHeaders(userAgent: string): Record<string, string> {
}
/**
* Jittered exponential backoff: 2^attempt * baseMs * random(0.5, 1.5)
* Jittered exponential backoff: 2^attempt * baseMs * random(0.5, 1.5),
* optionally capped so high attempt counts don't produce absurd waits.
* Avoids predictable retry timing that is trivially fingerprinted.
*/
export function jitteredBackoff(attempt: number, baseMs: number = 1000): number {
export function jitteredBackoff(
attempt: number,
baseMs: number = 1000,
maxBackoffMs: number = Number.POSITIVE_INFINITY,
): number {
const jitter = 0.5 + Math.random(); // 0.5 1.5
return Math.round(Math.pow(2, attempt) * baseMs * jitter);
const raw = Math.pow(2, attempt) * baseMs * jitter;
return Math.round(Math.min(raw, maxBackoffMs));
}
/** Random integer in [minMs, maxMs] */
+420 -88
View File
@@ -49,6 +49,8 @@ interface ProductOverrides {
runtime_length_min?: number;
release_date?: string;
language?: string;
format_type?: string;
publisher_name?: string;
rating?: { overall_distribution?: { display_stars?: number } };
category_ladders?: Array<{ ladder: Array<{ name: string }> }>;
series?: Array<{ asin?: string; title?: string; sequence?: string }>;
@@ -81,6 +83,122 @@ function apiResponse(envelope: object) {
return { data: envelope };
}
// ---------------------------------------------------------------------------
// HTML fixture helpers (for getPopularAudiobooks / getNewReleases / getCategoryBooks,
// which scrape Audible's curated HTML pages)
// ---------------------------------------------------------------------------
interface HtmlBookOverrides {
asin?: string;
title?: string;
author?: string;
authorAsin?: string;
/** Single-narrator shorthand; mutually exclusive with `narrators`. */
narrator?: string;
/** Multi-narrator productions render each name as its own searchNarrator anchor. */
narrators?: string[];
coverArtUrl?: string;
rating?: number;
}
/** Render one or more narrator anchor links suitable for embedding in .narratorLabel. */
function renderNarratorLinks(names: string[]): string {
return names
.map(
(name) =>
`<a href="/search?searchNarrator=${encodeURIComponent(name)}">${name}</a>`,
)
.join(', ');
}
/**
* Produces a single .productListItem block matching the selectors parsed by
* parseProductListItems(). The parser looks for an `<li data-asin>` descendant,
* with an `<a href="/pd/...">` fallback — using a real `<li>` here both
* exercises the primary path and keeps the markup well-formed.
*/
function makeProductListItemHtml(overrides: HtmlBookOverrides = {}): string {
const {
asin = 'B000000001',
title = 'Test Book',
author = 'Test Author',
authorAsin = 'A000000001',
narrator = 'Test Narrator',
narrators,
coverArtUrl = 'https://images.example.com/cover._SL500_.jpg',
rating = 4.5,
} = overrides;
// Real Audible storefront markup embeds each narrator as its own anchor inside
// .narratorLabel for multi-narrator productions. The single-narrator case keeps
// the original plain-text span for backward compatibility with existing tests.
const narratorMarkup = narrators && narrators.length > 0
? `<span class="narratorLabel">Narrated by: ${renderNarratorLinks(narrators)}</span>`
: `<span class="narratorLabel">${narrator}</span>`;
return `
<div class="productListItem">
<ul>
<li data-asin="${asin}">
<img src="${coverArtUrl}" />
<h3><a href="/pd/test/${asin}">${title}</a></h3>
<a class="authorLabel" href="/author/test/${authorAsin}">${author}</a>
${narratorMarkup}
<span class="ratingsLabel">${rating} out of 5</span>
</li>
</ul>
</div>
`;
}
/**
* Produces a single .s-result-item block matching the selectors parsed by
* parseSearchResultItems(). Used for /search?node=<categoryId> category pages.
*/
function makeSearchResultItemHtml(overrides: HtmlBookOverrides = {}): string {
const {
asin = 'B000000001',
title = 'Test Book',
author = 'Test Author',
authorAsin = 'A000000001',
narrator = 'Test Narrator',
narrators,
coverArtUrl = 'https://images.example.com/cover._SL500_.jpg',
rating = 4.5,
} = overrides;
const narratorLinks = narrators && narrators.length > 0
? renderNarratorLinks(narrators)
: `<a href="/search?searchNarrator=${encodeURIComponent(narrator)}">${narrator}</a>`;
return `
<div class="s-result-item">
<ul>
<li data-asin="${asin}">
<img src="${coverArtUrl}" />
<h2><a href="/pd/test/${asin}">${title}</a></h2>
<a href="/author/test/${authorAsin}">${author}</a>
${narratorLinks}
<span class="ratingsLabel">${rating} out of 5</span>
</li>
</ul>
</div>
`;
}
/** Wrap one or more item-HTML strings in a minimal page document. */
function makeHtmlPage(items: string[]): string {
return `<html><body>${items.join('')}</body></html>`;
}
/**
* Produces the value that client.get() should resolve to for HTML responses.
* cheerio.load() is called on response.data, so .data must be the raw HTML string.
*/
function htmlResponse(html: string) {
return { data: html };
}
// ---------------------------------------------------------------------------
// Test setup
// ---------------------------------------------------------------------------
@@ -499,6 +617,47 @@ describe('AudibleService', () => {
const genreSet = new Set(results[0].genres);
expect(genreSet.size).toBe(5);
});
it('maps language from catalog product', async () => {
const products = [makeProduct({ language: 'english' })];
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse(products)));
const service = new AudibleService();
const { results } = await service.search('test', 1);
expect(results[0].language).toBe('english');
});
it('maps format_type to formatType from catalog product', async () => {
const products = [makeProduct({ format_type: 'unabridged' })];
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse(products)));
const service = new AudibleService();
const { results } = await service.search('test', 1);
expect(results[0].formatType).toBe('unabridged');
});
it('maps publisher_name to publisherName from catalog product', async () => {
const products = [makeProduct({ publisher_name: 'Penguin Random House Audio' })];
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse(products)));
const service = new AudibleService();
const { results } = await service.search('test', 1);
expect(results[0].publisherName).toBe('Penguin Random House Audio');
});
it('leaves formatType and publisherName undefined when catalog product omits them', async () => {
const products = [makeProduct()];
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse(products)));
const service = new AudibleService();
const { results } = await service.search('test', 1);
expect(results[0].formatType).toBeUndefined();
expect(results[0].publisherName).toBeUndefined();
});
});
// -------------------------------------------------------------------------
@@ -683,61 +842,66 @@ describe('AudibleService', () => {
});
// -------------------------------------------------------------------------
// getPopularAudiobooks()
// getPopularAudiobooks() — HTML scraping of /adblbestsellers
// -------------------------------------------------------------------------
describe('getPopularAudiobooks()', () => {
it('uses products_sort_by: BestSellers', async () => {
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse([])));
it('hits /adblbestsellers on the htmlClient with pageSize=50', async () => {
htmlClientMock.get.mockResolvedValue(htmlResponse(makeHtmlPage([makeProductListItemHtml()])));
const service = new AudibleService();
await service.getPopularAudiobooks(1);
expect(apiClientMock.get.mock.calls[0][1].params.products_sort_by).toBe('BestSellers');
expect(htmlClientMock.get).toHaveBeenCalledWith(
'/adblbestsellers',
expect.objectContaining({
params: expect.objectContaining({ pageSize: 50 }),
}),
);
});
it('subtracts 1 from public page=1 before calling the API', async () => {
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse([])));
it('does not include a page param on the first request (only from page 2 onward)', async () => {
htmlClientMock.get.mockResolvedValue(htmlResponse(makeHtmlPage([makeProductListItemHtml()])));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
await service.getPopularAudiobooks(1);
expect(apiClientMock.get.mock.calls[0][1].params.page).toBe(0);
expect(htmlClientMock.get.mock.calls[0][1].params.page).toBeUndefined();
delaySpy.mockRestore();
});
it('makes a second call with page=1 when paginating to page 2', async () => {
const page1Products = Array.from({ length: 50 }, (_, i) =>
makeProduct({ asin: `B${String(i).padStart(9, '0')}`, title: `Book ${i}` }),
it('includes page=2 on the second request when paginating', async () => {
const page1Items = Array.from({ length: 50 }, (_, i) =>
makeProductListItemHtml({ asin: `B${String(i).padStart(9, '0')}`, title: `Book ${i}` }),
);
const page2Products = Array.from({ length: 25 }, (_, i) =>
makeProduct({ asin: `B${String(i + 50).padStart(9, '0')}`, title: `Book ${i + 50}` }),
const page2Items = Array.from({ length: 25 }, (_, i) =>
makeProductListItemHtml({ asin: `B${String(i + 50).padStart(9, '0')}`, title: `Book ${i + 50}` }),
);
apiClientMock.get
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page1Products, 75)))
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page2Products, 75)));
htmlClientMock.get
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page1Items)))
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page2Items)));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
await service.getPopularAudiobooks(75);
expect(apiClientMock.get.mock.calls[1][1].params.page).toBe(1);
expect(htmlClientMock.get.mock.calls[1][1].params.page).toBe(2);
delaySpy.mockRestore();
});
it('paginates and returns up to the requested limit', async () => {
const page1Products = Array.from({ length: 50 }, (_, i) =>
makeProduct({ asin: `B${String(i).padStart(9, '0')}`, title: `Book ${i}` }),
it('paginates across pages and returns up to the requested limit', async () => {
const page1Items = Array.from({ length: 50 }, (_, i) =>
makeProductListItemHtml({ asin: `B${String(i).padStart(9, '0')}`, title: `Book ${i}` }),
);
const page2Products = Array.from({ length: 25 }, (_, i) =>
makeProduct({ asin: `B${String(i + 50).padStart(9, '0')}`, title: `Book ${i + 50}` }),
const page2Items = Array.from({ length: 25 }, (_, i) =>
makeProductListItemHtml({ asin: `B${String(i + 50).padStart(9, '0')}`, title: `Book ${i + 50}` }),
);
apiClientMock.get
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page1Products, 75)))
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page2Products, 75)));
htmlClientMock.get
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page1Items)))
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page2Items)));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
@@ -747,176 +911,338 @@ describe('AudibleService', () => {
delaySpy.mockRestore();
});
it('stops early when a page returns fewer than the page size', async () => {
const products = [makeProduct()];
apiClientMock.get.mockResolvedValueOnce(apiResponse(makeProductsResponse(products, 1)));
it('stops early when a page returns fewer than half the page size', async () => {
htmlClientMock.get.mockResolvedValueOnce(
htmlResponse(makeHtmlPage([makeProductListItemHtml()])),
);
const service = new AudibleService();
const results = await service.getPopularAudiobooks(50);
expect(results).toHaveLength(1);
expect(apiClientMock.get).toHaveBeenCalledTimes(1);
expect(htmlClientMock.get).toHaveBeenCalledTimes(1);
});
it('deduplicates by ASIN across pages', async () => {
const sharedProduct = makeProduct({ asin: 'BDUP000001', title: 'Duplicated Book' });
const uniqueProduct = makeProduct({ asin: 'BUNIQ000001', title: 'Unique Book' });
const sharedAsin = 'BDUP000001';
const uniqueAsin = 'BUNIQ000001';
apiClientMock.get
.mockResolvedValueOnce(
apiResponse(makeProductsResponse([sharedProduct], 51)),
)
.mockResolvedValueOnce(
// page 2 returns the same ASIN plus a new one
apiResponse(makeProductsResponse([sharedProduct, uniqueProduct], 51)),
);
// Build a "full" first page (50 items, all with the shared ASIN duplicated as filler)
// so the parser proceeds to page 2.
const page1Items = [
makeProductListItemHtml({ asin: sharedAsin, title: 'Duplicated Book' }),
...Array.from({ length: 49 }, (_, i) =>
makeProductListItemHtml({ asin: `BFILL${String(i).padStart(5, '0')}`, title: `Filler ${i}` }),
),
];
const page2Items = [
makeProductListItemHtml({ asin: sharedAsin, title: 'Duplicated Book' }),
makeProductListItemHtml({ asin: uniqueAsin, title: 'Unique Book' }),
...Array.from({ length: 48 }, (_, i) =>
makeProductListItemHtml({ asin: `BFILL2${String(i).padStart(4, '0')}`, title: `Filler2 ${i}` }),
),
];
htmlClientMock.get
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page1Items)))
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page2Items)));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
const results = await service.getPopularAudiobooks(100);
const results = await service.getPopularAudiobooks(150);
const asins = results.map((r) => r.asin);
expect(asins.filter((a) => a === 'BDUP000001')).toHaveLength(1);
expect(asins.filter((a) => a === sharedAsin)).toHaveLength(1);
expect(asins).toContain(uniqueAsin);
delaySpy.mockRestore();
});
it('returns empty array on error without throwing', async () => {
const error: Error & { response?: { status: number } } = new Error('Not Found');
error.response = { status: 404 };
apiClientMock.get.mockRejectedValue(error);
htmlClientMock.get.mockRejectedValue(error);
const service = new AudibleService();
const results = await service.getPopularAudiobooks(5);
expect(results).toEqual([]);
});
it('uses htmlClient (not apiClient) for the request', async () => {
htmlClientMock.get.mockResolvedValue(htmlResponse(makeHtmlPage([makeProductListItemHtml()])));
const service = new AudibleService();
await service.getPopularAudiobooks(1);
expect(htmlClientMock.get).toHaveBeenCalled();
expect(apiClientMock.get).not.toHaveBeenCalled();
});
it('maps title, author, narrator, and rating from the parsed item', async () => {
htmlClientMock.get.mockResolvedValue(
htmlResponse(
makeHtmlPage([
makeProductListItemHtml({
asin: 'B0HTMLMAP1',
title: 'Mapped Title',
author: 'Mapped Author',
authorAsin: 'A00MAPAUTH',
narrator: 'Mapped Narrator',
rating: 4.7,
}),
]),
),
);
const service = new AudibleService();
const [book] = await service.getPopularAudiobooks(1);
expect(book.asin).toBe('B0HTMLMAP1');
expect(book.title).toBe('Mapped Title');
expect(book.author).toBe('Mapped Author');
expect(book.authorAsin).toBe('A00MAPAUTH');
expect(book.narrator).toBe('Mapped Narrator');
expect(book.rating).toBeCloseTo(4.7);
});
it('captures every co-narrator on multi-narrator productions (regression: prior code took only the first link)', async () => {
htmlClientMock.get.mockResolvedValue(
htmlResponse(
makeHtmlPage([
makeProductListItemHtml({
asin: 'B0FULLCAST',
narrators: [
'Kristin Atherton',
'Roy McMillan',
'Clare Corbett',
'Tom Bateman',
'Patience Tomlinson',
'Shaheen Khan',
],
}),
]),
),
);
const service = new AudibleService();
const [book] = await service.getPopularAudiobooks(1);
// Every narrator must round-trip — order is not significant downstream,
// but document order should be preserved for stable cache values.
expect(book.narrator).toBe(
'Kristin Atherton, Roy McMillan, Clare Corbett, Tom Bateman, Patience Tomlinson, Shaheen Khan',
);
});
});
// -------------------------------------------------------------------------
// getNewReleases()
// getNewReleases() — HTML scraping of /newreleases
// -------------------------------------------------------------------------
describe('getNewReleases()', () => {
it('uses products_sort_by: -ReleaseDate', async () => {
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse([])));
it('hits /newreleases on the htmlClient with pageSize=50', async () => {
htmlClientMock.get.mockResolvedValue(htmlResponse(makeHtmlPage([makeProductListItemHtml()])));
const service = new AudibleService();
await service.getNewReleases(1);
expect(apiClientMock.get.mock.calls[0][1].params.products_sort_by).toBe('-ReleaseDate');
expect(htmlClientMock.get).toHaveBeenCalledWith(
'/newreleases',
expect.objectContaining({
params: expect.objectContaining({ pageSize: 50 }),
}),
);
});
it('subtracts 1 from public page=1 before calling the API', async () => {
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse([])));
it('does not include a page param on the first request', async () => {
htmlClientMock.get.mockResolvedValue(htmlResponse(makeHtmlPage([makeProductListItemHtml()])));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
await service.getNewReleases(1);
expect(apiClientMock.get.mock.calls[0][1].params.page).toBe(0);
expect(htmlClientMock.get.mock.calls[0][1].params.page).toBeUndefined();
delaySpy.mockRestore();
});
it('subtracts 1 from public page=2 when paginating to the second page', async () => {
const page1Products = Array.from({ length: 50 }, (_, i) =>
makeProduct({ asin: `B${String(i).padStart(9, '0')}` }),
it('includes page=2 on the second request when paginating', async () => {
const page1Items = Array.from({ length: 50 }, (_, i) =>
makeProductListItemHtml({ asin: `B${String(i).padStart(9, '0')}` }),
);
const page2Items = Array.from({ length: 50 }, (_, i) =>
makeProductListItemHtml({ asin: `B${String(i + 50).padStart(9, '0')}` }),
);
const page2Products = [makeProduct({ asin: 'BNEW000099' })];
apiClientMock.get
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page1Products, 51)))
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page2Products, 51)));
htmlClientMock.get
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page1Items)))
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page2Items)));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
await service.getNewReleases(51);
expect(apiClientMock.get.mock.calls[1][1].params.page).toBe(1);
await service.getNewReleases(100);
expect(htmlClientMock.get.mock.calls[1][1].params.page).toBe(2);
delaySpy.mockRestore();
});
it('deduplicates by ASIN across pages', async () => {
const sharedProduct = makeProduct({ asin: 'BDUP000002' });
apiClientMock.get
.mockResolvedValueOnce(apiResponse(makeProductsResponse([sharedProduct], 51)))
.mockResolvedValueOnce(apiResponse(makeProductsResponse([sharedProduct], 51)));
const sharedAsin = 'BDUP000002';
const page1Items = [
makeProductListItemHtml({ asin: sharedAsin }),
...Array.from({ length: 49 }, (_, i) =>
makeProductListItemHtml({ asin: `BNEW${String(i).padStart(6, '0')}` }),
),
];
const page2Items = [
makeProductListItemHtml({ asin: sharedAsin }),
...Array.from({ length: 49 }, (_, i) =>
makeProductListItemHtml({ asin: `BNEW2${String(i).padStart(5, '0')}` }),
),
];
htmlClientMock.get
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page1Items)))
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page2Items)));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
const results = await service.getNewReleases(100);
const results = await service.getNewReleases(150);
expect(results.filter((r) => r.asin === 'BDUP000002')).toHaveLength(1);
expect(results.filter((r) => r.asin === sharedAsin)).toHaveLength(1);
delaySpy.mockRestore();
});
it('returns empty array on error without throwing', async () => {
const error: Error & { response?: { status: number } } = new Error('Not Found');
error.response = { status: 404 };
apiClientMock.get.mockRejectedValue(error);
htmlClientMock.get.mockRejectedValue(error);
const service = new AudibleService();
const results = await service.getNewReleases(5);
expect(results).toEqual([]);
});
it('uses htmlClient (not apiClient) for the request', async () => {
htmlClientMock.get.mockResolvedValue(htmlResponse(makeHtmlPage([makeProductListItemHtml()])));
const service = new AudibleService();
await service.getNewReleases(1);
expect(htmlClientMock.get).toHaveBeenCalled();
expect(apiClientMock.get).not.toHaveBeenCalled();
});
});
// -------------------------------------------------------------------------
// getCategoryBooks()
// getCategoryBooks() — HTML scraping of /search?node=<categoryId>
// -------------------------------------------------------------------------
describe('getCategoryBooks()', () => {
it('sends category_id and BestSellers sort param', async () => {
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse([])));
it('hits /search on the htmlClient with node, pageSize, and popularity-rank sort', async () => {
htmlClientMock.get.mockResolvedValue(
htmlResponse(makeHtmlPage([makeSearchResultItemHtml()])),
);
const service = new AudibleService();
await service.getCategoryBooks('18685580011', 1);
const params = apiClientMock.get.mock.calls[0][1].params;
expect(params.category_id).toBe('18685580011');
expect(params.products_sort_by).toBe('BestSellers');
const params = htmlClientMock.get.mock.calls[0][1].params;
expect(htmlClientMock.get.mock.calls[0][0]).toBe('/search');
expect(params.node).toBe('18685580011');
expect(params.pageSize).toBe(50);
expect(params.sort).toBe('popularity-rank');
});
it('subtracts 1 from public page=1 before calling the API', async () => {
apiClientMock.get.mockResolvedValue(apiResponse(makeProductsResponse([])));
it('does not include a page param on the first request', async () => {
htmlClientMock.get.mockResolvedValue(
htmlResponse(makeHtmlPage([makeSearchResultItemHtml()])),
);
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
await service.getCategoryBooks('CAT001', 1);
expect(apiClientMock.get.mock.calls[0][1].params.page).toBe(0);
expect(htmlClientMock.get.mock.calls[0][1].params.page).toBeUndefined();
delaySpy.mockRestore();
});
it('subtracts 1 from public page=2 when paginating to the second page', async () => {
const page1Products = Array.from({ length: 50 }, (_, i) =>
makeProduct({ asin: `B${String(i).padStart(9, '0')}` }),
it('includes page=2 on the second request when paginating', async () => {
const page1Items = Array.from({ length: 50 }, (_, i) =>
makeSearchResultItemHtml({ asin: `B${String(i).padStart(9, '0')}` }),
);
const page2Items = Array.from({ length: 50 }, (_, i) =>
makeSearchResultItemHtml({ asin: `B${String(i + 50).padStart(9, '0')}` }),
);
const page2Products = [makeProduct({ asin: 'BCAT000099' })];
apiClientMock.get
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page1Products, 51)))
.mockResolvedValueOnce(apiResponse(makeProductsResponse(page2Products, 51)));
htmlClientMock.get
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page1Items)))
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page2Items)));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
await service.getCategoryBooks('CAT001', 51);
expect(apiClientMock.get.mock.calls[1][1].params.page).toBe(1);
await service.getCategoryBooks('CAT001', 100);
expect(htmlClientMock.get.mock.calls[1][1].params.page).toBe(2);
delaySpy.mockRestore();
});
it('deduplicates by ASIN across pages', async () => {
const sharedProduct = makeProduct({ asin: 'BDUP000003' });
apiClientMock.get
.mockResolvedValueOnce(apiResponse(makeProductsResponse([sharedProduct], 51)))
.mockResolvedValueOnce(apiResponse(makeProductsResponse([sharedProduct], 51)));
const sharedAsin = 'BDUP000003';
const page1Items = [
makeSearchResultItemHtml({ asin: sharedAsin }),
...Array.from({ length: 49 }, (_, i) =>
makeSearchResultItemHtml({ asin: `BCAT${String(i).padStart(6, '0')}` }),
),
];
const page2Items = [
makeSearchResultItemHtml({ asin: sharedAsin }),
...Array.from({ length: 49 }, (_, i) =>
makeSearchResultItemHtml({ asin: `BCAT2${String(i).padStart(5, '0')}` }),
),
];
htmlClientMock.get
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page1Items)))
.mockResolvedValueOnce(htmlResponse(makeHtmlPage(page2Items)));
const service = new AudibleService();
const delaySpy = vi.spyOn(service as any, 'delay').mockResolvedValue(undefined);
const results = await service.getCategoryBooks('CAT001', 100);
const results = await service.getCategoryBooks('CAT001', 150);
expect(results.filter((r) => r.asin === 'BDUP000003')).toHaveLength(1);
expect(results.filter((r) => r.asin === sharedAsin)).toHaveLength(1);
delaySpy.mockRestore();
});
it('uses htmlClient (not apiClient) for the request', async () => {
htmlClientMock.get.mockResolvedValue(
htmlResponse(makeHtmlPage([makeSearchResultItemHtml()])),
);
const service = new AudibleService();
await service.getCategoryBooks('CAT001', 1);
expect(htmlClientMock.get).toHaveBeenCalled();
expect(apiClientMock.get).not.toHaveBeenCalled();
});
it('captures every co-narrator on multi-narrator productions (regression: prior code took only the first link)', async () => {
htmlClientMock.get.mockResolvedValue(
htmlResponse(
makeHtmlPage([
makeSearchResultItemHtml({
asin: 'B0FULLCAST',
narrators: ['Alice', 'Bob', 'Carol', 'Dan'],
}),
]),
),
);
const service = new AudibleService();
const [book] = await service.getCategoryBooks('CAT001', 1);
expect(book.narrator).toBe('Alice, Bob, Carol, Dan');
});
});
// -------------------------------------------------------------------------
@@ -979,6 +1305,9 @@ describe('AudibleService', () => {
runtimeLengthMin: '300',
genres: ['Fiction'],
rating: '4.7',
language: 'english',
formatType: 'unabridged',
publisherName: 'Test Publisher',
},
});
@@ -988,6 +1317,9 @@ describe('AudibleService', () => {
expect(details?.title).toBe('Audnexus Book');
expect(details?.author).toBe('Author A');
expect(details?.durationMinutes).toBe(300);
expect(details?.language).toBe('english');
expect(details?.formatType).toBe('unabridged');
expect(details?.publisherName).toBe('Test Publisher');
// Catalog API should NOT be called when Audnexus succeeds.
expect(apiClientMock.get).not.toHaveBeenCalled();
});
@@ -198,4 +198,69 @@ describe('processAudibleRefresh', () => {
const { processAudibleRefresh } = await import('@/lib/processors/audible-refresh.processor');
await expect(processAudibleRefresh({ jobId: 'job-2' })).rejects.toThrow('DB down');
});
it('deduplicates ASINs in the input list before persisting, preserving order', async () => {
// Two `A` entries should collapse to one. Final ranks must be contiguous
// (1, 2, 3) and follow Audible's editorial ordering (A, B, C).
const popular = [
{ asin: 'A', title: 'Book A', author: 'X', coverArtUrl: null },
{ asin: 'B', title: 'Book B', author: 'X', coverArtUrl: null },
{ asin: 'A', title: 'Book A (duplicate)', author: 'X', coverArtUrl: null },
{ asin: 'C', title: 'Book C', author: 'X', coverArtUrl: null },
];
audibleServiceMock.getPopularAudiobooks.mockResolvedValue(popular);
audibleServiceMock.getNewReleases.mockResolvedValue([]);
thumbnailCacheMock.cleanupUnusedThumbnails.mockResolvedValue(0);
prismaMock.audibleCache.upsert.mockResolvedValue({});
prismaMock.audibleCacheCategory.deleteMany.mockResolvedValue({ count: 0 });
prismaMock.audibleCacheCategory.create.mockResolvedValue({});
prismaMock.userHomeSection.findMany.mockResolvedValue([]);
prismaMock.audibleCache.findMany.mockResolvedValue([]);
const { processAudibleRefresh } = await import('@/lib/processors/audible-refresh.processor');
const result = await processAudibleRefresh({ jobId: 'job-dedup' });
expect(result.popularSaved).toBe(3);
// Only 3 category entries created — the duplicate `A` was dropped.
const popularCreates = (prismaMock.audibleCacheCategory.create.mock.calls as Array<[{ data: { asin: string; categoryId: string; rank: number } }]>)
.map((c) => c[0].data)
.filter((d) => d.categoryId === '__popular__');
expect(popularCreates).toHaveLength(3);
expect(popularCreates.map((d) => d.asin)).toEqual(['A', 'B', 'C']);
expect(popularCreates.map((d) => d.rank)).toEqual([1, 2, 3]);
// upsert called once per unique ASIN, not per input row.
expect(prismaMock.audibleCache.upsert).toHaveBeenCalledTimes(3);
});
it('drops entries with missing ASINs as part of dedup', async () => {
const popular = [
{ asin: 'A', title: 'Book A', author: 'X', coverArtUrl: null },
{ asin: '', title: 'Book with empty asin', author: 'X', coverArtUrl: null },
{ asin: null, title: 'Book with null asin', author: 'X', coverArtUrl: null },
{ asin: 'B', title: 'Book B', author: 'X', coverArtUrl: null },
];
audibleServiceMock.getPopularAudiobooks.mockResolvedValue(popular as any);
audibleServiceMock.getNewReleases.mockResolvedValue([]);
thumbnailCacheMock.cleanupUnusedThumbnails.mockResolvedValue(0);
prismaMock.audibleCache.upsert.mockResolvedValue({});
prismaMock.audibleCacheCategory.deleteMany.mockResolvedValue({ count: 0 });
prismaMock.audibleCacheCategory.create.mockResolvedValue({});
prismaMock.userHomeSection.findMany.mockResolvedValue([]);
prismaMock.audibleCache.findMany.mockResolvedValue([]);
const { processAudibleRefresh } = await import('@/lib/processors/audible-refresh.processor');
const result = await processAudibleRefresh({ jobId: 'job-empty-asin' });
expect(result.popularSaved).toBe(2);
const popularCreates = (prismaMock.audibleCacheCategory.create.mock.calls as Array<[{ data: { asin: string; categoryId: string; rank: number } }]>)
.map((c) => c[0].data)
.filter((d) => d.categoryId === '__popular__');
expect(popularCreates.map((d) => d.asin)).toEqual(['A', 'B']);
expect(popularCreates.map((d) => d.rank)).toEqual([1, 2]);
});
});
@@ -59,6 +59,7 @@ describe('processDownloadTorrent', () => {
vi.clearAllMocks();
// Restore default implementations cleared by clearAllMocks
configMock.getMany.mockResolvedValue({ prowlarr_api_key: null });
jobQueueMock.addNotificationJob.mockResolvedValue(undefined);
});
const torrentPayload = {
@@ -110,7 +111,7 @@ describe('processDownloadTorrent', () => {
enabled: true,
category: 'readmeabook',
});
prismaMock.request.update.mockResolvedValue({});
prismaMock.request.update.mockResolvedValue({ type: 'audiobook', user: { plexUsername: 'testuser' } });
prismaMock.downloadHistory.create.mockResolvedValue({ id: 'dh-1' });
const { processDownloadTorrent } = await import('@/lib/processors/download-torrent.processor');
@@ -141,7 +142,7 @@ describe('processDownloadTorrent', () => {
enabled: true,
category: 'readmeabook',
});
prismaMock.request.update.mockResolvedValue({});
prismaMock.request.update.mockResolvedValue({ type: 'audiobook', user: { plexUsername: 'testuser' } });
prismaMock.downloadHistory.create.mockResolvedValue({ id: 'dh-2' });
const { processDownloadTorrent } = await import('@/lib/processors/download-torrent.processor');
@@ -186,7 +187,7 @@ describe('processDownloadTorrent', () => {
enabled: true,
category: 'readmeabook',
});
prismaMock.request.update.mockResolvedValue({});
prismaMock.request.update.mockResolvedValue({ type: 'audiobook', user: { plexUsername: 'testuser' } });
prismaMock.downloadHistory.create.mockResolvedValue({ id: 'dh-1' });
const { processDownloadTorrent } = await import('@/lib/processors/download-torrent.processor');
+58
View File
@@ -458,6 +458,64 @@ describe('AppriseProvider', () => {
});
});
describe('messageLabel rendering by event', () => {
const basePayload = {
requestId: 'req-1',
title: 'Test Book',
author: 'Test Author',
userName: 'Test User',
timestamp: new Date('2024-01-01T00:00:00Z'),
};
it('renders "⚠️ Error:" with error emoji for request_error', async () => {
fetchMock.mockResolvedValue({ ok: true, text: async () => 'ok' });
const { AppriseProvider } = await import('@/lib/services/notification');
const provider = new AppriseProvider();
await provider.send(
{ serverUrl: 'http://apprise:8000', urls: 'slack://token' },
{ ...basePayload, event: 'request_error', message: 'Boom' }
);
const body = JSON.parse(fetchMock.mock.calls[0][1].body);
expect(body.body).toContain('⚠️ Error: Boom');
expect(body.body).not.toContain('📝');
});
it('renders "📝 Reason:" with note emoji for issue_reported', async () => {
fetchMock.mockResolvedValue({ ok: true, text: async () => 'ok' });
const { AppriseProvider } = await import('@/lib/services/notification');
const provider = new AppriseProvider();
await provider.send(
{ serverUrl: 'http://apprise:8000', urls: 'slack://token' },
{ ...basePayload, event: 'issue_reported', issueId: 'iss-1', message: 'Chapter 3 cuts off' }
);
const body = JSON.parse(fetchMock.mock.calls[0][1].body);
expect(body.body).toContain('📝 Reason: Chapter 3 cuts off');
expect(body.body).not.toContain('⚠️');
expect(body.body).not.toContain('Error:');
});
it('renders "📝 Details:" with note emoji for request_grabbed', async () => {
fetchMock.mockResolvedValue({ ok: true, text: async () => 'ok' });
const { AppriseProvider } = await import('@/lib/services/notification');
const provider = new AppriseProvider();
await provider.send(
{ serverUrl: 'http://apprise:8000', urls: 'slack://token' },
{ ...basePayload, event: 'request_grabbed', message: 'Test Book [M4B] via NZBGeek (SABnzbd)', requestType: 'audiobook' }
);
const body = JSON.parse(fetchMock.mock.calls[0][1].body);
expect(body.body).toContain('📝 Details: Test Book [M4B] via NZBGeek (SABnzbd)');
expect(body.body).not.toContain('⚠️');
expect(body.body).not.toContain('Error:');
expect(body.title).toBe('Audiobook Grabbed');
});
});
describe('integration with NotificationService.sendToBackend', () => {
it('decrypts sensitive fields and sends to Apprise', async () => {
fetchMock.mockResolvedValue({
+58
View File
@@ -267,6 +267,64 @@ describe('NtfyProvider', () => {
});
});
describe('messageLabel rendering by event', () => {
const basePayload = {
requestId: 'req-1',
title: 'Test Book',
author: 'Test Author',
userName: 'Test User',
timestamp: new Date('2024-01-01T00:00:00Z'),
};
it('renders "⚠️ Error:" with error emoji for request_error', async () => {
fetchMock.mockResolvedValue({ ok: true, json: async () => ({ id: 'msg' }) });
const { NtfyProvider } = await import('@/lib/services/notification');
const provider = new NtfyProvider();
await provider.send(
{ topic: 'audiobooks' },
{ ...basePayload, event: 'request_error', message: 'Boom' }
);
const body = JSON.parse(fetchMock.mock.calls[0][1].body);
expect(body.message).toContain('⚠️ Error: Boom');
expect(body.message).not.toContain('📝');
});
it('renders "📝 Reason:" with note emoji for issue_reported', async () => {
fetchMock.mockResolvedValue({ ok: true, json: async () => ({ id: 'msg' }) });
const { NtfyProvider } = await import('@/lib/services/notification');
const provider = new NtfyProvider();
await provider.send(
{ topic: 'audiobooks' },
{ ...basePayload, event: 'issue_reported', issueId: 'iss-1', message: 'Chapter 3 cuts off' }
);
const body = JSON.parse(fetchMock.mock.calls[0][1].body);
expect(body.message).toContain('📝 Reason: Chapter 3 cuts off');
expect(body.message).not.toContain('⚠️');
expect(body.message).not.toContain('Error:');
});
it('renders "📝 Details:" with note emoji for request_grabbed', async () => {
fetchMock.mockResolvedValue({ ok: true, json: async () => ({ id: 'msg' }) });
const { NtfyProvider } = await import('@/lib/services/notification');
const provider = new NtfyProvider();
await provider.send(
{ topic: 'audiobooks' },
{ ...basePayload, event: 'request_grabbed', message: 'Test Book [M4B] via NZBGeek (SABnzbd)', requestType: 'audiobook' }
);
const body = JSON.parse(fetchMock.mock.calls[0][1].body);
expect(body.message).toContain('📝 Details: Test Book [M4B] via NZBGeek (SABnzbd)');
expect(body.message).not.toContain('⚠️');
expect(body.message).not.toContain('Error:');
expect(body.title).toBe('Audiobook Grabbed');
});
});
describe('integration with NotificationService.sendToBackend', () => {
it('decrypts accessToken and sends to ntfy', async () => {
fetchMock.mockResolvedValue({
+189
View File
@@ -6,6 +6,15 @@
import { beforeEach, describe, expect, it, vi } from 'vitest';
import { createPrismaMock } from '../helpers/prisma';
import type { DedupGroup } from '@/lib/utils/deduplicate-audiobooks';
import type { AudibleAudiobook } from '@/lib/integrations/audible.service';
function makeBook(overrides: Partial<AudibleAudiobook> & { asin: string }): AudibleAudiobook {
return {
title: 'Test Book',
author: 'Test Author',
...overrides,
};
}
const prismaMock = createPrismaMock();
@@ -304,3 +313,183 @@ describe('getSiblingAsins', () => {
expect(result.has('ASIN_LONELY')).toBe(false);
});
});
describe('collapseByExistingWorks', () => {
beforeEach(() => {
vi.clearAllMocks();
vi.resetModules();
});
it('returns input unchanged when the list is empty or has one entry', async () => {
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
expect(await collapseByExistingWorks([])).toEqual([]);
expect(prismaMock.workAsin.findMany).not.toHaveBeenCalled();
const single = [makeBook({ asin: 'A1' })];
expect(await collapseByExistingWorks(single)).toEqual(single);
expect(prismaMock.workAsin.findMany).not.toHaveBeenCalled();
});
it('returns input unchanged when none of the ASINs are in any work', async () => {
prismaMock.workAsin.findMany.mockResolvedValue([]);
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
const books = [
makeBook({ asin: 'A1', title: 'Alpha' }),
makeBook({ asin: 'A2', title: 'Beta' }),
];
const result = await collapseByExistingWorks(books);
expect(result).toEqual(books);
});
it('collapses two ASINs that share a work to a single representative', async () => {
prismaMock.workAsin.findMany.mockResolvedValue([
{ asin: 'A1', workId: 'work-1' },
{ asin: 'A2', workId: 'work-1' },
]);
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
const books = [
makeBook({ asin: 'A1', title: 'The Passengers', coverArtUrl: 'cover.jpg' }),
makeBook({ asin: 'A2', title: 'The Passengers' }),
];
const result = await collapseByExistingWorks(books);
expect(result).toHaveLength(1);
// A1 wins — it has the cover URL (higher metadata score)
expect(result[0].asin).toBe('A1');
});
it('keeps the richest-metadata entry when collapsing, regardless of input order', async () => {
prismaMock.workAsin.findMany.mockResolvedValue([
{ asin: 'A1', workId: 'work-1' },
{ asin: 'A2', workId: 'work-1' },
]);
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
// A1 first (sparse), A2 second (rich) — A2 should win on score
const books = [
makeBook({ asin: 'A1', title: 'Book' }),
makeBook({
asin: 'A2',
title: 'Book',
coverArtUrl: 'cover.jpg',
rating: 4.5,
durationMinutes: 600,
narrator: 'Full Cast',
description: 'Rich book',
releaseDate: '2024-01-01',
genres: ['Fiction'],
}),
];
const result = await collapseByExistingWorks(books);
expect(result).toHaveLength(1);
expect(result[0].asin).toBe('A2');
});
it('preserves position of the work in the input order', async () => {
prismaMock.workAsin.findMany.mockResolvedValue([
{ asin: 'A2', workId: 'work-1' },
{ asin: 'A4', workId: 'work-1' },
]);
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
const books = [
makeBook({ asin: 'A1', title: 'Alpha' }),
makeBook({ asin: 'A2', title: 'Beta' }),
makeBook({ asin: 'A3', title: 'Gamma' }),
makeBook({ asin: 'A4', title: 'Beta' }),
makeBook({ asin: 'A5', title: 'Delta' }),
];
const result = await collapseByExistingWorks(books);
// A2 and A4 collapse to one entry at position 1 (the first occurrence)
expect(result.map(b => b.asin)).toEqual(['A1', 'A2', 'A3', 'A5']);
});
it('handles multiple independent works in the same batch', async () => {
prismaMock.workAsin.findMany.mockResolvedValue([
{ asin: 'A1', workId: 'work-1' },
{ asin: 'A2', workId: 'work-1' },
{ asin: 'B1', workId: 'work-2' },
{ asin: 'B2', workId: 'work-2' },
{ asin: 'B3', workId: 'work-2' },
]);
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
const books = [
makeBook({ asin: 'A1' }),
makeBook({ asin: 'B1' }),
makeBook({ asin: 'A2' }),
makeBook({ asin: 'B2' }),
makeBook({ asin: 'B3' }),
makeBook({ asin: 'C1' }),
];
const result = await collapseByExistingWorks(books);
expect(result.map(b => b.asin)).toEqual(['A1', 'B1', 'C1']);
});
it('passes through books that are not in any work alongside collapsed ones', async () => {
prismaMock.workAsin.findMany.mockResolvedValue([
{ asin: 'A1', workId: 'work-1' },
{ asin: 'A2', workId: 'work-1' },
]);
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
const books = [
makeBook({ asin: 'STANDALONE_1', title: 'Standalone 1' }),
makeBook({ asin: 'A1', title: 'Same Book' }),
makeBook({ asin: 'STANDALONE_2', title: 'Standalone 2' }),
makeBook({ asin: 'A2', title: 'Same Book' }),
];
const result = await collapseByExistingWorks(books);
expect(result).toHaveLength(3);
expect(result.map(b => b.asin)).toEqual(['STANDALONE_1', 'A1', 'STANDALONE_2']);
});
it('returns input unchanged on DB failure (does not throw)', async () => {
prismaMock.workAsin.findMany.mockRejectedValue(new Error('DB exploded'));
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
const books = [
makeBook({ asin: 'A1' }),
makeBook({ asin: 'A2' }),
];
const result = await collapseByExistingWorks(books);
expect(result).toEqual(books);
});
it('only queries the workAsin table once per call', async () => {
prismaMock.workAsin.findMany.mockResolvedValue([
{ asin: 'A1', workId: 'work-1' },
{ asin: 'A2', workId: 'work-1' },
]);
const { collapseByExistingWorks } = await import('@/lib/services/works.service');
await collapseByExistingWorks([
makeBook({ asin: 'A1' }),
makeBook({ asin: 'A2' }),
makeBook({ asin: 'A3' }),
]);
expect(prismaMock.workAsin.findMany).toHaveBeenCalledTimes(1);
expect(prismaMock.workAsin.findMany).toHaveBeenCalledWith({
where: { asin: { in: ['A1', 'A2', 'A3'] } },
select: { asin: true, workId: true },
});
});
});
+95
View File
@@ -0,0 +1,95 @@
/**
* Component: Narrator Extraction Utility Tests
* Documentation: documentation/integrations/audible.md
*/
import { describe, expect, it } from 'vitest';
import * as cheerio from 'cheerio';
import { extractAllNarrators } from '@/lib/utils/extract-narrator';
function load(html: string) {
const $ = cheerio.load(`<div id="item">${html}</div>`);
return { $, $el: $('#item') };
}
describe('extractAllNarrators', () => {
it('returns the single narrator name when only one searchNarrator link is present', () => {
const { $, $el } = load(
`<a href="/search?searchNarrator=Andy%20Serkis">Andy Serkis</a>`,
);
expect(extractAllNarrators($, $el)).toBe('Andy Serkis');
});
it('joins multiple narrator names from separate searchNarrator links', () => {
const { $, $el } = load(`
<a href="/search?searchNarrator=Kristin%20Atherton">Kristin Atherton</a>,
<a href="/search?searchNarrator=Roy%20McMillan">Roy McMillan</a>,
<a href="/search?searchNarrator=Clare%20Corbett">Clare Corbett</a>,
<a href="/search?searchNarrator=Tom%20Bateman">Tom Bateman</a>,
<a href="/search?searchNarrator=Patience%20Tomlinson">Patience Tomlinson</a>,
<a href="/search?searchNarrator=Shaheen%20Khan">Shaheen Khan</a>
`);
expect(extractAllNarrators($, $el)).toBe(
'Kristin Atherton, Roy McMillan, Clare Corbett, Tom Bateman, Patience Tomlinson, Shaheen Khan',
);
});
it('preserves document order (downstream sorts before comparing, but order should be stable)', () => {
const { $, $el } = load(`
<a href="/search?searchNarrator=Z">Zelda</a>
<a href="/search?searchNarrator=A">Alice</a>
<a href="/search?searchNarrator=M">Mallory</a>
`);
expect(extractAllNarrators($, $el)).toBe('Zelda, Alice, Mallory');
});
it('falls back to .narratorLabel text when no searchNarrator links exist', () => {
const { $, $el } = load(
`<span class="narratorLabel">Narrated by: Single Narrator</span>`,
);
expect(extractAllNarrators($, $el)).toBe('Narrated by: Single Narrator');
});
it('prefers searchNarrator links over .narratorLabel when both are present', () => {
const { $, $el } = load(`
<span class="narratorLabel">Narrated by: ONLY ONE</span>
<a href="/search?searchNarrator=First">First</a>
<a href="/search?searchNarrator=Second">Second</a>
`);
expect(extractAllNarrators($, $el)).toBe('First, Second');
});
it('returns empty string when neither links nor .narratorLabel exist', () => {
const { $, $el } = load(`<span>some other content</span>`);
expect(extractAllNarrators($, $el)).toBe('');
});
it('skips empty link text and joins only non-empty names', () => {
const { $, $el } = load(`
<a href="/search?searchNarrator=A"></a>
<a href="/search?searchNarrator=B">Bob</a>
<a href="/search?searchNarrator=C"> </a>
<a href="/search?searchNarrator=D">Diana</a>
`);
expect(extractAllNarrators($, $el)).toBe('Bob, Diana');
});
it('trims whitespace from each captured name', () => {
const { $, $el } = load(`
<a href="/search?searchNarrator=A"> Alice </a>
<a href="/search?searchNarrator=B">
Bob
</a>
`);
expect(extractAllNarrators($, $el)).toBe('Alice, Bob');
});
it('falls back to .narratorLabel when all searchNarrator links are empty', () => {
const { $, $el } = load(`
<a href="/search?searchNarrator=A"></a>
<a href="/search?searchNarrator=B"> </a>
<span class="narratorLabel">Fallback Narrator</span>
`);
expect(extractAllNarrators($, $el)).toBe('Fallback Narrator');
});
});
+18
View File
@@ -67,6 +67,24 @@ describe('jitteredBackoff', () => {
expect(value).toBeGreaterThanOrEqual(250);
expect(value).toBeLessThanOrEqual(750);
});
it('caps the result at maxBackoffMs when the raw backoff would exceed it', () => {
// attempt=10 with base=1000 produces 2^10 * 1000 * [0.5..1.5] = 512_000..1_536_000,
// all of which exceed a 60_000ms cap.
for (let i = 0; i < 50; i++) {
const value = jitteredBackoff(10, 1000, 60_000);
expect(value).toBeLessThanOrEqual(60_000);
}
});
it('returns the un-capped jittered value when below the cap', () => {
// attempt=0 with base=1000 produces 500..1500, all below a 60_000ms cap.
for (let i = 0; i < 50; i++) {
const value = jitteredBackoff(0, 1000, 60_000);
expect(value).toBeGreaterThanOrEqual(500);
expect(value).toBeLessThanOrEqual(1500);
}
});
});
describe('randomDelay', () => {