Adds file hash-based matching for Audiobookshelf library items to ensure 100% accurate ASIN assignment for RMAB-organized content. Removes fuzzy matching from library availability checks, making all matching ASIN-only to eliminate false positives and race conditions. Updates database schema, processors, and matcher utilities; adds new tests and documentation for the new matching strategy. Removes obsolete scripts, Dockerfile, and related tests; updates docker-compose for test environments.
15 KiB
Plex Media Server Integration
Status: ✅ Implemented
Connectivity to Plex for OAuth, library management, content detection, and automatic scanning. Database stores all audiobooks from Plex as source of truth for availability.
Data Flow
- Plex Scan Job → Fetches all audiobooks → Populates DB with
availabilityStatus: 'available' - Audible Refresh → Fuzzy matches against Plex data in DB → Sets
availabilityStatus: 'available'for matches - UI → Queries DB → Shows "In Your Library" badge → Prevents duplicate requests
Key Principle: Database reflects Plex content. Audible data matched against this.
Core Endpoints
GET {server_url}/identity - Server info (machineIdentifier, version, platform) | Also used for access verification GET {server_url}/library/sections - List libraries with IDs and types GET {server_url}/library/sections/{id}/all?type=9 - All albums (type 9 = audiobooks) GET {server_url}/library/sections/{id}/all?type=9&sort=addedAt:desc&X-Plex-Container-Start=0&X-Plex-Container-Size=10 - Recently added (lightweight polling) GET {server_url}/library/sections/{id}/refresh - Trigger async scan GET {server_url}/library/metadata/{rating_key} - Item metadata (includes user's personal rating) GET {server_url}/library/sections/{id}/search?title={query} - Search DELETE {server_url}/library/metadata/{rating_key} - Delete library item (requires deletion enabled in Plex settings)
Auth: X-Plex-Token header
Response: XML (requires xml2js parsing to JSON)
API Docs: /PlexMediaServerAPIDocs.json
Security: During OAuth, user's accessible servers are fetched from plex.tv/api/v2/resources. Only users with the configured server in their resource list can authenticate.
Plex OAuth
Base: https://plex.tv/api/v2
POST /pins→ Get PIN id and code- Build auth URL:
https://app.plex.tv/auth#?clientID={id}&code={code} GET /pins/{id}→ Poll until authToken populatedGET /users/account→ Get user info with token- Security check: Get server machineIdentifier from configured server
- Security check: Fetch user's accessible servers (
GET plex.tv/api/v2/resourceswith user token) - Security check: Verify configured server's machineIdentifier is in user's resource list
- Only grant access if server found in user's accessible resources (validates shared access)
Audiobook Detection
- Plex has no dedicated audiobook type
- Stored as Music library (type="artist")
- Admin selects library during setup
- Query with
type=9for Album-level items (books) item.title= book name,item.parentTitle= author
Library Scanning
Full Library Scan
Scan Process:
- Fetch all audiobooks via API (
type=9) - For each:
- Exists by
plexGuid? Update metadata - New? Create entry in
plex_librarytable
- Exists by
- Match downloaded requests (status: 'downloaded'):
- Uses centralized
audiobook-matcher.ts(ASIN matching, title normalization, narrator support) - Matched → Update request status to 'available' + link plexGuid
- Uses centralized
- Return summary (total, new count, updated count, matched downloads)
Trigger: Scheduled (every 6 hours default) or manual admin action Note: Heavy operation, scans entire library
Recently Added Check (Lightweight Polling)
Process:
- Query top 10 items sorted by
addedAt:descwith pagination - For each item:
- New? Create in
plex_librarytable - Existing? Update metadata
- New? Create in
- Match downloaded requests:
- Uses centralized
audiobook-matcher.ts(same as full scan and homepage) - Searches entire
plex_librarytable for matches
- Uses centralized
- Return summary (new, updated, matched downloads)
Trigger: Scheduled (every 5 minutes default), enabled by default Benefits: Lightweight polling for new items + comprehensive matching for downloaded requests Note: Requests transition: pending → searching → downloading → processing → downloaded → available (after detection)
Auto-Completion of Stuck Requests
Library scans (full and incremental) now check all non-terminal requests for matches:
Eligible statuses:
- pending, searching, downloading, processing, downloaded
- failed, awaiting_search, awaiting_import, warn
Excluded statuses:
- available (already completed)
- cancelled (user cancelled)
Use Case:
- Request stuck in 'awaiting_search' or 'failed' status
- User manually imports audiobook to library (via Plex/ABS or external tool)
- Next library scan (manual trigger or scheduled recently-added check)
- Request auto-matches and marks as 'available'
- Error messages and retry counters cleared
State Cleanup on Match:
- errorMessage → null
- searchAttempts → 0
- downloadAttempts → 0
- importAttempts → 0
- completedAt → scan timestamp
Edge Cases:
- Active downloads/jobs continue but become no-ops (download completes, organize skips)
- Torrent/NZB remains in download client (manual cleanup if desired)
Logging:
- Transitions from non-downloaded statuses logged with original status:
Match found! "Book" → "Library Book" (was 'failed') - Provides visibility into which stuck requests were auto-completed
Data Models
interface PlexAudiobook {
ratingKey: string;
guid: string;
title: string;
author: string; // from parentTitle
narrator?: string;
duration: number; // ms
year?: number;
summary?: string;
thumb?: string;
addedAt: number;
updatedAt: number;
filePath: string;
}
interface PlexLibrary {
id: string;
title: string;
type: string; // "artist", "audio"
locations: string[];
itemCount: number;
}
BookDate Ratings
Problem: Library scan runs with system Plex token, storing those ratings in cache. Different users need different ratings for recommendations.
Solution:
- Local admin users: Use cached ratings (from system Plex token)
- Plex-authenticated users (including admins): Fetch library with user's token to get personal ratings
How Per-User Ratings Work:
- Key insight:
/library/sections/{id}/allreturns items with the authenticated user's ratings - Plex ratings are tied to user accounts (stored on plex.tv), not the server
- When fetched with a user's token, each item includes that user's personal
userRating - No special permissions needed - works for all authenticated users (admin and non-admin)
Implementation:
getLibraryContent(serverUrl, userToken, libraryId)- Fetches library with user-specific ratings- Returns
PlexAudiobook[]withuserRatingfield specific to the authenticated user - Plex-authenticated users: Fetch full library (~1-2s), match by plexGuid/ratingKey against cached structure
- Local admin: Use cached ratings (skip API call, user has no Plex account)
BookDate Integration:
enrichWithUserRatings(userId, cachedBooks)- Determines user type and returns appropriate ratings- Local admin (plexId starts with 'local-') → cached ratings from system token (no API call)
- Plex-authenticated (everyone else) → user's plex.tv token + stored machineIdentifier → server access token → fetch library with user's ratings
Notes:
- System Plex token (configured during setup) is used for library scanning, testing, admin operations only
- Cached ratings reflect whoever owns that system token
- Local admins use cached ratings because they don't have Plex accounts (user.authToken is bcrypt hash)
- Token types: Plex uses two token types per the API documentation
- plex.tv OAuth tokens: For authenticating to plex.tv services
- Server access tokens: For talking to individual PMS instances
- Must call
/api/v2/resourceswith plex.tv token + machineIdentifier to get server-specific access tokens - Each server in user's resources list has its own
accessToken
- Security: machineIdentifier stored in Configuration during setup to avoid accessing system token for user operations
- BookDate correctly fetches server-specific access tokens without touching the system Plex token
Fixed Issues ✅
1. Response Format Handling
- Issue: Server info "unknown", libraries failing to load
- Cause: Modern Plex returns JSON when
Accept: application/jsonset, not XML - Fix: Added JSON handling alongside XML parsing, optional chaining for
$attributes
2. OAuth Callback Missing pinId
- Issue: "Missing pinId parameter" after auth
- Fix: Modified
getOAuthUrl()to append pinId to callback URL
3. Scan Architecture
- Issue: Matched requests instead of populating library (0 matches when DB empty)
- User Feedback: "Seeing books on homepage I know are in library"
- Fix: Rewrote to populate ALL Plex audiobooks to DB as source of truth, Audible matches against this
4. Mapping Artist Instead of Album
- Issue: Author names as titles, undefined authors
- Cause: Querying without
type=9returned Artist items, not Albums - Fix: Added
type=9parameter, changedgrandparentTitletoparentTitlefor author
5. Immediate Plex Search After File Organization (400 Error)
- Issue: organize_files job triggered match_plex immediately after copying files
- Cause: Plex hadn't scanned new files yet, search API returned 400 error
- User Experience: Error logs despite successful download
- Fix: Removed immediate match_plex trigger, changed workflow:
- organize_files → status: 'downloaded' (green)
- Scheduled scan_plex (every 6 hours) → matches downloaded requests → status: 'available'
6. Recently Added Check Used Different Matching Criteria
- Issue: Recently added check didn't match downloaded requests that full scan matched
- Cause: Recently added used AND logic (title >= 70% AND author >= 70%), full scan used weighted average (title × 0.7 + author × 0.3 >= 0.7)
- User Experience: "The Tenant" → "The Tenant (Unabridged)" matched in full scan but not in recently added check
- Fix: Changed recently added check to use same weighted scoring algorithm as full scan
7. Scan Methods Not Using Centralized Matcher
- Issue: Full scan and recently added check had custom matching logic, different from homepage matcher
- Cause: Each component implemented its own fuzzy matching without title normalization, ASIN matching, or narrator support
- User Experience: Inconsistent matching behavior across the application
- Fix: Both scan methods now use
audiobook-matcher.tsutility (same as homepage)- ASIN matching: Checks plexGuid for exact ASIN (100% confidence)
- Title normalization: Removes "(Unabridged)", "(Abridged)", etc.
- Narrator matching: Can match narrator to Plex author field
- ASIN filtering: Rejects candidates with wrong ASINs in plexGuid
- Consistent 70% weighted threshold everywhere
8. BookDate Token Decryption Failures
- Issue: Decryption errors when fetching user ratings for BookDate recommendations
- User Experience: "Failed to decrypt user authToken" / "Failed to decrypt system Plex token"
- Cause: Tokens may be stored as plain text (from before encryption implementation or different encryption key)
- Fix: Added fallback to use tokens as plain text if decryption fails
- User Plex token: Try decrypt, fallback to plain text
- System Plex token: Try decrypt, fallback to plain text (before architectural fix)
- Allows BookDate to function with both encrypted and plain text tokens
9. BookDate Accessing System Token for User Operations ⚡ ARCHITECTURAL FIX
- Issue: Every BookDate user request was decrypting system Plex token to get machineIdentifier
- User Experience: Unnecessary decryption operations, security concern (users shouldn't access admin token)
- Cause: machineIdentifier was fetched via testConnection() using system token for each user request
- Fix: Store machineIdentifier in Configuration during setup, use stored value for user operations
- Added
plex_machine_identifierto Configuration table - Setup/complete route saves machineIdentifier from test-plex response
- config.service.ts returns machineIdentifier from config
- enrichWithUserRatings() uses stored machineIdentifier (no system token access)
- System token now only used for: library scanning, setup, testing, admin operations
- User flow: user's plex.tv token + stored machineIdentifier → server access token
- Added
- Security: Users never access or decrypt the system Plex token
10. OAuth Callback Re-fetching machineIdentifier ⚡ ARCHITECTURAL FIX
- Issue: auth/plex/callback route was calling testConnection() to fetch machineIdentifier on every user login
- User Experience: Unnecessary Plex API call on every authentication (adds latency, wastes resources)
- Cause: Inconsistent architecture - setup/settings save machineIdentifier, but callback re-fetched it
- Fix: Use stored machineIdentifier from config (via getPlexConfig().machineIdentifier)
- auth/plex/callback now reads from database instead of API call
- Consistent with BookDate and other user operations
- testConnection() only used for: testing connections, initial fetching during setup/settings
- Result: Faster authentication, no unnecessary API calls, consistent architecture
Library Item Deletion
Endpoint: DELETE /library/metadata/{ratingKey}
Use Case: When admin deletes a request, also delete from Plex library to keep in sync
Requirements:
- Deletion must be enabled: Settings > Server > Library in Plex webui
- Without this setting enabled, DELETE requests will fail
Implementation:
deleteItem(serverUrl, authToken, ratingKey)- Deletes library item by ratingKey- Called during request deletion when backend mode is 'plex'
- Extracts ratingKey from audiobook.plexGuid (format:
plex://album/{ratingKey}) - Mirrors ABS deletion behavior for consistency
Error Handling:
- 404: Item not found (already deleted) - logged but not thrown
- Other errors: Logged but deletion continues (prevents blocking request deletion)
Availability Checking
- DB Population: Plex scan creates/updates records with
plexGuid+ ASIN +availabilityStatus: 'available' - Audible Matching: Real-time ASIN-only matching (100% confidence, exact matches only)
- API Enrichment: Discovery APIs use real-time ASIN matching at query time
- UI:
AudiobookCardshows "In Your Library" ifisAvailable: true(ASIN exact match) - Server Validation:
/api/requestsreturns 409 ifavailabilityStatus === 'available'
Match Priority (ASIN-Only):
- ASIN in dedicated field (100% confidence) → Match
- ASIN in plexGuid (backward compatibility) → Match
- No ASIN match → Return null (no fuzzy fallback)
Note: Fuzzy matching (70% threshold) is preserved in ranking-algorithm.ts for Prowlarr torrent selection, but NOT used for library availability checks. This eliminates false positives (e.g., "Foundation" matching "Foundation and Empire").
Tech Stack
- axios/node-fetch
- xml2js (XML → JSON)
- string-similarity (fuzzy matching)