mirror of
https://github.com/kikootwo/ReadMeABook.git
synced 2026-06-02 20:30:10 +00:00
Add first-class ebook request support and UI
Implements first-class ebook requests with their own type, parent-child relationship to audiobook requests, and separate status flow. Updates database schema and migrations to support 'type' and 'parentRequestId' fields on requests. Adds processors and job types for ebook search and direct HTTP download from Anna's Archive, with FlareSolverr integration for Cloudflare bypass. Enhances admin UI tables and request actions to display and manage ebook requests, including orange badge and source links. Updates documentation to reflect new ebook support, configuration, and behavior.
This commit is contained in:
@@ -1,19 +1,29 @@
|
||||
# E-book Sidecar
|
||||
# E-book Support
|
||||
|
||||
**Status:** ✅ Implemented | Optional e-book downloads from Anna's Archive
|
||||
**Status:** ✅ Implemented | First-class ebook requests with Anna's Archive integration
|
||||
|
||||
## Overview
|
||||
Automatically downloads e-books from Anna's Archive to accompany audiobooks, placing them in the same folder.
|
||||
Ebooks are first-class citizens in RMAB, with their own request type, tracking, and UI representation. When an audiobook request completes, an ebook request is automatically created (if enabled). Ebooks are downloaded directly from Anna's Archive via HTTP.
|
||||
|
||||
## Key Details
|
||||
- **When:** Runs during file organization (after audiobook copied, after cover art)
|
||||
- **Matching:** ASIN-based search (exact match)
|
||||
- **Non-blocking:** Failures don't affect audiobook download
|
||||
- **Atomic:** Either succeeds or fails gracefully
|
||||
- **Location:** E-book placed in same directory as audiobook
|
||||
- **Filename:** `[Title] - [Author].[format]` (sanitized)
|
||||
|
||||
## Configuration
|
||||
### First-Class Ebook Requests
|
||||
- **Request Type:** `type: 'ebook'` (vs `'audiobook'`)
|
||||
- **Parent Relationship:** Ebook requests are children of audiobook requests (`parentRequestId`)
|
||||
- **Terminal State:** `downloaded` (ebooks don't have "available" state like audiobooks)
|
||||
- **UI Badge:** Orange (#f16f19) ebook badge to distinguish from audiobooks
|
||||
- **Separate Tracking:** Own progress, status, and error handling
|
||||
|
||||
### Flow
|
||||
1. Audiobook organization completes
|
||||
2. Ebook request created automatically (if enabled)
|
||||
3. `search_ebook` job searches Anna's Archive
|
||||
4. `start_direct_download` downloads via HTTP
|
||||
5. `organize_files` copies to audiobook folder
|
||||
6. Request marked as `downloaded` (terminal)
|
||||
7. "Available" notification sent
|
||||
|
||||
### Configuration
|
||||
|
||||
**Admin Settings → E-book Sidecar tab**
|
||||
|
||||
@@ -21,287 +31,155 @@ Automatically downloads e-books from Anna's Archive to accompany audiobooks, pla
|
||||
|-----|---------|---------|-------------|
|
||||
| `ebook_sidecar_enabled` | `false` | `true/false` | Enable feature |
|
||||
| `ebook_sidecar_preferred_format` | `epub` | `epub, pdf, mobi, azw3, any` | Preferred format |
|
||||
| `ebook_sidecar_base_url` | `https://annas-archive.li` | URL | Base URL (mirror resilience) |
|
||||
| `ebook_sidecar_flaresolverr_url` | `` (empty) | URL | FlareSolverr proxy URL (optional) |
|
||||
| `ebook_sidecar_base_url` | `https://annas-archive.li` | URL | Base URL |
|
||||
| `ebook_sidecar_flaresolverr_url` | `` (empty) | URL | FlareSolverr proxy (optional) |
|
||||
|
||||
**Stored in:** `Configuration` table (database)
|
||||
## Database Schema
|
||||
|
||||
**Request model additions:**
|
||||
```prisma
|
||||
type String @default("audiobook") // 'audiobook' | 'ebook'
|
||||
parentRequestId String? @map("parent_request_id")
|
||||
parentRequest Request? @relation("EbookParent", fields: [parentRequestId], references: [id])
|
||||
childRequests Request[] @relation("EbookParent")
|
||||
```
|
||||
|
||||
**Indexes:** `type`, `parentRequestId`
|
||||
|
||||
## Job Processors
|
||||
|
||||
### search_ebook
|
||||
- Searches Anna's Archive by ASIN first, then title + author
|
||||
- Creates download history record with `downloadClient: 'direct'`
|
||||
- Triggers `start_direct_download` job
|
||||
|
||||
### start_direct_download
|
||||
- Downloads file via HTTP with progress tracking
|
||||
- Tries multiple slow download links on failure
|
||||
- Triggers `organize_files` on success
|
||||
|
||||
### monitor_direct_download
|
||||
- Future use for async download monitoring
|
||||
- Currently, most tracking happens in start_direct_download
|
||||
|
||||
## Ranking Algorithm
|
||||
|
||||
Ebook ranking (for future multi-source support):
|
||||
- **Format Score:** 40 pts (exact match) to 10 pts (different format)
|
||||
- **Size Score:** 30 pts (inverse - smaller files preferred)
|
||||
- **Source Score:** 30 pts (Anna's Archive gets full score)
|
||||
|
||||
## Delete Behavior
|
||||
|
||||
**Ebook deletion is different from audiobook deletion:**
|
||||
- Only deletes ebook files (`.epub`, `.pdf`, `.mobi`, etc.)
|
||||
- Does NOT delete the title folder (audiobook files remain)
|
||||
- Does NOT delete from backend library (Plex/ABS)
|
||||
- Does NOT clear audiobook availability linkage
|
||||
- Soft-deletes the ebook request record
|
||||
|
||||
## UI Representation
|
||||
|
||||
### RequestCard
|
||||
- Orange ebook badge displayed next to status badge
|
||||
- Orange book icon for placeholder cover art
|
||||
- Interactive search disabled (Anna's Archive only)
|
||||
|
||||
### Status Flow
|
||||
```
|
||||
pending → searching → downloading → processing → downloaded (terminal)
|
||||
↘ awaiting_search (retry) ↗
|
||||
```
|
||||
|
||||
## FlareSolverr Integration
|
||||
|
||||
Anna's Archive uses Cloudflare protection which may block direct scraping requests. FlareSolverr solves this by using a headless browser to bypass the protection.
|
||||
|
||||
### What is FlareSolverr?
|
||||
- Proxy server using headless Chrome/Chromium
|
||||
- Automatically solves Cloudflare challenges
|
||||
- Returns HTML content after challenge is solved
|
||||
- Open source: https://github.com/FlareSolverr/FlareSolverr
|
||||
|
||||
### When to Use FlareSolverr
|
||||
- **Required:** When e-book downloads consistently fail with no search results
|
||||
- **Optional:** If direct requests work (depends on Cloudflare's current state)
|
||||
- **Recommended:** For reliable, consistent downloads
|
||||
Anna's Archive uses Cloudflare protection. FlareSolverr bypasses this using a headless browser.
|
||||
|
||||
### Setup
|
||||
1. Run FlareSolverr via Docker:
|
||||
```bash
|
||||
docker run -d --name flaresolverr -p 8191:8191 ghcr.io/flaresolverr/flaresolverr:latest
|
||||
```
|
||||
2. In Admin Settings → E-book Sidecar, enter: `http://localhost:8191`
|
||||
3. Click "Test Connection" to verify
|
||||
```bash
|
||||
docker run -d --name flaresolverr -p 8191:8191 ghcr.io/flaresolverr/flaresolverr:latest
|
||||
```
|
||||
|
||||
### How It Works
|
||||
1. Requests are routed through FlareSolverr
|
||||
2. FlareSolverr loads the page in headless Chrome
|
||||
3. If Cloudflare challenge appears, it waits for solution
|
||||
4. HTML is returned after page loads
|
||||
5. Falls back to direct requests if FlareSolverr fails
|
||||
Configure URL in Admin Settings → E-book Sidecar: `http://localhost:8191`
|
||||
|
||||
### Performance Impact
|
||||
- **First request:** ~5-10 seconds (browser startup)
|
||||
- **Subsequent requests:** ~2-5 seconds per page
|
||||
- **Total time:** ~15-30 seconds per e-book (vs ~5-15 without)
|
||||
### Performance
|
||||
- First request: ~5-10 seconds
|
||||
- Subsequent: ~2-5 seconds per page
|
||||
- Total: ~15-30 seconds per ebook
|
||||
|
||||
## How It Works
|
||||
## Scraping Strategy
|
||||
|
||||
### Flow
|
||||
1. **Trigger:** File organization completes audiobook copy
|
||||
2. **Check:** `ebook_sidecar_enabled === 'true'`
|
||||
3. **Search:** Try ASIN first (if available), then fall back to title + author
|
||||
4. **Extract MD5:** First search result → MD5 hash
|
||||
5. **Get Download Links:** Find "no waitlist" slow download links
|
||||
6. **Extract URL:** Parse slow download page for actual file server URL
|
||||
7. **Download:** Stream file to audiobook directory
|
||||
8. **Rename:** Sanitize filename based on metadata
|
||||
|
||||
### Scraping Strategy
|
||||
|
||||
**Method 1: ASIN Search (exact match)**
|
||||
### Method 1: ASIN Search (exact match)
|
||||
```
|
||||
Search: https://annas-archive.li/search?ext=epub&lang=en&q="asin:B09TWSRMCB"
|
||||
↓
|
||||
MD5 Page: https://annas-archive.li/md5/[md5]
|
||||
↓ (Filter: "slow partner server" links)
|
||||
Slow Download: https://annas-archive.li/slow_download/[md5]/0/5
|
||||
↓ (Parse for actual download URL)
|
||||
File Server: http://[server-ip]:port/path/to/file.epub
|
||||
```
|
||||
|
||||
**Method 2: Title + Author Search (fallback)**
|
||||
```
|
||||
Search: https://annas-archive.li/search?q=The+Housemaid+Freida+McFadden
|
||||
&ext=epub
|
||||
&content=book_nonfiction&content=book_fiction&content=book_unknown
|
||||
&lang=en
|
||||
↓
|
||||
(Same flow as ASIN search from MD5 page onwards)
|
||||
Slow Download: https://annas-archive.li/slow_download/[md5]/0/5
|
||||
↓
|
||||
File Server: http://[server]/path/to/file.epub
|
||||
```
|
||||
|
||||
### Matching Priority
|
||||
1. **ASIN** (exact match - most accurate, if available)
|
||||
2. **Title + Author** (fuzzy match with book/language filters)
|
||||
|
||||
### Retry Logic
|
||||
- **Max attempts:** 5 slow download links
|
||||
- **Timeout:** 60 seconds per download
|
||||
- **Delays:** 1.5 seconds between requests
|
||||
- **Retries:** 3x for 5xx errors with exponential backoff
|
||||
|
||||
## Format Support
|
||||
|
||||
| Format | Extension | Recommended | Notes |
|
||||
|--------|-----------|-------------|-------|
|
||||
| EPUB | `.epub` | ✅ Yes | Most compatible with e-readers |
|
||||
| PDF | `.pdf` | ⚠️ Sometimes | Best for fixed-layout books |
|
||||
| MOBI | `.mobi` | ⚠️ Legacy | Kindle (older devices) |
|
||||
| AZW3 | `.azw3` | ⚠️ Sometimes | Kindle (newer devices) |
|
||||
| Any | `[first available]` | ❌ No | Downloads first match |
|
||||
|
||||
**Recommendation:** Use EPUB for maximum compatibility.
|
||||
### Method 2: Title + Author (fallback)
|
||||
```
|
||||
Search: https://annas-archive.li/search?q=Title+Author&ext=epub&lang=en
|
||||
↓ (Same flow from MD5 page)
|
||||
```
|
||||
|
||||
## File Naming
|
||||
|
||||
**Pattern:** `[Title] - [Author].[format]`
|
||||
|
||||
**Sanitization:**
|
||||
- Remove invalid chars: `<>:"/\|?*`
|
||||
- Collapse multiple spaces
|
||||
- Trim leading/trailing spaces and dots
|
||||
- Limit to 100 characters
|
||||
|
||||
**Examples:**
|
||||
- `The Housemaid - Freida McFadden.epub`
|
||||
- `Project Hail Mary - Andy Weir.pdf`
|
||||
- Remove: `<>:"/\|?*`
|
||||
- Collapse spaces, trim, limit to 200 chars
|
||||
|
||||
## Error Handling
|
||||
|
||||
**Graceful Failures (non-blocking):**
|
||||
- No ASIN available → Skip silently (log info)
|
||||
- No search results → Log warning, continue audiobook
|
||||
- No download links → Log warning, continue audiobook
|
||||
- All downloads fail → Log error, continue audiobook
|
||||
- Download timeout → Log error, continue audiobook
|
||||
**Non-blocking errors:**
|
||||
- No search results → Request goes to `awaiting_search` for retry
|
||||
- All downloads fail → Same retry behavior
|
||||
- Audiobook organization never affected
|
||||
|
||||
**Never Blocks Audiobook:**
|
||||
- All e-book errors are non-fatal
|
||||
- Audiobook organization completes regardless
|
||||
- Errors logged to job events (visible in admin)
|
||||
## Technical Files
|
||||
|
||||
## Logging
|
||||
**Processors:**
|
||||
- `src/lib/processors/search-ebook.processor.ts`
|
||||
- `src/lib/processors/direct-download.processor.ts`
|
||||
- `src/lib/processors/organize-files.processor.ts` (ebook branch)
|
||||
|
||||
**Success (with FlareSolverr):**
|
||||
```
|
||||
E-book sidecar enabled, searching for e-book...
|
||||
Using FlareSolverr at http://localhost:8191
|
||||
Searching by ASIN: B09TWSRMCB (format: epub)...
|
||||
Found via ASIN: 3b6f9c0f1665c4ba6e3214d43c37e1de
|
||||
Found MD5: 3b6f9c0f1665c4ba6e3214d43c37e1de
|
||||
Found 8 download link(s)
|
||||
Attempting download link 1/5...
|
||||
Downloading from: 93.123.118.12
|
||||
E-book downloaded: The Housemaid - Freida McFadden.epub
|
||||
```
|
||||
**Services:**
|
||||
- `src/lib/services/ebook-scraper.ts`
|
||||
- `src/lib/services/job-queue.service.ts` (ebook job types)
|
||||
|
||||
**Success (ASIN match, direct):**
|
||||
```
|
||||
E-book sidecar enabled, searching for e-book...
|
||||
Searching by ASIN: B09TWSRMCB (format: epub)...
|
||||
Found via ASIN: 3b6f9c0f1665c4ba6e3214d43c37e1de
|
||||
Found MD5: 3b6f9c0f1665c4ba6e3214d43c37e1de
|
||||
Found 8 download link(s)
|
||||
Attempting download link 1/5...
|
||||
Downloading from: 93.123.118.12
|
||||
E-book downloaded: The Housemaid - Freida McFadden.epub
|
||||
```
|
||||
**Utils:**
|
||||
- `src/lib/utils/file-organizer.ts` (`organizeEbook` method)
|
||||
- `src/lib/utils/ranking-algorithm.ts` (`rankEbooks` function)
|
||||
|
||||
**Success (Title fallback):**
|
||||
```
|
||||
E-book sidecar enabled, searching for e-book...
|
||||
Searching by ASIN: B09TWSRMCB (format: epub)...
|
||||
No results for ASIN, falling back to title + author search...
|
||||
Searching by title + author: "The Housemaid" by Freida McFadden...
|
||||
Found via title search: 3b6f9c0f1665c4ba6e3214d43c37e1de
|
||||
Found MD5: 3b6f9c0f1665c4ba6e3214d43c37e1de
|
||||
Found 8 download link(s)
|
||||
E-book downloaded: The Housemaid - Freida McFadden.epub
|
||||
```
|
||||
**UI:**
|
||||
- `src/components/requests/RequestCard.tsx` (ebook badge)
|
||||
|
||||
**Failure:**
|
||||
```
|
||||
E-book sidecar enabled, searching for e-book...
|
||||
Searching by ASIN: B09TWSRMCB (format: epub)...
|
||||
No results for ASIN, falling back to title + author search...
|
||||
Searching by title + author: "The Housemaid" by Freida McFadden...
|
||||
No search results found for title: "The Housemaid" by Freida McFadden
|
||||
E-book download failed: No search results found (tried ASIN and title+author)
|
||||
```
|
||||
**Delete:**
|
||||
- `src/lib/services/request-delete.service.ts` (ebook-specific logic)
|
||||
|
||||
## Troubleshooting
|
||||
## Format Support
|
||||
|
||||
### E-book Not Downloaded
|
||||
|
||||
**Cause:** No matching e-book in Anna's Archive (tried ASIN and title+author)
|
||||
**Solution:** Not all audiobooks have e-book equivalents, this is expected
|
||||
|
||||
**Cause:** ASIN mismatch (Anna's Archive has different ASIN)
|
||||
**Solution:** Feature now automatically falls back to title + author search
|
||||
|
||||
**Cause:** All download links failed
|
||||
**Solution:** Check job logs for errors, may be temporary server issues
|
||||
|
||||
### Wrong Format Downloaded
|
||||
|
||||
**Cause:** Preferred format not available
|
||||
**Solution:** Anna's Archive doesn't have that format, falls back to available format
|
||||
|
||||
### Download Timeout
|
||||
|
||||
**Cause:** Slow file server or large file
|
||||
**Solution:** Automatic retry with next download link
|
||||
|
||||
### Feature Not Working
|
||||
|
||||
**Cause:** Feature disabled
|
||||
**Solution:** Admin Settings → E-book Sidecar → Enable toggle
|
||||
|
||||
### Cloudflare Blocking
|
||||
|
||||
**Cause:** Anna's Archive has Cloudflare protection enabled
|
||||
**Solution:** Configure FlareSolverr (see FlareSolverr Integration section)
|
||||
|
||||
**Symptoms:**
|
||||
- No search results found
|
||||
- Requests timing out
|
||||
- Errors about Cloudflare challenge
|
||||
|
||||
### FlareSolverr Not Working
|
||||
|
||||
**Cause:** FlareSolverr not running or unreachable
|
||||
**Solution:**
|
||||
1. Verify FlareSolverr is running: `docker ps | grep flaresolverr`
|
||||
2. Check URL is correct (usually `http://localhost:8191`)
|
||||
3. Test connection in Admin Settings
|
||||
|
||||
**Cause:** FlareSolverr timing out
|
||||
**Solution:** FlareSolverr may need more time; check container logs for errors
|
||||
|
||||
## Security & Legal
|
||||
|
||||
**Important Notes:**
|
||||
- Anna's Archive is a shadow library
|
||||
- Use at your own discretion and responsibility
|
||||
- Ensure compliance with local laws and regulations
|
||||
- Feature is optional and disabled by default
|
||||
- No API key required (web scraping)
|
||||
|
||||
**Privacy:**
|
||||
- User-Agent: `ReadMeABook/1.0 (Audiobook Automation)`
|
||||
- No tracking or analytics
|
||||
- Distributed (each user scrapes for themselves)
|
||||
|
||||
## Technical Implementation
|
||||
|
||||
**Files:**
|
||||
- Service: `src/lib/services/ebook-scraper.ts`
|
||||
- Integration: `src/lib/utils/file-organizer.ts` (line 265+)
|
||||
- Settings API: `src/app/api/admin/settings/ebook/route.ts`
|
||||
- FlareSolverr Test API: `src/app/api/admin/settings/ebook/test-flaresolverr/route.ts`
|
||||
- UI: `src/app/admin/settings/page.tsx` (ebook tab)
|
||||
|
||||
**Dependencies:**
|
||||
- axios (HTTP requests)
|
||||
- cheerio (HTML parsing)
|
||||
- fs/promises (file operations)
|
||||
|
||||
**Caching:**
|
||||
- MD5 lookups cached in-memory (prevents re-scraping same ASIN)
|
||||
- Cache cleared on service restart
|
||||
|
||||
## Performance
|
||||
|
||||
**Impact:**
|
||||
- **Network:** 3-5 requests per e-book (search, MD5, slow download pages)
|
||||
- **Time:** ~5-15 seconds per e-book (depends on file server)
|
||||
- **Storage:** E-books typically 1-50 MB
|
||||
- **CPU:** Minimal (streaming download)
|
||||
| Format | Extension | Recommended |
|
||||
|--------|-----------|-------------|
|
||||
| EPUB | `.epub` | ✅ Yes |
|
||||
| PDF | `.pdf` | ⚠️ Sometimes |
|
||||
| MOBI | `.mobi` | ⚠️ Legacy |
|
||||
| AZW3 | `.azw3` | ⚠️ Sometimes |
|
||||
|
||||
## Limitations
|
||||
|
||||
1. **Match Accuracy:** Title + author search may return wrong book if title is common
|
||||
2. **Format Availability:** Depends on Anna's Archive catalog
|
||||
3. **Download Speed:** Depends on file server load
|
||||
4. **Language:** Title search filters for English books only
|
||||
5. **Success Rate:** ~70-90% (ASIN has higher accuracy, title fallback is less precise)
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- ISBN-13 fallback matching (between ASIN and title search)
|
||||
- Format preference priority list (try EPUB, then PDF, then MOBI)
|
||||
- Per-request override (API endpoint)
|
||||
- Statistics tracking (success rate, formats, match method)
|
||||
- Rate limit monitoring
|
||||
- Relevance scoring for title search results
|
||||
1. Single source (Anna's Archive) - future Prowlarr support stubbed
|
||||
2. Title search may return wrong book for common titles
|
||||
3. Download speed depends on file server load
|
||||
4. English books only (title search filter)
|
||||
|
||||
## Related
|
||||
- [File Organization](../phase3/file-organization.md) - Where e-book download happens
|
||||
- [File Organization](../phase3/file-organization.md) - Ebook organization
|
||||
- [Settings Pages](../settings-pages.md) - Configuration UI
|
||||
- [Configuration Service](../backend/services/config.md) - Settings storage
|
||||
- [Ranking Algorithm](../phase3/ranking-algorithm.md) - Ebook ranking
|
||||
- [Request Deletion](../admin-features/request-deletion.md) - Delete behavior
|
||||
|
||||
Reference in New Issue
Block a user