Add language config and locale-aware parsing

Introduce centralized language configuration and wire locale-aware behavior across scraping and ranking. Adds src/lib/constants/language-config.ts with per-language scraping rules, stop words, and character replacements; replaces AudibleRegion.isEnglish with a language field in types and AUDIBLE_REGIONS. Update AudibleService, ebook scraper, processors, and API routes to use getLanguageForRegion so Anna's Archive searches, scraping selectors, runtime/rating parsing, and ranking use language-specific params and filters. Extend ranking algorithm to accept stopWords and characterReplacements and apply them during normalization and matching. Update UI selects to mark non-English regions and adjust tests accordingly.
This commit is contained in:
kikootwo
2026-02-20 06:32:44 -05:00
parent c146383735
commit 5d8ac2f73d
18 changed files with 525 additions and 112 deletions
+2 -2
View File
@@ -115,11 +115,11 @@ export function BackendSelectionStep({
>
{Object.values(AUDIBLE_REGIONS).map((region) => (
<option key={region.code} value={region.code}>
{region.name}{!region.isEnglish ? ' *' : ''}
{region.name}{region.language !== 'en' ? ' *' : ''}
</option>
))}
</select>
{AUDIBLE_REGIONS[audibleRegion]?.isEnglish === false && (
{AUDIBLE_REGIONS[audibleRegion]?.language !== 'en' && (
<div className="bg-amber-50 dark:bg-amber-900/20 rounded-lg p-4 border border-amber-200 dark:border-amber-800 mt-2">
<div className="flex gap-3">
<svg