From 1e8cf87331e7c5c63e3cf29014a243117d2afe22 Mon Sep 17 00:00:00 2001 From: Roland Knall Date: Mon, 15 Jun 2026 08:03:02 +0000 Subject: [PATCH] fix(scan): confirm audiobook files via embedded tags when path heuristics fail The per-audiobook file scan attributed candidate files using only path/name heuristics: a file was kept only if its filename or folder contained the audiobook title, or its path contained the author. Layouts where the folder/filename does not carry that information (e.g. AudioBookShelf-style series-creator folders with numbered episode filenames) had every candidate rejected, so correctly-placed files were never linked (0 files imported). Add an embedded-tag confirmation fallback: for candidates the path heuristics reject, read the file's ID3/MP4 tags (reusing PathMetadataParser via the bundled ffprobe) and attribute the file when the embedded ASIN matches the audiobook (definitive), or when both title and author agree after normalization. Tags are read concurrently with a bounded degree of parallelism; the match decision is applied on a single thread. The match logic is factored into ScanBackgroundService.MatchEmbeddedTags and covered by unit tests. --- CHANGELOG.md | 1 + .../Audiobooks/ScanBackgroundService.cs | 109 ++++++++++++++++ .../ScanBackgroundServiceTagMatchTests.cs | 122 ++++++++++++++++++ 3 files changed, 232 insertions(+) create mode 100644 tests/Features/Api/Services/ScanBackgroundServiceTagMatchTests.cs diff --git a/CHANGELOG.md b/CHANGELOG.md index 982b7c6df..d10c146eb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Fixed +- **Library scan: files are matched to an audiobook by their embedded tags when the folder/filename layout does not carry the title or author.** The per-audiobook scan previously kept a candidate file only if its filename or folder contained the audiobook title, or its path contained the author — so AudioBookShelf-style layouts (series-creator folders, numbered episode filenames) left correctly-placed files unmatched and the audiobook stuck at zero files. The scan now falls back to reading embedded ID3/MP4 tags via the bundled ffprobe and attributes a file when its embedded ASIN matches the audiobook (definitive), or when both title and author agree after normalization. Tags are read (concurrently, with bounded parallelism) only for the files the path heuristics reject, so the common already-matching case is unaffected. - **Authentication settings: startup-config save no longer offers a downloadable `config.json` fallback when the backend refuses the save as invalid.** `SettingsView.saveSettings()` previously wrapped `apiService.saveStartupConfig` in a bare `catch {}` and treated every failure as a disk-persistence problem — offering the user a downloadable `config.json` containing the *server-rejected* values so they could save it manually. That bypasses the new backend admin-existence guard entirely: a user who tries to enable the login screen with no admin user gets the backend's 400, the FE catches it, and the FE offers a download of the same `AuthenticationRequired=true` config the server just refused. The catch now inspects the thrown error's `status`: 4xx responses are validation refusals and surface as a hard error toast (no download offered); 5xx and network failures fall through to the existing download fallback, which is the right escape hatch for "server wants to save but can't write to disk." - **Authentication settings: enabling the login screen now refuses to persist when no admin user exists.** `ConfigurationService.SaveStartupConfigAsync` queries `IUserService.GetAdminUsersAsync` whenever the incoming save *transitions* `AuthenticationRequired` from disabled to enabled, and throws if the admin user list is empty. This closes the carveout left by the credential-visibility and admin-provisioning fixes below: the settings DTO clears blank fields before save, so a user who flips "Enable login screen" with empty (or username-only) admin credentials silently skipped provisioning entirely and still reached the startup-config write, locking themselves out of an admin-less instance (recoverable by editing `config/config.json` back to `"AuthenticationRequired": "false"`, but a confusing first-time-setup trap). The check is scoped to the transition: subsequent saves while auth is already on (API key regenerations, port changes, log-level tweaks) don't re-query the admin list, and the common "just updating other startup fields with auth off" path stays unaffected. The admin block in `SaveApplicationSettings` runs before the startup-config write in the same save flow, so the typical "supply credentials and enable login in the same save" sequence has the admin row in place by the time the check runs. - **Authentication settings: admin provisioning failures no longer silently let the auth-required toggle proceed.** `ConfigurationService.SaveApplicationSettingsAsync` previously caught any exception from `CreateUserAsync` / `UpdatePasswordAsync`, logged it, and returned successfully — so when admin credentials were supplied but the user-service rejected them (password policy violation, repo I/O error, concurrent-write race), `SettingsView.saveSettings()` would still go on to persist `AuthenticationRequired=true` on its second request. The result was an instance that required login but had no working admin account — exactly the lockout shape the credential-visibility fix below was meant to prevent. The catch now re-throws the failure so the caller aborts before the auth-toggle write. The settings row itself is still saved before the admin block (non-admin changes like notification triggers and webhooks shouldn't disappear because admin provisioning failed), and the no-credentials path remains an unchanged silent skip. diff --git a/listenarr.application/Audiobooks/ScanBackgroundService.cs b/listenarr.application/Audiobooks/ScanBackgroundService.cs index 7ffa3e857..fe9fc59c3 100644 --- a/listenarr.application/Audiobooks/ScanBackgroundService.cs +++ b/listenarr.application/Audiobooks/ScanBackgroundService.cs @@ -19,6 +19,7 @@ using Listenarr.Application.Interfaces; using Listenarr.Application.Interfaces.Repositories; using Listenarr.Application.Mapping; +using Listenarr.Application.Metadata; using Listenarr.Application.Notification; using Listenarr.Application.Security; using Listenarr.Domain.Common; @@ -79,6 +80,8 @@ protected override async Task ExecuteAsync(CancellationToken stoppingToken) var audiobookRepository = scope.ServiceProvider.GetRequiredService(); var fileRepository = scope.ServiceProvider.GetRequiredService(); var historyRepository = scope.ServiceProvider.GetRequiredService(); + // Optional: used by the embedded-tag confirmation fallback below. + var ffmpegService = scope.ServiceProvider.GetService(); var audiobook = await audiobookRepository.GetByIdAsync(job.AudiobookId); if (audiobook == null) { @@ -317,6 +320,56 @@ protected override async Task ExecuteAsync(CancellationToken stoppingToken) } } + // Embedded-tag confirmation fallback: any audio candidates that the + // path/name heuristics above could not attribute to this audiobook are + // re-checked against their embedded tags (ID3/MP4). This rescues layouts + // where the folder/filename does not carry the title/author (e.g. + // AudioBookShelf-style series folders or numbered episode filenames) but + // the files are tagged correctly. The embedded ASIN is a definitive, + // layout-independent match; title+author tags are a softer fallback. + var unconfirmed = candidates.Where(f => !unique.Contains(f)).ToList(); + if (unconfirmed.Count > 0) + { + var ffprobePath = ffmpegService != null ? await ffmpegService.GetFfprobePathAsync() : null; + if (string.IsNullOrEmpty(ffprobePath)) + { + _logger.LogDebug("Scan job {JobId}: {Count} candidate(s) unmatched by path heuristics but ffprobe is unavailable; skipping tag confirmation", job.Id, unconfirmed.Count); + } + else + { + // Read embedded tags concurrently (ffprobe is process/IO-bound) with a + // bounded degree of parallelism, then apply matches sequentially so the + // shared foundFiles/unique state is mutated on a single thread. + var tagsByFile = new System.Collections.Concurrent.ConcurrentDictionary(StringComparer.OrdinalIgnoreCase); + var maxDop = Math.Max(1, Math.Min(4, Environment.ProcessorCount)); + await Parallel.ForEachAsync(unconfirmed, + new ParallelOptions { MaxDegreeOfParallelism = maxDop, CancellationToken = stoppingToken }, + async (f, token) => + { + try + { + var tags = await PathMetadataParser.ReadEmbeddedTagsAsync(f, ffprobePath, token); + if (tags != null) tagsByFile[f] = tags; + } + catch (Exception tagEx) when (tagEx is not OperationCanceledException && tagEx is not OutOfMemoryException && tagEx is not StackOverflowException) + { + _logger.LogDebug(tagEx, "Scan job {JobId}: failed reading embedded tags for {File}", job.Id, LogRedaction.SanitizeFilePath(f)); + } + }); + + foreach (var f in unconfirmed) + { + if (!tagsByFile.TryGetValue(f, out var tags)) continue; + var reason = MatchEmbeddedTags(audiobook, tags); + if (reason != TagMatchReason.None && unique.Add(f)) + { + foundFiles.Add(f); + _logger.LogInformation("Scan job {JobId}: confirmed '{File}' for audiobook {AudiobookId} via embedded tags ({Reason})", job.Id, LogRedaction.SanitizeFilePath(f), audiobook.Id, reason == TagMatchReason.Asin ? "ASIN" : "title+author"); + } + } + } + } + // Calculate base path for the audiobook files var basePath = CalculateBasePath(foundFiles); if (!string.IsNullOrEmpty(basePath)) @@ -584,6 +637,62 @@ protected override async Task ExecuteAsync(CancellationToken stoppingToken) } } + /// Why an audio file was attributed to an audiobook via its embedded tags. + internal enum TagMatchReason + { + None = 0, + Asin, + TitleAndAuthor, + } + + /// + /// Decides whether an audio file's embedded tags identify it as belonging to + /// . A matching ASIN is definitive; otherwise both the + /// title and the author must agree (after normalization) so that a shared author or + /// a generic title alone cannot produce a false match. + /// + internal static TagMatchReason MatchEmbeddedTags(Audiobook audiobook, PathParsedMetadata? tags) + { + if (audiobook == null || tags == null) return TagMatchReason.None; + + var wantAsin = audiobook.Asin?.Trim(); + if (!string.IsNullOrWhiteSpace(wantAsin) + && !string.IsNullOrWhiteSpace(tags.Asin) + && string.Equals(tags.Asin.Trim(), wantAsin, StringComparison.OrdinalIgnoreCase)) + { + return TagMatchReason.Asin; + } + + var wantTitle = NormalizeTagToken(audiobook.Title); + var tagTitle = NormalizeTagToken(tags.Title); + var titleMatch = wantTitle.Length > 0 && tagTitle.Length > 0 + && (tagTitle.Contains(wantTitle, StringComparison.Ordinal) + || wantTitle.Contains(tagTitle, StringComparison.Ordinal)); + + var tagAuthor = NormalizeTagToken(tags.Author); + var authorMatch = tagAuthor.Length > 0 + && (audiobook.Authors ?? new List()) + .Select(NormalizeTagToken) + .Where(a => a.Length > 0) + .Any(a => tagAuthor.Contains(a, StringComparison.Ordinal) || a.Contains(tagAuthor, StringComparison.Ordinal)); + + return titleMatch && authorMatch ? TagMatchReason.TitleAndAuthor : TagMatchReason.None; + } + + /// Lowercase and collapse runs of non-alphanumeric characters to a single space. + private static string NormalizeTagToken(string? s) + { + if (string.IsNullOrWhiteSpace(s)) return string.Empty; + var sb = new System.Text.StringBuilder(s.Length); + var lastWasSpace = false; + foreach (var ch in s.ToLowerInvariant()) + { + if (char.IsLetterOrDigit(ch)) { sb.Append(ch); lastWasSpace = false; } + else if (!lastWasSpace) { sb.Append(' '); lastWasSpace = true; } + } + return sb.ToString().Trim(); + } + private string CalculateBasePath(List filePaths) { if (!filePaths.Any()) diff --git a/tests/Features/Api/Services/ScanBackgroundServiceTagMatchTests.cs b/tests/Features/Api/Services/ScanBackgroundServiceTagMatchTests.cs new file mode 100644 index 000000000..bd5e1b2ed --- /dev/null +++ b/tests/Features/Api/Services/ScanBackgroundServiceTagMatchTests.cs @@ -0,0 +1,122 @@ +/* + * Listenarr - Audiobook Management System + * Copyright (C) 2024-2026 Listenarr Contributors + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU Affero General Public License as published + * by the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Affero General Public License for more details. + * + * You should have received a copy of the GNU Affero General Public License + * along with this program. If not, see . + */ +using Listenarr.Application.Audiobooks; +using Listenarr.Application.Metadata; +using Listenarr.Domain.Models; +using Xunit; + +namespace Listenarr.Tests.Features.Api.Services +{ + // Covers ScanBackgroundService.MatchEmbeddedTags — the embedded-tag confirmation + // fallback used when path/filename heuristics cannot attribute a file to a book. + public class ScanBackgroundServiceTagMatchTests + { + private static Audiobook Book(string? title, string? asin, params string[] authors) => new() + { + Title = title, + Asin = asin, + Authors = new List(authors), + }; + + private static PathParsedMetadata Tags(string? title = null, string? author = null, string? asin = null) => + new() { Title = title, Author = author, Asin = asin }; + + [Fact] + public void MatchingAsin_IsDefinitive() + { + var book = Book("Das Tierarztpraktikum", "B004VQF7K2", "Markus Dittrich"); + var tags = Tags(title: "something totally different", author: "Nobody", asin: "B004VQF7K2"); + + Assert.Equal(ScanBackgroundService.TagMatchReason.Asin, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + + [Fact] + public void Asin_IsCaseAndWhitespaceInsensitive() + { + var book = Book("X", "B004VQF7K2"); + var tags = Tags(asin: " b004vqf7k2 "); + + Assert.Equal(ScanBackgroundService.TagMatchReason.Asin, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + + [Fact] + public void TitleAndAuthor_MatchWhenAsinDiffers() + { + // ab3-style: record ASIN differs from the file's, but title + author agree. + var book = Book("Das Tierarztpraktikum", "B00U6W36DU", "Markus Dittrich"); + var tags = Tags(title: "Das Tierarztpraktikum", author: "Markus Dittrich", asin: "B004VQF7K2"); + + Assert.Equal(ScanBackgroundService.TagMatchReason.TitleAndAuthor, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + + [Fact] + public void TitleMatch_ToleratesPunctuationAndSubtitle() + { + // Combined "Title: Subtitle" tag still matches the bare record title (normalized, substring). + var book = Book("A Dance with Dragons", null, "George R.R. Martin"); + var tags = Tags(title: "A Dance with Dragons: A Song of Ice and Fire, Book 5", author: "George R. R. Martin"); + + Assert.Equal(ScanBackgroundService.TagMatchReason.TitleAndAuthor, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + + [Fact] + public void TitleMatchButAuthorMismatch_IsNotAMatch() + { + var book = Book("Das Tierarztpraktikum", "B00U6W36DU", "Markus Dittrich"); + var tags = Tags(title: "Das Tierarztpraktikum", author: "Someone Else", asin: "B004VQF7K2"); + + Assert.Equal(ScanBackgroundService.TagMatchReason.None, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + + [Fact] + public void AuthorMatchButTitleMismatch_IsNotAMatch() + { + var book = Book("Das Tierarztpraktikum", "B00U6W36DU", "Markus Dittrich"); + var tags = Tags(title: "An Unrelated Book", author: "Markus Dittrich", asin: "B004VQF7K2"); + + Assert.Equal(ScanBackgroundService.TagMatchReason.None, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + + [Fact] + public void NoUsableSignals_IsNotAMatch() + { + var book = Book("Das Tierarztpraktikum", "B00U6W36DU", "Markus Dittrich"); + var tags = Tags(title: null, author: null, asin: null); + + Assert.Equal(ScanBackgroundService.TagMatchReason.None, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + + [Fact] + public void NullTags_IsNotAMatch() + { + var book = Book("Das Tierarztpraktikum", "B00U6W36DU", "Markus Dittrich"); + + Assert.Equal(ScanBackgroundService.TagMatchReason.None, ScanBackgroundService.MatchEmbeddedTags(book, null)); + } + + [Fact] + public void EmptyRecordAsin_DoesNotMatchEmptyTagAsin() + { + // Two missing ASINs must not be treated as "equal" — fall through to title/author. + var book = Book("Das Tierarztpraktikum", asin: null, "Markus Dittrich"); + var tags = Tags(title: "Unrelated", author: "Markus Dittrich", asin: null); + + Assert.Equal(ScanBackgroundService.TagMatchReason.None, ScanBackgroundService.MatchEmbeddedTags(book, tags)); + } + } +}