PDF.load() throws ObjectParseError on encrypted linearized PDFs instead of returning isEncrypted: true, isAuthenticated: false
Repo: https://github.com/LibPDF-js/core/issues
Version: @libpdf/core@0.3.4
Summary
When loading an encrypted PDF whose xref is stored as an encrypted object stream (common for linearized / PDF 1.5+ files), PDF.load() throws ObjectParseError: Invalid object stream index at entry 0: expected object number, got eof instead of returning a PDF instance with isEncrypted: true and isAuthenticated: false.
This makes it impossible to detect password protection through the documented API: the thrown error is the generic parse error, not a SecurityError, so callers cannot distinguish "encrypted, needs password" from "corrupt file."
Reproduction
import { PDF } from '@libpdf/core';
import { readFile } from 'node:fs/promises';
const bytes = await readFile('encrypted-linearized.pdf');
const pdf = await PDF.load(bytes); // throws
The PDF in question:
%PDF-1.7, linearized
- Trailer contains
/Encrypt 1431 0 R
- Cross-reference is itself an
/XRef stream encrypted under the file key
Stack trace
ObjectParseError: Invalid object stream index at entry 0: expected object number, got eof
at ObjectStreamParser.parseIndex (libpdf_core.js:59991:48)
at ObjectStreamParser.parse (libpdf_core.js:59978:23)
at ObjectStreamParser.getObject (libpdf_core.js:60007:10)
at Object.getObject (libpdf_core.js:60939:30)
at ObjectRegistry.resolver (libpdf_core.js:68007:42)
at ObjectRegistry.resolve (libpdf_core.js:57860:24)
at walk (libpdf_core.js:64625:20)
at PDFPageTree.load (libpdf_core.js:64639:5)
at PDF.load (libpdf_core.js:68014:42)
What appears to happen
PDF.load() parses the trailer and discovers /Encrypt.
- The empty-password attempt for the Standard security handler fails (the file requires a real user password).
- Without a valid file key, decrypting the object streams produces garbage.
- The page-tree walker (
PDFPageTree.load) eagerly resolves refs that point into those streams, and the object-stream parser sees garbage bytes and throws.
- The throw escapes before the
PDF instance is returned, so pdf.isEncrypted / pdf.isAuthenticated / pdf.getSecurity() are never reachable.
{ lenient: true } is the default (ParseOptions) but doesn't recover here. Passing credentials: "<wrong>" produces the same failure.
Expected behavior
When the trailer contains /Encrypt and authentication fails, PDF.load() should resolve with a PDF instance where:
isEncrypted === true
isAuthenticated === false
…and defer (or skip) eager parsing of objects that live in encrypted streams. Callers can then either prompt the user for a password and call PDF.load(bytes, { credentials }), or treat the file as protected and stop.
Alternatively, throw a SecurityError with code NOT_AUTHENTICATED / INVALID_PASSWORD so callers have a typed signal.
Current workaround
Catch the throw and scan the trailer bytes for the literal /Encrypt keyword, since the trailer is never inside an encrypted stream:
export async function isProtected(bytes: Uint8Array): Promise<boolean> {
try {
const pdf = await PDF.load(bytes);
return pdf.isEncrypted && !pdf.isAuthenticated;
} catch {
const tail = bytes.subarray(Math.max(0, bytes.length - 8192));
// search for "/Encrypt" in tail
// ...
}
}
This works but bypasses the library entirely for the detection path, which is what the API should be handling.
Sample file
Happy to share a redacted reproduction PDF privately — let me know the preferred channel.
PDF.load()throwsObjectParseErroron encrypted linearized PDFs instead of returningisEncrypted: true, isAuthenticated: falseRepo: https://github.com/LibPDF-js/core/issues
Version:
@libpdf/core@0.3.4Summary
When loading an encrypted PDF whose xref is stored as an encrypted object stream (common for linearized / PDF 1.5+ files),
PDF.load()throwsObjectParseError: Invalid object stream index at entry 0: expected object number, got eofinstead of returning aPDFinstance withisEncrypted: trueandisAuthenticated: false.This makes it impossible to detect password protection through the documented API: the thrown error is the generic parse error, not a
SecurityError, so callers cannot distinguish "encrypted, needs password" from "corrupt file."Reproduction
The PDF in question:
%PDF-1.7, linearized/Encrypt 1431 0 R/XRefstream encrypted under the file keyStack trace
What appears to happen
PDF.load()parses the trailer and discovers/Encrypt.PDFPageTree.load) eagerly resolves refs that point into those streams, and the object-stream parser sees garbage bytes and throws.PDFinstance is returned, sopdf.isEncrypted/pdf.isAuthenticated/pdf.getSecurity()are never reachable.{ lenient: true }is the default (ParseOptions) but doesn't recover here. Passingcredentials: "<wrong>"produces the same failure.Expected behavior
When the trailer contains
/Encryptand authentication fails,PDF.load()should resolve with aPDFinstance where:isEncrypted === trueisAuthenticated === false…and defer (or skip) eager parsing of objects that live in encrypted streams. Callers can then either prompt the user for a password and call
PDF.load(bytes, { credentials }), or treat the file as protected and stop.Alternatively, throw a
SecurityErrorwith codeNOT_AUTHENTICATED/INVALID_PASSWORDso callers have a typed signal.Current workaround
Catch the throw and scan the trailer bytes for the literal
/Encryptkeyword, since the trailer is never inside an encrypted stream:This works but bypasses the library entirely for the detection path, which is what the API should be handling.
Sample file
Happy to share a redacted reproduction PDF privately — let me know the preferred channel.