Summary
Part of #488. Extract src/core/parsers/ into a standalone @libscope/parsers package. Parsers have zero upward imports to the rest of libscope — this is the lowest-risk extraction in the split.
Problem / Motivation
Format parsers (PDF, DOCX, EPUB, PPTX, CSV, JSON, YAML, HTML → text/markdown) are useful independently of semantic search. Today they're buried inside @libscope/core and carry all of core's dependencies along for the ride. Extracting them allows projects that only need format conversion to take a minimal dependency.
Proposed Solution
Move src/core/parsers/ into packages/parsers/src/ with its own package.json. Each parser's format-specific library (pdf-parse, mammoth, epub2, etc.) stays as an optionalDependency so consumers only install what they need.
Acceptance Criteria
Out of Scope
- Adding new parsers or changing parser output format
- Removing the gzip auto-detection logic in pack file I/O (that lives in
core/packs.ts, not parsers)
Technical Notes
src/core/parsers/ currently has zero upward imports — no imports from indexing, search, DB, or providers. Clean extraction.
- The parsers directory contains:
markdown.ts, pdf.ts, docx.ts, xlsx.ts, csv.ts, json.ts, yaml.ts, html.ts, epub.ts, pptx.ts
- All parsers share
src/logger.ts and src/errors.ts — these will need to be re-exported from @libscope/core or duplicated into a shared internal package
Summary
Part of #488. Extract
src/core/parsers/into a standalone@libscope/parserspackage. Parsers have zero upward imports to the rest of libscope — this is the lowest-risk extraction in the split.Problem / Motivation
Format parsers (PDF, DOCX, EPUB, PPTX, CSV, JSON, YAML, HTML → text/markdown) are useful independently of semantic search. Today they're buried inside
@libscope/coreand carry all of core's dependencies along for the ride. Extracting them allows projects that only need format conversion to take a minimal dependency.Proposed Solution
Move
src/core/parsers/intopackages/parsers/src/with its ownpackage.json. Each parser's format-specific library (pdf-parse,mammoth,epub2, etc.) stays as anoptionalDependencyso consumers only install what they need.Acceptance Criteria
@libscope/parsersbuilds independently withnpm run build— zero imports from@libscope/coreor any other@libscope/*package@libscope/parserspackage directorypdf-parse,mammoth,csv-parse,js-yaml,epub2,node-html-markdown, etc.) is declared as anoptionalDependency@libscope/coredepends on@libscope/parsersand all parser imports in core are updated to the new package pathlibscopeCLI behaviour is unchangedOut of Scope
core/packs.ts, not parsers)Technical Notes
src/core/parsers/currently has zero upward imports — no imports from indexing, search, DB, or providers. Clean extraction.markdown.ts,pdf.ts,docx.ts,xlsx.ts,csv.ts,json.ts,yaml.ts,html.ts,epub.ts,pptx.tssrc/logger.tsandsrc/errors.ts— these will need to be re-exported from@libscope/coreor duplicated into a shared internal package