Converts a PDF to text (Markdown) by rendering each page to an image and sending it through the deepseek-ocr:latest model served by Ollama via its OpenAI-compatible API (Responses API).
- Ollama running locally and the model pulled:
ollama pull deepseek-ocr:latest
- On Linux,
pdf2imagerequires Poppler:- Debian/Ubuntu:
sudo apt-get install poppler-utils
- Debian/Ubuntu:
pipx install git+https://github.com/arrase/OCR.gitocr 2512.15741v1.pdfYou can include/exclude pages (1-based) and use both at the same time; --include is applied first and then --exclude.
Examples:
# Only page 1
ocr --include 1 2512.15741v1.pdf
# Pages 1 to 5 except 3
ocr --include 1-5 --exclude 3 2512.15741v1.pdf
# Combinations
ocr --include 1,3,5-8 --exclude 6-7 2512.15741v1.pdfOutput: creates 2512.15741v1.md in the same directory.
OLLAMA_BASE_URL(defaulthttp://localhost:11434/v1)OLLAMA_MODEL(defaultdeepseek-ocr:latest)