Skip to content
/ OCR Public

Converts a PDF to text (Markdown) by rendering each page to an image and sending it through the deepseek-ocr:latest model served by Ollama via its OpenAI-compatible API (Responses API).

Notifications You must be signed in to change notification settings

arrase/OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR (PDF → Markdown) with Ollama + deepseek-ocr

Converts a PDF to text (Markdown) by rendering each page to an image and sending it through the deepseek-ocr:latest model served by Ollama via its OpenAI-compatible API (Responses API).

Requirements

  • Ollama running locally and the model pulled:
    • ollama pull deepseek-ocr:latest
  • On Linux, pdf2image requires Poppler:
    • Debian/Ubuntu: sudo apt-get install poppler-utils

Installation (pipx)

pipx install git+https://github.com/arrase/OCR.git

Usage

ocr 2512.15741v1.pdf

Page selection

You can include/exclude pages (1-based) and use both at the same time; --include is applied first and then --exclude.

Examples:

# Only page 1
ocr --include 1 2512.15741v1.pdf

# Pages 1 to 5 except 3
ocr --include 1-5 --exclude 3 2512.15741v1.pdf

# Combinations
ocr --include 1,3,5-8 --exclude 6-7 2512.15741v1.pdf

Output: creates 2512.15741v1.md in the same directory.

Environment variables (optional)

  • OLLAMA_BASE_URL (default http://localhost:11434/v1)
  • OLLAMA_MODEL (default deepseek-ocr:latest)

About

Converts a PDF to text (Markdown) by rendering each page to an image and sending it through the deepseek-ocr:latest model served by Ollama via its OpenAI-compatible API (Responses API).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages