sift

Ask questions about a PDF, get answers with clickable, page-anchored citations, and see the source passages highlighted in a side-by-side PDF viewer.

Built on the Claude Agent SDK and PyMuPDF. Authentication piggybacks on your local Claude Code session — no API keys to manage.

Features

Two-pane web UI — PDF on the left, chat on the right
Color-matched citations — each citation gets a distinct pastel color, applied to the PDF highlight, the citation chip's left border, and the inline (p. N) pill in the answer text. When several citations land on the same page, the colors tell you which highlight goes with which chip at a glance.
Click-to-jump — both inline (p. 4) references and the citation chips scroll the PDF to the right page on click.
Three answer modes — pick by how much initiative you want the model to take:
- auto (default) — concise, extractive, shaped to the question. Use when you're asking about the paper. Examples: "What does the paper claim about X?", "Which datasets did they evaluate on?", "Summarize section 4."
- strict — pure extraction, never infer or interpret. Use when you want the literal text only. Examples: "List exactly the methods they evaluated.", "Quote the threats-to-validity section."
- freehand — the model is your collaborator; the user prompt is the spec. Inference, synthesis, application, and structural framing are encouraged — every factual claim still anchored to a citation, but inferences are marked ("suggests", "implies", "extending this"). Use for anything generative. Examples: "Fill this SLR rubric for me: [paste rubric]", "Draft a related-work paragraph that critically engages with this paper.", "What hypotheses does the framework in §3 suggest about [my domain]?", "Identify weaknesses in their evaluation."
Rule of thumb: auto for "tell me about the paper," freehand for "use the paper to do something for me," strict when you don't want the model adding any flavor.
Per-PDF chat memory — follow-up questions like "make that more concise" or "what did I ask first?" work because each turn sees the prior conversation. A clear button wipes history; turns rehydrate when you switch back to a PDF.
Abstract is off-limits — extract_pages wraps the abstract in explicit [BEGIN ABSTRACT] / [END ABSTRACT] markers and the prompt forbids citing inside them, forcing the model to anchor claims in the body where they're elaborated.
Markdown answers with bold, lists, headings, and inline code.
Activity trail — collapsed Thought for Ns · K steps per turn shows the agent's reasoning and tool calls.
Robust word-coordinate highlighting — multi-line wraps, hyphenated breaks, and minor paraphrases all match. Cross-page fallback: if the cited page misses, every other page is scanned and the longest match wins, so off-by-one page numbers from the model self-heal.
Model selector — Haiku 4.5 (fast/cheap default), Sonnet 4.6, Opus 4.7, or whatever your CLI's /model is set to (inherit).
CLI — same agent, headless: python pdf_qa.py paper.pdf "question" [--mode auto|strict|brainstorm].

Prerequisites

Python 3.10 or newer. The Claude Agent SDK requires it. If pip install reports No matching distribution found for claude-agent-sdk, your Python is too old — check with python --version and install a newer one (python.org).
Claude Code CLI installed and signed in once interactively (claude from a terminal). The Agent SDK spawns the claude CLI as a subprocess and inherits whatever it's authenticated with — works with a Claude Pro/Max subscription (OAuth) or an ANTHROPIC_API_KEY configured via the CLI.

Where Claude Code stores its credentials per OS:

OS Path

macOS / Linux ~/.claude/

Windows %USERPROFILE%\.claude\ (C:\Users\<you>\.claude\)

Nothing in this repo touches that directory; auth lives entirely outside the project, so cloning is enough — no .env files, no key configuration.

Setup

macOS / Linux

git clone https://github.com/<you>/sift.git
cd sift

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

If python3 resolves to an older interpreter, run the venv step with the specific binary, e.g. python3.12 -m venv venv.

Windows (PowerShell)

git clone https://github.com/<you>/sift.git
cd sift

python -m venv venv
venv\Scripts\Activate.ps1
pip install -r requirements.txt

If PowerShell blocks the activate script with an execution-policy error, run once: Set-ExecutionPolicy -Scope CurrentUser RemoteSigned. Use venv\Scripts\activate.bat from cmd.exe instead of PowerShell.

Run

Web UI

uvicorn app:app --port 8000

Open http://localhost:8000. Upload a PDF, ask a question, click any (p. N) reference or citation chip to jump.

CLI

python pdf_qa.py path/to/paper.pdf "What is the main contribution?"
python pdf_qa.py --model sonnet --mode brainstorm paper.pdf "Limitations?"

Outputs: paper_highlighted.pdf and paper_citations.json next to the input PDF.

How it works

PyMuPDF extracts per-page text into <paper>.pages.txt.
The agent reads that file, identifies passages that ground each claim, and writes a tiny script that calls highlight_lib.highlight_pdf(input, output, citations, passages).
highlight_lib matches each passage at the word-coordinate level (it pulls every word's bounding box via page.get_text("words") and finds the longest contiguous matching run vs the quote, normalized lowercase + alphanumeric). Word-level matching survives anything search_for chokes on: line wraps, hyphenated breaks, ligatures.
Citations JSON records the actual highlighted text plus a per-citation pastel color, so the chip's border, the inline (p. N) pill, and the yellow PDF region always agree.
Per-PDF chat memory lives in an in-process dict (CHATS) keyed by filename. Each /ask prepends the last 10 turns to the prompt as a PRIOR CONVERSATION block, so the agent can answer follow-ups that reference earlier turns. Memory is volatile (lost on uvicorn restart); add a JSON dump in app.py if you want persistence.

Project layout

.
├── app.py             FastAPI server (web UI + SSE streaming)
├── agent_core.py      Shared agent setup and prompt for CLI + web
├── highlight_lib.py   Word-coordinate highlighting library
├── pdf_qa.py          CLI entrypoint
├── static/index.html  Two-pane UI (vanilla JS, no build step)
├── pdfs/              User PDFs and generated artifacts (gitignored)
└── requirements.txt

Configuration

Endpoints (app.py):

GET / static UI
GET /config model + mode choices
POST /upload multipart PDF upload
GET /pdfs list uploaded PDFs
GET /pdf/{id} serve PDF (?highlighted=true for annotated copy)
GET /history/{id} per-PDF chat history for rehydration on reload
POST /clear/{id} wipe chat history for a PDF
POST /ask SSE stream: stats, text, tool, done, error

Per-turn options sent to /ask: { file_id, question, model: "haiku|sonnet|opus|inherit", mode: "auto|strict|freehand" }.

The defaults (DEFAULT_MODEL, DEFAULT_MODE) and per-PDF turn cap (MAX_TURNS_KEPT) live in agent_core.py and app.py respectively.

Notes

max_buffer_size is set to 20 MB on the SDK transport because the Read tool returns large JSON payloads when invoked on big files. Don't lower it.
The Read tool is intentionally pointed at the extracted .pages.txt, never at the PDF directly — invoking Read on a PDF returns each page as base64 image data and immediately blows past any sane buffer.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
docs		docs
pdfs		pdfs
static		static
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Mika_continue_from_here.md		Mika_continue_from_here.md
README.md		README.md
agent_core.py		agent_core.py
agent_core_direct.py		agent_core_direct.py
agent_core_search.py		agent_core_search.py
app.py		app.py
highlight_lib.py		highlight_lib.py
pdf_qa.py		pdf_qa.py
requirements.txt		requirements.txt
search_core.py		search_core.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sift

Features

Prerequisites

Setup

macOS / Linux

Windows (PowerShell)

Run

Web UI

CLI

How it works

Project layout

Configuration

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

OS	Path
macOS / Linux	`~/.claude/`
Windows	`%USERPROFILE%\.claude\` (`C:\Users\<you>\.claude\`)

Folders and files

Latest commit

History

Repository files navigation

sift

Features

Prerequisites

Setup

macOS / Linux

Windows (PowerShell)

Run

Web UI

CLI

How it works

Project layout

Configuration

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages