NoteGenerator

NoteGenerator is a CLI workflow for turning PDF courseware into image-based AI notes and writing the result into Notion.

It renders each PDF page to an image, uploads those images to Cloudinary, sends the images to a multimodal model for page-by-page analysis, generates a deck summary, and writes both summary and per-page content into Notion with block-safe rendering.

What It Solves

Analyze lecture slides from images instead of relying on OCR-only text extraction
Route files to an existing Notion page automatically when the user does not provide a URL
Resume long jobs from the failed page after transient network errors
Keep detailed logs and structured checkpoints for monitoring and recovery
Write Notion body content and comments using rendering-safe structures
Expose the whole workflow through a command-line interface suitable for OpenClaw agents

Pipeline

flowchart TD
    A["PDF input"] --> B["Render each page to PNG"]
    B --> C["Upload page images to Cloudinary"]
    C --> D["Send image URLs to Kimi vision model"]
    D --> E["Generate page analyses"]
    E --> F["Generate deck summary"]
    F --> G["Convert body Markdown to Notion blocks"]
    E --> H["Convert page analysis to comment-safe rich_text"]
    G --> I["Write summary and page body to Notion"]
    H --> J["Attach per-page comments to heading blocks"]
    I --> K["manifest.json / summary.md / run.log / checkpoint.json"]
    J --> K

Key Behaviors

Fixed model target: kimi-k2.5
Page analysis is image-first
Cover pages, agenda pages, and transition pages are explicitly prevented from being over-interpreted
Notion body content is written as blocks
Per-page analysis is attached as comments on the 第 N 页 heading block
Comment-unsafe content such as tables and fenced code blocks is moved into the page body
Long-running jobs maintain checkpoint.json and heartbeat.json
Multi-file processing is serial and ordered by file modification time

Repository Layout

.
├── .env.example
├── docs/
│   └── page_analyzer_notion_manual.md
├── skills/
│   └── notegenerator-openclaw/
│       └── SKILL.md
├── src/notegenerator_cli/
│   ├── cli.py
│   ├── workflow.py
│   ├── notion_utils.py
│   ├── ai_client.py
│   ├── router.py
│   └── install_agent.py
└── tests/

Requirements

Python 3.11+
A Notion integration token with access to the target pages
A Cloudinary account
An OpenAI-compatible multimodal endpoint
Optional: OpenClaw if you want to run the workflow from an agent

Installation

git clone git@github.com:BojayL/NoteGenerator.git
cd NoteGenerator
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Configuration

Copy .env.example to .env and replace placeholders:

cp .env.example .env

Supported environment variables:

NOTEGENERATOR_AI_BASE_URL
NOTEGENERATOR_AI_API_KEY
NOTEGENERATOR_MODEL
NOTION_API_KEY
NOTION_VERSION
CLOUDINARY_URL
NOTEGENERATOR_PAGE_DPI
NOTEGENERATOR_DEFAULT_TELEGRAM_TOKEN
NOTEGENERATOR_DEFAULT_TELEGRAM_USER_ID

Configuration load order:

Project .env
Process environment variables
~/.openclaw/openclaw.json Bailian defaults
~/.config/notion/api_key

CLI Commands

Environment check:

notegenerator doctor

Single file with explicit Notion target:

notegenerator ingest /absolute/path/course.pdf "https://www.notion.so/target-page"

Batch import with explicit target:

notegenerator batch /path/a.pdf /path/b.pdf --notion-url "https://www.notion.so/target-page"

Batch import with automatic routing:

notegenerator batch /path/a.pdf /path/b.pdf --auto-route

Route without writing:

notegenerator route /absolute/path/course.pdf

Inspect a run:

notegenerator inspect-run /absolute/path/to/run-dir

Resume from a failed page:

notegenerator resume-analysis /absolute/path/to/run-dir --start-page 17

Resume only a range:

notegenerator resume-analysis /absolute/path/to/run-dir --start-page 17 --end-page 24

Finalize an existing run when page analysis is complete but the summary is still missing:

notegenerator finalize-run /absolute/path/to/run-dir

OpenClaw inbound entrypoint:

notegenerator openclaw-run --scan-inbound --limit 5

Install the companion OpenClaw agent bundle:

notegenerator install-openclaw-agent

Run Artifacts

Each execution creates a dedicated directory under runs/ containing:

run.log: human-readable runtime log
events.jsonl: structured event stream
images/: rendered page PNG files
page_analyses/: page-level analysis outputs
summary.md: deck-level summary
manifest.json: final output manifest
checkpoint.json: resumable page state
heartbeat.json: live progress marker, removed automatically when done
error.json: failure details when a run aborts

Recovery Model

If the job fails because of a transient connection issue, do not restart from page 1.

Use inspect-run to read next_page_to_analyze, then resume:

notegenerator inspect-run /absolute/path/to/run-dir
notegenerator resume-analysis /absolute/path/to/run-dir --start-page <next_page>

The workflow reuses existing page renders, Cloudinary URLs, and completed page analyses before regenerating the summary and completing the Notion write.

If inspect-run shows:

next_page_to_analyze = null
page_content_written = true
summary_written = false

then page analysis and page-body/comment writing are already finished, and only the deck summary is missing. In that case, do not rerun ingest or resume-analysis. Use:

notegenerator finalize-run /absolute/path/to/run-dir

finalize-run reuses local page_analyses/, regenerates only the summary, and appends only the missing summary blocks to Notion.

Notion Rendering Strategy

Page body

Body content is converted into Notion blocks. The workflow currently writes:

headings
paragraphs
bulleted lists
numbered lists
quotes
callouts
dividers
code blocks
tables
external images

Page comments

Per-page analysis is attached to the 第 N 页 heading block as comments.

Only comment-safe rich_text should remain in comments:

plain text
inline bold / italic
inline code
links
simple one-level textual lists

The following should be moved to the page body instead:

Markdown tables
fenced code blocks
block-level callouts and quotes
images
nested structures

OpenClaw Integration

The repository includes support for installing a dedicated OpenClaw agent workspace.

Recommended agent behavior:

Trigger openclaw-run or ingest
Capture run_dir
Every 2 minutes, call inspect-run
If next_page_to_analyze has a value, resume from that page
If next_page_to_analyze = null and only summary_written is false, call finalize-run
Report the PDF path, selected Notion page, run directory, and final result

The reusable OpenClaw-oriented skill is included at:

skills/notegenerator-openclaw/SKILL.md

Deploy to OpenClaw

This section describes a practical end-to-end deployment path for running NoteGenerator from an OpenClaw agent.

1. Prepare the NoteGenerator project

Clone the repository, create the virtual environment, and install the package:

git clone git@github.com:BojayL/NoteGenerator.git
cd NoteGenerator
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Create .env from the public template:

cp .env.example .env

At minimum, fill in:

NOTEGENERATOR_AI_API_KEY
NOTION_API_KEY
CLOUDINARY_URL

If you want the OpenClaw installer to bind a Telegram bot without passing CLI flags every time, also fill in:

NOTEGENERATOR_DEFAULT_TELEGRAM_TOKEN
NOTEGENERATOR_DEFAULT_TELEGRAM_USER_ID

Validate the environment:

./.venv/bin/notegenerator doctor

2. Make sure OpenClaw is already installed

This repository does not install OpenClaw itself. Before installing the agent bundle, you should already have:

a working OpenClaw installation
a writable ~/.openclaw/openclaw.json
an enabled Telegram channel in OpenClaw if you want Telegram-triggered imports

The CLI can also reuse OpenClaw model defaults from ~/.openclaw/openclaw.json if NOTEGENERATOR_AI_API_KEY is not provided in .env.

3. Install the companion OpenClaw agent bundle

The simplest install path is:

cd NoteGenerator
source .venv/bin/activate
notegenerator install-openclaw-agent

If you want to make the deployment explicit, pass the full parameter set:

notegenerator install-openclaw-agent \
  --agent-id courseware-notion-router \
  --account-id courseware-notion-router \
  --display-name "Courseware Importer" \
  --persona-theme "Receives PDF courseware, routes it to the right Notion page, and runs the CLI workflow." \
  --emoji "📚" \
  --telegram-token "$NOTEGENERATOR_DEFAULT_TELEGRAM_TOKEN" \
  --telegram-user-id "$NOTEGENERATOR_DEFAULT_TELEGRAM_USER_ID"

4. What the installer writes

The installer creates an OpenClaw agent workspace under:

~/.openclaw/agents/<agent_id>/

The generated bundle includes:

workspace/AGENTS.md
workspace/IDENTITY.md
workspace/TOOLS.md
workspace/HEARTBEAT.md
workspace/USER.md
workspace/SOUL.md
workspace/skills/courseware-importer/SKILL.md

It also updates ~/.openclaw/openclaw.json by:

registering the agent
adding the Telegram account binding
connecting the agent ID to the Telegram account ID you passed

If ~/.openclaw/agents/pingping/agent/models.json exists locally, the installer copies it into the new agent's agent/ directory as a convenience.

5. Restart OpenClaw if needed

Some OpenClaw setups hot-reload configuration changes and some do not.

If your runtime does not pick up the new agent automatically, restart the relevant OpenClaw process after running the installer.

You should verify that:

the new agent directory exists
~/.openclaw/openclaw.json now contains your agent and Telegram binding
the Telegram bot token/account ID pair is present under the Telegram accounts section

6. Understand how the installed agent behaves

The generated courseware-importer skill encodes the operational workflow:

If the user provides a Notion page URL, the agent should run ingest.
If the user does not provide a page URL, the agent should route or use openclaw-run --scan-inbound.
Once a run starts, the agent should capture the run_dir.
Every 2 minutes, the agent should call inspect-run.
If network instability interrupts analysis, the agent should resume from next_page_to_analyze.
If multiple files arrive, they should be processed serially in modification-time order.
When the run is complete, heartbeat.json should disappear automatically.

7. Triggering the workflow from OpenClaw

There are two common ways to operate it.

Option A: Telegram-driven workflow

Use Telegram as the transport:

Send one or more PDF files to the Telegram bot bound to the installed agent.
If you already know the target Notion page, include the URL in the conversation.
If no URL is provided, let the agent route automatically.
The agent should process the files with the installed workflow rules.

Option B: Inbound-folder workflow

If your OpenClaw setup writes incoming media to ~/.openclaw/media/inbound, you can run:

cd NoteGenerator
source .venv/bin/activate
notegenerator openclaw-run --scan-inbound --limit 5

This is especially useful for debugging the same workflow outside the full agent loop.

8. Monitoring and recovery from OpenClaw

During a long run, inspect the active run directory:

notegenerator inspect-run /absolute/path/to/run-dir

Important fields:

uploaded_pages
analyzed_pages
next_page_to_analyze
notion_written
heartbeat_exists

If the run fails due to a transient connection problem, resume from the failed page:

notegenerator resume-analysis /absolute/path/to/run-dir --start-page <next_page>

Use the next_page_to_analyze value returned by inspect-run.

9. Recommended post-install verification

After deployment, verify the stack in this order:

notegenerator doctor
notegenerator route /absolute/path/to/sample.pdf
notegenerator ingest /absolute/path/to/sample.pdf "https://www.notion.so/target-page"
notegenerator inspect-run /absolute/path/to/run-dir

For OpenClaw-specific verification, confirm that:

the agent workspace was created
Telegram binding was added
a sample PDF can be processed through the installed agent
per-page comments and body blocks render correctly in Notion

10. Deployment notes

NOTEGENERATOR_DEFAULT_TELEGRAM_TOKEN and NOTEGENERATOR_DEFAULT_TELEGRAM_USER_ID are only defaults for the installer. You can always override them via CLI flags.
The installed OpenClaw agent is meant to automate the same CLI you can run manually. Debugging the CLI first is usually faster than debugging the full agent loop first.
If your OpenClaw environment already has a preferred agent naming convention, pass your own --agent-id and --account-id during installation.

Analyzer Output Contract

The page analyzer is constrained by:

docs/page_analyzer_notion_manual.md

That manual exists to keep model output convertible to stable Notion structures and to prevent over-interpretation of low-information pages.

Development

Run tests:

python -m unittest discover -s tests -v

Useful implementation entrypoints:

src/notegenerator_cli/cli.py
src/notegenerator_cli/workflow.py
src/notegenerator_cli/notion_utils.py
src/notegenerator_cli/install_agent.py

Publishing Notes

This public repository intentionally excludes private runtime state and credentials:

.env is not committed
runs/ is not committed
.venv/ is not committed
all repository examples use placeholders instead of live API keys

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
skills/notegenerator-openclaw		skills/notegenerator-openclaw
src/notegenerator_cli		src/notegenerator_cli
tests		tests
.env.example		.env.example
.gitignore		.gitignore
DEVELOPMENT_REPORT.md		DEVELOPMENT_REPORT.md
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

NoteGenerator

What It Solves

Pipeline

Key Behaviors

Repository Layout

Requirements

Installation

Configuration

CLI Commands

Run Artifacts

Recovery Model

Notion Rendering Strategy

Page body

Page comments

OpenClaw Integration

Deploy to OpenClaw

1. Prepare the NoteGenerator project

2. Make sure OpenClaw is already installed

3. Install the companion OpenClaw agent bundle

4. What the installer writes

5. Restart OpenClaw if needed

6. Understand how the installed agent behaves

7. Triggering the workflow from OpenClaw

Option A: Telegram-driven workflow

Option B: Inbound-folder workflow

8. Monitoring and recovery from OpenClaw

9. Recommended post-install verification

10. Deployment notes

Analyzer Output Contract

Development

Publishing Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages