Slide Morphology Triage Agent

Open-source whole-slide pathology agent for mophology triage ROI collection, morphology-first diagnosis, and interactive WSI exploration. Support general slides and Acute myeloid leukemia (AML) slides

The project combines deterministic slide reduction, embedding-based candidate retrieval, and a vision-language model that navigates high-value regions instead of trying to reason over an entire gigapixel slide at once.

Highlights

Interactive FastAPI workbench for browser-based runs and live monitoring.
Headless evaluation scripts for single-slide, batch, and cache-precomputation workflows.
Support for standard WSI formats plus MIRAX files and server-local slide browsing.

Project Visuals

System overview and WSI workflow.

Web workbench run view and result flow.

Performance Benchmarks

Evaluated on a private dataset of 372 bone marrow WSIs. VLMs were run in official FP8 quantized variants where available.

VLM Performance (ROI Collection)

Results are averaged across available feature extractors. roi5 % is the percentage of runs where the model successfully reached the 5-ROI target in our task.

Model	Success %	roi5 %	Avg. tool calls
Gemma-4-31B-it	100.00	71.30	21.24
Qwen3.5-397B-A17B-FP8	99.66	99.26	25.07
DeepSeek-V4	100.00	99.73	27.10
GLM-4.6V-FP8	92.14	95.63	26.28
GPT-OSS-120B	99.63	23.99	37.36

AML Diagnosis Results

Gemma-4-31B-it and Qwen3.5-397B-A17B-FP8 were the main diagnosis models explored here: Gemma-4-31B-it showed stronger AML-morphology specificity, while Qwen3.5-397B-A17B-FP8 was more general in blood-cell image understanding.

Prompt wording has some effect, but the results are not dominated by prompt changes alone.

_Model	_Extractor	_{Acc %}	_TP/FN	_TN/FP	_{NPM1 %}	_HistSim
_{Gemma-4-31B-it}	_DinoBloom	_70.16	_234/85	_27/26	_74.43	_0.889
_{Gemma-4-31B-it}	_H-optimus-1	_72.85	_246/73	_25/28	_71.74	_0.897
_{Gemma-4-31B-it}	_UNI2	_78.23	_269/50	_22/31	_73.02	_0.919
_{Gemma-4-31B-it}	_Virchow2	_77.96	_270/49	_20/33	_72.00	_0.907
_{Qwen3.5-397B-A17B-FP8}	_DinoBloom	_60.48	_177/142	_48/5	_65.67	_0.885
_{Qwen3.5-397B-A17B-FP8}	_H-optimus-1	_61.02	_182/137	_45/8	_65.96	_0.896
_{Qwen3.5-397B-A17B-FP8}	_UNI2	_69.09	_215/104	_42/11	_64.15	_0.915
_{Qwen3.5-397B-A17B-FP8}	_Virchow2	_66.13	_207/112	_39/14	_67.88	_0.906

HistSim measures histogram similarity between clinician-selected ROIs and agent-selected ROIs.

Quick Start

Install

uv sync
source .venv/bin/activate

Configure

If your model backend needs API keys or other runtime settings, create a .env file with the variables you use locally.

The OpenAI-compatible client settings live in configs/config.yaml under the agent section:

agent:
  OPENAI_API_BASE: "http://pluto/v1"
  OPENAI_API_KEY: "local"

Environment variables still override the YAML values, so you can also set them in your shell or .env:

export OPENAI_API_BASE="http://your-server/v1"
export OPENAI_API_KEY="your-key"

Update model exposure in main.py and wsi_core.py by setting MODEL_NAME and ALLOWED_MODEL_NAMES.

To expose server-local slide roots in the web Explorer, set SERVER_SLIDE_ROOTS before starting the app:

export SERVER_SLIDE_ROOTS="/some/other/root"

Supported sources:

Standard slide files: .svs, .tif, .tiff, .ndpi
MIRAX files: .mrxs, .mrsx
MIRAX companion folders: select the folder whose sibling .mrxs or .mrsx has the same stem

Run The Workbench

python main.py

The app starts on port 3008 by default and falls forward to the next free port if needed.

How It Fits Together

WSI
  -> tile extraction and quality filtering
  -> embedding-based candidate ranking
  -> agentic ROI navigation and selection
  -> morphology-only diagnosis from accepted ROIs

aml_auto is the default production path:

WSI
  -> aml_roi
      -> roi_collection.json + ROI images
  -> aml_diagnosis
      -> final JSON diagnosis + report

Workbench

The web UI is organized around three panels:

Input and Run: slide source, agent, model, extractor, tile size, batch size, and tile filter.
Slide Viewer: slide overview plus ROI snapshots collected during navigation.
Run Status: live steps, current state, errors, and report link.

Slide Sources

You can start a run from:

uploaded slide files such as .svs, .tif, .tiff, .ndpi
MIRAX folders or zip bundles
the built-in server Explorer for server-local or HPC slide roots

Main Controls

Agent: aml_auto, aml_roi, aml_diagnosis, tile, or wsi
Model: which VLM is exposed in the workbench
Feature extractor: embedding backbone used for ROI candidate preparation
Tile size: patch size for the extractor path
Batch size: embedding throughput during tile feature extraction
Tile filter: candidate prefilter before expensive embedding

Tile filter modes:

Quality score: ranks tiles by focus, stain, texture, and artifact heuristics.
Coarse to fine: thumbnail-level region prefilter before embedding.
Hybrid: combines region screening with raw-tile quality prefiltering.
None: keeps only the baseline foreground and texture gating.

Typical Flow

Choose a slide source or browse the server Explorer.
Pick the agent, model, extractor, tile size, batch size, and tile filter.
Start the run.
Follow the live trace while reviewing the overview and ROI panes.

Agents

`agent_type`	Purpose	Slide input	Output
`aml_auto`	Full two-stage AML workflow	required	ROI bundle + diagnosis report
`aml_roi`	Stage 1 ROI collection only	required	`roi_collection.json` + ROI images
`aml_diagnosis`	Stage 2 diagnosis from existing ROIs	not required	strict AML JSON diagnosis
`tile`	Save good and bad tiles for curation or training	required	labeled tiles under `Selected_Tiles`
`wsi`	General-purpose pathology exploration agent	required	task-shaped report and saved ROIs

AML ROI Collector

aml_roi is the evidence-acquisition stage. It navigates the slide, opens ranked candidates, and marks acceptable high-power ROIs toward the configured target of 5 for downstream diagnosis.

Outputs include:

roi_collection.json
images/roi_N.jpg
images/slide_overview.jpg
images/roi_candidates.jpg
navigation summary report.md

AML Diagnosis

aml_diagnosis receives ROI images only. It does not navigate the slide and does not see slide-level metadata. The output is a strict JSON object with morphology summary, accepted ROI reasoning, blast range, triage zone, final decision, limitations, confidence, and conditional NPM1 prediction.

General WSI Agent

wsi is the non-AML mode for broader pathology tasks such as tumour description, MSI-oriented inspection, inflammation review, or other prompt-defined workflows.

When prompted for MSI screening, the agent samples at least three distinct tumour regions, marks representative ROIs, and returns a qualitative MSI-H or MSS impression with the caveat that definitive status still requires IHC or molecular testing.

CLI And Batch Runs

See the dedicated CLI guide: evaluate/README.md.

Main entrypoints:

evaluate/run_single_slide.py: single-case headless run
evaluate/run_batch_aml.sh: CSV-driven AML batch
evaluate/run_batch_aml_suite.sh: model/extractor suite
evaluate/preextract_hybrid_cache.py: feature-cache prewarm
evaluate/preextract_hybrid_cache_suite.sh: multi-extractor prewarm

Reference Embeddings

AML mode uses retrieval against curated reference tiles to improve ROI ranking. Prebuilding the reference index can reduce startup cost significantly.

Quick start:

python -m wsi_core_pkg.embeddings.prebuild_reference_embeddings \
    --tiles-root ./Selected_Tiles \
    --output-dir ./outputs/cache/reference_hnsw \
    --extractor reddino

Detailed notes live in evaluate/README.md under Reference Embeddings.

Documentation Map

evaluate/README.md: CLI workflows and reference-embedding setup
docs/AML_AGENT_PIPELINE_METHODOLOGY.md: methodology and algorithmic description
docs/AML_AGENT_PIPELINE_IMPLEMENTATION.md: implementation details
docs/AML_AGENT_TOOLS_APPENDIX.md: tool and navigation appendix
docs/AML_PROMPTS_APPENDIX.md: prompt appendix
docs/EVALUATION_METRICS.md: metric definitions

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.vscode		.vscode
Selected_Tiles		Selected_Tiles
configs		configs
docs		docs
evaluate		evaluate
static		static
tests		tests
wsi_core_pkg		wsi_core_pkg
.gitignore		.gitignore
README.md		README.md
image_utils.py		image_utils.py
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock
wsi_core.py		wsi_core.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slide Morphology Triage Agent

Highlights

Project Visuals

Performance Benchmarks

VLM Performance (ROI Collection)

AML Diagnosis Results

Quick Start

Install

Configure

Run The Workbench

How It Fits Together

Workbench

Slide Sources

Main Controls

Typical Flow

Agents

AML ROI Collector

AML Diagnosis

General WSI Agent

CLI And Batch Runs

Reference Embeddings

Documentation Map

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Slide Morphology Triage Agent

Highlights

Project Visuals

Performance Benchmarks

VLM Performance (ROI Collection)

AML Diagnosis Results

Quick Start

Install

Configure

Run The Workbench

How It Fits Together

Workbench

Slide Sources

Main Controls

Typical Flow

Agents

AML ROI Collector

AML Diagnosis

General WSI Agent

CLI And Batch Runs

Reference Embeddings

Documentation Map

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages