Skip to content

Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models

License

Notifications You must be signed in to change notification settings

xmed-lab/Med-RwR

Repository files navigation

Med-RwR

Arxiv Model

📍 Overview

Med-RwR is the first Multimodal Medical Reasoning-with-Retrieval framework, which proactively retrieves external knowledge by querying observed symptoms or domain-specific medical concepts during reasoning. This approach encourages the model to ground its diagnostic analysis in verifiable external information retrieved after analyzing both visual and textual inputs.

model

Overview of Med-RwR framework

🔥 News

  • [2025/11] Demo codes released.
  • [2025/11] The model is released on HuggingFace.
  • [2025/10] The paper is available on arXiv.

🛠️ Environment Installation

The required dependencies are as follows.

conda create -n medrwr python==3.10
conda activate medrwr
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt

Our code is implemented based on SWIFT. Please setup the required version (v3.3.0) with the following commands.

git clone https://github.com/xmed-lab/Med-RwR.git
pip install -e .

⚡️ Quick Start

Start the Retriever

The knowledge base is available at HuggingFace. Please download from the link and put inside retrieve/knowledge_base.

Start the retriever, where we employ BGE-M3 as an example:

python retrieve/retrieve.py

Inference

Run the demo code:

python demo.py

Update the question and image path in the code with your own values before running it.

🙏 Acknowledgements

We refer to the codes from SWIFT, R1-Searcher, ZeroSearch. Thank the authors for their contribution to the community.

📚 Citation

If you find this project useful, please consider citing:

@article{wang2025proactive,
  title={Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models},
  author={Wang, Lehan and Qin, Yi and Yang, Honglong and Li, Xiaomeng},
  journal={arXiv preprint arXiv:2510.18303},
  year={2025}
}

About

Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published