Official implementation of the paper "DuoProto: Dual-Branch Prototype-Guided Framework for Early Recurrence Prediction in Hepatocellular Carcinoma (HCC)". In this repository, we provide the implementation of DuoProto, a dual-branch, prototype-guided learning framework that leverages limited multi-phase CT during training to enhance single-phase early recurrence prediction.
Install the required dependencies:
pip install -r requirements.txt- Python >= 3.8
- PyTorch >= 1.9.0
- MONAI >= 0.9.0
- scikit-learn
- matplotlib
- seaborn
- pandas
- numpy
Note: This project uses private medical data collected from hospital collaborations, which cannot be publicly released due to patient privacy and institutional regulations. Therefore, we do not provide access to the original dataset. If you wish to adapt this framework to your own data, please follow the format described below.
Your data should be organized with the following structure:
Single-phase data (PV only):
{
"PVimg": "/path/to/pv_image.nii.gz",
"PVmask": "/path/to/pv_mask.nii.gz",
"label": 0 or 1, # ER (1) vs NER (0)
"bclc": 0, 1, 2, or 3, # BCLC staging (0, A, B, C)
"PID": "patient_id"
}Multi-phase data:
{
"preimg": "/path/to/pre_image.nii.gz",
"premask": "/path/to/pre_mask.nii.gz",
"Aimg": "/path/to/arterial_image.nii.gz",
"Amask": "/path/to/arterial_mask.nii.gz",
"PVimg": "/path/to/pv_image.nii.gz",
"PVmask": "/path/to/pv_mask.nii.gz",
"Delayimg": "/path/to/delay_image.nii.gz",
"Delaymask": "/path/to/delay_mask.nii.gz",
"label": 0 or 1, # ER (1) vs NER (0)
"bclc": 0, 1, 2, or 3, # BCLC staging (0, A, B, C)
"PID": "patient_id"
}Implement your data loading logic in utils/dataloader.py by modifying the get_custom_data() function:
def get_custom_data():
# TODO: Implement your custom data loading here
files = [] # List of data dictionaries
labels = [] # List of corresponding labels
# Your data loading logic here
# ...
return files, labelsRun training with the DuoProto framework:
bash scripts/train.shInference is performed automatically after training.
See visualize.ipynb
├── main.py # Main training script
├── train.sh # Training shell script
├── requirements.txt # Python dependencies
├── utils/
│ ├── dataloader.py # Data loading utilities
│ └── balanced_sampler.py # Balanced sampling for class imbalance
├── models/
│ ├── ViT.py # Vision Transformer implementation
│ ├── multiphase_vit.py # Multi-phase ViT model (auxiliary branch)
│ └── proto_model.py # DuoProto fusion model
├── trainer/
│ └── training.py # Dual-branch training loop
└── inference/
└── evaluation.py # Evaluation metrics and visualization
@misc{yu2025learninglimitedmultiphasect,
title={Learning from Limited Multi-Phase CT: Dual-Branch Prototype-Guided Framework for Early Recurrence Prediction in HCC},
author={Hsin-Pei Yu and Si-Qin Lyu and Yi-Hsien Hsieh and Weichung Wang and Tung-Hung Su and Jia-Horng Kao and Che Lin},
year={2025},
eprint={2510.07347},
archivePrefix={arXiv},
primaryClass={q-bio.QM},
url={https://arxiv.org/abs/2510.07347},
}
