Engaging preference optimization alignment in large language model for continual radiology report generation: A hybrid approach

Amaan Izhar, Norisma Idris, Nurul Japar

📄 Abstract

Large language models (LLMs) remain relatively underutilized in medical imaging, particularly in radiology, which is essential for disease diagnosis and management. Nonetheless, radiology report generation (RRG) is a time-consuming task that can result in delays and inconsistencies. To address these challenges, we present a novel hybrid approach that integrates multi-modal radiology information and preference optimization alignment in LLM for continual RRG. Our method integrates a pre-trained small multi-modal model to analyze radiology images and generate an initial report, which is subsequently refined and aligned by an LLM using odds ratio preference optimization (ORPO) and with historical patient data and assessments to mimic radiologist-like responses, bypassing reinforcement learning from human feedback-based (RLHF) alignment. This two-stage fusion—supervised fine-tuning followed by preference optimization—ensures high accuracy while minimizing hallucinations and errors. We also propose a data field curation strategy extendable to various other RRG modality datasets, focusing on selecting relevant responses for preference alignment. We evaluate our approach on two public datasets, achieving state-of-the-art performance with average Bleu scores of 0.375 and 0.647, Meteor scores of 0.495 and 0.714, Rouge-L scores of 0.483 and 0.732, and average F1-RadGraph scores of 0.488 and 0.487, for chest X-rays and lung CT scan datasets, respectively. We further provide in-depth qualitative analyses and ablation studies to explain the workings of our model and grasp the clinical relevance for RRG. This work presents the first application of preference optimization in continual RRG, representing a significant advancement in automating clinically reliable report generation. By reducing cognitive burdens on radiologists through AI-powered reasoning and alignment in LLMs, the proposed model improves decision-making, perception, and diagnostic precision, streamlining workflows and enhancing patient care.

🏗️ Architecture

📚 Citation

If you find this work useful, please consider citing our paper and giving this repository a ⭐:

@article{izhar2025r2gpoallm,
  title={Engaging preference optimization alignment in large language model for continual radiology report generation: A hybrid approach},
  author={Izhar, Amaan and Idris, Norisma and Japar, Nurul},
  journal={Cognitive Computation},
  volume={17},
  number={1},
  pages={53},
  year={2025},
  publisher={Springer},
  doi={10.1007/s12559-025-10404-6}
}

🛠️ Reproducibility

✅ System Requirements

Component	Specification
OS	Ubuntu 22.04
GPU	≥ 24 GB VRAM
RAM	≥ 16 GB
Disk Space	≥ 200 GB
Env Modules	Miniconda, Python 3.10
Dependencies	CUDA ≥ 12.1, Huggingface token

📦 Environment Setup

Run:

# Clone the repo
git clone https://github.com/AI-14/r2gpoallm.git
cd r2gpoallm

# Create and activate conda environment
conda create -n env python=3.10 --yes
conda activate env
conda install pip
pip install -r requirements.txt

# Update specific package
pip install -U datasets

📁 Dataset Setup

Follow instructions from the PKATransNet repository.
After running experiments there, rename beam-search-predictions.csv → test_p.csv.
Move test_p.csv into the correct datasets/<sub-directory>/ folder in this repository.
Copy the entire datasets folder from PKATransNet into this repository.

🔬 Running Experiments

IUXRAY

Update your Huggingface token in the following files:

scripts/iuxray/base_sft_like.sh
scripts/iuxray/base_po_like.sh
scripts/iuxray/sft.sh
scripts/iuxray/po.sh

Run:

source scripts/iuxray/pp.sh                  
source scripts/iuxray/base_sft_like.sh       
source scripts/iuxray/base_po_like.sh        
source scripts/iuxray/sft.sh                 
source scripts/iuxray/po.sh

COVCTR

Update your Huggingface token in the following files:

scripts/covctr/base_sft_like.sh
scripts/covctr/base_po_like.sh
scripts/covctr/sft.sh
scripts/covctr/po.sh

Run:

source scripts/covctr/pp.sh                  
source scripts/covctr/base_sft_like.sh       
source scripts/covctr/base_po_like.sh        
source scripts/covctr/sft.sh                 
source scripts/covctr/po.sh

🧹 Clean Up

Run:

cd ..
conda deactivate
conda remove --name env --all
rm -r r2gpoallm

⚠️ Note: Due to the small dataset sizes and sensitivity to random seed initialization, results may vary slightly across runs. For rigorous experiments, consider multiple seed evaluations.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
configs		configs
preprocessors		preprocessors
scripts		scripts
README.md		README.md
requirements.txt		requirements.txt
test_base_po_like.py		test_base_po_like.py
test_base_sft_like.py		test_base_sft_like.py
train_test_po.py		train_test_po.py
train_test_sft.py		train_test_sft.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Engaging preference optimization alignment in large language model for continual radiology report generation: A hybrid approach

📄 Abstract

🏗️ Architecture

📚 Citation

🛠️ Reproducibility

✅ System Requirements

📦 Environment Setup

📁 Dataset Setup

🔬 Running Experiments

IUXRAY

COVCTR

🧹 Clean Up

About

Languages

AI-14/r2gpoallm

Folders and files

Latest commit

History

Repository files navigation

Engaging preference optimization alignment in large language model for continual radiology report generation: A hybrid approach

📄 Abstract

🏗️ Architecture

📚 Citation

🛠️ Reproducibility

✅ System Requirements

📦 Environment Setup

📁 Dataset Setup

🔬 Running Experiments

IUXRAY

COVCTR

🧹 Clean Up

About

Topics

Resources

Stars

Watchers

Forks

Languages