Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ CCD Public

[Minimum Viable Product Showcase Only] 📷 CCD: Mitigating Hallucinations in Radiology MLLMs via Clinical Contrastive Decoding

License

Notifications You must be signed in to change notification settings

X-iZhang/CCD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CCD Logo CCD: Mitigating Hallucinations in Radiology MLLMs via Clinical Contrastive Decoding

🔥 News

  • [17 Oct 2025] 🔩 CCD has been upgraded to support view classification for chest X-rays — see the Supported Expert Models section for details.
  • [06 Oct 2025] 🎮 The online demo is available at Hugging Face Spaces. Feel free to try it out!
  • [30 Sep 2025] 🗂️ The processed test data for quick start are now available — enjoy exploring with the provided guidelines!
  • [27 Sep 2025] ⛳ Our preprint is now live on arXiv — check it out for details.

Overview

Multimodal large language models (MLLMs) are advancing radiology by combining image and text understanding, but often generate inaccurate or unsupported clinical details—so-called medical hallucinations. We propose Clinical Contrastive Decoding (CCD), a training-free and retrieval-free inference framework that integrates structured clinical signals from task‑specific radiology expert models. CCD reduces hallucinations and improves clinical accuracy without changing the base model. Experiments show CCD boosts performance on multiple datasets and models, offering a practical way to make radiology MLLMs more reliable.

CCD's Framework

framework

📖 Contents

⛏️ Installation

Tip

Use uv for installation — it's faster and more reliable than pip.

Option 1:

Install the latest version directly from GitHub for quick setup:

uv pip install git+https://github.com/X-iZhang/CCD.git

Note

Requirements: Python 3.9 or later, and a CUDA-compatible GPU (recommended)

Option 2:

If you plan to modify the code or contribute to the project, you can clone the repository and install it in editable mode:

  1. Clone the repository and navigate to the project folder
git clone https://github.com/X-iZhang/CCD.git
cd CCD
  1. Set up the environment and install in editable mode
conda create -n CCD python=3.10 -y
conda activate CCD
pip install uv # enable uv support
uv pip install -e .
🔄 Upgrade to the latest code base
git pull
uv pip install -e .

⚡ Quick Start

CLI Inference

You can perform inference directly from the command line using our CLI tool:

python -m ccd.run_ccd \
  --model-path "X-iZhang/libra-maira-2" \
  --image "./path/to/Chest_Xray.jpg" \
  --question "Is there evidence of any abnormalities?" \
  --max-new-tokens 128

Optional arguments:

Argument Description Default
--alpha Clinical guidance weight (range: 0.0–1.0) 0.5
--beta Expert token weight (range: 0.0–1.0) 0.5
--gamma Token bias magnitude (range: 2, 5, 10) 10
--expert-model Choice of expert model: "DenseNet" or "MedSiglip" DenseNet

Script Inference

You can run inference programmatically using the ccd_eval function from ccd/run_ccd.py.
After installing this repository, you can easily launch a model (either your own trained model or ours) locally or in Google Colab.

from ccd import ccd_eval

# Run CCD inference on a chest X-ray
output = ccd_eval(
    model_path="X-iZhang/libra-maira-2",  # or your custom radiology MLLM
    image="./path/to/Chest_Xray.jpg",
    question="Describe the findings in this chest X-ray.",
    alpha=0.5,        # Clinical guidance weight
    beta=0.5,         # Expert token weight
    gamma=10,         # Token bias magnitude
    temperature=0.9,  # Sampling temperature
    top_p=0.9,        # Nucleus sampling probability
    top_k=50,         # Top-k sampling
    expert_model="DenseNet",    # or "MedSiglip"
    max_new_tokens=256
)
print(output)
💡 You can also use run_eval to test the original model output (without CCD).
from ccd import run_eval

# Run standard inference without CCD
output = run_eval(
    model_path="X-iZhang/libra-maira-2",
    image="./path/to/Chest_Xray.jpg",
    question="Describe the findings in this chest X-ray.",
    max_new_tokens=128,
    num_beams=1
)
print(output)

Gradio Web Interface

You can launch the Gradio demo locally with:

python -m ccd.app

Once the Gradio web interface is launched, you can open it using the URL printed on your screen. You will notice that both the default MAIRA-2 model and the expert models are ready for setup, with more models available in the list. Simply upload a chest X-ray image, enter your question, and click 🚀Generate to view the results!

demo

🛠️ Advanced Usage

Supported MLLM Models

CCD is compatible with any radiology MLLM that follows the Libra/LLaVA architecture:

Note

To switch MLLM models, simply set the --model-path argument (CLI) or model_path parameter (Python) to one of the following checkpoints.

Model Checkpoint
Libra-v1.0-7B X-iZhang/libra-v1.0-7b
Libra-v1.0-3B X-iZhang/libra-v1.0-3b
MAIRA-2 X-iZhang/libra-maira-2
LLaVA-Med-v1.5 X-iZhang/libra-llava-med-v1.5-mistral-7b
LLaVA-Rad X-iZhang/libra-llava-rad
Med-CXRGen-F X-iZhang/Med-CXRGen-F
Med-CXRGen-I X-iZhang/Med-CXRGen-I

Warning

The model adapted from the Libra repository is intended for demonstration purposes only. For accurate evaluation, please refer to the original model weights and configuration settings, particularly the chat template.

Supported Expert Models

CCD integrates two expert models for clinical signal extraction:

Note

To switch expert models, simply set the --expert-model argument (CLI) or expert_model parameter (Python) to one of the following names.

Model Checkpoint Note
DenseNet torchxrayvision/densenet121-res224-chex CheXpert (Stanford)
MedSiglip google/medsiglip-448 Variant of SigLIP
View Model ChestViewSplit 'Frontal' or 'Lateral'

Tip

When deploying DenseNet, it has been upgraded to support the view classification expert model, which helps the system better understand the view position of chest X-rays, thereby improving the accuracy of report generation. MedSigLIP has also been configured accordingly. The design is inspired by the MAIRA-2 chat template.

Parameter Settings

  • alpha (0.0-1.0): Weight for clinical guidance text

    • Higher = more influence from expert-generated guidance
    • Recommended: 0.3-0.7
  • beta (0.0-1.0): Weight for direct token biasing

    • Higher = stronger push toward clinical terminology
    • Recommended: 0.3-0.7
  • gamma (2, 5, 10): Maximum token bias magnitude

    • 2: Subtle influence
    • 5: Moderate influence
    • 10: Strong influence (default)

Tip

These parameters can be set beyond the recommended range for adversarial testing to observe CCD’s behaviour under extreme conditions.

🗂️ Dataset

CCD supports multiple medical imaging datasets commonly used in radiology research:

  • MIMIC-CXR — Chest X-ray images with corresponding radiology reports.
  • IU-Xray — Chest X-ray dataset with structured annotations.
  • CheXpert Plus — Large-scale dataset for chest X-ray interpretation.
  • Medical-CXR-VQA — A dataset for visual question answering in chest X-rays.

Note

To facilitate hands-on testing, we provide pre-processed test splits for MIMIC-CXR, IU-Xray, CheXpert Plus and Medical-CXR-VQA, available on Hugging Face Collections.

Warning

Carefully read the READMEs; Please note that the image quality of these datasets has been compressed for efficient storage and sharing. Use the original datasets for evaluation.

📊 Evaluation

For evaluating generated reports, we recommend using RadEval — a unified framework for radiology text evaluation that integrates multiple standard metrics. Details can be found in the GitHub repository.

You can install RadEval via pip:

pip install RadEval

Tip

RadEval supports metrics such as BLEU, ROUGE, BERTScore, CheXbert F1, and RadGraph F1, making it ideal for comprehensive evaluation of radiology report generation models.

📝 Citation

If you find our paper and code useful in your research and applications, please cite using this BibTeX:

@misc{zhang2025ccdmitigatinghallucinationsradiology,
      title={CCD: Mitigating Hallucinations in Radiology MLLMs via Clinical Contrastive Decoding}, 
      author={Xi Zhang and Zaiqiao Meng and Jake Lever and Edmond S. L. Ho},
      year={2025},
      eprint={2509.23379},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.23379}, 
}

📚 Acknowledgments

This project builds upon the following outstanding open-source works:

  • Libra — A flexible toolkit supporting multiple radiology LLM backbones, covering the full pipeline from training to inference.
  • TorchXRayVision — A library for chest X-ray datasets and models.
  • MedSigLIP — Medical Signal–Language Image Pretraining.
  • RadEval — A unified framework for radiology text evaluation.

We thank the authors for their valuable contributions to the medical AI community.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🧰 Intended Use

CCD is designed to assist clinical practitioners, researchers, and medical trainees in generating and analysing chest X-ray reports, with a focus on temporal reasoning and context-aware description of radiological findings.

Key Applications

  • 🩺 Clinical Decision Support — Produces preliminary findings or comparative analyses that can aid radiologists in drafting and reviewing reports.
  • 🎓 Educational Tool — Demonstrates example interpretations and temporal progressions for teaching radiology residents and students.
  • 🔬 Research Utility — Enables investigation of automated report generation, visual-language alignment, and temporal feature learning in medical imaging.

Important

All outputs must be reviewed and validated by qualified radiologists or medical professionals before informing any clinical decision.


Limitations and Recommendations
  1. Data Bias — Performance may degrade on underrepresented populations or rare disease categories.
  2. Clinical Oversight — CCD is a supportive system, not a replacement for professional medical judgment.
  3. Temporal Sensitivity — Although TAC enhances temporal alignment, subtle or atypical longitudinal changes may remain unrecognised.
  4. Generalisation — Performance may vary on image types or clinical contexts not present in the training distribution.
Ethical Considerations
  • Patient Privacy — All input data must be fully de-identified and compliant with HIPAA, GDPR, or equivalent local regulations.
  • Responsible Deployment — CCD’s outputs may contain inaccuracies; users should interpret them with appropriate caution.
  • Accountability — The responsibility for clinical verification and safe deployment lies with the end-user organisation or researcher.
Disclaimer

This model and accompanying tools are intended solely for research and educational purposes.
CCD is not approved by the FDA, CE, or other regulatory authorities for clinical use.
For medical diagnosis or treatment decisions, please consult a licensed healthcare professional.