If you use or extend this work, please cite:
@misc{liu2024factual,
title={Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation},
author={Kang Liu and Zhuoqi Ma and Mengmeng Liu and Zhicheng Jiao and Xiaolu Kang and Qiguang Miao and Kun Xie},
year={2024},
eprint={2405.09586},
archivePrefix={arXiv},
primaryClass={eess.IV}
}- Python 3.9
torch==2.1.2+cu118transformers==4.23.1torchvision==0.16.2+cu118radgraph==0.09
⚠️ Due to RadGraph's specific environment, we recommend two separate virtual environments:
- RadGraph environment: for structural entity extraction (
knowledge_encoder/radgraph_requirements.txt)- Main FSE environment: for running the rest of the framework (
requirements.txt)
-
IU X-Ray 📥 Images & Reports: Google Drive
-
MIMIC-CXR 📥 Images: PhysioNet (license required) 📥 Reports: Google Drive
-
MIMIC-CXR: Baidu Netdisk (code:
MK13) -
IU X-Ray: Baidu Netdisk (code:
MK13)
git clone https://github.com/dwadden/dygiepp.git
conda create -n dygiepp python=3.7
conda activate dygiepp
cd dygiepp
pip install -r requirements.txt
conda develop .Refer to
knowledge_encoder/radgraph_requirements.ymlfor additional dependencies.
- RadGraph model: PhysioNet RadGraph
- Annotation JSON: Google Drive (requires PhysioNet license)
Set local paths for:
radgraph_model_pathann_path(annotation.json)
Run:
python knowledge_encoder/factual_serialization.pybash pretrain_mimic_cxr.shConfigure --load argument in pretrain_inference_mimic_cxr.sh, then run:
bash pretrain_inference_mimic_cxr.shConfigure --load argument in finetune_mimic_cxr.sh, then run:
bash finetune_mimic_cxr.shDownload images, reports (mimic_cxr_annotation_sen_best_reports_keywords_20.json), and checkpoints (finetune_model_best.pth).
Configure --load and --mimic_cxr_ann_path in test_mimic_cxr.sh, then run:
bash test_mimic_cxr.sh- MIMIC-CXR (FSE-5,
$M_{gt}=100$ ):
- IU X-Ray (FSE-20,
$M_{gt}=60$ ):
- R2Gen Some codes are adapted based on R2Gen.
- R2GenCMN Some codes are adapted based on R2GenCMN.
- MGCA Some codes are adapted based on MGCA.
[1] Chen, Z., Song, Y., Chang, T.H., Wan, X., 2020. Generating radiology reports via memory-driven transformer, in: EMNLP, pp. 1439–1449.
[2] Chen, Z., Shen, Y., Song, Y., Wan, X., 2021. Cross-modal memory networks for radiology report generation, in: ACL, pp. 5904–5914.
[3] Wang, F., Zhou, Y., Wang, S., Vardhanabhuti, V., Yu, L., 2022. Multigranularity cross-modal alignment for generalized medical visual representation learning, in: NeurIPS, pp. 33536–33549.