Methods: We created a large-scale dataset of 10,021 thoracic CTs, encompassing 157 labels, and applied an ensemble of 3D anatomy segmentation models to extract anatomical pseudo-labels. These labels were projected onto a two-dimensional plane, resembling CXR, enabling the training of detailed semantic segmentation models without any manual annotation effort.
Results: Our resulting segmentation models demonstrated remarkable performance, with a high average model-annotator agreement between two radiologists with mIoU scores of 0.93 and 0.85 for frontal and lateral anatomies, whereas the inter-annotator agreement remained at 0.95 and 0.83 mIoU. Additionally, our anatomical segmentations allowed for the accurate extraction of relevant explainable medical features such as the Cardio-Thoracic-Ratio.
Conclusion: Our method of volumetric pseudo-labeling paired with CT projection offers a promising approach for detailed anatomical segmentation of CXR with a high agreement with human annotators. This technique can have important clinical implications, particularly in the analysis of various thoracic pathologies.
Abstract: In clinical radiology reports, doctors capture important information about the patient's health status. They convey their observations from raw medical imaging data about the inner structures of a patient. As such, formulating reports requires medical experts to possess wide-ranging knowledge about anatomical regions with their normal, healthy appearance as well as the ability to recognize abnormalities. This explicit grasp on both the patient's anatomy and their appearance is missing in current medical image-processing systems as annotations are especially difficult to gather. This renders the models to be narrow experts e.g. for identifying specific diseases. In this work, we recover this missing link by adding human anatomy into the mix and enable the association of content in medical reports to their occurrence in associated imagery (medical phrase grounding). To exploit anatomical structures in this scenario, we present a sophisticated automatic pipeline to gather and integrate human bodily structures from computed tomography datasets, which we incorporate in our PAXRay: A Projected dataset for the segmentation of Anatomical structures in X-Ray data. Our evaluation shows that methods that take advantage of anatomical information benefit heavily in visually grounding radiologists' findings, as our anatomical segmentations allow for up to absolute 50% better grounding results on the OpenI dataset as compared to commonly used region proposals.
The dataset is available by clicking the folders:
We currently have not benchmarked models on the PAXRay++ dataset yet but will be released soon.
For immediate use, refer to CXAS
The pre-trained models are available by clicking the respective link below:
Model | mIoU | Code | Link |
---|---|---|---|
UNet (100%) | 60.6 | code | weights |
UNet (50%) | 57.07 | code | weights |
UNet (25%) | 52.77 | code | weights |
If you use this work or dataset, please cite:
@inproceedings{paxray,
author = {Seibold,Constantin and Reiß,Simon and Sarfraz,Saquib and Fink,Matthias A. and Mayer,Victoria and Sellner,Jan and Kim,Moon Sung and Maier-Hein, Klaus H. and Kleesiek, Jens and Stiefelhagen,Rainer},
title = {Detailed Annotations of Chest X-Rays via CT Projection for Report Understanding},
booktitle = {Proceedings of the 33th British Machine Vision Conference (BMVC)},
year = {2022}
}