COVID-19 Lung Lesion Segmentation Using a Sparsely Supervised Mask R-CNN on Chest X-rays Automatically Computed from Volumetric CTs
V. Ramesh, B. Rister, and D. L. Rubin.
“COVID-19 Lung Lesion Segmentation
Using a Sparsely Supervised Mask R-CNN
on Chest X-rays Automatically Computed
from Volumetric CTs.” arXiv:2105.08147
[eess.IV], May 2021.
Chest X-rays of COVID-19 patients are frequently obtained to determine the extent of lung disease and are a valuable source of data for creating AI models. Most work to date assessing disease severity on chest imaging has focused on segmenting CT images; however, given that CTs are performed much less frequently than chest X-rays for COVID-19 patients, automated lung lesion segmentation on chest X-rays is clinically valuable. To accelerate severity detection and augment the amount of publicly available chest X-ray training data for supervised DL models, we propose an automated pipeline for segmentation of COVID-19 lung lesions on chest X-rays comprised of a Mask R-CNN trained on CheXMix, our newly released dataset containing a mixture of open-source chest X-rays and coronal X-ray projections computed from annotated volumetric CTs.
We develop an automated pipeline for COVID-19 lung lesion segmentation on chest X-rays. Due to the lack of publicly available annotated chest X-ray data, we implement a pixel-based algorithm (a method operating at the pixel level) that generates coronal X-ray projections from annotated volumetric CTs to augment the training dataset. A Mask R-CNN framework is then trained on this mixed dataset. Our model achieves superior accuracy with only limited supervised training.
CT → CXR Conversion
We implement a pixel-based re-projection method, modeled as a sub-problem of ray tracing[1], to compute chest X-rays as coronal projections of volumes of axial CT slices:
The algorithm is implemented in Python as follows:
from PIL import Image
import numpy as np
def create_CXR(images): # images is a list of axial CT slices
for z in range(len(images)):
img =[z]).convert('L') # convert image to 8-bit grayscale
HEIGHT, WIDTH = img.size
data = list(img.getdata()) # convert image data to a list of integers
# convert that to 2D list (list of lists of integers)
pixels = [data[offset:offset+WIDTH] for offset in range(0, WIDTH*HEIGHT, WIDTH)]
xray = np.zeros((len(images),[0]).convert('L').size[0]))
# Loop from left to right on the CT slice
for x in range(WIDTH):
# Sum y values in the current x column
sum = 0
for y in range(HEIGHT):
sum += pixels[y][x]
# Assign sum to the point (x, z) on the coronal image - p[z][x] in the pixel array,
# since z represents height (rows) and x represents length (columns)
xray[len(images) - 1 - z][x] = sum
return xray
CheXMix: a chest X-ray dataset containing a mixture of patient X-rays and coronal CT projections with COVID-19 lung lesion annotations
We present CheXMix, the first publicly available, open-source chest X-ray dataset containing over 100 images (assembled from a variety of public sources) with COVID-19 lung lesion annotations produced by our Mask R-CNN model:
Lung Lesion Segmentation
We employ a naive implementation of the Mask R-CNN framework for the task of instance segmentation. In a Mask R-CNN architecture, training samples are fed into a ResNet-101 backbone network, convolved, and passed to the Region Proposal Network (RPN) to generate a set of proposed regions possibly containing lung lesions. Anchors corresponding with each region of interest are then passed through a series of feature maps to generate masks outlining COVID-19 lung lesions on the input chest X-ray. Object classes and bounding boxes are computed via a series of fully connected layers. The task of COVID-19 lung lesion segmentation is posed as a problem of binary classification between the image background and lung lesions. The final output is a predicted mask corresponding with the input chest X-ray, which can then be overlaid on the input image for clinical use.
Environmental Setup[2]
Our models were trained on a single GPU (Tesla P4 GPU provided by Google Colab, 16 GB memory). The code is implemented using TensorFlow v1, but is compatible with TensorFlow v2 and can be ported to the most recent version of TensorFlow if desired. To install all required dependencies, run the following:
pip install tensorflow==1.15.2 keras==2.1.0 Pillow scikit-image opencv-python numpy glob2 regex os-sys argparse matplotlib
Afterwards, set up the Mask R-CNN model:
git clone --quiet
cd ~/Mask_RCNN
pip install -q PyDrive
python install
cp ~/Mask_RCNN/samples/balloon/ ./
sed -i -- 's/balloon/lesion/g'
sed -i -- 's/Balloon/Lesion/g'
The following commands can be used to train the Mask R-CNN model:
# Train a new model starting from pre-trained ImageNet weights
python train --dataset='/path/to/data/' --weights=imagenet
# Train a new model starting from pre-trained COCO weights
python train --dataset='/path/to/data/' --weights=coco
# Continue training a model that you had trained earlier
python train --dataset='/path/to/data/' --weights=/path/to/weights/
# Continue training the last model you trained. This will find
# the last trained weights in the model directory.
python train --dataset='/path/to/data/' --weights=last
To train with data augmentation[3], run:
python train --dataset='/path/to/data/' --weights=imagenet/coco/last --aug='y'
The CT to X-ray re-projection algorithm can be executed in isolation as follows:
python ⟨path/to/CT/volume⟩ ⟨path/to/mask/volume⟩
Pre-trained Models
Training dataset | Train/test split | Data augmentation (y/n) | Checkpoint |
X-rays only | 60/40 | y | Download |
Mixed | 60/40 | y | Download |
X-rays only | 80/20 | y | Download |
Mixed | 80/20 | y | Download |
X-rays only | 80/20 | n | Download |
Mixed | 80/20 | n | Download |
Our model's results far exceed the few existing published baselines. For instance, Tang, Sun, and Li’s U-Net segmentation model* (the only published COVID-19 lung lesion segmentation framework with publicly available model schematics), when implemented and trained on Datasets 1 and 2, achieved Intersection over Union (IoU) scores of 0.38 ± 0.03 and 0.49 ± 0.03, respectively, both of which are significantly lower than our model’s corresponding IoU scores of 0.81 ± 0.03 and 0.79 ± 0.03. Since we trained and tested our model and the baseline model on the same datasets, our Mask R-CNN likely outperformed Tang, Sun, and Li’s U-Net segmentation architecture due to its structure as a series of recurring feature maps rather than contracting and expansive paths, the presence of the RPN, and its greater complexity in the form of a ResNet-101 backbone rather than a ResNet-18 backbone.
Furthermore, the fact that model achieved similar results after training on both Datasets 1 and 2 indicates that we can replace more than 83% of chest X-ray training images with X-ray projections generated from CTs while maintaining model accuracy.
Representative results are shown in the figures below.
What next?
A limitation of our study is that we used small amounts of publicly available data; however, our results still suggest that improved accuracy can be obtained by augmenting chest X-ray data with large numbers of frontal projections of public CT volumes. Training and testing our model on larger datasets could improve future results.