Can Machine Learning Read Chest X-rays like Radiologists?Using adversarial networks to achieve human-level performance for chest x-ray organ segmentationDavid W.
DaiBlockedUnblockFollowFollowingJun 24Healthcare Needs AI to ScaleToday, only about 10% of 7B population in the world have access to good healthcare service, and half of the world don’t even access to essential health services.
Even among the developed countries, healthcare system is under strain, with rising cost and long wait time.
To train up enough physicians and care providers for the growing demands within a short period of time is impractical, if not impossible.
The solution has to involve technological breakthroughs.
And that’s where Machine Learning (ML) and Artificial Intelligence (AI) can make a big impact.
In this post, I will introduce a simple but extremely effective Deep Learning approach I developed to understand chest x-ray images.
More details can be found in the original paper.
There are LOTs of Chest X-rays (CXRs)CXRs are the most common type of medical imaging, often 2x-10x more than other advanced imaging methods such as MRI, CT scans, PET scans:Among the 20+ million X-rays, over 8 millions are chest x-rays, making chest x-rays the most common standardized medical imaging.
(source)Some reasons that CXRs are popular include: (1) lower dose of radiation; (2) lower cost; (3) it needs only less than a minute to take an image (compared with, say, an hour or more for a CT scan).
As the result, CXRs are widely used as a screening tool.
If there’s something wrong in your lungs that requires more evidence to diagnose, your doctor usually first prescribe a CXR.
CXR provides a low fidelity view that paves the way to other more sophisticated imaging methods.
From talking to radiologists I learned that a sizable hospital can generate hundreds if not thousands of CXR a day, all of which need to be read by a radiologist or, less preferably, other physicians.
And it is often paramount for the reading to be done in hours to detect urgent conditions (such as those developed by in-patients).
In short, reading CXR is quite a demanding task for radiologists and physicians alike.
CXR Reading Involves Many Steps and Can Be Time ConsumingThe average time it takes a well trained radiologist to read a CXR is about 1–2 minutes.
It is hard to speed that up because CXR reading is a very systematic process.
One popular mnemonic for CXR reading is the following: ABCDEFGHI.
A for airways, B for bones, C for cardiac… you get the idea.
It’s not exactly short, and taking shortcuts means risking overlooking important findings.
From working on CXR I also realize that reading CXR is actually very hard.
I had brought CXRs with tuberculosis diagnosis to a general physician and he cannot tell for most part which patient is TB (tuberculosis) positive.
The radiology resident I spoke to told me that during their residency program they will read about 10,000 CXR images to get proficient at it.
This reminds me that most professional baseball batters in MLB need to swing 10,000 times to be able to hit the ball.
It seems that it takes that amount of training data for humans to start recognizing the patterns in CXRs.
This steep learning curve might be due to the fact that CXR is so different from natural images we are trained on throughout our lives.
This turned out to be a hurdle for AI system as well, which we shall revisit later.
Radiologists are in Dire ShortageWe are only talking about CXRs.
As CT scans and other imaging technology becomes more popular, radiologists’ workloads will increase dramatically.
The chronic shortage of radiologists in the developed world is well documented.
For example, the UK publishes reports on clinical radiology in UK and the main finding for several years has been the “increased workforce shortages and spiraling costs.
The radiology workforce is showing signs of stress and burnout”.
The shortage of trained radiologists is even more severe in developing countries where the healthcare infrastructure lags behind.
Organ Segmentation in CXRA fundamental task in understanding CXR is to recognize the lung fields and the heart regions:Left: CXR from Japanese Society of Radiology Technology.
Right: The same CXR overlaid with human labeled left lung, right lung, and heart contours.
There is actually a lot of information you can get from the lung contour: an abnormally large heart might suggest cardiomegaly (the abnormal enlargement of the heart); the blunting of costophrenic angle (#3 in the image below) might suggest pleural effusion.
It can also be helpful to isolate the diagnostic AI algorithms to only the lung field, minimizing spurious signals from other parts of the images.
(This is a known issue as neural net classifiers can sometimes exploits artifacts in the CXR such as exposure and texts.
)Important contour landmarks around lung fields: aortic arch (1) is excluded from lung fields; costophrenic angles (3) and cardiodiaphragmatic angles (2) should be visible in healthy patients.
Hila and other vascular structures (4) are part of the lung fields.
The rib cage contour (5) should be clear in healthy lungs.
Clinical Applications with CXR SegmentationIn addition to assisting computer-aided diagnosis, CXR segmentation directly leads to automated calculation of cardiothoracic ratio (CTR).
CTR is simply the width of the heart divided by the width of the lung (see image below).
CTR is a key clinical indicator.
CTR > 0.
5 suggests cardiomegaly, or the enlargement of heart, which is often resulted from heart diseases or prior heart attacks.
Measuring CTR is very tedious.
It involves pinpointing the left and right most points of the heart and lung, and actually taking the measurement.
As a result, most radiologists simply skip this measurement and just eyeball whether the heart is too large.
In some countries like China CXR readers are required to take explicit CTR measurements, which can significantly increases radiologist workloads.
It’s easy to see that high quality lung segmentation can lead to an automated CTR calculation:These are CTR measurement lines computed from the lung masks generated by our method (to be introduced in part 2).
Indeed, in our follow-up work we find that our CTR calculation is highly accurate, with only 6% in root mean square error (RMSE), which is comparable, and possibly better than, existing works like (Dallal et al 2017)^.
^The numbers aren’t directly comparable as we don’t have access to their dataset.
Challenges of Segmenting CXR with Neural NetworksChallenge #1: Implicit Medical KnowledgeBecause CXR is a 2-D projection of a 3-D human body many physiological structures lie on top of each other in the image, and a lot of time it is a judgement call on where you draw the boundary.
Take the following case as an example:Left: CXR with mild deformity.
Right: Human labeled left and right lung regions.
(Source)The image exhibits some scarring in the left lower lobe (the right side of the image) as well as in the apex of the left lung.
They blur the lung contour substantially.
Therefore the red contour has to be drawn by inferring the lung shape using medical knowledge.
The segmentation model must acquire a global concept of contour shape in order to resolve the local ambiguity around the blurred boundaries and produce the correct contour like those by human labelers.
Challenge #2: Non-natural ImagesCXR images look nothing like the natural images we see in everyday life:Most existing computer vision neural networks are designed for colorful natural images and takes advantage of the rich textures present in them.
This makes it hard to directly apply off-the-shelf solutions on CXR.
Challenge #3: Small Training DataPublic medical images for CXR are much smaller than natural images due to privacy concerns and administrative barriers, among other reasons.
Furthermore, unlike natural images that can be labeled by any annotator, medical image labeling can only be done by doctors and trained professionals, incurring high label acquisition cost.
To my knowledge, there are only two publicly available CXR dataset with pixel -level labels of the lung fields, one with 247 images, and the other 138.
This is at least 3000 times smaller than the ImageNet challenge, which has anywhere from 1.
2 million to 14 million labeled images.
In fact, the neural nets trained on ImageNet dataset is so powerful that practically all existing neural net segmentation models are initialized with parameters learned on the ImageNet challenge (such as from ResNet or VGG).
It’s not clear a priori if such a small dataset is enough for data hungry neural nets with millions to hundreds of millions parameters.
Sneak Peak of the SolutionIn Part 2 of the series we design our models to address each of the challenges above.
Here’s a quick preview:Unlike natural images, CXRs are grayscale and are highly standardized (challenge #2).
This observation led us to design the segmentation network to use much fewer convolutional channels compared with networks used on the ImageNet dataset with diverse colors and shapes.
This change unfortunately makes it impractical to do transfer learning from ImageNet-trained models.
However, by using fewer filters, our model has very few parameters (small model capacity) which minimize the risk of overfitting prone to happen on small training data (challenge #3).
Finally, perhaps the most challenging, is how to teach the segmentation model the medical knowledge humans possess (challenge #1).
The key insight here is to use adversarial learning to guide the segmentation model to generate more natural images, which we will show in Part 2 to be highly effective.
The architecture of the final solution that addresses all of the challenges looks like this:This is Part 1 of a two part series.
See Part 2 for details of the model design and performance.
About the author: David Dai is Senior Machine Learning Engineer at Apple, advisor at Wayfinder AI, and former Senior Director of Engineering at Petuum.
He holds PhD in Machine Learning from Carnegie Mellon University, and was named Pittsburgh’s 30 Under 30.
@daiwei89 | Medium | david@wayfinder.
ReferencesSCAN: Structure Correcting Adversarial Network for Organ Segmentation in Chest X-raysWorld Bank and WHO: Half the world lacks access to essential health services, 100 million still pushed into extreme poverty because of health expensesDiagnostic Imaging Dataset by National Health Service, UKChest radiograph assessment using ABCDEFGHIClinical radiology UK workforce census 2018 reportUnsupervised Domain Adaptation for Automatic Estimation of Cardiothoracic RatioAutomatic estimation of heart boundaries and cardiothoracic ratio from chest x-ray imagesImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustnessDevelopment of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists’ detection of pulmonary nodules.
Two public chest X-ray datasets for computer-aided screening of pulmonary diseases.. More details