Review: MultiChannel — Segment Colon Histology Images (Biomedical Image Segmentation)Foreground Segmentation using FCN + Edge Detection Using HED + Object Detection Using Faster R-CNNSH TsangBlockedUnblockFollowFollowingFeb 17In this story, MultiChannel is briefly reviewed.
It is a Deep MultiChannel Neural Networks used for gland instance segmentation.
This approach, as seen in the above figure, fuses the results from 3 sub-networks: Foreground Segmentation using FCN, Edge Detection Using HED, Object Detection Using Faster R-CNN.
State-of-the-art results are achieved using 2015 MICCAI Gland Segmentation Challenge Dataset.
Authors firstly published MultiChannel in 2016 MICCAI with the use of only 2 sub-networks: Foreground Segmentation using FCN, and Edge Detection Using HED.
And then they enhanced the conference version with adding the Object Detection Using Faster R-CNN.
This enhanced version is published in 2017 TBE.
Since the transaction version is much more detailed, though I have read both, I will present the transaction version here.
(SH Tsang @ Medium)Outline1st Sub-Network: Foreground Segmentation Channel2nd Sub-Network: Edge Detection Channel3rd Sub-Network: Object Detection ChannelFusing MultichannelComparison With State-of-the-art ApproachesFurther Ablation Study1.
1st Sub-Network: Foreground Segmentation Channel1st Sub-Network: Foreground Segmentation ChannelFCN-32s is used as the network in Foreground Segmentation Channel.
However, due to small output feature map produced by FCN-32s, which is not favour to segmentation.
Dilated convolution, proposed in DilatedNet, is used to enhance FCN.
The strides of pool4 and pool5 are 1.
And subsequent convolution layers enlarge the receptive field with a dilated convolution.
Softmax cross entropy loss is used while training.
Pre-trained FCN-32s is used.
2nd Sub-Network: Edge Detection Channel2nd Sub-Network: Edge Detection ChannelEdge channel is based on Holistically-nested Edge Detector (HED).
It learns hierarchically embedded multiscale edge fields to account for the low-, mid-, and high- level information of contours and object boundaries.
For m-th prediction:The output is the sigmoid function σ of feature map h().
And the final one is weighted fusion of different scales of edge fields.
Sigmoid cross entropy loss is used during training.
Xavier initialization is used.
Ground-truth edge labels are generated from region labels.
A pixel is not edge if all neighbor (top, down, left, right) pixels are either all foreground or all background.
3rd Sub-Network: Object Detection Channel3rd Sub-Network: Object Detection ChannelFaster R-CNN is used here, but with modification.
Filling operation is done after generating the region proposals.
The value of each pixel in regions covered by the bounding boxes equals the number of bounding boxes it belongs to.
For example, if a pixel is in the overlapping area of three bounding boxes, the value of that pixel will be three.
????.is the filling operation.
The loss is the same as the one in Faster R-CNN, i.
the sum of classification loss and regression loss.
Pre-trained Faster R-CNN is used.
Ground-truth bounding boxes are generated using smallest rectangle which encircling each gland.
Fusing MultichannelFusing Multichannel7-layer CNN is used.
Again, dilated convolution, used in DilatedNet, is used here to replace downsampling.
Xavier initialization is used.
Comparison With State-of-the-art Approaches5.
DatasetMICCAI 2015 Gland Segmentation Challenge Contest165 labeled colorectal cancer histological imagesOriginal images, most of them, 775×522.
Training set: 85 imagesTest set: 80 images.
(test set A contains 60 images and test set B contains 20 images).
There are 37 benign sections and 48 malignant ones in the training set, 33 benign sections and 27 malignant ones in testing set A and 4 benign sections and 16 malignant ones in testing set B.
Data AugmentationData Augmentation Strategy I: Horizontal flipping and 0, 90, 180, 270 rotation.
Data Augmentation Strategy II: Elastic transformation just like the one in U-Net.
EvaluationThree indicators are used: F1 Score, ObjectDice and ObjectHausdorff.
F1 Score: A score measured by precision P and recall R.
More than 50% overlap is defined as true positive.
ObjectDice: A metric for segmentation.
ObjectHausdorff: A metric for measuring the shape similarity.
(For details, please read my review on CUMedVision2 / DCAN.
)RS and WRS are rank sum based on F1 score, ObjectDice and ObjectHausdorff.
We can see that MultiChannel nearly gets all rank 1 in Part A and Part B test set, which means that MultiChannel outperforms, CUMedVision1, CUMedVision2 / DCAN, FCN and Dilated FCN (DeepLab).
Some qualitative results:5.
Comparison with Instance Segmentation ApproachesMultiChannel is better than all of the instance segmentation approaches such as MNC.
While only segmenting within bounding box (i.
second last row), the result is also inferior than fusion approach.
Edge3 means the edge is dilated by disc filter with radius of 3.
Some qualitative results:6.
Further Ablation Study6.
Data AugmentationUsing data augmentation strategy II (elastic transformation) is better.
Different Fusion Variants of MultiChannelEdge3 means the edge is dilated by disc filter with radius of 3.
That means increase the width of edges so as to deal with the imbalance of edge and non-edge pixels during training.
First 3 rows: Without using dilated convolution, the performance is inferior.
Last 2 rows: Only 2 channels (or sub-network) for fusion, the performance is also inferior.
Middle 3 rows: With dilated convolution, plus 3 channels, the performance is the best.
References[2016 MICCAI] [MultiChannel]Gland Instance Segmentation by Deep Multichannel Side Supervision[2017 TBE] [MultiChannel]Gland Instance Segmentation Using Deep Multichannel Neural NetworksMy Previous ReviewsImage Classification[LeNet] [AlexNet] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN]Object Detection[OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [DeepID-Net] [R-FCN] [ION] [MultiPathNet] [NoC] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]Semantic Segmentation[FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [SegNet] [ParseNet] [DilatedNet] [PSPNet] [DeepLabv3] [DRN]Biomedical Image Segmentation[CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet]Instance Segmentation[DeepMask] [SharpMask] [MultiPathNet] [MNC] [InstanceFCN] [FCIS]Super Resolution[SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DRCN] [DRRN] [LapSRN & MS-LapSRN].