In this tutorial we will consider colorectal histology tissues classification using ResNet architecture and Pytorch framework.
Introduction
Recently machine learning (ML) applications became widespread in healthcare industry: omics field (genomics, transcriptomics, proteomics), drug investigation, radiology and digital histology. Deep learning based image analysis studies in histopathology include different tasks (e.g., classification, semantic segmentation, detection, and instance segmentation) and various additional applications (e.g., stain normalization, cell/gland/region structure analysis), main goal of ML application in this field is automatic detection, grading and prognosis of cancer. However, there are several challenges in digital pathology area. Usually histology slides are large sized hematoxylin and eosin (H&E) stained images with color variations and artifacts, also different levels of magnification results in different levels of information extraction (cell/gland/region levels). One Whole Slide Image (WSI) is multi-gigabyte image with typycal resolution 100 000 x 100 000 pixels. In supervised classification scenario which we will consider in this article usually WSI is divided into patches with some stride, after that some CNN architecture is applied to extract feature vectors from patches and can be passed into traditional machine learning algorithms such as SVM or gradient boosting for further operations.
Typical steps for machine learning in digital pathological image analysis.
In this article we will try to apply CNN ResNet architecture to classify tissue types of colon, we will consider patches with different labels such as: mucosa, tumor, stroma, lympho and etc. We won’t consider transfering learning case and will train CNN from scratch because weights from pretrained models were obtained from ImageNet images which is not related to histology field and won’t help in quicker model convergenge.
Dataset
As a dataset I selected collection of textures in colorectal cancer histology, it can be considered as a MNIST for biologists. You can find this dataset at Zenodo or at Kaggle platform
TDAtaset contains two zipped folders:
- “Kather_texture_2016_image_tiles_5000.zip”: a zipped folder containing 5000 histological images of 150 * 150 px each (74 * 74 µm). Each image belongs to exactly one of eight tissue categories (specified by the folder name).
- “Kather_texture_2016_larger_images_10.zip”: a zipped folder containing 10 larger histological images of 5000 x 5000 px each. These images contain more than one tissue type.
All images are RGB, 0.495 µm per pixel, digitized with an Aperio ScanScope (Aperio/Leica biosystems), magnification 20x. Histological samples are fully anonymized images of formalin-fixed paraffin-embedded human colorectal adenocarcinomas (primary tumors) from pathology archive (Institute of Pathology, University Medical Center Mannheim, Heidelberg University, Mannheim, Germany).
Colorectal MNIST images classification with ResNet
Import necessary libraries and listing input directory with the data to observe folders structure and stored files. To run kernel I used kaggle notebooks, where you can directly import appropriate data without downloading.
In [1]:
Helpers
In [2]:
We will consider directory with small images 150 x 150 in size. To feed images into ResNet CNN model you need to resize them to 224 x 224 size.
In [3]:
Data exploration
Here we can see 8 folders with names corresponding to the labels for our model.
In [4]:
['03_COMPLEX',
'08_EMPTY',
'04_LYMPHO',
'01_TUMOR',
'02_STROMA',
'06_MUCOSA',
'05_DEBRIS',
'07_ADIPOSE']
In [5]:
['CRC-Prim-HE-05_APPLICATION.tif',
'CRC-Prim-HE-04_APPLICATION.tif',
'CRC-Prim-HE-10_APPLICATION.tif',
'CRC-Prim-HE-06_APPLICATION.tif',
'CRC-Prim-HE-03_APPLICATION.tif',
'CRC-Prim-HE-01_APPLICATION.tif',
'CRC-Prim-HE-08_APPLICATION.tif',
'CRC-Prim-HE-02-APPLICATION.tif',
'CRC-Prim-HE-07_APPLICATION.tif',
'CRC-Prim-HE-09_APPLICATION.tif']
Let’s select random number for each folder to display random samples from input dataset.
In [6]:
03_COMPLEX 625
08_EMPTY 625
04_LYMPHO 625
01_TUMOR 625
02_STROMA 625
06_MUCOSA 625
05_DEBRIS 625
07_ADIPOSE 625
In [7]:
5C8E_CRC-Prim-HE-08_005.tif_901_Col_1.tif
1429C_CRC-Prim-HE-06_005.tif_5401_Col_6451.tif
6408_CRC-Prim-HE-05_004.tif_451_Col_1.tif
154F0_CRC-Prim-HE-09_024.tif_151_Col_151.tif
13F70_CRC-Prim-HE-07_014.tif_751_Col_1351.tif
1754A_CRC-Prim-HE-06_001.tif_601_Col_751.tif
1688C_CRC-Prim-HE-08_023.tif_451_Col_151.tif
16CE8_CRC-Prim-HE-03_012.tif_1801_Col_901.tif
In [8]:
Form DataFrame with image paths and corresponding labels to use in PyTorch Dataset class.
In [9]:
In [10]:
img_path | label | |
---|---|---|
0 | /kaggle/input/colorectal-histology-mnist/kathe... | 03_COMPLEX |
1 | /kaggle/input/colorectal-histology-mnist/kathe... | 03_COMPLEX |
2 | /kaggle/input/colorectal-histology-mnist/kathe... | 03_COMPLEX |
3 | /kaggle/input/colorectal-histology-mnist/kathe... | 03_COMPLEX |
4 | /kaggle/input/colorectal-histology-mnist/kathe... | 03_COMPLEX |
Let’s map string label into integer number (label encoding procedure)
In [11]:
In [12]:
PyTorch Dataset, Dataloaders and Transforms preparation
In [13]:
Here is the basic transforms for train, validation and test dataset, but you can add other augmentations to increase variance of the data.
In [14]:
Split the data into train, validation and test datasets. Train set usually used to adjust the weights, validation set - for hyperparameters optimization, and test set is for model performance testing.
In [15]:
In [16]:
Train DF shape: (4000, 3)
Valid DF shape: (200, 3)
Test DF shape: (800, 3)
Create dataset objects and corresponding data loaders
In [17]:
In [18]:
In [19]:
Train loop
In [20]:
Model setup and training
Here ResNet model with 50 layers is used, we replace last linear layer to satisfy the requirement for number of classes. Additionally linear scheduler is used and will reduce learning rate of Adam optimizer every 7 epochs.
In [21]:
In [22]:
Epoch 0/9:
100%|██████████| 63/63 [00:25<00:00, 2.46it/s]
train Loss: 1.0313 Acc: 0.6414
100%|██████████| 4/4 [00:01<00:00, 3.25it/s]
val Loss: 0.7102 Acc: 0.7578
Saving model for best loss
Best_loss: 0.7102
Best_acc_score: 0.7578
Epoch 1/9:
100%|██████████| 63/63 [00:26<00:00, 2.42it/s]
train Loss: 0.6983 Acc: 0.7587
100%|██████████| 4/4 [00:01<00:00, 3.25it/s]
val Loss: 4.5117 Acc: 0.6250
Best_loss: 0.7102
Best_acc_score: 0.7578
Epoch 2/9:
100%|██████████| 63/63 [00:25<00:00, 2.45it/s]
train Loss: 0.6180 Acc: 0.7862
100%|██████████| 4/4 [00:01<00:00, 3.40it/s]
val Loss: 0.4103 Acc: 0.8477
Saving model for best loss
Best_loss: 0.4103
Best_acc_score: 0.8477
Epoch 3/9:
100%|██████████| 63/63 [00:25<00:00, 2.43it/s]
train Loss: 0.5134 Acc: 0.8209
100%|██████████| 4/4 [00:01<00:00, 3.32it/s]
val Loss: 0.3510 Acc: 0.8516
Saving model for best loss
Best_loss: 0.3510
Best_acc_score: 0.8516
Epoch 4/9:
100%|██████████| 63/63 [00:25<00:00, 2.47it/s]
train Loss: 0.4767 Acc: 0.8182
100%|██████████| 4/4 [00:01<00:00, 3.18it/s]
val Loss: 2.5895 Acc: 0.6641
Best_loss: 0.3510
Best_acc_score: 0.8516
Epoch 5/9:
100%|██████████| 63/63 [00:25<00:00, 2.46it/s]
train Loss: 0.4765 Acc: 0.8259
100%|██████████| 4/4 [00:01<00:00, 3.25it/s]
val Loss: 0.4351 Acc: 0.8398
Best_loss: 0.3510
Best_acc_score: 0.8516
Epoch 6/9:
100%|██████████| 63/63 [00:26<00:00, 2.42it/s]
train Loss: 0.3701 Acc: 0.8676
100%|██████████| 4/4 [00:01<00:00, 3.34it/s]
val Loss: 0.1979 Acc: 0.9414
Saving model for best loss
Best_loss: 0.1979
Best_acc_score: 0.9414
Epoch 7/9:
100%|██████████| 63/63 [00:25<00:00, 2.45it/s]
train Loss: 0.3140 Acc: 0.8886
100%|██████████| 4/4 [00:01<00:00, 3.49it/s]
val Loss: 0.1852 Acc: 0.9414
Saving model for best loss
Best_loss: 0.1852
Best_acc_score: 0.9414
Epoch 8/9:
100%|██████████| 63/63 [00:26<00:00, 2.42it/s]
train Loss: 0.2892 Acc: 0.9000
100%|██████████| 4/4 [00:01<00:00, 3.30it/s]
val Loss: 0.1873 Acc: 0.9375
Best_loss: 0.1852
Best_acc_score: 0.9414
Epoch 9/9:
100%|██████████| 63/63 [00:25<00:00, 2.45it/s]
train Loss: 0.2876 Acc: 0.9015
100%|██████████| 4/4 [00:01<00:00, 2.71it/s]
val Loss: 0.1765 Acc: 0.9414
Saving model for best loss
Best_loss: 0.1765
Best_acc_score: 0.9414
Validation and test results
We can see that our model quickly converged to a good results.
In [23]:
In [24]:
100%|██████████| 13/13 [00:04<00:00, 3.17it/s]
In [25]:
In [26]:
In [27]:
Confusion matrix, without normalization
[[98 0 1 0 0 1 0 0]
[ 0 88 5 0 7 0 0 0]
[ 2 11 83 3 0 1 0 0]
[ 0 0 5 95 0 0 0 0]
[ 0 4 2 0 89 1 4 0]
[ 1 0 2 4 2 91 0 0]
[ 0 0 0 0 1 0 96 3]
[ 0 0 0 0 0 0 1 99]]
In [28]:
precision recall f1-score support
01_TUMOR 0.97 0.98 0.98 100
02_STROMA 0.85 0.88 0.87 100
03_COMPLEX 0.85 0.83 0.84 100
04_LYMPHO 0.93 0.95 0.94 100
05_DEBRIS 0.90 0.89 0.89 100
06_MUCOSA 0.97 0.91 0.94 100
07_ADIPOSE 0.95 0.96 0.96 100
08_EMPTY 0.97 0.99 0.98 100
accuracy 0.92 800
macro avg 0.92 0.92 0.92 800
weighted avg 0.92 0.92 0.92 800
Conclusion
We trained the ResNet-50 model for 15 epochs, although the model showed good accuracy. From model results on test dataset we can see that tumor and empty are recognizable with f1 score equal to 0.98, the most confusable label is complex which probably represents combinations of other tissue types.