High-precision neural-network discrimination of human plasma samples to detect pancreatic cancer using specialized data-augmentation method
Meiyappan Solaiyappan1, Santosh Kumar Bharti1, Paul T Winnard Jr1, Mohamad Dbouk2, Michael G Goggins2,3,4, and Zaver M Bhujwalla1,3,5
1Department of Radiology, The Johns Hopkins University School of Medicine, Baltimore, MD, United States, 2Department of Pathology, The Johns Hopkins University School of Medicine, Baltimore, MD, United States, 3Department of Oncology, The Johns Hopkins University School of Medicine, Baltimore, MD, United States, 4Department of Medicine, The Johns Hopkins University School of Medicine, Baltimore, MD, United States, 5Department of Radiation Oncology and Molecular Radiation Sciences, The Johns Hopkins University School of Medicine, Baltimore, MD, United States
We developed and demonstrated an artificial neural-network, suitably designed using specialized data-augmentation technique, that can successfully discriminate human plasma samples to provide an early specific detection of pancreatic cancer (PDAC) with high precision accuracy.
Figure 2: (a) The scatter-plot shows the 2D
embedding of the neural network’s hidden layer output, to illustrate well
separated clustering of control (green), benign (blue), and malignant (red) samples
with very little overlap. The clustering performance provides a visual
understanding of the high precision accuracy of discrimination obtained in the
final output results. (b) Receiver
Operating Characteristics (ROC) curves show the sensitivity and specificity
performance of the neural-network, with the area under the curve (AUC) for all three
classifications above 0.95.
Figure 3: Confusion
Matrix result of cancer plasma prediction. The green diagonal boxes show the correct
predictions in each class and red boxes indicate misclassifications. The numbers in each box correspond to the
number of samples (and their percentage of the total data). The right-most column shows the precision for each predicted class (in green). The bottom-row
shows prediction accuracy for each class (in green) and the bottom-right corner
box shows the overall accuracy (in green) and error rate (in red). Cancer plasma classification resulted in an 95.2%
correct prediction.