MetastaSite: Predicting metastasis to different sites using deep learning with gene expression data
doi: 10.3389/fmolb.2022.913602.
eCollection 2022.
Affiliations
Item in Clipboard
Front Mol Biosci.
.
Abstract
Deep learning has massive potential in predicting phenotype from different omics profiles. However, deep neural networks are viewed as black boxes, providing predictions without explanation. Therefore, the requirements for these models to become interpretable are increasing, especially in the medical field. Here we propose a computational framework that takes the gene expression profile of any primary cancer sample and predicts whether patients’ samples are primary (localized) or metastasized to the brain, bone, lung, or liver based on deep learning architecture. Specifically, we first constructed an AutoEncoder framework to learn the non-linear relationship between genes, and then DeepLIFT was applied to calculate genes’ importance scores. Next, to mine the top essential genes that can distinguish the primary and metastasized tumors, we iteratively added ten top-ranked genes based upon their importance score to train a DNN model. Then we trained a final multi-class DNN that uses the output from the previous part as an input and predicts whether samples are primary or metastasized to the brain, bone, lung, or liver. The prediction performances ranged from AUC of 0.93-0.82. We further designed the model’s workflow to provide a second functionality beyond metastasis site prediction, i.e., to identify the biological functions that the DL model uses to perform the prediction. To our knowledge, this is the first multi-class DNN model developed for the generic prediction of metastasis to various sites.
Keywords:
artificial intelligence; clinical decision-making; deep learning; gene expression; machine learning; metastasis; metastasis site.
Copyright © 2022 Albaradei, Albaradei, Alsaedi, Uludag, Thafar, Gojobori, Essack and Gao.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
FIGURE 1
General overview of the proposed computational framework that takes the gene expression profile of any primary cancer sample and predicts whether patients’ samples are primary (localized) or had been metastasized to brain, bone, lung, or liver based on deep learning architecture.

FIGURE 2
The workflow represents the first part of our model’s framework. (A) The architecture of AutoEncoder, (B) Applying DeepLIFT to compute the importance scores in the Encoder network, (C) Using DNN as a baseline method to perform the metastasis prediction.

FIGURE 3
The workflow represents the second part of our model’s framework, which determines the significant neurons in the network to predict metastasis status.

FIGURE 4
AUC is based on different numbers of featured genes using DNN for bone, brain, lung, and liver sites. AUC is indicated in blue, while error rate is shown in red.

FIGURE 5
The prediction performance of the final multi-class DNN model.

FIGURE 6
The prediction performance of the final multi-class DNN model using external testing data from the TCGA datasets. Note, for the brain there are only 2 samples in the test set).

FIGURE 7
The prediction performance of the final multi-class DNN model using a specific population-based cohort.

FIGURE 8
The biological interpretation of our deep neural network approach.

FIGURE 9
A simplified network showing each layer’s enriched pathways based only on the metastasis sites.
Similar articles
-
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
BMC Syst Biol. 2016.PMID: 27490187
Free PMC article. -
Machine learning and deep learning methods that use omics data for metastasis prediction.
Comput Struct Biotechnol J. 2021 Sep 4;19:5008-5018. doi: 10.1016/j.csbj.2021.09.001. eCollection 2021.
Comput Struct Biotechnol J. 2021.PMID: 34589181
Free PMC article.Review.
-
Predicting Bone Metastasis Using Gene Expression-Based Machine Learning Models.
Front Genet. 2021 Nov 10;12:771092. doi: 10.3389/fgene.2021.771092. eCollection 2021.
Front Genet. 2021.PMID: 34858485
Free PMC article. -
MetaCancer: A deep learning-based pan-cancer metastasis prediction model developed using multi-omics data.
Comput Struct Biotechnol J. 2021 Aug 9;19:4404-4411. doi: 10.1016/j.csbj.2021.08.006. eCollection 2021.
Comput Struct Biotechnol J. 2021.PMID: 34429856
Free PMC article. -
Deep metabolome: Applications of deep learning in metabolomics.
Comput Struct Biotechnol J. 2020 Oct 1;18:2818-2825. doi: 10.1016/j.csbj.2020.09.033. eCollection 2020.
Comput Struct Biotechnol J. 2020.PMID: 33133423
Free PMC article.Review.
References
-
-
Albaradei S., Thafar M. A., Van Neste C., Essack M. (2019). “Metastatic state of colorectal cancer can be accurately predicted with methylome,” in Proceedings of the 2019 6th International Conference on Bioinformatics Research and Applications, Seoul Republic of Korea, December 19 – 21, 2019.
-