Early detection and management of sarcopenia is of clinical importance. We aimed to develop a chest X-ray-based deep learning model to predict presence of sarcopenia.
Data of participants who visited osteoporosis clinic at Severance Hospital, Seoul, South Korea, between January 2020 and June 2021 were used as derivation cohort as split to train, validation and test set (65:15:20). A community-based older adults cohort (KURE) was used as external test set. Sarcopenia was defined based on Asian Working Group 2019 guideline. A deep learning model was trained to predict appendicular lean mass (ALM), handgrip strength (HGS) and chair rise test performance from chest X-ray images; then the machine learning model (SARC-CXR score) was built using the age, sex, body mass index and chest X-ray predicted muscle parameters along with estimation uncertainty values.
Mean age of the derivation cohort (n = 926; women n = 700, 76%; sarcopenia n = 141, 15%) and the external test (n = 149; women n = 95, 64%; sarcopenia n = 18, 12%) cohort was 61.4 and 71.6 years, respectively. In the internal test set (a hold-out set, n = 189, from the derivation cohort) and the external test set (n = 149), the concordance correlation coefficient for ALM prediction was 0.80 and 0.76, with an average difference of 0.18 ± 2.71 and 0.21 ± 2.28, respectively. Gradient-weight class activation mapping for deep neural network models to predict ALM and HGS commonly showed highly weight pixel values at bilateral lung fields and part of the cardiac contour. SARC-CXR score showed good discriminatory performance for sarcopenia in both internal test set [area under the receiver-operating characteristics curve (AUROC) 0.813, area under the precision-recall curve (AUPRC) 0.380, sensitivity 0.844, specificity 0.739, F1-score 0.540] and external test set (AUROC 0.780, AUPRC 0.440, sensitivity 0.611, specificity 0.855, F1-score 0.458). Among SARC-CXR model features, predicted low ALM from chest X-ray was the most important predictor of sarcopenia based on SHapley Additive exPlanations values. Higher estimation uncertainty of HGS contributed to elevate the predicted risk of sarcopenia. In internal test set, SARC-CXR score showed better discriminatory performance than SARC-F score (AUROC 0.813 vs. 0.691, P = 0.029).
Chest X-ray-based deep leaning model improved detection of sarcopenia, which merits further investigation.
Appendicular lean mass; Artificial intelligence; Chest X-ray-based deep learning model; Chest radiograph; Sarcopenia.