EHS
EHS

# Multi-cohort and longitudinal Bayesian clustering study of stage and subtype in Alzheimer’s disease

A major contribution of this study is the transition from a cross-sectional understanding of AD subtypes to the perspective brought by longitudinal clustering. Some of the previously reported AD subtypes seem to reflect different stages of the disease that can be observed in our five estimated longitudinal atrophy patterns. Hence, our data contribute a step towards solving the long-lasting problem of disentangling disease stages from actual disease subtypes. This was enabled by modeling longitudinal data using a clear timescale, i.e., over eight years, from disease onset in a large multiethnic cohort of 891 AD dementia cases from four continents. Another important finding is that AD subtypes with clearly distinct atrophy trajectories may converge in late disease stages. This introduces a new understanding of neurodegeneration in AD, which combined with knowledge of neuropathological and clinical heterogeneity, could set the ground for future personalized predictions of biological changes and cognitive decline in AD.

At the modeled clinical disease onset, our method successfully identified the same patterns of atrophy previously identified in neuropathological and neuroimaging subtyping studies (minimal atrophy, limbic predominant, typical AD, and hippocampal sparing)5,7,8,13,16. Our results revealed two main pathways of atrophy. We introduce the term pathway to describe AD patients that show similar spatial distribution of atrophied brain regions over time. Within the same atrophy pathway, patients may progress faster (LPA+) than others (LPA and MA) but their spatial distribution of atrophy over time is similar. This pathway contrasts with the second different atrophy pathway in AD, which has a different spatial distribution with mainly cortical atrophy over time. The differences in progression rates also reflect the rates of cognitive decline of the patients. It is a very important future aim to understand the factors underlying of these differences in progression within the same pathway but also between the different pathways that we have identified.

The minimal atrophy (atrophy limited to the entorhinal cortex), the limbic predominant (atrophy mainly in limbic areas), and the typical (widespread atrophy in the hippocampus, temporal, parietal, and frontal lobes) AD subtypes16, were identified in some disease stage of our MA, LPA, or LPA+ longitudinal atrophy clusters. MA was the most representative cluster in the datasets under investigation and it had the highest variability within cluster. Clustering methods often identify one cluster that represents the most prevalent pattern in a dataset which is an average of more heterogeneous observations than the pattern that results from the remaining clusters in the dataset16. It is important to stress that our MA cluster includes patients that are grouped in the minimal and limbic predominant patterns of atrophy, and potentially some early stage typical AD patients reported in the literature7. This is the case, since in our study we model trajectories of atrophy from the disease onset accounting for longitudinal structural changes in CU $$A\beta$$ negative subjects. Through this type of modeling, we connected patterns of atrophy from the literature by modeling atrophy trajectories and therefore disease staging explicitly. Our MA and LPA clusters probably belong to the same AD subtype observed in two distinct stages, since MA patients reached the LPA levels (baseline) two years after the AD onset. The differences in cognitive intercepts (MMSE and ADAS word recall) between our MA and LPA clusters support the view that they reflect different disease stages. The LPA+ cluster appears to be on the same atrophy pathway but with faster atrophy rates in comparison to the MA and LPA clusters. Patients in the LPA+ cluster had the steepest decline in cognition among the five identified clusters, including memory and orientation. LPA+ patients had similar APOE e47, education and disease onset as in MA and LPA. However, premorbid intelligence, a proxy for cognitive reserve17, was significantly higher in LPA+ than in MA and LPA. We believe that due to high cognitive reserve, patients of the LPA+ cluster can reach higher levels of brain atrophy than the MA and LPA clusters, while maintaining similar clinical severity until they reach the AD onset17. The dynamics of brain atrophy over time in the MA, LPA, and LPA+ clusters differed. However, our current data seems to indicate that these three longitudinal atrophy clusters belong to the same atrophy pathway in AD, namely the mediotemporal atrophy pathway. Atrophy in this well-documented pathway is shown to correlate with the neurofibrillary tangle pathology at autopsy1,5,18. Even though these three clusters (MA, LPA, and LPA+) belong to the same atrophy pathway, their rates of atrophy and cognitive decline differ substantially, which can have important clinical implications. These observed differences are likely due to a combination of protective and risk factors as well as potential concomitant non-AD brain pathologies7. For example, it was shown by Ferreira and colleagues, that the location and frequency of markers of small vessel disease differ between AD subtypes19.

Our HS cluster resembles the hippocampal sparing subtype described in previous neuropathological and neuroimaging subtyping studies5,7,8,13,16. This subtype is more often characterized by cortical atrophy in comparison to the other AD subtypes7,8,16,18. In our study, some characteristics of the HS cluster included steep atrophy trajectories, a lower frequency of the APOE e4 allele7, high premorbid intelligence, more years of education, and early AD onset, which is in line with the characteristics associated with the hippocampal sparing subtype reported by previous studies7,8,13,16. This cluster had the lowest frequency, which is also in line with previous studies7,8. The chances of finding more hippocampal sparing patients were reduced since the cohort selection criteria included the amnestic phenotypic presentation of AD, which is frequently related to typical AD and thus the mediotemporal atrophy pathway4. The significantly affected constructional and ideational praxis is a key characteristic of the hippocampal sparing subtype7,13,16, which was also confirmed in our study. Comparisons between our MA and HS cluster covariance patterns revealed network differences between these two groups. In the MA, anatomical differences due to the disease were predominantly localized in the medial-temporal lobe and cortical regions combined as a network at the AD onset. On the other hand, the HS cluster network differences at the AD onset also involve the basal ganglia. Moreover, the HS cluster had higher nodal strength at the intercept of some ventromedial prefrontal and medial temporal regions from the MA cluster. Based on all these results, we believe that the HS pattern of atrophy represents a distinct atrophy pathway in AD, namely the cortical pathway.

To explain the atrophy trajectories of our DA cluster is challenging since excessive frontal and temporal atrophy was already present at the clinical onset. Our data showed that in advanced stages on the mediotemporal and cortical pathways of atrophy, AD patients may develop comparable levels of atrophy that are similar to our DA cluster. As a result, this cluster of patients can potentially belong to either of the two pathways of atrophy. Similarly to our LPA+, cognitive reserve in our DA cluster (education exceeded 15 years on average) may explain the greater atrophy levels (at dementia onset)7,17. Our DA cluster had a similar pattern of atrophy to that of the typical AD atrophy subtype reported in the literature7,8,13,16, but lower frequency. In a recent cross-sectional clustering study using tau PET that mainly included preclinical AD, no cluster had spatial tau distribution similar to the typical AD pattern of atrophy, but the cortical and medial-temporal patterns of tau were observed10. Further, two other studies in prodromal AD found clusters of individuals with decreased temporal-parietal glucose metabolism20 or increased temporal-parietal atrophy21 (typical AD pattern), but in low sample frequencies, which is in line with our findings.

Recently, it was proposed that $$A\beta$$ aggregation in the default mode network (DMN) is predominantly associated with within-network but distant glucose hypometabolism22. Moreover, glucose metabolism, atrophy, and tau pathology are closely linked in AD7,18,22. We speculate that the mediotemporal path of neurodegeneration in AD may be initiated in the vulnerable temporal lobe after enough is deposited in distant DMN regions. In contrast, the cortical atrophy pathway patients may show less initial temporal lobe atrophy (and amnestic symptomatology) partially because they respond differently to $$A\beta$$ aggregation in the DMN due to compensation mechanisms22 such as cognitive reserve17.

Our study has addressed some important methodological challenges that the existing literature of biological subtypes has not overcome so far. To our knowledge, this is the first time that AD atrophy subtypes were discovered based on modeling longitudinal biomarker trajectories8. An immediate advantage of our longitudinal clustering approach is that it overcomes the assumption that subjects of a cluster (cross-sectional analysis) remain in the same cluster when the disease advances, which is unrealistic8. Previous studies have employed arbitrary timescales to model biomarker progression8,10,13. Our estimates are based on a clearly defined timescale, namely the time from clinical onset. This approach provides the unique possibility to generate interpretations based on disease staging that help to trace abnormal changes early in the disease course of each cluster. Previously, longitudinal interpretations could not directly relate back to data in hand because they were not anchored to a specific timescale13. We calculated atrophy w-values for each patient corrected for the effects of aging in brain morphology based on a dataset of longitudinal $$A\beta$$ negative CU individuals. Our model for the correction of ageing effects on the atrophy values, as it was shown in the results, identified the excess atrophy due to AD at different ages correctly and is in line with the literature comparing early and late onset AD23. This approach helped to estimate the within-subject variance more precisely and therefore account for the effects observed in aging9,15,24, which has been a limitation of cross-sectional estimations9,16,18. A common pitfall of clustering studies is to focus on finding labels for observations depending on their features in a population, which tends to overfit the training set. External validation datasets help to assess the ability of clustering models to generalize8. We found that our longitudinal atrophy estimates and the unseen atrophy patterns in the validation dataset were highly concordant. Moreover, the application of longitudinal clustering separately in the ADNI and J-ADNI/AIBL cohorts showed similar longitudinal atrophy patterns to those found in the whole discovery dataset with small variations. The low sample percentages that some clusters exhibited, is attributed to the underrepresentation of rare subtypes in some cohorts that focused on the typical AD phenotype, the lower sample that was used in the separate cohorts for clustering, and to the ability of our method to identify clusters of very low prevalence if they exist15. Concordance was high for the most prevalent atrophy patterns and lower for DA and HS, due to low sample sizes and cohort differences. Between ADNI and J-ADNI/AIBL cohorts, a quantitative assessment showed increased similarity in longitudinal atrophy trajectories, with small variations due to small sample sizes and cohort variability. Of interest, the hippocampal sparing and diffuse atrophy patterns of atrophy were found in both datasets but with lower prevalence than in the complete discovery dataset. This happened due to the split of the discovery dataset in smaller datasets that underrepresent the AD population. AD subtypes of lower prevalence in the population7, are doomed to be underrepresented or disappear when clustering is applied to small datasets9. The combined analysis of the cohorts in the discovery dataset with one model instead of building one clustering model per cohort, allowed us to build a single statistical model that produced more accurate estimates due to a larger sample size. Importantly, since our study was mainly based on longitudinal information from repeated cross-sectional measurements, we avoided to interpret structural relations between brain regions based on cross-sectional correlations. Instead, we focused only on the longitudinal correlation between brain regions which is based on within patient longitudinal trajectories.

Our study has some limitations. Only atrophy markers were modeled in the context of AD heterogeneity. Pre-AD scans were not included. This reduced our ability to infer atrophy patterns that precede the diagnosis of AD dementia. In the future, we envision combining and comparing other imaging modalities longitudinally, thus extending our current analyses to incorporate information about tau-related pathology. Moreover, the future addition of biomarkers of non-AD pathologies in the clustering studies design will help in understanding the contribution of comorbidities in AD subtypes. The inclusion of subjects from four different continents is a strength since it increased the variability in the sample and therefore represented the AD population better, but it is also a limitation due to variability in MRI assessments. Another limitation is the short follow-up period for AD patients included in the study. A future re-estimation of atrophy trajectories will include more MRI visits per patient to obtain better estimates. However, a strong methodological aspect of this study is the reconstruction of longitudinal subtype-atrophy profiles over the dementia part of the AD continuum, based on longitudinal individual patients’ data that comprised short segments of the disease continuum. Future studies should also include multiple MRIs from patients that are followed up from the preclinical until the dementia stage. The cohorts were harmonized to reduce MRI variability. Beyond these limitations, we assumed that the CU population has homogeneous brain morphology. Future studies should investigate whether CU individuals age differently and incorporate this information in the context of AD heterogeneity.

In conclusion, based on a large multiethnic cohort of AD dementia patients, we discovered five longitudinal patterns of brain atrophy that group the previously reported AD subtypes into two atrophy pathways (a mediotemporal and a cortical). We introduced a different understanding of the neurodegenerative aspect of AD heterogeneity, by shifting from the cross-sectional understanding of AD subtypes to the perspective brought by longitudinal clustering. Our study is a step forward toward answering an urgent question, whether the observed heterogeneity in AD reflects disease stages or distinct biological subtypes. We believe that with the help of our proposed model, it will be possible to unravel the heterogeneity in AD, thus enabling precision medicine and potentially leading to successful disease-modifying treatments in the future.

EHS