Data were collected as part of the ongoing Personalized Parkinson Project (PPP), a prospective, longitudinal, single-center study (Clinical Trials NCT033648) of 520 people with early-stage Parkinson’s disease—diagnosed within the last 5 years27. Study participants wear a smartwatch (Verily Study Watch) for up to 23 h/day for the 3-year duration of the study, which passively collects raw sensor data from IMU, gyroscope, photoplethysmography, and skin conductance sensors.
Set 1 (N = 198 participants) was selected for video-based consensus scoring by matching age, gender, and MDS-UPDRS III score to be representative of the overall PPP study. Two assessors independently scored videos of the exams. When difficulties in rating MDS-UPDRS Part III tasks arose due to poor video quality, assessors provided scores only when confident in their assessment. MDS-UPDRS Part III consensus scores were computed as the median of the in-person rating and both video ratings.
Starting in May 2020, participants were offered the opportunity to enroll in a substudy, which asks them to perform an active assessment (Parkinson’s Disease Virtual Motor Exam, PD-VME) in the clinic and in remote, unsupervised settings. The PD-VME was deployed fully remotely, using digital instructions and an over-the-air firmware update to the watches of consented participants. A total of 370 participants enrolled in the substudy (Set 2).
The smartwatch guides participants through the series of structured motor tasks comprising the PD-VME. It also allows patients on symptomatic medication to log the timing of their medication intake. The study design and patient-facing UI of the PD-VME are summarized in Fig. 1.
Each week, participants were asked to perform the PD-VME twice on the same day, at two predefined times: first in the off state (selected as a time when they typically experienced their worst motor function), and then in the on-state (at a time when they typically experienced good motor function later in the day). Participants not taking medication were instructed to complete the PD-VME twice, one hour apart. The helpdesk at the site (Radboudumc) monitored wear-time and PD-VME completion and reached out to participants if more than three consecutive weekly assessments were missed.
Starting in July 2020, participants enrolled in the PD-VME substudy were asked to perform the PD-VME during their in-clinic visit (in the same manner as they did remotely), while the assessor observed its execution without providing feedback or any additional instructions. The in-clinic PD-VME is performed within 1 h after completion of the MDS-UPDRS part III off state exam, and before dopaminergic medication intake.
Demographic and clinical characteristics of the study population are presented in Table 1, for participants in Set 1 and Set 2. Distributions of the side on which the participants chose to wear the smartwatch are also included.
Median smartwatch wear time across all PPP participants (N = 520)27,28 was 22.1 h/day, with a median follow-up period of 390 days. Variations in follow-up duration are due largely to the N = 126 who have not completed the study at the time of publication, and loss-to-follow-up is only 5.4%. Reasons for participant drop-out are indicated in Supplementary Table 2. Participants in Set 2 completed 22,668 PD-VMEs, corresponding to 59% of per-protocol test sessions during the 70-week follow-up period (Supplementary Fig. 1). In the first week, 80% of participants had at least 1 PD-VME, and 40% had completed one PD-VME in week 52.
Participants’ ability to perform the PD-VME was assessed during the in-clinic visit. Participants were able to complete the tasks in the exam (100% for tremor and upper-extremity bradykinesia and 98.5% for gait). Major protocol deviations were recorded as follows: participants did not place their hands on their lap during rest tremor tasks (8.2% of cases), participants performed the arm-twist using both arms (3.1% of cases), and participants either walked with their arms crossed across their chest (in 3.1% of cases) or sat down repeatedly (6.8% of cases) during the gait task. Detailed results are summarized in Supplementary Table 3.
Among three measurements that were considered for measuring tremor severity, lateral tremor acceleration measurement is presented here because it showed the strongest correlation to in-clinic MDS-UPDRS ratings, and the strongest ability to separate on state from off state measurements. Results for additional measures are included in Supplementary Table 4.
The Spearman rank correlation between the median lateral acceleration during the rest tremor task and expert consensus rating of MDS-UPDRS task 3.17 was 0.70 [0.61, 0.77], N = 138 (Fig. 2a). For 56 participants, video quality was insufficient to ensure high confidence consensus ratings wrist acceleration signals intuitively map to the clinical observations during the MDS-UPDRS (Fig. 2b). Next, the sensitivity to on-off changes of the rest-tremor acceleration measurement was assessed (Fig. 2c). A small effect (Cohen’s d of 0.2) was observed comparing the on and off state. The mean difference in the measure was 0.10 [0.05, 0.1].
Test-retest reliability is reported in Fig. 2d, with intra-class correlation (ICC) of 0.71 [0.58–0.81] week-on-week (N = 208), and ICC of 0.90 [0.84–0.94] m s−2 for monthly averaged measures (N = 139).
Finally, the distribution of remote measurements compared to the sensor measurement during the in-clinic VME is shown in Fig. 2e. The in-clinic PD-VME measure was between the 25th and the 75th percentiles of the remote PD-VME measures for 41% of the participants.
Among the four measurements that were considered for measuring upper-extremity bradykinesia severity, no single measure showed both strong correlation to in-clinic MDS-UPDRS ratings, and a strong ability to separate on from off state measurements. Therefore, results are included below for both the arm-twist amplitude, and the arm-twist rate.
The highest correlation with expert consensus rating of MDS-UPDRS task 3.6 was observed for the arm twist amplitude measure, with ρ = −0.62 [−0.73, −0.49], N = 159 (Fig. 3a). However, the effect of medication state (Cohen’s d of −0.07) was very small (Fig. 3c)29. The mean on-off difference in the measure was −0.9 [0.0, −1.6] degrees. Test-retest ICC (Fig. 3d) was 0.71 [0.59–0.80] week-on-week (N = 208) and 0.89 [0.84–0.94] for monthly-averaged measures (N = 136). The in-clinic PD-VME measure was between the 25th and the 75th percentiles of the remote PD-VME measures for 45% of the participants.
The assessors observed during the in-clinic PD-VME exam that some patients mainly focussed on the speed of the arm-twist movement rather than the amplitude. Therefore, sensor-based measures of the rate of arm-twist and the combination of rate and amplitude were investigated as well. Correlations to the consensus MDS-UPDRS ratings of ρ = 0.06 [−0.25, +0.13] for arm-twist rate, and ρ = −0.42 [−0.55, −0.28] for the product of rate and amplitude were observed. Both metrics showed significant change in on and off: Cohen’s d of −0.22 and mean change of −0.16 [−0.13, −0.20] s−1 for arm-twist rate, and Cohen’s d of −0.26 and mean change of −8 [−6, −10] degrees/s for the combination. The full results are included in Supplementary Table 6.
Arm swing during gait
Among the three measurements that were considered for measuring gait impairment, arm swing acceleration was selected. While it was not the best outcome measure across any of the criteria, it showed solid performance across all of them. Results for the measures that were not selected are included in Supplementary Table 7.
The Spearman rank correlation between the arm swing acceleration during the gait task and expert consensus rating of MDS-UPDRS task 3.10 was ρ = −0.46 [−0.58, −0.31], N = 164 (Fig. 4a). A small effect (Cohen’s d of 0.44) was observed comparing the on and off state. The mean difference in the measure was −0.8 [−1.2, −0.5] m−s−2. Test-retest ICC (Fig. 4d) was 0.43 [0.30–0.56] week-on-week (N = 210), and 0.75 [0.66–0.84] for monthly-averaged measures (N = 139). The in-clinic PD-VME measure was between the 25th and the 75th percentiles of the remote PD-VME measures for 39% of the participants.