Workgroups 2025

Proposed Work Groups and Analyses: We will have groups addressing the following common challenges: 

Work Group 1: Measurement Error in Mediation Modeling when imperfect neuroimaging or neuropath measures are used as mediators

Work Group 2: Simulations for Power Calculations to estimate data needs to fulfill NAPA research goals on ADRD prevention 

Work Group 3: Data Quality and Covariance Structure preservation when applying canned modules to create synthetic data from existing cohorts

Work Group 4: Outcome Crosswalk via a Third Dataset to combine evidence from multiple studies using different assessments of the same cognitive construct

Work Group 5: Outcome Crosswalk Using Other Predictors to combine evidence from multiple studies using different assessments of the same cognitive construct


Specific Training Areas: Links between statistical and data generation models; Directed Acyclic Graphs and coding for simulating data; mixed models; measurement models.

Dataset: HRS and National Health and Aging Trends Study (NHATS) will serve as template data sources for teaching purposes. Participants will be encouraged to leverage other datasets.

Content Experts: Adina Zeki Al Hazzouri, PhD (Columbia); Tom Belin, PhD (UCLA); Matthew Fox (BU); Sarah Ackley (Brown). (Note: none confirmed; Additional expert in differential privacy needed).

Work Group Descriptions

Work Group 1: Measurement Error in Mediation Modeling when imperfect neuroimaging or neuropath measures are used as mediators

Simulation for bias analyses illustrating measurement error in mediation modeling of cognitive aging. Extensive work evaluates measured neuroimaging characteristics as mediators between risk factors and cognitive function. However, available neuroimaging measures are likely imprecise and subject to measurement error. In this group, we will try to estimate the impact of measurement on published results about neuroimaging (and neuropathology?) measures as mediators of exposures in cognitive outcomes. ADNI, KHANDLE, STAR, or other.

  • IN “Chandra A, Anjum R, Waters S, Proitsi P, Smith LJ, Marshall CR, Alzheimer’s Disease Neuroimaging Initiative. Marital dissolution and cognition: The mediating effect of Aβ neuropathology. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring. 2024 Oct;16(4):e70032.” the estimated effect of marital dissolution on cognition was partially mediated by Aβ PET. We anticipate this effect might be underestimated due to measurement error of Aβ PET or overestimated due to confounders of marital dissolution and Aβ or confounders of Aβ PET and cognition. This project would entail using synthetic data to evaluate sensitivity of the findings to these sources of bias.  Paper link here.
  • In “Ottoy J, Ozzoude M, Zukotynski K, Adamo S, Scott C, Gaudet V, Ramirez J, Swardfager W, Cogo‐Moreira H, Lam B, Bhan A. Vascular burden and cognition: mediating roles of neurodegeneration and amyloid PET. Alzheimer’s & Dementia. 2023 Apr;19(4):1503-17.” the estimated indirect effect exceeds the estimated total effect.  This result implies that white matter hyperintensity volume has a direct effect (not mediated by amyloid PET or cortical thickness) on cognitive outcomes that is beneficial. This very surprising result seems plausibly attributable to a combination of correlated measurement error and/or collider bias in the mediation analysis. One project would be to simulate data following a different causal model than they conclude that could generate their empirical results. There may be multiple causal models that would fulfill this, demonstrating the extent to which the conclusions of the paper rest on a priori assumptions ignoring possible sources of exposure-mediator confounding, exposure-outcome confounding, non-linearities, or measurement error.
  • Mediation modeling is common across the literature and group members are invited to identify papers of particular interest to evaluate by augmenting the original data with synthetic data. This does not need to be related to mediation by other factors besides neuroimaging.
  • A side project would be to develop one or more synthetic data sets and perform mediation analyses comparing the Process Macro (available in SPSS, R, and SAS), and mediation packages in other software packages. 

Work Group 2: Simulations for Power Calculations to estimate data needs to fulfill NAPA research goals on ADRD prevention 

Simulations for power calculations: The updated National Plan to Address Alzheimer’s Disease (NAPA 2024) calls for research evaluating racial/ethnic and social disparities in AD/ADRD and modifiable determinants of AD/ADRD. These calls have not been tied to clear plans to invest in data infrastructure to support this research. This work group will use data simulations to propose some specific data structures or sets of data structures, including sample size of individuals and places, that would be needed to give informative estimates to answer the research questions prioritized in NAPA. This will also consider how we can use existing data to fulfill these goals if we combine the data sources. No data set needed.

Work Group 3: Data Quality and Covariance Structure preservation when applying canned modules to create synthetic data from existing cohorts

This work group will apply established methods for generating synthetic data sets from a source real data set, and evaluate the extent to which covariance structures and risk factor associations are preserved in the synthetic data.  HRS, NHATS, or ADNI

  • A huge advantage of this is that the synthetic data is fully shareable (because it is completely fabricated, although it still mirrors covariance structures of the original data). Other popular options include the R synthpop package, or the fabricatr package. In this project, we would use data from HRS HCAP to evaluate the links between educational attainment, age, diagnosis of diabetes, hypertension, or stroke, and/or depressive symptoms with cognitive function and use HRS core to evaluate the links of each of these covariates measured in 2006 to predict cognitive change through 2022 in the original data and in synthetic versions of the data, comparing simplicity of use and performance of the competing synthetic data packages.(other reading: Viana D, Teixeira R, Baptista J, Pinto T. Synthetic Data Generation Models for Time Series: A Literature Review. In2024 International Conference on Electrical, Computer and Energy Technologies (ICECET 2024 Jul 25 (pp. 1-6). IEEE.) and https://medinform.jmir.org/2020/2/e16492/

Work Group 4: Outcome Crosswalk via a Third Dataset to combine evidence from multiple studies using different assessments of the same cognitive construct

Integrating evidence regarding risk factors for cognitive aging and dementia from two studies that included different outcomes by using a 3rd data set to create a crosswalk. Often, we aim to meta-analyze evidence from studies that measured cognition with related but not identical assessments. To effectively synthesize associations of various exposures with these discrepant outcomes, we need a way to crosswalk. Historically the tools to do this have involved psychometric methods with linking items or strong distributional assumptions.  In this workgroup we use data from a 3rd data set where both outcomes are measured to create a crosswalk not between individual scores on the two measures but between coefficients derived from regressions (of whatever type) using the two measures.  We then apply this crosswalk to meta-analyze results from existing studies.  A possible additional analysis would be to compare performance of this approach to using a psychometric approach estimating the latent variable and combining results. Another valuable problem would be to determine the limitations of meta-analysis across multiple cognitive outcomes by determining, for example, if crosswalks are the same across racial/ethnic or APOE-e4 dosage groups. ADAMS-HCAP, Core HRS-HCAP, ADNI, 

Work Group 5: Outcome Crosswalk Using Other Predictors to combine evidence from multiple studies using different assessments of the same cognitive construct

Integrating evidence regarding risk factors for cognitive aging and dementia from two studies that included different outcomes by creating a crosswalk using other predictors. Sometimes, we aim to combine evidence from studies that measured cognition with related but not identical assessments. To effectively synthesize associations of various exposures with these discrepant outcomes, we need a way to crosswalk. Historically the tools to do this have involved psychometric methods with linking items or strong distributional assumptions.  In this workgroup we create a crosswalk not between individual scores on the two measures but between coefficients derived from regressions (of whatever type) using the two measures.  We do this by using a set of benchmark exposures with varying strengths of association with the outcomes, and plotting the regression coefficients for each exposure-outcome association against one another. We then apply this crosswalk to convert from the regression coefficient we would obtain with one outcome to what we would obtain from the other outcome.  A possible additional analysis would be to compare performance of this approach to using a psychometric approach estimating the latent variable and combining results. ADAMS-HCAP, Core HRS-HCAP, ADNI,