IMPACT-MH

Project Leaders

Project Number

1UF1MH141632-01

Awardee Organization

MASSACHUSETTS GENERAL HOSPITAL

Program Official

ERIN MICHELE KING

Project Description

Maternal infection is one of the most strongly implicated environmental risks for neurodevelopmental disorders in offspring. This risk has been demonstrated in national registries, large epidemiologic studies, and electronic health records (EHR) studies, including work from the investigators. These human data are bolstered by mechanistic investigations in animal models, demonstrating the impact of maternal immune activation on rodent and non-human primate neurodevelopmental and behavioral phenotypes. Population estimates of prevalence suggest that the majority of offspring born to mothers with infections during pregnancy will not develop autism, ADHD, or another developmental disorder. Thus, in order to enable efforts focused on prevention and early intervention, paradigms to more precisely estimate risk in offspring exposed to maternal infection are needed.

Building reliable, generalizable risk prediction methods requires access to large, representative populations with longitudinal follow-up and the capacity to conduct phenotyping at low cost. In prior NIH-supported work, we have demonstrated that electronic health records provide an opportunity to identify and characterize very large cohorts in silico, of whom a subset can then be contacted for more intensive follow-up. We have similarly linked an OB biobank to large-scale electronic health records in investigations of SARS-CoV-2 maternal and fetal immunology. This nested cohort approach combines the scale and generalizability of health records research with the precision of individual phenotyping.

The present study aims to build a series of risk stratification models that extend standard analyses of electronic health records, estimating at each step the extent to which additional features improve the capacity to predict outcomes. We will predict neurodevelopmental outcomes in early childhood (prior to age 8), among offspring of mothers with infections in pregnancy. We will generate these predictions at 2 actionable time points in development: at birth, and preschool age (3-5 years), externally validating in a second health system. The electronic health records models will be extended with clinical features derived from large language models; with maternal questionnaire data; and with child behavioral phenotyping.

Aim 1. Apply standard machine learning methods, augmented by large language models, to electronic health records from more than 150,000 maternal-fetal dyads spanning 2 large health systems to a) characterize maternal infection, pregnancy complications, and offspring phenotypes validated against prospectively-collected obstetric biobank data, and b) predict emergence of neurodevelopmental risk signatures by age 6-8 among offspring exposed to infection in utero.

Aim 2. Among offspring exposed to infection in utero, for a randomly-selected subset of children at high risk for neurodevelopmental diagnoses and matched children at low risk for these diagnoses by age 3-5, a) collect parental questionnaire data regarding behavior at age 3-5. In this nested case-control sample, b) predict neurodevelopmental diagnoses at age 6-8 comparing EHR alone to EHR plus parental survey, validating the resulting models in a 2nd large health system.

Aim 3. Among offspring exposed to infection in utero, for the same subset of children at high risk for neurodevelopmental symptoms and matched children at low risk, a) conduct remote behavioral and neurocognitive phenotyping at age 3-5. In this nested case-control sample, b) predict neurodevelopmental diagnoses at age 6-8 with the addition of behavioral phenotyping at age 3-5 to EHR, validating the resulting models in a 2nd large health system.