Impact-MH

Project Number

1U01MH136059-01

Awardee Organization

MASSACHUSETTS GENERAL HOSPITAL

Program Official

SARAH E. MORRIS

Project Description

A 2021 Surgeon General Advisory highlighted a crisis in youth mental health, including massive challenges in accessing resources for diagnosis and care. The report notes that improved models of psychopathology trajectories could reveal opportunities for scalable and stepped-care therapies. The literature to date indicates that prediction in transition age youth is particularly critical; individuals in their late teens and early twenties are navigating key psychosocial milestones at a time when normative variation in behavior is difficult to disentangle from emerging mental illness. This is also a period when the impact of falling off developmental trajectories can be profound. Youth may be unable to complete their education or secure stable employment. They may also increase substance use and become entangled with the legal system, with potential for lifelong consequences. 

Work by the investigators and others has demonstrated the utility of electronic health records (EHR) for developing risk stratification models in psychiatric illness. At the same time, we have also shown that coded data readily available in EHR is insufficient to reliably predict the evolution of psychopathology over time; additional data types may be required to achieve these critical goals.

RFA MH-23-105 explicitly seeks strategies to augment EHR-based models. Among the broad range of potential measures, batteries that capture dimensional traits, including quantitative symptoms and cognition, are at the core of evidence-based, developmentally relevant psychopathology frameworks, including NIMH’s Research Domain Criteria effort. Importantly, variation in such traits is not well-captured by standard clinical assessments of psychopathology even though they may represent important elements of emerging illness that contribute to functional outcomes.  Addressing these gaps thus represents an opportunity to extend standard characterization available in EHR. We propose to integrate such measures – i.e., symptom inventories and neuroscience-based cognitive tasks – in combination with natural language processing (NLP) of retrospectively ascertained patient notes, to enhance prediction of prospective outcomes among transition age youth. We hypothesize that a combination of highly scalable computed phenotypes and prospectively administered brief symptom inventories and neurocognitive tests – that is, jointly observed scalable phenotypes – will improve performance of models for predicting longitudinal neuropsychiatric and functional outcomes.  We will test this hypothesis by applying methods developed by the investigators to a cohort of N= 10,000 individuals, age 18-20, identified from EHR in the Mass General Brigham Health Care system and assessed prospectively every 6 months for 2 years. 

Aim 1. Identify, enroll, and retrospectively characterize 10,000 18-to-20 year-olds in the health systems of 2 large academic medical centers. We will extend coded clinical data with validated NLP methods to characterize dimensional psychiatric symptoms and cognitive functioning retrospectively for up to 10 years, using data censored 6 or 24 months prior to baseline to predict current status.  H1. Models incorporating EHR + NLP measures that estimate psychiatric and cognitive features will achieve clinically and statistically significant superiority to models based on EHR data alone in predicting current psychiatric (diagnoses and treatment) status. 

Aim 2. Collect enhanced phenotyping – i.e. neurocognitive and self-report psychiatric and psychosocial functioning data — on this cohort every 6 months, and apply both standard and novel interpretable machine learning methods to derive predictors of 180-day psychiatric outcomes.  H2. Models incorporating enhanced measures, or jointly-observed phenotypes, will achieve clinically and statistically significant superiority in predicting psychiatric outcomes versus models with EHR data alone.

Aim 3. Apply interpretable machine learning methods to derive predictors of 24-month outcomes using EHR data alone and with enhanced phenotyping.  H3. Models incorporating EHR+enhanced measures will be clinically and statistically superior to models incorporating EHR data alone in predicting 24-month outcomes.

Impact and Significance. This study builds on a decade of work by the investigators in longitudinal phenotyping using EHR, and psychiatric and neurocognitive assessment in adolescents and young adults. In a large, generalizable cohort, we will determine the extent to which enhanced, virtual measures improve clinical prediction tools for risk stratification and mapping trajectories in transition-aged youth.  The low economic and implementation burden of the proposed strategy will facilitate translation to broad, diverse communities.