HALO: Developing Data-Driven Clinical Signatures for People Who Experience Hallucinations

Project Leaders

Contact PI

DROR BEN-ZEEV

Professor of Psychiatry and Behavioral Sciences

University of Washington

MPI

TREVOR COHEN

Professor of Biomedical Informatics and Medical Education

UW Medicine

Project Number

1U01MH135901-01

Awardee Organization

UNIVERSITY OF WASHINGTON

Program Official

SARAH E. MORRIS

Project Description

Hallucinations are prevalent in the context of a wide variety of mental disorders but also occur in approximately 10% of the general population. Hallucinatory experiences are readily identifiable by those experiencing them, but they are not always indicators of conditions that lead to serious negative outcomes such as hospitalizations, use of emergency services, and suicide attempt. Our risk evaluation capabilities are hampered, in part, by the limitations of our assessment strategies which typically involve resource intensive clinical interviews, administered by trained clinicians, in clinic settings. Ubiquitous smartphone technologies offer us novel opportunities to administer behavioral measures that can capture the experience and impact of hallucinations at a scope, scale, and ecological validity that far exceed clinic-based assessment capabilities. Applying powerful computational methods to the rich data collected using mobile behavioral tasks has the potential to yield tools for identification of those at heightened risk for major clinical and functional impairments. In response to RFA-MH-23-105, we propose to recruit a large sample of people experiencing hallucinations to install a smartphone behavioral measurement package that will prompt them to complete targeted brief self- report measures, audio diaries describing their hallucinatory experiences, and validated audio-administered verbal memory tasks in their own environments. Participants will also complete clinical outcome measures prospectively. Our team will derive data-driven clinical signatures from the mobile behavioral tasks to predict individual differences in severe negative outcomes among people experiencing hallucinations; identify and mitigate bias in modelling across groups defined by race, sex, and age; examine whether adding smartphone- captured behavioral data to information that is typically available in the clinical record improves model clinical utility; and produce machine learning-ready data structures that adhere to FAIR (Findable, Accessible, Interoperable, Reusable) principles, and can be shared with the broader scientific community for ongoing iterative testing and refinement. If successful, the project will produce data-driven tools that will advance our ability to allocate the right clinical resources, at the right time, to the right people.

Aim 1: Derive data-driven clinical signatures from mobile behavioral tasks to predict individual differences in severe negative outcomes among people experiencing hallucinations. Using multiple computational approaches, we will evaluate which self-report items, speech derived features, and verbal recall patterns differentiate individuals who do or do not develop severe negative outcomes (i.e., emergency service use, hospitalization, suicide attempt) prospectively, over a one-year period.

Aim 2: Identify and mitigate bias in modelling across groups defined by race, sex, and age. All speech recognition and feature extraction methods will be evaluated for differences across participant subgroups. We will fine-tune speech and language models on publicly available subpopulation-derived data, extend lexical dictionaries as needed, and de-bias downstream machine learning models to address observed discrepancies.

Aim 3: Examine whether adding smartphone-captured behavioral data to information that is typically available in the clinical record improves model clinical utility. We will use likelihood ratio tests and model fit indices to compare hierarchically nested models of the additional parameters above and beyond the type of individual-level data that is typically available in electronic health records.

Aim 4: Produce machine learning-ready data structures that adhere to FAIR (Findable, Accessible, Interoperable, Reusable) principles. Working closely with the Data Coordinating Center, we will apply a tailored de-identification system to transcribed text. De-identified transcripts and extracted features will be harmonized with baseline data, enriched with granular metadata, indexed in searchable resources, and made available to the broader scientific community in AI-ready formats to advance machine learning efforts in this area.

Public Health Relevance Statement

Our team proposes to collect data from a large sample of people who experience hallucinations using smartphone behavioral measurement tools. With the aid of innovative computational modelling strategies, these data will be used to develop “clinical signatures” that indicate which individuals are at heightened risk for severe outcomes such as hospitalization and suicide. If successful, these measures and models can be used to guide scalable clinical decision making, resource allocation, treatment, and impactful prevention efforts.