Abstract
Background: Using novel data mining methods such as natural language processing (NLP) on Electronic Health Records (EHR) for screening and detecting individuals at risk for psychosis.
Method: The study included all patients receiving a first index diagnosis of nonorganic and nonpsychotic mental disorder within the South London and Maudsley (SLaM) NHS Foundation Trust between January 1, 2008, and July 28, 2018. LASSO-regularised Cox regression was used to refine and externally validate a refined version of a 5-item individualised, transdiagnostic, clinically based risk calculator previously developed (Harrell’s C = 0.79) and piloted for implementation. The refined version included 14 additional NLP- predictors: tearfulness, poor appetite, weight loss, insomnia, cannabis, cocaine, guilt, irritability, delusions, hopelessness, disturbed sleep, poor insight, agitation and paranoia.
Results: A total of 92,151 patients with a first index diagnosis of nonorganic and nonpsychotic mental disorder within the SLaM Trust were included in the derivation (n = 28,297) or external validation (n = 63,854) data sets. Mean age was 33.6 years, 50.7% were women, and 67.0% were of white race/ethnicity. Mean follow-up was 1590 days. The overall 6-year risk of psychosis in secondary mental health care was 3.4 (95% CI, 3.3 – 3.6). External validation indicated strong performance on unseen data (Harrell’s C 0.85, 95% CI 0.84–0.86), an increase of 0.06 from the original model.
Conclusions: Using NLP on EHRs can considerably enhance the prognostic accuracy of psychosis risk calculators. This can help identify patients at risk of psychosis who require assessment and specialized care, facilitating earlier detection and potentially improving patient outcomes.
Method: The study included all patients receiving a first index diagnosis of nonorganic and nonpsychotic mental disorder within the South London and Maudsley (SLaM) NHS Foundation Trust between January 1, 2008, and July 28, 2018. LASSO-regularised Cox regression was used to refine and externally validate a refined version of a 5-item individualised, transdiagnostic, clinically based risk calculator previously developed (Harrell’s C = 0.79) and piloted for implementation. The refined version included 14 additional NLP- predictors: tearfulness, poor appetite, weight loss, insomnia, cannabis, cocaine, guilt, irritability, delusions, hopelessness, disturbed sleep, poor insight, agitation and paranoia.
Results: A total of 92,151 patients with a first index diagnosis of nonorganic and nonpsychotic mental disorder within the SLaM Trust were included in the derivation (n = 28,297) or external validation (n = 63,854) data sets. Mean age was 33.6 years, 50.7% were women, and 67.0% were of white race/ethnicity. Mean follow-up was 1590 days. The overall 6-year risk of psychosis in secondary mental health care was 3.4 (95% CI, 3.3 – 3.6). External validation indicated strong performance on unseen data (Harrell’s C 0.85, 95% CI 0.84–0.86), an increase of 0.06 from the original model.
Conclusions: Using NLP on EHRs can considerably enhance the prognostic accuracy of psychosis risk calculators. This can help identify patients at risk of psychosis who require assessment and specialized care, facilitating earlier detection and potentially improving patient outcomes.
Original language | English |
---|---|
Journal | Schizophrenia Bulletin |
DOIs | |
Publication status | Published - 11 Aug 2020 |