Projects per year
Abstract
Background
Advances in molecular biology, genetics and neuroimaging have the potential to improve our understanding of psychotic disorders. However, the clinical classification of psychotic disorders has remained largely unchanged and is based on criterion-based diagnostic systems (such as ICD-10 and DSM-5) which do not necessarily reflect their underlying aetiology and pathophysiology. A more refined characterisation of clinical phenotype could help to improve our understanding of these disorders.
Clinical data are increasingly recorded in the form of electronic health records (EHRs). Automated information extraction methods such as natural language processing (NLP) offer the opportunity to quickly extract and analyse large volumes of clinical data from EHRs. We sought to characterise the range of presenting symptoms in a large sample of patients with psychotic disorders using NLP.
Methods
Dataset: South London and Maudsley NHS Trust (SLaM) Biomedical Research Centre (BRC) Case Register comprising pseudonymised EHRs of over 270,000 people.
Clinical sample: 18,761 patients with an ICD-10 diagnosis of a psychotic disorders (F20, F25 or F31) and a control group of 57,999 patients with a non-psychotic disorder diagnosis (mood/affective/personality disorders without psychotic symptoms).
Data collection: The NLP software package TextHunter was used. All sentences containing keywords relevant to the following symptom categories were analysed using a support vector machine learning (SVM) approach: positive symptoms, negative symptoms, disorganisation, mania and catatonia. Data on 46 symptoms were obtained with 37,211 instances annotated to contribute training and gold standard data for machine learning. 2,950 instances were independently annotated to determine inter-annotator agreement.
Outcomes: prevalence of psychotic symptoms and their association with ICD-10 diagnosis.
Results
A good degree of inter-annotator agreement was achieved (Cohen’s κ: 0.83). Machine learning NLP achieved a mean precision (positive predictive value) of 83% and recall (sensitivity) of 78%. Among patients with psychotic disorders, the most frequently documented symptoms were paranoia, disturbed sleep and hallucinations. Psychotic symptoms were not limited to patients with an ICD-10 diagnosis of a psychotic disorder and were also present in the control group.
Discussion
We found that psychotic symptoms were not limited to patients with a specific ICD-10 diagnosis and were present in a wide range of ICD-10 disorders. These findings highlight the utility of detailed NLP-derived symptom data to better characterise psychotic disorders.
Advances in molecular biology, genetics and neuroimaging have the potential to improve our understanding of psychotic disorders. However, the clinical classification of psychotic disorders has remained largely unchanged and is based on criterion-based diagnostic systems (such as ICD-10 and DSM-5) which do not necessarily reflect their underlying aetiology and pathophysiology. A more refined characterisation of clinical phenotype could help to improve our understanding of these disorders.
Clinical data are increasingly recorded in the form of electronic health records (EHRs). Automated information extraction methods such as natural language processing (NLP) offer the opportunity to quickly extract and analyse large volumes of clinical data from EHRs. We sought to characterise the range of presenting symptoms in a large sample of patients with psychotic disorders using NLP.
Methods
Dataset: South London and Maudsley NHS Trust (SLaM) Biomedical Research Centre (BRC) Case Register comprising pseudonymised EHRs of over 270,000 people.
Clinical sample: 18,761 patients with an ICD-10 diagnosis of a psychotic disorders (F20, F25 or F31) and a control group of 57,999 patients with a non-psychotic disorder diagnosis (mood/affective/personality disorders without psychotic symptoms).
Data collection: The NLP software package TextHunter was used. All sentences containing keywords relevant to the following symptom categories were analysed using a support vector machine learning (SVM) approach: positive symptoms, negative symptoms, disorganisation, mania and catatonia. Data on 46 symptoms were obtained with 37,211 instances annotated to contribute training and gold standard data for machine learning. 2,950 instances were independently annotated to determine inter-annotator agreement.
Outcomes: prevalence of psychotic symptoms and their association with ICD-10 diagnosis.
Results
A good degree of inter-annotator agreement was achieved (Cohen’s κ: 0.83). Machine learning NLP achieved a mean precision (positive predictive value) of 83% and recall (sensitivity) of 78%. Among patients with psychotic disorders, the most frequently documented symptoms were paranoia, disturbed sleep and hallucinations. Psychotic symptoms were not limited to patients with an ICD-10 diagnosis of a psychotic disorder and were also present in the control group.
Discussion
We found that psychotic symptoms were not limited to patients with a specific ICD-10 diagnosis and were present in a wide range of ICD-10 disorders. These findings highlight the utility of detailed NLP-derived symptom data to better characterise psychotic disorders.
Original language | English |
---|---|
Pages (from-to) | S154-S155 |
Number of pages | 2 |
Journal | Schizophrenia Bulletin |
Volume | 44 |
Issue number | S1 |
DOIs | |
Publication status | Published - 1 Apr 2018 |
Fingerprint
Dive into the research topics of 'T101. ENRICHING PSYCHOTIC DISORDER CLASSIFICATION USING NATURAL LANGUAGE PROCESSING'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Linking electronic health records with passive smartphone activity data to predict outcomes in psychotic disorders
Patel, R. (Primary Investigator), McGuire, P. (Primary Investigator) & Curcin, V. (Primary Investigator)
14/02/2018 → 13/02/2021
Project: Research
-
Symptom dimensions in first episode psychosis: predicting clinical outcomes using natural language processing
Patel, R. (Primary Investigator)
3/10/2016 → 2/10/2018
Project: Research