T101. ENRICHING PSYCHOTIC DISORDER CLASSIFICATION USING NATURAL LANGUAGE PROCESSING

Rashmi Patel; Richard George Jackson; Robert James Stewart; Philip McGuire

doi:10.1093/schbul/sby016.377

T101. ENRICHING PSYCHOTIC DISORDER CLASSIFICATION USING NATURAL LANGUAGE PROCESSING

Rashmi Patel, Richard George Jackson, Robert James Stewart, Philip McGuire

Research output: Contribution to journal › Meeting abstract › peer-review

Abstract

Background
Advances in molecular biology, genetics and neuroimaging have the potential to improve our understanding of psychotic disorders. However, the clinical classification of psychotic disorders has remained largely unchanged and is based on criterion-based diagnostic systems (such as ICD-10 and DSM-5) which do not necessarily reflect their underlying aetiology and pathophysiology. A more refined characterisation of clinical phenotype could help to improve our understanding of these disorders.
Clinical data are increasingly recorded in the form of electronic health records (EHRs). Automated information extraction methods such as natural language processing (NLP) offer the opportunity to quickly extract and analyse large volumes of clinical data from EHRs. We sought to characterise the range of presenting symptoms in a large sample of patients with psychotic disorders using NLP.

Methods
Dataset: South London and Maudsley NHS Trust (SLaM) Biomedical Research Centre (BRC) Case Register comprising pseudonymised EHRs of over 270,000 people.
Clinical sample: 18,761 patients with an ICD-10 diagnosis of a psychotic disorders (F20, F25 or F31) and a control group of 57,999 patients with a non-psychotic disorder diagnosis (mood/affective/personality disorders without psychotic symptoms).
Data collection: The NLP software package TextHunter was used. All sentences containing keywords relevant to the following symptom categories were analysed using a support vector machine learning (SVM) approach: positive symptoms, negative symptoms, disorganisation, mania and catatonia. Data on 46 symptoms were obtained with 37,211 instances annotated to contribute training and gold standard data for machine learning. 2,950 instances were independently annotated to determine inter-annotator agreement.
Outcomes: prevalence of psychotic symptoms and their association with ICD-10 diagnosis.

Results
A good degree of inter-annotator agreement was achieved (Cohen’s κ: 0.83). Machine learning NLP achieved a mean precision (positive predictive value) of 83% and recall (sensitivity) of 78%. Among patients with psychotic disorders, the most frequently documented symptoms were paranoia, disturbed sleep and hallucinations. Psychotic symptoms were not limited to patients with an ICD-10 diagnosis of a psychotic disorder and were also present in the control group.

Discussion
We found that psychotic symptoms were not limited to patients with a specific ICD-10 diagnosis and were present in a wide range of ICD-10 disorders. These findings highlight the utility of detailed NLP-derived symptom data to better characterise psychotic disorders.

Original language	English
Pages (from-to)	S154-S155
Number of pages	2
Journal	Schizophrenia Bulletin
Volume	44
Issue number	S1
DOIs	https://doi.org/10.1093/schbul/sby016.377
Publication status	Published - 1 Apr 2018

Access to Document

10.1093/schbul/sby016.377Licence: CC BY

Linking electronic health records with passive smartphone activity data to predict outcomes in psychotic disorders
Patel, R. (Primary Investigator), McGuire, P. (Primary Investigator) & Curcin, V. (Primary Investigator)
MRC Medical Research Council
14/02/2018 → 13/02/2021
Project: Research
Symptom dimensions in first episode psychosis: predicting clinical outcomes using natural language processing
Patel, R. (Primary Investigator)
Academy of Medical Sciences
3/10/2016 → 2/10/2018
Project: Research

Cite this

@article{8cc656c9a52d403bbbbb5657c7eec8dd,

title = "T101. ENRICHING PSYCHOTIC DISORDER CLASSIFICATION USING NATURAL LANGUAGE PROCESSING",

abstract = "BackgroundAdvances in molecular biology, genetics and neuroimaging have the potential to improve our understanding of psychotic disorders. However, the clinical classification of psychotic disorders has remained largely unchanged and is based on criterion-based diagnostic systems (such as ICD-10 and DSM-5) which do not necessarily reflect their underlying aetiology and pathophysiology. A more refined characterisation of clinical phenotype could help to improve our understanding of these disorders.Clinical data are increasingly recorded in the form of electronic health records (EHRs). Automated information extraction methods such as natural language processing (NLP) offer the opportunity to quickly extract and analyse large volumes of clinical data from EHRs. We sought to characterise the range of presenting symptoms in a large sample of patients with psychotic disorders using NLP.MethodsDataset: South London and Maudsley NHS Trust (SLaM) Biomedical Research Centre (BRC) Case Register comprising pseudonymised EHRs of over 270,000 people.Clinical sample: 18,761 patients with an ICD-10 diagnosis of a psychotic disorders (F20, F25 or F31) and a control group of 57,999 patients with a non-psychotic disorder diagnosis (mood/affective/personality disorders without psychotic symptoms).Data collection: The NLP software package TextHunter was used. All sentences containing keywords relevant to the following symptom categories were analysed using a support vector machine learning (SVM) approach: positive symptoms, negative symptoms, disorganisation, mania and catatonia. Data on 46 symptoms were obtained with 37,211 instances annotated to contribute training and gold standard data for machine learning. 2,950 instances were independently annotated to determine inter-annotator agreement.Outcomes: prevalence of psychotic symptoms and their association with ICD-10 diagnosis.ResultsA good degree of inter-annotator agreement was achieved (Cohen{\textquoteright}s κ: 0.83). Machine learning NLP achieved a mean precision (positive predictive value) of 83% and recall (sensitivity) of 78%. Among patients with psychotic disorders, the most frequently documented symptoms were paranoia, disturbed sleep and hallucinations. Psychotic symptoms were not limited to patients with an ICD-10 diagnosis of a psychotic disorder and were also present in the control group.DiscussionWe found that psychotic symptoms were not limited to patients with a specific ICD-10 diagnosis and were present in a wide range of ICD-10 disorders. These findings highlight the utility of detailed NLP-derived symptom data to better characterise psychotic disorders.",

author = "Rashmi Patel and Jackson, {Richard George} and Stewart, {Robert James} and Philip McGuire",

year = "2018",

month = apr,

day = "1",

doi = "10.1093/schbul/sby016.377",

language = "English",

volume = "44",

pages = "S154--S155",

journal = "Schizophrenia Bulletin",

issn = "0586-7614",

publisher = "Oxford University Press",

number = "S1",

}

TY - JOUR

T1 - T101. ENRICHING PSYCHOTIC DISORDER CLASSIFICATION USING NATURAL LANGUAGE PROCESSING

AU - Patel, Rashmi

AU - Jackson, Richard George

AU - Stewart, Robert James

AU - McGuire, Philip

PY - 2018/4/1

Y1 - 2018/4/1

N2 - BackgroundAdvances in molecular biology, genetics and neuroimaging have the potential to improve our understanding of psychotic disorders. However, the clinical classification of psychotic disorders has remained largely unchanged and is based on criterion-based diagnostic systems (such as ICD-10 and DSM-5) which do not necessarily reflect their underlying aetiology and pathophysiology. A more refined characterisation of clinical phenotype could help to improve our understanding of these disorders.Clinical data are increasingly recorded in the form of electronic health records (EHRs). Automated information extraction methods such as natural language processing (NLP) offer the opportunity to quickly extract and analyse large volumes of clinical data from EHRs. We sought to characterise the range of presenting symptoms in a large sample of patients with psychotic disorders using NLP.MethodsDataset: South London and Maudsley NHS Trust (SLaM) Biomedical Research Centre (BRC) Case Register comprising pseudonymised EHRs of over 270,000 people.Clinical sample: 18,761 patients with an ICD-10 diagnosis of a psychotic disorders (F20, F25 or F31) and a control group of 57,999 patients with a non-psychotic disorder diagnosis (mood/affective/personality disorders without psychotic symptoms).Data collection: The NLP software package TextHunter was used. All sentences containing keywords relevant to the following symptom categories were analysed using a support vector machine learning (SVM) approach: positive symptoms, negative symptoms, disorganisation, mania and catatonia. Data on 46 symptoms were obtained with 37,211 instances annotated to contribute training and gold standard data for machine learning. 2,950 instances were independently annotated to determine inter-annotator agreement.Outcomes: prevalence of psychotic symptoms and their association with ICD-10 diagnosis.ResultsA good degree of inter-annotator agreement was achieved (Cohen’s κ: 0.83). Machine learning NLP achieved a mean precision (positive predictive value) of 83% and recall (sensitivity) of 78%. Among patients with psychotic disorders, the most frequently documented symptoms were paranoia, disturbed sleep and hallucinations. Psychotic symptoms were not limited to patients with an ICD-10 diagnosis of a psychotic disorder and were also present in the control group.DiscussionWe found that psychotic symptoms were not limited to patients with a specific ICD-10 diagnosis and were present in a wide range of ICD-10 disorders. These findings highlight the utility of detailed NLP-derived symptom data to better characterise psychotic disorders.

AB - BackgroundAdvances in molecular biology, genetics and neuroimaging have the potential to improve our understanding of psychotic disorders. However, the clinical classification of psychotic disorders has remained largely unchanged and is based on criterion-based diagnostic systems (such as ICD-10 and DSM-5) which do not necessarily reflect their underlying aetiology and pathophysiology. A more refined characterisation of clinical phenotype could help to improve our understanding of these disorders.Clinical data are increasingly recorded in the form of electronic health records (EHRs). Automated information extraction methods such as natural language processing (NLP) offer the opportunity to quickly extract and analyse large volumes of clinical data from EHRs. We sought to characterise the range of presenting symptoms in a large sample of patients with psychotic disorders using NLP.MethodsDataset: South London and Maudsley NHS Trust (SLaM) Biomedical Research Centre (BRC) Case Register comprising pseudonymised EHRs of over 270,000 people.Clinical sample: 18,761 patients with an ICD-10 diagnosis of a psychotic disorders (F20, F25 or F31) and a control group of 57,999 patients with a non-psychotic disorder diagnosis (mood/affective/personality disorders without psychotic symptoms).Data collection: The NLP software package TextHunter was used. All sentences containing keywords relevant to the following symptom categories were analysed using a support vector machine learning (SVM) approach: positive symptoms, negative symptoms, disorganisation, mania and catatonia. Data on 46 symptoms were obtained with 37,211 instances annotated to contribute training and gold standard data for machine learning. 2,950 instances were independently annotated to determine inter-annotator agreement.Outcomes: prevalence of psychotic symptoms and their association with ICD-10 diagnosis.ResultsA good degree of inter-annotator agreement was achieved (Cohen’s κ: 0.83). Machine learning NLP achieved a mean precision (positive predictive value) of 83% and recall (sensitivity) of 78%. Among patients with psychotic disorders, the most frequently documented symptoms were paranoia, disturbed sleep and hallucinations. Psychotic symptoms were not limited to patients with an ICD-10 diagnosis of a psychotic disorder and were also present in the control group.DiscussionWe found that psychotic symptoms were not limited to patients with a specific ICD-10 diagnosis and were present in a wide range of ICD-10 disorders. These findings highlight the utility of detailed NLP-derived symptom data to better characterise psychotic disorders.

U2 - 10.1093/schbul/sby016.377

DO - 10.1093/schbul/sby016.377

M3 - Meeting abstract

SN - 0586-7614

VL - 44

SP - S154-S155

JO - Schizophrenia Bulletin

JF - Schizophrenia Bulletin

IS - S1

ER -

T101. ENRICHING PSYCHOTIC DISORDER CLASSIFICATION USING NATURAL LANGUAGE PROCESSING

Abstract

Access to Document

Fingerprint

Projects

Linking electronic health records with passive smartphone activity data to predict outcomes in psychotic disorders

Symptom dimensions in first episode psychosis: predicting clinical outcomes using natural language processing

Cite this