Deep learning with anaphora resolution for the detection of tweeters with depression: Algorithm development and validation study

Akkapon Wongkoblap; Miguel A. Vadillo; Vasa Curcin

doi:10.2196/19824

Deep learning with anaphora resolution for the detection of tweeters with depression: Algorithm development and validation study

Akkapon Wongkoblap^*, Miguel A. Vadillo, Vasa Curcin

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

24 Citations (Scopus)

Abstract

Background: Mental health problems are widely recognized as a major public health challenge worldwide. This concern highlights the need to develop effective tools for detecting mental health disorders in the population. Social networks are a promising source of data wherein patients publish rich personal information that can be mined to extract valuable psychological cues; however, these data come with their own set of challenges, such as the need to disambiguate between statements about oneself and third parties. Traditionally, natural language processing techniques for social media have looked at text classifiers and user classification models separately, hence presenting a challenge for researchers who want to combine text sentiment and user sentiment analysis. Objective: The objective of this study is to develop a predictive model that can detect users with depression from Twitter posts and instantly identify textual content associated with mental health topics. The model can also address the problem of anaphoric resolution and highlight anaphoric interpretations. Methods: We retrieved the data set from Twitter by using a regular expression or stream of real-time tweets comprising 3682 users, of which 1983 self-declared their depression and 1699 declared no depression. Two multiple instance learning models were developed—one with and one without an anaphoric resolution encoder—to identify users with depression and highlight posts related to the mental health of the author. Several previously published models were applied to our data set, and their performance was compared with that of our models. Results: The maximum accuracy, F1 score, and area under the curve of our anaphoric resolution model were 92%, 92%, and 90%, respectively. The model outperformed alternative predictive models, which ranged from classical machine learning models to deep learning models. Conclusions: Our model with anaphoric resolution shows promising results when compared with other predictive models and provides valuable insights into textual content that is relevant to the mental health of the tweeter.

Original language	English
Article number	e19824
Journal	JMIR Mental Health
Volume	8
Issue number	8
DOIs	https://doi.org/10.2196/19824
Publication status	Published - Aug 2021

Keywords

Anaphora resolution
Deep learning
Depression
Depression markers
Mental health
Multiple-instance learning
Social media
Twitter

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.2196/19824

Cite this

@article{f6bbf8182b50495dbb1b838fbccf3fbc,

title = "Deep learning with anaphora resolution for the detection of tweeters with depression: Algorithm development and validation study",

abstract = "Background: Mental health problems are widely recognized as a major public health challenge worldwide. This concern highlights the need to develop effective tools for detecting mental health disorders in the population. Social networks are a promising source of data wherein patients publish rich personal information that can be mined to extract valuable psychological cues; however, these data come with their own set of challenges, such as the need to disambiguate between statements about oneself and third parties. Traditionally, natural language processing techniques for social media have looked at text classifiers and user classification models separately, hence presenting a challenge for researchers who want to combine text sentiment and user sentiment analysis. Objective: The objective of this study is to develop a predictive model that can detect users with depression from Twitter posts and instantly identify textual content associated with mental health topics. The model can also address the problem of anaphoric resolution and highlight anaphoric interpretations. Methods: We retrieved the data set from Twitter by using a regular expression or stream of real-time tweets comprising 3682 users, of which 1983 self-declared their depression and 1699 declared no depression. Two multiple instance learning models were developed—one with and one without an anaphoric resolution encoder—to identify users with depression and highlight posts related to the mental health of the author. Several previously published models were applied to our data set, and their performance was compared with that of our models. Results: The maximum accuracy, F1 score, and area under the curve of our anaphoric resolution model were 92%, 92%, and 90%, respectively. The model outperformed alternative predictive models, which ranged from classical machine learning models to deep learning models. Conclusions: Our model with anaphoric resolution shows promising results when compared with other predictive models and provides valuable insights into textual content that is relevant to the mental health of the tweeter.",

keywords = "Anaphora resolution, Deep learning, Depression, Depression markers, Mental health, Multiple-instance learning, Social media, Twitter",

author = "Akkapon Wongkoblap and Vadillo, {Miguel A.} and Vasa Curcin",

note = "Funding Information: AW is fully funded by a scholarship from the Royal Thai Government to study for a PhD. MAV was supported by Comunidad de Madrid (grants 2016-T1/SOC-1395 and 2020-5A/SOC-19723) and AEI /UE FEDER (grant PSI2017-85159-P). This work was partially supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/P010105/1 (CONSULT: Collaborative Mobile Decision Support for Managing Multiple Morbidities). VC is also supported by the National Institute for Health Research (NIHR) Biomedical Research Centre based at Guy's and St Thomas' National Health Service NHS) Foundation Trust and King's College London, and the Public Health and Multimorbidity Theme of the National Institute for Health Research{\textquoteright}s Applied Research Collaboration (ARC) South London. The opinions in this paper are those of the authors and do not necessarily reflect the opinions of the funders. Publisher Copyright: {\textcopyright} Akkapon Wongkoblap, Miguel A Vadillo, Vasa Curcin. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.",

year = "2021",

month = aug,

doi = "10.2196/19824",

language = "English",

volume = "8",

journal = "JMIR Mental Health",

issn = "2368-7959",

publisher = "JMIR Publications",

number = "8",

}

TY - JOUR

T1 - Deep learning with anaphora resolution for the detection of tweeters with depression

T2 - Algorithm development and validation study

AU - Wongkoblap, Akkapon

AU - Vadillo, Miguel A.

AU - Curcin, Vasa

N1 - Funding Information: AW is fully funded by a scholarship from the Royal Thai Government to study for a PhD. MAV was supported by Comunidad de Madrid (grants 2016-T1/SOC-1395 and 2020-5A/SOC-19723) and AEI /UE FEDER (grant PSI2017-85159-P). This work was partially supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/P010105/1 (CONSULT: Collaborative Mobile Decision Support for Managing Multiple Morbidities). VC is also supported by the National Institute for Health Research (NIHR) Biomedical Research Centre based at Guy's and St Thomas' National Health Service NHS) Foundation Trust and King's College London, and the Public Health and Multimorbidity Theme of the National Institute for Health Research’s Applied Research Collaboration (ARC) South London. The opinions in this paper are those of the authors and do not necessarily reflect the opinions of the funders. Publisher Copyright: © Akkapon Wongkoblap, Miguel A Vadillo, Vasa Curcin. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

PY - 2021/8

Y1 - 2021/8

N2 - Background: Mental health problems are widely recognized as a major public health challenge worldwide. This concern highlights the need to develop effective tools for detecting mental health disorders in the population. Social networks are a promising source of data wherein patients publish rich personal information that can be mined to extract valuable psychological cues; however, these data come with their own set of challenges, such as the need to disambiguate between statements about oneself and third parties. Traditionally, natural language processing techniques for social media have looked at text classifiers and user classification models separately, hence presenting a challenge for researchers who want to combine text sentiment and user sentiment analysis. Objective: The objective of this study is to develop a predictive model that can detect users with depression from Twitter posts and instantly identify textual content associated with mental health topics. The model can also address the problem of anaphoric resolution and highlight anaphoric interpretations. Methods: We retrieved the data set from Twitter by using a regular expression or stream of real-time tweets comprising 3682 users, of which 1983 self-declared their depression and 1699 declared no depression. Two multiple instance learning models were developed—one with and one without an anaphoric resolution encoder—to identify users with depression and highlight posts related to the mental health of the author. Several previously published models were applied to our data set, and their performance was compared with that of our models. Results: The maximum accuracy, F1 score, and area under the curve of our anaphoric resolution model were 92%, 92%, and 90%, respectively. The model outperformed alternative predictive models, which ranged from classical machine learning models to deep learning models. Conclusions: Our model with anaphoric resolution shows promising results when compared with other predictive models and provides valuable insights into textual content that is relevant to the mental health of the tweeter.

AB - Background: Mental health problems are widely recognized as a major public health challenge worldwide. This concern highlights the need to develop effective tools for detecting mental health disorders in the population. Social networks are a promising source of data wherein patients publish rich personal information that can be mined to extract valuable psychological cues; however, these data come with their own set of challenges, such as the need to disambiguate between statements about oneself and third parties. Traditionally, natural language processing techniques for social media have looked at text classifiers and user classification models separately, hence presenting a challenge for researchers who want to combine text sentiment and user sentiment analysis. Objective: The objective of this study is to develop a predictive model that can detect users with depression from Twitter posts and instantly identify textual content associated with mental health topics. The model can also address the problem of anaphoric resolution and highlight anaphoric interpretations. Methods: We retrieved the data set from Twitter by using a regular expression or stream of real-time tweets comprising 3682 users, of which 1983 self-declared their depression and 1699 declared no depression. Two multiple instance learning models were developed—one with and one without an anaphoric resolution encoder—to identify users with depression and highlight posts related to the mental health of the author. Several previously published models were applied to our data set, and their performance was compared with that of our models. Results: The maximum accuracy, F1 score, and area under the curve of our anaphoric resolution model were 92%, 92%, and 90%, respectively. The model outperformed alternative predictive models, which ranged from classical machine learning models to deep learning models. Conclusions: Our model with anaphoric resolution shows promising results when compared with other predictive models and provides valuable insights into textual content that is relevant to the mental health of the tweeter.

KW - Anaphora resolution

KW - Deep learning

KW - Depression

KW - Depression markers

KW - Mental health

KW - Multiple-instance learning

KW - Social media

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=85112064830&partnerID=8YFLogxK

U2 - 10.2196/19824

DO - 10.2196/19824

M3 - Article

AN - SCOPUS:85112064830

SN - 2368-7959

VL - 8

JO - JMIR Mental Health

JF - JMIR Mental Health

IS - 8

M1 - e19824

ER -

Deep learning with anaphora resolution for the detection of tweeters with depression: Algorithm development and validation study

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this