چکیده
1. مقدمه
2. کارهای مرتبط
3. اندازه گیری شدت افسردگی
4. مجموعه داده DEPTWEET
5. طراحی آزمایشی
6. نتایج و بحث
7. نتیجه گیری و کار آینده
منابع مالی
بیانیه مشارکت نویسنده CRediT
در دسترس بودن داده ها
منابع
Abstract
1. Introduction
2. Related work
3. Measuring severity of depression
4. The DEPTWEET dataset
5. Experimental design
6. Results and discussions
7. Conclusions and future work
Funding
CRediT authorship contribution statement
Data availability
References
چکیده
تحقیقات بهداشت روانی از طریق روشهای مبتنی بر دادهها به دلیل فقدان گونهشناسی استاندارد و کمبود دادههای کافی مانع شده است. در این مطالعه، ما از بیان بالینی افسردگی برای ایجاد یک گونهشناسی برای متون رسانههای اجتماعی برای تشخیص شدت افسردگی استفاده میکنیم. این روش استاندارد ارزیابی بالینی راهنمای تشخیصی و آماری اختلالات روانی (DSM-5) و پرسشنامه سلامت بیمار (PHQ-9) را تقلید می کند تا نشانه های ظریف اختلالات افسردگی را از توییت ها در بر بگیرد. همراه با نوع شناسی، مجموعه داده جدیدی از 40191 توییت را ارائه می کنیم که توسط حاشیه نویسان خبره برچسب گذاری شده اند. هر توییت به عنوان "غیر افسرده" یا "افسرده" برچسب گذاری می شود. علاوه بر این، سه سطح شدت برای توییتهای «افسرده» در نظر گرفته میشود: (1) خفیف، (2) متوسط و (3) شدید. یک امتیاز اطمینان مرتبط با هر برچسب برای تأیید کیفیت حاشیه نویسی ارائه می شود. ما کیفیت مجموعه داده را از طریق نمایش آمار خلاصه در حالی که نتایج قوی پایه را با استفاده از مدلهای مبتنی بر توجه مانند BERT و DistilBERT تنظیم میکنیم، بررسی میکنیم. در نهایت، ما به طور گسترده به محدودیت های مطالعه می پردازیم تا راهنمایی هایی برای تحقیقات بیشتر ارائه دهیم.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
Mental health research through data-driven methods has been hindered by a lack of standard typology and scarcity of adequate data. In this study, we leverage the clinical articulation of depression to build a typology for social media texts for detecting the severity of depression. It emulates the standard clinical assessment procedure Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and Patient Health Questionnaire (PHQ-9) to encompass subtle indications of depressive disorders from tweets. Along with the typology, we present a new dataset of 40191 tweets labeled by expert annotators. Each tweet is labeled as ‘non-depressed’ or ‘depressed’. Moreover, three severity levels are considered for ‘depressed’ tweets: (1) mild, (2) moderate, and (3) severe. An associated confidence score is provided with each label to validate the quality of annotation. We examine the quality of the dataset via representing summary statistics while setting strong baseline results using attention-based models like BERT and DistilBERT. Finally, we extensively address the limitations of the study to provide directions for further research.
Introduction
Analyzing the presence of mood and psychological disorders through behavioral and linguistic cues from social media data remains a critical area of interdisciplinary research. In addition to these disorders, the last decade has seen exponentially increasing attempts to assess related symptomatology such as depressive disorders, self-harm, and severity of mental illness using non-clinical data (Bucci et al., 2019). Social media platforms and other online discussion forums have been particularly appealing to the research community for various research purposes (e.g., population-level mental health monitoring (Conway & O’Connor, 2016), personal traits detection (Marouf et al., 2020), cyberbullying spotting (Bozyiğit et al., 2021), etc.) because of the massive scale of data. This massive data flow has resulted from increasing rates of internet access and people spontaneously sharing their suffering, pain, and struggle anonymously on these platforms (Ofek et al., 2015). Recognizing the early symptoms of depressive disorder through a person’s language use can prevent many disastrous outcomes like self-harm, suicide, etc., and even help deploy effective treatment in proper time. Moreover, the outbreak of the COVID-19 pandemic is likely to have devastating impacts on the mental health of millions of individuals as lockdown in the affected areas has reported high rises in the incident rates of mood disorder, including acute stress disorder, post-traumatic stress disorder, generalized anxiety disorder, and overall sub-clinical mental health deterioration (Singh et al., 2020). The scope of mental health deterioration during the COVID-19 pandemic and the comprehensive nature of diagnosing depressive disorders have provided an unprecedented need to infer the mental states of individuals from all-inclusive resources. Recent studies have revealed that valuable insights into the impact of the pandemic on population-level mental health can be inferred from posts or comments on social media (Low et al., 2020).
Conclusions and future work
This work introduced a new typology for diagnosing depression severities from social media texts, as well as a unique dataset of labeled tweets with a confidence score for each label. The dataset was constructed based on strong ground truths and clinical validation, and it is expected to help alleviate the scarcity of mental health data to some extent. The description of the process and challenges in creating such a dataset may motivate researchers to collect similar corpora of this scale from other social media and discussion forums. The experimental results indicated that existing state-of-the-art models often fail to understand the contextual undertone of the data samples. Developing a model that is capable of comprehending the subdued relationship and differences among depression severities can result in an even better understanding of human cognition. Moreover, analysis of the classification performance indicates that there is no distinct division of keywords among different depression severities. The same keyword might be used differently to express different emotions, rather it is more important to understand the context of the tweet to diagnose the severity of depression. Broader implications of this research may include personalizing and directing preventative and awareness messages by health professionals to the users in need.