Using machine learning to track the impact of the pandemic on mental health


Newswise – CAMBRIDGE, MA – Coping with a global pandemic has impacted the mental health of millions of people. A team of researchers from MIT and Harvard University has shown that they can measure these effects by analyzing the language people use to express their anxiety online.

Using machine learning to analyze the text of more than 800,000 Reddit posts, researchers were able to identify changes in the tone and content of language people used during the first wave of the Covid-19 pandemic, since January. as of April 2020. Their analysis revealed several key changes in mental health conversations, including an overall increase in the discussion of anxiety and suicide.

“We found that there were these natural clusters that emerged linked to suicide and loneliness, and the number of posts in these clusters more than doubled during the pandemic compared to the same months last year, which is a serious concern.” says Daniel Low, a graduate student in the Bioscience and Speech and Hearing Technology program at Harvard and MIT and lead author of the study.

The analysis also revealed different impacts on people who already suffer from different types of mental illness. The findings could help psychiatrists, or potentially the moderators of the Reddit forums that have been studied, better identify and help people whose mental health suffers, the researchers say.

“When the mental health needs of so many in our society are inadequately met, even in the beginning, we wanted to bring attention to the ways many people are suffering during this period, in order to amplify and inform the allocating resources to support them, ”says Laurie Rumker, a graduate student in Harvard’s PhD program in Bioinformatics and Integrative Genomics and one of the study’s authors.

Satrajit Ghosh, a principal investigator at MIT’s McGovern Institute for Brain Research, is the senior author of the study, which was published last month in Journal of Internet Medical Research. Other authors of the paper include Tanya Talkar, a graduate student in the Bioscience and Technology of Speech and Hearing program at Harvard and MIT; John Torous, director of the digital psychiatry division at Beth Israel Deaconess Medical Center; and Guillermo Cecchi, a principal research staff member at IBM’s Thomas J. Watson Research Center.

A wave of anxiety

The new study originated from the MIT 6.897 / HST.956 (Machine Learning for Healthcare) class at MIT’s Department of Electrical and Computer Engineering. Low, Rumker, and Talkar, who were taking the course last spring, had done some previous research on using machine learning to detect mental health disorders based on how people speak and what they say. After the onset of the Covid-19 pandemic, they decided to focus their class project on analyzing Reddit forums dedicated to different types of mental illness.

“When Covid arrived, we were all curious to know if it was affecting some communities more than others,” says Low. “Reddit gives us the opportunity to go through all of these subreddits that are specialized support groups. It’s a truly unique opportunity to see how these different communities were affected differently while the wave was happening, in real time.”

The researchers analyzed posts from 15 subreddit groups dedicated to a variety of mental illnesses, including schizophrenia, depression and bipolar disorder. They also included a handful of groups dedicated to topics not specifically related to mental health, such as personal finance, fitness, and parenting.

Using different types of natural language processing algorithms, the researchers measured the frequency of words associated with topics such as anxiety, death, isolation, and substance abuse, and grouped the posts based on similarities in the language used. These approaches allowed the researchers to identify similarities between each group’s posts after the onset of the pandemic, as well as distinctive differences between the groups.

The researchers found that while people in most support groups started posting about Covid-19 in March, the health anxiety group started much earlier, in January. However, as the pandemic progressed, the other mental health groups began to look a lot like the health anxiety group, in terms of the language that was most often used. At the same time, the personal finance group showed the most negative semantic shift from January to April 2020, and significantly increased the use of words related to economic stress and negative sentiment.

They also found that the mental health groups most negatively affected at the start of the pandemic were those linked to ADHD and eating disorders. The researchers speculate that without their usual social support systems in place, due to the blockages, people suffering from those ailments found it much more difficult to manage their condition. In those groups, the researchers found posts about news hyperfocusing and relapsing into anorexic-like behavior as meals were not monitored by others due to quarantine.

Using another algorithm, the researchers grouped the posts into groups such as loneliness or substance use and then tracked how these groups changed as the pandemic progressed. Suicide-related posts more than doubled from pre-pandemic levels, and the groups that became significantly associated with the suicide cluster during the pandemic were the support groups for borderline personality disorder and post-traumatic stress disorder. .

The researchers also discovered the introduction of new topics that specifically seek help with mental health or social interaction. “The topics within these subreddit support groups were changing a bit, as people tried to adjust to a new life and focused on how they could get more help if needed,” says Talkar.

While the authors point out that they cannot imply the pandemic as the sole cause of the observed language changes, they note that there was a much more significant change during the period from January to April in 2020 than the same months in 2019 and 2018, indicating the i changes cannot be explained by normal annual trends.

Mental health resources

This type of analysis could help healthcare professionals identify segments of the population that are most vulnerable to the decline in mental health caused not only by the Covid-19 pandemic but also by other mental health stressors such as controversial elections or natural disasters. the researchers say.

Additionally, when applied to Reddit or other social media posts in real time, this analytics could be used to offer users additional resources, such as guidance to a different support group, information on how to find mental health treatment, or the number of a suicide hotline.

“Reddit is a valuable source of support for many people with mental health problems, many of whom may not have formal access to other types of mental health support, so there are implications of this work for the ways they support all. Internal Reddit could be provided, “Rumker says.

The researchers now intend to apply this approach to investigate whether posts on Reddit and other social media sites can be used to detect mental health disorders. An ongoing project involves screening posts on a social media site for veterans for suicide risk and post-traumatic stress disorder.


The research was funded by the National Institutes of Health and the McGovern Institute.


