Bot to the Rescue? Effects of a Fully Automated Conversational Agent on Anxiety and Depression: A Randomized Controlled Trial

Research Article

Ann Depress Anxiety. 2021; 8(1): 1107.

Bot to the Rescue? Effects of a Fully Automated Conversational Agent on Anxiety and Depression: A Randomized Controlled Trial

Gutu SM, Cosmoiu A, Cojocaru D, Turturescu T, Popoviciu CM and Giosan C*

Department of Psychology, University of Bucharest, Romania

Corresponding author: Cezar Giosan, Department of Psychology, University of Bucharest, Romania

Received: July 31, 2021; Accepted: August 17, 2021; Published: August 24, 2021


Web-based conversational agents powered by Artificial Intelligence (AI) and rooted in cognitive-behavioral therapy have been proven efficacious in alleviating the symptoms of anxiety and depression, when compared to passive controls. However, the benefits of a fully automated agent vs. active controls have not yet been examined. Furthermore, the potential impact of such interventions on the transdiagnostic factors underlying anxiety and depression is not known.

To elucidate this, 95 adults were randomized to receive (1) a 2-week intervention with an AI-powered chatbot (Woebot) (n=39) or (2) regular psychoeducational materials (n=54). In completers’ analyses, significant main effects of time were obtained for one of the primary outcomes, anxiety, and for the secondary outcomes, transdiagnostic factors, with both groups showing decreased anxiety and intolerance of uncertainty and increased rumination, selfcompassion, guilt and shame. No group by time interaction effects were found for either of the primary outcomes, depression and anxiety, or for the secondary outcomes. Intent-to-Treat analyses also revealed no significant effects of group on the primary or secondary outcomes. Our findings point to the necessity of further research to better understand the areas where chatbots might bring benefits superior to those obtained through simple and inexpensive strategies.

Keywords: Mobile mental health; mHealth; Depression; Anxiety; Transdiagnostic factors; Health apps


AI: Artificial Intelligence; ANOVA: Analysis of Variance; CBT: Cognitive-Behavioral Therapy; ITT: Intention-to-Treat; M: Mean; RCT: Randomized Controlled Trial; SD: Standard Deviation


Mental disorders, which affect up to 29% of people in their lifetime [1] and come with significant societal and personal costs [2], have increased in their prevalence and severity [3-6]. The onset period for several mental disorders, especially mood and anxiety disorders, is early 20’s [4,7,8], with significantly higher rates of depression found in college students than in the general population [9]. Subclinical levels of depression and anxiety also lead to significant impairment [10,11].

While these realities point to the critical role of interventions in alleviating such symptoms, only 35.5-50.3 % of serious cases in developed countries and 76.3% - 85.4% in less developed countries end up receiving professional care [12]. The 2018 National Survey on Drug Use and Health reported that up to 56.7% of Americans with some form mental illness received no treatment, regardless of the form and severity of mental illness [13]. Tellingly, only 16.4% of students meeting the criteria for a mental illness receive adequate treatment for it [7].

Although the reasons for not receiving psychological care vary across individuals, some of the reported barriers are: treatment not needed, lack of time, preference for self-management, and perceived stigma and embarrassment [14,15].

In addition to these obstacles, the current pandemic context brings forth additional limitations for conventional face-to-face psychological assistance, as social distancing, mask wearing, and surface disinfection are mandatory, pointing to the importance of exploring alternative means of providing psychological care, such as automated CBT interventions recommended by The National Institute for Health and Care Excellence, which can offer information and guidance similar to treatments delivered by standard methods [16].

Owing to the enormous recent increase in computing power, conversational agents (chatbots) powered by Artificial Intelligence (AI) (e.g., Replika, Shim, Woebot, Wysa) have emerged as a potentially useful therapeutic method in the recent years. Chatbots are cheap, easily accessible, and do not suffer from scale-up challenges. However, while the use of therapy bots has increased recently, the technology behind this kinds of interventions is still experimental in nature and the field lacks high-quality evidence derived from randomized controlled studies [17].

Given the fact that many such solutions may be marketed to vulnerable individuals, the necessity of rigorously validating their claims of mental health improvements with their use becomes imperative. Some evidence for the benefits of the use of chatbots in psychiatry is positive, but there are concerns about the lack of higher quality evidence for any type of diagnosis and interventions in mental health research that uses them [18]. A systematic review on these types of interventions found that they can be effective in reducing depression, anxiety, stress, and substance use, but, out of the apps that were reviewed, only two were available for commercial use [19].

There is some evidence that web-based conversational agents rooted in cognitive-behavioral theory can be efficacious in alleviating the symptoms of some mental health conditions, such as anxiety and depression. For instance, a pilot randomized clinical trial on the effectiveness and adherence of an AI-powered smartphone app, delivering strategies used in positive psychology and CBT interventions using a conversational interface, reported no significant changes in the intervention group compared to a waitlist on any of the outcome measures. However, when the analysis included only the participants who adhered to the intervention, there was a significant group-by-time interaction effect on psychological well-being and perceived stress, with small to large effect sizes [20]. Likewise, another RCT on an AI-powered psychological intervention-Tess-showed significant reduction of self-reported symptoms of depression and anxiety in college students, compared to a control group who received informational materials [21].

Another AI-driven conversational agent-Woebot, a fullyautomated CBT-driven chatbot – also showed promise in an RCT which compared it to a passive control group, in that it led to a higher decrease in depression and anxiety, although the control group’s adherence to the intervention was not examined [22].

To date, to our knowledge, the research on the mediators and mechanisms of change in automated, computerized interventions has solely focused on symptom changes through specific therapeutic protocols for mental disorders. However, there is a growing body of evidence on the impact of transdiagnostic factors on mental health [23]. Transdiagnostic factors are vulnerability factors that overlap across several mental disorders [24]. Thus, treatments targeting key transdiagnostic factors (i.e., common vulnerability factors) could have a general impact across multiple disorders and prove efficacious in preventing declines in mental health [24].

For depression and anxiety, which tend to co-occur [25], the following major transdiagnostic factors have been identified: rumination, guilt, shame, intolerance of uncertainty (all associated with negative outcomes [23]), and self-compassion (associated with positive outcomes [26]).

Recent progress on the merits of AI-powered conversational agents notwithstanding, little is currently known about the potential impact of such chatbots on the transdiagnostic factors that underlie anxiety and depression.

To this end, the present study’s objectives were twofold: (1) evaluate the efficacy in reducing anxiety and depression using a CBToriented conversational agent-Woebot-compared to an active control group, who received psychoeducational materials that they needed to show mastery of, and (2) to examine the role of this conversational agent in reducing the severity of the transdiagnostic factors associated with depression and anxiety.


Recruitment and procedure

Potential participants were recruited through announcements on social media websites such as Facebook and Instagram. The inclusionary criteria were: at least 18 years old; access to a computer/ mobile phone/tablet and the Internet; and the ability to read and write in English (at least B2 level of English in the Common European Framework of Reference). The study was approved by the Ethics Committee of a large university in Europe.

After signing an informed consent, all participants were assigned a personal code and sent an online baseline evaluation. Confirmed participants (i.e., those who completed the baseline evaluation) were randomized to either the experimental (i.e., Woebot) or the control group. After approximately one week (T2), all enrolled participants were contacted to fill out an instrument assessing the transdiagnostic factors, and those in the experimental group were required to send a screenshot of their time spent on Woebot and their check-in diagram to check for treatment adherence. After two weeks (T3), the participants were contacted once again to complete the initial set of scales, and those in the experimental condition also sent screenshots of their check-in diagram and time spent within the app. The primary outcomes (anxiety and depression) were measured at preintervention and post-intervention, whereas rumination, intolerance of uncertainty, shame, guilt, and self-compassion were additionally assessed mid-intervention (after seven days), in order to test for their effects as mediators of treatment outcome. Participants who completed all three sets of evaluations were entered in a raffle for the opportunity to win the equivalent of US $20.

Data collection was done exclusively online; the online instruments were created using Google Forms and QuestionPro.


An a priori power analysis was conducted with the G*Power [27], as informed by previous trials exploring the efficacy of fully automated conversational agents [22]. For a medium effect size of f = .25 (i.e., approximately equivalent to a partial η2 of .06), at a statistical power of .80 and an alpha of .05, a total number of 38 participants (19 per trial arm) was deemed sufficient. However, to allow for attrition, a higher number of participants was recruited (Figure 1).