Safety Evaluation of a Clinical-Grade Generative AI Agent for Anxiety and Depression Symptoms
Abstract
Background: Generative AI could radically improve engagement with digital mental health interventions. However, responsible deployment requires controls around non-deterministic outputs and safety evidence.Objective: This study aimed to evaluate the user safety risk and early signals of clinical effectiveness of a clinician-curated generative AI care agent for anxiety and depression symptoms, using large-scale simulation and real-world testing.Methods: The digital program investigated delivers a structured Cognitive Behavioral Therapy (CBT) skills training program through a constrained generative AI architecture. A multi-agent safety system combining synthetic high-risk scenario testing, automated harm detection, and clinician oversight was developed to ensure user safety. Safety and early indications of symptom reduction were assessed through 1) evaluation of 43,325 simulated responses to a mix of high and low-risk synthetic patients, and 2) in a 2-week prospective study of US adults with moderate-to-severe symptoms of anxiety and depression (N=85).Results: In simulation experiments, potentially harmful outputs occurred in <1 in 10,000 responses (0.01%, 95% CI [0.01%, 0.03%]); none encouraged harm to self or others, judged or actively invalidated the user, or used offensive language. In real use, no harmful outputs were observed (<1 in 12,000), no serious adverse events occurred, and deterioration rates were 5% and 3% for anxiety and depression symptom scores respectively (within expected bounds for psychotherapy). Clinically meaningful reductions were seen for anxiety (B = −5.3, d = 1.1) and depression (B = −5.8, d = 1.2) symptoms, with >50% of users meeting responder criteria after a median of ~90 minutes of use.Conclusions: A constrained generative AI architecture with multi-layered safety oversight can deliver clinically aligned and safe mental health support. Although controlled trials are needed to confirm intervention effectiveness, this dual evaluation, combining high-throughput simulation with real-world deployment, offers a scalable model for the responsible use and continual evaluation of generative AI in mental healthcare.
Related articles
Related articles are currently not available for this article.