AI models reveals distinct voice patterns related to mental health

Monday 31 Jan 22


Jakob Eyvind Bardram
Head of Sections, Professor
DTU Health Tech
+45 45 25 53 11


Lars Vedel Kessing
Psychiatric Center Copenhagen, Rigshospitalet
+45 38 64 70 81


Maria Faurholt-Jepsen
Psychiatric Center Copenhagen
+45 38 64 70 73


Darius Adam Rohani

In two recently published papers [1,2], we demonstrate how voice signals, recorded from the smartphone can be used to;

  • discriminate among patients with a bipolar or depressive disorder and healthy participants, and
  • predict mood and symptoms changes in patients with bipolar or depressive disorders. 

Lars V. Kessing – a professor in psychiatry – has been treating bipolar patients for more than 20 years. During his conversation with the patients, he often noticed that the patients’ tone of voice was strongly associated with their current mood. This motivated us to investigate if a patient's mood could be automatically detected using voice analysis of phone conversations. More specifically, we wanted to investigate if we could use the voice as an objective marker to classify persons with a mood disorder. Moreover, if voice changes are associated with the current experienced symptoms for the bipolar and depressive patients, such as during depressive mood, or during increased activity.  

Across two RCT studies – RADMIS and Bio – we installed the Monsenso System [3] on the participants’ phones to record voice signals from both incoming and outgoing conversations. The data was the first to include as many as 228 people, including 121 patients with bipolar disorder and 48 patients with a depressive disorder. Furthermore, we had voice data from first-degree relatives of patients with a mood disorder who are genetically more exposed to develop a mood disorder later in life.

We introduced the voice data to a Random Forest model and found that the recorded voice data of healthy participants were statistically significantly different from the bipolar [1] and depressive patients [2]. Surprisingly, the difference was also present when compared to the first-degree relatives [1]. We saw no difference when comparing bipolar patients and depressive patients [2].

Models that used the recorded voice signals to predict changes in symptoms in bipolar and depressive disorder patients had low sensitivity and specificity (Illustrated in Figure 1A). 

 independent_ROC_curve dependent_ROC_curve 

Figure 1 A&B. The ROC curve for the classifications of different states based on voice features in patients with bipolar disorder.

A) The user-independent models. B) The user-dependent models.

ROC curve

For the bipolar patients with a ROC curve for each recorded symptom. A diagonal line represents a result that performs as well as random. In contrast, a model that produces a line that reaches closer to the upper left corner is better at discriminating the different symptoms.

Interestingly, when we developed a person-specific model, which was trained on a single patient, the sensitivity and specificity increased (illustrated in Figure 1B).

In conclusion, we saw that voice features contain patterns that a model can use to classify between healthy participants and patients with mood disorders. However, these voice features were not generic. We could not train a generic model to use voice features to classify whether a new patient is experiencing depressive or manic symptoms. Instead, if a model was personalized to a given patient, we could use voice to characterize when the person entered (hypo)manic or depressive phases.

These results are significant as it paves the way to create a supportive tool for newly discharged patients from the hospital. A system that uses such a model could warn a patient or the psychiatrist if the patient is entering a depressive or manic phase. This could be used in a telemedicine setup, where clinical staff could call and support the patient thereby prevent severe symptoms.


  1. Faurholt-Jepsen, M., Rohani, D. A., Busk, J., Vinberg, M., Bardram, J. E., & Kessing, L. V. (2021). Voice analyses using smartphone-based data in patients with bipolar disorder, unaffected relatives and healthy control individuals, and during different affective states. International Journal of Bipolar Disorders9(1), 1-13. [online]
  2. Faurholt‐Jepsen, M., Rohani, D. A., Busk, J., Tønning, M. L., Vinberg, M., Bardram, J. E., & Kessing, L. V. (2021). Discriminating between patients with unipolar disorder, bipolar disorder, and healthy control individuals based on voice features collected from naturalistic smartphone calls. Acta Psychiatrica Scandinavica. [online]
  3. Bardram, J. E., Frost, M., Szántó, K., Faurholt-Jepsen, M., Vinberg, M., & Kessing, L. V. (2013, April). Designing mobile health technology for bipolar disorder: a field trial of the monarca system. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 2627-2636). [online]