Your privacy, your choice

We use essential cookies to make sure the site can function. We also use optional cookies for advertising, personalisation of content, usage analysis, and social media.

By accepting optional cookies, you consent to the processing of your personal data - including transfers to third parties. Some third parties are outside of the European Economic Area, with varying standards of data protection.

See our privacy policy for more information on the use of your personal data.

for further information and to change your choices.

You are viewing the site in preview mode

Skip to main content

Table 2 Classification results of emotion fusion

From: Hearing vocals to recognize schizophrenia: speech discriminant analysis with fusion of emotions and features based on deep learning

Model

(a, b, c)

Accuracy (%)

Balanced Accuracy (%)

Sensitivity (%)

Specificity (%)

AUC (%)

ResNet18

0.15, 0.75, 0.10

0.20, 0.75, 0.05

87.0 (3.4)

83.2 (4.5)

93.6 (2.2)

72.9 (7.3)

92.3 (3.2)

ResNet18_ASE

0.00, 0.50, 0.50

89.6 (4.2)

87.7 (4.6)

93.0 (5.1)

82.4 (7.8)

92.9 (4.3)

ResNet18_MFCC

0.45, 0.45, 0.10

91.3 (4.6)

89.6 (6.0)

94.2 (6.2)

85.0 (13.0)

96.2 (2.4)

ResNet18_ASE_MFCC

0.10, 0.75, 0.15

91.7 (5.0)

90.0 (6.1)

94.9 (4.9)

85.1 (10.9)

96.3 (3.1)

  1. The value outside the parentheses represents the mean values obtained from a five-fold cross-validation, while the values in the parentheses denote standard deviations. In the (a, b, c) column, the ResNet18 model yields same results for two sets of parameters and the boldfaced number indicates the largest value among the three parameters