Deep Multimodal Fusion Convolutional Neural Network for Emotion Recognition

Shailesh Kulkarni, S.S. Khot, Yogesh Angal

doi:10.48165/bapas.2024.44.2.1

PDF

Published: Oct 17, 2024

DOI: https://doi.org/10.48165/bapas.2024.44.2.1

Keywords:

Emotion recognition, deep convolutional neural network, trial-and-error based fusion, multi-modal, deep learning.

Shailesh Kulkarni, S.S. Khot, Yogesh Angal

Abstract

Emotion recognition plays an effective and efficient role in identifying a person’s feelings. The performances of using either one feature provide no accurate recognition, in case the format is vague. This research develops a new model, a deep convolutional neural network with trial-and-error-based fusion (TE-DCNN) for emotion recognition. The proposed TE-DCNN model extracts the audio, visual, and text formats to enhance the emotion recognition process. In this approach, three DCNN models are trained using either format, which consequently reduces the time dependencies and recognition is much faster than the other methods. The model adopts a trial-and-error-based (TE) fusion method to fuse three data formats, which is highly feasible to avoid over-fitting problems. Here, the TE-DCNN model outperformed with better results and also minimized the computational complexity. Moreover, the model is quite flexible and scalable to recognize the emotions of humans. The performance of the TE-DCNN model can be evaluated by five metrics such as accuracy, specificity, precision, recall and F1 score, and achieved 94.33%, 94.58%, 93.80%, 94.08, and 93.94% for emotion recognition compared to other state-of-the-art methods.

Issue

Vol. 44 No. 3 (2024): LIB PRO. 44(3), JUL-DEC 2024 (Published: 31-07-2024)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details