Til hovedinnhold
Norsk English

Emotion recognition using speech and neural structured learning to facilitate edge intelligence

Sammendrag

Emotions are quite important in our daily communications and recent years have witnessed a lot of research works to develop reliable emotion recognition systems based on various types data sources such as audio and video. Since there is no apparently visual information of human faces, emotion analysis based on only audio data is a very challenging task. In this work, a novel emotion recognition is proposed based on robust features and machine learning from audio speech. For a person independent emotion recognition system, audio data is used as input to the system from which, Mel Frequency Cepstrum Coefficients (MFCC) are calculated as features. The MFCC features are then followed by discriminant analysis to minimize the inner-class scatterings while maximizing the inter-class scatterings. The robust discriminant features are then applied with an efficient and fast deep learning approach Neural Structured Learning (NSL) for emotion training and recognition. The proposed approach of combining MFCC, discriminant analysis and NSL generated superior recognition rates compared to other traditional approaches such as MFCC-DBN, MFCC-CNN, and MFCC-RNN during the experiments on an emotion dataset of audio speeches. The system can be adopted in smart environments such as homes or clinics to provide affective healthcare. Since NSL is fast and easy to implement, it can be tried on edge devices with limited datasets collected from edge sensors. Hence, we can push the decision-making step towards where data resides rather than conventionally processing of data and making decisions from far away of the data sources. The proposed approach can be applied in different practical applications such as understanding peoples’ emotions in their daily life and stress from the voice of the pilots or air traffic controllers in air traffic management systems.
Les publikasjonen

Kategori

Vitenskapelig artikkel

Språk

Engelsk

Institusjon(er)

  • SINTEF Digital / Sustainable Communication Technologies

År

2020

Publisert i

Engineering Applications of Artificial Intelligence

ISSN

0952-1976

Forlag

Elsevier

Årgang

94

Vis denne publikasjonen hos Cristin