Speech Emotion Recognition using Machine Learning

IEEE BASE PAPER ABSTRACT:

The aim of the paper is to detect the emotions which are elicited by the speaker while speaking. Emotion Detection has become a essential task these days. The speech which is in fear, anger, joy have higher and wider range in pitch whereas have low range in pitch. Detection of speech is useful in assisting human machine interactions. Here we are using different classification algorithms to recognize the emotions , Support Vector Machine , Multi-layer perception, and the audio feature MFCC, MEL, chroma, Tonnetz were used. These models have been trained to recognize these emotions (Calm, neutral, surprise, happy, sad, angry, fearful, disgust). We got an accuracy of 86.5% and testing it with the input audio we get the same.

PROJECT OUTPUT VIDEO:

ALGORITHM / MODEL USED:

Artificial Neural Network model (ANN model)

OUR PROPOSED ABSTRACT:

One of the quickest and most natural ways for humans to communicate is through speech. Speech emotion recognition is the process of accurately anticipating a human’s emotion from their speech. It improves the way people and computers communicate. Although it is tricky to annotate audio and difficult to forecast a person’s sentiment because emotions are subjective, “Speech Emotion Recognition (SER)” makes this possible.

Various researchers have created a variety of systems to extract the emotions from the speech stream. Speech qualities in particular are more helpful in identifying between various emotions, and if they are unclear, this is the cause of how challenging it is to identify an emotion from a speaker’s speech. A variety of the datasets for speech emotions, its modelling, and types are accessible, and they aid in determining the style of speech.

After feature extraction, the classification of speech emotions is a crucial component, so in this system proposal, we introduced Artificial Neural Networks (ANN model) that are utilised to distinguish emotions such as angry, disgust, Fear, happy, neutral, Sad and surprise. The proposed system model Artificial Neural Networks (ANN model) achieved training accuracy of 100% and Validation accuracy of 99%.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

System : Pentium i3 Processor.
Hard Disk : 500 GB.
Monitor : 15’’ LED
Input Devices : Keyboard, Mouse
Ram : 4 GB

SOFTWARE REQUIREMENTS:

Operating System : Windows 10 / 11.
Coding Language : Python 3.8.
Web Framework : Flask.
Frontend : HTML, CSS, JavaScript.

REFERENCE:

Kotikalapudi Vamsi Krishna, Navuluri Sainath, A. Mary Posonia, “Speech Emotion Recognition using Machine Learning”, 2022 6th International Conference on Computing Methodologies and Communication (ICCMC), IEEE Conference, 2022.

Python IEEE Projects

Speech Emotion Recognition using Machine Learning