An efficient Human Facial Expression Recognition System
An efficient Human Facial Expression Recognition System
ABSTRACT:
Facial expression is an important channel for human communication and can be applied in many real applications. One critical step for facial expression recognition (FER) is to accurately extract emotional features. Current approaches on FER in static images have not fully considered and utilized the features of facial element and muscle movements, which represent static and dynamic, as well as geometric and appearance characteristics of facial expressions. This paper proposes an approach to solve this limitation using ‘salient’ distance features, which are obtained by extracting patch-based 3D Gabor features, selecting the ‘salient’ patches, and performing patch matching operations. The experimental results demonstrate high correct recognition rate (CRR), significant performance improvements due to the consideration of facial element and muscle movements, promising results under face registration errors, and fast processing time. The comparison with the state-of-the-art performance confirms that the proposed approach achieves the highest CRR on the JAFFE database and is among the top performers on the Cohn-Kanade (CK) database.
PROJECT OUTPUT VIDEO:
EXISTING SYSTEM
- The vast majority of the past work on FER does not take the dynamics of facial expressions into account.
- Some efforts have been made on capturing and utilizing facial movement features, and almost all of them are video-based.
- These efforts try to adopt either geometric features of the tracked facial points (e.g. shape vectors, facial animation parameters, distance and angular, and trajectories, or appearance difference between holistic facial regions in consequent frames (e.g. optical flow, and differential-AAM, or texture and motion changes in local facial regions (e.g. surface deformation, motion units, spatiotemporal descriptors, animation units, and pixel difference).
- Although achieved promising results, these approaches often require accurate location and tracking of facial points, which remains problematic.
PROPOSED SYSTEM
Our proposed system approach is to solve this limitation using ‘salient’ distance features, which are obtained by extracting patch-based 3D Gabor features, selecting the ‘salient’ patches, and performing patch matching operations. The experimental results demonstrate high correct recognition rate (CRR), significant performance improvements due to the consideration of facial element and muscle movements, promising results under face registration errors, and fast processing time.
MODULES:
- Preprocessing module
- Finding Probability of human face Module
- Binary Image Conversion Module
- Facial Movement feature extraction module
- Eye feature extraction Module
- Lip Feature Extraction module
- Training Module
- Test module
MODULE DESCRIPTION:
1) Preprocessing module
For skin color segmentation, first we contrast the image. Then we perform skin color segmentation.
- Finding Probability of human face Module
Then, we have to find the largest connected region. Then we have to check the probability to become a face of the largest connected region. If the largest connected region has the probability to become a face, then it will open a new form with the largest connected region. If the largest connected regions height & width is larger or equal than 50 and the ratio of height/width is between 1 to 2, then it may be face.
- Binary Image Conversion Module
For face detection, first we convert binary image from RGB image. For converting binary image, we calculate the average value of RGB for each pixel and if the average value is below than 110, we replace it by black pixel and otherwise we replace it by white pixel. By this method, we get a binary image from RGB image.
Then, we try to find the forehead from the binary image. We start scan from the middle of the image, then want to find a continuous white pixels after a continuous black pixel. Then we want to find the maximum width of the white pixel by searching vertical both left and right site. Then, if the new width is smaller half of the previous maximum width, then we break the scan because if we reach the eyebrow then this situation will arise. Then we cut the face from the starting position of the forehead and its high will be 1.5 multiply of its width.
In the figure, X will be equal to the maximum width of the forehead. Then we will have an image which will contain only eyes, nose and lip. Then we will cut the RGB image according to the binary image.
- Facial Movement feature extraction module
- Eye feature extraction Module
For eyes detection, we convert the RGB face to the binary face. Now, we consider the face width by W. We scan from the W/4 to (W-W/4) to find the middle position of the two eyes. The highest white continuous pixel along the height between the ranges is the middle position of the two eyes.
Then we find the starting high or upper position of the two eyebrows by searching vertical. For left eye, we search w/8 to mid and for right eye we search mid to w – w/8. Here w is the width of the image and mid is the middle position of the two eyes. There may be some white pixels between the eyebrow and the eye. To make the eyebrow and eye connected, we place some continuous black pixels vertically from eyebrow to the eye. For left eye, the vertical black pixel-lines are placed in between mid/2 to mid/4 and for right eye the lines are in between mid+(w-mid)/ 4 to mid+3*(w-mid)/ 4 and height of the black pixel-lines are from the eyebrow starting height to (h- eyebrow starting position)/4. Here w is the width of the image and mid is the middle position of the two eyes and h is the height of the image. Then we find the lower position of the two eyes by searching black pixel vertically. For left eye, we search from the mid/4 to mid – mid/4 width. And for right eye, we search mid + (w-mid)/ 4 to mid+3*(w- mid)/ 4 width from image lower end to starting position of the eyebrow. Then we find the right side of the left eye by searching black pixel horizontally from the mid position to the starting position of black pixels in between the upper position and lower position of the left eye. And left side for right eye we search mid to the starting position of black pixels in between the upper position and lower position of right eye. The left side of the left eye is the starting width of the image and the right side of the right eye is the ending width of the image. Then we cut the upper position, lower position, left side and the right side of the two eyes from the RGB image.
For apply Bezier curve on eyes, first we have to remove eyebrow from eye. For remove eyebrow, we search 1st continuous black pixel then continuous white pixel and then continuous black pixel from the binary image of the eye box. Then we remove the 1st continuous black pixel from the box and then we get the box which only contains the eye.
Now, the eye box which contains only eye, has some skin or skin color around the box. So, we apply similar skin color like the lip for finding the region of eye. Then we apply big connect for finding the highest connected region and this is the eye because in the eye box, eye is the biggest thing which is not similar to the skin color.
- Lip Feature Extraction module
For lip detection, we determine the lip box. And we consider that lip must be inside the lip box. So, first we determine the distance between the forehead and eyes. Then we add the distance with the lower height of the eye to determine the upper height of the box which will contain the lip. Now, the starting point of the box will be the ¼ position of the left eye box and ending point will be the ¾ position of the right eye box. And the ending height of the box will be the lower end of the face image. So, this box will contain only lip and may some part of the nose. Then we will cut the RGB image according the box.
So, for detection eyes and lip, we only need to convert binary image from RGB image and some searching among the binary image.
In the lip box, there is lip and may be some part of nose. So, around the box there is skin color or the skin. So, we convert the skin pixel to white pixel and other pixel as black. We also find those pixels which are similar to skin pixels and convert them to white pixel. Here, if two pixels RGB values difference is less than or equal 10, then we called them similar pixel. Here, we use histogram for finding the distance between the lower average RGB value and higher average RGB value. If the distance is less than 70, then we use 7 for finding similar pixel and if the distance is getter than or equal 70 then we use 10 for finding similar pixel. So, the value for finding similar pixel depends on the quality of the image. If the image quality is high, we use 7 for finding similar pixel and if the image quality is low, we use 10.
So, in the binary image, there are black regions on lip, nose and may some other little part which have a little different than skin color. Then we apply big connected region for finding the black region which contain lip in binary image. And we are sure that the big connected region is the lip because in the lip box, lip is the largest thing which is different than skin.
- Training Module
In our database, there are two tables. One table “Person” is for storing the name of people and their index of 4 kinds of emotion which are stored in other table “Position”. In the “Position” table, for each index, there are 6 control points for lip Bezier curve, 6 control points for left eye Bezier curve, 6 control points for right eye Bezier curve, lip height and width, left eye height and width and right eye height and width. So, by this method, the program learns the emotion of the people.
- Test module
For emotion detection of an image, we have to find the Bezier curve of the lip, left eye and right eye. Then we convert each width of the Bezier curve to 100 and height according to its width. If the person’s emotion information is available in the database, then the program will match which emotion’s height is nearest the current height and the program will give the nearest emotion as output.
If the person’s emotion information is not available in the database, then the program calculates the average height for each emotion in the database for all people and then get a decision according to the average height.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
Processor : Intel Duel Core.
Hard Disk : 60 GB.
Floppy Drive : 1.44 Mb.
Monitor : LCD Colour.
Mouse : Optical Mouse.
RAM : 512 Mb.
SOFTWARE REQUIREMENTS:
Operating system : Windows XP.
Coding Language : C#.NET