(47-5) 15 * << * >> * Russian * English * Content * All Issues

Facial expression recognition based on adaptation of the classifier to videos of the user
E.N. Churaev¹, A.V. Savchenko^1,2

¹HSE University, Laboratory of Algorithms and Technologies for Networks Analysis,
603093, Nizhny Novgorod, Russia, Rodionova 136;
²Sber AI, 121170, Moscow, Russia, Kutuzovsky prospekt 32, building 2

PDF, 1244 kB

DOI: 10.18287/2412-6179-CO-1269

Pages: 806-815.

Full text of article: Russian language.

Abstract:
In this paper, an approach that can significantly increase the accuracy of facial emotion recognition by adapting the model to the emotions of a particular user (e.g., smartphone owner) is considered. At the first stage, a neural network model, which was previously trained to recognize facial expressions in static photos, is used to extract visual features of faces in each frame. Next, the face features of video frames are aggregated into a single descriptor for a short video fragment. After that a neural network classifier is trained. At the second stage, it is proposed that adaptation (fine-tuning) to this classifier should be performed using a small set of video data with the facial expressions of a particular user. After emotion classification, the user can adjust the predicted emotions to further improve the accuracy of a personal model. As part of an experimental study for the RAVDESS dataset, it has been shown that the approach with model adaptation to a specific user can significantly (up to 20 – 50 %) improve the accuracy of facial expression recognition in the video.

Keywords:
facial expression classification, neural network classifier adaptation, speaker-dependent emotion recognition.

Citation:
Churaev EN, Savchenko AV. Facial expression recognition based on adaptation of the classifier to videos of the user. Computer Optics 2023; 47(5): 806-815. DOI: 10.18287/2412-6179-CO-1269.

Acknowledgements:
This work was supported by the Russian Science Foundation under RSF grant No. 20-71-10010.

References:

Ekman P. Basic emotions. In Book: Dalgleish T, Power MJ, eds. Handbook of cognition and emotion. New York: John Wiley & Sons; 1991: 45-60. DOI: 10.1002/0470013494.ch3.
Livingstone SR, Russo FA. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 2018; 13(5): e0196391. DOI: 10.1371/journal.pone.0196391.
Mollahosseini A, Hasani B, Mahoor MH. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 2017; 10(1): 18-31. DOI: 10.1109/TAFFC.2017.2740923.
Chang WY, Hsu SH, Chien JH. FATAUVA-Net: An integrated deeplearning framework for facial attribute recognition, action unit detection, and valence-arousal estimation. in: 2017 IEEE Conf on Computer Vision and Pattern Recognition Workshops (CVPRW) 2017: 17-25. DOI: 10.1109/CVPRW.2017.246.
Arunnehru J, Kalaiselvi Geetha M. Automatic human emotion recognition in surveillance video. In Book: Dey N, Santhi V, eds. Intelligent techniques in signal processing for multimedia security. Cham: Springer International Publishing Switzerland; 2017: 321-342. DOI: 10.1007/978-3-319-44790-2_15.
Lee KW, Yoon HS, Song JM, Park KR. Convolutional neural network-based classification of driver’s emotion during aggressive and smooth driving using multi-modal camera sensors. Sensors 2018; 18(4): 957. DOI: 10.3390/s18040957.
Scherr SA, Elberzhager F, Holl K. Acceptance testing of mobile applications-automated emotion tracking for large user groups. 2018 IEEE/ACM 5th Int Conf on Mobile Software Engineering and Systems (MOBILESoft) 2018: 247-251.
Naas SA, Sigg S. Real-time emotion recognition for sales. 2020 16th Int Conf on Mobility, Sensing and Networking (MSN) 2020: 584-591. DOI: 10.1109/MSN50589.2020.00096.
Tkalcic M, Kosir A, Tasic J. Affective recommender systems: The role of emotions in recommender systems. The RecSys 2011 Workshop on Human Decision Making in Recommender Systems 2011: 9-13.
Liu X, Xie L, Wang Y, Zou J, Xiong J, Ying Z, Vasilakos AV. Privacy and security issues in deep learning: A survey. IEEE Access 2020; 9: 4566-4593. DOI: 10.1109/ACCESS.2020.3045078.
Savchenko AV. Facial expression and attributes recognition based on multi-task learning of lightweight neural networks. 2021 IEEE 19th Int Symposium on Intelligent Systems and Informatics (SISY) 2021: 119-124. DOI: 10.1109/SISY52375.2021.9582508.
Savchenko AV, Savchenko LV, Makarov I. Classifying emotions and engagement in online learning based on a single facial expression recognition neural network. IEEE Trans Affect Comput 2022; 13(4): 2132-2143. DOI: 10.1109/TAFFC.2022.3188390.
Churaev E, Savchenko AV. Touching the limits of a dataset in video-based facial expression recognition. 2021 Int Russian Automation Conf (RusAutoCon) 2021: 633-638. DOI: 10.1109/RusAutoCon52004.2021.9537388.
Bagheri E, Bagheri A, Esteban PG, Vanderborgth B. A novel model for emotion detection from facial muscles activity. In Book: Silva MF, Lima JL, Reis LP, Sanfeliu A, Tardioli D, eds. Robot 2019: Fourth Iberian robotics conference. Cham: Springer; 2019: 237-249. DOI: 10.1007/978-3-030-36150-1_20.
Luna-Jiménez C, Grio D, Callejas Z, Kleinlein R, Montero JM, Fernández-Martínez F. Multimodal emotion recognition on RAVDESS dataset using transfer learning. Sensors 2021; 21(22): 7665. DOI: 10.3390/s21227665.
Churaev E, Savchenko AV. Multi-user facial emotion recognition in video based on user-dependent neural network adaptation. 2022 VIII Int Conf on Information Technology and Nanotechnology (ITNT) 2022: 1-5. DOI: 10.1109/ITNT55410.2022.9848645.
Savchenko L, Savchenko AV. Speaker-aware training of speech emotion classifier with speaker recognition. In Book: Karpov A, Potapova R, eds. Speech and Computer. Cham: Springer Nature Switzerland AG; 2021: 614-625. DOI: 10.1007/978-3-030-87802-3_55.
Li CJ, Spigner M. Partially speaker-dependent automatic speech recognition using deep neural networks. Journal of the South Carolina Academy of Science 2021; 19(2): 93-99.
Cootes TF, Edwards GJ, Taylor CJ. Active appearance models. IEEE Trans Pattern Anal Mach Intell 2001; 23(6): 681-685. DOI: 10.1109/34.927467.
Shan C, Gong S, McOwan PW. Facial expression recognition based on local binary patterns: A comprehensive study. Image Vis Comput 2009; 27(6): 803-816. DOI: 10.1016/j.imavis.2008.08.005.
Wang Z, Ying Z. Facial expression recognition based on local phase quantization and sparse representation. 2012 8th Int Conf on Natural Computation 2012: 222-225. DOI: 10.1109/ICNC.2012.6234551.
Tan M, Le Q. EfficientNet: Rethinking model scaling for convolutional neural networks. Int Conf on Machine Learning 2019: 6105-6114.
Capotondi A, Rusci M, Fariselli M, Benini L. CMix-NN: Mixed low-precision CNN library for memory-constrained edge devices. IEEE Trans Circuits Syst II Express Briefs 2020; 67(5): 871-875. DOI: 10.1109/TCSII.2020.2983648.
Wang P, Fan E, Wang P. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recogn Lett 2021: 141; 61-67. DOI: 10.1016/j.patrec.2020.07.042.
Lomotin K, Makarov I. Automated image and video quality assessment for computational video editing. In Book: van der Aalst WMP, Batagelj V, Ignatov DI, Khachay M, Koltsova O, Kutuzov A, Kuznetsov SO, Lomazova IA, Loukachevitch N, Napoli A, Panchenko A, Pardalos PM, Pelillo M, Savchenko AV, Tutubalina E, eds. Analysis of images, social networks and texts. Cham: Springer Nature Switzerland AG; 2021: 243-256. DOI: 10.1007/978-3-030-72610-2_18.
Zhang K, Zhang Z, Li Z, Qiao Y. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 2016; 23(10): 1499-1503. DOI: 10.1109/LSP.2016.2603342.
Cao Q, Shen L, Xie W, Parkhi OM, Zisserman A. VGGFace2: A dataset for recognising faces across pose and age. 2018 13th IEEE Int Conf on Automatic Face & Gesture Recognition (FG 2018) 2018: 67-74. DOI: 10.1109/FG.2018.00020.
Barsoum E, Zhang C, Ferrer CC, Zhang Z. Training deep networks for facial expression recognition with crowd-sourced label distribution. ICMI '16: Proc 18th ACM Int Conf on Multimodal Interaction 2016: 279-283. DOI: 10.1145/2993148.2993165.
Meng D, Peng X, Wang K, Qiao Y. Frame attention networks for facial expression recognition in videos. 2019 IEEE Int Conf on Image Processing (ICIP) 2019: 3866-3870. DOI: 10.1109/ICIP.2019.8803603.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems 2017: 6000-6010.
Deng J, Guo J, Xue N, Zafeiriou S. Arcface: Additive angular margin loss for deep face recognition. 2019 IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR) 2019: 4690-4699. DOI: 10.1109/CVPR.2019.00482.
Jaratrotkamjorn A, Choksuriwong A. Bimodal emotion recognition using deep belief network. 2019 23rd Int Computer Science and Engineering Conf (ICSEC) 2019: 103-109. DOI: 10.1109/ICSEC47112.2019.8974707.
Alshamsi H, Kepuska V, Alshamsi H, Meng H. Automated facial expression and speech emotion recognition app development on smart phones using cloud computing. 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conf (IEMCON) 2018: 730-738. DOI: 10.1109/IEMCON.2018.8614831.
Rzayeva Z, Alasgarov E. Facial emotion recognition using convolutional neural networks. 2019 IEEE 13th Int Conf on Application of Information and Communication Technologies (AICT) 2019: 1-5. DOI: 10.1109/AICT47866.2019.8981757.
He Z, Jin T, Basu A, Soraghan J, Di Caterina G, Petropoulakis L. Human emotion recognition in video using subtraction pre-processing. ICMLC '19: Proc 2019 11th Int Conf on Machine Learning and Computing 2019: 374-379. DOI: 10.1145/3318299.3318321.
Baltrušaitis T, Robinson P, Morency LP. Openface: an open source facial behavior analysis toolkit. 2016 IEEE Winter Conf on Applications of Computer Vision (WACV) 2016: 1-10. DOI: 10.1109/WACV.2016.7477553.
Noyes E, Davis JP, Petrov N, Gray KL, Ritchie KL. The effect of face masks and sunglasses on identity and expression recognition with super-recognizers and typical observers. Royal Soc Open Sci 2021; 8(3): 201169. DOI: 10.1098/rsos.201169.
Bhattacharya S, Gupta M. A survey on: Facial emotion recognition invariant to pose, illumination and age. 2019 Second Int Conf on Advanced Computational and Communication Paradigms (ICACCP) 2019: 1-6. DOI: 10.1109/ICACCP.2019.8883015.
Savchenko AV. Personalized frame-level facial expression recognition in video. In Book: Yacoubi ME, Granger E, Yuen PC, Pal U, Vincent N, eds. Pattern recognition and artificial intelligence. Cham: Springer Nature Switzerland AG; 2022: 447-458. DOI: 10.1007/978-3-031-09037-0_37.

© 2009, IPSI RAS
151, Molodogvardeiskaya str., Samara, 443001, Russia; E-mail: journal@computeroptics.ru ; Tel: +7 (846) 242-41-24 (Executive secretary), +7 (846) 332-56-22 (Issuing editor), Fax: +7 (846) 332-56-20

Facial expression recognition based on adaptation of the classifier to videos of the user E.N. Churaev 1, A.V. Savchenko 1,2

1 HSE University, Laboratory of Algorithms and Technologies for Networks Analysis, 603093, Nizhny Novgorod, Russia, Rodionova 136; 2 Sber AI, 121170, Moscow, Russia, Kutuzovsky prospekt 32, building 2

Facial expression recognition based on adaptation of the classifier to videos of the user
E.N. Churaev¹, A.V. Savchenko^1,2

¹HSE University, Laboratory of Algorithms and Technologies for Networks Analysis,
603093, Nizhny Novgorod, Russia, Rodionova 136;
²Sber AI, 121170, Moscow, Russia, Kutuzovsky prospekt 32, building 2