(47-5) 15 * << * >> * Русский * English * Содержание * Все выпуски

Распознавание выражений лиц на основе адаптации классификатора видеоданных пользователя
Е.Н. Чураев 1, А.В. Савченко 1,2

Национальный исследовательский университет Высшая школа экономики,
Лаборатория алгоритмов и технологий анализа сетевых структур,

Сбер, Лаборатория искусственного интеллекта,
121170, Россия, г. Москва, Кутузовский проспект д. 32, строение 2

 PDF, 1244 kB

DOI: 10.18287/2412-6179-CO-1269

Страницы: 806-815.

Аннотация:
В настоящей работе предложен метод распознавания выражений лиц по видео, позволяющий значительно увеличить точность при помощи адаптации модели к эмоциям конкретного пользователя, например, владельца мобильного устройства. На первом этапе нейросетевая модель, предварительно обученная распознавать выражения лиц на статических фото, применяется для извлечения визуальных признаков лиц на каждом видеокадре. Далее они агрегируются в единый дескриптор для короткого фрагмента видео, после чего обучается нейросетевой классификатор. На втором этапе предлагается выполнить адаптацию этого классификатора с использованием небольшого набора видеоданных с выражениями лиц конкретного пользователя. После принятия решения пользователь может корректировать предсказанные эмоции для дальнейшего повышения точности персональной модели. В рамках экспериментального исследования для набора данных RAVDESS показано, что подход с адаптацией модели под конкретного пользователя позволяет значительно (на 20 – 50 %) повысить точность распознавания выражений лиц по видео.

Ключевые слова:
распознавание выражений лиц, адаптация нейросетевого классификатора, распознавание лиц.

Благодарности
Работа выполнена при поддержке Российского научного фонда (проект № 20-71-10010).

Цитирование:
Чураев, Е.Н. Распознавание выражений лиц на основе адаптации классификатора видеоданных пользователя / Е.Н. Чураев, А.В. Савченко // Компьютерная оптика. – 2023. – Т. 47, № 5. – С. 806-815. – DOI: 10.18287/2412-6179-CO-1269.

Citation:
Churaev EN, Savchenko AV. Facial expression recognition based on adaptation of the classifier to videos of the user. Computer Optics 2023; 47(5): 806-815. DOI: 10.18287/2412-6179-CO-1269.

References:

  1. Ekman P. Basic emotions. In Book: Dalgleish T, Power MJ, eds. Handbook of cognition and emotion. New York: John Wiley & Sons; 1991: 45-60. DOI: 10.1002/0470013494.ch3.
  2. Livingstone SR, Russo FA. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 2018; 13(5): e0196391. DOI: 10.1371/journal.pone.0196391.
  3. Mollahosseini A, Hasani B, Mahoor MH. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 2017; 10(1): 18-31. DOI: 10.1109/TAFFC.2017.2740923.
  4. Chang WY, Hsu SH, Chien JH. FATAUVA-Net: An integrated deeplearning framework for facial attribute recognition, action unit detection, and valence-arousal estimation. in: 2017 IEEE Conf on Computer Vision and Pattern Recognition Workshops (CVPRW) 2017: 17-25. DOI: 10.1109/CVPRW.2017.246.
  5. Arunnehru J, Kalaiselvi Geetha M. Automatic human emotion recognition in surveillance video. In Book: Dey N, Santhi V, eds. Intelligent techniques in signal processing for multimedia security. Cham: Springer International Publishing Switzerland; 2017: 321-342. DOI: 10.1007/978-3-319-44790-2_15.
  6. Lee KW, Yoon HS, Song JM, Park KR. Convolutional neural network-based classification of driver's emotion during aggressive and smooth driving using multi-modal camera sensors. Sensors 2018; 18(4): 957. DOI: 10.3390/s18040957.
  7. Scherr SA, Elberzhager F, Holl K. Acceptance testing of mobile applications-automated emotion tracking for large user groups. 2018 IEEE/ACM 5th Int Conf on Mobile Software Engineering and Systems (MOBILESoft) 2018: 247-251.
  8. Naas SA, Sigg S. Real-time emotion recognition for sales. 2020 16th Int Conf on Mobility, Sensing and Networking (MSN) 2020: 584-591. DOI: 10.1109/MSN50589.2020.00096.
  9. Tkalcic M, Kosir A, Tasic J. Affective recommender systems: The role of emotions in recommender systems. The RecSys 2011 Workshop on Human Decision Making in Recommender Systems 2011: 9-13.
  10. Liu X, Xie L, Wang Y, Zou J, Xiong J, Ying Z, Vasilakos AV. Privacy and security issues in deep learning: A survey. IEEE Access 2020; 9: 4566-4593. DOI: 10.1109/ACCESS.2020.3045078.
  11. Savchenko AV. Facial expression and attributes recognition based on multi-task learning of lightweight neural networks. 2021 IEEE 19th Int Symposium on Intelligent Systems and Informatics (SISY) 2021: 119-124. DOI: 10.1109/SISY52375.2021.9582508.
  12. Savchenko AV, Savchenko LV, Makarov I. Classifying emotions and engagement in online learning based on a single facial expression recognition neural network. IEEE Trans Affect Comput 2022; 13(4): 2132-2143. DOI: 10.1109/TAFFC.2022.3188390.
  13. Churaev E, Savchenko AV. Touching the limits of a dataset in video-based facial expression recognition. 2021 Int Russian Automation Conf (RusAutoCon) 2021: 633-638. DOI: 10.1109/RusAutoCon52004.2021.9537388.
  14. Bagheri E, Bagheri A, Esteban PG, Vanderborgth B. A novel model for emotion detection from facial muscles activity. In Book: Silva MF, Lima JL, Reis LP, Sanfeliu A, Tardioli D, eds. Robot 2019: Fourth Iberian robotics conference. Cham: Springer; 2019: 237-249. DOI: 10.1007/978-3-030-36150-1_20.
  15. Luna-Jiménez C, Grio D, Callejas Z, Kleinlein R, Montero JM, Fernández-Martínez F. Multimodal emotion recognition on RAVDESS dataset using transfer learning. Sensors 2021; 21(22): 7665. DOI: 10.3390/s21227665.
  16. Churaev E, Savchenko AV. Multi-user facial emotion recognition in video based on user-dependent neural network adaptation. 2022 VIII Int Conf on Information Technology and Nanotechnology (ITNT) 2022: 1-5. DOI: 10.1109/ITNT55410.2022.9848645.
  17. Savchenko L, Savchenko AV. Speaker-aware training of speech emotion classifier with speaker recognition. In Book: Karpov A, Potapova R, eds. Speech and Computer. Cham: Springer Nature Switzerland AG; 2021: 614-625. DOI: 10.1007/978-3-030-87802-3_55.
  18. Li CJ, Spigner M. Partially speaker-dependent automatic speech recognition using deep neural networks. Journal of the South Carolina Academy of Science 2021; 19(2): 93-99.
  19. Cootes TF, Edwards GJ, Taylor CJ. Active appearance models. IEEE Trans Pattern Anal Mach Intell 2001; 23(6): 681-685. DOI: 10.1109/34.927467.
  20. Shan C, Gong S, McOwan PW. Facial expression recognition based on local binary patterns: A comprehensive study. Image Vis Comput 2009; 27(6): 803-816. DOI: 10.1016/j.imavis.2008.08.005.
  21. Wang Z, Ying Z. Facial expression recognition based on local phase quantization and sparse representation. 2012 8th Int Conf on Natural Computation 2012: 222-225. DOI: 10.1109/ICNC.2012.6234551.
  22. Tan M, Le Q. EfficientNet: Rethinking model scaling for convolutional neural networks. Int Conf on Machine Learning 2019: 6105-6114.
  23. Capotondi A, Rusci M, Fariselli M, Benini L. CMix-NN: Mixed low-precision CNN library for memory-constrained edge devices. IEEE Trans Circuits Syst II Express Briefs 2020; 67(5): 871-875. DOI: 10.1109/TCSII.2020.2983648.
  24. Wang P, Fan E, Wang P. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recogn Lett 2021: 141; 61-67. DOI: 10.1016/j.patrec.2020.07.042.
  25. Lomotin K, Makarov I. Automated image and video quality assessment for computational video editing. In Book: van der Aalst WMP, Batagelj V, Ignatov DI, Khachay M, Koltsova O, Kutuzov A, Kuznetsov SO, Lomazova IA, Loukachevitch N, Napoli A, Panchenko A, Pardalos PM, Pelillo M, Savchenko AV, Tutubalina E, eds. Analysis of images, social networks and texts. Cham: Springer Nature Switzerland AG; 2021: 243-256. DOI: 10.1007/978-3-030-72610-2_18.
  26. Zhang K, Zhang Z, Li Z, Qiao Y. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 2016; 23(10): 1499-1503. DOI: 10.1109/LSP.2016.2603342.
  27. Cao Q, Shen L, Xie W, Parkhi OM, Zisserman A. VGGFace2: A dataset for recognising faces across pose and age. 2018 13th IEEE Int Conf on Automatic Face & Gesture Recognition (FG 2018) 2018: 67-74. DOI: 10.1109/FG.2018.00020.
  28. Barsoum E, Zhang C, Ferrer CC, Zhang Z. Training deep networks for facial expression recognition with crowd-sourced label distribution. ICMI '16: Proc 18th ACM Int Conf on Multimodal Interaction 2016: 279-283. DOI: 10.1145/2993148.2993165.
  29. Meng D, Peng X, Wang K, Qiao Y. Frame attention networks for facial expression recognition in videos. 2019 IEEE Int Conf on Image Processing (ICIP) 2019: 3866-3870. DOI: 10.1109/ICIP.2019.8803603.
  30. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems 2017: 6000-6010.
  31. Deng J, Guo J, Xue N, Zafeiriou S. Arcface: Additive angular margin loss for deep face recognition. 2019 IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR) 2019: 4690-4699. DOI: 10.1109/CVPR.2019.00482.
  32. Jaratrotkamjorn A, Choksuriwong A. Bimodal emotion recognition using deep belief network. 2019 23rd Int Computer Science and Engineering Conf (ICSEC) 2019: 103-109. DOI: 10.1109/ICSEC47112.2019.8974707.
  33. Alshamsi H, Kepuska V, Alshamsi H, Meng H. Automated facial expression and speech emotion recognition app development on smart phones using cloud computing. 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conf (IEMCON) 2018: 730-738. DOI: 10.1109/IEMCON.2018.8614831.
  34. Rzayeva Z, Alasgarov E. Facial emotion recognition using convolutional neural networks. 2019 IEEE 13th Int Conf on Application of Information and Communication Technologies (AICT) 2019: 1-5. DOI: 10.1109/AICT47866.2019.8981757.
  35. He Z, Jin T, Basu A, Soraghan J, Di Caterina G, Petropoulakis L. Human emotion recognition in video using subtraction pre-processing. ICMLC '19: Proc 2019 11th Int Conf on Machine Learning and Computing 2019: 374-379. DOI: 10.1145/3318299.3318321.
  36. Baltrušaitis T, Robinson P, Morency LP. Openface: an open source facial behavior analysis toolkit. 2016 IEEE Winter Conf on Applications of Computer Vision (WACV) 2016: 1-10. DOI: 10.1109/WACV.2016.7477553.
  37. Noyes E, Davis JP, Petrov N, Gray KL, Ritchie KL. The effect of face masks and sunglasses on identity and expression recognition with super-recognizers and typical observers. Royal Soc Open Sci 2021; 8(3): 201169. DOI: 10.1098/rsos.201169.
  38. Bhattacharya S, Gupta M. A survey on: Facial emotion recognition invariant to pose, illumination and age. 2019 Second Int Conf on Advanced Computational and Communication Paradigms (ICACCP) 2019: 1-6. DOI: 10.1109/ICACCP.2019.8883015.
  39. Savchenko AV. Personalized frame-level facial expression recognition in video. In Book: Yacoubi ME, Granger E, Yuen PC, Pal U, Vincent N, eds. Pattern recognition and artificial intelligence. Cham: Springer Nature Switzerland AG; 2022: 447-458. DOI: 10.1007/978-3-031-09037-0_37.

© 2009, IPSI RAS
Россия, 443001, Самара, ул. Молодогвардейская, 151; электронная почта: journal@computeroptics.ru; тел: +7 (846) 242-41-24 (ответственный секретарь), +7 (846) 332-56-22 (технический редактор), факс: +7 (846) 332-56-20