(49-4) 18 * << * >> * Russian * English * Content * All Issues
Innovative Integration of Residual Networks for Enhanced In-loop Filtering in VVC Using Deep Convolutional Neural Networks
M.K.I. Ibraheem 1, A.V. Dvorkovich 1, A.M.S. Al-Temimi 2
1 Department of Multimedia Technologies and Telecommunications,
Phystech-School of Radio Engineering and Computer Technologies (FRKT),
Moscow Institute of Physics and Technology (MIPT),
9, Institutsky Lane, 141701, Dolgoprudny, Russia;
2 Iraqi National Data Center, General Secretariat for the Council of Ministers,
Karada Maryam, 10069 , Baghdad, Iraq
PDF, 5179 kB
DOI: 10.18287/2412-6179-CO-1572
Pages: 692-701.
Full text of article: English language.
Abstract:
This paper explores the integration of Residual Networks (ResNets) into the in-loop filtering (ILF) process of the Versatile Video Coding (VVC) standard, aiming to enhance video compression efficiency and video quality through the application of Deep Convolutional Neural Networks (DCNNs). The study introduces a novel architecture, the Residual Deep Convolutional Neural Network (RDCNN), designed to replace conventional VVC in-loop filtering modules, including Deblocking Filter (DBF), Sample Adaptive Offset (SAO), and Adaptive Loop Filter (ALF). By leveraging the Rate Distortion Optimization (RDO) technique, the RDCNN model is applied to every coding unit (CU) to optimize the balance between video quality and bitrate. The proposed methodology involves offline training with specific parameters using the TensorFlow-GPU platform, followed by feature extraction and prediction of optimal filtering decisions for each video frame during the encoding process. The results demonstrate the effectiveness of the proposed RDCNN in significantly reducing the bitrate while maintaining high visual quality, outperforming existing methods in terms of compression efficiency and peak signal-to-noise ratio (PSNR) values across various video files (YUV color space). Specifically, the RDCNN achieved a YUV PSNR of 41.2 dB and a BD-rate reduction of – 2.43% for the Y component, – 6.96% for the U component, and – 9.43% for the V component. These results underscore the potential of deep learning techniques, particularly ResNets, in addressing the complexities of video compression and enhancing the VVC standard. The evaluation across various YUV video files, including Stefan_cif, Soccer, Mobile, Harbour, Crew, and Bus, revealed consistently higher average YUV PSNR values compared to both VTM 22.2 and other related methods. This indicates not only improved compression efficiency but also enhanced visual quality, crucial for diverse video processing tasks.
Keywords:
Deep Learning, Residual Deep Convolutional Neural Network, Versatile Video Coding, Video Compression, VTM.
Citation:
Ibraheem MKI, Dvorkovich AV, Al-Temimi AMS. Innovative Integration of Residual Networks for Enhanced In-loop Filtering in VVC Using Deep Convolutional Neural Networks. Computer Optics 2025; 49 (4): 692-701. DOI: 10.18287/2412-6179-CO-1572.
References:
- Norkin A, Bjontegaard G, Fuldseth A, Narroschke M, Ikeda M, Andersson K, Zhou M, Van der Auwera G. HEVC deblocking filter. IEEE Trans Circuits Syst Video Technol 2012; 22(12): 1746-1754. DOI: 10.1109/TCSVT.2012.2223053.
- Fu C-M, Alshina E, Alshin A, Huang Y-W, Chen C-Y, Tsai C-Y, Hsu C-W, Lei S-M, Park J-H, Han W-J. Sample adaptive offset in the HEVC standard. IEEE Trans Circuits Syst Video Technol 2012; 22(12): 1755-1764. DOI: 10.1109/TCSVT.2012.2221529.
- Tsai C-Y, Chen C-Y, Yamakage T, Chong IS, Huang Y-W, Fu C-M, Itoh T, Watanabe T, Chujoh T, Karczewicz M, Lei S-M. Adaptive loop filtering for video coding. IEEE J Sel Top Signal Process 2013; 7(6): 934-945. DOI: 10.1109/JSTSP.2013.2271974.
- Bross B, Chen J, Liu S, Wang YK. Versatile video coding (draft 10). ITU-T and ISO/IEC JVET-S2001 2020.
- Zhang K, Zuo W, Chen Y, Meng D, Zhang L. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans Image Process 2017; 26(7): 3142-3155. DOI: 10.1109/TIP.2017.2662206.
- Wang Z, Chen J, Hoi SCH. Deep learning for image super-resolution: A survey. IEEE Trans Pattern Anal Machine Intell 2020; 43(10): 3365-3387. DOI: 10.1109/TPAMI.2020.2982166.
- Dong C, Deng Y, Loy CC, Tang X. Compression artifacts reduction by a deep convolutional network. 2015 IEEE International Conference on Computer Vision (ICCV) 2015: 576-584. DOI: 10.1109/ICCV.2015.73.
- Dai Y, Liu D, Wu F. A convolutional neural network approach for post-processing in HEVC intra coding. In Book: Amsaleg L, Guðmundsson GÞ, Gurrin C, Jónsson BÞ, Satoh S, eds. MultiMedia Modeling: 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6, 2017, Proceedings, Part I. Cham, Switzerland: Springer International Publishing AG; 2017: 28-39. DOI: 10.1007/978-3-319-51811-4_3.
- Lin W, He X, Han X, Liu D, See J, Zou J, Xiong H, Wu F. Partition-aware adaptive switching neural networks for post-processing in HEVC. IEEE Trans Multimed 2019; 22(11): 2749-2763. DOI: 10.1109/TMM.2019.2962310.
- Ma D, Zhang F, Bull DR. MFRNet: a new CNN architecture for post-processing and in-loop filtering. IEEE J Sel Top Signal Process 2020; 15(2): 378-387. DOI: 10.1109/JSTSP.2020.3043064.
- Chen S, Chen Z, Wang Y, Liu S. In-loop filter with dense residual convolutional neural network for VVC. 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) 2020: 149-152. DOI: 10.1109/MIPR49039.2020.00038.
- Bouaafia S, Khemiri R, Sayadi FE, Atri M, Liouane N. A deep CNN-lstm framework for fast video coding. In Book: Moataz AE, Mammass D, Mansouri A, Nouboud F, eds. Image and Signal Processing. 9th International Conference, ICISP 2020, Marrakesh, Morocco, June 4–6, 2020, Proceedings. Cham, Switzerland: Springer International Publishing AG; 2020: 205-212. DOI: 10.1007/978-3-030-51935-3_22.
- Bouaafia S, Khemiri R, Sayadi FE, Atri M. SVM-based inter prediction mode decision for HEVC. 2020 17th Int Multi-Conf on Systems, Signals & Devices (SSD) 2020: 12-16. DOI: 10.1109/SSD49366.2020.9364153.
- Bouaafia S, Khemiri R, Sayadi FE, Atri M. Fast CU partition-based machine learning approach for reducing HEVC complexity. J Real-Time Image Process 2020; 17(1): 185-196. DOI: 10.1007/s11554-019-00936-0.
- Amna M, Imen W, Ezahra SF, Mohamed A. Fast intra-coding unit partition decision in H.266/FVC based on deep learning. J Real-Time Image Process 2020; 17(6): 1971-1981. DOI: 10.1007/s11554-020-00998-5.
- Hsu T-Y, Lu Y-J, Hsieh T-H, Wang C-C. An efficient HEVC intra frame coding based on deep convolutional neural network. 2021 IEEE/ACIS 22nd Int Conf on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) 2021: 138-141. DOI: 10.1109/SNPD51163.2021.9704928.
- Pan Z, Yi X, Zhang Y, Jeon B, Kwong S. Efficient in-loop filtering based on enhanced deep convolutional neural networks for HEVC. IEEE Trans Image Process 2020; 29: 5352-5366. DOI: 10.1109/TIP.2020.2982534.
- Bouaafia S, Messaoud S, Khemiri R, Sayadi FE. VVC in-loop filtering based on deep convolutional neural network. Comput Intell Neurosci 2021; 2021(1): 9912839. DOI: 10.1155/2021/9912839.
- Zhang Q, Wang Y, Huang L, Jiang B, Wang X. Fast CU partition decision for H.266/VVC based on the improved DAG-SVM classifier model. Multimedia Systems 2021; 27(1): 1-14. DOI: 10.1007/s00530-020-00688-z.
- Li M, Ji W. Lightweight multiattention recursive residual CNN-based in-loop filter driven by neuron diversity. IEEE Trans Circuits Syst Video Technol 2023; 33(11): 6996-7008. DOI: 10.1109/TCSVT.2023.3270729.
- Kuanar S, Rao KR, Conly C, Gorey N. Deep learning based HEVC in-loop filter and noise reduction. Signal Process: Image Commun 2021; 99: 116409. DOI: 10.1016/j.image.2021.116409.
- LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015; 521(7553): 436-444. DOI: 10.1038/nature14539.
- Ma S, Zhang X, Jia C, Zhao Z, Wang S, Wang S. Image and video compression with neural networks: A review. IEEE Trans Circuits Syst Video Technol 2019; 30(6): 1683-1698. DOI: 10.1109/TCSVT.2019.2910119.
- Bidwe RV, Mishra S, Patil S, Shaw K, Vora DR, Kotecha K, Zope B. Deep learning approaches for video compression: A bibliometric analysis. Big Data Cogn Comput 2022; 6(2): 44. DOI: 10.3390/bdcc6020044.
- jvet/VVCSoftware_VTM/Tags/VTM-22.2. 2025. Source: <https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tags/VTM-22.2>.
- Ma D, Zhang F, Bull DR. BVI-DVC: A training database for deep video compression. IEEE Trans Multimed 2021; 24: 3847-3858. DOI: 10.1109/TMM.2021.3108943.
- Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv Preprint. 2014. Source: <https://arxiv.org/abs/1412.6980>. DOI: 10.48550/arXiv.1412.6980.
- Zhang F, Feng C, Bull DR. Enhancing VVC through CNN-based post-processing. 2020 IEEE Int Conf on Multimedia and Expo (ICME) 2020: 1-6. DOI: 10.1109/ICME46284.2020.9102912.
- He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2016 IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2016: 770-778. DOI: 10.1109/CVPR.2016.90.
- Chen S, Chen Z, Wang Y, Liu S. In-loop filter with dense residual convolutional neural network for VVC. 2020 IEEE Conf on Multimedia Information Processing and Retrieval (MIPR) 2020: 149-152. DOI: 10.1109/MIPR49039.2020.00038.
- Kawamura K, Kidani Y, Naito S. CE13-2.6/CE13-2.7: Evaluation results of cnn based in-loop filtering. Document JVET-N0710, 14th JVET meeting, Geneva, Switzerland 2019: 19-27.
© 2009, IPSI RAS
151, Molodogvardeiskaya str., Samara, 443001, Russia; E-mail: journal@computeroptics.ru ; Tel: +7 (846) 242-41-24 (Executive secretary), +7 (846) 332-56-22 (Issuing editor), Fax: +7 (846) 332-56-20