Using a Haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise
V.G. Spitsyn,Yu.A. Bolotova, N.H. Phan,T.T.T. Bui

 

Tomsk Polytechnic University, Tomsk, Russia,

Ba Ria-Vung Tau University, Vietnam

Full text of article: Russian language.

 PDF

Abstract:
In this paper we propose a novel algorithm for optical character recognition in the presence of impulse noise by applying a wavelet transform, principal component analysis, and neural networks. In the proposed algorithm, the Haar wavelet transform is used for low frequency components allocation, noise elimination and feature extraction. The principal component analysis is used to reduce the dimension of the extracted features. We use a set of different multi-layer neural networks as classifiers for each character; the inputs are represented by a reduced set of features. One of the key features of the proposed approach is creating a separate neural network for each type of character. The experimental results show that the proposed algorithm can effectively recognize the characters in images in the presence of impulse noise; the results are comparable with ABBYY FineReader and Tesseract OCR.

Keywords:
optical character recognition; wavelet transform; principal component analysis; neural networks.

Citation:
Spitsyn VG, Bolotova YuA, Phan NH, Bui TTT. Using a Haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Computer Optics 2016; 40(2): 249-257. DOI: 10.18287/2412-6179-2016-40-2-249-257.

References:

  1. Bolotova YuA, Druki AA, Spitsyn VG. License plate recognition with hierarchical temporal memory model. Proceedings of 9th International Forum on Strategic Technology (IFOST-2014). Chittagong: CUET; 2014: 121-124.
  2. Bolotova YuA, Spitsyn VG, Rudometkina MN. Lisence plate recognition algorithm on the base of connected components method and hierarchical temporal memory model. Computer Optics 2015; 39(2): 275-280.
  3. Bolotova YuA, Kermani AK, Spitsyn VG. Colored background symbols recognition on the base of hierarchical temporal memory model with Gabor filter preprocessing [In Russian]. Electromagnetic Waves and Electronic Systems 2012; 17(1): 14-19.
  4. Kazanskiy NL, Popov SB. The distributed vision system of the registration of the railway train. Computer Optics 2012; 36(3): 419-428.
  5. Izotov PY, Sukhanov SV, Golovashkin DL. Technology of implementation of neural network algorithm in cuda environment at the example of handwritten digits recognition. Computer Optics 2010; 34(2): 243-251.
  6. Breuel TM, Ul-Hasan A, Azawi MAl, Shafait F. High-Performance OCR for Printed English and Fraktur using LSTM Networks. Proc. of 12th International Conference on Document Analysis and Recognition 2013: 683-687.
  7. Helinski M, Kmieciak M, Parkola T. Report on the comparison of Tesseract and ABBYY FineReader OCR engines. Technical report. Poznañ, Poland: Poznañ Supercomputing and Networking Center; 2012.
  8. Mori Sh, Suen ChY, Yamamoto K. Historical review of OCR research and development. Proceedings of the IEEE 1992; 80(7): 1029-1058.
  9. Smith R. An overview of the Tesseract OCR Engine. Proceedings of 9th International Conference on Document Analysis and Recognition (ICDAR2007) 2007; II: 629-633.
  10. Krupin A. ABBYFineReader: View from Inside [In Russian]. Source: áhttp://www.3dnews.ru/software/632560/ñ.
  11. Breuel TM. The OCRopus Open Source OCR System Tech. Proceedings of SPIE 2008; 6815: 68150F.
  12. Mehdi L, Solimani A, Dargazany A. Combining wavelet transforms and neural networks for image classification. In: 41st Southeasten Symposium on System Theory. Tullahoma, TN, USA; 2009: 44-48.
  13. Bui TTT, Phan NH, Spitsyn VG. Face recognition using Viola-Jones method, wavelet transforms and pricipal component analysis [In Russian]. Bulletin of the Tomsk Polytechnic University 2011; 320(5): 54-59.
  14. Turk MA, Pentland AP. Face Recognition Using Eigenfaces. Proc IEEE 1991: 586-591.
  15. Luk'janica AA, Shishkin AG. Digital video processing [In Russian]. Moscow: ISS Press; 2009.
  16. Spitsyn VG, Tsoy YuR. Intelligent systems. Tomsk: Tomsk Polytechnic University Press; 2012.
  17. Haykin S. Neural Networks: A Comprehensive Foundation Second Edition [In Russian]. Moscow: Wiliams Publishing; 2006.
  18. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient based learning applied to document recognition. Proceedings of the IEEE 1998; 86(11): 2278-2324.
  19. LeCun Y, Kavukcuoglu K, Farabet C. Convolutional networks and applications in vision. International Symposium on Circuits and Systems. Paris; 2010: 253-256.
  20. Bolotova YuA, Spitsyn VG. Analysis of hierarchically-temporal dependencies for handwritten symbols and gesture recognition. Proceedings of the 7th International Forum on Strategic Technology (IFOST-2012) (Tomsk) 2012; 1: 596-601.

© 2009, IPSI RAS
Institution of Russian Academy of Sciences, Image Processing Systems Institute of RAS, Russia, 443001, Samara, Molodogvardeyskaya Street 151; E-mail: journal@computeroptics.ru; Phones: +7 (846) 332-56-22, Fax: +7 (846) 332-56-20