(43-5) 16 * << * >> * Русский * English * Содержание * Все выпуски
  
U-Net-bin: hacking the  document image binarization contest
P.V. Bezmaternykh1,2, D.A. Ilin1, D.P. Nikolaev1,3
  1 Smart Engines Service LLC, 117312, Moscow, Russia,  
2 Federal Research Center  "Computer Science and Control" of RAS, 117312, Moscow, Russia,
3 Institute for Information Transmission Problems of RAS, 127051, Moscow, Russia
  PDF, 3137 kB
DOI: 10.18287/2412-6179-2019-43-5-825-832
Страницы: 825-832.
Язык статьи: английский.
Аннотация:
Image binarization is still a  challenging task in a variety of applications. In particular, Document Image  Binarization Contest (DIBCO) is organized regularly to track the  state-of-the-art techniques for the historical document binarization. In this  work we present a binarization method that was ranked first in the DIBCO`17  contest. It is a convolutional neural network (CNN) based method which uses  U-Net architecture, originally designed for biomedical image segmentation. We  describe our approach to training data preparation and contest ground truth  examination and provide multiple insights on its construction (so called  hacking). It led to more accurate historical document binarization problem  statement with respect to the challenges one could face in the open access  datasets. A docker container with the final network along with all the  supplementary data we used in the training process has been published on Github. 
Ключевые слова:
historical document  processing, binarization, DIBCO, deep learning, U-Net architecture, training  dataset augmentation, document analysis.
Цитирование:
  Bezmaternykh, P.V. U-Net-bin: hacking the  document image binarization contest / P.V. Bezmaternykh, D.A. Ilin, D.P.  Nikolaev // Computer Optics. – 2019. – Vol. 43(5). – P. 825-832. –  DOI: 10.18287/2412-6179-2019-43-5-825-832.
 
Благодарности:
The work was partially funded  by Russian Foundation for Basic Research (projects 17-29-07092 and 17-29-07093).
Литература:
  - Kruchinin, A.Yu. Industrial  DataMatrix barcode recognition for an arbitrary camera angle and rotation /  A.Yu. Kruchinin // Computer Optics. – 2014. – Vol. 38(4). –  P. 865-870.
 
  - Fedorenko, V.A. Binarization of  images of striated toolmarks for estimation of the number of matching  striations traces [In Russian] / V.A. Fedorenko, E.V. Sidak,  P.V. Giverts // Journal of Information Technologies and Computational  Systems. – 2016. – Issue 3. – P. 82-88.
 
  - Gudkov, V. Skeletonization  of binary images and finding of singular points for fingerprint recognition /  V. Gudkov, D. Klyuev // Bulletin of the South Ural State University.  Seria: Computer Technologies, Automatic Control & Radioelectronics. – 2015.  – Vol. 15, No. 3. – P. 11-17. – DOI: 10.14529/ctcr150302.
 
  - Nikolaev, D.P. Segmentation-based binarization method for color document images /  D.P. Nikolaev // Proceedings of the 6th German-Russian Workshop “Pattern  recognition and image understanding” (OGRW-6). – 2003. – P. 190-193.
 
  - Nagy, G. Disruptive  developments in document recognition / G. Nagy // Pattern Recognition  Letters. – 2016. – Vol. 79. – P. 106-112. DOI: 10.1016/j.patrec.2015.11.024.
 
  - Gatos, B. ICDAR 2009  Document Image Binarization Contest (DIBCO 2009) / B. Gatos,  K. Ntirogiannis, I. Pratikakis //  2009 10th International Conference on Document Analysis and Recognition. –  2009. – P. 1375-1382. – DOI: 10.1109/icdar.2009.246.
 
  - Pratikakis, I. ICDAR2017  Competition on document image binarization (DIBCO 2017) / I. Pratikakis,  K. Zagoris, G. Barlas, B. Gatos // 2017 14th IAPR International  Conference on Document Analysis and Recognition (ICDAR). – 2017. – Vol. 1.  – P. 1395-1403. – DOI: 10.1109/icdar.2017.228.
 
  - Ronneberger, O. U-Net:  convolutional networks for biomedical image segmentation [Electronical  Resource] / O. Ronneberger, P. Fischer, T. Brox. – 2015. – URL:  https://arxiv.org/abs/1505.04597 (request date 25.07.2019).
 
  - Otsu, N. A threshold selection method from gray-level  histograms / N. Otsu // IEEE Transactions on  Systems, Man, and Cybernetics. – 1979. – Vol. 9, Issue 1. –  P. 62-66. DOI: 10.1109/tsmc.1979.4310076.
 
  - Sauvola, J. Adaptive document  image binarization / J. Sauvola, M. Pietikäinen // Pattern  Recognition. – 2000. – Vol. 33, Issue 2. – P. 225-236. – DOI:  10.1016/s0031-3203(99)00055-2.
 
  - Cheriet, M. A recursive  thresholding technique for image segmentation / M. Cheriet,  J.N. Said, C.Y. Suen // IEEE Transactions on Image Processing. –  1998. – Vol. 7, Issue 6. – P. 918-921. DOI: 10.1109/83.679444.
 
  - Jianzhuang, L. Automatic  thresholding of gray-level pictures using two-dimension Otsu method /  L. Jianzhuang, L. Wenqing, T. Yupeng // 1991 International  Conference on Circuits and Systems. – 1991. – DOI: 10.1109/ciccas.1991.184351.
 
  - Ershov, E.I. Exact fast  algorithm for optimal linear separation of 2D distribution / E.I. Ershov,  V.V. Postnikov, A.P. Terekhin, D.P. Nikolaev // 2015 European  Conference on Modelling and Simulation. – 2015. – P. 469-474.
 
  - Shi, Z. Digital image  enhancement using normalization techniques and their application to palm leaf  manuscripts / Z. Shi, S. Setlur, V. Govindaraju. – 2005. – URL:  https://cedar.buffalo.edu/~zshi/Papers/kbcs04_261.pdf (request date  25.07.2019).
 
  - Gatos, B. Adaptive degraded  document image binarization / B. Gatos, I. Pratikakis,  S.J. Perantonis // Pattern Recognition. – 2006. – Vol. 39,  Issue 3. – P. 317-327. – DOI: 10.1016/j.patcog.2005.09.010.
 
  - Lu, S. Document image  binarization using background estimation and stroke edges / S. Lu,  B. Su, C.L. Tan // International Journal on Document Analysis and  Recognition. – 2010. – Vol. 13, Issue 4. – P. 303-314. – DOI:  10.1007/s10032-010-0130-8.
 
  - Niblack, W. An introduction  to digital image processing / W. Niblack.  – Upper Saddle River, NJ: Prentice-Hall, Inc., 1990.
 
  - Trier, O.D. Evaluation of binarization  methods for document images / O.D. Trier, T. Taxt // IEEE  Transactions on Pattern Analysis and Machine Intelligence. – 1995. –  Vol. 17, Issue 3. – P. 312-315. DOI: 10.1109/34.368197.
 
  - Khurshid, K. Com parison of  Niblack inspired binarization methods for ancient documents / K. Khurshid,  I. Siddiqi, C. Faure, N. Vincent // Document Recognition and  Retrieval XVI. – 2009. – DOI: 10.1117/12.805827.
 
  - Lazzara, T. Efficient  multiscale Sauvola’s binarization / T. Lazzara, G. Géraud //  International Journal on Document Analysis and Recognition. – 2014. –  Vol. 17, Issue 2. – P. 105-123. – DOI:  10.1007/s10032-013-0209-0.
 
  - Kim, I.-J. Multi-window  binarization of camera image for document recognition / I.-J. Kim // Ninth  International Workshop on Frontiers in Handwriting Recognition. – 2004. –  P. 323-327. – DOI: 10.1109/IWFHR.2004.70.
 
  - Howe, N.R. Document  binarization with automatic parameter tuning / N.R. Howe // International  Journal on Document Analysis and Recognition. – 2012. – Vol. 16,  Issue 3. – P. 247-258. – DOI: 10.1007/s10032-012-0192-x.
 
  - Wen, J. A new  binarization method for non-uniform illuminated document images / J. Wen,  S. Li, J. Sun // Pattern Recognition. – 2013. – Vol. 46,  Issue 6. – P. 1670-1690. – DOI: 10.1016/j.patcog.2012.11.027.
 
  - Chen, Y. Decompose  algorithm for thresholding degraded historical document images / Y. Chen,  G. Leedham // IEE Proceedings – Vision, Image and Signal Processing. –  2005. – Vol. 152, Issue 6. – 702. – DOI: 10.1049/ip-vis:20045054.
 
  - Chou, C.-H. A binarization  method with learning-built rules for document images produced by cameras /  C.-H. Chou, W.-H. Lin, F. Chang // Pattern Recognition. – 2010.  – Vol. 43, Issue 4. – P. 1518-1530. – DOI:  10.1016/j.patcog.2009.10.016.
 
  - Gatos, B. Improved document  image binarization by using a combination of multiple binarization techniques  and adapted edge information / B. Gatos, I. Pratikakis,  S.J. Perantonis, // 2008 19th International Conference on Pattern  Recognition. – 2008. – DOI: 10.1109/icpr.2008.4761534.
 
  - Badekas, E. Optimal  combination of document binarization techniques using a self-organizing map  neural network / E. Badekas, N. Papamarkos // Engineering  Applications of Artificial Intelligence. – 2006. – Vol. 20, Issue 1.  – P. 11-24. – DOI: 10.1016/j.engappai.2006.04.003.
 
  - Wu, Y. Learning document  image binarization from data / Y. Wu, P. Natarajan, S. Rawls,  W. AbdAlmageed // 2016 IEEE International Conference on Image Processing  (ICIP). – 2016. – DOI: 10.1109/icip.2016.7533063.
 
  - Westphal, F. Document image  binarization using recurrent neural networks / F. Westphal,  N. Lavesson, H. Grahn // 2018 13th IAPR International Workshop on  Document Analysis Systems (DAS). – 2018. – DOI: 10.1109/das.2018.71.
 
  - Tensmeyer, C. Document image  binarization with fully convolutional neural networks / C. Tensmeyer,  T. Martinez // 2017 14th IAPR International Conference on Document  Analysis and Recognition (ICDAR). – 2017.
 
  - Xiong, W. Degraded  historical document image binarization using local features and support vector  machine (SVM) / W. Xiong, J. Xu, Z. Xiong, J. Wang,  M. Liu // Optik. – 2018. – Vol. 164. – P. 218-223. – DOI:  10.1016/j.ijleo.2018.02.072.
 
  - Nikolaev, D.P. Quality criteria  for the problem of automated adjustment of binarization algorithms [In Russian]  / D.P. Nikolaev, A.A. Saraev // Proceeding of the Institute for  Systems Analysis of the Russian Academy of Science. – 2013. – Vol. 63,  Issue 3. – P. 85-94.
 
  - Krokhina, D. Analysis of straw  row in the image to control the trajectory of the agricultural combine  harvester (Erratum) / D. Krokhina, A.Y. Shkanaev, D.V. Polevoy,  A.V. Panchenko, S.R. Nailevish, D.L. Sholomov // Tenth  International Conference on Machine Vision (ICMV 2017). – 2018. – P. 90. –  DOI: 10.1117/12.2310143.
 
  - Chollet, F. Keras: The Python  deep learning library / F. Chollet, [et al.]. – 2015. – URL:  https://keras.io (request date 25.07.2019).
 
  - Kingma, D.P. Adam: A method  for stochastic optimization [Electronical Resource] / D.P. Kingma,  J. Ba. – 2014. – URL: https://arxiv.org/abs/1412.6980 (request date  25/07/2019).
 
  - Pratikakis, I. ICFHR 2018  Competition on Handwritten Document Image Binarization (H-DIBCO 2018) /  I. Pratikakis, K. Zagori, P. Kaddas, B. Gatos // 2018 16th  International Conference on Frontiers in Handwriting Recognition (ICFHR). –  2018. – DOI: 10.1109/icfhr-2018.2018.00091.
 
  - Oliveira, S.A. dhSegment: A  generic deep-learning approach for document segmetation / S.A. Oliveira,  B. Seguin, F. Kaplan // 2018 16th The International Conference on  Frontiers of Handwriting Recognition (ICFHR). – 2018. – P. 7-12.
 
  - Calvo-Zaragoza, J. A selectional  auto-encoder approach for document image binarization / J. Calvo-Zaragoza,  A.-J. Gallego // Pattern Recognition. – 2019. – Vol. 86. –  P. 37-47. – DOI: 10.1016/j.patcog.2018.08.011.
 
  - Arlazarov, V.V. MIDV-500: A dataset for identity documents analysis and recognitionon  mobile devices in video stream [Electronical Resource] / V.V. Arlazarov,  K. Bulatov, T.S. Chernov, V.L. Arlazarov. – 2018. – URL:  https://arxiv.org/abs/1807.05786 (request date 25.07.2019).
    
  
 
  
  
  © 2009, IPSI RAS
    Россия, 443001, Самара, ул. Молодогвардейская, 151; электронная почта: ko@smr.ru ; тел: +7  (846)  242-41-24 (ответственный
      секретарь), +7 (846)
      332-56-22 (технический  редактор), факс: +7 (846) 332-56-20