(43-4) 21 * << * >> * Russian * English * Content * All Issues
  
  
Multivariate mixed kernel density estimators and their application in  machine learning for classification of biological objects based on spectral  measurements
A.A. Sirota1, A.O. Donskikh1, A.V. Akimov1, D.A. Minakov1
  1 Voronezh State University, Voronezh, Russia
 PDF, 1171 kB
  PDF, 1171 kB
DOI: 10.18287/2412-6179-2019-43-4-677-691
Pages: 677-691.
Full text of article: Russian language.
Abstract:
A problem of  non-parametric multivariate density estimation for machine learning and data  augmentation is considered. A new mixed density estimation method based on  calculating the convolution of independently obtained kernel density estimates  for unknown distributions of informative features and a known (or independently  estimated) density for non-informative interference occurring during  measurements is proposed. Properties of the mixed density estimates obtained  using this method are analyzed. The method is compared with a conventional  Parzen-Rosenblatt window method applied directly to the training data. The  equivalence of the mixed kernel density estimator and the data augmentation  procedure based on the known (or estimated) statistical model of interference  is theoretically and experimentally proven. The applicability of the mixed  density estimators for training of machine learning algorithms for the  classification of biological objects (elements of grain mixtures) based on  spectral measurements in the visible and near-infrared regions is evaluated.
Keywords:
machine learning,  pattern classification, data augmentation, kernel density estimation, spectral  measurements
Citation: 
Sirota AA, Donskikh AO, Akimov AV, Minakov DA. Multivariate mixed kernel density estimators and their application in  machine learning for classification of biological objects based on spectral  measurements. Computer Optics 2019; 43(4): 677-691.  DOI: 10.18287/2412-6179-2019-43-4-677-691.
References:
  - Krivenko MP. Nonparametric estimation of Bayesian classifier elements [In Russian]. Informatics and Applications 2010; 4(2): 13-24.
 
- Lapko AV, Lapko VA. Nonparametric algorithm of automatic classification under conditions of large-scale statistical data [In Russian]. Information Science and Control Systems 2018; 3(57): 59-70. DOI: 10.22250/isu.2018.57.59-70.
 
- Nakamura Y, Hasegawa O. Nonparametric density estimation based on self-organizing incremental neural network for large noisy data. IEEE Transactions on Neural Networks and Learning Systems 2016; 28(1): 8-17. DOI: 10.1109/TNNLS.2015.2489225.
 
- Donskikh AO, Sirota AA. A data augmentation method for machine learning based on nonparametric kernel density estima-tion [In Russian]. Proceedings of Voronezh State University. Series: system analysis and information technology 2017; 3: 142-155.
 
- Yaeger L, Lyon R, Webb B. Effective training of a neural network character classifier for word. NIPS 1996: 807-813.
 
- Ciresan DC, Meier U, Gambardella LM, Schmidhuber J. Deep big simple neural nets excel on handwritten digit recognition. Neural Computation 2010; 22(12): 3207-3220. DOI: 10.1162/NECO_a_00052.
 
- Simard PY, Steinkraus D, Platt JC. Best practices for convolutional neural networks applied to visual document analysis. 7th Int Conf Docum Anal Recogn 2003: 958-963. DOI: 10.1109/ICDAR.2003.1227801.
 
- Kachalin SV. Improving the stability of large neural networks by extending small training sets of parent samples with synthe-sized biometric descendant samples [In Russian]. Proceedings of the Scientific and Technical Conference of Thecluster of Penza Enterprises Providing Security of Information Technologies 2014; 9: 32-35.
 
- Akimov AV, Sirota AA. Synthetic data generation models and algorithms for training image recognition algorithms using the Viola-Jones framework. Computer Optics 2016; 40(6): 911-918. DOI: 10.18287/2412-6179-2016-40-6-911-918.
 
- Guo H, Viktor HL. Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach. ACM SIGKDD Explorations Newsletter 2004; 6(1): 30-39. DOI: 10.1145/1007730.1007736.
 
- Chawla N, Bowyer K, Hall L, Kegelmeyer W. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intel-ligence Research 2002; 16(1): 321-357.DOI: 10.1613/jair.953.
 
- Chawla NV, Lazarevic A, Hall LO, Bowyer KW. SMOTEBoost: Improving prediction of the minority class in boosting. In Book: Lavrač N, Gamberger D, Todorovski L, Blockeel H, eds. Knowledge discovery in databases. Berlin, Heidelberg, New York: Springer-Verlag; 2003: 107-119. DOI: 10.1007/978-3-540-39804-2_12.
 
- Fukunaga K. Introduction to Statistical pattern recognition. 2nd ed. San Diego: Academic Press; 1990.
 
- Duda RO, Hart PE, Stork DG. Pattern classification. 2nd ed. Hoboken, NJ: Wiley-Interscience; 2000.
 
- Kryanev AV, Lukin GV. Mathematical methods for handling uncertain data [In Russian]. Moscow: "Fizmatlit" Publisher; 2003.
 
- Akimov AV, Donskikh AO, Sirota AA. Models and algorithms of digital image recognition under influence of warping and additive noise [In Russian]. Proceedings of Voronezh State University. Series: System Analysis and Information Technology 2018; 1: 104-118.
 
- Gramacki A. Nonparametric kernel density estimation and its computational aspects. Cham, Switzerland: Springer International Publishing AG; 2018: 42-49. ISBN: 978-3-319-71687-9.
 
- Dobrovidov AV, Ruds'ko IM. Bandwidth selection in nonparametric estimator of density derivative by smoothed cross-validation method. Automation and Remote Control 2010; 71(2): 209-224. DOI: 10.1134/S0005117910020050.
 
- Voronov IV, Mukhometzianov RN, Krasnova AA. Bandwidth selection in the approximation of probability density via Parzen-Rosenblatt method for small sample size [In Russian]. Radio Electronics Technology 2016; 1(9): 93-98.
 
- Donskikh AO, Minakov DA, Sirota AA. Optical methods of identifying the varieties of the components of grain mixtures based on using artificial neural networks for data analysis. Journal of Theoretical and Applied Information Technology 2018; 96(2): 534-542. 
 
 
 
  
  © 2009, IPSI RAS
  151,  Molodogvardeiskaya str., Samara, 443001, Russia; E-mail: ko@smr.ru ; Tel: +7  (846)  242-41-24 (Executive secretary), +7 (846)
332-56-22 (Issuing   editor), Fax: +7 (846) 332-56-20