(45-5) 16 * << * >> * Russian * English * Content * All Issues

Nonparametric pattern recognition algorithm for testing a hypothesis of the independence of random variables
I.V. Zenkov 1,3, A.V. Lapko 2,3, V.A. Lapko 2,3, E.V. Kiryushina 1, V.N. Vokin 1

Siberian Federal University,
660041, Krasnoyarsk, Russia, Svobodny Av. 79,
Institute of Computational Modelling SB RAS,
660036, Krasnoyarsk, Russia, Akademgorodok 50,
Reshetnev Siberian State University of Science and Technology,
660037, Krasnoyarsk, Russia, Krasnoyarsky Rabochy Av. 31

 PDF, 777 kB

DOI: 10.18287/2412-6179-CO-871

Pages: 767-772.

Full text of article: Russian language.

Abstract:
A new method for testing a hypothesis of the independence of multidimensional random variables is proposed. The technique under consideration is based on the use of a nonparametric pattern recognition algorithm that meets a maximum likelihood criterion. In contrast to the traditional formulation of the pattern recognition problem, there is no a priori training sample. The initial information is represented by statistical data, which are made up of the values of a multivariate random variable. The distribution laws of random variables in the classes are estimated according to the initial statistical data for the conditions of their dependence and independence. When selecting optimal bandwidths for nonparametric kernel-type probability density estimates, the minimum standard deviation is used as a criterion. Estimates of the probability of pattern recognition error in the classes are calculated. Based on the minimum value of the estimates of the probabilities of pattern recognition errors, a decision is made on the independence or dependence of the random variables. The technique developed is used in the spectral analysis of remote sensing data.

Keywords:
testing a hypothesis of the independence of random variables, multidimensional random variables, pattern recognition, nonparametric probability density estimation, bandwidths of kernel functions, Kolmogorov–Smirnov criterion, spectral analysis of remote sensing data.

Citation:
Zenkov IV, Lapko AV, Lapko VA, Kiryushina EV, Vokin VN. Nonparametric pattern recognition algorithm for testing a hypothesis of the independence of random variables. Computer Optics 2021; 45(5): 767-772. DOI: 10.18287/2412-6179-CO-871.

Acknowledgements:
The research was funded by the Russian Foundation for Basic Research, government of Krasnoyarsk Territory, and Krasnoyarsk Regional Science Foundation under project No. 20-41-240001.

References:

  1. Lapko AV, Lapko VA. Properties of nonparametric estimates of multidimensional probability density of independent random variables [In Russian]. Informatika i Sistemy Upravleniya 2012; 31(1): 166-174.
  2. Lapko AV, Lapko VA. Properties of the nonparametric decision function with a priori information on independence of attributes of classified objects. Optoelectronics, Instrumentation and Data Processing 2012; 48(4): 416-422. DOI: 10.3103/S8756699012040139.
  3. Pugachev VS. Probability theory and mathematical statistics: textbook [In Russian]. Moscow: “Fizmatlit” Publisher; 2002.
  4. Lapko AV, Lapko VA. A technique for testing hypotheses for distributions of multidimensional spectral data using a nonparametric pattern recognition algorithm. Computer Optics 2019; 43(2): 238-244. DOI: 10.18287/2412-6179-2019-43-2-238-244
  5. Parzen E. On estimation of a probability density function and mode. Ann Math Statistic 1962; 33(3): 1065-1076. DOI: 10.1214/aoms/1177704472.
  6. Epanechnikov VA. Non-parametric estimation of a multivariate probability density. Theory Probab its Appl 1969; 14(1): 153-158. DOI: 10.1137/1114019.
  7. Lapko AV, Medvedev AV, Tishina EA. To the optimization of nonparametric estimates [In Russian]. Collection of scientific papers "Algorithms and programs for automation systems of experimental research" (Frunze: Ilim) 1975: 105-116.
  8. Rudemo M. Empirical choice of histogram and kernel density estimators. Scand Stat Theory Appl 1982; 9(2): 65-78.
  9. Hall P. Large sample optimality of least squares cross-validation in density estimation. Ann Stat 1983; 11(4): 1156-1174.
  10. Jiang M, Provost SB. A hybrid bandwidth selection methodology for kernel density estimation. J Stat Comput Simul 2014; 84(3): 614-627. DOI: 10.1080/00949655.2012.721366.
  11. Dutta S. Cross-validation revisited. Commun Stat Simul Comput 2016; 45(2): 472-490. DOI: 10.1080/03610918.2013.862275.
  12. Heidenreich NB, Schindler A, Sperlich S. Bandwidth selection for kernel density estimation: a review of fully automatic selectors. Adv Stat Anal 2013; 97: 403-433. DOI: 10.1007/s10182-013-0216-y.
  13. Li Q, Racine JS. Nonparametric econometrics: Theory and practice. Princeton: Princeton University Press; 2007.
  14. Duin R. On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Trans Comput 1976; C-25(11): 1175-1179. DOI: 10.1109/TC.1976.1674577.
  15. Botev ZI, Kroese DP. Non-asymptotic bandwidth selection for density estimation of discrete data. Methodol Comput Appl Probab 2008; 10(3): 435-451. DOI: 10.1007/s11009-007-9057-z.
  16. Lapko AV, Lapko VA. Method of fast bandwidth selection in a nonparametric classifier corresponding to the a posteriori probability maximum criterion. Optoelectronics, Instrumentation and Data Processing 2019; 55(6): 597-605. DOI: 10.3103/S8756699019060104.
  17. Scott DW. Multivariate density estimation: Theory, practice, and visualization. New Jersey: John Wiley and Sons; 2015.
  18. Sheather SJ. Density estimation. Stat Sci 2004; 19(4): 588-597. DOI: 10.1214/088342304000000297.
  19. Silverman BW. Density estimation for statistics and data analysis. London: Chapman and Hall; 1986.
  20. Sharakshaneh АS, Zheleznov IG, Ivnitskij VА. Complex system [In Russian]. Moscow: “Vysshaya shkola” Publisher; 1977.
  21. Kharuk VI, Im ST, Dvinskaya ML, Ranson KJ, Petrov IA. Tree wave migration across an elevation gradient in the Altai Mountains, Siberia. J Mt Sci 2017; 14(3): 442-452. DOI: 10.1007/s11629-016-4286-7.

© 2009, IPSI RAS
151, Molodogvardeiskaya str., Samara, 443001, Russia; E-mail: journal@computeroptics.ru ; Tel: +7 (846) 242-41-24 (Executive secretary), +7 (846) 332-56-22 (Issuing editor), Fax: +7 (846) 332-56-20