(50-1) 17 * << * >> * Русский * English * Содержание * Все выпуски

Neural network task specialization via domain constraining
R.O. Malashin1, D.A. Ilyukhin1

1Pavlov Institute of Physiology RAS, Saint-Petersburg, Makarova emb. 6

  Полный текст (PDF)

DOI: 10.18287/COJ1673

ID статьи: 1673

Аннотация:
This paper introduces a concept of neural network specialization via task-specific domain constraining, aimed at enhancing network performance on data subspace in which the network operates. The study presents experiments on training specialists for image classification and object detection tasks. The results demonstrate that specialization can enhance a generalist's accuracy even without additional data or changing training regimes -- solely by constraining class label space in which the network performs. Theoretical and experimental analyses indicate that effective specialization requires modifying traditional fine-tuning methods and constraining data space to semantically coherent subsets. The specialist extraction phase before tuning the network is proposed for maximal performance gains. We also provide analysis of the evolution of the feature space during specialization. This study paves way to future research for developing more advanced dynamically configurable image analysis systems, where computations depend on the specific input. Additionally, the proposed methods can help improve system performance in scenarios where certain data domains should be excluded from consideration of the generalist network.

Ключевые слова:
neural network specialization, task-specific domain constraining, dynamic configurability.

Благодарности:
Initial experiments were supported by the grant from the St. Petersburg Science Foundation. The study was supported by supported by the State funding allocated to the Pavlov Institute of Physiology, Russian Academy of Sciences (№1021062411653-4-3.1.8).

Цитирование:
Malashin RO, Ilyukhin DA. Neural network task specialization via domain constraining. Computer Optics 2026; 50(1): 1673. DOI: 10.18287/COJ1673.

Citation:
Malashin RO, Ilyukhin DA. Neural network task specialization via domain constraining. Computer Optics 2026; 50(1): 1673. DOI: 10.18287/COJ1673.

References:

  1. Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531; 2015. DOI 10.48550/arXiv.1503.02531.
  2. Xiao B et al. Florence-2: Advancing a unified representation for a variety of vision tasks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024: 4818-4829.
  3. Kirillov A et al. Segment anything. Proceedings of the IEEE/CVF international conference on computer vision; 2023: 4015-4026.
  4. Wang X et al. Images speak in images: A generalist painter for in-context visual learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023: 6830-6839.
  5. Biedenkapp A, Bozkurt HF, Eimer T, Hutter F, Lindauer M. Dynamic algorithm configuration: Foundation of a new meta-algorithmic framework. Proceedings of the Twenty-fourth European Conference on Artificial Intelligence; 2020, 163: DOI 10.3233/FAIA200122.
  6. Shazeer N, Mirhoseini A, Maziarz K, Davis A, Le Q, Hinton G, Dean J. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538; 2017. DOI 10.48550/arXiv.1701.06538.
  7. Agarwal N, Sondhi A, Chopra K, Singh G. Transfer learning: Survey and classification. Smart Innovations in Communication and Computational Sciences: Proceedings of ICSICCS 2020; 2021: 145-155. DOI 10.1007/978-981-15-5345-5_13.
  8. Zhou B, Kalra N, Krähenbühl P. Domain adaptation through task distillation. European Conference on Computer Vision; 2020: 664-680. DOI 10.1007/978-3-030-58574-7_40.
  9. Wang T, Zhu JY, Torralba A, Efros AA. Dataset distillation. arXiv preprint arXiv:1811.10959; 2018. DOI 10.48550/arXiv.1811.10959.
  10. Malashin R. Extraction of object hierarchy data from trained deep-learning neural networks via analysis of the confusion matrix. J Opt Tech; 2016, 83: 599-603. DOI 10.1364/jot.83.000599.
  11. Stringer C et al. Cellpose: a generalist algorithm for cellular segmentation. Nature methods; 2021, 18 (1): 100-106.
  12. Arpit D, Jastrzebski S, Ballas N, Krueger D, Bengio E, Kanwal MS, Maharaj T, Fischer A, Courville A, Bengio Y et al. A closer look at memorization in deep networks. International conference on machine learning; 2017: 233-242. DOI 10.5555/3305381.3305406.
  13. Saxe AM, McClelland JL, Ganguli S. A mathematical theory of semantic development in deep neural networks. Proceedings of the National Academy of Sciences; 2019, 116 (23): 11537-11546. DOI 10.1073/pnas.1820226116.
  14. Malashin R, Yachnaya V, Mullin A. Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy. arXiv preprint arXiv:2502.12125; 2025.
  15. Mendez JA, Eaton E. How to reuse and compose knowledge for a lifetime of tasks: A survey on continual learning and functional composition. arXiv preprint arXiv:2207.07730; 2022. DOI 10.3233/FAIA200122.
  16. Bengio E, Bacon PL, Pineau J, Precup D. Conditional computation in neural networks for faster models. arXiv preprint arXiv:1511.06297; 2015. DOI 10.48550/arXiv.1511.06297.
  17. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE. Adaptive mixtures of local experts. Neural computation; 1991, 3 (1): 79-87. DOI 10.1162/neco.1991.3.1.79.
  18. Fedus W, Zoph B, Shazeer N. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. Journal of Machine Learning Research; 2022, 23 (120): 1-39. DOI 10.48550/arXiv.2101.03961.
  19. Qiu Z, Huang Z, Zheng B, Wen K, Wang Z, Men R, Titov I, Liu D, Zhou J, Lin J. Demons in the detail: On implementing load balancing loss for training specialized mixture-of-expert models. arXiv preprint arXiv:2501.11873; 2025. DOI 10.48550/arXiv.2501.11873.
  20. Bolukbasi T, Wang J, Dekel O, Saligrama V. Adaptive neural networks for efficient inference. International Conference on Machine Learning; 2017: 527-536. DOI 10.48550/arXiv.1702.07811.
  21. Malashin RO. Sparsely ensembled convolutional neural network classifiers via reinforcement learning. ICMLT 2021: 6th International Conference on Machine Learning Technologies; 2021: 102-110. DOI 10.1145/3468891.3468906.
  22. Malashin RO. Principle of least action in dynamically configured image analysis systems. Journal of Optical Technology; 2019, 86 (11): 678. DOI 10.1364/jot.86.000678.
  23. Shelepin Y, Krasilnikov N. Principle of least action, physiology of vision and conditioned reflex theory. Ross Fiziol Zh im I M Sechenova; 2003, 89 (6): 725-730. DOI 10.1068/p5366.
  24. Shelepin Y, Krasilnikov N, Trufanov G, Harauzov A, Pronin S, Foking A. The principle of least action and visual perception. Twenty-ninth European Conference on Visual Perception; 2006. DOI 10.1068/p5366.
  25. Zhuge M, Liu H, Faccio F, Ashley DR, Csordás R, Gopalakrishnan A, Hamdi A, Hammoud HAAK, Herrmann V, Irie K et al. Mindstorms in natural language-based societies of mind. arXiv preprint arXiv:2305.17066; 2023. DOI 10.48550/arXiv.2305.17066.
  26. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al. Imagenet large scale visual recognition challenge. International journal of computer vision; 2015, 115: 211-252. DOI 10.1007/s11263-015-0816-y.
  27. Howard A, Sandler M, Chu G, Chen LC, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V et al. Searching for mobilenetv3. Proceedings of the IEEE/CVF international conference on computer vision; 2019: 1314-1324. DOI 10.1109/ICCV.2019.00140.
  28. Hinton G, Srivastava N, Swersky K. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on; 2012, 14 (8): 2.
  29. Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980; 2014. DOI 10.48550/arXiv.1412.6980.
  30. Fellbaum C. Wordnet. Theory and applications of ontology: computer applications; 2010: 231-243. DOI 10.1007/978-90-481-8847-5_10.
  31. McInnes L, Healy J, Melville J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426; 2018.
  32. Kornblith S et al. Similarity of neural network representations revisited. International conference on machine learning; 2019: 3519-3529. DOI 10.48550/arXiv.1905.00414.
  33. Jocher G et al. ultralytics/yolov5: v3.0 - third release. ; 2020, 3 (6).
  34. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision (ECCV); 2014, 8693: 740-755. DOI 10.1007/978-3-319-10602-1_48.
  35. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A. The pascal visual object classes (voc) challenge. International journal of computer vision; 2010, 88: 303-338. DOI 10.1007/s11263-009-0275-4.
  36. Cui Y, Song Y, Sun C, Howard A, Belongie S. Large scale fine-grained categorization and domain-specific transfer learning. Proceedings of the IEEE conference on computer vision and pattern recognition; 2018: 4109-4118. DOI 10.1109/CVPR.2018.00432.
  37. Bossard L, Guillaumin M, Van Gool L. Food-101--mining discriminative components with random forests. Computer Vision--ECCV 2014: 13th European Conference; 2014, 13: 446-461. DOI 10.1007/978-3-319-10599-4_29.

Россия, 443001, Самара, ул. Молодогвардейская, 151; электронная почта: journal@computeroptics.ru; тел: +7 (846) 242-41-24 (ответственный секретарь), +7 (846) 332-56-22 (технический редактор), факс: +7 (846) 332-56-20