Mathematical Analysis of Raman Spectra Data Arrays Using Machine Learning Algorithms

Yana A. Byuchkova
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Andrey Y. Zyubin
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Vladimir V. Rafalskiy
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Ekaterina M. Moiseeva
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Ilia G. Samusev
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Paper #8684 received 5 Mar 2023; revised manuscript received 13 Jun 2023; accepted for publication 16 Jun 2023; published online 29 Jun 2023.


This paper is devoted to the application of mathematical methods of classification and differentiation of low-resolution spectral data arrays of Raman light scattering for complex biological compounds as human platelets. Spectral data arrays consisted of 1266 spectra from 4 groups of patients, totaling 152 people were analyzed. A random forest algorithm was used. Potential biomarkers of differences between patient groups were identified, on which the given algorithms were tested. Using the random forest algorithm for classification of spectra of healthy patients without therapy and patients with cardiovascular pathologies without therapy, we have achieved the accuracy of 83.4%. Classification of the healthy patients on and off therapy shows the accuracy of 76.26% and classification of the patients with cardiovascular pathologies shows 70% accuracy.


spectroscopy; surface-enhanced Raman scattering; cardiovascular disease; machine learning; random forest algorithm

