Mathematical Analysis of Raman Spectra Data Arrays Using Machine Learning Algorithms

Yana A. Byuchkova (Login required)
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Andrey Y. Zyubin
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Vladimir V. Rafalskiy
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Ekaterina M. Moiseeva
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Ilia G. Samusev
Immanuel Kant Baltic Federal University, Kaliningrad, Russian Federation

Paper #8684 received 5 Mar 2023; revised manuscript received 13 Jun 2023; accepted for publication 16 Jun 2023; published online 29 Jun 2023.


This paper is devoted to the application of mathematical methods of classification and differentiation of low-resolution spectral data arrays of Raman light scattering for complex biological compounds as human platelets. Spectral data arrays consisted of 1266 spectra from 4 groups of patients, totaling 152 people were analyzed. A random forest algorithm was used. Potential biomarkers of differences between patient groups were identified, on which the given algorithms were tested. Using the random forest algorithm for classification of spectra of healthy patients without therapy and patients with cardiovascular pathologies without therapy, we have achieved the accuracy of 83.4%. Classification of the healthy patients on and off therapy shows the accuracy of 76.26% and classification of the patients with cardiovascular pathologies shows 70% accuracy.


spectroscopy; surface-enhanced Raman scattering; cardiovascular disease; machine learning; random forest algorithm

Full Text:



1. World Health Organization, Cardiovascular Diseases (CVDs), (accessed 6 May 2023). [].

2. N. N. Pribylova, O. A. Osipova, M. A. Vlasenko, O. A. Vlasenko, and A. Y. Chetverikova, “Diagnostic aspects of definition of operational damage of a myocardium at a coronary revascularization,” Challenges in Modern Medicine 18(10), 17–23 (2012). [in Russian]

3. Z. Liu, D. Meng,G. Su, P. Hu, B. Song, Y. Wang, W. Junhan,Y. Hao, Y. Tianyi, C. Buyun, O. Tse-Hsien, H. Sushmit, M. Matthew, L. Fanxin, and W. Wu, “Ultrafast Early Warning of Heart Attacks through Plasmon-Enhanced Raman Spectroscopy using Collapsible Nanofingers and Machine Learning,” Small 19(2), 2204719 (2023).

4. F. B. de Santana, W. B. Neto, and R. J. Poppi, “Random forest as one-class classifier and infrared spectroscopy for food adulteration detection,” Food Chemistry 293, 323–332 (2019).

5. B. P. Lovatti, M. H. Nascimento, K. P. Rainha, E. C. Oliveira, Á. C. Neto, E. V. Castro, and P. R. Filgueiras, “Different strategies for the use of random forest in NMR spectra,” Journal of Chemometrics 34(12), e3231 (2020).

6. A. Wójtowicz, J. Piekarczyk, B. Czernecki, and H. Ratajkiewicz, “A random forest model for the classification of wheat and rye leaf rust symptoms based on pure spectra at leaf scale,” Journal of Photochemistry and Photobiology B: Biology 223, 112278 (2021).

7. S. Khan, R. Ullah, A. Khan, A. Sohail, N. Wahab, M. Bilal, and M. Ahmed,“Random forest-based evaluation of Raman spectroscopy for dengue fever analysis,” Applied Spectroscopy 71(9), 2111–2117 (2017).

8. G. Li, D. Wang, J. Zhao, M. Zhou, K. Wang, S. Wu, and L. Lin, “Improve the precision of platelet spectrum quantitative analysis based on “M+N” theory,” Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 264, 120291 (2022).

9. L. Hu, C. Yin, S. Ma, and Z. Liu, “Rapid detection of three quality parameters and classification of wine based on Vis-NIR spectroscopy with wavelength selection by ACO and CARS algorithms,” Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 205, 574–581 (2018).

10. T. Chen, Q. Chang, J. G. P. W. Clevers, and L. Kooistra, “Rapid identification of soil cadmium pollution risk at regional scale based on visible and near-infrared spectroscopy,” Environmental Pollution 206, 217–226 (2015).

11. R. K. Douglas, S. Nawar, M. C. Alamar, A. M. Mouazen, and F. Coulon, “Rapid prediction of total petroleum hydrocarbons concentration in contaminated soil using vis-NIR spectroscopy and regression techniques” Science of the Total Environment 616, 147–155 (2018).

12. A. Zyubin, V. Rafalskiy, A. Tcibulnikova, E. Moiseeva, K. Matveeva, A. Tsapkova, I. Lyatun, P. Medvedskaya, I. Samusev, and M. Demin, “Surface-enhanced Raman spectroscopy for antiplatelet therapy effectiveness assessment,” Laser Physics Letters 17(4), 045601 (2020).

13. L. Breiman, “Random forests,”Machine Learning 45, 5–32 (2001).

14. K. Czamara, K. Majzner, M. Z. Pacia, K. Kochan, A. Kaczor, and M. Baranska, “Raman spectroscopy of lipids: a review,” Journal of Raman Spectroscopy 46(1), 4–20 (2015).

15. A. J. Hobro, M. Rouhi, E. W. Blanch, and G. L. Conn, “Raman and Raman optical activity (ROA) analysis of RNA structural motifs in Domain I of the EMCV IRES,” Nucleic Acids Research 35(4), 1169–1177 (2007).

16. D. Garc´ıa-Rubio, B. de la Mora, I. Badillo-Ram´ırez, D. Cerecedo, J. Saniger, J. Ben´ıtez-Ben´ıtez, and M. Villagr´an-Muniz, “Analysis of platelets in hypertensive and normotensive individuals using Raman and Fourier transform infrared-attenuated total reflectance spectroscopies,” Journal of Raman Spectroscopy 50(4), 509–521 (2019).

17. J. Depciuch, E. Kaznowska, I. Zawlik, R. Wojnarowska, M. Cholewa, P. Heraud, and J. Cebulski, “Application of Raman spectroscopy and infrared spectroscopy in the identification of breast cancer,” Applied Spectroscopy 70(2), 251–263 (2016).

18. E. M. Jones, G. Balakrishnan, T. C. Squier, and T. G. Spiro, “Distinguishing unfolding and functional conformational transitions of calmodulin using ultraviolet resonance R aman spectroscopy,” Protein Science 23(8), 1094–1101 (2014).

19. N. C. Maiti, M. M. Apetri, M. G. Zagorski, P. R. Carey, and V. E. Anderson, “Raman spectroscopic characterization of secondary structure in natively unfolded proteins: α-synuclein,” Journal of the American Chemical Society 126(8), 2399–2408 (2004).

20. P. Schellenberg, E. Johnson, A. P. Esposito, P. J. Reid, and W. W. Parson, “Resonance Raman scattering by the green fluorescent protein and an analogue of its chromophore,” The Journal of Physical Chemistry B 105(22), 5316–5322 (2001).

21. W. Curatolo, S. P. Verma, J. D. Sakura, D. M. Small, G. G. Shipley, and D. F. H. Wallach, “Structural effects of myelin proteolipid apoprotein on phospholipids: a Raman spectroscopic study,” Biochemistry 17(9), 1802–1807 (1978).

22. H. Takeuchi, “Raman structural markers of tryptophan and histidine side chains in proteins,” Biopolymers: Original Research on Biomolecules 72(5), 305–317 (2003).

23. B. R. Wood, P. Caspers, G. J. Puppels, S. Pandiancherri, and D. McNaughton, “Resonance Raman spectroscopy of red blood cells using near-infrared laser excitation,” Analytical and Bioanalytical Chemistry 387, 1691–1703 (2007).

© 2014-2024 Samara National Research University. All Rights Reserved.
Public Media Certificate (RUS). 12+