Raman and autofluorescence analysis of human body fluids from patients with malignant tumors

In this study we measured Raman and autofluorescence spectral features of blood and urine from patients with various cancers. A total number of 26 blood samples from patients with lung cancer and 12 blood samples from patients with other cancers, and also 10 urine samples from patients with lung cancer and 9 urine samples from patients with other tumors were tested. The processing of experimental data and definition of informative bands for body fluid spectral analysis were performed on the bases of PLS-DA method. Wherein, there is no significant correlation between the most informative criteria for blood and urine. This fact shows that simultaneous study of blood and urine samples can increase the analysis informativeness. In general, the developed approach of body fluids analysis may become the basis of an inexpensive, quick and reliable method of lung cancer screening.


Introduction
According to the data from World Health Organization, a cancer rate steadily increases. Moreover, most common cancer is a lung cancer [1]. To decrease the mortality rates associated with cancers one has to develop a simple and reliable method for early cancer detection. Currently, X-ray, tomographic, endoscopic and cyto-histological techniques are widely used for diagnosing lung cancer. However, application of some diagnostic methods is ineffective in the early stages of tumor growth [2]. Therefore, effective diagnostics requires the development of new methods for early cancer detection.
The incidence of registered cancer cases leads to a progressive deterioration of patient's health associated with the weakening of immunity, progressing of cachexia and changes in the internal organs functionality [3]. Since these changes provoke alterations in body fluids homeostasis, it is possible to use the component composition analysis of urine, blood, saliva and other body fluids for cancer detection. Presently, the biochemical analysis [4][5][6] and the analysis of tumor markers [7,8] are widely used for the body fluids cancer diagnosis. However, biochemical analysis has poor informative ratio as a result of the poor specificity of the examining body fluids components in a certain cancer localization detection. Analysis of tumor markers is much more informative than biochemical analysis, but the applicability of the most tumor markers in screening studies is also limited by low specificity, and generally tumor markers are used for the disease course monitoring.
Today in addition to the laboratory methods a variety of physical methods of analysis [9] may be successfully utilized for examining the body fluids component composition. Physical methods have such advantages as simplicity of sample preparation, wide dynamic range and great versatility in comparison with chemical methods of analysis. Therefore, body fluids analysis with optical methods can become a successful alternative to existing laboratory methods. Raman Spectroscopy (RS) and autofluorescence (AF) analysis [10] allow for detecting homeostasis changes in the body fluids at the molecular level. These techniques are successfully used in different branches of clinical medicine and in the experimental studies of the body fluids composition for cancer detection in various locations.
For example, application of RS resulted in 92.3%, 79.5% and 86.4% sensitivity and 85.7%, 91.0% and 80.0% specificity respectively in monitoring blood composition in patients with oral cancer [11], stomach cancer [12] and colorectal cancer [13]. RS of urine allowed for the detection of the prostate cancer [14] with 100%sensitivity and 89%specificity. Moreover, it is possible to increase the diagnostic accuracy of tumor detection by combining simultaneously several body fluids analyses. Thus, the utilization of combined AF analysis of blood plasma, cellular components acetone extract, sputum and urine helped to achieve 90% accuracy of lung cancer detection [15].
A low level of cancer detection with only one diagnostic method and in the analysis of only one type of body fluid requires to carry out more complex analysis and use a combination of RS and AF for the simultaneous study of body fluids (such as urine and blood). It is important to note that in the majority of studies reviewed above spectral features of body fluids have been examined in order to classify of a healthy group and a precancerous (or cancerous) group of only one localization. The aim of this paper is to study the application of body fluids RS and AF analysis for cancer location determination.

Experimental setup
The study of body fluids spectral features was performed with the experimental setup shown in Fig. 1. The excitation of collected spectra was performed by the Luxx Master LML-785.0RB-04 laser module (central wavelength 785 nm). The RPB785fiber-optic Raman probe allows for focusing the exiting radiation, collecting and filtering the scattered radiation. The collected signal was decomposed into a spectrum using a high-resolution Shamrock SR-500i-D1-R spectrograph with integrated cooled up to -65°C ANDOR DU416A-LDC-DD digital camera. The test body fluids were placed in the PMMA cuvette with an aluminum coating. The cuvette geometry (depth 6.5 mm, radius of deepening curvature 19 mm) was optimized to match the working distance of probe focusing lens. The Raman probe was normally positioned on the axis of the deepening.

Body fluid samples and experimental preparation
The blood and urine samples were collected from patients with cancers of different localizations. The collected samples were placed in sterile test-tubes and were stored at +2 -+4°C before the analysis. The analysis of collected body fluids was performed within 60 h after sample collection. Patients of Samara Regional Clinical Oncology Dispensary with malignant tumors or benign tumors were enrolled in this study.
Patients with systemic diseases and patients taking any medical antitumor drugs were excluded from the study. We performed our study for two cohorts of patients. The first cohort included 26 blood samples from patients with lung cancer, 12 blood samples from patients with other tumors (2 benign tumors, 6 stomach cancers, 3 mediastinum cancer, 1 kidney cancer), 10 urine samples from patients with lung cancer, and 9 urine samples from patients with other tumors (2 benign tumors, 1 stomach cancer, 5 mediastinum cancer, 1 kidney cancer). For the second cohort of 10 samples from patients with lung cancer and 3 samples from patients with other tumors (1 benign tumor, 1 stomach cancer, 1 kidney cancer) the simultaneous recording of spectral properties of urine and blood was carried out. The performed studies were approved by the ethical committee of Samara State Medical University.

Spectra processing
All spectra were recorded in the spectral range 780-950 nm, the exposure time being 20 seconds. A sequential recording of three spectra for each tested sample was performed. The final spectrum was received from averaging the three recorded spectra. The total time of the final spectrum recording was 3 minutes. The recorded spectra were processed by the method proposed by Zeng et al [16] for AF and Raman signals separation.
The raw spectrum of the urine sample is presented in Fig. 2. The spectrum contains a wide decreasing AF part and narrow Raman peaks. AF was approximated by a tenth order polynomial function. The Raman component of the spectrum was obtained by subtracting the AF component from the raw spectrum. Further analysis of RS and AF spectra was performed independently. The processing of experimental data and calculation of posterior lung cancer determination from other types of tumors by body fluid spectral characteristics were performed on the bases of regression analysis. Prior to regression analysis, the raw spectral data were centered, smoothed by the Savitsky-Golay filter, and normalized by using standard deviation of a normal variate method (SNV) [17]. Data centering decreases the model rank by one, and is applicable in uniform model cases. The SNV method subtracts mean value from each spectrum and divides each signal value by the standard deviation of the whole spectrum. The SNV method is used for leveling the experimental data dispersion [17].
The recorded spectra of body fluids may contain hidden links between different spectrum bands, due to the contribution of the same chemical components to these bands. Which results in the appearance of multiple correlations (collinearities). Consequently, the analyzed spectral data is multicollinear, therefore, the projection methods are required for such data analysis. Since we have a priori information about exact cancer type corresponding to each study sample, it is recommended that training and classification problem should be solved. The most popular approach to such problems is the discriminant analysis method with regression on latent structures -PLS-DA [18].
The regression problem is solved by the PLS method, further it allows for the regression prediction application in classification of new samples. The regression problem solution by PLS method was performed by the following algorithm: a twodimensional matrix of spectra (predictors block) and a one-dimensional matrix of diagnoses (responses block) are decomposed into a matrix of scores, a matrix of loadings and a residual matrix. The matrix of loadings provides information about the role of variables, the matrix of scores provides information about the data similarity and correlation. After model construction the k-fold cross validation was used for accuracy testing.
Cross validation was carried out as follows: 10% of the samples are excluded from the data set, a model is built to the excluded part and then applied to the remaining part of samples. Then the excluded part of the samples is returned, and the cycle is repeated 9 times with excluding of the following samples parts.
The spectrum informative bands during the regression model construction were determined by the analysis of the variable importance in projection (VIP) [19]. VIP allows for evaluating individual variables from the predictors block influence on the PLS model. The higher the VIP-score of an individual variable is, the more significant it is in model construction. Variables with a low VIP-score are less important, and may be regarded as candidates for exclusion from the model. The VIP distribution makes it possible to define the most informative spectral bands in the blood and urine spectra for constructing a regression model for classification of patients with lung cancer and patients with other tumors.
Multivariate analysis was carried out with using the TP T cloud beta software module (https://tptcloud.com). The statistical processing of the results, analysis of the correlation dependence and calculation of the Pearson correlation coefficient were performed in the IBM SPSS Statistics ver. 23 software package.

Raman spectra of blood
The Raman spectra of blood are presented in Fig. 3. As shown in Fig. 3, blood samples of patients with various tumors have qualitatively coinciding spectra. Differences are observed in the intensity amplitude of the individual spectral bands. Human body fluids have a complex chemical composition; therefore the shape of body fluids Raman spectra and certain spectral bands intensities depend on the contribution of molecular vibrations of several components. This set of blood spectra was a subject to the multivariate analysis for constructing regression model. Fig. 4 shows the VIPscores of Raman spectra matrix of blood samples for the constructed regression model of lung cancer detection among tumors of other localizations.
Results shown in Fig. 4 allow for defining the most informative spectral bands in the constructed regression model fordiscriminating the lung cancer from other tumors during the analysis of blood Raman spectra. For example, the spectral band 790-820 cm -1 corresponds to glutathione [20]. Oncological diseases are followed by a change in the relative amount of neutrophils, which stimulates oxidative stress in the patient's blood.  Changes of glutathione concentration causes the antioxidant activity decrease of plasma, therefore, this decrease may be regarded an informative criterion for assessment of the organism oxidative stress [21]. Cancer tissues are characterized by increased proteolysis and increased concentration of acute phase proteins [22]. Informative Raman bands associated with these changes are 946-970 cm -1 (proteins), 1465-1475 cm -1 (lipids, proteins), 1640-1660 cm -1 (proteins, phospholipids) [23 -25]. The intensity of the spectral band 1135-1140 cm -1 is proportional to the mannose concentration [26].
The metabolic imbalance of minor sugars and changes in the mannose concentration lead to changes in the glycoproteins synthesis and changes in glycosylation [27]. As a result, the organism produces "abnormal" immunoglobulins, and the immune system ability to identify "abnormal" cells decreases. The constructed regression model enables to discriminate the lung cancer from other tumors by blood sample spectral characteristics analysis with 84.9% a posteriori probability. To improve our research informativity spectral characteristics of urine samples from cancer patients were analyzed.

Spectral characteristics of urine
The porphyrins (nitrogen-containing pigments) accumulate in sites of active cell division and excrete with urine [28]. Alterations in the AF urine spectrum reflect changes and metabolic imbalance of porphyrins. Therefore, the AF intensity of urine can be used as an informative criterion of oncopathology growth. The AF spectrum of porphyrins has features in red and nearinfrared spectral ranges [29], so the excitation of the AF spectra by 785 nm laser allows to evaluate the presence of porphyrins in the test sample. The raw spectra of urine samples from patient with stomach cancer and patient with lung cancer at different exposure times of laser radiation are shown in Fig. 5 (a, b).   5 demonstrates changes in AF intensity and changes in photobleaching process for patients with different diagnoses. The photobleaching mechanism for various porphyrins in urine is quite complicated, since photosensitizing porphyrins may interact with various photo-oxidizing molecules in biological fluids [30]. A standardized spectra recording was performed to correctly estimate AF; the sample irradiation time being 3 minutes. Approximation curves of urine AF for test samples are shown in Fig. 6. Features of various urine samples AF are caused by porphyrin metabolism changes and interaction of porphyrins with various organic molecules [29,30].
On the basis of AF, approximating curves set the regression model was built. For the model obtained the a posteriori probability of lung cancer determination from other tumors was 83.3% for obtained model. Thus, it is necessary to study the Raman spectra of urine samples from cancer patients towards improving informativity of the analysis of urine spectral characteristics for detecting lung cancer. Raman spectra of urine samples are presented in Fig. 7.  A multivariate analysis based on the obtained urine Raman spectra was carried out, and a regression model was built. VIP-scores of urine Raman spectra samples for the constructed regression model of lung cancer detection are shown in Fig. 8.  a -urea (1003 cm -1 ) in urine and glutathione (803 cm -1 ) in blood; b -tryptophan (1553 cm -1 ) in urine and glutathione (803 cm -1 ) in blood; c -pyruvate (1700 cm -1 ) in urine and glutathione (803 cm -1 ) in blood; d -urea (1003 cm -1 ) in urine and mannose (1138 cm -1 ) in blood; E -tryptophan (1553 cm -1 ) in urine and mannose (1138 cm -1 ) in blood; f -pyruvate (1700 cm -1 ) in urine and mannose (1138 cm -1 ) in blood; g -urea (1003 cm -1 ) in urine and protein (1660 cm -1 ) in blood; h -tryptophan (1553 cm -1 ) in urine and protein (1660 cm -1 ) in blood; e -pyruvate (1700 cm -1 ) in urine and protein (1660 cm -1 ) in blood. Fig. 8 demonstrates the most informative spectral bands in the regression model constructed for discriminating the lung cancer from other tumors in the analysis of the urine Raman spectra. These bands are 1000-1015 cm -1 (urea) and 1525-1560 cm -1 (tryptophan, proteins) [31,32]. Progress of oncopathology growth is followed by an increased proteolysis, which corresponds to the changes in 1525-1560 cm -1 band intensity. Synthesis of ammonia during proteolysis in the body leads to further ammonia fermentation in the liver with the formation of urea. Urea is the nitrogen metabolism end product in the proteins metabolism and it may be a criterion for protein metabolism evaluation in the body cells [33].
Tumor cells are characterized by a high glucose intake. In this case, there is an anaerobic glycolysis. The marker of increased glycolysis is lactic dehydrogenase (LDH) [34]. LDH affects the pyruvic acid concentration corresponding to the spectral band 1690-1705 cm -1 (pyruvate) [32]. The constructed regression model allows for discriminating the lung cancer from other tumors during the urine spectral characteristics analysis with the a posteriori probability of 93.9%.

Combined analysis of urine and blood spectral data
Improving the proposed approach accuracy for lung cancer determination is possible by combining spectral analysis data of blood and urine. A two-dimensional distribution of intensities proportional to the previously described changes of Raman spectra components in blood and urine is shown in Fig. 9 (a-i).
The a posteriori probability of lung cancer determination for the selected Raman bands of urine and blood spectra shown in Fig. 9 laid down between 76.2% and 94.9%, wherein proteins (1660 cm -1 ) in blood and pyruvate (1700 cm -1 ) in urine are the most informative combination of blood and urine components, which is indicative of lung cancer.
We estimated the correlation between the main informative Raman bands of urine and blood. Urine is a product of blood filtration through the kidneys. Consequently, increasing blood components concentration to a certain reabsorption threshold can lead to a change in concentration of the corresponding components in urine. Therefore, the urine component concentration can correlate with the blood component concentration. As it follows from the VIP distribution, the most informative Raman bands for lung cancer determination are: I bl.803 (glutathione), I bl.955 (proteins), I bl.1138 (mannose), I bl.1471 (proteins, lipids), I bl.1556 (tryptophan), I bl.1660 (proteins, phospholipids) in blood analysis; and I ur.1003 (urea), I ur.1375 (arabinose), I ur.1525 (proteins, tryptophane), I ur.1553 (tryptophane), I ur.1700 (pyruvate) in urine analysis. Here I bl(ur).i is the Raman intensity on the i-th band of blood (bl) or urine (ur) spectra. Table 1 shows pair correlations between the most informative criteria for discriminating the lung cancer from other tumors of both test body fluids are presented in. Significant correlations (p-value< 0.01) are in bold type.
It follows from Table 1 that there is no correlation between the most informative criteria of lung cancer detection for both test body fluids. Consequently, simultaneous analysis of the several body fluids spectral characteristics may improve the accuracy of the proposed lung cancer detection method. Significant correlations between I bl.803 -I bl.955 , I bl.803 -I bl.1138, I bl.803 -I bl.1660 criteria can be explained by the presence of glutathione. Glutathione spectrum has strong Raman peaks at 953 cm -1 , 1143 cm -1 , 1660 cm -1 wavenumbers [32], and therefore, glutathione contributes to the corresponding blood spectra bands. Likewise, a significant correlation between I bl.1556 -I bl.803 is probably related to the fact that the tryptophan Raman spectrum has peaks at 803 cm -1 and 1556 cm -1 wavenumber region and tryptophan may contribute to the corresponding spectral bands [32].

Discussion and Conclusions
On the basis of blood sample experimental data multivariate analysis the Raman bands intensity changes are proportional to the concentration changes of glutathione, mannose and proteins, and these bands may be an informative criteria for discriminating the lung cancer from other tumors. The fluorescence intensity changes associated with porphyrins and the Raman intensity changes corresponding to urea, tryptophan and pyruvate are the most informative criteria for lung cancer and other tumors classification by urine analysis. The a posteriori probabilities of lung cancer separation from other tumors based on the proposed methods of blood and urine analysis are presented in Table 2.
In current study, the highest a posteriori probability of lung cancer detection is 94.9%. It was achieved by a simultaneous analysis of urine and blood by the RS method. Decoupled RS analysis of urine and blood allows for achieving 84.9% and 93.9% a posteriori probability of lung cancer detection respectively. AF urine analysis made it possible to separate lung cancer from other tumors with 83.3% a posteriori probability. However, the blood spectra analysis was performed for a larger number of samples, while a lower a posteriori probability in blood RS analysis in comparison with urine RS analysis may be associated with this fact. Therefore, additional studies with a large number of body fluid samples are necessary in order to determine the precise capabilities of the proposed method. On the other hand, there is no significant correlation between the most informative criteria of lung cancer detection for both body fluids. This fact has shown that simultaneous study of the blood and urine samples can increase the analysis informativeness and improve the probability of lung cancer separation from other tumors by using a combination of RS and AF.
Comparison of the obtained results with those of other studies shows that the proposed optical method may become the basis for cancer screening and may be used in combination with other methods for enhancing research informativeness. For example, a non-invasive and cost-effective cancer screening method (breast cancer, cervical cancer, colon cancer, leukemia, esophageal cancer, liver cancer, bladder cancer) by fluorescence analysis was demonstrated by V. Masilamani et al. [35]. This study showed 86.7% a posteriori probability of various cancers detection by changes of flavoproteins and porphyrins excreted with urine. Adding the data of urine and blood Raman analysis allows to increase the accuracy of lung cancer detection by including the information about urea, tryptophan and pyruvate content to the analysis. G. Del Mistro et al [14] demonstrated 95% a posteriori probability of prostate cancer detection by urine RS analysis. The greatest spectral changes for the urine samples of the prostate cancer group are associated with changes in the 6oxypurine content. The diagnostic study of the body fluids spectral characteristics by using RS for the oral cancer detection was demonstrated by S. Jaychandran et al [31]. Analysis of 158 urine samples, 158 blood samples and 158 saliva samples made it possible to define the differences between the healthy, precancerous and cancer groups with 90.5%, 78%, and 93.1% a posteriori probability respectively for each body fluid. For the studied groups, the main spectral differences of blood samples are associated with changes in phenylalanine, lipids, collagen, purine and amide I; the spectral differences of urine samples were associated with changes in creatinine, tryptophan, indoxyl sulfate. Thus, the informativeness increase for the above mentioned studies is possible due to adding the glutathione, porphyrins and pyruvate content data to the analysis.
Besides the combined analysis of the several body fluids, the increasing of the cancer detection accuracy using body fluids spectral characteristics analysis is possible by preallotment of certain markers from tested samples as demonstrated by Shangyuan Feng et al [36]. The proposed method uses modified nucleotides separation from the urine samples by affinity chromatography with the following nucleotides RS analysis. The PLS-DA analysis of spectroscopic data allows for achieving 95% a posteriori probability of nasopharyngeal cancer, esophageal cancer, and a healthy group separation by urine RS analysis. This method demonstrated high accuracy; however, such analysis is complicated, since the specific substance separation from a body fluid sample requires utilizing certain ligand.
The discussed approaches to various cancer detection demonstrate that the proposed method may prove alternative to the available cancer detection techniques. The increase in the study informativeness of blood and urine may be achieved by AF and RS combined study and joint analysis of registered spectroscopic data. However, a comprehensive understanding of cancer detection possibility with the proposed method requires that the number of patients enrolled should be increased. Also it is advisable that method sensitivity and specificity be cheeked for detecting cancer among the patients with nononcological diseases and healthy people. In order to do this, numerous studies with body fluid samples from people without oncological pathologies should be performed. In addition to studying the spectral characteristics of urine and blood, it is also possible that other body fluids be utilized [37,38] as research objects for increasing the cancer detection accuracy.