Multidimensional Analysis of Dermoscopic Images and Spectral Information for the Diagnosis of Skin Tumors

. The paper is devoted to the identification of skin tumors and interpretation of their component composition based on multidimensional analysis of Raman scattering spectral data and dermoscopic images. The dataset contains 65 samples of malignant melanomas, 107 seborrheic keratoses, and 166 nevi. The multivariate curve resolution alternating least squares analysis was performed for the Raman spectra to extract spectral profiles of main skin components and their composition. The obtained biochemical profiles of skin neoplasms were analyzed by the gradient boosting method. Dermoscopic image analysis was performed using a convolutional neural network with modified Visual Geometry Group 16-layer model architecture. Joint analysis of component and spatial features was carried out using logistic regression of the predicted values of a model based on Raman spectra and a model based on image analysis. As a result, a binary model for the classification of malignant melanoma and pigmented benign neoplasms was constructed, showing an area under the receiver operating characteristic of 0.94 (0.90−0.98) with 95% confidence interval. Combining spatial and Raman spectra derived component features makes it possible to increase the efficiency of diagnosing skin cancer. © 2024 Journal of Biomedical Photonics & Engineering


Introduction
Skin neoplasms are one of the most commonly diagnosed forms of cancer [1].However, visual diagnosis of skin tumors significantly depends on the qualifications and professional experience of the physician, and its accuracy varies from 40 to 80% due to the presence of many different nosologies of skin tumors that have similar distinctive features [2].The gold standard in oncology is histological examination, but it requires sampling and is not applicable for initial examination or mass monitoring.Therefore, the urgent task is to develop tools and information processing methods that allow the doctor to diagnose skin tumors in a timely and accurate manner.
In recent years, optical methods have been increasingly used to investigate biological tissues and fluids [3−6].Raman spectroscopy (RS) methods have the greatest sensitivity to changes in the chemical composition of neoplasms compared to healthy skin due to the recording the intensities of vibrational-rotational vibrations of the functional groups of molecules of nucleic acids, proteins, lipids, and hydrocarbons [7−9].It J of Biomedical Photonics & Eng 10(1) 2024 22 Mar 2024 © J-BPE 010307-2 has been shown that discriminant analysis of RS samples using projection to latent structures (PLS-DA) provides a statistically reliable result [10].Principal component analysis was successfully used to obtain differences in spectral data [11,12].Optical diagnostics of skin cancer has been further developed through the use of classification algorithms based on neural networks, demonstrating high sensitivity and specificity in real time.It was shown that the use of convolutional neural networks (CNN) for analyzing RS spectra allows one to achieve an accuracy in determining the type of skin tumor comparable to the diagnostic accuracy of qualified dermatologists [13,14].However, in most studies this approach is presented in the form of a "black box", focusing exclusively on the classification problem, and the classification criteria very often remain unknown, which does not allow them to be physically interpreted.
Recently, the method of multidimensional curve resolution (MCR) has found effective application in biology and medicine to determine the concentration of mixture components from spectra [15−17].The application of the MCR method has been studied for the analysis of the RS spectra of mixtures of proteinogenic amino acids [18], the effect of noise on RS spectra resolution [19], and the possibility of diagnosing skin diseases [20].It was shown [20] that each RS spectrum may be represented with accuracy 95% by concentration profile of eight groups of spectral components: elastin and nature moisturizing factor (1386, 1518, 1174, and 1230 cm −1 ), keratin, collagen and elastin (1248, 1450, 1656 cm −1 ), water (1642 cm −1 ), proteins, lipids and nuclear acid (1248, 1271, 1301, 1318 cm −1 ), proteins, lipids, nature moisturizing factor and melanin (1144, 1275, 1750, 1355, 1386, 1559, 1694 cm −1 ), proteins and water (1450, 1642 cm −1 ) and the contribution of the optical system.The relative component composition of skin neoplasms reconstructed using MCR analysis of Raman spectra can be physically interpreted and contains sufficient information for the classification of neoplasms [20].However, the area under the receiver operating characteristic (ROC AUC) values do not exceed 0.70 for the discriminant analysis by PLS-DA.
It is known that the increase of the efficiency of tumor recognition can be achieved by implementation of multimodal diagnostic methods [21] that combine several qualitatively different groups of features.Since RS uses spectral and component profiles as characteristic distinguishing features, it is most natural to supplement them with spatial properties that characterize the heterogeneity of tumor growth and can be obtained from the analysis of dermatoscopic images.Machine learning algorithms demonstrate good performance in analyzing dermatoscopic images when working with several hundred features [22,23], but classification accuracy indicators decrease with an increase in the number of features.To overcome the described shortcomings, deep learning methods are used [24−26], which require a huge amount of data.It is difficult to generate such a volume of data, so it is interesting to use methods for increasing and retraining the classifier when adding new data.
Therefore, the aim of this research is to develop a method for identifying skin tumors and interpreting their component composition based on multidimensional analysis of RS data and dermoscopic images.

Experimental Data
All Raman spectra and dermoscopic images of skin lesions were obtained using the experimental system (see Fig. 1), which combines portable RS unit and multispectral digital dermatoscope.The RS unit allows recording of Raman spectra in the near-infrared region when excited by 785 nm radiation.The multispectral digital dermatoscope with a Basler acA1920-25uc camera (RGB color system, resolution ~ 13 μm/pixel) [28] provides image recording in polarized visible light (1920 × 1080, 48 bits, 96 dpi, .tiff).
All measurements of in vivo RS spectra were carried out with a spectral resolution of 0.2 nm in the range from 800 to 1000 nm at a signal accumulation time of 60 s.The optical detector was located above the tissue sample at a distance of 7−8 mm.The diameter of the probe radiation beam on the skin was approximately 3 mm.The laser power density on the skin did not exceed 1.56 W/cm 2 .Preprocessing of RS spectra included cutting in the range of 860−920 nm, which corresponds to 1114−1874 cm −1 , smoothing with a Savitzky-Golay filter and baseline removal in order to eliminate autofluorescence.Registration of dermoscopic images was carried out in polarized illumination mode.
The study protocols were approved by the ethics committee of Samara State Medical University (protocol No. 132 of May 29, 2013).An excisional biopsy was taken for each of the lesions, the diagnosis was established using histological analysis.The dataset contains 65 samples of malignant melanomas (MM), 107 seborrheic keratoses (SK), and 166 nevi (NE).The dermoscopic image set contains 65 MM, 166 NE, and 107 SK.Electronic recourse HAM10K (224 × 224, 24 bits, 96 dpi, .jpeg)with 1113 MM, 6705 NE, and 1099 SK images [29] was used for preliminary training of the neural network.

Analysis of Raman Spectra by Multivariate Curve Resolution Method
The alternating least squares (ALS) of MCR-ALS GUI 2.0 method [30] was used for reconstruction of concentrations and spectral profiles of skin components.The application of the MCR method for in vivo RS spectral analysis allows us to reflect the contribution of the most significant skin components.As a classification method, we used the gradient boosting algorithm, namely, Light Gradient Boosting Machine (LightGBM) framework.It is an ensemble algorithm based on decision tree models that are sequentially added to the ensemble to correct errors made by previous models [31].
The proposed approach makes it possible to extract important diagnostic information from spectral data that is understandable to the doctor, namely the relative concentrations of the components in the sample.

Convolutional Neural Network
For the analysis of spatial data, we used a CNN based on a modified Visual Geometry Group 16-layer model (VGG16) architecture (see Fig. 2).We used the Python library Keras to build the CNN architecture.The neural network receives dermoscopic images with dimensions of 224 × 224 pixels as input.The classifier consists of five alternating blocks of Conv2D and MaxPooling2D layers (with ReLU activation function) and fully connected Dense layers as an output.The Conv2D CNN layer performs the convolution operation using a 3 × 3 sliding window method.It slides along the feature map and extracts a three-dimensional feature template with a certain shape (window height, window width, entry depth) at each possible position.MaxPooling2D layers are used for decrease the resolution of the feature map, which reduces the number of trainable parameters.The operating principle of MaxPooling2D is based on selecting the maximum value from the extracted window of the input feature map.In this paper, the window size was 2 × 2, which makes it possible to reduce the resolution by half.At the output, the total number of training parameters was 1,097,345.The sigmoid was used as the activation function at the output layer, the optimizer was Adam.
The data was divided into training, validation, and test sets in a ratio of 8:1:1.It is important to mention that the dermoscopic images dataset may contain arbitrary number of different types of neoplasms, therefore, when forming the training, validation, and test sets, the classes were balanced as follows.The set of the smaller class included all available samples, and the set of the larger class was formed randomly from the available samples of this class.Taking into account the random nature of the sampling, the construction of classification models was carried out at least 10 times, and the results was calculated by averaging the model metrics over the number of experiments performed.Thus, at each stage of constructing a classification model for MM and benign pigmented neoplasms, 65 MM, 26 SK, and 39 NE were used, randomly taken from the data set obtained by the multispectral digital dermatoscope.Since the classes in HAM10K dataset are also unbalanced, the number of images of the smaller class (MM) was increased using various geometric transformations, such as rotation by a random angle, shifts, zooming, etc.
The stability of the resulting models was checked by 10-fold cross-validation.This cross-validation was carried out using the following algorithm: 10% of the objects in the set were excluded from it and became the test set.The model was rebuilt based on the remaining part of the data.Then the excluded part of the objects was returned, random 10% of objects that have not yet been in the test set were again selected as a new test set, and the model was rebuilt again.The cycle was repeated 10 times, and the final performance of the model was calculated as the average of the ten test results obtained.Cross-validation allows one to determine the optimal parameters of the class separation model to avoid overfitting.
J of Biomedical Photonics & Eng 10(1) 2024 22 Mar 2024 © J-BPE 010307-4 The learning process was based on the concept of transfer learning: the network was trained on HAM10K data, then the network was additionally trained on the second set of data obtained by multispectral digital dermatoscope presented in Fig. 1.Additional training was carried out in conditions where the four upper layers are frozen.Freezing the layers is necessary in order to preserve the common features previously identified by the classifier and the resulting weights.
The use of two data sets is due to the small number of images in the data set obtained by multispectral digital dermatoscope and to the difference in the characteristics of the dermoscopic equipment.Training on HAM10K is not enough for the CNN to be able to work with other data, it is necessary to additionally train the network on images recorded by a multispectral digital dermatoscope.
The optimal number of epochs was determined experimentally based on the analysis of accuracy graphs for the training and validation sets.Accuracy was calculated as the proportion of correct responses of the neural network.To analyze what the trained CNN pays attention to when forming a forecast, attention maps were built using the Python library Tensorflow.The areas where CNN pays more or less attention to when forming a diagnosis are highlighted in yellow or blue, respectively.

Multidimensional Analysis by Logistic Regression
In this research, we attempted a joint analysis of the spectral component features identified by the MCR method and spatial features extracted by a CNN from dermoscopic images.It was proposed to combine classification models at the very last stage where the model receives predicted values for each observation.Thus, we used logistic regression of the predicted values of both models for multidimentional analysis of neoplasms based on their RS derived component compositions and spatial properties of dermoscopic images.ROC AUC was chosen as a scalar characteristic for comparing several models.ROC AUC confidence intervals (CI) were calculated for a probability level of at least 95%.

Analysis of Raman spectra by Gradient Boosting Method
Information about the relative component composition of each neoplasm was used as features for recognizing RS spectra.In order to overcome the low accuracy of PLS-DA method for analyzing of MCR resolved RS spectra reported in Ref. [20], it was proposed to use ensemble algorithm LightGBM.As a result of the selection of the LightGBM hyperparameters, the maximum number of leaves was set to 30, and the maximum number of trees did not exceed 100.The stability of the resulting models was checked by 10-fold cross-validation.A binary model for classifying MM versus benign pigmented neoplasms (SK+NE) was constructed (RS model).The ROC curve is shown in Fig. 3 (blue line).Model quality metrics are presented in Table 1.The model shows the accuracy, sensitivity, and specificity of 0.78, 0.77, and 0.80, respectively, and the ROC AUC of 0.82 (0.76−0.89, 95% CI), which is significantly higher compared to the value of 0.69 (0.63−0.76, 95% CI) received by PLS-DA method [10,20].Automation of the process of recognizing MM while maintaining the possibility of physical interpretation of the obtained data on the relative composition of the skin area makes the algorithm a candidate for widespread use in decision control systems.However, despite the superiority of the LightGBM model over the results of previous studies, the ROC AUC values do not exceed 0.90.To increase the efficiency of tumor recognition, it is necessary to expand the spectral classification features by implemention of qualitatively new characteristics, which leads us to the analysis of dermoscopic images.Fig. 3 ROC curves of RS model (blue line), spatial CNN model (green line), and multidimensional model (red line).ROC AUC confidence intervals are calculated for a probability level of at least 95%.

Analysis of Dermoscopic Images by the Convolutional Neural Network
Using transfer learning approach described above, a CNN classifier for MM versus benign pigmented neoplasms (SK+NE) was constructed (spatial CNN model).As it can be seen from evaluating learning process accuracy graph for preliminary CNN training on HAM10K dataset the model is trained smoothly (Fig. 4a), the number of correctly classified images increases.From attention maps for test images (Fig. 5a), it is noticeable that the neural network trained on the HAM10K dataset pays attention to the tumor boundary, focusing on the size and shape of the object in the decision-making process.At the next stage, images recorded using the experimental system, there is a general trend toward increasing accuracy (Fig. 4b), but the model behaves somewhat unstable, which may be due to the small amount of data used for additional training.As for attention maps (Fig. 5b), the model begins to pay attention to the neoplasm itself, assessing not only its shape and size, but also its relief.Taking into account textural features, the neural network analyzes the heterogeneity of tumor growth.
The ROC curve is shown in Fig. 3 (green line).Model quality metrics are presented in Table 1.The use of a CNN in combination with transfer learning and the use of an additional dataset for preliminary training of the CNN allows us to obtain a stable classification model that demonstrates an accuracy of 85%.The model shows sensitivity and specificity of 0.85, and 0.85, respectively, and the ROC AUC of 0.87 (0.82−0.93, 95% CI), which is slightly higher compared to the value of 0.82 (0.76−0.89, 95% CI), obtained by the analysis of RS spectra derived component compositions using the LightGBM method.

Multidimensional Analysis of Raman Spectra and Dermoscopic Images of Neoplasms
As a result of logistic regression of the predicted values of two independent components composition and spatial CNN models, a joint binary model for classifying MM versus benign pigmented neoplasms (SK+NE) was constructed (multidimentional model).The stability of the resulting combined multidimentional model was tested by 10-fold cross-validation.The ROC curve is shown in Fig. 3 (red line).Model quality metrics are presented in Table 1.Analyzing the presented metrics, one can note the good ability of the classification model using multidimensional analysis to distinguish between MM and SK+NE.The model shows the accuracy, sensitivity, and specificity of 0.84, 0.83, and 0.85, respectively.The quality metrics calculated for the 0.5 threshold are slightly inferior to the results provided by the CNN model.However, these metrics depend on the chosen classification threshold.For a commonly used classification threshold at 0.67 (two-thirds of the maximum possible value), an accuracy of 0.87 and a specificity of 1.00 can be achieved.The obtained result suggests the possibility of the effective use of the described multidimensional method for diagnosing neoplasms in clinical practice in population screening programs, since for the classification case under consideration, the model will not miss any MM, accurately identifies as MM all samples that are really MM.Of course, in this case part of the samples (not more than 25%) will be identified as MM by the model, although in fact they are benign pigmented neoplasms.However, all these cases can be resolved with an in-depth study, which is always carried out when diagnosing MM.
The ROC AUC of the multidimensional model is 0.94 (0.90−0.98, 95% CI), which surpasses the ROC AUC of the RS and spatial CNN models, where the ROC AUC was 0.82 (0.76−0.89, 95% CI) and 0.87 (0.78−0.95, 95% CI), respectively.This means that, in general, the ability of multidimensional model to distinguish between MM and benign pigmented neoplasms is better than that of models based on component analysis by LightGBM method or CNN analysis of dermoscopic images.The latter is explained by the possibility of a multidimensional model to use spatial classification features when prediction by spectral features fails, and vice versa.This can be easily demonstrated from the analysis of Fig. 6, where the studied neoplasms are shown in the three-dimensional space of the predicted values of the constructed RS, spatial CNN, and multidimensional models.Each marker in Fig. 6a is one of the neoplasms with the coordinates of the predicted values for the models, and its color indicates the true diagnosis of the sample obtained as a result of histological examination: yellow -MM, blue -SK or NE.The predicted values of the models range from 0 to 1, where 0 means MM, and 1 means benign pigmented neoplasms (SK+NE).If the predicted value is less than the threshold value (in this case 0.5), it means that the model classifies the sample as MM; if it exceeds the threshold value, it is classified as SK+NE.The projection of tumor markers onto the prediction plane of the RS and spatial CNN models is shown in Fig. 6b.Here, the outlines of the markers indicate the correctness of determining the diagnosis using a multidimensional model.Samples that were correctly recognized by multidimensional model are circled in green, samples that were incorrectly recognized are circled in red.
As one can see, only a small part of the borderline cases is subject to incorrect classification (when independent RS and spatial CNN models make different decisions, that is, when one model shows a predicted value above the classification threshold, and the other model shows a lower one).These cases are shown in Fig. 6b in the graphs in quadrants 2 and 4. Thus, the multidimensional method allows us to resolve controversial cases when spectral (component compositions) features indicate one diagnosis, and spatial features indicate another, and draw a conclusion about the correct diagnosis.
When comparing the results obtained with other studies, it is worth noting a clear advantage.Thus, in a study [2], the average sensitivity and specificity of 58 dermatologists for the classification of 100 skin lesions was 86.6% (±9.3%) and 71.3% (±11.2%),respectively.This corresponds to a mean ROC AUC of 0.79 (±0.06).Moreover, the average values of sensitivity, specificity and ROC AUC for the expert group are 89% (±9.2%), 74.5% (±12.6%) and 0.82 (±0.06), respectively [2].Thus, the combined classification model can help ordinary dermatologists to identify skin lesions on average more effectively.

Conclusion
The results obtained support the feasibility of multidimensional analysis of multimodal optical skin data, namely RS spectra and dermoscopic images.Combining spatial features of dermoscopic images and the component features identified by the MCR method allows us to preserve the advantages of the physical interpretability of the results of MCR analysis and at the same time allows us to include in the analysis the entire volume of spectral information and textural features of tumors visible on the images.
Due to the complex complementarity of the distinguished spatial, spectral, and component features, the method is characterized by high stability and accuracy with ROC AUC equal to 0.94 (0.90−0.98, 0.95 CI) for the binary classification of MM and benign pigmented neoplasms, which the quality of determining the type of neoplasm is higher when using only spatial features or only spectral and component features by 7% and 12%, respectively.
The developed methods, models, and algorithms of multidimensional analysis make it possible to recognize the type of neoplasm and increase the efficiency of diagnosing skin cancer to a level comparable to or exceeding the efficiency of diagnosing neoplasms by general practitioners and medical specialists by combining spatial, spectral, and component signs.

Fig. 4
Fig. 4 Accuracy graphs: (a) for the model trained on the HAM10K set, (b) for the model additionally trained on the second data set registered by the experimental system.

Fig. 5
Fig. 5 Attention maps for images from the test set: (a) for the model trained on the HAM10K set, (b) for the model additionally trained on the second set registered by the experimental system (yellowmore attention, blueless attention).The axes indicate pixels.

Fig. 6
Fig. 6 Scatter diagram of predicted values of designed RS, spatial CNN, and multidimensional models of RS spectra and images: (a) three-dimensional space of the predicted values of all models, (b) projection on prediction plane of RS and spatial CNN models.

Table 1
Quality metrics for the binary classification model of MM and SK+NE.