Visualization and Characterization of Nucleotide Sequences in Bioinformatics: State-of-Art in the Past and Present
Paper #8980 received 29 May 2023; revised manuscript received 29 Aug 2023; accepted for publication 29 Aug 2023; published online 21 Nov 2023.
DOI: 10.18287/JBPE23.09.040201
Abstract
Keywords
Full Text:
PDFReferences
1. S. Goodwin, J. D. McPherson, and W. R. McCombie, “Coming of age: ten years of next-generation sequencing technologies,” Nature Reviews Genetics 17, 333–351 (2016).
2. S. Neidle, M. Sanderson, Principles of nucleic acid structure, 2nd ed., Academic Press (2021). ISBN: 9780128196786.
3. J. F.-W. Chan, S. Yuan, K.-H. Kok, K. K.-W. To, H. Chu, J. Yang, F. Xing, J. Liu, C. C.-Y. Yip, R. W.-S. Poon, H.-W. Tsoi, S. K.-F. Lo, K.-H. Chan, V. K.-M. Poon, W.-M. Chan, J. D. Ip, J.-P. Cai, V. C.-C. Cheng, H. Chen, C. K.-M. Christopher Kim-Ming Hui, and K.-Y. Yuen, “A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster,” Lancet 395(10223), 514–523 (2020).
4. “Official hCoV-19 Reference Sequence, hCoV-19/Wuhan/WIV04/2019”, GISAID (accessed on 10 April 2023). [https://www.epicov.org/epi3/frontend#541d4].
5. “Influenza A virus (A/Weiss/43 (H1N1)) neuraminidase (NA) gene, complete cds,” National Library of Medicine (accessed on 10 April 2023). [https://www.ncbi.nlm.nih.gov/nuccore/AF250365.2].
6. “Variola virus, complete genome,” National Library of Medicine (accessed on 10 April 2023). [https://www.ncbi.nlm.nih.gov/nuccore/NC_001611.1].
7. A. S. Borovik, A. Yu. Grosberg, and M. D. Frank-Kamenetskii, “Fractality of DNA texts,” Journal of Biomolecular Structure and Dynamics 12(3), 655–669 (1994).
8. C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, and H. E. Stanley, “Long-range correlations in nucleotide sequences,” Nature 356(6365), 168–170 (1992).
9. W. Li, K. Kaneko, “Long-range correlation and partial 1/fα spectrum in a noncoding DNA sequence,” Europhysics Letters, 17(7), 655–660 (1992).
10. W. Li, “Generating nontrivial long-range correlations and 1/F spectra by replication and mutation,” International Journal of Bifurcation and Chaos 2(1), 137–154 (1992).
11. R. F. Voss, “Evolution of long-range fractal correlations and 1/f noise in DNA base sequences,” Physical Review Letters 68(25), 3805–3808 (1992).
12. B.-L. Hao, “Fractals from genomes–exact solutions of a biology-inspired problem,” Physica A 282(1–2), 225–246 (2000).
13. S. Buldyrev, N. Dokholyan, A. Goldberger, S. Havlin, C.-K. Peng, H. Stanley, and G. Viswanathan, “Analysis of DNA sequences using methods of statistical physics,” Physica A: Statistical Mechanics and its Applications 249(1–4), 430–438 (1998).
14. A. N. Kolmogorov, “Three approaches to the quantitative definition of information,” Problems of Information Transmission 1(1), 3–11 (1965).
15. J. Ziv, A. Lempel, “On the complexity of finite sequences,” IEEE Transactions on Information Theory 22(1), 75–81 (1976).
16. S. Grumbach, F. Taxi, “A new challenge for compression algorithms: genetic sequences,” Information Processing & Management 30(6), 875–885 (1994).
17. L. Alison, T. Edgoose, and T. I. Dix, “Compression of strings with approximate repeats,” Proceedings on Intelligent Systems for Molecular Biology, 8–16 (1998).
18. V. D. Gusev, L. A. Nemytikova, and N. A. Chuzhanova, “On the complexity measures of genetic sequences,” Bioinformatics 15(12), 994–999 (1999).
19. O. G. Troyanskaya, O. Arbell, Y. Koren, G. M. Landau, and A. Bolshoy, “Sequence complexity profiles of prokaryotic genomic sequences: a fast algorithm for calculating linguistic complexity,” Bioinformatics 18, 679–688 (2002).
20. G. Gordon, “Multi-dimensional linguistic complexity,” Journal of Biomolecular Structure & Dynamics 20(6), 747–750 (2003).
21. Yu. L. Orlov, R. Te Brokherst, and I. I. Abnizova, “Statistical measures of the structure of genomic sequences: entropy, complexity, and position information,” Journal of Bioinformatics and Computational Biology 4(2), 523–536 (2006).
22. J. Riolo, A. J. Steckl, “Comparative analysis of genome code complexity and manufacturability with engineering benchmarks,” Scientific Reports 12(1), 2808 (2022).
23. M. Anisimova, P. Joseph, J. P. Bielawski, and Y. Ziheng, “Accuracy and power of Bayes prediction of amino Acid sites under positive selection,” Molecular Biology and Evolution 19(6), 950–958 (2002).
24. E. Rivas, S. R. Eddy, “Noncoding RNA gene detection using comparative sequence analysis,” BMC Bioinformatics 2, 1–19 (2001).
25. I. Abnizova, K. Walter, R. Te Boekhorst, G. Elgar, and W. R. Gilks, “Statistical information characterization of conserved non-coding elements in vertebrates,” Journal of Bioinformatics and Computational Biology 5(02b), 533–547 (2007).
26. S. R. Eddy, “A model of the statistical power of comparative genome sequence analysis,” PLoS Biology 3(1), e10 (2005).
27. D. G. Hwang, P. Green, “Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution pat-terns in mammalian evolution,” proceedings of the National Academy of Sciences 101(39), 13994–14001 (2004).
28. A. J. Pinho, S. P. Garcia, D. Pratas, and P. J. S. G. Ferreira, “DNA sequences at a glance,” PLoS ONE 8(11), e79922 (2013).
29. J. T. Machado, A. M. Lopes, “Multidimensional scaling and visualization of patterns in prime numbers,” Communications in Nonlinear Science and Numerical Simulation 83, 105128 (2020).
30. J. A. T. Machado, J. M. Rocha-Neves, F. Azevedo, and J. P. Andrade, “Advances in the computational analysis of SARS-COV2 genome,” Nonlinear Dynamics 106(2), 1525–1555 (2021).
31. E. Hamori, J. Ruskin, “H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences,” Journal of Biological Chemistry 258(2), 1318–1327 (1983).
32. M.A. Gates, “A simple way to look at DNA,” Journal of Theoretical Biology 119(3), 319–328 (1986).
33. A. Nandy, “A new graphical representation and analysis of DNA sequence structure. I: methodology and application to globin genes,” Current Science 66, 309–314 (1994).
34. P. M. Leong, S. Morgenthaler, “Random walk and gap plots of DNA sequences,” Bioinformatics 11(5), 503–507 (1995).
35. M. Randic, M. Vracko, N. Lers, and O. Plavsic, “Novel 2-D graphical representation of DNA sequences and their numerical characterization,” Chemical Physics Letters 368(1–2), 1–6 (2003).
36. M. Randic, M. Vracko, N. Lers, and O. Plavsic, “On 3-D graphical representation of DNA primary sequence and their numerical characterization,” Journal of Chemical Information and Computer Sciences 40(5), 1235–1244 (2000).
37. Z. G. Yu, B. Wang, “A time series model of CDS sequences in complete genome,” Chaos Solitons Fractals 12(3), 519–526 (2001).
38. G. Xie, Z. Mo, “Three 3D graphical representations of DNA primary sequences based on the classifications of DNA bases and their applications,” Journal of Theoretical Biology 269(1), 123–130 (2011).
39. N. Jafarzadeh, A. Iranmanesh, “A novel graphical and numerical representation for analyzing DNA sequences based on codons,” Match-Communications in Mathematical and in Computer Chemistry 68(2), 611–620 (2012).
40. N. Jafarzadeh, A. Iranmanesh, “C-curve: A novel 3D graphical representation of DNA sequence based on codons,” Mathematical Biosciences 241(2), 217–224, (2013).
41. H. J. Jeffrey, “Chaos game representation of gene structure,” Nucleic Acids Research 18(8), 2163–2170 (1990).
42. P. K. Burma, A. Raj, J. K. Deb, and S. K. Brahmachari, “Genome analysis: a new approach for visualization of sequence organization in genomes,” Journal of Biosciences 17(4), 395–411 (1992).
43. M. A. Huynen, D. A. M. Konings, and P. Hogeweg, “Equal g and c contents in histone genes indicate selection pressures on mrna secondary structure,” Journal of Molecular Evolution 34(4), 280–291 (1992).
44. K. A. Hill, N. J. Schisler, and S. M. Singh, “Chaos game representation of coding regions of human globin genes and alcohol dehydrogenase genes of phylogenetically divergent species,” Journal of Molecular Evolution 35(3), 261–269 (1992).
45. J. S. Almeida, J. A. Carrico, A. Maretzek, P. A. Noble, and M. Fletcher, “Analysis of genomic sequences by chaos game representation,” Bioinformatics 17(5), 429–437 (2001).
46. S. V. Korolev, V. G. Tumanyan, “Fractal dimensions of oligonucleotide compositions of dna sequences,” Bioinformatics, Supercomputing and Complex Genome Analysis 635–638 (1993).
47. V. V. Solovyev, H. A. Lim, L. Milanesi, and C. Charles, “Application of fractal representation of genetic texts for recognition of genome functional and coding regions,” Bioinformatics, Supercomputing and Complex Genome Analysis 609–622 (1993).
48. P. J. Deschavanne, A. Giron, J. Vilain, G. Fagot, and B. Fertil, “Genomic signature: characterization and classification of species assessed by chaos game representation of sequences,” Molecular Biology and Evolution 16(10), 1391–1399 (1999).
49. Z. Sun, S. Pei, L. H. Rong, and S. S.-T. Yau, “A novel numerical representation for proteins: Three-dimensional chaos game representation and its extended natural vector,” Computational and Structural Biotechnology Journal 18, 1904–1913 (2020).
50. R. Touati, S. Haddad-Boubaker, I. Ferchichi, I. Messaoudi, A. E. Ouesleti, H. Triki, Z. Lachiri, and M. Kharrat, “Comparative genomic signature representations of the emerging covid-19 coronavirus and other coronaviruses: High identity and possible recombination between bat and pangolin coronaviruses,” Genomics 112(6), 4189–4202 (2020).
51. D. C. Sengupta, M. D. Hill, K. R. Benton, and H. N. Banerjee, “Similarity studies of corona viruses through chaos game representation,” Computational Molecular Bioscience 10(3), 61 (2020).
52. G. S. Randhawa, M. P. M. Soltysiak, H. El Roz, C. P. E. de Souza, K. A. Hill, and L. Kari, “Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: Covid-19 case study,” PLoS ONE 15(4), e0232391 (2020).
53. T. Paul, S. Vainio, and J. Roning, “Detection of intra-family coronavirus genome sequences through graphical representation and artificial neural network,” Expert Systems With Applications 194, 116559 (2022).
54. E. M. Anitas, “Small-angle scattering and multifractal analysis of DNA sequences,” International Journal of Molecular Sciences 21(13), 4651 (2020).
55. E. M. Anitas, “Fractal analysis of DNA sequences using frequency chaos game representation and small-angle scattering,” International Journal of Molecular Sciences 23, 1847 (2022).
56. T. D. Schneider, R. M. Stephens, “Sequence logos: a new way to display consensus sequences,” Nucleic Acids Research 18(20), 6097–6100 (1990).
57. S. S. Ulyanov, O. V. Ulianova, S. S. Zaytsev, Y. V. Saltykov, and V. A. Feodorova, “Statistics on gene-based laser speckles with a small number of scatterers: implications for the detection of polymorphism in the Chlamydia trachomatis omp1 gene,” Laser Physics Letters 15(4), 045601 (2018).
58. J. W. Goodman, Statistical Optics, 2nd ed, John Wiley & Sons (2015).
59. D. A. Zimnyakov, M. V. Alonova, An. V. Skripal, S. S. Zaitsev, and V. A. Feodorova, “Polarization analysis of gene sequence structures: mapping of extreme local polarization states,” Journal of Biomedical Photonics & Engineering 8(4), 040322 (2022).
60. D. Zimnyakov, M. Alonova, An. Skripal, S. Dobdin, and V. Feodorova, “Quantification of the diversity in gene structures using the principles of polarization mapping,” Current Issues in Molecular Biology 45(2), 1720–1740 (2023).
61. “Official hCoV-19 Reference Sequence, hCoV-19/Italy/LOM_ASSTMonza_6000027610_20220207092700/2022,” GISAID (accessed on 10 April 2023). [https://www.epicov.org/epi3/frontend#c8999].
62. “Official hCoV-19 Reference Sequence, hCoV-19/England/MILK-1681E7D/2021,” GISAID (accessed on 10 April 2023). [https://www.epicov.org/epi3/frontend#6379b4].
Сontact
34 Moskovskoe shosse, Samara, 443086, Russian Federation
Email: j-bpe@ssau.ru
Phone: +7-846-267-4550
© 2014-2025 J-BPE















