Preview

Siberian Journal of Clinical and Experimental Medicine

Advanced search

An algorithm for assessing the pathogenicity of genetic mutations in tumor based on a retrospective study of pathogenic and neutral genetic variants

https://doi.org/10.29001/2073-8552-2025-40-1-226-234

Abstract

Introduction. Cancer is accounting for 16.8% of all deaths and 22.8% of noncommunicable disease-related deaths, approximately. The diagnostic, prognostic, and therapeutic aspects of patient management majorly depend on mutations that drive the oncogenic process. However, evaluating the clinical significance of the variant is a major challenge, as many of them become variants of unknown significance (VUS).

Aim: of the current study is to create a new algorithm for classification of missense variants.

Material and Methods. Data from the NCBI Assembly, Uniprot, GnomAD, and OncoKB databases was processed with Python 3 to assess oncogenicity, population frequency of missense variants, as well as their occurrence in orthologous sequences. We selected 314 known benign polymorphisms and 332 reported pathogenic mutations of BRCA1, BRCA2, DICER1, PIK3CA, and TP53 genes from the ClinVar database for training and testing datasets.

Results. We have developed the algorithm that provides three criteria based on oncogenicity and population frequency of a variant, as well as its occurrence in orthologous sequences for assessing its potential pathogenicity. A variant was classified as neutral if the following was true: a) a variant doesn’t meet the criterion for oncogenicity; b) a variant meets at least one of the remaining criteria. All other variants were deemed to be pathogenic. The new algorithm demonstrates high sensitivity (94.95% (88.61%, 98.34%)) and specificity (96.52% (91.33%, 99.04%)) in classifying benign and pathogenic variants. The algorithm requires a position of a variant to be represented in population databases and to correspond to an appropriately aligned region in a multiple sequence alignment of orthologs, along with two adjacent positions.

Conclusion. The algorithm might be used to evaluate the variants of other oncogenic genes, possibly making the classification of genetic variants more precise, intensifying molecular diagnostics.

About the Authors

D. S. Bug
Pavlov First Saint Petersburg State Medical University (Pavlov University)
Russian Federation

Dmitrii S. Bug, Junior Research Scientist, Bioinformatics Research Center of Scientific Educational Institute of Biomedicine

6-8, L’va Tolstogo str., Saint Petersburg, 197022



A. N. Narkevich
V.F. Voino-Yasenetsky Krasnoyarsk State Medical University
Russian Federation

Artem N. Narkevich, Dr. Sci. (Med.), Associate Professor, Dean, Prof.

1, Partizana Zheleznyaka str., Krasnoyarsk, 660022



A. V. Tishkov
Pavlov First Saint Petersburg State Medical University (Pavlov University)
Russian Federation

Artem V. Tishkov, Cand. Sci. (Phys.-Math.), Head of the Physics, Mathematics, and Informatics Department

6-8, L’va Tolstogo str., Saint Petersburg, 197022



N. V. Petukhova
Pavlov First Saint Petersburg State Medical University (Pavlov University)
Russian Federation

Natalia V. Petukhova, Cand. Sci. (Biol.), Head of the Bioinformatics Research Center of Scientific Educational Institute of Biomedicine

6-8, L’va Tolstogo str., Saint Petersburg, 197022



References

1. Bray F., Laversanne M., Sung H., Ferlay J., Siegel R.L., Soerjomataram I. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A. Cancer J. Clinicians. 2024;74(3):229–263. https://doi.org/10.3322/caac.21834

2. Ostroverkhova D., Przytycka T.M., Panchenko A.R. Cancer driver mutations: predictions and reality. Trends Mol. Med. 2023;29(7):554–566. https://doi.org/10.1016/j.molmed.2023.03.007

3. Cook C.E., Bergman M.T., Finn R.D., Cochrane G., Birney E., Apweiler R. The European Bioinformatics Institute in 2016: Data growth and integration. Nucleic. Acids Res. 2016;44(D1):D20–D26. https://doi.org/10.1093/nar/gkv1352

4. Olivier M., Hollstein M., Hainaut P. TP53 mutations in human cancers: origins, consequences, and clinical use. Cold Spring Harb. Perspect. Biol. 2010;2(1):a001008–a001008. https://doi.org/10.1101/cshperspect.a001008

5. Daver N.G., Maiti A., Kadia T.M., Vyas P., Majeti R., Wei A.H. et al. TP53-mutated myelodysplastic syndrome and acute myeloid leukemia: biology, current therapy, and future directions. Cancer Discovery. 2022;12(11):2516–2529. https://doi.org/10.1038/gim.2015.30

6. Richards S., Aziz N., Bale S., Bick D., Das S., Gastier-Foster J. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the american college of medical genetics and genomics and the association for molecular pathology. Genet. Med. 2015;17(5):405–424. DOI: 10.1038/gim.2015.30.

7. Li M.M., Datto M., Duncavage E.J., Kulkarni S., Lindeman N.I., Roy S. et al. Standards and guidelines for the interpretation and reporting of sequence variants in cancer. J. Mol. Diagn. 2017;19(1):4–23. https://doi.org/10.1016/j.jmoldx.2016.10.002

8. Horak P., Griffith M., Danos A.M., Pitel B.A., Madhavan S., Liu X. et al. Standards for the classification of pathogenicity of somatic variants in cancer (Oncogenicity): joint recommendations of clinical genome resource (ClinGen), cancer genomics consortium (CGC), and variant interpretation for cancer consortium(VICC). Genet. Med. 2022;24(5):986– 998. https://doi.org/10.1016/j.gim.2022.01.001

9. Flanagan S.E., Patch A.M., Ellard S. Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations. Genet.Test. Mol. Biomarkers. 2010;14(4):533–537. https://doi.org/10.1089/gtmb.2010.0036

10. Jordan D.M., Ramensky V.E., Sunyaev S.R. Human allelic variation: perspective from protein function, structure, and evolution. Curr. Opin. Struct. Biol. 2010;20(3):342–350. https://doi.org/10.1016/j.sbi.2010.03.006

11. Masica D.L., Karchin R. Towards increasing the clinical relevance of in silico methods to predict pathogenic missense variants. Nussinov R, ed. PLoS Comput. Biol. 2016;12(5):e1004725. https://doi.org/10.1371/journal.pcbi.1004725

12. Chakravarty D., Gao J., Phillips S., Kundra R., Zhang H., Wang J. et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. 2017;(1):1–16. https://doi.org/10.1200/po.17.00011

13. Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581(7809):434–443. https://doi.org/10.1038/s41586-020-2308-7

14. Tatusov R.L., Koonin E.V., Lipman D.J. A genomic perspective on protein families. Science. 1997;278(5338):631–637. https://doi.org/10.1126/science.278.5338.631

15. Gudmundsson S., Singer-Berk M., Watts N.A., Phu W., Goodrich J.K., Solomonson M. Variant interpretation using population databases: Lessons from GnomAD. Human Mutation. 2022;43(8):1012–1030. https://doi.org/10.1186/s13059-017-1353-5

16. Ghosh R., Oak N., Plon S.E. Evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Biol. 2017;18(1):225. https://doi.org/10.1186/s13059-017-1353-5

17. Gunning A.C., Fryer V., Fasham J., Crosby A.H., Ellard S., Baple E.L. Assessing performance of pathogenicity predictors using clinically relevant variant datasets. J. Med. Genet. 2021;58(8):547–555. https://doi.org/10.1136/jmedgenet-2020-107003

18. Adebali O., Reznik A.O., Ory D.S., Zhulin I.B. Establishing the precise evolutionary history of a gene improves prediction of disease-causing missense mutations. Genetics in Medicine. 2016;18(10):1029–1036. https://doi.org/10.1038/gim.2015.208

19. Han J.H., Batey S., Nickson A.A., Teichmann S.A., Clarke J. The folding and evolution of multidomain proteins. Nat Rev Mol Cell Biol. 2007;8(4):319–330. https://doi.org/10.1038/nrm2144

20. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019;1(5):206–215. https://doi.org/10.1038/s42256-019-0048-x


Review

For citations:


Bug D.S., Narkevich A.N., Tishkov A.V., Petukhova N.V. An algorithm for assessing the pathogenicity of genetic mutations in tumor based on a retrospective study of pathogenic and neutral genetic variants. Siberian Journal of Clinical and Experimental Medicine. 2025;40(1):226-234. https://doi.org/10.29001/2073-8552-2025-40-1-226-234

Views: 186


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2713-2927 (Print)
ISSN 2713-265X (Online)