Thanks to visit codestin.com
Credit goes to link.springer.com

Skip to main content
Log in

Predicting HIV drug resistance using weighted machine learning method at target protein sequence-level

  • Original Article
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

Acquired immune deficiency syndrome (AIDS) is a fatal disease caused by human immunodeficiency virus (HIV). Although 23 different drugs have been available, the treatment of AIDS remains challenging because the virus mutates very quickly which can lead to drug resistance. Therefore, predicting drug resistance before treatment is crucial for individual treatments. Here, based on HIV target protein sequence information, we analyzed 21-drug resistance caused by mutated residues using machine learning (ML) methods. To transform target sequences into numeric vectors, seven physicochemical properties were used, which can well represent the interacting characteristics of target proteins. Then, principal component analysis (PCA) method was adopted to reduce the feature dimensionality. Random forest (RF) and support vector machine (SVM) based on three different kernel functions, including linear, polynomial and radial basis function (RBF), were all employed. By comparisons, we found that RBF-based SVM method gives a comparative performance with RF model. Further, we added the weight information to RBF-based SVM method by four different weight evaluation methods of RF, eXtreme Gradient Boosting (XGB), CfsSubsetEval and ReliefFAttributeEval, respectively. Results show that the RF-weighted RBF-based SVM yield the superior performance and 13 out of 21 drug models provide the correlation coefficients (R2) over 0.8 and 3 of them are higher than 0.9. Finally, position-specific importance analysis indicates that most of the mutation residues with high RF weight scores are proved to be closely related with drug resistance, which has been revealed in previous reports. Overall, we can expect that this method can be a supplementary tool for predicting HIV drug resistance for newly discovered mutations.

Graphic abstract

Here, based on HIV target protein sequence information, we analyzed 21-drug resistance caused by mutated residues using machine learning (ML) methods by fusing the weight information of different mutation positions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from £29.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Rambaut A, Posada D, Crandall KA, Holmes EC (2004) The causes and consequences of HIV evolution. Nat Rev Genet 5:52–61

    Article  CAS  PubMed  Google Scholar 

  2. Smyth RP, Davenport MP, Mak J (2012) The origin of genetic diversity in HIV-1. Virus Res 169(2):415–429

    Article  CAS  PubMed  Google Scholar 

  3. Iyidogan P, Anderson KS (2014) Current perspectives on HIV-1 antiretroviral drug resistance. Viruses 6(10):4095–4139

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  4. German Advisory Committee Blood (Arbeitskreis Blut) (2016) Subgroup assessment of pathogens transmissible by Blood Human immunodeficiency virus (HIV). Transf Medicine and Hemotherapy 43(3):203–222

    Article  Google Scholar 

  5. Riemenschneider M, Senge R, Neumann U, Hüllermeier E, Heider D (2016) Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification. BioData Mining 9:10

    Article  PubMed Central  PubMed  Google Scholar 

  6. Heider D, Senge R, Cheng W, Hullermeier E (2013) Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction. Bioinformatics 29(16):1946–1952

    Article  CAS  PubMed  Google Scholar 

  7. Bonet I (2015) Machine learning for prediction of HIV drug resistance: a review. Curr Bioinform 10(5):579–585

    Article  CAS  Google Scholar 

  8. Rhee SY, Taylor J, Wadhera G, Ben-Hur A, Brutlag DL, Shafer RW (2006) Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proc Natl Acad Sci USA 103(46):17355–17360

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  9. Beerenwinkel N, Däumer M, Oette M, Korn K, Hoffmann D, Kaiser R, Lengauer T, Selbig J, Walter H (2003) Geno2pheno: Estimating phenotypic drug resistance from HIV-1 genotypes. Nucleic Acids Res 31:3850–3855

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  10. Van Laethem K, De Luca A, Antinori A, Cingolani A, Perno CF, Vandamme AM (2002) A genotypic drug resistance interpretation algorithm that significantly predicts therapy response in HIV-1-infected patients. Antivir Ther 7:123–129

    PubMed  Google Scholar 

  11. Meynard JL, Vray M, Morand-Joubert L, Race E, Descamps D, Peytavin G et al (2002) Phenotypic or genotypic resistance testing for choosing antiretroviral therapy after treatment failure: a randomized trial. AIDS 16:727–736

    Article  PubMed  Google Scholar 

  12. Tarasova O, Biziukova N, Filimonov D, Poroikov V (2018) A computational approach for the prediction of HIV resistance based on amino acids and nucleotide descriptors. Molecules 23(11):2751

    Article  PubMed Central  Google Scholar 

  13. Khalid Z, Sezerman OU (2018) Prediction of HIV drug resistance by combining sequence and structural properties IEEE/ACM. Trans Comput Biol Bioinform 15(3):966–973

    Article  CAS  Google Scholar 

  14. Riemenschneider M, Hummel T, Heider D (2016) SHIVA-A web application for drug resistance and tropism testing in HIV BMC. Bioinformatics 17:314

    PubMed  PubMed Central  Google Scholar 

  15. Riemenschneider M, Cashin KY, Budeus B, Sierra S, Shirvani-Dastgerdi E, Bayanolhagh S, Kaiser R, Gorry PR, Heider D (2016) Genotypic prediction of co-receptor tropism of HIV-1 subtypes A and C. Sci Rep 6:1–9

    Article  Google Scholar 

  16. Beerenwinkel N, Schmidt B, Walter H, Kaiser R, Lengauer T, Hoffmann D, Korn K, Selbig J (2002) Diversity and complexity of HIV-1 drug resistance: A bioinformatics approach to predicting phenotype from genotype. Proc Natl Acad Sci USA 99:8271–8276

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  17. Heider D, Senge R, Cheng W, Hüllermeier E (2013) Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction. Bioinformatics 29:1946–1952

    Article  CAS  PubMed  Google Scholar 

  18. Masso M, Vaisman II (2013) Sequence and structure based models of HIV-1 protease and reverse transcriptase drug resistance. BMC Genom 14(Suppl 4):S3

    Article  Google Scholar 

  19. Bonet I, García MM, Saeys Y, Van De Peer Y, Grau R (2007) Predicting human immunodeficiency virus (HIV) drug resistance using recurrent neural networks. In: Proceedings of the IWINAC 2007, La Manga del Mar Menor, Spain, vol 4527, pp 234–243

  20. Sheik Amamuddy O, Bishop NT, Tastan Bishop Ö (2017) Improving fold resistance prediction of HIV-1 against protease and reverse transcriptase inhibitors using artificial neural networks. BMC Bioinform 18:369

    Article  Google Scholar 

  21. Ekpenyong ME, Etebong PI, Jackson TC (2019) Fuzzy-multidimensional deep learning for efficient prediction of patient response to antiretroviral therapy. Heliyon 5:e02080

    Article  PubMed Central  PubMed  Google Scholar 

  22. Steiner MC, Gibson KM, Crandall KA (2020) Drug resistance prediction using deep learning techniques on HIV-1 sequence data. Viruses 12:5

    Article  Google Scholar 

  23. Brand L, Yang X, Liu K, Elbeleidy S, Wang H, Zhang H et al (2020) Learning robust multilabel sample specific distances for identifying HIV-1 drug resistance. J Comput Biol 27(4):655–672

    Article  CAS  PubMed  Google Scholar 

  24. Shen C, Yu X, Harrison RW, Weber IT (2016) Automated prediction of HIV drug resistance from genotype data. BMC Bioinform 17(Suppl 8):278

    Article  Google Scholar 

  25. Ramon E, Belanche-Munoz L, Perez-Enciso M (2019) HIV drug resistance prediction with weighted categorical kernel functions. BMC Bioinform 20(1):410

    Article  Google Scholar 

  26. Guo Y, Yu L, Wen Z, Li M (2008) Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res 36(9):3025–3030

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  27. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1–3):37–52

    Article  CAS  Google Scholar 

  28. Hall MA, Smith LA (1998) Practical feature subset selection for machine learning. In: Proceedings of the 21st Austral Asian computer science conference, ACSC'98 vol 20(1), pp 181–191

  29. Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: European conference on machine learning. Springer, Berlin, Heidelberg, pp 171–182

  30. Breimanr L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  31. Aledo JC, Cantón FR, Veredas FJ (2017) A machine learning approach for predicting methionine oxidation sites. BMC Bioinform 18:430

    Article  Google Scholar 

  32. Vapnik VN (1997) The support vector method. In: Proceedings of the 7th international conference on artificial neural networks, Lausanne, pp 263–271

  33. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD 11(1):10–18

    Article  Google Scholar 

  34. Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181

    Google Scholar 

  35. Wensing AM, Ceccherini-Silberstein F, Charpentier C et al (2019) Update of the drug resistance mutations in HIV-1 2019 resistance mutations update. Top Antiviral Med 27(3):111–121

    Google Scholar 

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanzhi Guo.

Ethics declarations

Conflict of interests

The authors declare no competing financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (DOC 108 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, Q., Yuan, R., He, J. et al. Predicting HIV drug resistance using weighted machine learning method at target protein sequence-level. Mol Divers 25, 1541–1551 (2021). https://doi.org/10.1007/s11030-021-10262-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11030-021-10262-y

Keywords