Thanks to visit codestin.com
Credit goes to link.springer.com

Skip to main content

Graphical Modeling of Multiple Biological Pathways in Genomic Studies

  • Chapter
  • First Online:
Modern Statistical Methods for Health Research

Part of the book series: Emerging Topics in Statistics and Biostatistics ((ETSB))

  • 2118 Accesses

Abstract

Complex diseases are associated with a variety of genomic factors. Identifying such risk factors can help us to better understand the pathogenesis of these diseases, which in turn may help the development of prevention and intervention strategies in terms of personalized treatments. Disease-associated risk factors can be identified by genome-wide association studies (GWAS) or the detection of differentially expressed (DE) genes. Traditional single-marker- or single-gene-based approaches lack adequate statistical powers due to the stringent threshold of multiple testing corrections. To address this problem, researchers have proposed pathway-based approaches, leveraging our knowledge of the gene–gene interactions to increase the power of conducting a large number of statistical tests. Many pathway-based approaches treat pathways as lists of genes that ignore the topology structures of genes. Including the topology structure into analysis allows modeling of specific gene–gene interactions. Previously, such approaches only consider two-way interactions, and they only incorporate a single pathway in GWAS or DE analysis. However, genes participate in different biological processes and thus interact with each other simultaneously in different pathways. We hereby extend the approach by combining multiple biological pathways into genomic studies. In the proposed approaches, the topology structures of biological pathways are modeled by a Markov random field, which is a graphical model to present the dependence structure in the dataset. Finally, a Bayesian framework is constructed to incorporate the knowledge from topological structures of biological pathways with the evidence from biological experiments. The inference of gene status, like disease association status, can be made based on the marginal posterior probability obtained from Bayesian analysis. The proposed approaches are evaluated with simulations studies and a lung cancer dataset. The results show that combining multiple biological pathways can enhance the power of detection and control the false positive rate.

Yujing Cao and Yu Zhang authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+
from £29.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 98.00
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 123.00
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
GBP 119.99
Price includes VAT (United Kingdom)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)

    Article  Google Scholar 

  2. Bair, E., Hastie, T., Paul, D., Tibshirani, R.: Prediction by supervised principal components. J. Am. Stat. Assoc. 101(473), 119–137 (2006). http://www.jstor.org/stable/30047444

    Article  MathSciNet  MATH  Google Scholar 

  3. Barabási, A.L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12(1), 56–68 (2011)

    Article  Google Scholar 

  4. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57(1), 289–300 (1995). http://www.jstor.org/stable/2346101

    MathSciNet  MATH  Google Scholar 

  5. Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. R. Stat. Soc. Ser. B Methodol. 36(2), 192–236 (1974)

    MathSciNet  MATH  Google Scholar 

  6. Bokanizad, B., Tagett, R., Ansari, S., Helmi, B.H., Draghici, S.: SPATIAL: A System-level PAThway Impact AnaLysis approach. Nucl. Acids Res. 44(11), 5034–5044 (2016)

    Article  Google Scholar 

  7. Buniello, A., MacArthur, J.A.L., Cerezo, M., Harris, L.W., Hayhurst, J., Malangone, C., McMahon, A., Morales, J., Mountjoy, E., Sollis, E., et al.: The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucl. Acids Res. 47(D1), D1005–D1012 (2019)

    Article  Google Scholar 

  8. Bush, W.S., Moore, J.H.: Genome-wide association studies. PLoS Comput. Biol. 8(12), e1002822 (2012)

    Article  Google Scholar 

  9. Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46(3), 167–174 (1992)

    MathSciNet  Google Scholar 

  10. Chen, M., Cho, J., Zhao, H.: Incorporating biological pathways via a Markov random field model in genome-wide association studies. PLoS Genet. 7(4), 1–13 (2011). https://doi.org/10.1371/journal.pgen.1001353

    Article  Google Scholar 

  11. Chen, M., Zang, M., Wang, X., Xiao, G.: A powerful Bayesian meta-analysis method to integrate multiple gene set enrichment studies. Bioinformatics (Oxford, England) 29, 862–869 (2013). https://doi.org/10.1093/bioinformatics/btt068

  12. Chen, X., Wang, L., Hu, B., Guo, M., Barnard, J., Zhu, X.: Pathway-based analysis for genome-wide association studies using supervised principal components. Genet. Epidemiol. 34(7), 716–724 (2010)

    Article  Google Scholar 

  13. Cookson, W., Liang, L., Abecasis, G., Moffatt, M., Lathrop, M.: Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10(3), 184–194 (2009)

    Article  Google Scholar 

  14. Creixell, P., Reimand, J., Haider, S., Wu, G., Shibata, T., Vazquez, M., Mustonen, V., Gonzalez-Perez, A., Pearson, J., Sander, C., et al.: Pathway and network analysis of cancer genomes. Nat. Methods 12(7), 615 (2015)

    Article  Google Scholar 

  15. Dutta, B., Wallqvist, A., Reifman, J.: Pathnet: a tool for pathway analysis using topological information. Source Code Biol. Med. 7(1), 1 (2012)

    Article  Google Scholar 

  16. Franke, A., McGovern, D.P., Barrett, J.C., Wang, K., Radford-Smith, G.L., Ahmad, T., Lees, C.W., Balschun, T., Lee, J., Roberts, R., et al.: Genome-wide meta-analysis increases to 71 the number of confirmed Crohn’s disease susceptibility loci. Nat. Genet. 42(12), 1118–1125 (2010)

    Article  Google Scholar 

  17. Freytag, S., Manitz, J., Schlather, M., Kneib, T., Amos, C.I., Risch, A., Chang-Claude, J., Heinrich, J., Bickeböller, H.: A network-based kernel machine test for the identification of risk pathways in genome-wide association studies. Hum. Hered. 76(2), 64–75 (2014)

    Article  Google Scholar 

  18. Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., Manolio, T.A.: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. 106(23), 9362–9367 (2009)

    Article  Google Scholar 

  19. Hou, J., Acharya, L., Zhu, D., Cheng, J.: An overview of bioinformatics methods for modeling biological pathways in yeast. Brief. Funct. Genomics 15(2), 95–108 (2016)

    Article  Google Scholar 

  20. Hou, L., Chen, M., Zhang, C.K., Cho, J., Zhao, H.: Guilt by rewiring: gene prioritization through network rewiring in genome wide association studies. Hum. Mol. Genet. 23(10), 2780–2790 (2014)

    Article  Google Scholar 

  21. Jassal, B., Matthews, L., Viteri, G., Gong, C., Lorente, P., Fabregat, A., Sidiropoulos, K., Cook, J., Gillespie, M., Haw, R., Loney, F., May, B., Milacic, M., Rothfels, K., Sevilla, C., Shamovsky, V., Shorser, S., Varusai, T., Weiser, J., Wu, G., Stein, L., Hermjakob, H., D’Eustachio, P.: The reactome pathway knowledgebase. Nucl. Acids Res. 48, D498–D503 (2020). https://doi.org/10.1093/nar/gkz1031

    Google Scholar 

  22. Jin, L., Zuo, X.Y., Su, W.Y., Zhao, X.L., Yuan, M.Q., Han, L.Z., Zhao, X., Chen, Y.D., Rao, S.Q.: Pathway-based analysis tools for complex diseases: a review. Genomics Proteomics Bioinformatics 12(5), 210–220 (2014)

    Article  Google Scholar 

  23. Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., Tanabe, M.: Data, information, knowledge and principle: back to metabolism in KEGG. Nucl. Acids Res. 42(D1), D199–D205 (2014)

    Article  Google Scholar 

  24. Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M., Tanabe, M.: KEGG as a reference resource for gene and protein annotation. Nucl. Acids Res. 44(D1), D457–D462 (2016)

    Article  Google Scholar 

  25. Krauss, G.: Biochemistry of Signal Transduction and Regulation. Wiley, London (2006)

    Google Scholar 

  26. Lin, Z., Li, M., Sestan, N., Zhao, H.: A markov random field-based approach for joint estimation of differentially expressed genes in mouse transcriptome data. Stat. Appl. Genet. Mol. Biol. 15(2), 139–150 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  27. Liu, J., Peissig, P., Zhang, C., Burnside, E., McCarty, C., Page, D.: Graphical-model based multiple testing under dependence, with applications to genome-wide association studies. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, vol. 2012, p. 511. NIH Public Access (2012)

    Google Scholar 

  28. Liu, L., Lei, J., Roeder, K., et al.: Network assisted analysis to reveal the genetic basis of autism. Ann. Appl. Stat. 9(3), 1571–1600 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  29. Loscalzo, J., Kohane, I., Barabasi, A.L.: Human disease classification in the postgenomic era: a complex systems approach to human pathobiology. Mol. Syst. Biol. 3(1), 124 (2007)

    Article  Google Scholar 

  30. Luo, L., Peng, G., Zhu, Y., Dong, H., Amos, C.I., Xiong, M.: Genome-wide gene and pathway analysis. Eur. J. Hum. Genet. 18(9), 1045–1053 (2010)

    Article  Google Scholar 

  31. Mitrea, C., Taghavi, Z., Bokanizad, B., Hanoudi, S., Tagett, R., Donato, M., Voichita, C., Draghici, S.: Methods and approaches in the topology-based analysis of biological pathways. Front. Physiol. 4(278), 1–22 (2013)

    Google Scholar 

  32. Mokry, M., Middendorp, S., Wiegerinck, C.L., Witte, M., Teunissen, H., Meddens, C.A., Cuppen, E., Clevers, H., Nieuwenhuis, E.E.: Many inflammatory bowel disease risk loci include regions that regulate gene expression in immune cells and the intestinal epithelium. Gastroenterology 146(4), 1040–1047 (2014)

    Article  Google Scholar 

  33. Mourad, R., Sinoquet, C., Leray, P.: Probabilistic graphical models for genetic association studies. Brief. Bioinform. 13(1), 20–33 (2012)

    Article  Google Scholar 

  34. Nica, A.C., Dermitzakis, E.T.: Using gene expression to investigate the genetic basis of complex disorders. Hum. Mol. Genet. 17(R2), R129–R134 (2008)

    Article  Google Scholar 

  35. Pan, W.: Relationship between genomic distance-based regression and kernel machine regression for multi-marker association testing. Genet. Epidemiol. 35(4), 211–216 (2011). https://doi.org/10.1002/gepi.20567

    Google Scholar 

  36. Pan, W., Kim, J., Zhang, Y., Shen, X., Wei, P.: A powerful and adaptive association test for rare variants. Genetics 197(4), 1081–95 (2014). https://doi.org/10.1534/genetics.114.165035

    Article  Google Scholar 

  37. Pan, W., Kwak, I.Y., Wei, P.: A powerful pathway-based adaptive test for genetic association with common or rare variants. Am. J. Hum. Genet. 97(1), 86–98 (2015). https://doi.org/10.1016/j.ajhg.2015.05.018

    Article  Google Scholar 

  38. Pavlopoulos, G.A., Secrier, M., Moschopoulos, C.N., Soldatos, T.G., Kossida, S., Aerts, J., Schneider, R., Bagos, P.G.: Using graph theory to analyze biological networks. BioData Mining 4(1), 1 (2011)

    Article  Google Scholar 

  39. Rapin, N., Bagger, F.O., Jendholm, J., Mora-Jensen, H., Krogh, A., Kohlmann, A., Thiede, C., Borregaard, N., Bullinger, L., Winther, O., et al.: Comparing cancer vs normal gene expression profiles identifies new disease entities and common transcriptional programs in AML patients. Blood 123(6), 894–904 (2014)

    Article  Google Scholar 

  40. Ripke, S., O’Dushlaine, C., Chambert, K., Moran, J.L., Kähler, A.K., Akterin, S., Bergen, S., Collins, A.L., Crowley, J.J., Fromer, M., et al.: Genome-wide association analysis identifies 14 new risk loci for schizophrenia. Nat Genet. 45(10), 1150–1159 (2013)

    Article  Google Scholar 

  41. Rodchenkov, I., Babur, O., Luna, A., Aksoy, B.A., Wong, J.V., Fong, D., Franz, M., Siper, M.C., Cheung, M., Wrana, M., Mistry, H., Mosier, L., Dlin, J., Wen, Q., O’Callaghan, C., Li, W., Elder, G., Smith, P.T., Dallago, C., Cerami, E., Gross, B., Dogrusoz, U., Demir, E., Bader, G.D., Sander, C.: Pathway commons 2019 update: integration, analysis and exploration of pathway data. Nucl. Acids Res. 48, D489–D497 (2020). https://doi.org/10.1093/nar/gkz946

    Google Scholar 

  42. Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., Ideker, T.: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13(11), 2498–2504 (2003)

    Article  Google Scholar 

  43. Shedden, K., Taylor, J.M.G., Enkemann, S.A., Tsao, M.S., Yeatman, T.J., Gerald, W.L., Eschrich, S., Jurisica, I., Giordano, T.J., Misek, D.E., Chang, A.C., Zhu, C.Q., Strumpf, D., Hanash, S., Shepherd, F.A., Ding, K., Seymour, L., Naoki, K., Pennell, N., Weir, B., Verhaak, R., Ladd-Acosta, C., Golub, T., Gruidl, M., Sharma, A., Szoke, J., Zakowski, M., Rusch, V., Kris, M., Viale, A., Motoi, N., Travis, W., Conley, B., Seshan, V.E., Meyerson, M., Kuick, R., Dobbin, K.K., Lively, T., Jacobson, J.W., Beer, D.G.: Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat. Med. 14, 822–827 (2008). https://doi.org/10.1038/nm.1790

  44. Slenter, D.N., Kutmon, M., Hanspers, K., Riutta, A., Windsor, J., Nunes, N., Mélius, J., Cirillo, E., Coort, S.L., Digles, D., et al.: Wikipathways: a multifaceted pathway database bridging metabolomics to other omics research. Nucl. Acids Res. 46(D1), D661–D667 (2018)

    Article  Google Scholar 

  45. Song, G.G., Lee, Y.H.: Pathway analysis of genome-wide association study on asthma. Hum. Immunol. 74(2), 256–260 (2013)

    Article  Google Scholar 

  46. Tarca, A.L., Draghici, S., Khatri, P., Hassan, S.S., Mittal, P., Kim, J.S., Kim, C.J., Kusanovic, J.P., Romero, R.: A novel signaling pathway impact analysis. Bioinformatics 25(1), 75–82 (2009)

    Article  Google Scholar 

  47. Wei, P., Pan, W.: Bayesian joint modeling of multiple gene networks and diverse genomic data to identify target genes of a transcription factor. Ann. Appl. Stat. 6(1), 334 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  48. Wu, M.C., Kraft, P., Epstein, M.P., Taylor, D.M., Chanock, S.J., Hunter, D.J., Lin, X.: Powerful SNP-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86(6), 929–942 (2010)

    Article  Google Scholar 

  49. Wu, M.C., Lee, S., Cai, T., Li, Y., Boehnke, M., Lin, X.: Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89(1), 82–93 (2011). https://doi.org/10.1016/j.ajhg.2011.05.029

    Article  Google Scholar 

  50. Zalkin, H., DAGLEY, S., Nicholson, D.E.: An Introduction to Metabolic Pathways. Wiley, London (1971)

    Google Scholar 

  51. Zhi, W., Minturn, J., Rappaport, E., Brodeur, G., Li, H.: Network-based analysis of multivariate gene expression data. In: Statistical Methods for Microarray Data Analysis: Methods and Protocols, pp. 121–139 (2013)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Institutes of Health [grant R15GM131390 to X.W.].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Cao, Y., Zhang, Y., Wang, X., Chen, M. (2021). Graphical Modeling of Multiple Biological Pathways in Genomic Studies. In: Zhao, Y., Chen, (.DG. (eds) Modern Statistical Methods for Health Research. Emerging Topics in Statistics and Biostatistics . Springer, Cham. https://doi.org/10.1007/978-3-030-72437-5_19

Download citation

Keywords

Publish with us

Policies and ethics