Helper functions to compute GO enrichment tests using the Bioconductor R-packages GOstats and GSEABase
# Source this R script to install goEnrichment and all dependencies:
if (!require(devtools)) {
install.packages("devtools")
require(devtools)
}
packages <- c("GOstats", "GSEABase")
if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
source("http://bioconductor.org/biocLite.R")
biocLite(setdiff(packages, rownames(installed.packages())))
}
install_github("asishallab/goEnrichment")
Copy and edit the supplied example input file ./inst/input.R to your needs and run it:
Rscript path/2/goEnrichment-package/exec/runGoEnrichment.R your_edited_input.R
If you want to test enrichment with non Arabidopsis or C. hirsuta genes you need to provide a custom background GeneSetCollection. In an interactive R shell run the following. Note, that you need a Gene Ontology (GO) annotation table of your background. The table is required to have three columns:
V1must hold the GO term accessions,V2the evidence codes, e.g.IEA, andV3must hold the gene identifiers (accessions).
Having such a table run the following code:
require(GOstats) require(GSEABase) goa.tbl <- read.table( "path/2/go_annos_table.txt", stringsAsFactors=FALSE, sep="\t" ) univ.go.annos <- sort(unique(goa.tbl$V3)) goFrame <- GOFrame(univ.go.annos, organism = "Homo sapiens") goAllFrame <- GOAllFrame(goFrame) gsc <- GeneSetCollection(goAllFrame, setType = GOCollection()) # Now save both your GeneSetCollection and the GO annotation table in binary format: save(gsc, file="custom_gsc.RData") save(goa.tbl, file="custom_goa.RData")
The above two prepared custom obejcts can subsequently be used in an adjusted input.R as explained there.
If you want to test goEnrichment you can run it with test_input.R. On a *nix like system that would be:
Rscript path/2/goEnrichment/exec/runGoEnrichment.R path/2/goEnrichment/test_input.R