get_anno_genes {GOfuncR} | R Documentation |
Given a vector of GO-IDs, e.g. c('GO:0072025','GO:0072221') this function returns all genes that are annotated to those GO-categories. This includes genes that are annotated to any of the child nodes of a GO-category.
get_anno_genes(go_ids, database = 'Homo.sapiens', genes = NULL, annotations = NULL, term_df = NULL, graph_path_df = NULL, godir = NULL)
go_ids |
character() vector of GO-IDs, e.g. c('GO:0051082', 'GO:0042254'). |
database |
optional character() defining an OrganismDb or OrgDb annotation package from Bioconductor, like 'Mus.musculus' (mouse) or 'org.Pt.eg.db' (chimp). |
genes |
optional character() vector of gene-symbols. If defined, only annotations of those genes are returned. |
annotations |
optional data.frame() with two character() columns: gene-symbols and GO-categories. Alternative to 'database'. |
term_df |
optional data.frame() with an ontology 'term' table.
Alternative to the default integrated GO-graph or |
graph_path_df |
optional data.frame() with an ontology 'graph_path' table.
Alternative to the default integrated GO-graph or |
godir |
optional character() specifying a directory that
contains the ontology tables 'term.txt' and 'graph_path.txt'.
Alternative to the default integrated GO-graph
or |
Besides the default 'Homo.sapiens', also other OrganismDb or OrgDb packages from Bioconductor, like 'Mus.musculus' (mouse) or 'org.Pt.eg.db' (chimp), can be used. It is also possible to directly provide a data.frame() with annotations, which is then searched for the input GO-categories and their child nodes.
By default the package's integrated GO-graph is used to find child nodes,
but a custom ontology can be defined, too.
For details on how to use a custom ontology with
term_df
+ graph_path_df
or godir
please refer to the
package's vignette. The advantage of term_df
+ graph_path_df
over godir
is that the latter reads the files 'term.txt' and
'graph_path.txt' from disk and therefore takes longer.
A data.frame() with two columns: GO-IDs (character()) and the annotated genes (character()). The output is ordered by GO-ID and gene-symbol.
Steffi Grote
[1] Ashburner, M. et al. (2000). Gene Ontology: tool for the unification of biology. Nature Genetics 25, 25-29.
get_anno_categories
get_ids
get_names
get_child_nodes
get_parent_nodes
## find all genes that are annotated to GO:0000109 ## ("nucleotide-excision repair complex") get_anno_genes(go_ids='GO:0000109') ## find out wich genes from a set of genes ## are annotated to some GO-categories genes = c('AGTR1', 'ANO1', 'CALB1', 'GYG1', 'PAX2') gos = c('GO:0001558', 'GO:0005536', 'GO:0072205', 'GO:0006821') anno_genes = get_anno_genes(go_ids=gos, genes=genes) # add the names and domains of the GO-categories cbind(anno_genes ,get_names(anno_genes$go_id)[,2:3]) ## find all annotations to GO-categories containing 'serotonin receptor' sero_ids = get_ids('serotonin receptor') sero_anno = get_anno_genes(go_ids=sero_ids$go_id) # merge with names of GO-categories head(merge(sero_ids, sero_anno))