gene_df |
Data object containing the genes
(see gene_input for options on how
the genes can be stored within the object).
Can be one of the following formats:
matrix : A sparse or dense matrix.
data.frame : A data.frame ,
data.table . or tibble .
codelist : A list or character vector .
Genes, transcripts, proteins, SNPs, or genomic ranges
can be provided in any format
(HGNC, Ensembl, RefSeq, UniProt, etc.) and will be
automatically converted to gene symbols unless
specified otherwise with the ... arguments.
Note: If you set method="homologene" , you
must either supply genes in gene symbol format (e.g. "Sox2")
OR set standardise_genes=TRUE .
|
gene_input |
Which aspect of gene_df to
get gene names from:
"rownames" : From row names of data.frame/matrix.
"colnames" : From column names of data.frame/matrix.
<column name> : From a column in gene_df ,
e.g. "gene_names" .
|
gene_output |
How to return genes.
Options include:
"rownames" : As row names of gene_df .
"colnames" : As column names of gene_df .
"columns" : As new columns "input_gene", "ortholog_gene"
(and "input_gene_standard" if standardise_genes=TRUE )
in gene_df .
"dict" : As a dictionary (named list) where the names
are input_gene and the values are ortholog_gene.
"dict_rev" : As a reversed dictionary (named list)
where the names are ortholog_gene and the values are input_gene.
|
standardise_genes |
If TRUE AND
gene_output="columns" , a new column "input_gene_standard"
will be added to gene_df containing standardised HGNC symbols
identified by gorth.
|
input_species |
Name of the input species (e.g., "mouse","fly").
Use map_species to return a full list
of available species.
|
output_species |
Name of the output species (e.g. "human","chicken").
Use map_species to return a full list
of available species.
|
method |
R package to to use for gene mapping:
"gprofiler" : Slower but more species and genes.
"homologene" : Faster but fewer species and genes.
"babelgene" : Faster but fewer species and genes.
Also gives consensus scores for each gene mapping based on a
several different data sources.
|
drop_nonorths |
Drop genes that don't have an ortholog
in the output_species .
|
non121_strategy |
How to handle genes that don't have
1:1 mappings between input_species :output_species .
Options include:
"drop_both_species" or "dbs" or 1 :
Drop genes that have duplicate
mappings in either the input_species or output_species
(DEFAULT).
"drop_input_species" or "dis" or 2 :
Only drop genes that have duplicate
mappings in the input_species .
"drop_output_species" or "dos" or 3 :
Only drop genes that have duplicate
mappings in the output_species .
"keep_both_species" or "kbs" or 4 :
Keep all genes regardless of whether
they have duplicate mappings in either species.
"keep_popular" or "kp" or 5 :
Return only the most "popular" interspecies ortholog mappings.
This procedure tends to yield a greater number of returned genes
but at the cost of many of them not being true biological 1:1 orthologs.
"sum","mean","median","min" or "max" :
When gene_df is a matrix and gene_output="rownames" ,
these options will aggregate many-to-one gene mappings
(input_species -to-output_species )
after dropping any duplicate genes in the output_species .
|
mthreshold |
Maximum number of ortholog names per gene to show.
Passed to gorth.
Only used when method="gprofiler" (DEFAULT : Inf ).
|
as_sparse |
Convert gene_df to a sparse matrix.
Only works if gene_df is one of the following classes:
matrix
Matrix
data.frame
data.table
tibble
If gene_df is a sparse matrix to begin with,
it will be returned as a sparse matrix
(so long as gene_output= "rownames" or "colnames" ).
|
sort_rows |
Sort gene_df rows alphanumerically.
|
verbose |
Print messages.
|
... |
Additional arguments to be passed to
gorth or homologene.
NOTE: To return only the most "popular"
interspecies ortholog mappings,
supply mthreshold=1 here AND set method="gprofiler" above.
This procedure tends to yield a greater number of returned genes but at
the cost of many of them not being true biological 1:1 orthologs.
For more details, please see
here.
|