This package provides long description of genes collected from the RefSeq database. The text in “COMMENT” section started with “Summary:” is extracted as the description of the gene, e.g. in the following example:
LOCUS NM_012363 936 bp mRNA linear PRI 12-FEB-2021
DEFINITION Homo sapiens olfactory receptor family 1 subfamily N member 1
(OR1N1), mRNA.
ACCESSION NM_012363 XM_071152
VERSION NM_012363.1
KEYWORDS RefSeq; MANE Select.
SOURCE Homo sapiens (human)
ORGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
Catarrhini; Hominidae; Homo.
REFERENCE 1 (bases 1 to 936)
AUTHORS Malnic B, Godfrey PA and Buck LB.
TITLE The human olfactory receptor gene family
JOURNAL Proc Natl Acad Sci U S A 101 (8), 2584-2589 (2004)
PUBMED 14983052
REMARK Erratum:[Proc Natl Acad Sci U S A. 2004 May 4;101(18):7205]
REFERENCE 2 (bases 1 to 936)
AUTHORS Fuchs T, Malecova B, Linhart C, Sharan R, Khen M, Herwig R,
Shmulevich D, Elkon R, Steinfath M, O'Brien JK, Radelof U, Lehrach
H, Lancet D and Shamir R.
TITLE DEFOG: a practical scheme for deciphering families of genes
JOURNAL Genomics 80 (3), 295-302 (2002)
PUBMED 12213199
REFERENCE 3 (bases 1 to 936)
AUTHORS Rouquier S, Taviaux S, Trask BJ, Brand-Arpon V, van den Engh G,
Demaille J and Giorgi D.
TITLE Distribution of olfactory receptor genes in the human genome
JOURNAL Nat Genet 18 (3), 243-250 (1998)
PUBMED 9500546
REMARK Erratum:[Nat Genet 1998 May;19(1):102]
COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staff. The
reference sequence was derived from AL359636.17.
On Apr 5, 2004 this sequence version replaced XM_071152.1.
Summary: Olfactory receptors interact with odorant molecules in the
nose, to initiate a neuronal response that triggers the perception
of a smell. The olfactory receptor proteins are members of a large
family of G-protein-coupled receptors (GPCR) arising from single
coding-exon genes. Olfactory receptors share a 7-transmembrane
domain structure with many neurotransmitter and hormone receptors
and are responsible for the recognition and G protein-mediated
transduction of odorant signals. The olfactory receptor gene family
is the largest in the genome. The nomenclature assigned to the
olfactory receptor genes and proteins for this organism is
independent of other organisms. [provided by RefSeq, Jul 2008].
##RefSeq-Attributes-START##
MANE Ensembl match :: ENST00000304880.2/ ENSP00000306974.2
RefSeq Select criteria :: based on single protein-coding transcript
##RefSeq-Attributes-END##
Function loadGeneSummary()
extracts the gene summary table. Specifying the organism
argument with the full name or the corresponding taxon ID returns a table of genes and their summaries:
## Gene summaries were retrieved from RefSeq database release 220 (December 21, 2022).
tb = loadGeneSummary(organism = 9606)
# # or use the full organism name
# tb = loadGeneSummary(organism = "Homo sapiens")
dim(tb)
## [1] 73575 6
## RefSeq_accession Organism Taxon_ID Gene_ID Review_status
## 1 NR_030309.1 Homo sapiens 9606 693168 PROVISIONAL REFSEQ
## 2 NM_001353788.2 Homo sapiens 9606 321 REVIEWED REFSEQ
## 3 NM_001004748.1 Homo sapiens 9606 401667 PROVISIONAL REFSEQ
## 4 NM_001370511.1 Homo sapiens 9606 5723 REVIEWED REFSEQ
## 5 NM_206891.3 Homo sapiens 9606 23181 REVIEWED REFSEQ
## 6 NM_130846.3 Homo sapiens 9606 5801 REVIEWED REFSEQ
## Gene_summary
## 1 microRNAs (miRNAs) are short (20-24 nt) non-coding RNAs that are involved in post-transcriptional regulation of gene expression in multicellular organisms by affecting both the stability and translation of mRNAs. miRNAs are transcribed by RNA polymerase II as part of capped and polyadenylated primary transcripts (pri-miRNAs) that can be either protein-coding or non-coding. The primary transcript is cleaved by the Drosha ribonuclease III enzyme to produce an approximately 70-nt stem-loop precursor miRNA (pre-miRNA), which is further cleaved by the cytoplasmic Dicer ribonuclease to generate the mature miRNA and antisense miRNA star (miRNA*) products. The mature miRNA is incorporated into a RNA-induced silencing complex (RISC), which recognizes target mRNAs through imperfect base pairing with the miRNA and most commonly results in translational inhibition or destabilization of the target mRNA. The RefSeq represents the predicted microRNA stem-loop.
## 2 The protein encoded by this gene is a member of the X11 protein family. It is a neuronal adapter protein that interacts with the Alzheimer's disease amyloid precursor protein (APP). It stabilizes APP and inhibits production of proteolytic APP fragments including the A beta peptide that is deposited in the brains of Alzheimer's disease patients. This gene product is believed to be involved in signal transduction processes. It is also regarded as a putative vesicular trafficking protein in the brain that can form a complex with the potential to couple synaptic vesicle exocytosis to neuronal cell adhesion.
## 3 Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms.
## 4 The protein encoded by this gene belongs to a subfamily of the phosphotransferases. This encoded enzyme is responsible for the third and last step in L-serine formation. It catalyzes magnesium-dependent hydrolysis of L-phosphoserine and is also involved in an exchange reaction between L-serine and L-phosphoserine. Deficiency of this protein is thought to be linked to Williams syndrome.
## 5 The protein encoded by this gene may be involved in axon patterning in the central nervous system. This gene is not highly expressed. Several transcript variants encoding different isoforms have been found for this gene.
## 6 The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This PTP possesses an extracellular region, a single transmembrane region, and a single intracellular catalytic domain, and thus represents a receptor-type PTP. Silencing of this gene has been associated with colorectal cancer. Multiple transcript variants encoding different isoforms have been found for this gene. This gene shares a symbol (PTPRQ) with another gene, protein tyrosine phosphatase, receptor type, Q (GeneID 374462), which is also located on chromosome 12.
Setting organism
to NULL
returns a table of all organisms.
##
## Aedes aegypti Aotus nancymaae
## 1 1
## Aplysia californica Bison bison bison
## 1 1
## Callorhinchus milii Macaca nemestrina
## 1 1
## Mandrillus leucophaeus Rhinopithecus roxellana
## 1 1
## Anas platyrhynchos Cercocebus atys
## 2 2
## Chelonia mydas Colobus angolensis palliatus
## 2 2
## Crassostrea gigas Geospiza fortis
## 2 2
## Latimeria chalumnae Loxodonta africana
## 2 2
## Melopsittacus undulatus Python bivittatus
## 2 2
## Alligator sinensis Amphimedon queenslandica
## 3 3
## Chlorocebus sabaeus Columba livia
## 3 3
## Falco cherrug Falco peregrinus
## 3 3
## Nannospalax galili Oncorhynchus mykiss
## 3 3
## Orycteropus afer afer Pelodiscus sinensis
## 3 3
## Salmo salar Zonotrichia albicollis
## 3 3
## Alligator mississippiensis Bos mutus
## 4 4
## Ficedula albicollis Meleagris gallopavo
## 4 4
## Myotis brandtii Myotis davidii
## 4 4
## Pseudopodoces humilis Ailuropoda melanoleuca
## 4 5
## Astyanax mexicanus Balaenoptera acutorostrata
## 5 5
## Balaenoptera acutorostrata scammoni Camelus ferus
## 5 5
## Elephantulus edwardii Panthera tigris
## 5 5
## Poecilia formosa Chrysemys picta
## 5 6
## Heterocephalus glaber Otolemur garnettii
## 6 6
## Physeter catodon Saimiri boliviensis
## 6 6
## Sorex araneus Cavia porcellus
## 6 7
## Chinchilla lanigera Dasypus novemcinctus
## 7 7
## Leptonychotes weddellii Myotis lucifugus
## 7 7
## Octodon degus Ceratotherium simum simum
## 7 8
## Condylura cristata Echinops telfairi
## 8 8
## Erinaceus europaeus Jaculus jaculus
## 8 8
## Mesocricetus auratus Mustela putorius furo
## 8 8
## Ochotona princeps Pteropus alecto
## 8 8
## Vicugna pacos Chrysochloris asiatica
## 8 9
## Ictidomys tridecemlineatus Lipotes vexillifer
## 9 9
## Odobenus rosmarus divergens Orcinus orca
## 9 9
## Trichechus manatus latirostris Tursiops truncatus
## 9 9
## Felis catus Microtus ochrogaster
## 10 10
## Papio anubis Bubalus bubalis
## 10 11
## Macaca fascicularis Nomascus leucogenys
## 11 11
## Peromyscus maniculatus bairdii Callithrix jacchus
## 11 15
## Hydra vulgaris Pongo abelii
## 20 22
## Strongylocentrotus purpuratus Sarcophilus harrisii
## 64 65
## Xenopus laevis Brassica rapa
## 85 89
## Saccoglossus kowalevskii Cucumis melo
## 90 105
## Ovis aries Acyrthosiphon pisum
## 117 125
## Malus domestica Takifugu rubripes
## 132 141
## Citrus sinensis Solanum lycopersicum
## 146 152
## Vitis vinifera Oryzias latipes
## 156 161
## Zea mays Pan paniscus
## 166 179
## Tupaia chinensis Solanum tuberosum
## 184 215
## Cricetulus griseus Xenopus tropicalis
## 236 245
## Taeniopygia guttata Apis mellifera
## 249 257
## Capra hircus Anolis carolinensis
## 277 294
## Brachypodium distachyon Oryctolagus cuniculus
## 312 323
## Ciona intestinalis Tribolium castaneum
## 331 334
## Nasonia vitripennis Gorilla gorilla
## 350 374
## Ornithorhynchus anatinus Bombyx mori
## 418 423
## Sus scrofa Danio rerio
## 427 468
## Eptesicus fuscus Glycine max
## 494 671
## Macaca mulatta Monodelphis domestica
## 677 685
## Pan troglodytes Gallus gallus
## 686 969
## Canis lupus familiaris Equus caballus
## 1101 1463
## Bos taurus Rattus norvegicus
## 1968 2332
## Mus musculus Homo sapiens
## 11008 73575
##
## PREDICTED REFSEQ INFERRED REFSEQ VALIDATED REFSEQ PROVISIONAL REFSEQ
## 13 2374 8886 18920
## REVIEWED REFSEQ
## 73610
A specific status can be set via argument status
, e.g. only to "reviewed"
:
## REVIEWED REFSEQ
## 73610
Version of the data:
## RefSeq gene summaries
## RefSeq release: 220
## Source: https://ftp.ncbi.nih.gov/refseq/release/complete/*.rna.gbff.gz
## Number of organisms: 129
## Built date: 2023-09-23
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.18-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] GeneSummary_0.99.6
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.33 R6_2.5.1 fastmap_1.1.1 xfun_0.40
## [5] cachem_1.0.8 knitr_1.44 htmltools_0.5.6 rmarkdown_2.25
## [9] cli_3.6.1 sass_0.4.7 jquerylib_0.1.4 compiler_4.3.1
## [13] tools_4.3.1 evaluate_0.21 bslib_0.5.1 yaml_2.3.7
## [17] rlang_1.1.1 jsonlite_1.8.7