computeFeaturesCage {ORFik} | R Documentation |
Normally dont use this function, but instead use: [computeFeatures()]
computeFeaturesCage(grl, RFP, RNA = NULL, Gtf = NULL, tx = NULL, fiveUTRs = NULL, cds = NULL, threeUTRs = NULL, faFile = NULL, riboStart = 26, riboStop = 34, extension = NULL, orfFeatures = TRUE, cageFiveUTRs = NULL, includeNonVarying = TRUE, grl.is.sorted = FALSE)
grl |
a |
RFP |
ribo seq reads as GAlignment, GRanges or GRangesList object |
RNA |
rna seq reads as GAlignment, GRanges or GRangesList object |
Gtf |
a TxDb object of a gtf file, |
tx |
a GrangesList of transcripts, normally called from: exonsBy(Gtf, by = "tx", use.names = T) only add this if you are not including Gtf file You do not need to reassign these to the cage peaks, it will do it for you. |
fiveUTRs |
fiveUTRs as GRangesList, must be original unchanged fiveUTRs |
cds |
a GRangesList of coding sequences |
threeUTRs |
a GrangesList of transcript 3' utrs, normally called from: threeUTRsByTranscript(Gtf, use.names = T) |
faFile |
a FaFile or BSgenome from the fasta file, see ?FaFile |
riboStart |
usually 26, the start of the floss interval, see ?floss |
riboStop |
usually 34, the end of the floss interval |
extension |
a numeric/integer needs to be set! set to 0 if you did not use cage, if you used cage to change tss' when finding the orfs, standard cage extension is 1000 |
orfFeatures |
a logical, is the grl a list of orfs? Must be assigned. |
cageFiveUTRs |
a GRangesList, if you used cage-data to extend 5' utrs, include this, also extension must match with the extension used for these. |
includeNonVarying |
a logical T, if TRUE, include all features not dependent on Ribo-seq data and RNA-seq data, that is: Kozak, fractionLengths, distORFCDS, isInFrame, isOverlapping and rankInTx |
grl.is.sorted |
logical (F), a speed up if you know argument grl is sorted, set this to TRUE. |
A specialized version if you used Cage data, and don't have a new txdb with reassigned leaders, transcripts and gene starts. If you do have a txdb with cage reassignments, use computeFeatures instead. Each feature have a link to an article describing feature, try ?floss
a data.table with scores, each column is one score type, name of columns are the names of the scores, i.g [floss()] or [fpkm()]
Other features: computeFeatures
,
disengagementScore
,
distToCds
, entropy
,
floss
, fpkm_calc
,
fpkm
, fractionLength
,
insideOutsideORF
, isInFrame
,
isOverlapping
,
kozakSequenceScore
, orfScore
,
rankOrder
,
ribosomeReleaseScore
,
ribosomeStallingScore
,
subsetCoverage
,
translationalEff
# a small example without cage-seq data: # we will find ORFs in the 5' utrs # and then calculate features on them ## Not run: if (requireNamespace("BSgenome.Hsapiens.UCSC.hg19")) { library(GenomicFeatures) # Get the gtf txdb file txdbFile <- system.file("extdata", "hg19_knownGene_sample.sqlite", package = "GenomicFeatures") txdb <- loadDb(txdbFile) # Extract sequences of fiveUTRs. fiveUTRs <- fiveUTRsByTranscript(txdb, use.names = TRUE)[1:10] faFile <- BSgenome.Hsapiens.UCSC.hg19::Hsapiens # need to suppress warning because of bug in GenomicFeatures, will # be fixed soon. tx_seqs <- suppressWarnings(extractTranscriptSeqs(faFile, fiveUTRs)) # Find all ORFs on those transcripts and get their genomic coordinates fiveUTR_ORFs <- findMapORFs(fiveUTRs, tx_seqs) unlistedORFs <- unlistGrl(fiveUTR_ORFs) # group GRanges by ORFs instead of Transcripts fiveUTR_ORFs <- groupGRangesBy(unlistedORFs, unlistedORFs$names) # make some toy ribo seq and rna seq data starts <- unlistGrl(ORFik:::firstExonPerGroup(fiveUTR_ORFs)) RFP <- promoters(starts, upstream = 0, downstream = 1) score(RFP) <- rep(29, length(RFP)) # the original read widths # set RNA seq to duplicate transcripts RNA <- unlistGrl(exonsBy(txdb, by = "tx", use.names = TRUE)) cageNotUsed <- 0 # used to inform that no cage was used computeFeaturesCage(grl = fiveUTR_ORFs, orfFeatures = TRUE, RFP = RFP, RNA = RNA, Gtf = txdb, faFile = faFile, extension = cageNotUsed) } # See vignettes for more examples ## End(Not run)