computeFeaturesCage {ORFik}R Documentation

Get all possible features in ORFik

Description

Normally dont use this function, but instead use: [computeFeatures()]

Usage

computeFeaturesCage(grl, RFP, RNA = NULL, Gtf = NULL, tx = NULL,
  fiveUTRs = NULL, cds = NULL, threeUTRs = NULL, faFile = NULL,
  riboStart = 26, riboStop = 34, extension = NULL, orfFeatures = TRUE,
  cageFiveUTRs = NULL, includeNonVarying = TRUE, grl.is.sorted = FALSE)

Arguments

grl

a GRangesList object with usually ORFs, but can also be either leaders, cds', 3' utrs or ORFs are a special case, see argument tx_len

RFP

ribo seq reads as GAlignment, GRanges or GRangesList object

RNA

rna seq reads as GAlignment, GRanges or GRangesList object

Gtf

a TxDb object of a gtf file,

tx

a GrangesList of transcripts, normally called from: exonsBy(Gtf, by = "tx", use.names = T) only add this if you are not including Gtf file You do not need to reassign these to the cage peaks, it will do it for you.

fiveUTRs

fiveUTRs as GRangesList, must be original unchanged fiveUTRs

cds

a GRangesList of coding sequences

threeUTRs

a GrangesList of transcript 3' utrs, normally called from: threeUTRsByTranscript(Gtf, use.names = T)

faFile

a FaFile or BSgenome from the fasta file, see ?FaFile

riboStart

usually 26, the start of the floss interval, see ?floss

riboStop

usually 34, the end of the floss interval

extension

a numeric/integer needs to be set! set to 0 if you did not use cage, if you used cage to change tss' when finding the orfs, standard cage extension is 1000

orfFeatures

a logical, is the grl a list of orfs? Must be assigned.

cageFiveUTRs

a GRangesList, if you used cage-data to extend 5' utrs, include this, also extension must match with the extension used for these.

includeNonVarying

a logical T, if TRUE, include all features not dependent on Ribo-seq data and RNA-seq data, that is: Kozak, fractionLengths, distORFCDS, isInFrame, isOverlapping and rankInTx

grl.is.sorted

logical (F), a speed up if you know argument grl is sorted, set this to TRUE.

Details

A specialized version if you used Cage data, and don't have a new txdb with reassigned leaders, transcripts and gene starts. If you do have a txdb with cage reassignments, use computeFeatures instead. Each feature have a link to an article describing feature, try ?floss

Value

a data.table with scores, each column is one score type, name of columns are the names of the scores, i.g [floss()] or [fpkm()]

See Also

Other features: computeFeatures, disengagementScore, distToCds, entropy, floss, fpkm_calc, fpkm, fractionLength, insideOutsideORF, isInFrame, isOverlapping, kozakSequenceScore, orfScore, rankOrder, ribosomeReleaseScore, ribosomeStallingScore, subsetCoverage, translationalEff

Examples

 # a small example without cage-seq data:
 # we will find ORFs in the 5' utrs
 # and then calculate features on them
 ## Not run: 
 if (requireNamespace("BSgenome.Hsapiens.UCSC.hg19")) {
  library(GenomicFeatures)
  # Get the gtf txdb file
  txdbFile <- system.file("extdata", "hg19_knownGene_sample.sqlite",
  package = "GenomicFeatures")
  txdb <- loadDb(txdbFile)

  # Extract sequences of fiveUTRs.
  fiveUTRs <- fiveUTRsByTranscript(txdb, use.names = TRUE)[1:10]
  faFile <- BSgenome.Hsapiens.UCSC.hg19::Hsapiens
  # need to suppress warning because of bug in GenomicFeatures, will
  # be fixed soon.
  tx_seqs <- suppressWarnings(extractTranscriptSeqs(faFile, fiveUTRs))

  # Find all ORFs on those transcripts and get their genomic coordinates
  fiveUTR_ORFs <- findMapORFs(fiveUTRs, tx_seqs)
  unlistedORFs <- unlistGrl(fiveUTR_ORFs)
  # group GRanges by ORFs instead of Transcripts
  fiveUTR_ORFs <- groupGRangesBy(unlistedORFs, unlistedORFs$names)

  # make some toy ribo seq and rna seq data
  starts <- unlistGrl(ORFik:::firstExonPerGroup(fiveUTR_ORFs))
  RFP <- promoters(starts, upstream = 0, downstream = 1)
  score(RFP) <- rep(29, length(RFP)) # the original read widths

  # set RNA seq to duplicate transcripts
  RNA <- unlistGrl(exonsBy(txdb, by = "tx", use.names = TRUE))

  cageNotUsed <- 0 # used to inform that no cage was used

  computeFeaturesCage(grl = fiveUTR_ORFs, orfFeatures =  TRUE, RFP = RFP,
   RNA = RNA, Gtf = txdb, faFile = faFile, extension = cageNotUsed)

}
# See vignettes for more examples

## End(Not run)


[Package ORFik version 1.0.0 Index]