pmapToTranscriptF {ORFik} | R Documentation |
Map range coordinates between features in the transcriptome and genome (reference) space. The length of x must be the same as length of transcripts. Only exception is if x have integer names like (1, 3, 3, 5), so that x[1] maps to 1, x[2] maps to transcript 3 etc.
pmapToTranscriptF( x, transcripts, ignore.strand = FALSE, x.is.sorted = TRUE, tx.is.sorted = TRUE )
x |
GRangesList/GRanges/IRangesList/IRanges to map to transcriptomic coordinates |
transcripts |
a GRangesList/GRanges/IRangesList/IRanges to map against (the genomic coordinates). Must be of lower abstraction level than x. So if x is GRanges, transcripts can not be IRanges etc. |
ignore.strand |
When ignore.strand is TRUE, strand is ignored in
overlaps operations (i.e., all strands are considered "+") and the
strand in the output is '*'. |
x.is.sorted |
if x is a GRangesList object, are "-" strand groups pre-sorted in decreasing order within group, default: TRUE |
tx.is.sorted |
if transcripts is a GRangesList object, are "-" strand groups pre-sorted in decreasing order within group, default: TRUE |
This version tries to fix the shortcommings of GenomicFeature's version. Much faster and uses less memory. Implemented as dynamic program optimized c++ code.
object of same class as input x, names from ranges are kept.
library(GenomicFeatures) # Need 2 ranges object, the target region and whole transcript # x is target region x <- GRanges("chr1", IRanges(start = c(26, 29), end = c(27, 29)), "+") names(x) <- rep("tx1_ORF1", length(x)) x <- groupGRangesBy(x) # tx is the whole region tx_gr <- GRanges("chr1", IRanges(c(5, 29), c(27, 30)), "+") names(tx_gr) <- rep("tx1", length(tx_gr)) tx <- groupGRangesBy(tx_gr) pmapToTranscriptF(x, tx) pmapToTranscripts(x, tx) # Reuse names for matching x <- GRanges("chr1", IRanges(start = c(26, 29, 5), end = c(27, 29, 18)), "+") names(x) <- c(rep("tx1_1", 2), "tx1_2") x <- groupGRangesBy(x) tx1_2 <- GRanges("chr1", IRanges(c(4, 28), c(26, 31)), "+") names(tx1_2) <- rep("tx1", 2) tx <- c(tx, groupGRangesBy(tx1_2)) a <- pmapToTranscriptF(x, tx[txNames(x)]) b <- pmapToTranscripts(x, tx[txNames(x)]) identical(a, b) seqinfo(a) # A note here, a & b only have 1 seqlength, even though the 2 "tx1" # are different in size. This is an artifact of using duplicated names. ## Also look at the asTx for a similar useful function.