windows_pipeline_quantification {FLAMES} | R Documentation |
This is the final step in the 3 step Windows FLAMES pipeline. This should be run
after read realignment is performed, following windows_pipeline_isoforms
.
windows_pipeline_quantification(pipeline_vars)
pipeline_vars |
the list returned from |
windows_pipeline_quantification
returns a SummarizedExperiment object, or a SingleCellExperiment in the case
of this function being used for the FLAMES single cell pipeline, containing a count
matrix as an assay, gene annotations under metadata, as well as a list of the other
output files generated by the pipeline. The pipeline also outputs a number of output
files into the given outdir
directory. These output files generated by the pipeline are:
transcript_count.csv.gz - a transcript count matrix (also contained in the SummarizedExperiment)
isoform_annotated.filtered.gff3 - isoforms in gff3 format (also contained in the SummarizedExperiment)
transcript_assembly.fa - transcript sequence from the isoforms
align2genome.bam - sorted BAM file with reads aligned to genome
realign2transcript.bam - sorted realigned BAM file using the transcript_assembly.fa as reference
tss_tes.bedgraph - TSS TES enrichment for all reads (for QC)
## example windows pipeline for BULK data. See Vignette for single cell data. # download the two fastq files, move them to a folder to be merged together temp_path <- tempfile() bfc <- BiocFileCache::BiocFileCache(temp_path, ask=FALSE) file_url <- "https://raw.githubusercontent.com/OliverVoogd/FLAMESData/master/data" # download the required fastq files, and move them to new folder fastq1 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq1", paste(file_url, "fastq/sample1.fastq.gz", sep="/")))]] fastq2 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq2", paste(file_url, "fastq/sample2.fastq.gz", sep="/")))]] fastq_dir <- paste(temp_path, "fastq_dir", sep="/") # the downloaded fastq files need to be in a directory to be merged together dir.create(fastq_dir) file.copy(c(fastq1, fastq2), fastq_dir) unlink(c(fastq1, fastq2)) # the original files can be deleted # run the FLAMES bulk pipeline setup #pipeline_variables <- bulk_windows_pipeline_setup(annot=system.file("extdata/SIRV_anno.gtf", package="FLAMES"), # fastq=fastq_dir, # outdir=tempdir(), genome_fa=system.file("extdata/SIRV_genomefa.fasta", package="FLAMES"), # config_file=system.file("extdata/SIRV_config_default.json", package="FLAMES")) # read alignment is handled externally (below downloads aligned bam for example) # genome_bam <- paste0(temp_path, "/align2genome.bam") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM", paste(file_url, "align2genome.bam", sep="/")))]], genome_bam) # # genome_index <- paste0(temp_path, "/align2genome.bam.bai") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM Index", paste(file_url, "align2genome.bam.bai", sep="/")))]], genome_index) # pipeline_variables$genome_bam = genome_bam # # # run the FLAMES bulk pipeline find isoforms step # pipeline_variables <- windows_pipeline_isoforms(pipeline_variables) # # # read realignment is handled externally # realign_bam <- paste0(temp_path, "/realign2genome.bam") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM", paste(file_url, "realign2transcript.bam", sep="/")))]], realign_bam) # # realign_index <- paste0(temp_path, "/realign2genome.bam.bai") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM Index", paste(file_url, "realign2transcript.bam.bai", sep="/")))]], realign_index) # pipeline_variables$realign_bam <- realign_bam # # # finally, quantification, which returns a Summarized Experiment object # se <- windows_pipeline_quantification(pipeline_variables)