bulk_windows_pipeline_setup {FLAMES}R Documentation

FLAMES Windows Bulk Pipeline

Description

An implementation of the FLAMES pipeline designed to run on Windows, or any OS without access to minimap2, for read realignment. This pipeline requires external read alignment, in betwen pipeline calls.

Usage

bulk_windows_pipeline_setup(
  annot,
  fastq,
  in_bam = NULL,
  outdir,
  genome_fa,
  downsample_ratio = 1,
  config_file
)

Arguments

annot

gene annotations file in gff3 format

fastq

file path to input fastq file

in_bam

optional bam file to use instead of fastq files (skips read alignment step)

outdir

directory to store all output files.

genome_fa

genome fasta file.

downsample_ratio

downsampling ratio if performing downsampling analysis.

config_file

JSON configuration file. If specified, config_file overrides all configuration parameters

Details

This function, bulk_windows_pipeline_setup is the first step in the 3 step Windows FLAMES bulk pipeline, and should be run first, read alignment undertaken, then windows_pipline_isoforms should be run, read realignment performed, and finally windows_pipeline_quantification should be run. For each function, besides bulk_windows_pipeline_setup, a list pipeline_variables is returned, which contains the information required to continue the pipeline. This list should be passed into each function, and updated with the returned list. In the case of bulk_windows_pipeline_setup, pipeline_variables is the list returned. See the vignette 'Vignette for FLAMES bulk on Windows' for more details.

Value

a list pipeline_variables with the required variables for execution of later Windows pipeline steps. File paths required to perform minimap2 alignment are given in pipeline_variables$return_files. This list should be given as input for windows_pipeline_isoforms after minimap2 alignment has taken place; windows_pipeline_isoforms is the continuation of this pipeline.

Examples

## example windows pipeline for BULK data. See Vignette for single cell data.

# download the two fastq files, move them to a folder to be merged together
temp_path <- tempfile()
bfc <- BiocFileCache::BiocFileCache(temp_path, ask=FALSE)
file_url <- 
    "https://raw.githubusercontent.com/OliverVoogd/FLAMESData/master/data"
# download the required fastq files, and move them to new folder
fastq1 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq1", paste(file_url, "fastq/sample1.fastq.gz", sep="/")))]]
fastq2 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq2", paste(file_url, "fastq/sample2.fastq.gz", sep="/")))]]
fastq_dir <- paste(temp_path, "fastq_dir", sep="/") # the downloaded fastq files need to be in a directory to be merged together
dir.create(fastq_dir)
file.copy(c(fastq1, fastq2), fastq_dir)
unlink(c(fastq1, fastq2)) # the original files can be deleted

# run the FLAMES bulk pipeline setup
#pipeline_variables <- bulk_windows_pipeline_setup(annot=system.file("extdata/SIRV_anno.gtf", package="FLAMES"), 
#                   fastq=fastq_dir,
#                   outdir=tempdir(), genome_fa=system.file("extdata/SIRV_genomefa.fasta", package="FLAMES"),
#                   config_file=system.file("extdata/SIRV_config_default.json", package="FLAMES"))
# read alignment is handled externally (below downloads aligned bam for example)
# genome_bam <- paste0(temp_path, "/align2genome.bam")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM", paste(file_url, "align2genome.bam", sep="/")))]], genome_bam)
# 
# genome_index <- paste0(temp_path, "/align2genome.bam.bai")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM Index", paste(file_url, "align2genome.bam.bai", sep="/")))]], genome_index)
# pipeline_variables$genome_bam = genome_bam
# 
# # run the FLAMES bulk pipeline find isoforms step
# pipeline_variables <- windows_pipeline_isoforms(pipeline_variables)
# 
# # read realignment is handled externally
# realign_bam <- paste0(temp_path, "/realign2genome.bam")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM", paste(file_url, "realign2transcript.bam", sep="/")))]], realign_bam)
# 
# realign_index <- paste0(temp_path, "/realign2genome.bam.bai")
# file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM Index", paste(file_url, "realign2transcript.bam.bai", sep="/")))]], realign_index)
# pipeline_variables$realign_bam <- realign_bam
# 
# # finally, quantification, which returns a Summarized Experiment object
# se <- windows_pipeline_quantification(pipeline_variables)

[Package FLAMES version 1.0.2 Index]