GDCprepare {TCGAbiolinks} | R Documentation |
Reads the data downloaded and prepare it into an R object
GDCprepare( query, save = FALSE, save.filename, directory = "GDCdata", summarizedExperiment = TRUE, remove.files.prepared = FALSE, add.gistic2.mut = NULL, mut.pipeline = "mutect2", mutant_variant_classification = c("Frame_Shift_Del", "Frame_Shift_Ins", "Missense_Mutation", "Nonsense_Mutation", "Splice_Site", "In_Frame_Del", "In_Frame_Ins", "Translation_Start_Site", "Nonstop_Mutation") )
query |
A query for GDCquery function |
save |
Save result as RData object? |
save.filename |
Name of the file to be save if empty an automatic will be created |
directory |
Directory/Folder where the data was downloaded. Default: GDCdata |
summarizedExperiment |
Create a summarizedExperiment? Default TRUE (if possible) |
remove.files.prepared |
Remove the files read? Default: FALSE This argument will be considered only if save argument is set to true |
add.gistic2.mut |
If a list of genes (gene symbol) is given, columns with gistic2 results from GDAC firehose (hg19) and a column indicating if there is or not mutation in that gene (hg38) (TRUE or FALSE - use the MAF file for more information) will be added to the sample matrix in the summarized Experiment object. |
mut.pipeline |
If add.gistic2.mut is not NULL this field will be taken in consideration. Four separate variant calling pipelines are implemented for GDC data harmonization. Options: muse, varscan2, somaticsniper, MuTect2. For more information: https://gdc-docs.nci.nih.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/ |
mutant_variant_classification |
List of mutant_variant_classification that will be consider a sample mutant or not. Default: "Frame_Shift_Del", "Frame_Shift_Ins", "Missense_Mutation", "Nonsense_Mutation", "Splice_Site", "In_Frame_Del", "In_Frame_Ins", "Translation_Start_Site", "Nonstop_Mutation" |
A summarizedExperiment or a data.frame
## Not run: query <- GDCquery(project = "TCGA-KIRP", data.category = "Simple Nucleotide Variation", data.type = "Masked Somatic Mutation", workflow.type = "MuSE Variant Aggregation and Masking") GDCdownload(query, method = "api", directory = "maf") maf <- GDCprepare(query, directory = "maf") # Get GISTIC values gistic.query <- GDCquery(project = "TCGA-ACC", data.category = "Copy Number Variation", data.type = "Gene Level Copy Number Scores", access = "open") GDCdownload(gistic.query) gistic <- GDCprepare(gistic.query) ## End(Not run)