cBioDataPack {cBioPortalData} | R Documentation |
The cBioDataPack
function allows the user to
download and process cancer study datasets found in MSKCC's cBioPortal.
Output datasets use the MultiAssayExperiment data
representation to faciliate analysis and data management operations.
cBioDataPack( cancer_study_id, use_cache = TRUE, names.field = c("Hugo_Symbol", "Entrez_Gene_Id", "Gene"), cleanup = TRUE, ask = TRUE )
cancer_study_id |
character(1) The study identifier from cBioPortal as in https://cbioportal.org/webAPI |
use_cache |
logical(1) (default TRUE) create the default cache location and use it to track downloaded data. If data found in the cache, data will not be re-downloaded. A path can also be provided to data cache location. |
names.field |
A character vector of possible column names for the column that is used to label ranges from a mutations or copy number file. |
cleanup |
logical(1) whether to delete the |
ask |
A logical vector of length one indicating whether to prompt the
the user before downloading and loading study |
The full list of study identifiers (studyId
s) can obtained from
getStudies()
. Currently, only ~ 72% of datasets can be represented as
MultiAssayExperiment
data objects from the data tarballs. Refer to
getStudies(..., buildReport = TRUE)
and its "pack_build"
column to see
which study identifiers are not building. Users who would like to prioritize
particular datasets should open GitHub issues at the URL in the
DESCRIPTION
file. For a more fine-grained approach to downloading data
from the cBioPortal API, refer to the cBioPortalData
function.
A MultiAssayExperiment object
The cBioDataPack
function accesses data from the cBio_URL
option.
By default, it points to an Amazon S3 bucket location. Previously, it
pointed to 'http://download.cbioportal.org'. This recent change
(> 2.1.17) should provide faster and more reliable downloads for all users.
See the URL using cBioPortalData:::.url_location
. This can be changed
if there are mirrors that host this data by setting the cBio_URL
option
with getOption("cBio_URL", "https://some.url.com/")
before running the
function.
Levi Waldron, Marcel R., Ino dB.
https://www.cbioportal.org/datasets, cBioPortalData
cbio <- cBioPortal() head(getStudies(cbio)[["studyId"]]) # ask=FALSE for non-interactive use mae <- cBioDataPack("acc_tcga", ask = FALSE)