processPathFile {PloGO2} | R Documentation |
For each pathway extract all ID's from the file. If abundance data is present extract and merge.
processPathFile(fname, AnnotIDlist, datafile=NULL, datafile.ignore.cols=1, format=c("compact","long"), aggregateFun="sum")
fname |
The pathway file name, in either Wego native format or long format |
AnnotIDlist |
The list of pathway annotation ID |
datafile |
The file containing abundance or NULL if none. |
datafile.ignore.cols |
How many columns in the abundance file to ignore. By default assume the first only, containing identifiers. |
format |
Either "compact" or "long"; see details |
aggregateFun |
The aggregation function for abundance data |
The format is "compact" by default, meaning a text file containing ID's in the first column, and pathway identifiers in the second, separated by spaces or semicolons. This is the same as the "Wego native format". A "long" format is also accepted, meaning a text file with two or more columns separated by spaces, containing an identifier, followed by a pathway id, followed optionally by other columns which are ignored. The pathway id's will first be aggregated for each identifier.
A list with the following components
counts |
A vector of the same length as the list of pathway id's of interest giving the number of ID's in each category |
ID.list |
The list of ID's for each pathway category |
datafile |
The abundance datafile provided passed through |
abundance |
A matrix with as many rows as the pathway list provided, and as many columns as the abundance data file |
N |
The number of protein (gene etc) identifiers in each file |
fname |
The filename without the file path |
J. Wu
# use one of the stored files dir <- system.file("files", package="PloGO2") fname <- paste(dir,"PWFiles/AllData.txt", sep="/") datafile <- paste(dir, "Abundance_data.csv", sep="/") AnnotIDlist <- unique(unlist(sapply(read.delim(fname, stringsAsFactors=FALSE)[,2], function(x) strsplit(x, split=" ")[[1]]))) # or if abundance in present aggregate that by category processPathFile(fname, AnnotIDlist, datafile=datafile)