SELEX {SELEX} | R Documentation |
Functions to assist in discovering transcription factor DNA binding specificities from SELEX-seq experimental data according to the Slattery et al. paper. For a more comprehensive example, please look at the vignette. Sample data used in the Slattery, et. al. is stored in the extdata
folder for the package, and can be accessed using either the base R function system.file
or the package function selex.exampledata
.
Functions available:
selex.affinities | Construct a K-mer affinity table |
selex.config | Set SELEX system parameters |
selex.counts | Construct or retrieve a K-mer count table |
selex.countSummary | Summarize available K-mer count tables |
selex.defineSample | Define annotation for an individual sample |
selex.exampledata | Extract example data files |
selex.fastqPSFM | Construct a diagnostic PSFM for a FASTQ file |
selex.getAttributes | Display sample handle attributes |
selex.getRound0 | Obtain round zero sample handle |
selex.getSeqfilter | Display sequence filter attributes |
selex.infogain | Compute or retrieve information gain between rounds |
selex.infogainSummary | Summarize available information gain values |
selex.jvmStatus | Display current JVM memory usage |
selex.kmax | Calculate kmax for a dataset |
selex.kmerPSFM | Construct a PSFM from a K-mer table |
selex.loadAnnotation | Load a sample annotation file |
selex.mm | Build or retrieve a Markov model |
selex.mmProb | Compute prior probability of sequence using Markov model |
selex.mmSummary | Summarize Markov model properties |
selex.revcomp | Create forward-reverse complement data pairs |
selex.run | Run a standard SELEX analysis |
selex.sample | Create a sample handle |
selex.sampleSummary | Show samples visible to the current SELEX session |
selex.saveAnnotation | Save sample annotations to file |
selex.seqfilter | Create a sequence filter |
selex.setwd | Set or change the working directory |
selex.split | Randomly split a dataset |
selex.summary | Display all count table, Markov model, and information gain summaries |
Package: | SELEX |
Type: | Package |
Version: | .99 |
Date: | 2014-11-3 |
License: | GPL |
Chaitanya Rastogi, Dahong Liu, and Harmen Bussemaker
Maintainer: Harmen Bussemaker hjb2004@columbia.edu
Slattery, M., Riley, T.R., Liu, P., Abe, N., Gomez-Alcala, P., Dror, I., Zhou, T., Rohs, R., Honig, B., Bussemaker, H.J.,and Mann, R.S. (2011) Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell 147:1270–1282.
Riley, T.R., Slattery, M., Abe, N., Rastogi, C., Liu, D., Mann, R.S., and Bussemaker, H.J. (2014) SELEX-seq: a method for characterizing the complete repertoire of binding site preferences for transcription factor complexes. Methods Mol. Biol. 1196:255–278.
#Initialize the SELEX package #options(java.parameters="-Xmx1500M") #library(SELEX) # Configure the current session workDir = file.path(".", "SELEX_workspace") selex.config(workingDir=workDir,verbose=FALSE, maxThreadNumber= 4) # Extract sample data from package, including XML database sampleFiles = selex.exampledata(workDir) # Load & display all sample files using XML database selex.loadAnnotation(sampleFiles[3]) selex.sampleSummary() # Create sample handles r0 = selex.sample(seqName="R0.libraries", sampleName="R0.barcodeGC", round=0) r2 = selex.sample(seqName='R2.libraries', sampleName='ExdHox.R2', round=2) # Split the r0 sample into testing and training sets r0.split = selex.split(sample=r0) r0.split # Display all currently loaded samples selex.sampleSummary() # Find kmax on the test dataset k = selex.kmax(sample=r0.split$test) # Build the Markov model on the training dataset mm = selex.mm(sample=r0.split$train, order=NA, crossValidationSample=r0.split$test) # See Markov model R^2 values selex.mmSummary() # Kmer counting with an offset t1 = selex.counts(sample=r2, k=2, offset=14, markovModel=NULL) # Kmer counting with a Markov model (produces expected counts) t2 = selex.counts(sample=r2, k=4, markovModel=mm) # Display all available kmer statistics selex.countSummary() # Calculate information gain ig = selex.infogain(sample=r2, k=8, mm) # View information gain results selex.infogainSummary() # Perform the default analysis selex.run(trainingSample=r0.split$train, crossValidationSample=r0.split$test, infoGainSample=r2) # View all stats selex.summary()