simInheritance {methInheritSim} | R Documentation |
Simulate a multigenerational methylation experiment with inheritance
Description
Simulate a multigenerational methylation case versus control
experiment with inheritance relation using a real control dataset.
The simulation can be parametrized to fit different models. The number of
cases and controls, the proportion of the case affected
by the treatment (penetrance), the effect of the treatment on the mean of
the distribution, the proportion of sites inherited, the proportion of the
differentially methylated sites from the precedent generation inherited,
etc..
The function simulates a multigeneration dataset like a bisulfite
sequencing experiment. The simulation includes the information about
control and case for each generation.
Usage
simInheritance(pathOut, pref, k, nbCtrl, nbCase, treatment, sample.id,
generation, stateInfo, propDiff, propDiffsd, diffValue, propInheritance,
rateDiff, minRate, propInherite, propHetero, minReads, maxPercReads, context,
assembly, meanCov, diffRes, saveGRanges, saveMethylKit, runAnalysis)
Arguments
pathOut |
a string of character or NULL , the path
where the
files created by the function will be saved. When NULL , the files
are saved in the current directory.
|
pref |
a string of character representing the parameters of
specific simulation the string is composed of those elements, separated
by "_":
a fileID
the chromosome number, a number between 1 and nbSynCHR
the number of samples, a number in the vNbSample vector
the mean proportion of samples that has,
for a specific position, differentially methylated values, a
number in the vpDiff vector
the proportion of
C/T for a case differentially methylated that follows a shifted beta
distribution, a
number in the vDiff vector
the
proportion of cases that inherits differentially sites, a number in the
vInheritance vector
|
k |
a positive integer , an ID for the current simulation.
|
nbCtrl |
a positive integer , the number of controls.
|
nbCase |
a positive integer , the number of cases.
|
treatment |
a vector of integer denoting controls and cases. The
vector length must correspond to the sum of cases and controls.
|
sample.id |
a matrix the name of each samples for each generation (row)
and each case and control (column).
|
generation |
a positive integer , the number of generations
simulated.
|
stateInfo |
a GRanges that contains the CpG (or
methylated sites).
The GRanges have four metadata from the real dataset:
chrOri a numeric , the chromosome from the real dataset
startOri a numeric , the position of the site in the real dataset
meanCTRL a numeric , the mean of the control in the real dataset
varCTRL a numeric , the variance of the control in the real
dataset.
|
propDiff |
a double superior to
0 and inferior or equal
to 1 , the mean value for the proportion of samples that will have,
for a specific position, differentially methylated values. It can be
interpreted as the penetrance.
|
propDiffsd |
a non-negative double , the
standard deviation associated to the vpDiff . Note that
vpDiff and vpDiffsd must be the same length.
|
diffValue |
a non-negative double
included in [0,1], the proportion of C/T for a case differentially
methylated that follows
a beta distribution where the mean is shifted by vDiff
from the CTRL distribution.
|
propInheritance |
a non-negative double
included in [0,1], the proportion of cases
that inherits differentially methylated sites.
|
rateDiff |
a positive double inferior to 1 , the mean of
the chance that a site is differentially methylated.
|
minRate |
a non-negative double inferior to 1 , the
minimum rate for differentially methylated sites.
Default: 0.01 .
|
propInherite |
a non-negative double inferior or equal
to 1 ,
the proportion of differentially methylated regions that
are inherated.
|
propHetero |
a non-negative double between [0,1], the
reduction of vDiff for the second and following generations.
|
minReads |
a positive integer , sites and regions having lower
coverage than this count are discarded. The parameter
corresponds to the lo.count parameter in
the methylKit package.
|
maxPercReads |
a double between [0,100], the percentile of read
counts that is going to be used as upper cutoff. Sites and regions
having higher
coverage than maxPercReads are discarded. This parameter is used for
both CpG sites and tiles analysis. The parameter
correspond to the hi.perc parameter in the methylKit package.
|
context |
a string of character , the short description of the
methylation context, such as "CpG", "CpH", "CHH", etc..
|
assembly |
a string of character , the short description of the
genome assembly, such as "mm9", "hg18", etc..
|
meanCov |
a positive integer , the mean of the coverage
at the simulated CpG sites.
|
diffRes |
a list with 2 entries:
-
stateDiff a vector of integer (0
and 1 ) with length corresponding the length of stateInfo .
The vector
indicates, using a 1 , the positions where the CpG sites are
differentially methylated.
-
stateInherite a vector of integer (0 and
1 )
with length corresponding the length of stateInfo . The
vector
indicates, using a 1 , the positions where the CpG values are
inherited.
when is NULL generate a new ones with getDiffMeth .
|
saveGRanges |
a logical , when true , the package save two
files type. The first generate for each simulation contains a list .
The length of the list corresponds to the number of generation.
The generation are stored in order (first entry = first generation,
second entry = second generation, etc..). All samples related to one
generations are contained in a GRangesList .
The GRangeaList store a list of GRanges . Each
GRanges stores the raw mehylation data of one sample.
The second file a numeric vector denoting controls and cases
(a file is generates by entry in the vector parameters
vNbSample ).
|
saveMethylKit |
a logical , when TRUE , the package save
a file contains a list . The length of the
list corresponds to the number of generation. The generation are
stored in order (first entry = first generation,
second entry = second generation, etc..). All samples related to one
generations are contained in a S4 methylRawList object. The
methylRawList object contains two Slots:
1. treatment: A numeric vector denoting controls and cases.
2. .Data: A list of methylRaw objects. Each object stores the
raw methylation data of one sample.
|
runAnalysis |
a logical , if TRUE , two files are saved :
1. The first file is the methylObj... file formated
with the methylkit package in a S4 methylBase
object (with the methylKit
functions: filterByCoverage , normalizeCoverage and
unite ).
2. The second file contains a S4 calculateDiffMeth object
generated with the methylKit functions calculateDiffMeth
using the first file.
|
Value
0
indicating that the function has been successful.
Author(s)
Pascal Belleau, Astrid Deschenes
Examples
## Name of the directory that will contained the generated files
temp_dir <- "test_simInheritance"
## Load dataset
data(dataSimExample)
## Generate a stateDiff object with length corresponding to
## nbBlock * nbCpG from stateInformation
stateDiff <- list()
stateDiff[["stateDiff"]] <- c(1, 0, 1)
stateDiff[["stateInherite"]] <- c(1, 0, 0)
## Simulate multigenerational methylation experiment with inheritance
methInheritSim:::simInheritance(pathOut = temp_dir,
pref = "S1_6_0.9_0.8_0.5", k = 1, nbCtrl = 6, nbCase = 6,
treatment = dataSimExample$treatment,
sample.id = dataSimExample$sample.id,
generation = 3, stateInfo = dataSimExample$stateInfo[1:3],
propDiff = 0.9, propDiffsd = 0.1, diffValue = 0.8,
propInheritance = 0.5, rateDiff = 0.3, minRate = 0.3,
propInherite = 0.3, propHetero = 0.5, minReads = 10, maxPercReads = 99,
assembly="RNOR_5.0", context="Cpg", meanCov = 40, diffRes = stateDiff,
saveGRanges = FALSE, saveMethylKit = FALSE, runAnalysis = FALSE)
## Delete directory
if (dir.exists(temp_dir)) {
unlink(temp_dir, recursive = TRUE, force = FALSE)
}
[Package
methInheritSim version 1.16.0
Index]