mvrnorm_sim {microbiomeDASim} | R Documentation |
This function is used in the
gen_norm_microbiome
call when the user
specified the method as mvrnorm.
mvrnorm_sim(n_control, n_treat, control_mean, sigma, num_timepoints, rho, corr_str = c("ar1", "compound", "ind"), func_form = c("linear", "quadratic", "cubic", "M", "W", "L_up", "L_down"), beta, IP = NULL, missing_pct, missing_per_subject, miss_val = NA, dis_plot = FALSE, plot_trend = FALSE, zero_trunc = TRUE)
n_control |
integer value specifying the number of control individuals |
n_treat |
integer value specifying the number of treated individuals |
control_mean |
numeric value specifying the mean value for control subjects. all control subjects are assummed to have the same population mean value. |
sigma |
numeric value specifying the global population standard deviation for both control and treated individuals. |
num_timepoints |
integer value specifying the number of timepoints per subject. |
rho |
value for the correlation parameter. must be between [0, 1].
see |
corr_str |
correlation structure selected. see
|
func_form |
character value specifying the functional form for the
longitduinal mean trend. see |
beta |
vector value specifying the parameters for the differential
abundance function. see |
IP |
vector specifying any inflection points. depends on the type of
functional form specified. see |
missing_pct |
numeric value that must be between [0, \1] that specifies what percentage of the individuals will have missing values. |
missing_per_subject |
integer value specifying how many observations per
subject should be dropped. note that we assume that all individuals must
have baseline value, meaning that the maximum number of
|
miss_val |
value used to induce missingness from the simulated data. by default missing values are assummed to be NA but other common choices include 0. |
dis_plot |
logical argument on whether to plot the simulated data or not. by default plotting is turned off. |
plot_trend |
specifies whether to plot the true mean trend. see
|
zero_trunc |
logical indicator designating whether simulated outcomes should be zero truncated. default is set to TRUE |
This function returns a list with the following objects:
df
- data.frame object with complete outcome Y
, subject ID,
time, group, and outcome with missing data
Y
- vector of complete outcome
Mu
- vector of complete mean specifications used during simulation
Sigma
- block diagonal symmetric matrix of complete data used during
simulation
N
- total number of observations
miss_data
- data.frame object that lists which ID's and timepoints
were randomly selected to induce missingness
Y_obs
- vector of outcome with induced missingness
num_subjects_per_group <- 20 sim_obj <- mvrnorm_sim(n_control=num_subjects_per_group, n_treat=num_subjects_per_group, control_mean=5, sigma=1, num_timepoints=5, rho=0.95, corr_str='ar1', func_form='linear', beta=c(0, 0.25), missing_pct=0.6, missing_per_subject=2) #checking the output head(sim_obj$df) #total number of observations is 2(num_subjects_per_group)(num_timeponts) sim_obj$N #there should be approximately 60% of the IDs with missing observations length(unique(sim_obj$miss_data$miss_id))/length(unique(sim_obj$df$ID)) #checking the subject covariance structure sim_obj$Sigma[seq_len(5), seq_len(5)]