replace_missing_data {PrInCE}R Documentation

Replace missing data with median ± random noise

Description

Replace missing data within each numeric column of a data frame with the column median, plus or minus some random noise, in order to train classifiers that do not easily ignore missing data (e.g. random forests or support vector machines).

Usage

replace_missing_data(dat, noise_pct = 0.05)

Arguments

dat

the data frame to replace missing data in

noise_pct

the standard deviation of the random normal distribution from which to draw added noise, expressed as a percentage of the standard deviation of the non-missing values in each column

Value

a data frame with missing values in each numeric column replaced by the column median, plus or minus some random noise


[Package PrInCE version 1.6.0 Index]