.iterativeNormForKmers {monaLisa}R Documentation

Adjust for k-mer composition (multiple iterations)

Description

Here we run '.normForKmers' multiple times to converge to the final weights that will be used to correct the background sequences for k-mer composition differences compared to the foreground. We closely follow HOMER's normalizeSequence() function found in Motif2.cpp. Note that HOMER runs the normalizeSequence() one last time after going through all iterations or reaching a low error, which we do not do here.

Usage

.iterativeNormForKmers(
  df,
  maxKmerSize = 3L,
  minSeqWgt = 0.001,
  maxIter = 160L,
  verbose = FALSE
)

Arguments

df

a DataFrame with sequence information as returned by .calculateGCweight.

maxKmerSize

Integer scalar giving the maximum k-mer size to consider. The default is set to 3 (like in HOMER), meaning that k-mers of size 1, 2 and 3 are considered.

minSeqWgt

Numeric scalar greater than zero giving the minimal weight of a sequence. The default value (0.001) was also used by HOMER (HOMER_MINIMUM_SEQ_WEIGHT constant in Motif2.h).

maxIter

An integer scalar giving the maximum number if times to run .normForKmers. the default is set to 160 (as in HOMER).

verbose

A logical scalar. If TRUE, report on k-mer composition adjustment.

Value

a DataFrame containing:


[Package monaLisa version 1.0.0 Index]