pengls {pengls}R Documentation

Iterative estimation of penalised generalised least squares

Description

Iterative estimation of penalised generalised least squares

Usage

pengls(
  data,
  glsSt,
  xNames,
  outVar,
  corMat,
  lambda,
  foldid,
  cvType = c("random", "blocked"),
  maxIter = 30,
  tol = 0.05,
  verbose = FALSE,
  optControl = lmeControl(opt = "optim", maxIter = 500, msVerbose = verbose, msMaxIter
    = 500, niterEM = 1000, msMaxEval = 1000),
  nfolds = 10,
  ...
)

Arguments

data

A data matrix or data frame

glsSt

a covariance structure, as supplied to nlme::gls as "correlation"

xNames

names of the regressors in data

outVar

name of the outcome variable in data

corMat

a starting value for th correlation matrix. Taken to be a diagonal matrix if missing

lambda

The penalty value for glmnet. If missing, the optimal value of vanilla glmnet without autocorrelation component is used

foldid

An optional vector deffining the fold

cvType

A character vector defining the type of cross-validation. Either "random" or "blocked", ignored if foldid is provided

maxIter

maximum number of iterations between glmnet and gls

tol

A convergence tolerance

verbose

a boolean, should output be printed?

optControl

control arguments, passed onto nlme::gls' control argument

nfolds

an integer, the number of folds used in cv.glmnet to find lambda

...

passed onto glmnet::glmnet

Details

This function does not provide cross-validation, but rather fits the model for the lambda penalty value provided, or else the optimal lambda value of the vanilla glmnet. Cross-validation needs to be implemented by the user; since the data exhibits autocorrelation this may need to be blocked cross-validation or some other dedicated method.

Value

A list with components

corMat

The square root of the inverse correlation matrix

Coef

The coefficients of the correlation object

Examples

### Example 1: spatial data
# Define the dimensions of the data
library(nlme)
n <- 50 #Sample size
p <- 100 #Number of features
g <- 10 #Size of the grid
#Generate grid
Grid <- expand.grid("x" = seq_len(g), "y" = seq_len(g))
# Sample points from grid without replacement
GridSample <- Grid[sample(nrow(Grid), n, replace = FALSE),]
#Generate outcome and regressors
b <- matrix(rnorm(p*n), n , p)
a <- rnorm(n, mean = b %*% rbinom(p, size = 1, p = 0.2)) #20% signal
#Compile to a matrix
df <- data.frame("a" = a, "b" = b, GridSample)
# Define the correlation structure (see ?nlme::gls), with initial nugget 0.5 and range 5
corStruct <- corGaus(form = ~ x + y, nugget = TRUE, value = c("range" = 5, "nugget" = 0.5))
#Fit the pengls model, for simplicity for a simple lambda
penglsFit <- pengls(data = df, outVar = "a", xNames = grep(names(df), pattern = "b", value =TRUE),
glsSt = corStruct, nfolds = 5)

### Example 2: timecourse data
dfTime <- data.frame("a" = a, "b" = b, "t" = seq_len(50))
corStructTime <- corAR1(form = ~ t, value = 0.5)
penglsFit <- pengls(data = dfTime, outVar = "a",
xNames = grep(names(dfTime), pattern = "b", value =TRUE),
glsSt = corStructTime, nfolds = 5)

[Package pengls version 1.0.0 Index]