Simulation and Extrapolation for Measurement Error Correction

simexreg is used to correct a regression object with a variable measured with error via SIMEX by Cook et al. (1994) and Küchenhoff et al. (2006).

simexreg(
  reg = NULL,
  formula = NULL,
  data = NULL,
  weights = NULL,
  MEvariable = NULL,
  MEvartype = NULL,
  MEerror = NULL,
  variance = FALSE,
  lambda = c(0.5, 1, 1.5, 2),
  B = 200
)

# S3 method for simexreg
coef(object, ...)

# S3 method for simexreg
vcov(object, ...)

# S3 method for simexreg
sigma(object, ...)

# S3 method for simexreg
formula(x, ...)

# S3 method for simexreg
family(object, ...)

# S3 method for simexreg
predict(object, ...)

# S3 method for simexreg
model.frame(formula, ...)

# S3 method for simexreg
print(x, ...)

# S3 method for simexreg
summary(object, ...)

# S3 method for summary.simexreg
print(x, digits = 4, ...)

# S3 method for simexreg
update(object, ..., evaluate = TRUE)

Arguments

reg	naive regression object. See `Details`.
formula	regression formula
data	new dataset for `reg`
MEvariable	variable measured with error
MEvartype	type of the variable measured with error. Can be `continuous` or `categorical` (first 3 letters are enough).
MEerror	the standard deviation of the measurement error (when `MEvartype` is `continuous`) or the misclassification matrix (when `MEvartype` is `categorical`).
variance	a logical value. If `TRUE`, estimate the var-cov matrix of coefficients through Jackknife. Default is `FALSE`.
lambda	a vector of lambdas for SIMEX. Default is `c(0.5, 1, 1.5, 2)`.
B	number of simulations for SIMEX. Default is `200`.
object	an object of class `simexreg`
...	additional arguments
x	an object of class `simexreg`
digits	minimal number of significant digits. See print.default.
evaluate	a logical value. If `TRUE`, the updated call is evaluated. Default is `TRUE`.

Value

If MEvariable is not in the regression formula, reg is returned. If MEvariable is in the regression formula, an object of class simexreg is returned:

call

the function call,

NAIVEreg

the naive regression object,

a list of MEvariable, MEvartype, MEerror, variance, lambda and B,

RCcoef

coefficient estimates corrected by SIMEX,

RCsigma

the residual standard deviation of a linear regression object corrected by SIMEX,

RCvcov

the var-cov matrix of coefficients corrected by SIMEX,

...

Details

reg fitted by lm, glm (with family gaussian, binomial or poisson), multinom, polr, coxph or survreg is supported.

Methods (by generic)

coef(simexreg): Extract coefficients corrected by simexreg
vcov(simexreg): Extract the var-cov matrix of coefficients corrected by simexreg
sigma(simexreg): Extract the residual standard deviation of a linear regression object corrected by simexreg
formula(simexreg): Extract the regression formula
family(simexreg): Extract the family of a regression of class lm or glm
predict(simexreg): Predict with new data
model.frame(simexreg): Extract the model frame
print(simexreg): Print results of simexreg nicely
summary(simexreg): Summarize results of simexreg nicely
update(simexreg): Update simexreg

Functions

print(summary.simexreg): Print summary of simexreg nicely

References

Carrol RJ, Ruppert D, Stefanski LA, Crainiceanu C (2006). Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition. London: Chapman & Hall.

Cook JR, Stefanski LA (1994). Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89(428): 1314 - 1328.

Küchenhoff H, Mwalili SM, Lesaffre E (2006). A general method for dealing with misclassification in regression: the misclassification SIMEX. Biometrics. 62(1): 85 - 96.

Stefanski LA, Cook JR (1995). Simulation-extrapolation: the measurement error jackknife. Journal of the American Statistical Association. 90(432): 1247 - 56.

Examples


if (FALSE) {
rm(list=ls())
library(CMAverse)

# lm
n <- 1000
x1 <- rnorm(n, mean = 5, sd = 3)
x2_true <- rnorm(n, mean = 2, sd = 1)
error1 <- rnorm(n, mean = 0, sd = 0.5)
x2_error <- x2_true + error1
x3 <- rbinom(n, size = 1, prob = 0.4)
y <- 1 + 2 * x1 + 4 * x2_true + 2 * x3  + rnorm(n, mean = 0, sd = 2)
data <- data.frame(x1 = x1, x2_true = x2_true, x2_error = x2_error,
                   x3 = x3, y = y)
reg_naive <- lm(y ~ x1 + x2_error + x3, data = data)
reg_true <- lm(y ~ x1 + x2_true + x3, data = data)
reg_simex <- simexreg(reg = reg_naive, data = data, 
MEvariable = "x2_error", MEvartype = "con", MEerror = 0.5, variance = TRUE)
coef(reg_simex)
vcov(reg_simex)
sigma(reg_simex)
formula(reg_simex)
family(reg_simex)
predict(reg_simex, newdata = data[1, ])
reg_simex_model <- model.frame(reg_simex)
reg_simex_update <- update(reg_simex, data = data, weights = rep(1, n))
reg_simex_summ <- summary(reg_simex)
                
# glm
n <- 1000
x1 <- rnorm(n, mean = 5, sd = 3)
x2_true <- sample(x = c(1:3), size = n, prob = c(0.2,0.3,0.5), replace = TRUE)
MEerror <- matrix(c(0.8,0.1,0.1,0.2,0.7,0.1,0.05,0.25,0.7), nrow = 3)
x2_error <- x2_true
for (j in 1:3) {
  x2_error[which(x2_error == c(1:3)[j])] <-
    sample(x = c(1:3), size = length(which(x2_error == c(1:3)[j])),
           prob = MEerror[, j], replace = TRUE)
}
x2_true <- as.factor(x2_true)
x2_error <- as.factor(x2_error)
x3 <- rnorm(n, mean = 2, sd = 1)
linearpred <- 1 + 0.3 * x1 - 1.5*(x2_true == 2) - 2.5*(x2_true == 3) - 0.2 * x3
py <- exp(linearpred) / (1 + exp(linearpred))
y <- rbinom(n, size = 1, prob = py)
data <- data.frame(x1 = x1, x2_true = x2_true, x2_error = x2_error,
                   x3 = x3, y = y)
reg_naive <- glm(y ~ x1 + x2_error + x3, data = data, family = binomial("logit"))
reg_true <- glm(y ~ x1 + x2_true + x3, data = data, family = binomial("logit"))
reg_simex <- simexreg(reg = reg_naive, data = data, 
MEvariable = "x2_error", MEerror = MEerror, variance = TRUE, MEvartype = "cat")
}