vignettes/multiple_imputation.Rmd
multiple_imputation.Rmd
This example demonstrates how to estimate causal effects with multiple imputations by using cmest
when the dataset contains missing values. For this purpose, we simulate some data containing a continuous baseline confounder \(C_1\), a binary baseline confounder \(C_2\), a binary exposure \(A\), a binary mediator \(M\) and a binary outcome \(Y\). The true regression models for \(A\), \(M\) and \(Y\) are: \[logit(E(A|C_1,C_2))=0.2+0.5C_1+0.1C_2\] \[logit(E(M|A,C_1,C_2))=1+2A+1.5C_1+0.8C_2\] \[logit(E(Y|A,M,C_1,C_2)))=-3-0.4A-1.2M+0.5AM+0.3C_1-0.6C_2\]
To create the dataset with missing values, we randomly delete 10% of values in the dataset.
set.seed(1)
expit <- function(x) exp(x)/(1+exp(x))
n <- 10000
C1 <- rnorm(n, mean = 1, sd = 0.1)
C2 <- rbinom(n, 1, 0.6)
A <- rbinom(n, 1, expit(0.2 + 0.5*C1 + 0.1*C2))
M <- rbinom(n, 1, expit(1 + 2*A + 1.5*C1 + 0.8*C2))
Y <- rbinom(n, 1, expit(-3 - 0.4*A - 1.2*M + 0.5*A*M + 0.3*C1 - 0.6*C2))
data_noNA <- data.frame(A, M, Y, C1, C2)
missing <- sample(1:(5*n), n*0.1, replace = FALSE)
C1[missing[which(missing <= n)]] <- NA
C2[missing[which((missing > n)*(missing <= 2*n) == 1)] - n] <- NA
A[missing[which((missing > 2*n)*(missing <= 3*n) == 1)] - 2*n] <- NA
M[missing[which((missing > 3*n)*(missing <= 4*n) == 1)] - 3*n] <- NA
Y[missing[which((missing > 4*n)*(missing <= 5*n) == 1)] - 4*n] <- NA
data <- data.frame(A, M, Y, C1, C2)
The DAG for this scientific setting is:
library(CMAverse)
cmdag(outcome = "Y", exposure = "A", mediator = "M",
basec = c("C1", "C2"), postc = NULL, node = TRUE, text_col = "white")
To conduct multiple imputations, we set the multimp
argument to be TRUE
. The regression-based approach is used for illustration. The multiple imputation results:
res_multimp <- cmest(data = data, model = "rb", outcome = "Y", exposure = "A",
mediator = "M", basec = c("C1", "C2"), EMint = TRUE,
mreg = list("logistic"), yreg = "logistic",
astar = 0, a = 1, mval = list(1),
estimation = "paramfunc", inference = "delta",
multimp = TRUE, m = 10)
summary(res_multimp)
## Causal Mediation Analysis
##
## # Regressions with imputed dataset 1
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[1L]]$yreg)$data, weights = getCall(x$reg.output[[1L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5603 -0.2432 -0.1779 -0.1640 3.0447
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.5224 0.7620 -4.622 3.79e-06 ***
## A 0.6683 0.5283 1.265 0.20585
## M -1.0747 0.3738 -2.875 0.00404 **
## C1 0.8998 0.6909 1.302 0.19280
## C2 -0.8122 0.1427 -5.690 1.27e-08 ***
## A:M -0.3553 0.5540 -0.641 0.52132
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2053.5 on 9999 degrees of freedom
## Residual deviance: 1997.1 on 9994 degrees of freedom
## AIC: 2009.1
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[1L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[1L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2944 0.1110 0.1624 0.2412 0.5924
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.01294 0.64619 0.020 0.98402
## A 1.70241 0.14542 11.707 < 2e-16 ***
## C1 2.42112 0.65530 3.695 0.00022 ***
## C2 0.97261 0.13661 7.120 1.08e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2264.5 on 9999 degrees of freedom
## Residual deviance: 2035.7 on 9996 degrees of freedom
## AIC: 2043.7
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 2
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[2L]]$yreg)$data, weights = getCall(x$reg.output[[2L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6003 -0.2437 -0.1724 -0.1610 3.0547
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.3781 0.7686 -4.395 1.11e-05 ***
## A 0.8585 0.5115 1.678 0.09328 .
## M -1.0685 0.3742 -2.856 0.00429 **
## C1 0.7463 0.6968 1.071 0.28412
## C2 -0.8598 0.1448 -5.939 2.87e-09 ***
## A:M -0.5478 0.5386 -1.017 0.30910
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2022.7 on 9999 degrees of freedom
## Residual deviance: 1960.1 on 9994 degrees of freedom
## AIC: 1972.1
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[2L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[2L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2720 0.1133 0.1625 0.2496 0.5705
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.2632 0.6391 0.412 0.680508
## A 1.7188 0.1450 11.853 < 2e-16 ***
## C1 2.1723 0.6470 3.357 0.000787 ***
## C2 0.9131 0.1350 6.765 1.33e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2286.6 on 9999 degrees of freedom
## Residual deviance: 2060.8 on 9996 degrees of freedom
## AIC: 2068.8
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 3
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[3L]]$yreg)$data, weights = getCall(x$reg.output[[3L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5452 -0.2430 -0.1729 -0.1611 3.0363
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.4209 0.7681 -4.454 8.43e-06 ***
## A 0.6180 0.5279 1.171 0.24176
## M -1.0370 0.3726 -2.783 0.00539 **
## C1 0.8083 0.6970 1.160 0.24623
## C2 -0.8459 0.1442 -5.865 4.49e-09 ***
## A:M -0.3704 0.5531 -0.670 0.50309
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2030.4 on 9999 degrees of freedom
## Residual deviance: 1973.7 on 9994 degrees of freedom
## AIC: 1985.7
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[3L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[3L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2646 0.1145 0.1658 0.2469 0.5667
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.2910 0.6446 0.451 0.651683
## A 1.6746 0.1438 11.647 < 2e-16 ***
## C1 2.1524 0.6519 3.302 0.000961 ***
## C2 0.9278 0.1352 6.864 6.71e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2286.6 on 9999 degrees of freedom
## Residual deviance: 2067.3 on 9996 degrees of freedom
## AIC: 2075.3
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 4
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[4L]]$yreg)$data, weights = getCall(x$reg.output[[4L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5517 -0.2469 -0.1710 -0.1617 3.0467
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.2476 0.7689 -4.224 2.40e-05 ***
## A 0.7042 0.5288 1.332 0.18299
## M -1.0429 0.3735 -2.792 0.00523 **
## C1 0.6137 0.6974 0.880 0.37888
## C2 -0.8758 0.1451 -6.035 1.59e-09 ***
## A:M -0.4094 0.5547 -0.738 0.46048
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2022.7 on 9999 degrees of freedom
## Residual deviance: 1962.9 on 9994 degrees of freedom
## AIC: 1974.9
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[4L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[4L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2841 0.1103 0.1624 0.2477 0.5675
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.3423 0.6427 0.533 0.59430
## A 1.7518 0.1464 11.962 < 2e-16 ***
## C1 2.0718 0.6497 3.189 0.00143 **
## C2 0.9545 0.1360 7.018 2.24e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2279.3 on 9999 degrees of freedom
## Residual deviance: 2046.0 on 9996 degrees of freedom
## AIC: 2054
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 5
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[5L]]$yreg)$data, weights = getCall(x$reg.output[[5L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5654 -0.2397 -0.1732 -0.1600 3.0569
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.5296 0.7754 -4.552 5.32e-06 ***
## A 0.7121 0.5285 1.347 0.17787
## M -1.0714 0.3741 -2.864 0.00419 **
## C1 0.8859 0.7035 1.259 0.20797
## C2 -0.8338 0.1455 -5.732 9.93e-09 ***
## A:M -0.4124 0.5549 -0.743 0.45736
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1999.6 on 9999 degrees of freedom
## Residual deviance: 1942.5 on 9994 degrees of freedom
## AIC: 1954.5
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[5L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[5L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2857 0.1114 0.1619 0.2469 0.5815
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.1650 0.6421 0.257 0.797241
## A 1.7349 0.1457 11.909 < 2e-16 ***
## C1 2.2561 0.6500 3.471 0.000519 ***
## C2 0.9456 0.1355 6.977 3.03e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2286.6 on 9999 degrees of freedom
## Residual deviance: 2054.0 on 9996 degrees of freedom
## AIC: 2062
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 6
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[6L]]$yreg)$data, weights = getCall(x$reg.output[[6L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6038 -0.2447 -0.1726 -0.1601 3.0703
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.3346 0.7611 -4.381 1.18e-05 ***
## A 0.7255 0.5008 1.449 0.147373
## M -1.2184 0.3598 -3.386 0.000709 ***
## C1 0.8314 0.6959 1.195 0.232159
## C2 -0.8757 0.1450 -6.041 1.53e-09 ***
## A:M -0.3815 0.5288 -0.721 0.470610
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2030.4 on 9999 degrees of freedom
## Residual deviance: 1963.2 on 9994 degrees of freedom
## AIC: 1975.2
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[6L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[6L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2844 0.1127 0.1614 0.2450 0.5849
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.06128 0.64567 0.095 0.924385
## A 1.70250 0.14526 11.720 < 2e-16 ***
## C1 2.39068 0.65431 3.654 0.000258 ***
## C2 0.92563 0.13585 6.814 9.52e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2271.9 on 9999 degrees of freedom
## Residual deviance: 2047.1 on 9996 degrees of freedom
## AIC: 2055.1
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 7
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[7L]]$yreg)$data, weights = getCall(x$reg.output[[7L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5621 -0.2412 -0.1745 -0.1606 3.0527
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.5515 0.7683 -4.623 3.79e-06 ***
## A 0.6686 0.5284 1.265 0.20573
## M -1.0724 0.3737 -2.869 0.00411 **
## C1 0.9297 0.6964 1.335 0.18189
## C2 -0.8361 0.1445 -5.787 7.16e-09 ***
## A:M -0.3752 0.5543 -0.677 0.49846
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2022.7 on 9999 degrees of freedom
## Residual deviance: 1965.1 on 9994 degrees of freedom
## AIC: 1977.1
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[7L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[7L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2768 0.1122 0.1641 0.2454 0.5672
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.2980 0.6440 0.463 0.64353
## A 1.7011 0.1453 11.710 < 2e-16 ***
## C1 2.1390 0.6519 3.281 0.00103 **
## C2 0.9495 0.1361 6.977 3.01e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2271.9 on 9999 degrees of freedom
## Residual deviance: 2047.4 on 9996 degrees of freedom
## AIC: 2055.4
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 8
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[8L]]$yreg)$data, weights = getCall(x$reg.output[[8L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5715 -0.2448 -0.1721 -0.1582 3.0600
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.5523 0.7673 -4.630 3.66e-06 ***
## A 0.6917 0.5288 1.308 0.19089
## M -1.0386 0.3731 -2.784 0.00538 **
## C1 0.9411 0.6932 1.358 0.17459
## C2 -0.9014 0.1445 -6.237 4.47e-10 ***
## A:M -0.4069 0.5543 -0.734 0.46285
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2038.1 on 9999 degrees of freedom
## Residual deviance: 1975.1 on 9994 degrees of freedom
## AIC: 1987.1
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[8L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[8L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2663 0.1143 0.1622 0.2518 0.5674
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.2690 0.6423 0.419 0.675350
## A 1.7138 0.1450 11.821 < 2e-16 ***
## C1 2.1813 0.6492 3.360 0.000779 ***
## C2 0.8835 0.1345 6.571 5e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2286.6 on 9999 degrees of freedom
## Residual deviance: 2065.7 on 9996 degrees of freedom
## AIC: 2073.7
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 9
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[9L]]$yreg)$data, weights = getCall(x$reg.output[[9L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5511 -0.2427 -0.1731 -0.1611 3.0410
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.3264 0.7595 -4.380 1.19e-05 ***
## A 0.5437 0.5175 1.051 0.29339
## M -1.1552 0.3580 -3.227 0.00125 **
## C1 0.8108 0.6951 1.167 0.24340
## C2 -0.8389 0.1443 -5.813 6.15e-09 ***
## A:M -0.2790 0.5436 -0.513 0.60773
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2030.4 on 9999 degrees of freedom
## Residual deviance: 1971.7 on 9994 degrees of freedom
## AIC: 1983.7
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[9L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[9L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2797 0.1120 0.1655 0.2430 0.5746
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.2267 0.6412 0.354 0.723653
## A 1.6866 0.1446 11.667 < 2e-16 ***
## C1 2.2028 0.6497 3.390 0.000698 ***
## C2 0.9728 0.1362 7.141 9.25e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2279.3 on 9999 degrees of freedom
## Residual deviance: 2052.9 on 9996 degrees of freedom
## AIC: 2060.9
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 10
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output[[10L]]$yreg)$data, weights = getCall(x$reg.output[[10L]]$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6092 -0.2405 -0.1749 -0.1602 3.0630
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.5684 0.7693 -4.639 3.51e-06 ***
## A 0.8190 0.5116 1.601 0.10944
## M -1.1175 0.3743 -2.985 0.00283 **
## C1 0.9642 0.6974 1.383 0.16679
## C2 -0.8361 0.1445 -5.786 7.21e-09 ***
## A:M -0.4960 0.5387 -0.921 0.35720
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2022.7 on 9999 degrees of freedom
## Residual deviance: 1960.7 on 9994 degrees of freedom
## AIC: 1972.7
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[10L]]$mreg[[1L]])$data,
## weights = getCall(x$reg.output[[10L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2796 0.1141 0.1610 0.2463 0.5836
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.02446 0.64559 0.038 0.969771
## A 1.68768 0.14535 11.611 < 2e-16 ***
## C1 2.45189 0.65446 3.746 0.000179 ***
## C2 0.89215 0.13547 6.585 4.53e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2264.5 on 9999 degrees of freedom
## Residual deviance: 2045.9 on 9996 degrees of freedom
## AIC: 2053.9
##
## Number of Fisher Scoring iterations: 7
##
##
## # Effect decomposition on the risk ratio scale via the regression-based approach with multiple imputation
##
## Closed-form parameter function estimation with
## delta method standard errors, confidence intervals and p-values
##
## Estimate Std.error 95% CIL 95% CIU P.val
## Rcde 1.34659 0.22892 0.96501 1.879 0.0800 .
## Rpnde 1.43309 0.24492 1.02518 2.003 0.0353 *
## Rtnde 1.36413 0.22702 0.98445 1.890 0.0621 .
## Rpnie 0.93070 0.03640 0.86202 1.005 0.0664 .
## Rtnie 0.88592 0.05343 0.78715 0.997 0.0446 *
## Rte 1.26960 0.20277 0.92836 1.736 0.1350
## ERcde 0.31730 0.20479 -0.08409 0.719 0.1213
## ERintref 0.11667 0.11739 -0.11340 0.347 0.3203
## ERintmed -0.09468 0.09544 -0.28175 0.092 0.3212
## ERpnie -0.06928 0.03639 -0.14061 0.002 0.0569 .
## ERcde(prop) 1.17873 0.24213 0.70417 1.653 1.13e-06 ***
## ERintref(prop) 0.43015 0.47475 -0.50035 1.361 0.3649
## ERintmed(prop) -0.34900 0.38596 -1.10547 0.407 0.3659
## ERpnie(prop) -0.25988 0.24107 -0.73236 0.213 0.2810
## pm -0.60888 0.49210 -1.57338 0.356 0.2160
## int 0.08115 0.08977 -0.09480 0.257 0.3660
## pe -0.17873 0.24213 -0.65329 0.296 0.4604
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Rcde: controlled direct effect risk ratio; Rpnde: pure natural direct effect risk ratio; Rtnde: total natural direct effect risk ratio; Rpnie: pure natural indirect effect risk ratio; Rtnie: total natural indirect effect risk ratio; Rte: total effect risk ratio; ERcde: excess relative risk due to controlled direct effect; ERintref: excess relative risk due to reference interaction; ERintmed: excess relative risk due to mediated interaction; ERpnie: excess relative risk due to pure natural indirect effect; ERcde(prop): proportion ERcde; ERintref(prop): proportion ERintref; ERintmed(prop): proportion ERintmed; ERpnie(prop): proportion ERpnie; pm: overall proportion mediated; int: overall proportion attributable to interaction; pe: overall proportion eliminated)
##
## Relevant variable values:
## $a
## [1] 1
##
## $astar
## [1] 0
##
## $yval
## [1] "1"
##
## $mval
## $mval[[1]]
## [1] 1
##
##
## $basecval
## $basecval[[1]]
## [1] 0.9993387
##
## $basecval[[2]]
## [1] 0.6060235
Compare the multiple imputation results with true results:
res_noNA <- cmest(data = data_noNA, model = "rb", outcome = "Y", exposure = "A",
mediator = "M", basec = c("C1", "C2"), EMint = TRUE,
mreg = list("logistic"), yreg = "logistic",
astar = 0, a = 1, mval = list(1),
estimation = "paramfunc", inference = "delta")
summary(res_noNA)
## Causal Mediation Analysis
##
## # Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
## data = getCall(x$reg.output$yreg)$data, weights = getCall(x$reg.output$yreg)$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5681 -0.2414 -0.1741 -0.1604 3.0521
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.5731 0.7713 -4.633 3.61e-06 ***
## A 0.7084 0.5283 1.341 0.17990
## M -1.0471 0.3736 -2.803 0.00507 **
## C1 0.9337 0.6973 1.339 0.18054
## C2 -0.8415 0.1443 -5.829 5.56e-09 ***
## A:M -0.4222 0.5541 -0.762 0.44607
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2022.7 on 9999 degrees of freedom
## Residual deviance: 1965.0 on 9994 degrees of freedom
## AIC: 1977
##
## Number of Fisher Scoring iterations: 7
##
##
## # Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output$mreg[[1L]])$data,
## weights = getCall(x$reg.output$mreg[[1L]])$weights)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2578 0.1145 0.1647 0.2519 0.5563
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.4353 0.6410 0.679 0.49712
## A 1.7076 0.1443 11.837 < 2e-16 ***
## C1 1.9982 0.6474 3.086 0.00203 **
## C2 0.9024 0.1345 6.709 1.96e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2294.0 on 9999 degrees of freedom
## Residual deviance: 2072.5 on 9996 degrees of freedom
## AIC: 2080.5
##
## Number of Fisher Scoring iterations: 7
##
##
## # Effect decomposition on the risk ratio scale via the regression-based approach
##
## Closed-form parameter function estimation with
## delta method standard errors, confidence intervals and p-values
##
## Estimate Std.error 95% CIL 95% CIU P.val
## Rcde 1.33138 0.22278 0.95912 1.848 0.0872 .
## Rpnde 1.41988 0.23708 1.02358 1.970 0.0358 *
## Rtnde 1.34928 0.22056 0.97940 1.859 0.0669 .
## Rpnie 0.93338 0.03576 0.86587 1.006 0.0719 .
## Rtnie 0.88697 0.05295 0.78902 0.997 0.0445 *
## Rte 1.25939 0.19804 0.92535 1.714 0.1425
## ERcde 0.30416 0.20019 -0.08820 0.697 0.1287
## ERintref 0.11571 0.11472 -0.10914 0.341 0.3131
## ERintmed -0.09387 0.09323 -0.27659 0.089 0.3140
## ERpnie -0.06662 0.03576 -0.13670 0.003 0.0624 .
## ERcde(prop) 1.17262 0.23540 0.71124 1.634 6.32e-07 ***
## ERintref(prop) 0.44610 0.49051 -0.51528 1.407 0.3631
## ERintmed(prop) -0.36189 0.39881 -1.14354 0.420 0.3642
## ERpnie(prop) -0.25684 0.23474 -0.71691 0.203 0.2739
## pm -0.61872 0.50289 -1.60437 0.367 0.2186
## int 0.08421 0.09264 -0.09736 0.266 0.3633
## pe -0.17262 0.23540 -0.63400 0.289 0.4634
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Rcde: controlled direct effect risk ratio; Rpnde: pure natural direct effect risk ratio; Rtnde: total natural direct effect risk ratio; Rpnie: pure natural indirect effect risk ratio; Rtnie: total natural indirect effect risk ratio; Rte: total effect risk ratio; ERcde: excess relative risk due to controlled direct effect; ERintref: excess relative risk due to reference interaction; ERintmed: excess relative risk due to mediated interaction; ERpnie: excess relative risk due to pure natural indirect effect; ERcde(prop): proportion ERcde; ERintref(prop): proportion ERintref; ERintmed(prop): proportion ERintmed; ERpnie(prop): proportion ERpnie; pm: overall proportion mediated; int: overall proportion attributable to interaction; pe: overall proportion eliminated)
##
## Relevant variable values:
## $a
## [1] 1
##
## $astar
## [1] 0
##
## $yval
## [1] "1"
##
## $mval
## $mval[[1]]
## [1] 1
##
##
## $basecval
## $basecval[[1]]
## [1] 0.9993463
##
## $basecval[[2]]
## [1] 0.6061