This example demonstrates how to estimate causal effects with multiple imputations by using cmest when the dataset contains missing values. For this purpose, we simulate some data containing a continuous baseline confounder $$C_1$$, a binary baseline confounder $$C_2$$, a binary exposure $$A$$, a binary mediator $$M$$ and a binary outcome $$Y$$. The true regression models for $$A$$, $$M$$ and $$Y$$ are: $logit(E(A|C_1,C_2))=0.2+0.5C_1+0.1C_2$ $logit(E(M|A,C_1,C_2))=1+2A+1.5C_1+0.8C_2$ $logit(E(Y|A,M,C_1,C_2)))=-3-0.4A-1.2M+0.5AM+0.3C_1-0.6C_2$

To create the dataset with missing values, we randomly delete 10% of values in the dataset.

set.seed(1)
expit <- function(x) exp(x)/(1+exp(x))
n <- 10000
C1 <- rnorm(n, mean = 1, sd = 0.1)
C2 <- rbinom(n, 1, 0.6)
A <- rbinom(n, 1, expit(0.2 + 0.5*C1 + 0.1*C2))
M <- rbinom(n, 1, expit(1 + 2*A + 1.5*C1 + 0.8*C2))
Y <- rbinom(n, 1, expit(-3 - 0.4*A - 1.2*M + 0.5*A*M + 0.3*C1 - 0.6*C2))
data_noNA <- data.frame(A, M, Y, C1, C2)
missing <- sample(1:(5*n), n*0.1, replace = FALSE)
C1[missing[which(missing <= n)]] <- NA
C2[missing[which((missing > n)*(missing <= 2*n) == 1)] - n] <- NA
A[missing[which((missing > 2*n)*(missing <= 3*n) == 1)] - 2*n] <- NA
M[missing[which((missing > 3*n)*(missing <= 4*n) == 1)] - 3*n] <- NA
Y[missing[which((missing > 4*n)*(missing <= 5*n) == 1)] - 4*n] <- NA
data <- data.frame(A, M, Y, C1, C2)

The DAG for this scientific setting is:

library(CMAverse)
## Registered S3 methods overwritten by 'lme4':
##   method                          from
##   cooks.distance.influence.merMod car
##   influence.merMod                car
##   dfbeta.influence.merMod         car
##   dfbetas.influence.merMod        car
cmdag(outcome = "Y", exposure = "A", mediator = "M",
basec = c("C1", "C2"), postc = NULL, node = TRUE, text_col = "white")

To conduct multiple imputations, we set the multimp argument to be TRUE. The regression-based approach is used for illustration. The multiple imputation results:

res_multimp <- cmest(data = data, model = "rb", outcome = "Y", exposure = "A",
mediator = "M", basec = c("C1", "C2"), EMint = TRUE,
mreg = list("logistic"), yreg = "logistic",
astar = 0, a = 1, mval = list(1),
estimation = "paramfunc", inference = "delta",
multimp = TRUE, args_mice = list(m = 10))
summary(res_multimp)
## Causal Mediation Analysis
##
## # Regressions with imputed dataset 1
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[1L]]$yreg)$data, weights = getCall(x$reg.output[[1L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.6544  -0.2446  -0.1733  -0.1571   3.0818
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.6155     0.7563  -4.781 1.75e-06 ***
## A             0.8454     0.4866   1.737 0.082324 .
## M            -1.1828     0.3592  -3.293 0.000993 ***
## C1            1.1130     0.6888   1.616 0.106133
## C2           -0.9115     0.1442  -6.322 2.58e-10 ***
## A:M          -0.5107     0.5149  -0.992 0.321299
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2061.1  on 9999  degrees of freedom
## Residual deviance: 1986.3  on 9994  degrees of freedom
## AIC: 1998.3
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[1L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[1L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2753   0.1143   0.1637   0.2465   0.5893
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.04186    0.63872   0.066 0.947748
## A            1.68676    0.14358  11.748  < 2e-16 ***
## C1           2.39526    0.64720   3.701 0.000215 ***
## C2           0.92552    0.13464   6.874 6.24e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2301.4  on 9999  degrees of freedom
## Residual deviance: 2075.6  on 9996  degrees of freedom
## AIC: 2083.6
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 2
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[2L]]$yreg)$data, weights = getCall(x$reg.output[[2L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5705  -0.2402  -0.1732  -0.1601   3.0503
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.5202     0.7726  -4.556 5.21e-06 ***
## A             0.7264     0.5289   1.374  0.16959
## M            -1.0590     0.3737  -2.834  0.00459 **
## C1            0.8821     0.6995   1.261  0.20727
## C2           -0.8319     0.1452  -5.728 1.01e-08 ***
## A:M          -0.4478     0.5548  -0.807  0.41961
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2007.3  on 9999  degrees of freedom
## Residual deviance: 1950.4  on 9994  degrees of freedom
## AIC: 1962.4
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[2L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[2L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2907   0.1102   0.1616   0.2451   0.5747
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.2246     0.6457   0.348 0.727979
## A             1.7384     0.1467  11.852  < 2e-16 ***
## C1            2.2056     0.6532   3.377 0.000734 ***
## C2            0.9561     0.1366   6.998 2.59e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2264.5  on 9999  degrees of freedom
## Residual deviance: 2034.0  on 9996  degrees of freedom
## AIC: 2042
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 3
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[3L]]$yreg)$data, weights = getCall(x$reg.output[[3L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.6001  -0.2442  -0.1722  -0.1596   3.0502
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.4724     0.7663  -4.532 5.85e-06 ***
## A             0.8326     0.5111   1.629  0.10332
## M            -1.0319     0.3731  -2.766  0.00568 **
## C1            0.8459     0.6935   1.220  0.22259
## C2           -0.8750     0.1443  -6.063 1.34e-09 ***
## A:M          -0.5568     0.5375  -1.036  0.30027
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2038.1  on 9999  degrees of freedom
## Residual deviance: 1974.7  on 9994  degrees of freedom
## AIC: 1986.7
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[3L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[3L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2757   0.1151   0.1629   0.2471   0.5941
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -0.0388     0.6391  -0.061 0.951589
## A             1.6807     0.1436  11.706  < 2e-16 ***
## C1            2.4881     0.6482   3.838 0.000124 ***
## C2            0.9087     0.1344   6.762 1.36e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2301.4  on 9999  degrees of freedom
## Residual deviance: 2077.1  on 9996  degrees of freedom
## AIC: 2085.1
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 4
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[4L]]$yreg)$data, weights = getCall(x$reg.output[[4L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5675  -0.2413  -0.1751  -0.1606   3.0619
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.6023     0.7713  -4.671 3.00e-06 ***
## A             0.6911     0.5286   1.307  0.19111
## M            -1.0792     0.3742  -2.884  0.00392 **
## C1            0.9686     0.6978   1.388  0.16515
## C2           -0.8408     0.1445  -5.817 5.98e-09 ***
## A:M          -0.3692     0.5548  -0.665  0.50575
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2022.7  on 9999  degrees of freedom
## Residual deviance: 1964.0  on 9994  degrees of freedom
## AIC: 1976
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[4L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[4L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2808   0.1111   0.1632   0.2465   0.5690
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.3146     0.6450   0.488  0.62577
## A             1.7271     0.1458  11.845  < 2e-16 ***
## C1            2.1044     0.6523   3.226  0.00125 **
## C2            0.9591     0.1360   7.054 1.74e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2279.3  on 9999  degrees of freedom
## Residual deviance: 2048.9  on 9996  degrees of freedom
## AIC: 2056.9
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 5
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[5L]]$yreg)$data, weights = getCall(x$reg.output[[5L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5604  -0.2433  -0.1709  -0.1579   3.0599
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.5033     0.7704  -4.547 5.44e-06 ***
## A             0.6653     0.5285   1.259  0.20812
## M            -1.0524     0.3737  -2.816  0.00486 **
## C1            0.8870     0.6969   1.273  0.20308
## C2           -0.8892     0.1456  -6.105 1.03e-09 ***
## A:M          -0.3842     0.5545  -0.693  0.48840
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2015.0  on 9999  degrees of freedom
## Residual deviance: 1953.6  on 9994  degrees of freedom
## AIC: 1965.6
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[5L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[5L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2809   0.1115   0.1641   0.2454   0.5732
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.2656     0.6424   0.413 0.679244
## A             1.7167     0.1452  11.827  < 2e-16 ***
## C1            2.1534     0.6495   3.315 0.000916 ***
## C2            0.9635     0.1360   7.086 1.38e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2279.3  on 9999  degrees of freedom
## Residual deviance: 2050.7  on 9996  degrees of freedom
## AIC: 2058.7
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 6
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[6L]]$yreg)$data, weights = getCall(x$reg.output[[6L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5671  -0.2411  -0.1711  -0.1576   3.0601
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.4978     0.7662  -4.565 4.99e-06 ***
## A             0.6005     0.5175   1.160  0.24591
## M            -1.1364     0.3582  -3.172  0.00151 **
## C1            0.9577     0.7003   1.368  0.17145
## C2           -0.8730     0.1455  -5.999 1.98e-09 ***
## A:M          -0.3284     0.5441  -0.604  0.54612
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2007.3  on 9999  degrees of freedom
## Residual deviance: 1945.8  on 9994  degrees of freedom
## AIC: 1957.8
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[6L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[6L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2673   0.1146   0.1624   0.2531   0.5829
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.1661     0.6337   0.262 0.793244
## A             1.7269     0.1437  12.015  < 2e-16 ***
## C1            2.2465     0.6415   3.502 0.000461 ***
## C2            0.9027     0.1333   6.775 1.25e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2330.8  on 9999  degrees of freedom
## Residual deviance: 2097.8  on 9996  degrees of freedom
## AIC: 2105.8
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 7
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[7L]]$yreg)$data, weights = getCall(x$reg.output[[7L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5652  -0.2426  -0.1738  -0.1608   3.0640
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.4733     0.7707  -4.507 6.58e-06 ***
## A             0.6897     0.5285   1.305  0.19189
## M            -1.1095     0.3750  -2.958  0.00309 **
## C1            0.8568     0.6994   1.225  0.22061
## C2           -0.8486     0.1449  -5.855 4.77e-09 ***
## A:M          -0.3552     0.5551  -0.640  0.52227
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2015.0  on 9999  degrees of freedom
## Residual deviance: 1955.7  on 9994  degrees of freedom
## AIC: 1967.7
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[7L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[7L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2870   0.1124   0.1613   0.2440   0.5882
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.03183    0.64602   0.049 0.960703
## A            1.70552    0.14538  11.731  < 2e-16 ***
## C1           2.41571    0.65432   3.692 0.000223 ***
## C2           0.93227    0.13603   6.853 7.22e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2264.5  on 9999  degrees of freedom
## Residual deviance: 2039.8  on 9996  degrees of freedom
## AIC: 2047.8
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 8
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[8L]]$yreg)$data, weights = getCall(x$reg.output[[8L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5849  -0.2491  -0.1701  -0.1614   3.0541
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.0819     0.7609  -4.050 5.12e-05 ***
## A             0.7038     0.5002   1.407 0.159435
## M            -1.1883     0.3593  -3.307 0.000942 ***
## C1            0.5817     0.6971   0.834 0.404029
## C2           -0.8908     0.1449  -6.147 7.89e-10 ***
## A:M          -0.3894     0.5279  -0.738 0.460721
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2030.4  on 9999  degrees of freedom
## Residual deviance: 1963.7  on 9994  degrees of freedom
## AIC: 1975.7
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[8L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[8L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2753   0.1142   0.1640   0.2446   0.5847
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.06976    0.64465   0.108 0.913822
## A            1.67172    0.14392  11.615  < 2e-16 ***
## C1           2.37882    0.65253   3.646 0.000267 ***
## C2           0.93134    0.13539   6.879 6.03e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2279.3  on 9999  degrees of freedom
## Residual deviance: 2058.5  on 9996  degrees of freedom
## AIC: 2066.5
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 9
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[9L]]$yreg)$data, weights = getCall(x$reg.output[[9L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5696  -0.2417  -0.1702  -0.1580   3.0560
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.4477     0.7728  -4.461 8.14e-06 ***
## A             0.6996     0.5294   1.322  0.18631
## M            -1.0701     0.3737  -2.863  0.00419 **
## C1            0.8415     0.7009   1.201  0.22994
## C2           -0.8754     0.1461  -5.993 2.07e-09 ***
## A:M          -0.4294     0.5554  -0.773  0.43942
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 1999.6  on 9999  degrees of freedom
## Residual deviance: 1939.4  on 9994  degrees of freedom
## AIC: 1951.4
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[9L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[9L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2908   0.1099   0.1594   0.2463   0.5717
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.2425     0.6477   0.374 0.708141
## A             1.7544     0.1476  11.889  < 2e-16 ***
## C1            2.1962     0.6549   3.354 0.000798 ***
## C2            0.9334     0.1367   6.830 8.51e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2249.6  on 9999  degrees of freedom
## Residual deviance: 2021.0  on 9996  degrees of freedom
## AIC: 2029
##
## Number of Fisher Scoring iterations: 7
##
##
## # Regressions with imputed dataset 10
##
## ## Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output[[10L]]$yreg)$data, weights = getCall(x$reg.output[[10L]]$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5648  -0.2429  -0.1716  -0.1598   3.0392
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.4182     0.7682  -4.450 8.59e-06 ***
## A             0.6923     0.5289   1.309  0.19056
## M            -1.0344     0.3727  -2.775  0.00552 **
## C1            0.8076     0.6967   1.159  0.24639
## C2           -0.8595     0.1447  -5.940 2.85e-09 ***
## A:M          -0.4509     0.5541  -0.814  0.41578
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2022.7  on 9999  degrees of freedom
## Residual deviance: 1964.4  on 9994  degrees of freedom
## AIC: 1976.4
##
## Number of Fisher Scoring iterations: 7
##
##
## ## Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output[[10L]]$mreg[[1L]])$data, ## weights = getCall(x$reg.output[[10L]]$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2811   0.1121   0.1615   0.2459   0.5713
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.2112     0.6472   0.326 0.744143
## A             1.7095     0.1461  11.700  < 2e-16 ***
## C1            2.2442     0.6551   3.426 0.000613 ***
## C2            0.9230     0.1362   6.779 1.21e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2257.0  on 9999  degrees of freedom
## Residual deviance: 2035.2  on 9996  degrees of freedom
## AIC: 2043.2
##
## Number of Fisher Scoring iterations: 7
##
##
## # Effect decomposition on the odds ratio scale via the regression-based approach with multiple imputation
##
## Closed-form parameter function estimation with
##  delta method standard errors, confidence intervals and p-values
##
##                Estimate Std.error  95% CIL 95% CIU    P.val
## Rcde            1.33974   0.22906  0.95827   1.873   0.0871 .
## Rpnde           1.43121   0.24439  1.02413   2.000   0.0358 *
## Rtnde           1.35828   0.22703  0.97885   1.885   0.0669 .
## Rpnie           0.92992   0.03662  0.86084   1.005   0.0651 .
## Rtnie           0.88254   0.05397  0.78285   0.995   0.0410 *
## Rte             1.26309   0.20237  0.92269   1.729   0.1449
## ERcde           0.31081   0.20477 -0.09053   0.712   0.1291
## ERintref        0.12117   0.11823 -0.11056   0.353   0.3055
## ERintmed       -0.09834   0.09609 -0.28667   0.090   0.3061
## ERpnie         -0.07006   0.03661 -0.14182   0.002   0.0556 .
## ERcde(prop)     1.18229   0.24912  0.69402   1.671 2.08e-06 ***
## ERintref(prop)  0.46292   0.50976 -0.53619   1.462   0.3638
## ERintmed(prop) -0.37585   0.41497 -1.18916   0.437   0.3651
## ERpnie(prop)   -0.26937   0.25241 -0.76409   0.225   0.2859
## pm             -0.64522   0.53551 -1.69480   0.404   0.2283
## int             0.08708   0.09587 -0.10083   0.275   0.3638
## pe             -0.18229   0.24912 -0.67057   0.306   0.4643
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Rcde: controlled direct effect odds ratio; Rpnde: pure natural direct effect odds ratio; Rtnde: total natural direct effect odds ratio; Rpnie: pure natural indirect effect odds ratio; Rtnie: total natural indirect effect odds ratio; Rte: total effect odds ratio; ERcde: excess relative risk due to controlled direct effect; ERintref: excess relative risk due to reference interaction; ERintmed: excess relative risk due to mediated interaction; ERpnie: excess relative risk due to pure natural indirect effect; ERcde(prop): proportion ERcde; ERintref(prop): proportion ERintref; ERintmed(prop): proportion ERintmed; ERpnie(prop): proportion ERpnie; pm: overall proportion mediated; int: overall proportion attributable to interaction; pe: overall proportion eliminated)
##
## Relevant variable values:
## $a ## [1] 1 ## ##$astar
## [1] 0
##
## $yval ## [1] "1" ## ##$mval
## $mval[[1]] ## [1] 1 ## ## ##$basecval
## $basecval[[1]] ## [1] 0.9993387 ## ##$basecval[[2]]
## [1] 0.6060235

Compare the multiple imputation results with true results:

res_noNA <- cmest(data = data_noNA, model = "rb", outcome = "Y", exposure = "A",
mediator = "M", basec = c("C1", "C2"), EMint = TRUE,
mreg = list("logistic"), yreg = "logistic",
astar = 0, a = 1, mval = list(1),
estimation = "paramfunc", inference = "delta")
summary(res_noNA)
## Causal Mediation Analysis
##
## # Outcome regression:
##
## Call:
## glm(formula = Y ~ A + M + A * M + C1 + C2, family = binomial(),
##     data = getCall(x$reg.output$yreg)$data, weights = getCall(x$reg.output$yreg)$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -0.5681  -0.2414  -0.1741  -0.1604   3.0521
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -3.5731     0.7713  -4.633 3.61e-06 ***
## A             0.7084     0.5283   1.341  0.17990
## M            -1.0471     0.3736  -2.803  0.00507 **
## C1            0.9337     0.6973   1.339  0.18054
## C2           -0.8415     0.1443  -5.829 5.56e-09 ***
## A:M          -0.4222     0.5541  -0.762  0.44607
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2022.7  on 9999  degrees of freedom
## Residual deviance: 1965.0  on 9994  degrees of freedom
## AIC: 1977
##
## Number of Fisher Scoring iterations: 7
##
##
## # Mediator regressions:
##
## Call:
## glm(formula = M ~ A + C1 + C2, family = binomial(), data = getCall(x$reg.output$mreg[[1L]])$data, ## weights = getCall(x$reg.output$mreg[[1L]])$weights)
##
## Deviance Residuals:
##     Min       1Q   Median       3Q      Max
## -3.2578   0.1145   0.1647   0.2519   0.5563
##
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.4353     0.6410   0.679  0.49712
## A             1.7076     0.1443  11.837  < 2e-16 ***
## C1            1.9982     0.6474   3.086  0.00203 **
## C2            0.9024     0.1345   6.709 1.96e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 2294.0  on 9999  degrees of freedom
## Residual deviance: 2072.5  on 9996  degrees of freedom
## AIC: 2080.5
##
## Number of Fisher Scoring iterations: 7
##
##
## # Effect decomposition on the odds ratio scale via the regression-based approach
##
## Closed-form parameter function estimation with
##  delta method standard errors, confidence intervals and p-values
##
##                Estimate Std.error  95% CIL 95% CIU    P.val
## Rcde            1.33138   0.22278  0.95912   1.848   0.0872 .
## Rpnde           1.41988   0.23708  1.02358   1.970   0.0358 *
## Rtnde           1.34928   0.22056  0.97940   1.859   0.0669 .
## Rpnie           0.93338   0.03576  0.86587   1.006   0.0719 .
## Rtnie           0.88697   0.05295  0.78902   0.997   0.0445 *
## Rte             1.25939   0.19804  0.92535   1.714   0.1425
## ERcde           0.30416   0.20019 -0.08820   0.697   0.1287
## ERintref        0.11571   0.11472 -0.10914   0.341   0.3131
## ERintmed       -0.09387   0.09323 -0.27659   0.089   0.3140
## ERpnie         -0.06662   0.03576 -0.13670   0.003   0.0624 .
## ERcde(prop)     1.17262   0.23540  0.71124   1.634 6.32e-07 ***
## ERintref(prop)  0.44610   0.49051 -0.51528   1.407   0.3631
## ERintmed(prop) -0.36189   0.39881 -1.14354   0.420   0.3642
## ERpnie(prop)   -0.25684   0.23474 -0.71691   0.203   0.2739
## pm             -0.61872   0.50289 -1.60437   0.367   0.2186
## int             0.08421   0.09264 -0.09736   0.266   0.3633
## pe             -0.17262   0.23540 -0.63400   0.289   0.4634
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Rcde: controlled direct effect odds ratio; Rpnde: pure natural direct effect odds ratio; Rtnde: total natural direct effect odds ratio; Rpnie: pure natural indirect effect odds ratio; Rtnie: total natural indirect effect odds ratio; Rte: total effect odds ratio; ERcde: excess relative risk due to controlled direct effect; ERintref: excess relative risk due to reference interaction; ERintmed: excess relative risk due to mediated interaction; ERpnie: excess relative risk due to pure natural indirect effect; ERcde(prop): proportion ERcde; ERintref(prop): proportion ERintref; ERintmed(prop): proportion ERintmed; ERpnie(prop): proportion ERpnie; pm: overall proportion mediated; int: overall proportion attributable to interaction; pe: overall proportion eliminated)
##
## Relevant variable values:
## $a ## [1] 1 ## ##$astar
## [1] 0
##
## $yval ## [1] "1" ## ##$mval
## $mval[[1]] ## [1] 1 ## ## ##$basecval
## $basecval[[1]] ## [1] 0.9993463 ## ##$basecval[[2]]
## [1] 0.6061