R Translation of Stata Description Calculating Marginal Effects and Standard Errors

This post translates the Stata code to calculate the standard error of the predictions contained in the document, How are average marginal effects and their standard errors computed by margins using the delta method? by Jeff Pitblado of StataCorp, into R code.

First, read the data into R. If you have not already done so, you will need to install the R package readstata13 by issuing the command, install.packages('readstata13'). Recall that Stata (until Stata 16) could only hold a single dataset in memory at a time. Therefore, the commands issued in Stata to generate variables and such do not explicitly define the data being used. In R, we will load the data into an object, dta.

library('readstata13')
dta <- read.dta13('http://www.stata-press.com/data/r14/margex.dta')

Fit the logistic regression model

We fit the same logistic regression model and check to see that it looks the same as Pitblado's:

fit <- glm(outcome ~ treatment + distance, data=dta, family=binomial)
summary(fit)

## 
## Call:
## glm(formula = outcome ~ treatment + distance, family = binomial, 
##     data = dta)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.8221  -0.7995  -0.4271  -0.4106   2.2555  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.337762   0.096241 -24.291  < 2e-16 ***
## treatment    1.427760   0.113082  12.626  < 2e-16 ***
## distance    -0.004775   0.001105  -4.321 1.55e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2732.1  on 2999  degrees of freedom
## Residual deviance: 2485.6  on 2997  degrees of freedom
## AIC: 2491.6
## 
## Number of Fisher Scoring iterations: 7

Looks good.

Standard errors for predictive margins.

Now we will calculate the standard error of the predicted margin for the two conditions of the treatment variable (0 and 1). Stata stores results of scalars, macros, and matrices from an analysis that can be accessed via special commands _b[<name>], which will give the beta coefficient associated with the variable name from a model. R, in contrast, stores results as attribute of the model fit object (in our case, the fit object above). I create a couple of objects here that correspond to what is available in the returned values of the margins command in Stata:

R variable	Stata return value
`betas`	`_b`
`vcov(fit)`	`e(V)`

And, I create two alternate datasets, d0 and d1 where the value of treatment equals 0 and 1, respectively, and all other variables remain the same.

betas <- coef(fit)

d0 <- d1 <- dta   ## Create two copies of the data
d0$treatment = 0  ## Set values of treatment to equal 0 in all rows
d1$treatment = 1  ## Set values of treatment to equal 1 in all rows

Now we obtain the predicted values of all observations where treatment==0 and all observations where treatment==1. To do this in the probability scale, we use the plogis() function in R that corresponds to the invlogit() function in Stata. We print the values of the two variables to confirm that they match Pitblado's.

p0 <- plogis(betas['treatment']*d0$treatment + betas['distance']*d0$distance + betas['(Intercept)'])
p1 <- plogis(betas['treatment']*d1$treatment + betas['distance']*d1$distance + betas['(Intercept)'])
lapply(list(p0, p1), summary)

## [[1]]
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.002695 0.082913 0.085490 0.079115 0.086995 0.087948 
## 
## [[2]]
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01114 0.27376 0.28045 0.26002 0.28432 0.28676

Success!¹

Now we calculate the matrices that make up the Jacobin matrix. The vecaccum command in Stata multiplies the first column listed by the remaining columns in the matrix. In Stata, the matrix consists of the data stored in the data frame; in R, we manually create this matrix. We test our Jacobian against Pitblado's:

dp0dxb <- p0*(1-p0)
dp1dxb <- p1*(1-p1)
zero <- rep(0, nrow(dta))
one <- rep(1, nrow(dta))

J0 <- dp0dxb %*% matrix(c(zero, one, zero, dta$distance), ncol=4)
J1 <- dp1dxb %*% matrix(c(zero, one, one, dta$distance), ncol=4)
Jac <- matrix(c(J0/nrow(dta), J1/nrow(dta)), nrow=2, byrow=TRUE)

Notice that it doesn't look exactly the same. That is because Stata expects the intercept values (what it calls _const) to be in the last row/column of the variance-covariance matrix where R expects the intercept to be in the first row/column.

With that adjustment, we can calculate the variances by using the last three columns of the Jacobian. Using only the last three matches the output from R's vcov(); Stata saves an extra row and an column to represent the reference category of the treatment variable. All of the cells in the extra row and column equal zero. Now we test the variance matrix that Pitblado calculated:

V <- Jac[,2:4] %*% vcov(fit) %*% t(Jac[,2:4])
V

##               [,1]          [,2]
## [1,]  4.824092e-05 -1.180452e-07
## [2,] -1.180452e-07  1.249294e-04

And we get the same values!

As a side note, you can accomplish the same results in R using the predict() function. We could type p0 <- predict(fit, newdata=d0, type='response'), instead to get the same result. ↩

R Stata stata-to-R statistics

mike bader

r translation of stata description calculating marginal effects and standard errors

Pingbacks

Comments