Standard errors for regression adjustment: margins vs teffects ra

Daniele Girardi

Join Date: Dec 2023

Posts: 14
#1

Standard errors for regression adjustment: margins vs teffects ra

22 Apr 2025, 07:08

Hello,

To estimate treatment effects using regression adjustment (RA) in STATA, one can use either teffects ra or margins.

The STATA guide section on "margins, contrast" explains how to use margins to implement regression adjustment (pp.1714-1718 in the STATA manual for version 18). It says "The point estimates of the ATE [using teffects ra] are identical to those we obtained using margins, though the standard errors differ slightly from those reported by margins. The standard errors from the two estimators are, however, asymptotically equivalent, meaning they would coincide with a sufficiently large dataset." I can add that in all examples and tests I ran, the standard errors from the margins implementation were slightly larger.

I would like to understand better the source of this difference in standard errors between the margins and the teffects ra implementations of RA. My understanding is that teffects ra estimates RA in one step: it writes down the RA estimator as a series of equations (GMM style) and then estimates the parameters of interest and their variance using quasimaximum likelihood (QML). I'm less clear about how standard errors are estimated in the margins implementation of RA. The STATA manual says that the "linearization method" is used to estimate the variance of the parameters when using the vce(unconditional) option, which we need to use in implementing RA with margins. But I'm not really familiar with the linearization method used, and I don't understand precisely how this differs from the QML method used by teffects ra.

I would be very grateful for any guidance in understanding (even just intuitively, as a start) the source of the difference in standard errors; whether one of the two methods has better finite-sample properties in estimating standard errors; and whether it is always true that standard errors using margins are more conservative than those using teffects ra.

Thanks

Daniele
Tags: None
FernandoRios

Join Date: Apr 2014

Posts: 2470
#2

22 Apr 2025, 07:32

I believe the main difference comes from the Degrees of freedom
teffects ra uses GMM and thus robust Standard errors with no Degrees of freedom adjustment
margins uses whatever approach was used in the regression step. So if -regress- uses no degrees of freedom, so will margins.
1 like
Comment
Daniele Girardi

Join Date: Dec 2023

Posts: 14
#3

23 Apr 2025, 03:09

Originally posted by FernandoRios View Post

I believe the main difference comes from the Degrees of freedom
teffects ra uses GMM and thus robust Standard errors with no Degrees of freedom adjustment
margins uses whatever approach was used in the regression step. So if -regress- uses no degrees of freedom, so will margins.

Thanks Fernando! This makes sense to me, and is compatible with the facts that the SEs of the margins implementation of RA seem to be systematically more conservative than those of the teffects ra implementation, and that the discrepancy is larger in smaller samples (where the dof correction has more bite). I have been trying to confirm this explanation, either from the STATA manual or by changing the dof correction in the regression before margins to reconcile the standard errors, but with no luck so far. I also can't determine whether this means that the SEs produced by margins are overly conservative or not. It might be the case if the dof adjustment also corrects for interacted fixed effects that are actually partly overlapping. If you (or anyone else) has any further information or thoughts about this, or materials you can point me to, I would of course be super grateful.
Comment

Enrique Pinzon (StataCorp)

StataCorp Employee

Join Date: Jan 2015
Posts: 217

23 Apr 2025, 05:07

Hi Daniele,

Below is an example of what Fernando was mentioning

Code:

. sysuse auto, clear 
(1978 automobile data)

. // teffects
. teffects ra (mpg price) (foreign)

Iteration 0:  EE criterion =  3.966e-29  
Iteration 1:  EE criterion =  8.381e-30  

Treatment-effects estimation                    Number of obs     =         74
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
------------------------------------------------------------------------------
             |               Robust
         mpg | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATE          |
     foreign |
   (Foreign  |
         vs  |
  Domestic)  |   5.366788   1.270333     4.22   0.000     2.876981    7.856594
-------------+----------------------------------------------------------------
POmean       |
     foreign |
   Domestic  |   19.75523   .6185438    31.94   0.000      18.5429    20.96755
------------------------------------------------------------------------------

. // margins with unconditional standard errors
. quietly regress mpg i.foreign##(c.price), vce(robust)

. margins r.foreign, vce(unconditional) contrast(nowald) 

Contrasts of predictive margins                             Number of obs = 74

Expression: Linear prediction, predict()

------------------------------------------------------------------------
                       |            Unconditional
                       |   Contrast   std. err.     [95% conf. interval]
-----------------------+------------------------------------------------
               foreign |
(Foreign vs Domestic)  |   5.366788   1.306124      2.761806     7.97177
------------------------------------------------------------------------

. di sqrt(r(V)[1,1]*(70/74))
1.2703329

Comment

Daniele Girardi

Join Date: Dec 2023
Posts: 14

23 Apr 2025, 05:19

Originally posted by Enrique Pinzon (StataCorp) View Post

Hi Daniele,

Below is an example of what Fernando was mentioning

Code:

. sysuse auto, clear
(1978 automobile data)

. // teffects
. teffects ra (mpg price) (foreign)

Iteration 0: EE criterion = 3.966e-29
Iteration 1: EE criterion = 8.381e-30

Treatment-effects estimation Number of obs = 74
Estimator : regression adjustment
Outcome model : linear
Treatment model: none
------------------------------------------------------------------------------
| Robust
mpg | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
ATE |
foreign |
(Foreign |
vs |
Domestic) | 5.366788 1.270333 4.22 0.000 2.876981 7.856594
-------------+----------------------------------------------------------------
POmean |
foreign |
Domestic | 19.75523 .6185438 31.94 0.000 18.5429 20.96755
------------------------------------------------------------------------------

. // margins with unconditional standard errors
. quietly regress mpg i.foreign##(c.price), vce(robust)

. margins r.foreign, vce(unconditional) contrast(nowald)

Contrasts of predictive margins Number of obs = 74

Expression: Linear prediction, predict()

------------------------------------------------------------------------
| Unconditional
| Contrast std. err. [95% conf. interval]
-----------------------+------------------------------------------------
foreign |
(Foreign vs Domestic) | 5.366788 1.306124 2.761806 7.97177
------------------------------------------------------------------------

. di sqrt(r(V)[1,1]*(70/74))
1.2703329

Thank you Enrique, this example is super helpful and clarifying.

Comment

Daniele Girardi

Join Date: Dec 2023

Posts: 14
#6

25 Apr 2025, 06:05

Hi Enrique,

To generalise from your example, with robust (but not clustered) standard errors vce(robust) I can reconcile the standard errors by doing

Code:

sqrt(r(V)[1,1]*((e(N)-e(df_m)-1)/e(N)))

If we had cluster-robust standard errors vce(cluster clustvar), based on my understanding of how STATA computes cluster-robust SEs, I guess we would have to do

Code:

sqrt(r(V)[1,1]*((e(N)-e(df_m)-1)/(e(N)-1))*((e(N_clust)-1)/e(N_clust)))

Are these general versions of the 'conversion formulas' correct, or am I getting anything wrong? Thanks so much for your help

Last edited by Daniele Girardi; 25 Apr 2025, 06:07.
Comment
Enrique Pinzon (StataCorp)

StataCorp Employee

Join Date: Jan 2015

Posts: 217
#7

25 Apr 2025, 06:47

Hi Daniele,

What you are doing is more general than what I wrote and therefore will be more helpful to those that search for this question. Your intuition and computations are correct.

Enrique
P.S: I am happy to see that -lpdid- is so popular.
Comment
Daniele Girardi

Join Date: Dec 2023

Posts: 14
#8

25 Apr 2025, 07:06

Thanks Enrique for confirming that my formulas are correct. With many regressors/fixed effects, -teffects ra- can be pretty slow, and I think being able to obtain exactly the same estimates & SEs but faster using -margins- is a very useful trick.

PS: I'm very glad that people are finding -lpdid- helpful!
Comment

Ben Jann

Join Date: Sep 2014
Posts: 262

25 Apr 2025, 08:47

Hi Daniele,

you can also use kmatch ra, which will be much faster than teffects ra (and is not affected by the convergence problems sometimes observed with teffects ra) and probably also faster and more robust that regress followed by margins.

kmatch divides by N-1, so you also need to apply a small correction if you want the exact same SE as reported by teffects.

Example:

Code:

. webuse cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke)

Iteration 0:  EE criterion = 7.734e-24  
Iteration 1:  EE criterion = 1.196e-25  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
----------------------------------------------------------------------------------------
                       |               Robust
               bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-----------------------+----------------------------------------------------------------
ATE                    |
               mbsmoke |
(Smoker vs Nonsmoker)  |  -239.6392   23.82402   -10.06   0.000    -286.3334    -192.945
-----------------------+----------------------------------------------------------------
POmean                 |
               mbsmoke |
            Nonsmoker  |   3403.242   9.525207   357.29   0.000     3384.573    3421.911
----------------------------------------------------------------------------------------

. kmatch ra mbsmoke (bweight = prenatal1 mmarried mage fbaby), nomtable

Regression adjustment                                    Number of obs = 4,642

Treatment   : mbsmoke = 1
RA equations: bweight = prenatal1 mmarried mage fbaby _cons

Treatment-effects estimation
------------------------------------------------------------------------------
     bweight | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         ATE |  -239.6392   23.82659   -10.06   0.000    -286.3506   -192.9278
------------------------------------------------------------------------------

. di %9.0g _se[ATE] * sqrt((e(N)-1) / e(N))
 23.82402

With clustered SEs:

Code:

. teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke), vce(cluster mage)

Iteration 0:  EE criterion = 7.734e-24  
Iteration 1:  EE criterion = 1.196e-25  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
                                            (Std. err. adjusted for 33 clusters in mage)
----------------------------------------------------------------------------------------
                       |               Robust
               bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-----------------------+----------------------------------------------------------------
ATE                    |
               mbsmoke |
(Smoker vs Nonsmoker)  |  -239.6392   23.19119   -10.33   0.000    -285.0931   -194.1853
-----------------------+----------------------------------------------------------------
POmean                 |
               mbsmoke |
            Nonsmoker  |   3403.242   14.19689   239.72   0.000     3375.417    3431.068
----------------------------------------------------------------------------------------

. kmatch ra mbsmoke (bweight = prenatal1 mmarried mage fbaby), nomtable vce(cluster mage)

Regression adjustment                                    Number of obs = 4,642

Treatment   : mbsmoke = 1
RA equations: bweight = prenatal1 mmarried mage fbaby _cons

Treatment-effects estimation
                                  (Std. err. adjusted for 33 clusters in mage)
------------------------------------------------------------------------------
     bweight | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         ATE |  -239.6392   23.55077   -10.18   0.000    -287.6106   -191.6679
------------------------------------------------------------------------------

. di %9.0g _se[ATE] * sqrt((e(N_clust)-1) / e(N_clust))
 23.19119

ben

Comment

Daniele Girardi

Join Date: Dec 2023

Posts: 14
#10

25 Apr 2025, 08:54

Thanks Ben, I will definitely try -kmatch ra- out!
Comment

Ben Jann

Join Date: Sep 2014
Posts: 262

#11

27 Apr 2025, 05:29

By the way, you could also use command listreg (speed will be similar to kmatch), even if listreg has been written for something else (analysis of data from so-called list experiments). An advantage of listreg is that it has option normal, which will give you the same SEs as teffects ra. The default estimator implemented in listreg is equivalent to a regression-adjustment estimator of the average treatment effect on the treated (ATET). Example:

Code:

. webuse cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke), atet

Iteration 0:  EE criterion = 7.629e-24  
Iteration 1:  EE criterion = 2.697e-26  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
----------------------------------------------------------------------------------------
                       |               Robust
               bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-----------------------+----------------------------------------------------------------
ATET                   |
               mbsmoke |
(Smoker vs Nonsmoker)  |  -223.3017    22.7422    -9.82   0.000    -267.8755   -178.7278
-----------------------+----------------------------------------------------------------
POmean                 |
               mbsmoke |
            Nonsmoker  |   3360.961   12.75749   263.45   0.000     3335.957    3385.966
----------------------------------------------------------------------------------------

. listreg bweight mbsmoke, controls(prenatal1 mmarried mage fbaby) normal

List experiment regression                               Number of obs = 4,642

------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |  -223.3017    22.7422    -9.82   0.000    -267.8755   -178.7278
------------------------------------------------------------------------------
Long-list indicator: mbsmoke
Group sizes:         3778 in short-list, 864 in long-list
Short-list controls: prenatal1 mmarried mage fbaby

However, with a little trick (specify two outcome variables, the original variable and and its reverse), you can also estimate the ATE:

Code:

. teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke)

Iteration 0:  EE criterion = 7.734e-24  
Iteration 1:  EE criterion = 1.196e-25  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
----------------------------------------------------------------------------------------
                       |               Robust
               bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-----------------------+----------------------------------------------------------------
ATE                    |
               mbsmoke |
(Smoker vs Nonsmoker)  |  -239.6392   23.82402   -10.06   0.000    -286.3334    -192.945
-----------------------+----------------------------------------------------------------
POmean                 |
               mbsmoke |
            Nonsmoker  |   3403.242   9.525207   357.29   0.000     3384.573    3421.911
----------------------------------------------------------------------------------------

. gen _bweight = bweight * -1

. listreg bweight _bweight = mbsmoke, controls(prenatal1 mmarried mage fbaby) normal

List experiment regression                               Number of obs = 4,642

------------------------------------------------------------------------------
             |               Robust
             | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |  -239.6392   23.82402   -10.06   0.000    -286.3334    -192.945
------------------------------------------------------------------------------
Double-list method:  pooled
Outcome variables:   bweight _bweight
Long-list indicator: mbsmoke
List 1 group sizes:  3778 in short-list, 864 in long-list
List 2 group sizes:  864 in short-list, 3778 in long-list
Short-list controls: prenatal1 mmarried mage fbaby

A cool thing about listreg is that you can use it for analysis of treatment-effect heterogeneity:

Code:

. // ATET
. listreg bweight mbsmoke prenatal1 mmarried mage fbaby, normal

List experiment regression                              Number of obs =  4,642
                                                        Wald chi2(4)  =  20.31
                                                        Prob > chi2   = 0.0004

------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
   prenatal1 |  -39.29726   48.86623    -0.80   0.421    -135.0733     56.4788
    mmarried |  -27.28955   48.76806    -0.56   0.576    -122.8732    68.29409
        mage |  -9.917709   4.705036    -2.11   0.035    -19.13941   -.6960073
       fbaby |   112.7685   44.30192     2.55   0.011     25.93834    199.5987
       _cons |   24.42354    117.549     0.21   0.835    -205.9682    254.8153
------------------------------------------------------------------------------
Long-list indicator: mbsmoke
Group sizes:         3778 in short-list, 864 in long-list
Short-list controls: prenatal1 mmarried mage fbaby

. // ATE
. listreg bweight _bweight = mbsmoke prenatal1 mmarried mage fbaby, normal

List experiment regression                              Number of obs =  4,642
                                                        Wald chi2(4)  =  20.31
                                                        Prob > chi2   = 0.0004

------------------------------------------------------------------------------
             |               Robust
             | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
   prenatal1 |  -39.29726   48.86623    -0.80   0.421    -135.0733     56.4788
    mmarried |  -27.28955   48.76806    -0.56   0.576    -122.8732    68.29409
        mage |  -9.917709   4.705036    -2.11   0.035    -19.13941   -.6960073
       fbaby |   112.7685   44.30192     2.55   0.011     25.93834    199.5987
       _cons |   24.42354    117.549     0.21   0.835    -205.9682    254.8153
------------------------------------------------------------------------------
Double-list method:  pooled
Outcome variables:   bweight _bweight
Long-list indicator: mbsmoke
List 1 group sizes:  3778 in short-list, 864 in long-list
List 2 group sizes:  864 in short-list, 3778 in long-list
Short-list controls: prenatal1 mmarried mage fbaby

Best,
ben

Comment

Daniele Girardi

Join Date: Dec 2023

Posts: 14
#12

28 Apr 2025, 03:51

Thanks Ben! that's very useful and I will try it out too.
Comment

Announcement