Clustered Standard Errors differ across -mean, over() cluster()- and -clttest, by() cluster()-

Kristoffer Bjarkefur

Join Date: Feb 2016

Posts: 53
#1

Clustered Standard Errors differ across -mean, over() cluster()- and -clttest, by() cluster()-

19 Feb 2016, 12:09

Hi all,

I work with Impact Evaluations where an important part of the validation of the analysis we do is balance tables (also known as difference in mean tables). I am currently writing a command that quickly generates balance tables. I have found similar commands on SSC but none of them allows for multiple treatment arms or includes all the option that the researchers in my team would like to have. We will test this command thoroughly within the team, but then happily share it as a part of the ietoolkit package on SSC.

While writing this command I have researched different ways that people in the Stata community estimates the statistics presented in balance tables and I found something that I cannot understand. Posts in this forum have recommended using the built-in command -mean- as well as the user written command -clttest- to generate standard errors in clustered group means. But the two commands generate different standard errors.

Specified as below I expected the two commands to generate the same output, but they don't. Output is similar, but I expected an exact match, as it is a straightforward estimation. Code below, output attached as images (I hope output in images is not discouraged in this forum, output looks terrible when I copy and paste from Stata to the forum).

mean HH_head_gender , over(tmt) cluster(village)
clttest HH_head_gender , by(tmt) cluster(village)

Please let me know if there is something I do not understand here. I obviously much prefer to base my command on built-in standard commands, but I am afraid I am missing some assumption that makes either of the two commands less precise for what I am trying to do. I have read the documentation for both commands, but I still don't understand why the standard errors are different.

I have emailed this questions to the author of clttest as well, but since my question is touching a general topic in statistics and not just about how clttest is implemented, I thought it was worthwhile to post my question here as well.

Any insight on this issue is much appreciated.

Thank you,

Kristoffer Bjarkefur
[email protected]

2 Photos
Tags: balance table, clttest, cluster, standard errors
Jeph Herrin

Join Date: Apr 2014

Posts: 335
#2

19 Feb 2016, 17:05

Output is similar, but I expected an exact match, as it is a straightforward estimation.

As described in the Stata documentation, -mean- with the -cluster- option uses the robust sandwich estimator (hence, the output labels the results as "Robust Std.Err.". In contrast, as the help file for -cltest- explains, -clttest- uses a variance inflation factor proposed by Donner & Klar in the reference cited. Thus, neither is a "straightforward" estimation.

Notably, the approach used by -clttest- assumes that the clusters ('village') are nested within the comparison groups ('tmt'). If this is not true, you should use Stata's -mean- command. If they are nested, both approaches are appropriate, but I think -mean, cluster()- is generally more conservative,

Hope this helps.

cheers,
Jeph
Comment
Kristoffer Bjarkefur

Join Date: Feb 2016

Posts: 53
#3

19 Feb 2016, 17:30

Great, thank you! I guess clustered standard errors are more complicated than my current understanding. And sorry for not catching the differences when reading the documentation myself.

I will proceed with -mean- for my command as it is more general, and I prefer my command to not inherit assumptions of how clusters are nested, even though it sounds as if your approach is more efficient when it is appropriate to use.

Anyways, thank you again for your help and for your quick reply!

Best,
Kristoffer
Comment
Kristoffer Bjarkefur

Join Date: Feb 2016

Posts: 53
#4

12 Aug 2019, 08:52

Follow up on my earlier posts after receiving questions. This command has been published in the ietoolkit package that can be installed from SSC. Instructions and documentation here: https://github.com/worldbank/ietoolkit
Comment

Announcement

Clustered Standard Errors differ across -mean, over() cluster()- and -clttest, by() cluster()-

Comment

Comment

Comment