Clustered Standard errors, F Statistics.

Mustafa Ozer

Join Date: Apr 2015

Posts: 40
#1

Clustered Standard errors, F Statistics.

12 Apr 2015, 11:44

Dear Statalist,

My question is the following:

I am using Demographic and Health survey of Turkey to estimate the equation below. Standard errors are clustered for 26 regions, in which individuals lived when they were children. In the model, I have 26 regions fixed effects, 12 age fixed effects, three categories of mother tongues and parents educational attainment. Additionally, the model also includes dummy for whether or not data is taken from 2008 survey and she was young at the time of the survey as well as urban/rural dummy and wealth continues variable.

The eqaution below is my first stage estimate in order to estimate the impact of education on healt behaviours of woman. However when I tried to estimate the model below it did not give the F statistics and probability value for the whole model. However, It still gives the coefficients of the independent variables and their t statistics.

Could you please explain this to me? The model below is estimated with STATA 11.2 and the frequncy of observations for each region are given.

Thank you very much,

reg Woman_Edcation i.2008##i.young i.childhoodregion26 i.Age_Fixed_Effct i.mothertongue i.Parents_Edc urban wealths [aw=weight], nocon vce (cluster childhoodregion26)

Number of obs = 4026
F( 17, 26) = .
Prob > F = .
R-squared = 0.2814
Root MSE = .37696

Childhood|
region 26 | Freq.
---------------+-----------------------------------
1 | 160
2 | 52
3 | 68
4 | 50
5 | 47
6 | 106
7 | 84
8 | 85
9 | 99
10 | 127
11 | 78
12 | 194
13 | 245
14 | 121
15 | 163
16 | 84
17 | 68
18 | 201
19 | 198
20 | 178
21 | 208
22 | 135
23 | 344
24 | 294
25 | 452
26 | 250
Abroad | 50
---------------+-----------------
Total | 4,141
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30091
#2

12 Apr 2015, 12:32

Using the cluster robust vce estimator, you only have 26 degrees of freedom, which you have more than exhausted with all those predictor variables.
Comment
Mustafa Ozer

Join Date: Apr 2015

Posts: 40
#3

13 Apr 2015, 03:42

Dear Statalist,

Thank you for your response.

The paper attached to the mail clusters standard errors for 26 sub-regions of Turkey. Then it fits the model with regional fixed effects (26 regions), age fixed effects (for 12 different ages), year of birth fixed effects (17 different years) and with other backgraund caracteristics, some of which also have more than one categories. So the paper obviously has more than 26 parameters to test.

My question is whether or not this paper made a mistake by using too many fixed affect variable? They were using Difference in Difference methodology. It was published in NBER as a working paper and then in World Development Journal. By the way, they have more observation than my study. But it still clusters for 26 sub-regions of Turkey.

Does my problem occurs because of having less observation in each cluster?

In the mean time, the model I am estimating is a Difference in Difference model too.

Young=treatment group
2008=the data set which was collected after treatment.

reg Woman_Edcation i.2008##i.young i.childhoodregion26 i.Age_Fixed_Effct i.mothertongue i.Parents_Edc urban wealths [aw=weight], nocon vce (cluster childhoodregion26)

Another thing is that in the literature it also says that having less than 30 clusters may cause a substantial problems regarding standard errors of the regression coefficients.

Do I solve the problems regarding the degrees of freedom and standards errors if I use "wild cluster bootstrap-t" instead of Cluster Robust Standard errors?

Regards
Attached Files

Women’s Education Harbinger of Another Spring Evidence from a natural experiment in turkey.pdf (566.7 KB, 1 view)
Comment
Mustafa Ozer

Join Date: Apr 2015

Posts: 40
#4

13 Apr 2015, 09:55

Thank you, I solved the problem.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#5

13 Apr 2015, 23:44

Mustafa:
it would be interesting (for me, at least) to know how you fixed your problem. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Mustafa Ozer

Join Date: Apr 2015

Posts: 40
#6

14 Apr 2015, 04:16

My first problem was not having the F statistics but then What i learnt is that due to degrees of freedom i may not have the F statistics but the t statistics
and p value for each independent variable can be used.

For the problem of having less number of clusters, Unfortunately there is no available stata codes for IV with Wild cluster Bootstrap t. Andrew M Menger, who writes the codes for stata told me this. He said it will take a while to create the codes.

However, fortunaltely instead of using region of childhood as a cluster I used province of childhood, which gave me 81 clusters. Therefore, I solved the problem of having small number of clusters.

Thanks again,

Kind Regards,

Mustafa
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#7

14 Apr 2015, 04:37

Mustafa:
thanks a lot for closing out this thread with some more details about the way you dealt with your original problem..

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Clustered Standard errors, F Statistics.

Comment

Comment

Comment

Comment

Comment

Comment