Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clustered Standard errors, F Statistics.

    Dear Statalist,

    My question is the following:

    I am using Demographic and Health survey of Turkey to estimate the equation below. Standard errors are clustered for 26 regions, in which individuals lived when they were children. In the model, I have 26 regions fixed effects, 12 age fixed effects, three categories of mother tongues and parents educational attainment. Additionally, the model also includes dummy for whether or not data is taken from 2008 survey and she was young at the time of the survey as well as urban/rural dummy and wealth continues variable.

    The eqaution below is my first stage estimate in order to estimate the impact of education on healt behaviours of woman. However when I tried to estimate the model below it did not give the F statistics and probability value for the whole model. However, It still gives the coefficients of the independent variables and their t statistics.

    Could you please explain this to me? The model below is estimated with STATA 11.2 and the frequncy of observations for each region are given.

    Thank you very much,




    reg Woman_Edcation i.2008##i.young i.childhoodregion26 i.Age_Fixed_Effct i.mothertongue i.Parents_Edc urban wealths [aw=weight], nocon vce (cluster childhoodregion26)

    Number of obs = 4026
    F( 17, 26) = .
    Prob > F = .
    R-squared = 0.2814
    Root MSE = .37696





    Childhood|
    region 26 | Freq.
    ---------------+-----------------------------------
    1 | 160
    2 | 52
    3 | 68
    4 | 50
    5 | 47
    6 | 106
    7 | 84
    8 | 85
    9 | 99
    10 | 127
    11 | 78
    12 | 194
    13 | 245
    14 | 121
    15 | 163
    16 | 84
    17 | 68
    18 | 201
    19 | 198
    20 | 178
    21 | 208
    22 | 135
    23 | 344
    24 | 294
    25 | 452
    26 | 250
    Abroad | 50
    ---------------+-----------------
    Total | 4,141

  • #2
    Using the cluster robust vce estimator, you only have 26 degrees of freedom, which you have more than exhausted with all those predictor variables.

    Comment


    • #3
      Dear Statalist,

      Thank you for your response.

      The paper attached to the mail clusters standard errors for 26 sub-regions of Turkey. Then it fits the model with regional fixed effects (26 regions), age fixed effects (for 12 different ages), year of birth fixed effects (17 different years) and with other backgraund caracteristics, some of which also have more than one categories. So the paper obviously has more than 26 parameters to test.

      My question is whether or not this paper made a mistake by using too many fixed affect variable? They were using Difference in Difference methodology. It was published in NBER as a working paper and then in World Development Journal. By the way, they have more observation than my study. But it still clusters for 26 sub-regions of Turkey.

      Does my problem occurs because of having less observation in each cluster?

      In the mean time, the model I am estimating is a Difference in Difference model too.

      Young=treatment group
      2008=the data set which was collected after treatment.


      reg Woman_Edcation i.2008##i.young i.childhoodregion26 i.Age_Fixed_Effct i.mothertongue i.Parents_Edc urban wealths [aw=weight], nocon vce (cluster childhoodregion26)


      Another thing is that in the literature it also says that having less than 30 clusters may cause a substantial problems regarding standard errors of the regression coefficients.

      Do I solve the problems regarding the degrees of freedom and standards errors if I use "wild cluster bootstrap-t" instead of Cluster Robust Standard errors?

      Regards
      Attached Files

      Comment


      • #4
        Thank you, I solved the problem.

        Comment


        • #5
          Mustafa:
          it would be interesting (for me, at least) to know how you fixed your problem. Thanks.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            My first problem was not having the F statistics but then What i learnt is that due to degrees of freedom i may not have the F statistics but the t statistics
            and p value for each independent variable can be used.

            For the problem of having less number of clusters, Unfortunately there is no available stata codes for IV with Wild cluster Bootstrap t. Andrew M Menger, who writes the codes for stata told me this. He said it will take a while to create the codes.

            However, fortunaltely instead of using region of childhood as a cluster I used province of childhood, which gave me 81 clusters. Therefore, I solved the problem of having small number of clusters.

            Thanks again,

            Kind Regards,

            Mustafa

            Comment


            • #7
              Mustafa:
              thanks a lot for closing out this thread with some more details about the way you dealt with your original problem..
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment

              Working...
              X