Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • clustering robust standard errors in one wave dataset

    Hello,

    I have a one wave dataset of 158 firms, is it useful to cluster the standard errors? If so, I want to cluster them in industry. However, I have 35 industry dummies. I have tried to do it like the internet mentioned:

    Code:
    . regress abschangecarbonintensity firmsize profitability leverage age capitalintensity CAPEX KZindex elektrici
    > tygenerator Carbonleakage industry10 industry11 industry13 industry16 industry17 industry19 industry20 indust
    > ry21 industry22 industry23 industry24 industry25 industry28 industry29 industry30 industry35 industry42 indus
    > try46 industry47 industry49 industry52 industry63 industry70 industry72 industry81 WestFlanders Hainaut Antwe
    > rp Brussels FlemishBrabant Limbourg Liege Namur WalloonBrabant Luxembourg SME  publicfirm, robust cluster  in
    > dustry10 industry11 industry13 industry16 industry17 industry19 industry20 industry21 industry22 industry23 i
    > ndustry24 industry25 industry28 industry29 industry30 industry35 industry42 industry46 industry47 industry49 
    > industry52 industry63 industry70 industry72 industry81
    option cluster incorrectly specified
    r(198);
    but I get an error. Do you have any advice?

    Kind regards,
    Timea De Wispelaere

  • #2
    Timea:
    -robust- and -cluster- cannot go together.
    If you want to cluster the standard errors, just go -vce(cluster clusterid)-, as you can see from the following toy-example:
    Code:
    . sysuse auto.dta
    (1978 Automobile Data)
    
    . regress price mpg trunk, robust cluster
    option cluster incorrectly specified
    r(198);
    
    . regress price mpg trunk, robust vce(cluster foreign)
    options vce() and robust may not be combined
    r(198);
    
    . regress price mpg trunk, vce(cluster foreign)
    
    Linear regression                               Number of obs     =         74
                                                    F(0, 1)           =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.2222
                                                    Root MSE          =     2637.6
    
                                    (Std. Err. adjusted for 2 clusters in foreign)
    ------------------------------------------------------------------------------
                 |               Robust
           price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             mpg |  -220.1649   45.58852    -4.83   0.130    -799.4219    359.0922
           trunk |   43.55851   9.441408     4.61   0.136    -76.40595     163.523
           _cons |   10254.95   185.7531    55.21   0.012     7894.732    12615.17
    ------------------------------------------------------------------------------
    
    .
    As an aside, my guess is that you can also make your code more efficient preferring the long layout format (see -reshape-) and relying on -fvvarlist- notation to create categorical variables (and interactions, if any).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo: One should never cluster standard errors with only two clusters. It matters a lot how many clusters there are. Something on the order of 30 should be a minimum. Even then, it depens on the cluster sizes. The context matters a lot.

      Timea: The question to cluster is a subtle one. The data weren't obtained by cluster sampling, correct? So there's no reason to cluster to account for the sampling scheme. Further, it looks to me like the variables you're interested in are measured at the firm level. Again, no reason for clustering. It certainly can make sense to include industry dummies, but you don't need to cluster at the industry level. Industries with only a single firm, if there are any, will not contribute to the estimation.

      May I recommend my paper with Abadie, Athey, and Imbens, "When Should You Adjust Standard Errors for Clustering?" It is available as an NBER working paper.

      Comment


      • #4
        Jeff: you're obvioulsy correct.
        The idea was to propose a toy-example.
        In the real world I would never use a clustered standard errors with such a scant number of clusters.
        Many thanks for reminding interested listers and myself of the NBER working paper, that I downloaded some time ago.
        Last edited by Carlo Lazzaro; 06 Apr 2020, 08:08.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thank you very much, this was very helpful!

          Comment


          • #6
            Originally posted by Jeff Wooldridge View Post

            Timea: The question to cluster is a subtle one. The data weren't obtained by cluster sampling, correct? So there's no reason to cluster to account for the sampling scheme.
            Depends how you look at it. The sample obtained, is actually all firms that are regulated by the EU emission trading system. Whether or not a firm is regulated by this system, depends (among other things) of the industry it is active in. So if you look at the population of belgian ETS ruled firms, no sampling has been done. However, in some way, clustered sampling has been done from the total population of Belgian firms. What is your view on this?

            Comment


            • #7
              Carlo, understood. But in my view the danger of toy examples is that beginners don't understand that to extrapolate from the "toy" example, certain conditions be met. I'd wager there are plenty of people on this forum who don't know one should have around three dozen clusters in order to have faith in the standard errors. If I may kindly make a suggestion: Use data sets that are up to the job for illustrating the task at hand. That auto.dta data set is overused and really can't illustrate much beyond basic regression.

              Timea: If you want to view the assignment to the "treatment" to be essentially at the industry level, then there is a case to be made for that. But it will be better if you use the "absorb" option so that you don't see the clustered standard errors on the industry dummies, as those are useless. You need to show exactly how you typed the command to get further help. I can't see that from the output you showed.

              Comment


              • #8
                Jeff:
                I agree with you. The (over)use of the -auto- dataset is probably due to the fact that it's easy to recall (-sysuse auto.dta-).
                The dataset is good for presenting many facets of -regress-, but cannot replace the features of datasets created for panel data analysis.
                With a bit of belated hindsight, I should have used -nlswork.dta- as a basis for a toy-example:
                Code:
                . use "https://www.stata-press.com/data/r16/nlswork.dta"
                (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                
                . regress ln_wage wks_ue tenure , robust cluster
                option cluster incorrectly specified
                r(198);
                
                . regress ln_wage wks_ue tenure , robust vce(cluster idcode)
                options vce() and robust may not be combined
                r(198);
                
                . regress ln_wage wks_ue tenure , vce(cluster idcode)
                
                Linear regression                               Number of obs     =     22,445
                                                                F(2, 4630)        =     729.98
                                                                Prob > F          =     0.0000
                                                                R-squared         =     0.1323
                                                                Root MSE          =     .43705
                
                                             (Std. Err. adjusted for 4,631 clusters in idcode)
                ------------------------------------------------------------------------------
                             |               Robust
                     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                      wks_ue |  -.0042301   .0004815    -8.78   0.000    -.0051741    -.003286
                      tenure |   .0492824   .0013837    35.62   0.000     .0465698     .051995
                       _cons |   1.514731   .0062338   242.99   0.000     1.502509    1.526952
                ------------------------------------------------------------------------------
                
                .
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Second thought: as the original poster (OP) reported one wave of data only, my last reply can be tweaked a bit to meet OP's needs:
                  Code:
                  . use "https://www.stata-press.com/data/r16/nlswork.dta"
                  (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                  
                  . regress ln_wage wks_ue tenure if year==70, robust cluster
                  option cluster incorrectly specified
                  r(198);
                  
                  . regress ln_wage wks_ue tenure if year==70, robust vce(cluster idcode)
                  options vce() and robust may not be combined
                  r(198);
                  
                  . regress ln_wage wks_ue tenure if year==70, vce(cluster idcode)
                  
                  Linear regression                               Number of obs     =      1,612
                                                                  F(2, 1611)        =      43.34
                                                                  Prob > F          =     0.0000
                                                                  R-squared         =     0.0902
                                                                  Root MSE          =     .38165
                  
                                               (Std. Err. adjusted for 1,612 clusters in idcode)
                  ------------------------------------------------------------------------------
                               |               Robust
                       ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                        wks_ue |  -.0062348   .0019186    -3.25   0.001     -.009998   -.0024715
                        tenure |   .0964186   .0128734     7.49   0.000     .0711682    .1216691
                         _cons |   1.400249   .0195057    71.79   0.000      1.36199    1.438508
                  ------------------------------------------------------------------------------
                  
                  .
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Carlo: That example doesn't illustrate anything because each cluster has one unit. So it's the same as not clustering. (You'll see that replacing vce(cluster idcode) with vce(robust) delivers the same standard errors.) Using a single wave, there isn't a variable in that data set that one would cluster on. If a policy were applied at, say, the county level, and we knew each individual's county of residence, then we could cluster on county.

                    Comment


                    • #11
                      Jeff is correct.
                      I tried to correct my last reply (where machinery takes control over theory) yesterday, but was caught up in one web meeting that lasted too long to go back to this thread and amend it.
                      With one wave of data only there's no need to cluster. Sorry for the confusion.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        thank you all for the valuable input!

                        Comment

                        Working...
                        X