Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error ocurring in one version of stata but not the other: option cluster() incorrectly specified: too many variables specified

    Hello!
    I have access to different versions of Stata 17, running on the same server (windows server 2019):
    - 19-user 4-core, which runs the code bellow
    - Single-user 12-core, which yields the following error message:
    option cluster() incorrectly specified: too many variables specified
    - both are Stata 17.0 MP—Parallel Edition

    I have replicated the error that occurred on my data (I was checking if one version was faster than the other) with the following code:

    Code:
    webuse nlswork, clear
    reg ln_w grade age ttl_exp tenure south, vce(cluster idcode occ_code)
    I have already tried
    Code:
    set maxvar 120000, perma
    , as this seemed the most common solution that I found, but without success.
    Any suggestions on how I can fix it?

    Thank you for your help!
    Hélder

  • #2
    Helder:
    Stata is as usual correct.
    Multiple cluster is not allowed with -regress-:
    Code:
    . webuse nlswork, clear
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    . 
    . reg ln_w grade age ttl_exp tenure south, vce(cluster idcode occ_code)
    option cluster() incorrectly specified: too many variables specified
    r(198);
    
    . reg ln_w grade age ttl_exp tenure south, vce(cluster idcode)
    
    Linear regression                               Number of obs     =     28,091
                                                    F(5, 4696)        =     870.00
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.3367
                                                    Root MSE          =     .38921
    
                                 (Std. err. adjusted for 4,697 clusters in idcode)
    ------------------------------------------------------------------------------
                 |               Robust
         ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           grade |   .0703772   .0021427    32.84   0.000     .0661765    .0745779
             age |   -.004424   .0009259    -4.78   0.000    -.0062392   -.0026087
         ttl_exp |   .0293041   .0018139    16.16   0.000      .025748    .0328602
          tenure |   .0191088   .0015944    11.98   0.000      .015983    .0222347
           south |  -.1376237   .0091203   -15.09   0.000    -.1555038   -.1197436
           _cons |    .737228   .0338776    21.76   0.000      .670812    .8036441
    ------------------------------------------------------------------------------
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo,
      Thank you for your reply. I usually accept that Stata is correct and try to find a solution. The issue here is that I have two Stata versions, one runs the code and the other doesn´t. Thus the 19-user 4-core version does run the code.
      Best regards,
      Hélder
      Last edited by Helder Costa; 02 May 2023, 04:18.

      Comment


      • #4
        Hélder:
        this is intersting indeed.
        Could you please share what you typed and what Stata gave you back in the two instances? Thanks.
        Last edited by Carlo Lazzaro; 02 May 2023, 05:00.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Yes, here are the logs:

          -----------------------------------------------------------------------------------------------------------------------------------------
          name: <unnamed>
          log: C:\Users\helder.ascosta\Desktop\stata_error\error. txt
          log type: text
          opened on: 2 May 2023, 15:16:56
          r; t=0.00 15:16:56

          .
          . webuse nlswork, clear
          (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
          r; t=1.61 15:16:58

          . reg ln_w grade age ttl_exp tenure south, vce(cluster idcode occ_code)
          option cluster() incorrectly specified: too many variables specified
          r(198); t=0.00 15:16:58

          end of do-file

          r(198); t=1.63 15:16:58


          And in the one that run everything:

          --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
          name: <unnamed>
          log: C:\Users\Hélder Costa\Desktop\stata_error\noerror.txt
          log type: text
          opened on: 2 May 2023, 16:15:32

          .
          . webuse nlswork, clear
          (National Longitudinal Survey of Young Women, 14-24 years old in 1968)

          . reg ln_w grade age ttl_exp tenure south, vce(cluster idcode occ_code)

          Source | SS df MS Number of obs = 27,971
          -------------+---------------------------------- F(5, 27965) = 2836.20
          Model | 2149.34046 5 429.868091 Prob > F = 0.0000
          Residual | 4238.51578 27,965 .15156502 R-squared = 0.3365
          -------------+---------------------------------- Adj R-squared = 0.3364
          Total | 6387.85624 27,970 .228382418 Root MSE = .38931

          (Std. err. adjusted for clustering on idcode occ_code)
          ------------------------------------------------------------------------------
          | Robust
          ln_wage | Coefficient std. err. t P>|t| [95% conf. interval]
          -------------+----------------------------------------------------------------
          grade | .0702746 .0108826 6.46 0.000 .0465634 .0939858
          age | -.0043243 .0018094 -2.39 0.034 -.0082667 -.000382
          ttl_exp | .0292316 .0037572 7.78 0.000 .0210454 .0374178
          tenure | .0191016 .0029133 6.56 0.000 .0127541 .0254492
          south | -.1379479 .020722 -6.66 0.000 -.1830972 -.0927986
          _cons | .7361279 .1700289 4.33 0.001 .3656666 1.106589
          ------------------------------------------------------------------------------

          Cluster combination levels
          idcode 4685
          occ_code 13
          idcode occ_code 9002

          .
          . capture log close
          Thank you!

          Comment


          • #6
            Hélder:
            with Stata 17 I am not able to replicate the different outcomes.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              I´m using Stata 17 as well. One is a 19-user 4-core, which runs the code. The other is the Single-user 12-core, which doesn't run it. I guess this is the most I can say about the versions:

              Stata/MP 17.0 for Windows (64-bit x86-64)
              Revision 10 May 2022
              Copyright 1985-2021 StataCorp LLC

              Total physical memory: 512.00 GB
              Available physical memory: 359.59 GB

              Stata license: 19-user 4-core network
              Stata/MP 17.0 for Windows (64-bit x86-64)
              Revision 08 Mar 2023
              Copyright 1985-2021 StataCorp LLC

              Total physical memory: 512.00 GB
              Available physical memory: 359.48 GB

              Stata license: Single-user 12-core

              Comment


              • #8
                You'll notice that the one Stata version is a year out of data (10 May 2022). This is what I think the underlying cause is for the difference.

                Comment


                • #9
                  Originally posted by Leonardo Guizzetti View Post
                  You'll notice that the one Stata version is more than a year out of data (10 May 2022). This is what I think the underlying cause is for the difference.
                  Unless an update disabled multi-way clustering, the version should not be the issue. But that is what seems to make the most sense. I did not know that regress allowed multi-way clustering, but this is documented in the manual: https://www.stata.com/manuals/rregress.pdf

                  With cluster–robust standard errors for clustering by levels of cvar1 and cvar2 regress y x1 x2 i.a, vce(cluster cvar1 cvar2)
                  On the most updated version of Stata 17, I cannot cluster using more than 1 variable. So maybe the OP should contact Technical Services and inquire about this issue.

                  Comment


                  • #10
                    I actually wonder if what Helder is experiencing is a fluke created by some early attempts from Stata to push the multiway cluster option. (easter egg)
                    Stata18 does allow for doing that. Its one of the new features they are showing in the latest release, but not in Stata17.
                    That being said
                    If you still want to use two-way cluster, you may want to look into the community-contributed program -vcemway-.
                    F
                    regarding documentation: Stata17 https://www.stata.com/manuals17/rregress.pdf
                    the one Andrew pointed out was for Stata18

                    Comment


                    • #11
                      Originally posted by Andrew Musau View Post

                      Unless an update disabled multi-way clustering, the version should not be the issue. But that is what seems to make the most sense. I did not know that regress allowed multi-way clustering, but this is documented in the manual: https://www.stata.com/manuals/rregress.pdf
                      I think it might have been enabled in an earlier version of Stata 17, for whatever reason, but later disabled. The quote from the manual is now for Stata 18 which I did notice allows for multi-way clustering, but this is not documented for Stata 17.
                      Last edited by Leonardo Guizzetti; 02 May 2023, 09:53. Reason: crossed with #10.

                      Comment


                      • #12
                        Another way to do it is using reghdfe with the -noabsorb- option.

                        Code:
                        net install reghdfe, from("https://raw.githubusercontent.com/sergiocorreia/reghdfe/master/src/")

                        Example:

                        Code:
                        webuse nlswork, clear
                        reghdfe ln_w grade age ttl_exp tenure south, vce(cluster idcode tenure) noabsorb
                        Res.:

                        Code:
                        . reghdfe ln_w grade age ttl_exp tenure south, vce(cluster idcode tenure) noabsorb
                        (MWFE estimator converged in 1 iterations)
                        
                        HDFE Linear regression                            Number of obs   =     28,091
                        Absorbing 1 HDFE group                            F(   5,    269) =     717.86
                        Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                                          R-squared       =     0.3367
                                                                          Adj R-squared   =     0.3365
                        Number of clusters (idcode)  =      4,697         Within R-sq.    =     0.3367
                        Number of clusters (tenure)  =        270         Root MSE        =     0.3892
                        
                                                (Std. err. adjusted for 270 clusters in idcode tenure)
                        ------------------------------------------------------------------------------
                                     |               Robust
                             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                        -------------+----------------------------------------------------------------
                               grade |   .0703772   .0022745    30.94   0.000     .0658991    .0748554
                                 age |   -.004424   .0009882    -4.48   0.000    -.0063696   -.0024783
                             ttl_exp |   .0293041   .0018216    16.09   0.000     .0257176    .0328906
                              tenure |   .0191088   .0021053     9.08   0.000     .0149638    .0232539
                               south |  -.1376237   .0101428   -13.57   0.000    -.1575931   -.1176543
                               _cons |    .737228    .034514    21.36   0.000     .6692762    .8051799
                        ------------------------------------------------------------------------------

                        Comment


                        • #13
                          The version of multiway clustering that was available for a short time in Stata 17 was intended to be used as the back end for computations in a feature that we planned to make available during Stata 17. This multiway cluster feature was not documented in -regress- (-areg- or -xtreg, fe-) and was not intended for public consumption at that time. We did not end up releasing the feature that was going to rely on multiway clustering from -regress- (-areg or -xtreg, fe-) in Stata 17. Therefore, the -vce(cluster)- option was reverted to its previous behavior. We did however, continue developing multiway clustering in a more general way, and it is now available for -regress-, -areg-, and -xtreg, fe- in Stata 18.

                          Comment


                          • #14
                            Thank you all for you help, I now understand what was happening.

                            Comment

                            Working...
                            X