Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing output with lagged variables

    Hello, I'm running a fixed effects model with year dummies. I would also like to run a different version of the model with a lag on one of the variables. When I run the original model, everything works out. I get this table:

    xtreg totreprtabled difftwo noemplyeesw3cnt totrevenue totnetassetend totprgmrevnue totliabend othrsalwages i.year, fe robust

    Fixed-effects (within) regression Number of obs = 747,838
    Group variable: ein Number of groups = 313,138

    R-sq: Obs per group:
    within = 0.0027 min = 1
    between = 0.3574 avg = 2.4
    overall = 0.2461 max = 3

    F(9,313137) = 35.55
    corr(u_i, Xb) = 0.2006 Prob > F = 0.0000

    (Std. Err. adjusted for 313,138 clusters in ein)
    ---------------------------------------------------------------------------------
    | Robust
    totreprtabled | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
    difftwo | 15374.22 4112.481 3.74 0.000 7313.872 23434.57
    noemplyeesw3cnt | .672055 .5552385 1.21 0.226 -.4161968 1.760307
    totrevenue | 8.35e-06 .0000163 0.51 0.609 -.0000237 .0000404
    totnetassetend | .0002926 .0001683 1.74 0.082 -.0000373 .0006224
    totprgmrevnue | .0005129 .0002528 2.03 0.042 .0000174 .0010083
    totliabend | .0000966 .000178 0.54 0.587 -.0002522 .0004454
    othrsalwages | .0108264 .0021668 5.00 0.000 .0065795 .0150733
    |
    year |
    2013 | 5232.222 511.6966 10.23 0.000 4229.312 6235.133
    2014 | 18496.99 3616.263 5.11 0.000 11409.22 25584.76
    |
    _cons | 139248.7 4706.256 29.59 0.000 130024.6 148472.9
    ----------------+----------------------------------------------------------------
    sigma_u | 730233.74
    sigma_e | 845176.67
    rho | .42742571 (fraction of variance due to u_i)
    ---------------------------------------------------------------------------------

    Everything is as it should be. When I run the model with a lag on the variable difftwo, I have a problem with the output.

    xtreg totreprtabled L.difftwo noemplyeesw3cnt totrevenue totnetassetend totprgmrevnue totliabend othrsalwages i.year, fe robust

    Fixed-effects (within) regression Number of obs = 422,215
    Group variable: ein Number of groups = 253,714

    R-sq: Obs per group:
    within = 0.0011 min = 1
    between = 0.2780 avg = 1.7
    overall = 0.1993 max = 2

    F(0,253713) = .
    corr(u_i, Xb) = 0.2049 Prob > F = .

    (Std. Err. adjusted for 253,714 clusters in ein)
    ---------------------------------------------------------------------------------
    | Robust
    totreprtabled | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
    difftwo |
    L1. | 7210.704 . . . . .
    |
    noemplyeesw3cnt | 1.647141 . . . . .
    totrevenue | .0000835 . . . . .
    totnetassetend | .0003486 . . . . .
    totprgmrevnue | .0006594 . . . . .
    totliabend | .0001677 . . . . .
    othrsalwages | .0082254 . . . . .
    |
    year |
    2014 | 12662.6 . . . . .
    |
    _cons | 164728.3 . . . . .
    ----------------+----------------------------------------------------------------
    sigma_u | 965074.59
    sigma_e | 1162071.5
    rho | .40817665 (fraction of variance due to u_i)
    ---------------------------------------------------------------------------------


    Its missing all this key information.

    Any ideas whats wrong?

    Here's a sample of the data. the variable difftwo is a binary indicator. There are no ones in this small sample. its a data set with 1,000,000 observations and about 10,000 ones identified. There is also no data for the year 2011 on this variable.

    [CODE]
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long ein float(year difftwo)
    10018922 2011 .
    10018922 2012 0
    10018922 2013 0
    10018922 2014 0
    10018923 2012 0
    10018923 2013 0
    10018923 2014 0
    10018927 2011 .
    10018927 2012 0
    10018927 2013 0
    10018927 2014 0
    10018930 2011 .
    10018930 2012 0
    10018930 2013 0
    10018930 2014 0
    10019705 2011 .
    10019705 2012 0
    10019705 2013 0
    10019705 2014 0
    10021545 2011 .
    10021545 2012 0
    10021545 2013 0
    10021545 2014 0
    10022320 2011 .
    10022320 2012 0
    10022320 2013 0
    10022320 2014 0
    10022415 2011 .
    10022415 2013 0
    10022415 2014 0
    10024245 2011 .
    10024245 2012 0
    10024245 2013 0
    10024455 2011 .
    10024455 2012 0
    10024455 2013 0
    10024455 2014 0
    10024645 2011 .
    10024645 2012 0
    10024645 2013 0
    10024645 2014 0
    10027741 2011 .
    10027741 2013 0
    10027741 2014 0
    10027746 2011 .
    10027746 2012 0
    10027746 2013 0
    10027746 2014 0
    10027747 2011 .
    10027747 2012 0
    10027747 2014 0
    10027748 2011 .
    10027748 2012 0
    10027748 2013 0
    10027748 2014 0
    10027751 2011 .
    10027751 2012 0
    10027751 2013 0
    10027751 2014 0
    10028980 2012 0
    10028980 2013 0
    10029480 2011 .
    10029480 2012 0
    10029480 2013 0
    10032215 2011 .
    10032215 2012 0
    10032215 2013 0
    10032215 2014 0
    10039552 2011 .
    10039552 2012 0
    10039552 2013 0
    10039552 2014 0
    10043280 2011 .
    10043280 2012 0
    10043280 2013 0
    10043280 2014 0
    10043285 2011 .
    10043285 2012 0
    10043285 2013 0
    10043285 2014 0
    10052260 2012 0
    10052260 2013 0
    10055140 2011 .
    10055140 2012 0
    10055140 2013 0
    10055140 2014 0
    10056837 2011 .
    10056837 2012 0
    10056837 2013 0
    10056837 2014 0
    10061300 2011 .
    10061300 2012 0
    10061300 2013 0
    10061310 2011 .
    10061310 2012 0
    10061310 2013 0
    10061310 2014 0
    10063057 2011 .
    10063057 2012 0
    10063057 2013 0
    Last edited by Philip Gigliotti; 28 Apr 2017, 14:36.

  • #2
    Philip:
    I would guess that the core of your problem is an instersection between missing values for -difftwo-, the way lagged variables work and the balance between clusters and degrees of freedom as far as robustified standard errors are concerned.
    Wrapping up qualitiatively, most of the parameters cannot be estimated.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      It works when I specify the
      Code:
      vce(cluster ein)
      option. I don't know if that helps figure out what's going on.

      Comment


      • #4
        Philip:
        weird (and interesting) indeed, as robustified and clustered standard errors (SEs) are expected to do the same job under -xtreg-.
        Perhaps posting what you typed and what Stata gave you back with both the SEs option woud be helpful to track down what was going on.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          when i enter this command:

          Code:
          xtreg compnsatncurrofcr L.diffthree totrevenue totassetsend totliabend programratio othrsalwages i.year, fe vce(cluster einstring)
          i get the missing output.

          when i enter this command, with the errors clustered around my panelvar, i also get the missing output:

          Code:
          xtreg compnsatncurrofcr L.diffthree totrevenue totassetsend totliabend programratio othrsalwages i.year, fe vce(cluster ein)
          its only when i enter this command, with the errors clustered around a string version of my panelvar, that I get the full output:

          Code:
          xtreg compnsatncurrofcr L.diffthree totrevenue totassetsend totliabend programratio othrsalwages i.year, fe vce(cluster einstring)

          Comment


          • #6
            My guess is that the difference between the second and third commands is that, in the third command, Stata does not recognize that einstring used to define the clusters for variance estimation is the same as ein used to define your panels, and does something that is not appropriate when the clusters for variance estimation are the panels.

            I see no difference at all between the first and third commands, I think one or the other was copied incorrectly.

            Comment


            • #7
              My mistake. The first command should be:

              Code:
              xtreg compnsatncurrofcr L.diffthree totrevenue totassetsend totliabend programratio othrsalwages i.year, fe robust

              Comment


              • #8
                The robust and vce(cluster) commands don't work, and I know that they are equivalent in the new stata. Do you think clustering around a string variable that is identical to my panelvar produces a mathematically identical result?

                I have had some success with other panel models. for instance

                Code:
                xtpoisson noemplyeesw3cnt L.difftwo totrevenue totassetsend totliabend programratio compnsatncurrofcr othrsalwages i.year, fe vce(robust)
                works. I think this may be because the robust poisson errors are not produced by clustering either.

                Comment


                • #9
                  The robust and vce(cluster) commands don't work
                  I disagree. I again suggest, as I did in post #6, that these estimates work properly and are telling you that, given your model and your data (including the effect of lagging the difftwo variable) the Huber/White/sandwich VCE estimator cannot be calculated. Clustering on a variable identical to your panel variable constitutes misinforming xtreg about the nature of your clustering, so that it does not incorporate the equivalence of your clusters and your panels in its calculations, and the results cannot be mathematically identical.

                  Were you to cluster on a numeric copy of your cluster variable, rather than a string copy, for example
                  Code:
                  generate gnxl = ein
                  xtreg compnsatncurrofcr L.diffthree totrevenue totassetsend totliabend programratio othrsalwages i.year, fe vce(cluster gnxl)
                  I expect you would get exactly the same results as you do when you cluster on the string copy of your panel variable, because again you have misinformed xtreg about the nature of your clustering.

                  Comment


                  • #10
                    Why would these standard errors not be able to be calculated?

                    Comment


                    • #11
                      Perhaps someone more familiar with robust VCE estimation can explain that; I cannot.

                      All I know is that reporting missing values for results is a common way that Stata to a violation of the conditions required to calculate the result.
                      Code:
                      . display "result of 1/0 is " 1/0
                      result of 1/0 is .
                      While I understand enough mathematics to explain why Stata cannot divide one by zero, I do not know enough about robust VCE estimation, nor about your data, to make a similar explanation for your problem.
                      Last edited by William Lisowski; 01 May 2017, 20:33.

                      Comment


                      • #12
                        I tried purging the dataset of the years in which the problem variable "difftwo" is missing. However, the output still generates with the missing results. I also tried running the model without the lag on difftwo and a lag on a different variable, and the output was still missing. It's something about lagging in general that is problematic. One thing that I can think of is that my panel is only 3 years, so when I lag a variable, there are only 2 years of data left in my panel. Does the cluster function not work with only two years of data?
                        Last edited by Philip Gigliotti; 03 May 2017, 07:34.

                        Comment


                        • #13
                          Of course removing the observations with difftwo missing had no effect - Stata's handling of missing values in xtreg excluded those observations, and any observation with a missing value on one of the variables in the model, automatically.

                          Perhaps someone else can comment on the data requirements for the Huber/White/sandwich VCE estimator to exist. Two observations seems like too few to estimate cluster effects reliably.

                          Comment


                          • #14
                            What is a good alternative to robust standard errors, if the cluster option is computationally impossible?

                            Comment


                            • #15
                              I contacted stata tech support and they got back to me, suggesting the problem was that i was not running the most recent version of stata. after updating my software, i am now able to produce the problem outputs. Incidentally, they are identical to those produced with the "trick" of using a copy of the panelvar as the cluster variable.

                              I'm attaching the comments from tech support for posterity.
                              Before going any further, let's make sure that you are using the latest
                              version of the software. Please execute the command line below and follow the
                              instructions.

                              update all

                              After the update please run you -xtreg- command lines again. I get standard
                              errors for all your models fitted with -xtreg-. Here is the output I get:

                              xtreg compnsatncurrofcr L.difftwo noemplyeesw3cnt totrevenue totassetsend ///
                              totliabend programratio othrsalwages i.year if cthree == 1, fe robust

                              Fixed-effects (within) regression Number of obs = 308,983
                              Group variable: ein Number of groups = 185,968

                              R-sq: Obs per group:
                              within = 0.0217 min = 1
                              between = 0.4241 avg = 1.7
                              overall = 0.4154 max = 2

                              F(8,185967) = 15.80
                              corr(u_i, Xb) = 0.2914 Prob > F = 0.0000

                              (Std. Err. adjusted for 185,968 clusters in ein)
                              --------------------------------------------------------------------------
                              | Robust
                              compnsatncurr~r | Coef. Std. Err. t P>|t| [95% Conf.Interval]
                              ----------------+---------------------------------------------------------
                              difftwo |
                              L1. | 8247.59 2470.901 3.34 0.001 3404.682 13090.5
                              |
                              noemplyeesw3cnt | 1.069199 1.074298 1.00 0.320 -1.0364 3.174798
                              totrevenue | .0004013 .0003484 1.15 0.249 -.0002815 .0010842
                              totassetsend | .0004682 .0003114 1.50 0.133 -.0001422 .0010785
                              totliabend | -.0003042 .0002494 -1.22 0.223 -.0007929 .0001846
                              programratio | 4.330959 3.10064 1.40 0.162 -1.746224 10.40814
                              othrsalwages | .0052239 .0018116 2.88 0.004 .0016731 .0087747
                              |
                              year |
                              2014 | 4718.509 805.612 5.86 0.000 3139.528 6297.489
                              |
                              _cons | 128005.9 5991.887 21.36 0.000 116262 139749.9
                              ----------------+---------------------------------------------------------
                              sigma_u | 600800.66
                              sigma_e | 209218.34
                              rho | .89184911 (fraction of variance due to u_i)
                              --------------------------------------------------------------------------


                              xtreg compnsatncurrofcr L.difftwo noemplyeesw3cnt totrevenue totassetsend ///
                              totliabend programratio othrsalwages i.year if cthree == 1, fe vce(robust)

                              Fixed-effects (within) regression Number of obs = 308,983
                              Group variable: ein Number of groups = 185,968

                              R-sq: Obs per group:
                              within = 0.0217 min = 1
                              between = 0.4241 avg = 1.7
                              overall = 0.4154 max = 2

                              F(8,185967) = 15.80
                              corr(u_i, Xb) = 0.2914 Prob > F = 0.0000

                              (Std. Err. adjusted for 185,968 clusters in ein)
                              ----------------------------------------------------------------------------
                              | Robust
                              compnsatncurr~r | Coef. Std. Err. t P>|t| [95% Conf.Interval]
                              ----------------+-----------------------------------------------------------
                              difftwo |
                              L1. | 8247.59 2470.901 3.34 0.001 3404.682 13090.5
                              |
                              noemplyeesw3cnt | 1.069199 1.074298 1.00 0.320 -1.0364 3.174798
                              totrevenue | .0004013 .0003484 1.15 0.249 -.0002815 .0010842
                              totassetsend | .0004682 .0003114 1.50 0.133 -.0001422 .0010785
                              totliabend | -.0003042 .0002494 -1.22 0.223 -.0007929 .0001846
                              programratio | 4.330959 3.10064 1.40 0.162 -1.746224 10.40814
                              othrsalwages | .0052239 .0018116 2.88 0.004 .0016731 .0087747
                              |
                              year |
                              2014 | 4718.509 805.612 5.86 0.000 3139.528 6297.489
                              |
                              _cons | 128005.9 5991.887 21.36 0.000 116262 139749.9
                              ----------------+-----------------------------------------------------------
                              sigma_u | 600800.66
                              sigma_e | 209218.34
                              rho | .89184911 (fraction of variance due to u_i)
                              ----------------------------------------------------------------------------



                              xtreg compnsatncurrofcr L.difftwo noemplyeesw3cnt totrevenue totassetsend ///
                              totliabend programratio othrsalwages i.year if cthree == 1,fe vce(cluster ein)

                              Fixed-effects (within) regression Number of obs = 308,983
                              Group variable: ein Number of groups = 185,968

                              R-sq: Obs per group:
                              within = 0.0217 min = 1
                              between = 0.4241 avg = 1.7
                              overall = 0.4154 max = 2

                              F(8,185967) = 15.80
                              corr(u_i, Xb) = 0.2914 Prob > F = 0.0000

                              (Std. Err. adjusted for 185,968 clusters in ein)
                              ----------------------------------------------------------------------------
                              | Robust
                              compnsatncurr~r | Coef. Std. Err. t P>|t| [95% Conf.Interval]
                              ----------------+-----------------------------------------------------------
                              difftwo |
                              L1. | 8247.59 2470.901 3.34 0.001 3404.682 13090.5
                              |
                              noemplyeesw3cnt | 1.069199 1.074298 1.00 0.320 -1.0364 3.174798
                              totrevenue | .0004013 .0003484 1.15 0.249 -.0002815 .0010842
                              totassetsend | .0004682 .0003114 1.50 0.133 -.0001422 .0010785
                              totliabend | -.0003042 .0002494 -1.22 0.223 -.0007929 .0001846
                              programratio | 4.330959 3.10064 1.40 0.162 -1.746224 10.40814
                              othrsalwages | .0052239 .0018116 2.88 0.004 .0016731 .0087747
                              |
                              year |
                              2014 | 4718.509 805.612 5.86 0.000 3139.528 6297.489
                              |
                              _cons | 128005.9 5991.887 21.36 0.000 116262 139749.9
                              ----------------+-----------------------------------------------------------
                              sigma_u | 600800.66
                              sigma_e | 209218.34
                              rho | .89184911 (fraction of variance due to u_i)
                              ----------------------------------------------------------------------------

                              Comment

                              Working...
                              X