Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtreg run provides coefficients, no standard errors

    I've done an xtreg run that provides statistics but no standard errors or t statistics. Here is an example. (I've done other runs with a more extended list of variables, lengthier output, but a similar result.)

    . xtreg totdaysupp comp hmo_sim cdhp, fe vce(robust)

    Fixed-effects (within) regression Number of obs = 230,424
    Group variable: enrolid Number of groups = 115,212

    R-sq: Obs per group:
    within = 0.0013 min = 2
    between = 0.0004 avg = 2.0
    overall = 0.0004 max = 2

    F(0,115211) = .
    corr(u_i, Xb) = 0.0061 Prob > F = .

    (Std. Err. adjusted for 115,212 clusters in enrolid)
    ------------------------------------------------------------------------------
    | Robust
    totdaysupp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    comp | 11.3995 . . . . .
    hmo_sim | 5.72344 . . . . .
    cdhp | 15.25799 . . . . .
    _cons | 495.9115 . . . . .
    -------------+----------------------------------------------------------------
    sigma_u | 341.95897
    sigma_e | 174.60141
    rho | .79320767 (fraction of variance due to u_i)


    Can someone explain or point me in the direction of how to investigate why this is happening? Thanks.




  • #2
    Charles:
    it would seem that your data have no within panel variaton.
    Can you please provide an example/excerpt of you dataset via -dataex-? Thanks.
    Last edited by Carlo Lazzaro; 01 Aug 2019, 11:18.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo -

      Before performing the run I modified the data set to ensure within panel variation. I reviewed and slightly revised that process today. I will explain it below and also provide an additional xtreg estimation that, accounting for constants, is essentially the same. Please see below.

      The data set is proprietary so I can not provide it as such, but my list of 1/20 below shows the within-group variation.

      I will also suggest another hypothesis for why standard errors are not being reported in this run: The data are skewed so that possibly a linear-linear functional form fits poorly causing a large sum of squared residuals, resulting in standard errors too large to report. Just a guess.

      . */ Correct the logic in 2019_07_26_... that performs an estimation run only with enrolids that change plantyps year to year to account for combination of plantyps 3,4,5, and 7 into hmo_sim /*

      . */ Consider these as psuedo-plantyps: 2, 4 (to include 3,4,5, and 7), 6, and 8 /*

      . gen psplantyp = 0

      . replace psplantyp = 2 if plantypmode==2
      (547,326 real changes made)

      . replace psplantyp = 4 if plantypmode==3|plantypmode==4|plantypmode==5|plant ypmode==7
      (1,128,201 real changes made)

      . replace psplantyp = 6 if plantypmode==6
      (1,729,248 real changes made)

      . replace psplantyp = 8 if plantypmode==8
      (22,647 real changes made)

      . tab psplantyp if year==2005

      psplantyp | Freq. Percent Cum.
      ------------+-----------------------------------
      2 | 282,415 16.48 16.48
      4 | 569,872 33.25 49.73
      6 | 855,012 49.89 99.63
      8 | 6,412 0.37 100.00
      ------------+-----------------------------------
      Total | 1,713,711 100.00

      . tab psplantyp if year==2006

      psplantyp | Freq. Percent Cum.
      ------------+-----------------------------------
      2 | 264,911 15.46 15.46
      4 | 558,329 32.58 48.04
      6 | 874,236 51.01 99.05
      8 | 16,235 0.95 100.00
      ------------+-----------------------------------
      Total | 1,713,711 100.00

      . */ Now strategize to include only those enrolids that change psplantyp year 1 to year 2 /*

      . gen psplan05=0

      . replace psplan05=psplantyp if year==2005
      (1,713,711 real changes made)

      . gen psplan06=0

      . replace psplan06=psplantyp if year==2006
      (1,713,711 real changes made)

      . by enrolid: egen psplanin05=max(psplan05)

      . by enrolid: egen psplanin06=max(psplan06)

      . */ The "replace" code puts the correct psplantyp in as psplan05 or psplan06 /*

      . */ depending on the year. For year 2005, psplan06 remains 0. Likewise for year 2006, /*

      . */ psplan05 remains = 0. /*

      . */ The egen code puts the correct value of psplanin05 and 06 in each reocord /*

      . */ regardless of year. /*

      . */ Now I wish to uncover the enrolids where psplan05=psplan06; i.e., no change/*

      . */ in psplantyp between years /*

      . gen psplandiff = psplanin06-psplanin05

      . tab psplandiff, missing

      psplandiff | Freq. Percent Cum.
      ------------+-----------------------------------
      -6 | 14 0.00 0.00
      -4 | 31,304 0.91 0.91
      -2 | 32,634 0.95 1.87
      0 | 3,227,312 94.16 96.03
      2 | 64,954 1.90 97.92
      4 | 70,534 2.06 99.98
      6 | 670 0.02 100.00
      ------------+-----------------------------------
      Total | 3,427,422 100.00

      . preserve

      . drop if psplandiff==0
      (3,227,312 observations deleted)

      . xtset enrolid year
      panel variable: enrolid (strongly balanced)
      time variable: year, 2005 to 2006
      delta: 1 unit

      . xtreg totdaysupp i.psplantyp, fe vce(robust)

      Fixed-effects (within) regression Number of obs = 200,110
      Group variable: enrolid Number of groups = 100,055

      R-sq: Obs per group:
      within = 0.0014 min = 2
      between = 0.0032 avg = 2.0
      overall = 0.0011 max = 2

      F(0,100054) = .
      corr(u_i, Xb) = 0.0209 Prob > F = .

      (Std. Err. adjusted for 100,055 clusters in enrolid)
      ------------------------------------------------------------------------------
      | Robust
      totdaysupp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      psplantyp |
      4 | -6.153864 . . . . .
      6 | -10.85658 . . . . .
      8 | 4.113842 . . . . .
      |
      _cons | 506.4588 . . . . .
      -------------+----------------------------------------------------------------
      sigma_u | 343.10152
      sigma_e | 175.71487
      rho | .79221477 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------


      . tab psplantyp, missing

      psplantyp | Freq. Percent Cum.
      ------------+-----------------------------------
      2 | 53,618 26.79 26.79
      4 | 44,541 22.26 49.05
      6 | 90,890 45.42 94.47
      8 | 11,061 5.53 100.00
      ------------+-----------------------------------
      Total | 200,110 100.00

      . sort enrolid year

      . list psplantyp in 1/20

      +----------+
      | psplan~p |
      |----------|
      1. | 6 |
      2. | 2 |
      3. | 6 |
      4. | 2 |
      5. | 6 |
      |----------|
      6. | 2 |
      7. | 2 |
      8. | 6 |
      9. | 6 |
      10. | 2 |
      |----------|
      11. | 6 |
      12. | 2 |
      13. | 2 |
      14. | 6 |
      15. | 6 |
      |----------|
      16. | 2 |
      17. | 6 |
      18. | 2 |
      19. | 6 |
      20. | 2 |
      +----------+

      Thanks for any suggestions.

      Charles Bondi






      Comment


      • #4
        Charles:
        does anything change if you use default standard errors?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          I'll be back with the data on Monday and address that question. Thanks.

          Comment


          • #6

            Hi Carlo,

            Indeed, I get standard errors with both default and bootstrapped standard errors, though not with cluster or, as previously shown, robust. Stata documentation indicates that robust and cluster are available for fe option but perhaps this is not the case for the first difference (panel size of 2) application.

            It is somewhat reassuring that the bootstrap and default standard errors are close.

            Would you have any idea if this is the case, or how I would investigate?

            Thanks,

            Charles


            . xtreg totdaysupp i.psplantyp, fe

            Fixed-effects (within) regression Number of obs = 200,110
            Group variable: enrolid Number of groups = 100,055

            R-sq: Obs per group:
            within = 0.0014 min = 2
            between = 0.0032 avg = 2.0
            overall = 0.0011 max = 2

            F(3,100052) = 46.67
            corr(u_i, Xb) = 0.0209 Prob > F = 0.0000

            ------------------------------------------------------------------------------
            totdaysupp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            psplantyp |
            4 | -6.153864 1.516421 -4.06 0.000 -9.126031 -3.181696
            6 | -10.85658 1.080787 -10.05 0.000 -12.97491 -8.738249
            8 | 4.113842 2.591551 1.59 0.112 -.9655654 9.193249
            |
            _cons | 506.4588 .9072635 558.23 0.000 504.6806 508.2371
            -------------+----------------------------------------------------------------
            sigma_u | 343.10152
            sigma_e | 175.71487
            rho | .79221477 (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            F test that all u_i=0: F(100054, 100052) = 7.55 Prob > F = 0.0000

            .
            .
            . xtreg totdaysupp i.psplantyp, fe vce(robust)

            Fixed-effects (within) regression Number of obs = 200,110
            Group variable: enrolid Number of groups = 100,055

            R-sq: Obs per group:
            within = 0.0014 min = 2
            between = 0.0032 avg = 2.0
            overall = 0.0011 max = 2

            F(0,100054) = .
            corr(u_i, Xb) = 0.0209 Prob > F = .

            (Std. Err. adjusted for 100,055 clusters in enrolid)
            ------------------------------------------------------------------------------
            | Robust
            totdaysupp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            psplantyp |
            4 | -6.153864 . . . . .
            6 | -10.85658 . . . . .
            8 | 4.113842 . . . . .
            |
            _cons | 506.4588 . . . . .
            -------------+----------------------------------------------------------------
            sigma_u | 343.10152
            sigma_e | 175.71487
            rho | .79221477 (fraction of variance due to u_i)
            ------------------------------------------------------------------------------

            . xtreg totdaysupp i.psplantyp, fe vce(cluster enrolid)

            Fixed-effects (within) regression Number of obs = 200,110
            Group variable: enrolid Number of groups = 100,055

            R-sq: Obs per group:
            within = 0.0014 min = 2
            between = 0.0032 avg = 2.0
            overall = 0.0011 max = 2

            F(0,100054) = .
            corr(u_i, Xb) = 0.0209 Prob > F = .

            (Std. Err. adjusted for 100,055 clusters in enrolid)
            ------------------------------------------------------------------------------
            | Robust
            totdaysupp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            psplantyp |
            4 | -6.153864 . . . . .
            6 | -10.85658 . . . . .
            8 | 4.113842 . . . . .
            |
            _cons | 506.4588 . . . . .
            -------------+----------------------------------------------------------------
            sigma_u | 343.10152
            sigma_e | 175.71487
            rho | .79221477 (fraction of variance due to u_i)
            ------------------------------------------------------------------------------

            . xtreg totdaysupp i.psplantyp, fe vce(bootstrap, reps(300) seed(123) nodots)

            Fixed-effects (within) regression Number of obs = 200,110
            Group variable: enrolid Number of groups = 100,055

            R-sq: Obs per group:
            within = 0.0014 min = 2
            between = 0.0032 avg = 2.0
            overall = 0.0011 max = 2

            Wald chi2(3) = 147.41
            corr(u_i, Xb) = 0.0209 Prob > chi2 = 0.0000

            (Replications based on 100,055 clusters in enrolid)
            ------------------------------------------------------------------------------
            | Observed Bootstrap Normal-based
            totdaysupp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            psplantyp |
            4 | -6.153864 1.501876 -4.10 0.000 -9.097487 -3.21024
            6 | -10.85658 1.111523 -9.77 0.000 -13.03512 -8.678035
            8 | 4.113842 2.254456 1.82 0.068 -.3048107 8.532495
            |
            _cons | 506.4588 1.430039 354.16 0.000 503.656 509.2617
            -------------+----------------------------------------------------------------
            sigma_u | 343.10152
            sigma_e | 175.71487
            rho | .79221477 (fraction of variance due to u_i)
            ------------------------------------------------------------------------------

            Comment


            • #7
              Charles:
              take a look at -help j_robustsingular-.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                From the help output:

                Are any standard errors missing?

                If any standard errors are reported as dots, something is wrong with your model: one or more coefficients could not be estimated in the normal
                statistical sense. You need to address that problem and ignore the rest of this discussion.

                ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                OK but this is a little confusing:

                1. Default and bootstrap vce referred to the same model, yet provided standard errors. So it appears the problem is the choice of vce option, despite the documentation indicating they (robust and cluster0 are available.
                2. Given the nature of my indicator (psplantyp has 4 values), I expect and receive 3 coefficients regardless of vce option. And they are computed to the same value, so the coefficient computation is unaffected by the choice of vce.

                Sincerely,

                Charles


                Comment


                • #9

                  Hi again, Carlo,

                  I just happened to be searching on the web and found your September 2017 comments on robust and cluster in the xtreg context.

                  "
                  Prash:
                  I keep replying along the same lines as the answer looks (to me, at any rate), the same.
                  Provided that what you're keeping asking is well covered in -xtreg- entry of Stata .pdf manual:
                  - if you use -robust- option in -regress- (regardless you have cross-sectional or panel data) your standard errors will accomodate for heteroskedasticity only. If you actually have panel data and you use -regress- but omit to cluster the standard errors on -panelid-, your results will be biased (ie, untrustworthy) ,as you consider all the observations as independent and neglect their panel structure. Hence, it is not that -robust- outperforms -cluster- if you use -regress- with panel data: -robust- is simply wrong;
                  - if you use -robust- under -xtreg- your standard errors will take both heteroskedasticity and/or autocorrelation of the systematic error into account. Hence if your panel data are affected by heteroskedasticity only; serial correlation only; both heteroskedasticity and serial correlation, -robust- (or -cluster-) option will account for them."

                  OK so if robust is not appropriate here, so moot, it remains that enrolid does not provide standard errors.

                  FYI as I expand my variable set I'm finding that boostrap and default provide similar inferences according to t and z statistics.

                  Regards,

                  Charles

                  Comment


                  • #10
                    Hi Carlo, I just got back to work. I haven't seen any more posts from you on this topic since 6 August. Thanks for your comments and help.


                    Charles

                    Comment

                    Working...
                    X