Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • List of observations in regression

    Hello,

    I am running a regression in STATA (xtgee).

    xtgee Profit number_employees company_size , link(log) corr(ar1) iterate(100) family(igaussian) vce(robust)


    When running the regression STATA tells me:

    note: observations not equally spaced
    modal spacing is delta Year = 1 unit
    12 groups omitted from estimation
    note: some groups have fewer than 2 observations
    not possible to estimate correlations for those groups
    25 groups omitted from estimation

    This is totally fine for me. But I would like to know which observations ended up being part of the regression. Is there a way to do that?

    Thanks for the help!!!

  • #2
    one of the items that Stata saves for you is called "e(sample)" - this is effectively a dummy that is 1 if the observation is included and 0 if not; because this is "temporary", I recommend generating a new variable that is equal to e(sample) immediately and then using that to see what observations are included and which are excluded

    Comment


    • #3
      Olmaba:
      -e(sample)- is probably the answer:
      Code:
      use http://www.stata-press.com/data/r16/nlswork2.dta
      . xtgee ln_w grade age c.age#c.age, corr(indep) nmp
      
      Iteration 1: tolerance = 8.741e-13
      
      GEE population-averaged model                   Number of obs     =     16,085
      Group variable:                     idcode      Number of groups  =      3,913
      Link:                             identity      Obs per group:
      Family:                           Gaussian                    min =          1
      Correlation:                   independent                    avg =        4.1
                                                                    max =          9
                                                      Wald chi2(3)      =    4241.04
      Scale parameter:                  .1408958      Prob > chi2       =     0.0000
      
      Pearson chi2(16081):               2265.75      Deviance          =    2265.75
      Dispersion (Pearson):             .1408958      Dispersion        =   .1408958
      
      ------------------------------------------------------------------------------
           ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
             grade |   .0724483   .0014229    50.91   0.000     .0696594    .0752372
               age |   .1064874   .0083644    12.73   0.000     .0900935    .1228812
                   |
       c.age#c.age |  -.0016931   .0001655   -10.23   0.000    -.0020174   -.0013688
                   |
             _cons |  -.8681487   .1024896    -8.47   0.000    -1.069025   -.6672728
      ------------------------------------------------------------------------------
      
      . gen flag = e(sample)
      
      . sum flag
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
              flag |     16,094    .9994408    .0236418          0          1
      
      . tab flag
      
             flag |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                0 |          9        0.06        0.06
                1 |     16,085       99.94      100.00
      ------------+-----------------------------------
            Total |     16,094      100.00
      
      .
      PS: happy with noticing that my reply is consistent with Rich's helpful (and faster) advice!
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment

      Working...
      X