Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtreg, r(2001) insufficient observations

    Hello,

    I am currently writing my thesis on STEM gaps, my dataset contains 106'000 independent observations. I described my data as a panel data set using "xtset OBSNUM". However, when I try to run a regression for example as the following one:

    xtreg lnearn c.age i.kid##i.sex

    the r(2001) insufficient observations error message appears. I do have some missing values for lnearn, but dropping them doesn't change anything...

    Does anyone know what I am doing wrong ?

    Thanks for your help,
    Elias

    c.f. attachment for dataex
    Attached Files

  • #2
    Thanks for attempting dataex but please use it as specified, to copy and paste the results into the forum software. Images are not nearly as helpful as you hope: they don't allow copy and paste and (in your case for me, and perhaps for others) the image may be unreadable any way.

    Comment


    • #3


      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float lnearn long(OBSNUM sex kid) byte age long(married degree) byte sector float stemcat
      11.608235  99671 0 0 40 1 6 2 0
      10.668956  92121 1 1 40 1 6 1 0
      11.0021  27204 1 1 34 1 6 2 0
      11.617286   5203 0 0 46 0 5 2 0
      9.903487  19266 1 1 32 1 6 3 0
      .  67306 1 1 71 0 5 0 0
      11.407565  46965 0 1 56 1 5 3 1
      10.434115  59541 1 1 35 1 5 3 0
      11.289782  40288 1 1 58 0 5 1 0
      11.289782  47151 1 0 38 0 5 2 2
      10.99541  36556 1 0 43 1 6 1 1
      .  75303 0 0 74 0 5 0 0
      10.08581  88199 1 0 74 1 6 1 0
      10.518673  16216 1 0 42 0 6 3 1
      11.512925  46218 0 0 40 0 7 1 1
      10.094108  54854 1 1 36 1 6 0 0
      11.695247  31147 0 1 37 1 6 3 1
      12.323855  77221 1 0 32 0 6 3 1
      11.91839 102601 1 1 52 1 5 3 1
      9.903487   1579 1 1 34 1 6 1 0
      11.326596  67003 0 0 31 0 5 2 1
      11.350407  56278 1 1 38 1 5 3 0
      .  56630 0 1 37 0 5 0 0
      11.561716  71363 1 0 32 0 5 1 0
      12.733755  97740 1 0 30 1 6 3 1
      12.100712  32735 0 1 38 1 7 3 1
      11.751943 105248 1 0 61 0 7 1 0
      12.380026  32479 0 1 37 1 7 3 0
      11.225244  54875 1 1 31 0 5 3 0
      11.472103  30889 1 1 41 1 5 3 0
      11.695247  38476 1 1 45 0 6 1 0
      11.225244  51800 0 0 37 0 5 3 1
      10.819778  70229 1 0 32 0 8 3 0
      . 101395 0 1 69 1 6 0 0
      11.695247  59353 0 1 57 1 5 3 0
      11.289782  23177 1 1 37 0 6 2 0
      . 102154 1 1 32 1 5 2 1
      12.206073  22403 0 1 39 1 6 3 0
      .  18936 1 0 68 1 6 0 0
      12.542545  22318 0 0 41 1 6 3 0
      10.714417  48712 1 1 33 1 5 3 0
      12.556163  35418 1 1 31 1 5 3 1
      12.429216 100700 0 1 40 1 6 3 0
      12.388394  31194 0 1 37 1 7 3 0
      11.589887  83422 1 1 44 1 5 1 0
      11.81303  74846 0 0 35 0 7 3 1
      11.964  54828 1 1 43 1 7 3 2
      12.206073  83125 0 1 58 1 6 3 1
      11.81303  93977 0 1 33 1 5 3 0
      7.313221  15072 0 0 59 0 5 0 0
      10.596635  61363 0 0 50 0 5 3 0
      10.325482  63799 0 1 35 1 8 0 0
      11.225244 103236 0 0 34 0 5 3 0
      11.456355  76049 1 1 32 1 7 3 0
      11.695247 104104 0 1 38 1 6 2 0
      11.512925  73044 0 0 57 1 5 3 0
      11.184422  73519 1 0 75 0 6 0 0
      11.512925  54690 0 1 32 1 8 2 0
      12.7367  64745 0 0 32 0 8 3 0
      8.699514  10390 0 0 72 0 6 3 0
      10.12663  28437 1 0 35 1 6 1 2
      10.714417  23614 1 1 35 0 6 3 0
      11.695247  30328 0 0 60 1 5 3 1
      10.91509 101608 1 1 36 0 6 1 0
      . 102393 1 0 75 1 5 0 0
      11.15625   2938 1 0 33 0 6 3 0
      9.903487  25240 1 0 30 0 6 3 1
      12.468437   8639 0 0 33 0 7 3 1
      11.580585  65520 1 0 46 0 6 1 0
      10.714417   9329 0 1 38 1 6 2 2
      11.407565 103863 1 1 35 1 5 3 1
      11.711777  70853 0 0 52 0 5 3 0
      6.633318  53471 0 0 73 0 6 0 0
      11.429543 105735 1 0 33 0 8 2 0
      12.061047  63114 0 1 52 1 5 3 0
      10.819778  95921 0 1 60 1 7 1 0
      10.878047  42682 1 1 33 1 6 1 0
      12.15478 104852 0 1 36 1 5 3 1
      .  57430 0 0 65 1 6 0 0
      10.12663  21534 0 1 33 1 6 3 0
      11.15625  70291 1 1 40 1 6 1 0
      10.714417   1457 1 1 48 0 6 2 0
      11.184422  43601 1 1 37 1 5 1 0
      11.429543  17004 1 1 35 1 6 3 0
      .  62450 0 0 66 1 6 0 0
      11.804326  47418 0 1 38 1 6 3 1
      11.407565   5710 1 0 71 0 6 3 0
      11.635143  93449 0 1 38 1 6 3 0
      11.77529  21005 1 0 33 1 5 3 0
      11.925035  90559 0 1 39 1 6 3 1
      12.0137  64212 1 0 33 0 6 3 0
      11.350407  74471 1 1 39 1 5 3 1
      11.512925  25293 0 1 36 1 5 3 1
      12.388915  41036 0 0 61 1 7 3 0
      11.385092  79374 0 1 39 0 5 3 1
      12.08954  18851 0 0 59 1 5 3 0
      11.571195  96938 0 1 57 1 5 3 0
      .  47890 0 0 59 1 6 0 0
      11.751943  44539 0 1 45 1 5 3 1
      9.903487   9255 0 1 46 1 8 3 0
      end
      label values sex genderlabel
      label def genderlabel 0 "M", modify
      label def genderlabel 1 "F", modify
      label values kid yesnolabel
      label values married yesnolabel
      label def yesnolabel 0 "N", modify
      label def yesnolabel 1 "Y", modify
      label values degree degreelabel
      label def degreelabel 5 "1", modify
      label def degreelabel 6 "2", modify
      label def degreelabel 7 "3", modify
      label def degreelabel 8 "4", modify
      label values sector sectorlabel
      label def sectorlabel 0 "L", modify
      label def sectorlabel 1 "educational institution", modify
      label def sectorlabel 2 "government", modify
      label def sectorlabel 3 "business/industry", modify
      label values stemcat stemlabel
      label def stemlabel 0 "others", modify
      label def stemlabel 1 "GEMP", modify
      label def stemlabel 2 "LPS", modify

      Comment


      • #4
        Your fixed effect is a dummy for every observation?

        Comment


        • #5
          at least in your example data, each OBSNUM occurs once and once only - in what sense is this panel data? unless you have more observations in which OBSNUM is repeated, this is not panel data and that is the reason for the error message

          Comment


          • #6
            Oh ok yes thanks. The data was however collected in 4 different years although individuals aren't observed multiple times is it possible to include time fixed effect in any way ?

            What would the best method be ?

            Comment


            • #7
              Elias:
              as per your example, you do not have a panel dataset. Therefore, I'd go:
              Code:
              . regress lnearn i.sex i.kid c.age##c.age i.married i.degree i.sector i.stemcat
              
                    Source |       SS           df       MS      Number of obs   =        90
              -------------+----------------------------------   F(13, 76)       =      4.55
                     Model |  38.1598991        13  2.93537686   Prob > F        =    0.0000
                  Residual |  49.0383721        76  .645241739   R-squared       =    0.4376
              -------------+----------------------------------   Adj R-squared   =    0.3414
                     Total |  87.1982713        89  .979755857   Root MSE        =    .80327
              
              ------------------------------------------------------------------------------------------
                                lnearn | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------------------+----------------------------------------------------------------
                                   sex |
                                    F  |   .1154282   .2121206     0.54   0.588    -.3070466    .5379031
                                       |
                                   kid |
                                    Y  |  -.2431639   .2246946    -1.08   0.283    -.6906819    .2043542
                                   age |   .1103801   .0736976     1.50   0.138    -.0364014    .2571616
                                       |
                           c.age#c.age |  -.0012263    .000741    -1.66   0.102    -.0027021    .0002494
                                       |
                               married |
                                    Y  |   .3504836   .2210022     1.59   0.117    -.0896805    .7906477
                                       |
                                degree |
                                    2  |  -.0921419   .2025288    -0.45   0.650     -.495513    .3112291
                                    3  |   .6133596   .2903066     2.11   0.038      .035164    1.191555
                                    4  |   .0651655   .3919277     0.17   0.868    -.7154263    .8457572
                                       |
                                sector |
              educational institution  |   1.426503   .4612709     3.09   0.003     .5078021    2.345203
                           government  |   1.864617    .477633     3.90   0.000     .9133282    2.815905
                    business/industry  |    1.85534   .4197086     4.42   0.000     1.019418    2.691262
                                       |
                               stemcat |
                                 GEMP  |   .2034165   .2102973     0.97   0.336    -.2154269    .6222598
                                  LPS  |  -.5708022   .4456755    -1.28   0.204    -1.458442    .3168376
                                       |
                                 _cons |   7.084807     1.7402     4.07   0.000     3.618898    10.55072
              ------------------------------------------------------------------------------------------
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                You have a repeated cross section. If you don’t know the year an observation was collected, you can’t include time fixed effects.

                Comment


                • #9
                  Thanks a lot Carlo !

                  Jeff, thanks for your answer. I can identify which year each individual was observed in yes. However, I was unable to use the year in xtset because I only have 4 different dates (2013, 15, 17, 19), therefore many individuals share the same year. Is there any solution for this ? Or is the best to simply include a year dummy in the regression ?

                  Comment


                  • #10
                    Elias:
                    if you have a repeated cross-sectional study there's no need to -xtset-, unless you want to use the -xtsum- command.
                    If different individuals share the same year, this is not an issue.
                    I would add an -i.time- predictor in the right-hand side of your regression equation.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment

                    Working...
                    X