Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • STATA not considering all observations during regression

    Hello all,

    I am running a regression to analyse the impact of per capita spending on education on student test scores. The test score are obtained from the PISA survey which is conducted every 3 years. My dataset contains PISA scores since 2000 (i.e. 2000,2003,2006...,2018). When I run the regression, I see that STATA is not considering PISA scores for 2000, 2003, and 2006. Can anyone help me understand why is this happening, and how can I correct it? Here is a screenshot of my dataset:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int Year float(country_id Overall_Average_Score) double(education_spending__GDP    gdp_per_capita    gini)
    2000 1  528.2785 2.016412 28312.86654    .
    2001 1         .        . 29546.38199    .
    2002 1         .        . 30807.51408    .
    2003 1  524.8465        . 32391.46999    .
    2004 1         .        . 34000.18232    .
    2005 1         . 2.101533 35659.12549    .
    2006 1 519.89355        . 37938.79208    .
    2007 1         .        . 39687.44861    .
    2008 1         . 2.056001 40130.34169    .
    2009 1  518.8372 2.205964 41672.93617    .
    2010 1         . 2.275109 42816.44175    .
    2011 1         . 2.138796 44440.57948    .
    2012 1 512.48315 2.074064  43884.6356 .326
    2013 1         . 2.061826  47763.2159    .
    2014 1         . 2.043287 47606.75493 .337
    2015 1 502.26355 1.993189 47226.75975    .
    2016 1         . 1.991675 50136.79838  .33
    2017 1         . 2.028433 50690.61954    .
    2018 1  498.9854 2.066418  52991.2149 .325
    2019 1         . 2.115632 52732.06568    .
    2020 1         .        .  55690.9186 .318
    2021 1         .        . 61977.19618    .
    2000 2   492.056        . 29380.03111    .
    2001 2         .        . 29707.46226    .
    2002 2         .        . 31178.05144    .
    2003 2  498.1521        . 32158.22591    .
    2004 2         .        . 33784.43265    .
    2005 2         .        . 35024.55748    .
    2006 2  502.1716        . 37659.84067    .
    2007 2         .        . 39436.42013 .284
    2008 2         .        . 41316.02264  .28
    2009 2    486.84        . 40929.33675 .289
    2010 2         .        . 42020.55064  .28
    2011 2         .        . 44469.20964 .281
    2012 2  500.3105 2.251973 46477.65508 .276
    2013 2         .   2.2804 47936.67796  .28
    2014 2         . 2.202749 48813.53441 .274
    2015 2  492.2151 2.187135 49942.05629 .275
    2016 2         . 2.184957 52665.08746 .284
    2017 2         . 2.130408 54188.36067 .275
    2018 2 491.03845 2.025426 56956.11056  .28
    2019 2         . 2.001613 59719.33165 .274
    2020 2         .        . 57253.30056    .
    2021 2         .        . 59976.26467    .
    2000 3  507.1259        . 27789.05351    .
    2001 3         .        . 28791.40584    .
    2002 3         .        . 30281.66799    .
    2003 3  518.1369        . 30934.59859    .
    2004 3         .        .  32063.6738    .
    2005 3         . 4.092949 33176.68088    .
    2006 3  510.5377        . 35253.92329    .
    2007 3         .        .  36794.2342    .
    2008 3         . 4.353625 37883.23342    .
    2009 3  509.2645 4.359445 37753.27627    .
    2010 3         . 4.315599 39837.99795    .
    2011 3         . 4.291925 40943.34348    .
    2012 3  509.3383 2.746605 42290.47767    .
    2013 3         . 2.757966 43672.71229    .
    2014 3         . 2.716016 44929.93333    .
    2015 3 502.50275 2.654477 46201.68589    .
    2016 3         . 2.633516 48599.20268    .
    2017 3         . 2.613172 50442.94752    .
    2018 3  499.9026 2.569389 52530.84151 .258
    2019 3         .  2.53095 55800.82599 .262
    2020 3         .        . 54539.03253    .
    2021 3         .        . 58806.11925    .
    2000 4 534.31274        . 29362.28957 .315
    2001 4         .        . 30231.00503 .317
    2002 4         .        . 30963.22201 .317
    2003 4  530.2002        . 32350.07088 .315
    2004 4         .        . 33925.89256 .321
    2005 4         .        . 36327.66206 .315
    2006 4  529.4961        . 38120.17679 .316
    2007 4         .        . 39575.32497 .317
    2008 4         .        . 40376.33017 .315
    2009 4   526.584        . 38865.40602 .316
    2010 4         .        . 40099.49789 .316
    2011 4         .   1.5377 41666.71885 .313
    2012 4  522.2119 1.501831 42290.87908 .317
    2013 4         . 1.471402 44298.51446 .319
    2014 4         . 1.447075 45753.78294 .312
    2015 4 523.33997 1.435967 44670.05437 .318
    2016 4         . 1.475583 46472.36877 .307
    2017 4         . 1.369036 48317.19116  .31
    2018 4     516.7 1.350653 49992.81142 .304
    2019 4         . 1.352826 49832.29798   .3
    2020 4         .        . 47228.37249  .28
    2021 4         .        . 53074.08826    .
    2000 5  409.5568        . 9440.041983    .
    2001 5         .        . 9845.971228    .
    2002 5         .        . 10205.70338    .
    2003 5         .        . 10779.63889    .
    2004 5         .        . 11682.59892    .
    2005 5         . 1.851618 12600.90585    .
    2006 5  430.5397        . 15589.15685    .
    2007 5         .        .  16771.1829    .
    2008 5         . 1.851339 16449.61404    .
    2009 5  439.2991 1.905145    16028.55  .48
    2010 5         . 1.723623 18041.31482    .
    2011 5         . 1.671707  20235.5706 .471
    end

  • #2
    Missing data

    Comment


    • #3
      Kanika:
      elaborating a bit on Jared's correct diagnosis, you can find that only 10 observations of your excerpt are free from at least 1 missing values (and, as such, included in the panel data regression):
      Code:
      . egen check=rowmiss( Year- gini )
      
      . tab check
      
            check |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                0 |         10       10.00       10.00
                1 |         26       26.00       36.00
                2 |         41       41.00       77.00
                3 |         23       23.00      100.00
      ------------+-----------------------------------
            Total |        100      100.00
      
      . xtset country_id Year
      
      Panel variable: country_id (unbalanced)
       Time variable: Year, 2000 to 2021
               Delta: 1 unit
      
      . xtreg Overall_Average_Score education_spending__GDP gdp_per_capita gini i.Year,fe
      note: 2018.Year omitted because of collinearity.
      
      Fixed-effects (within) regression               Number of obs     =         10
      Group variable: country_id                      Number of groups  =          5
      
      R-squared:                                      Obs per group:
           Within  = 1.0000                                         min =          1
           Between = 0.7189                                         avg =        2.0
           Overall = 0.4425                                         max =          3
      
                                                      F(5,0)            =          .
      corr(u_i, Xb) = -0.9927                         Prob > F          =          .
      
      -----------------------------------------------------------------------------------------
        Overall_Average_Score | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      ------------------------+----------------------------------------------------------------
      education_spending__GDP |  -47.16203          .        .       .            .           .
               gdp_per_capita |   -.007108          .        .       .            .           .
                         gini |   730.6377          .        .       .            .           .
                              |
                         Year |
                        2012  |  -51.60186          .        .       .            .           .
                        2015  |  -37.39955          .        .       .            .           .
                        2018  |          0  (omitted)
                              |
                        _cons |   707.1017          .        .       .            .           .
      ------------------------+----------------------------------------------------------------
                      sigma_u |  212.66236
                      sigma_e |          .
                          rho |          .   (fraction of variance due to u_i)
      -----------------------------------------------------------------------------------------
      F test that all u_i=0: F(4, 0) = .                           Prob > F =      .
      
      .
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment

      Working...
      X