Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dummy variable interpretation

    Code:
         Source |       SS           df       MS      Number of obs   =     1,315
    -------------+----------------------------------   F(28, 1286)     =    273.58
           Model |  946328.028        28  33797.4296   Prob > F        =    0.0000
        Residual |  158872.156     1,286  123.539779   R-squared       =    0.8563
    -------------+----------------------------------   Adj R-squared   =    0.8531
           Total |  1105200.18     1,314  841.096031   Root MSE        =    11.115
    
    ------------------------------------------------------------------------------
           enrol |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             CPI |   .1847218   .0313612     5.89   0.000     .1231971    .2462465
     newmarriage |    -.32916   .0818746    -4.02   0.000    -.4897824   -.1685375
          depend |  -.2282612   .0387019    -5.90   0.000     -.304187   -.1523355
         edspend |   2.111124   .2682445     7.87   0.000     1.584879    2.637368
       mortality |  -.3557503   .0344384   -10.33   0.000     -.423312   -.2881887
     newfemteach |   .3355659   .0259151    12.95   0.000     .2847253    .3864065
           urban |   .1179031   .0243665     4.84   0.000     .0701007    .1657055
           lnGDP |   2.257544   .9004769     2.51   0.012     .4909787    4.024109
              UM |  -2.232604   1.267144    -1.76   0.078    -4.718501    .2532922
              LM |  -1.696402   1.918191    -0.88   0.377    -5.459528    2.066725
               L |  -2.078713   2.783371    -0.75   0.455    -7.539159    3.381733
                 |
            year |
           2001  |   .2627674   1.840499     0.14   0.886    -3.347943    3.873477
           2002  |  -.1156184   1.834911    -0.06   0.950    -3.715366     3.48413
           2003  |   .1621663   1.842056     0.09   0.930    -3.451598     3.77593
           2004  |  -.7602889   1.842209    -0.41   0.680    -4.374354    2.853777
           2005  |  -.8418906    1.84367    -0.46   0.648    -4.458822    2.775041
           2006  |  -1.357138   1.844013    -0.74   0.462    -4.974742    2.260465
           2007  |  -1.241886   1.846762    -0.67   0.501    -4.864884    2.381112
           2008  |  -1.327063   1.850116    -0.72   0.473    -4.956639    2.302513
           2009  |  -1.505653   1.861639    -0.81   0.419    -5.157836    2.146531
           2010  |  -.7093271   1.864264    -0.38   0.704     -4.36666    2.948006
           2011  |  -.2339199   1.863438    -0.13   0.900    -3.889632    3.421792
           2012  |  -.5687192   1.864502    -0.31   0.760    -4.226518     3.08908
           2013  |     2.6685   1.867584     1.43   0.153    -.9953454    6.332345
           2014  |   3.751158    1.87445     2.00   0.046     .0738434    7.428473
           2015  |   4.425729   1.869453     2.37   0.018     .7582165    8.093242
           2016  |   5.518136   1.867712     2.95   0.003      1.85404    9.182233
           2017  |   5.279886   1.869216     2.82   0.005     1.612839    8.946933
                 |
           _cons |    50.3543   9.066823     5.55   0.000     32.56691    68.14169
    -----------------------------------------------------------------------------

  • #2
    Hi,

    above is my regression output. I am investigating the effect of corruption on education (specifically enrolment rates). I have added a dummy variable for country income classification. The variables above are L= Low income country, LM= low-middle income country, UM= upper-middle income country and I used High income country as my base. However since I have also included year dummies I am not sure how to interpret the constant. I think the constant represents a high income country in the year 2000. Is there any way I can find just the effect of a country being classified as high income across the years? Or can I only interpret it for each year separately?

    Thanks for any help!

    Comment


    • #3
      Your interpretation of the constant is correct. But what do you mean by "across the years." Your model assumes that the effect of income does not vary by year, that is, the model is additive and there is no year by income interaction. So, the effect of LM, for example, is -1.696 (rounded) regardless of year.
      Richard T. Campbell
      Emeritus Professor of Biostatistics and Sociology
      University of Illinois at Chicago

      Comment


      • #4
        Thanks for your response Dick,

        What I meant was is there a way I can see the effect of a country being of High income regardless of year. (Similar to how the effect of LM regardless of year is -1.696)

        Comment


        • #5
          A brute force approach would be to simply change the reference category. But the point to see here is that your equation already contains all of the possible information you can get about the effect of income. As it stands, you have chosen to see those effects in terms of contrasts to the high income category. You will notice that all of the dummy variables for income categories included in your model have negative coefficients, meaning that, given your model,, relative to high income countries, lower income reduces enrollment, regardless of year and regardless of the values of other variables. So, for example, low income countries (coded L) have estimated enrollments -2.078713 below high income countries in every year and regardless of the values of edspend, mortality, etc. Interpreting sets of dummy variables like year and income categories requires you to think hard about exactly what the model means, but it is effort well spent.

          If you were to treat income as a factor variable you can use Stata's margins and then marginsplot to see this graphically. If you are not familiar with these techniques you might want to look at a Stata Journal paper by Rich Williams which explains in detail how to do it. (The Stata Journal (2012) 12, Number 2, pp. 308–331). This paper is accessible for free from the SJ website.
          Richard T. Campbell
          Emeritus Professor of Biostatistics and Sociology
          University of Illinois at Chicago

          Comment

          Working...
          X