Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logitic regressions: Implications of changing base year dummy

    Dear all,

    when changing the base year dummy (from start year 2002 to end year 2010) in logistic regressions I run (logit and xtlogit, pa), 2 changes occur: i) the coefficients and standard errors of the year dummies change and ii) the coefficients and standard errors of the constant term change.

    2 questions on this:
    1) Why do the coefficients and standard errors of the constant term change?
    2) Why do the coefficients and standard errors of the other independent variables not change?

    Many thanks in advance,
    Matthias

  • #2
    You can think of the constant term in a regression as the expected value of the dependent variable when all independent variables are zero. When you have a block of indicator variables for years 2002 - 2010, if the base year is 2010, then the constant term is the expected value of the dependent variable in year 2010 (i.e. when all of the indicators for 2002-2009 are zero) , whereas if the base year is 2002, the constant term is the expected value of the dependent variable in 2002 (when all of the indicators for 2003-2010 are zero). So, those will generally differ.

    The coefficients of the independent variables, however, can be thought of as representing the expected change in the dependent variable if that independent variables increased by 1 and all other independent variables remain constant. For one of the year indicators, a change from zero to one is a unit increment and therefore the coefficient estimates the expected difference in the dependent variable between the indicated year and the base year: so this will depend on the base year. But for all of the other variables, a unit increase, holding everything else constant. In the absence of interaction effects, that change is independent of the year, so the representation of years makes no difference.

    Comment


    • #3
      Another way to think about it: the coefficients for the dummy variable show you how the intercepts differ for each group from the intercept for the reference category. So, for example, suppose group 1 is the reference group, and the coefficients for dummy2 and dummy 3 are 7 and 8 respectively. Each of these values may significantly differ from the intercept for the reference group.

      But, suppose you make group 2 the reference. Dummy 1 will equal -7, and dummy 3 will equal 1 (because group 3 has a value one point higher than group 2). The latter may not be significant. i.e. the fact that groups 2 and 3 both differ from group 1 does not tell you whether groups 2 and 3 significantly differ from each other.

      When working with dummies, two handy commands are testparm and pwcompare. testparm will test whehter the dummy variables as a group have effects that significiantly differ from 0. pwcompare lets you compare every group with every other group, showing you whether or not the differences are statistically significant. Here are examples:

      Code:
      . webuse nhanes2f,clear
      
      . reg weight i.race height
      
            Source |       SS       df       MS              Number of obs =   10337
      -------------+------------------------------           F(  3, 10333) = 1053.44
             Model |  570783.717     3  190261.239           Prob > F      =  0.0000
          Residual |  1866243.53 10333  180.610039           R-squared     =  0.2342
      -------------+------------------------------           Adj R-squared =  0.2340
             Total |  2437027.25 10336    235.7805           Root MSE      =  13.439
      
      ------------------------------------------------------------------------------
            weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              race |
            Black  |   3.342126   .4315817     7.74   0.000     2.496142     4.18811
            Other  |  -4.093021   .9641624    -4.25   0.000    -5.982966   -2.203076
                   |
            height |   .7537809   .0137332    54.89   0.000     .7268612    .7807007
             _cons |  -54.74336   2.308216   -23.72   0.000    -59.26791   -50.21881
      ------------------------------------------------------------------------------
      
      . reg weight ib2.race height
      
            Source |       SS       df       MS              Number of obs =   10337
      -------------+------------------------------           F(  3, 10333) = 1053.44
             Model |  570783.717     3  190261.239           Prob > F      =  0.0000
          Residual |  1866243.53 10333  180.610039           R-squared     =  0.2342
      -------------+------------------------------           Adj R-squared =  0.2340
             Total |  2437027.25 10336    235.7805           Root MSE      =  13.439
      
      ------------------------------------------------------------------------------
            weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              race |
            White  |  -3.342126   .4315817    -7.74   0.000     -4.18811   -2.496142
            Other  |  -7.435147   1.037341    -7.17   0.000    -9.468536   -5.401757
                   |
            height |   .7537809   .0137332    54.89   0.000     .7268612    .7807007
             _cons |  -51.40123   2.340395   -21.96   0.000    -55.98886   -46.81361
      ------------------------------------------------------------------------------
      
      . testparm i.race
      
       ( 1)  1.race = 0
       ( 2)  3.race = 0
      
             F(  2, 10333) =   40.66
                  Prob > F =    0.0000
      
      . pwcompare race, pv
      
      Pairwise comparisons of marginal linear predictions
      
      Margins      : asbalanced
      
      --------------------------------------------------------
                      |                            Unadjusted
                      |   Contrast   Std. Err.      t    P>|t|
      ----------------+---------------------------------------
                 race |
      Black vs White  |   3.342126   .4315817     7.74   0.000
      Other vs White  |  -4.093021   .9641624    -4.25   0.000
      Other vs Black  |  -7.435147   1.037341    -7.17   0.000
      --------------------------------------------------------
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment

      Working...
      X