Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two equivalent regressions are giving me slightly different estimates: Is this a precision issue? How do I control it?

    Good afternoon,

    If we look at the formula for the OLS estimator b = inv(X'X)*X'Y we see that the OLS estimator is invariant to division of all the variables (dependent and all independent) by the same constant.

    But when I do it, I obtain difference in the OLS estimates in the 5 digit after the decimal point, see the code below. In the original regression I obtain

    . dis _b[mpg]
    -56.194159

    . dis _b[headroom]
    -675.59623


    vs

    . dis _b[mpg]
    -56.194129

    . dis _b[headroom]
    -675.59625

    after I divide throughout by a constant. The estimate are close, but they are not the same.

    My questions are:

    1) Is this a precision issue?
    2) How do I control this issue and make it disappear?
    3) Should we be worried about this?

    Code:
    . clear
    
    . sysuse auto
    (1978 automobile data)
    
    . gen ones = 1
    
    . reg price mpg headroom weight ones, hascons
    
          Source |       SS           df       MS      Number of obs   =        74
    -------------+----------------------------------   F(3, 70)        =     11.09
           Model |   204556469         3  68185489.6   Prob > F        =    0.0000
        Residual |   430508927        70  6150127.53   R-squared       =    0.3221
    -------------+----------------------------------   Adj R-squared   =    0.2931
           Total |   635065396        73  8699525.97   Root MSE        =    2479.9
    
    ------------------------------------------------------------------------------
           price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             mpg |  -56.19416   85.07654    -0.66   0.511     -225.874    113.4856
        headroom |  -675.5962   392.3504    -1.72   0.090    -1458.115     106.922
          weight |   2.061945   .6586383     3.13   0.003      .748332    3.375557
            ones |   3158.306   3617.449     0.87   0.386    -4056.468    10373.08
    ------------------------------------------------------------------------------
    
    . dis _b[mpg]
    -56.194159
    
    . dis _b[headroom]
    -675.59623
    
    . predict double e, resid
    
    . summ e
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
               e |         74   -6.15e-14    2428.453  -3354.365   7108.818
    
    . sca SD = r(sd)
    
    . for var price mpg headroom weight ones: replace X = X/SD
    
    ->  replace price = price/SD
    variable price was int now float
    (74 real changes made)
    
    ->  replace mpg = mpg/SD
    variable mpg was int now float
    (74 real changes made)
    
    ->  replace headroom = headroom/SD
    (74 real changes made)
    
    ->  replace weight = weight/SD
    variable weight was int now float
    (74 real changes made)
    
    ->  replace ones = ones/SD
    (74 real changes made)
    
    . reg price mpg headroom weight ones, hascons
    
          Source |       SS           df       MS      Number of obs   =        74
    -------------+----------------------------------   F(3, 70)        =     11.09
           Model |  34.6859748         3  11.5619916   Prob > F        =    0.0000
        Residual |  72.9999978        70  1.04285711   R-squared       =    0.3221
    -------------+----------------------------------   Adj R-squared   =    0.2931
           Total |  107.685973        73  1.47515031   Root MSE        =    1.0212
    
    ------------------------------------------------------------------------------
           price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             mpg |  -56.19413   85.07654    -0.66   0.511    -225.8739    113.4857
        headroom |  -675.5963   392.3504    -1.72   0.090    -1458.115     106.922
          weight |   2.061945   .6586383     3.13   0.003     .7483322    3.375557
            ones |   3158.305   3617.449     0.87   0.386    -4056.469    10373.08
    ------------------------------------------------------------------------------
    
    . dis _b[mpg]
    -56.194129
    
    . dis _b[headroom]
    -675.59625

  • #2
    It's a precision issue. Replace

    Code:
    . for var price mpg headroom weight ones: replace X = X/SD
    with

    Code:
    foreach var in price mpg headroom weight ones {
        recast double `var'
        replace `var' = `var'/SD
    }
    (or the equivalent ancient for code, which I am not willing to look up now).

    Comment


    • #3
      Thank you, Daniel !

      I thought that this is the problem initially, but then somehow I convinced myself that it cannot be.

      Let me elaborate for future reference: the problem of precision arises in the second regression. The -regress- command does everything in full precision, but the division I do for the second regression in not in full precision, and the problem arises there.

      One way to fix this issue is the way Daniel showed, by changing the precision of the varialbes before replacing them. Another way is to generate a new set of variables in full precision:

      Code:
      . clear
      
      . sysuse auto
      (1978 automobile data)
      
      . gen ones = 1
      
      . reg price mpg headroom weight ones, hascons
      
            Source |       SS           df       MS      Number of obs   =        74
      -------------+----------------------------------   F(3, 70)        =     11.09
             Model |   204556469         3  68185489.6   Prob > F        =    0.0000
          Residual |   430508927        70  6150127.53   R-squared       =    0.3221
      -------------+----------------------------------   Adj R-squared   =    0.2931
             Total |   635065396        73  8699525.97   Root MSE        =    2479.9
      
      ------------------------------------------------------------------------------
             price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               mpg |  -56.19416   85.07654    -0.66   0.511     -225.874    113.4856
          headroom |  -675.5962   392.3504    -1.72   0.090    -1458.115     106.922
            weight |   2.061945   .6586383     3.13   0.003      .748332    3.375557
              ones |   3158.306   3617.449     0.87   0.386    -4056.468    10373.08
      ------------------------------------------------------------------------------
      
      . predict double e, resid
      
      . summ e
      
          Variable |        Obs        Mean    Std. dev.       Min        Max
      -------------+---------------------------------------------------------
                 e |         74   -6.15e-14    2428.453  -3354.365   7108.818
      
      . sca SD = r(sd)
      
      . for var price mpg headroom weight ones: gen double Xdouble = X/SD
      
      ->  gen double pricedouble = price/SD
      
      ->  gen double mpgdouble = mpg/SD
      
      ->  gen double headroomdouble = headroom/SD
      
      ->  gen double weightdouble = weight/SD
      
      ->  gen double onesdouble = ones/SD
      
      . reg pricedouble mpgdouble headroomdouble weightdouble onesdouble, hascons
      
            Source |       SS           df       MS      Number of obs   =        74
      -------------+----------------------------------   F(3, 70)        =     11.09
             Model |  34.6859758         3  11.5619919   Prob > F        =    0.0000
          Residual |          73        70  1.04285714   R-squared       =    0.3221
      -------------+----------------------------------   Adj R-squared   =    0.2931
             Total |  107.685976        73  1.47515035   Root MSE        =    1.0212
      
      --------------------------------------------------------------------------------
         pricedouble | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      ---------------+----------------------------------------------------------------
           mpgdouble |  -56.19416   85.07654    -0.66   0.511     -225.874    113.4856
      headroomdouble |  -675.5962   392.3504    -1.72   0.090    -1458.115     106.922
        weightdouble |   2.061945   .6586383     3.13   0.003      .748332    3.375557
          onesdouble |   3158.306   3617.449     0.87   0.386    -4056.468    10373.08
      --------------------------------------------------------------------------------
      So now the are the same as they should be.

      Comment

      Working...
      X