Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Total sum of squares when using reg and noconstant


    In the artificial data set below I estimated depvar = beta0 + beta1*year and depvar = beta1*year. Understood the formula for beta1 changes, but I would unerstand that total sum of squares is a function only of depvar, not how beta1 is estimated.

    In the first case, the result makes sense as the sample mean of depvar is 10.5, so total sum of squares = 10*(0.5)^2 = 2.5.
    In the second case, where I use the noconstant option, of course I expect the estimated beta1 to be different, but now the total sum of squares = 1105.

    I see that TSS = 1105 if I pretend the sample mean of depvar = 0, but of course it remains as before, 10.5.


    How does one get TSS = 1105 in the second estimation below? Pretend y-bar = 0
    Then sum of (y – ybar)2 = 5 * 102 + 5 * 112‑ = 500 + 605 = 1105.

    Does anyone know why TSS is computed this way when the noconstant option is used?

    . list

    +-----------------------------------+
    | userid depvar year plantype |
    |-----------------------------------|
    1. | 1 10 0 0 |
    2. | 1 11 1 1 |
    3. | 2 10 0 0 |
    4. | 2 11 1 1 |
    5. | 3 10 0 0 |
    |-----------------------------------|
    6. | 3 11 1 1 |
    7. | 4 10 0 0 |
    8. | 4 11 1 1 |
    9. | 5 10 0 0 |
    10. | 5 11 1 0 |
    +-----------------------------------+

    . reg depvar year

    Source | SS df MS Number of obs = 10
    -------------+---------------------------------- F(1, 8) = .
    Model | 2.5 1 2.5 Prob > F = .
    Residual | 0 8 0 R-squared = 1.0000
    -------------+---------------------------------- Adj R-squared = 1.0000
    Total | 2.5 9 .277777778 Root MSE = 0

    ------------------------------------------------------------------------------
    depvar | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    year | 1 . . . . .
    _cons | 10 . . . . .
    ------------------------------------------------------------------------------

    . reg depvar year, noconstant

    Source | SS df MS Number of obs = 10
    -------------+---------------------------------- F(1, 9) = 10.89
    Model | 605 1 605 Prob > F = 0.0092
    Residual | 500 9 55.5555556 R-squared = 0.5475
    -------------+---------------------------------- Adj R-squared = 0.4972
    Total | 1105 10 110.5 Root MSE = 7.4536

    ------------------------------------------------------------------------------
    depvar | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    year | 11 3.333333 3.30 0.009 3.459476 18.54052
    ------------------------------------------------------------------------------



  • #2
    according to the manual (methods and formulas under -regress- in r.pdf, "The total sum of squares, TSS, equals y0y if there is no intercept and y0yō€€€
    (10y)2=n otherwise." (didn't copy very well but you can look it up yourself); if interested in a general treatment of the issue, you might want to look at, Gordon, HA (1981), "Errors in computer packages. Least squared regression through the origin," The Statistician, 30(1): 23-29, or Eisenhauer, JG, (2003) "Regression through the origin," Teaching Statistics, 25(3): 76-80

    Comment


    • #3
      Thanks Rich, and also for the background references. Yes, I should have checked methods and formulas for this surprising result.

      Comment

      Working...
      X