Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Major change in coefficient when adding a squared variable in fixed effects model. Interpretation?

    When I use the xtreg command with fixed effects to look at the correlation between use of marijuana and my dependent variable: income, I get the number 119. However, when I add the squared variable for use of marijuana (value -72), the coefficient for the non squared use of marijuana jumps from 119 to 586.

    How should I interpret this?

    (I use the NSLY79 data set)

  • #2
    Yes, and the constant term may have changed, too. Perhaps other coefficients in your model do, as well. Nothing surprising about that.

    Your quadratic model says that there is an inverse-U shaped relationship between marijuana usage and income. income = constant term + 586*marijuana - 72*marijuana^2. If you were to graph that, you would find that it is an upside-down U-shape (strictly speaking, a parabola) with its peak at marijuana = -586/(2*(-72)) = 4.07 (to two decimal places)

    What I recommend you do, to get a really good sense of what is going on here, is to re-run both regressions. Following each one, use -predict- to create a variable with the predicted values. Then graph both sets of predicted values against the marijuana variable. You will see that the predicted values with the quadratic model fall on a parabola as I've described, and that the predicted values without the parabola fall on a straight line which his a chord (or, rather, a secant) running through the parabola that is struggling to find the best fit between the ups and downs of the parabola.

    Comment


    • #3
      I am not familiar with the plotting capabilities in Stata. If you could post the code for plotting the graph you suggested, it would be a great opportunity for learning.

      Comment


      • #4
        Code:
        //    CREATE A TOY DATA SET TO DEMONSTRATE THE CODE
        clear
        set obs 100
        set seed 1234
        gen marijuana = runiformint(0, 10)
        gen income = 1000 + 586 * marijuana - 72 * marijuana^2 + rnormal(0, 250)
        
        
        regress income marijuana // AND PERHAPS OTHER VARIABLES
        predict linear_prediction
        
        // ADD A QUADRATIC TERM
        regress income c.marijuana##c.marijuana
        predict quadratic_prediction
        
        // DO A PLOT
        graph twoway line *_prediction marijuana, sort || scatter income marijuana
        would be sufficient for the purpose at hand. Since my data set here is just made up to demonstrate the principle, it does not exactly reproduce the results reported in #1, but it is somewhat similar, and the graph illustrates the point I'm trying to make about the linear fit being a secant through the quadratic fit parabola that tries to accomodate the ups and downs of the data as well as a straight line can (which is not very well), so it's slope will not resemble the linear coefficient in the quadratic model..
        Last edited by Clyde Schechter; 02 Mar 2019, 19:04.

        Comment

        Working...
        X