Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predicting fitted values at means

    Hello everyone, I would like to predict the fitted values after a regression to explore the fitted line relationship with a given variable. Since I include more explanatory variables I want to explore the relationship at the mean of other variables. I was wondering if there's any way to do what I want with margins or predict. Here is an example of what I would like to do
    Code:
    sysuse auto, clear
    generate mpgsq = mpg^2
    
    regress price mpg mpgsq weight
    
    scalar b0 = _b[_cons]
    scalar b1 = _b[mpg]
    scalar b2 = _b[mpgsq]
    scalar b3 = _b[weight]
    
    summarize weight, meanonly
    scalar mwgt = r(mean)
    
    generate yhat = b0 + b1*mpg + b2*mpgsq + b3*mwgt
    
    twoway (line yhat mpg, sort)
    I was hoping to use regress price c.mpg##c.mpg weight and then use margins or predict instead of having to calculate them the way I did.

    Thanks for any pointers.
    Alfonso Sanchez-Penalver

  • #2
    Code:
    regress price c.mpg##c.mpg weight
    margins, at(mpg = (12(1)41) (mean) weight)
    marginsplot
    will get you pretty close. The differences between that and what you show with your example are:

    1. The scale of the vertical axis is different, because -marginsplot- by default includes confidence intervals around all those points. That expands the scale of values that needs to be displayed. It also makes the graph pretty messy when plotting this many points. So for both reasons, you probably would prefer to add the -noci- option to -marginsplot-.

    2. The plot created by -marginsplot- is a -twoway connect-, not -line-. If that difference is important to you, you can use the -savings()- option on -margins- and, instead of using -marginsplot-, you can just -use- the file you saved the -margins- output into and then create whatever graph you want from the data.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      2. The plot created by -marginsplot- is a -twoway connect-, not -line-. If that difference is important to you, you can use the -savings()- option on -margins- and, instead of using -marginsplot-, you can just -use- the file you saved the -margins- output into and then create whatever graph you want from the data.
      Note that marginsplot has a recast() option for this, so recast(line) should do the trick.

      Best
      Daniel

      Comment


      • #4
        Thanks for pointing that out, Daniel. I had forgotten about that.

        Comment


        • #5
          So margins calculates fitted values, unless you specify another option either directly, through dydx() for example, or with predict? I see that if you don't specify any values, it automatically assumes that you want the means of the variables. If you do specify the values, then it will calculate a fitted value for each of the values you specify and the means of the rest of the variables. Is this right?

          Thanks guys!
          Alfonso Sanchez-Penalver

          Comment


          • #6
            So margins calculates fitted values, unless you specify another option either directly, through dydx() for example, or with predict?
            Depends what you mean by "fitted values." In a logistic model, for example, the default for -margins- is to calculate predicted probabilities, some people might use the term "fitted values" to refer to xb (which is available, but is not the default.) In general, when using -margins- after an estimation command, you should check that command's postestimation help (e.g. -help xtologit postestimation-) and then click on the -margins- link to find out what the default prediction is, and what alternatives you can specify.

            I see that if you don't specify any values, it automatically assumes that you want the means of the variables.
            No. What it does for variables whose values are not specified in the -margins- command is leave them at their existing values in the data, applies -predict- (or -predictnl-, or something equivalent to these) and then calculates the mean of those results. So it is the mean of the predicted values, which is, in general, different from the prediction at mean values.

            If you do specify the values, then it will calculate a fitted value for each of the values you specify...
            Essentially yes, the fitted value or whatever it has been asked to estimate.

            ...and the means of the rest of the variables.
            No, This is wrong in the same way that the second quote above is wrong. It calculates the mean of the predicted values holding those variables at their existing values in the data. The mean of the predictions is different, in general, from the prediction at the means.

            Comment


            • #7
              Thanks Clyde, i see the difference between the means of the predicted values and the predicted values at the means. Since I was using it after regress, and in the case of a linear model they are the same, I was getting confused. Thank you for pointing that out.
              Alfonso Sanchez-Penalver

              Comment

              Working...
              X