Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Negative predicted margins with non negative DV

    Hello everyone,
    I have looked around but could not find an answer to the following question. Any help is appreciated.
    I am trying to predict amount (ln) fundraised which cannot be negative (0 to 22). My code is as follows:

    reg lnraised $controls c.months##c.rating, robust
    qui margins, at(c.months=(48(12)84) rating =( 2 4)) atmeans
    marginsplot

    As you can see below, the graph I obtain predicts negative values, up to -5, of the DV at higher values of "months".

    Am I doing something wrong? How can I obtain non negative values of the predicted values?

    Thank you in advance,

    Cristiano




    Attached Files

  • #2
    This can happen with linear models. In fact, with a linear model, there is always some set of values of the predictor variables that will give a negative prediction for the outcome. With a good model of data where the outcome variable is always non-negative, that value will be out of the range of the observed data, or perhaps just barely within that range.

    The implication is that a linear model may not be a great fit to your data. Now, -5 is not all that different from zero, and there is no reason mathematically why the value of ln(something) cannot be negative. And if 84 is at the extreme end of the range of values of months in your data, or is outside that range, then maybe the model is actually making a prediction that if followed- out as far as 84 months, on average the value of lnraised could be negative. If the model otherwise seems to fit your data well, then that would be an extrapolated prediction of the model.

    The other possibility is that the model really isn't a good one for the data, that the relationship is really non-linear and a linear regression doesn't represent it well. Remember, the linear regression line is the best fitting line for your data--but the best fit may not be a very good fit. The stocking that best fits your hand would not make a good glove. So you might want to look into a non-linear model that tapers off rather than continuing relentlessly downward with increasing months.

    Comment


    • #3
      Thank you Clyde, this clarifies a lot. I appreciate the response.

      When you say non linear you mean to include curvilinear independent variables or to use a different regression? if the latter, may I ask what type of models you suggest?

      Thanks again,

      Cristiano

      Comment


      • #4
        Both are possible. Curvilinear independent variables can, if you are aggressive enough about them, be fit to almost anything--the drawback being that such models are usually unrelated to any actual real world data generating process because very few real world phenomena are truly polynomial. Linear splines or cubic splines work well. Linear splines are also relatively easy to interpret and understand, cubic splines not so much. And both have the drawback that they do not play well with -margins- and require a really laborious workaround. In terms of a different regression model, if you give up the log transformation on the dependent variable and go back to the original variable, a Poisson model might do the trick, or a gamma-model with log link, or even with a power link. Doing some graphical exploration of the data might give you some ideas about what seems potentially fruitful.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Both are possible. Curvilinear independent variables can, if you are aggressive enough about them, be fit to almost anything--the drawback being that such models are usually unrelated to any actual real world data generating process because very few real world phenomena are truly polynomial. Linear splines or cubic splines work well. Linear splines are also relatively easy to interpret and understand, cubic splines not so much. And both have the drawback that they do not play well with -margins- and require a really laborious workaround. In terms of a different regression model, if you give up the log transformation on the dependent variable and go back to the original variable, a Poisson model might do the trick, or a gamma-model with log link, or even with a power link. Doing some graphical exploration of the data might give you some ideas about what seems potentially fruitful.
          Fantastic, thank you for the thorough explanation Clyde, much appreciated. I will look into all of them. Thank you.

          Comment


          • #6
            Hi Clyde Schechter
            I should mention that I did came up with a workaround margins that "plays well" with splines (or other transformations).
            Check out the command "f_able" from ssc.
            I also wrote (but not yet submitted for upload) a program to write splines that should make things fareasier to apply with polynomial splines, and restricted cubic polynomials.
            Best regards
            Fernando

            Comment


            • #7
              FernandoRios Thanks for letting me know about f_able. Looks very interesting and useful.

              Comment

              Working...
              X