Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • square of partly negative independent variable

    Hello everyone,

    i have a rather simple question on how to interpret the squared version of an independent variable.
    Im running a panel regression using fixed effects and trying to measure the effect of a climate beta (cbeta) on market risk (cbeta) over the course of 12 years.
    Additionally to cbeta I generate a squared version of cbeta called cbeta2.
    Now my question is how to correctly interprete cbeta2, because cbeta holds both negative and postive values thus making cbeta2 only have postive values.

    Thank you in advance!

  • #2
    This seems backwards to me. Why square a predictor (you say independent variable) if you don't know how you can interpret the results?

    Sometimes squaring has intrinsic meaning. In elementary physics there are many examples. In these cases, there always seems to be a zero defined by the problem (correct me if I am wrong), such as time since a ball was dropped, or whatever.

    Sometimes squaring a predictor has a quite different rationale. In microeconomics and perhaps other social sciences people often seem to find age and age squared to be useful predictors for capturing curvature in some relationship (which need not imply that a turning point occurs within the range of the data). In that kind of situation, it's the joint effects of both predictors together that are crucial.

    A scatter plot smooth of your outcome versus climate risk might be useful here. On the other hand, having several other predictors that make it hard to see patterns would not be surprising.

    Comment


    • #3
      Thank you for you answer!

      Its my first time really working with stata. I squared cbeta, because I remembered doing so with age (as you mentioned) and other independent variables in a course a few years back.
      In my regression the squared cbeta has a way lower p-value than the normal cbeta - thats why I want to include it.
      I only have a problem interpreting it right due to my poblem with original postive and negative values....
      You dont think thats a good idea?

      Comment


      • #4
        Your question isn't really about using Stata. The same issue arises with any software that supports regression. If you've been using statistics for some years that's good.

        The key point is that it's the joint effect of the two -- the original and the square -- that's important. It makes no sense to look at the P-value of either as if the other were just another predictor. There isn't, even notionally, any sense in which you can think holding one constant while the other is free to vary.

        Stata gives you separate P-values -- but that doesn't imply that they have meaning.

        I am still hopeful that you will show the graph I asked for.

        Comment


        • #5
          I m not really sure if those are the ones that you were asking for?!
          Click image for larger version

Name:	2.png
Views:	1
Size:	135.9 KB
ID:	1760269

          Attached Files

          Comment


          • #6
            Both graphs (especially the second) are helpful, but would be clarified by ms(Oh) mcolor(blue%20) scheme(s1color).

            If you're using some version of Stata that isn't the latest (18 as I write) please note our longstanding request to tell us what version you're using.

            This might be a problem in which (unusually in my experience) the square alone is a candidate. But note the leverage of moderate outliers.

            Comment


            • #7
              It is easier to understand if you subract the mean of x from the value used in the regression, call the result xbar, Then the regession includes xbar and xbar squared. This won't change your estimated coefficients, predicted values of y or error terms, just the estimated value of the constant term , which will exactly compensate for the change from x to xbar. Now you can see that the coefficient on xbar is the effect of changes in x on the predicted value of y at the mean of x. If the coefficient of xbar squared is positive, that means that the effect of x increases as x exceeds xbar, If negative, then the effect of x becomes smaller or more negative as x exceeds xbar.

              Adding a squared term is usually not the best way to model a non-linear effect. Have you tried the log or cube root?

              Comment


              • #8
                Which variable is to be logged here, Daniel?

                Comment


                • #9
                  y = a*x + b*x^2

                  dy/dx = a + 2b*x

                  As Nick says, the two coefficients do not stand alone, so you can't just look at one or the other's p-value (though, if the t-stat on b is small, then you might exclude it, as it is a test of non-linearity).

                  Do not concern yourself with negative values becoming positive when squared. The sign is restored in the derivative (b is multiplied by x).

                  Comment


                  • #10
                    I was thinking that the log of x might be a better way to allow for a non-linearity. I realize it is nothing like the square, but unless the OP has a reason to prefer squaring (and isn't just looking for a general way to allow for some non-linearity) the log or cube root may be fine, and don't impose an extreme effect on extreme values.

                    Comment


                    • #11
                      Daniel Feenberg Thanks for spelling out your thinking. Unfortunately the x variable is often negative, which rules out logarithms beyond some device such as sign(x) log1p(abs(x)) .

                      Comment

                      Working...
                      X