Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sign/interpretation of squared coefficient

    Hi all,

    This is probably quite a basic question but just wanted to clarify something. In my regression, I consider the effect of overlapping and non-overlapping knowledge between a target and acquiring firm on the innovative performance (number of patents, count variable) of the acquiring firm. Since I suspect a non-linear relationship, I have included the quantity of overlapping patents (count variable), as well as the squared term. Quantity of overlapping patents is positive and significant, and squared term is negative and significant. Hence, this shows an inverse U relationship between quantity of overlapping patents and innovative performance.

    However, for the non-overlapping quantity (also count variable), the sign is negative, and it's squared term positive (both significant). What exactly positive squared term show? A non-linear, U relationship between the two? Intuitively, I expected an inverse U relationship.

    Apologies if this question is elementary.
    Thanks
    Chris

  • #2
    However, for the non-overlapping quantity (also count variable), the sign is negative, and it's squared term positive (both significant). What exactly positive squared term show? A non-linear, U relationship between the two?
    Yes, a U-relationship.

    That said, for both the U- and inverse-U- relationships, you should also verify that the vertex (turning point) actually lies within the realm of the data you are working with. The turning point will be at quantity = -linear coefficient /( 2 * quadratic coefficient). If this turning point is outside the range of values of quantity in your data, or very near the edge, then even though the regression is picking up some non-linearity in the relationship, the relationship becomes truly curved only at or beyond the edge of the data, which is not really what people understand by a U- or inverse-U relationship. For example, try this:

    Code:
    clear*
    set obs 5
    gen x = 5 + _n
    gen y = x*x + x + rnormal(0, 0.02)
    
    regress y c.x##c.x
    
    nlcom -_b[x]/(2*_b[c.x#c.x])
    
    graph twoway line y x
    The regression does indeed pick up a quadratic term, and it is highly statistically significant. But the turning point is near -0.5, which is far outside the range of the actual values of x, and the graph clearly looks much more linear than it does U-shaped.

    So you always need to check the turning point in these things before drawing conclusions.


    Comment


    • #3
      Excellent tip, Clyde. Thanks a lot for your reply, I'll be sure to check that.

      Comment


      • #4
        while I agree with Clyde that one should always check the turning point, I do not agree that, even in general, it should be in the range of the data to make it useful (not quite what Clyde said, but I think it's a fair re-statement); when one has ceiling (floor) effects, a model using a quadratic can be useful but one would, in this situation, expect the turning point to be outside the range of the data

        interesting digression - many years Bill Gould (STB 15) wrote an article about piecewise linear regression in which he said the following; "Although the quadratic is widely used to approximate nonlinear functions, it is typically done under the assumption that the turning point lies outside the data." <grin>

        Comment


        • #5
          I think there is no disagreement between me and Rich Goldstein. Let me clarify my statement. I did not intend to imply that a quadratic term is not useful if the turning point lies outside the range of the data. What I wanted to say is that if the turning point is outside the range of the data, the quadratic relationship should not be referred to as "(inverse-) U-shaped." The U- would be an extrapolation.

          In the example I gave, the relationship is clearly a quadratic one by design. But as the data are far from the turning point it is not sensible to call it U-shaped. Rather the quadratic is picking up a degree of non-linearity in the relationship, which is very real. It is, indeed, often the case that we need to model a degree of curvilinearity, and quadratic or higher order terms are often a useful approach. My remark was meant to distinguish between a possibly local quadratic relationship and a globally U-shaped one.

          Comment


          • #6
            I agree with Clyde Schechter that there is no disagreement between us <grin>

            Comment


            • #7
              I would like to add something more to the discussion. Having a negative term an a positive squared value is only a necessary but not sufficient condition for a U shaped function. Indeed it is helpful to read the paper by Lind and Melhum published in Oxford bulletin of Economics and Statistics in 2010. They show that the sufficient condition for a U shaped relationship is that the first derivative of the relationship at the minimum of the range of the independent variable should be negative, while the derivative at the maximum value of the regressor should be positive. Of course the turning point should lie within a credible range, i.e. within the min and the max of the variable. When you have a negative linear term and a positive quadratic term, this could be even consistent with a monotonically increasing convex function. Said that have a look at the cited paper. The good news is that the authors also developed a stata routine to implement easily the test.

              Code:
              net install utest
              The test is quite flexible and is useful to detect an inverse U shape as well.

              I hope this helps.

              Comment


              • #8
                The condition about the derivative of the relationship at the minimum and the maximum of the range, as applied to a quadratic model, is entirely equivalent to the turning point being located within the range of the data. If the derivative is negative at the minimum value of x, and positive at the maximum value of x, then since a quadratic function is everywhere continuously differentiable, by the intermediate value theorem, the derivative will take on the value 0 somewhere between those endpoints, and that point will necessarily be the minimum (and a turning point), and by calculus it will be at the turning point given by the formula x = -linear coefficient/(2*quadratic coefficient). Conversely, if the quadratic term has a positive coefficient, and the turning point lies between the maximum and minimum, then the derivative will necessarily be negative at all values of x less than the turning point, and positive at all values of x greater than the turning point, therefore necessarily negative at the minimum of x and positive at the maximum of x.

                Of course, quadratics are not the only way to model U- or inverse-U-shaped relationships. But if we go beyond the quadratic functions, then the condition about the derivatives does not really guarantee a U-shaped relationship exists. For example, run -graph twoway function y = cos(x), range(1 11)- and you will see that the derivative condition is satisfied at both ends of the range, but the relationship is more like a W than a U.

                When you have a negative linear term and a positive quadratic term, this could be even consistent with a monotonically increasing convex function.
                No, that's not possible. The second derivative of a quadratic function is just two times the quadratic coefficient, at all values of x. If the quadratic coefficient is positive, then the second derivative is positive, and the function is therefore nowhere convex. Now, it certainly could be consistent with a monotone increasing function over the particular range of values of x in the data: in fact my example in #2 is like that. But not convex.

                Added: My error: such a function would be everywhere convex. So, yes, that's possible. Sorry for the confusion..
                Last edited by Clyde Schechter; 17 Aug 2018, 18:11.

                Comment


                • #9
                  Thanks Clyde. Regarding the idea of finding a U-shape checking the sign of the linear and the quadratic term and adding the requirement that the turning point is within the data range, this is the explanation given in the paper I cited before (Lind and Mehlum (2010) "With or Without U? The Appropriate Test for a U-Shaped Relationship". Oxford Bulletin of Economics and Statistics n. 72(1) )

                  In most empirical work trying to identify U shapes, the researcher includes a nonlinear (usually quadratic) term in an otherwise standard regression model. If this term is significant and, in addition, the estimated extremum point is within the data range, it is common to conclude that there is a U-shaped relationship. We argue in this paper that this criterion is too weak. The problem arises when the true relationship is convex but monotone over relevant data values. A quadratic specification may then erroneously yield an extreme point and hence a U shape.
                  And later in the paper:

                  Most works use the criterion that if both beta and gamma - they are the coefficients associated to the linear and quadratic terms -  are significant and if the implied extreme point is within the data range, they have found a U. This is a sensible criterion but it is neither sufficient nor necessary. It is insufficient as the estimated extreme point may be too close, given the uncertainty, to an end point of the data range. It is not generally necessary as beta may be zero if the data range extends to both sides of x=0.
                  Last edited by Dario Maimone Ansaldo Patti; 17 Aug 2018, 19:12.

                  Comment


                  • #10
                    IMost works use the criterion that if both beta and gamma - they are the coefficients associated to the linear and quadratic terms -  are significant and if the implied extreme point is within the data range, they have found a U. This is a sensible criterion but it is neither sufficient nor necessary. It is insufficient as the estimated extreme point may be too close, given the uncertainty, to an end point of the data range. It is not generally necessary as beta may be zero if the data range extends to both sides of x=0
                    This is not really different from what I said in #2, where I pointed out that if the turning point is outside of the range of the data, or very near the edge, then the relationship should not be considered U-shape notwithstanding the quadratic term. I will also note here, though I did not say it earlier, that I agree that the significance or lack of significance of the linear term is of no importance whatsoever. (I have made that point in other posts on quadratic regression in this Forum

                    What I said in #8 remains true, that restricting our attention to quadratic functions, the derivative condition and the turning point inside the range of the datacondition are entirely equivalent. For more general functions, however, they are not. The derivative condition will always imply the existence of a critical point within the range, but the converse is not true in the unrestricted case. And the derivative condition, in the unrestricted case, does not imply that the relationship is U-shaped because more complicated possibilities can arise.

                    Comment

                    Working...
                    X