Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about U-shape relationship

    Hello everyone,

    1.To identify U-shape relationship, if the variable and its squared added to model and both of them were significant, is it enough to identify u-shape relationship?
    2. to check wether is it inverted or direct u-shape, is using -utest- enough?
    3. If I'm considering a model with interaction, can I interpret this kind of relationship or not?
    for example:
    if model is:
    Code:
    Y=b0+b1*X1+b2*X1^2+b3*X2+b4*X1*X2+error term
    and b1 & b2 is significant does it mean that we have u-shape relationship or I have to check it in a model without interaction term?

    4. if we had u-shape relationship between X1 and Y, and I want to check the effect of interaction of X1 and X2, Shall I use X1^2*X2 or X1*X2 is enough?

    Thank you in advance for your help and comments,
    King regards,
    Michael
    Last edited by Michael Lee; 23 Jul 2022, 05:20.

  • #2
    U-shape and a quadratic in a predictor y = b_0 + b_1 x + b_2 x^2 are not one and the same. A counter-example is given by cosh(). In fact, I would try not to call a parabola U-shape myself.

    A quadratic can qualify as a good fit but that does not mean that a turning point occurs in the range of the data.

    I think in general formal test results are never enough. You need to plot fitted relationships and the data to see what shape you have.

    Comment


    • #3
      "if the variable and its squared added to model and both of them were significant"

      The main effect does not need to be significant. In general, once you add things like squared terms and interactions, the meaning and interpretation of the main effects changes. For more, see

      https://www3.nd.edu/~rwilliam/stats2/l53.pdf
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        As Nick Cox pointed out above, "A quadratic can qualify as a good fit but that does not mean that a turning point occurs in the range of the data." Take that to heart. You need to check the location of the turning point of the quadratic: if it is outside the range of the data, or close to the extreme ends of the data, then you have established some degree of curvilinearity, but not a U-shape. You can do this by using -nlcom- to calculate -b/(2a) where b is the linear coefficient and a is the quadratic coefficient.

        Also note that there are situations where even with this test, you will be misled into thinking there is a U-shaped relationship when there isn't. Run the following code:
        Code:
        clear*
        set obs 100
        gen x = _n
        gen y = log(x)
        
        regress y c.x##c.x
        
        nlcom -_b[x]/(2*_b[c.x#c.x])
        
        predict fitted, xb
        graph twoway line y fitted x, sort
        and you will see that the quadratic term is "highly significant," the turning point of the fitted model is comfortably within the range of x, but the log function is clearly not U-shaped and the graph confirms that the model has a rather different shape from the data, despite a very high R2.

        Comment


        • #5
          Thank you everyone for your helpful answers.

          Here I stored regression results and plotted y and predicted y vs x and x^2:


          y vs. x^2:

          Click image for larger version

Name:	Screen Shot 1401-05-02 at 13.56.36.png
Views:	1
Size:	221.8 KB
ID:	1674832

          y vs. x:

          Click image for larger version

Name:	Screen Shot 1401-05-02 at 13.57.04.png
Views:	1
Size:	229.7 KB
ID:	1674833


          as it seems we don't have u-shape relationship accordingly.

          Thank you every one again.
          Best regards,
          Michael
          Last edited by Michael Lee; 24 Jul 2022, 03:27.

          Comment


          • #6
            That is not the best way to check. Here is one better way, and there are others. The example is not a good example of analysis, but is presented just to show how to extract the part of the prediction that is a function of a predictor and its square in the presence of other predictors.

            Code:
            . sysuse auto, clear
            (1978 automobile data)
            
            . gen weightsq = weight^2
            
            . regress mpg weight weightsq length
            
                  Source |       SS           df       MS      Number of obs   =        74
            -------------+----------------------------------   F(3, 70)        =     48.87
                   Model |  1653.84284         3  551.280945   Prob > F        =    0.0000
                Residual |  789.616624        70  11.2802375   R-squared       =    0.6768
            -------------+----------------------------------   Adj R-squared   =    0.6630
                   Total |  2443.45946        73  33.4720474   Root MSE        =    3.3586
            
            ------------------------------------------------------------------------------
                     mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                  weight |  -.0117278   .0045789    -2.56   0.013    -.0208601   -.0025955
                weightsq |   1.18e-06   6.43e-07     1.83   0.072    -1.06e-07    2.46e-06
                  length |  -.0560631   .0559624    -1.00   0.320    -.1676765    .0555504
                   _cons |   55.81875   7.394533     7.55   0.000     41.07082    70.56668
            ------------------------------------------------------------------------------
            
            . gen partial = _b[weight] * weight + _b[weightsq] * weightsq
            
            . line partial weight , sort

            Comment


            • #7
              Thank you Nick,
              Here is the result:
              Click image for larger version

Name:	Screen Shot 1401-05-02 at 16.00.32.png
Views:	1
Size:	551.3 KB
ID:	1674850

              Its peak is at ~ 122; and maximum(x)=144.
              So, can I say something about inversed u-shape relationship?

              Comment


              • #8
                Hello everyone,
                If we have a inverted u-shape relationship between y and x1, and we want to check the effect of interaction of x1 and x2 on y, what should I do? Shall I consider x1*x2 coefficient in this regard or x2*x1^2?

                Thank you in advance,
                Best regards,
                Michael

                Comment


                • #9
                  Shall I consider x1*x2 coefficient in this regard or x2*x1^2?
                  You must consider both jointly.

                  Comment


                  • #10
                    Originally posted by Clyde Schechter View Post
                    You must consider both jointly.
                    Thank you.
                    If the coefficient of x1 and x1^2 is not significant and interaction of x1*x2 and x1^2*x2 is significant, how can I interpret this?
                    P.S.: In the model without interaction terms (x1*x2 and x1^2*x2), the coefficients of x1 and x1^2 were significant!

                    Thank you again!
                    Best regards,
                    Mike

                    Comment


                    • #11
                      Even for people who take statistical significance seriously, the difference between statistically significant and not statistically significant is, itself, not statistically significant.

                      More important, the coefficients of x1 and x1^2 in a model without interaction with x2 do not mean the same thing as their coefficients in a model where they are interacted with x2, so talking about whether they have changed in some way between the models is a meaningless waste of effort.

                      Working within the statistical significance framework, if your question is whether the interaction between x1 and x2 is statistically significant (taking into account that x1 enters the model as both a linear and quadratic term) that is approached by jointly testing the x2#x1 and x2#x1#x1 terms for significance. In code:
                      Code:
                      test c.x2#c.x1 c.x2#c.x1#c.x1
                      (I'm assuming here that x2 is a continuous variable. If it's discrete the code is different and slightly more complicated.)

                      In these models where there are interactions and quadratics, it is difficult to maintain intuition about the meanings of particular coefficients and, frankly, it is best not to even try unless your hobby is algebra. And it is never wise to compare the coefficient of "the same term" across models, because what looks like "the same term" is usually something altogether different. I really recommend graphing the model predictions or marginal effects as the best way to understand these models. The -margins- and -marginsplot- commands do that for you. Looking at those will give you more understanding in an instant than you will get from puzzling over coefficients for a week.

                      Comment

                      Working...
                      X