Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Statistically significant squared term, but insignificant level term

    I am not sure if this is the appropriate place to ask this question. If not, please let me know and I will delete it.

    I have estimated, for example, the following model to capture increasing/decreasing marginal effect of x on y:

    \[ y=\alpha + \beta_1x+ \beta_2x^2 +e \]
    where \[ \beta_1 \] is statistically insignificant, but \[ \beta_2 \] is statistically significant.

    My questions are as follows:

    1. What's the implication of these statistical significance and insignificance in terms of interpreting the coefficients?
    2. Can I still meaningfully calculate the turning point of $x$ by using the formula $x*=\frac{\beta_1}{2\beta_2}$?

    Thanks.
    Last edited by Taz Raihan; 21 Apr 2018, 03:01.

  • #2
    In general, you don't worry too much about the significance of lower level terms once higher order terms are in the model. Just interpret things the way you usually would.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    Stata Version: 17.0 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Have a look at this thread here (response:#2), Clyde explained it very well (as always), Nick (#7) and Marteen's (#8) responses are also may be of your interest.
      Last edited by Roman Mostazir; 21 Apr 2018, 09:28.
      Roman

      Comment


      • #4
        Thank you very much, Richard Williams and Roman Mostazir. What I gather from Clyde Schechter's post is that, when one of the terms between x and x-squared is insignificant, I should do a joint hypothesis test. If that test is significant, then I may proceed to calculate the turning point, x*. Otherwise, I may not?

        Comment


        • #5
          On a side-issue, but you raised it: You can't delete a thread you started. This is explicit in the FAQ Advice all posters are asked to read before posting. Do please read https://www.statalist.org/forums/help#closure and the rest of that document.

          I don't see a real problem in an occasional statistical question without Stata content. I would see a real problem if almost all questions here were statistical without Stata content but we'd have done something about that long before it happened. As Paracelsus supposedly said, The poison is the dose, i.e. we just need to think quantitatively.

          On your question: I would not rely on a significance test alone. I can readily imagine situations in which I know from one or more of (a) looking at the data (b) experience with similar datasets (c) theory or subject-matter knowledge that curvature is needed, or at least helpful, which is a motive for fitting a quadratic rather than a single linear term. If I thought that lack of significance would not necessarily be fatal to fitting a quadratic. Further, if other people in the same territory were using a quadratic in their projects, doing the same is often a good idea.

          The lack of independence of a variable and its square means that they are a double act. You can't ignore one if you fit both.
          Last edited by Nick Cox; 22 Apr 2018, 04:56.

          Comment


          • #6
            Hi, Taz: A format test of a U or an inverted-U relationship between y, x and x^2 can be performed by the (ssc install) -utest- command. You might want go through the paper: J. T. Lind and H. Mehlum: With or without U? The appropriate test for a U shaped relationship. Oxford Bulletin of Economics and Statistics 72(1): 109-18 (2010).

            Ho-Chuan (River) Huang
            Stata 17.0, MP(4)

            Comment


            • #7
              Nick Cox Thank you for clarifying the side-issue. Would you care to elaborate on your following statement?
              "The lack of independence of a variable and its square means that they are a double act. You can't ignore one if you fit both."
              Does this mean that x being statistically insignificant prevent us from calculating the turning point, x* even though squared-x is statistically significant?

              River Huang Thank you, sir. I am now reading the paper you suggested.

              Comment


              • #8
                Naturally not. If you have coefficients for linear and quadratic terms you can calculate the turning point, but you should always consider uncertain that is, as discussed in the cited thread.

                Comment


                • #9
                  Thank you, Dr. Cox.

                  Comment


                  • #10
                    As I said before once you have higher order terms in the model you generally don’t worry about the statistical significance of the lower terms. You shouldn’t drop x if x^2 is in the model.

                    To think of it another way — if x and x^2 were in the model, you probably wouldn’t think there is a problem if the coefficient for x were negative, would you? Nor would you probably be worried if it were positive. So why should it worry you that the coefficient is zero or thereabouts?
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    Stata Version: 17.0 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://www3.nd.edu/~rwilliam

                    Comment


                    • #11
                      Richard Williams thank you for your response. I wasn't actually concerned with dropping or keeping the lower order term, x. I was just wondering if I could compute x*=beta1/2*beta2 when one of these coefficients is statistically insignificant.

                      Comment


                      • #12
                        In your equation, zero is a legit value for B1. If B1 equals zero, it will be statistically insignificant. So again, there is no reason B1 can’t be zero.
                        -------------------------------------------
                        Richard Williams, Notre Dame Dept of Sociology
                        Stata Version: 17.0 MP (2 processor)

                        EMAIL: [email protected]
                        WWW: https://www3.nd.edu/~rwilliam

                        Comment


                        • #13
                          Thank you for your valuable input, Dr. Williams.

                          Comment


                          • #14
                            One more argument for not being worried about the significance of B1 when x and x^2 are in the model. You can add or subtract some arbitrary constant to x. For example, instead of having a variable run from 1 to 7, you can subtract four and have it run from -3 to 3. Or, sometimes people subtract the mean from each case, to ease interpretation. e.g. after centering, a score of 0 on x means the case has an average score.

                            When you add or subtract a constant, then B1, and the significance of B1, will change. But B2 stays the same, the fit stays exactly the same, etc. If I had played around with it a bit I could have found some value to subtract from x that would have made b1 insignificant. But it wouldn't have mattered, it still would have been the same model,

                            In this example, centering causes B1 to actually change signs. But it is still the exact same model.

                            Code:
                            . webuse nhanes2f, clear
                            
                            . sum weight
                            
                                Variable |        Obs        Mean    Std. Dev.       Min        Max
                            -------------+---------------------------------------------------------
                                  weight |     10,337    71.90088    15.35515      30.84     175.88
                            
                            . gen xweight = weight - r(mean)
                            
                            . reg health c.weight##c.weight
                            
                                  Source |       SS           df       MS      Number of obs   =    10,335
                            -------------+----------------------------------   F(2, 10332)     =     15.20
                                   Model |   44.117835         2  22.0589175   Prob > F        =    0.0000
                                Residual |  14990.9035    10,332  1.45091982   R-squared       =    0.0029
                            -------------+----------------------------------   Adj R-squared   =    0.0027
                                   Total |  15035.0214    10,334   1.4549082   Root MSE        =    1.2045
                            
                            -----------------------------------------------------------------------------------
                                       health |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            ------------------+----------------------------------------------------------------
                                       weight |   .0139346   .0049689     2.80   0.005     .0041947    .0236746
                                              |
                            c.weight#c.weight |   -.000111   .0000316    -3.51   0.000     -.000173    -.000049
                                              |
                                        _cons |   3.012006   .1905019    15.81   0.000     2.638586    3.385427
                            -----------------------------------------------------------------------------------
                            
                            . reg health c.xweight##c.xweight
                            
                                  Source |       SS           df       MS      Number of obs   =    10,335
                            -------------+----------------------------------   F(2, 10332)     =     15.20
                                   Model |  44.1178342         2  22.0589171   Prob > F        =    0.0000
                                Residual |  14990.9035    10,332  1.45091982   R-squared       =    0.0029
                            -------------+----------------------------------   Adj R-squared   =    0.0027
                                   Total |  15035.0214    10,334   1.4549082   Root MSE        =    1.2045
                            
                            -------------------------------------------------------------------------------------
                                         health |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            --------------------+----------------------------------------------------------------
                                        xweight |  -.0020291   .0008504    -2.39   0.017    -.0036961   -.0003621
                                                |
                            c.xweight#c.xweight |   -.000111   .0000316    -3.51   0.000     -.000173    -.000049
                                                |
                                          _cons |   3.440015    .014002   245.68   0.000     3.412568    3.467462
                            -------------------------------------------------------------------------------------
                            For more on centering and how adding or subtracting constants to x may ease interpretation without changing the meaning of the model, see

                            https://www3.nd.edu/~rwilliam/stats2/l53.pdf

                            https://www3.nd.edu/~rwilliam/stats2/l55.pdf
                            -------------------------------------------
                            Richard Williams, Notre Dame Dept of Sociology
                            Stata Version: 17.0 MP (2 processor)

                            EMAIL: [email protected]
                            WWW: https://www3.nd.edu/~rwilliam

                            Comment


                            • #15
                              Hello All,

                              This one is again a very trivial question and could have been answered and therefore please bear with me for repeating this.
                              In one of my estimations for a different explanatory variable, I have statistically insignificant level term ( negative) and significant square term (again negative). Earlier I was doubtful about its relevance, but after going through some discussions on similar findings, I can safely say that there is nothing "unusual" about such result. However, I am still struggling with how to interpret. For doing the interpretation, I calculate the marginal effect of this variable using:
                              Code:
                              margins, dydx(Leverage)
                              The coefficient is negative as well as significant at 1%.
                              Can I simply say that "leverage has a significant negative impact on the outcome variable" without getting or mentioning about the impact of Leverage in Levels or Leverage Square? Do I even need to mention the impact of these variables in isolation at all? I guess not!! If Yes, then how should I actually go about it??

                              Please correct me if I am wrong and clarify regarding my interpretation.


                              P.S. I have tested for joint significance test of this variable and as per the result I should include both of them.

                              regards,
                              Mohina
                              Last edited by mohina saxena; 18 Sep 2019, 13:20.

                              Comment

                              Working...
                              X