Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to interpret the non-linear effect of age

    Dear All,

    I am trying to look at the non-linear effect of age on Y.

    But I got the following puzzling results.

    The effect of age on Y is NOT statistically significant, but
    the effect of age_squared on Y is significant at the 0.01 level.

    Does this mean the relationship between age and Y is quadratic? Or should I exclude age_squared from the model because age (not squared) is not significant? (I mean it does not make sense to include age_squared if age is not significant?)

    My colleague advised me to include both age and age_squared. But I'm still not sure about it.
    Does anyone know how to interpret this result?

    Best,
    DS

  • #2
    your colleague is correct; to help understand what is going on, make a graph using mcp (user-written - do a search to find and download) or twoway function; you leave in the age term because it makes it easier to interpret the entire quadratic: see Nelder, JA (1998), "The selection of terms in response-surface models--How strong is the weak-heredity principle?", The American Statistician, 52(4): 315-318

    Comment


    • #3
      Dear Rich,

      Thank you very much for your suggestions!
      I will try a mcp plot and check Nelder's article.

      Best,
      DS

      Comment


      • #4
        David:
        as an aside to Rich helpful insights, even if you square_term for age is statistical significant, the turning point should be included within the range of your -age- variable. Otherwise, the usefulness of squared term for age is hard to justify.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Quadratics can be useful for capturing some curvature. I wouldn't say that the turning point has to be visible within the range of the data.

          But the linear and the squared terms are a team effort. As the linear and square effects are correlated, their effects are not simply separable (short of using orthogonal polynomials).

          But if in principle a turning point makes no sense, i.e. that is no independent scientific rationale for a turning point, then it may be that the nonlinearity is better captured e.g. by an exponential, logarithm or reciprocal. Perhaps this is Carlo's main point restated.

          In Stata terms a simple example is the regression of mpg on weight in the auto dataset. There is a visible curvature and using the square of weight as an extra predictor does help. But better yet is to regress the reciprocal of mpg on weight, which is closer to the physics (or engineering).

          Perhaps more typical is the relation of income to age, where the relationship (in observational data) is a composite of several social effects. Here people seem to find quadratics useful.

          Comment


          • #6
            Nick's take is.as always, much smarter than mine. I do second Nick's remark that theory should be considered before statistical procedures to avoid presenting results that could hurt reviewers and audience at large in a given research field. For instance, if the risk for a given disease is age-related, even if a turning point (let's say a maximum) comes alive by squaring age, it s difficult to defense, out of that sample, that the risk decrease after the turning point, as tons of literature claim exactly the opposite.
            As far as the main part of my previous reply is concerned, that is the relationship between data range and turning point, It may well be due to a personal attitude of playing on the safe side of the matter or possibly because I was impressed by the contents of the following article, which is still one of my favourite reading on regression-related topics: http://www.bmj.com/content/317/7155/409.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X