No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it correct to interpret the sign of statistically non-signifcant coefficient?

    Dear Members
    This question turns to be somewhat foolish or trivial for many in this forum, but I think a clear cut answer if it all possible by members in this forum can uproot the doubt in my mind.

    Question 1: Can we interpret the "SIGN" of an non-significant coefficient?
    Asnwer1: "
    "If a coefficient's t-statistic is not significant, don't interpret it at all. You can't be sure that the value of the corresponding parameter in the underlying regression model isn't really zero."
    DeVeaux, Velleman, and Bock (2012), Stats: Data and Models, 3rd edition, Addison-Wesley

    p. 801 (in Chapter 10: Multiple Regression, under the heading "What Can Go Wrong?")

    Answer2: Some authors interpret the sign at least, stating that the nature of the relationship is -ve (or +ve) but not statistically significant.
    Which one of the above is more correct? Faced with an insignificant coefficient should we ignore them completely and move forward or should we stop and interpret the sign?
    I have read a similar post on this forum but couldn't find the answer for my doubt

  • #2
    I share the reply to your first answer.
    The second one sounds a bit like an intellectual speculation.
    Kind regards,
    (Stata 16.0 SE)


    • #3
      I would not put any meaning in such a coefficient. For example, if your coefficient is -0.00001, it is negative but very close to zero. In the population, it could easily be a little larger and above 0. So this positive / negative distinction is quite arbitrary and rather useless when the coefficient is very weak.
      Best wishes

      (Stata 16.1 MP)


      • #4
        Thanks, Felix Bittmann & Carlo Lazzaro for the help. Hence, I must report those insignificant results but not say anything about them (at the best we can say those coefficients are not significant based on a threshold level of significance), right? Okay-Have a good day


        • #5
          Just to add to your confusion: The whole concept of statistical significance as a clear line separating "significant" and "non-significant" results is in doubt (to put it mildly): . This makes sense, the purpose of inference is to quantify uncertainty: so the answer is unlikely to be binary (significant/not significant). So there is a false premise in your question: you assumed that significant versus non-significant is a meaningful distinction, which modern statistics doubts.

          So can you interpret the sign of a non-significant parameter? Yes and no. Yes you can, but you have to be careful. Often, but not always, that "have to be careful" means using so many qualifiers in your interpretation that the interpretation becomes meaningless.

          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz



          • #6
            just follow Maarten's excellent advice in drafting your paper.
            Kind regards,
            (Stata 16.0 SE)


            • #7
              Thanks Maarten Buis, though as you said, I got a little confused. As you said, given the distinction between the statistical significance versus non-significance doubtful (these days), my question is not-valid.
              Yes you can, but you have to be careful.
              - I agree but to what extent I can say or how to put it with brevity is a difficult thing.
              Thanks Carlo for alerting me


              • #8
                Some recent papers that are of interest are available here:


                • #9
                  Great references, thanks ericmelse


                  • #10
                    just to top off Eric's excellent list, you may want to take a look at
                    Kind regards,
                    (Stata 16.0 SE)


                    • #11
                      Going back to the original question on whether it is correct to interpret a statistically non-significant coefficient: You should always try to interpret your data. The more basic question is: Do you want to make inferences from the sample to the population or do you want to make sense of the data in your sample?

                      Describing and trying to understand the sample data is a precondition before you go on trying to make inferences to the population. It is often forgotten that regression analysis with regression coefficients and R²s can be purely descriptive (therefore it also belongs into the "descriptive statistics" section in textbooks), even if regression tables are full of F-, t-, or z-statistics and p-values. What is the nature of your variables? What is the data generating process? Did you have a look at the univariate distribution of your data? Did you have a look at scatterplots of two variables? In this sense: Of course you should interpret the sign of a non-significant coefficient! Only if you want to talk about the reality in the population from which your sample is sampled (hopefully with some kind of useful random sampling and with respecting your sampling design -- which also often tends to be ignored) statistical significance comes into play. Here then all the excellent references by Eric and Carlo should be part of the canon of "must reads".

                      I would like to add a famous article by Jacob Cohen to the list:
                      Last edited by Dirk Enzmann; 01 Jun 2021, 05:45. Reason: Added on second thoughts


                      • #12
                        Thanks Dirk Enzmann for adding to the list of "must read" articles. My confusion arose becomes text book are either silent or have not explained what to do with a non-signifcant coefficient. On the other hand, sometimes articles (depends upon the discipline) interpret the sign with a disclaimer on the statistical significance.

                        The coefficient on the two-item interaction term of the unpredicted acquirer dummy and log (1 + excess cash ratio) is positive(0.025) but statistically insignificant
                        Now, as many in thread pointed, the inference to be drawn based on sampling was forgotten and I didn't acknowledge that most often our intention is to make make inferences from the sample to the population.

                        Thanks Carlo Lazzaro for the reference in #10. As an aside most often, I am forced to discard non-significant results (hence no further checks or readings on them) and asked to come up with something with that magical p-value<.05
                        Last edited by lal mohan kumar; 01 Jun 2021, 06:26.


                        • #13
                          Imagine you have the full information of the cash ratios of an asset (i.e. complete data of the "population", thus no sample data!), in that case you would need not test of significance, at all. To which population do you want to make inferences? To the future? As far as I can see you can't predict the probability of the future by ways of a test of significance (others may disagree on that).


                          • #14
                            Thanks Dirk Enzmann. I agree.


                            • #15
                              Just to make some further points that might deserve mention:

                              Historically, a leading motive for significance tests was not over-interpreting results from (very) small samples. The point is not far from common sense in that most people wouldn't take 7 heads out of 10 tosses as convincing evidence for a biased coin. but 70/100 is stronger evidence, and 700/1000 is stronger yet.

                              The motive has not become redundant. Small datasets remain common whenever a measurement is hard work or for other reasons data are elusive, but in many fields people routinely now have very large samples and can afford to be a little indulgent in including predictors in published models.

                              There can be special reasons for indulgence too:

                              1. Leaving a predictor inside a model even if its coefficient is very small or hard to interpret is a way of saying to yourself, and to potential critics, Yes, there is an Idea that covariate C is important and I've let the data do their best to show how important it is, and the answer is not very important. Leaving it out on a second pass because the first pass of modelling showed insignificance is widely condemned as cherry-picking, or P-hacking, or whatever. And people in every field are on the lookout for a reviewer who is likely to squawk But what about C? You've not even tested its contribution.

                              2. With observational data in particular -- what most of us here deal with -- no covariate C is ever just itself but can be at least in part of a proxy for other things. So, even if a predictor is conventionally significant it can be a bit of a puzzle quite what it is capturing. In geography and the environmental sciences aspect -- the direction a slope faces -- is often included to catch its effects on temperature and moisture but it can by accident helpfully catch something of the underlying geological structure related to folding, faulting and so forth. As far as I can see, in many medical and social sciences, people reach for age as an obvious predictor, but quite why can vary: it often features as proxy for something else, physical strength, personal experience or skill, different immunological status, or whatever.

                              3. Often predictors belong together in little teams, and so you should take the lot and what they imply. The most familiar example here is likely to be a bundle of indicators for a categorical variable with many categories in which the price of adding fine control is that some of the indicators may not achieve conventional significance, but together they help. A more esoteric example in my experience is using sines and cosines as predictors of seasonality or other periodic variations, where there are good mathematical and physical arguments for taking a sine and cosine as yoked together even if one but not the other coefficient achieves significance.

                              It's a counsel of perfection that your results must tell a clear story, just as that your choice of covariates should be theory-led. But much theory doesn't get further than saying that some factor should be important without saying how it can be measured or telling you how it should appear in an equation. Even good research can be a bit of a muddle with some mix of mystery as well as clarity in its results.
                              Last edited by Nick Cox; 01 Jun 2021, 12:01.