Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hypothesis Testing Regression Coefficients: One-sided

    Helpful Link: https://www.stata.com/support/faqs/statistics/one-sided-tests-for-coefficients/


    Hello everyone, I realize this topic has a few other posts about this, but it is hard for me to piece together the above link and some of the other comments input. The post also adds value in that the link above does not address testing the difference between regression coefficients when an F-test is returned from the 'test' command.

    Code:
    reg PctGrow ib1.Factor_Arm#ib2018.Year if Year == 2018 & TermFlag == 0 ,  vce(cluster ClinicID)
        test 2.Factor_Arm#2018.Year - 3.Factor_Arm#2018.Year = 0
        local sign_fac = sign(_b[3.Factor_Arm#2018.Year]-_b[2.Factor_Arm#2018.Year ])
        display "H_0: Fac3 coef >= Fac2 coef. p-value = " ttail(r(df_r),`sign_fac'*sqrt(r(F)))
    OUTPUT:
    Code:
    . reg PctGrow ib1.Factor_Arm#ib2018.Year if Year == 2018 & TermFlag == 0 ,  vce(cluster ClinicID)
    
    Linear regression                               Number of obs     =      2,700
                                                    F(2, 134)         =       1.66
                                                    Prob > F          =     0.1934
                                                    R-squared         =     0.0041
                                                    Root MSE          =     .46162
    
                                    (Std. Err. adjusted for 135 clusters in ClinicID)
    ---------------------------------------------------------------------------------
                    |               Robust
            PctGrow |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
    Factor_Arm#Year |
            2 2018  |   .0449931    .037292     1.21   0.230    -.0287639      .11875
            3 2018  |  -.0261276   .0370407    -0.71   0.482    -.0993876    .0471324
                    |
              _cons |    .619855   .0245349    25.26   0.000     .5713293    .6683808
    ---------------------------------------------------------------------------------
    
    .
    end of do-file
    
    . do "C:\Users\...
    
    .         test 2.Factor_Arm#2018.Year - 3.Factor_Arm#2018.Year = 0
    
     ( 1)  2.Factor_Arm#2018b.Year - 3.Factor_Arm#2018b.Year = 0
    
           F(  1,   134) =    3.24
                Prob > F =    0.0739
    
    .         local sign_fac = sign(_b[3.Factor_Arm#2018.Year]-_b[2.Factor_Arm#2018.Year ])
    
    .         display "H_0: Fac3 coef >= Fac2 coef. p-value = " ttail(r(df_r),`sign_fac'*sqrt(r(F)))
    H_0: Rebate coef >= Ranking coef. p-value = .96305365
    
    .
    I realize there are numerous problems with one-sided statistical tests. But I want to be able to reject the null (statistically) that the coefficient for the 3.Factor_Arm... is greater than 2.Factor_Arm..., and a one-sided test seems most appropriate, as it is tough to tell from the confidence intervals that they are distinctly different.

    The code is largely taken from the link above. My problem is the "p-value" that is spit out at the bottom. Perhaps the confusion is my own on whether it should be ttail() or 1-ttail(). The 'sign_fac' is going to be negative for sure, and so I'm feeding in a negative value to the ttail() function, but this function is returning a value that says that 96% of the distribution is above this critical value. I'm not convinced that this is the 'p-value' as it is traditionally interpreted, the probability of achieving an even MORE extreme value.

    For example, if 2.Factor_Arm was even greater, the critical value would be even more negative and the "p-value" percentage spit out of these commands would be even greater.

    Can someone clarify for me?

    1. To get a real 'pvalue', shouldn't this be 1-ttail() in this case?
    2. Or alternatively, perhaps I should remove the `sign_fac' multiplication inside the ttail() function entirely? Perhaps I do not understand the place of this macro in the ttail function, because the critical value is automatically positive and I want to know what the probability is of receiving an even greater value.

    Thanks for the consideration of a response. Hopefully others find our discussion helpful and constructive.


    Last edited by Robert Niewoehner; 01 May 2019, 14:56.

  • #2
    Somehow the link behind the "Helpful Link" at the top of post #1 picked up an extra character at the end so clicking on it does not do what it should. The corrected clickable link is

    https://www.stata.com/support/faqs/s...-coefficients/

    Comment


    • #3
      Your test statistic should be constructed to match your null hypothesis. Your null hypothesis is that
      Code:
      3.Factor_Arm#2018.Year > 2.Factor_Arm#2018.Year
      in which case
      Code:
      3.Factor_Arm#2018.Year - 2.Factor_Arm#2018.Year > 0
      but your test statistic is
      Code:
      2.Factor_Arm#2018.Year - 3.Factor_Arm#2018.Year
      so as the difference you care about gets larger (more positive) the test statistic gets smaller (more negative) which leads to a larger (closer to 1) p-value.

      I think.

      Comment


      • #4
        The American Statistical Association now recommends against null hypothesis significance testing. https://www.tandfonline.com/doi/full...5.2019.1583913. A better approach to this problem which is consistent with these recommendations, and also is inherently much less confusing is
        Code:
        lincom  3.Factor_Arm#2018.Year - 2.Factor_Arm#2018.Year
        which will give you an estimate of the difference between the two coefficients (in the order you want), along with a confidence interval. You will then be able to see at a glance whether the difference is positive or not, and also how much of the confidence interval lies in positive territory, and whether the dip into negative territory (if there is one) is extensive enough to matter from a practical perspective.

        Comment


        • #5
          Originally posted by William Lisowski View Post
          Somehow the link behind the "Helpful Link" at the top of post #1 picked up an extra character at the end so clicking on it does not do what it should. The corrected clickable link is...
          Thanks for that. Unfortunately, it appears that I cannot now edit my original post, otherwise I would update the link.

          Your test statistic should be constructed to match your null hypothesis. Your null hypothesis is that ...
          I'm not sure that I follow you. The test statistic does care if it is constructed as coef1 - coef2 = 0 or as coef2 - coef1 = 0. The resulting test statistic is the same. I think my point still holds, this is calculating the inverse p-value, which would be the probability of finding a test statistic that is "larger" than I have calculated (since my test statistic is negative).

          A better approach to this problem which is consistent with these recommendations, and also is inherently much less confusing is
          Thanks for the comment, Clyde. You are correct, that approach is more intuitive, clearer, and more consistent with the ASA's recommendation. It would still be helpful to be able to 1) present my results in both formats in case questions arise and 2) clarify my own understanding of hypothesis testing.

          Can you comment on my questions?

          Comment


          • #6
            The more I ponder my setting, the more I start to convince myself: if I construct the following test statistic...

            Code:
            3.Factor_Arm#2018.Year - 2.Factor_Arm#2018.Year
            I will get a negative test statistic, for sure. Recall that ttail() returns the probability of finding a value greater than my statistic, or P(T > t). By inputting a negative value into ttail(), I am finding the probability that I will find a value that is greater than a negative statistic. If this returns a p-value of 0.96, that can be interpreted as there is only a 4% chance of finding an even more negative value which would result from an even greater distance between the 2 coefficients. Thus we can reject the null that 3.Factor_Arm is greater than 2.Factor_Arm ...

            I think. Feel free to correct me.

            Comment


            • #7
              In post #1 you present
              Code:
              local sign_fac = sign(_b[3.Factor_Arm#2018.Year]-_b[2.Factor_Arm#2018.Year ])
              display "H_0: Fac3 coef >= Fac2 coef. p-value = " ttail(r(df_r),`sign_fac'*sqrt(r(F)))
              You created this by adapting the following code from the linked FAQ
              Code:
              local sign_mpg = sign(_b[mpg])  
              display "Ho: coef <= 0  p-value = " ttail(r(df_r),`sign_mpg'*sqrt(r(F)))
              Your adaptation consisted of substituting
              Code:
              _b[3.Factor_Arm#2018.Year]-_b[2.Factor_Arm#2018.Year
              for
              Code:
              _b[mpg]
              in the FAQ code.

              But this means your calculation is computing the p-value for testing
              Code:
              Ho: _b[3.Factor_Arm#2018.Year]-_b[2.Factor_Arm#2018.Year <= 0
              or in other words for
              Code:
              Ho: _b[3.Factor_Arm#2018.Year] <= _b[2.Factor_Arm#2018.Year
              rather than for
              Code:
              H_0: Fac3 coef >= Fac2 coef
              as your code asserts.

              Comment


              • #8
                Originally posted by William Lisowski View Post
                In post #1 you present...
                Ah, you are most correct, I swapped those two without realizing while going between the example and my own code. Thank you for the explicit correction, it is very helpful.

                Comment

                Working...
                X