Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Post estimation Wald test question

    I have an OLS regression that looks at the effect of 6 conditions on attitude, and each observation is randomized into 1 of the 6 conditions.

    reg attitude i.condition

    To compare the effect of condition 2 vs condition 4, I used a post estimation Wald test that uses all the observations from my sample:
    test 2.condition = 4.condition

    and the F stat result is...

    F( 1, 814) = 6.55
    Prob > F = 0.0107

    However, if I create a dichotomous variable where feedbackinc 0 = if condition == 2 and feedbackinc 1 = condition == 4, and all other conditions are missing values for this variable. My Wald test results change because the degrees of freedom is different.

    reg attitude i.feedbackinc
    test 0.feedbackinc = 1.feedbackinc
    F( 1, 271) = 7.18
    Prob > F = 0.0078

    I'm just wondering that which regression and Wald test is correct for comparing condition 2 to condition 4? The one with all observations, or the one that only includes people in condition 2 and condition 4?

    Thank you!

  • #2
    Yes, and the degrees of freedom change, in turn, reflects a change in the estimation sample, as in the second analysis you are excluding all observations with value of condition other than 2 and 4.

    You have asked two different questions of the data, and as a result, you have gotten two different answers. Each of them is the correct answer to the corresponding question. Your problem is to determine which question you meant to ask.

    Before we get to the details of that, let me also point out that while the answers are not identical, they aren't very different either. Frankly, I can't imagine a situation where the difference between a p-value of 0.0107 and a p-value of 0.0078 would be meaningful. So for practical purposes, and to the extent that p-values matter to what you are doing, you have actually gotten the same answer to your two different questions.

    Your first question of the data is asking: conditional on all of the observations obtained in this study with 6 randomized groups, what is the probability, under random sampling of the population, that we would observe an attitude difference between groups 2 and 4 as large as, or larger than, the one observed. Your second question is asking: conditional on the counterfactual situation that only groups 2 and 4 existed, what, now, is the probability that, under random sampling of the population, we would observe an attitude difference between groups 2 and 4 as large as, or larger than, the one observed. Clearly the domains of random sampling for these two situations are different, and so the sampling distribution of the difference between group 2 and group 4 attitudes will differ, as the second approach does not allow the information about the variation in attitudes in the population obtained in the full study to be applied.

    Comment


    • #3
      Thank you, Clyde. This is very helpful.

      Xuyang

      Comment

      Working...
      X