Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ttest means and differences

    Hello,

    In the ttest help Stata notes that a test of differences occurs by subtracting the highest value by the lowest values 'By default, the mean of the group corresponding to
    the largest value in the variable in by() is subtracted from the mean of the group with the smallest value in by()'


    My data set contains values of 1, 2 and missing.
    Value of 1 has 3,000,000 observations
    Value of 2 has 600,000 observations
    Missing has 2,400,000 observations

    When I run the ttest the results output notes 'diff = mean(1) - mean(2) '

    My code to run the ttest contains the requirement for unequal means although I do not specify to 'reverse' the means test.

    Why do the stata results specify the 'diff = mean(1) - mean(2)' when this is contrary to the default setting in stata where the highest value is subtracted from the lowest value?

    Regards
    Imtiaz

  • #2
    Yeah, that's right. Mean of group 2 (largest) is being subtracted from the mean of group 1 (smallest, because here 1 is smaller than 2). I think the large/small is about the values inside the "by" variable, not the actual means or sample sizes.

    Comment


    • #3
      Thanks Ken. My question though is why does the results screen say that values for 1 have been subtracted from 2, whereas it should be the other way around as the value 2 is higher (see the results below and the line in red font)?



      Two-sample t test with unequal variances
      ------------------------------------------------------------------------------
      Group | Obs Mean Std. err. Std. dev. [95% conf. interval]
      ---------+--------------------------------------------------------------------
      1 | 1589265 -6.164901 .0915939 115.4688 -6.344422 -5.98538
      2 | 364,655 -10.67484 .7331251 442.7098 -12.11174 -9.237935
      ---------+--------------------------------------------------------------------
      Combined | 1953920 -7.006579 .1557942 217.7733 -7.31193 -6.701228
      ---------+--------------------------------------------------------------------
      diff | 4.509938 .7388247 3.061863 5.958012
      ------------------------------------------------------------------------------
      diff = mean(1) - mean(2) t = 6.1042

      Comment


      • #4
        Ken has given you great advice already.

        In answer to your question in #3, the subtraction doesn't have to be one way or the other (e.g., lower mean minus higher mean), only that the choice is consistent and carried through the other statistics. After all, the results are symmetric (i.e., the same magnitude but different sign). I believe the default is lower category value minus higher category. Nevertheless, as of Stata 16 I think, StataCorp introduced a -reverse- option, which does the opposite computation. You could also reverse the category coding on earlier versions if you prefer.
        Last edited by Leonardo Guizzetti; 12 Nov 2021, 17:50.

        Comment


        • #5
          Ok, thanks. I just found it strange that Stata, without me using the reverse option, chose to process the differences in a reverse order ie 1 minus 2.

          Comment


          • #6
            Like I said, Stata uses the categories to determine defaults. Exactly which way is also detailed in the documentation.

            Comment


            • #7
              Originally posted by Imtiaz Bhayat View Post
              Thanks Ken. My question though is why does the results screen say that values for 1 have been subtracted from 2, whereas it should be the other way around as the value 2 is higher (see the results below and the line in red font)?
              Oh, I see. The result screen is correct. I think it's because you read it backward. In English expression, A - B is read as "subtract B from A," not "subtract A from B."

              Comment


              • #8
                Thanks Ken and Leonardo!

                Comment

                Working...
                X