Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Paired or Unpaired T-Test

    Hello, I have a sample of weekly data spanning two years, and I want to test whether the means of a variable are different across the two years. Should I be using a paired T-Test or an unpaired T-Test?

  • #2
    You have not described the data sufficiently to answer that question. Let's take two extreme cases:

    Case 1. The data in the first year are measured on a sample from a different population altogether than the data in the second year (so that year is actually confounded with the population). In this situation, an unpaired t-test would be called for because there is nothing that relates any one datum from the first year to any other specific datum in the second year.

    Case 2. The data are weekly closing prices of the same securities in each year. So the same entity (price of a specific security) is measured twice: once in the first year and once in the second year. Moreover, there is reason to believe that these prices exhibit some degree of seasonality, so that there is a correlation between the price of a security on week N of year 1 and its price on week N of year 2. These are completely paired observations and a paired T-test would be mandatory here.

    Put in general terms, if there is an actual real-world correspondence between each of the observations in year 1 and some specific corresponding observation in year 2, then you need a paired T-test. But if there is nothing that links an observation in year 1 with any particular observation in year 2, then they can be treated as independent samples and an unpaired T-test can be used.

    Comment


    • #3
      Zachariah:
      as an aside to Clyde's helpful advice, you may want to consider:
      Code:
      regress <depvar> i.year
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thanks Clyde,
        The data are payroll data for a farm, and I am interested in testing for difference in the total hours worked for females and the share of the workforce that is female over the two years. There is definitely seasonality over the sample and the weeks from the first year match to weeks for the second year. So in this case, do you think a paired test is more appropriate?

        Comment


        • #5
          Yes, it sounds like a paired test is more appropriate. Actually, not just more appropriate--required.

          Comment


          • #6
            Thanks, Clyde. Do you know where I can find details about the actual test statistic Stata computes for the paired t-test?

            Comment


            • #7
              Run -help ttest-. Then click on the View complete PDF manual entry link. In the PDF manual page that opens, click on the Methods and formulas link. A description of the formula used for the paired t-test (which they call the test for matched observations) appears at the end of that section.

              Comment


              • #8
                Thanks, Clyde! I appreciate your help.

                Comment


                • #9
                  I'd support Clyde's point at #2 that knowing more about your situation would be helpful. For example, from the description here, I wouldn't be sure whether your units of analysis are "weeks" or "persons," although I suspect the former. My thinking is that the choice of a paired vs. unpaired t-test, and the details of the formula used, are among the less difficult aspects of making a valid comparison, what with the complexities of seasonality and so forth. Our many economist colleagues here are quite knowledgeable about such matters, and giving them more substantive info to work with might elicit a good response.

                  Comment


                  • #10
                    Hi Mike,
                    Thanks for chiming in. I actually have individual level payroll data that I am aggregating at the week level for years 1 and 2. I want to test whether the average female number of hours worked per week and the share of female hours worked per week is different between the two years. If you have more guidance about whether I should be using a paired test or an unpaired test, I am all ears.

                    Comment


                    • #11
                      If the goal is to test whether the mean has changed over the two years I don’t see why a paired test is needed. In fact, I don’t see how that properly accounts for the serial correlation across time. I would define a dummy variable for, say, the second year and then use a simple regression with Newey-West standard errors:

                      Code:
                      newey y year2, lag(h)
                      where y is the outcome and h must be chosen. My guess is 3 or 4 for weekly data.

                      Comment


                      • #12
                        Thanks, Jeff. Just curious, do these tests still need to be done if the data contain all the information for the population? I was going to ask you about that in an email, but I might as well ask you here since I have you on the line.

                        Comment

                        Working...
                        X