Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Compare paired data - on separate lines, but identical ID

    Dear all,

    I am hoping that some of you may be able to tell me how to solve the following problem:

    Background: In my dataset of about 100 individuals we have measured a number of variables at two different occasions. Each individual therefore has two sets of data, i.e. two lines of data, like this:

    id time var1time1 var1time2 var2
    13 1 10 . 1
    13 2 . 11 2
    14 1 24 . 4
    14 2 . 22 3
    15 1 7 . 2
    15 2 . 8 5

    Question/Problem: I’d like to compare the paired values (using either paired t-test or Wilcoxon signed rank test) measured at two different occasions, but the values of var1time1 and var1time2 are located at different lines (so e.g. "signrank var1time1=var1time2" doesn't make sense). How do I make Stata "understand" that var1time1 is to be compared to var1time2 given identical id? (It is not an alternative to merge the lines as most variables are set up as var2 shown above)

    Thanks in advance!

  • #2
    Perhaps just reshape the data to wide form, so that you have a single observation for each id?

    Alternatively, you could do something like:
    Code:
    egen _var1time1 = max(var1time1), by(id)
    egen _var1time2 = max(var1time2), by(id)
    that will give you the same value in both observations for each id, so just make sure you use only one one them for the tests. (say with a condition like if time == 1)
    Last edited by Hemanshu Kumar; 29 Aug 2022, 06:00.

    Comment


    • #3
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte(id time var1time1 var1time2 var2)
      13 1 10  . 1
      13 2  . 11 2
      14 1 24  . 4
      14 2  . 22 3
      15 1  7  . 2
      15 2  .  8 5
      end
      
      bysort id (time) : replace var1time2 = var1time2[2]
      
      ttest var1time1 = var1time2 if time == 1
      
      Paired t test
      ------------------------------------------------------------------------------
      Variable |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
      ---------+--------------------------------------------------------------------
      var1ti~1 |       3    13.66667    5.238745    9.073772   -8.873832    36.20717
      var1ti~2 |       3    13.66667    4.255715    7.371115   -4.644198    31.97753
      ---------+--------------------------------------------------------------------
          diff |       3           0           1    1.732051   -4.302653    4.302653
      ------------------------------------------------------------------------------
           mean(diff) = mean(var1time1 - var1time2)                     t =   0.0000
       H0: mean(diff) = 0                              Degrees of freedom =        2
      
       Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
       Pr(T < t) = 0.5000         Pr(|T| > |t|) = 1.0000          Pr(T > t) = 0.5000
      Please note the use of dataex to give an example as explained at https://www.statalist.org/forums/help#stata

      Comment


      • #4
        Thank you so much, Hemanshu Kumar! That solved it! A good opportunity for me to learn how to use "egen".

        Comment


        • #5
          And thank you, Nick Cox! Another helpful way to solve the problem!

          Comment

          Working...
          X