Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying paired samples in panel data

    Hello,

    I am using panel data to compare two types of analyzing milk samples (the two methods are "24" and "MD") - see dataex example below. I am wanting to compare values of "energy" between the two methods. I am hoping for help with two things:
    1. Some of the samples have been analyzed twice and need to be combined together to make an average value for energy. For example, person 1 has two values for type "MD" and two values for type "24." I can tell these are just reanalysis of the same sample because they occur on the same date. So what I want to do is generate a variable that is the mean value of "energy" of all MD samples taken on the same date. And the same for the 24 samples.
    2. I then want to identify paired samples in the dataset (samples that are taken on the same date but are different types). I imagine this would be a new variable that indicates whether the samples are from day 1, 2, 3, etc.

    Any advice is much appreciated!

    Sarah



    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float pid str2 samplename1 str8 date float energy
     1 "24" "7/21/22"   82.57
     1 "MD" "7/21/22"   84.83
     1 "MD" "7/21/22"    84.5
     1 "24" "7/21/22"   82.17
     2 "24" "7/21/22"   89.45
     2 "MD" "7/21/22"   92.39
     2 "MD" "7/21/22"   94.01
     2 "24" "7/21/22"   89.03
     2 "MD" "7/21/22"   93.22
     3 "MD" "9/27/22"  110.39
     3 "24" "10/13/22"   90.6
     3 "MD" "10/21/22"  88.45
     3 "MD" "10/21/22"  89.04
     3 "24" "10/13/22"  87.95
     3 "MD" "10/6/22"   79.42
     3 "MD" "9/27/22"  111.39
     3 "24" "10/21/22"  93.42
     4 "MD" "10/6/22"   67.57
     4 "MD" "10/21/22"  71.95
     4 "MD" "10/6/22"   67.69
     4 "24" "10/6/22"   68.17
     4 "24" "10/21/22"  73.24
     4 "24" "10/21/22"   73.5
     4 "24" "10/6/22"   68.72
     4 "24" "10/27/22"  72.35
     4 "MD" "11/3/22"   73.92
     4 "MD" "11/3/22"   73.15
     4 "MD" "10/27/22"  67.92
     4 "24" "11/3/22"   70.93
     4 "24" "10/27/22"  72.13
     4 "MD" "10/21/22"  71.41
     4 "MD" "10/27/22"  67.23
     4 "24" "10/27/22"  71.49
     4 "24" "11/3/22"   69.91
     5 "24" "10/21/22"  67.28
     5 "MD" "10/21/22"  72.03
     5 "MD" "10/21/22"  72.08
     5 "24" "10/21/22"  67.62
     6 "24" "9/8/22"    88.57
     6 "MD" "9/8/22"   102.56
     6 "24" "9/8/22"    88.37
     6 "MD" "9/8/22"   102.76
     7 "24" "9/8/22"    83.72
     7 "MD" "9/8/22"   121.31
     7 "24" "9/8/22"    84.04
     7 "24" "10/21/22"  82.37
     7 "MD" "10/11/22" 139.59
     7 "24" "11/3/22"  125.37
     7 "24" "11/3/22"  124.58
     7 "MD" "9/22/22"  124.52
     7 "24" "10/11/22"  98.67
     7 "24" "9/27/22"   98.94
     7 "MD" "10/21/22"  80.34
     7 "24" "10/21/22"  82.31
     7 "MD" "10/11/22" 139.73
     7 "24" "9/27/22"   98.74
     7 "MD" "9/22/22"  120.98
     7 "MD" "11/1/22"  114.13
     7 "MD" "10/21/22"   80.5
     7 "24" "10/11/22"  98.33
     7 "MD" "9/8/22"   114.94
     7 "MD" "11/1/22"  114.66
     8 "24" "11/16/22"  80.43
     8 "24" "11/16/22"  80.36
     8 "MD" "11/16/22"  80.78
     8 "MD" "11/16/22"  80.89
     9 "MD" "11/10/22"   80.5
     9 "MD" "11/10/22"  80.84
     9 "24" "11/10/22"  75.96
     9 "24" "11/10/22"  74.96
    10 "24" "12/17/22"  76.99
    10 "24" "12/17/22"  76.91
    10 "MD" "12/17/22"  92.19
    10 "MD" "12/17/22"  91.52
    11 "24" "11/30/22"  72.42
    11 "MD" "11/10/22"  90.26
    11 "MD" "11/10/22"  89.63
    11 "24" "11/30/22"  71.48
    12 "MD" "7/7/22"    82.67
    12 "24" "6/23/22"   84.36
    12 "24" "6/23/22"   84.66
    12 "MD" "7/7/22"    80.79
    13 "MD" "10/6/22"   75.35
    13 "MD" "10/21/22"  84.06
    13 "24" "10/21/22"  80.47
    13 "24" "10/21/22"  89.09
    13 "24" "10/21/22"   89.6
    13 "24" "10/21/22"  80.59
    13 "24" "10/6/22"   84.18
    13 "MD" "10/21/22"  79.25
    13 "MD" "10/21/22"  78.54
    13 "MD" "10/6/22"   75.52
    13 "MD" "10/21/22"  84.31
    13 "24" "10/6/22"   83.67
    14 "MD" "11/30/22"  79.12
    14 "MD" "11/10/22"  75.78
    14 "MD" "11/10/22"   75.8
    14 "MD" "11/30/22"  79.54
    14 "24" "11/30/22"  81.16
    14 "24" "11/30/22"  80.27
    end

  • #2
    First, collapse and average by pid, sample type, and date. Then reshape wide to pair your sample types by pid and date.

    Code:
    collapse (mean) mean_energy_=energy, by(pid samplename1 date)
    reshape wide mean_energy_, i(pid date) j(samplename1) str

    Comment


    • #3
      Thank you Daniel! Is it possible to collapse more than one variable at a time? I also have a fat and carbs variable as well that I would like to explore. Or do I need to do them all separately?

      Comment


      • #4
        Code:
        collapse (mean) mean_energy = energy mean_fat = fat mean_carbs = carbs, by(pid samplename1 date)
        will work just fine. Do read -help collapse- for more information on this versatile command that is a "bread and butter" piece of Stata.

        Comment

        Working...
        X