Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating means

    Dear Stata experts,

    I need help calculating the mean run values of the 'X_A3', 'Y_A3', and 'S_A3' per SampleID. Each blood sample (variable SampleID) is analyzed once, twice or three times. The number of times each sample is analyzed is found in the variable Run. The results from each analysis is presented in 'X_A3', 'Y_A3', and 'S_A3' parameters. I want to calculate the mean of the run results for each sample, and the new 'X_A3', 'Y_A3', and 'S_A3' shall have only one value (the mean) per sample. This is an example of the data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long SampleID byte Run double(X_A3 Y_A3) int S_A3
    45635001 1             4.95 5.20999999999999 142
    45635001 2 4.96999999999999             5.11 142
    45635002 1             4.49                5 148
    45635002 2             4.44             4.99 147
    45635002 3             9.31 5.12999999999999 172
    45635003 1             9.58             5.15 169
    45635004 1 7.13999999999999             4.45 135
    45635005 1             4.07              4.4 145
    45635005 2 4.21999999999999             4.79 147
    45635005 3             4.19             4.82 147
    45635006 1 6.45999999999999 4.58999999999999 137
    45635006 2             6.53 4.54999999999999 138
    45635007 1              4.5             4.79 144
    45635007 2 4.50999999999999 4.75999999999999 145
    45635007 3 5.54999999999999             4.23 135
    45635008 1             5.48             4.23 135
    45635009 1 5.16999999999999              4.9 152
    45635009 2 5.17999999999999 4.87999999999999 153
    45635010 1              5.2             4.57 135
    45635010 2             5.28             4.54 133
    end
    I appreciate your help with the appropriate command(s).

    Thank you in advance.
    Best,
    Amanda
    Last edited by Amanda Ode; 17 Sep 2023, 07:05.

  • #2
    You can use the egen command with the option by:

    Code:
    bysort SampleID: egen X_A3_mean = mean(X_A3)

    Comment


    • #3
      Thank you Luca! What I am trying to get is the following:

      Example: For sampleID 45635001, X_A3=mean of Run1 (4.95) and Run 2 (4.96999)= 4.955
      Y_A3=mean of Run1 (5.2099) and Run 2 (5.11)= 5.16 and so on..
      i.e., for each SampleID, the mean value for the Runs related to that specific sample is desired.

      Comment


      • #4
        From #1:
        I want to calculate the mean of the run results for each sample, and the new 'X_A3', 'Y_A3', and 'S_A3' shall have only one value (the mean) per sample.
        I believe you want to use -collapse-. E.g.,

        Code:
        collapse (mean) X_A3 Y_A3 S_A3 (max) Runs=Run, by(SampleID)
        For the data you posted:

        Code:
        . list
        
             +---------------------------------------------------+
             | SampleID        X_A3        Y_A3      S_A3   Runs |
             |---------------------------------------------------|
          1. | 45635001        4.96        5.16       142      2 |
          2. | 45635002        6.08        5.04   155.667      3 |
          3. | 45635003        9.58        5.15       169      1 |
          4. | 45635004        7.14        4.45       135      1 |
          5. | 45635005        4.16        4.67   146.333      3 |
             |---------------------------------------------------|
          6. | 45635006       6.495        4.57     137.5      2 |
          7. | 45635007   4.8533333   4.5933333   141.333      3 |
          8. | 45635008        5.48        4.23       135      1 |
          9. | 45635009       5.175        4.89     152.5      2 |
         10. | 45635010        5.24       4.555       134      2 |
             +---------------------------------------------------+
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 19.5 (Windows)

        Comment


        • #5
          Thank you Bruce! It was the collapse command that I needed. I have not used Stata in years and I seem to forget a lot The command worked perfectly.

          Comment

          Working...
          X