Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Analysis between three groups

    gender age hiv_aids MM
    0 0 0 10.8
    1 0 0 127.3867
    1 0 0 28.97162
    0 0 0 21.2
    1 1 0 52.04571
    1 1 0 37.44932
    1 1 0 34.1
    1 1 0 28.01327
    1 0 0 19.6
    1 1 0 97.23553
    0 0 0 29.85625
    1 0 0 49.24438
    1 0 0 11.5739
    1 0 0 25.6
    1 1 0 24.54847
    1 0 0 26.3
    1 0 0 14.2
    0 0 0 28.2
    0 0 0 20.7
    0 1 0 22.11574
    0 0 0 9.1
    1 1 0 73.35053
    1 0 0 74.53004
    0 1 0 29.6
    1 0 0 67.08441
    1 1 0 14.59639
    1 1 1 41.4
    1 1 1 31.6
    1 1 1 96.27718
    1 1 1 39 Age: 0 children 1 adult. Gender: 0 female 1 male. MM: molecular marker



    I am trying to see if age, gender and hiv infection status have some effect in this molecular marker. I can do it with two variables:

    bysort age: ranksum MM by, (gender).

    But I want to test like: values of MM between gender and age among those who are hiv negative, for exemple. Among those who are HIV negative, there is diference between female children than male children.

    I could not attach dta. plan, sorry.

    thank you

  • #2
    You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    Without more info, I can't be sure, but this looks like a standard 2 sample t test using groups - look at documentation for ttest. There are also regression approaches that should give the same results. You mention ranksum - are you looking for non-parametric estimators? If so, you need to say so.

    Comment


    • #3
      Gabriel:
      it seems a job for -regress-.
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        Phil Bromiley Yes, this variable fits in a non-parametric distribution. When I used ranksum works well, accoording expected. However I want to test a specific group. Carlo Lazzaro I believe that regress will not be helpfull for the same reason I have mentioned (anyway I have tested regress and logistic and it did not work out for gender) My goal is test among those HIV negative, inside the children group, there is difference by gender. Same for adults. All HIV positive persons are adults, therefore is not necessary test it inside this group. (Please find attached the dta format data)
        Attached Files

        Comment


        • #5
          Among those who are HIV negative, there is diference between female children than male children.
          Unless I am missing something, you just need to proceed in the usual Stata way of selecting groups, i.e., using the if qualifier

          Code:
          bys age: ranksum MM if hiv==0, by(gender)
          More generally, all that ranksum does is to test whether independent samples are from populations with the same distribution (some view it as a test of medians). Therefore, with your data in #1, I can test whether the distributions differ between females aged 0 and hiv negative and males aged 1 and hiv positive (the syntax is flexible enough to allow me to do this)

          Code:
          ranksum MM if gender==0 & age==0 & hiv==0| gender==1 & age==1 &hiv==1, by(gender)
          Just define the groups and then modify the syntax.

          Comment


          • #6
            Dear Andrew Musau this is exactly what I was trying to test. Thank you very much! But I still have a problem to undertand this data. It does not make sence to me (biologically) test difference between female children hiv negative and aldult male hiv positive. I need to test children male and female (hiv negative), adults male and female (hiv negative) and adults male hiv negative and positive. I have played around with your syntax and just had problem to test bewteen adults male hiv positive and aldults male hiv negative.

            Another question is: When I cut the variable mm for two groups (base on median cutoff) I found a difference bewteen adults hiv negative male and female.

            (bys hiv: tab mmc gender, ex)

            What is the best way to interpretate this data; transforming the variable in two groups or analying straight by ranksum?
            Attached Files
            Last edited by Gabriel Reis Ferreira; 18 Aug 2018, 11:07.

            Comment


            • #7
              In future please use dataex to present data examples.

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input byte(age gender hiv) float(mm mmc)
              1 1 0 14.59639  0
              0 1 0     25.6  0
              0 1 0 28.97162  0
              1 1 0 37.44932 30
              1 1 0 73.35053 30
              1 1 0     34.1 30
              1 1 0 24.54847  0
              0 1 0     14.2  0
              0 0 0     20.7  0
              1 1 0 52.04571 30
              0 1 0 67.08441 30
              1 0 0 22.11574  0
              1 1 0 97.23553 30
              0 1 0     26.3  0
              0 1 0 74.53004 30
              1 0 0     10.8  0
              1 0 0     29.6  0
              0 1 0 127.3867 30
              0 0 0     28.2  0
              0 0 0      9.1  0
              0 1 0  11.5739  0
              0 0 0 29.85625  0
              1 1 0 28.01327  0
              0 0 0 24.62219  0
              0 1 0     19.6  0
              0 1 0 49.24438 30
              0 0 0     21.2  0
              1 1 1       39 30
              1 1 1 96.27718 30
              1 1 1     31.6 30
              1 1 1     41.4 30
              end
              It does not make sence to me (biologically) test difference between female children hiv negative and aldult male hiv positive.
              My example was just an illustration. It is however sensible to use theory to decide what tests make sense and what don't.

              I need to test children male and female (hiv negative), adults male and female (hiv negative) and adults male hiv negative and positive. I have played around with your syntax and just had problem to test bewteen adults male hiv positive and aldults male hiv negative.
              I would first graph the data before doing the tests. In this way, I know what to expect. The syntax changes little

              Code:
              set scheme s1color
              *children, hiv negative, by gender
              graph box mm if gender==0 & age==0 & hiv==0| gender==1 & age==0 &hiv==0, by(gender)
              gr save g1
              
              *adults, hiv negative, by gender
              graph box mm if gender==0 & age==1 & hiv==0| gender==1 & age==1 &hiv==0, by(gender)
              gr save g2
              
              *adults male, by hiv status 
              graph box mm if gender==1 & age==1 & hiv==0| gender==1 & age==1 &hiv==1, by(hiv)
              gr save g3
              
              gr combine g1.gph g2.gph g3.gph
              Click image for larger version

Name:	gcombine.png
Views:	1
Size:	23.8 KB
ID:	1458734


              Tests

              Code:
              *children, hiv negative, by gender
              ranksum mm if gender==0 & age==0 & hiv==0| gender==1 & age==0 &hiv==0, by(gender)
              
              *adults, hiv negative, by gender
              ranksum mm if gender==0 & age==1 & hiv==0| gender==1 & age==1 &hiv==0, by(gender)
              
              
              *adults male, by hiv status 
              ranksum mm if gender==1 & age==1 & hiv==0| gender==1 & age==1 &hiv==1, by(hiv)
              Code:
              . ranksum mm if gender==0 & age==0 & hiv==0| gender==1 & age==0 &hiv==0, by(gender)
              
              Two-sample Wilcoxon rank-sum (Mann-Whitney) test
              
                    gender |      obs    rank sum    expected
              -------------+---------------------------------
                         0 |        6          41          51
                         1 |       10          95          85
              -------------+---------------------------------
                  combined |       16         136         136
              
              unadjusted variance       85.00
              adjustment for ties        0.00
                                   ----------
              adjusted variance         85.00
              
              Ho: mm(gender==0) = mm(gender==1)
                           z =  -1.085
                  Prob > |z| =   0.2781
              
              . 
              . 
              . 
              . *adults, hiv negative, by gender
              
              . 
              . ranksum mm if gender==0 & age==1 & hiv==0| gender==1 & age==1 &hiv==0, by(gender)
              
              Two-sample Wilcoxon rank-sum (Mann-Whitney) test
              
                    gender |      obs    rank sum    expected
              -------------+---------------------------------
                         0 |        3          10          18
                         1 |        8          56          48
              -------------+---------------------------------
                  combined |       11          66          66
              
              unadjusted variance       24.00
              adjustment for ties        0.00
                                   ----------
              adjusted variance         24.00
              
              Ho: mm(gender==0) = mm(gender==1)
                           z =  -1.633
                  Prob > |z| =   0.1025
              
              
              . 
              . *adults male, by hiv status 
              
              . 
              . ranksum mm if gender==1 & age==1 & hiv==0| gender==1 & age==1 &hiv==1, by(hiv)
              
              Two-sample Wilcoxon rank-sum (Mann-Whitney) test
              
                       hiv |      obs    rank sum    expected
              -------------+---------------------------------
                         0 |        8          48          52
                         1 |        4          30          26
              -------------+---------------------------------
                  combined |       12          78          78
              
              unadjusted variance       34.67
              adjustment for ties        0.00
                                   ----------
              adjusted variance         34.67
              
              Ho: mm(hiv==0) = mm(hiv==1)
                           z =  -0.679
                  Prob > |z| =   0.4969
              So we cannot reject the null of no difference in the distributions across all of our comparison groups.

              Another question is: When I cut the variable mm for two groups (base on median cutoff) I found a difference bewteen adults hiv negative male and female.

              (bys hiv: tab mmc gender, ex)

              What is the best way to interpretate this data; transforming the variable in two groups or analying straight by ranksum?
              It appears that you are dichotomizing the "mm" variable here. By doing so, you are throwing away valuable information and there is some literature arguing that you should not do this. However, I have no clue what a molecular marker is and this is not my field of work, so consult your colleagues (or other forum members) to see if dichotomization makes sense here. There may be good reasons for it.

              Comment


              • #8
                Thank you Andrew Musau

                I have done the graphs at first, that was the reason why I choose this groups to compare.

                Code:
                graph box mm, over(age) over(gender) over(hiv)

                Click image for larger version

Name:	gender age hiv mm.png
Views:	1
Size:	10.5 KB
ID:	1458753

                If a try without group division we can find difference.


                Code:
                ranksum mm, by (gender)
                
                Two-sample Wilcoxon rank-sum (Mann-Whitney) test
                
                      gender |      obs    rank sum    expected
                -------------+---------------------------------
                           0 |        9          88         144
                           1 |       22         408         352
                -------------+---------------------------------
                    combined |       31         496         496
                
                unadjusted variance      528.00
                adjustment for ties        0.00
                                     ----------
                adjusted variance        528.00
                
                Ho: mm(gender==0) = mm(gender==1)
                             z =  -2.437
                    Prob > |z| =   0.0148
                Code:
                . ranksum mm if hiv==0, by (gender)
                
                Two-sample Wilcoxon rank-sum (Mann-Whitney) test
                
                      gender |      obs    rank sum    expected
                -------------+---------------------------------
                           0 |        9          88         126
                           1 |       18         290         252
                -------------+---------------------------------
                    combined |       27         378         378
                
                unadjusted variance      378.00
                adjustment for ties        0.00
                                     ----------
                adjusted variance        378.00
                
                Ho: mm(gender==0) = mm(gender==1)
                             z =  -1.955
                    Prob > |z| =   0.050
                Thank you again for your help, I will keep chasing it.

                Comment

                Working...
                X