Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • create new categorical variable with different cutoffs depedent on condition (sex)

    Dear forum,

    I have a dataset of 407 participants, i am trying to categorise them by their hip to waist ratio. This is a continuous variable which I would like to ceate a new categorical variable with.

    I am trying to categorise my continuous hip:waist ratio into a new categorical variable of three categories 1 "low" 2 "average" 3 "high".

    However, the cutoffs for low average high are different for male and female.

    Is there a way to do this with egen cut() function or is something more complex?

    My variables: sex
    waisthip


    Grateful for any help,


  • #2
    Update:
    I managed (rather inelegantly) to get half way there by doing hte following
    tabulate waisthip, m
    generate waisthipsexmale = waisthip if sex==1
    recode waisthipsexmale (min/0.91=2)
    recode waisthipsexmale (min/0.97=3)
    recode waisthipsexmale (min/1.5=4)
    tabulate waisthipsexmale, m

    generate waisthipsexfemale=waisthip
    recode waisthipsexfemale (min/0.86=2)
    recode waisthipsexfemale (min/0.93=3)
    recode waisthipsexfemale (min/1.9=4)
    tabulate waisthipsexfemale, m

    I now have two new variables which are appropriately recoded, now is there any wya to combine them back into a single variable?

    Comment


    • #3
      And i have now done it by generating a new varaible and after finding much useful advice on previous posts. APologies fo rwasting anyone's time who ahs read this and thank you for such a useful resource. I will read more previous posts in future. thanks all

      Comment


      • #4
        So, you have a measured quantity and you want to degrade it to categories because 1.49 and 0.97 (say) have the same implications for males? See e.g. https://onlinelibrary.wiley.com/doi/....1002/sim.2331 for some advice against similar habits.

        (Stata technique any way...)

        If I understand this correctly your breakpoints for males are 0.91 0.97 1.5 and for females 0.86 0.93 1.9 but I don't understand your recode statements which appear to overwrite each other and sometimes work on data for both sexes

        Compare this reproducible example.

        Code:
        . clear
        
        . sysuse auto
        (1978 Automobile Data)
        
        . recode mpg min/20=1
        (mpg: 38 changes made)
        
        . tab mpg
        
            Mileage |
              (mpg) |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  1 |         38       51.35       51.35
                 21 |          5        6.76       58.11
                 22 |          5        6.76       64.86
                 23 |          3        4.05       68.92
                 24 |          4        5.41       74.32
                 25 |          5        6.76       81.08
                 26 |          3        4.05       85.14
                 28 |          3        4.05       89.19
                 29 |          1        1.35       90.54
                 30 |          2        2.70       93.24
                 31 |          1        1.35       94.59
                 34 |          1        1.35       95.95
                 35 |          2        2.70       98.65
                 41 |          1        1.35      100.00
        ------------+-----------------------------------
              Total |         74      100.00
        
        . recode mpg min/30=2
        (mpg: 69 changes made)
        
        . tab mpg
        
            Mileage |
              (mpg) |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  2 |         69       93.24       93.24
                 31 |          1        1.35       94.59
                 34 |          1        1.35       95.95
                 35 |          2        2.70       98.65
                 41 |          1        1.35      100.00
        ------------+-----------------------------------
              Total |         74      100.00
        
        . recode mpg min/40=3
        (mpg: 73 changes made)
        
        . tab mpg
        
            Mileage |
              (mpg) |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  3 |         73       98.65       98.65
                 41 |          1        1.35      100.00
        ------------+-----------------------------------
              Total |         74      100.00

        Three breakpoints create four classes. I dislike
        Code:
        recode
        which must be pure prejudice, because many people like it. If I recall correctly the dark secret is that it was based on syntax used by Some Previous Statistical Software.

        Here is one way to calculate male categories. Naturally the details should be modified if you want different inequalities at breakpoints or the breakpoints are different.

        Code:
        local w waisthip 
        gen male_cat = cond(`w' <= 0.91, 1, cond(`w' <= 0.97, 2, cond(`w' <= 1.5, 3, 4))) if  sex == 1 & `w' <  .
        It takes a little practice to read (indeed, to write) cond() statements, but it's entirely possible. More at https://www.stata-journal.com/articl...article=pr0016



        Comment

        Working...
        X