Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create new variable based on multiple variables with specified values?

    Hi all,

    I’m trying to generate a new variable that tells me whether or not each participant in the dataset had an elevated value for a test. The threshold for elevation is specific to age group (30s, 40s, 50s, 60+) and gender. E.g. if you’re 51 and male, an elevated value would be 0.6 while if you’re 40 and female it may be 0.5.

    I recoded my continuous age variable into groups, with the variable agegrp:
    0 = less than 40
    1 = 40 –less than 50
    2 = 50 – less than 60
    3 = 60+

    gender
    male = 1
    female = 2

    I tried to start with the men and combine at least the 4 different options for what is an elevated value for each group with the idea that I could add on for women:

    gen elevated_value = gender==1 & agegrp==0 & test_value > 0.553|gender==1 & agegrp==1 & test_value > 0.626|gender==1 & agegrp==2 & test_value > 0.687|gender==1 & agegrp==3 & test_value > 0.791

    However, this doesn’t seem to be working. I just want to create a variable that tells me that the test value is high (or not) for that person, given the gender and age category. Any ideas? Thanks in advance!

  • #2
    Nora:
    welcome to the list.
    You may want to combine -gender- and -age- in an unique variable via -egen- with -group- function.
    You should end up with something along the following lines:
    Code:
    set obs 10
    g gender=0 in 1/5
    replace gender=1 in 6/10
    g age=40 in 1/2
    replace age=50 in 3/4
    replace age=60 in 5/6
    replace age=70 in 7/8
    replace age=80 in 9/10
    label define gender 0 "male" 1 "female"
    label val gender gender
    egen combined=group(gender age)
    Then you can -label- the different groups and -generate- the -elevated_value- accordingly.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you, Carlo! I managed to get the combined agegroup gender variable thanks to your helpful advice. I haven't been successful with the second step of creating an elevated value variable with values corresponding to whether or not an individual in each age group reached threshold for their test value:


      This worked well to create a variable with a combo of age group and sex:
      *create age group matched with gender categories
      egen agesex=group(agegrp gender), label
      tab agesex

      This is the part that is a bad syntax:

      generate elevated_value=.
      replace elevated_value=1 if {(agesex==1 & test_value>0.561)|
      (agesex==2 & test_value >0.543)|
      (agesex==3 & test_value >0.623)|
      (agesex==4 & test_value >0.605)|
      (agesex==5 & test_value >0.686)|
      (agesex==6 & test_value >0.651)|
      (agesex==7 & test_value >0.771)|
      (agesex==8 & test_value >0.682)}

      Any ideas?

      Comment


      • #4
        The braces (curly brackets {}) should be parentheses (round brackets ()).

        Comment


        • #5
          thanks!

          I tried the following using that suggestion but it did not work:
          generate elevated_value=.
          replace elevated_value=1 if ((agesex==1 & test_value>0.561)|
          (agesex==2 & test_value >0.543)|
          (agesex==3 & test_value >0.623)|
          (agesex==4 & test_value >0.605)|
          (agesex==5 & test_value >0.686)|
          (agesex==6 & test_value >0.651)|
          (agesex==7 & test_value >0.771)|
          (agesex==8 & test_value >0.682))

          I had tried the brackets so that I could use multiple lines in my do file. Instead of that I then tried:
          generate elevated_value=.
          replace elevated_value=1 if ((agesex==1 & test_value>0.561)| ///
          (agesex==2 & test_value >0.543)| ///
          (agesex==3 & test_value >0.623)| ///
          (agesex==4 & test_value >0.605)| ///
          (agesex==5 & test_value >0.686)| ///
          (agesex==6 & test_value >0.651)| ///
          (agesex==7 & test_value >0.771)| ///
          (agesex==8 & test_value >0.682))

          This did not work either.
          For both of those attempts I had this message:

          too few ')' or ']'
          r(132);

          For the attempts with the brackets I had:
          invalid syntax
          r(198);

          What am I missing? I had tried adding "[" instead with no success. I'm not well versed in splitting lines and I'm just attempting it now bc this one command should be long to incorporate all I want.

          Comment


          • #6
            Your second approach looks good to me. I suggest that (0, 1) indicators are more useful than (1, missing) indicators.

            Code:
            clear 
            set obs 100 
            set seed 2803
            gen test_value = runiform() 
            sort test_value 
            gen age_sex = ceil(_n/10) 
            
            generate elevated_value = ((age_sex==1 & test_value>0.561)|  ///
             (age_sex==2 & test_value >0.543)|  ///
             (age_sex==3 & test_value >0.623)|  /// 
             (age_sex==4 & test_value >0.605)|  ///
             (age_sex==5 & test_value >0.686)|  ///
             (age_sex==6 & test_value >0.651)|  ///
             (age_sex==7 & test_value >0.771)|  ///
             (age_sex==8 & test_value >0.682))
            
            tab age_sex elevated_value
            
                       |    elevated_value
               age_sex |         0          1 |     Total
            -----------+----------------------+----------
                     1 |        10          0 |        10 
                     2 |        10          0 |        10 
                     3 |        10          0 |        10 
                     4 |        10          0 |        10 
                     5 |        10          0 |        10 
                     6 |        10          0 |        10 
                     7 |        10          0 |        10 
                     8 |         0         10 |        10 
                     9 |        10          0 |        10 
                    10 |        10          0 |        10 
            -----------+----------------------+----------
                 Total |        90         10 |       100

            Comment


            • #7
              That worked!! Thank you so much!!

              Comment

              Working...
              X