Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Saving & using multiple scalars to generate new variables

    Hi all,

    I am trying to create a new indicator variable based on whether an individual's height, age bmi is in the top 5th percentile for either white, asian or other. Here is what I have so far - struggles to create new variable based on the minimum value stored a local 95th percentile.
    Many thanks in advance for any help on this.

    Kind regards,
    Michelle

    local VARS height age bmi
    local RACE2 white asian other

    foreach var of local VARS {
    foreach race2 of local RACE2 {
    summarize `var' if race2 == "`race2'", d
    local p95 `r(p95)'
    dis "for variable `var' & race2 == `race2'; p95 = `r(p95)', mean (SD) = `r(mean)' (`r(sd)')
    }
    foreach var of local VARS
    gen `var'_group = 1 if `var' >= `p95' & `var' < .
    replace `var'_group = 0 if `var' < `p95'
    }
    }
    Last edited by Michelle Hall; 14 Jun 2021, 16:49.

  • #2
    Here is some untested code that may start you in a useful direction.
    Code:
    local VARS height age bmi
    local RACE2 white asian other
    
    foreach var of local VARS {
        generate `var'_group = .
        foreach race2 of local RACE2 {
            summarize `var' if race2 == "`race2'", d
            dis "for variable `var' & race2 == `race2'; p95 = `r(p95)', mean (SD) = `r(mean)' (`r(sd)')"
            replace `var'_group = 1 if race2 == "`race2'" & `var' >= `r(p95)' & `var' < .
            replace `var'_group = 0 if race2 == "`race2'" & `var' < `r(p95)'
        }
    }

    Comment


    • #3
      Dear William,

      Thank you for your suggestion. The problem I am still having is that cut-off value used to assign '1' to new generate variables are not based on the minimum of the 95th for each race.... I have tried:
      Michelle

      Code:
       
      local VARS height age bmi
      local RACE2 white asian other
            foreach var of local VARS {
            generate `var'_group = .
                 foreach race2 of local RACE2 { summarize `var' if race2 == "`race2'", d dis "for variable `var' & race2 == `race2'; p95 = `r(p95)', mean (SD) = `r(mean)' (`r(sd)')"
                 replace `var'_group = 1 if race2 == "`race2'" & `var' >= min(`r(p95)') & `var' < .
                 replace `var'_group = 0 if race2 == "`race2'" & `var' < min`r(p95)')
            }
      }
      Last edited by Michelle Hall; 14 Jun 2021, 18:29.

      Comment


      • #4
        I am afraid I do not understand what you mean by "the minimum of the 95th for each race".

        Within each race, you have three different 95th percentiles, from three very different distributions, and comparing them seems unhelpful. The 95th percentile of bmi seems to me to be guaranteed to be lower than the 95th percentile of weight (in either pounds or kilograms) and height (in inches or centimeters).

        Try writing what you expect to do. Can you complete the following description?

        1) compute the 95th percentile of height, age, and bmi for each of three races - a total of nine numbers. Let's call these values p95_white_height, p95_asian_height, p95_other_height, p95_white_age, ... , p95_other_bmi

        2) then what?

        Comment


        • #5
          Thank you for your help with this. Apologies for confusion:

          2) then take the minimum value of the 95th percentile for age across white, asian and other and generate a new variable with 1 if age is in the top 5th percentile (i.e. >= minimum of the three 95th percentile values for age across the races: white, asian, other) and 0 otherwise. Then repeat this process for BMI - point taken about the height and weight.

          Thanks again.
          Michelle

          Comment


          • #6
            Perhaps this does what you want.
            Code:
            local VARS height age bmi
            local RACE2 white asian other
            
            foreach var of local VARS {
                local p95min = .
                foreach race2 of local RACE2 {
                    summarize `var' if race2 == "`race2'", d
                    display "for variable `var' & race2 == `race2'; p95 = `r(p95)', mean (SD) = `r(mean)' (`r(sd)')"
                    local p95min = min(`r(p95)',`p95min')
                }
                display "for variable `var' p95min = `p95min'"
                generate `var'_group = .
                replace  `var'_group = 1 if `var' >= `p95min' & `var' < .
                replace  `var'_group = 0 if `var' <  `p95min'
            }

            Comment


            • #7
              Thank you so much William, this does exactly what I want.
              Again, many thanks
              Michelle

              Comment

              Working...
              X