Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to randomly assign integer weights from a series of decimal weights

    Good Morning Everyone,

    I am attempting to find a way in Stata to randomly assign integer weights to a dataset from a series of decimal weights so that the number of people we are weighing up to remains the same.

    For example:

    We have a group of five people who have each been assigned a weight of 4.2.

    Person # | Weight
    Person 1 | 4.2
    Person 2 | 4.2
    Person 3 | 4.2
    Person 4 | 4.2
    Person 5 | 4.2

    Weighted Total = 21 people


    I am attempting to find a way to instead assign four of those people with a 4 and the fifth with a 5, and randomly select the person that is assigned the 5 from the group of five people. This way, we are still multiplying up to 21 people, instead of rounding all weights to the nearest integer where we would only be able to weight up to 20.

    Rounded to nearest integer:
    Person # | Weight
    Person 1 | 4
    Person 2 | 4
    Person 3 | 4
    Person 4 | 4
    Person 5 | 4

    Weighted Total = 20 people

    Ideal scenario:
    Person # | Weight
    Person 1 | 4
    Person 2 | 4
    Person 3 | 5
    Person 4 | 4
    Person 5 | 4

    Weighted Total = 21 people


    Any help would be much appreciated. Please let me know if I can clarify my endeavor further.

    Thanks a lot!

  • #2
    Like this?

    Code:
    clear
    input person weight
    1 4.2
    2 4.2
    3 4.2
    4 4.2 
    5 4.2
    end
    sum weight,meanonly
    local sum = r(sum)
    gen newweight = int(weight)
    sum new,meanonly
    local newsum= r(sum)
    gen x = runiform()
    sort x
    replace new = int(weight) + round(`sum',1) - round(`newsum',1)  if _n == 1
    sort person
    drop x
    tabstat wei new, stat(sum)
    l

    Comment


    • #3
      Hey Scott,

      Thanks so much for your reply! Your code worked for a set of 5 people, and helped me to devise a way to accomplish this for any data set. To be clear, I have developed code that will change decimal weights for a given data set into integer weights, but will keep the total number of cases approximately the same by randomly assigning +1 to the appropriate number of cases. For future reference, I have printed my code here with annotations:

      *****************************************
      set more off

      sort pweight /*or whatever your original decimal weight variable is called*/

      egen weightv = group(pweight) /*creates a variable that groups the values of weights, so all 1's are together, all 1.2's are together, all 2's are together, etc.*/

      tab weightv /*to see how many groups of weight you have, as groups will be numbered from 1 to n, where n = the number of groups of decimal weights in your data set*/

      set seed /*insert seed number here*/

      gen x = runiform()

      sort weightv x /*sorts data into weight groups, then puts the data in random order within each group*/

      gen finalweight = int(pweight) /*rounds each decimal weight down to nearest integer*/

      /* For each number 1 to 70 (the total number of groups of weights in my data set), we are multiplying the total number of records in the group by the difference between the decimal weight and the integer weight to get the proportion of records in the group that need a weight that is 1 higher. I am using the line numbers to pick the appropriate amount within each group that need to be changed*/
      forval num = 1/70 {
      count if weightv = `num' /*how many observations in the particular weight group*/
      gen y`num' = r(N) /*generating a variable that represents the number of observations in the particular weight group*/
      replace y`num' = round(y`num' * (pweight - int(pweight))) if weightv==`num' /*generates variable for each weight group representing the number of obs that need their weight changed*/
      egen x`num' = min(_n) if weightv==`num' /*generates a variable that represents the first line number of the particular weight group*/
      replace finalweight = (finalweight + 1) if weightv==`num' & _n<(x`num' + y`num') /*adds 1 to the appropriate number of records in each weight group, randomly*/
      }

      *check to see if totals are approximately the same
      total pweight
      total finalweight


      *drop unneeded variables
      drop x*
      drop y*


      ***************************************

      Please feel free to let me know if you have any questions about or suggestions for my code. Thanks again for your help!

      Ryan
      Last edited by Ryan Klein; 26 Aug 2014, 09:40.

      Comment

      Working...
      X