Dear STATALIST,
I am coming to you with an embarrassingly simple request.
I have three variables (resp1, resp2, resp3), representing 3 response options that each voter could pick (only one response) at each voting round. The votes were conducted in households, and in each household, all eligible inhabitants could vote (so more than one voter per household). The voting round is represented by variable round (4 voting rounds), and variable hhid is household ID. So in each household and at each round, you have x number of voters who picked resp1, y number who picked resp2, etc. Variable nbvoter is the number of voters per household.
All I wish to do is, for each voting round, calculate the total number of respondents who voted for each response option, so that I could say for instance: At round 1, n (30%) voters voted for resp1, n (10%) voters for resp2, and n (60%) voters for resp3.
What STATA is doing is, for resp1 for instance, calculating the number of times resp1 was not picked, the number of times it was picked by a single person, the number of times it was picked by 2 respondents, etc (so frequencies). However, given the variables are not categorical but continuous, it just does not make sense. For each voting round, and by answer option, I need to compute the total number of respondents who picked, say resp1, and divid it by the total number of voters. I will also have to use svy to compute 95% CI around the computed proportions to account for repeated observations within households, and the data is svyset.
I have tried to collapse the data several times in different ways to generate new variables which would enable me to solve the issue, but to no avail. I would much appreciate your help.
Here is the dataset.
Thank you very much again.
I am coming to you with an embarrassingly simple request.
I have three variables (resp1, resp2, resp3), representing 3 response options that each voter could pick (only one response) at each voting round. The votes were conducted in households, and in each household, all eligible inhabitants could vote (so more than one voter per household). The voting round is represented by variable round (4 voting rounds), and variable hhid is household ID. So in each household and at each round, you have x number of voters who picked resp1, y number who picked resp2, etc. Variable nbvoter is the number of voters per household.
All I wish to do is, for each voting round, calculate the total number of respondents who voted for each response option, so that I could say for instance: At round 1, n (30%) voters voted for resp1, n (10%) voters for resp2, and n (60%) voters for resp3.
What STATA is doing is, for resp1 for instance, calculating the number of times resp1 was not picked, the number of times it was picked by a single person, the number of times it was picked by 2 respondents, etc (so frequencies). However, given the variables are not categorical but continuous, it just does not make sense. For each voting round, and by answer option, I need to compute the total number of respondents who picked, say resp1, and divid it by the total number of voters. I will also have to use svy to compute 95% CI around the computed proportions to account for repeated observations within households, and the data is svyset.
I have tried to collapse the data several times in different ways to generate new variables which would enable me to solve the issue, but to no avail. I would much appreciate your help.
Here is the dataset.
Thank you very much again.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str7 hhid float(resp1 resp2 resp3 nbvoter round) "k001" 0 9 9 9 1 "a001" 1 9 16 17 1 "e001" 0 2 5 7 1 "1741001" 0 0 15 15 1 "2989001" 0 3 6 9 1 "2267001" 3 1 9 13 1 "3184001" 0 1 6 7 1 "s001" 1 3 9 13 1 "3018001" 0 2 3 5 1 "2759001" 0 1 20 21 1 "1479001" 4 7 5 16 1 "2835001" 2 9 14 16 1 "2441001" 0 9 8 8 1 "1760001" 0 3 7 10 1 "v001" 0 9 11 11 1 "y001" 0 9 6 6 1 "1713001" 0 1 7 8 1 "l001" 0 6 1 7 1 "3124001" 3 0 15 18 1 "n001" 2 2 4 8 1 "2761001" 0 4 7 11 1 "3612001" 0 1 12 13 1 "w001" 0 9 10 10 1 "3612002" 2 3 8 13 2 "3124002" 0 0 6 6 2 "1741002" 0 1 7 8 2 "3184002" 0 0 6 6 2 "2989002" 0 0 6 6 2 "y002" 0 2 7 9 2 "s002" 1 0 7 8 2 "2441002" 0 0 10 10 2 "e002" 1 1 4 6 2 "v002" 1 2 7 10 2 "2267002" 0 1 9 10 2 "1479002" 0 3 6 9 2 "2835002" 0 0 6 6 2 "w002" 0 1 9 10 2 "1713002" 0 2 8 10 2 "l002" 0 5 1 6 2 "3018002" 1 0 6 7 2 "a002" 0 4 1 5 2 "k002" 0 0 7 7 2 "2759002" 0 0 9 9 2 "n002" 0 3 6 9 2 "1760002" 1 0 7 8 2 "2761002" 0 0 9 9 2 "e003" 0 0 7 7 3 "2759003" 0 0 7 7 3 "3124003" 0 0 12 12 3 "2835003" 0 0 6 6 3 "n003" 0 0 6 6 3 "s003" 0 0 11 11 3 "w003" 1 0 7 8 3 "a003" 0 0 7 7 3 "2989003" 0 0 6 6 3 "3612003" 1 0 5 6 3 "1713003" 0 0 8 8 3 "2441003" 0 0 5 5 3 "k003" 0 1 7 8 3 "3018003" 1 2 7 10 3 "1479003" 0 2 8 10 3 "2267003" 2 1 8 11 3 "1741003" 0 1 6 7 3 "y003" 0 2 5 7 3 "1760003" 0 0 5 5 3 "3184003" 0 0 6 6 3 "l003" 0 0 8 8 3 "2761003" 1 0 5 6 3 "v003" 0 0 6 6 3 "a004" 0 0 6 6 4 "2441004" 0 1 8 9 4 "1713004" 0 1 10 11 4 "3184004" 0 0 6 6 4 "k004" 0 1 7 8 4 "1760004" 0 1 8 9 4 "3018004" 0 0 7 7 4 "1479004" 2 3 6 11 4 "1741004" 0 0 10 10 4 "2989004" 0 0 6 6 4 "3612004" 0 4 7 11 4 "2835004" 1 1 5 7 4 "n004" 0 0 12 12 4 "l004" 1 0 9 10 4 "v004" 0 0 5 5 4 "2267004" 0 1 9 10 4 "e004" 0 1 8 9 4 "y004" 0 1 7 8 4 "s004" 0 0 8 8 4 "2759004" 0 3 8 11 4 "w004" 3 2 5 10 4 "3124004" 0 0 13 13 4 "2761004" 0 0 9 9 4 end
Comment