Adding weights to variables containing within group percentages

Tim Bergmann

Join Date: Nov 2022
Posts: 2

Adding weights to variables containing within group percentages

18 Nov 2022, 06:44

Hello, I have a question regarding adding weights to survey Data.

Code:

clear
input str20 Country double projection_weight byte L5 float perc_L5
"Cameroon"           11120.53503054633  1      43.4
"Chile"             10973.507432230832  1   86.8868
"Chile"              8541.300724619616  1   86.8868
"China"              211097.9105238506  1  22.35104
"Congo Brazzaville"  7753.040300652525  2  20.91743
"Croatia"            3540.037078778114  1  52.22222
"Guatemala"          4633.447452234091  1  61.63636
"Guinea"             5669.176581309484  1  51.40351
"Hungary"           2042.3258052764882  1  67.12963
"India"             370129.64921075094 98 18.004147
"Italy"              75734.34207291917  1      70.4
"Jordan"             3650.272141344148  2  39.76024
"Kuwait"            1390.7818018971584  3 16.019417
"Liberia"           3387.1228642331657  1      54.4
"Mexico"            164398.15696774196  1 69.330666
"North Macedonia"   3331.2486706403165  1  60.18518
"Panama"             3247.522338947211  2 18.518518
"Poland"            65426.710048699206  1  52.40741

I have individual-level data for over 100 countries. The variable L5 can take on four values, of which I want to know
the percentage representation within a Country.
First, I tried to calculate the percentages as seen under the variable perc_L5.

Code:

bysort Country L5: gen prop = _N 

by Country: replace prop = 100 * prop/_N

As it would not be precise to use this results, how can I rewrite the code to add weights to the observations?

Tags: None

George Ford

Join Date: Aug 2014
Posts: 3138

18 Nov 2022, 07:02

I added some data so that the calculation could be evaluated.

Code:

clear
input str20 Country double projection_weight byte L5 float perc_L5
"Cameroon"           11120.53503054633  1      43.4
"Chile"             10973.507432230832  1   86.8868
"Chile"              8541.300724619616  1   86.8868
"China"              211097.9105238506  1  22.35104
"Congo Brazzaville"  7753.040300652525  2  20.91743
"Croatia"            3540.037078778114  1  52.22222
"Guatemala"          4633.447452234091  1  61.63636
"Guinea"             5669.176581309484  1  51.40351
"Hungary"           2042.3258052764882  1  67.12963
"India"             370129.64921075094 98 18.004147
"Italy"              75734.34207291917  1      70.4
"Jordan"             3650.272141344148  2  39.76024
"Kuwait"            1390.7818018971584  3 16.019417
"Liberia"           3387.1228642331657  1      54.4
"Mexico"            164398.15696774196  1 69.330666
"North Macedonia"   3331.2486706403165  1  60.18518
"Panama"             3247.522338947211  2 18.518518
"Poland"            65426.710048699206  1  52.40741

"Cameroon"           11120.53503054633  2      43.4
"Chile"             10973.507432230832  2   86.8868
"Chile"              8541.300724619616  3   86.8868
"China"              211097.9105238506  2  22.35104
"Congo Brazzaville"  7753.040300652525  3  20.91743
"Croatia"            3540.037078778114  2  52.22222
"Guatemala"          4633.447452234091  98  61.63636
"Guinea"             5669.176581309484  2  51.40351
"Hungary"           2042.3258052764882  3  67.12963
"India"             370129.64921075094  2 18.004147
"Italy"              75734.34207291917  2      70.4
"Jordan"             3650.272141344148  1  39.76024
"Kuwait"            1390.7818018971584  2 16.019417
"Liberia"           3387.1228642331657  98      54.4
"Mexico"            164398.15696774196  2 69.330666
"North Macedonia"   3331.2486706403165  2  60.18518
"Panama"             3247.522338947211  1 18.518518
"Poland"            65426.710048699206  2  52.40741
end

bysort Country: gen totcount = _N 
bysort Country L5: gen L5count = _N 
by Country: g prop = 100 * L5count/totcount

Comment

Tim Bergmann

Join Date: Nov 2022
Posts: 2

18 Nov 2022, 08:05

Thank you for the fast reply, unfortunately the result is like the one I had before.
Admittedly, the data example was a bit short so I prepared this one:

Code:

input str20 Country double projection_weight byte L5
"Argentina"  72469.49024515405  1
"Argentina" 38175.479902532374  1
"Argentina" 10344.113584815686  1
"Argentina"  13304.34326130613  1
"Argentina" 32442.106755731922  1
"Argentina" 14904.989909521484  1
"Argentina" 26818.060658717528  1
"Argentina"  8477.792322793595  1
"Argentina" 38114.442527840496  1
"Argentina"  33779.38315709478  1
"Argentina"  8601.732608068187  2
"Chile"     11233.189109101633  1
"Chile"     30451.322637124158  1
"Chile"      6418.568607528032  1
"Chile"        5925.4961486188  1
"Chile"      6418.568607528032  1
"Chile"        5671.3947625337  1
"Chile"      4982.109525043076  1
"Chile"      7439.584914735958  1
"Chile"      31328.01872372968  1
"Chile"     13542.925150552212  1
"Chile"     10678.479235837947  1
"Chile"       11850.9922972376  1
"Chile"     12665.355641090746  2
"Chile"     15770.762179453848  2
"Germany"   23506.413243750743  1
"Germany"   31689.304472218402  1
"Germany"    75829.11830406115  1
"Germany"    53452.73260016138  1
"Germany"   58255.412092256745  1
"Germany"   25067.480354849886  2
"Germany"   112135.42338214809  2
"Germany"   20459.242487361444  2
"Germany"     98136.4130176962  2
"Germany"   204592.42487361442  2
"Germany"    25378.48467264748  2
"Germany"    72828.84298011378  3
"Germany"   169250.12805237703 98
"Germany"    93643.06511706572 99

This time I reduced it to the important variables - Country, the projected weight , observation of interest- for three countries.
The generated variable should contain the results as in either of the last two columns of this table.

Code:

  tab L5[iweight=projection_weight] if Country == "Chile" & _randomtag == 1
 Do you think that |
    climate change is a |
 very serious threat, a |
    somewhat serious th |      Freq.     Percent        Cum.
------------------------+-----------------------------------
    Very serious threat | 145,940.65       83.69       83.69
Somewhat serious threat | 28,436.118       16.31      100.00
------------------------+-----------------------------------
                  Total | 174,376.77      100.00

Announcement

Adding weights to variables containing within group percentages

Comment

Comment