Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recover population size by group

    Hi,

    I am trying to construct a "population" variable from household survey data. Data is collected at individual and household level. Survey weights are reported. My goal is to have a measure of population size at the district level (code below). I have generated a variable that counts the number of respondents by by district. However, without the weights, the variable I constructed only reports the number of observations per district. Is there a way I could recover district-level population? Thank you

    Code:
    "bysort district: egen pop4 = count(hhidall)"
    end

    Here is the data example

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float ivid_all double(hhidall district) float hhwt
     1.010109e+13  101010898944 10101  441.1645
     1.010109e+13  101010898944 10101  441.1645
    1.0101212e+13  101012111360 10101 411.47345
     1.010121e+14 1010121048064 10101       669
     1.010109e+14 1010108989440 10101       821
     1.010115e+14 1010115018752 10101       450
     1.010121e+13  101012103168 10101       449
     1.010109e+13  101010898944 10101       536
    1.1111287e+14  101011701760 10101       449
    1.0101031e+13  101010317312 10101  232.2846
    2.0202301e+13  101011496960 10101       506
    1.0101171e+13  101011701760 10101       537
    3.0303094e+13  101010300928 10101       449
     2.020242e+13  101012103168 10101       505
     1.010103e+14 1010103025664 10101       635
     1.010121e+14 1010121048064 10101       669
    3.0303094e+13  101010300928 10101       505
     1.010109e+13  101010898944 10101       536
    1.2121476e+14  101012299776 10101       534
    1.0101031e+13  101010317312 10101  232.2846
     1.010121e+14 1010121048064 10101       669
     1.010121e+13  101012103168 10101       449
    end

  • #2
    Check your survey documentation to verify that hhid is the correct weight to use (inverse probability of sampling) and that there are no other weights that need to be specified at higher levels of sampling. Then -svyset- your data appropriately and use -svy: tab, count-
    Code:
    svyset [pweight = hhid]
    svy: tab district, count

    Comment

    Working...
    X