Hi Stata Experts,
I am using DHS data which is a nationally-representative household survey data with a two-stage/three-stage clustering design where Census enumeration units were sampled at first stage and households at second stage. As is usual, only one weight variable is specified (wij). The sampling design is explained in the following excerpts from the report
(EAs refer to Census enumeration areas; ward size indicates number of households)
Using Stata 15, I was able to survey set my data but then I got stuck. My Stata code for survey setting is shown below.
I wish to use districts as the group level so that I have enough number of groups and also average number of observations within groups. However, district does not coincide with any sampling level. So, using “svy:melogit” does not work. Using the stratum as my group level does not work either. If I use PSU as my group, then the average number of observations within groups becomes too small. Also, I don't see the LR test- should I take this to mean that multilevel model is not needed here? The output for null models is shown below.
The sample sizes are shown below
An example data is shown below:
I will appreciate if someone can point me in the right direction. Many thanks
I am using DHS data which is a nationally-representative household survey data with a two-stage/three-stage clustering design where Census enumeration units were sampled at first stage and households at second stage. As is usual, only one weight variable is specified (wij). The sampling design is explained in the following excerpts from the report
Each province was stratified into urban and rural areas, yielding 14 sampling strata. Samples of wards were selected independently in each stratum…...
In the first stage, 383 wards were selected with probability proportional to ward size and with independent selection in each sampling stratum. ….
Due to the large size of the urban wards, in a second stage of sample selection, one EA was randomly selected from each of the sample urban wards….
In the last stage of selection, a fixed number of 30 households per cluster were selected with an equal probability systematic selection from the newly created household listing.
In the first stage, 383 wards were selected with probability proportional to ward size and with independent selection in each sampling stratum. ….
Due to the large size of the urban wards, in a second stage of sample selection, one EA was randomly selected from each of the sample urban wards….
In the last stage of selection, a fixed number of 30 households per cluster were selected with an equal probability systematic selection from the newly created household listing.
Using Stata 15, I was able to survey set my data but then I got stuck. My Stata code for survey setting is shown below.
Code:
* Group level weight gen x=1 * Individual level weight: w i|j from wij gen wt=v005/1000000 *Level 1 weights using scaling method 1: New weights sum to cluster sample size gen sqw = wt*wt egen sumsqw = sum(sqw), by(sdist) egen sumw = sum(wt), by(sdist) gen pwt1 = wt*sumw/sumsqw *Level 1 weights using scaling method 2: New weights sum to effective cluster size egen nj = count(lbw), by(sdist) gen pwt2 = wt*nj/sumw . svyset v001 , strata(v022) || _n, weight (pwt1) singleunit(missing) Note: Stage 1 is sampled with replacement; further stages will be ignored for variance estimation. pweight: <none> VCE: linearized Single unit: missing Strata 1: v022 SU 1: v001 FPC 1: <zero> Weight 1: <none> Strata 2: <one> SU 2: <observations> FPC 2: <zero> Weight 2: pwt1
Code:
. svy:melogit lbw || sdist:, cov(unstruct) (running melogit on estimation sample) hierarchical groups are not nested within v001 an error occurred when svy executed melogit r(459); . svy:melogit lbw || v022:, cov(unstruct) (running melogit on estimation sample) hierarchical groups are not nested within v001 an error occurred when svy executed melogit r(459); svy:melogit lbw || v001:, cov(unstruct) (running melogit on estimation sample) Survey: Mixed-effects logistic regression Number of strata = 14 Number of obs = 1,641 Number of PSUs = 366 Population size = 1,468.6222 Design df = 352 F( 0, 352) = . Prob > F = . ------------------------------------------------------------------------------ | Linearized lbw | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | -1.995664 .1053414 -18.94 0.000 -2.202842 -1.788487 -------------+---------------------------------------------------------------- v001 | var(_cons)| .2488302 .1706466 .0645861 .9586658 ------------------------------------------------------------------------------
Code:
. quietly melogit lbw || sdist:, cov(unstruct) . estat group ------------------------------------------------------------- | No. of Observations per Group Group Variable | Groups Minimum Average Maximum ----------------+-------------------------------------------- sdist | 71 1 23.1 109 ------------------------------------------------------------- . quietly melogit lbw || v022:, cov(unstruct) . estat group ------------------------------------------------------------- | No. of Observations per Group Group Variable | Groups Minimum Average Maximum ----------------+-------------------------------------------- v022 | 14 81 117.2 155 ------------------------------------------------------------- . quietly melogit lbw || v001:, cov(unstruct) . estat group ------------------------------------------------------------- | No. of Observations per Group Group Variable | Groups Minimum Average Maximum ----------------+-------------------------------------------- v001 | 366 1 4.5 12 -------------------------------------------------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int(v001 v002) byte v003 long v005 int v021 byte(v022 v024 v025 sdist) float(lbw age matgr) 1 17 4 1084429 1 1 1 1 3 0 51 0 1 49 2 1084429 1 1 1 1 3 0 57 0 1 57 2 1084429 1 1 1 1 3 . 47 1 1 81 4 1084429 1 1 1 1 3 . 34 0 1 97 3 1084429 1 1 1 1 3 0 30 0 2 4 2 274377 2 1 1 1 4 0 5 0 2 36 3 274377 2 1 1 1 4 . 10 1 2 62 1 274377 2 1 1 1 4 0 33 0 2 166 2 274377 2 1 1 1 4 . 31 0 3 45 4 566707 3 1 1 1 4 0 26 0 3 87 3 566707 3 1 1 1 4 0 37 1 3 94 3 566707 3 1 1 1 4 0 0 0 3 150 2 566707 3 1 1 1 4 0 51 0 3 164 2 566707 3 1 1 1 4 0 23 1 3 171 2 566707 3 1 1 1 4 0 37 0 3 171 6 566707 3 1 1 1 4 0 23 0 3 206 5 566707 3 1 1 1 4 0 48 1 4 111 1 979347 4 1 1 1 4 0 51 0 4 118 7 979347 4 1 1 1 4 0 7 0 4 125 4 979347 4 1 1 1 4 1 9 0 4 132 2 979347 4 1 1 1 4 0 45 1 4 146 3 979347 4 1 1 1 4 0 51 1 4 195 4 979347 4 1 1 1 4 1 8 0 4 195 9 979347 4 1 1 1 4 1 21 0 5 84 2 864348 5 1 1 1 4 . 17 1 5 109 6 864348 5 1 1 1 4 0 47 1 5 235 4 864348 5 1 1 1 4 1 2 1 5 322 1 864348 5 1 1 1 4 . 58 0 5 360 2 864348 5 1 1 1 4 0 18 0 6 36 2 969976 6 1 1 1 4 0 18 1 6 45 1 969976 6 1 1 1 4 . 40 0 6 120 2 969976 6 1 1 1 4 0 41 1 6 167 1 969976 6 1 1 1 4 0 49 0 6 176 3 969976 6 1 1 1 4 . 34 0 6 223 4 969976 6 1 1 1 4 0 32 0 6 232 2 969976 6 1 1 1 4 0 13 1 6 251 2 969976 6 1 1 1 4 . 4 0 6 260 5 969976 6 1 1 1 4 0 35 0 6 279 2 969976 6 1 1 1 4 0 51 0 7 216 3 981948 7 1 1 1 4 0 6 0 8 7 2 1020485 8 1 1 1 4 1 52 1 8 85 1 1020485 8 1 1 1 4 0 40 0 9 64 8 1026315 9 1 1 1 5 0 3 0 9 83 4 1026315 9 1 1 1 5 0 32 0 10 63 4 874170 10 1 1 1 5 0 10 0 10 154 2 874170 10 1 1 1 5 0 2 0 11 128 2 1023708 11 1 1 1 5 0 58 0 11 158 4 1023708 11 1 1 1 5 . 0 1 11 166 2 1023708 11 1 1 1 5 0 50 0 11 219 2 1023708 11 1 1 1 5 . 20 0 end label values v022 V022 label def V022 1 "province 1 - urban", modify label values v024 V024 label def V024 1 "province 1", modify label values v025 V025 label def V025 1 "urban", modify label values sdist SDIST label def SDIST 3 "ilam", modify label def SDIST 4 "jhapa", modify label def SDIST 5 "morang", modify label values lbw A label def A 0 "No", modify label def A 1 "Yes", modify label values matgr matgr label def matgr 0 "20-24", modify label def matgr 1 "<20", modify