Hi,
Here is the svyset for my survey data:
svyset psu [pw=wt_final], strata(strata1) fpc(fpc1) || school_id, strata(strata2) fpc(fpc2) || id, strata(strata3) fpc(fpc3) singleunit(centered)
I use svy mean and the subpop statement to get the mean for grade 4 and 6 and region A. I used again svy mean but now using over(grade region) to compare the results. I get the same estimate but different standard error (slightly) and confidence intervals. Could someone explain to me why?. I really appreciate your input.
Thanks!
Jennifer
. svy, subpop(if grade==4 & region=="REGION A"): mean adj_orf
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 1574
Number of PSUs = 10 Population size = 182028
Subpop. no. obs = 503
Subpop. size = 63187.1
Design df = 9
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
adj_orf | 4.637633 1.727553 .7296358 8.54563
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.
Note: Strata with single sampling unit centered at overall
mean.
. svy, subpop(if grade==6 & region==1): mean adj_orf
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 1577
Number of PSUs = 10 Population size = 182253
Subpop. no. obs = 517
Subpop. size = 47116.9
Design df = 9
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
adj_orf | 24.52088 4.222985 14.96783 34.07394
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.
Note: Strata with single sampling unit centered at overall
mean.
. svy: mean adj_orf, over(region grade)
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 3 Number of obs = 3428
Number of PSUs = 28 Population size = 461175
Design df = 25
Over: region grade
_subpop_1: REGION A 4
_subpop_2: REGION A 6
_subpop_3: REGION B 4
_subpop_4: REGION B 6
_subpop_5: REGION C 4
_subpop_6: REGION C 6
--------------------------------------------------------------
| Linearized
Over | Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
adj_orf |
_subpop_1 | 4.637633 1.727522 1.079735 8.195532
_subpop_2 | 24.52088 4.226485 15.81627 33.22549
_subpop_3 | 10.21037 1.837862 6.425223 13.99552
_subpop_4 | 35.90704 5.374154 24.83876 46.97532
_subpop_5 | 10.59444 2.29463 5.868564 15.32032
_subpop_6 | 33.77606 4.203415 25.11897 42.43316
--------------------------------------------------------------
Note: Strata with single sampling unit centered at overall
mean.
.
Here is the svyset for my survey data:
svyset psu [pw=wt_final], strata(strata1) fpc(fpc1) || school_id, strata(strata2) fpc(fpc2) || id, strata(strata3) fpc(fpc3) singleunit(centered)
I use svy mean and the subpop statement to get the mean for grade 4 and 6 and region A. I used again svy mean but now using over(grade region) to compare the results. I get the same estimate but different standard error (slightly) and confidence intervals. Could someone explain to me why?. I really appreciate your input.
Thanks!
Jennifer
. svy, subpop(if grade==4 & region=="REGION A"): mean adj_orf
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 1574
Number of PSUs = 10 Population size = 182028
Subpop. no. obs = 503
Subpop. size = 63187.1
Design df = 9
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
adj_orf | 4.637633 1.727553 .7296358 8.54563
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.
Note: Strata with single sampling unit centered at overall
mean.
. svy, subpop(if grade==6 & region==1): mean adj_orf
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 1 Number of obs = 1577
Number of PSUs = 10 Population size = 182253
Subpop. no. obs = 517
Subpop. size = 47116.9
Design df = 9
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
adj_orf | 24.52088 4.222985 14.96783 34.07394
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.
Note: Strata with single sampling unit centered at overall
mean.
. svy: mean adj_orf, over(region grade)
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 3 Number of obs = 3428
Number of PSUs = 28 Population size = 461175
Design df = 25
Over: region grade
_subpop_1: REGION A 4
_subpop_2: REGION A 6
_subpop_3: REGION B 4
_subpop_4: REGION B 6
_subpop_5: REGION C 4
_subpop_6: REGION C 6
--------------------------------------------------------------
| Linearized
Over | Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
adj_orf |
_subpop_1 | 4.637633 1.727522 1.079735 8.195532
_subpop_2 | 24.52088 4.226485 15.81627 33.22549
_subpop_3 | 10.21037 1.837862 6.425223 13.99552
_subpop_4 | 35.90704 5.374154 24.83876 46.97532
_subpop_5 | 10.59444 2.29463 5.868564 15.32032
_subpop_6 | 33.77606 4.203415 25.11897 42.43316
--------------------------------------------------------------
Note: Strata with single sampling unit centered at overall
mean.
.
Comment