Population size smaller than number of observation, svyset

Kateryna Kravchenko

Join Date: Apr 2020
Posts: 5

Population size smaller than number of observation, svyset

12 Apr 2020, 15:15

Dear Stata users,

After applying svyset command to dataset with weights included in the dataset and 15 stratas generated considering sample design:

Code:

svyset cluster [pweight=weight], strata(strata)

I am getting a result of

Code:

svy: mean bcg

with population size smaller than the number of observations.

Code:

Number of strata =      15        Number of obs   =        622
Number of PSUs   =     336        Population size = 613.447816
                                  Design df       =        321

--------------------------------------------------------------
             |             Linearized
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
         bcg |   .9828574    .008315      .9664986    .9992163
--------------------------------------------------------------

Would you be so kind to have any suggestion what can possibly be wrong in that case?

Tags: None

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

13 Apr 2020, 10:34

Welcome to Stata list. As I remember from a recent posting on Stata list, sometimes when you use weights it can generate what appear to be extra observations. I could be wrong about this with survey analysis.
Comment
Kateryna Kravchenko

Join Date: Apr 2020

Posts: 5
#3

13 Apr 2020, 15:38

Dear Phil,

Thank you for your reply.

What I was confused about was the estimated population size appearing to be smaller than the number of observations (613.448 and 622, respectively). I was assuming that the problem might be with weights assigned, but uncertain about that.
Is there any possibility to correct the weights? Or perhaps I am missing out on something else.

Any help would be highly appreciated.
Comment
Kateryna Kravchenko

Join Date: Apr 2020

Posts: 5
#4

23 Apr 2020, 05:39

Dear all,

I found out what was the issue leading to my concern with the population size.

If the weights were normalized (as it should be mentioned in the sample design description), so that the weighted number of households/individuals would be equal to the corresponding unweighted number of sample cases with completed questionnaires, the population size would not be equal to the total one. If we simply calculate the sum of the weights, it should lead to the number of observations.

In other words, the normalized weights couldn't be used for estimation of population totals; however, still will lead to the same results as non-normalized for obtaining the point estimates such as mean or proportion.
Comment
Samruddhi Borate

Join Date: Jul 2024

Posts: 4
#5

17 Mar 2025, 18:06

Hi, thank you for bringing this up. I had the exact same doubt and reached the same conclusion. I was wondering if you were able to find any way to calculate the population size though?
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5008
#6

17 Mar 2025, 18:39

Originally posted by Samruddhi Borate View Post

Hi, thank you for bringing this up. I had the exact same doubt and reached the same conclusion. I was wondering if you were able to find any way to calculate the population size though?

I suspect your best bet is to look at the study documentation. Or see if non-normalized weights are also included in the data.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Announcement

Population size smaller than number of observation, svyset

Comment

Comment

Comment

Comment

Comment