Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Number of PSU in svyset

    Hello everyone,
    I am analysing LSMS household survey data (panel data, 3 waves). I would like to implement the svyset design but I face problems in terms of the number of PSU. My primary sampling units are communities (EA), I have (final) sampling weights in my data as well as one variable for the strata used.

    When I just count the number of communities in my data, I get 373. Several observations lack of community data, so I think there are 372 communities in my dataset.

    However, when I use
    "svyset comm [pw=wgt], strata(stratum)",
    Stata tells me (svydescribe) that there are 442 communities in my data.

    On the other hand, when I try
    "svyset comm [pw=wgt]",
    Stata tells me that I have one stratum and 372 units which seems to be correct, from my point of view.

    So, how can I define the svyset command correctly including all the information on the sampling design I have?

    Any help is greatly appreciated.

    Thanks,
    Heidi

  • #2
    The FAQ (sec 12) ask that you show us the results of your commands, not just the commands themselves.

    The first svyset is correct (442 communities). You don't tell us how many strata you really have (first svyset and svdes), but what is happening is that community numbering is independent within strata, so, for example, community "1" in stratum 1 is different from community "1" in stratum 2. Yet when you ignore strata, the two community "1" are erroneously considered to be the same community.
    Last edited by Steve Samuels; 07 Jul 2015, 14:50.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      Thanks a lot for the explanation.
      I have 6 strata in total. This is how the output looks like:


      . svyset comm [pw=wgt], strata(stratum)

      pweight: wgt
      VCE: linearized
      Single unit: missing
      Strata 1: stratum
      SU 1: comm
      FPC 1: <zero>

      . svydescribe

      Survey: Describing stage 1 sampling units

      pweight: wgt
      VCE: linearized
      Single unit: missing
      Strata 1: stratum
      SU 1: comm
      FPC 1: <zero>

      #Obs per Unit
      ----------------------------
      Stratum #Units #Obs min mean max
      -------- -------- -------- -------- -------- --------
      1 68 3477 1 51.1 207
      2 79 8598 1 108.8 258
      3 73 11121 1 152.3 338
      4 74 11957 10 161.6 376
      5 80 10843 2 135.5 303
      6 68 9899 2 145.6 274
      -------- -------- -------- -------- -------- --------
      6 442 55895 1 126.5 376

      3159 = #Obs with missing values in the
      -------- survey characteristics
      59054

      Comment


      • #4
        Thanks. The output confirms my diagnosis: there are six strata and 442 first stage units, and your first svyset statement is correct. You might want to investigate why over 5% of your observations are missing one of the survey design characteristics.
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          Thanks again for your help!

          Comment

          Working...
          X