Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replication Weights using Jackknife Method: Cluster Sampling

    Dear all,

    I am currently working with survey data gathered from a cluster sampling frame of primary schools. At each school, the 4th graders had to sit for an examination and answer one survey.

    Due to the complex sample design applied in this survey, standard methods to estimate standard errors cannot be used because they would overestimate sampling variance considerably. Then I need to use replication methods to overcome this problem. As far as I understand, the replication weights were created using the Jackknife Repeated Replication method - 2-PSU per stratum (JK2); however, the replication weights are not included in the database and only the following variables are available:

    jkzone: is the variable that captures the assignment of schools or students to variance zones
    jkrep: is the variable that captures whether the case is to be dropped or have its weight doubled for each set of replicate weights

    To built the replication weights they have paired 2 PSUs per jackknife zone (jkzone), not per stratum and the sampling zones were constructed within explicit strata (idstrate). When an explicit stratum has an odd number of schools, either by design or because of school non-response, the students in the remaining school were used for pairing. Each sampling zone then consists of a pair of schools or students.

    The PSU in the sample is schools, and school (schwgt) and students (totwgt) weights are included. The database also includes stratification variables idstrate and idstrati, explicit and implicit respectively. But, according to the company that collected the data, they should only be used if you decide to do subgroup comparisons. Otherwise, there is no need using them.

    I have used the command survwgt create, the survwgt package was written by Nicholas Winter, to generate the Jackknife replication weights as follows but I obtained an error message:

    survwgt create jk2, strata(jkzone) psu(idschool) weight( totwgt ) stem(jkn_)
    stratum with more than 2 PSUs detected
    fpc must be >= 0
    Then, I have used the jkn method, juts to wonder what it's happening but again I obtained an error message:

    survwgt create jkn, strata(jkzone) psu(idschool) weight( totwgt ) stem(jkn_)
    stratum with only one PSU detected
    fpc must be >= 0
    I guess this is happening because they are choosing the sampling zones either by schools or students (when a specific stratum has an odd number of schools). The survwgt create documentation says that it only works for:

    survey designs that exactly match the specifications for the type of weights requested (two PSUs per stratum for BRR, etc.) Any collapsing of strata or PSUs, splitting of certainty PSUs, or other adjustments to approximate the appropriate design must be done outside of the program.
    However, these last wrinkles are beyond my Stata programming ability at this point.

    There is no methodological note available for the database but according to the company that collected the data, the replication weights for this specific sampling design can only be created using SPSS. I do not know how to use SPSS and I do not have an SPSS license, That's why I would like to use Stata, but I'm not really sure that I can actually use it to create the replication weights. So, (i) can anyone please tell me if it is possible to use Stata to calculate the replication weights under this sample design and (ii) if it is possible, can you explain why I'm obtaining the error messages mentioned?

    Thank you very much in advance for your help. I look forward to your reply.

    Tatiana Zarate



  • #2
    Dear Tatiana,
    Did you find a solution to your problem? I'm facing the exact same issue and don't know how to proceed. Any help/tips would be greatly appreciated.
    Thanks!
    Sönke Matthewes

    Comment

    Working...
    X