Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • replicate variance of "svy: mean"

    Hi everyone,

    I've been trying to replicate the variance produced by "svy: mean varname" in a single-stage stratified sampling design. Here is my code
    Code:
    clear all
    set seed 1234
    
    * generate data
    set obs 20
    g weights = rpoisson(5)
    g y = rnormal(100, 100)
    g strata = rbinomial(1, 0.5)
    svyset [pw = weights], strata(strata)
    drop if weights==0
    
    * Survey mean
    svy: mean y
    
    * try to replicate sd.err.
    
    * demean y within strata
    g y_w = y * weights
    bys strata: egen YW = total(y_w)
    bys strata: egen W = total(weights)
    g y_demeaned = y - YW/W
    g N = 1
    g inffun = (weights * y_demeaned)^2
    
    * collapse to strata
    collapse (sum) N W=weights inffun, by(strata)
    g V = (N/(N-1)) * inffun
    
    * collapse strata-specific variances
    collapse (sum) W V
    g se = sqrt(1/W^2 * V)
    list se
    The code works perfectly if there is just one stratum (g strata=1). But if there are two strata, my replicated standard error does not quite match with the one produced by Stata. The difference disappears asymptotically, but I'd like to know where the difference in a small sample comes from.

    Thanks for any help on this.

Working...
X