Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing proportions using svy command

    Hello,

    I am a PhD student and working in stata only since 2 years.

    I need to compare proportions of sexual outcomes (ever had sex, condom use at first sex...) in two independant datasets, and one of them is a survey dataset. I have troubles to combine the two datasets and to use commands that take into account the fact that one of the sample is weighted.

    The first dataset (name=coverte) contains data from participants with HIV. It is not a survey. N=284

    The second datasat (BS2010) contains survey data with its own weight. N=2,899

    I appended the two datasets, once i've opened first the BS2010 dataset : "append using coverte07022023.dta, gen (base)"

    Typing "svydescribe" shows the following :

    svydescribe

    Survey: Describing stage 1 sampling units

    pweight: RD2TOT
    VCE: linearized
    Single unit: missing
    Strata 1: <one>
    SU 1: <observations>
    FPC 1: <zero>

    #Obs per Unit
    ----------------------------
    Stratum #Units #Obs min mean max
    -------- -------- -------- -------- -------- --------
    1 2,899 2,899 1 1.0 1
    -------- -------- -------- -------- -------- --------
    1 2,899 2,899 1 1.0 1

    284 = #Obs with missing values in the
    -------- survey characteristics
    3,183

    For example, if I try to compare the proportions of participants reporting ever having sex, which is the binary variable RS_ni (RS for rapport sexuel in French).

    I first used the tabulate command but the result that does not take into account the weight in the BS2010 survey, so my understanding is it's not correct :

    . tab RS_ni base, col chi2

    +-------------------+
    | Key |
    |-------------------|
    | frequency |
    | column percentage |
    +-------------------+

    | base
    RS_ni | 0 1 | Total
    -----------+----------------------+----------
    0 | 302 47 | 349
    | 10.42 16.91 | 10.99
    -----------+----------------------+----------
    1 | 2,592 227 | 2,819
    | 89.41 81.65 | 88.73
    -----------+----------------------+----------
    3 | 5 4 | 9
    | 0.17 1.44 | 0.28
    -----------+----------------------+----------
    Total | 2,899 278 | 3,177
    | 100.00 100.00 | 100.00

    Pearson chi2(2) = 25.8040 Pr = 0.000

    And if I try to use the svy command for proportions, it ignores the coverte sample. (number of observation : 2,899). I tried the svy: logit command and it is the same issue.

    svy: proportion RS_ni, over (base)
    (running proportion on estimation sample)

    Survey: Proportion estimation

    Number of strata = 1 Number of obs = 2,899
    Number of PSUs = 2,899 Population size = 3,311.8753
    Design df = 2,898

    --------------------------------------------------------------
    | Linearized Logit
    | Proportion Std. Err. [95% Conf. Interval]
    -------------+------------------------------------------------
    RS_ni@base |
    0 0 | .1227894 .007692 .108486 .1386854
    1 0 | .8750966 .0077625 .8590646 .8895394
    3 0 | .002114 .0012226 .0006795 .0065567
    --------------------------------------------------------------


    I would be very grateful if someone can help me. I have checked the stata survey data reference manual, but the only section referring to combining samples from multiples surveys did not help me.

  • #2
    I just want to complete with some important information before anyone is willing to answer :

    The proportions in the first dataset (coverte) have been standardized to adjust for the distribution of age and education level of the second dataset (BS). BS is a French representative sample.
    That is to say that actually I need to compare :
    - standardized proportions in dataset n°1
    - and weighted proportions in dataset n°2.

    To compute the standardized proportions in the first dataset, I did svyset the data doing that :
    g pop = 1
    svyset [pweight=pop]

    An then to compute the standardized rate of RS_ni in the first dataset, i did the following :
    svy, subpop (female) : proportion RS_ni, stdize (age_dip) stdweight (std_wt)
    svy, subpop (male) : proportion RS_ni, stdize (age_dip) stdweight (std_wt)

    Thank you very much.

    Comment

    Working...
    X