Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • survey data (pweight)

    Hello everyone. I did some table to show top ten occupation of the gender according to their 1) absolut number and 2) percentage. The codes for women (coded as 1 and men as 0) were:
    Code:
    egen n_female = total (gender==1), by(occupation)
    egen p_female = mean(gender), by(occupation)
    egen tag = tag(occupation)
    
    1)gsort -tag -n_female
     gen order = _n
     su gender if gender==1, meanonly
     gen ppcumtotal_female = 100 * sum(n_female) / r(N)   //shows the cumulative percentage of women from all women
     tabdisp  order in 1/10, c(occupation n_female ppcumtotal_female p_female)
    
    2) gsort -tag -p_female
        replace order = _n
         tabdisp order in 1/10, c(occupation p_female n_female)
    The problem is now, that I realized today that I have to weight the survey data with pweight (The variable for that is "weight" and has for each observation another value), so I don't have to calculate the weight. So I may have to change in these codes some thing, but I don't no what to do really.



    I would be very thankful for some help.

    Last edited by Anshul Anand; 29 May 2015, 08:35.

  • #2
    I think the main problem is that "egen" does not support "svy" commands. But how can this problem be solved?

    Comment


    • #3
      There's nothing in your code above that requires svy as far as I can see. (You are not calculating SEs.) For calculating a total, above you coded
      Code:
      egen n_female = total (gender==1), by(occupation)
      To get a weighted total, I think you could change it to
      Code:
      egen n_female = total (  weight*(gender==1)  ), by(occupation)
      with similar tricks elsewhere

      Comment


      • #4
        To continue Stephen's code:

        Code:
        egen n_pop = total(weight*(gender!=.), by(occupation)
        gen p_female = n_female/n_pop
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          thanks a lot for the help!! One question: How to deal with the cumulative percentage? Like this:
          Code:
          :
          svy: su gender if gender==1, meanonly
          gen pccumtotal_female = 100 * sum(n_female) / r(N)
          where n_female is the weighted?

          Comment


          • #6
            I have no idea what your problem is. Stephen and I created the weighted variables that you asked for, so that you could use them in your original code. Now you 1) add a svy command that has no connection to anything; and 2) you ask how to "deal" with a variable that presumably was working with the non-weighted variables (since you had already created the table). Please read FAQ Section 12.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #7
              I think I see your problem, but if you want only the top 10 occupations, you need to set the cumlative proportions starting with the largest. Here's code based on the auto data. Here "rep78" plays the roll of occupation and "foreign" is your "female". Set a macro "k" for number of categories you want.In the example I take k=2.

              Code:
              local k = 2 /* top ranking substitute 10 in your problem */
              sysuse auto, clear
              
              egen tag = tag(rep78)
              gen  wt = trunk  // sample weight
              
              /* Create weighted totals and means */
              egen ttot = total(wt*foreign), by(rep78)
              egen tn   = total(wt*(foreign!=.)), by(rep78)
              gen  fmean = tt/tn
              
              tempfile t1
              save `t1', replace
              
              keep if tag
              keep  rep78 ttot fmean
              
              /* top totals: highest to smallest */
              gsort -ttot, gen(rankt)
              /* cumulative total percentages, starting with largest */
              gen cumpct = sum(ttot)
              replace cumpct= 100*cumpct/cumpct[_N]
              
              list  rankt ttot cumpct rep78
              
              /* top means: highest to smallest */
              gsort  -fmean, gen(rankm)
              list rankm fmean rep78  if rankm<=`k'
              
              /* Merge back with original data */
              merge 1:m rep78 using `t1'
              keep if _merge==3
              drop _merge
              
              /* sample analyses */
              
              tabdisp rankt if rankt<=`k', c(rep78  cumpct ttot)
              
              svyset _n [pweight = wt]
              svy, subpop(if  rankm<=`k'):  reg price trunk i.rep78
              Steve Samuels
              Statistical Consulting
              [email protected]

              Stata 14.2

              Comment


              • #8
                Thank you very much Sir!!

                Comment

                Working...
                X