Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create variable with mean values calculated with survey weights and data subset

    I am trying to create a bar graph of calculated means of variable from the NSFG. I learned that I should not drop cases or use if-statements to subset the data when using survey weights, so I created a variable 'samp40' that includes only women ages 40 and older, my population of interest. I use the followinng code to calculate the percentage women who would like to have another child by race/ethnic group:

    svy, subpop(samp40): mean wantsanother, over(race_eth)

    I would now like to create a bar graph of this information (Percentage of women over 40 who want another child, by race), but I cannot figure out how to create a single variable with these values. Normally, I would create it usign the following:

    egen mnwantsanother_race=wtmean(wantsanother) if samp40, by(race_eth) weight(weightvar)

    However, this does not give me the correct mean values because I used the if-statement to subset the data. I also cannot use the svy prefix with the egen command.

    I appreciate any help you can provide!

    (I am using Stata 16 on Windows 11)

  • #2
    It may be possible to save the output as a new dataset using svmat, then plot it. Here is an example. A bit clunky but should work:

    Code:
    webuse nhanes2f, clear
    svyset psuid [pweight = finalwgt], strata(stratid)
    svy: mean zinc, over(sex)
    
    preserve
    matrix a = r(table)'
    mat list a
    
    clear
    svmat a, names(matcol)
    gen str10 sex = ""
    replace sex = "Male"   in 1
    replace sex = "Female" in 2
    
    graph bar ab, over(sex)
    restore

    Comment


    • #3
      Thanks Ken! I couldn't quite get this code to work for me, so I was wondering if you could clariry a few things: 1) what is the 'b' for in graph bar 'ab'? 2) why do I need to do the preserve and restore for this?

      Comment


      • #4
        Originally posted by Kathleen Broussard View Post
        Thanks Ken! I couldn't quite get this code to work for me, so I was wondering if you could clariry a few things: 1) what is the 'b' for in graph bar 'ab'? 2) why do I need to do the preserve and restore for this?
        If you take away the preserve/restore, there should be no problem. Just be aware that when we use "svmat" the "saved matrix" will be tugged at the end of the data set, as new variables. By clearing the original data then submit the svmat, you'll have a cleaner dataset. Once the graph is made, there is no need to keep the results, so we can restore back to the original data. As I have said, it does not hurt to not use preserve/restore, just make sure to delete the matrix at the end of the data. (The "end" as new variables added to the right side of the data set, not new cases at the bottom of the data set.)

        ab is just the new transposed name for the two mean zinc levels. In the original it's called "b" and because the data is called "a", after svmat, a new concatenated name "ab" was given to it. You can check help svmat to see other option of handling the naming.

        Here is the matrix, a:
        Code:
                              b         se          t     pvalue         ll         ul         df       crit      eform
        [email protected]  90.753322  .58531174  155.05126  2.268e-46  89.559571  91.947074         31  2.0395134          0
        [email protected]  83.874435  .47049143  178.26985  3.014e-48  82.914862  84.834009         31  2.0395134          0
        Here is the variables after svmat:

        Code:
             +---------------------------------------------------------------------------------------------------+
             |       ab        ase         at   apvalue        all        aul   adf      acrit   aeform      sex |
             |---------------------------------------------------------------------------------------------------|
          1. | 90.75332   .5853117   155.0513         0   89.55957   91.94707    31   2.039513        0     Male |
          2. | 83.87444   .4704914   178.2699         0   82.91486   84.83401    31   2.039513        0   Female |
             +---------------------------------------------------------------------------------------------------+
        If this still confuses you. Here is the overall scheme:

        Once you have ran a svy:mean command, use return list and you should see an item saved called "r(table)". That r(table) contains the means. If you use "mat list r(table)" you will be able to see them in the first row. After you save that as a matrix (I called that a, transposed), we can use it as a two-case data set and generate the bar graph.
        Last edited by Ken Chui; 06 Jul 2022, 16:02.

        Comment


        • #5
          Thank you so much! This worked beautifully!

          Comment

          Working...
          X