Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating percentile shares of total, by groups, using pshare

    Dear Statalisters

    I have person-level microdata (in Stata 15) which describes hourly wages (hw), with people nested in locations ('city'). I'm interested in calculating the share of total city wages held by individuals in the top 1, and separately top 10 percent of each city wage distribution, for each of my 700 or so cities. Ideally I would put this value in a new variable for all individuals in that city.

    I stumbled across a program written by Ben Jann called pshare (http://www.stata-journal.com/article...article=st0432) which seems to do what I want, kind of. However, it does not work on groups using 'by' for instance. Nor does it create a new variable with the estimated shares.

    I'm trying to figure out the best workaround. I was thinking about doing something like the following (for the top 1 percent share):

    Code:
        gen w1psh=.
            levels city, local(levels)
            foreach i of local levels {
                pshare estimate hw [fw=p] if city == `i', p(99)
                  replace w1psh = XXXXXXXXX if city == `i'
            }
    XXXXXXX would need to be a stored value of the pshare command. But I cannot seem to find where pshare saves what it estimates. In its help file it says:

    pshare estimate stores the following in e():

    Scalars
    e(N) number of observations
    e(N_over) number of subpopulations
    e(N_clust) number of clusters
    e(k_eq) number of equations in e(b)
    e(bins) number of bins (percentile groups) per equation
    e(df_r) sample degrees of freedom
    e(rank) rank of e(V)
    e(level) confidence level for CIs

    Macros
    e(cmd) pshare
    e(cmdline) command as typed
    e(depvar) name(s) of outcome variable(s)
    e(pvar) name of variable specified in pvar()
    e(type) proportion, percent, density, sum, average, or generalized
    e(norm) # or names of reference variables or empty
    e(normpop) total or overvar = # or empty
    e(percentiles) percentile thresholds
    e(step) step or empty
    e(gini) gini or empty
    e(over) name of over() variable
    e(over_namelist) values from over() variable
    e(over_labels) labels from over() variable
    e(total) total or empty
    e(contrast) contrast or empty
    e(baseval) + or value/name of base for contrasts
    e(ratio) ratio or empty
    e(lnratio) lnratio or empty
    e(wtype) weight type
    e(wexp) weight expression
    e(clustvar) name of cluster variable
    e(vce) vcetype specified in vce()
    e(vcetype) title used to label Std. Err.
    e(title) title in estimation output
    e(properties) b V or b
    Am I correct that none of these actually contains the estimated percentile share? This seems really odd, so I'm sure I am missing something.

    Any thoughts?
    Best,
    Tom
    Last edited by Tom Kemeny; 24 May 2018, 08:06.

  • #2
    The estimated percentile shares are stored in coefficient vector e(b), so you can use _b[] to access the values. Type matrix list e(b) after running pshare to have a look at the e(b). Here is an example:
    Code:
    sysuse nlsw88, clear
    gen w1psh = .
    levels industry, local(levels)
    foreach i of local levels {
        pshare estimate wage if industry == `i', p(99)
        replace w1psh = _b[99-100] if industry == `i'
    }
    Alternatively, you could also copy e(b) into a regular matrix and then select the relevant matrix element from there:
    Code:
    sysuse nlsw88, clear
    gen w1psh = .
    levels industry, local(levels)
    foreach i of local levels {
        pshare estimate wage if industry == `i', p(99)
        matrix b = e(b)
        replace w1psh = b[1,2] if industry == `i'
    }
    Furthermore, note that pshare does work on groups. The relevant option is called over(). Other Stata commands such as, e.g., mean also call this option over(). Using this option, you could do the following:
    Code:
    sysuse nlsw88, clear
    pshare estimate wage, p(99) over(industry)
    matrix b = e(b)
    gen w1psh = .
    local j = 0
    foreach i in `e(over_namelist)' {
        local j = `j' + 2
        replace w1psh = b[1,`j'] if industry == `i'
    }
    ben

    Comment

    Working...
    X