Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • weight issue with pctile

    Greetings all,

    I'm trying to construct percentiles using the pctile command with Stata.

    This is the code that works, with no weights:

    pctile pct = (robot_pred), nq(722) genp(percent)

    This is the code that does not work. It includes my attempt to introduce population weights:

    pctile pct = (robot_pred) [ipums_pop_1990=weight], nq(722) genp(percent)

    The error that is returned is:

    weights not allowed
    r(101);


    Does anyone know what may be causing this error? I'm happy to indulge work-arounds, but am positively befuddled as to why pctile is generating everything fine, but not allowing weights to be introduced. My understanding is that pctile allows weights when used in this way.

    For context: I'm trying to population-weight what is basically community-level data. The underlying unit of observation is the community (of which there are 722 in this dataset). And the weight would just be each community's population in 1990. All on Stata 17.

  • #2
    See the output of
    Code:
    help weight
    for instructions on the correct format for specifying weights, with examples. Apparently what you typed confused Stata into providing an unhelpful error message.

    I am surprised that the unweighted code you ran did work, since the output of
    Code:
    help pctile
    does not suggest the use of parentheses surrounding the variable name.

    Comment


    • #3
      As William said, you have confused Stata, and the error message is not meaningful. This sort of stuff happens when we make up our own syntax and do not read the help file for the command we are using. The syntax of the command is

      pctile type newvar = exp if in weight , pctile options

      so your command should look something like this:

      pctile pct = robot_pred [pw = ipums_pop_1990], nq(722) genp(percent)

      Comment


      • #4
        Please note our firm preference for full real names as explained at

        https://www.statalist.org/forums/help#realnames

        https://www.statalist.org/forums/help#adviceextras #3

        and re-register as there explained.


        Comment


        • #5
          On a closer reading of post #1, I think you may have misunderstood the meaning of the nq() option.

          To me, "quartile" means dividing a population into four groups; "decile" means 10 groups, and "percentile" means 100 groups. You suggest in post #1 you want to construct "percentiles", but divide your 722 communities into 722 groups. That is an unusual objective in any event, and "quantile" would be a better description than "percentiles"

          In the following simplified example, my objective is to divide 20 observations of x into four groups. Since the observations are sorted by the value of x, and since the 20 values are distinct, it's particularly easy to see the differences in three different approaches.
          Code:
          . clear all
          
          . set obs 20
          Number of observations (_N) was 0, now 20.
          
          . generate x = _n
          
          . pctile q1 = x, nq(4)
          
          . pctile q2 = x, nq(20)
          
          . xtile  q3 = x, nq(4)
          
          . list, noobs
          
            +-----------------------+
            |  x     q1     q2   q3 |
            |-----------------------|
            |  1    5.5    1.5    1 |
            |  2   10.5    2.5    1 |
            |  3   15.5    3.5    1 |
            |  4      .    4.5    1 |
            |  5      .    5.5    1 |
            |-----------------------|
            |  6      .    6.5    2 |
            |  7      .    7.5    2 |
            |  8      .    8.5    2 |
            |  9      .    9.5    2 |
            | 10      .   10.5    2 |
            |-----------------------|
            | 11      .   11.5    3 |
            | 12      .   12.5    3 |
            | 13      .   13.5    3 |
            | 14      .   14.5    3 |
            | 15      .   15.5    3 |
            |-----------------------|
            | 16      .   16.5    4 |
            | 17      .   17.5    4 |
            | 18      .   18.5    4 |
            | 19      .   19.5    4 |
            | 20      .      .    4 |
            +-----------------------+
          As you can see, q3 receives the group number each observation is assigned to. q1 receives the three values that separate the observations into four groups. q2, akin to what you did, receives the 19 values that separate the 20 observations into 20 groups.

          Comment

          Working...
          X