Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Appropriate weight for regression using Census Public Use Microdata (PUMS)

    Hi all,

    I have had a hard time finding an exact answer to this question and was hoping someone could fill in the blank. Specifically, what is the appropriate weight to use for Census Public Use Microdata when running regressions in STATA? I've never been able to find consistency (ie, I've seen pweight, fw, and aweight used). This pertains to regression analysis with PUMS data, where the person weight is pwgtp, or the housing weight is wgtp.

    Thanks,

    Justin

  • #2
    The weights are pweights. American Community Survey Data come with Successive Difference Replicate (SDR) weights (see, e.g. https://www2.census.gov/programs-sur..._ch12_2014.pdf). The data must be svyset to reflect this structure or standard errors will be wrong. For the person-weights:
    Code:
    svyset [pw=pwgtp], sdrweight(pwgtp1-pwgtp80) vce(sdr)
    and for HH weights:
    Code:
    svyset [pw=wgtp], sdrweight(wgtp1-wgtp80) vce(sdr)
    Let me add that "STATA" is not correct spelling, but "Stata" is. The reason: "Stata" is not an acronym, defined in my OS X dictionary as
    an abbreviation formed from the initial letters of other words and pronounced as a word (e.g., ASCII,NASA)
    Two well-known programs whose names are acronyms: SAS ("Statistical Analysis System") and SPSS ("Statistical Package for the Social Sciences"). See the last FAQ.
    Last edited by Steve Samuels; 17 Apr 2018, 15:37.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      Originally posted by Steve Samuels View Post
      The weights are pweights. American Community Survey Data come with Successive Difference Replicate (SDR) weights (see, e.g. https://www2.census.gov/programs-sur..._ch12_2014.pdf). The data must be svyset to reflect this structure or standard errors will be wrong. For the person-weights:
      Code:
      svyset [pw=pwgtp], sdrweight(pwgtp1-pwgtp80) vce(sdr)
      and for HH weights:
      Code:
      svyset [pw=wgtp], sdrweight(wgtp1-wgtp80) vce(sdr)
      Thanks for your help Steve.
      So after using the above svyset code, is it safe to run a regression, eg. reg x y ? Or is there an additional step I need before using the regression results?




      Comment


      • #4
        You need to run one of Stata's survey commands, those that take a svy prefix. They are listed here in the Stata 15 manual.
        So regress would
        Code:
        svy: reg y x
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment

        Working...
        X