Regression discontinuity with aweight vs pweight?

Andy dB

Join Date: Sep 2015

Posts: 23
#1

Regression discontinuity with aweight vs pweight?

01 Apr 2017, 14:31

Hi everyone,

I'd like to write a simple, "sharp" RD command with a triangular kernel - but I'm wondering whether I should use aw or pw weights.

I've seen both. This previous thread uses pw, whereas the "rd" ado seems to use aw. I've tried to read up on it but I'm confused.

I create my weights as follows:

Code:

gen weight=max(0,`band'-abs(forcing))

I'm not sure whether I should run

Code:

reg dv above forcing forcing2 aboveXforcing aboveXforcing2 [aw=weight]

or

Code:

reg dv above forcing forcing2 aboveXforcing aboveXforcing2 [pw=weight]

where
"band" = the bandwidth
"forcing" = the forcing variable, centered at the cutoff
"above" = indicates being above the cutoff, i.e. assignment to treatment

Could you help?

Thank you in advance,

Andy
Tags: syntax, weight
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

01 Apr 2017, 16:00

The difference between aweights and pweights is not about what command you are using. It's about the relationship of the dataset to the sample and population.

If you have used a data-design wherein different people (firms, whatever entities you are analyzing) are accrued to the sample with different probabilities, then that is dealt with using -pweights-. It is a matter of the sampling being non simple-random sampling.

-aweights- are different. Aweights are intended for a situation where the individual observations in the data set represent composites of different number of units and the numbers in the data represent averages over those units. For example, each person may have been measured a certain number of times (different numbers for different people) and the data set contains one observation with the average. Or each firm has a certain number of employees and each observation is at the firm level and reports the average compensation of all its emloyees. Situations like that are what -aweights- are for.
Comment
Andy dB

Join Date: Sep 2015

Posts: 23
#3

01 Apr 2017, 16:26

Thank you very much. It looks like pweights are the way to go here. Not that observations closer to the cutoff had a different probability of getting sampled, but they're considered "more important" (and they're not representing more than one observation, either). In fact, it looks like -rd- actually uses pweights - sorry about the confusion.
Comment

Announcement

Regression discontinuity with aweight vs pweight?

Comment

Comment