Sample Split reghdfe and pweight

Rob Vo

Join Date: May 2022

Posts: 1
#1

Sample Split reghdfe and pweight

12 May 2022, 09:53

Dear all,
I have a question using the reghdfe command and pweight using Stata 16. I run an interaction analysis, and I am interested in the effect of VAR1 dependent on VAR2. I use a weighting, as this makes sense in my question. That works fine.

gen VAR1_VAR2 = VAR1*VAR2

reghdfe ///
ln_gross_investment_total ///
VAR1 VAR2 ///
VAR1_VAR2 ///
[pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

Then, I perform a sample split in which I split my sample according to a dummy variable SPLIT into high (=1) and low (=0) values for that variable

reghdfe ///
ln_gross_investment_total ///
VAR1 VAR2 ///
VAR1_VAR2 ///
if SPLIT==1 [pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

reghdfe ///
ln_gross_investment_total ///
VAR1 VAR2 ///
VAR1_VAR2 ///
if SPLIT==0 [pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

Again, this also works fine. I get reasonable results. Now the problem begins: When I then estimate my whole sample again, using an interaction with SPLIT to test the significance of the difference. I get a different SPLIT_VAR1_VAR2 estimator than if I would compute the difference manually.

gen SPLIT_VAR1 = SPLIT*VAR1
gen SPLIT_VAR2 = SPLIT*VAR2
gen SPLIT_VAR1_VAR2 = SPLIT*VAR1_VAR2

reghdfe ///
ln_gross_investment_total ///
VAR1 VAR2 ///
VAR1_VAR2 ///
SPLIT_VAR1 ///
SPLIT_VAR2 ///
SPLIT_VAR1_VAR2 ///
[pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

Natally, SPLIT_VAR1_VAR2 should yield the same as VAR1_VAR2 (if SPLIT==1) - VAR1_VAR2 (if SPLIT==0)

When I exclude the weighting from my analysis, this holds true. When I employ pweigh, this is no longer the case. Can it be true, that pweigth somehow biases my results in any direction? I thought that pweight would work also well in subsamples.

Thanks for your answers. I hope I have not forgotten anything. Let me know if you need further information.

Best,
Robert
Tags: None
FernandoRios

Join Date: Apr 2014

Posts: 2534
#2

12 May 2022, 10:19

Hi Robert
The problem is not the weights, but the interaction
When you run these models

reg y x1 x2 x3 i.id if split==0
reg y x1 x2 x3 i.id if split==1

Is as if you are interacting x1, x2, x3 and i.id with split.
However, when you run this model:

reghdfe y split##c.(x1 x2 x3) , abs(id)

is as if you are doing this:

reg y split##c.(x1 x2 x3) i.id

This assumes id is not being interacted with Split. This is why your results are not matching as you were expecting.

HTH
Fernando
Comment

Announcement

Sample Split reghdfe and pweight

Comment