Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sample Split reghdfe and pweight

    Dear all,
    I have a question using the reghdfe command and pweight using Stata 16. I run an interaction analysis, and I am interested in the effect of VAR1 dependent on VAR2. I use a weighting, as this makes sense in my question. That works fine.

    gen VAR1_VAR2 = VAR1*VAR2

    reghdfe ///
    ln_gross_investment_total ///
    VAR1 VAR2 ///
    VAR1_VAR2 ///
    [pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

    Then, I perform a sample split in which I split my sample according to a dummy variable SPLIT into high (=1) and low (=0) values for that variable

    reghdfe ///
    ln_gross_investment_total ///
    VAR1 VAR2 ///
    VAR1_VAR2 ///
    if SPLIT==1 [pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

    reghdfe ///
    ln_gross_investment_total ///
    VAR1 VAR2 ///
    VAR1_VAR2 ///
    if SPLIT==0 [pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

    Again, this also works fine. I get reasonable results. Now the problem begins: When I then estimate my whole sample again, using an interaction with SPLIT to test the significance of the difference. I get a different SPLIT_VAR1_VAR2 estimator than if I would compute the difference manually.

    gen SPLIT_VAR1 = SPLIT*VAR1
    gen SPLIT_VAR2 = SPLIT*VAR2
    gen SPLIT_VAR1_VAR2 = SPLIT*VAR1_VAR2

    reghdfe ///
    ln_gross_investment_total ///
    VAR1 VAR2 ///
    VAR1_VAR2 ///
    SPLIT_VAR1 ///
    SPLIT_VAR2 ///
    SPLIT_VAR1_VAR2 ///
    [pweight = WEIGHT], cluster(CLUSTER) a(FE) keepsingleton

    Natally, SPLIT_VAR1_VAR2 should yield the same as VAR1_VAR2 (if SPLIT==1) - VAR1_VAR2 (if SPLIT==0)

    When I exclude the weighting from my analysis, this holds true. When I employ pweigh, this is no longer the case. Can it be true, that pweigth somehow biases my results in any direction? I thought that pweight would work also well in subsamples.

    Thanks for your answers. I hope I have not forgotten anything. Let me know if you need further information.

    Best,
    Robert




  • #2
    Hi Robert
    The problem is not the weights, but the interaction
    When you run these models

    reg y x1 x2 x3 i.id if split==0
    reg y x1 x2 x3 i.id if split==1

    Is as if you are interacting x1, x2, x3 and i.id with split.
    However, when you run this model:

    reghdfe y split##c.(x1 x2 x3) , abs(id)

    is as if you are doing this:

    reg y split##c.(x1 x2 x3) i.id

    This assumes id is not being interacted with Split. This is why your results are not matching as you were expecting.

    HTH
    Fernando

    Comment

    Working...
    X