Bootstrap program in two stage estimation by group

Konstantina Maragkou New

Join Date: Mar 2022

Posts: 11
#1

Bootstrap program in two stage estimation by group

15 Nov 2023, 07:08

Dear all,

I am trying to get bootstrapped standard errors for a two-stage estimation procedure. I use the first-stage to estimate a group-level measure to be used as independent variable in the second stage. My dataset is at individual-level in the first-stage and then reduced to group-level in the second stage. I'd like to write a program that can be called on in the bootstrap command and do the following:

Code:

statsby, by(group) saving(test, replace): reg y1 i.x merge m:1 group using “test.dta”, nogen g mod=_b[1.x] duplicates drop group, force reg y2 mod

This is the program have I written so far, but it returns the error “no observations”, which should be due to how I treat the first stage. Any help would be very much appreciated.

Code:

program “twostage”, eclass *First stage: tempfile test statsby, by (group) saving(`test', replace): reg y1 i.x merge m:1 group using `test', nogen tempvar mod g `mod’=_b[1.x] *Second stage duplicates drop group, force reg y2 `mod' tempvar used g byte `used’ = e(sample) tempname b mat `b' = e(b) mat colnames `b' = mod _cons ereturn post `b’, esample(`used’) end bootstrap _b, reps(1000) seed(16121992) cluster(group): twostage
Tags: None
Bert Lloyd

Join Date: Apr 2014

Posts: 110
#2

15 Nov 2023, 17:38

The double-quotes around twostage in the program definition line look like they could cause a problem. Maybe not the problem you're having, though.
Comment
Konstantina Maragkou New

Join Date: Mar 2022

Posts: 11
#3

21 Nov 2023, 05:48

Thank you, Bert. You're correct, I appreciate the insight about not using quotes. I made progress in the first stage by correcting an additional step, specifically by creating the temporary variable `mod' as part of the statsby command. Now, the first stage runs smoothly.

The challenge is in the second stage now, when attempting to run the regression at the group level by dropping the duplicate observations. I tried to preserve and restore the data, thinking that this would allow the regression to run at the group level and then restore the data for the second replication, but this approach does seem to work. I'm looking for alternative options. Is there a way to specify during the regression to only run at the group level?

Your guidance on tackling this would be highly appreciated!
Comment

Announcement

Bootstrap program in two stage estimation by group

Comment

Comment