Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrap program in two stage estimation by group

    Dear all,

    I am trying to get bootstrapped standard errors for a two-stage estimation procedure. I use the first-stage to estimate a group-level measure to be used as independent variable in the second stage. My dataset is at individual-level in the first-stage and then reduced to group-level in the second stage. I'd like to write a program that can be called on in the bootstrap command and do the following:

    Code:
    statsby, by(group) saving(test, replace): reg y1 i.x
     
    merge m:1 group using “test.dta”, nogen
     
    g mod=_b[1.x]
     
    duplicates drop group, force
     
    reg y2 mod
    This is the program have I written so far, but it returns the error “no observations”, which should be due to how I treat the first stage. Any help would be very much appreciated.

    Code:
    program “twostage”, eclass
    
    *First stage:
    tempfile test
    statsby, by (group) saving(`test', replace): reg y1 i.x
    
    merge m:1 group using `test', nogen  
    
    tempvar mod
    g `mod=_b[1.x]  
    
    *Second stage
    duplicates drop group, force 
    reg y2 `mod'
     
    tempvar used
    g byte `used’ =  e(sample)
    
    tempname b
    mat `b' = e(b)
    mat colnames `b' = mod _cons
     
    ereturn post `b’, esample(`used’)
    
    end
    
    bootstrap _b, reps(1000) seed(16121992) cluster(group): twostage

  • #2
    The double-quotes around twostage in the program definition line look like they could cause a problem. Maybe not the problem you're having, though.

    Comment


    • #3
      Thank you, Bert. You're correct, I appreciate the insight about not using quotes. I made progress in the first stage by correcting an additional step, specifically by creating the temporary variable `mod' as part of the statsby command. Now, the first stage runs smoothly.

      The challenge is in the second stage now, when attempting to run the regression at the group level by dropping the duplicate observations. I tried to preserve and restore the data, thinking that this would allow the regression to run at the group level and then restore the data for the second replication, but this approach does seem to work. I'm looking for alternative options. Is there a way to specify during the regression to only run at the group level?

      Your guidance on tackling this would be highly appreciated!

      Comment

      Working...
      X