Dear Statalisters,
I am trying to estimate how learning experience (denoted by the variable "HWB") affects task performance (denoted by the variable "performance"), which is a continuous variable. HWB is endogenous and I implement iv regression to deal with endogeneity concerns. I estimate IMR using a probit regression where the DV is "worked", which indicated whether the worker worked in that hourly slot or not. Then I use the IMR in the main equation to estimate the effect of HWB on performance as follows:
xtset, clear
capture program drop heckman
program heckman, eclass
sum worked
probit worked avgcomp_last HWB controls1
matrix b1=e(b)
capture drop IMR
predict IMR, score
xtset courier_id
xi: xtivreg2 performance controls1 controls2 IMR (HWB = HWB_lagday), fe
matrix b2=e(b)
matrix coleq b1 = choice
matrix coleq b2 = level
matrix b=b2,b1
ereturn post b
end
bootstrap, reps(50) seed(12345) cluster(courier_id) idcluster(newid):heckman
est sto m1
However, my one of my DVs, "performance1" can only be observed when the variable "stockout_reqsub"==1. In short, there is another selection issue here. I cannot find any proper way to deal with this. My question is, should I include another probit regression:
sum stockout_reqsub
probit stockout_reqsub controls3
matrix b3=e(b)
capture drop IMR2
predict IMR2, score
and then in the final equation use both IMR (from the "worked" equation) and IMR2 (from the "stockout_reqsub" equation) in the final equation to perform the estimation?
My dataset is as below:
I am trying to estimate how learning experience (denoted by the variable "HWB") affects task performance (denoted by the variable "performance"), which is a continuous variable. HWB is endogenous and I implement iv regression to deal with endogeneity concerns. I estimate IMR using a probit regression where the DV is "worked", which indicated whether the worker worked in that hourly slot or not. Then I use the IMR in the main equation to estimate the effect of HWB on performance as follows:
xtset, clear
capture program drop heckman
program heckman, eclass
sum worked
probit worked avgcomp_last HWB controls1
matrix b1=e(b)
capture drop IMR
predict IMR, score
xtset courier_id
xi: xtivreg2 performance controls1 controls2 IMR (HWB = HWB_lagday), fe
matrix b2=e(b)
matrix coleq b1 = choice
matrix coleq b2 = level
matrix b=b2,b1
ereturn post b
end
bootstrap, reps(50) seed(12345) cluster(courier_id) idcluster(newid):heckman
est sto m1
However, my one of my DVs, "performance1" can only be observed when the variable "stockout_reqsub"==1. In short, there is another selection issue here. I cannot find any proper way to deal with this. My question is, should I include another probit regression:
sum stockout_reqsub
probit stockout_reqsub controls3
matrix b3=e(b)
capture drop IMR2
predict IMR2, score
and then in the final equation use both IMR (from the "worked" equation) and IMR2 (from the "stockout_reqsub" equation) in the final equation to perform the estimation?
My dataset is as below:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long order_id float worked double performance float stockout_reqsub long num_item double num_stockouts float(avg_numstockout_reqsub time_dum) byte day_of_week float(avgcomp_last HWB CSF_day precip_hourly) . 0 . 0 . . 0 . 2 6 0 9 0 . 0 . 0 . . 6.333333 . 4 9.3 0 0 0 . 0 . 0 . . 1.5 . 6 6.64 7 53.55 0 6847080 1 3 1 16 4 2.3333333 2 5 8.592857 0 5.5 0 . 0 . 0 . . 7.621622 . 5 7.88 5 45.55 .019 . 0 . 0 . . 3.0714285 . 6 9.8125 6 50.3 0 . 0 . 0 . . 0 . 6 8.700001 0 12.9 0 . 0 . 0 . . 4.2222223 . 4 13.95 0 5.5 0 5962762 1 0 1 74 1 2.7011495 4 5 8.525001 4 52.25 0 . 0 . 0 . . 3.029412 . 0 6.64 0 0 0 5775032 1 . 0 48 0 2.2173913 1 6 7.306667 0 11.64 0 4603736 1 0 1 37 2 3.3809524 4 3 8.394285 2 29.55 0 . 0 . 0 . . 1.728395 . 6 7 0 0 .002 . 0 . 0 . . 0 . 1 6.8375 6 53.45 0 . 0 . 0 . . 0 . 4 7.73 0 6.14 0 . 0 . 0 . . 5.457627 . 5 6.897143 5 43.95 0 . 0 . 0 . . 1 . 6 9.75 0 0 0 . 0 . 0 . . 4.3 . 0 9.9 1 18.85 0 6053104 1 2 1 15 2 2.75 2 6 8 0 6 0 . 0 . 0 . . 2.2857144 . 5 5.5 0 0 0 5102814 1 . 0 6 0 3.375 4 2 9.3 3 25.04 0 end
Comment