Hi
I am trying to run a regression for each industry-year group within a data set, and then generate variables of the estimated coefficients. My two current code attempts are below:
On the second industry-year combo, this produces an error that the new variable b_x1 is already defined. What I am trying to do is to essentially get this for each industry-year combo. I guess it may need a nested loop to capture the fact it is for each industry-year group .. ?
Other code I had tried was using -runby-:
The issue in this case was that I only want to run the regression for industry-years where I have at least 10 observations (of y x1 x2) in the regression. I couldn't seem to get a way to only run for industry-years with at least 10 observations in the regression. Instead, it ran for all and removed some observations from my data set. It feels like there should be a simple fix for this, but I am yet to find one which does the trick and feel like I've been going round in circles..
Data sample below (although very small relative to the full data set):
Any advice is much appreciated.
Paul
I am trying to run a regression for each industry-year group within a data set, and then generate variables of the estimated coefficients. My two current code attempts are below:
Code:
forval y=2002/2021 {
forval i= 1/97 {
di "Year = `y' and sic2 = `i'"
count if sic2 == `i' & Year == `y' & !missing(yy, x1, x2)
if r(N) > 10 {
regress yy x1 x2 if sic2== `i' & Year==`y'
foreach v of varlist x1 x2 {
gen b_`v' = _b[`v']
}
}
}
}
On the second industry-year combo, this produces an error that the new variable b_x1 is already defined. What I am trying to do is to essentially get this for each industry-year combo. I guess it may need a nested loop to capture the fact it is for each industry-year group .. ?
Other code I had tried was using -runby-:
Code:
capture program drop myregress
program define myregress
regress y x1 x2
foreach v of varlist x1 x2 {
gen b_`v' = _b[`v']
}
gen n_obs = e(N)
exit
end
runby myregress, by(sic2 Year)
The issue in this case was that I only want to run the regression for industry-years where I have at least 10 observations (of y x1 x2) in the regression. I couldn't seem to get a way to only run for industry-years with at least 10 observations in the regression. Instead, it ran for all and removed some observations from my data set. It feels like there should be a simple fix for this, but I am yet to find one which does the trick and feel like I've been going round in circles..
Data sample below (although very small relative to the full data set):
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int Year byte sic2 float(id y x1 x2) 2001 62 1 . . . 2002 62 1 . . . 2003 62 1 . . . 2004 62 1 . . . 2005 62 1 . . . 2006 62 1 . . . 2007 62 1 . . . 2008 62 1 . . . 2009 62 1 . . . 2010 62 1 . . . 2011 62 1 . . . 2012 62 1 -.015869012 .00001050928 -.09859806 2013 62 1 -.13666852 9.380775e-06 . 2014 62 1 -.06423958 .000013209865 . 2015 62 1 -.20667164 .00001164741 .09992313 2016 62 1 .005849376 3.9231227e-06 -.01027858 2017 62 1 . 4.980378e-06 . 2018 62 1 . . . 2019 62 1 . . . 2020 62 1 . . . 2021 62 1 . . . 2001 35 2 . . . 2002 35 2 . . . 2003 35 2 . . . 2004 35 2 . . . 2005 35 2 . . . 2006 35 2 . . . 2007 35 2 . . . 2008 35 2 . . . 2009 35 2 . . . 2010 35 2 -.08069245 9.844865e-07 -.08591416 2011 35 2 -.0487175 1.043447e-06 -.1507656 2012 35 2 . 1.2754988e-06 . 2013 35 2 . . . 2014 35 2 . . . 2015 35 2 . . . 2016 35 2 . . . 2017 35 2 . . . 2018 35 2 . . . 2019 35 2 . . . 2020 35 2 . . . 2021 35 2 . . . 2001 36 3 . . . 2002 36 3 . . . 2003 36 3 . . . 2004 36 3 . . . 2005 36 3 . . . 2006 36 3 . . . 2007 36 3 . . . 2008 36 3 . . . 2009 36 3 . . . 2010 36 3 . . . 2011 36 3 . . . 2012 36 3 . .00013303179 . 2013 36 3 . . . 2014 36 3 . . . 2015 36 3 . . . 2016 36 3 . . . 2017 36 3 . . . 2018 36 3 . . . 2019 36 3 . . . 2020 36 3 . . . 2021 36 3 . . . 2001 73 4 . . . 2002 73 4 . . . 2003 73 4 . . . 2004 73 4 . . . 2005 73 4 . . . 2006 73 4 . . . 2007 73 4 . . . 2008 73 4 . . . 2009 73 4 . . . 2010 73 4 . . . 2011 73 4 -.034475815 8.390318e-06 .08454923 2012 73 4 . 7.091846e-06 . 2013 73 4 . . . 2014 73 4 . . . 2015 73 4 . . . 2016 73 4 . . . 2017 73 4 . . . 2018 73 4 . . . 2019 73 4 . . . 2020 73 4 . . . 2021 73 4 . . . 2001 14 5 . . . 2002 14 5 . . . 2003 14 5 . . . 2004 14 5 . . . 2005 14 5 . . . 2006 14 5 . . . 2007 14 5 . . . 2008 14 5 . . . 2009 14 5 . . . 2010 14 5 . . . 2011 14 5 . . . 2012 14 5 .009635898 1.6602168e-06 .1071836 2013 14 5 .011130786 1.5824262e-06 .09883676 2014 14 5 -.01621402 1.521586e-06 .017382598 2015 14 5 .002984429 1.556011e-06 -.00879613 2016 14 5 -.06915921 1.202309e-06 .06596348 end
Any advice is much appreciated.
Paul

Comment