Dear Statlist Users,
I have a data set with 500 firms, 550 mergers and ranging from 1996 to 2010.
Basically for each merger, I have an acquirer and observations for the acquirer firm from 5 years before to 5 years after the merger.
I want to predict a_salest+1,i using independent variables, including: a_salest,i a_salest-1,i a_markett-1,i purpose1_stri
With i being a firm and t the year in which the merger occurred, namely merger_year. So yeari = t, if yeari = merger_yeari
Each merger has a unique id sdcdealno, each firm has a unique id a_cusip6. As you can imagine, the same acquirer id a_cusip6 can appear for different mergers.
My doubt concerns how to model this data conceptually and in Stata.
It is not a proper pooled cross-sectional data set, since a same firm could appear in the range 1996-2006 and in the range 2000-2010, if it makes an acquisition in different years. Thus, the assumption of independent samples is violated.
It is not a balanced panel data set, because I don't have observations for each firm for the entire period (1996 to 2019) but only for the t-5, t+5 range.
I have some merger observations where, say t-5 is missing. Yet, I don't think the definition of unbalanced panel data applies, because if I dropped those observations, the data set would still be not balanced in strict terms.
So:
I have a data set with 500 firms, 550 mergers and ranging from 1996 to 2010.
Basically for each merger, I have an acquirer and observations for the acquirer firm from 5 years before to 5 years after the merger.
I want to predict a_salest+1,i using independent variables, including: a_salest,i a_salest-1,i a_markett-1,i purpose1_stri
With i being a firm and t the year in which the merger occurred, namely merger_year. So yeari = t, if yeari = merger_yeari
Each merger has a unique id sdcdealno, each firm has a unique id a_cusip6. As you can imagine, the same acquirer id a_cusip6 can appear for different mergers.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str10 sdcdealno str6 a_cusip6 int merger_year double year str17(purpose1_str purpose2_str) double(a_market a_sales) "1000844020" "500453" 2000 1995 "synergy" "none" 0 512.248 "1000844020" "500453" 2000 1996 "synergy" "none" 14 579.096 "1000844020" "500453" 2000 1997 "synergy" "none" 0 632.47 "1000844020" "500453" 2000 1998 "synergy" "none" 16 330.872 "1000844020" "500453" 2000 1999 "synergy" "none" 16 335.249 "1000844020" "500453" 2000 2000 "synergy" "none" 7 358.463 "1000844020" "500453" 2000 2001 "synergy" "none" 0 282.613 "1000844020" "500453" 2000 2002 "synergy" "none" 20 286.704 "1000844020" "500453" 2000 2003 "synergy" "none" 9 438.292 "1000844020" "500453" 2000 2004 "synergy" "none" 8 458.377 "1000844020" "500453" 2000 2005 "synergy" "none" 0 685.946 "1018256020" "882508" 2001 1996 "strengthen" "none" 3406 13128 "1018256020" "882508" 2001 1997 "strengthen" "none" 0 9940 "1018256020" "882508" 2001 1998 "strengthen" "none" 795 9750 "1018256020" "882508" 2001 1999 "strengthen" "none" 805 8460 "1018256020" "882508" 2001 2000 "strengthen" "none" 43 9468 "1018256020" "882508" 2001 2001 "strengthen" "none" 808.5 11860 "1018256020" "882508" 2001 2002 "strengthen" "none" 946.8333129882813 8201 "1018256020" "882508" 2001 2003 "strengthen" "none" 2022.6666259765625 8383 "1018256020" "882508" 2001 2004 "strengthen" "none" 0 9834 "1018256020" "882508" 2001 2005 "strengthen" "none" 0 12580 "1018256020" "882508" 2001 2006 "strengthen" "none" 1 13392 "1019582020" "920355" 2002 1997 "product_extension" "technology" 2 790.175 "1019582020" "920355" 2002 1998 "product_extension" "technology" 0 859.799 "1019582020" "920355" 2002 1999 "product_extension" "technology" 12 1017.271 "1019582020" "920355" 2002 2000 "product_extension" "technology" 0 1155.134 "1019582020" "920355" 2002 2001 "product_extension" "technology" 12 1387.677 "1019582020" "920355" 2002 2002 "product_extension" "technology" 22 1483.32 "1019582020" "920355" 2002 2003 "product_extension" "technology" 40 1920.97 "1019582020" "920355" 2002 2004 "product_extension" "technology" 9 2126.853 end
It is not a proper pooled cross-sectional data set, since a same firm could appear in the range 1996-2006 and in the range 2000-2010, if it makes an acquisition in different years. Thus, the assumption of independent samples is violated.
It is not a balanced panel data set, because I don't have observations for each firm for the entire period (1996 to 2019) but only for the t-5, t+5 range.
I have some merger observations where, say t-5 is missing. Yet, I don't think the definition of unbalanced panel data applies, because if I dropped those observations, the data set would still be not balanced in strict terms.
So:
Q1: How should I state in Stata the type of data I have?Thank you in advance for your suggestions
xtset does not seem the most appropriate for the reasons above mentioned.
Q2: What regression model should I use? FE would eliminate time-invariant variables such as purpose1_str.
Perhaps either RE or a normal OLS, stating in both cases the lagged independent variables, or alternatively the lead dependent variable.
Comment