Dear Stata Community,
Given I am fairly new to econometrics and the understandings behind this, I have a few questions regarding the use of the three regression models RE, FE, and OLS and the comparison of these in analyzing panel data, which I hope you can help me answer.
I am currently regressing data with the use of Stata and performing both the Hausman test and the LM or F-test in determining which model is the appropriate one to use.
My data set contains all listed companies in the Nordics during the time period 2005-2017.
The models I am regressing are models where I have investment ratio as the dependent variable and lagged cash flow as the independent variable. My model changes in regards to two things:
1) time period; i.e. I run the same regression for three different time periods (2005-2007; 2008-2009; 2010-2017)
2) After I have run the three regressions in bullet 1 and analyzed these, I then run the regressions for the three time periods, but by using two other dependent variables.
What I have currently done is run the two tests to determine what regression model to use for each of the three time periods.
For the models with different dependent variables for all the three periods I have done the exact same for each of the periods. These tests show different results in terms of what model is the correct one to use.
My question is:
1) Can I compare the results for the three time periods by using different models?
2) When I later on use the two other dependent variables in the three time periods, can I then also use different models across time periods and dependent variables and compare these?
As a way of showing you my codes, I have written them down below. The sections marked in red are the important ones (the dependent variable I change, different time periods used, and the tests of which model to use).
Thank you very much in advance!
Given I am fairly new to econometrics and the understandings behind this, I have a few questions regarding the use of the three regression models RE, FE, and OLS and the comparison of these in analyzing panel data, which I hope you can help me answer.
I am currently regressing data with the use of Stata and performing both the Hausman test and the LM or F-test in determining which model is the appropriate one to use.
My data set contains all listed companies in the Nordics during the time period 2005-2017.
The models I am regressing are models where I have investment ratio as the dependent variable and lagged cash flow as the independent variable. My model changes in regards to two things:
1) time period; i.e. I run the same regression for three different time periods (2005-2007; 2008-2009; 2010-2017)
2) After I have run the three regressions in bullet 1 and analyzed these, I then run the regressions for the three time periods, but by using two other dependent variables.
What I have currently done is run the two tests to determine what regression model to use for each of the three time periods.
For the models with different dependent variables for all the three periods I have done the exact same for each of the periods. These tests show different results in terms of what model is the correct one to use.
My question is:
1) Can I compare the results for the three time periods by using different models?
2) When I later on use the two other dependent variables in the three time periods, can I then also use different models across time periods and dependent variables and compare these?
As a way of showing you my codes, I have written them down below. The sections marked in red are the important ones (the dependent variable I change, different time periods used, and the tests of which model to use).
Thank you very much in advance!
Code:
* Load data
clear
insheet using "C:\Users\dkkrz\Desktop\stata3middel.csv", delimiter(";")
* Encode panel data
encode selskabsnavn, gen(selskab)
tsset selskab r
* Create investments divided by lagged total assets variable
gen totalassetskorrigeret_1 = l.totalassetskorrigeret
gen capex_ratio = (capitalexpendituresadditiontofix/totalassetskorrigeret_1)
* Ekskludér negative CFLOW
drop if cflow<0
* Create lagged cflow variable
gen cflow_1 = l.cflow
****************************************
* Random effects models all years last year
xtreg capex_ratio cflow_1, re
estimates store random_effects
* Fixed effects models all years last year
xtreg capex_ratio cflow_1, fe
estimates store fixed_effects
* Test random effects vs. fixed effects
hausman fixed_effects random_effects
* Regular OLS all years last year
regress capex_ratio cflow_1
* Test OLS vs. random effects (Breusch-Pagan)
estat hettest
* Use random effects with categories
xtreg capex_ratio cflow_1, re
*******************************************
* Create lagged cflow variable
gen netsalesorrevenues_1 = l.netsalesorrevenues
gen sales_growth = (netsalesorrevenues/netsalesorrevenues_1 - 1)*100
gen sales_growth_1 = l.sales_growth
gen sales_growth_sq = sales_growth^2
* Encode industry
encode industry10, gen(industries10)
* regress MTB
regress mtb sales_growth_1 sales_growth_sq i.industries10
* Get mtb_f and mtb_r
predict mtb_f
predict mtb_r, residuals
* Lag mtb_f and mtb_r
gen mtb_f_1 = l.mtb_f
gen mtb_r_1 = l.mtb_r
****************************************
* Subset for correct period
keep if hllcllhc == "LL-HC" | hllcllhc == "HL-LC"
gen period = ""
replace period = "pre_crisis" if r < 2008
replace period = "crisis" if r >= 2008 & r<=2009
replace period = "post_crisis" if r > 2009
keep if period == "post_crisis"
* Variable - model
encode hllcllhc, gen(hllcllhc_factor)
encode country, gen(countries)
encode period, gen(period_factor)
encode cap, gen(cap_factor)
****************************************
* Random effects models all years last year
xtreg capex_ratio cflow_1, re
estimates store random_effects
* Fixed effects models all years last year
xtreg capex_ratio cflow_1, fe
estimates store fixed_effects
* Test random effects vs. fixed effects
hausman fixed_effects random_effects
* Regular OLS all years last year
regress capex_ratio cflow_1
* Test OLS vs. random effects (Breusch-Pagan)
estat hettest
* Use random effects with categories
xtreg capex_ratio cflow_1, re
******************** VIGTIGSTE MODEL ********************
* Random effects models with categories
xtreg capex_ratio i.hllcllhc_factor##c.cflow_1 mtb_f_1 mtb_r_1, re
estimates store random_effects
* Fixed effects models with categories
xtreg capex_ratio i.hllcllhc_factor##c.cflow_1 mtb_f_1 mtb_r_1, fe
estimates store fixed_effects
* Test random effects vs. fixed effects
hausman fixed_effects random_effects
* Regular OLS with categories
regress capex_ratio i.hllcllhc_factor##c.cflow_1 mtb_f_1 mtb_r_1
margins i.hllcllhc_factor, dydx(c.cflow_1)
marginsplot
* Test OLS vs. random effects (Breusch-Pagan)
estat hettest
* Use random effects with categories
xtreg capex_ratio i.hllcllhc_factor##c.cflow_1 mtb_f_1 mtb_r_1, re
*************************************************************************************************************************************

Comment