I am analyzing to examine the impact of six winsorised and standardised independent variables (LM_QA_w, LM_ALL_w, ML_B_QA_w, ML_B_ALL_w, ML_U_QA_w, ML_U_ALL_w) on SHARE_TURNOVER_Le (Lead variable), with eight control variables SHARE_TURNOVER, SIZE, LEVERAGE_w, AGE, RD_w, BM, DIV_w, FILE_SIZE (I didn't winsorize log variables - SHARE_TURNOVER (current year), SIZE, AGE, & BM). I have also created two moderating variables: "continuous position" and "position dummies". I would appreciate your insights on the correctness of my STATA codes.
Constructing Continuous Position:
egen quarter_id = group(fyearq fqtr)
sort quarter_id tic
by quarter_id: gen alph_rank = _n
by quarter_id: egen total_firms = count(tic)
gen continuous_position = (alph_rank - 1) / (total_firms - 1)
Constructing Position Dummies:
gen first_5 = alph_rank <= total_firms*0.05
gen first_10 = alph_rank <= total_firms*0.10
gen first_5_10 = alph_rank > total_firms*0.05 & alph_rank <= total_firms*0.10
gen first_10_25 = alph_rank > total_firms*0.10 & alph_rank <= total_firms*0.25
Standardizing Independent Variables:
egen LM_QA_w = std(LM_QA1_w)
egen LM_ALL_w = std(LM_ALL1_w)
egen ML_B_QA_w = std(ML_B_QA1_w)
egen ML_B_ALL_w = std(ML_B_ALL1_w)
egen ML_U_QA_w = std(ML_U_QA1_w)
egen ML_U_ALL_w = std(ML_U_ALL1_w)
Constructing Interaction Terms for Continuous Positions:
gen interaction1_cont = LM_QA_w * continuous_position
gen interaction2_cont = LM_ALL_w * continuous_position
gen interaction3_cont = ML_B_QA_w * continuous_position
gen interaction4_cont = ML_B_ALL_w * continuous_position
gen interaction5_cont = ML_U_QA_w * continuous_position
gen interaction6_cont = ML_U_ALL_w * continuous_position
Constructing Interaction Terms for Position Dummies:
foreach var in LM_QA_w LM_ALL_w ML_B_QA_w ML_B_ALL_w ML_U_QA_w ML_U_ALL_w {
gen interaction_`var'_5 = `var' * first_5
gen interaction_`var'_5_10 = `var' * first_5_10
gen interaction_`var'_10_25 = `var' * first_10_25
}
Regression for LM_QA_w:
ssc install estout, replace
eststo clear
eststo: reghdfe SHARE_TURNOVER_Le LM_QA_w continuous_position first_5 first_5_10 first_10_25 interaction1_cont interaction_LM_QA_w_5 interaction_LM_QA_w_5_10 interaction_LM_QA_w_10_25 SHARE_TURNOVER SIZE LEVERAGE_w AGE RD_w BM DIV_w FILE_SIZE_LM_QA_w, absorb(quarter_id fama_french_49) vce(cluster gvkey_numeric)
I'm only posting regress codes here for my first independent variable (LM_QA_w). Please review my approach, suggest any potential improvements, or confirm if the methodology is appropriate.
Thank you for your guidance!
Constructing Continuous Position:
egen quarter_id = group(fyearq fqtr)
sort quarter_id tic
by quarter_id: gen alph_rank = _n
by quarter_id: egen total_firms = count(tic)
gen continuous_position = (alph_rank - 1) / (total_firms - 1)
Constructing Position Dummies:
gen first_5 = alph_rank <= total_firms*0.05
gen first_10 = alph_rank <= total_firms*0.10
gen first_5_10 = alph_rank > total_firms*0.05 & alph_rank <= total_firms*0.10
gen first_10_25 = alph_rank > total_firms*0.10 & alph_rank <= total_firms*0.25
Standardizing Independent Variables:
egen LM_QA_w = std(LM_QA1_w)
egen LM_ALL_w = std(LM_ALL1_w)
egen ML_B_QA_w = std(ML_B_QA1_w)
egen ML_B_ALL_w = std(ML_B_ALL1_w)
egen ML_U_QA_w = std(ML_U_QA1_w)
egen ML_U_ALL_w = std(ML_U_ALL1_w)
Constructing Interaction Terms for Continuous Positions:
gen interaction1_cont = LM_QA_w * continuous_position
gen interaction2_cont = LM_ALL_w * continuous_position
gen interaction3_cont = ML_B_QA_w * continuous_position
gen interaction4_cont = ML_B_ALL_w * continuous_position
gen interaction5_cont = ML_U_QA_w * continuous_position
gen interaction6_cont = ML_U_ALL_w * continuous_position
Constructing Interaction Terms for Position Dummies:
foreach var in LM_QA_w LM_ALL_w ML_B_QA_w ML_B_ALL_w ML_U_QA_w ML_U_ALL_w {
gen interaction_`var'_5 = `var' * first_5
gen interaction_`var'_5_10 = `var' * first_5_10
gen interaction_`var'_10_25 = `var' * first_10_25
}
Regression for LM_QA_w:
ssc install estout, replace
eststo clear
eststo: reghdfe SHARE_TURNOVER_Le LM_QA_w continuous_position first_5 first_5_10 first_10_25 interaction1_cont interaction_LM_QA_w_5 interaction_LM_QA_w_5_10 interaction_LM_QA_w_10_25 SHARE_TURNOVER SIZE LEVERAGE_w AGE RD_w BM DIV_w FILE_SIZE_LM_QA_w, absorb(quarter_id fama_french_49) vce(cluster gvkey_numeric)
I'm only posting regress codes here for my first independent variable (LM_QA_w). Please review my approach, suggest any potential improvements, or confirm if the methodology is appropriate.
Thank you for your guidance!
Comment