Hello everyone,
I wanted to ask the most appropriate method for finding the best polynomial model for multivariate regression.
So far, I have been working with single covariate regressions and used the following code to generate a table showing the R2, AIC and BIC.
Below is the regression I am working on, including a dummy treatment variable (T) and two interaction variables.
For reference, I am using the heterogeneous local average treatment effect (HLATE) proposed by Becker et al. (2012) in the context of RDD.
I, therefore, have two questions to ask.
The first is more theoretical and probably very elementary: the degree of the covariates functions have to be all the same, or can I have a regression with x12 and x23?
The second question concerns the use of STATA. Is there a way to test which model is the best, using code similar to the one posted earlier in the message, or should I use another approach altogether?
I apologize if the question is not precise or I used imprecise terms. I hope I have made myself clear.
edit: I can't change the title but most likely I am referring to a multivariable regression modeling, not multivariate.
I wanted to ask the most appropriate method for finding the best polynomial model for multivariate regression.
So far, I have been working with single covariate regressions and used the following code to generate a table showing the R2, AIC and BIC.
Code:
foreach i of numlist 1/5 { if `i' != 1{ gen x`i' = x^`i' } qui reg y x* estimate store M_'i' qui estat ic matrix M_IC_`i' = r(S) scalar aic_`i' = M_IC_`i'[1, 5] scalar bic_`i' = M_IC_`i'[1, 6] } estout M_*, cells(b(star fmt(3)) se(par fmt(3))) stats(r2 r2_a aic bic df_r, labels(R^2 aR^2 AIC BIC df))
For reference, I am using the heterogeneous local average treatment effect (HLATE) proposed by Becker et al. (2012) in the context of RDD.
Code:
reg AR GDP_dev* ceqi* T TxGDP* Txceqi*
The first is more theoretical and probably very elementary: the degree of the covariates functions have to be all the same, or can I have a regression with x12 and x23?
The second question concerns the use of STATA. Is there a way to test which model is the best, using code similar to the one posted earlier in the message, or should I use another approach altogether?
I apologize if the question is not precise or I used imprecise terms. I hope I have made myself clear.
edit: I can't change the title but most likely I am referring to a multivariable regression modeling, not multivariate.
Comment