Hello everybody!
I am working on a causal probit model, so my focus is not prediction, I rather want to examine the sign and magnitude of individual coefficients.
As the sample size is pretty small, I want to do some kind of validation in order to check for / avoid overfitting.
If I understand correctly commands like crossfold focus on the predictive power of a model: Statistics like root mean squared error (RMSE) or psuedo-R2 inform about the model as a whole, not the individual regressors.
Given my interest in the individual variables, I want to validate my causal models like this:
Finally, I would now compare every single regressor in the five validation regressions with the main regression that draws on the whole data.
As I am a rookie, doing such an analysis for the first time, it would be great if someone could tell me if it's an appropriate way that I want to go.
Or have I overlooked a well-established method for validating models when the focus is causal analysis and hence on every single regressor?
Thank you very much for your help!
Kind regards
Antonio
I am working on a causal probit model, so my focus is not prediction, I rather want to examine the sign and magnitude of individual coefficients.
As the sample size is pretty small, I want to do some kind of validation in order to check for / avoid overfitting.
If I understand correctly commands like crossfold focus on the predictive power of a model: Statistics like root mean squared error (RMSE) or psuedo-R2 inform about the model as a whole, not the individual regressors.
Given my interest in the individual variables, I want to validate my causal models like this:
Code:
*My main model is: probit Success x1 x2 x3 *Randomly dividing data into two samples each: gen v1 = rbinomial(1,.75) gen v2 = rbinomial(1,.75) gen v3 = rbinomial(1,.75) gen v4 = rbinomial(1,.75) gen v5 = rbinomial(1,.75) *Validation: probit Success x1 x2 x3 if v1==1 probit Success x1 x2 x3 if v2==1 probit Success x1 x2 x3 if v3==1 probit Success x1 x2 x3 if v4==1 probit Success x1 x2 x3 if v5==1
As I am a rookie, doing such an analysis for the first time, it would be great if someone could tell me if it's an appropriate way that I want to go.
Or have I overlooked a well-established method for validating models when the focus is causal analysis and hence on every single regressor?
Thank you very much for your help!
Kind regards
Antonio


Comment