I have a recommendation for the csdid command. But it could just as easily be applied to many regression commands. Please pardon any of my ignorance.
Developers can add a "fast" option to regressions. The fast option would 'drop' all observations that are not included in the model estimation, due to missing values or other unmet criteria for estimation (which is model specific). This would reduce the memory burden of estimations when the data set is large and has many observations that are not used in the estimation.
This could reduce the processing time.
Right now, a work-around is to run a regression, then use 'predict' to create predicted values. Observations without a predicted value can be dropped, then the same regression can be re-run more quickly. The downside is that the initial regression can still be time consuming. It would be nice to have an option for 'self-cleaning' data that drops observations at optimal steps during the estimation process.
Another partial work-around is to drop observations with missing values for variables that are included in an estimation. But some models have additional criteria for omitting observations.
Developers can add a "fast" option to regressions. The fast option would 'drop' all observations that are not included in the model estimation, due to missing values or other unmet criteria for estimation (which is model specific). This would reduce the memory burden of estimations when the data set is large and has many observations that are not used in the estimation.
This could reduce the processing time.
Right now, a work-around is to run a regression, then use 'predict' to create predicted values. Observations without a predicted value can be dropped, then the same regression can be re-run more quickly. The downside is that the initial regression can still be time consuming. It would be nice to have an option for 'self-cleaning' data that drops observations at optimal steps during the estimation process.
Another partial work-around is to drop observations with missing values for variables that are included in an estimation. But some models have additional criteria for omitting observations.
Comment