Dear Statalist members,
We are glad to annouce that our package samregc is already availabe in the SSC repository.
samregc is a Stata package designed to enhance robustness and sensitivity analysis for regression coefficients. It significantly improves upon existing tools by introducing innovative solutions for distinguishing between sample and omitted variables/collinearity effects (with samesample and sisters options). Additionally, samregc expands computational feasibility and analytical detail. The ncomb() option allows targeted combinatorial analysis of control variables rather than exhaustive all-subset regression approaches, increasing the practical limit of manageable control variables well beyond the ∼25 vars traditional limit (for all-subsert-regression approaches). The griterateover() option further enhances flexibility by iterating over groups of conceptually related variables, significantly reducing complexity and aligning analysis with theoretical frameworks. Moreover, samregc improves interpretability by providing detailed diagnostic tables and comprehensive plots, effectively highlighting control variables with the greatest robustness impact on interest/main variables and clearly distinguishing sample-driven variability from genuine covariate effects (sample vs omitted variables/collinearity effects).
The package can be installed by typing:
ssc install samregc
The new package is designed to build upon and extend the functionality of checkrob (Barslund, 2007), a tool that has been widely used for nearly two decades to explore robustness issues in empirical research. However, in response to the growing complexity and evolving methodological demands of contemporary applied work, we identified key areas where targeted improvements can significantly enhance functionality and broaden analytical scope.
Key methodological contributions of samregc include:
1. Differentiating sources of coefficient variability.
A core challenge in robustness checks lies in distinguishing whether changes in estimated coefficients are driven by differences in sample composition (due to missing values introduced by some control variables) or by the inclusion or exclusion of potentially collinear or omitted variables. Unlike checkrob, samregc offers two features to explicitly address this issue:
All-subsert-subset regression (the only combinatorial alternative in checkrob) becomes computationally infeasible as the number of potential controls grows (typically beyond 25). samregc addresses this through:
In addition to standard summary tables, samregc generates a richer set of diagnostics to inform robustness interpretation:
To assess how the coefficient on main_var in a regression of depvar changes when adding combinations of control1 and control2, simply type:
samregc depvar main_var, iterateover(control1 control2)
This runs four regressions: the base model with main_var only, and three extended models including each control separately and jointly. The output includes a dataset (samregc.dta) with all coefficients, t-statistics, and sample sizes, as well as automatically generated plots and summary tables.
The package supports a range of estimation commands via the cmdest() option (e.g., xtreg, ivregress), and allows use of weights.
Full documentation and worked examples are available in the help file (help samregc). We hope samregc proves useful for researchers seeking to implement comprehensive sensitivity analyses.
We also thank Professor Christopher F. Baum for his patience and submission guidance.
With best regards,
Pablo Gluzmann (CEDLAS-FCE-UNLP & CONICET). Email: [email protected]. Academic Profile: https://www.researchgate.net/profile/Pablo-Gluzmann.
Demian Panigo (Instituto Malvinas, UNLP & CONICET). Email: [email protected]. Academic Profile: https://www.researchgate.net/profile/Demian-Panigo.
We are glad to annouce that our package samregc is already availabe in the SSC repository.
samregc is a Stata package designed to enhance robustness and sensitivity analysis for regression coefficients. It significantly improves upon existing tools by introducing innovative solutions for distinguishing between sample and omitted variables/collinearity effects (with samesample and sisters options). Additionally, samregc expands computational feasibility and analytical detail. The ncomb() option allows targeted combinatorial analysis of control variables rather than exhaustive all-subset regression approaches, increasing the practical limit of manageable control variables well beyond the ∼25 vars traditional limit (for all-subsert-regression approaches). The griterateover() option further enhances flexibility by iterating over groups of conceptually related variables, significantly reducing complexity and aligning analysis with theoretical frameworks. Moreover, samregc improves interpretability by providing detailed diagnostic tables and comprehensive plots, effectively highlighting control variables with the greatest robustness impact on interest/main variables and clearly distinguishing sample-driven variability from genuine covariate effects (sample vs omitted variables/collinearity effects).
The package can be installed by typing:
ssc install samregc
The new package is designed to build upon and extend the functionality of checkrob (Barslund, 2007), a tool that has been widely used for nearly two decades to explore robustness issues in empirical research. However, in response to the growing complexity and evolving methodological demands of contemporary applied work, we identified key areas where targeted improvements can significantly enhance functionality and broaden analytical scope.
Key methodological contributions of samregc include:
1. Differentiating sources of coefficient variability.
A core challenge in robustness checks lies in distinguishing whether changes in estimated coefficients are driven by differences in sample composition (due to missing values introduced by some control variables) or by the inclusion or exclusion of potentially collinear or omitted variables. Unlike checkrob, samregc offers two features to explicitly address this issue:
- The samesample option ensures all model specifications are estimated using the largest common sample, thereby isolating the effect of variable inclusion from sample-driven changes.
- The sisters() option executes paired regressions for each iteration: one with the selected controls and another without, using an identical sample for both. This enables direct visualization and quantification of omitted-variable or multicollinearity effects (e.g., via scatter or arrow plots).
All-subsert-subset regression (the only combinatorial alternative in checkrob) becomes computationally infeasible as the number of potential controls grows (typically beyond 25). samregc addresses this through:
- The ncomb(#1,#2) option, which limits the iteration space to combinations containing between #1 and #2 variables, extending feasibility to substantially larger covariate sets.
- The griterateover() option, allowing users to iterate over grouped variables (e.g., dummies, lags, instruments), reducing dimensionality and aligning analysis with theoretical structures.
In addition to standard summary tables, samregc generates a richer set of diagnostics to inform robustness interpretation:
- Kernel density plots of coefficient and t-statistic distributions for each main variable.
- Tabulated summaries of coefficient sign, significance, and variability across specifications, highlighting the most influential control variables.
- Comparative plots and structured tables (enabled via samesample, unbalanced, or sisters options) to distinguish between sample-based and omitted-variable/collinearity-based coefficient changes.
To assess how the coefficient on main_var in a regression of depvar changes when adding combinations of control1 and control2, simply type:
samregc depvar main_var, iterateover(control1 control2)
This runs four regressions: the base model with main_var only, and three extended models including each control separately and jointly. The output includes a dataset (samregc.dta) with all coefficients, t-statistics, and sample sizes, as well as automatically generated plots and summary tables.
The package supports a range of estimation commands via the cmdest() option (e.g., xtreg, ivregress), and allows use of weights.
Full documentation and worked examples are available in the help file (help samregc). We hope samregc proves useful for researchers seeking to implement comprehensive sensitivity analyses.
We also thank Professor Christopher F. Baum for his patience and submission guidance.
With best regards,
Pablo Gluzmann (CEDLAS-FCE-UNLP & CONICET). Email: [email protected]. Academic Profile: https://www.researchgate.net/profile/Pablo-Gluzmann.
Demian Panigo (Instituto Malvinas, UNLP & CONICET). Email: [email protected]. Academic Profile: https://www.researchgate.net/profile/Demian-Panigo.