I am using the did_multiplegt_stat package to estimate the effects of a continuous treatment that may vary in every period, with no dynamic effects (current version 2025-02-07, from https://github.com/chaisemartinPacka...ree/main/STATA). I have some questions about the way control variables work, and about how observations are selected for inclusion in regressions. To demonstrate my questions, I am using the sample data provided by the authors of the package.
I understand that the controls option in did_multiplegt_dyn (a related package by the same author group) controls for the first differences of the varlist. The github documentation for did_multiplegt_stat says, “When time-varying control variables are inputted to the command, the command compares the t–1-to-t outcome evolution of switchers and stayers with the same baseline treatment, and with the same controls at period t–1.” Does this mean (1) that comparisons are only made for observations that have identical values of the control variables in period t–1? Or (2) does it use the values of the control variables in t–1 as regressors? Or (3), as with the did_multiplegt_dyn model, does it use first-differenced controls?
My code and output are below. They suggest that comparisons are not made only between observations that have identical values for the control variables; if that were the case, the second regression below would not run, because each observation has unique values of the control variable. So, perhaps one of my interpretations (2) or (3) above is correct.
What are the criteria for including observations in regressions when using the did_multiplegt_stat package? Why does the number of observations differ across the two regressions above? Why are they so much smaller than the full extent of the panel (which would be 48 x 42 = 2016)?
Thanks!
John
I understand that the controls option in did_multiplegt_dyn (a related package by the same author group) controls for the first differences of the varlist. The github documentation for did_multiplegt_stat says, “When time-varying control variables are inputted to the command, the command compares the t–1-to-t outcome evolution of switchers and stayers with the same baseline treatment, and with the same controls at period t–1.” Does this mean (1) that comparisons are only made for observations that have identical values of the control variables in period t–1? Or (2) does it use the values of the control variables in t–1 as regressors? Or (3), as with the did_multiplegt_dyn model, does it use first-differenced controls?
My code and output are below. They suggest that comparisons are not made only between observations that have identical values for the control variables; if that were the case, the second regression below would not run, because each observation has unique values of the control variable. So, perhaps one of my interpretations (2) or (3) above is correct.
Code:
. use "https://github.com/chaisemartinPackages/ApplicationData/raw/main/data_gazoline.dta", clear . distinct lngca id year tau no_auto // this tells me there are 2064 observations, with 2064 distinct values of a control variable no_auto -------------------------------- | total distinct ---------+---------------------- lngca | 2064 2064 id | 2064 48 year | 2064 43 tau | 2064 249 no_auto | 2064 2064 -------------------------------- . did_multiplegt_stat lngca id year tau, estimator(was) ---------------------------------------------- Number of observations = 1587 WAS Estimation method = doubly-robust Polynomial order = (1) ---------------------------------------------- -------------------------------------------------------------------------------- Weighted Average Slope (WAS) -------------------------------------------------------------------------------- | Estimate SE LB CI UB CI Switchers Stayers -------------+----------------------------------------------------------------- WAS | -.0038867 .0009433 -.0057355 -.0020379 384 1203 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- . did_multiplegt_stat lngca id year tau, estimator(was) controls(no_auto) ---------------------------------------------- Number of observations = 1584 WAS Estimation method = doubly-robust Polynomial order = (1) ---------------------------------------------- -------------------------------------------------------------------------------- Weighted Average Slope (WAS) -------------------------------------------------------------------------------- | Estimate SE LB CI UB CI Switchers Stayers -------------+----------------------------------------------------------------- WAS | -.0040948 .0010031 -.0060608 -.0021287 383 1201 -------------------------------------------------------------------------------- --------------------------------------------------------------------------------
Thanks!
John