I am trying to run a difference-in-difference regression and see the coefficient relative to the control group before time of treatment. This DID has three categories: RGGI, Leaker, and Control.
> reghdfe log_netgen b1.category_num#i.after_RGGI, absorb(plantstate obsyear) vce(cluster plantstate) where category_num represents a type of state (RGGI, Leaker, or Control) and after_RGGI is a dummy variable where 1 means the date in the data is after 2009.
My aim is to see the coefficients for Leaker#1 and RGGI#1, so I specified b1 as the base category, as 1 = Control for my category_num variable. Stata gives the following output:
HDFE Linear regression Number of obs = 183,543
Absorbing 2 HDFE groups F( 2, 50) = 13.59
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.0364
Adj R-squared = 0.0360
Within R-sq. = 0.0011
Number of clusters (plantstate_num) = 51 Root MSE = 3.1621
(Std. err. adjusted for 51 clusters in plantstate_num)
---------------------------------------------------------------------------------------------
| Robust
log_netgen | Coefficient std. err. t P>|t| [95% conf. interval]
----------------------------+----------------------------------------------------------------
category_numeric#after_RGGI |
Control#1 | .6075423 .1186053 5.12 0.000 .3693165 .8457681
Leaker#0 | -.697489 .2695983 -2.59 0.013 -1.238993 -.1559848
Leaker#1 | 0 (omitted)
RGGI#0 | 0 (omitted)
RGGI#1 | 0 (omitted)
|
_cons | 9.601487 .0604883 158.73 0.000 9.479993 9.722982
---------------------------------------------------------------------------------------------
Why is the regression omitting Leaker#1, RGGI#0, and RGGI#1 instead of omitting Control and 0 (since i.after_RGGI would usually make 0 the base category)
Basically, how can I make my regression output give me RGGI#1 and Leaker #1, omitting all other combinations? Thank you!
> reghdfe log_netgen b1.category_num#i.after_RGGI, absorb(plantstate obsyear) vce(cluster plantstate) where category_num represents a type of state (RGGI, Leaker, or Control) and after_RGGI is a dummy variable where 1 means the date in the data is after 2009.
My aim is to see the coefficients for Leaker#1 and RGGI#1, so I specified b1 as the base category, as 1 = Control for my category_num variable. Stata gives the following output:
HDFE Linear regression Number of obs = 183,543
Absorbing 2 HDFE groups F( 2, 50) = 13.59
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.0364
Adj R-squared = 0.0360
Within R-sq. = 0.0011
Number of clusters (plantstate_num) = 51 Root MSE = 3.1621
(Std. err. adjusted for 51 clusters in plantstate_num)
---------------------------------------------------------------------------------------------
| Robust
log_netgen | Coefficient std. err. t P>|t| [95% conf. interval]
----------------------------+----------------------------------------------------------------
category_numeric#after_RGGI |
Control#1 | .6075423 .1186053 5.12 0.000 .3693165 .8457681
Leaker#0 | -.697489 .2695983 -2.59 0.013 -1.238993 -.1559848
Leaker#1 | 0 (omitted)
RGGI#0 | 0 (omitted)
RGGI#1 | 0 (omitted)
|
_cons | 9.601487 .0604883 158.73 0.000 9.479993 9.722982
---------------------------------------------------------------------------------------------
Why is the regression omitting Leaker#1, RGGI#0, and RGGI#1 instead of omitting Control and 0 (since i.after_RGGI would usually make 0 the base category)
Basically, how can I make my regression output give me RGGI#1 and Leaker #1, omitting all other combinations? Thank you!
Comment