Main Category of Interest Omitted from a DiD regression with -reghdfe-

Michael Duarte Goncalves

Join Date: Oct 2022
Posts: 500

Main Category of Interest Omitted from a DiD regression with -reghdfe-

11 Feb 2025, 03:05

Hi Statalist Community,

I need your help, please. I don't understand the result of a regression I'm doing.
I'm attaching a -dataex- and also the command used. I used the -reghdfe- command from SSC.

Code:

Example generated by -dataex-. For more info, type help dataex
clear
input float(EV_share municipio_madrid after suburbios_madrid lejos_madrid) int ym_date long postal_code byte(province powertraintype) float mean_dist_mad_central_km
 .016949153 1 0 0 0 660 28001 30 2 2.2617676
 .016393442 1 0 0 0 661 28001 30 2 2.2617676
 .016393442 1 0 0 0 662 28001 30 2 2.2617676
 .033898305 1 0 0 0 663 28001 30 2 2.2617676
 .016666668 1 0 0 0 664 28001 30 2 2.2617676
          0 1 0 0 0 665 28001 30 2 2.2617676
          0 1 0 0 0 666 28001 30 2 2.2617676
          0 1 0 0 0 667 28001 30 2 2.2617676
  .01754386 1 0 0 0 668 28001 30 2 2.2617676
          0 1 0 0 0 669 28001 30 2 2.2617676
          0 1 0 0 0 670 28001 30 2 2.2617676
  .04081633 1 0 0 0 671 28001 30 2 2.2617676
  .02173913 1 0 0 0 672 28001 30 2 2.2617676
         .1 1 0 0 0 673 28001 30 2 2.2617676
 .029850746 1 0 0 0 674 28001 30 2 2.2617676
        .06 1 0 0 0 675 28001 30 2 2.2617676
  .03076923 1 0 0 0 676 28001 30 2 2.2617676
    .015625 1 0 0 0 677 28001 30 2 2.2617676
  .08510638 1 0 0 0 678 28001 30 2 2.2617676
         .1 1 0 0 0 679 28001 30 2 2.2617676
  .14583333 1 0 0 0 680 28001 30 2 2.2617676
  .03448276 1 0 0 0 681 28001 30 2 2.2617676
  .14285715 1 0 0 0 682 28001 30 2 2.2617676
  .05084746 1 0 0 0 683 28001 30 2 2.2617676
        .14 1 0 0 0 684 28001 30 2 2.2617676
  .06122449 1 0 0 0 685 28001 30 2 2.2617676
  .15686275 1 0 0 0 686 28001 30 2 2.2617676
   .1777778 1 0 0 0 687 28001 30 2 2.2617676
       .125 1 0 0 0 688 28001 30 2 2.2617676
  .14864865 1 0 0 0 689 28001 30 2 2.2617676
 .065789476 1 0 0 0 690 28001 30 2 2.2617676
  .11764706 1 0 0 0 691 28001 30 2 2.2617676
  .04347826 1 0 0 0 692 28001 30 2 2.2617676
  .17391305 1 0 0 0 693 28001 30 2 2.2617676
  .08108108 1 0 0 0 694 28001 30 2 2.2617676
         .2 1 0 0 0 695 28001 30 2 2.2617676
  .06896552 1 0 0 0 696 28001 30 2 2.2617676
  .11764706 1 0 0 0 697 28001 30 2 2.2617676
        .18 1 0 0 0 698 28001 30 2 2.2617676
    .220339 1 0 0 0 699 28001 30 2 2.2617676
  .11764706 1 1 0 0 700 28001 30 2 2.2617676
  .11864407 1 1 0 0 701 28001 30 2 2.2617676
  .19753087 1 1 0 0 702 28001 30 2 2.2617676
   .2037037 1 1 0 0 703 28001 30 2 2.2617676
  .22222222 1 1 0 0 704 28001 30 2 2.2617676
  .24561404 1 1 0 0 705 28001 30 2 2.2617676
  .15789473 1 1 0 0 706 28001 30 2 2.2617676
  .23404256 1 1 0 0 707 28001 30 5 2.2617676
         .2 1 1 0 0 708 28001 30 2 2.2617676
end
format %tm ym_date

Basically, I ran the following regression:

Code:

xtset postal_code ym_date  // Panel structure

reghdfe EV_share i.municipio_madrid#c.after i.suburbios_madrid#c.after i.lejos_madrid#c.after, ///
  absorb(ym_date) vce(cluster postal_code)

But the category I'm most interested in isn't displayed, and I don't understand why. Here are the results of my regression:

Code:

. reghdfe EV_share i.municipio_madrid#c.after i.suburbios_madrid#c.after i.lejos_madrid#c.afte
> r, ///
>   absorb(ym_date) vce(cluster postal_code)
(MWFE estimator converged in 1 iterations)
note: 1.municipio_madrid#c.after omitted because of collinearity
note: 1.suburbios_madrid#c.after omitted because of collinearity
note: 1.lejos_madrid#c.after omitted because of collinearity

HDFE Linear regression                            Number of obs   =     15,732
Absorbing 1 HDFE group                            F(   3,    280) =      38.55
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.2334
                                                  Adj R-squared   =     0.2302
                                                  Within R-sq.    =     0.0268
Number of clusters (postal_code) =        281     Root MSE        =     0.1095

                                      (Std. err. adjusted for 281 clusters in postal_code)
------------------------------------------------------------------------------------------
                         |               Robust
                EV_share | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------------+----------------------------------------------------------------
municipio_madrid#c.after |
                      0  |   .0196474    .013661     1.44   0.151    -.0072438    .0465386
                      1  |          0  (omitted)
                         |
suburbios_madrid#c.after |
                      0  |   .0349717   .0136989     2.55   0.011     .0080058    .0619376
                      1  |          0  (omitted)
                         |
    lejos_madrid#c.after |
                      0  |   .0869142     .01226     7.09   0.000     .0627806    .1110477
                      1  |          0  (omitted)
                         |
                   _cons |   .0595808   .0094204     6.32   0.000     .0410371    .0781245
------------------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     ym_date |        62           0          62     |
-----------------------------------------------------+

.

Could you please give me a hand with this? I'd like the “1s” that are omitted to appear, as this is the category I'm interested in.

Thank you in advance for your help and patience.

Tags: None

Nils Enevoldsen

Join Date: Oct 2014

Posts: 296
#2

12 Feb 2025, 11:15

For every observation, is exactly one of the following true: municipio_madrid, suburbios_madrid, and lejos_madrid?

If so, try

Code:

1.municipio_madrid#c.after 1.suburbios_madrid#c.after 1.lejos_madrid#c.after
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#3

13 Feb 2025, 01:07

Michael:
your data excerpt does not allow to re-run your code:

. reghdfe EV_share i.municipio_madrid#c.after i.suburbios_madrid#c.after i.lejos_madrid#c.after, absorb(ym_date) vce(cluster postal_code)
(dropped 49 singleton observations)
insufficient observations
r(2001);

In additio:
1) the community-contributed module -reghdfe- does not require to -xtset- your dataset beforehand;
2) it is not clear to me why you did not -absorb- the -panelid-, too;
3) as you're assumed to use that last Stata release, why not using -xtdidregress-?

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Michael Duarte Goncalves

Join Date: Oct 2022

Posts: 500
#4

11 Mar 2025, 04:56

Hi Nils Enevoldsen
Hi Carlo Lazzaro

Your suggestions enabled me to solve my problem successfully. Thank you so much!
Best,

Michael
Comment

Announcement

Main Category of Interest Omitted from a DiD regression with -reghdfe-

Comment

Comment

Comment