Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression with grouped variables

    Dear Stata Community

    In my paper, I want to investigate the impact of import competition on the incidence of zombie firms. The variables import competition (=penetration) and share of zombie firms (=share_zombiesBH1) are industry-based variables. The variable sic defines in which sector the firm is. Gvkey is the unique identifier for each firm. and at is the number of total assets by each firm.

    The datatable looks as follows:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long gvkey float sic double year float(penetration share_zombiesBH1) double at
     1934 20 1989  .03984093 .030927835   184.433
    10271 20 1989  .03984093 .030927835   133.559
     8935 20 1989  .03984093 .030927835    4381.7
     5824 20 1989  .03984093 .030927835    3717.6
     4054 20 1989  .03984093 .030927835    17.268
    15131 20 1989  .03984093 .030927835     2.884
     1369 20 1989  .03984093 .030927835   139.793
    15090 20 1989  .03984093 .030927835     1.486
    11748 20 1989  .03984093 .030927835    95.965
     1498 20 1989  .03984093 .030927835   423.518
     5848 20 1989  .03984093 .030927835  1352.919
    13324 20 1989  .03984093 .030927835    21.304
     5597 20 1989  .03984093 .030927835  1814.101
     6544 20 1989  .03984093 .030927835      2946
     8852 20 1989  .03984093 .030927835    3221.9
     4809 20 1989  .03984093 .030927835   448.037
    13641 20 1989  .03984093 .030927835     7.981
    13607 20 1989  .03984093 .030927835     1.987
     1722 20 1989  .03984093 .030927835  4728.308
     7507 20 1989  .03984093 .030927835  1841.913
    12825 20 1989  .03984093 .030927835    66.594
     6102 20 1989  .03984093 .030927835   844.339
     2663 20 1989  .03984093 .030927835    3932.1
     9303 20 1989  .03984093 .030927835   111.086
     3138 20 1989  .03984093 .030927835   448.532
     2909 20 1989  .03984093 .030927835    40.322
     3362 20 1989  .03984093 .030927835  4804.161
     5568 20 1989  .03984093 .030927835  4487.451
     1729 20 1989  .03984093 .030927835     4.372
     7770 20 1989  .03984093 .030927835   380.202
    13318 20 1989  .03984093 .030927835   186.896
    12409 20 1989  .03984093 .030927835   289.361
    10793 20 1989  .03984093 .030927835   2586.08
    12785 20 1989  .03984093 .030927835   291.102
    10177 20 1989  .03984093 .030927835     9.145
    14356 20 1989  .03984093 .030927835    42.618
     5141 20 1989  .03984093 .030927835   754.733
    13323 20 1989  .03984093 .030927835    81.168
    12736 20 1989  .03984093 .030927835     2.685
    14891 20 1989  .03984093 .030927835   104.623
    13592 20 1989  .03984093 .030927835   115.337
     7146 20 1989  .03984093 .030927835   864.511
     5185 20 1989  .03984093 .030927835   191.928
    13864 20 1989  .03984093 .030927835     6.809
     5599 20 1989  .03984093 .030927835   171.749
     8479 20 1989  .03984093 .030927835   15126.7
    14273 20 1989  .03984093 .030927835   243.038
     1462 20 1989  .03984093 .030927835   178.954
    12201 20 1989  .03984093 .030927835   161.111
     4078 20 1989  .03984093 .030927835   139.408
    10899 20 1989  .03984093 .030927835    111.94
    14070 20 1989  .03984093 .030927835   329.232
     6375 20 1989  .03984093 .030927835    3390.4
    15000 20 1989  .03984093 .030927835    51.464
     9433 20 1989  .03984093 .030927835   481.846
    11791 20 1989  .03984093 .030927835    39.347
    20338 20 1989  .03984093 .030927835    34.025
     3013 20 1989  .03984093 .030927835   185.989
    11424 20 1989  .03984093 .030927835    47.818
     2606 20 1989  .03984093 .030927835    47.165
    12309 20 1989  .03984093 .030927835   132.147
    14455 20 1989  .03984093 .030927835   213.396
    12756 20 1989  .03984093 .030927835  4731.946
     9774 20 1989  .03984093 .030927835   164.886
    13930 20 1989  .03984093 .030927835     15.11
     1408 20 1989  .03984093 .030927835   11394.2
     6340 20 1989  .03984093 .030927835    13.616
    13136 20 1989  .03984093 .030927835   109.704
    10345 20 1989  .03984093 .030927835   113.399
    14382 20 1989  .03984093 .030927835     3.464
     2435 20 1989  .03984093 .030927835  1020.984
     8336 20 1989  .03984093 .030927835    14.301
     4050 20 1989  .03984093 .030927835    461.52
     2710 20 1989  .03984093 .030927835   139.293
    14332 20 1989  .03984093 .030927835   749.157
    11902 20 1989  .03984093 .030927835    28.139
     2674 20 1989  .03984093 .030927835   382.507
     3245 20 1989  .03984093 .030927835    25.191
    14057 20 1989  .03984093 .030927835   194.622
     2562 20 1989  .03984093 .030927835    3704.7
     5709 20 1989  .03984093 .030927835   727.429
    11713 20 1989  .03984093 .030927835    57.376
    19437 20 1989  .03984093 .030927835     1.733
     3144 20 1989  .03984093 .030927835  8282.536
     9411 20 1989  .03984093 .030927835  6522.732
     2675 20 1989  .03984093 .030927835   781.051
    13830 20 1989  .03984093 .030927835     8.531
    13063 20 1989  .03984093 .030927835    17.732
    12614 20 1989  .03984093 .030927835     70.89
    15087 20 1989  .03984093 .030927835     1.894
     8582 20 1989  .03984093 .030927835   193.591
     3657 20 1989  .03984093 .030927835   479.687
     1663 20 1989  .03984093 .030927835    9025.7
     2597 20 1989  .03984093 .030927835  3434.042
    12566 20 1989  .03984093 .030927835   135.253
     3821 20 1989  .03984093 .030927835   744.759
    10551 20 1989  .03984093 .030927835   116.692
    11790 21 1989 .003395724          0   432.161
     3642 21 1989 .003395724          0   392.816
     1932 21 1989 .003395724          0 18655.548
    end
    If I now run the regression, I get the following output. As it can be seen in the output, Stata assumes that the variable gvkey is the group variable, whereas sic should be the group variable. How can I solve this problem?

    Code:
     xtreg share_zombiesBH1 penetration at, fe
    
    Fixed-effects (within) regression               Number of obs     =     39,091
    Group variable: gvkey                           Number of groups  =      4,927
    
    R-sq:                                           Obs per group:
         within  = 0.1767                                         min =          1
         between = 0.1245                                         avg =        7.9
         overall = 0.1052                                         max =         23
    
                                                    F(2,34162)        =    3667.10
    corr(u_i, Xb)  = -0.5759                        Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
    share_zomb~1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     penetration |   .4333692   .0052005    83.33   0.000     .4231761    .4435624
              at |   1.87e-07   2.61e-08     7.14   0.000     1.35e-07    2.38e-07
           _cons |   .0014647   .0009884     1.48   0.138    -.0004727     .003402
    -------------+----------------------------------------------------------------
         sigma_u |  .05474494
         sigma_e |  .03663516
             rho |  .69069127   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4926, 34162) = 10.17                Prob > F = 0.0000
    Many thanks for your help.
    Roman

  • #2
    You can do many things, the simplest probably is -areg, absorb(sic) cluster(sic)-

    Comment


    • #3
      Thank you very much for your reply Joro Kolev. It works really fine.

      Comment

      Working...
      X