Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Omitted dummy variable in fixed effect regression

    Hello everybody,

    For my master thesis, I am investigating the effects of company and macrofinancial variables on the gross written premium growth rate of insurers in the life and composite segment. I wanted to test if a composite company has a significant advantage by offering both life and non life products by including a dummy variable that is 1 if the company is composite and 0 otherwise but it gets omitted when I put it in my regression. I think that it's because of the fixed effect structure but I dont know what my option are to fix this.
    Thank for you suggestions and help!

    Code:
    . xtreg log_growthrate log_total_revenue_life log_solvency_ratio GDP_Growth Inflation Interest_Rate log_No_firms log_den composite_insurance, fe robust
    note: composite_insurance omitted because of collinearity.
    
    Fixed-effects (within) regression               Number of obs     =      6,092
    Group variable: company1                        Number of groups  =        853
    
    R-squared:                                      Obs per group:
         Within  = 0.1141                                         min =          1
         Between = 0.0050                                         avg =        7.1
         Overall = 0.0081                                         max =          9
    
                                                    F(7, 852)         =      25.98
    corr(u_i, Xb) = -0.9624                         Prob > F          =     0.0000
    
                                           (Std. err. adjusted for 853 clusters in company1)
    ----------------------------------------------------------------------------------------
                           |               Robust
            log_growthrate | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -----------------------+----------------------------------------------------------------
    log_total_revenue_life |   .2235154    .022139    10.10   0.000     .1800621    .2669687
        log_solvency_ratio |   .0512521   .0167348     3.06   0.002      .018406    .0840983
                GDP_Growth |   .0049925   .0013801     3.62   0.000     .0022837    .0077014
                 Inflation |   .0017915   .0024391     0.73   0.463     -.002996    .0065789
             Interest_Rate |   .0281604   .0051809     5.44   0.000     .0179915    .0383293
              log_No_firms |  -.0877924    .047825    -1.84   0.067    -.1816611    .0060762
                   log_den |   .1150995   .0342001     3.37   0.001     .0479732    .1822257
       composite_insurance |          0  (omitted)
                     _cons |  -3.305812   .3521752    -9.39   0.000    -3.997045   -2.614579
    -----------------------+----------------------------------------------------------------
                   sigma_u |  .58271337
                   sigma_e |   .3070798
                       rho |  .78264982   (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------
    
    . 
    end of do-file

  • #2
    Yes, the problem here is that your variable, composite, is a time-invariant attribute of the companies, so it is colinear with the fixed effects and gets omitted. That's linear algebra and there is no way around that in this fixed effects model.

    The numerous economists and econometricians who populate this Forum can probably come up with more, possibly better, ideas. But I see three approaches that may help you achieve your research goals.

    The first is to leave the fixed-effects model behind and recognize that you need a different model to estimate the effect of a time-invariant attribute. Such an effect acts purely on differences between companies, not within companies (at least in your data). For this purpose, I would use Mundlak regression. If you are using Stata version 18.5, this is the -xtreg, cre- command. If you are using an earlier version of Stata, you can install -xthybrid- from SSC and use that. This will enable you to separately estimate within-company and between-company effects in the data. For the time-varying variables, the within-company effects will be identical to what you would get from a fixed-effects model. But for variables like composite, you will be able to get a between-company estimate.

    The second is to use a different outcome variable. Instead of modeling the (log) growth rate of premiums, model the level of premiums, and on the right hand side of the equation use composite, year, and their interaction in a fixed-effects model. The composite variable itself will still be omitted due to colinearity, but the composite#year interaction will be preserved and its coefficient(s) will give you the contrast between the rate of increase in premiums in composite and non-composite companies.

    The third is to get a different data set in which composite is no longer time-invariant. That is, get a new data set consisting of firms that have started out as non-composite and then become composite at some point in time Then you can do a difference-in-differences regression of their growth rates. I have no idea whether such a data set can feasibly found, but if it can, it offers another way out of your dilemma.

    Comment


    • #3
      Thank you very much for your response. It worked!
      But I also have another question, I uses vce(robust) to adjust for heteroskedasticity in my data but when I use vce(cluster panel_id), I get exactly the same results, am I doing something wrong or is this normal? woolridge test was insignificant
      Code:
      . xtreg log_growthrate log_total_revenue_life log_solvency_ratio GDP_Growth Inflation Interest_Rate HHI Market_openness log_den, fe vce(cluster company1), in the example I used a regression without the composit dummy for simplicity.
      
      Fixed-effects (within) regression               Number of obs     =      5,750
      Group variable: company1                        Number of groups  =        850
      
      R-squared:                                      Obs per group:
           Within  = 0.1249                                         min =          1
           Between = 0.0060                                         avg =        6.8
           Overall = 0.0079                                         max =          8
      
                                                      F(8, 849)         =      22.04
      corr(u_i, Xb) = -0.9674                         Prob > F          =     0.0000
      
                                             (Std. err. adjusted for 850 clusters in company1)
      ----------------------------------------------------------------------------------------
                             |               Robust
              log_growthrate | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      -----------------------+----------------------------------------------------------------
      log_total_revenue_life |   .2530106   .0258874     9.77   0.000     .2021997    .3038214
          log_solvency_ratio |   .0697309   .0223406     3.12   0.002     .0258817    .1135802
                  GDP_Growth |    .005953   .0015325     3.88   0.000     .0029452    .0089608
                   Inflation |  -.0105186     .00529    -1.99   0.047    -.0209017   -.0001355
               Interest_Rate |   .0296831   .0060567     4.90   0.000     .0177953    .0415709
                         HHI |   .4215014   .3666259     1.15   0.251    -.2980979    1.141101
             Market_openness |   .2234144   .0908967     2.46   0.014     .0450057     .401823
                     log_den |   .1140602   .0369959     3.08   0.002     .0414461    .1866744
                       _cons |  -5.914331   .7765018    -7.62   0.000    -7.438419   -4.390243
      -----------------------+----------------------------------------------------------------
                     sigma_u |  .65209959
                     sigma_e |  .30718366
                         rho |  .81839379   (fraction of variance due to u_i)
      ----------------------------------------------------------------------------------------
      
      . 
      end of do-file
      Code:
      xtreg log_growthrate log_total_revenue_life log_solvency_ratio GDP_Growth Inflation Interest_Rate HHI Market_openness log_den, fe vce(robust)
      
      Fixed-effects (within) regression               Number of obs     =      5,750
      Group variable: company1                        Number of groups  =        850
      
      R-squared:                                      Obs per group:
           Within  = 0.1249                                         min =          1
           Between = 0.0060                                         avg =        6.8
           Overall = 0.0079                                         max =          8
      
                                                      F(8, 849)         =      22.04
      corr(u_i, Xb) = -0.9674                         Prob > F          =     0.0000
      
                                             (Std. err. adjusted for 850 clusters in company1)
      ----------------------------------------------------------------------------------------
                             |               Robust
              log_growthrate | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      -----------------------+----------------------------------------------------------------
      log_total_revenue_life |   .2530106   .0258874     9.77   0.000     .2021997    .3038214
          log_solvency_ratio |   .0697309   .0223406     3.12   0.002     .0258817    .1135802
                  GDP_Growth |    .005953   .0015325     3.88   0.000     .0029452    .0089608
                   Inflation |  -.0105186     .00529    -1.99   0.047    -.0209017   -.0001355
               Interest_Rate |   .0296831   .0060567     4.90   0.000     .0177953    .0415709
                         HHI |   .4215014   .3666259     1.15   0.251    -.2980979    1.141101
             Market_openness |   .2234144   .0908967     2.46   0.014     .0450057     .401823
                     log_den |   .1140602   .0369959     3.08   0.002     .0414461    .1866744
                       _cons |  -5.914331   .7765018    -7.62   0.000    -7.438419   -4.390243
      -----------------------+----------------------------------------------------------------
                     sigma_u |  .65209959
                     sigma_e |  .30718366
                         rho |  .81839379   (fraction of variance due to u_i)
      ----------------------------------------------------------------------------------------
      
      . 
      end of do-file

      Comment


      • #4
        Some time in the past, I think version 13, Stata changed -xtreg, fe- so that invoking vce(robust) would automatically be reinterpreted as (vce cluster panel_variable).

        Comment

        Working...
        X