Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data xtreg i.years - multiple years omitted

    Hello,

    I am doing a research about the effect of gender diversity on firm performance with panel data.

    As a control variable, I added a quota dummies. For instance, the law required 20% of women on boards by 2014 and 40% by 2017. My dataset is going from 2005 to 2019.

    So my dummies are:
    NoQuota=1 from 2005 to 2013
    BoardQuota20=1 if ReportingYear==2014, 2015 and 2016
    BoardQuota40=1 if ReportingYear==2017, 2018 and 2019

    Now, when I am running xreg i.ReportingYear, I have multiple years being omitted.
    2005 is omitted because because Stata use it as a comparison.
    It seems that 2016 is omitted because of BoardQuota20 and 2019 is omitted because of NoQuota and BoardQuota40.


    I looked on the forum, asked my supervisor, but did not find a solution. How would you proceed to keep a quota dummy but correct this issue?



    To create my dummies, I did:

    Code:
    generate NoQuota=1
    replace NoQuota=0 if ReportingYear==2014
    replace NoQuota=0 if ReportingYear==2015
    replace NoQuota=0 if ReportingYear==2016
    replace NoQuota=0 if ReportingYear==2017
    replace NoQuota=0 if ReportingYear==2018
    replace NoQuota=0 if ReportingYear==2019
    
    generate BoardQuota20=0
    replace BoardQuota20=1 if ReportingYear==2014
    replace BoardQuota20=1 if ReportingYear==2015
    replace BoardQuota20=1 if ReportingYear==2016
    
    generate BoardQuota40=0
    replace BoardQuota40=1 if ReportingYear==2017
    replace BoardQuota40=1 if ReportingYear==2018
    replace BoardQuota40=1 if ReportingYear==2019


    Then for the regressions, I did:
    Code:
    xtset RIC ReportingYear
    xtreg ROA BoardGenderDiversity i.ReportingYear
    --> Here I had no problem, no variable omitted

    Code:
    xtreg ROA BoardGenderDiversity NoQuota i.ReportingYear
    note: 2019.ReportingYear omitted because of collinearity
    Code:
    xtreg ROA BoardGenderDiversity BoardQuota20 i.ReportingYear
    note: 2016.ReportingYear omitted because of collinearity
    Code:
    xtreg ROA BoardGenderDiversity BoardQuota40 i.ReportingYear
    note: 2019.ReportingYear omitted because of collinearity
    Code:
    xtreg ROA BoardGenderDiversity NoQuota BoardQuota20 BoardQuota40 i.ReportingYear
    note: BoardQuota40 omitted because of collinearity
    note: 2016.ReportingYear omitted because of collinearity
    note: 2019.ReportingYear omitted because of collinearity


    Here is the result from the last one for you to understand:
    Code:
    . xtreg ROA BoardGenderDiversity NoQuota BoardQuota20 BoardQuota40 i.ReportingYear
    note: BoardQuota40 omitted because of collinearity
    note: 2016.ReportingYear omitted because of collinearity
    note: 2019.ReportingYear omitted because of collinearity
    
    Random-effects GLS regression Number of obs = 1,058
    Group variable: RIC Number of groups = 112
    
    R-sq: Obs per group:
    within = 0.0457 min = 1
    between = 0.0054 avg = 9.4
    overall = 0.0104 max = 15
    
    Wald chi2(15) = 45.45
    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0001
    
    --------------------------------------------------------------------------------------
    ROA | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    ---------------------+----------------------------------------------------------------
    BoardGenderDiversity | .0069789 .0127523 0.55 0.584 -.0180151 .0319728
    NoQuota | .0141971 .0066756 2.13 0.033 .0011132 .0272811
    BoardQuota20 | .0009718 .0039353 0.25 0.805 -.0067413 .0086849
    BoardQuota40 | 0 (omitted)
    |
    ReportingYear |
    2006 | -.0021102 .0049543 -0.43 0.670 -.0118204 .0076001
    2007 | -.0012082 .0048026 -0.25 0.801 -.0106212 .0082047
    2008 | -.0042524 .0047132 -0.90 0.367 -.0134902 .0049854
    2009 | -.0218363 .0049033 -4.45 0.000 -.0314466 -.012226
    2010 | -.0092194 .0047844 -1.93 0.054 -.0185967 .0001579
    2011 | -.0097298 .0048925 -1.99 0.047 -.0193189 -.0001407
    2012 | -.0138445 .0051586 -2.68 0.007 -.0239552 -.0037338
    2013 | -.0139143 .0053079 -2.62 0.009 -.0243175 -.003511
    2014 | -.0027877 .0040655 -0.69 0.493 -.0107559 .0051806
    2015 | .0010093 .0037689 0.27 0.789 -.0063776 .0083962
    2016 | 0 (omitted)
    2017 | .0024925 .0038145 0.65 0.513 -.0049837 .0099687
    2018 | .0072331 .0039234 1.84 0.065 -.0004567 .0149229
    2019 | 0 (omitted)
    |
    _cons | .0439863 .0087964 5.00 0.000 .0267457 .061227
    ---------------------+----------------------------------------------------------------
    sigma_u | .068035
    sigma_e | .02444248
    rho | .8856846 (fraction of variance due to u_i)
    --------------------------------------------------------------------------------------
    Thank you very much for your help,
    Best,
    Last edited by Ambroise Lescudier; 18 Jul 2020, 09:24.

  • #2
    You have a logical problem. If you have variables that are identical to the year, then you can't control for both the year and that variable. One alternative would be to move from dummies for year to a continuous variable.

    Comment


    • #3
      Originally posted by Phil Bromiley View Post
      You have a logical problem. If you have variables that are identical to the year, then you can't control for both the year and that variable. One alternative would be to move from dummies for year to a continuous variable.
      Hello,
      Thank you very much. However, why are not all years omitted then? (Including BoardGender20 should omit 2014, 2015 and 2016, no? (and not only 2016)

      My ultimate goal is to do a difference-in-difference analysis to see the impact of the quotas requirements, but I would need to not have some years omitted in panel data.

      I tried the continuous variable:

      Code:
      generate Quota=0
      replace Quota=1 if ReportingYear==2014
      replace Quota=1 if ReportingYear==2015
      replace Quota=1 if ReportingYear==2016
      replace Quota=2 if ReportingYear==2017
      replace Quota=2 if ReportingYear==2018
      replace Quota=2 if ReportingYear==2019
      and got the following:
      Code:
      xtset RIC ReportingYear
             panel variable:  RIC (strongly balanced)
              time variable:  ReportingYear, 2005 to 2019
                      delta:  1 unit
      
      . 
      . xtreg ROA BoardGenderDiversity Quota i.ReportingYear
      note: 2019.ReportingYear omitted because of collinearity
      
      Random-effects GLS regression                   Number of obs     =      1,058
      Group variable: RIC                             Number of groups  =        112
      
      R-sq:                                           Obs per group:
           within  = 0.0457                                         min =          1
           between = 0.0054                                         avg =        9.4
           overall = 0.0104                                         max =         15
      
                                                      Wald chi2(15)     =      45.45
      corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0001
      
      --------------------------------------------------------------------------------------
                       ROA |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ---------------------+----------------------------------------------------------------
      BoardGenderDiversity |   .0069789   .0127523     0.55   0.584    -.0180151    .0319728
                     Quota |  -.0070986   .0033378    -2.13   0.033    -.0136405   -.0005566
                           |
             ReportingYear |
                     2006  |  -.0021102   .0049543    -0.43   0.670    -.0118204    .0076001
                     2007  |  -.0012082   .0048026    -0.25   0.801    -.0106212    .0082047
                     2008  |  -.0042524   .0047132    -0.90   0.367    -.0134902    .0049854
                     2009  |  -.0218363   .0049033    -4.45   0.000    -.0314466    -.012226
                     2010  |  -.0092194   .0047844    -1.93   0.054    -.0185967    .0001579
                     2011  |  -.0097298   .0048925    -1.99   0.047    -.0193189   -.0001407
                     2012  |  -.0138445   .0051586    -2.68   0.007    -.0239552   -.0037338
                     2013  |  -.0139143   .0053079    -2.62   0.009    -.0243175    -.003511
                     2014  |  -.0089144   .0038665    -2.31   0.021    -.0164926   -.0013362
                     2015  |  -.0051174   .0037141    -1.38   0.168    -.0123968     .002162
                     2016  |  -.0061268   .0037436    -1.64   0.102    -.0134641    .0012106
                     2017  |   .0024925   .0038145     0.65   0.513    -.0049837    .0099687
                     2018  |   .0072331   .0039234     1.84   0.065    -.0004567    .0149229
                     2019  |          0  (omitted)
                           |
                     _cons |   .0581835   .0074573     7.80   0.000     .0435674    .0727996
      ---------------------+----------------------------------------------------------------
                   sigma_u |    .068035
                   sigma_e |  .02444248
                       rho |   .8856846   (fraction of variance due to u_i)
      --------------------------------------------------------------------------------------
      2019 is still dropping out in this case... Do you think of something else?

      Thank you very much for your help.
      Best,
      Last edited by Ambroise Lescudier; 20 Jul 2020, 17:23.

      Comment


      • #4
        Ambroise:
        it may well be that -i.2019- is perfectly correlated to another predictor.
        But the substantive issue with your model seems to be a very low R-sq between (actually, this is the R-sq to consider under a -re- specification): I would check whether your model is misspecified and whether all the predictors needed to give a true and fair view of the data generating process were actually included.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hello Carlo,
          Thank you for your response.
          it may well be that -i.2019- is perfectly correlated to another predictor.
          This is what I concluded.

          About the R-sq, I simplified my model for the purpose of my problem. I have more control variables and thus a higher r-sq. I just wanted to resolve the Quota issue. But thank you!

          Comment

          Working...
          X