Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue concerning i.Year

    Hello everyone, I am working on a panel data set whose time period goes from 2004 to 2016. I am experiencing an issue with time dummies both when using i.Year and when generating time dummies manually as STATA keeps on using 2 base groups instead of 1.

    My dataset is coded in this way:
    Code:
    .  . xtset countrynum Year, yearly
           panel variable:  countrynum (strongly balanced)
            time variable:  Year, 2004 to 2016
                    delta:  1 year
    
    .
    When using i.Year STATA uses both 2004 and 2005 as base years:
    Code:
    . reg Y I logYlevel_1 n H  i.countrynum Cor_1 i.countrynum##c.Cor_1 i.Year, vce(cluster countrynu
    > m)
    note: Cor_1 omitted because of collinearity
    
    Linear regression                               Number of obs     =        240
                                                    F(14, 19)         =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.7647
                                                    Root MSE          =     .01438
    
                                      (Std. Err. adjusted for 20 clusters in countrynum)
    ------------------------------------------------------------------------------------
                       |               Robust
                     Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------------+----------------------------------------------------------------
                     I |  -.0050199   .0195443    -0.26   0.800    -.0459266    .0358867
           logYlevel_1 |  -.1255014   .0508644    -2.47   0.023    -.2319618    -.019041
                     n |  -.0001286   .0003074    -0.42   0.680    -.0007721    .0005148
                     H |   .1810461   .1962326     0.92   0.368    -.2296734    .5917656
                       |
            countrynum |
                    2  |  -.0075186   .0115144    -0.65   0.522    -.0316185    .0165812
                    3  |  -.0415829   .0185308    -2.24   0.037    -.0803684   -.0027975
                    4  |  -.0301643   .0216167    -1.40   0.179    -.0754087      .01508
                    5  |   .0385532   .0177765     2.17   0.043     .0013464    .0757599
                    6  |   .0338513   .0113286     2.99   0.008     .0101402    .0575623
                    7  |   .0341033   .0208821     1.63   0.119    -.0096034    .0778101
                    8  |    .023549   .0119514     1.97   0.064    -.0014656    .0485635
                    9  |   .0537951   .0334736     1.61   0.125     -.016266    .1238563
                   10  |    .008647   .0057067     1.52   0.146    -.0032972    .0205912
                   11  |  -.0228318   .0093685    -2.44   0.025    -.0424402   -.0032233
                   12  |   .0125716   .0159917     0.79   0.441    -.0208994    .0460427
                   13  |  -.0263945   .0187979    -1.40   0.176    -.0657389    .0129498
                   14  |  -.0221471   .0113101    -1.96   0.065    -.0458194    .0015251
                   15  |  -.0336907   .0185583    -1.82   0.085    -.0725337    .0051524
                   16  |   .0114023   .0125498     0.91   0.375    -.0148647    .0376693
                   17  |   .0544347   .0217663     2.50   0.022     .0088773    .0999922
                   18  |  -.0102997   .0046371    -2.22   0.039    -.0200052   -.0005942
                   19  |   .0412845   .0238289     1.73   0.099      -.00859     .091159
                   20  |   .0377269   .0146951     2.57   0.019     .0069696    .0684842
                       |
                 Cor_1 |  -44.65295   475.2509    -0.09   0.926    -1039.364    950.0586
                 Cor_1 |          0  (omitted)
                       |
    countrynum#c.Cor_1 |
                    2  |  -785.9514     338.93    -2.32   0.032     -1495.34   -76.56274
                    3  |  -350.8146   914.5333    -0.38   0.706    -2264.955    1563.326
                    4  |   29.94349   449.2771     0.07   0.948    -910.4044    970.2914
                    5  |   909.2216   706.9965     1.29   0.214     -570.539    2388.982
                    6  |  -797.9921   222.6642    -3.58   0.002    -1264.034   -331.9506
                    7  |    -308.33   548.4422    -0.56   0.581    -1456.233    839.5728
                    8  |  -62.57157   447.8015    -0.14   0.890     -999.831    874.6878
                    9  |   372.6863   449.6995     0.83   0.418    -568.5455    1313.918
                   10  |   75.53298    746.382     0.10   0.920    -1486.663    1637.728
                   11  |   -32.4799   463.6609    -0.07   0.945    -1002.933    937.9734
                   12  |   1929.619    461.482     4.18   0.001     963.7258    2895.512
                   13  |  -957.3895   506.8818    -1.89   0.074    -2018.305    103.5263
                   14  |   1648.217   965.5464     1.71   0.104    -372.6945    3669.129
                   15  |   53.18744   615.2624     0.09   0.932    -1234.572    1340.947
                   16  |    2202.39   565.7829     3.89   0.001     1018.192    3386.587
                   17  |   638.5813   549.3248     1.16   0.259    -511.1688    1788.331
                   18  |   327.0107    517.551     0.63   0.535     -756.236    1410.257
                   19  |    582.496   405.3486     1.44   0.167    -265.9084      1430.9
                   20  |   9.871007   1133.115     0.01   0.993    -2361.765    2381.507
                       |
                  Year |
                 2006  |   .0142296   .0039022     3.65   0.002     .0060622     .022397
                 2007  |   .0049212   .0039376     1.25   0.227    -.0033202    .0131626
                 2008  |  -.0247142   .0052067    -4.75   0.000    -.0356119   -.0138166
                 2009  |   -.065904   .0061165   -10.77   0.000    -.0787059   -.0531021
                 2010  |  -.0084701   .0080807    -1.05   0.308    -.0253833     .008443
                 2011  |  -.0142801   .0077827    -1.83   0.082    -.0305695    .0020092
                 2012  |   -.044804   .0102844    -4.36   0.000    -.0663294   -.0232785
                 2013  |  -.0459268   .0135691    -3.38   0.003    -.0743273   -.0175263
                 2014  |  -.0290396   .0113149    -2.57   0.019     -.052722   -.0053573
                 2015  |  -.0111417   .0152856    -0.73   0.475    -.0431348    .0208515
                 2016  |  -.0155312   .0126094    -1.23   0.233    -.0419229    .0108605
                       |
                 _cons |   1.252449   .5100886     2.46   0.024     .1848215    2.320077
    ------------------------------------------------------------------------------------
    When I manually calculate time dummies, I use 2004 as a base group, but STATA says that 2016 was omitted for multicollinearity:
    Code:
    . reg Y I logYlevel_1 n H  i.countrynum Cor_1 i.countrynum##c.Cor_1 d2005 d2006 d2007 d2008 d2009
    >  d2010 d2011 d2012 d2013 d2014 d2015 d2016 , vce(cluster countrynum)
    note: Cor_1 omitted because of collinearity
    note: d2016 omitted because of collinearity
    
    Linear regression                               Number of obs     =        240
                                                    F(14, 19)         =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.7647
                                                    Root MSE          =     .01438
    
                                      (Std. Err. adjusted for 20 clusters in countrynum)
    ------------------------------------------------------------------------------------
                       |               Robust
                     Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------------+----------------------------------------------------------------
                     I |  -.0050199   .0195443    -0.26   0.800    -.0459266    .0358867
           logYlevel_1 |  -.1255014   .0508644    -2.47   0.023    -.2319618    -.019041
                     n |  -.0001286   .0003074    -0.42   0.680    -.0007721    .0005148
                     H |   .1810461   .1962326     0.92   0.368    -.2296734    .5917656
                       |
            countrynum |
                    2  |  -.0075186   .0115144    -0.65   0.522    -.0316185    .0165812
                    3  |  -.0415829   .0185308    -2.24   0.037    -.0803684   -.0027975
                    4  |  -.0301643   .0216167    -1.40   0.179    -.0754087      .01508
                    5  |   .0385532   .0177765     2.17   0.043     .0013464    .0757599
                    6  |   .0338513   .0113286     2.99   0.008     .0101402    .0575623
                    7  |   .0341033   .0208821     1.63   0.119    -.0096034    .0778101
                    8  |    .023549   .0119514     1.97   0.064    -.0014656    .0485635
                    9  |   .0537951   .0334736     1.61   0.125     -.016266    .1238563
                   10  |    .008647   .0057067     1.52   0.146    -.0032972    .0205912
                   11  |  -.0228318   .0093685    -2.44   0.025    -.0424402   -.0032233
                   12  |   .0125716   .0159917     0.79   0.441    -.0208994    .0460427
                   13  |  -.0263945   .0187979    -1.40   0.176    -.0657389    .0129498
                   14  |  -.0221471   .0113101    -1.96   0.065    -.0458194    .0015251
                   15  |  -.0336907   .0185583    -1.82   0.085    -.0725337    .0051524
                   16  |   .0114023   .0125498     0.91   0.375    -.0148647    .0376693
                   17  |   .0544347   .0217663     2.50   0.022     .0088773    .0999922
                   18  |  -.0102997   .0046371    -2.22   0.039    -.0200052   -.0005942
                   19  |   .0412845   .0238289     1.73   0.099      -.00859     .091159
                   20  |   .0377269   .0146951     2.57   0.019     .0069696    .0684842
                       |
                 Cor_1 |  -44.65295   475.2509    -0.09   0.926    -1039.364    950.0586
                 Cor_1 |          0  (omitted)
                       |
    countrynum#c.Cor_1 |
                    2  |  -785.9514     338.93    -2.32   0.032     -1495.34   -76.56274
                    3  |  -350.8146   914.5333    -0.38   0.706    -2264.955    1563.326
                    4  |   29.94349   449.2771     0.07   0.948    -910.4044    970.2914
                    5  |   909.2216   706.9965     1.29   0.214     -570.539    2388.982
                    6  |  -797.9921   222.6642    -3.58   0.002    -1264.034   -331.9506
                    7  |    -308.33   548.4422    -0.56   0.581    -1456.233    839.5728
                    8  |  -62.57157   447.8015    -0.14   0.890     -999.831    874.6878
                    9  |   372.6863   449.6995     0.83   0.418    -568.5455    1313.918
                   10  |   75.53298    746.382     0.10   0.920    -1486.663    1637.728
                   11  |   -32.4799   463.6609    -0.07   0.945    -1002.933    937.9734
                   12  |   1929.619    461.482     4.18   0.001     963.7258    2895.512
                   13  |  -957.3895   506.8818    -1.89   0.074    -2018.305    103.5263
                   14  |   1648.217   965.5464     1.71   0.104    -372.6945    3669.129
                   15  |   53.18744   615.2624     0.09   0.932    -1234.572    1340.947
                   16  |    2202.39   565.7829     3.89   0.001     1018.192    3386.587
                   17  |   638.5813   549.3248     1.16   0.259    -511.1688    1788.331
                   18  |   327.0107    517.551     0.63   0.535     -756.236    1410.257
                   19  |    582.496   405.3486     1.44   0.167    -265.9084      1430.9
                   20  |   9.871007   1133.115     0.01   0.993    -2361.765    2381.507
                       |
                 d2005 |   .0155312   .0126094     1.23   0.233    -.0108605    .0419229
                 d2006 |   .0297608   .0104419     2.85   0.010     .0079056    .0516161
                 d2007 |   .0204524   .0107642     1.90   0.073    -.0020774    .0429822
                 d2008 |   -.009183   .0113028    -0.81   0.427    -.0328401     .014474
                 d2009 |  -.0503728   .0090671    -5.56   0.000    -.0693504   -.0313952
                 d2010 |   .0070611   .0079591     0.89   0.386    -.0095974    .0237196
                 d2011 |   .0012511   .0063774     0.20   0.847     -.012097    .0145992
                 d2012 |  -.0292728   .0047347    -6.18   0.000    -.0391825    -.019363
                 d2013 |  -.0303956   .0096392    -3.15   0.005    -.0505708   -.0102204
                 d2014 |  -.0135084   .0036453    -3.71   0.002    -.0211382   -.0058787
                 d2015 |   .0043896   .0062192     0.71   0.489    -.0086274    .0174065
                 d2016 |          0  (omitted)
                 _cons |   1.236918   .5037766     2.46   0.024     .1825014    2.291334
    -----------------------------------------------------------------------------------
    The same happens to 2004 if I use 2016 as a base group:
    Code:
    . reg Y I logYlevel_1 n H  i.countrynum Cor_1 i.countrynum##c.Cor_1 d2004 d2005 d2006 d2007 d2008
    >  d2009 d2010 d2011 d2012 d2013 d2014 d2015 , vce(cluster countrynum)
    note: Cor_1 omitted because of collinearity
    note: d2004 omitted because of collinearity
    
    Linear regression                               Number of obs     =        240
                                                    F(14, 19)         =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.7647
                                                    Root MSE          =     .01438
    
                                      (Std. Err. adjusted for 20 clusters in countrynum)
    ------------------------------------------------------------------------------------
                       |               Robust
                     Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------------+----------------------------------------------------------------
                     I |  -.0050199   .0195443    -0.26   0.800    -.0459266    .0358867
           logYlevel_1 |  -.1255014   .0508644    -2.47   0.023    -.2319618    -.019041
                     n |  -.0001286   .0003074    -0.42   0.680    -.0007721    .0005148
                     H |   .1810461   .1962326     0.92   0.368    -.2296734    .5917656
                       |
            countrynum |
                    2  |  -.0075186   .0115144    -0.65   0.522    -.0316185    .0165812
                    3  |  -.0415829   .0185308    -2.24   0.037    -.0803684   -.0027975
                    4  |  -.0301643   .0216167    -1.40   0.179    -.0754087      .01508
                    5  |   .0385532   .0177765     2.17   0.043     .0013464    .0757599
                    6  |   .0338513   .0113286     2.99   0.008     .0101402    .0575623
                    7  |   .0341033   .0208821     1.63   0.119    -.0096034    .0778101
                    8  |    .023549   .0119514     1.97   0.064    -.0014656    .0485635
                    9  |   .0537951   .0334736     1.61   0.125     -.016266    .1238563
                   10  |    .008647   .0057067     1.52   0.146    -.0032972    .0205912
                   11  |  -.0228318   .0093685    -2.44   0.025    -.0424402   -.0032233
                   12  |   .0125716   .0159917     0.79   0.441    -.0208994    .0460427
                   13  |  -.0263945   .0187979    -1.40   0.176    -.0657389    .0129498
                   14  |  -.0221471   .0113101    -1.96   0.065    -.0458194    .0015251
                   15  |  -.0336907   .0185583    -1.82   0.085    -.0725337    .0051524
                   16  |   .0114023   .0125498     0.91   0.375    -.0148647    .0376693
                   17  |   .0544347   .0217663     2.50   0.022     .0088773    .0999922
                   18  |  -.0102997   .0046371    -2.22   0.039    -.0200052   -.0005942
                   19  |   .0412845   .0238289     1.73   0.099      -.00859     .091159
                   20  |   .0377269   .0146951     2.57   0.019     .0069696    .0684842
                       |
                 Cor_1 |  -44.65295   475.2509    -0.09   0.926    -1039.364    950.0586
                 Cor_1 |          0  (omitted)
                       |
    countrynum#c.Cor_1 |
                    2  |  -785.9514     338.93    -2.32   0.032     -1495.34   -76.56274
                    3  |  -350.8146   914.5333    -0.38   0.706    -2264.955    1563.326
                    4  |   29.94349   449.2771     0.07   0.948    -910.4044    970.2914
                    5  |   909.2216   706.9965     1.29   0.214     -570.539    2388.982
                    6  |  -797.9921   222.6642    -3.58   0.002    -1264.034   -331.9506
                    7  |    -308.33   548.4422    -0.56   0.581    -1456.233    839.5728
                    8  |  -62.57157   447.8015    -0.14   0.890     -999.831    874.6878
                    9  |   372.6863   449.6995     0.83   0.418    -568.5455    1313.918
                   10  |   75.53298    746.382     0.10   0.920    -1486.663    1637.728
                   11  |   -32.4799   463.6609    -0.07   0.945    -1002.933    937.9734
                   12  |   1929.619    461.482     4.18   0.001     963.7258    2895.512
                   13  |  -957.3895   506.8818    -1.89   0.074    -2018.305    103.5263
                   14  |   1648.217   965.5464     1.71   0.104    -372.6945    3669.129
                   15  |   53.18744   615.2624     0.09   0.932    -1234.572    1340.947
                   16  |    2202.39   565.7829     3.89   0.001     1018.192    3386.587
                   17  |   638.5813   549.3248     1.16   0.259    -511.1688    1788.331
                   18  |   327.0107    517.551     0.63   0.535     -756.236    1410.257
                   19  |    582.496   405.3486     1.44   0.167    -265.9084      1430.9
                   20  |   9.871007   1133.115     0.01   0.993    -2361.765    2381.507
                       |
                 d2004 |          0  (omitted)
                 d2005 |   .0155312   .0126094     1.23   0.233    -.0108605    .0419229
                 d2006 |   .0297608   .0104419     2.85   0.010     .0079056    .0516161
                 d2007 |   .0204524   .0107642     1.90   0.073    -.0020774    .0429822
                 d2008 |   -.009183   .0113028    -0.81   0.427    -.0328401     .014474
                 d2009 |  -.0503728   .0090671    -5.56   0.000    -.0693504   -.0313952
                 d2010 |   .0070611   .0079591     0.89   0.386    -.0095974    .0237196
                 d2011 |   .0012511   .0063774     0.20   0.847     -.012097    .0145992
                 d2012 |  -.0292728   .0047347    -6.18   0.000    -.0391825    -.019363
                 d2013 |  -.0303956   .0096392    -3.15   0.005    -.0505708   -.0102204
                 d2014 |  -.0135084   .0036453    -3.71   0.002    -.0211382   -.0058787
                 d2015 |   .0043896   .0062192     0.71   0.489    -.0086274    .0174065
                 _cons |   1.236918   .5037766     2.46   0.024     .1825014    2.291334
    ------------------------------------------------------------------------------------
    
    .

    Why do you think this is the case?

  • #2
    Simona:
    it cannot be that two years are used as reference categories at the same time, as the aim of the omission is to protect your regression model from the so called dummy trap (see https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).
    My guess is that one year is omitted due to collinearity and the other one is omitted because of its role as reference category (by the way, -fvvarlist- notation allows you to decide which year - or level of category variable- you want to set as reference category).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      This cannot be answered in specific terms without seeing an example of your data. In general terms, the problem is that one or more of the variable other than year in your model is colinear with the time indicators. For example, if Cor_1 is defined as being 1 if Year > some threshold year, and 0 otherwise, then Cor_1 would be the "culprit." One way you can find out what's going on is:

      Code:
      reg d2004 I logYlevel_1 n H i.countrynum Cor_1 i.countrynum##c.Cor_1   ///
          d2005 d2006 d2007 d2008 d2009 d2010 d2011 d2012 d2013 d2014 d2015
      The results will have R2 = 1, and most of the coefficients will be 0 (either exactly or with very small rounding errors) and the rest of the coefficients will show you what d2004 is colinear with in this context.

      Depending on what you find, you may conclude that your data contains errors that need to be fixed. Or if it is due to something like Cor_1 corresponding to a threshold division of time as suggested in the preceding paragraph, then you will just be happy to have two time indicators omitted.

      Added: Crossed with #2. But primary reason for editing is to complete the post. Somehow, the post posted itself without my hitting "Post Reply" while I was only about half way through it.
      Last edited by Clyde Schechter; 04 Mar 2019, 10:45.

      Comment


      • #4
        I note 20 countries and 13 years, so 20 observations excluded because of missing values. Something is biting there.

        For STATA read Stata passim and please read http://www.statalist.org/forums/help to the end!

        Comment


        • #5
          Thank you for your replies!

          I have uploaded a screenshot of my data to show you the dataset I am dealing with - in case the problem is related to it.
          Click image for larger version

Name:	Screen Shot 2019-03-05 at 12.57.19.png
Views:	1
Size:	237.7 KB
ID:	1486654



          Carlo:
          -fvvarlist- suggests 2004 to be the base year as when I run
          Code:
           list i.Year
          I get the following (I only copied part of the result).
          Code:
          260. | 2004b. | 2005. | 2006. | 2007. | 2008. | 2009. | 2010. | 2011. | 2012. | 2013. | 2014. |
               |   Year |  Year |  Year |  Year |  Year |  Year |  Year |  Year |  Year |  Year |  Year |
               |      0 |     0 |     0 |     0 |     0 |     0 |     0 |     0 |     0 |     0 |     0 |
               |----------------------------------------------------------------------------------------|
               |                   2015.                   |                   2016.                    |
               |                    Year                   |                    Year                    |
               |                       0                   |                       1                    |
          I have tried using 2015 as a base year too, just to experiment whether the problem only appears with 2004 and 2016, but again 2004 is omitted due to collinearity. How should I deal with this?



          Clyde:
          Cor takes a number of values which do not depend on whether Year > some threshold year as Cor= number of registered corruption crimes.
          Also, I am unsure about the meaning of the /// but I have run your code without the /// and this is my result:
          Code:
          . reg d2004 I logYlevel_1 n H i.countrynum Cor_1 i.countrynum##c.Cor_1  d2005 d2006 d2007 d2008 d
          > 2009 d2010 d2011 d2012 d2013 d2014 d2015
          note: Cor_1 omitted because of collinearity
          
                Source |       SS           df       MS      Number of obs   =       240
          -------------+----------------------------------   F(54, 185)      =         .
                 Model |           0        54           0   Prob > F        =         .
              Residual |           0       185           0   R-squared       =         .
          -------------+----------------------------------   Adj R-squared   =         .
                 Total |           0       239           0   Root MSE        =         0
          
          ------------------------------------------------------------------------------------
                       d2004 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------------+----------------------------------------------------------------
                           I |          0  (omitted)
                 logYlevel_1 |          0  (omitted)
                           n |          0  (omitted)
                           H |          0  (omitted)
                             |
                  countrynum |
                          2  |          0  (omitted)
                          3  |          0  (omitted)
                          4  |          0  (omitted)
                          5  |          0  (omitted)
                          6  |          0  (omitted)
                          7  |          0  (omitted)
                          8  |          0  (omitted)
                          9  |          0  (omitted)
                         10  |          0  (omitted)
                         11  |          0  (omitted)
                         12  |          0  (omitted)
                         13  |          0  (omitted)
                         14  |          0  (omitted)
                         15  |          0  (omitted)
                         16  |          0  (omitted)
                         17  |          0  (omitted)
                         18  |          0  (omitted)
                         19  |          0  (omitted)
                         20  |          0  (omitted)
                             |
                       Cor_1 |          0  (omitted)
                       Cor_1 |          0  (omitted)
                             |
          countrynum#c.Cor_1 |
                          2  |          0  (omitted)
                          3  |          0  (omitted)
                          4  |          0  (omitted)
                          5  |          0  (omitted)
                          6  |          0  (omitted)
                          7  |          0  (omitted)
                          8  |          0  (omitted)
                          9  |          0  (omitted)
                         10  |          0  (omitted)
                         11  |          0  (omitted)
                         12  |          0  (omitted)
                         13  |          0  (omitted)
                         14  |          0  (omitted)
                         15  |          0  (omitted)
                         16  |          0  (omitted)
                         17  |          0  (omitted)
                         18  |          0  (omitted)
                         19  |          0  (omitted)
                         20  |          0  (omitted)
                             |
                       d2005 |          0  (omitted)
                       d2006 |          0  (omitted)
                       d2007 |          0  (omitted)
                       d2008 |          0  (omitted)
                       d2009 |          0  (omitted)
                       d2010 |          0  (omitted)
                       d2011 |          0  (omitted)
                       d2012 |          0  (omitted)
                       d2013 |          0  (omitted)
                       d2014 |          0  (omitted)
                       d2015 |          0  (omitted)
                       _cons |          0  (omitted)
          ------------------------------------------------------------------------------------
          Nick:
          I believe I lose 20 observations as I lose Corruption (Cor) of 2004 when using the lagged value of Corruption (Cor_1)
          Last edited by Simona Battipaglia; 05 Mar 2019, 06:02. Reason: Edited to better explain my answer

          Comment


          • #6
            I believe you that Cor_1 is a lagged version of Cor. If it's going to be omitted from the model, then don't include it at all and then you should get all 260 observations used.

            I think Stata's thinking goes

            1. Include Cor_1, but we can't use 20 observations because of missing values.

            2. In fact Cor_1 is omitted as a predictor.

            But step 1 is not reversed. Hence try not feeding Cor_1 to the command in the first place.

            All that said, the number of parameters being estimated here is to me uncomfortably large for this size of dataset.

            Comment


            • #7
              Simona:
              dealing with collinearity means dealing with your data and your specification.
              Your data are what they are: end of story; you can possibly deal with the specification (I do share Nick's warning about the sky-rocketing number of parameters vs your limited sample size).
              You may want to consider a linear (vs categorical) role of time as a predictor, including a squared term too, to search for a possible turning-point.
              As an aside, the Centre-Southern Italian region of Abruzzo (that offers both mountains and sea) is miswritten -Abbruzzo-.
              That said, the usual warning about not posting screenshots still applies.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Thank you for noticing the spelling issue.

                Concerning the Cor_1 issue I am looking for 2004 data to avoid losing observations, as Cor_1 is the main thing I am looking at for my analysis. (edited// I have now dealt with the issue and the model below presents 260 observations)

                Carlo: Do you mean I should simply add Year as a variable as in the following way:

                Code:
                  
                
                . reg Y I logYlevel_1 n H  i.countrynum Cor_1 i.countrynum##c.Cor_1 Year , vce(cluster countrynum
                > )
                note: Cor_1 omitted because of collinearity
                
                Linear regression                               Number of obs     =        260
                                                                F(4, 19)          =          .
                                                                Prob > F          =          .
                                                                R-squared         =     0.2829
                                                                Root MSE          =      .0237
                
                                                  (Std. Err. adjusted for 20 clusters in countrynum)
                ------------------------------------------------------------------------------------
                                   |               Robust
                                 Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                -------------------+----------------------------------------------------------------
                                 I |   .0136168   .0254749     0.53   0.599    -.0397028    .0669365
                       logYlevel_1 |  -.2853915     .05492    -5.20   0.000    -.4003404   -.1704427
                                 n |  -.0012297   .0002192    -5.61   0.000    -.0016885   -.0007709
                                 H |  -.1132615     .23479    -0.48   0.635    -.6046826    .3781595
                                   |
                        countrynum |
                                2  |  -.0523238   .0123011    -4.25   0.000    -.0780702   -.0265773
                                3  |  -.0942043    .019376    -4.86   0.000    -.1347587   -.0536499
                                4  |  -.0787444   .0267068    -2.95   0.008    -.1346423   -.0228465
                                5  |    .092117    .022458     4.10   0.001     .0451119    .1391221
                                6  |   .0668656   .0124769     5.36   0.000     .0407511    .0929801
                                7  |    .103312   .0239863     4.31   0.000     .0531081    .1535159
                                8  |   .0656904   .0147672     4.45   0.000     .0347824    .0965984
                                9  |   .1017578   .0432451     2.35   0.030     .0112448    .1922708
                               10  |   .0222183   .0069949     3.18   0.005     .0075778    .0368588
                               11  |  -.0434028   .0092926    -4.67   0.000    -.0628524   -.0239532
                               12  |   .0183361   .0201064     0.91   0.373    -.0237471    .0604193
                               13  |  -.0958896    .022312    -4.30   0.000    -.1425892   -.0491901
                               14  |  -.0490688    .012275    -4.00   0.001    -.0747607    -.023377
                               15  |  -.1008043   .0225146    -4.48   0.000    -.1479279   -.0536808
                               16  |   .0685415   .0121233     5.65   0.000     .0431672    .0939158
                               17  |   .1286768   .0246013     5.23   0.000     .0771858    .1801679
                               18  |   .0068771   .0058449     1.18   0.254    -.0053565    .0191107
                               19  |    .105893   .0265508     3.99   0.001     .0503214    .1614645
                               20  |   .0782352   .0202825     3.86   0.001     .0357836    .1206869
                                   |
                             Cor_1 |  -404.2615   162.8172    -2.48   0.023    -745.0418   -63.48109
                             Cor_1 |          0  (omitted)
                                   |
                countrynum#c.Cor_1 |
                                2  |   387.9583   159.7976     2.43   0.025       53.498    722.4186
                                3  |  -2541.722    620.688    -4.10   0.001    -3840.837   -1242.607
                                4  |  -515.0294   123.4335    -4.17   0.001    -773.3787   -256.6802
                                5  |   792.5986   294.5226     2.69   0.014     176.1556    1409.042
                                6  |  -801.7563   251.8317    -3.18   0.005    -1328.846   -274.6666
                                7  |  -141.5688   294.1889    -0.48   0.636    -757.3132    474.1756
                                8  |   328.4078   124.0316     2.65   0.016     68.80671     588.009
                                9  |   323.6399   211.7618     1.53   0.143    -119.5827    766.8624
                               10  |   2062.257   308.0757     6.69   0.000     1417.447    2707.066
                               11  |   332.6196   153.1288     2.17   0.043     12.11747    653.1218
                               12  |   3809.895   373.1752    10.21   0.000      3028.83     4590.96
                               13  |  -792.5549   236.7223    -3.35   0.003     -1288.02   -297.0895
                               14  |  -1531.622   553.0615    -2.77   0.012    -2689.193    -374.051
                               15  |   375.3612   188.5123     1.99   0.061    -19.19972     769.922
                               16  |  -1581.248   348.5597    -4.54   0.000    -2310.792   -851.7044
                               17  |    486.136   153.3866     3.17   0.005     165.0941    807.1778
                               18  |   1114.851   181.1655     6.15   0.000     735.6673    1494.035
                               19  |   735.8404   130.8447     5.62   0.000     461.9793    1009.701
                               20  |  -2401.573   542.7916    -4.42   0.000    -3537.649   -1265.497
                                   |
                              Year |  -.0029798   .0013202    -2.26   0.036    -.0057431   -.0002165
                             _cons |   8.874816   2.940713     3.02   0.007     2.719833     15.0298
                ------------------------------------------------------------------------------------
                Last edited by Simona Battipaglia; 05 Mar 2019, 10:48.

                Comment


                • #9
                  Simona:
                  not quite.
                  I meant (see interaction in red, please):
                  Code:
                  reg Y I logYlevel_1 n H   i.countrynum##c.Cor_1 c.Year##c.Year , vce(cluster countrynum )
                  You should also avoid repeating
                  Code:
                  i.countrynum Cor_1
                  due to its redundancy, as the (so called) conditional main effect of both predictors is already included in the double # -fvvarlist- notation
                  Code:
                  i.countrynum##c.Cor_1
                  .
                  As a general rule, tiding up your code can help you in seeing what's the matter with your data (efficiency, you know...).
                  Last edited by Carlo Lazzaro; 05 Mar 2019, 11:46.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    The output you show in #5 implies that your d2004 variable is always 0. That is why it is always omitted, and why some other time variable also must be omitted when d2004 is your chosen base category. Now, perhaps d2004 is not zero in every observation in the data set, but it is apparently always 0 in those observations that participate in the regression. You can confirm this independently by running:

                    Code:
                    tab d2004 if e(sample)
                    after re-running your original regression.

                    Apparently all of the 2004 observations have some variable with a missing value. In fact, you say as much yourself: since Cor_1 is the lagged value of Cor, and 2004 is the first year of your data set, Cor_1 is always missing in 2004 observations. Mystery solved.

                    Note: While the attempt to share your data is appreciated, screenshots are not helpful. First, as is the case with so many screenshots, the one you posted is not readable, at least not on my computer. Next, screenshots fail to convey aspects of the data that are often important, such as storage types and labeling. Finally, if somebody wanted to troubleshoot for you by working with your data, there is no way to import the data from a screenshot to Stata. The helpful way to show example data is by using the -dataex- command. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                    Comment


                    • #11
                      Thank you both for your replies and I will rather use -dataex- next time. While
                      Code:
                       
                       reg Y I logYlevel_1 n H   i.countrynum##c.Cor_1 c.Year##c.Year , vce(cluster countrynum )
                      does not pass the RESET test, the following does
                      Code:
                       reg Y I logYlevel_1 n H i.countrynum##c.Cor_1 c.Year, vce(cluster countrynum )
                      .

                      Am I right in interpreting the result by saying that the Year taken into account affects Y (GDP growth) by X% (according to the estimated coefficient) ? Also, by including c.Year, H (human capital) becomes insignificant, entailing that when accounting for the Year, GDP growth is not affected anymore by human capital. However, this seems like an odd result, do you think that this would be enough to justify me dropping the variable c.Year?

                      Comment


                      • #12
                        Simona:
                        - as you did not log the regressand, I do not think that the percentage variation is the way to go;
                        - if -H- looses its statisrical significance when -Y- is plugged in among predictors, it simply means that your data do not support the evidence of any effect played by -H- on -Y- when adjusted for the remaining predictors. As usual, this result can well depend of different causes (and your limited sample size may be one of them);
                        - dropping variable because your data do not confirm your expectations is neither scientific, nor informative.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          Thank you all for your extremely helpful advice and for your time!

                          Kind regards,
                          Simona

                          Comment


                          • #14
                            Dear all,

                            I am having a similar issue as Simona, due to some unknown (at least to me) reason two of my years are being dropped from my estimations one being the base and the other due to the collinearity.
                            I am using the following command in stat14.0

                            Code:
                            xtset districtself year
                            xtreg logspikehs lmwhs median_age median_education share_illeterate share_retired share_students share_unmarried share_women share_agri share_services share_trading share_manufacturing year1 year2 year3 year4 year5  , fe vce(cluster districtself)
                            and following is the output I am getting from stata
                            Code:
                            note: year5 omitted because of collinearity
                            
                            Fixed-effects (within) regression               Number of obs     =        588
                            Group variable: districtself                    Number of groups  =         98
                            
                            R-sq:                                           Obs per group:
                                 within  = 0.8149                                         min =          6
                                 between = 0.0578                                         avg =        6.0
                                 overall = 0.7789                                         max =          6
                            
                                                                            F(16,97)          =     121.23
                            corr(u_i, Xb)  = -0.0548                        Prob > F          =     0.0000
                            
                                                             (Std. Err. adjusted for 98 clusters in districtself)
                            -------------------------------------------------------------------------------------
                                                |               Robust
                                     logspikehs |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            --------------------+----------------------------------------------------------------
                                          lmwhs |   5.306975   .4264071    12.45   0.000     4.460675    6.153275
                                     median_age |   .0231407   .0153189     1.51   0.134     -.007263    .0535444
                               median_education |   .0202031   .0225883     0.89   0.373    -.0246285    .0650346
                               share_illeterate |   .0042084   .0033828     1.24   0.216    -.0025056    .0109223
                                  share_retired |  -.0561454   .0248822    -2.26   0.026    -.1055297    -.006761
                                 share_students |  -.0083063   .0206381    -0.40   0.688    -.0492673    .0326547
                                share_unmarried |   .0130591   .0065427     2.00   0.049     .0000737    .0260445
                                    share_women |  -.0139748   .0049258    -2.84   0.006     -.023751   -.0041985
                                     share_agri |   .0027159   .0034336     0.79   0.431    -.0040989    .0095307
                                 share_services |   .0066551   .0031882     2.09   0.039     .0003274    .0129828
                                  share_trading |  -.0029541   .0049015    -0.60   0.548    -.0126821     .006774
                            share_manufacturing |    .001841    .005731     0.32   0.749    -.0095335    .0132154
                                          year1 |   8.557515   .6452879    13.26   0.000     7.276797    9.838233
                                          year2 |   5.869519   .4716393    12.44   0.000     4.933445    6.805592
                                          year3 |    3.56583    .291818    12.22   0.000     2.986652    4.145008
                                          year4 |   1.300326    .222288     5.85   0.000     .8591456    1.741506
                                          year5 |          0  (omitted)
                                          _cons |  -54.81592   3.928375   -13.95   0.000    -62.61266   -47.01918
                            --------------------+----------------------------------------------------------------
                                        sigma_u |  .18579891
                                        sigma_e |  .41654963
                                            rho |  .16593968   (fraction of variance due to u_i)
                            -------------------------------------------------------------------------------------
                            
                            .
                            my example data is in the following
                            -----------------------
                            Code:
                            * Example generated by -dataex-. To install: ssc install dataex
                            clear
                            input float(logspikehs lmwhs median_age median_education share_illeterate share_retired share_students share_unmarried share_women share_agri share_services share_trading share_manufacturing year) byte(year1 year2 year3 year4 year5)
                             -2.558409 8.039157   35 10  10.90684 2.3139918  2.023752 26.797155  13.73886 4.3713713  44.90681   13.2719  4.224636 2004 1 0 0 0 0
                            -2.3657022 8.476371   36 10  10.11645  .5896513  2.232368  23.83671  12.69499  .4683629   52.9002   4.71407  3.142393 2006 0 1 0 0 0
                             -2.542258 8.881836   35 10  8.807885     1.994 2.0053542 25.155243  12.78523  5.743135  39.88345 11.423695  2.910006 2008 0 0 1 0 0
                            -4.0047326  9.03884   35 10  9.257348 1.1297612  3.469716  24.63409 14.048172  2.913295  57.50639  7.941474  5.878557 2010 0 0 0 1 0
                             -5.181784 9.290352   37 12 11.313846 1.1189824  3.246734   23.5888  15.25079  2.674123  42.88018 1.3306923  5.635561 2012 0 0 0 0 1
                             -2.510615 9.578034 35.5 10 14.250965   1.44758  5.981085 24.215256 18.896103  5.309056  41.92178 11.290117 4.3161345 2014 0 0 0 0 0
                            -1.9919136 8.039157   35  8  24.94258  2.411582 2.0687857 23.043375  5.596759 18.071205 37.827724 16.069403   9.58853 2004 1 0 0 0 0
                            -2.3418057 8.476371   33 10  23.63444 1.3409512 1.1733979 29.169844  6.616175  2.940082   39.2728  11.80261 11.010347 2006 0 1 0 0 0
                            -2.5490444 8.881836   40 10   29.7847  5.690811  1.241238  20.39517  6.313539  30.08603  18.02271   11.8409  7.477448 2008 0 0 1 0 0
                            -3.8635964  9.03884   36  8  24.49669 2.2567503 2.0568764  25.74037  7.116378 13.233109  29.95527 17.699556  6.658371 2010 0 0 0 1 0
                             -4.028917 9.290352   37  9 21.075203  3.399511 2.2731442  22.63762   7.94648 21.418705 21.197315  .4158455  6.402743 2012 0 0 0 0 1
                               -2.5458 9.578034   40  9  29.85759 4.2597065  3.772423 17.042093 17.281464 35.977978 17.591042   14.6199  7.165493 2014 0 0 0 0 0
                            -2.3334374 8.039157   36 10 16.486708  2.691559 1.8031697  26.97494  8.780628  8.136788  38.99605 17.440176  6.688824 2004 1 0 0 0 0
                            -2.3025851 8.476371   35 10   14.3521 2.2873812 1.0050443 29.273617  7.893749  .5711666  44.88651 16.005455  6.434148 2006 0 1 0 0 0
                            -2.7405205 8.881836   37 10  15.11578  2.526976  .7542983  26.42409  8.041978 11.740806  30.21699  19.36924  6.276433 2008 0 0 1 0 0
                             -4.918653  9.03884   36 10  14.27757  2.270407 2.1635392  25.77645 10.269762  7.460679  41.17196 18.242336  6.679374 2010 0 0 0 1 0
                            -4.4283075 9.290352   37 10  11.80966  3.051733  3.081607  24.10761 11.634595  8.838715 30.151203  .7680569  11.12961 2012 0 0 0 0 1
                            -2.3738942 9.578034   38 10   11.3621  3.106778  2.462341  22.41693 14.661825 11.769317 25.401844 17.888924 12.493567 2014 0 0 0 0 0
                               -2.0726 8.039157   37 10 23.114164  2.536887  1.723712  29.60225 13.199022 28.670237 27.447746  12.94578  6.789503 2004 1 0 0 0 0
                             -1.961854 8.476371   34 10 16.297745  1.700985 .16370514 36.159004  4.049959  .9187898 38.144917  11.26712  9.229283 2006 0 1 0 0 0
                            end
                            ------------------


                            I have tried using i.year dummies in my estimations but it again drops one of the years and it is not always year5.

                            Code:
                             xtreg logspikehs lmwhs median_age median_education share_illeterate share_retired share_students share_u
                            > nmarried share_women share_agri share_services share_trading share_manufacturing i.year , fe vce(cluster
                            >  districtself)
                            note: 2014.year omitted because of collinearity
                            
                            Fixed-effects (within) regression               Number of obs     =        588
                            Group variable: districtself                    Number of groups  =         98
                            
                            R-sq:                                           Obs per group:
                                 within  = 0.8149                                         min =          6
                                 between = 0.0578                                         avg =        6.0
                                 overall = 0.7789                                         max =          6
                            
                                                                            F(16,97)          =     121.23
                            corr(u_i, Xb)  = -0.0548                        Prob > F          =     0.0000
                            
                                                             (Std. Err. adjusted for 98 clusters in districtself)
                            -------------------------------------------------------------------------------------
                                                |               Robust
                                     logspikehs |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            --------------------+----------------------------------------------------------------
                                          lmwhs |   -.253906   .0445865    -5.69   0.000    -.3423978   -.1654143
                                     median_age |   .0231407   .0153189     1.51   0.134     -.007263    .0535444
                               median_education |   .0202031   .0225883     0.89   0.373    -.0246285    .0650346
                               share_illeterate |   .0042084   .0033828     1.24   0.216    -.0025056    .0109223
                                  share_retired |  -.0561454   .0248822    -2.26   0.026    -.1055297    -.006761
                                 share_students |  -.0083063   .0206381    -0.40   0.688    -.0492673    .0326547
                                share_unmarried |   .0130591   .0065427     2.00   0.049     .0000737    .0260445
                                    share_women |  -.0139748   .0049258    -2.84   0.006     -.023751   -.0041985
                                     share_agri |   .0027159   .0034336     0.79   0.431    -.0040989    .0095307
                                 share_services |   .0066551   .0031882     2.09   0.039     .0003274    .0129828
                                  share_trading |  -.0029541   .0049015    -0.60   0.548    -.0126821     .006774
                            share_manufacturing |    .001841    .005731     0.32   0.749    -.0095335    .0132154
                                                |
                                           year |
                                          2006  |  -.2567017   .0717093    -3.58   0.001    -.3990249   -.1143784
                                          2008  |  -.3056472   .0510029    -5.99   0.000    -.4068738   -.2044207
                                          2010  |  -1.698069   .0703665   -24.13   0.000    -1.837727   -1.558411
                                          2012  |  -1.599768   .1206321   -13.26   0.000     -1.83919   -1.360347
                                          2014  |          0  (omitted)
                                                |
                                          _cons |  -1.553611    .865015    -1.80   0.076    -3.270426    .1632045
                            --------------------+----------------------------------------------------------------
                                        sigma_u |  .18579891
                                        sigma_e |  .41654963
                                            rho |  .16593968   (fraction of variance due to u_i)
                            -------------------------------------------------------------------------------------
                            
                            .
                            Any suggestions, what is going wrong with my estimations?
                            Thank you in advance.
                            Last edited by Zahid Khan; 28 Sep 2023, 10:23.

                            Comment


                            • #15
                              I believe the problem is due to the variable lmwhs. In your example data, this variable is constant within year. Otherwise put, lmwhs is, itself, a year variable, in disguise, as it were. As such it is colinear with the explicit year* variables. This colinearity forces the omission of an additional year to break the colinearity. It also means that effects both lmwhs and the years are inestimable, so that their coefficients are all meaningless. (The coefficients of the other variables are fine.)

                              It is not possible to simultaneously include a variable whose value is constant within year and also include the year variables. If estimating the effect of lmwhs is necessary to achieve your research goals, then you must omit all the year variables. If you were including lmwhs only to adjust for its effects, then you can omit lmwhs and include the year variables, secure in the knowledge that the year variables themselves are already adjusting for the lmwhs effect.

                              Comment

                              Working...
                              X