Panel data

Enrique Santiago Pajuelo

Join Date: Dec 2020

Posts: 11
#1

Panel data

30 Apr 2021, 15:41

Good afternoon,

I am doing a panel data study in order to test the effect of ideology of political regimes (dependent variable that is a continuous variable) and demonstrations. While Hausman test tells me I need to use a fixed effect panel data study, I a not so sure. The thing is, between variation in the data for ideology is far larger than within variation and I take some variables, such as education (which may affect the dependent variable) that obviously would not make sense to add in a fixed effects models. In addition R explains better the between countries variation than the within.

My question is, should I continue using Fixed effects model? Are fixed effects model better because finding significant effects in within country is more valuable? I don't know whether because within country estimators are controlling for many other variables that in these models are fixed, they might be better predictors...

Thank you so much in advance.

Kind regards.
Tags: None
Rhys Williams

Join Date: Apr 2020

Posts: 224
#2

30 Apr 2021, 16:45

Personally, I think that if your Hausman test says you should be using fixed effects then you need to think carefully about what omitted variable bias could be occurring before disregarding the FE test.

The point about including education doesn't really matter - if it's time invariant then it will be eliminated by FE so it doesn't really matter whether you control for it or not (obviously, it's automatically omitted), unless you are interested in studying the effect on education...

FE is valuable because it removes time-invariant omitted variable bias. Using RE doesn't.

I would advise you to think more about the OVB that will arise from omitting FE.

Best,
Rhys
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#3

01 May 2021, 03:33

Enrique:
welcome to this forum.
as an aside to Rhys' valuable advice, you do not tell interested listers how many predictors you have plugged in the righ-hand side of your regression equation.
That's why new posters' welcome comes with the recommendation of reading (and acting on) the FAQ, especially as far as sharing Stata codes and outcomes with the list is concerned.
I also find difficult to get the relevance of education at national level: is is the average level of education pursued by inhabitants of each country?
As Rhys reminded, the -fe- estimator wipes out time-invariant omitted variabke bias: however, it may well be that your model remain misspecified as far as time-varying predictors are concerned.
In sum, I do not find in your post enough details to give you a more positive guidance.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Enrique Santiago Pajuelo

Join Date: Dec 2020

Posts: 11
#4

01 May 2021, 05:00

Hello both Rhys and Carlo,

Thank you for your uselful insights and the welcoming messages.

Regarding what Rhys says, Hausman clearly indicates that I need to do a fixed effect model, since it returns a value of 0,000. Being education time-invariant, then I wouldn't include it in the research if I finally go for Fixed effects (since it would already be accounted in the research).

Answering Carlo, Education is measured in the average years of schooling of people +15 years old (it is time invariant because in the 10 years of observation that use in my panel data, the avarage education years of schooling does not change in most of my observations which are 176 countries). Other alternative variables that I use are: internet use, gdp per capita, opposition parties size and executive corruption, all of them would be accepted in the fixed effect model since they vary from year to year.

A question that I have is: Taking into account that ideology varies more between countries than within countries and that ideology is significant (p<0,05), would it be right to mention in my paper that the effect of ideology on my dependent variable (and also on R squared) would be even larger between countries? For example, we could have that corruption using fe has a higher effect than ideology, but if the difference in corruption between countries was not large, then ideology maybe would be a better explanatory variable, right?

As I said, at first I wanted to make a between model since variation in ideology is larger between countries than within, but if you tell me that FE is usually more valuable since it controls for fixed variables that I may not be accounting for, then I could use that to provide more strength to my argument.

Thanks again for your valuable comments.

Last edited by Enrique Santiago Pajuelo; 01 May 2021, 05:04.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#5

01 May 2021, 05:58

Enrique:
a first idea would be to interact -ideology- with -corruption- under the -fe- specifrication and see what happend to your coefficients. Statistical significance per se is not a relevant achievement (even though most of us were taught differently at the university).
Besides, you can take a look at the community-contributed -mundlak- command as a compromise halfway between -fe- and -re specifications.
Again, seeing what you typed and what Stata gave you back (as per FAQ) would help enormously.
As a final sidelight, please note that R-sqs have different interpretation depending on panel data regression specification (see -xtreg- entry in Stata 16 .pdf manual for more details) that do not necessarily overlap with OLS R-sq.

Kind regards,
Carlo
(Stata 19.0)
Comment

Enrique Santiago Pajuelo

Join Date: Dec 2020
Posts: 11

01 May 2021, 08:50

Hello Carlo,

Thanks again for answering. I made the mundlak test and it also gave a value of 0,000, therefore I interpret it as another indication that I need to do fixed effect, right?

Code:

 ( 1)  mean_x2 = 0
 ( 2)  mean_x3 = 0
 ( 3)  mean_x4 = 0
 ( 4)  mean_x5 = 0

           chi2(  5) =   32.69
         Prob > chi2 =    0.0000

This is the result I get for Hausman test:

Code:

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |     fixed        random       Difference          S.E.
-------------+----------------------------------------------------------------
    ideology |    -.148681    -.2077445        .0590635        .0051272
internet_u~s |   -.0052374    -.0045964        -.000641               .
      gdp_pc |    7.28e-06     2.23e-06        5.05e-06        2.87e-06
opposition~e |   -.3162369    -.2607343       -.0555026        .0107827       .
executive_~n |    -.857176    -.9801461          .12297        .0160216
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =      192.12
                Prob>chi2 =      0.0000
                (V_b-V_B is not positive definite)

I leave here part of my code as an example of my data as indicated in FAQs with dataex.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(protest_level ideology internet_users gdp_pc opposition_size executive_corruption)
 -.495 -1.043           4  1627.67   .379 .904
 -.495 -1.043           5     1792   .379 .904
 -.495 -1.043 5.454545455     1945   .379 .904
 -.495 -1.043         5.9     2025   .379 .871
 -.495 -1.043           7     2022   .379 .871
 -.141 -1.043        8.26     1928   .379 .794
 -.073 -1.043        8.26     1929   .379 .828
 -.073 -1.043 11.44768809  2014.75   .379 .828
 -.073 -1.043 11.44768809  1934.56   .379 .865
  .341 -1.553 11.44768809  1934.56   .379 .907
 1.005 -1.296 11.44768809  1934.56   .379 .866
 -.087   .192         2.8  7520.67  -.675 .938
 -.356   .192         3.1     8016  -.661 .938
 -.356   .192         6.5     8190  -.661 .938
 -.356   .192         8.9     8508  -.661  .92
 -.356   .192        21.4     8673  -.661  .92
 -.356   .192        12.4     8689  -.661  .92
 -.356   .192          13     8453  -.661  .92
  .592   .192 14.33907936  8146.44  -.661 .689
  .592   .524 14.33907936  7771.44  -.661 .515
  .592   .375 14.33907936  7771.44  -.661 .495
  .592   .203 14.33907936  7771.44  -.661 .495
  .291  -.417          45  9222.97 -1.303 .779
  .291  -.417          49     9484 -1.303 .779
  .291  -.417 54.65595904     9592 -1.303 .762
  .413  -.417        57.2     9660 -1.303 .827
  .413  -.417        60.1     9808 -1.303 .848
   .28  -.417  63.2529327    10032 -1.303 .848
   .28  -.417  66.3634447    10342 -1.303 .848
   .28  -.417  71.8470405 10702.12 -1.303 .867
   .28  -.417  71.8470405 11104.17 -1.303 .867
  .201  -.492 69.64285467 11104.17 -1.303 .867
  .201  -.492 69.64285467 11104.17 -1.303 .867
-2.737  1.537          68  60112.4  -1.14  .22
-2.737  1.537          78    65307  -1.14  .22
-2.737  1.537  84.9999915    68255  -1.14  .22
-3.129  1.537          88    70791  -1.14  .22
-3.129  1.537        90.4    72601  -1.14  .22
-3.129  1.537        90.5    74746  -1.14  .22
 -3.81  1.537 90.60000732    75876  -1.14  .22
 -3.81  1.537 94.81992254  76643.5 -1.603  .22
 -3.81  1.537 98.45000178 76397.82 -1.603  .22
 -3.81  1.537 99.14999796 76397.82 -1.603  .22
 -3.81  1.537 99.14999796 76397.82 -1.603  .22
 2.041  -.146          45 18979.99 -1.368 .518
 2.041  -.146          51    20003 -1.368 .518
 2.041  -.146        55.8    19599 -1.368 .518
 2.041  -.146        59.9    19873 -1.368 .567
 2.041  -.146        64.7    19183 -1.368 .567
 1.954 -1.403 68.04306411    19502 -1.368 .454
 1.954 -1.968 70.96898082    18875 -1.368 .395
 1.954 -1.968 74.29490687 19200.91 -1.368 .395
 1.954 -1.968 74.29490687 18556.38 -1.368 .361
 1.954 -1.968 74.29490687 18556.38 -1.368 .409
 2.211  -.374 74.29490687 18556.38 -1.368 .398
   .69  -.594          25  8330.81  1.049 .819
   .69  -.594          32     8465  1.049 .819
   .69  -.594        37.5     9077   .477 .819
   .69  -.594        41.9     9385   .477 .828
   .69  -.594 54.62280586     9735   .477 .828
   .69  -.594 59.10083377    10042   .477 .828
   .69  -.594 64.34602977    10080  1.671 .828
   .69  -.594 64.74488433 10859.38  1.027 .814
 1.133 -1.112 68.24505226 11454.43  1.956 .447
 1.635 -1.392 68.24505226 11454.43  -.741  .15
  .442  -1.75 68.24505226 11454.43   .143 .234
  1.87 -1.588          76 45400.22 -3.321 .034
  1.87 -1.588 79.48769771    46132 -3.321 .034
  1.87 -1.588          79    46999 -3.321 .034
  1.87 -1.511 83.45349717    47250 -3.321 .034
  1.87 -1.511          84    47867 -3.321 .033
  1.87 -1.511 84.56051491    48357 -3.321 .032
  1.87 -1.511       86.54    48845 -3.321 .027
  1.87 -1.511 86.54504885 49265.61 -3.321 .027
  1.87 -1.511 86.54504885  49830.8 -3.321 .043
 1.349  -.403 86.54504885  49830.8 -3.321  .03
 1.349 -1.556 86.54504885  49830.8 -3.321  .03
 1.887 -1.381       75.17 40288.35 -1.957 .047
 1.887 -1.381  78.7399931    41446 -1.957 .068
 1.887 -1.381 80.02999392    41565 -1.957 .046
 1.887 -1.381     80.6188    41375 -1.957 .046
 1.887 -1.381 80.99582496    41338 -1.957 .046
 1.887 -1.381 83.94014193    41294 -1.957 .046
 1.887 -1.381 84.32374257    41445 -1.957 .046
 1.887  -.841 87.93558659 42177.37 -1.957 .046
 1.887   -.77 87.47913723 42988.07 -1.957 .068
 1.887   -.77 87.75220479 42988.07 -1.957 .067
 1.887 -1.336 87.75220479 42988.07 -1.957 .045
-1.198   .054          46 16153.84  1.377  .91
-1.198   .054          50    16176  1.377  .91
-1.198   .054        54.2    16359  1.377  .91
-1.534   .054 73.00000137    17133  1.377  .91
 -1.58   .054 75.00001564    17439  1.377  .91
 -1.58   .054          77    17460  1.377  .91
 -1.58   .054        78.2    16645  1.377 .911
 -1.58   .256          79 16522.31  1.377 .903
-1.614   .256 79.79999549 16628.06  1.127 .903
-1.614   .302 79.79999549 16628.06  1.127 .867
-1.614   .606 79.79999549 16628.06   .588 .909
   -.7  -.934           1   717.61   .352 .856
end

For fixed effects I get the following results:

Code:

xtreg $ylist $xlist, fe vce(cluster id) 

Fixed-effects (within) regression               Number of obs     =      1,840
Group variable: id                              Number of groups  =        172

R-sq:                                           Obs per group:
     within  = 0.2001                                         min =          1
     between = 0.2354                                         avg =       10.7
     overall = 0.2255                                         max =         11

                                                F(6,171)          =       9.58
corr(u_i, Xb)  = -0.0096                        Prob > F          =     0.0000

                                           (Std. Err. adjusted for 172 clusters in id)
--------------------------------------------------------------------------------------
                     |               Robust
  protest_level |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
            ideology |   -.148681   .0563847    -2.64   0.009    -.2599808   -.0373813
      internet_users |  -.0052374   .0013372    -3.92   0.000    -.0078769   -.0025978
              gdp_pc |   7.28e-06   8.08e-06     0.90   0.368    -8.66e-06    .0000232
     opposition_size |  -.3162369   .0770215    -4.11   0.000    -.4682722   -.1642015
executive_corruption |   -.857176   .3663637    -2.34   0.020    -1.580354   -.1339983
               _cons |    .731889   .2265359     3.23   0.001     .2847222    1.179056
---------------------+----------------------------------------------------------------
             sigma_u |  1.2978051
             sigma_e |  .26377467
                 rho |  .96032952   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#7

01 May 2021, 09:22

Enrique:
since I found neither -panelid-, not -timevar- in your -dataex- (thanks for using it anyway) excerpt, it was impossible for me to replicate your regression.
As an aside, if you invoke non-default standard errors, you should switch from -hausman- to the community-contributed module -xtoverid-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Enrique Santiago Pajuelo

Join Date: Dec 2020
Posts: 11

01 May 2021, 10:01

Hello Carlo again,

Sorry for not including id and year, I leave it down below corrected.

As for -xtoverid-, it sends an error saying that I cannot use it with xtreg fe, doing it with xtreg re, I get the following result:

Code:

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re  robust cluster(id)
Sargan-Hansen statistic  94.470  Chi-sq(5)    P-value = 0.0000

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int id double(year protest_level ideology internet_users gdp_pc opposition_size executive_corruption)
 1 2010  -.495 -1.043           4  1627.67   .379 .904
 1 2011  -.495 -1.043           5     1792   .379 .904
 1 2012  -.495 -1.043 5.454545455     1945   .379 .904
 1 2013  -.495 -1.043         5.9     2025   .379 .871
 1 2014  -.495 -1.043           7     2022   .379 .871
 1 2015  -.141 -1.043        8.26     1928   .379 .794
 1 2016  -.073 -1.043        8.26     1929   .379 .828
 1 2017  -.073 -1.043 11.44768809  2014.75   .379 .828
 1 2018  -.073 -1.043 11.44768809  1934.56   .379 .865
 1 2019   .341 -1.553 11.44768809  1934.56   .379 .907
 1 2020  1.005 -1.296 11.44768809  1934.56   .379 .866
 2 2010  -.087   .192         2.8  7520.67  -.675 .938
 2 2011  -.356   .192         3.1     8016  -.661 .938
 2 2012  -.356   .192         6.5     8190  -.661 .938
 2 2013  -.356   .192         8.9     8508  -.661  .92
 2 2014  -.356   .192        21.4     8673  -.661  .92
 2 2015  -.356   .192        12.4     8689  -.661  .92
 2 2016  -.356   .192          13     8453  -.661  .92
 2 2017   .592   .192 14.33907936  8146.44  -.661 .689
 2 2018   .592   .524 14.33907936  7771.44  -.661 .515
 2 2019   .592   .375 14.33907936  7771.44  -.661 .495
 2 2020   .592   .203 14.33907936  7771.44  -.661 .495
 3 2010   .291  -.417          45  9222.97 -1.303 .779
 3 2011   .291  -.417          49     9484 -1.303 .779
 3 2012   .291  -.417 54.65595904     9592 -1.303 .762
 3 2013   .413  -.417        57.2     9660 -1.303 .827
 3 2014   .413  -.417        60.1     9808 -1.303 .848
 3 2015    .28  -.417  63.2529327    10032 -1.303 .848
 3 2016    .28  -.417  66.3634447    10342 -1.303 .848
 3 2017    .28  -.417  71.8470405 10702.12 -1.303 .867
 3 2018    .28  -.417  71.8470405 11104.17 -1.303 .867
 3 2019   .201  -.492 69.64285467 11104.17 -1.303 .867
 3 2020   .201  -.492 69.64285467 11104.17 -1.303 .867
 4 2010 -2.737  1.537          68  60112.4  -1.14  .22
 4 2011 -2.737  1.537          78    65307  -1.14  .22
 4 2012 -2.737  1.537  84.9999915    68255  -1.14  .22
 4 2013 -3.129  1.537          88    70791  -1.14  .22
 4 2014 -3.129  1.537        90.4    72601  -1.14  .22
 4 2015 -3.129  1.537        90.5    74746  -1.14  .22
 4 2016  -3.81  1.537 90.60000732    75876  -1.14  .22
 4 2017  -3.81  1.537 94.81992254  76643.5 -1.603  .22
 4 2018  -3.81  1.537 98.45000178 76397.82 -1.603  .22
 4 2019  -3.81  1.537 99.14999796 76397.82 -1.603  .22
 4 2020  -3.81  1.537 99.14999796 76397.82 -1.603  .22
 5 2010  2.041  -.146          45 18979.99 -1.368 .518
 5 2011  2.041  -.146          51    20003 -1.368 .518
 5 2012  2.041  -.146        55.8    19599 -1.368 .518
 5 2013  2.041  -.146        59.9    19873 -1.368 .567
 5 2014  2.041  -.146        64.7    19183 -1.368 .567
 5 2015  1.954 -1.403 68.04306411    19502 -1.368 .454
 5 2016  1.954 -1.968 70.96898082    18875 -1.368 .395
 5 2017  1.954 -1.968 74.29490687 19200.91 -1.368 .395
 5 2018  1.954 -1.968 74.29490687 18556.38 -1.368 .361
 5 2019  1.954 -1.968 74.29490687 18556.38 -1.368 .409
 5 2020  2.211  -.374 74.29490687 18556.38 -1.368 .398
 6 2010    .69  -.594          25  8330.81  1.049 .819
 6 2011    .69  -.594          32     8465  1.049 .819
 6 2012    .69  -.594        37.5     9077   .477 .819
 6 2013    .69  -.594        41.9     9385   .477 .828
 6 2014    .69  -.594 54.62280586     9735   .477 .828
 6 2015    .69  -.594 59.10083377    10042   .477 .828
 6 2016    .69  -.594 64.34602977    10080  1.671 .828
 6 2017    .69  -.594 64.74488433 10859.38  1.027 .814
 6 2018  1.133 -1.112 68.24505226 11454.43  1.956 .447
 6 2019  1.635 -1.392 68.24505226 11454.43  -.741  .15
 6 2020   .442  -1.75 68.24505226 11454.43   .143 .234
 7 2010   1.87 -1.588          76 45400.22 -3.321 .034
 7 2011   1.87 -1.588 79.48769771    46132 -3.321 .034
 7 2012   1.87 -1.588          79    46999 -3.321 .034
 7 2013   1.87 -1.511 83.45349717    47250 -3.321 .034
 7 2014   1.87 -1.511          84    47867 -3.321 .033
 7 2015   1.87 -1.511 84.56051491    48357 -3.321 .032
 7 2016   1.87 -1.511       86.54    48845 -3.321 .027
 7 2017   1.87 -1.511 86.54504885 49265.61 -3.321 .027
 7 2018   1.87 -1.511 86.54504885  49830.8 -3.321 .043
 7 2019  1.349  -.403 86.54504885  49830.8 -3.321  .03
 7 2020  1.349 -1.556 86.54504885  49830.8 -3.321  .03
 8 2010  1.887 -1.381       75.17 40288.35 -1.957 .047
 8 2011  1.887 -1.381  78.7399931    41446 -1.957 .068
 8 2012  1.887 -1.381 80.02999392    41565 -1.957 .046
 8 2013  1.887 -1.381     80.6188    41375 -1.957 .046
 8 2014  1.887 -1.381 80.99582496    41338 -1.957 .046
 8 2015  1.887 -1.381 83.94014193    41294 -1.957 .046
 8 2016  1.887 -1.381 84.32374257    41445 -1.957 .046
 8 2017  1.887  -.841 87.93558659 42177.37 -1.957 .046
 8 2018  1.887   -.77 87.47913723 42988.07 -1.957 .068
 8 2019  1.887   -.77 87.75220479 42988.07 -1.957 .067
 8 2020  1.887 -1.336 87.75220479 42988.07 -1.957 .045
 9 2010 -1.198   .054          46 16153.84  1.377  .91
 9 2011 -1.198   .054          50    16176  1.377  .91
 9 2012 -1.198   .054        54.2    16359  1.377  .91
 9 2013 -1.534   .054 73.00000137    17133  1.377  .91
 9 2014  -1.58   .054 75.00001564    17439  1.377  .91
 9 2015  -1.58   .054          77    17460  1.377  .91
 9 2016  -1.58   .054        78.2    16645  1.377 .911
 9 2017  -1.58   .256          79 16522.31  1.377 .903
 9 2018 -1.614   .256 79.79999549 16628.06  1.127 .903
 9 2019 -1.614   .302 79.79999549 16628.06  1.127 .867
 9 2020 -1.614   .606 79.79999549 16628.06   .588 .909
10 2010    -.7  -.934           1   717.61   .352 .856
end

Last edited by Enrique Santiago Pajuelo; 01 May 2021, 10:03.

Comment

Eric de Souza

Join Date: Mar 2014

Posts: 587
#9

01 May 2021, 10:07

-xtoverid- tests whether the additional assumptions imposed on the model to go from FE to RE are valid.
Run -xtreg- without the -fe- option. You will then be estimating the model assuming RE
Then run -xtoverid-. The null hypothesis is that the additional assumptions required by RE are valid.
Comment
Enrique Santiago Pajuelo

Join Date: Dec 2020

Posts: 11
#10

01 May 2021, 10:26

Hello Eric, thank you for the comment, I have run -xtoverid- and it gives me the following result which indicates that I reject the null and, therefore, that I should use fixed effects, isn't it?

Code:

Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(id) Sargan-Hansen statistic 94.470 Chi-sq(5) P-value = 0.0000

Last edited by Enrique Santiago Pajuelo; 01 May 2021, 10:29.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17699

#11

01 May 2021, 10:45

Enrique:
you're correct.
The -xtoverid- oucome points you toward -fe- specification-.
That said, thanks to your updated -dataex- excerpt, while -xtoverid- recommendation about going -fe- still holds (by the way, please note the -xi- prefix at the beginning of -xtreg- code, since, being a bit old-fashioned, -the community-contributed command -xtoverid- does not support -fvvarlist- notation):

Code:

. xi: xtreg protest_level ideology opposition_size internet_users gdp_pc executive_corruption i.year, re vce(cluster id)
i.year            _Iyear_2010-2020    (naturally coded; _Iyear_2010 omitted)

Random-effects GLS regression                   Number of obs     =        100
Group variable: id                              Number of groups  =         10

R-sq:                                           Obs per group:
     within  = 0.1793                                         min =          1
     between = 0.9005                                         avg =       10.0
     overall = 0.8698                                         max =         11

                                                Wald chi2(9)      =          .
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .

                                            (Std. Err. adjusted for 10 clusters in id)
--------------------------------------------------------------------------------------
                     |               Robust
       protest_level |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
            ideology |  -.2421335   .2877043    -0.84   0.400    -.8060235    .3217566
     opposition_size |  -.5123841    .187669    -2.73   0.006    -.8802085   -.1445597
      internet_users |   .0107887   .0080149     1.35   0.178    -.0049202    .0264975
              gdp_pc |  -.0000763   .0000273    -2.79   0.005    -.0001298   -.0000228
executive_corruption |  -2.364413   .9393591    -2.52   0.012    -4.205523   -.5233027
         _Iyear_2011 |   .0866948   .0926493     0.94   0.349    -.0948944     .268284
         _Iyear_2012 |   .0460992   .1091596     0.42   0.673    -.1678496     .260048
         _Iyear_2013 |  -.0116144   .1101165    -0.11   0.916    -.2274388      .20421
         _Iyear_2014 |  -.0360378   .1399065    -0.26   0.797    -.3102495     .238174
         _Iyear_2015 |  -.0870544   .1480279    -0.59   0.556    -.3771837     .203075
         _Iyear_2016 |  -.1249127   .1368746    -0.91   0.361     -.393182    .1433567
         _Iyear_2017 |  -.1251427   .1858755    -0.67   0.501    -.4894519    .2391665
         _Iyear_2018 |  -.1748437   .2209839    -0.79   0.429    -.6079642    .2582767
         _Iyear_2019 |  -.3608061   .2776857    -1.30   0.194    -.9050602    .1834479
         _Iyear_2020 |  -.3602549   .3188789    -1.13   0.259     -.985246    .2647362
               _cons |    2.30865   .9299739     2.48   0.013     .4859349    4.131366
---------------------+----------------------------------------------------------------
             sigma_u |  .14570577
             sigma_e |  .23725576
                 rho |  .27386538   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re  robust cluster(id)
Sargan-Hansen statistic 1113.140  Chi-sq(6)   P-value = 0.0000

you can switch to the community-contributed module -mundlak-, even with its -hybrid- option:

Code:

. xi: mundlak protest_level ideology opposition_size internet_users gdp_pc executive_corruption i.year
i.year            _Iyear_2010-2020    (naturally coded; _Iyear_2010 omitted)

The variable gdp_pc does not vary sufficiently within groups and will not be used to create additional regressors.
0% of the total variance in gdp_pc is within groups.

+------------------------------------------------+
|             Variable |     RE     |  Mundlak   |
|----------------------+------------+------------|
|             ideology |     -0.242 |      0.091 |
|      opposition_size |     -0.512 |      0.099 |
|       internet_users |      0.011 |     -0.011 |
|               gdp_pc |     -0.000 |     -0.000 |
| executive_corruption |     -2.364 |     -1.302 |
|          _Iyear_2011 |      0.087 |      0.142 |
|          _Iyear_2012 |      0.046 |      0.245 |
|          _Iyear_2013 |     -0.012 |      0.296 |
|          _Iyear_2014 |     -0.036 |      0.379 |
|          _Iyear_2015 |     -0.087 |      0.434 |
|          _Iyear_2016 |     -0.125 |      0.380 |
|          _Iyear_2017 |     -0.125 |      0.524 |
|          _Iyear_2018 |     -0.175 |      0.515 |
|          _Iyear_2019 |     -0.361 |      0.536 |
|          _Iyear_2020 |     -0.360 |      0.511 |
|       mean__ideology |            |     -0.303 |
| mean__opposition_s~e |            |     -0.392 |
| mean__internet_users |            |      0.026 |
| mean__executive_co~n |            |     -4.551 |
|    mean___Iyear_2011 |            |      5.371 |
|                _cons |      2.309 |      4.285 |
|----------------------+------------+------------|
|                    N |        100 |        100 |
|                  N_g |     10.000 |     10.000 |
|                g_min |      1.000 |      1.000 |
|                g_avg |     10.000 |     10.000 |
|                g_max |     11.000 |     11.000 |
|                  rho |      0.274 |      0.274 |
|                 rmse |      0.483 |      0.251 |
|                 chi2 |    178.635 |    893.284 |
|                    p |      0.000 |      0.000 |
|                 df_m |     15.000 |     20.000 |
|                sigma |      0.278 |      0.278 |
|              sigma_u |      0.146 |      0.146 |
|              sigma_e |      0.237 |      0.237 |
|                 r2_w |      0.179 |      0.490 |
|                 r2_o |      0.870 |      0.976 |
|                 r2_b |      0.900 |      0.993 |
+------------------------------------------------+

. xi: mundlak protest_level ideology opposition_size internet_users gdp_pc executive_corruption i.year, hybrid
i.year            _Iyear_2010-2020    (naturally coded; _Iyear_2010 omitted)

The variable gdp_pc does not vary sufficiently within groups and will not be used to create additional regressors.
0% of the total variance in gdp_pc is within groups.

+------------------------------------------------+
|             Variable |     RE     |   Hybrid   |
|----------------------+------------+------------|
|             ideology |     -0.242 |            |
|      opposition_size |     -0.512 |            |
|       internet_users |      0.011 |            |
|               gdp_pc |     -0.000 |     -0.000 |
| executive_corruption |     -2.364 |            |
|          _Iyear_2011 |      0.087 |            |
|          _Iyear_2012 |      0.046 |            |
|          _Iyear_2013 |     -0.012 |            |
|          _Iyear_2014 |     -0.036 |            |
|          _Iyear_2015 |     -0.087 |            |
|          _Iyear_2016 |     -0.125 |            |
|          _Iyear_2017 |     -0.125 |            |
|          _Iyear_2018 |     -0.175 |            |
|          _Iyear_2019 |     -0.361 |            |
|          _Iyear_2020 |     -0.360 |            |
|       diff__ideology |            |      0.103 |
| diff__opposition_s~e |            |      0.079 |
| diff__internet_users |            |     -0.010 |
| diff__executive_co~n |            |     -1.261 |
|    diff___Iyear_2011 |            |      0.155 |
|    diff___Iyear_2012 |            |      0.262 |
|    diff___Iyear_2013 |            |      0.317 |
|    diff___Iyear_2014 |            |      0.400 |
|    diff___Iyear_2015 |            |      0.463 |
|    diff___Iyear_2016 |            |      0.411 |
|    diff___Iyear_2017 |            |      0.556 |
|    diff___Iyear_2018 |            |      0.552 |
|    diff___Iyear_2019 |            |      0.569 |
|    diff___Iyear_2020 |            |      0.544 |
|       mean__ideology |            |     -0.017 |
| mean__opposition_s~e |            |     -0.289 |
| mean__internet_users |            |      0.017 |
| mean__executive_co~n |            |     -6.754 |
|    mean___Iyear_2011 |            |      8.381 |
|                _cons |      2.309 |      5.247 |
|----------------------+------------+------------|
|                    N |        100 |        100 |
|                  N_g |     10.000 |     10.000 |
|                g_min |      1.000 |      1.000 |
|                g_avg |     10.000 |     10.000 |
|                g_max |     11.000 |     11.000 |
|                  rho |      0.274 |      0.000 |
|                 rmse |      0.483 |      0.277 |
|                 chi2 |    178.635 |   3539.270 |
|                    p |      0.000 |      0.000 |
|                 df_m |     15.000 |     20.000 |
|                sigma |      0.278 |      0.237 |
|              sigma_u |      0.146 |      0.000 |
|              sigma_e |      0.237 |      0.237 |
|                 r2_w |      0.179 |      0.475 |
|                 r2_o |      0.870 |      0.978 |
|                 r2_b |      0.900 |      0.996 |
+------------------------------------------------+

As you can see -i.year- is now a predictor of your regression code.

Kind regards,
Carlo
(Stata 19.0)

Comment

Enrique Santiago Pajuelo

Join Date: Dec 2020

Posts: 11
#12

01 May 2021, 11:45

Hello Carlo,

Thank you very much, I am honestly very grateful for your answer.

Hopefully this is my last questions in order to close this topic:

1. You have used -xtreg, re- in order to test -xtoverid-, but I should use fixed effects myself for the research, right? At least that's what xtoverid and Hausman indicate.
2. When you say that -i.year- are now predictor of my regression code, does that mean I have to include them in my -xtreg- as independent variables? Also, when displaying the data in my research paper, should I include the i.year dummies or just say that I control for that while only including my main independent variable and alternative explanations?

Thank you in advance.

Kind regards,
Enrique.

Last edited by Enrique Santiago Pajuelo; 01 May 2021, 12:02.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#13

02 May 2021, 04:33

Enrique:
1) as Eric wisely suggested in #9, the community-contributed programme -xtoverid- test -re-only, being the null that -re- is the way to go. In your case, -xtoverid- outcome rejects the null and points you toward -fe- instead;
2) yes, you have to include them and make each year coefficient explicit in your table.

Kind regards,
Carlo
(Stata 19.0)
Comment

Enrique Santiago Pajuelo

Join Date: Dec 2020
Posts: 11

#14

02 May 2021, 09:18

Hello Carlo:

I added the i.year parameter to my Fixed effects model and it looks like I show down below. As I interpret that my main independent variable, Ideology, is significant and has a relevant effect on the variable (although corruption has a higher one), is that a good interpretation?

Code:

Fixed-effects (within) regression               Number of obs     =      1,840
Group variable: id                              Number of groups  =        172

R-sq:                                           Obs per group:
     within  = 0.2135                                         min =          1
     between = 0.2341                                         avg =       10.7
     overall = 0.2236                                         max =         11

                                                F(16,171)         =       5.68
corr(u_i, Xb)  = -0.1088                        Prob > F          =     0.0000

                                           (Std. Err. adjusted for 172 clusters in id)
--------------------------------------------------------------------------------------
                     |               Robust
  protest_level |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
            ideology |  -.1407932   .0574657    -2.45   0.015    -.2542266   -.0273597
      internet_users |  -.0010012   .0019363    -0.52   0.606    -.0048233    .0028209
              gdp_pc |   .0000124   8.73e-06     1.42   0.158    -4.84e-06    .0000296
     opposition_size |  -.3005896   .0781382    -3.85   0.000    -.4548293     -.14635
executive_corruption |  -.9609528   .3637076    -2.64   0.009    -1.678888   -.2430179
                     |
                year |
               2011  |   .0051467   .0241086     0.21   0.831     -.042442    .0527354
               2012  |  -.0048074   .0283033    -0.17   0.865    -.0606762    .0510614
               2013  |  -.0373195   .0326917    -1.14   0.255    -.1018508    .0272118
               2014  |  -.0669302   .0367843    -1.82   0.071    -.1395399    .0056795
               2015  |  -.0752066   .0429393    -1.75   0.082    -.1599661    .0095528
               2016  |  -.1157768   .0468634    -2.47   0.014    -.2082822   -.0232715
               2017  |   -.103583   .0508403    -2.04   0.043    -.2039385   -.0032276
               2018  |  -.1097797   .0546515    -2.01   0.046    -.2176581   -.0019012
               2019  |   -.136404   .0566913    -2.41   0.017    -.2483088   -.0244991
               2020  |  -.1664269   .0601279    -2.77   0.006    -.2851154   -.0477383
                     |
               _cons |   .5787519   .2357694     2.45   0.015     .1133587    1.044145
---------------------+----------------------------------------------------------------
             sigma_u |  1.3059579
             sigma_e |  .26235264
                 rho |  .96120905   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------

Also, I read that areg better for measuring R-sqs since the result is much intuitive, I did it and it shows 0.97 which strangely high, is this the one that I should report in my study?

Code:

areg $ylist $xlist i.year, absorb(id)

Linear regression, absorbing indicators         Number of obs     =      1,840
                                                F(  16,   1652)   =      28.03
                                                Prob > F          =     0.0000
                                                R-squared         =     0.9728
                                                Adj R-squared     =     0.9697
                                                Root MSE          =     0.2624

--------------------------------------------------------------------------------------
  protest_level |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
            ideology |  -.1407932   .0217453    -6.47   0.000    -.1834444   -.0981419
      internet_users |  -.0010012   .0011427    -0.88   0.381    -.0032424    .0012401
              gdp_pc |   .0000124   4.34e-06     2.86   0.004     3.88e-06    .0000209
     opposition_size |  -.3005896   .0275566   -10.91   0.000    -.3546392   -.2465401
executive_corruption |  -.9609528   .0989993    -9.71   0.000     -1.15513   -.7667754
                     |
                year |
               2011  |   .0051467    .029066     0.18   0.859    -.0518634    .0621567
               2012  |  -.0048074   .0296329    -0.16   0.871    -.0629294    .0533147
               2013  |  -.0373195   .0306978    -1.22   0.224    -.0975303    .0228913
               2014  |  -.0669302   .0321451    -2.08   0.037    -.1299796   -.0038808
               2015  |  -.0752066   .0338111    -2.22   0.026    -.1415237   -.0088896
               2016  |  -.1157768   .0356009    -3.25   0.001    -.1856044   -.0459492
               2017  |   -.103583   .0380022    -2.73   0.006    -.1781206   -.0290454
               2018  |  -.1097797   .0392564    -2.80   0.005    -.1867771   -.0327822
               2019  |   -.136404   .0402481    -3.39   0.001    -.2153467   -.0574612
               2020  |  -.1664269   .0405677    -4.10   0.000    -.2459965   -.0868572
                     |
               _cons |   .5787519    .094865     6.10   0.000     .3926836    .7648202
---------------------+----------------------------------------------------------------
                  id |      F(171, 1652) =    149.403   0.000         (172 categories)

Last edited by Enrique Santiago Pajuelo; 02 May 2021, 09:22.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17699
#15

02 May 2021, 10:18

Enrique:
-xtreg- and -areg- are not exacly equivalent.
I would stick with -xtreg- and check whether:
- the functional form of the regressand is Ok (use -linktest- - like approach to investigate this issue);
- -fe- is actually the way to go.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment