Interactions, Inconsistent resutls

Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#1

Interactions, Inconsistent resutls

28 Nov 2018, 06:10

Hi all,,

I am getting inconsistent results and hope that someone can help me understand the source.
I have panel data and am looking at oil revenue's effect on human rights. I have human rights index (HRI) as dependent variable, oil revenue as main explanatory variable and I also have a binary variable (democracy) that is equal to 1 for democracies and 0 for autocracies. When i run a model :

1) xtreg HRI oil democracy oil#democracy
2)xtreg HRI oil if democracy ==0

I should get the same coefficient for oil which is the effect of oil for autocracies but I get different coefficient and statistical significance also varies. I tries several datasets (both cross sectional and panel including one available from Stata) and as expected the coefficients are the same but in my dataset I end up getting different coefficients. My only explanation might be that oil variables has many zeros but I even created a dataset with many zero values and as is expected i got the same results

Thank you in advance for your help
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#2

28 Nov 2018, 07:14

Hovhannes:
posting what you typed and what Stata gave you back (between CODE delimiters, please) with only a brief wording of what's the presumable matter with your data, saves time of interested listers (and yours, too), as they can spot something weird immediately from code, table and numbers and reply positively. Thanks

Kind regards,
Carlo
(Stata 19.0)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4398
#3

28 Nov 2018, 07:40

Originally posted by Hovhannes Nahapetyan View Post

I should get the same coefficient for oil which is the effect of oil for autocracies . . . I tries several datasets (both cross sectional and panel including one available from Stata) and as expected the coefficients are the same . . .

What do you mean by "the same"? I could see how they might be similar, but you're fitting models to different samples of data, and so I would not expect the coefficients to be "the same", especially if democracy affects things (or is correlated with things) in ways that your two models don't take into account.

.ÿversionÿ15.1

.ÿ
.ÿclearÿ*

.ÿ
.ÿsetÿseedÿ`=strreverse("1472452")'

.ÿquietlyÿsetÿobsÿ100

.ÿ
.ÿgenerateÿintÿcountryÿ=ÿ_n

.ÿgenerateÿbyteÿdemocracyÿ=ÿmod(_n,ÿ2)

.ÿgenerateÿdoubleÿcountry_uÿ=ÿrnormal()ÿ+ÿ!democracyÿ*ÿrnormal()

.ÿ
.ÿquietlyÿexpandÿ5

.ÿquietlyÿbysortÿcountry:ÿgenerateÿbyteÿyearÿ=ÿ_n

.ÿgenerateÿdoubleÿoilÿ=ÿruniform()

.ÿgenerateÿdoubleÿHRIÿ=ÿoilÿ/ÿ2.5ÿ+ÿdemocracyÿ*ÿoilÿ/ÿ2.5ÿ+ÿcountry_uÿ+ÿrnormal()

.ÿ
.ÿprogramÿdefineÿdem
ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ15.1
ÿÿ2.ÿ
.ÿÿÿÿÿÿÿÿÿquietlyÿ`0'
ÿÿ3.ÿÿÿÿÿÿÿÿÿdisplayÿinÿsmclÿasÿtextÿ"oilÿcoefficientÿ=ÿ"ÿasÿresultÿ%05.3fÿ_b[oil]ÿ///
>ÿÿÿÿÿÿÿÿÿ"ÿ±ÿ"ÿ%05.3fÿ_se[oil]
ÿÿ4.ÿend

.ÿ
.ÿquietlyÿxtsetÿcountryÿyear

.ÿ
.ÿdemÿxtregÿHRIÿc.oilÿi.democracyÿc.oil#i.democracy
oilÿcoefficientÿ=ÿ0.468ÿ±ÿ0.236

.ÿdemÿxtregÿHRIÿc.oilÿifÿdemocracyÿ==0
oilÿcoefficientÿ=ÿ0.475ÿ±ÿ0.245

.ÿ
.ÿexit

endÿofÿdo-file

.
Comment
Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#4

28 Nov 2018, 07:52

Hi Carlo,

Thank you very much for the reply. Please, see codes and results below: here polity_cat has 3 categories coded 0-2
1) xtreg physint L_oilcap_log i.polity_cat c.L_oilcap_log#i.polity_cat, re
2) xtreg physint L_oilcap_log if polity_cat==0, re

For regression (1 )the coefficient for L_oilcap_log should be the effect of this variable when polity_cat=0 i.e. baseline category.
For regression (2) the coefficient for L_oilcap_log should give the same coefficient but it does not and this is what is really surprising.
Not only are coefficients different but also first regression shows no statistical significant while the second is statistically significant
Thank you for your help

Regression 1

. xtreg physint L_oilcap_log i.polity_cat c.L_oilcap_log#i.polity_cat, re

Random-effects GLS regression Number of obs = 3,566
Group variable: cow Number of groups = 165

R-sq: Obs per group:
within = 0.0329 min = 4
between = 0.2637 avg = 21.6
overall = 0.1559 max = 25

Wald chi2(5) = 149.12
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

-------------------------------------------------------------------------------------------
physint | Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------------------+----------------------------------------------------------------
L_oilcap_log | .0267375 .0356157 0.75 0.453 -.0430681 .096543
|
polity_cat |
anocracy | -.1486905 .10117 -1.47 0.142 -.3469801 .0495991
democracy | .9677071 .1079842 8.96 0.000 .7560619 1.179352
|
polity_cat#c.L_oilcap_log |
anocracy | -.0167481 .0306848 -0.55 0.585 -.0768893 .043393
democracy | -.0783344 .0380519 -2.06 0.040 -.1529147 -.0037541
|
_cons | 4.388589 .1636918 26.81 0.000 4.067759 4.709419
--------------------------+----------------------------------------------------------------
sigma_u | 1.6303355
sigma_e | 1.3036242
rho | .60999118 (fraction of variance due to u_i)
-------------------------------------------------------------------------------------------

Regression 2

. xtreg physint L_oilcap_log if polity_cat==0, re

Random-effects GLS regression Number of obs = 1,072
Group variable: cow Number of groups = 90

R-sq: Obs per group:
within = 0.0117 min = 1
between = 0.0075 avg = 11.9
overall = 0.0047 max = 25

Wald chi2(1) = 6.81
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0091

------------------------------------------------------------------------------
physint | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
L_oilcap_log | .1252923 .0480051 2.61 0.009 .031204 .2193805
_cons | 3.539986 .2164304 16.36 0.000 3.11579 3.964182
-------------+----------------------------------------------------------------
sigma_u | 1.6170188
sigma_e | 1.3803844
rho | .57845747 (fraction of variance due to u_i)
------------------------------------------------------------------------------

Thank you very much.
Comment
Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#5

28 Nov 2018, 08:13

And an example when it shows the same result.
use https://stats.idre.ucla.edu/stat/data/hsbdemo, clear . regress write female##c.socst Source | SS df MS Number of obs = 200 -------------+---------------------------------- F(3, 196) = 49.26 Model | 7685.43528 3 2561.81176 Prob > F = 0.0000 Residual | 10193.4397 196 52.0073455 R-squared = 0.4299 -------------+---------------------------------- Adj R-squared = 0.4211 Total | 17878.875 199 89.843593 Root MSE = 7.2116 -------------------------------------------------------------------------------- write | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------------+---------------------------------------------------------------- female | female | 15.00001 5.09795 2.94 0.004 4.946132 25.05389 socst | .6247968 .0670709 9.32 0.000 .4925236 .7570701 | female#c.socst | female | -.2047288 .0953726 -2.15 0.033 -.3928171 -.0166405 | _cons | 17.7619 3.554993 5.00 0.000 10.75095 24.77284 -------------------------------------------------------------------------------- . regress write socst if female==0 Source | SS df MS Number of obs = 91 -------------+---------------------------------- F(1, 89) = 79.62 Model | 4513.09285 1 4513.09285 Prob > F = 0.0000 Residual | 5044.57748 89 56.6806458 R-squared = 0.4722 -------------+---------------------------------- Adj R-squared = 0.4663 Total | 9557.67033 90 106.196337 Root MSE = 7.5287 ------------------------------------------------------------------------------ write | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- socst | .6247968 .0700195 8.92 0.000 .4856695 .7639241 _cons | 17.7619 3.711281 4.79 0.000 10.38766 25.13613
Comment
Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#6

28 Nov 2018, 08:20

Sorry the results for this example didn't post properly.
This is an example it shows the same results.

regress write female##c.socst
regress write socst if female==0

regression 1
regress write female##c.socst

Source | SS df MS Number of obs = 200
-------------+---------------------------------- F(3, 196) = 49.26
Model | 7685.43528 3 2561.81176 Prob > F = 0.0000
Residual | 10193.4397 196 52.0073455 R-squared = 0.4299
-------------+---------------------------------- Adj R-squared = 0.4211
Total | 17878.875 199 89.843593 Root MSE = 7.2116

--------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
female |
female | 15.00001 5.09795 2.94 0.004 4.946132 25.05389
socst | .6247968 .0670709 9.32 0.000 .4925236 .7570701
|
female#c.socst |
female | -.2047288 .0953726 -2.15 0.033 -.3928171 -.0166405
|
_cons | 17.7619 3.554993 5.00 0.000 10.75095 24.77284

regress write socst if female==0

Source | SS df MS Number of obs = 91
-------------+---------------------------------- F(1, 89) = 79.62
Model | 4513.09285 1 4513.09285 Prob > F = 0.0000
Residual | 5044.57748 89 56.6806458 R-squared = 0.4722
-------------+---------------------------------- Adj R-squared = 0.4663
Total | 9557.67033 90 106.196337 Root MSE = 7.5287

------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
socst | .6247968 .0700195 8.92 0.000 .4856695 .7639241
_cons | 17.7619 3.711281 4.79 0.000 10.38766 25.13613

Thanks
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17702

28 Nov 2018, 08:41

Hovhannes:
as Joseph highlighted, you cannot (and should not) expect similar results from two regression models that use different specifications and, in addition, consider different sample sizes (200 vs 91, if I am not mistaken).
A simpler regressin toy-example may help:

Code:

. use "C:\Program Files (x86)\Stata15\ado\base\a\auto.dta"
(1978 Automobile Data)

. regress price i.rep78##i.foreign
note: 1b.rep78#1.foreign identifies no observations in the sample
note: 2.rep78#1.foreign identifies no observations in the sample
note: 5.rep78#1.foreign omitted because of collinearity

      Source |       SS           df       MS      Number of obs   =        69
-------------+----------------------------------   F(7, 61)        =      0.39
       Model |    24684607         7  3526372.43   Prob > F        =    0.9049
    Residual |   552112352        61  9051022.16   R-squared       =    0.0428
-------------+----------------------------------   Adj R-squared   =   -0.0670
       Total |   576796959        68  8482308.22   Root MSE        =    3008.5

-------------------------------------------------------------------------------
        price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
        rep78 |
           2  |   1403.125   2378.422     0.59   0.557    -3352.823    6159.073
           3  |   2042.574   2204.707     0.93   0.358    -2366.011    6451.159
           4  |   1317.056   2351.846     0.56   0.578    -3385.751    6019.863
           5  |       -360   3008.492    -0.12   0.905    -6375.851    5655.851
              |
      foreign |
     Foreign  |   2088.167   2351.846     0.89   0.378     -2614.64    6790.974
              |
rep78#foreign |
   1#Foreign  |          0  (empty)
   2#Foreign  |          0  (empty)
   3#Foreign  |  -3866.574   2980.505    -1.30   0.199    -9826.462    2093.314
   4#Foreign  |  -1708.278   2746.365    -0.62   0.536    -7199.973    3783.418
   5#Foreign  |          0  (omitted)
              |
        _cons |     4564.5   2127.325     2.15   0.036      310.651    8818.349
-------------------------------------------------------------------------------

. regress price i.rep78 if foreign==0

      Source |       SS           df       MS      Number of obs   =        48
-------------+----------------------------------   F(4, 43)        =      0.45
       Model |  19111892.1         4  4777973.01   Prob > F        =    0.7734
    Residual |   458855805        43  10671065.2   R-squared       =    0.0400
-------------+----------------------------------   Adj R-squared   =   -0.0493
       Total |   477967697        47  10169525.5   Root MSE        =    3266.7

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       rep78 |
          2  |   1403.125   2582.521     0.54   0.590    -3805.025    6611.275
          3  |   2042.574     2393.9     0.85   0.398    -2785.185    6870.334
          4  |   1317.056   2553.665     0.52   0.609    -3832.901    6467.012
          5  |       -360    3266.66    -0.11   0.913    -6947.847    6227.847
             |
       _cons |     4564.5   2309.877     1.98   0.055     -93.8113    9222.811
------------------------------------------------------------------------------

An aside, I fail to get why you do not compact your code instead of typing separately the interaction and the conditional main effects of the two predictors. This dangerous habit makes your code more error prone (other things being equal, the wider the number of instructions your code is composed of, the higher the likelihood of mistyping/foregetting something along the way) and, basically, wastes your time.

Kind regards,
Carlo
(Stata 19.0)

Comment

Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#8

28 Nov 2018, 09:35

Thank you Carlo and Joseph
Comment
Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#9

28 Nov 2018, 12:22

Hi Carlo,

Sorry, actually i just realized that your example proves my point as you can see when restricting the sample to foreign==0 it give the same results for rep70 as in the full model even though sample sizes differ. This is contrary to my regression results that different coefficients.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4398
#10

28 Nov 2018, 22:00

Originally posted by Hovhannes Nahapetyan View Post

. . . your example proves my point as you can see when restricting the sample to foreign==0 it give the same results for rep70 as in the full model even though sample sizes differ. This is contrary to my regression results that different coefficients.

You're comparing apples to oranges when you try to use a regress example to justify your expectations for an xtreg , re problem.

My only suggestion is to reiterate my point above, which is to pay closer attention to the corr(u_i, X) = 0 (assumed) note that is apparently insufficiently prominently displayed in the "Random-effects GLS regression" output.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#11

29 Nov 2018, 00:02

Hovhannes:
I notice that my toy-example was actually unfortunate.
However, point estimates are one of the results that -regress- (or each other inference procedure) brings to our attention: basically, their contribution in explaining (others things being equal) the variation in the dependent variable is supported by confidence interval and p-value. Hence, they should be read considering standard errors (which are influenced by sample size) and confidence interval.

Kind regards,
Carlo
(Stata 19.0)
Comment
Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#12

29 Nov 2018, 23:22

Thank you very much Carlo and Joseph!

Joseph I have asked this question before but still am not quite sure (even after reading a lot more on it) and want to bring this up since you mentioned xtreg, re and (corr u_i, x)=0.
If we test for the presence of unit level heterogeneity (xttest0) and find out that it is present, but then run a Hausman test and find out that it is not correlated with x i.e (corr u_i, x)=0. how do we decide between xtreg y x, re cluster(robust) and reg y x , cluster (robust). Is it correct to say that if (corr u_i, x)=0. and thus we don't need to use fixed effects then OLS with clustered standard errors is preferred to random effects and if that's the case(is not) why and why not
Thank you!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#13

30 Nov 2018, 00:25

Hovhannes:
1) (corr u_i, x)=0 is an assumption of the -re- machinery, which may hold in same instances and not in others. It is often difficul to detect whether it holds or not and represents a possible downside of -re- specification, that balances out the -fe- shortcoming of wiping-out time-invariant predictors;
2) under pooled -regress- you cluster the standard errors because you should tell Stata that observations are not independent due to the panel structure of your dataset;
3) under -xtreg- you cluster/robustify the standard errors when you suspect/detect heteroskedasticity and/or autocorrelation in your dataset (usually the latter bites harder in T>N panel datasets, which should be analyzed with -xtgls- -like commands);
4) if you impose non-default standard errors. -hausman- cannot help you anymore, and you should switch to its user-written cousin -xtoverid-.
5) it is true that pooled OLS estimator is consistent when -re- are proved; the issue is about efficiency.

Kind regards,
Carlo
(Stata 19.0)
Comment
Hovhannes Nahapetyan

Join Date: Jul 2017

Posts: 44
#14

01 Dec 2018, 09:03

Thank you so much Carlo! As always a prompt and comprehensive response!
Comment

Announcement