Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interactions, Inconsistent resutls

    Hi all,,

    I am getting inconsistent results and hope that someone can help me understand the source.
    I have panel data and am looking at oil revenue's effect on human rights. I have human rights index (HRI) as dependent variable, oil revenue as main explanatory variable and I also have a binary variable (democracy) that is equal to 1 for democracies and 0 for autocracies. When i run a model :

    1) xtreg HRI oil democracy oil#democracy
    2)xtreg HRI oil if democracy ==0

    I should get the same coefficient for oil which is the effect of oil for autocracies but I get different coefficient and statistical significance also varies. I tries several datasets (both cross sectional and panel including one available from Stata) and as expected the coefficients are the same but in my dataset I end up getting different coefficients. My only explanation might be that oil variables has many zeros but I even created a dataset with many zero values and as is expected i got the same results

    Thank you in advance for your help

  • #2
    Hovhannes:
    posting what you typed and what Stata gave you back (between CODE delimiters, please) with only a brief wording of what's the presumable matter with your data, saves time of interested listers (and yours, too), as they can spot something weird immediately from code, table and numbers and reply positively. Thanks
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Hovhannes Nahapetyan View Post
      I should get the same coefficient for oil which is the effect of oil for autocracies . . . I tries several datasets (both cross sectional and panel including one available from Stata) and as expected the coefficients are the same . . .
      What do you mean by "the same"? I could see how they might be similar, but you're fitting models to different samples of data, and so I would not expect the coefficients to be "the same", especially if democracy affects things (or is correlated with things) in ways that your two models don't take into account.

      .ÿversionÿ15.1

      .ÿ
      .ÿclearÿ*

      .ÿ
      .ÿsetÿseedÿ`=strreverse("1472452")'

      .ÿquietlyÿsetÿobsÿ100

      .ÿ
      .ÿgenerateÿintÿcountryÿ=ÿ_n

      .ÿgenerateÿbyteÿdemocracyÿ=ÿmod(_n,ÿ2)

      .ÿgenerateÿdoubleÿcountry_uÿ=ÿrnormal()ÿ+ÿ!democracyÿ*ÿrnormal()

      .ÿ
      .ÿquietlyÿexpandÿ5

      .ÿquietlyÿbysortÿcountry:ÿgenerateÿbyteÿyearÿ=ÿ_n

      .ÿgenerateÿdoubleÿoilÿ=ÿruniform()

      .ÿgenerateÿdoubleÿHRIÿ=ÿoilÿ/ÿ2.5ÿ+ÿdemocracyÿ*ÿoilÿ/ÿ2.5ÿ+ÿcountry_uÿ+ÿrnormal()

      .ÿ
      .ÿprogramÿdefineÿdem
      ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ15.1
      ÿÿ2.ÿ
      .ÿÿÿÿÿÿÿÿÿquietlyÿ`0'
      ÿÿ3.ÿÿÿÿÿÿÿÿÿdisplayÿinÿsmclÿasÿtextÿ"oilÿcoefficientÿ=ÿ"ÿasÿresultÿ%05.3fÿ_b[oil]ÿ///
      >ÿÿÿÿÿÿÿÿÿ"ÿ±ÿ"ÿ%05.3fÿ_se[oil]
      ÿÿ4.ÿend

      .ÿ
      .ÿquietlyÿxtsetÿcountryÿyear

      .ÿ
      .ÿdemÿxtregÿHRIÿc.oilÿi.democracyÿc.oil#i.democracy
      oilÿcoefficientÿ=ÿ0.468ÿ±ÿ0.236

      .ÿdemÿxtregÿHRIÿc.oilÿifÿdemocracyÿ==0
      oilÿcoefficientÿ=ÿ0.475ÿ±ÿ0.245

      .ÿ
      .ÿexit

      endÿofÿdo-file


      .

      Comment


      • #4
        Hi Carlo,

        Thank you very much for the reply. Please, see codes and results below: here polity_cat has 3 categories coded 0-2
        1) xtreg physint L_oilcap_log i.polity_cat c.L_oilcap_log#i.polity_cat, re
        2) xtreg physint L_oilcap_log if polity_cat==0, re

        For regression (1 )the coefficient for L_oilcap_log should be the effect of this variable when polity_cat=0 i.e. baseline category.
        For regression (2) the coefficient for L_oilcap_log should give the same coefficient but it does not and this is what is really surprising.
        Not only are coefficients different but also first regression shows no statistical significant while the second is statistically significant
        Thank you for your help

        Regression 1

        . xtreg physint L_oilcap_log i.polity_cat c.L_oilcap_log#i.polity_cat, re

        Random-effects GLS regression Number of obs = 3,566
        Group variable: cow Number of groups = 165

        R-sq: Obs per group:
        within = 0.0329 min = 4
        between = 0.2637 avg = 21.6
        overall = 0.1559 max = 25

        Wald chi2(5) = 149.12
        corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

        -------------------------------------------------------------------------------------------
        physint | Coef. Std. Err. z P>|z| [95% Conf. Interval]
        --------------------------+----------------------------------------------------------------
        L_oilcap_log | .0267375 .0356157 0.75 0.453 -.0430681 .096543
        |
        polity_cat |
        anocracy | -.1486905 .10117 -1.47 0.142 -.3469801 .0495991
        democracy | .9677071 .1079842 8.96 0.000 .7560619 1.179352
        |
        polity_cat#c.L_oilcap_log |
        anocracy | -.0167481 .0306848 -0.55 0.585 -.0768893 .043393
        democracy | -.0783344 .0380519 -2.06 0.040 -.1529147 -.0037541
        |
        _cons | 4.388589 .1636918 26.81 0.000 4.067759 4.709419
        --------------------------+----------------------------------------------------------------
        sigma_u | 1.6303355
        sigma_e | 1.3036242
        rho | .60999118 (fraction of variance due to u_i)
        -------------------------------------------------------------------------------------------

        Regression 2


        . xtreg physint L_oilcap_log if polity_cat==0, re

        Random-effects GLS regression Number of obs = 1,072
        Group variable: cow Number of groups = 90

        R-sq: Obs per group:
        within = 0.0117 min = 1
        between = 0.0075 avg = 11.9
        overall = 0.0047 max = 25

        Wald chi2(1) = 6.81
        corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0091

        ------------------------------------------------------------------------------
        physint | Coef. Std. Err. z P>|z| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        L_oilcap_log | .1252923 .0480051 2.61 0.009 .031204 .2193805
        _cons | 3.539986 .2164304 16.36 0.000 3.11579 3.964182
        -------------+----------------------------------------------------------------
        sigma_u | 1.6170188
        sigma_e | 1.3803844
        rho | .57845747 (fraction of variance due to u_i)
        ------------------------------------------------------------------------------

        Thank you very much.

        Comment


        • #5
          And an example when it shows the same result.
          use https://stats.idre.ucla.edu/stat/data/hsbdemo, clear . regress write female##c.socst Source | SS df MS Number of obs = 200 -------------+---------------------------------- F(3, 196) = 49.26 Model | 7685.43528 3 2561.81176 Prob > F = 0.0000 Residual | 10193.4397 196 52.0073455 R-squared = 0.4299 -------------+---------------------------------- Adj R-squared = 0.4211 Total | 17878.875 199 89.843593 Root MSE = 7.2116 -------------------------------------------------------------------------------- write | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------------+---------------------------------------------------------------- female | female | 15.00001 5.09795 2.94 0.004 4.946132 25.05389 socst | .6247968 .0670709 9.32 0.000 .4925236 .7570701 | female#c.socst | female | -.2047288 .0953726 -2.15 0.033 -.3928171 -.0166405 | _cons | 17.7619 3.554993 5.00 0.000 10.75095 24.77284 -------------------------------------------------------------------------------- . regress write socst if female==0 Source | SS df MS Number of obs = 91 -------------+---------------------------------- F(1, 89) = 79.62 Model | 4513.09285 1 4513.09285 Prob > F = 0.0000 Residual | 5044.57748 89 56.6806458 R-squared = 0.4722 -------------+---------------------------------- Adj R-squared = 0.4663 Total | 9557.67033 90 106.196337 Root MSE = 7.5287 ------------------------------------------------------------------------------ write | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- socst | .6247968 .0700195 8.92 0.000 .4856695 .7639241 _cons | 17.7619 3.711281 4.79 0.000 10.38766 25.13613

          Comment


          • #6
            Sorry the results for this example didn't post properly.
            This is an example it shows the same results.

            regress write female##c.socst
            regress write socst if female==0

            regression 1
            regress write female##c.socst


            Source | SS df MS Number of obs = 200
            -------------+---------------------------------- F(3, 196) = 49.26
            Model | 7685.43528 3 2561.81176 Prob > F = 0.0000
            Residual | 10193.4397 196 52.0073455 R-squared = 0.4299
            -------------+---------------------------------- Adj R-squared = 0.4211
            Total | 17878.875 199 89.843593 Root MSE = 7.2116

            --------------------------------------------------------------------------------
            write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
            ---------------+----------------------------------------------------------------
            female |
            female | 15.00001 5.09795 2.94 0.004 4.946132 25.05389
            socst | .6247968 .0670709 9.32 0.000 .4925236 .7570701
            |
            female#c.socst |
            female | -.2047288 .0953726 -2.15 0.033 -.3928171 -.0166405
            |
            _cons | 17.7619 3.554993 5.00 0.000 10.75095 24.77284

            regress write socst if female==0


            Source | SS df MS Number of obs = 91
            -------------+---------------------------------- F(1, 89) = 79.62
            Model | 4513.09285 1 4513.09285 Prob > F = 0.0000
            Residual | 5044.57748 89 56.6806458 R-squared = 0.4722
            -------------+---------------------------------- Adj R-squared = 0.4663
            Total | 9557.67033 90 106.196337 Root MSE = 7.5287

            ------------------------------------------------------------------------------
            write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            socst | .6247968 .0700195 8.92 0.000 .4856695 .7639241
            _cons | 17.7619 3.711281 4.79 0.000 10.38766 25.13613

            Thanks


            Comment


            • #7
              Hovhannes:
              as Joseph highlighted, you cannot (and should not) expect similar results from two regression models that use different specifications and, in addition, consider different sample sizes (200 vs 91, if I am not mistaken).
              A simpler regressin toy-example may help:
              Code:
              . use "C:\Program Files (x86)\Stata15\ado\base\a\auto.dta"
              (1978 Automobile Data)
              
              . regress price i.rep78##i.foreign
              note: 1b.rep78#1.foreign identifies no observations in the sample
              note: 2.rep78#1.foreign identifies no observations in the sample
              note: 5.rep78#1.foreign omitted because of collinearity
              
                    Source |       SS           df       MS      Number of obs   =        69
              -------------+----------------------------------   F(7, 61)        =      0.39
                     Model |    24684607         7  3526372.43   Prob > F        =    0.9049
                  Residual |   552112352        61  9051022.16   R-squared       =    0.0428
              -------------+----------------------------------   Adj R-squared   =   -0.0670
                     Total |   576796959        68  8482308.22   Root MSE        =    3008.5
              
              -------------------------------------------------------------------------------
                      price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              --------------+----------------------------------------------------------------
                      rep78 |
                         2  |   1403.125   2378.422     0.59   0.557    -3352.823    6159.073
                         3  |   2042.574   2204.707     0.93   0.358    -2366.011    6451.159
                         4  |   1317.056   2351.846     0.56   0.578    -3385.751    6019.863
                         5  |       -360   3008.492    -0.12   0.905    -6375.851    5655.851
                            |
                    foreign |
                   Foreign  |   2088.167   2351.846     0.89   0.378     -2614.64    6790.974
                            |
              rep78#foreign |
                 1#Foreign  |          0  (empty)
                 2#Foreign  |          0  (empty)
                 3#Foreign  |  -3866.574   2980.505    -1.30   0.199    -9826.462    2093.314
                 4#Foreign  |  -1708.278   2746.365    -0.62   0.536    -7199.973    3783.418
                 5#Foreign  |          0  (omitted)
                            |
                      _cons |     4564.5   2127.325     2.15   0.036      310.651    8818.349
              -------------------------------------------------------------------------------
              
              . regress price i.rep78 if foreign==0
              
                    Source |       SS           df       MS      Number of obs   =        48
              -------------+----------------------------------   F(4, 43)        =      0.45
                     Model |  19111892.1         4  4777973.01   Prob > F        =    0.7734
                  Residual |   458855805        43  10671065.2   R-squared       =    0.0400
              -------------+----------------------------------   Adj R-squared   =   -0.0493
                     Total |   477967697        47  10169525.5   Root MSE        =    3266.7
              
              ------------------------------------------------------------------------------
                     price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                     rep78 |
                        2  |   1403.125   2582.521     0.54   0.590    -3805.025    6611.275
                        3  |   2042.574     2393.9     0.85   0.398    -2785.185    6870.334
                        4  |   1317.056   2553.665     0.52   0.609    -3832.901    6467.012
                        5  |       -360    3266.66    -0.11   0.913    -6947.847    6227.847
                           |
                     _cons |     4564.5   2309.877     1.98   0.055     -93.8113    9222.811
              ------------------------------------------------------------------------------
              An aside, I fail to get why you do not compact your code instead of typing separately the interaction and the conditional main effects of the two predictors. This dangerous habit makes your code more error prone (other things being equal, the wider the number of instructions your code is composed of, the higher the likelihood of mistyping/foregetting something along the way) and, basically, wastes your time.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Thank you Carlo and Joseph

                Comment


                • #9
                  Hi Carlo,

                  Sorry, actually i just realized that your example proves my point as you can see when restricting the sample to foreign==0 it give the same results for rep70 as in the full model even though sample sizes differ. This is contrary to my regression results that different coefficients.

                  Comment


                  • #10
                    Originally posted by Hovhannes Nahapetyan View Post
                    . . . your example proves my point as you can see when restricting the sample to foreign==0 it give the same results for rep70 as in the full model even though sample sizes differ. This is contrary to my regression results that different coefficients.
                    You're comparing apples to oranges when you try to use a regress example to justify your expectations for an xtreg , re problem.

                    My only suggestion is to reiterate my point above, which is to pay closer attention to the corr(u_i, X) = 0 (assumed) note that is apparently insufficiently prominently displayed in the "Random-effects GLS regression" output.

                    Comment


                    • #11
                      Hovhannes:
                      I notice that my toy-example was actually unfortunate.
                      However, point estimates are one of the results that -regress- (or each other inference procedure) brings to our attention: basically, their contribution in explaining (others things being equal) the variation in the dependent variable is supported by confidence interval and p-value. Hence, they should be read considering standard errors (which are influenced by sample size) and confidence interval.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Thank you very much Carlo and Joseph!

                        Joseph I have asked this question before but still am not quite sure (even after reading a lot more on it) and want to bring this up since you mentioned xtreg, re and (corr u_i, x)=0.
                        If we test for the presence of unit level heterogeneity (xttest0) and find out that it is present, but then run a Hausman test and find out that it is not correlated with x i.e (corr u_i, x)=0. how do we decide between xtreg y x, re cluster(robust) and reg y x , cluster (robust). Is it correct to say that if (corr u_i, x)=0. and thus we don't need to use fixed effects then OLS with clustered standard errors is preferred to random effects and if that's the case(is not) why and why not
                        Thank you!

                        Comment


                        • #13
                          Hovhannes:
                          1) (corr u_i, x)=0 is an assumption of the -re- machinery, which may hold in same instances and not in others. It is often difficul to detect whether it holds or not and represents a possible downside of -re- specification, that balances out the -fe- shortcoming of wiping-out time-invariant predictors;
                          2) under pooled -regress- you cluster the standard errors because you should tell Stata that observations are not independent due to the panel structure of your dataset;
                          3) under -xtreg- you cluster/robustify the standard errors when you suspect/detect heteroskedasticity and/or autocorrelation in your dataset (usually the latter bites harder in T>N panel datasets, which should be analyzed with -xtgls- -like commands);
                          4) if you impose non-default standard errors. -hausman- cannot help you anymore, and you should switch to its user-written cousin -xtoverid-.
                          5) it is true that pooled OLS estimator is consistent when -re- are proved; the issue is about efficiency.
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Thank you so much Carlo! As always a prompt and comprehensive response!

                            Comment

                            Working...
                            X