Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diff in Diff

    Dear Users,
    I want to run a Diff N Diff regression, to try to understand if my y var has been affected by the entrance in the European Union. My dataset is a panel of 40 countries in 15 years, and looks like that:

    country year y id ue due d04 did04
    Albania 2000 53.6 1 0 0 0 0 0
    Albania 2001 56.6 1 0 0 0 0 0
    Albania 2002 56.8 1 0 0 0 0 0
    Albania 2003 56.8 1 0 0 0 0 0


    In generate both the dummy for the time and the one for the treatment with:

    generate due=0
    replace due=1 if ue<=year
    gen d04 = (year>=2004) & !missing(year)

    because my "treatment" is the entrance in the European Union. I generated an interaction variable with:

    generate did04=d04*due

    When I try to perform the diff n diff, writing

    regress y d04 due did04, robust

    Stata omitted because of multicollineary the interaction term, reporting this error:

    note: did04 omitted because of collinearity

    I use Stata 13.1. Where is my error? Thank you in advance for any help, and excuse me for any unintentional mistake I did expressing the problem.

  • #2
    In the data sample that you post, ue, due, d04, and did04 are all constant, so it is unsurprising that Stata would drop all but one of them in a regression. Presumably that is not the case in your full data--but in this unrepresentative sample it's impossible to know what's going on.

    More to the point, your description, to the limited extent I understand it, makes it sounds as if you do not actually have the data necessary for a difference-in-differences model.

    In a difference-in-differences model there are two groups of entities (countries): a group that got the treatment (entrance into the EU) and another group that did not. You also observe the outcome variable on both groups both before and after the coalescence of the EU. If you don't have a group of countries that never entered the EU, then you don't have a difference-in-differences model and you need to analyze it simply as pre-post treatment comparisons. This gives rise to 2 variables: the treatment variable indicates the group that eventually joins the EU (1) and those that do not (0). The second variable is the before vs after variable, which in this situation would be 0 in years before 2004 and 1 in years after. The interaction term you want is then the product of these.

    The before-after variable is your d04 variable. You have not yet, as far as I can see, generated the treatment variable that indicates which countries entered the EU and which did not. Presumably you need to do this with something like -gen treatment = inlist(country, "Cyprus," "Czech Republic", "Estonia", "Hungary", ...etc.)-. Evidently, you need to write out this list explicitly, as Stata doesn't know what ... means. Also, inlist() only allows 9 countries to be listed here, and as there were 10 countries that joined the EU in 2004, you need to leave one out. Let's say you leave out Slovenia, so you need to add - | country == "Slovenia"- to the end of the command.

    Anyway, once you figure out how to calculate a treatment variable, let's call it treatment, you do not need to generate a new interaction variable. Stata will do that for you in your estimation command if you use factor variable notation. (See -help fvvarlist-). The syntax would look like this:

    Code:
    estimation_command y i.treatment##i.d04
    Finally, if this is panel data, you probably should not be using the -regress- command to analyze it, because the assumption that error terms in all observations are independent is most likely violated. You should look into the -xtreg- command.

    Added later: if you do move to the -xtreg- command and chose the fixed effects model, the -treatment- main effect will be dropped by Stata because it is collinear with the fixed effects. That is not a problem and you shouldn't be concerned when that happens. It is the interaction term that provides information about the effect of interest here.

    Comment


    • #3
      Dear Clyde,
      thank you very much for your useful answer. First, let me reassure you: the data sample I posted is poor, but I have both countries inside the EU and outside the EU, before and after the 2004. Actually in my dataset I have three kinds of countries: ones that never joined the EU in all the observed 15 years (like Albania), ones that already were in the EU before 2000 (like Italy), ones that joined EU among 2000 and 2015 (like Bulgaria). I don't understand why you say that I didn't the treatment variable. Isn't due, in the post above, the treatment dummy? I have a variable called ue which represent the year in which a country joined EU; I generated the treatment variable (due) coding:

      generate due=0
      replace due=1 if ue<=year

      This should (in my mind, please correct me if I'm wrong!) generate a dummy which is 0 if a country hasn't joined EU in the year of the observation, 1 if the country has joined. Thank you very much for you useful comments about the model, and for teaching me about factor variable notation. Am I wrong or in line of principle is exactly the same of doing an interaction variable? Of course if way more comfortable, thank you for the tip. Actually, I'm also looking for a FE model, clustering countries. I hope it's appropriated.

      Comment


      • #4
        It looks like your calculation of due is, indeed, the treatment variable. That was not apparent from your post in #1 because there was no explanation of what ue is, and in the data example ue is always 0, so it doesn't look like something that makes sense to compare with year. But now I understand that ue is set to zero when there is no entry to the EU.

        In principle, just as far as the regression itself goes, there is no difference between using factor variable notation and calculating your own interaction term. But only if you use factor variable notation will you be able to use the -margins- command after that. Maybe you don't need it, but particularly in models with interaction terms, it is usually important to calculate effects in the different subgroups and perhaps make graphs etc. Before Stata Corp. created -margins- those tasks required tedious and error-prone series of commands using -predict- or -lincom-, etc.; -margins- will save you hours of work in those tasks.

        If all you're going to do is run the regression and hand off the regression coefficients to someone, then there is no advantage to factor notation. But if you're going to delve into understanding the results, I think you'll find it very helpful to use factor notation, and then -margins-.

        An FE model clustering countries sounds appropriate to me.

        Comment


        • #5
          Clyde,
          thank you very much, and excuse me for the poor explanation in the first post. You have been amazing kind and useful, thank you!

          Comment


          • #6
            I have a further question, which is more an econometrical one then strictly related to Stata. If I'm off topic, please just tell me, and excuse me for the error. I manage (thanks to Clyde's help!) to do the diff n diff. I prefer to do it with the diff command, a user done command (diff, by Juan M. Villa-Brooks World Poverty Institute - The University of Manchester - 2011. DIFF: Stata Module to Perform Differences in Differences Estimation. Statistical Software Components. Boston College. Department of Economics). As far as I understand, the command is just a more comfortable syntax to perform the factor variable notation Clyde suggested me. I run three different diff n diff estimation, and three different panel FE regression. My variable of interest is ffc, the treatment dummy is due, the temporal dummy is d04, the interaction is did04. All the other variables are controls. Now, coding:

            diff ffc, treated(due) period(d04) robust cov (gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc)

            xtset id year
            xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)

            drop if year<2000
            drop if year>2008

            diff ffc, treated(due) period(d04) robust cov (gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc)
            xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)

            drop if year<2002
            drop if year>2006

            diff ffc, treated(due) period(d04) robust cov (gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc)
            xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)

            I obtain two groups of three different estimation, one with the diff n diff method, and the other with the Panel FE. As you may see, the difference in the regression is the number of years included: one sample includes all the years, the second one years among 2000 and 2008, the third one among 2002 and 2006.

            Now the question: may someone explain me why the interaction variable is significant (in terms of p-value) in the diff n diff with all the year, and is non significant in the other diff n diff regressions, while the same variable is significant in the panel fe regression with just 4 years, and non-significant in the others? Thank you and excuse me for the long post.

            Comment


            • #7
              From the help file for -diff-
              diff performs several difference in differences (diff-in-diff) treatment effect estimations of a given outcome variable from a pooled baseline and follow up dataset: ... diff is also suitable for estimating repeated cross sections diff-in-diff.... [emphasis added]
              -diff- is not intended for use with panel data.

              Comment


              • #8
                Thank you very much for the quick answer Clyde, I miss that one, sorry. Just let me a brief follow-up for clarity: if I code:
                Code:
                regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust
                
                xtset id year xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                
                drop if year<2000
                drop if year>2008
                
                regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust
                
                xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                
                drop if year<2002
                
                drop if year>2006
                
                regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust
                
                xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                I obtain exactly the same results. Do you have an explanation of this difference in significance? Is also the GLS in factor notation un-appropriate for panel data? Thank you so much.

                Comment


                • #9
                  I'm completely confused by your post. You say the results are exactly the same, and then ask if I can explain the difference?!? And you don't even show the results.
                  Please post back showing the actual Stata output and clarify the question. I'd like to help, but I really have no idea what you're talking about here.

                  Comment


                  • #10
                    Dear Clyde,
                    thank you for the answer and excuse me for being unclear. I meant that are the same results I found with the diff command. There is the output to the code posted above:

                    . regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust

                    Linear regression Number of obs = 105
                    F( 10, 94) = 118.97
                    Prob > F = 0.0000
                    R-squared = 0.8690
                    Root MSE = 8.9005

                    ------------------------------------------------------------------------------
                    | Robust
                    ffc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    due |
                    Sì | 3.623588 3.468423 1.04 0.299 -3.263046 10.51022
                    |
                    d04 |
                    Sì | 2.684728 2.783901 0.96 0.337 -2.842773 8.212229
                    |
                    due#d04 |
                    Sì#Sì | -2.173693 3.620438 -0.60 0.550 -9.362157 5.01477
                    |
                    gvtspending | 2.142442 .3786619 5.66 0.000 1.3906 2.894284
                    gw_gnppc | -4.225914 1.210857 -3.49 0.001 -6.630099 -1.821729
                    ethnfrac | 49.22688 10.78852 4.56 0.000 27.80602 70.64774
                    legor_uk | -3.781459 2.347466 -1.61 0.111 -8.442408 .8794904
                    legor_so | -46.19743 3.247645 -14.22 0.000 -52.64571 -39.74916
                    legor_fr | -24.57273 2.982864 -8.24 0.000 -30.49528 -18.65019
                    legor_sc | .6177467 3.481514 0.18 0.860 -6.294882 7.530375
                    _cons | 41.91866 6.108815 6.86 0.000 29.78946 54.04785
                    ------------------------------------------------------------------------------

                    .
                    . xtset id year
                    panel variable: id (strongly balanced)
                    time variable: year, 2002 to 2006
                    delta: 1 unit

                    .
                    . xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                    note: gw_gnppc omitted because of collinearity
                    note: ethnfrac omitted because of collinearity
                    note: legor_uk omitted because of collinearity
                    note: legor_so omitted because of collinearity
                    note: legor_fr omitted because of collinearity
                    note: legor_sc omitted because of collinearity

                    Fixed-effects (within) regression Number of obs = 105
                    Group variable: id Number of groups = 21

                    R-sq: within = 0.0867 Obs per group: min = 5
                    between = 0.3038 avg = 5.0
                    overall = 0.2696 max = 5

                    F(3,20) = .
                    corr(u_i, Xb) = -0.6143 Prob > F = .

                    (Std. Err. adjusted for 21 clusters in country)
                    ------------------------------------------------------------------------------
                    | Robust
                    ffc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    d04 | 2.121648 2.965457 0.72 0.483 -4.064188 8.307483
                    due | -5.202446 .9272655 -5.61 0.000 -7.136688 -3.268205
                    did04 | -1.140376 3.176692 -0.36 0.723 -7.766839 5.486086
                    gvtspending | -.190276 .4041911 -0.47 0.643 -1.033404 .6528518
                    gw_gnppc | 0 (omitted)
                    ethnfrac | 0 (omitted)
                    legor_uk | 0 (omitted)
                    legor_so | 0 (omitted)
                    legor_fr | 0 (omitted)
                    legor_sc | 0 (omitted)
                    _cons | 73.11453 8.265563 8.85 0.000 55.87287 90.35619
                    -------------+----------------------------------------------------------------
                    sigma_u | 25.575719
                    sigma_e | 3.1209402
                    rho | .98532778 (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------

                    .
                    . drop if year<2000
                    (0 observations deleted)

                    . drop if year>2008
                    (0 observations deleted)

                    .
                    . regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust

                    Linear regression Number of obs = 105
                    F( 10, 94) = 118.97
                    Prob > F = 0.0000
                    R-squared = 0.8690
                    Root MSE = 8.9005

                    ------------------------------------------------------------------------------
                    | Robust
                    ffc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    due |
                    Sì | 3.623588 3.468423 1.04 0.299 -3.263046 10.51022
                    |
                    d04 |
                    Sì | 2.684728 2.783901 0.96 0.337 -2.842773 8.212229
                    |
                    due#d04 |
                    Sì#Sì | -2.173693 3.620438 -0.60 0.550 -9.362157 5.01477
                    |
                    gvtspending | 2.142442 .3786619 5.66 0.000 1.3906 2.894284
                    gw_gnppc | -4.225914 1.210857 -3.49 0.001 -6.630099 -1.821729
                    ethnfrac | 49.22688 10.78852 4.56 0.000 27.80602 70.64774
                    legor_uk | -3.781459 2.347466 -1.61 0.111 -8.442408 .8794904
                    legor_so | -46.19743 3.247645 -14.22 0.000 -52.64571 -39.74916
                    legor_fr | -24.57273 2.982864 -8.24 0.000 -30.49528 -18.65019
                    legor_sc | .6177467 3.481514 0.18 0.860 -6.294882 7.530375
                    _cons | 41.91866 6.108815 6.86 0.000 29.78946 54.04785
                    ------------------------------------------------------------------------------

                    .
                    . xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                    note: gw_gnppc omitted because of collinearity
                    note: ethnfrac omitted because of collinearity
                    note: legor_uk omitted because of collinearity
                    note: legor_so omitted because of collinearity
                    note: legor_fr omitted because of collinearity
                    note: legor_sc omitted because of collinearity

                    Fixed-effects (within) regression Number of obs = 105
                    Group variable: id Number of groups = 21

                    R-sq: within = 0.0867 Obs per group: min = 5
                    between = 0.3038 avg = 5.0
                    overall = 0.2696 max = 5

                    F(3,20) = .
                    corr(u_i, Xb) = -0.6143 Prob > F = .

                    (Std. Err. adjusted for 21 clusters in country)
                    ------------------------------------------------------------------------------
                    | Robust
                    ffc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    d04 | 2.121648 2.965457 0.72 0.483 -4.064188 8.307483
                    due | -5.202446 .9272655 -5.61 0.000 -7.136688 -3.268205
                    did04 | -1.140376 3.176692 -0.36 0.723 -7.766839 5.486086
                    gvtspending | -.190276 .4041911 -0.47 0.643 -1.033404 .6528518
                    gw_gnppc | 0 (omitted)
                    ethnfrac | 0 (omitted)
                    legor_uk | 0 (omitted)
                    legor_so | 0 (omitted)
                    legor_fr | 0 (omitted)
                    legor_sc | 0 (omitted)
                    _cons | 73.11453 8.265563 8.85 0.000 55.87287 90.35619
                    -------------+----------------------------------------------------------------
                    sigma_u | 25.575719
                    sigma_e | 3.1209402
                    rho | .98532778 (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------

                    .
                    . drop if year<2002
                    (0 observations deleted)

                    .
                    . drop if year>2006
                    (0 observations deleted)

                    .
                    . regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust

                    Linear regression Number of obs = 105
                    F( 10, 94) = 118.97
                    Prob > F = 0.0000
                    R-squared = 0.8690
                    Root MSE = 8.9005

                    ------------------------------------------------------------------------------
                    | Robust
                    ffc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    due |
                    Sì | 3.623588 3.468423 1.04 0.299 -3.263046 10.51022
                    |
                    d04 |
                    Sì | 2.684728 2.783901 0.96 0.337 -2.842773 8.212229
                    |
                    due#d04 |
                    Sì#Sì | -2.173693 3.620438 -0.60 0.550 -9.362157 5.01477
                    |
                    gvtspending | 2.142442 .3786619 5.66 0.000 1.3906 2.894284
                    gw_gnppc | -4.225914 1.210857 -3.49 0.001 -6.630099 -1.821729
                    ethnfrac | 49.22688 10.78852 4.56 0.000 27.80602 70.64774
                    legor_uk | -3.781459 2.347466 -1.61 0.111 -8.442408 .8794904
                    legor_so | -46.19743 3.247645 -14.22 0.000 -52.64571 -39.74916
                    legor_fr | -24.57273 2.982864 -8.24 0.000 -30.49528 -18.65019
                    legor_sc | .6177467 3.481514 0.18 0.860 -6.294882 7.530375
                    _cons | 41.91866 6.108815 6.86 0.000 29.78946 54.04785
                    ------------------------------------------------------------------------------

                    .
                    . xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                    note: gw_gnppc omitted because of collinearity
                    note: ethnfrac omitted because of collinearity
                    note: legor_uk omitted because of collinearity
                    note: legor_so omitted because of collinearity
                    note: legor_fr omitted because of collinearity
                    note: legor_sc omitted because of collinearity

                    Fixed-effects (within) regression Number of obs = 105
                    Group variable: id Number of groups = 21

                    R-sq: within = 0.0867 Obs per group: min = 5
                    between = 0.3038 avg = 5.0
                    overall = 0.2696 max = 5

                    F(3,20) = .
                    corr(u_i, Xb) = -0.6143 Prob > F = .

                    (Std. Err. adjusted for 21 clusters in country)
                    ------------------------------------------------------------------------------
                    | Robust
                    ffc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    d04 | 2.121648 2.965457 0.72 0.483 -4.064188 8.307483
                    due | -5.202446 .9272655 -5.61 0.000 -7.136688 -3.268205
                    did04 | -1.140376 3.176692 -0.36 0.723 -7.766839 5.486086
                    gvtspending | -.190276 .4041911 -0.47 0.643 -1.033404 .6528518
                    gw_gnppc | 0 (omitted)
                    ethnfrac | 0 (omitted)
                    legor_uk | 0 (omitted)
                    legor_so | 0 (omitted)
                    legor_fr | 0 (omitted)
                    legor_sc | 0 (omitted)
                    _cons | 73.11453 8.265563 8.85 0.000 55.87287 90.35619
                    -------------+----------------------------------------------------------------
                    sigma_u | 25.575719
                    sigma_e | 3.1209402
                    rho | .98532778 (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------

                    .
                    end of do-file


                    I hope it is clear. In case is not, may you suggest me how can I copy and paste from stata in a more reading-friendly way? Thank you again so much for you amazing availability and help.

                    Comment


                    • #11
                      Maybe is better in this way. In the first table, I have the diff n diff regressions, with 3 different period (1995-2015/2000-2008/2002-2006).

                      Difference in Differences 1995-2015/2000-2008/2002-2006
                      --------------------------------------------------------------------
                      (1) (2) (3)
                      Freedom fr~n Freedom fr~n Freedom fr~n
                      --------------------------------------------------------------------
                      No 0 0 0
                      (.) (.) (.)

                      Sì 5.197** 3.293 3.624
                      (2.75) (1.36) (1.04)

                      No 0 0 0
                      (.) (.) (.)

                      Sì 3.897* 2.584 2.685
                      (2.09) (1.10) (0.96)

                      No # No 0 0 0
                      (.) (.) (.)

                      No # Sì 0 0 0
                      (.) (.) (.)

                      Sì # No 0 0 0
                      (.) (.) (.)

                      Sì # Sì -7.039** -2.547 -2.174
                      (-2.93) (-0.86) (-0.60)

                      Government spending 1.965*** 2.001*** 2.142***
                      (9.83) (7.62) (5.66)

                      Government Wa..C. -4.682*** -4.448*** -4.226***
                      (-6.20) (-5.16) (-3.49)

                      Ethno-Linguistic F~i 40.88*** 41.70*** 49.23***
                      (6.81) (4.95) (4.56)

                      Legal origins UK -4.870** -4.235* -3.781
                      (-2.93) (-2.10) (-1.61)

                      Legal origins Soci~t -44.56*** -47.40*** -46.20***
                      (-29.95) (-21.25) (-14.22)

                      Legal origins France -25.93*** -25.33*** -24.57***
                      (-14.19) (-10.82) (-8.24)

                      Legal origins Scan~n -1.977 0.408 0.618
                      (-1.07) (0.17) (0.18)

                      Constant 48.00*** 47.53*** 41.92***
                      (12.09) (9.88) (6.86)
                      --------------------------------------------------------------------
                      Observations 399 189 105
                      Adjusted R-squared 0.824 0.851 0.855
                      --------------------------------------------------------------------
                      t statistics in parentheses
                      * p<0.05, ** p<0.01, *** p<0.001

                      In this second table, I have the Panel FE regression for the same period.

                      Panel FE 1995-2015/2000-2008/2002-2006
                      --------------------------------------------------------------------
                      (1) (2) (3)
                      Freedom fr~n Corruption~x Control of~n
                      --------------------------------------------------------------------
                      2004? 4.713 2.846 0.139
                      (1.51) (1.55) (1.45)

                      In EU? 3.165 4.317 0.187
                      (1.56) (1.41) (1.71)

                      Interaction d04 due -4.794 -3.197 -0.326**
                      (-1.39) (-1.37) (-3.02)

                      Government spending -0.0697 -0.376 0.00440
                      (-0.23) (-0.87) (0.25)

                      Government Wa..C. 0 0 0
                      (.) (.) (.)

                      Ethno-Linguistic F~i 0 0 0
                      (.) (.) (.)

                      Legal origins UK 0 0 0
                      (.) (.) (.)

                      Legal origins Soci~t 0 0 0
                      (.) (.) (.)

                      Legal origins France 0 0 0
                      (.) (.) (.)

                      Legal origins Scan~n 0 0 0
                      (.) (.) (.)

                      Constant 65.06*** 71.45*** 1.057**
                      (10.57) (7.83) (2.98)
                      --------------------------------------------------------------------
                      Observations 399 391 336
                      Adjusted R-squared 0.034 0.056 0.170
                      --------------------------------------------------------------------
                      t statistics in parentheses
                      * p<0.05, ** p<0.01, *** p<0.001


                      As you may see, in the first table the interaction variable is significant for the first period (1995-2015) and insignificant in the others. On the other hand, in the second table the interaction variable is significant in the last period (2002-2006) and insignificant in the others. Do you have any explanation to this? Thank you, hope to been clear this time.

                      Comment


                      • #12
                        What you posted in #11 doesn't help me. I don't know what command(s) it comes from, for one, and, I don't know how to relate the variable labels in that output to the variable names in your commands and -regress- and -xtreg- output. And, in any case, the numbers in that output don't have any obvious relationship to those in the regression output, so I have really no idea what they represent.

                        It's better to focus on the direct output of the analyses. So returning to #10, the way to make it more user friendly is to put it in a code block. I suggested that before. Please see FAQ #12 for instructions how to do that. In any case, I am able to read the output there well enough to comment on it. You are getting identical results from all of these analyses because you are running the exact same commands on the exact same data in each group of -reg- and -xtreg- commands. Although you are attempting to change the data with your -drop if- commands, notice that in the Stata output, immediately after each command, Stata tells you "(0 observations deleted)." So it appears that you do not have any observations with year < 2002 or year > 2006 by the time you reach this point in the code.

                        The data you showed in #1 clearly has observations that would be candidates for -drop-ping, so somewhere along the way they got removed before you reached this point in your analysis. You will have to troubleshoot how that happened.

                        Comment


                        • #13
                          Thank you for your perseverance into helping me Clyde! I did a mistake with the dataset, you are totally right. Here it is the correct output (with the code):

                          Code:
                          . regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust
                          
                          Linear regression                                      Number of obs =     399
                                                                                 F( 10,   388) =  416.51
                                                                                 Prob > F      =  0.0000
                                                                                 R-squared     =  0.8283
                                                                                 Root MSE      =  9.7484
                          
                          ------------------------------------------------------------------------------
                                       |               Robust
                                   ffc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                   due |
                                   Sì  |   5.196876   1.891159     2.75   0.006     1.478673    8.915079
                                       |
                                   d04 |
                                   Sì  |   3.896697     1.8663     2.09   0.037     .2273695    7.566024
                                       |
                               due#d04 |
                                Sì#Sì  |  -7.039191   2.398964    -2.93   0.004    -11.75579   -2.322595
                                       |
                           gvtspending |   1.964749    .199936     9.83   0.000     1.571656    2.357843
                              gw_gnppc |  -4.681705   .7545846    -6.20   0.000    -6.165292   -3.198119
                              ethnfrac |   40.87855   6.003651     6.81   0.000     29.07479    52.68231
                              legor_uk |  -4.869977   1.664385    -2.93   0.004     -8.14232   -1.597635
                              legor_so |  -44.56216   1.487803   -29.95   0.000    -47.48733     -41.637
                              legor_fr |  -25.93025   1.827158   -14.19   0.000    -29.52262   -22.33788
                              legor_sc |  -1.976997   1.848762    -1.07   0.286    -5.611843    1.657848
                                 _cons |   48.00006   3.970148    12.09   0.000     40.19436    55.80575
                          ------------------------------------------------------------------------------
                          
                          . 
                          . xtset id year 
                                 panel variable:  id (unbalanced)
                                  time variable:  year, 1995 to 2016, but with gaps
                                          delta:  1 unit
                          
                          . 
                          . xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                          note: gw_gnppc omitted because of collinearity
                          note: ethnfrac omitted because of collinearity
                          note: legor_uk omitted because of collinearity
                          note: legor_so omitted because of collinearity
                          note: legor_fr omitted because of collinearity
                          note: legor_sc omitted because of collinearity
                          
                          Fixed-effects (within) regression               Number of obs      =       399
                          Group variable: id                              Number of groups   =        21
                          
                          R-sq:  within  = 0.0435                         Obs per group: min =        19
                                 between = 0.0235                                        avg =      19.0
                                 overall = 0.0094                                        max =        19
                          
                                                                          F(4,20)            =      1.27
                          corr(u_i, Xb)  = 0.0437                         Prob > F           =    0.3141
                          
                                                         (Std. Err. adjusted for 21 clusters in country)
                          ------------------------------------------------------------------------------
                                       |               Robust
                                   ffc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                   d04 |   4.713141   3.124605     1.51   0.147    -1.804671    11.23095
                                   due |    3.16513   2.031957     1.56   0.135    -1.073458    7.403717
                                 did04 |  -4.794282   3.450961    -1.39   0.180    -11.99286    2.404296
                           gvtspending |  -.0697054    .301616    -0.23   0.820    -.6988653    .5594544
                              gw_gnppc |          0  (omitted)
                              ethnfrac |          0  (omitted)
                              legor_uk |          0  (omitted)
                              legor_so |          0  (omitted)
                              legor_fr |          0  (omitted)
                              legor_sc |          0  (omitted)
                                 _cons |   65.06137   6.155886    10.57   0.000     52.22041    77.90232
                          -------------+----------------------------------------------------------------
                               sigma_u |  22.959057
                               sigma_e |  5.8452696
                                   rho |  .93912692   (fraction of variance due to u_i)
                          ------------------------------------------------------------------------------
                          
                          . 
                          . drop if year<2000
                          (227 observations deleted)
                          
                          . drop if year>2008
                          (367 observations deleted)
                          
                          . 
                          . regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust
                          
                          Linear regression                                      Number of obs =     189
                                                                                 F( 10,   178) =  226.15
                                                                                 Prob > F      =  0.0000
                                                                                 R-squared     =  0.8585
                                                                                 Root MSE      =  9.0463
                          
                          ------------------------------------------------------------------------------
                                       |               Robust
                                   ffc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                   due |
                                   Sì  |   3.292563   2.417664     1.36   0.175    -1.478409    8.063534
                                       |
                                   d04 |
                                   Sì  |   2.583956   2.357938     1.10   0.275    -2.069153    7.237066
                                       |
                               due#d04 |
                                Sì#Sì  |  -2.547058   2.958382    -0.86   0.390    -8.385072    3.290957
                                       |
                           gvtspending |   2.001437   .2625081     7.62   0.000     1.483408    2.519465
                              gw_gnppc |  -4.447996   .8628452    -5.16   0.000    -6.150718   -2.745274
                              ethnfrac |   41.69702   8.418821     4.95   0.000     25.08347    58.31056
                              legor_uk |  -4.235492   2.018431    -2.10   0.037    -8.218626   -.2523582
                              legor_so |  -47.40386   2.231187   -21.25   0.000    -51.80684   -43.00088
                              legor_fr |  -25.33003   2.342062   -10.82   0.000    -29.95181   -20.70825
                              legor_sc |   .4080025   2.464494     0.17   0.869    -4.455383    5.271388
                                 _cons |   47.52872   4.810217     9.88   0.000     38.03633    57.02111
                          ------------------------------------------------------------------------------
                          
                          . 
                          . xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                          note: gw_gnppc omitted because of collinearity
                          note: ethnfrac omitted because of collinearity
                          note: legor_uk omitted because of collinearity
                          note: legor_so omitted because of collinearity
                          note: legor_fr omitted because of collinearity
                          note: legor_sc omitted because of collinearity
                          
                          Fixed-effects (within) regression               Number of obs      =       189
                          Group variable: id                              Number of groups   =        21
                          
                          R-sq:  within  = 0.0699                         Obs per group: min =         9
                                 between = 0.2722                                        avg =       9.0
                                 overall = 0.2378                                        max =         9
                          
                                                                          F(4,20)            =      2.76
                          corr(u_i, Xb)  = -0.5945                        Prob > F           =    0.0561
                          
                                                         (Std. Err. adjusted for 21 clusters in country)
                          ------------------------------------------------------------------------------
                                       |               Robust
                                   ffc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                   d04 |   1.795276   2.285201     0.79   0.441     -2.97157    6.562121
                                   due |  -1.497031   2.052316    -0.73   0.474    -5.778087    2.784024
                                 did04 |  -.0147426   2.653078    -0.01   0.996    -5.548966    5.519481
                           gvtspending |  -.7580804   .2877919    -2.63   0.016    -1.358404    -.157757
                              gw_gnppc |          0  (omitted)
                              ethnfrac |          0  (omitted)
                              legor_uk |          0  (omitted)
                              legor_so |          0  (omitted)
                              legor_fr |          0  (omitted)
                              legor_sc |          0  (omitted)
                                 _cons |   81.33537   5.958517    13.65   0.000     68.90612    93.76462
                          -------------+----------------------------------------------------------------
                               sigma_u |  25.617948
                               sigma_e |  3.5726054
                                   rho |  .98092273   (fraction of variance due to u_i)
                          ------------------------------------------------------------------------------
                          
                          . 
                          . drop if year<2002
                          (91 observations deleted)
                          
                          . 
                          . drop if year>2006
                          (92 observations deleted)
                          
                          . 
                          . regress ffc i.due##i.d04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, robust
                          
                          Linear regression                                      Number of obs =     105
                                                                                 F( 10,    94) =  118.97
                                                                                 Prob > F      =  0.0000
                                                                                 R-squared     =  0.8690
                                                                                 Root MSE      =  8.9005
                          
                          ------------------------------------------------------------------------------
                                       |               Robust
                                   ffc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                   due |
                                   Sì  |   3.623588   3.468423     1.04   0.299    -3.263046    10.51022
                                       |
                                   d04 |
                                   Sì  |   2.684728   2.783901     0.96   0.337    -2.842773    8.212229
                                       |
                               due#d04 |
                                Sì#Sì  |  -2.173693   3.620438    -0.60   0.550    -9.362157     5.01477
                                       |
                           gvtspending |   2.142442   .3786619     5.66   0.000       1.3906    2.894284
                              gw_gnppc |  -4.225914   1.210857    -3.49   0.001    -6.630099   -1.821729
                              ethnfrac |   49.22688   10.78852     4.56   0.000     27.80602    70.64774
                              legor_uk |  -3.781459   2.347466    -1.61   0.111    -8.442408    .8794904
                              legor_so |  -46.19743   3.247645   -14.22   0.000    -52.64571   -39.74916
                              legor_fr |  -24.57273   2.982864    -8.24   0.000    -30.49528   -18.65019
                              legor_sc |   .6177467   3.481514     0.18   0.860    -6.294882    7.530375
                                 _cons |   41.91866   6.108815     6.86   0.000     29.78946    54.04785
                          ------------------------------------------------------------------------------
                          
                          . 
                          . xtreg ffc d04 due did04 gvtspending gw_gnppc ethnfrac legor_uk legor_so legor_fr legor_sc, fe vce (cluster country)
                          note: gw_gnppc omitted because of collinearity
                          note: ethnfrac omitted because of collinearity
                          note: legor_uk omitted because of collinearity
                          note: legor_so omitted because of collinearity
                          note: legor_fr omitted because of collinearity
                          note: legor_sc omitted because of collinearity
                          
                          Fixed-effects (within) regression               Number of obs      =       105
                          Group variable: id                              Number of groups   =        21
                          
                          R-sq:  within  = 0.0867                         Obs per group: min =         5
                                 between = 0.3038                                        avg =       5.0
                                 overall = 0.2696                                        max =         5
                          
                                                                          F(3,20)            =         .
                          corr(u_i, Xb)  = -0.6143                        Prob > F           =         .
                          
                                                         (Std. Err. adjusted for 21 clusters in country)
                          ------------------------------------------------------------------------------
                                       |               Robust
                                   ffc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                   d04 |   2.121648   2.965457     0.72   0.483    -4.064188    8.307483
                                   due |  -5.202446   .9272655    -5.61   0.000    -7.136688   -3.268205
                                 did04 |  -1.140376   3.176692    -0.36   0.723    -7.766839    5.486086
                           gvtspending |   -.190276   .4041911    -0.47   0.643    -1.033404    .6528518
                              gw_gnppc |          0  (omitted)
                              ethnfrac |          0  (omitted)
                              legor_uk |          0  (omitted)
                              legor_so |          0  (omitted)
                              legor_fr |          0  (omitted)
                              legor_sc |          0  (omitted)
                                 _cons |   73.11453   8.265563     8.85   0.000     55.87287    90.35619
                          -------------+----------------------------------------------------------------
                               sigma_u |  25.575719
                               sigma_e |  3.1209402
                                   rho |  .98532778   (fraction of variance due to u_i)
                          ------------------------------------------------------------------------------
                          
                          .
                          As you may see, in the interaction variable is significant for the first period (1995-2015) and insignificant in the others for the factor notation regression. On the other hand, in the Panel FE the interaction variable is significant in the last period (2002-2006) and insignificant in the others. Do you have any explanation to this? Thank you, hope to finally been able to be clear this time.

                          Comment


                          • #14
                            Thank you! This output is very easy to work with.

                            So, I'm not going to try to figure out what's going on in the -regress- analyses because they're just wrong. You have panel data: -regress- is fatally flawed here due to omitted variable bias, as well as correlated error terms, so these analyses should just be discarded.

                            Now, factor variable notation really has nothing to do with it. You could have (and, I would argue, should have) used factor variable notation in your -xtreg- analyses also. The results would be the same either way. (The advantage of factor variable notation comes when you use -margins- later on.)

                            Your interaction effect does quite attenuated as you start to remove observations farther away from the treatment implementation year (2004) in time. One possible explanation is that the effects of the intervention actually increase over time, so that the paths of the two groups of countries diverge more and more as time goes on. I would check this out graphically by plotting the mean ffc in both groups to see if the paths diverge following 2004.

                            Code:
                            collapse (mean) ffc, by(due year)
                            reshape wide ffc, i(year) j(due)
                            graph twoway connect ffc* year
                            If you see progressively diverging paths, this would argue for replacing the d04 variable by a linear spline in time in the modeling (-help mkspline).

                            Another explanation is just sampling variation in a sample of only moderate size. In the largest of your analyses, the first one, the 95% confidence interval around the interaction coefficient is very wide, and the values seen in the two later analyses on more restricted sample fall comfortably inside that interval. In fact, all 3 of the estimates fall comfortably within all 3 confidence intervals. So this may just be noise in the data.

                            Comment


                            • #15
                              Thank you very much Clyde, your suggestion about interpretation are very precious.

                              Comment

                              Working...
                              X