Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can the if option of xtreg unbalance Panel Data?

    Hi everyone,

    I have a tricky question regarding the way xtreg works.

    Using a balanced Panel Data with 10 observations per id, I want to use option if after regression command in xtreg, fe to use only observation with specific criteria.
    The problem is - not all the ids have this criteria, eg.:

    xtreg x z if z == "yes", fe robust

    id | time | z
    1 | 2000 | yes
    1 | 2001 | yes
    1 | 2002 | no
    2 | 2000 | yes
    2 | 2001 | no
    2 | 2002 | yes

    First, I assumed that xtreg would automatically dismiss all the ids with missing values, but then realised that since the observations are not missing, but excluded manually this may actually manually unbalance my panel and bias the results. On the other hand the x-values are already within values and therefore the results should correctly describe the regression for case in which z == "yes". Im confused.

    So does it unbalance my panel or not and I can still safely use xtreg?

    Thank you in advance
    Alexandra


  • #2
    -xtreg- works just fine with unbalanced panels. There is nothing to worry about here.

    That said, you should think about the meaning and implications of imposing the z == "yes" restriction. If it makes sense, from a scientific perspective, to include observations on ids for which z is sometimes yes and sometimes no, then applying the -if- condition to -xtreg- is the way to go On the other hand, if your analysis is only meaningful for entities for which have z = "yes" at all times, then you need to find a way to exclude ids with any z = "no". My point is that this is not a statistical issue about unbalanced vs balanced panels. It's an issue of whether any z = "no" observations at all simply make an id "not in universe" for your research question. That's what you should focus on.

    If you do need to exclude observations where any z = "no", you can do that as follows:

    Code:
    egen always_yes = min(z == "yes"), by(id)
    xtreg whatever if always_yes
    Last edited by Clyde Schechter; 03 Apr 2016, 09:27.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      -xtreg- works just fine with unbalanced panels. There is nothing to worry about here.
      Halo Clyde,

      Thank you for the quick answer. But the question remains: does it make this technically to an unbalanced panel analysis?

      Thanks

      Comment


      • #4
        If the result of excluding observations using -if- leads to different numbers of observations in each panel, then, yes, it's an unbalanced panel analysis. But that doesn't matter: -xtreg- doesn't care if the panel is balanced or not..

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          -xtreg- works just fine with unbalanced panels. There is nothing to worry about here.

          That said, you should think about the meaning and implications of imposing the z == "yes" restriction. If it makes sense, from a scientific perspective, to include observations on ids for which z is sometimes yes and sometimes no, then applying the -if- condition to -xtreg- is the way to go On the other hand, if your analysis is only meaningful for entities for which have z = "yes" at all times, then you need to find a way to exclude ids with any z = "no". My point is that this is not a statistical issue about unbalanced vs balanced panels. It's an issue of whether any z = "no" observations at all simply make an id "not in universe" for your research question. That's what you should focus on.

          If you do need to exclude observations where any z = "no", you can do that as follows:

          Code:
          egen always_yes = min(z == "yes"), by(id)
          xtreg whatever if always_yes
          Dear Clyde Schechter ,

          Apologies for asking you a question on this topic again. I am not sure if I am understanding the - if - condition correctly, when used in combination with xtreg, fe.

          If I use an individual fixed effect specification, and impose an - if - condition (say if z=="yes"), will the fixed effect specification not automatically drop observations for which z is sometimes yes, and sometimes no? I.e. will the fixed effect specification not ensure that only observations for which z is always yes will be included?

          Kind regards,

          Anna

          Comment


          • #6
            If I use an individual fixed effect specification, and impose an - if - condition (say if z=="yes"), will the fixed effect specification not automatically drop observations for which z is sometimes yes, and sometimes no? I.e. will the fixed effect specification not ensure that only observations for which z is always yes will be included?
            Only those specific observations for which z != "yes" will be omitted. If the same individual has other observations with z == "yes", those will be retained in the analysis.

            If what you need to do is analyze only those observations where the individual always has z == yes, you the code that was explained at the end of #2 (and at the end of the quote from that post which you show in #5).

            Comment


            • #7
              Apologies, I should have been more specific! I only have two years of data; so in that case, if I use an if condition, and fixed effects, and will the observations for which z==yes in one wave but z==no in the other wave not be taken into account, as essentially for some observations there will be only 1 data point?

              Comment


              • #8
                It makes no difference whether you have two years of data or two thousand. -if z == "yes"- will select for inclusion all and only those observations in which z == "yes"; it will not notice whether the same individual has another observation with -z != "yes"-.

                That said, when there are only two observations per individual, these observation where z is once yes and once no will be singletons. In that case, when running -xtreg, fe-, those observations will have no influence on the regression coefficients and their standard errors. But they will still be counted in the number of observations and number of groups, and they will affect sigma_u and rho.

                Comment


                • #9
                  Clyde Schechter Thank you very much! Now I finally understand! A final question (promised!): I want to run the regression on observations that have z==no in wave 1, but z==yes in wave two. How can I do so? I tried creating the always_yes and always_no varialbes, and then run the regression with the condition if always_yes!=1 & always_no!=1, but then I also include observations who have z==yes in wave 1 and z==no in wave 2 (and I only want to include those observations that have the reverse).

                  Many thanks for all your help!

                  Comment


                  • #10
                    Code:
                    label define boolean 0 "no" 1 "yes"
                    encode z, gen(n_z) label(boolean)
                    assert inlist(n_z, 0, 1)
                    isid id wave, sort
                    by id: egen z1 = max(cond(wave == 1, n_z, .))
                    by id: egen z2 = max(cond(wave == 2, n_ z, .))
                    xtreg whatever if z2== 1 & z1 == 0
                    Use your actual variable names for iid and wave. This code assumes that z is always either "yes" or "no" (as a string variable" with no other alternatives and no missing values.

                    Comment

                    Working...
                    X