Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • cluster-invariant independent variable with fixed effects

    Hi everyone,

    I am estimating linear models where the unit of analysis is the county, and counties are nested in states. There is no time component, and the number of counties in each state varies widely. The main independent variable is a dichotomous variable that varies within some of the states but not others. In other words, some states have “ones” or “zeros” in all of their counties, whereas in others the independent variable does vary across counties.

    I might be getting this wrong, but my understanding is that if I run a fixed-effects model to account for state-level unobservables, all counties belonging to states where the independent variable of interest does not vary should be dropped from the analysis, as the state fixed-effect and the independent variable are perfectly collinear in those cases.

    However, when I run the models in Stata this is not the case. I start with the pooled estimator ignoring the nesting of counties in states:

    reg y x, robust

    Now, the number of observations remains the same if I do either:

    xtset state
    xtreg y x, fe vce(cluster state)

    or

    reg y x i.state, vce(cluster state)

    In the latter case, the output gives me coefficients for all states (except of course for the reference category), in other words none are omitted. Shouldn’t some of them be unidentifiable?

    I am confused why nothing is being dropped out in the fixed effect model even if the independent variable is invariant for some of the clusters. Is it appropriate to run a fixed effect model with this kind of data, or should I just treat it as a cross-section perhaps with std errors clustered by state?

    Thank you.
    Last edited by Maria Nolan; 29 Mar 2017, 16:16.

  • #2
    Maria:
    welcome to the list.
    I'm not clear why you're using -xtreg,fe- if your data have no panel structure.
    That said, please post what Stata gave you back, too (as per FAQ). Thanks.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Carlo,

      Thank you very much for your response. As you point out, I do not have a panel structure since there is no time involved. I am working with hierarchically structured data, with counties are nested in states. Substantively, I am only interested in county-level covariates, but as a robustness check I would like to control for state-level heterogeneity by treating the states as fixed effects.

      Ignoring the hierarchical structure of the data gives me this:

      Code:
      reg outcomevar i.treatedcounty, robust
      
      Linear regression                               Number of obs     =      2,228
                                                      F(1, 2226)        =      24.75
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.0091
                                                      Root MSE          =     3.6775
      
      -------------------------------------------------------------------------------
                    |               Robust
         outcomevar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      --------------+----------------------------------------------------------------
      treatedcounty |
           treated  |  -.8029753   .1613879    -4.98   0.000    -1.119462   -.4864888
              _cons |   6.897014   .0948097    72.75   0.000     6.711089    7.082939
      -------------------------------------------------------------------------------

      Now if I add state dummies and clustered errors this is what I get:


      Code:
      reg outcomevar i.treatedcounty i.statecode, vce(cluster statecode) base
      
      Linear regression                               Number of obs     =      2,228
                                                      F(0, 30)          =          .
                                                      Prob > F          =          .
                                                      R-squared         =     0.1947
                                                      Root MSE          =     3.3379
      
                                    (Std. Err. adjusted for 31 clusters in statecode)
      -------------------------------------------------------------------------------
                    |               Robust
         outcomevar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      --------------+----------------------------------------------------------------
      treatedcounty |
           control  |          0  (base)
           treated  |  -.0865117   .1314476    -0.66   0.515    -.3549635    .1819402
                    |
          statecode |
                 1  |          0  (base)
                 2  |  -3.436622   .0563347   -61.00   0.000    -3.551673   -3.321572
                 3  |  -.5861505   .0563347   -10.40   0.000    -.7012013   -.4710997
                 4  |  -1.645042   .0563347   -29.20   0.000    -1.760093   -1.529991
                 5  |  -3.496759   .0421241   -83.01   0.000    -3.582788    -3.41073
                 6  |  -5.885945    .058682  -100.30   0.000    -6.005789     -5.7661
                 7  |  -6.067232   .0563347  -107.70   0.000    -6.182283   -5.952181
                 8  |  -2.556956   .0563347   -45.39   0.000    -2.672007   -2.441905
                 9  |  -3.826613    .024361  -157.08   0.000    -3.876365   -3.776862
                10  |  -7.198257   .0605076  -118.96   0.000     -7.32183   -7.074684
                11  |   -7.83342   .0018778 -4171.54   0.000    -7.837255   -7.829585
                12  |  -3.674835   .0439005   -83.71   0.000    -3.764491   -3.585178
                13  |  -5.189244   .0606314   -85.59   0.000     -5.31307   -5.065418
                14  |  -3.929479   .0331381  -118.58   0.000    -3.997156   -3.861803
                15  |  -6.057091   .0308632  -196.26   0.000    -6.120122    -5.99406
                16  |  -4.330975   .0037556 -1153.19   0.000    -4.338645   -4.323305
                17  |  -4.237297   .0166918  -253.86   0.000    -4.271386   -4.203208
                18  |   -1.08158   .0563347   -19.20   0.000    -1.196631   -.9665293
                19  |  -5.002663   .0216803  -230.75   0.000    -5.046941   -4.958386
                20  |  -6.748452   .0402259  -167.76   0.000    -6.830604   -6.666299
                21  |  -5.949709    .042251  -140.82   0.000    -6.035997   -5.863421
                22  |    .237328   .0563347     4.21   0.000     .1222772    .3523788
                23  |  -5.576166   .0434477  -128.34   0.000    -5.664898   -5.487434
                24  |  -2.050062   .0093891  -218.34   0.000    -2.069237   -2.030887
                25  |  -2.282927   .0563347   -40.52   0.000    -2.397977   -2.167876
                26  |   .6615083   .0176736    37.43   0.000      .625414    .6976027
                27  |  -2.760879   .0563347   -49.01   0.000     -2.87593   -2.645828
                28  |  -4.950849   .0279136  -177.36   0.000    -5.007857   -4.893842
                29  |  -6.035505   .0431899  -139.74   0.000     -6.12371   -5.947299
                30  |  -2.358259   .0563347   -41.86   0.000     -2.47331   -2.243208
                31  |  -5.282389    .042251  -125.02   0.000    -5.368677   -5.196101
                    |
              _cons |    11.5765   .0563347   205.49   0.000     11.46144    11.69155
      -------------------------------------------------------------------------------

      I use xtreg, fe just to get a shorter output since I am not interested in interpreting the state effects. Isn't that equivalent to manually introducing the state dummies as above? I always thought it was. Essentially I want to remove all variation between states and estimate the parameter of interest relying on the within-state variation only. I noticed there is a slight difference in the standard error when I do xtreg, fe vs including i.statecode as predictor (maybe a degrees of freedom issue?), but overall Stata does seem to be doing the same thing:

      Code:
      xtset statecode
             panel variable:  statecode (unbalanced)
      
      xtreg outcomevar i.treatedcounty, fe vce(cluster statecode) base
      
      Fixed-effects (within) regression               Number of obs     =      2,228
      Group variable: statecode                       Number of groups  =         31
      
      R-sq:                                           Obs per group:
           within  = 0.0001                                         min =          3
           between = 0.2221                                         avg =       71.9
           overall = 0.0091                                         max =        550
      
                                                      F(1,30)           =       0.44
      corr(u_i, Xb)  = 0.1941                         Prob > F          =     0.5126
      
                                    (Std. Err. adjusted for 31 clusters in statecode)
      -------------------------------------------------------------------------------
                    |               Robust
         outcomevar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      --------------+----------------------------------------------------------------
      treatedcounty |
           control  |          0  (base)
           treated  |  -.0865117   .1305589    -0.66   0.513    -.3531484    .1801251
                    |
              _cons |   6.709859   .0341047   196.74   0.000     6.640208     6.77951
      --------------+----------------------------------------------------------------
            sigma_u |  2.2694765
            sigma_e |  3.3379468
                rho |   .3161302   (fraction of variance due to u_i)
      -------------------------------------------------------------------------------

      So as you can see I have 2,228 observations (counties) nested in 31 clusters (states). There is within-state variation in the relevant county-level independent variable in 21 of the 31 clusters. The other ten have ones or zeros in all of their units. When I introduce state fixed effects, I assume the coefficient on "treatedcounty" is being identified only using the observations in those 21 states?

      I am not sure fixed effects is the way to go here since I am throwing away a lot of information, but on the other hand there is the concern of controlling for potential unobserved state effects. Independent from this modeling decision however, I am puzzled as to whether there is any difference between restricting the sample to those 21 states versus running the analysis with the entire sample of counties and fixed effects?

      Thank you again!

      Last edited by Maria Nolan; 30 Mar 2017, 09:59.

      Comment


      • #4
        I use xtreg, fe just to get a shorter output since I am not interested in interpreting the state effects. Isn't that equivalent to manually introducing the state dummies as above? I always thought it was. Essentially I want to remove all variation between states and estimate the parameter of interest relying on the within-state variation only. I noticed there is a slight difference in the standard error when I do xtreg, fe vs including i.statecode as predictor (maybe a degrees of freedom issue?), but overall
        Correct.

        The other ten have ones or zeros in all of their units. When I introduce state fixed effects, I assume the coefficient on "treatedcounty" is being identified only using the observations in those 21 states?
        Not true. A variable which is constant within county in every county is colinear with the fixed effects and gets omitted. But when it is constant within some counties, that is not a colinearity problem and the variable is retained. The observations in those counties are also retained. And they matter. If you drop those counties from the analysis, the results will change, and they will most likely be wrong due to selection bias from excluding those counties!

        Yes, you need the fixed effects here. If there are state-level influences on the outcome variable that apply to all the counties within a state but not to counties in the other states, then you have non-independent observations at the county level. This makes the OLS results invalid: independence of observations is a critical assumption there. You must use -xtreg- for this. Ordinarily, I would hedge this warning by saying that you could ignore this if there are actually no state-level effects on the variable. But the results themselves clearly show that they are: your intrastate correlation estimate came out rho = 0.316. That is a level of non-independence of observations that is far too large to simply ignore.

        Comment


        • #5
          Maria:
          thanks for futher clarifications.
          Clyde gave, as usual, excellent and comprehensive advice.
          I have an aside only: in you first regression (pooled OLS) you robustified the standard errors (SEs): that takes heteroskedasticity into account, but not not the violation of observations Independence (OLS prerequisite): for that, -cluster()- option comes in handy.
          It also Worth noting that the above mentioned difference does not hold, say, for -xtreg-, where the two SE option do the same job.
          Kind regards,
          Carlo
          (Stata 18.0 SE)

          Comment


          • #6

            Thank you so much, Clyde and Carlo.

            Clyde, when you said

            A variable which is constant within county in every county is colinear with the fixed effects and gets omitted. But when it is constant within some counties, that is not a colinearity problem and the variable is retained.
            You meant a county-level variable which is constant in every state would be colinear with the fixed effects and be omitted, but when it is constant within some states only, there is no colinearity problem?

            This clarifies the issue. Thanks again for your generous help!

            Comment


            • #7
              You meant a county-level variable which is constant in every state would be colinear with the fixed effects and be omitted, but when it is constant within some states only, there is no colinearity problem?
              Yes. Sorry for the error.

              Comment


              • #8
                I think there is still a bit of confusion here. Clyde is correct that a variable will only drop out if it has no within-state variation for any state. And, of course, if this were the case, Stata would not give you a coefficient estimate on treatedcounty if this variable had no within-state variation.

                Having said that, Maria is also correct in thinking that identification is purely off of the states with some variation across counties within the state. Without any other controls that have some within-state variation, the FE estimate using all of the states will be identical to dropping the 10 states without any variation in treatedcounty.

                To see this, do the following:

                Code:
                egen totaltreat = sum(treatedcounty), by(statecoed)
                xtreg outcomevar treatedcounty if totaltreat > 0, fe
                You should get an identical answer on the treatedcounty variable as using all of the states.

                Let me also say that I suspect fixed effects will be the only convincing analysis here unless you have some sort or random assignment to treatment and control. Fixed effects is doing what it should: it accounts for state-level differences and identifies the effect off of states that have some variation in treatment status. This is not a bad thing.

                In your particular application, you don't get a significant result. The standard error is large relative to the coefficient estimate. That's the way it goes. You might try putting in state-level controls -- as many as you can find -- and use a random effects analysis. But that will always be less robust than fixed effects.

                I hope this helps.

                Jeff

                Comment


                • #9
                  Jeff is right in what he says. My statement was accurate, but unclearly worded. The coefficient estimates will be the same whether you include or exclude the counties for which the variable is constant. But the standard errors will be different (and, hence, the confidence intervals and p-values will differ), as will the estimates of the variance components. See the following example:

                  Code:
                  set more off
                  clear
                  set obs 10
                  set seed 1234
                  gen id = _n
                  gen fe = rnormal(0, 1)
                  expand 10
                  gen x = rnormal(0, 1)
                  gen y = 2.5 + fe + 7*x + rnormal(0, 1)
                  
                  replace x = 0.5 if id == 9
                  replace x = -0.5 if id == 10
                  
                  xtset id
                  
                  xtreg y x, fe
                  
                  xtreg y x if id <= 8, fe

                  Comment


                  • #10
                    I agree with Clyde that the standard errors can be a bit different, but that, I think, is due to conventions with degrees-of-freedom conventions. A quirk in the above code causes bigger differences than one would see in practice. Notice that the x is set to constant for ids 9 and 10 after y was generated. I would have generated y after my x variable was generated; otherwise, it introduces a serious misspecification. Now the result on dropping ids 9 and 10 for beta hat is algebraic. But the above way of generating the data exaggerates the likely differences. The small cross sectional size likely has something to do with it, too.

                    Code:
                    set more off
                    clear
                    set obs 10
                    set seed 1234
                    gen id = _n
                    gen fe = rnormal(0, 1)
                    expand 10
                    gen x = rnormal(0, 1)
                    replace x = 0.5 if id == 9
                    replace x = -0.5 if id == 10  
                    gen y = 2.5 + fe + 7*x + rnormal(0, 1)  
                    xtset id  xtreg y x, fe  
                    xtreg y x if id <= 8, fe

                    Comment


                    • #11
                      Thank you very much, Jeff and Clyde, for an informative exchange and the time you dedicate to this forum.

                      I just have one remaining question: are state clustered errors appropriate in this type of setting, with 20 to 30 clusters? I understand the need to account for potential within-cluster correlation in the error, but I want to make sure the number of groups is large enough, given the asymptotic assumption for cluster-robust inference.

                      I am analyzing several outcome variables with this hierarchically structured data. With some of my outcomes and always introducing state fixed effects, clustering the errors versus only accounting for heteroskedasticity (by manually introducing the state dummies in Stata with i.statecode and specifying the robust option) can lead to different inferences. The difference in the errors is typically not too large, but in some cases can be large enough to make the difference for statistical significance at conventional levels. I assume the best approach here is to be conservative and opt for the clustered errors, but I wonder if you have any particular suggestions on this.

                      As for the results when states without any variation in the independent variable are dropped or not, the FE estimates are indeed identical, and there is a very small difference in the standard error (as long as no other county-level controls are included, of course).

                      With the full sample:

                      Code:
                      xtset statecode
                      xtreg outcomevar treatedcounty, fe vce(cluster statecode)
                      
                      Fixed-effects (within) regression               Number of obs     =      2,228
                      Group variable: statecode                       Number of groups  =         31
                      
                      R-sq:                                           Obs per group:
                           within  = 0.0001                                         min =          3
                           between = 0.2221                                         avg =       71.9
                           overall = 0.0091                                         max =        550
                      
                                                                      F(1,30)           =       0.44
                      corr(u_i, Xb)  = 0.1941                         Prob > F          =     0.5126
                      
                                                    (Std. Err. adjusted for 31 clusters in statecode)
                      -------------------------------------------------------------------------------
                                    |               Robust
                         outcomevar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                      --------------+----------------------------------------------------------------
                      treatedcounty |  -.0865117   .1305589    -0.66   0.513    -.3531484    .1801251
                              _cons |   6.709859   .0341047   196.74   0.000     6.640208     6.77951
                      --------------+----------------------------------------------------------------
                            sigma_u |  2.2694765
                            sigma_e |  3.3379468
                                rho |   .3161302   (fraction of variance due to u_i)
                      -------------------------------------------------------------------------------

                      Dropping states without county-level variation in the independent variable:

                      Code:
                      egen totaltreat = sum(treatedcounty), by(statecode)
                      
                      xtreg outcomevar treatedcounty if totaltreat > 0, fe vce(cluster statecode)
                      
                      Fixed-effects (within) regression               Number of obs     =      1,797
                      Group variable: statecode                       Number of groups  =         21
                      
                      R-sq:                                           Obs per group:
                           within  = 0.0001                                         min =          7
                           between = 0.0651                                         avg =       85.6
                           overall = 0.0011                                         max =        550
                      
                                                                      F(1,20)           =       0.43
                      corr(u_i, Xb)  = 0.0613                         Prob > F          =     0.5185
                      
                                                    (Std. Err. adjusted for 21 clusters in statecode)
                      -------------------------------------------------------------------------------
                                    |               Robust
                         outcomevar |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                      --------------+----------------------------------------------------------------
                      treatedcounty |  -.0865117   .1316146    -0.66   0.518     -.361055    .1880317
                              _cons |   6.293216   .0426264   147.64   0.000     6.204299    6.382134
                      --------------+----------------------------------------------------------------
                            sigma_u |  2.1237033
                            sigma_e |  3.4129803
                                rho |  .27911636   (fraction of variance due to u_i)
                      -------------------------------------------------------------------------------

                      Thanks again!

                      Comment


                      • #12
                        Hi Maria:

                        Clustering with only 31 clusters, when some of the cluster sizes are really large, is a stretch. It could work okay. I might bootstrap just to see that the results are similar. This doesn't prove it's okay, but hopefully the cluster bootstrap ses are similar.

                        Comment

                        Working...
                        X