Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Struggling with Collinearity in Panel Data

    Hi,

    I am struggling with how to set up my regression robustness check. I am currently running a regression of recycling rates on income, population density and several other variables.

    I have panel data of 350 local authorities over 20 quarters so have used
    xtset acode qdate
    xtreg recycling loginc logpopden loghhsize (unitary) md11 md12 md13 md14 md15 md16 md17 md18 md19 md20 md21 md22 md23 md24 md25 md26 md27 md28 md29 md291 wasteavg dryavg quarter2 quarter3 quarter4, fe vce(robust)


    Some local authorities are Unitary and some are not. I would like to test whether unitary authorities have a higher recycling rate, however whenever I include my dummy for Unitary (takes the value one if that local authority is unitary), I get collinearity. Have I missed something?
    Should I run two separate regressions for if Unitary= 1 and Unitary =0 and test if they are statistically different?

    Below is a sample of my data.
    Thank you!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str43 name str9 code int year byte quarter float(qdate quarter2 quarter3 quarter4 recycling loginc logpopden loghhsize) byte unitarydummy
    "Hartlepool Borough Council"           "E06000001" 2012 1 208 0 0 0  31.64322 9.576926 2.2529745 .8586616 1
    "Hartlepool Borough Council"           "E06000001" 2012 2 209 1 0 0  29.11372 9.576926  2.261659 .8586616 1
    "Hartlepool Borough Council"           "E06000001" 2012 3 210 0 1 0 24.318804 9.576926  2.261659 .8586616 1
    "Hartlepool Borough Council"           "E06000001" 2012 4 211 0 0 1  23.49204 9.576926  2.261659 .8586616 1
    "Hartlepool Borough Council"           "E06000001" 2013 1 212 0 0 0  29.75906  9.58011 2.2631164 .8628899 1
    "Hartlepool Borough Council"           "E06000001" 2013 2 213 1 0 0 25.608576  9.58011 2.2631164 .8628899 1
    "Hartlepool Borough Council"           "E06000001" 2013 3 214 0 1 0  19.14898  9.58011 2.2631164 .8628899 1
    "Hartlepool Borough Council"           "E06000001" 2013 4 215 0 0 1 26.363016  9.58011 2.2631164 .8628899 1
    "Hartlepool Borough Council"           "E06000001" 2014 1 216 0 0 0  29.21854 9.606159 2.2677865 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2014 2 217 1 0 0 22.891203 9.606159 2.2631164 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2014 3 218 0 1 0 22.664324 9.606159 2.2677865 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2014 4 219 0 0 1    19.374 9.606159 2.2677865 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2015 1 220 0 0 0  25.64599 9.642772 2.2669578 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2015 2 221 1 0 0 24.093536 9.642772 2.2669578 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2015 3 222 0 1 0 23.910435 9.642772 2.2669578 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2015 4 223 0 0 1  22.74303 9.642772 2.2669578 .8671005 1
    "Hartlepool Borough Council"           "E06000001" 2016 1 224 0 0 0 25.628105 9.620527  2.265921 .8712934 1
    "Hartlepool Borough Council"           "E06000001" 2016 2 225 1 0 0 21.490993 9.620527  2.265921 .8712934 1
    "Hartlepool Borough Council"           "E06000001" 2016 3 226 0 1 0  20.56008 9.620527  2.265921 .8712934 1
    "Hartlepool Borough Council"           "E06000001" 2016 4 227 0 0 1  19.26644 9.620527  2.265921 .8712934 1
    "Middlesbrough Borough Council"        "E06000002" 2012 1 208 0 0 0  15.01909 9.554639  3.262778 .8586616 1
    "Middlesbrough Borough Council"        "E06000002" 2012 2 209 1 0 0 14.171424 9.554639  3.234316 .8586616 1
    "Middlesbrough Borough Council"        "E06000002" 2012 3 210 0 1 0 13.453314 9.554639  3.234316 .8586616 1
    "Middlesbrough Borough Council"        "E06000002" 2012 4 211 0 0 1  13.24626 9.554639  3.234316 .8586616 1
    "Middlesbrough Borough Council"        "E06000002" 2013 1 212 0 0 0  14.57947 9.564863  3.236794 .8628899 1
    "Middlesbrough Borough Council"        "E06000002" 2013 2 213 1 0 0 14.828068 9.564863  3.236794 .8628899 1
    "Middlesbrough Borough Council"        "E06000002" 2013 3 214 0 1 0 14.709766 9.564863  3.236794 .8628899 1
    "Middlesbrough Borough Council"        "E06000002" 2013 4 215 0 0 1  20.44064 9.564863  3.236794 .8628899 1
    "Middlesbrough Borough Council"        "E06000002" 2014 1 216 0 0 0 34.159927 9.601301 3.2381685 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2014 2 217 1 0 0  24.25953 9.601301  3.236794 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2014 3 218 0 1 0  24.04574 9.601301 3.2381685 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2014 4 219 0 0 1    24.127 9.601301 3.2381685 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2015 1 220 0 0 0  27.94404 9.626811  3.239502 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2015 2 221 1 0 0 23.334343 9.626811  3.239502 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2015 3 222 0 1 0  19.43632 9.626811  3.239502 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2015 4 223 0 0 1  23.32905 9.626811  3.239502 .8671005 1
    "Middlesbrough Borough Council"        "E06000002" 2016 1 224 0 0 0  23.50695 9.613669   3.24228 .8712934 1
    "Middlesbrough Borough Council"        "E06000002" 2016 2 225 1 0 0 20.070557 9.613669   3.24228 .8712934 1
    "Middlesbrough Borough Council"        "E06000002" 2016 3 226 0 1 0 19.601873 9.613669   3.24228 .8712934 1
    "Middlesbrough Borough Council"        "E06000002" 2016 4 227 0 0 1  21.82984 9.613669   3.24228 .8712934 1
    "Redcar and Cleveland Borough Council" "E06000003" 2012 1 208 0 0 0  23.51096 9.537339  1.728642 .8586616 1
    "Redcar and Cleveland Borough Council" "E06000003" 2012 2 209 1 0 0  20.10306 9.537339  1.712536 .8586616 1
    "Redcar and Cleveland Borough Council" "E06000003" 2012 3 210 0 1 0  19.72007 9.537339  1.712536 .8586616 1
    "Redcar and Cleveland Borough Council" "E06000003" 2012 4 211 0 0 1 22.403687 9.537339  1.712536 .8586616 1
    "Redcar and Cleveland Borough Council" "E06000003" 2013 1 212 0 0 0 24.170063 9.545955  1.710911 .8628899 1
    "Redcar and Cleveland Borough Council" "E06000003" 2013 2 213 1 0 0  23.99578 9.545955  1.710911 .8628899 1
    "Redcar and Cleveland Borough Council" "E06000003" 2013 3 214 0 1 0 24.617693 9.545955  1.710911 .8628899 1
    "Redcar and Cleveland Borough Council" "E06000003" 2013 4 215 0 0 1  30.60893 9.545955  1.710911 .8628899 1
    "Redcar and Cleveland Borough Council" "E06000003" 2014 1 216 0 0 0  36.69286 9.575816 1.7105495 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2014 2 217 1 0 0  23.24818 9.575816  1.710911 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2014 3 218 0 1 0  28.21425 9.575816 1.7105495 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2014 4 219 0 0 1    33.378 9.575816 1.7105495 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2015 1 220 0 0 0 34.828026 9.600556  1.711272 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2015 2 221 1 0 0 26.965475 9.600556  1.711272 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2015 3 222 0 1 0 18.484371 9.600556  1.711272 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2015 4 223 0 0 1   22.8024 9.600556  1.711272 .8671005 1
    "Redcar and Cleveland Borough Council" "E06000003" 2016 1 224 0 0 0 25.000637 9.583902  1.713077 .8712934 1
    "Redcar and Cleveland Borough Council" "E06000003" 2016 2 225 1 0 0  23.34894 9.583902  1.713077 .8712934 1
    "Redcar and Cleveland Borough Council" "E06000003" 2016 3 226 0 1 0  20.75586 9.583902  1.713077 .8712934 1
    "Redcar and Cleveland Borough Council" "E06000003" 2016 4 227 0 0 1 25.922733 9.583902  1.713077 .8712934 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2012 1 208 0 0 0  20.72435 9.607841   2.19778 .8586616 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2012 2 209 1 0 0  19.85636 9.607841 2.1946657 .8586616 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2012 3 210 0 1 0 17.278917 9.607841 2.1946657 .8586616 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2012 4 211 0 0 1  20.15345 9.607841 2.1946657 .8586616 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2013 1 212 0 0 0  22.03593 9.608176  2.197891 .8628899 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2013 2 213 1 0 0 18.222332 9.608176  2.197891 .8628899 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2013 3 214 0 1 0  17.56083 9.608176  2.197891 .8628899 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2013 4 215 0 0 1  19.13032 9.608176  2.197891 .8628899 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2014 1 216 0 0 0 21.184946 9.632138  2.201991 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2014 2 217 1 0 0 13.187984 9.632138  2.201991 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2014 3 218 0 1 0 16.005465 9.632138  2.201991 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2014 4 219 0 0 1    18.041 9.632138  2.201991 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2015 1 220 0 0 0 20.246767 9.662816  2.206735 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2015 2 221 1 0 0 16.660748 9.662816  2.206735 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2015 3 222 0 1 0  15.44666 9.662816  2.206735 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2015 4 223 0 0 1 18.055502 9.662816  2.206735 .8671005 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2016 1 224 0 0 0 17.611841 9.641798 2.2102504 .8712934 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2016 2 225 1 0 0 14.851618 9.641798 2.2102504 .8712934 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2016 3 226 0 1 0 14.507548 9.641798 2.2102504 .8712934 1
    "Stockton-on-Tees Borough Council"     "E06000004" 2016 4 227 0 0 1  17.83563 9.641798 2.2102504 .8712934 1
    "Darlington Borough Council"           "E06000005" 2012 1 208 0 0 0  42.36351 9.570878 1.6325684 .8586616 1
    "Darlington Borough Council"           "E06000005" 2012 2 209 1 0 0 32.836056 9.570878  1.678964 .8586616 1
    "Darlington Borough Council"           "E06000005" 2012 3 210 0 1 0  31.34309 9.570878  1.678964 .8586616 1
    "Darlington Borough Council"           "E06000005" 2012 4 211 0 0 1   32.1052 9.570878  1.678964 .8586616 1
    "Darlington Borough Council"           "E06000005" 2013 1 212 0 0 0 28.556936 9.597573 1.6757873 .8628899 1
    "Darlington Borough Council"           "E06000005" 2013 2 213 1 0 0 30.584833 9.597573 1.6757873 .8628899 1
    "Darlington Borough Council"           "E06000005" 2013 3 214 0 1 0  28.63693 9.597573 1.6757873 .8628899 1
    "Darlington Borough Council"           "E06000005" 2013 4 215 0 0 1  21.55379 9.597573 1.6757873 .8628899 1
    "Darlington Borough Council"           "E06000005" 2014 1 216 0 0 0  20.11268 9.606159 1.6770965 .8671005 1
    "Darlington Borough Council"           "E06000005" 2014 2 217 1 0 0 26.908495 9.606159 1.6757873 .8671005 1
    "Darlington Borough Council"           "E06000005" 2014 3 218 0 1 0  25.60923 9.606159 1.6770965 .8671005 1
    "Darlington Borough Council"           "E06000005" 2014 4 219 0 0 1  30.62015 9.606159 1.6770965 .8671005 1
    "Darlington Borough Council"           "E06000005" 2015 1 220 0 0 0 31.480186  9.65098 1.6769096 .8671005 1
    "Darlington Borough Council"           "E06000005" 2015 2 221 1 0 0  29.34417  9.65098 1.6769096 .8671005 1
    "Darlington Borough Council"           "E06000005" 2015 3 222 0 1 0   29.5002  9.65098 1.6769096 .8671005 1
    "Darlington Borough Council"           "E06000005" 2015 4 223 0 0 1 25.388384  9.65098 1.6769096 .8671005 1
    "Darlington Borough Council"           "E06000005" 2016 1 224 0 0 0  29.34344 9.647757 1.6770965 .8712934 1
    "Darlington Borough Council"           "E06000005" 2016 2 225 1 0 0  27.49854 9.647757 1.6770965 .8712934 1
    "Darlington Borough Council"           "E06000005" 2016 3 226 0 1 0  28.14532 9.647757 1.6770965 .8712934 1
    "Darlington Borough Council"           "E06000005" 2016 4 227 0 0 1  28.00815 9.647757 1.6770965 .8712934 1
    end
    format %tq qdate

  • #2
    Darcy,

    In this example, unitarydummy is 1 for all panels so I cannot replicate the problem. Nevertheless, if unitarydummy is time-invariant, it will be excluded from the model because panel dummy already contains that information. This is similar to how gender dummy is automatically excluded from xtreg in a panel of people. This discussion from Statalist will be helpful (fixed-effects and time-invariant variables), also this Stackexchange post on how to keep time-invariant variables in FEs (Link)
    Last edited by Ashish Tyagi; 28 Feb 2019, 04:36. Reason: Added Statalist post link.

    Comment


    • #3
      Ashish,

      Thank you so much that is very helpful. Can you recommend a way I could therefore test whether a local authority having the dummy=1 for 'unitary' is a significant factor in affecting recycling rates?
      I am not sure how to approach this.

      Comment


      • #4
        Darcy,

        Based on this thread (here) and a quick reading of Schunck (2013), you can try this method below. I work using 100 obs example you provided and changed unitarydummy = 0 for first 2 utilities (otherwise it is just a vector of ones).

        Code:
        encode code, gen(acode) // code to numeric
        replace unitarydummy = 0 if acode <= 2 // otherwise a vector of constant
        xtset acode qdate
        * no adjustment, unitarydummy dropped
        xtreg recycling loginc logpopden loghhsize (unitary)  quarter2 quarter3 quarter4, fe vce(robust)
        est store base
        
        * Try hybrid model, undertake transformation
        * Post 4, url: https://www.statalist.org/forums/forum/general-stata-discussion/general/485328-time-invariant-variables-in-fixed-effects-model
        foreach var of varlist loginc logpopden loghhsize {
        by acode, sort : egen `var'_between = mean(`var')
        generate `var'_within = `var' - `var'_between
        }
        
        * run hybrid model - Post #4
        xtreg recycling loginc_between loginc_within logpopden_between logpopden_within loghhsize_between loghhsize_within ///
            unitarydummy quarter2 quarter3 quarter4, re vce(robust)
        est store hybrid1
        
        * run hybrid model without 'within', replaced by non-transformed covariate, post #6
        xtreg recycling loginc_between loginc logpopden_between logpopden loghhsize_between loghhsize ///
            unitarydummy quarter2 quarter3 quarter4, re vce(robust)
        est store hybrid2
        
        est table base hybrid1 hybrid2, star stats(N r_2)
        
        . est table base hybrid1 hybrid2, star stats(N r_2)
        
        --------------------------------------------------------------
            Variable |     base           hybrid1         hybrid2     
        -------------+------------------------------------------------
              loginc |  49.410769                       49.410782     
           logpopden | -264.09921**                    -264.09923***  
           loghhsize | -138.03425                       -138.0343     
        unitarydummy |  (omitted)      -4.7426959      -4.7426959     
            quarter2 | -3.4954348      -3.4954347*     -3.4954348*    
            quarter3 | -4.6135072*      -4.613507**    -4.6135072**   
            quarter4 | -2.9118043*      -2.911804**    -2.9118042**   
        loginc_bet~n |                 -40.544609      -89.955391     
        loginc_wit~n |                  49.411277                     
        logpopden~en |                 -7.6251447**     256.47409***  
        logpopden~in |                 -264.09992***                  
        loghhsize~en |                  (omitted)       (omitted)     
        loghhsize~in |                 -138.03642                     
               _cons |  257.12663       435.24388       554.70003     
        -------------+------------------------------------------------
                   N |        100             100             100     
                 r_2 |                                                
        --------------------------------------------------------------
                              legend: * p<0.05; ** p<0.01; *** p<0.001
        
        * Interpretation: Within coeff. are akin to FE model estimates. 
        * With 100 obs, loghhsize_between is omitted due to collinearity. But that could be due to sparse data.
        * Additional reading: Schunck, Reinhard. (2013). Within and between estimates in random-effects models: Advantages and drawbacks of correlated random effects and hybrid models.The Stata Journal, 13(1):65-76.

        Comment


        • #5
          Ashish, thank you again this was very helpful.

          I am unsure which estimate to use, hybrid1 or hybrid2? And what is the test actually doing?

          Also am I correct to have calculated within and between estimates for the other variables included in my regression that refer to method and frequency of recycling?
          Below is part of my code.

          //hybrid model for unitary
          foreach var of varlist loginc logpopden loghhsize md11 md12 md13 md14 md15 md16 md17 md18 md19 md20 md21 md22 md23 md24 md25 md26 md27 md28 md29 md291 md31 md32 md33 md34 md35 md36 md37 wasteavg dryavg{
          by acode, sort : egen `var'_between = mean(`var')
          generate `var'_within = `var' - `var'_between
          }

          xtreg recycling loginc logpopden loghhsize unitarydummy md11 md12 md13 md14 md15 md16 md17 md18 md19 md20 md21 md22 md23 md24 md25 md26 md27 md28 md29 md291 wasteavg dryavg quarter2 quarter3 quarter4, fe vce(cluster acode)
          est store base
          //Step 1
          xtreg recycling loginc_between loginc_within logpopden_between logpopden_within loghhsize_between loghhsize_within md11_between unitarydummy md11_within md12_between md12_within md13_between md13_within md14_between md14_within md15_between md15_within md16_between md16_within md17_between md17_within md18_between md18_within md19_between md19_within md20_between md20_within md21_between md21_within md22_between md22_within md23_between md23_within md24_between md24_within md25_between md25_within md26_between md26_within md27_between md27_within md28_between md28_within md29_between md29_within md291_between md291_within md31_between wasteavg_between wasteavg_within dryavg_between dryavg_within quarter2 quarter3 quarter4, re vce(robust)
          est store hybrid1
          //Step 2
          xtreg recycling loginc_between loginc logpopden_between logpopden loghhsize_between loghhsize unitarydummy md11_between md11 md12_between md12 md13_between md13 md14_between md14 md15_between md15 md16_between md16 md17_between md17 md18_between md18 md19_between md19 md20_between md20 md21_between md21 md22_between md22 md23_between md23 md24_between md24 md25_between md25 md26_between md26 md27_between md27 md28_between md28 md29_between md29 md291_between md291 wasteavg_between wasteavg dryavg_between dryavg quarter2 quarter3 quarter4, re vce(robust)
          est store hybrid2


          est table base hybrid1 hybrid2, star stats(N r_2)


          Comment


          • #6
            Darcy,

            The code looks fine. Schunck (2013) explains the reasoning quite well. FEs only have within variation (changes within utilities over time) and dummy out between variation (across utilities). As a result, variable like unitary are ignored as there is no within variation in it. One can estimate a Between Effect (BE) model to find the effect of unitary status on recycling, but a pure between effect model would ignore variations within a utility over time. Random Effect model is a weighted average of BE & FE, and the way it is implemented in this method makes BE & FE explicit. Please give Schunck (2013) a detailed read. This link explaining the difference between BE & FE may also help clear things out (link).

            As for hybrid1 and hybrid2, hybrid1 is the technically correct model (*_within coefficients are FE estimates).

            Comment


            • #7
              Dear Ashish, this looks great I have run the regression and got results. The Schunk (2013) paper was helpful. I am still a little confused about why we run the hybrid2 equation I would be grateful if you could clarify.

              Comment


              • #8
                Darcy,

                Hybrid2 is a popular model called correlated random effects and yields the same within estimates. Hybrid1 model is eq (3) of Schunck (2013), which he refers to as the 'hybrid model'. Hybrid2 is eq (4), the correlated random effect model. Eq (5) shows the link between 2 models. In the CRE model, the coefficient on mean(x_i) is not the between estimator beta_3 but the difference (between - within). This can be seen here in the example, for loginc, hybrid1's (-40.544 - 49.411) = hybrid2's (-89.955). If the focus is only on within estimators, CRE model (hybrid2) is sufficient. The hybrid model also identifies the between effects, in addition. I hope this addresses the confusion.

                Comment

                Working...
                X