Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed Effects Model with Time Invariant Independent Variables

    I am working with a panel data set in Stata 16.1 that has 14,690 observations and 19 variables. I am running regress on a dependent variable of policy stringency index, which varies by day during the 130 day study period, and a series of independent variables. My main independent variable of interest is World Bank income classification groups, which is a time invariant categorical variable. Several of my other independent variables - used as controls - are also time invariant. I want to know the coefficient for each of these time invariant independent variables as well as use a fixed effects model to capture the other time invariant effects. However, I am struggling to create a fixed effects model that does not omit my time invariant independent variables. My current code is:

    Code:
    xtreg strin_ind i.income_cat i.n_polity over65 covid_cases_prev_day ghsindex i.date, fe
    Stata omits all of my independent variables except for covid_cases_prev_day. I have seen studies that treat the other time invariant independent variables as controls - how can I do this?

    Thank you!

  • #2
    It is not possible to estimate the effects of time invariant variables in the presence of fixed effect--they are perfectly collinear with the fixed effects. The within transformation wipes out both the time invariant fixed effects and the time invariant variables.

    Comment


    • #3
      You are seeking something that does not exist. It is a straight calculation in linear algebra that in any fixed-effects model, time-invariant variables do not have identifiable effects. It is mathematically impossible to get around this. Whatever studies you have seen that provide estimates for the effects of time-invariant predictors are not using a fixed-effects model.

      You may have seen studies that use a hybrid model. -xthybrid-, available from SSC can do that for you. But that model is not a true fixed-effects model and it does not automatically adjust for unobserved time-invariant effects the way a true fixed-effects model would.

      Added: Crossed with #2, which makes the same point.

      Comment


      • #4
        Clyde and Joro,

        Thank you for clarifying the theory behind the fixed-effects model. I believe the -xthybrid- command would be helpful. How can I control for day and country effects with this hybrid model?

        I set my panel data with the following code:
        Code:
        xtset id date, daily

        Comment


        • #5
          I believe the -xthybrid- command would be helpful. How can I control for day and country effects with this hybrid model?
          Schunck, R. (2013). Within and between estimates in random-effects models: Advantages and drawbacks of correlated random effects and hybrid models. The Stata Journal, 13(1), 65-76.

          Comment


          • #6
            Thank you for providing the citation, Chris. I referenced that paper, but I still do not know how to control for day effects.

            I also created dummy variables for my categorical variables, as the Schunck paper suggested, using the code below:
            Code:
            generate lowincome = income_cat==1 if income_cat!=.
            generate lowmidincome = income_cat==2 if income_cat!=.
            generate upmidincome = income_cat==3 if income_cat!=.
            generate highincome = income_cat==4 if income_cat!=.
            
            
            generate autocracy = n_polity_2018==1 if n_polity_2018!=.
            generate anocracy = n_polity_2018==2 if n_polity_2018!=.
            generate democracy = n_polity_2018==3 if n_polity_2018!=.
            However, when I run -xthybrid-, my highincome and democracy variables are omitted.

            Code:
            xthybrid strin_ind lowincome lowmidincome upmidincome highincome autocracy anocracy democracy over65_2019 ghsindex co
            > vid_cases_prev_day date, clusterid(id) se
            
            The variable 'lowincome' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'lowincome' is within clusters]
            The variable 'lowmidincome' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'lowmidincome' is within clusters]
            The variable 'upmidincome' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'upmidincome' is within clusters]
            The variable 'highincome' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'highincome' is within clusters]
            The variable 'autocracy' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'autocracy' is within clusters]
            The variable 'anocracy' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'anocracy' is within clusters]
            The variable 'democracy' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'democracy' is within clusters]
            The variable 'over65_2019' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'over65_2019' is within clusters]
            The variable 'ghsindex' does not vary sufficiently within clusters
            and will not be used to create additional regressors.
            [~0% of the total variance in 'ghsindex' is within clusters]
            
            Hybrid model. Family: gaussian. Link: identity.
            
            +-----------------------------------+
            |             Variable |   model    |
            |----------------------+------------|
            | strin_ind            |            |
            |         R__lowincome |   -11.5418 |
            |                      |     5.7931 |
            |      R__lowmidincome |    -3.6237 |
            |                      |     3.9716 |
            |       R__upmidincome |     1.5030 |
            |                      |     3.3585 |
            |        R__highincome |  (omitted) |
            |                      |            |
            |         R__autocracy |    -5.0748 |
            |                      |     4.1473 |
            |          R__anocracy |    -3.4765 |
            |                      |     2.9943 |
            |         R__democracy |  (omitted) |
            |                      |            |
            |       R__over65_2019 |    -0.5860 |
            |                      |     0.2605 |
            |          R__ghsindex |    -0.0419 |
            |                      |     0.1209 |
            | W__covid_cases_pre~y |    -0.0000 |
            |                      |     0.0000 |
            |              W__date |     0.4418 |
            |                      |     0.0052 |
            | B__covid_cases_pre~y |     0.0000 |
            |                      |     0.0000 |
            |              B__date |  (omitted) |
            |                      |            |
            |                _cons |    69.4817 |
            |                      |     6.9082 |
            |----------------------+------------|
            |        var(_cons[id])|            |
            |                _cons |   119.1077 |
            |                      |    16.3798 |
            |----------------------+------------|
            |      var(e.strin_ind)|            |
            |                _cons |   521.6886 |
            |                      |     6.1107 |
            |----------------------+------------|
            | Statistics           |            |
            |                   ll | -6.700e+04 |
            |                 chi2 |  7564.0897 |
            |                    p |     0.0000 |
            |                  aic |  1.340e+05 |
            |                  bic |  1.341e+05 |
            +-----------------------------------+
                                     legend: b/se
            Level 1: 14690 units. Level 2: 113 units.

            Comment


            • #7
              Stata will omit one dummy if you specify all possible categories. Don't worry. This is normal. You then can interpret the remaining dummies relative to the baseline (i.e., omitted category).

              Comment


              • #8
                Thank you, Chris. Glad this isn't an error.

                Comment


                • #9
                  What sensitivity, or specification, checks should I perform when using the xthybrid command? I did not find a discussion of sensitivity checks in the Schunck 2013 paper Chris referenced above.
                  Last edited by Morgan Pincombe; 29 Jul 2020, 09:01.

                  Comment

                  Working...
                  X