Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confusion w/ panel data, fixed effects, dummies, clustered standard errors etc

    Hey STATA community,

    I currently do a project and now I am stuck in the data analysis. Therefore, I need your help AND sorry for opening a new thread with these basic questions. I just want to understand to STATA command once that I can apply it in the future without major questions..

    I have firm-level data over multiple years. The literature I am following uses sector and year dummies, pseudo fixed effects and clustered standard errors on industry-level. Now I wonder, how I can fit all that into one model.

    I set my panel using:
    xtset companynum DataYearFiscal, yearly
    Whereas companynum is the company identifier and DataYearFiscal is the year.

    So far, I managed to build the following model:
    xtreg DepVar IndVar1 IndVar2 ... IndVarn i.DataYearFiscal i.companynum, vce (cluster Industry)
    I have some questions to better understand the command.
    Does is fit the above-mentioned requirements?
    i.DataYearFiscal and i.companynum are dummies, aren't they? Are they used to cover time-variant effects and firm-variant effects respectively?
    vce (cluster Industry) says clustered standard errors on industry-level to account for differences between industries (e.g. manufacturing vs retail). What happens if I exclude it?
    If I include fe, i.companynum is ommited because of collinearity.

    The above-mentioned model, however, has an R-Square between = 1.0000. That seems odd to me. Why is this so? Where did I take the wrong exit?

    Thank you very much!
    Tobi

  • #2
    Tobias:
    welcome to this forum.
    Some comments about your query:
    - your -xtset-looks OK;
    - -i.companynum- is redundant, as you have already included this fixed effect in -xtset-. As expected, it is wiped out by -fe- machinery, being a time-invariant predictor;;
    - your take about -i.DataYearFiscal and- i.companynum- is right. However, I would recommend you to take a look at any decent textbook on panel data econometrics;
    - cluster/robust option accounts for heteroskedasticity and/or autocorrelation;
    - your R_sq looks actually strange. What is the outcome of -xttest0- after -xtreg,re-?
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Hi,

      Thank you for your quick response!

      I do not need to include either i.companynum or fe because I use a panel regression!? But what does "pseudo fixed effects" refer to? I am a bit confused by the "pseudo".

      I will consult a text book to check how I apply them correctly.

      Ok. I do understand the concept of heteroskedasticity. However, does is make sense the do that on industry-level if the panel is on firm-level?

      Regarding the R-square issue, my outputs are:

      R-sq:
      within = 0.0764
      between = 1.0000
      overall = 0.1672

      Wald chi2(14) = .
      Prob > chi2 = .

      xttest0

      Breusch and Pagan Lagrangian multiplier test for random effects

      DepVar[companynum,t] = Xb + u[companynum] + e[companynum,t]

      Estimated results:
      | Var sd = sqrt(Var)
      ----------+--------------------------------------
      DepVar | 1773.943 42.11821
      e | 1739.027 41.70165
      u | 0 0

      Test: Var(u) = 0
      chibar2(01) = 0.00
      Prob > chibar2 = 1.0000

      I have honestly no clue what this tells me now. *facepalm*

      Comment


      • #4
        Tobias:
        as suspected from the sky-rocketing between R-sq, the outcome of -xttest0- tells you that your dataset does not show evidence of panel-wise effect; hence, you should switch to pooled OLS.
        On pseudo-panels (ie, cohorts of individuals instead of individuals followed-up over time), you may want to take a look at: https://www.insee.fr/en/statistiques...uillerm_EN.pdf.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Hi Carlo,

          I read through the document. However, I can't get my head around the idea, why I have a pseudo-panel rather than a normal one. My data is on individual firm level and I do follow individuals for the whole time period. Is it because my panel is unbalanced?

          Do unbalanced panel require different methods/commands or controls? Or do I have to drop the firm-years with missing variables? I thought, STATA ignores them anyway...

          Thank you very much!
          Tobi

          Comment


          • #6
            Tobias:
            if you have a sample composed of the same units (firm, in your case) that you follow-up over years, you have a panel dataset.
            The fact that your panel is unbalanced has nothing to do with pseudo-panel definition.
            Eventually, as you correctly surmise, Stata can handle both balanced and unbalaced panel datasets with no problem (and no difference in the -xt- suite commands to be used for your regression).
            That said, as per -xtttest0- outcome, your panel dataset does not show evidence of panel-wise fixed effect; hence, it should be estimated vi pooled OLS.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              Hi Carlo,

              Sorry for not getting back to you! Thank you very much for your explanation. It helps tremendously!

              All the best

              Comment

              Working...
              X