Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • High Dimensional Fixed Effects (Firm-Quarter and year-quarter)

    Dear Stata community,
    I am trying to estimate a regression of the impact of weather changes on company investment decisions. I have quarterly panel data with different companies companies and and their locations in different states.
    Since weather changes are seasonal I would like to include firm-quarter fixed effects (4 effects for each company). Can someone explain what is the difference between firm and quarter fixed effects versus firm-quarter fixed effects.
    I also plan to have year-quarter fixed effects. Can someone advise on the meaning of year-quarter fixed effects.
    I plan to cluster the standard errors at the state level since weather changes affect different states differently.
    Is this the correct specification of the model?

  • #2
    Can someone explain what is the difference between firm and quarter fixed effects versus firm-quarter fixed effects.
    Firm fixed effects are variables that characterize each firm but do not vary over time. It is because firm fixed effects are included in fixed-effects regression that these models are not affected by bias attributable to omission of time-invariant attributes of firms. In practice, they are implemented in Stata either by -xtset firm- and specifing the -fe- option in the regression, or by including i.firm in the varlist of a -regress- command.

    Quarterly time fixed effects are variables that characterize each quarter (treating 2001Q1 and 2002Q1 as different quarters) but do not vary across firms. They are often included in models in economics and finance to adjust the analysis for time-specific shocks to the outcome that apply across the board to all firms and vary from one quarter to the next. In terms of implementation, note that including a time variable in the -xtset- command does not lead to inclusion of time fixed effects in the model. If you want quarter fixed effects in your model, you must include i.quarterly_date in your varlist.

    Seasonal quarter fixed effects are variables that simply distinguish the four quarters of each year. There would be four such effects (of which one will be omitted as the base category). These variables adjust the analysis for seasonal effects that apply to all firms in the same way and also apply the same way from one year to the next. That is, for these variables 2001Q1 and 2002Q1 are both just Q1.) Thus a Q1 quarterly effect is roughly an indicator for "everything that happens to everyone everywhere each year during winter." For implementation you need a variable, call it season, that takes on values 1, 2, 3, and 4 corresponding to the four calendar quarters, and you include i.season in your varlist.

    Firm-quarterly time fixed effects are variables that characterize each combination of firm and year and quarter. If you have 10 firms observed over 15 quarters, there will be 10 firm fixed effects (of which 1 is omitted as the base), and there will be 15 quarter effects (of which one is omitted as the base), but there will be 150 firm-quarter effects (of which 1 is omitted as the base category). The use of firm-quarter effects would adjust the model for shocks that are unique to each firm in each quarter. Note that this is only possible if you have multiple observations of each firm within each quarter. If, as I infer from your description, you have only one observation of each firm in each quarter, then you cannot do this because the firm-quarter effects will soak up all of your degrees of freedom and you will get no estimates for the model parameters of interest.

    I also plan to have year-quarter fixed effects. Can someone advise on the meaning of year-quarter fixed effects.
    Year-quarter fixed effects are what I have referred to above as quarterly-time fixed effects.

    I plan to cluster the standard errors at the state level since weather changes affect different states differently.
    Non sequitur. If weather changes affect different states differently, then you are dealing with an interaction between state and weather. That has nothing to do with clustering standard errors. You would cluster errors at the state level if there is reason to believe that regression residuals are correlated within states.

    Is this the correct specification of the model?
    That's not a statistical question. That's a content-based question, for which you should seek advice from colleagues in your field or from the literature on the topic. My guess, speaking as a person who is completely uninformed in this content area, is that it's not correct because it sounds like you have a three-level data generating process, with observations repeated over time nested in firms, nested in turn (or perhaps in a mixed-membership model) with state. So the use of a two-level model with clustering errors at a higher level is a mis-specification of the process. It is, admittedly, a commonly used one in some disciplines because it avoids the use of random effects (which are viewed with skepticism in some disciplines). Sometimes there is no completely satisfactory specification of the model that can, as a practical matter, be estimated with data that can be obtained in the real world.

    Once you have settled on the structure of your model, if you have questions about how to implement that model in Stata code, do post back with a full description of the modeling decisions you have made and some specimen code or pseudo-code that you think might be the solution. In such a post, be sure also to include example data, using the -dataex- command. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Dear Clyde,
      I have a similar question. I am using a quarterly survey of different regions. It is a repeated cross-section where I have years and quarters. I am running an individual level regression. When I add region-specific year_quarter dummies my results mean nothing. My question is about the right way of controlling year effects and season effects for each region.
      Let's say my variables are: region year quarter and year_quarter. So which one do you think the right way of control?

      1-) i.region#i.year_quarter
      2-) i.region#i.year and i.region#i.quarter
      3-) i.year and i.quarter

      I know the third one is not region specific, but the problem is the first one seems like taking all variation and nothing left for the parameter of interest. Do you think the second option is a reasonable one?

      thank you for your help

      Comment


      • #4
        Well,l you don't provide much specific information about your project and its goals, so I can't really answer your question except in very general, somewhat vague terms.

        The use of i.year i.quarter to adjust for annual shocks and seasonal variation is simple and straightforward. By contrast, the use of year_quarter adjusts for quarterly shocks, but does not identify any seasonal pattern of variation, nor any annual deviaions. But which is appropriate for your situation depends on the nature of the variables and the kinds of chronological variation it is reasonable to expect apply in your data generating process. So I can't say anything more about that.

        As for region#year_quarter taking up all the variation and leaving nothing for the parameter of interest, this might be a good thing or a bad thing. If your parameter of interest is, in fact, irrelevant other than being some kind of proxy marker for the passage of time and range over space, then it is good and appropriate to include region#year_quarter precisely so you do not falsely attribute to your parameter of interest an importance that it does not deserve. Now, on the other hand, if the parameter of interest is some policy or practice which has a side consequence of driving people (or firms, or whatever) to move to different regions, and if this change in location is reflected in your data, then including region as a covariate constitutes adjustment of the analysis for a variable that lies on the causal pathway between your parameter of interest and the outcome. In that case it would be quite wrong to include these covariates: you never want to adjust for something on the causal pathway. But again, knowing almost nothing about your actual problem, I can't advise you which of these considerations applies to you.

        Comment

        Working...
        X