Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed Effects for a Panel at a Coarser Level

    Hello,

    I want to include some fixed effects in my model that I believe are difficult to include so any advice on how exactly this can be done would be very helpful. My data is at the Firm x Year level, except the Firms have different exposures to different countries in every year, and I want to include Country x Year fixed effects (as well as Firm fixed effects). The predictor variable of interest is a linear combination (based on the shares of each firm) of a Country x Time variable, so that's why Country x Time fixed effects are important.

    For example, my data looks like this:
    Firm Year Firm Yt Firm Xt
    1 1 8 6.5
    1 2 3 2.8
    2 1 6 4.4
    2 2 15 11.1
    3 1 7 5.75
    3 2 13 9.6
    where I know the following (in addition to the shares of each firm, which are used to calculate Xt):
    - Firm 1 is in countries 1,2,3 in Year 1; and in countries 1,2 in year 2
    - Firm 2 is in countries 1 & 3 in Year 1; and in countries 2 & 3 in year 2
    - Firm 3 is in all 3 countries in both time periods.

    What I would like to do is regress Yt on Xt, Firm fixed effects, and Country x Year fixed effects. I also want to cluster standard errors at the Firm level. To complicate things, my dataset is actually very large (thousands of firms, thousands of geographies, and close to 100 time periods) so whatever I do has to be doable with large datasets.

    I did some research and potentially found a way to do it by expanding the data to be at the Firm x Country x Year level, but this would be pretty difficult to do with my original dataset (would increase its size dramatically) and I am not sure if "repeating" the observations within my data would affect my estimates, because it would make it look like I have more observations than I actually do. To illustrate, in the above example, this data would be expanded to something like this:
    Firm Country Year Firm Yt Firm xt
    1 1 1 8 6.5
    1 2 1 8 6.5
    1 3 1 8 6.5
    1 1 2 3 2.8
    1 2 2 3 2.8
    2 1 1 6 4.4
    2 3 1 6 4.4
    2 2 2 15 11.1
    2 3 2 15 11.1
    3 1 1 7 5.75
    3 2 1 7 5.75
    3 3 1 7 5.75
    3 1 2 13 9.6
    3 2 2 13 9.6
    3 3 2 13 9.6
    So I'm just listing the countries a firm is in and repeating the firm's Yt and Xt. Then I can run this command:

    Code:
    reghdfe y x, absorb(firm country#time) vce(cluster firm)
    Again, I am not sure this solution is ok, especially because I'm manually repeating each firm's observations several times. Any help would be appreciated!
    Last edited by Yorgi Gratian; 11 Feb 2019, 09:40.

  • #2
    Duplicating observations seems unlikely to be a good solution.

    I'm not sure why you don't run reghdfe on the original dataset. It is pretty flexible. However, if your data is at at the firm level (not firm's x within a country), then I'm not sure that this is the way to go. You seem to have a bunch of data that is not in your example (what you have below the example). It is hard to think about this without such data being clear.

    It seems like you want to control for country conditions in each year added up to the firm level. What if, instead of trying to do all these dummies you created a variable that is the firm's proportion in country x times the average for all other firms in country x? You'd end up with one variable for each country. Country-year dummies are problematic since the firm has different amounts of activity in each country - the country-year should have different influences on each firm.

    Comment


    • #3
      Thanks for the reply, Phil. I will give a simpler example to illustrate the entire problem.


      Say there are three countries, two firms, and two time periods. The exposure of the firms to the countries is as follows (shares across all countries add up to 1 in each time period):
      Year Firm Country 1 Country 2 Country 3
      1 1 0.6 0.4 0
      1 2 0.9 0 0.1
      2 1 0.7 0.3 0
      2 2 0.8 0.1 0.1








      The predictor of interest is constructed by taking a weighted average of a variable at the country x year level (say it's a measure of resources in the country in that year):
      Year Country 1 Country 2 Country 3
      1 5 3 4
      2 6 3 20

      So the constructed predictor (xit) is as follows:
      Firm Year Xit
      1 1 4.2
      2 1 4.9
      1 2 5.1
      2 2 7.1

      I also have data reported by the firms every year that will be my dependent variable, so the entire dataset at the end looks like this:
      Firm Year Xit Yit
      1 1 4.2 3
      2 1 4.9 6
      1 2 5.1 4
      2 2 7.1 11

      Now, the most straight forward approach is to do something like this:

      Code:
      xtset firm year
      xtreg y x i.year, fe vce(cluster firm)
      This does not really make use of the variation in firms' exposures to different countries (except that the exposures are all used to calculate the aggregate xit). I would like to use the variation in the exposures somehow...firms operating in a given country and a given year are affected differently depending on how exposed they are to that country in that year. Is this possible to do or have you come across any papers that do something similar?

      Comment

      Working...
      X