Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Non-linear regression with industry and year FE

    Dear colleagues,

    Greetings!

    I have been searching different forums for the past weeks, but the more I dive into the topic, the more confused I get.

    I have a sample of 51,000 new investments made by foreign entrants within the context of 1 country, 61 industries, and 7 years of observation.

    My dependent variable is a binary estimate which takes a value of 1 if a foreign entrant chooses a joint venture (JV) over a wholly-owned subsidiary (WOS) for the initial ownership mode of their investment (there are firms that change their ownership mode later on, there are those that do not, but in any case my main research interest has to do with their initial choice of the ownership mode).

    Because my explanatory variables are representative of what is happening on the industry-level, and they are also time varying, I think I need to implement a FE regression. However, as I have noted earlier, because I strictly care for the initial ownership mode choice of firms within each industry, I cannot implement xtlogit, fe as it will not even converge due to the internal assumption of FE being firm-level.

    Therefore, I have a problem: in the articles that have similar research design, the authors implement logit regressions, but they do not explain in detail how they control for industry and year FE. In fact, in case they implemented, say, xtlogit, re, I think their decision was not 100% correct because they also dealt with the firm-level initial transactions within certain industries that took form of JV vs WOS, or WOS vs minority JV vs majority JV, but did not change on firm-level throughout the years of observation (ex., Li & Li (2010). Flexibility versus commitment: MNEs’ ownership strategy in China: 1 country, 2000-2006 observation period, 5,000 new foreign investments in manufacturing industries, choice of analysis - multinomial logit regression).

    Also, considering that including industry and year dummies (i.ind, i.year) into a non-linear regression does not equal controlling for their FE, what is there that can be done about this? I have found a suggestion to use CRE probit, but I am not entirely sure if this will solve the problem. I am also not fully familiar with such regression design, so I am cautious about my abilities to interpret the output correctly: https://www.statalist.org/forums/for...60#post1606360.

    I have also found that interpreting output of margins, dydx after running a non-linear regression with FE might be very dangerous, because FE will not be estimated and therefore one will not be able to correctly predict the probability of outcome taking an anticipated value. I did not know this before, so now I am more confused than ever. What is the optimal approach to this whole thing then?

    Thank you for your replies in advance!

  • #2
    I think the relevant discussion for you is this one: https://www.statalist.org/forums/for...uated-at-means. Your description suggests that you want to run a logit regression with industry and time dummies, and I do not see any issues with that considering your sample sizes.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      I think the relevant discussion for you is this one: https://www.statalist.org/forums/for...uated-at-means. Your description suggests that you want to run a logit regression with industry and time dummies, and I do not see any issues with that considering your sample sizes.
      Dear Andrew,

      Thank you for your reply!

      I guess I got confused because I was trying to set my data at the Firm ID-Year format (xtset id year), but the later regressions did not work properly as my DV (choice of initial ownership mode) is technically time invariant and recorded only in the year of market entry.

      Therefore, if I use logit with i.ind i.year dummies, will that be sufficient? The interesting thing that I found is that reghdfe, ab(ind year) (a community-contributed module) produces results very similar to logit i.ind i.year, and I do not necessarily understand how that works.

      Comment


      • #4
        reghdfe, ab(ind year)
        This estimates a linear probability model, where you absorb the industry and time dummies. The logit command, on the other hand, estimates a logistic regression model. In a number of cases, the average marginal effects from logit will be similar to the LPM coefficients.

        Therefore, if I use logit with i.ind i.year dummies, will that be sufficient?
        From my reading of your discussion in #1, it seems that you state that's what is done in the literature. Not estimating a conditional logit model leaves some unobserved firm heterogeneity unaccounted for, but typically you will throw in control variables to capture as much of this as possible. In any case, you appear to be constrained. If the dependent variable is time-invariant, there is no within-variation to exploit.

        Comment

        Working...
        X