I have been struggling with choosing the right STATA estimator for my data/research questions for a couple of months now and would be really grateful if you can provide some insights.
I have a panel of U.S. manufacturing firms investing abroad (in around 100 host countries) over the period of 2008-2012. The dependent variable (hereafter DV) is investment in each country (whether the firm-country pairing is one in a year or not) and how that depends on firm characteristics (e.g., size, past performance, financial leverage etc.--all time-varying), that particular host country’s characteristics (e.g., culture, institutions, distance etc.; culture is constant over time) and also the interaction of these two types of variable.
Since I would like to examine the impact of both firm (individual) characteristics and country (alternative) characteristics on the investment decision, McFadden's conditional logit (which excludes the variables not varying among alternatives) is not a proper option for my research questions. Since the only way to get a handle on firm-characteristics in McFadden model is interacting them with alternatives dummies and I have around 100 of those alternatives (host countries): analysing the impact of firm characteristics for each of these alternatives is overwhelming (and not theoretically of the interest of my reader). I do not need to compare the coefficient of firm-characteristics across alternatives one-by-one.
So I ran a different specification. I used the clogit, but instead of grouping the observations for each choice occasion (which would drop the individual-specific variables due to their invariance across alternatives), I grouped the observations for each firm (which goes through five years--i.e., five choice occasions--each of which identified by a year dummy). To me, this approach partially resembles the application of clogit in medicine where treated subjects and controls are grouped together to be compared. Here all the observations related to a firm form a control group to control out the firm fixed tendencies. Is this estimator preferable to the simple logit command in terms of consistency (since McFadden's choice model is not on the table, I need to work with remaining options). Do you suggest any other estimator/specification to model this process.
PS1: Adding country dummies doesn't make much difference except country culture will drop out due to its stability over time (I would rather to avoid this given my hypotheses).
PS2: I also have tried the command melogit (as well as meqrlogit and similar ones) to cluster the data based on parent firm and host country (random intercepts). I worked with startvalues & startgrid to pass through the initial errors (“initial values not feasible"). After successfully passing through this error, I left the model to run for 48 full hours, but it progressed only through three iterations. So mixed-effects logit sounds to be computationally impossible.
I have a panel of U.S. manufacturing firms investing abroad (in around 100 host countries) over the period of 2008-2012. The dependent variable (hereafter DV) is investment in each country (whether the firm-country pairing is one in a year or not) and how that depends on firm characteristics (e.g., size, past performance, financial leverage etc.--all time-varying), that particular host country’s characteristics (e.g., culture, institutions, distance etc.; culture is constant over time) and also the interaction of these two types of variable.
Since I would like to examine the impact of both firm (individual) characteristics and country (alternative) characteristics on the investment decision, McFadden's conditional logit (which excludes the variables not varying among alternatives) is not a proper option for my research questions. Since the only way to get a handle on firm-characteristics in McFadden model is interacting them with alternatives dummies and I have around 100 of those alternatives (host countries): analysing the impact of firm characteristics for each of these alternatives is overwhelming (and not theoretically of the interest of my reader). I do not need to compare the coefficient of firm-characteristics across alternatives one-by-one.
So I ran a different specification. I used the clogit, but instead of grouping the observations for each choice occasion (which would drop the individual-specific variables due to their invariance across alternatives), I grouped the observations for each firm (which goes through five years--i.e., five choice occasions--each of which identified by a year dummy). To me, this approach partially resembles the application of clogit in medicine where treated subjects and controls are grouped together to be compared. Here all the observations related to a firm form a control group to control out the firm fixed tendencies. Is this estimator preferable to the simple logit command in terms of consistency (since McFadden's choice model is not on the table, I need to work with remaining options). Do you suggest any other estimator/specification to model this process.
PS1: Adding country dummies doesn't make much difference except country culture will drop out due to its stability over time (I would rather to avoid this given my hypotheses).
PS2: I also have tried the command melogit (as well as meqrlogit and similar ones) to cluster the data based on parent firm and host country (random intercepts). I worked with startvalues & startgrid to pass through the initial errors (“initial values not feasible"). After successfully passing through this error, I left the model to run for 48 full hours, but it progressed only through three iterations. So mixed-effects logit sounds to be computationally impossible.