No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • (Quasi-)Three-dimensional panel data with binary depedent and independent variables

    Dear Stata-community,
    I have just started my PhD and am quite new to stata and could use some guidance on the choice of the correct model for my data.

    Generally speaking, I am analyzing whether there is an effect of top management team-characteristics on a certain external event. Both, the dependent as well as the key independent variable are binary variables, whereas most controls are continuous variables. The data contains around 23,000 firm-year observations for a 15 year time frame. Each firm-year observation includes data on CEO and CFO characteristics (e.g. gender, age, salary) and firm fundamentals. The data looks something like this:
    firmid ceoid year cfoid dependent variable (y) independent variable_ceo (x_ceo) independent variable_cfo (x_cfo) control1
    1 1 2002 1 1 1 1 2151
    2 2 2002 2 1 0 1 2341
    3 3 2002 3 0 1 1 212
    1 1 2003 4 1 1 0 131
    2 2 2003 2 0 1 0 14245
    Given that the data is panel data, I want to definitely include time- and firm-fixed effects. So far I have used a mixed-effects probit model with time- and firm fixed effects.

    meprobit y L.x_ceo L.x_cfo L.controls || firmid:, vce(robust) intpoints(12)
    However, as a mixed-effects probit model probably does not really work well fixed effects in probit models, I am unsure whether this is the right way to go forward. Going forward I hence have a couple of questions where I would need further guidance:
    1. Which model would you recommend for the data at hand? Should I use a logit instead of the mixed-effects probit model with time- and firm-fixed effects? Or revert back to an OLS regression?
    2. Do I need to account somehow for the fact that both, the main independent as well as the dependent variable are binary?
    3. Do I need to rather cluster standard errors at the management level than on the firm level given that firm-year-observations are not independent for each CEO/CFO combination? If so, how would I include clustered standard errors at the management level if my panel is set with year firmid as the panel variables?

    Thank you already for your help and please do let me know in case I need to further specify anything in my post (this might very well be the case given that I am new to this community)!

  • #2
    While not strictly speaking correct, adding i.year to the rhs variables and using xtlogit would be a possibility. This handles the binary dv. You don't need to do anything about a binary rhs variable except interpret it correctly. Note that the fixed effects estimate is taking out the base level of any included variables which may make the interpretation different. Clustering at the firm level is probably adequate.

    In future, please follow the FAQ on asking questions.