Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rare binary dependent variable with panel data

    I’m having a problem that may be more of a statistics issue than a Stata issue. I am using Stata/IC 15.1 for Windows.

    I have panel data going from 2011 to 2017 with 1,974 firms and 13,818 observations (strongly balanced). My dependent variable is binary: Y=1 if the firm does an IPO and 0 otherwise. My dependent variable is rare: 84 positive outcomes (Y=1) and 13,734 negative outcomes (Y=0).

    I want to estimate the impact of a one year lagged continuous variable (X(t)) on the probability to do an IPO the following year, P(Y(t+1)=1). I want also to use a fixed-effects model in order to avoid time-invariant omitted variable bias. My variables are heteroscedastic, I would therefore need to use robust standard errors.

    Below is an example of my dataset. Please note that Date is the year before the IPO. IPO equals 1 when the firm will issue an IPO at Date + 1 year.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(ID Date IPO) double xlag
     1 2011 0               -3.09
     1 2012 0               -6.84
     1 2013 0                2.39
     1 2014 0                4.57
     1 2015 0                4.94
     1 2016 0                5.51
     1 2017 1                 .04
     2 2011 0                   .
     2 2012 0                   .
     2 2013 0                   .
     2 2014 1                   .
     2 2015 0                   .
     2 2016 1               -1.35
     2 2017 0               -2.66
     3 2011 0              -46.27
     3 2012 0              -43.34
     3 2013 1               -2.98
     3 2014 1               -5.53
     3 2015 1              -11.66
     3 2016 1 -3.5500000000000003
     3 2017 0               -6.45
    I can use the -xtlogit- model with -fe- but I fear this would not take into account the rare nature of my binary dependent variable and this would not allow me to control for heteroscedasticity.

    After reading previous posts on Statalist, I know of the existence of the two following user-written commands:
    1. ReLogit by King et al.
    2. Firthlogit by Coveney
    I understand that these commands are not suited for panel data which means they would not be helpful in my case.

    What type of model present in Stata would best suit to my case ?

    Thank you in advance for your help.

  • #2
    One approach would be to pair your IPOing firms with the most similar ones you can find (propensity score matching is currently fashionable in many areas). Then you could run a conditional logit with the panel being the pairing.

    In many cases, folks use panel dummies when it is not clear what the appropriate panel estimator is (e.g., using dummies with tobit).

    Comment


    • #3
      Thank you for your answer Phil ! I will use matching and conditional logit then.

      Comment

      Working...
      X