Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • strongly unbalanced panel: xtlogit vs logit?

    Dear community,

    I am benefiting from reading threads across this forum in several technical dimensions, but there are a few preliminary ad-hoc things I think it is better to ask directly.

    I am studying a rather standard dataset of criminal offences. Each row corresponds to a crime. My dependent variable is a binary for whether the crime is of a certain category (use of weapons). My independent variables correspond to a big set of time dummies, the area-code of where the crime has been committed and a few demographic indicators of offenders as well as some other variables I created. I have about 100k observations and 100 area groups.

    My original plan was to start by setting up a panel logit estimation with fixed effects at area-code (xtlogit, fe), a typical approach in the literature, where it is also common to re-frame the problem from an offender-based analysis into a neighborhood (or whatever other favored area-code) based one. However, given the nature of my study, a neighbourhood analysis would make little sense.


    My problem:

    For some reason, Stata gets stuck at the estimation phase. I impute command xtlogit y x, fe, I receive several "note" messages about a few multicollinear time dummies and then nothing. Stata keeps loading indefinitely with no output whatsoever. I would like to know whether it is related to the panel structure I am using and whether it is advisable to drop it. In this case, I have a large set of time-invariant regressors about the properties of the given neighborhoods that I could include to make up for the FE.

    Thank you for your time and patience.
    Last edited by Paola Bertolini; 22 Nov 2021, 17:42.

  • #2
    EDIT:

    I ended up getting this message:

    note: multiple positive outcomes within groups encountered.

    followed by the critical error:

    2,730 (group size) take 575 (# positives) combinations results in numeric overflow; computations cannot proceed

    Not sure why I get this as I don't have categorical variables?

    I tried xtreg and I don't get errors.

    Can I use a i.neighborhood dummy instead of xtlogit to overcome the issue?

    thanks
    Last edited by Paola Bertolini; 22 Nov 2021, 18:28.

    Comment


    • #3
      Please refer to here for the error message in #2.

      I think you have a pooled cross-sectional data at the level of criminal offenses rather than a typical panel data. So using -logit- with area FE and time FE would suffice but may be extremely time consuming given your sample size and the number of dummies. There is a Chamberlain-Mundlak device which is similar to what you said in #1, using time-invariant neighborhood characteristics to mimic FE, but it's developed for a typical panel data, and I'm not sure whether it can be applied to pooled cross section.

      Comment

      Working...
      X