Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Random Effect Logistic Panel Model

    Hi,

    I am running several random effect panel models using the British Election Study. My data has 4 waves with more than 27,000 respondents in each wave. I am using Stata version 14.
    euRefVote is my binary dependent variable.

    I first ran some fixed effects panel models for time-varying variables and I am now running random effects panel models. The fixed effects models worked fine. However, Stata starts running the random effect model but then stops at Iteration 1.
    I also tried running different random effect panel models with the same panel data to check whether it was one specific variable that caused the problem.Yet all of my variable showed the same problem of being 'concave'.
    I also tried running a random effect panel model for another dataset I have, which worked fine.

    This is the code I have been using:

    xtset id wave

    xtlogit euRefVote age, re
    xtlogit euRefVote i.age gender marital1 i.household i.ethnicity, re
    xtlogit euRefVote partyID RiskPoverty, re




    On the display screen I get this result:

    Fitting comparison model:

    Iteration 0: log likelihood = -79595.428
    Iteration 1: log likelihood = -78032.547
    Iteration 2: log likelihood = -78032.037
    Iteration 3: log likelihood = -78032.037

    Fitting full model:

    tau = 0.0 log likelihood = -78032.037
    tau = 0.1 log likelihood = -75599.757
    tau = 0.2 log likelihood = -73145.166
    tau = 0.3 log likelihood = -70640.428
    tau = 0.4 log likelihood = -68051.496
    tau = 0.5 log likelihood = -65334.372
    tau = 0.6 log likelihood = -62427.947
    tau = 0.7 log likelihood = -59238.334
    tau = 0.8 log likelihood = -55591.027

    Iteration 0: log likelihood = -59239.613
    Iteration 1: log likelihood = -50640.545 (not concave)



    What could the problem be with my panel data?

    Thanks!
    Josefine

  • #2
    It is hard to know. The likelihood function for a random effects logistic model is not a simple, well-behaved one, and estimating these models can be tricky. Among the things they can be sensitive to:

    1. A rare outcome.
    2. Small or very sparse cells defined by the predictor variables. (i.age sounds like a recipe for this difficulty unless age actually encodes fairly coarse age groupings.)
    3. Predictor variables not on similar scales. (This doesn't look likely here.)

    Another issue that can foil these estimations is if sigma_u is very close to zero. Try running the same model with -estimate(0)- specified. That will cause Stata to stop early and show you its interim results. If the value for sigma_u is very close to zero in that output, that would suggest that this is the problem. The solution would be to just go to ordinary logistic regression.

    You could also learn other things from the output of the -estimate(0)- output: is there a predictor whose coefficient or standard error is unreasonably large or unreasonably small? That might identify a culprit variable.

    If none of these turn up a solution, try running the model with the -difficult- option, or chose something other than the default for the -technique()- option; sometimes you will have better luck with an alternate estimation algorithm.

    Yet another approach is to run this under -melogit- or -meqrlogit-, as they estimate the same model as -xtlogit- but use different approaches to estimation.



    Comment


    • #3
      Dear Clyde,
      Thanks for the info and advice.

      I checked the distribution of my variable age and re-ran a logistic regression random effect panel model with the same variable before I had recoded the variable into fewer categories. Its previous name was ageGroup. You were right, already in the previous variable was the sigma_U for my age variable very small.
      Below is the output from the random effect panel model with the previous variable which already showed a small sigma_u.


      xtlogit euRefVote ageGroup, re

      Fitting comparison model:

      Iteration 0: log likelihood = -15651.098
      Iteration 1: log likelihood = -15373.345
      Iteration 2: log likelihood = -15373.288
      Iteration 3: log likelihood = -15373.288

      Fitting full model:

      tau = 0.0 log likelihood = -15373.288
      tau = 0.1 log likelihood = -15373.545

      Iteration 0: log likelihood = -15373.545
      Iteration 1: log likelihood = -15373.309
      Iteration 2: log likelihood = -15373.305
      Iteration 3: log likelihood = -15373.297
      Iteration 4: log likelihood = -15373.296
      Iteration 5: log likelihood = -15373.296

      Random-effects logistic regression Number of obs = 22,700
      Group variable: id Number of groups = 22,700

      Random effects u_i ~ Gaussian Obs per group:
      min = 1
      avg = 1.0
      max = 1

      Integration method: mvaghermite Integration pts. = 12

      Wald chi2(1) = 448.08
      Log likelihood = -15373.296 Prob > chi2 = 0.0000

      ------------------------------------------------------------------------------
      euRefVote | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      ageGroup | .204816 .0096758 21.17 0.000 .1858518 .2237802
      _cons | -.867695 .0497022 -17.46 0.000 -.9651096 -.7702805
      -------------+----------------------------------------------------------------
      /lnsig2u | -3.344092 2.34073 -7.931839 1.243655
      ------------+----------------------------------------------------------------
      sigma_u | .1878623 .2198675 .0189506 1.862329
      rho | .0106137 .0245801 .0001091 .5131989
      ------------------------------------------------------------------------------
      LR test of rho=0: chibar2(01) = 0.02 Prob >= chibar2 = 0.450



      Does this very small sigma_u imply that the between-variance between my four waves of data is too small for running random effect panel models using logistic regression?
      I also tried tried running the same random effect logistic regression model with -estimate(0)- specified. Yet, I can't seem to write the correct syntax.
      xtlogit euRefVote ImmigSelf, re estimates(0)


      Returning to my current dataset, I have tried running multiple combinations of random effects ordinal logistic regression models on different variables but they all return the same error:

      xtologit euRefVote age ImmigSelf

      Fitting comparison model:

      Iteration 0: log likelihood = -79595.428
      Iteration 1: log likelihood = -67518.53
      Iteration 2: log likelihood = -67411.917
      Iteration 3: log likelihood = -67411.673
      Iteration 4: log likelihood = -67411.673
      _gsem_eval_ordinal(): 3204 matrix found where scalar required
      _gsem_eval_iid__obs(): - function returned error
      _gsem_eval_iid__wrk(): - function returned error
      _gsem_eval_iid(): - function returned error
      mopt__calluser_v(): - function returned error
      opt__eval_nr_v2(): - function returned error
      opt__eval(): - function returned error
      _optimize_evaluate(): - function returned error
      _mopt__evaluate(): - function returned error
      _moptimize_evaluate(): - function returned error
      _gsem_build__start_fixed(): - function returned error
      _gsem_build__start(): - function returned error
      _gsem_build(): - function returned error
      _gsem_parse(): - function returned error
      st_gsem_parse(): - function returned error
      <istmt>: - function returned error
      r(3204);

      end of do-file

      Is my panel data too small for running ordinal logistic regression models? Or is there an undefined variable I should include in the random effects model above?

      Lastly, I ran a serie of probit logistic random effects model which worked and provided me with significant results for multiple variables and a small sigma_U superior to 3. Would it better to simply switch to probit logistic regression and probit logistic random effects models instead?

      Thanks!

      Comment


      • #4
        OK, first let me correct my own error from #2. The option to specify is not -estimates(0)- it's -iterate(0)-. Very sorry about that.

        Having looked at the fuller output you showed this time, there is something very wrong here:

        Code:
        Random-effects logistic regression Number of obs = 22,700
        Group variable: id Number of groups = 22,700
        
        Random effects u_i ~ Gaussian Obs per group:
        min = 1
        avg = 1.0
        max = 1
        You cannot validly estimate a random effects model when there is one observation per group, because there is no way to identify the group and observation level error terms separately. It is surprising to me that you got the remainder of that output. I would have expected convergence to fail in this example, too. In any case, I would not put any faith in those results.

        Either there is something wrong with your data, or you have -xtset- your data incorrectly. But for a panel data analysis you should have, at least in most cases, more than one observation per panel. If your data only contains one observation per "panel" then you don't really have panel data: you have a simple cross section, and you should analyze it with single-level models such as -logit-.

        I don't know what's going on in the ordinal logistic regression model. But I think your first task is to either fix your data so it's real panel data, or revert to single-level modeling. If you continue to run into problems after doing that, post back.

        Comment


        • #5
          Hi Clyde,
          Yes thanks for noticing the problem in the panel data. I went through the datasets I was using, and by mistake I was only using one of the 4 waves of my panel data in my random effect panel logistic regression model. I have formatted the datasets and I am now running a random effects logistic regression panel model inclusive of all 4 waves in the dataset.

          With the four waves included. I ran a couple of random effects logistic regression models using xtlogit with the iterate (0) command for different variables. Yet, every time sigma_u = 1.527.
          Does this suggest that the between-variance is too small in this panel data to use xtlogit? And instead I should use the xtprobit or ordinal logistic regression?
          Last edited by Josefine Reimer Lynggaard; 03 Dec 2017, 03:24.

          Comment


          • #6
            If sigma_u = 1.527, that is far enough away from zero that it should not be interfering with convergence. Are you seeing any outlandish standard errors or coefficient estimates in the output?

            Are you still experiencing convergence difficulties now that you have expanded your data set? If not, then I was suggest you forget about this and move on. If so, then I think you are left with trying different estimation techniques and the -difficult-option. And don't forget about trying -melogit- and -meqrlogit-, which also estimate the random effects logistic model and use different numerical approaches. If your model is identified it is likely, though not guaranteed that one of these approaches will get the model to converge.

            -xtprobit- is another option to consider. I'm not fond of probit models myself because they are difficult to interpret, although the -margins- command will usually tell you everything you really need to know. I don't see how going to ordinal logistic regression will help you. Your outcome has only two levels, and while you can estimate that with ordinal logistic models, but the multi-level ordinal logistic likelihood function is even less well behaved than the logistic one, and I would expect that it would be even harder to coax it into convergence.

            Comment

            Working...
            X