Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using region fixed effects with -xtlogit- command

    Hello, I have a fairly large dataset (over 314,000 observations nested within 310 regions/clusters for the years 2000 and 2010). I did install and read through the instructions of -dataex- however do not feel that even drawing a sub-sample from it will help in fully reconstructing the problem. I will try to provide my sample code I used below, as well as the error message that follows it. I want to construct an -xtlogit- model with fixed effects. -xtlogit- would further not allow the use of the robust or cluster(region) options. The DV 'poorfairhlth' is a binary variable coded as 1 if a person is in poor or fair health, and 0 otherwise. The variables 'povrate' and 'conpov' are region-level measures of poverty and concentration of poverty respectively. Variable 'race' is an individual's race/ethnicity. The individuals are not to same from 2000 to 2010, however the regions are.
    Code:
    xtlogit poorfairhlth povrate conpov i.race, fe 
    3,150 (group size) take 565 (# positives) combinations results in numeric overflow; computations cannot proceed
    r(1400);
    Thank you for any help.

  • #2
    There isn't anything obviously wrong with your code or what you are trying to do.

    I vaguely remember a post similar to this sometime back, and the resolution was that there was some problem with the calculations -xtlogit- did, and, again, if I recall correctly, StataCorp eventually fixed that. So I would say make sure your Stata is fully updated. If you are using an old version of Stata, then this may remain an unfixed problem as of the final update, in which case you will need to find somebody with a current, and fully updated Stata to run this for you.

    I am not completely sure I am remembering this correctly. Certainly if you are running version 15.1 and if, after a complete update, you get the same problem, then something else is going on. Perhaps another Forum member has encountered this and knows what to do. If you don't get help here, this sounds like something you may need to contact technical support about.

    Comment


    • #3
      Hello, thank you very much for your helpful answer. I had only left out to mention that I also ran -xtset- before running the -xtlogit-
      Code:
      xtset region // set panel variable 'region'
             panel variable:  region (unbalanced)
      I am currently running Stata MP version 15.1 (32-bit for Windows). I don't know whether this is with the latest available update. I believe the last time it was updated on my machine was as of this past January 11th, 2018.
      Last edited by Straso Jovanovski; 01 May 2018, 18:05. Reason: provide more information on last Stata update

      Comment


      • #4
        The latest update is April 18, 2018. But I think the fix that I'm talking about is from considerably earlier. Still, it never hurts to -update all- and try again.

        Comment


        • #5
          A quick search and experience yields the following suggestions and questions:

          1. What happens if you include explanatory variable one by one? where does the machinery breaks?

          2. Have you tried estimating a similar model with logit, combined with i.region as an explanatory variable?

          3. From 2, or from some other model (suggestions even include a standard OLS) as initial values? From what I gather, error 1400 sometimes stems from infeasible starting values. supplying the model with reasonable starting values might help.

          Comment


          • #6
            Hello, thank you very much for your suggestions and help. I will re-enter the error message I am getting in Stata below, since I left out the initial line in that output (I had been running the models using -quietly- so all the various underlying steps were not shown).
            Code:
            . xtlogit poorfairhlth povrate conpov i.race , fe
            note: multiple positive outcomes within groups encountered.
            3,270 (group size) take 587 (# positives) combinations results in numeric overflow; computations cannot proceed
            r(1400);
            
            end of do-file
            
            r(1400);
            .
            To answer question 1. from above: I have tried with a simple bivariate version of the regression model, and still receive the same message (have also tried various combinations of reduced form and replacing the single control / predictor variable used, with the same result).
            Toward answering question 2.: I did try using i.region, and it does run/produce estimates, however there are hundreds of regions and the output is long.
            Lastly, question 4.: I honestly am not familiar with the concept of starting values. I do remember attempting to run a multilevel model using the -melogit- command in the form: "melogit depvar indepvar control1 control2 [pw=wt], || region: " (with vce(cluster region) option) and was getting at one time a similar error message; after a lengthy processing it would not converge, the screen would show an endless sequence of iterations lines: ".... Iteration 11208: log pseudolikelihood = -3.740e+08 (backed up) ...."; other instances it would return the error message: "initial values not feasible, r(1400)".

            Comment


            • #7
              Dear Straso Jovanovski,

              I fear that the problem is that the computational cost of xtlogit increases very quickly with the number of observations per group, and you have thousands. So, the problem is not just a Stata limitation, estimating such model would take many years.

              Fortunately, in your case, solution 2 proposed by Ariel Karlinsky will be feasible and lead to valid results as long as you have "large" samples in each group.

              Best wishes,

              Joao

              Comment


              • #8
                Thank you very much Joao, and Ariel, for all of your help. I understand regarding the suggested solution, I will go in that direction.

                I noticed something recently that may relate. I tried to run anew a multilevel mixed effects logit -melogit- model. It does run fine without the person-level sampling or probability weights, and in fact only takes a few minutes to complete and produce estimates. Where it breaks down is when I add the weights.

                Here is what I get:
                Code:
                . melogit poorfairhlth povrate conpov i.race [pw=wt], vce(cluster region) || region:  
                
                Fitting fixed-effects model:
                
                Iteration 0:   log likelihood = -1.258e+08  
                Iteration 1:   log likelihood = -1.254e+08  
                Iteration 2:   log likelihood = -1.254e+08  
                Iteration 3:   log likelihood = -1.254e+08  
                
                Refining starting values:
                
                Grid node 0:   log likelihood =          .
                Grid node 1:   log likelihood =          .
                Grid node 2:   log likelihood =          .
                Grid node 3:   log likelihood =          .
                (note: Grid search failed to find values that will yield a log likelihood value.)
                
                Fitting full model:
                
                initial values not feasible
                r(1400);

                Comment

                Working...
                X