Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xsmle: initial values not feasible - numerical overflow

    Hi all,

    I have a problem using the xsmle command for spatial panel estimations. I'm trying to run:
    Code:
    xsmle rate_t rate_sol_t region_type s_for s_m_u25 diff_s_vote s_yw_pc s_no_t s_u, re wmatrix(W) model(sar)
    But get the error message:
    Code:
    initial values not feasible
    r(1400);
    The error code translates into " numerical overflow".

    So here are my questions:
    1. Why is this error occuring? My panel contains 5226 obs, so is hardly to large for STATA 13 MP to handle. I also tried different models, weight matrices and var-lists (down to only one independent var) and time-frames (down to 2 years) and the error persists.
    2. I've read that with other ML-estimations, it might be possible to set initial values manually, yet I'm not quite clear on how to do that for xsmle. Is that even possible? And exactly what values would have to be set?

    I would be grateful for any help or advice you guys could give me.

    Best regards,

    Tim Umbach




    Last edited by Tim Umbach; 16 Jun 2017, 04:04. Reason: Added Tags

  • #2
    Welcome to Statalist, Tim.

    I note that xsmle is a user-written command rather than part of the official Stata distribution. The result of search xsmle shows three sources for it: Stata Journal package st0470, the SSC archives, and an author's website. The version on the latter two seems to be version 1.4.5 from June 2017 while the SJ version is 1.4.4 from December 2016. If the latter is what you have (the output of ado dir will report show st0470) you should consider replacing it with one of the other sources.

    With regard to setting initial values, help xsmle shows "maximize options" among the options, and clicking on that scrolls the display to the section

    maximize_options: difficult, technique(algorithm_spec), iterate(#), [no]log, from(init_specs), tolerance(#),
    ltolerance(#), nrtolerance(#), and nonrtolerance; see [R] maximize. These options are seldom used.
    So these are the standard maximization options within Stata's maximization code. And from that, turning to the maximize section of the Stata Base Reference Manual PDF, or looking at help maximize , shows us that the from() option is the mechanism for setting starting values.

    Comment


    • #3
      Hi William,

      thank you very much for you advice. Unfortunately, updating and setting intial values only partially fixed the problem.
      I used reg to obtain initial values and used from() to set them, but when I run xsmle with the entire panel it still produces the same error. However, if I reduce the number of years from 13 to 7 or so, the code works. Would changing the kind of regression to obtain intial values do anything? I also tried xtreg, with no difference.

      Do you think this is a limitation of my hardware? (I'm running STATA13 MP on a current i5 processor with 8GB RAM) Would set memory do anything useful? I currently use set max_mem .

      I don't think there is anything wrong with my data, as it does not matter which 7 year period I choose.

      If you (or anybody else) has any further ideas, I would be very grateful.

      Code:
      . xtset region_id year
             panel variable:  region_id (strongly balanced)
              time variable:  year, 2006 to 2015
                      delta:  1 unit
      
      . reg rate_t rate_sol_t region_type s_for s_m_u25 s_tot_o65 diff_s_vote s_yw_pc s_no_t s_u s_u_lt dhi_pc b_pol b_den b_czk b_aut b_swi b_fra
      > b_lux b_bel b_nel
      
            Source |       SS       df       MS              Number of obs =    4020
      -------------+------------------------------           F( 20,  3999) =  436.65
             Model |  2.2464e+10    20  1.1232e+09           Prob > F      =  0.0000
          Residual |  1.0287e+10  3999   2572313.1           R-squared     =  0.6859
      -------------+------------------------------           Adj R-squared =  0.6843
             Total |  3.2751e+10  4019   8148970.7           Root MSE      =  1603.8
      
      ------------------------------------------------------------------------------
            rate_t |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
        rate_sol_t |  -32.38778   4.035666    -8.03   0.000    -40.29994   -24.47563
       region_type |    3437.14   81.93767    41.95   0.000     3276.496    3597.783
             s_for |   80.96482   8.475192     9.55   0.000     64.34872    97.58092
           s_m_u25 |  -355.7466   52.02088    -6.84   0.000    -457.7365   -253.7567
         s_tot_o65 |  -45.43011   14.96746    -3.04   0.002    -74.77467   -16.08555
       diff_s_vote |  -8.003447   3.735393    -2.14   0.032     -15.3269   -.6799956
           s_yw_pc |   3.185143   .2676818    11.90   0.000     2.660337    3.709949
            s_no_t |   14.33078   14.20213     1.01   0.313    -13.51331    42.17487
               s_u |   412.6714   25.81091    15.99   0.000     362.0677    463.2752
            s_u_lt |   24.40051    24.0888     1.01   0.311    -22.82697      71.628
            dhi_pc |  -.1429188   .0144692    -9.88   0.000    -.1712865   -.1145511
             b_pol |    636.543   181.5281     3.51   0.000     280.6467    992.4392
             b_den |   140.6781   302.9236     0.46   0.642    -453.2211    734.5773
             b_czk |  -389.8239   141.8957    -2.75   0.006    -668.0185   -111.6293
             b_aut |     857.41    149.658     5.73   0.000     563.9968    1150.823
             b_swi |    1484.18   301.3307     4.93   0.000     893.4037    2074.956
             b_fra |   1128.569   166.0384     6.80   0.000     803.0416    1454.097
             b_lux |  -954.9572   317.1258    -3.01   0.003    -1576.701   -333.2138
             b_bel |   1125.069   272.5064     4.13   0.000     590.8045    1659.333
             b_nel |   1149.353   170.0713     6.76   0.000     815.9186    1482.788
             _cons |   10377.05   678.1937    15.30   0.000     9047.417    11706.69
      ------------------------------------------------------------------------------
      
      . matrix b = get(_b)
      
      . xsmle rate_t rate_sol_t region_type s_for s_m_u25 s_tot_o65 diff_s_vote s_yw_pc s_no_t s_u s_u_lt dhi_pc b_pol b_den b_czk b_aut b_swi b_fr
      > a b_lux b_bel b_nel , re wmatrix(W1) ematrix(W1) model(gspre) from(b, skip) difficult
      initial values not feasible
      r(1400);
      Last edited by Tim Umbach; 16 Jun 2017, 12:17.

      Comment


      • #4
        You've made some progress, at least. That the smaller model (fewer years) converges is important information.

        I will say I am not a user of xsmle, but that won't prevent me from giving advice from my broader experience.

        First, I will confidently assert that the problem is not a function of the hardware, including memory, on which you are running. It's a nice excuse for buying a new toy, but I expect the result would be disappointing. Hardware constraints generally don't manifest themselves this way.

        Second, my reading of help xsmle for the gspre model tells me it includes a random effect at the panel level. The regress command with which you obtained your initial parameter estimates does not. Perhaps if you were to replace regress with xtreg ... , re the starting values would be improved.

        Third, I note the existence of a number of parameters b_pol b_den ... which look like names one would give indicator (dummy) variables for nationality. If that is that case, and you have used a categorical variable that gives nationality to generate these indicator variables, you might consider switching to using factor variables (help factor variables), which help xsmle says are supported, on the slim chance that xsmle takes that additional information (that is, that the variables are 0/1 and constrained to at most one of the set being 1) into account in its calculations. I'm not persuaded by my argument, however, and would view this as a last resort.

        Finally, the authors do includes their names and email addresses in help xsmle as well as in the net describe document, where they qualify those addresses with the label "support". If all else fails, and you aren't lucky enough to have an actual user of xsmle see this topic and advise you, you might write them and ask for suggestions. I notice that most topics about xsmle consist of only the original post, suggesting that expertise on it is thin on the ground on Statalist. And with that said, Federico Belotti, one of the three authors, has given advice back in 2015, although not more recently.

        Good luck!

        Comment


        • #5
          Thanks again. Yes some progress has been made, but nothing beyond that yet. The vars b_pol etc. are indeed dummies, however they denote wether a region is a border region and written by hand, so there is no categorical variable to fall back on. But I don't think it matters, because the problem doesn't change if I remove them from the equation.

          I've now written the authors of xsmle and will post the answer here, if someone else is encountering this problem.

          Comment


          • #6
            Thanks for the feedback on indicator variables; my guess as to their meaning was incorrect, obviously. Were you able to obtain initial values using xtreg ..., re rather than regress?

            Comment


            • #7
              Ok, so I've solved the problem, with the help of Gordon Hughes, one of the authors of xsmle (My thanks again). The issue is that (now rather obviously) xtreg...,re only finds coefficients for the not spatially lagged coefficients. But the entire list of coefficients which need to be initialized looks like this:
              Code:
                          Main:         Main:         Main:         Main:         Main:
                  rate_sol_b~g        rate_t         s_for       s_m_u25     s_tot_o65
              y1    -.24585949     .00557478      4.065996     15.517071     6.4535302
              
                          Main:         Main:         Main:         Main:         Main:
                   diff_s_vote       s_yw_pc        s_no_t           s_u        s_u_lt
              y1     1.6286804     .10150995     1.0054709     4.1077511     .19045279
              
                          Main:         Main:         Main:         Main:         Main:
                        dhi_pc         b_pol         b_czk         b_aut         b_swi
              y1     .00151508    -16.686157    -48.398926     -56.92746    -72.079627
              
                          Main:         Main:         Main:         Main:           Wx:
                         b_fra         b_bel         b_nel         _cons           s_u
              y1     4.8284097     36.742601     51.301852     -260.3339    -.62917013
              
                            Wx:           Wx:           Wx:      Spatial:     Variance:
                        s_no_t   diff_s_vote        dhi_pc           rho     lgt_theta
              y1    -.04329075     .14526405     .00014957     .03820055    -1.8238058
              
                      Variance:
                      sigma2_e
              y1     1342.7501
              So my solution for this problem is to manually set the spatially lagged coefficients to 0 and the Variances to 1 for initial values. The command that works looks like this, where b1 ist the coefficient-vector of the xtreg...,re regression:
              Code:
              xsmle rate_burg rate_sol_burg rate_t s_for s_m_u25 s_tot_o65 diff_s_vote s_yw_pc s_no_t s_u s_u_lt dhi_pc b_pol b_czk b_aut ...
              b_swi b_fra b_bel b_nel , re wmatrix(W1) ematrix(W1) from(b1 0 0 0 0 0 1 1, copy skip) model(sdm) difficult durbin( s_u s_no_t diff_s_vote dhi_pc)
              Alternatively I could have also used the coefficients from a xsmle-regression from a smaller part of the panel that worked.

              So thanks again for all your help, and all the best,

              Tim Umbach
              Last edited by Tim Umbach; 22 Jun 2017, 04:05.

              Comment


              • #8
                Greetings. I am getting similar errors for my spatial error model model. From this post, I have understood that I have to set initial values using from () function. But I am trying to understand what value I should put in the function. And, I noticed I have to provide n+3 (n= number of variables) values. What does the extra 3 value stand for? Lastly, in the function "from(b1 0 0 0 0 0 1 1, copy skip)", what is b1 and how can I define it?
                Thanks in advance.

                Comment

                Working...
                X