Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing value problem with gsvy and gologit2?

    Hello, my first post here although I have been reading the forum for a long time.

    I tried to use gologit2 for my svy data but the error message is puzzling. I am positive that I have no missing values in the data because I dropped those.
    Actually, I am not even sure missing data is the problem here. I am using the most recent version of gologit2.

    Code:
    . nmissing  bday_3 age rengrp educ_4 marst_3 if sample1==1 & female==0
    . svyset psu [pweight=newperwt], strata(strata)
    . gsvy, subpop(if sample1==1 & female==0): gologit2 bday_3 age i.rengrp i.educ_4 i.marst_3, auto nolabel
    matrix has missing values
    an error occurred when svy executed gologit2
    Any insight is appreciated! Thanks!
    Last edited by Jocelyn Li; 07 Dec 2017, 15:01.

  • #2
    I tried to run gologit2 without gsvy and now this data has missing values for the dependent variable. However, this time gologit2 worked. So I guess the problem is the svy part?
    Code:
    . gologit2 bday_3 age i.rengrp i.educ_4 i.marst_3 if sample1==1 & female==0, auto nolabel
    
    ------------------------------------------------------------------------------
    Testing parallel lines assumption using the .05 level of significance...
    
    Step  1:  Constraints for parallel lines imposed for 7.rengrp (P Value = 0.5841)
    Step  2:  Constraints for parallel lines imposed for 8.rengrp (P Value = 0.0742)
    Step  3:  Constraints for parallel lines are not imposed for 
              age (P Value = 0.00000)
              3.rengrp (P Value = 0.00000)
              1.educ_4 (P Value = 0.00000)
              2.educ_4 (P Value = 0.00000)
              3.educ_4 (P Value = 0.00000)
              2.marst_3 (P Value = 0.00000)
              3.marst_3 (P Value = 0.00000)
    
    Wald test of parallel lines assumption for the final model:
    
     ( 1)  [eq1]7.rengrp - [eq2]7.rengrp = 0
     ( 2)  [eq1]8.rengrp - [eq2]8.rengrp = 0
    
               chi2(  2) =    3.49
             Prob > chi2 =    0.1746
    
    An insignificant test statistic indicates that the final model
    does not violate the proportional odds/ parallel lines assumption
    
    If you re-estimate this exact same model with gologit2, instead 
    of autofit you can save time by using the parameter
    
    pl(1b.rengrp 2o.rengrp 4o.rengrp 5o.rengrp 6o.rengrp 7.rengrp 8.rengrp 0b.educ_4 4o.educ_4 1b.marst_3)
    
    ------------------------------------------------------------------------------
    
    Generalized Ordered Logit Estimates             Number of obs     =     88,262
                                                    LR chi2(16)       =    3481.83
                                                    Prob > chi2       =     0.0000
    Log likelihood = -69434.378                     Pseudo R2         =     0.0245
    
     ( 1)  [eq1]7.rengrp - [eq2]7.rengrp = 0
     ( 2)  [eq1]8.rengrp - [eq2]8.rengrp = 0
    ------------------------------------------------------------------------------------------------
                            bday_3 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------------------------+----------------------------------------------------------------
    eq1                            |
                               age |  -.0041027   .0006267    -6.55   0.000    -.0053311   -.0028744
                                   |
                            rengrp |
            non-Hispanic NB black  |  -.1836288   .0222085    -8.27   0.000    -.2271567   -.1401008
             native-born Hispanic  |  -.0358244   .0262893    -1.36   0.173    -.0873505    .0157017
            foreign-born Hispanic  |  -.8225086   .0274713   -29.94   0.000    -.8763513   -.7686659
                                   |
                            educ_4 |
                              <hs  |  -.0974978   .0257083    -3.79   0.000    -.1478851   -.0471104
                               hs  |  -.1447337   .0196244    -7.38   0.000    -.1831969   -.1062704
    some college/associate degree  |   .0032983   .0189259     0.17   0.862    -.0337958    .0403925
                     BA and above  |          0  (omitted)
                                   |
                           marst_3 |
       separated/divorced/widowed  |   .1463106   .0200851     7.28   0.000     .1069446    .1856766
                    never married  |     .01379   .0180391     0.76   0.445    -.0215659    .0491459
                                   |
                             _cons |  -.3780634   .0323696   -11.68   0.000    -.4415066   -.3146201
    -------------------------------+----------------------------------------------------------------
    eq2                            |
                               age |   .0265874   .0011619    22.88   0.000     .0243101    .0288647
                                   |
                            rengrp |
            non-Hispanic NB black  |   .0952442   .0354787     2.68   0.007     .0257073    .1647811
             native-born Hispanic  |  -.0358244   .0262893    -1.36   0.173    -.0873505    .0157017
            foreign-born Hispanic  |  -.8225086   .0274713   -29.94   0.000    -.8763513   -.7686659
                                   |
                            educ_4 |
                              <hs  |   .9329333   .0457874    20.38   0.000     .8431916    1.022675
                               hs  |    .678657   .0404978    16.76   0.000     .5992828    .7580313
    some college/associate degree  |   .6199999   .0402559    15.40   0.000     .5410998       .6989
                     BA and above  |          0  (omitted)
                                   |
                           marst_3 |
       separated/divorced/widowed  |   .4139872   .0325959    12.70   0.000     .3501003     .477874
                    never married  |   .2074178   .0342287     6.06   0.000     .1403308    .2745047
                                   |
                             _cons |  -4.415686   .0651466   -67.78   0.000    -4.543371   -4.288001
    ------------------------------------------------------------------------------------------------
    
    . nmissing bday_3 age rengrp educ_4 marst_3 if sample1==1 & female==0
    
    bday_3            586

    Comment


    • #3
      I did more exploration and now I am really puzzled. I used the example data and code that includes gsvy from the gologit2 help file. However, it didn't work either. I got the same error message! What could be going wrong here?

      The first two lines is to show that I have the most current version of gologit2. Is it possible that even this most current version of gologit2 is not compatible with my Stata 15?

      Code:
      . ssc install gologit2, replace
      checking gologit2 consistency and verifying not already installed...
      all files already exist and are up to date.
      
      . update all
      (contacting http://www.stata.com)
      
      Update status
          Last check for updates:  07 Dec 2017
          New update available:    none         (as of 07 Dec 2017)
          Current update level:    21 Nov 2017  (what's new)
      
      Possible actions
      
          Do nothing; all files are up to date.
      
      . webuse nhanes2f, clear
      
      . gsvy, subpop(female): gologit2 health i.black age c.age#c.age, autofit
      matrix has missing values
      an error occurred when svy executed gologit2
      r(504);

      Comment


      • #4
        I agree that this is weird.

        I just ran

        Code:
        webuse nhanes2f, clear
        gsvy, subpop(female): gologit2 health i.black age c.age#c.age, autofit
        in Stata 14.2, and it works fine. So something about Stata 15 is zapping gologit2. Further, version control is not helping.

        In the short run, if you have an older version of Stata, try using it. I will try to see what is wrong with 15.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          I think I have found a temporary fix and have emailed it to Jocelyn. If anybody else desperately needs it before there is a permanent fix, you can email me.

          FYI, the problem seems to be with lincom.ado. The 15.1 version differs a little from the 14.2 version, and those changes zap gologit2. Version control does not help. I don't know if any other program in the world is affected, but mine is. gologit2's calls to lincom are usually not necessary so the temporary fix disables them.

          I have written Stata about it. Hopefully they will tweak lincom.ado. But if not I think I can work out a fix so that lincom.ado never needs to be called.
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          StataNow Version: 19.5 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            Thank you so much, Richard!

            I used the fix and reran some of the programs. The code works for the nhanes2f dataset, but this is what I got for my data.

            Code:
            . gsvy, subpop(if sample1==1 & female==0): gologit2 bday_3 age i.rengrp i.educ_4 i.marst_3, auto nolabel
            
            ------------------------------------------------------------------------------
            Testing parallel lines assumption using the .05 level of significance...
            
            Step  1:  Constraints for parallel lines imposed for 7.rengrp (P Value = 0.7697)
            Step  2:  Constraints for parallel lines imposed for 8.rengrp (P Value = 0.3610)
            Step  3:  Constraints for parallel lines are not imposed for
                      age (P Value = 0.00000)
                      3.rengrp (P Value = 0.00000)
                      1.educ_4 (P Value = 0.00000)
                      2.educ_4 (P Value = 0.00000)
                      3.educ_4 (P Value = 0.00000)
                      2.marst_3 (P Value = 0.00000)
                      3.marst_3 (P Value = 0.00000)
            
            Wald test of parallel lines assumption for the final model:
            
            Adjusted Wald test
            
             ( 1)  [eq1]7.rengrp - [eq2]7.rengrp = 0
             ( 2)  [eq1]8.rengrp - [eq2]8.rengrp = 0
            
                   F(  2,   299) =    0.45
                        Prob > F =    0.6409
            
            An insignificant test statistic indicates that the final model
            does not violate the proportional odds/ parallel lines assumption
            
            If you re-estimate this exact same model with gologit2, instead
            of autofit you can save time by using the parameter
            
            pl(1b.rengrp 2o.rengrp 4o.rengrp 5o.rengrp 6o.rengrp 7.rengrp 8.rengrp 0b.educ_4 4o.educ_4 1b.marst_3)
            
            ------------------------------------------------------------------------------
            
            Survey: Generalized Ordered Logit Estimates
            
            Number of strata   =       300                Number of obs     =      933,920
            Number of PSUs     =       600                Population size   =  304,728,355
                                                          Subpop. no. obs   =       88,262
                                                          Subpop. size      = 31,553,033.6
                                                          Design df         =          300
                                                          F(  16,    285)   =       148.80
                                                          Prob > F          =       0.0000
            
             ( 1)  [eq1]7.rengrp - [eq2]7.rengrp = 0
             ( 2)  [eq1]8.rengrp - [eq2]8.rengrp = 0
            ------------------------------------------------------------------------------------------------
                                           |             Linearized
                                    bday_3 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------------------------+----------------------------------------------------------------
            eq1                            |
                                       age |  -.0050333   .0007596    -6.63   0.000     -.006528   -.0035385
                                           |
                                    rengrp |
                    non-Hispanic NB black  |  -.2088175   .0266952    -7.82   0.000     -.261351    -.156284
                     native-born Hispanic  |  -.0386773   .0317493    -1.22   0.224    -.1011569    .0238022
                    foreign-born Hispanic  |  -.7992292   .0333405   -23.97   0.000      -.86484   -.7336184
                                           |
                                    educ_4 |
                                      <hs  |  -.0671374   .0321279    -2.09   0.037    -.1303619   -.0039128
                                       hs  |  -.1561431   .0247395    -6.31   0.000    -.2048281   -.1074582
            some college/associate degree  |   .0156297   .0238495     0.66   0.513    -.0313038    .0625631
                             BA and above  |          0  (omitted)
                                           |
                                   marst_3 |
               separated/divorced/widowed  |   .1254352   .0227145     5.52   0.000     .0807352    .1701351
                            never married  |   .0022532   .0228905     0.10   0.922    -.0427932    .0472995
                                           |
                                     _cons |  -.3463112    .040932    -8.46   0.000    -.4268615   -.2657609
            -------------------------------+----------------------------------------------------------------
            eq2                            |
                                       age |   .0259341   .0013227    19.61   0.000     .0233311     .028537
                                           |
                                    rengrp |
                    non-Hispanic NB black  |   .0625554   .0388302     1.61   0.108    -.0138586    .1389694
                     native-born Hispanic  |  -.0386773   .0317493    -1.22   0.224    -.1011569    .0238022
                    foreign-born Hispanic  |  -.7992292   .0333405   -23.97   0.000      -.86484   -.7336184
                                           |
                                    educ_4 |
                                      <hs  |   .9907265   .0567714    17.45   0.000     .8790058    1.102447
                                       hs  |   .6540072   .0476726    13.72   0.000     .5601922    .7478223
            some college/associate degree  |   .6319154    .045443    13.91   0.000      .542488    .7213429
                             BA and above  |          0  (omitted)
                                           |
                                   marst_3 |
               separated/divorced/widowed  |   .4076962   .0368969    11.05   0.000     .3350867    .4803058
                            never married  |   .2129917   .0413782     5.15   0.000     .1315634    .2944199
                                           |
                                     _cons |  -4.399057   .0778894   -56.48   0.000    -4.552336   -4.245779
            ------------------------------------------------------------------------------------------------
            if not found
            r(111);
            
            end of do-file
            
            r(111);
            I guess the program also worked for my data because I got results and they seem to make sense. However, the error message a the bottom concerns me. Is it a problem or I can safely ignore it?

            Comment


            • #7
              It doesn't like the way you are specifying subpop. Instead of using if, try something like

              gen mysample = sample1==1 & female==0

              gsvy, subpop(mysample): gologit2

              Make sure that the mysample var is computed correctly. I'll have to see what is causing this. But at least there is an easy workaround (I think).

              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment


              • #8
                Awesome! Problem solved! Thanks so much Richard!

                Comment


                • #9
                  OK, more questions.

                  I re-installed Stata 14 on my computer and used it to run my code again. The good news that the results are identical with what I got from using 15. However, this time I would like to have the gamma option, but it doesn't work, even with 14. It seems that gamma doesn't do anything at all when gsvy is used. Is this something that you already knew, Richard?

                  Code:
                  . gsvy, subpop(msample): gologit2 bday_3 $dems, auto nolabel gamma
                  
                  ------------------------------------------------------------------------------
                  Testing parallel lines assumption using the .05 level of significance...
                  
                  Step  1:  Constraints for parallel lines imposed for 7.rengrp (P Value = 0.7697)
                  Step  2:  Constraints for parallel lines imposed for 8.rengrp (P Value = 0.3610)
                  Step  3:  Constraints for parallel lines are not imposed for 
                            age (P Value = 0.00000)
                            3.rengrp (P Value = 0.00000)
                            1.educ_4 (P Value = 0.00000)
                            2.educ_4 (P Value = 0.00000)
                            3.educ_4 (P Value = 0.00000)
                            2.marst_3 (P Value = 0.00000)
                            3.marst_3 (P Value = 0.00000)
                  
                  Wald test of parallel lines assumption for the final model:
                  
                  Adjusted Wald test
                  
                   ( 1)  [eq1]7.rengrp - [eq2]7.rengrp = 0
                   ( 2)  [eq1]8.rengrp - [eq2]8.rengrp = 0
                  
                         F(  2,   299) =    0.45
                              Prob > F =    0.6409
                  
                  An insignificant test statistic indicates that the final model
                  does not violate the proportional odds/ parallel lines assumption
                  
                  If you re-estimate this exact same model with gologit2, instead 
                  of autofit you can save time by using the parameter
                  
                  pl(1b.rengrp 2o.rengrp 4o.rengrp 5o.rengrp 6o.rengrp 7.rengrp 8.rengrp 0b.educ_4 4o.educ_4 1b.marst_3)
                  
                  ------------------------------------------------------------------------------
                  
                  Survey: Generalized Ordered Logit Estimates
                  
                  Number of strata   =       300                Number of obs     =      933,920
                  Number of PSUs     =       600                Population size   =  304,728,355
                                                                Subpop. no. obs   =       88,262
                                                                Subpop. size      = 31,553,033.6
                                                                Design df         =          300
                                                                F(  16,    285)   =       148.80
                                                                Prob > F          =       0.0000
                  
                   ( 1)  [eq1]7.rengrp - [eq2]7.rengrp = 0
                   ( 2)  [eq1]8.rengrp - [eq2]8.rengrp = 0
                  ------------------------------------------------------------------------------------------------
                                                 |             Linearized
                                          bday_3 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------------------------+----------------------------------------------------------------
                  eq1                            |
                                             age |  -.0050333   .0007596    -6.63   0.000     -.006528   -.0035385
                                                 |
                                          rengrp |
                          non-Hispanic NB white  |          0  (base)
                          non-Hispanic NB black  |  -.2088175   .0266952    -7.82   0.000     -.261351    -.156284
                           native-born Hispanic  |  -.0386773   .0317493    -1.22   0.224    -.1011569    .0238022
                          foreign-born Hispanic  |  -.7992292   .0333405   -23.97   0.000      -.86484   -.7336184
                                                 |
                                          educ_4 |
                                            <hs  |  -.0671374   .0321279    -2.09   0.037    -.1303619   -.0039128
                                             hs  |  -.1561431   .0247395    -6.31   0.000    -.2048281   -.1074582
                  some college/associate degree  |   .0156297   .0238495     0.66   0.513    -.0313038    .0625631
                                   BA and above  |          0  (omitted)
                                                 |
                                         marst_3 |
                    married/living with partner  |          0  (base)
                     separated/divorced/widowed  |   .1254352   .0227145     5.52   0.000     .0807352    .1701351
                                  never married  |   .0022532   .0228905     0.10   0.922    -.0427932    .0472995
                                                 |
                                           _cons |  -.3463112    .040932    -8.46   0.000    -.4268615   -.2657609
                  -------------------------------+----------------------------------------------------------------
                  eq2                            |
                                             age |   .0259341   .0013227    19.61   0.000     .0233311     .028537
                                                 |
                                          rengrp |
                          non-Hispanic NB white  |          0  (base)
                          non-Hispanic NB black  |   .0625554   .0388302     1.61   0.108    -.0138586    .1389694
                           native-born Hispanic  |  -.0386773   .0317493    -1.22   0.224    -.1011569    .0238022
                          foreign-born Hispanic  |  -.7992292   .0333405   -23.97   0.000      -.86484   -.7336184
                                                 |
                                          educ_4 |
                                            <hs  |   .9907265   .0567714    17.45   0.000     .8790058    1.102447
                                             hs  |   .6540072   .0476726    13.72   0.000     .5601922    .7478223
                  some college/associate degree  |   .6319154    .045443    13.91   0.000      .542488    .7213429
                                   BA and above  |          0  (omitted)
                                                 |
                                         marst_3 |
                    married/living with partner  |          0  (base)
                     separated/divorced/widowed  |   .4076962   .0368969    11.05   0.000     .3350867    .4803058
                                  never married  |   .2129917   .0413782     5.15   0.000     .1315634    .2944199
                                                 |
                                           _cons |  -4.399057   .0778894   -56.48   0.000    -4.552336   -4.245779
                  ------------------------------------------------------------------------------------------------
                  The following code doesn't work either with Stata 14.
                  Code:
                  . webuse nhanes2f, clear
                  
                  . gsvy, subpop(female): gologit2 health i.black age c.age#c.age, autofit gamma

                  Comment


                  • #10
                    You don't show the error. If you use the official gologit2 with Stata 14.2 it works fine. If you use the tweaked temporary fix I sent you get an error, so I guess my temporary fix won't be the permanent one.

                    So, best advice for now: use 14,2 until I get a better fix. If somebody can only use gologit2 with Stata 15, and is getting the types of errors you originally got, I can email them a temporary fix that works so long as they don't use the gamma option. The gamma option uses lincom and lincom in Stata 15 causes gologit2 to have problems, so the temporary fix disables gamma, which most people don't use anyway.

                    Since you re-installed 14, run -update all- to make sure it is up to date.
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    StataNow Version: 19.5 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://www3.nd.edu/~rwilliam

                    Comment


                    • #11
                      Thanks to Kit Baum, a new version of gologit2 is now on SSC. I think it fixes most of the problems noted in this thread.
                      -------------------------------------------
                      Richard Williams, Notre Dame Dept of Sociology
                      StataNow Version: 19.5 MP (2 processor)

                      EMAIL: [email protected]
                      WWW: https://www3.nd.edu/~rwilliam

                      Comment


                      • #12
                        Hi Richard,

                        I appear to be having similar issues with the above. I am using stata 15.1 and version 3.1.2 16dec20167 of gologit2.

                        Code:
                        which gologit2
                        c:\ado\plus\g\gologit2.ado
                        *! version 3.1.2 16dec20167 Richard Williams, [email protected]
                        I have set my survey data with a pweight only.
                        Code:
                        svyset [pweight=WS]
                        
                              pweight: WS
                                  VCE: linearized
                          Single unit: missing
                             Strata 1: <one>
                                 SU 1: <observations>
                                FPC 1: <zero>
                        I then run the model
                        Code:
                        svy:gologit2 ocastydr age
                        The output is:
                        Code:
                        (running gologit2 on estimation sample)
                        - invalid name
                        an error occurred while attempting to compute scores
                        r(322);
                        The model ran fine without the prefix, even if the weights are included. I have attempted both the gsvy and svy prefix. The model runs for the same data using ologit.

                        I have dropped the missing data with the same error. The example data set ran without a problem.

                        Any help would be greatly appreciated.

                        Kind regards,

                        TiarnĂ¡n

                        Comment


                        • #13
                          Try adding the -nolabel- option. It sometimes fixes weird problems.

                          If problems persist, if you are free to share the data and code with me, I will take a look.
                          -------------------------------------------
                          Richard Williams, Notre Dame Dept of Sociology
                          StataNow Version: 19.5 MP (2 processor)

                          EMAIL: [email protected]
                          WWW: https://www3.nd.edu/~rwilliam

                          Comment


                          • #14
                            Thank you very much, that seems to of worked.

                            Is there a problem in my label so, a character that is incompatible, even though it works without the prefix.

                            Many thanks again

                            Comment


                            • #15
                              You don't show the labels so I don't know what the problem is. The gologit2 help says

                              nolabel causes the equations to be named eq1, eq2, etc. The default is to use the first 32 characters of the value labels and/or the values of Y as the equation labels. Note that some characters cannot be used in equation names, e.g. the space ( ), the period (.), the dollar sign ($), and the colon(, and will be replaced with the underscore (_) character. Square brackets ([]) and parentheses will be replaced with curly brackets ({}). The default behavior works well when the value labels are short and descriptive. It may not work well when value labels are very long and/or include characters that have to be changed. If the printout looks unattractive and/or you are getting strange errors, try changing the value labels of Y or else use the nolabel option.
                              -------------------------------------------
                              Richard Williams, Notre Dame Dept of Sociology
                              StataNow Version: 19.5 MP (2 processor)

                              EMAIL: [email protected]
                              WWW: https://www3.nd.edu/~rwilliam

                              Comment

                              Working...
                              X