Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error with levelsof but not levels command

    Hi everyone,
    Has anyone experienced errors with levelsof not present with levels commands? This curious case occurs with the following effort to find the value for _n at which rho is maximized or MAE is minimized under specific conditions. The `variables' list has "ihap irlx isad" and `panelrange' is "1/3". The following levels command works:


    if "`Ecriterion'"=="rho" | "`Ecriterion'"=="mae" {
    qui gen long obsno=_n
    foreach var of varlist `variables' {
    foreach id of numlist `panelrange' {
    capture noisily {
    quietly summarize SP_`Ecriterion'_`var'_ID if SP_id_`var'_ID==`id' & SP_d_`var'_ID==`diff', meanonly
    if "`Ecriterion'"=="rho" loc a="max"
    else if "`Ecriterion'"=="mae" loc a="min"
    levels obsno if SP_`Ecriterion'_`var'_ID == r(`a') & SP_id_`var'_ID==`id' & SP_d_`var'_ID==`diff'
    loc emax`var'`id'=SP_e_`var'_ID[`r(levels)']
    }
    }
    }
    }

    The code is not perfect and could use a clean, but it works just fine. I have confirmed this by checking the contents of the `emax`var'`id'' macros. However, curiously, if the "levels" command is replaced with "levelsof", then the following errors result:

    SP_e_ihap_ID not found
    SP_e_ihap_ID not found
    SP_e_ihap_ID not found
    SP_e_irlx_ID not found
    SP_e_irlx_ID not found
    SP_e_irlx_ID not found
    SP_e_isad_ID not found
    SP_e_isad_ID not found
    SP_e_isad_ID not found


    This is very strange, I think, because the two commands are often interchangeable (and meant to be so)? If anyone has experienced anything similar or can diagnose the problem, please let me know! The gtools version of levelsof also produces the same errors.

    Thanks for any input and time you can offer!
    Mike

  • #2
    Sorry, this is all with Stata 15.1 MP8.

    Comment


    • #3
      It is, indeed, very strange. Can you post an example data set that this code can be run with and which reproduces the error? (Use -dataex-.)

      Comment


      • #4
        Thanks for your reply, Clyde, it sounds like the problem is not obvious (?).

        To reproduce the problem I've saved the data I'm using -- just to be sure. The link is here. If you run the following in a .do file you can see that levels works, but levelsof does not.

        Any help is greatly appreciated!!
        Mike


        Code:
        loc Ecriterion="rho"
        loc variables="ihap irlx isad"
        loc panelrange="1/5"
        loc diff=1
        tempvar obsnO
        
        if "`Ecriterion'"=="rho" | "`Ecriterion'"=="mae" {
        qui gen long `obsnO'=_n
        foreach var in ihap irlx isad {
        foreach id of numlist `panelrange' {
        capture noisily {
        quietly summarize SP_`Ecriterion'_`var'_ID if SP_id_`var'_ID==`id' & SP_d_`var'_ID==`diff', meanonly
        if "`Ecriterion'"=="rho" loc a="max"
        else if "`Ecriterion'"=="mae" loc a="min"
        qui levelsof `obsnO' if SP_`Ecriterion'_`var'_ID == r(`a') & SP_id_`var'_ID==`id' & SP_d_`var'_ID==`diff'
        loc emax`var'`id'=SP_e_`var'_ID[`r(levels)']
        di "`emax`var'`id'' for `var' and ID `id'"
        }
        }
        }
        }
        Dropbox is a free service that lets you bring your photos, docs, and videos anywhere and share them easily. Never email yourself a file again!

        Comment


        • #5
          Well, I'm able to replicate your problem (in version 16). I can't really trace this to its roots because ultimately -levelsof- calls a Mata function that I don't really understand. By contrast, -levels- does not use Mata and is purely Stata based.

          What I did find out, however, by adding -return list- after the -levelsof- (resp. -levels-) command is that with -levels- we appropriately get r(N) == 1, but for some reasons with -levelsof- we get the incorrect result that r(N) == 0. So for some reason -levelsof- is not finding any observations that meet its -if- condition. My first thought is that this might be a precision issue: comparing floating point numbers for exact equality is hazardous due to rounding errors and precision limits. But when I modified the -if- condition to requiring only that -float(SP_`Ecriterion'_`var'_ID) == float(r(`a'))- (and the other two conjuncts) the same thing still happens. So I don't really know what is going on here.

          If somebody who is Mata fluent is following this thread and is interested in taking this further, that would be great. If nobody does that, then I think this goes to technical support as a bug in -levelsof-

          Comment


          • #6
            Hi Clyde,
            Thanks very much for your time and help with this -- much appreciated. Yes, I also suspected precision issues here given how they plague so many == tests, but either using float() or setting the variables to double doesn't resolve it. Very curious indeed! Perhaps Nick Cox would have some thoughts if he comes across this thread given his contribution to the commands? Notably, the same fault exists with glevelsof so it's not unique to the Stata version, so perhaps Greg Warnes would know what's going on.

            Anyway, thanks again!
            Mike

            Comment


            • #7
              I don't think this is a precision issue. The first thing that the levelsof ado does is to clear r() stored results. This is done before the syntax and marksample commands are used to parse the command and evaluate the if/in condition. This is undoubtedly a bug in levelsof. In the mean time, a simple workaround is to copy the stored result to a regular scalar and use that in the condition.

              Here's a quick and simple illustration:

              Code:
              . sysuse auto, clear
              (1978 Automobile Data)
              
              . sum gear_ratio
              
                  Variable |        Obs        Mean    Std. Dev.       Min        Max
              -------------+---------------------------------------------------------
                gear_ratio |         74    3.014865    .4562871       2.19       3.89
              
              . count if gear_ratio == r(max)
                1
              
              . 
              . sysuse auto, clear
              (1978 Automobile Data)
              
              . sum gear_ratio
              
                  Variable |        Obs        Mean    Std. Dev.       Min        Max
              -------------+---------------------------------------------------------
                gear_ratio |         74    3.014865    .4562871       2.19       3.89
              
              . levelsof make if gear_ratio == r(max)
              
              
              . return list
              
              scalars:
                                r(N) =  0
                                r(r) =  0
              
              . 
              . sysuse auto, clear
              (1978 Automobile Data)
              
              . sum gear_ratio
              
                  Variable |        Obs        Mean    Std. Dev.       Min        Max
              -------------+---------------------------------------------------------
                gear_ratio |         74    3.014865    .4562871       2.19       3.89
              
              . scalar maxval = r(max)
              
              . levelsof make if gear_ratio == maxval
              `"Datsun 200"'
              
              .

              Comment


              • #8
                Hi Robert,
                Thanks for that and the insight. It's probably best practice to store such things in general just to be sure. It does sound like a bit of a bug in levelsof, or at least quite unexpected given the way most Stata procedures work.

                Thanks again!
                Mike

                Comment


                • #9
                  I first wrote this command but it’s now official. I had no involvement with the recent rewrite. levels was the name of its predecessor; the name was, I believe, thought too good to use and there was perhaps a vague idea that the name should be kept in reserve for something to do with factor variables.

                  Comment


                  • #10
                    Yes, it is a bug in levelsof. levelsof is clearing r() before handling the if condition, which it should not do. It will be fixed in an update.
                    Bill Sribney (StataCorp)

                    Comment

                    Working...
                    X