Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata arbitrarily (but not randomly) resetting plugins

    I'm posting this mostly for the search engines. I wrote to Stata tech support about this issue a month ago and haven't heard back.

    I created a package that lets you run Julia inside of Stata. (reghdfejl uses it for a big speed-up over reghdfe on hard problems. Try it!) At the core of the package is a C++ plugin for Stata, written in compliance with the Stata plugin interface. In normal usage, the plugin is called many times, as a sequence of commands is sent to Julia. The intention is that the state of the Julia environment is preserved between calls.

    I noticed that the package would occasionally cause Stata to hard-crash. Sometimes there is a crash report; sometimes the Stata windows just disappear. I eventually figured out that occasionally Stata would just reset or reload the plugin, zapping the Julia environment. The plugin would then proceed as if the Julia shared library had been loaded when it effectively no longer was. A pointer that was supposed to point to a function in the Julia system would in fact point to nothing. Calling the function would crash Stata.

    I created the following reproducible demonstration. I found that it "works" in Stata 19 in Windows, macOS, and Ubuntu Linux.

    The complete code of the demo plugin is in a file called jl2.cpp. The plugin sets a C variable called `state` to 0 upon initialization. Then when called, the plugin sets state to the number of arguments it receives. It prints the state variable before and after doing this.
    Code:
    #include <string>
    #include "stplugin.h"
    using namespace std;
     
    int state = 0;
     
    STDLL stata_call(int argc, char* argv[]) {
        SF_error((char *) (to_string(state) + " is the state on entry\n").c_str());
     
        state = argc;
       
        SF_error((char *) (to_string(state) + " is the state on exit\n").c_str());
        return 0;
    }
    Compile this according to the instructions.

    I created this ado file to wrap the plugin. I call it "jl2.ado":
    Code:
    program define jl2
      plugin call _julia, `*'
    end
     
    program _julia, plugin using(jl2.plugin)
    I find that, in a fresh Stata session, running the following peculiar sequence causes the state variable to reset to 0 between the two calls to jl2. It should not do that:
    Code:
    sysuse auto
    jl2 a
    scatter mpg length
    qui reg trunk
    qui esttab
    jl2 a b
    Log:
    Code:
    . sysuse auto
    (1978 automobile data)
     
    . jl2 a
    0 is the state on entry
    1 is the state on exit
     
    . scatter mpg length
     
    . qui reg trunk
     
    . qui esttab
     
    . jl2 a b
    0 is the state on entry
    2 is the state on exit
    (This does require the user-written esttab command.)

    Notice that at the bottom jl2 shows the state as having reset from 1 to 0, which it should not have done. Making small changes to this example, like dropping the esttab command, makes the problem go away.

    I don't know if my assumption that Stata preserves a plugin's state between calls was just wrong. But almost always, it holds. And I need it to hold so that the julia package does not (occasionally) crash Stata.

  • #2
    Update: Hua Peng (StataCorp) explained to me that to manage memory, Stata will sometimes unload ado files. And if all ado's referencing a plugin are unloaded, then the plugin may be unloaded too. There isn't a perfect way to prevent this.

    In my two packages that call -jl-, boottest and reghdfejl, I added lines to declare extra programs to wrap the plugin. Example:

    Code:
    capture program _julia_reghdfejl, plugin using(jl.plugin)
    in reghdfejl.ado. With this in place, jl.plugin will only be at risk of unloading if Stata first unloads both jl.ado and reghdfejl.ado. Which is an improvement.

    Comment

    Working...
    X