Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Updating stata journal packages and ssc (the parallel module)

    Hi all,

    Recently I realized that the parallel module, which I maintain, has not been updated in SSC. Although users can still use the GitHub repository (https://github.com/gvegayon/parallel), I think it would be nice to have the latest version on SSC. My problem is that I currently have access to Stata 16 only, which would mean that, if updated, users with older versions of Stata wouldn't be able to use parallel. That said, I have a couple of questions:
    1. Is it too terrible to update parallel with Stata 16 on SSC? How many users would be left out (any idea)?
    2. Any volunteers who could help us compiling parallel with older versions of Stata?
    3. And finally, is it possible to update the version published with the Stata journal article?
    Any tips would be highly appreciated.

    George

  • #2
    3. Naturally you may submit an update to the Stata Journal. Submission of an update should be to the same address as used to submit the original article.

    1. If a version on SSC rules out users of versions 15 and earlier, then either that is what it is, or you maintain two commands with different names.

    I imagine you would want to keep the same name(s) in different places.

    Why would updating rule out users of earlier versions any way? The version statement in the program may be the earliest version the program requires. It doesn't have to be the same as the version of Stata you are currently using unless it uses features in that version.

    Comment


    • #3
      The main issue is not regarding the version statement, that's OK, the problem has to do with the binaries. The last time I checked, having a command that has a mlib compiled with a newer version of Stata fails to load in older versions. Am I correct? I don't have another Stata version other than 16, so I wouldn't know.

      Comment


      • #4
        I would ask StataCorp technical support for guidance here, but as you won't be submitting using Stata 17, I don't follow what is different from your 2019 submission to SJ.

        Comment


        • #5
          One possibility is to include the Mata code in the main ado-file. That way, the first time parallel is called, the code is compiled using whichever version the caller is running. The downside is that the Mata routines are then private to parallel.ado. Whether that is a problem, I cannot tell.

          A variation of this solution is to include a (a)do-file in the distribution that will (re-)compile the source code. You could have a dedicated subcommand, e.g.,

          Code:
          parallel recompile
          that would then (re-)compile the source using whichever version the caller is currently running.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            I would ask StataCorp technical support for guidance here, but as you won't be submitting using Stata 17, I don't follow what is different from your 2019 submission to SJ.
            Good point, Nick! I guess I was thinking in the case we wanted to update it b/c of new features or fixes.

            Daniel, that's a very interesting idea! I'll give it a thought and see how plausible it is.

            Comment


            • #7
              George, in this thread here https://www.statalist.org/forums/for...unction-s-code
              daniel klein explained that yes, if you rely on a pre-compiled Mata code you re in a sort of Catch 22 situation:
              1. if you compile on Stata 16, users of Stata less than 16 would not be able to use your command.
              2. if you compile on an early Stata, say Stata 11, it would not run as fast as it could had you compiled it on Stata 16.

              David Roodman seems to have resolved the problem by having his code compiled for both recent and ancient Statas. E.g., when I try his -boottest- on an ancient Stata 11, I get the following message:
              Code:
              . boottest price
              This version of boottest requires Stata version 13 or later. An older version compatible with Stata 11
              is at https://github.com/droodman/boottest/releases/tag/v2.6.0.
              r(9);
              Originally posted by George Vega View Post
              The main issue is not regarding the version statement, that's OK, the problem has to do with the binaries. The last time I checked, having a command that has a mlib compiled with a newer version of Stata fails to load in older versions. Am I correct? I don't have another Stata version other than 16, so I wouldn't know.

              Comment


              • #8
                Hi George,

                A few thoughts:
                • I have managed to beg, borrow, or steal all Stata versions back to 7, that being the first version I heavily used. Not sure if they all run on my current computer. This is probably not 100% legal but anyway I give back to Stata Corp and the Stata community. I'm happy to do an occasional compile for you.
                • I do regularly get queries about my programs from people running older Stata versions. E.g., as of now, I would not be too surprised to hear from a Stata 13 user.
                • For -parallel- it may suffice to just compile the .mata every time the program is called, as suggested. I've avoided this because compile time for my longer programs can reach 3-5 seconds, which I don't want when, say, running a test that should take 0.05 seconds. But people aren't going to use parallel (which is a great package) unless they're expecting long run times.
                • I settled on what I thought was an elegant solution for staying compatible with old versions while being able to take advantage of newer Mata features I wanted to use if available (notably the panelsum() function). See these lines in the cmp.mata and these in cmp.ado. I ship cmp.mata compiled with an old Stata. Every time the ado runs, it compares the current Stata version with the compile-time version. It only recompiles if there is a discrepancy. It determines the compile version by calling a simple function in cmp.mata/lcmp.mlib--which it can run because initially cmp.mata is compiled in an old version.
                • But this solution does have a downside: some users are working in networked settings in which they lack write permission for the directory where the new lcmp.mlib should go. So this crashes and requires intervention from local IT support. So I have moved away from this approach unless there's a new Mata feature that makes a big difference. Now I'm mostly just shipping mlibs compiled in old versions. Possibly this issue will affect you: if parallel is going to compile every time, you need to make sure it compiles to a directory where it has write access and which the newly spawned Stata instances will be able to access.
                • As of Stata 13, the -net- command accepts https links. This enables direct installation from Github. So in new-enough versions of Stata, instead of the sort of message Joro quoted from boottest, which just points to a GIthub repo and requires manual installation, you can tell users to type something like "net install boottest, replace from(https://raw.github.com/droodman/boottest/v2.6.0)". The v2.6.0 is the tag for a Github release.
                --David
                Last edited by David Roodman; 12 Jun 2021, 07:38.

                Comment


                • #9
                  @George Vega Dear vegayon, I have a problem when running your command parallel,I run it in Stata 17.0 in Mac 15.

                  . parallel initialize 2, f
                  N Child processes: 2
                  Stata dir: /Applications/Stata 17/StataSE.app/Contents/MacOS/stata-se

                  . sysuse auto
                  (1978 automobile data)

                  . parallel, by(foreign): egen maxp = max(price)
                  --------------------------------------------------------------------------------
                  Parallel Computing with Stata
                  Child processes: 2
                  pll_id : 3nlji4ert5
                  Running at : /Applications/Stata 17/ado/personal/Replications/内生SFA
                  Randtype : datetime

                  cannot find Stata directory
                  cannot find Stata directory
                  Waiting for the child processes to finish...
                  child process 0001 Exited with error -700- while running the command/dofile (vie
                  > w log)...
                  child process 0002 Exited with error -700- while running the command/dofile (vie
                  > w log)...
                  --------------------------------------------------------------------------------
                  Enter -parallel printlog #- to checkout logfiles.
                  --------------------------------------------------------------------------------
                  2 child processes encountered errors. Throwing last error.
                  r(700);

                  . which parallel
                  /Applications/Stata 17/ado/plus/p/parallel.ado
                  *! version 1.20.0 19mar2019
                  *! PARALLEL: Stata module for parallel computing
                  *! by George G. Vega [cre,aut], Brian Quistorff [aut]
                  *!
                  *! Project website:
                  *! https://github.com/gvegayon/parallel
                  *! Bug reports:
                  *! https://github.com/gvegayon/parallel/issues

                  Can you help me to solve this problem?

                  Best
                  Raymonf
                  Last edited by Raymond Zhang; 12 Jun 2021, 08:56.
                  Best regards.

                  Raymond Zhang
                  Stata 17.0,MP

                  Comment


                  • #10
                    Dear all, I have solved this problem by myself. The Stata dir cannot contain spaces between "Stata" and "16".You should correct your Stata dir to "/Applications/Stata16/StataSE.app/Contents/MacOS/stata-se",the right dir is "Stata16" not "Stata 16",Then it will run successfully.
                    Best regards.

                    Raymond Zhang
                    Stata 17.0,MP

                    Comment


                    • #11
                      David, Joro, thanks for the suggestions. We do have instructions showing how to install parallel directly from the GitHub repo, so I think the idea of including a message with instructions for installing pre-compiled versions would be the way to go.

                      David, thanks for offering the occasional compile; I really appreciate it! I think that could be the way to go (will bug you later for this!).

                      I think it would be great to be able to have a docker image for this. I recently started working with Docker and am now addicted to it. I have an R package for parallel computing called slurmR, and I have found a way to test it on GitHub using GitHub Actions with Docker in a Slurm Environment (see here). My dream would be to have a few Docker images with various versions of Stata, so that way checking and compiling parallel could be done automagically on GitHub.

                      Comment

                      Working...
                      X