Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Submitting packages with platform-specific content ('g' manifest lines) to SSC

    I authored svmachines some years ago and now I am trying to do some maintenance. It got published to SSC but there are a bunch of problems that make it a broken package there. The trickiest part is that it uses plugins, and plugins need to be built for each platform (containing some or any of: WIN, WIN64, WIN64A, MACINTEL, MACINTEL64, MACARM64, LINUX, LINUX64 ...) separately, and then at runtime the .ado files run
    Code:
    program _svmachines, plugin
    to load the plugin, and it needs to load the plugin that matches the current platform or else sadness ensues.

    [R] net explains a way to cut the knot

    G platformname filename1 filename2. It specifies that the file be installed in system directories only if the user’s operating system is of type platformname. filename1 is the name of the file on the server (the file to be copied), and filename2 is to be the name of the file on the user’s system.

    h filename asserts that filename must be loaded, or this package is not to be installed

    For example, you might code

    G WIN64 mydll.forwin mydll.plugin
    G LINUX64 mydll.forlinux mydll.plugin
    h mydll.plugin

    if you were offering the plugin mydll.plugin for Windows and Linux only.
    I leaned heavily on that, carefully crafting a .pkg file that worked on all platforms which you can view here and test with

    Code:
    . net link sj
    . net cd software
    . net cd sj16-4
    . net describe st0461
    Actually just for reference here's the complete original manifest:

    Code:
    d SJ16-4 st0461. Support vector machines
    d Support vector machines
    d by Nick Guenther, University of Waterloo,
    d      Waterloo, Canada
    d    Matthias Schonlau, University of Waterloo,
    d      Waterloo, Canada
    d Support:  nguenthe@@uwaterloo.ca,
    d           schonlau@@uwaterloo.ca
    d After installation, type help ^svmachines^,
    d   ^export svmlight^, and ^import svmlight^
    
    g LINUX st0461/_svmachines.forlinux _svmachines.plugin
    g LINUX st0461/_svm_dlopenable.forlinux _svm_dlopenable.plugin
    g LINUX st0461/_svm_getenv.forlinux _svm_getenv.plugin
    g LINUX st0461/_svmlight.forlinux _svmlight.plugin
    g LINUX st0461/_svm_setenv.forlinux _svm_setenv.plugin
    g LINUX64 st0461/_svmachines.forlux64 _svmachines.plugin
    g LINUX64 st0461/_svm_dlopenable.forlux64 _svm_dlopenable.plugin
    g LINUX64 st0461/_svm_getenv.forlux64 _svm_getenv.plugin
    g LINUX64 st0461/_svmlight.forlux64 _svmlight.plugin
    g LINUX64 st0461/_svm_setenv.forlux64 _svm_setenv.plugin
    g MACINTEL64 st0461/_svmachines.formacintel64 _svmachines.plugin
    g MACINTEL64 st0461/_svm_dlopenable.formacintel64 _svm_dlopenable.plugin
    g MACINTEL64 st0461/_svm_getenv.formacintel64 _svm_getenv.plugin
    g MACINTEL64 st0461/_svmlight.formacintel64 _svmlight.plugin
    g MACINTEL64 st0461/_svm_setenv.formacintel64 _svm_setenv.plugin
    G WIN64A st0461/libsvm.forwin64a libsvm.dll
    g WIN64A st0461/_svmachines.forwin64a _svmachines.plugin
    g WIN64A st0461/_svm_dlopenable.forwin64a _svm_dlopenable.plugin
    g WIN64A st0461/_svm_getenv.forwin64a _svm_getenv.plugin
    g WIN64A st0461/_svmlight.forwin64a _svmlight.plugin
    g WIN64A st0461/_svm_setenv.forwin64a _svm_setenv.plugin
    f st0461/attitude_indicators.dta
    f st0461/binary_classification_example.do
    f st0461/class_probability_example.do
    f st0461/export_svmlight.ado
    f st0461/export_svmlight.sthlp
    f st0461/import_svmlight.ado
    f st0461/import_svmlight.sthlp
    f st0461/multiclass_classification_example.do
    f st0461/regression_example.do
    f st0461/svmachines.ado
    f st0461/svmachines_example.ado
    f st0461/svmachines.sthlp
    f st0461/svm_ensurelib.ado
    f st0461/svm_examples.ihlp
    f st0461/_svm_model2stata.ado
    f st0461/svm_predict.ado
    f st0461/_svm_train.ado
    This is quite different from the manifest on SSC -- files are renamed and there are no 'g' lines anywhere.

    Code:
    d 'SVMACHINES': module providing Support Vector Machines for both Classification and Regression
    d
    d     svmachines fits a support vector machine (SVM) model.  SVM is
    d not one, but several, variant models each    based upon the
    d principles of splitting hyperplanes and the culling of
    d unimportant observations.    The basic SVM idea is to find a
    d linear boundary--a hyperplane--in high-dimensional space:  for  
    d classification, this is a boundary between two classes; for
    d regression it is a line near which points    should be--much like
    d in OLS, while simultaneously minimizing the number of
    d observations required to    distinguish this hyperplane.  The
    d unimportant observations are ignored after fitting is done, which
    d    makes SVM very memory efficient.     Each observation can be
    d thought of as a vector, so the support vectors are those
    d observations which the    algorithm deems critical to the fit.
    d
    d KW: log files
    d KW: dual log
    d KW: save results
    d KW: document preparation
    d KW: output
    d
    d Requires: Stata version 13
    d
    d Distribution-Date: 20220321
    d
    d Author: Nick Guenther , University of Waterloo
    d Support: email nguenthe@@uwaterloo.ca
    d
    d Author: Matthias Schonlau, University of Waterloo
    d Support: email schonlau@@uwaterloo.ca
    d
    f svmachines.ado
    f svmachines.sthlp
    f svmlight.sthlp
    f svm_ensurelib.ado
    f svm_predict.ado
    f ../_/_svm_model2stata.ado
    f ../_/_svm_train.ado
    f svm_binary_classification_example.do
    f svm_class_probability_example.do
    f svm_multiclass_classification_example.do
    f svm_regression_example.do
    f svmachines_example.ado
    f ../i/import_svmlight.ado
    f ../i/import_svmlight.sthlp
    f svm_examples.ihlp
    f ../e/export_svmlight.ado
    f ../e/export_svmlight.sthlp
    f ../a/attitude_indicators.dta
    f ../_/_svm_dlopenable_LINUX64.plugin
    f ../_/_svm_dlopenable_MACARM64.plugin
    f ../_/_svm_dlopenable_MACINTEL64.plugin
    f ../_/_svm_dlopenable_WIN64A.plugin
    f ../_/_svm_getenv_LINUX64.plugin
    f ../_/_svm_getenv_MACARM64.plugin
    f ../_/_svm_getenv_MACINTEL64.plugin
    f ../_/_svm_getenv_WIN64A.plugin
    f ../_/_svm_setenv_LINUX64.plugin
    f ../_/_svm_setenv_MACARM64.plugin
    f ../_/_svm_setenv_MACINTEL64.plugin
    f ../_/_svm_setenv_WIN64A.plugin
    f ../_/_svmachines_LINUX64.plugin
    f ../_/_svmachines_MACARM64.plugin
    f ../_/_svmachines_MACINTEL64.plugin
    f ../_/_svmachines_WIN64A.plugin
    f ../_/_svmlight_LINUX64.plugin
    f ../_/_svmlight_MACARM64.plugin
    f ../_/_svmlight_MACINTEL64.plugin
    f ../_/_svmlight_WIN64A.plugin
    f ../l/libsvm_WIN64A.dll
    f ../l/libsvm_MACARM64.dylib
    f ../l/libsvm_MACINTEL64A.dylib

    The SSC guidelines state

    It is not necessary nor desirable to generate a Stata-format package (.pkg) file, since the SSC archive software generates the package file automatically from the RePEc template.
    but they do not explain what that software does. All I know is somehow it took a working package, dropped all the 'g' lines, renamed a bunch of files, and produced one that users can't use.

    I would like to know how to format the zip file I submit to SSC to give the best possible chance that it's imported intact, and also to share that knowledge for future plugin writers.

    I found one thread covering exactly this question -- ironically it predates my package -- but it has no answer. However I discovered that package has in fact made it to SSC -- with it's 'g' lines intact!

    Code:
    d 'SADI': module to compute Sequence Analysis Distance Measures
    d
    d  Several measures of distance between pairs of sequences,
    d including  the simple Hamming distance, the optimal matching
    d distance and its  variant, OMv, as described in Halpin,
    d {it:Sociological Methods and    Research}, 2010, Hollister's
    d localised OM, Lesnard's dynamic Hamming  distance, Time-Warp Edit
    d Distance and a version of Elzinga's number of  common
    d subsequences for duration-weighted spell data. {p_end}  Several
    d utilities are also included.
    d
    d KW:  sequence analysis
    d KW: distance
    d KW: edit distance
    d
    d Requires: Stata version 10
    d
    d Distribution-Date: 20230314
    d
    d Author: Brendan Halpin, University of Limerick, Ireland
    d Support: email brendan.halpin@@ul.ie
    d
    f sadi.sthlp
    f ../a/adjusted_rand_index.mata
    f ../a/ari.ado
    f ../a/ari.sthlp
    f ../c/cal2spell.ado
    f ../c/cal2spell.sthlp
    f ../c/combinadd.ado
    f ../c/combinadd.sthlp
    f ../c/combinprep.ado
    f ../c/combinprep.sthlp
    f ../c/corrsqm.ado
    f ../c/corrsqm.sthlp
    f ../c/cumuldur.ado
    f ../c/cumuldur.sthlp
    f ../d/distances.do
    f ../d/dynhamming.ado
    f ../d/dynhamming.sthlp
    f ../e/elzspelladd.c
    f ../m/maketrpr.ado
    f ../m/maketrpr.sthlp
    f ../m/metricp.ado
    f ../m/metricp.sthlp
    f ../m/mvad.dta
    f ../n/ndsub.ado
    f ../n/ndsub.sthlp
    f ../n/nspells.ado
    f ../n/nspells.sthlp
    f ../o/oma.ado
    f ../o/oma.sthlp
    f ../o/omamatv3.c
    f ../o/omav.ado
    f ../o/omav.sthlp
    f ../p/permtab.ado
    f ../p/permtab.sthlp
    f ../p/permtabga.ado
    f sdchronogram.ado
    f sdchronogram.sthlp
    f sdchronoplot.ado
    f sdchronoplot.sthlp
    f sddiscrep.ado
    f sddiscrep.sthlp
    f sdentropy.ado
    f sdentropy.sthlp
    f sdhamming.ado
    f sdhamming.sthlp
    f sdhollister.ado
    f sdhollister.sthlp
    f sdindexplot.ado
    f sdindexplot.sthlp
    f sdplot.ado
    f sdstripe.ado
    f sdstripe.sthlp
    f stplugin.c
    f stplugin.h
    f ../t/trans2subs.ado
    f ../t/trans2subs.sthlp
    f ../t/trprgr.ado
    f ../t/trprgr.sthlp
    f ../t/turbulence.ado
    f ../t/turbulence.sthlp
    f ../t/twed.ado
    f ../t/twed.sthlp
    f ../u/uthash.h
    g WIN ../o/omamatv3.w32.plugin omamatv3.plugin
    g WIN64A ../o/omamatv3.w64.plugin omamatv3.plugin
    g LINUX ../o/omamatv3.linux.plugin omamatv3.plugin
    g LINUX64 ../o/omamatv3.linux64.plugin omamatv3.plugin
    g MACINTEL64 ../o/omamatv3.plugin.x86_64 omamatv3.plugin
    g MACARM64 ../o/omamatv3.plugin.arm64 omamatv3.plugin
    h omamatv3.plugin
    g WIN ../e/elzspelladd.w32.plugin elzspelladd.plugin
    g WIN64A ../e/elzspelladd.w64.plugin elzspelladd.plugin
    g LINUX ../e/elzspelladd.linux.plugin elzspelladd.plugin
    g LINUX64 ../e/elzspelladd.linux64.plugin elzspelladd.plugin
    g MACINTEL64 ../e/elzspelladd.plugin.x86_64 elzspelladd.plugin
    g MACARM64 ../e/elzspelladd.plugin.arm64 elzspelladd.plugin
    h elzspelladd.plugin
    So what did they do that I did not?

    Thank you for any kind insights you have to share about this niche problem.

  • #2
    For a package that includes binary plugins, g and h lines can be used to identify the OS-specific plugin, which is renamed to a generic name refernced in the h line. SSC packages are produced automatically from RePEc software templates, and that facility knows how to handle .ado, .sthlp, .mata, .dlg, .dta, etc. It does not know how to handle plugins, which requires manual intervention to set up the g- and h-lines. It is preferable to have only one plug per package, as most do.

    Comment


    • #3
      Hello Kit, thanks for answering!

      Given that the current copy of svmachines is broken, would you re-import it if I submitted another copy? The last version is archived here, it's not in SSC's layout but I could arrange it so and send you a zip. I gather that would still impose some manual labour on your end though?

      Is there a public repository containing these software templates that I could examine? It would be a lot easier to debug what went wrong with the last import if I could just see the script that arranged everything and understand it's outer limits. I want to help make it as easy as possible to import so that I can make it as easy as possible for my users to install it. Would you or RePEc be open to accepting patches to the import script(s) by the way? It probably wouldn't be too hard to teach it about Stata plugins (I know perl and bash and lots of things, I can probably figure it out).

      I'm working on cutting down the number of plugins in svmachines, though I still plan to distribute two.

      Thanks.

      Comment

      Working...
      X