Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • error using command join from package ftools

    Hello Statalisters

    I am using the command -join- from the package -ftools-, from SSC, in Stata 14.0 to speed up some code on a dataset of ~4 million observations. The following pseudocode is causing a problem:

    Code:
    save "`full_ds'"
    
    (transform and collapse the data)
    
    join, into("`full_ds'") by(year)
    Stata returns the following error message:

    Code:
    (obs: 16; levels: 16; method: hash0; dict size: 16)
    invalid file specification
    stata(): 3598 Stata returned error
    join(): - function returned error
    <istmt>: - function returned error
    I have tried saving the file to disk, instead of creating a temprorary file, this hasn't helped. I have tried using a traditional -merge 1:m-, this works just fine, but will be much too slow on my full dataset. I have also tried uninstalling and re-installing -ftools-, to no avail.

    Googling tells me that "3598" is a mata error, however I have decided to post my question here as I presume the error is in my use of stata, not the underlying mata code. Any suggestions to help me find the source of this problem are much appreciated.
    Last edited by Sebastien Willis; 13 Feb 2019, 02:37.

  • #2
    Sebastien:
    the first step is checking whether your Stata reslease is fully updated (the answer is probably yes, as you wisely tell the list that you're using Stata 14.0, but just in case...).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      This is a bug in ftools' join.ado (*! version 2.24.4 12mar2018); from a quick glance, the bug is in subroutine ParseUsing. Report this to Sergio Correia using the email address given in the help file for ftools. Here is a reproducible example

      Code:
      sysuse auto , clear
      generate id = _n
      preserve
      keep id make price mpg
      tempfile tmp
      save "`tmp'"
      restore
      keep id foreign
      join , into("`tmp'") by(id)
      Best
      Daniel

      Comment


      • #4
        Thanks both of you for your help. I will report the bug with the example to the package author.

        In this particular instance, I realised it is possible to avoid the -join- statement by using the -merge- option of the -fcollapse- command, should anyone find themselves in a similar situation.

        Comment


        • #5
          daniel klein , thanks for your sharp eye! That was indeed the bug; I fixed it yesterday but would have saved quite some time had I read your post on time.

          Sebastien Willis , glad you find that -merge- option useful! With Mauricio Caceres (author of gtools), we've learned that -collapse- can be extended in many ways at a very small cost in terms of code. You can not only "collapse, merge", but "collapse, append", collapse using your own functions, etc!

          Comment

          Working...
          X