Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • SPSS file becoming corrupted when saved as Stata file

    I have a very large (about 3 million rows and about 50 variables) uncorrupted file created in SPSS (Version 24). I am trying to convert it to Stata but when I try to save it as a .dta file I get the message: "The file does not record <strls> where expected. Either the file was written incorrectly in the first place or the file has become corrupted."

    Although the data editor shows no data when I try to open the file, the corrupted file is taking up several megabytes of memory, so the data is in there somewhere!

    The file works fine in SPSS so I don't understand why I am getting this message. All I can think of is that of the various "Save As" options offered by SPSS go only up to Version 13 I/C (which I selected), whereas I am running Stata Version 14 I/C. However, I would not have thought it was a problem to be running a newer version of Stata.

    Any advice gratefully received. Thank you.

  • #2
    One possibility that has worked for others is saving the dataset in Stata12 format. So I would advise you to try other options than the most recent one. It appears that SPSS is creating strL variables for some of its string variables; if you go back before Stata13 it's possible SPSS will handle them differently.

    At some point, Stata changed the internal layout of its .dta datasets, specifically the storage of strL variables, and non-Stata programs that try to create .dta datasets were broken and required updating. So, also, be sure you have the latest update in your copy of SPSS, and if you have access to SPSS support, perhaps they can advise you further.

    Comment


    • #3
      Thanks for the advice William, I will try that.

      Comment


      • #4
        I am unclear. Is SPSS giving the error when it saves or is Stata giving the error when it reads?

        Their are various ways to covert files. If William’s idea doesn’t work we can toss out some others.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 15.1MP (2 processor)

        EMAIL: rwilliam@ND.Edu
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Hi Richard. The file appears to save from SPSS with no problems. The error message appears when I try to open the resultant Stata file.
          I've not been able to try William's solution yet as I won't have access to my data until Monday morning (11.02.19).

          Comment


          • #6
            If you can afford it or can get someone else to do it for you Stat/Transfer is often the best solution. I only use it a few times a year but it is invaluable when I do.

            https://stattransfer.com/

            The freebie download is crippled but should let you know whether the full version will meet your needs.

            Sergiy Radyakin's usespss can often do the trick. I don't know if it works with SPSS 24 or not. See

            http://www.radyakin.org/transfer/use...espss_faq.html

            When it works rioweb is free and very convenient:

            https://gallery.shinyapps.io/rioweb/

            Your file is huge and apparently contains string variables. That may be making life more difficult than you would like. If you don't need the string variables consider dropping them.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            Stata Version: 15.1MP (2 processor)

            EMAIL: rwilliam@ND.Edu
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #7
              I wonder whether saving as .xls or .csv, then importing to Stata wouldn't produce a better result. If the issue persists, then it was not related to the SPSS origin itself.
              Best regards,

              Marcos

              Comment


              • #8
                Thanks Richard and Marcos for your suggestions. The solution of removing string variables, oddly enough, had not occurred to me! Replacing them by nominal variables would be an easy fix if it is only them that are causing the problem.
                I wasn't aware of the Stat/Transfer system but that looks worth a try as well.
                Unfortunately the .xls or .csv option won't easily work for me, as the version of Excel I use has a maximum of 1 million rows or so. It might be possible to break the files down and import them into Stata piece by piece, but the SPSS file I am currently working with is only the first of many (and some of the ones to come will be larger still), and so this process might become quite time-consuming.
                Anyway I think I have a few options to try now. Many thanks again for your assistance.

                Comment


                • #9
                  You don’t have to read the data into excel, as the csv option just produces ascii data. But go for a more direct transfer if you can. Otherwise you may have to redefine labels, missing value codes, etc.

                  My guess is that SPSS is not doing a good job of converting, perhaps because of the string variables. If you don’t need them or can convert them to numeric, you may have a fighting chance. Or, as William suggests, SPSS may be better at saving a Stata 12 file than it is Stata 13.

                  Depending on your status, you may be able to get stat/transfer for between $39 to $149 a year. If you wind up wasting ridiculous amounts of time on this, especially if you will have to do this with other files, I would splurge and buy stat/transfer. It may depend on whether you have more time or money. Just try the demo out and make sure if it works for what you want before buying.
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  Stata Version: 15.1MP (2 processor)

                  EMAIL: rwilliam@ND.Edu
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    Success! I converted all the string variables to numeric ones, then saved as Stata 12 version. The file opened without problems. If I'd been more logical I would have tried these fixes one at a time, as I now don't know which one (or both) was responsible for the issue, but no matter.
                    The direct transfer solution in on standby for further occurrences.
                    Many thanks to all of you for your help - you have saved me a lot of time.

                    Comment


                    • #11
                      Hi John. Out of curiosity, did you use the AUTORECODE command in SPSS to convert string variables to numeric? I ask, because if you did, the original string values would be preserved as value labels--and they could come in handy down the road.

                      Cheers,
                      Bruce
                      --
                      Bruce Weaver
                      Email: bweaver@lakeheadu.ca
                      Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
                      Stata version: 15.1 IC (Windows)

                      Comment


                      • #12
                        Originally posted by John Stephenson View Post
                        ...However, I would not have thought it was a problem to be running a newer version of Stata..
                        John has already solved his problem, but I would be interested in understanding what actually went wrong. Stata 14 supports Stata 13 data (if it is correctly written) and has greately improved the diagnostics and error messages issued during opening of the data files. Based on the above symptoms, the problem is likely with the SPSS.

                        John, could you please remove the bulk of your data from the original SPSS file leaving perhaps a handful of observations (ideally with non-empty values of those strings that you've purged) and check that you can still reproduce the issue with a smaller file. If so, I would be glad to receive the original SAV (small) and the saved-as DTA (small). You can scramble the content of those variables, I don't think this would affect reproducibility of the problem.

                        Thank you, Sergiy Radyakin



                        Comment

                        Working...
                        X