Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • STATA not reading in all obs of large file

    I have a large file in SAS format which I have outputted into csv format. The SAS output is: "76446565 records created". The outputted csv and sas7dbat files are both approx 12GB.

    But when I read the csv file into STATA, I get "21 vars, 4,684,295 obs".

    I don't know why STATA only reads in a small subset of the obs (5M out of 76M)? I have tried to adjust the memory but when I do STATA tells me "Memory no longer needs to be set in modern Statas; memory adjustments are performed on the fly automatically." I don't know what else to try?

    Thanks in advance for your help.

  • #2
    Also, I am using STATA/MP 14.2 so I am not limited by the number of obs I can read in.

    Comment


    • #3
      Sometimes looking at the last few observations in Stata and the next few records in the data file can show a problem. Otherwise it's very difficult to diagnose these problems in a forum: the file is not visible to us and we can't expect you to post it.

      If you don't get a better answer you may need to wave this in front of StataCorp technical support.

      Meanwhile please note http://www.statalist.org/forums/help#spelling

      Comment


      • #4
        A few ideas:
        What command did you use to import the data? (insheet, import delimited, etc)
        Try opening the .csv file exported from SAS in a text editor (not word processor) to confirm the size and number of rows in the file. You can try importing using -import delimited- with the rowrange option to see if you can load in the last few rows (e.g. rowrange( 76446560/76446565)) or try importing just the first column and see if it loads all the observations/rows.

        I'd also check the end of line character (eol) for the last row (or the row just before or after that row) reached by Stata - perhaps some sort of character or structural break in the formatting is causing issues. -hexdump, from() to()- can come in handy for this type of troubleshooting.

        Rather than output directly from SAS to CSV, you could try saving as a xport file in SAS and importing in Stata using -import sasxport- or use stat transfer to import directly from the sas7bdat file.
        Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

        Comment


        • #5
          Thank you for the help. Will do as you suggest.

          Comment

          Working...
          X