Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting a subset of data from a big zip file

    I have a very large dataset in zip. Is there any way I could extract a sub set of data without unzip the file? Thanks in advance for your comments!

  • #2
    There is a user-written command -zipuse- that you can get from SSC which will do this for you.

    Comment


    • #3
      Let me point out that the zipuse command simply saves you having to manually unzip the file - it unzips the file into a temporary file which is subsequently deleted. If you have a shortage of disk space on your system, zipuse is likely to run into the same limit as manually unziping the file..

      Comment


      • #4
        @Clyde Schechter how could I get subset of data? I do not want to run entire database.

        Comment


        • #5
          With the caveat that William noted, that zipuse, behind the scenes, unzips the entire file, you can nevertheless restrict what you bring into Stata by specifying a varlists of the variables you want to bring in, and restricting which observations you take in with -if- and -in-, just as you would with native Stata's -use- applied to a Stata data set.

          If you are not familiar with these things, read the help file and manual section for the -use- command, and you can also read the help for -zipuse- by running -rnethelp "http://fmwww.bc.edu/RePEc/bocode/z/zipuse.hlp"- if you haven't installed -zipuse- on your computer.

          Comment


          • #6
            i need to select a certain code data from a list of various codes. what should i do?

            Comment


            • #7
              Your question is unclear. "Code" usually refers to commands given to Stata in the Command window or do-file, whereas data refers to the variables and observations that Stata manipulates in response to those commands. Thus the term "code data" is inherently a contradiction in terms. On the other hand, in some contexts, "code" is often the name of a variable in a data set.

              So please write a longer post detailing what you are trying to do. Also, it is important to keep threads on topic. So unless what you are trying to do involves unzipping a zip file, please start a New Topic rather than continuing here.

              Comment

              Working...
              X